Seo book :Chapter 6:Avoiding things That Search Engines Hate

Search Engine Optimization : SEO BOOK.

Chapter 6

Avoiding things That Search Engines Hate

Dealing with Frames
Frames were very popular a few years ago. A framed site is one in which the browser window is broken into two or more frames, each of which holds a Web page. Frames cause a number of problems. Some browsers don't handle them well - in fact, the first frame enabled browsers weren't that enabled and often crashed when loading frames. In addition, many designers created framed sites without properly testing them. They built the sites on large, high-resolution screens, so they didn't realize that they were creating sites that would be almost unusable on small, low-resolution screens.
From a search engine perspective, frames create the following problems:
· Some search engines have trouble getting through the frame definition or frameset page to the actual web pages.
· If the search engine gets through, it indexes individual pages, not framesets. Each page is indexed separately, so pages that make sense only as part of the frameset end up in the search engines as independent pages.
· You cant point to a particular page in your site. This may be a problem in the following situations:

v Linking campaigns. Other sites can link to only the front of your site; they cant link to specific pages during link campaigns.
v Pay-per-click campaigns. If you are running a pay-per-click campaign, you cant link directly to a page related to a particular product.
v Placing your products in shopping directories. In this case, you need to be able to link to a particular product page.
Search engines index URLs - single pages, By definitation, a framed site is a collection of URLs, and as such, search engines don't know how to properly index the pages.

The HTML Nitty-Gritty of Frames
Here's a example of a frame-definition, or frameset, document:

<HTML>
<HEAD>
</HEAD>
<FRAMESET ROWS="110, *">
<FRAME SRC="navbar.htm">
<FRAME SRC="main.htm">
</FRAMESET>
</HTML>
This document describes how the frames should be created. It tells the browser to create two rows, one 110 pixels high and the other * high - That is, whatever room is left over. It also tells the browser to grab the navbar.htm document and place it in the first frame -the top row - and place main.htm into the bottom frame. Most of the bigger search engines can find their way through the frameset to the navbar.htm and main.htm documents, so Google, for instance, indexes those documents. Some older systems may not however effectively making the site invisible to them.

Providing Search Engines With The Necessary Information
The first thing you can do is to provide information in the frame-definition document for the search engines to index. First, add a TITLE and your meta tags, like this:

<HTML>
<HEAD>
<TITLE>Rodent Racing - Scores, Mouse Events, Rat Events, Gerbil Events - Everything about Rodent Racing</TITLE>
<meta name="description" content="Rodent Racing - Score, Schedules, everything Rodent Racing. Whether you're into mouse racing, stoat racing, rats, or gerbils, our site provides everything you'll ever need to know about Rodent Racing and caring or your racers.">
<META NAME="keywords" CONTENT="Rodent Racing, Racing Rodents, Gerbils, Mice, Mouse, Rodent Races, Rat Races, Mouse Races, Stoat, Stoat Racing, Rats, Gerbils">
</HEAD>
<FRAMESET ROWS="110.*">
<FRAME SRC="navbar.htm">
<FRAME SRC="main.htm">
</FRAMESET>
</HTML>

Then at the bottom of the FRAMESET, add <NOFRAMES> tags - <NOFRAMES> tags were originally designed to enclose that would be displayed by a browser that couldn't handle frames - with <BODY> tags and information inside , like this:

<FRAMESET ROWS="110,*">
<FRAME SRC="navbar.htm">
<FRAME SRC="main.htm">
<NOFRAMES>
<BODY>
</H1>Rodent Racing - Everything You Ever Wanted to Know about Rodent Racing
Events and the Rodent Racing Lifestyle</H1>
<p>[This site frames . so if you are reading this. Your browser doesn't handle frames]</p>
<p>This is the world's top rodent-racing Web site. You wont find more information about the world's top rodent-racing events anywhere else….[more info]
</BODY>
</NOFRAMES>
</FRAMESET>
</HTML>
The <NOFRAMES></NOFRAMES> tags were originally intended to display text for browsers that don't handle frames. Although few people still use such browsers, you can use the NOFRAMES tags to provide information to the search engines that they can index. E.g., you can take the information from main.htm and place it into the NOFRAMES area. Provide 200 to400 words of keyword-rich text to give the search engines something to work with. Make sure that the content between the <NOFRAMES> tags is about your site, and is descriptive and useful to visitors.

Providing A Navigation Path
You can easily provide a navigation path in the NOFRAMES area. Simply add links in your text to other pages in the site. Include a simple text-link navigation system on the page and remember to link to your sitemap.
Remember also todo the following:
· Give all your pages unique<TITLE> and meta tags. Many designers don't bother to do this for pages in frames because browsers read only the TITLE in the frame-definition document. But search engines index these pages individually, not as part of a frameset, so they should all have this information.
· Give all your pages simple text navigation systems so that a search engine can find its way through your site.
You'll run into one problem using these links inside the pages. The links will work fine for people who arrive at the page directly through the search engines, and any link that simply points at another page will work fine in that situation someone who arrives at your home page and sees the pages in the frames. But any link that points at a frame-definition document rather than another page wont work properly if someone is viewing the page in a frame.

Opening Pages In a Frameset
Given The way search engines work, pages in a Web site will be indexed individually. If you've created a site using frames, you want the site displayed in frames. You don't want individual pages pulled out of the frames and displayed, well individually. You can use Javascript to force the browser to load the frameset. Of course, this wont work for the small percentage of users working with browsers that don't handle Javascript…. But those browsers probably don't handle frames, JavaScript; but you cant have everything. You have to place a small piece of JavaScript into each page so that when the browser loads the page, it reads the JavaScript and loads the frame-definition document. It's a simple little JavaScript that goes something like this:

This Javascript causes two problems:
· The browser loads the frameset defined in index.html, which may not include the page that was indexed in the search engine. Visitors may have to use the navigation to find the page that, presumably, had the content they were looking for.
· The Back button doesn't work correctly, because each time the browser sees the Javascript, it loads the index.html file again.
Another option is to have a programmer create a customized script that loads the frame-definition document and drops the specific page into the correct frame. If youwork for a company with a Web-development department, this is a real possibility.

Handling Iframes
The iframes is a special type of frame. This is an Internet Explorer feature and something that is as common as normal frames. An frame is an inline floating frame. It allows you to grab content from one page and drop it into another, in the same way you can grab an image and drop it into the page. The tag looks like this:

It has the similar problem to regular frames. In particular, some search engines don't see the content in the iframe, and the ones that do index it separately. You can add a link within the <IFRAME> tag so that older searchbots will find the document like this:

Fixing Invisible Navigation Systems
Navigation systems that never show up on search engines' radar screens are a common problem. Many web sites use navigation systems that are invisible to search engines. A web page is compiled in two places - on the server and in the browser. If the navigation system is created in the browser, its probably not visible to a search engine.e.g.
· Java applets
· JavaScripts
· Macromedia Flash

How can you tell if your navigation has this problem? If you created the pages yourself, you probably know how you built the navigation, although you may be using an authoring tool that did it all for you. So if that's the case, or if you are examining a site that was built by someone else, here a few ways to figure out how the navigation is built.
· If navigation is created with a Java applet, when the page loads you probably see a gray box where the navigation sits for a moment or two, with a message such as Loading Java Applet.
· Look in the page's source code and see how its navigation is created.
· Turn off the display of Javascript and other active scripting and then reload the page to see if the navigation is still there.

Looking At the Source Code
Take a look at the source code of the document to see how it's created. Open the raw HTML file or choose View?Source from the browser's main menu, and then dig through the file looking for the navigation. If the page is large and complex, or if your HTML skills are correspondingly small and simple, you may want to try the technique under "Turning off scripting and Java."
Here's an example. Suppose that you find the following code where the navigation should be:
<applet code="MenuApplet" width="160" height="400" archive=http://www.yourdomain.com/menu.jar>
this is a navigation system created with a Javascript. Search engines don't read applet files, so they wont see the navigation.

Turning Off Scripting and Java
You can also turn off scripting and java in the browser, and then look at the pages. If the navigation has simply disappeared , or its there but doesn't work anymore. Here's how to disable the setting in the Internet Explorer.
· Choose Tools?Internet Options from the main menu.
· Click the security tab.
· Click the Custom Level button to open the Security dialog box.
· Select the Microsoft VM?Java Permissions?Disable Java Option button.
· Select the Active Scripting?Disable option button.
· Click the OK button, answer Yes in the message box, and click the Ok button again in the internet Options dialog box.

Fixing the Problem
If you want , you can continue to use these invisible menus and navigation tools. They can be attractive and effective. The search engines wont wee them, but that's okay because you can add a secondary form of navigation one that duplicates the top navigation. You can duplicate the navigation structure by using simple text links at the bottom of the page. If you have long pages or extremely cluttered HTML, you may want to place small text links near the top of the page to make sure search engines get to them, perhaps in the leftmost table column.

Reducing The Clutter in Your Web Pages
Simple is good; cluttered is bad. The more cluttered your pages, the more work it is for search engines to dig through them. What do I mean by clutter? E.g. A site having the HTML source document for the home page had 21414 characters, of which 19418 were characters other than spaces. However the home page did not contain a lot of text: 1196 characters, not including the spaces between the words. So if 1196 characters were used to create the words on the page, what were the other 18222 characters used for ? Things like this:
· Javascripts: 4251 characters
· Javascript event handlers on links: 1822 characters
· The top navigation bar: 6018 characters
· Text used to embed a Flash animation near the top of the page: 808 characters.
The rest is the normal clutter that you always have in HTML: tags used to format text, create tables, and so on. The problem with this page was that a search engine had to read 17127 characters before if ever reached the page content. The page did not have much content, and what was there was hidden away below all that HTML. This clutter above the page content means that some search engines may not reach it.

Use External Javascripts
You don't need to put Javascripts inside a page. Javascripts generally should be placed in an external file - a tag in the Web page "calls" a script that is pulled from another file on the Web server - for various reasons:
· They are actually safer outside the HTML file. They are less likely to be damaged while making changes o the HTML.
· They are easier to manage externally. Why not have a nice library of all the scripts in your site in one directory?
· The download time is slightly less. If you use the same script in multiple pages, the browser downloads the script once and catches it.
· They are easier to reuse. You don't need to copy scripts from one page to another and fix all the pages when you have to make a change to the script. Just store the script externally and change the external file to automatically change the script in any number of pages.
· Doing so removes clutter from your pages!

Use document.write to remove problem code
If you have a complicated top navigation bar - one with text colors in the main bar that change when you point at a menu and/or drop-down lists, also with changing colors - you can easily get code character counts peaking at 5000 to 6000 characters./ That's a lot of characters! Add some flash animation, and you are probably upto 7000 characters, which can easily end up being a significant portion of the overall code for the page. You can easily remove all this clutter by using Javascript to write the text into the page. Here's how:
· In an external text file, type this text:
· Grab the entire code you want to remove from the HTML page and then paste it between the following quotation marks: document. write("place code here")
· Save this file and place it on your web server.
· Call the file from the HTML page by adding an scr=attribute to your <SCRIPT> tag to refer to the external file, like this:
<script language="Javascript" src="/scripts/navbar.js"type="text/javascript"></script>

Use External CSS Files
If you can stick Javascript stuff into an external file, it shouldn't surprise you that you can do the same thing - drop stuff into a file that is then referred to in the HTML file proper - with Cascading Style Sheets information. For reasons many designers place CSS information directly into the page, despite the fact that the ideal use of a style sheet is external. Just think about it - one of the basic ideas behind style sheets is to allow you to make formatting changes to an entire site very quickly. If you want to change the size of the body text or the color of the heading text, you make one small change in the CSS file, and it affects the whole site immediately. If you have your CSS information in each page, though you have to change each and every page. Here's how to remove CSS information from the main block of HTML code. Simply place the targeted text in an external file - everything between and including the <STYLE></STYLE> tags - and then call the file in your HTML pages by using the <LINK> tag, like this:

Move image maps to the Bottom of the page
Image maps are images that contain multiple links. One way to clean up clutter in a page is to move the code that defines the links to the bottom of the Web page,right before the </BODY> tag. Doing so doesn't remove the clutter between the top o the page and the page content, making it more likely that search engines will reach the content.

Don't copy and paste from MS word
Don't copy text directly from Microsoft word and drop it into a Web page. You'll end up with all sorts of formatting clutter in your page! Here's one way to get around the problem:
· Save the File as an HTML file
· In your HTML authoring program, look for a Word cleaning tool! Word has such a bad reputation that HTML programs are now starting to add tools to help you clean the text before you see it. Dreamweaver has such a thing, and even Microsoft's own HTML-authoring tool, Frontpage , has one.

Managing Dynamic Web Pages
Pages pulled from databases are known as dynamic pages, as opposed to the normal static pages that don't come from the database. They are dynamic because they are created on the ly, when requested. The page doesn't exist until a browser requests it, at which point the data is grabbed froma database and put together with a CGI, an ASP, or a PHP program. Dynamic pages can create problems. Even the best search engines sometimes don't read them. After the searchbot receives the page, the page is already complete. So why don't search engines always read dynamic pages? Because search engines don't want to read them. Here are a few of the problems searchbots can run into reading dynamic pages:
· Dynamic pages often have only minor changes in them. A searchbot reading these pages may end up with hundreds of pages that are almost exactly the same, with nothing more than minor differences to distinguish one from the other.
· The search engines are concerned that databased pages might change frequently, making search results inaccurate.
· Searchbots sometimes get stuck in the dynamic system, going frompage to page among tens of thousands of pages. This happens when a web programmers hasn't properly written the link code, and the database continually feeds data to the search engine, even crashing your server.
· Hitting a database for thousands of pages can slowdown the server, so searchbots often avoid getting into situations in which that is likely to happen.
· Sometimes URLs can change, so even if the search engine does index the page, the next time someone tries to get there. It'll be gone, and search engines don't want to index dead links.

Finding out if your dynamic site is scaring off search engines
You can often tell if search engines are likely to omit your pages just by looking at the URL. Go deep into the site; if it's a product catalog, then go to the furthest subcategory you can find. Then look at the URL. Suppose you have a URL like this:

http://www.yourdomein.edu/rodent-racing-scores/match/index.php

This is a normal URL that should have few problems. It's a static page- or at least looks like a static page, which is what counts. Compare this URL with the next one:

http://yourdomain.com/products/index.html?&DID=18&CATID=13&ObjectGroup_ID=79

If you have a clean URL with no parameters, the search engines should be able to get to it. If you have a single parameter, its probably okay for the major search engines, though not necessarily for older systems. If you have two parameters, it may be a problem, or it may not, although two parameters are more likely to be a problem than a single parameters and three parameters are certainly a problem. You can also find out if a page in your site is indexed by using the following techniques:
· If you have the google Toolbar, open the page you want tocheck and then click the I button and select Cached snapshot of page. Or go to google and type cache: YourURL, where YourURL is the URL of the site you are interested in. If Google displays a cached page, it's there of course, If Google doesn't display it, move to the next technique.
· Go the Google and type the URL of the page into the text box and click Search. If the page is in the index, google displays some information about it.
· Use similar techniques with other search engines if you want to check them for your page.

Fixing Your Dynamic Web Page Problem
So how do you get search engines to take a look at your state-of-art dynamic Web site? Here are a few ideas:
· Find out the database program has a built-in way tocreate static HTML.
· Modify URLs so they don't look like they are pointing to dynamic pages
· Use a URL rewrite trick - a technique for changing the way URLs look.
In other words, this technique allows you to use what appear to be static URLs, yet still grab pages from a database. This is complicated stuff, so if your server administrator doesn't understand it, it may take him or her a few days to figure it all out.

· Find out it the programmer can create static pages from the database. Rather than creating a single web page each time its requested the database could "spit out" the entire site periodically".
· You can get your information into some search engines by using an XML feed often known as trusted feed.
· You canget pages into search engines by using a paid-inclusion program.

Using Session IDs in URLs
Session Id identifies a particular person visiting the site at a particular time, which enables the server to track what pages the visitor looks at and what actions the visitor takes during the session. If you request a page from a web site - by clicking a link on a web page, for instance - the web server that has page sends it to your browser. Then if you request another page, the server sends that page, too, but the server doesn't know that you are the same person. If the server needs to know who you are, it needs a way to identify you each time you request a page,. It does that by using session IDs. Session IDs are used for a variety of reasons, but the main purpose is to allow web developers to create various types of interactive sites. E.g. if the developers have created a secure environment, they may want to force visitors to go through the home page first. Or the developers may want a way to pick up a session where it left off. By setting cookies on the visitor's computer containing the session ID, the developers can see where the visitors was in the site at the end of the visitors's last session. Session IDs are common when running a software application that has any kind of security or needs to store variables, or wants to defeat the browser cahe - .
Session's IDs can be created in two ways:
· They can be stored in coolies.
· They can be displayed in the URL itself.
Some systems are set up to store the session IDs in a cookie but then use a URL session Id if the user's browser is set to not accept cookies. If a search engine recognizes a URL as including a session ID. It probably wont read the referenced page because the server can handle a session ID two different ways when the searchbot returns. Each time the searchbot returns to your site,the session ID will have expired, So the server could do either of the following:
· Display an error page, rather than the indexed page, or perhaps the site's default page.
· Assign a new session ID.

Dealing with session IDs like a magic trick. Sites that were invisible to search engines suddenly become visible! When sites run through URLs with session IDs, you can do various things:
· Instead of using session IDs in the URL, store session information in a cookie on the user's computer. Each time a page is requested, the server can check thecookie to see if session information is stored there.
· Get your programmer to omit session IDs if the device requesting a Web page from the server is a searchbot. The searchbot will deliver the same page to the searchbot but wont assign a session ID, so the searchbot can travel throughout the site without using session IDs.

Examining Cookie-Based Navigation
Cookies - the small text files that a Web server can store on a site visitor's hard drive - can often prove as indigestible tosearch engines as dynamic Web pages and session IDs. Cookies are sometimes used for navigation purposes. You may have seen crumb trails, a series of links showing where you have been as you travel through the site. Crumb trails look something like this:

Home?Rodents?Rats?Racing

This is generally information being stored in a cookie and is read each time you load a new page. Or the server may read the cookie to determine how many times you have visited the site or what you did the last time you were on the site, and direct you to a particular page based on that information. If you are using Internet Explorer on Microsoft windows, follow these steps to see what these cookie files look alike.
· Choose Tools?Internet Options from the main menu.
· In the Internet Options dialog box, make sure that the General tab is selected.
· Click The Settings button in the Temporary Internet Files area.
· In the settings dialog box, click the View Files button.
· Doubel click any of the these cookie files to view the file's contents; a warning message appears, but ignore it and click yes.
There is nothing wrong with using cookies, unless they are required in order to navigate through your site. A server can be set upto simply refuse to send a Web page to a site visitor if the visitor's browser doesn't accept cookies.
· A few browsers simply don't accept cookies.
· A small number of people have changed their browser settings to refuse to accept cookies.
· Searchbots cant accept cookies.

If your Web site demands the use of cookies, you wont get indexed. That's all there is to it! The searchbot will request a page, your server will try to set a cookie, and the searchbot wont be able to accept it. This server wont send the page,so the searchbot wont index it. How can you check to see if your site has this problem? Change your browser's cookies settings and see if you can travel through the Website. Here's how :
· Choose Tools?Internet Options from the main menu.
· In the Internet Options dialog box, click the privacy tab.
· On the Privacy tab, click the Advanced button.
· Select the Override Automatic Cookie Handling check box - if it's not already selected.
· Select both of the Prompt option buttons.
· Click OK to close the Advanced Privacy Settings dialog box.
· In the Internet Options dialog box, click the General tab.
· On the General tab, click the Delete Cookies button. Note that some sites wont recognize you when you revisit them, until you log in again and they reset their cookies.
· Click the OK button in the confirmation message box.
· Click the OK button to close the dialog box.

Now go to your web site and see what happens. Each time the site tries to set a cookie, you see a message box. Block the cookie and then see if you can still travel around the site. If you cant the searchbots cant navigate it it either.
How do you fix this problem?
· Don't require cookies: Ask your site programmers to find some other way to handle what you are doing with cookies, or do without the fancy navigation trick.
· As with session IDs, you can use a User-Agent script that treats searchbots differently. If the server sees a normal visitor, it requires cookies, if it's a searchbot, it doesn't.

Fixing Bits and Pieces
Forwarded pages, image maps and special characters can also cause problems for search engines.

Forwarded Pages
Search engines don't want to index pages that automatically forward to other pages. You've undoubtedly seen pages telling you that something has moved to another location and that you can click a link or wait a few seconds for the page to automatically forward the browser to another page. This is often done with a REFRESH meta tag, like this:

This meta tag forwards the browser immediately to yourdomain.com. Quite reasonably, search engines don't like these pages. Why index a page that doesn't contain information but forwards visitors to the page with the information? Why not index the target page? That's just what search engine do. If you use the REFRESH meta tag, you can except search engines to ignore the page.

Image Maps
An image map is an image that has multiple links. You can create the image like this:
<img name="main" src="images/main.gif" usemap="#m_main">

The usemap=parameter refers to the map instructions. You can create the information defining the hotspots on theimage - the individual links - by using a <MAP> tag, like this:

Will search engines follow these links? Many search engines don't read image maps . Use additional simple text links in the document.

Special Characters
Don't use special characters, such as accents in your text. To use unusual characters, you have to use special codes in HTML, and the search engines generally don't like these codes. If you want to write the word ole, for example, you can do it in three ways:

Rôle
Rôle
R^ole

Third method displays okay in Internet Explorer but not in a number of other browsers. But you probably should not use any of these forms because the Search engines don't like them, and therefore wont index these words. Stick to basic character.

Home : About Us : Seo Services : Link Building : Contact us : Seo Book : Portfolio : Seo Tools

All Rights reserived by SEOTOPPERS@2004

Search Engine Optimization Expert offers SEO, Search Engine Promotion & Link Popularity Building Services in Hyderabad, India.

Search Engine Optimization : SEO BOOK.