This article will discuss the general structure of a website – folder structure, filenames, domain names and pages, and how content should be crafted on pages.
Structure by Theme and Topic
The general subject or category of your website dictates it’s theme. Loosely stated, the theme of your website is generally your Primary Keyword Phrase.
Ideally, your site is only about one major subject or category. If you have more than one major subject for your site, say, for example, you sell baby diapers AND garage door openers, you should strongly consider creating multiple sites, one per subject.
The main idea is to separate content onto different pages by topic (keyword phrase)within your site. Suppose that a site sells house plans online and that is the theme of the site (it’s Primary Keyword Phrase). This site also sells country house plans, garage plans, and duplex plans, and let’s say for this example that each page of the site mentions all three plan types.
However, what is each page’s specific topic? The different plan types have been mentioned on multiple pages, so each page contains the keywords country house plans, garage plans, and duplex plans. None of the three plan types would be strongly relevant on any of these pages for Google.
The correct way to structure this site is to have one page that discusses only country house plans, another page that discusses only garage plans, and a third page that discusses only duplex plans. Each page is now strongly relevant for one keyword phrase. No “dilution” occurs in any of the pages, and each page should subsequently air better in the rankings for its particular keyword phrase. This is important.
Next, you would add links on each page so that garage plan pages link only to other garage plan pages, duplex plan pages link only to duplex plan pages, and so forth. By using the applicable keyword phrase in the link text (the clickable part of the link),you can also help strengthen the importance of each page. We’ll discuss in greater detail later how to link pages correctly between pages.
To properly structure a site that offers different products, services, or content categories, you should split the content onto different pages. You ideally want a single topic, or keyword phrase, applied per page.
Create Some Pages With Content
Websites with lots of pages in general rank better than sites with just a few pages, all other things being equal. It is better to have a 50-page site with short pages than a 5-page site with long, flowing pages. Each page should however contain a minimum of about 200 visible words of text to maximize relevance with Google.
Also, you need pages with real content – don’t create just a lot of “fluff” pages that are standard fare anyway – About Us page, Contact Us page, Our Mission page, etc.
Break up your pages using <h1> and heads <h2>, and include your keywords in these heads. Not only will it help visitors read your pages by providing visual separators, it will give your pages more relevance with Google.
Don’t create pages that are identical or nearly so in content. Google may consider them to be duplicates and your or site may be penalized. Pages full of high quality,unique, keyword-rich content are a must. Be careful if you use both HTML and PDF versions of the same content. Google will index both.
To prevent this, create a robots.txt file and place it in the main (root) directory on your server. A robots.txt file specifies which directories and file types to exclude from crawling. If your PDF files are duplicates of your HTML files, put all the PDF files in a different directory and specify that this directory by excluded from crawling.
For more information on creating a robots.txt file, see
Here are some standard pages you should consider for your site:
• Home page
• Your main product, service, or content pages (this is the meat of your site)
• FAQ page(s) (Frequently Asked Questions) or Articles pages
• Sitemap page (links to each page on your site)
• About Us page
• Contact Us page
• Related Links page(s) (discussed later)
• Link to Us page (discussed later)
• Testimonials page
• Ordering page
Don’t Nest Your Pages Too Deeply
When Google crawls your site, it typically starts at the home page and then follows each link on the page to all your other pages. Google finds your home page in turn from following a link on another website that points to your site.
Google seems to attach more importance to files that are closer to the root folder on your server – the folder on your Web server where the home page file is located. Some web designers however may create multiple folders and subfolders on the server for ease in maintaining lots of files.
Google may not value pages located in subfolders as strongly as files located in the root folder. In general, Google doesn’t like to index pages that are more than about three folder levels deep. Ideally, all pages should live in the same folder as your home page or at most be one level deep.
Don’t Bloat Your Pages With Code
Google has a time limit that it sets to crawl sites. If you have a very large site, Google may not have time to crawl all pages during the first pas. This problem can be minimized if you keep the code of your web pages lean and clean.
This also makes your pages download faster, which improves the visitor experience.Studies show that you lose 10% of your visitors for every second it takes your page to load. After about 5 seconds, you might as well forget it – most people will have left your site. Remember there is a still a percentage of people who still use dial-up modems – particularly outside of the US. This will not change real soon, despite the hype over broadband.
In addition, create a style sheet file (.CSS) file that contains your font information and link to it also.
<link rel="stylesheet" href="YourFile.css">
Stay Away From Frames and Flash
No popular websites use frames and neither should you. Yes, they provide some degree of navigational ease and yes there are workarounds but search engines simply cannot properly crawl framed sites. In addition, visitors can’t bookmark any interior pages of your site or link to them. There are some that still beat this dead horse but framed sites simply have too many negatives to contend with. Don’t do it.
Same goes for sites whose entire home page is a Flash movie. How many times have YOU actually watched a Flash movie when arriving on a home page? If you are like most, you’ve clicked “Skip Intro” as quickly as possible. We are all busy and to wait for a gratuitous Flash movie to download is downright annoying – especially each and every time we visit the site. The only people who care about Flash are Adobe, the Flash developer that you paid, and the CEO or Marketing Director who enjoys it for the coolness factor. Google can index Flash somewhat successfully, but this doesn’t mean it’s going to boost your page ranking or increase sales.
If you must use Flash, confine it to a small location on your page or provide a link to it. Flash movies that take up the entire web page do have their uses but the homepage is not one of them. If you do use Flash on a page, make sure to add the following code:
<NOEMBED>My keyword-rich content</NOEMBED>
Pay Attention To Your Dynamic Page URLs
Many sites today display content dynamically from a database. Common examples include search engines on a site that return directory pages, product pages,shopping cart pages, or news articles. Some content management software also produces pages with dynamic URLs. All dynamic pages can be identified by the “?”symbol in the URL, such as
Google can crawl and index dynamic pages as long as you don’t have more than 2parameters in the URL (the example above has two parameters separated by the “&”symbol). Even so, Google may not spider your dynamic pages for some time. Spiders do not want to get caught in a loop of trying to index hundreds of thousands of potential pages.
Google will not follow links that contain session IDs embedded in them
Specifically, Google will not index pages that include "&id=" in the URL string,whether you actually use session ids or not. This means that if you have a dynamic site that generates multiple-parameter URL strings, you should strongly consider changing your server code to use a string other than "id" for generating dynamic URLs. Don’t use anything that uses "id" anywhere in the string, including sessid, rid, pid,id1, etc.
A simple solution is to create static pages with hard-coded links to your most important dynamic pages whenever possible. You can create a series of sitemap pages just for this purpose. Yes it can be tedious if you have hundreds or thousands of products but it is worth the effort. You want to make it as easy as possible for Google to find all your important pages. This has the added benefit of helping your visitors find a specific product page – be sure and use the product name or keyword in the link text.
There more advanced technique involves installing a script on your server that changes a dynamic URL to a static page, whereby each parameter name is translated to a folder name. This method varies by server platform and is some thing more experienced webmaster should implement. For the Apache platform, it can be as simple as creating a .htaccess file that contains regular expressions. Do a search on Google and you’ll find a number of ways to do URL rewriting (also called mod rewrite or server rewriting).
All search engines prefer static pages over dynamic pages. If you have a large site with lots of dynamic pages, you should consider URL rewriting, as dynamic page scan take months longer to be indexed and then ranked in Google. And once indexed, Google will not re-crawl dynamic pages as often as static pages.
Keywords in File Names
Although not an important factor, Google does look to see if keywords are used in filenames for your web pages, but the overall influence on your ranking is very minute.
When naming files, separate each word with a hyphen, otherwise Google will not be able to recognize the phrase and will think it is a single word.
As a general rule of thumb, don’t use more than two hyphens, it looks spammy and Google may take a closer look at your site for other possible issues.
About Google Sitemaps
Google Sitemaps is an special file that lists all the pages on your site, whether your content has changed, and that you have added a new page. While this is a neat feature, many sites don’t need to use it. Keep in mind that Google will find your site and pages by following links.
With that said, some dynamic sites and other websites that have had problems getting their pages indexed (think Flash) may find it helpful. If your website is well-designed with clean internal links and a standard sitemap page, there is no need to use Google Sitemaps.
Bear in mind that once you are signed up for the Google Sitemaps program, you are committed to updating the Sitemap XML file on a regular basis, which can be a sink on your time.
In this regard, it is somewhat of a crutch for webmasters who have a messy or search engine "unfriendly" site and don’t want to change their site. It would be time better served to fix your site so that it can be crawled completely by all the search engines and to employ SEO best practices than continually update an XML file. You may be able to get Google to crawl some of your new pages quicker, but that doesn’t mean it will rank your pages any faster.
Remember, having a page in their index doesn’t equate to that page being ranked. For more information on the Google Sitemaps program, go to http://www.google.com/webmasters/sitemaps/docs/en/about.html