To be listed in Google’s search database (also called Google index), Google visits your site using automated programs called bots/robots or spiders. Such programs “read” each and every page of your website, starting typically with your home page and then following each link to all other web pages on your site. When a search engine robot or spider visits your site, it is said to crawl or spider your site.
Important: Google will not add a new web page to its index unless there is at least one other web page in its index that links to that page. So don’t fret over submitting your site to Google directly. Instead, you need to get another website to link to your website first.
Website crawls are performed by the main Google spider, called Googlebot. The more “popular” your site, the more often it typically is crawled by Google. Highly ranked sites and sites that update content frequently (like news and blog sites) get crawled daily.
If interested, you can check your server log files for the user-agent “Googlebot”. This will tell you when Google crawls your site. You can also check by IP address although this method is not as accurate as Google uses different IP addresses for their robots, which can change over time.
Google updates its main index more or less continuously although major “updates”still happen several times a year. These major updates correspond to major ranking algorithm changes. These updates have all been named – you may have heard about Florida, Bourbon, Allegra or Jagger in the forums.
For new websites, I advise you to make your site live as quickly as possible, even before you are completed. Given that Google prefers sites that are older, it no longer makes sense to wait until every "i" is dotted and "t" is crossed before going live with a new site. Instead, create an overall skeleton of your site, with a reasonably finished Home page and other important pages and make it live. Add new content, or update the content, on at least a monthly basis. Google also prefers sites that add or update content regularly.
This strategy has to do with what is called the Google Sandbox or the aging factor. The Sandbox is a set of filters applied to new websites whereby the site cannot rank well (or at all) for any competitive keywords for 6 – 24 months. Also called the aging delay. New sites can rank well for very niche, unique keyword phrases, such as their company name, but that’s about it. It is for this reason that new sites need to be made live on the Web as soon as possible in order to “start the aging clock”.
Important: It is critical that your website is up and running when Google visits you by following a link from another site. If your site is down, your listing on Google may disappear until the next update! The reason is that Google thinks your site doesn’t exist and may remove it from the index after a couple of attempts.