The Difference between Indexing and Crawling for SEO

Most webmasters get the terms “indexing” and “crawling” mixed up. The two terms are separate actions performed by search engines. The way a search engine crawls a website can have an effect on the website’s indexing. The subtle difference confuses new webmasters who keep track of indexed pages. Here is a brief overview of what each term means and how each action affects search engine rank.

Search Engine Crawling

Crawling occurs when the search engine sends a “bot” to find your pages. A bot is a simple program that crawls the web and looks for content pages. There are a variety of ways a bot can find a page on your website. Bots use different signals such as backlinks from external websites, social networking links, a submitted sitemap, or search engine submission tools.

Bots emulate a browser when accessing your website, but they use a variable called “useragent” to tell your server that it’s a bot and not an actual user. You can use this useragent variable to detect when a bot crawls your website versus an actual human.

For low quality pages or pages you don’t want indexed, you can block a bot from crawling your site using a file called “robots.txt.” Robots.txt uses file and directory parameters to block the bot from crawling areas of your site you want to remain private. However, if any of these site areas contain sensitive information, the robots.txt file is not sufficient enough to block access. If any private information is stored in a directory, make sure you add permissions in addition to blocking the directory in the robots.txt file.

For example, the following robots.txt syntax blocks the directory “blockme” from being crawled:
User-agent: *
Disallow: /blockme/

The asterisk means “all bots.” However, you can change the asterisk to a specific bot name to only block certain search engines and not others.

Search Engine Indexing

After a search engine bot crawls your site, it can take a few minutes to several days to index the content. Popular sites typically index more quickly than a new website. For instance, CNN indexes within minutes of publishing articles. Your new blog or content site might take a few days. Search engines use a variety of signals to index a website, and age of the site, popularity and duplicate content play a role in your indexed pages.

The term “index” means your page displays whenever a user performs a query. Your page might display several levels below other websites, but it is formally a part of a search engine result set.

Of course, ranking and indexing are two separate issues. Your search engine rank is determined using several hundred factors. Do not get ranking and indexing confused. Ranking takes a lot of hard work, and you should understand search engine guidelines before tackling competitive ranking.

When you evaluate your website, understanding the difference between indexing and crawling will help as you dive into search engine optimization. SEO is not difficult once you understand the terminology. Just remember to create a quality site that is focused on users rather than search engines.

Copyright Protection


Internet Marking Resources You Can Appreciate …