Why Robots.txt:- The robots.txt file gives you the authority to decide which page should not be indexed or crawled in Search Engine. Many sites contain some files that are not relevent to search engines (like images or admin files) therefore creating a robots.txt file to let you decide that which page of your site should not be indexed or crawled. You can also block some special Search Engines be creating robots.txt file.
A robots.txt is a simple text file that can be created with Notepad. If you are using Wordpress a sample robots.txt file would be:
User-agent: *
Disallow: /wp-
Disallow: /feed/
Disallow: /trackback/“User-agent: *” means that all the search bots (from Google, Yahoo, MSN and so on) should use those instructions to crawl your website. Unless your website is complex you will not need to set different instructions for different spiders.
“Disallow: /wp-” will make sure that the search engines will not crawl the Wordpress files. This line will exclude all files and foldes starting with “wp-” from the indexation, avoiding duplicated content and admin files.
If you are not using Wordpress just substitute the Disallow lines with files or folders on your website that should not be crawled, for instance:
User-agent: *
Disallow: /images/
Disallow: /cgi-bin/
Disallow: /any other folder to be excluded/
After you created the robots.txt file just upload it to your root directory and you are done!
Last 5 posts by Rajeev Pandey
- How to index a site in Google - January 3rd, 2009
- Web Hosting Reviews - January 2nd, 2009
- Top 5 SEO Softwares - January 2nd, 2009
- I think Google losing its edge - December 30th, 2008
- Google: We can beat Economic slowdown - December 26th, 2008
