Most of the time we want our website/blog content to be indexed by the search engines to gain more reputation over the Internet. But in some cases we don’t want our content to be indexed or want only a specific portion to be indexed. If you have been griping about your personal content (Personal images folder, admin folder, or folder like cgi-bin) getting indexed, here is the solution.
Robots.txt which is sometime called as Robots Exclusion Protocol is a file that is used to exclude content from the crawling process of search engine spiders/bots. If you have been familiar with the robots.txt you must have been acquaint with it’s directives like : Allow, Request Rate, Crawl-delay, Visit time, etc
ROBOTStxt is a spruced up freeware with a very snappy interface to create and edit robot.txt file. With it’s help you can keep your web/blog content secure and prevent search engines from prying into your personal content.
Here is an example of robot.txt file
User-agent: * (allows all search engine spiders)
Disallow: /secretcontent/ (disallow them to crawl secretcontent folder)
Choose a robot.txt file by going to File menu and hit ‘Open From www’ in case of opening it from website or simply hit Open to choose it from the disk. Once the file is open, use the ‘Add/Modify Entry’ drop-down box to pick any directive and set its value.
By choosing a User-agent from the list, you can make agent allow/disallow or set their attributes.
As you can see that I have added two user agents and disallowed them on my folders. Any of the directive can be chosen from the drop-down list and applied on the user agent.
It requires a minimum of .NET 2.0 Framework and works on Windows 2000, Windows XP, Windows 2003, Windows Vista, and Windows 7.