Monday, 11 July 2011

What is Robots.txt

Advertising Online

Robot.txt is a text file not html file you put on your site to tell search robots which pages you would like them not to visit. Suppose in your websites you have so many pages and one of the page is such that you don’t want search engine to visit that page because you are having very sensitive data that you don’t allow world to see that data then you can put that page in the robot.txt file so that search engine will not visit that particular page and it will not index that page.

The location of robot.txt is very important. It must be in the main directory i.e. suppose you are having website called mydomain.com then you have to put robot.txt (that means file name will be robot.txt) in http://mydomain.com/robot.txt. And if they don’t find it there, they simply assume that this site does not have robot.txt file and therefore they index everything they find along the way.

EXAMPLE:

1)      Here's a basic "robots.txt":

User-agent: *
Disallow: /

With all above declared, all robots (indicated by *) and / means not to index any page it means search engine will not visit any page.

2)      Suppose you are having website like an example http://www.shmoop.com/ and suppose there is one page in that website http://www.shmoop.com/shakespeare/ and you want search engine to visit this page so your code will be

User-agent: *
Disallow: /Shakespeare/

First / means it will automatically takes your domain path i.e. http://www.shmoop.com and Shakespeare means search engine will not visit that page.


Related post



No comments:

Post a Comment

Followers

About Me

Blog Directory Marketing / SEO
London Top Blogs Marketing / SEO
London Business Blogs - Blog Rankings Promote Your Blog BlogCatalog SEO Tutorials - Blogged Online Marketing topblogarea.com