How to configure robots.txt to allow everything?

Question

How to configure robots.txt to allow everything?

My robots.txt in Google Webmaster Tools displays the following values:

 User-agent: * Allow: /

What does it mean? I do not have enough knowledge about this, so I am looking for your help. I want to allow all robots to crawl my site, is this the correct configuration?

+110

robots.txt

Raajpoot Nov 25 '10 at 12:16

source share

4 answers

Jim · Answer 1 · 2010-11-25 12:23

This file will allow all crawlers to access

 User-agent: * Allow: /

This basically allows all user agents (*) to all parts of the site (/).

unor · Answer 2 · 2017-06-09 21:48

If you want to allow each bot to scan everything, this is the best way to specify it in the robots.txt file:

 User-agent: * Disallow:

Please note that the Disallow field is empty, which means according to the specification :

Any null value indicates that all URLs can be retrieved.

Your path (with Allow: / instead of Disallow: also works, but Allow not part of the original robots.txt specification , so it is not supported by all bots (many of them support it, however like Googlebot ). At the same time, unrecognized fields should be ignored, and for bots that do not recognize Allow , the result will be the same in any case: if nothing is forbidden to circumvent (using Disallow ), everything is allowed to crawl.
However, formally (according to the original specification), its entry is invalid because at least one Disallow field is required:

At least one Disallow field must be present in the entry.

Raja Anbazhagan · Answer 3 · 2017-12-25 06:58

I understand that this is a rather old question and there are pretty good answers to it. But here are my two cents to complete the picture.

According to the official documentation , there are four ways in which you can provide full access for robots to access your site.

Clean:

Specify a global match to the forbidden segment, as mentioned by @unor. So your /robot.txt looks like this.

 User-agent: * Disallow:

Hack:

Create the /robot.txt file /robot.txt no content. Which by default will allow everything for all types of Bots .

I do not care

Do not create /robot.txt at all. Which should give exactly the same results as the two above.

Ugly

From the documentation for robots for meta tags, you can use the following meta tag on all your pages on your site to tell Bots that these pages should not be indexed.

 <META NAME="ROBOTS" CONTENT="NOINDEX">

For this to apply to your entire site, you will need to add this meta tag to all of your pages. And this tag should be strictly placed under your HEAD on the page. Read more about this meta tag here .

Jordi · Answer 4 · 2010-11-25 12:24

This means that you allow each ( * ) user-agent / crawler to access the root directory ( / ) of your site. Are you okay.

How to configure robots.txt to allow everything?

Clean:

Hack:

I do not care

Ugly

More articles: