Robots.txt Allows All but Several Subdirectories

Question

Robots.txt Allows All but Several Subdirectories

I want my site to be indexed in search engines, with the exception of a few subdirectories. The following are the robots.txt settings:

robots.txt in the root directory

 User-agent: * Allow: /

Separate robots.txt in a subdirectory (exclude)

 User-agent: * Disallow: /

Is this correct, or will the root rule override the subdirectory rule?

+5

seo search-engine robots.txt cpanel shared-hosting

Kunwarbir S. Feb 13 '15 at 9:09

source share

3 answers

Yes there is

 User-agent: * Disallow: /

This directive is useful if you are developing a new website and do not want search engines to index your incomplete website. In addition, you can get extended information on the right here.

+1

Ganga Feb 13 '15 at 9:36

source share

You can manage them using the robots.txt file, which is located in the root directory. Make sure you have enable disallow patterns.

0

minion Feb 13 '15 at 12:50

source share

unor · Accepted Answer · 2015-02-15T13:14:22+0000

No, this is wrong.

You do not have a robots.txt file in a subdirectory. Your robots.txt should be placed in the root directory of your host document .

If you want to prevent crawling of URLs that start with /foo , use this entry in the robots.txt file ( http://example.com/robots.txt ):

 User-agent: * Disallow: /foo

This allows you to bypass everything (so there is no need for Allow ) except URLs such as

http://example.com/foo
http://example.com/foo/
http://example.com/foo.html
http://example.com/foobar
http://example.com/foo/bar
...

Robots.txt Allows All but Several Subdirectories

More articles: