Can Deny entire site on robots.txt have consequences after deletion?

I published a website and, due to misunderstandings beyond my control, I had to block all pages before indexing. Some of these pages were already connected on social networks, so to avoid a bad user experience, I decided to paste the following code into "robots.txt"

User-agent: * Disallow: * 

I got a “critical issue” warning on webmaster tools, and I'm a little worried about that. In your experience, is it (if possible) sufficient to restore the original "robots.txt"? Can the current situation leave consequences (fines or the like) on the website if it lasts a long time (and if so, how can I fix it)? I apologize if the question may seem a little general, but I cannot find specific answers. Thanks in advance.

+2
source share
2 answers

A "critical issue" occurs because Google cannot index pages on your site using the robots.txt configuration. If you are still developing a site, the standard procedure has this robots.txt configuration. Webmaster tools treat your site as if it were in production, but it looks like you are still developing, so in this case it’s kind of a false positive error message.

The presence of this configuration robots.txt does not have long-term negative consequences for ranking in search engines, however, the longer search engines can access your site, the better the rating. For Google, it’s about 3 months of stable crawl, it will earn some kind of trusted status. So it really depends on the domain and regardless of whether it has been previously indexed by Google and for how long, but still there will be no long-term consequences, since most of all you will have to wait another 3 months to “earn Google’s trust" "again .

Most social networks will read the robots.txt file when and when the user shares, search engines, on the other hand, vary in index speed and will take from several hours to several weeks to detect changes in your .txt robots and update the index.

Hope this helps, if you can provide more detailed information about your circumstances, I may be able to help further, but this should at least answer your question.

+1
source

My goal (for now) is to block all bots

Your current robots.txt file does not block all bots.

In the original robots.txt specification, Disallow: * means: Prevent crawling of all URLs starting with * , for example:

  • http://example.com/*
  • http://example.com/****
  • http://example.com/*p
  • http://example.com/*.html
  • ...

Some parsers do not follow the original specification and interpret * as a wildcard. For them (and only for them), this would probably mean blocking all URLs (where * means: "any character (s)").

In a few words, I would like the site to be accessible only from people, not from bots.

Then you should use:

 User-agent: * Disallow: / 
+1
source

All Articles