Protecting website content from crawlers

The content of a commerce website (ASP.NET MVC) is regularly crawled by a competitor. These people are programmers, and they use complex methods to crawl the site, so their identification by IP is not possible. Unfortunately, replacing values ​​with images is not an option because the site must remain readable by on-screen devices (JAWS).

My personal idea is to use robots.txt: to prevent crawlers from accessing one public URL on a page (this can be disguised as a link to a normal element, but hidden from ordinary users. Valid URL: http://example.com?itemId= 1234 Forbidden: http://example.com?itemId=123 below 128). If the owner of the IP address entered in the forbidden link shows a CAPTCHA confirmation. An ordinary user will never follow such a link because it is not displayed, Google should not crawl it because it is dummy. The problem is that the screen reader still reads the link, and I don’t think it would be so effective that it could be implemented.

+5
source share
4 answers

Your idea may work for several basic scanners, but it will be very easy to work. They just need to use a proxy server and access each link from a new IP address.

If you allow anonymous access to your site, you can never fully protect your data. Even if you manage to prevent scanning with a lot of time and effort, they can simply make a person view and capture content with something like a violinist. The best way to prevent your data from being seen by your competitors is to not post it on the public part of your site.

Forcing users to register can help you, at least then you can pick up who crawls your site and prohibits them.

+2

.

, -, , .

:

public ActionResult Index()
{
    if(Page.User.Identity.IsAuthorized)
        return RedirectToAction("IndexAll");

    // show only some poor content
}

[Authorize(Roles="Users")]
public ActionResult IndexAll()
{
    // Show everything
}

, .

0

, , , , , .

, - ( ) , - - .

0

How to protect content on the casino website https://www.onlinecasinosat.com/ can I use https://de.wordpress.org/plugins/wp-content-copy-protector/ This wp plugin protects publications content from copying by any other site author

0
source

All Articles