Nginx: another robots.txt file for alternate domain

Summary

I have one web application with an internal and external domain pointing to it, and I want robots.txt to block all access to the internal domain, but allow all access to the external domain.

Problem details

I have a simple Nginx server block that I used for a proxy for a Django application (see below). As you can see, this server block responds to any domain (due to the lack of the server_name parameter). However, I am wondering how to flag certain domains such as Nginx will serve the custom robots.txt file for them.

In particular, let's say that the example.com and www.example.com domains will serve the default robots.txt file from the htdocs directory. (Since "root / sites / mysite / htdocs" is installed and the robots.txt file is located at /sites/mysite/htdocs/robots.txt)

BUT, I also want the domain "example.internal.com" (which belongs to the same server as example.com) to have a custom robots.txt file; I would like to create my own robots.txt file, so Google does not index this internal domain.

I thought about duplicating a server block and determining the next in one of the server blocks. And then somehow override the robots.txt search in this server block.

"server_name internal.example.com;" 

But duplicating the entire server block just for this purpose does not seem very dry.

I also thought about maybe using an if statement to check and see if the host header contains an internal domain. And then it serves the custom robots.txt file in this way. But Nginx says If Is Evil .

What is a good approach for serving a custom robots.txt file for an internal domain?

Thank you for your help.

Here is an example server block code that I use.

 upstream app_server { server unix:/sites/mysite/var/run/wsgi.socket fail_timeout=0; } server { listen 80; root /sites/mysite/htdocs; location / { try_files $uri @proxy_to_app; } location @proxy_to_app { proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Protocol $scheme; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Scheme $scheme; proxy_set_header Host $http_host; proxy_redirect off; proxy_pass http://app_server; } } 
+7
django nginx dns robots.txt
source share
1 answer

You can use map to define a conditional variable. Add this external server directive:

 map $host $robots_file { default robots.txt; internal.example.com internal-robots.txt; } 

Then the variable can be used with try_files as follows:

 server_name internal.example.com; location = /robots.txt { try_files /$robots_file =404; } 

You now have two robots.txt files in your root:

 robots.txt internal-robots.txt 
+14
source share

All Articles