How to fix my htaccess for proxy search engine crawl requests?

I created a website with React in the interface and WordPress as a backend. For the search bots to see my site, I configured prerendering on the server side, and I'm trying to configure htaccess for proxy requests coming from search engines to serve pre-prepared pages.

For testing, I use the "Extract as Google" tool in Google Webmasters.

Here is my attempt:

<IfModule mod_rewrite.c> RewriteEngine On <IfModule mod_proxy_http.c> RewriteCond %{REQUEST_FILENAME} -f [OR] RewriteCond %{REQUEST_FILENAME} -d RewriteCond %{HTTP_USER_AGENT} googlebot [NC,OR] RewriteCond %{QUERY_STRING} _escaped_fragment_ # Proxy the request ... works for inner pages only RewriteRule ^(?!.*?)$ http://example.com:3000/https://example.com/$1 [P,L] </IfModule> </IfModule> # BEGIN WordPress <IfModule mod_rewrite.c> RewriteEngine On RewriteBase / RewriteRule ^index\.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] </IfModule> # END WordPress 

My problem is that this directive does not work for my homepage and only works for internal pages ( http://example.com/inner-page/ ):

 RewriteRule ^(?!.*?)$ http://example.com:3000/https://example.com/$1 [P,L] 

When I change this line to the next line, the homepage request is really proxied correctly, but the internal pages stop working.

 RewriteRule ^(index\.php)?(.*) http://example.com:3000/https://example.com/$1 [P,L] 

Could you help me fix the rewrite rule so that my homepage is also correctly proxied for googlebot?

+7
regex apache reactjs .htaccess
source share
2 answers

Change the RewriteRule to:

 RewriteRule ^(.*)/?$ http://example.com:3000/https://example.com/$1 [P,L] 
+1
source share

Avoid repetitions first

 <IfModule mod_rewrite.c> RewriteEngine On <IfModule mod_proxy_http.c> RewriteCond %{REQUEST_FILENAME} -f [OR] RewriteCond %{REQUEST_FILENAME} -d RewriteCond %{HTTP_USER_AGENT} googlebot [NC,OR] RewriteCond %{QUERY_STRING} _escaped_fragment_ # Proxy the request ... works for inner pages only RewriteRule ^(?!.*?)$ http://example.com:3000/https://example.com/$1 [P,L] RewriteBase / RewriteRule ^index\.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] </IfModule> </IfModule> 

Then change ^(?!.*?)$ To ^.*$ Or with a good pattern, for example [a-zA-Z0-9-.]* . Remember to use 0 or more flags ( * ) there.

The correct code will be

 <IfModule mod_rewrite.c> RewriteEngine On <IfModule mod_proxy_http.c> RewriteCond %{REQUEST_FILENAME} -f [OR] RewriteCond %{REQUEST_FILENAME} -d RewriteCond %{HTTP_USER_AGENT} googlebot [NC,OR] RewriteCond %{QUERY_STRING} _escaped_fragment_ # Proxy the request ... works for inner pages only RewriteRule ^(.*)$ http://example.com:3000/https://example.com/$1 [P,L] RewriteBase / RewriteRule ^index\.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] </IfModule> </IfModule> 
+1
source share

All Articles