I have an angularjs app where I want to share pages on Facebook. This is handled by meta tags ( https://developers.facebook.com/docs/sharing/best-practices ), but I cannot change the meta tags using js because js is not executed by Facebook crawlers. Therefore, I want to use prerender.io to execute and render my pages before the scanner receives them from the server.
The fact is that I'm not sure how correctly I understand the documentation ( https://github.com/greengerong/prerender-java ).
This is an example web.xml from README.md on GitHub:
<filter> <filter-name>prerender</filter-name> <filter-class>com.github.greengerong.PreRenderSEOFilter</filter-class> <init-param> <param-name>prerenderServiceUrl</param-name> <param-value>http://localhost:3000</param-value> </init-param> <init-param> <param-name>crawlerUserAgents</param-name> <param-value>me</param-value> </init-param> </filter> <filter-mapping> <filter-name>prerender</filter-name> <url-pattern>/*</url-pattern> </filter-mapping>
After several attempts to do everything right, I found out that if I just delete this part:
<init-param> <param-name>prerenderServiceUrl</param-name> <param-value>http://localhost:3000</param-value> </init-param>
I don't need to deal with websockets in GAE (this gave me this error: "Caused by: java.net.SocketException: Permission denied: ..."), and I can use the default value already deployed to http: // prerender.herokuapp.com . Question 1) What are the pros and cons of the default service or deploying my own?
Now the service is working, and I am not getting server errors - great!
As described in the documentation ( https://github.com/greengerong/prerender-java ), I first used "I" as the agent of the user-crawler. When using "me" as the crawler agent, the prerender started caching my own API calls. For example, when I was collecting a bunch of elements from my server, the prerender returned some HTML and cached the URI using the JSON I wanted. So now I have some cached pages on prerender.io, but not exactly the pages I want to cache :).
So, I changed crawlerUserAgent to this:
<init-param> <param-name>crawlerUserAgents</param-name> <param-value>facebookexternalhit/1.1</param-value> </init-param>
(I also tried facebookexternalhit, FacebookUserExternalHit, ...). Now I do not get page caching on prerender.io, and javascript is not executed before the Facebooks search engine receives the pages. After looking at the debugger ( https://developers.facebook.com/tools/debug/og/object/ ), he tells me that the crawler sees only the original meta tags, not the meta tags that I replace js on different pages (meta tags are replaced when opening my page and checking items).
Question 2) Am I doing this right? Should I try other crawler user agents? Is facebookexternalhit right?