Google does not crawl links in AngularJS application

Question

Google does not crawl links in AngularJS application

I have an AngularJS application that is being introduced to third-party sites. It injects dynamic content into a div on a third-party page. Google has successfully indexed this dynamic content, but does not seem to crawl links in dynamic content. Links will look something like this in dynamic content:

<a href="http://www.example.com/support?title=Example Title&titleId=12345">Link Here</a>

I use request parameters for links, not the actual URL structure, for example:

 http://www.example.com/support/title/Example Title/titleId/12345

I need to use the request parameters, since I do not want a third-party site to have to reconfigure its web server to redirect unreasonable URLs.

When I click the link, I use $ locationService to update the URL in the browser, and then my angular app responds accordingly. Basically it shows only relevant content based on the request parameters, sets the page title and meta description.

Many of the articles I read use the route provider in angularJS and templates, but I'm not sure why this might affect the crawler?

I read that google should view the URLs with the request parameters as separate pages, so I don't think this should be a problem: https://webmasters.googleblog.com/2008/09/dynamic-urls-vs-static -urls.html

The only things I haven't tried are 1. Providing a Sitemap with URLs that have request parameters, and 2. Adding static links from other pages to dynamic links to help Google open these pages.

Any help, ideas or ideas are appreciated.

+6

javascript angularjs seo googlebot

Aqua lunger Oct 12 '16 at 10:35

source share

4 answers

Himanshu · Answer 1 · 2016-10-26T14:06:34+0000

This is because google crawlers cannot get static html from your URL, since your pages are dynamically displayed using Javascript, you can achieve what you want using the following:

Since #! out of date, you can tell google that your pages are displayed using javascript using the following tag in the header

 <meta name="fragment" content="!">

If the above tag is detected, Google bots will request your URLs with the _escaped_fragment_ request _escaped_fragment_ from your server, for example

 http://www.example.com/?_escaped_fragment_=/support?title=Example Title&titleId=12345

Then you need to rebuild the original url from _escaped_fragment_ on your server and it will look like this again

 http://www.example.com/support?title=Example Title&titleId=12345

You will then need to serve static HTML for the crawler for this URL. You can do this using a browser without a browser to access the URL. Phantom.js is a good option to render your page using javascript and then provide the contents in a file to create an HTML snapshot on your page. You can also save a snapshot on your server for further scanning, so when visiting google bots, you can directly serve the snapshot, rather than re-displaying the page again.

Michael davidson · Answer 2 · 2016-10-19T21:07:12+0000

A web crawler may work with a higher priority than interpreting your dynamic AngularJS links as the web crawler loads the page. Using ng-href , dynamic link interpretation is done with higher priority. Hope it works!

sikandar shaikh · Answer 3 · 2016-10-20T10:26:09+0000

If you use C # URLs Nothing after the hash in the URL is sent to your server. Since Javascript frameworks originally used a hash as a routing mechanism, this is the main reason Google created this protocol.

Change your urls to #! instead of just using #.

angular.module ('MYAPP'). Config ([
'$ LocationProvider', function ($ locationProvider) {$ LocationProvider.hashPrefix ('!'); }]);

Pritish vaidya · Answer 4 · 2016-10-21T15:13:57+0000

This is how ajax google and bing calls are handled.

The documentation is mentioned here.

The overview mentioned in the documents is as follows

the crawler finds a pretty AJAX URL (that is, a URL containing a hash fragment #!). It then requests the contents of this URL from your server in a slightly modified form. Your web server returns the content as an HTML snapshot, which is then processed by the crawler . search results will display the original URL .

A step-by-step guide is shown in the docs.

Since Angular JS is intended for the Client side , therefore, you need to configure the web server to call the headless html browser to access your web page and deliver the hashbang URL that will be assigned to the google special URL .
If you use the hashbang url , you need to specify the angular application to use them instead of the normal hash of the value

  App.config(['$routeProvider', '$locationProvider', function($routes, $location) { $location.hashPrefix('!'); $routes.when('/home',{ controller : 'IndexCtrl', templateUrl : './pages/index.html' });

as indicated in the sample code here

However, if you do not want to use the hashtag url , but still tell google about the html content , but still want to inform Google about it, you can use this meta tag as this

 <meta name="fragment" content="!" />

and then configure angular to use htmlURL

  angular.module('HTML5ModeURLs', []).config(['$routeProvider', function($route) { $route.html5Mode(true); }]);

and then no matter what method is installed through the module

 var App = angular.module('App', ['HashBangURLs']); //or var App = angular.module('App', ['HTML5ModeURLs']);

Now you need a browser without a browser to access the URL. You can use phantom.js to load the contents of the page, run javascript , and then transfer the contents to a temporary file.

Phantomrunner.js , which takes any URL as input, loads and parses the html in the DOM, and then checks the status of the data.

Check each page using a specific function here.

SiteMap can also be done as shown in this example.

Best feature - you can use the search console to check the URL of your site using

Google search console
Full attribution goes to the site and the author mentioned in this site

.

UPDATE 1

Your crawler needs pages like -

 - com/ - com/category/ - com/category/page/

By default, however, angular sets your pages as such:

 - com - com/#/category - com/#/page

Approach 1

Hash bang lets angular know which HTML elements need to be injected using JS, which can be executed as mentioned earlier, but since it has been removed, therefore, another solution would be the following

Configure $locationProvider and configure the base for relative links

You can use $locationProvider as indicated in these docs and set html5mode to true
$ locationProvider.html5Mode (true);

This allows angular to change the routing and URLs of our pages without refreshing the page.

Set the base and head of the document as <base href="/">

The location service will automatically switch to the hashbang method for browsers that do not support the HTML5 history API.
Full attribution goes to the page and the author

You can also note some other measures and tests that you can take care of, as indicated in this document.

Google does not crawl links in AngularJS application

More articles: