Google Scholar with Matlab

I would like to get some data from Google Scholar automatically through a matlab script. What interests me most is data such as Google Scholar Bibtex records and direct citation. However, it seems that there is no API for Google Scholar , is there a way to automatically get bibliographic data from Google Scholar using Matlab? Are there any tools or code for this?

+8
matlab google-scholar
source share
2 answers

If you really want to use Matlab for this (which I really don't recommend), you can look at several different web-scraper examples, and there is this code that actually already got some information from Google Scholar. Basically, just a good "smoothing of web pages" and disabling.

I personally would recommend using Python for this, because Python is better for general IMHO programming. For example, this guy has already done something similar to what you want with Python. However, if you know Matlab and have no interest / time for Python, then follow the links in the first paragraph.

+4
source share

A word of caution that I found while working on this project.

There is a reason why Google Scholar does not have an API. Using bots to collect from Google Scholar is contrary to EULA. The basic idea is that any program that tries to interact with Google Scholar cannot do this in a qualitatively different way than the end user. In other words, you can automatically receive large amounts of data. Although the script response in @JustinPeel does not necessarily violate the conditions, putting it in a massive loop will.

Some specific points from this EULA :

You must not and do not allow third parties: ...

(i) directly or indirectly create requests or impressions or clicks on the results through any automatic, fraudulent, fraudulent or other invalid means (including, but not limited to, spam, robots, macro programs and Internet agents);

...

(l) β€œbypass”, β€œspider”, index or in any non-transitive manner store or cache information received from the Service (including, but not limited to, results or any part thereof, a copy or derivative);

If you look at Google Scholar robots.txt , you will also see that no bots are allowed.

I heard from some colleagues that you will have problems if you try to circumvent this policy, which could lead to your laboratory losing access to Google Scholar.

+7
source share

All Articles