A word of caution that I found while working on this project.
There is a reason why Google Scholar does not have an API. Using bots to collect from Google Scholar is contrary to EULA. The basic idea is that any program that tries to interact with Google Scholar cannot do this in a qualitatively different way than the end user. In other words, you can automatically receive large amounts of data. Although the script response in @JustinPeel does not necessarily violate the conditions, putting it in a massive loop will.
Some specific points from this EULA :
You must not and do not allow third parties: ...
(i) directly or indirectly create requests or impressions or clicks on the results through any automatic, fraudulent, fraudulent or other invalid means (including, but not limited to, spam, robots, macro programs and Internet agents);
...
(l) βbypassβ, βspiderβ, index or in any non-transitive manner store or cache information received from the Service (including, but not limited to, results or any part thereof, a copy or derivative);
If you look at Google Scholar robots.txt , you will also see that no bots are allowed.
I heard from some colleagues that you will have problems if you try to circumvent this policy, which could lead to your laboratory losing access to Google Scholar.
Artem kaznatcheev
source share