I want to read the source code (HTML tags) of a given URL from my servlet.
For example, the URL is http://www.google.com , and my servlet needs to read the HTML source code. Why do I need this, my web application is going to read other web pages and get useful content and do something with it.
Suppose my application shows a list of stores of one category in a city. As this list is created, my web application (servlet) goes through this web page, which displays various stores and reads content. With the source code, my servlet filters this source and gets useful information. Finally, a list is created (since my servlet does not have access to the web application database of the given URL).
Does anyone know a solution? (I especially need to do this in servlets). If you think there is another best way to get information from another site, please let me know.
thanks
java html jsp servlets web-scraping
Decora
source share