Using a web scraper, you can extract useful content from a web page and convert to any format, if applicable.
WebScrap ws= new WebScrap(); //set your extracted website url ws.setUrl("http://dasnicdev.imtqy.com/webscrap4j/"); //start scrap session ws.startWebScrap();
Now your web recycling session begins and is ready to crash or retrieve data in java using the webscrap4j library .
For the title:
System.out.println("-------------------Title-----------------------------"); System.out.println(ws.getSingleHTMLTagData("title"));
For Tagline:
System.out.println("-------------------Tagline-----------------------------"); System.out.println(ws.getSingleHTMLScriptData("<h2 id='project_tagline'>", "</h2>"));
For all anchor tags:
System.out.println("-------------------All anchor tag-----------------------------"); al=ws.getImageTagData("a", "href"); for(String adata: al) { System.out.println(adata); }
For image data:
System.out.println("-------------------Image data-----------------------------"); System.out.println(ws.getImageTagData("img", "src")); System.out.println(ws.getImageTagData("img", "alt"));
For Ul-Li data:
System.out.println("-------------------Ul-Li Data-----------------------------"); al=ws.getSingleHTMLScriptData("<ul>", "</ul>","<li>","</li>"); for(String str:al) { System.out.println(str); }
For complete source code check out this tutorial .
Tell Me How Jun 02 '15 at 8:37 2015-06-02 08:37
source share