I want to read text from a web page. I do not want to receive the HTML code of the web page. I found this code:
try { // Create a URL for the desired page URL url = new URL("http://www.uefa.com/uefa/aboutuefa/organisation/congress/news/newsid=1772321.html#uefa+moving+with+tide+history"); // Read all the text returned by the server BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream())); String str; while ((str = in.readLine()) != null) { str = in.readLine().toString(); System.out.println(str); // str is one line of text; readLine() strips the newline character(s) } in.close(); } catch (MalformedURLException e) { } catch (IOException e) { }
but this code gives me the HTML code of the webpage. I want to get all the text inside this page. How to do this with Java?
java
Rigor mortis
source share