Java regex retrieve data between href tags

I am trying to extract data between href tags in a java string. I can replace this with everything and the substring and using indexOf etc.

I would like to know how I can get data using regex.

So basically I'm trying to extract data and store it in rows or in a list.

String data ="12345"; String sampleStr =""; for(int i=0; i<10; i++) { data+=i; sampleStr += "<a href=\"javascript:yyy_getDetail(\'"+data+"\')\">"+data+"</a>"+", "; } System.out.println(sampleStr); String temp = sampleStr.substring(sampleStr.indexOf("\">")+2); 

Any suggestion regarding would be appreciated. Which should be a regular expression, so I only retrieve data.

+6
source share
2 answers

Here is an example of your needs. Please note that the full match will contain a string with anchor tags, and the content you found is in group 1 .

 String data ="12345"; String sampleStr =""; for(int i=0; i<10; i++) { data+=i; sampleStr += "<a href=\"javascript:yyy_getDetail(\'"+data+"\')\">"+data+"</a>"+", "; } Pattern pattern = Pattern.compile("<a[^>]*>(.*?)</a>"); Matcher matcher = pattern.matcher(sampleStr ); while (matcher.find()) { System.out.println("Result "+ matcher.group(1)); } 
+2
source

Use an HTML / XML parser instead. Your life will be much easier.

HTML is usually very incompatible, and you cannot be sure that it will turn out the way you want it.

Actually there is a famous answer to this question: RegEx matches open tags except XHTML stand-alone tags

You should take a look at the Best XML parser for Java for your options if you decide to use the HTML / XML parser :)

+1
source

All Articles