I have a great html code to scan. So far, I have used preg_match_all to extract the desired details from it. From the very beginning, the problem was that it was extremely complex. Finally, we decided to use a different extraction method. In some articles, I read that preg_match can be compared with strpos performance. They claim that strpos surpasses the effective regular expression scanner by up to 20 times. I thought I would try this method, but I do not know how to start.
Let's say I have this html line:
<li id="ncc-nba-16451" class="che10"><a href="/en/star">23 - Star</a></li> <li id="ncd-bbt-5674" class="che10"><a href="/en/moon">54 - Moon</a></li> <li id="ertw-cxda-c6543" class="che10"><a href="/en/sun">34,780 - Sun</a></li>
I want to extract only the number from each identifier and only the text (letters) from the contents of the tags a . so I do this preg_match_all scan:
'/<li.*?id=".*?([\d]+)".*?<a.*?>.*?([\w]+)<\/a>/s'
here you can see the result: LINK
Now, if I wanted to replace my method with strpos functionality, what would the approach look like? I understand that strpos returns the index of the start where the match occurred. But how can I use it for:
- get all possible matches, not just
- Extract numbers or text from the desired location in a string
Thanks for the help and advice;)
Mevia source share