Php regex optionally matches the whole word

im using php and i need to clear some info from some twists on the site. I simulate both an ajax request by the browser and a normal (all) page request by the browser, however the ajax response is slightly different from the whole page request in this html section.

ajax answer: <div id="accountProfile"><h2>THIS IS THE BIT I WANT</h2><dl id="accountProfileData">

however the normal answer is: <div id="accountProfile"><html xmlns="http://www.w3.org/1999/xhtml"><h2>THIS IS THE BIT I WANT</h2><dl id="accountProfileData">

that is, the ajax answer is missing the tag: <html xmlns="http://www.w3.org/1999/xhtml"> . I need to get a bit between h2 tags. obviously, I can't just clear the page for <h2>THIS IS THE BIT I WANT</h2><dl id="accountProfileData"> , because these tags may occur elsewhere and do not contain the information I want.

i may match one of the patterns individually, but I would like to do both in the same regex. here is my solution for matching ajax answer:

 <?php $pattern = '/\<div id="accountProfile"\>\<h2\>(.+?)\<\/h2\>\<dl id="accountProfileData"\>/'; preg_match($pattern, $haystack, $matches); print_r($matches); ?> 

can someone show me how i should change the template to possibly fit the <html xmlns="http://www.w3.org/1999/xhtml"> ? if it helps to simplify the haystack for brevity, that's fine.

+4
source share
1 answer

I have not tested it, but you can try the following:

  $pattern = '/\<div id="accountProfile"\>(\<html xmlns=\"http://www.w3.org/1999/xhtml\"\>){0,1}\<h2\>(.+?)\<\/h2\>\<dl id="accountProfileData"\>/'; 
+2
source

All Articles