PHP - splitting a string of HTML attributes into an indexed array

I have a line with HTML attributes:

$attribs = ' id= "header " class = "foo bar" style ="background-color:#fff; color: red; "'; 

How to convert this string to an indexed array, for example:

 array( 'id' => 'header', 'class' => array('foo', 'bar'), 'style' => array( 'background-color' => '#fff', 'color' => 'red' ) ) 

so I can use the array_merge_recursive PHP function to combine the two sets of HTML attributes.

thanks

+8
html split php
source share
6 answers

Use SimpleXML:

 <?php $attribs = ' id= "header " class = "foo bar" style ="background-color:#fff; color: red; "'; $x = new SimpleXMLElement("<element $attribs />"); print_r($x); ?> 

This assumes that attributes are always name / value pairs ...

+20
source share

You can use regex to extract this information:

 $attribs = ' id= "header " class = "foo bar" style ="background-color:#fff; color: red; "'; $pattern = '/(\\w+)\s*=\\s*("[^"]*"|\'[^\']*\'|[^"\'\\s>]*)/'; preg_match_all($pattern, $attribs, $matches, PREG_SET_ORDER); $attrs = array(); foreach ($matches as $match) { if (($match[2][0] == '"' || $match[2][0] == "'") && $match[2][0] == $match[2][strlen($match[2])-1]) { $match[2] = substr($match[2], 1, -1); } $name = strtolower($match[1]); $value = html_entity_decode($match[2]); switch ($name) { case 'class': $attrs[$name] = preg_split('/\s+/', trim($value)); break; case 'style': // parse CSS property declarations break; default: $attrs[$name] = $value; } } var_dump($attrs); 

Now you just need to parse the class classes (space-separated) and the declaration of style properties (a bit more complicated as it may contain comments and URLs with ; ).

+8
source share

You cannot use regex to parse html attributes. This is because the syntax is contextual. You can use regular expressions to tokenize input, but you need a state machine to parse it.

If performance doesn't really matter, this is the safest way to do this, perhaps to wrap the attributes in a tag and then send it through the html parser. For example:.

 function parse_attributes($input) { $dom = new DomDocument(); $dom->loadHtml("<foo " . $input. "/>"); $attributes = array(); foreach ($dom->documentElement->attributes as $name => $attr) { $attributes[$name] = $node->value; } return $attributes; } 

Perhaps you could optimize the above by reusing the parser or using XmlReader or sax parser .

+4
source share

Maybe this will help you. What is he doing.

  • The HTML DOM parser, written in PHP5 +, allows you to very easily manipulate HTML code!
  • Require PHP 5+.
  • Supports invalid HTML.
  • Find tags on an HTML page with selectors like jQuery.
  • Extract content from HTML in one line.

http://simplehtmldom.sourceforge.net/

+2
source share

A simple way could also be:

 $ atts_array = current ((array) new SimpleXMLElement ("<element $ attribs />"));
+2
source share

Simple and effective function to solve this problem.

 function attrString2Array($attr) { $atList = []; if (preg_match_all('/\s*(?:([a-z0-9-]+)\s*=\s*"([^"]*)")|(?:\s+([a-z0-9-]+)(?=\s*|>|\s+[a..z0-9]+))/i', $attr, $m)) { for ($i = 0; $i < count($m[0]); $i++) { if ($m[3][$i]) $atList[$m[3][$i]] = null; else $atList[$m[1][$i]] = $m[2][$i]; } } return $atList; } print_r(attrString2Array('<li data-tpl-classname="class" data-tpl-title="innerHTML" disabled nowrap href="#" hide src = "images/asas.gif">')); print_r(attrString2Array('data-tpl-classname="class" data-tpl-title="innerHTML" disabled nowrap href="#" hide src = "images/asas.gif"')); //Array //( // [data-tpl-classname] => class // [data-tpl-title] => innerHTML // [disabled] => // [nowrap] => // [href] => # // [hide] => // [src] => images/asas.gif //) 

0
source share

All Articles