How to split a long string in PHP?

I am currently studying the separation of a very long string, which may contain HTML characteristics.

Example:

Thiiiissssaaaveryyyylonnngggstringgg 

For this, I used this function in the past:

 function split($sString, $iCount = 75) { $text = $sString; $new_text = ''; $text_1 = explode('>',$text); $sizeof = sizeof($text_1); for ($i=0; $i<$sizeof; ++$i) { $text_2 = explode('<',$text_1[$i]); if (!empty($text_2[0])) { $new_text .= preg_replace('#([^\n\r .]{'. $iCount .'})#iu', '\\1 ', $text_2[0]); } if (!empty($text_2[1])) { $new_text .= '<' . $text_2[1] . '>'; } } return $new_text; } 

The function works to pick up such characters and break them after X characters. The problem is that HTML or ASCII characters are mixed there as follows:

 Thissssiisss<a href="#">lonnnggg</a>sting&#228;&#228;&#228; 

I tried to figure out how to break this line above and not count the characters in the HTML tags and count each ASCII character as 1.

Any help would be great.

thanks

+4
source share
4 answers

If you are worried about supporting UTF-8 wordwrap , then you want:

 function utf8_wordwrap($str, $width = 75, $break = "\n") // wordwrap() with utf-8 support { $str = preg_split('#[\s\n\r]+#', $str); $len = 0; foreach ($str as $val) { $val .= ' '; $tmp = mb_strlen($val, 'utf-8'); $len += $tmp; if ($len >= $width) { $return .= $break . $val; $len = $tmp; } else { $return .= $val; } } return $return; } 

Source: PHP Manual Comment

As for your problem with code points, you can look at html_entity_decode , which I think converts code points (like ß ) to the character that they represent. You will need to give it a character set so that it knows what 223 means (since the meaning of "223" depends on the encoding).

+2
source

Use the built-in wordwrap () instead

+2
source

I use this function to split lines in FireStats.

you can probably get it out of context and use it quite easily. note that it calls some other functions. you can skip the utf8 check if you want.

0
source

Get rid of this complexity, use the DOM parser to extract text text

 //Dump contents (without tags) from HTML $pageText = file_get_html('http://www.google.com/')->plaintext; echo "Length is: " . strlen($pageText); 
0
source

All Articles