How to split a long string in PHP?

Question

How to split a long string in PHP?

I am currently studying the separation of a very long string, which may contain HTML characteristics.

Example:

Thiiiissssaaaveryyyylonnngggstringgg

For this, I used this function in the past:

 function split($sString, $iCount = 75) { $text = $sString; $new_text = ''; $text_1 = explode('>',$text); $sizeof = sizeof($text_1); for ($i=0; $i<$sizeof; ++$i) { $text_2 = explode('<',$text_1[$i]); if (!empty($text_2[0])) { $new_text .= preg_replace('#([^\n\r .]{'. $iCount .'})#iu', '\\1 ', $text_2[0]); } if (!empty($text_2[1])) { $new_text .= '<' . $text_2[1] . '>'; } } return $new_text; }

The function works to pick up such characters and break them after X characters. The problem is that HTML or ASCII characters are mixed there as follows:

 Thissssiisss<a href="#">lonnnggg</a>sting&#228;&#228;&#228;

I tried to figure out how to break this line above and not count the characters in the HTML tags and count each ASCII character as 1.

Any help would be great.

thanks

+4

php

Patrik Johansson Sep 03 '09 at 9:57

source share

4 answers

Use the built-in wordwrap () instead

+2

Amber Sep 03 '09 at 10:00

source share

I use this function to split lines in FireStats.

you can probably get it out of context and use it quite easily. note that it calls some other functions. you can skip the utf8 check if you want.

0

Omry yadan Sep 03 '09 at 10:02

source share

Get rid of this complexity, use the DOM parser to extract text text

 //Dump contents (without tags) from HTML $pageText = file_get_html('http://www.google.com/')->plaintext; echo "Length is: " . strlen($pageText);

0

karim79 Sep 03 '09 at 10:06

source share

Dominic Rodger · Accepted Answer · 2009-09-03T10:13:04+0000

If you are worried about supporting UTF-8 wordwrap , then you want:

 function utf8_wordwrap($str, $width = 75, $break = "\n") // wordwrap() with utf-8 support { $str = preg_split('#[\s\n\r]+#', $str); $len = 0; foreach ($str as $val) { $val .= ' '; $tmp = mb_strlen($val, 'utf-8'); $len += $tmp; if ($len >= $width) { $return .= $break . $val; $len = $tmp; } else { $return .= $val; } } return $return; }

Source: PHP Manual Comment

As for your problem with code points, you can look at html_entity_decode , which I think converts code points (like ß ) to the character that they represent. You will need to give it a character set so that it knows what 223 means (since the meaning of "223" depends on the encoding).

How to split a long string in PHP?

More articles: