PHP Utf8 Decryption Problem

Question

PHP Utf8 Decryption Problem

I have the following address bar: Praha 5, Staré Město,

I need to use the utf8_decode () function on this line before I can write it to a PDF file (using domPDF lib).

However, php utf8 decoding function for the specified address bar looks incorrect (or rather incomplete).

The following code:

<?php echo utf8_decode('Praha 5, Staré Město,'); ?>

Produces the following:

Praha 5, Staré M? sto

Any idea why ě is not decoded?

+7

php utf-8 character-encoding

Latheesan Jun 20 '13 at 10:15

source share

4 answers

you don't need it (@Rajeev: this line is automatically detected as utf-8 encoded:

 echo mb_detect_encoding('Praha 5, Staré Město,');

will always return UTF-8.).

Would you prefer: https://code.google.com/p/dompdf/wiki/CPDFUnicode

0

scraaappy Jun 20 '13 at 10:47

source share

I quit using the built-in decoding function UTF-8 / UTF-16 (convert to #number; views), I did not find patterns why UTF-8 was not detected, I suspect because the encoded-like sequence is not always exactly located at the same position in the returned string. You can do an additional check.

UTF-8 three-character indicator: $ startutf8 = chr (0xEF) .chr (187) .chr (191); (if you see this ANYWHERE, and not just the first three characters, the string is encoded in UTF-8)

Decode according to the rules of UTF-8; this replaced an earlier version that intercepted bytes by byte: using

 function charset_decode_utf_8 ($string) { /* Only do the slow convert if there are 8-bit characters */ /* avoid using 0xA0 (\240) in ereg ranges. RH73 does not like that */ if (! ereg("[\200-\237]", $string) and ! ereg("[\241-\377]", $string)) return $string; // decode three byte unicode characters $string = preg_replace("/([\340-\357])([\200-\277])([\200-\277])/e", "'&#'.((ord('\\1')-224)*4096 + (ord('\\2')-128)*64 + (ord('\\3')-128)).';'", $string); // decode two byte unicode characters $string = preg_replace("/([\300-\337])([\200-\277])/e", "'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'", $string); return $string; }

0

Peters v Aug 10 '13 at 1:25

source share

The problem is encoding your PHP file, save the file in UTF-8 encoding, then you do not even need to use utf8_decode , if you get this data 'Praha 5, Staré Město,' from the database, it is better to change its encoding to UTF-8

0

vimal1083 Apr 25 '14 at 10:09

source share

deceze · Accepted Answer · 2013-06-20T10:19:43+0000

utf8_decode converts a string from UTF-8 encoding to ISO-8859-1, aka "Latin-1".
Latin-1 encoding cannot represent the letter "ě". It is so simple.
"Decoding" is a completely incorrect expression, it does the same as iconv('UTF-8', 'ISO-8859-1', $string) .

See What every programmer absolutely needs to know positively about encodings and character sets for working with text .

PHP Utf8 Decryption Problem

More articles: