PHP: How to create Unicode file names

I am trying to create files with Unicode characters in file names. I do not quite understand what encoding I should use, or if it is possible at all.

I have this file saved in latin1 encoding:

$h = fopen("unicode_♫.txt", 'w'); fclose($h); 

In UTF-8, this will be decoded as "unicode_ ♫ .txt". It writes it in latin1 version to disk (what is obvious?). I need it to be saved, as it would be with the UTF-8 extension. I also tried to encode it using UTF-16, but this does not work either.

I am using PHP 5.2 and would like this to work with NTFS, ext3 and ext4.

How can I do that?

+7
source share
4 answers

This is currently not possible on Windows (perhaps PHP 5.4 will support this scenario). In PHP, you can only write file names using the installed Windows encoding. If the code page does not contain the symbol, you cannot use it. Even worse, if you have a file in Windows with that symbol in the file name, you will have problems accessing it.

On Linux, at least with ext *, this is a completely different story. You can use any file names you need, the OS does not care about encoding. Therefore, if you use file names sequentially in UTF-8, you should be fine. UTF-16, however, is excluded because file names cannot contain bytes with a value of 0.

+10
source

for me the code below works well on Win7 / ntfs, Apache 2.2.21.0 and PHP 5.3.8.0:

 <?php // this source file is utf-8 encoded $fileContent = "Content of my file which contains Turkish characters such as şığŞİĞ"; $dirName = 'Dirname with utf-8 chars such as şığŞİĞ'; $fileName = 'Filename with utf-8 chars such as şığŞİĞ'; // converting encodings of names from utf-8 to iso-8859-9 (Turkish) $encodedDirName = iconv("UTF-8", "ISO-8859-9//TRANSLIT", $dirName); $encodedFileName = iconv("UTF-8", "ISO-8859-9//TRANSLIT", $fileName); mkdir($encodedDirName); file_put_contents("$encodedDirName/$encodedFileName.txt", $fileContent); 

you can do the same for opening files:

 <?php $fileName = "Filename with utf-8 chars such as şığ"; $fileContent = file_get_contents(iconv("UTF-8", "ISO-8859-9//TRANSLIT", "$fileName.txt")); print $fileContent; 
+5
source

Using the com_dotnet PHP extension, you can access Windows' Scripting.FileSystemObject and then do whatever you want with the UTF-8 file / folder names.

I have packaged this as a PHP thread wrapper, so it is very easy to use:

https://github.com/nicolas-grekas/Patchwork-UTF8/blob/lab-windows-fs/class/Patchwork/Utf8/WinFsStreamWrapper.php

First make sure the com_dotnet extension com_dotnet enabled in your php.ini then enable the shell with:

 stream_wrapper_register('win', 'Patchwork\Utf8\WinFsStreamWrapper'); 

Finally, use the functions you are used to (mkdir, fopen, rename, etc.), but the prefix of your path is win://

For example:

 <?php $dir_name = "Depósito"; mkdir('win://' . $dir_name ); ?> 
+1
source

File names do not have a notion of encoding. You must specify the file name in other ways. The only important point in your situation is that on most file systems, the file name is a string with a byte * with a null character, but in NTFS it is a 16-bit string with zero termination. Therefore, you cannot use the standard fopen -type functions to access all possible NTFS file names.

However, if you got the NTFS file name of an existing file in other ways, you can use the Windows API GetShortPathName to get the short file name that you can use in fopen . I do not know if PHP supports access to the Windows API functions, but maybe someone wrote a module or plugin for this.

0
source

All Articles