What is the most direct way to convert Path to * c_char?

Given std::path::Path , what is the most direct way to convert this to std::os::raw::c_char with zero completion? (to go to the C-functions that go along the way).

 use std::ffi::CString; use std::os::raw::c_char; use std::os::raw::c_void; extern "C" { some_c_function(path: *const c_char); } fn example_c_wrapper(path: std::path::Path) { let path_str_c = CString::new(path.as_os_str().to_str().unwrap()).unwrap(); some_c_function(path_str_c.as_ptr()); } 

Is there a way to avoid so many intermediate steps?

 Path -> OsStr -> &str -> CString -> as_ptr() 
+13
source share
3 answers

It is not as simple as it seems. There is one piece of information that you did not provide: in what encoding does function C expect a path?

On Linux, paths are simply arrays of bytes (0 is invalid), and applications usually don't try to decode them. (However, they may have to decode them using a specific encoding, for example, to display them to the user, in which case they will usually try to decode them according to the current locale, which will often use UTF-8 encoding.)

On Windows, this is more complicated because there are variants of API functions that use the "ANSI" code page, and variants that use "Unicode" (UTF-16). In addition, Windows does not support the installation of UTF-8 as the "ANSI" code page. This means that if the library does not specifically expect UTF-8 and does not convert the path to its own encoding, passing it in the UTF-8 encoded path is definitely incorrect (although it might seem that it works for strings containing only ASCII characters).

(I do not know about other platforms, but this is already dirty enough.)

In Rust Path , it's just a shell for OsStr . OsStr uses a platform-specific view that turns out to be UTF-8 compatible when the string is indeed valid UTF-8, but non-UTF-8 strings use undefined encoding (on Windows it actually uses WTF-8 , but it is not negotiable. on Linux, it's just an array of bytes as is).

Before passing the path to the C function, you must determine in which encoding it expects the string, and if it does not match the Rust encoding, you will have to convert it before packing into a CString . Rust does not allow you to convert Path or OsStr to anything other than str regardless of platform. For Unix-based purposes, OsStrExt feature of OsStrExt that provides access to OsStr as a fragment of bytes.

Rust is used to provide the to_cstring method on OsStr , but it never stabilizes, and it was deprecated in Rust 1.6.0, as it turned out that the behavior was inappropriate for Windows (the UTF-8 encoded path returned to this, but for the Windows API, not I support it!).

+8
source

Since Path is just a thin shell for OsStr , you can almost pass it as it is to your C function. But to be a valid C string, we need to add a NUL trailing byte. Therefore, we must allocate CString .

Converting to str , on the other hand, carries a risk (what if Path not a valid UTF-8 string?), As well as an overhead: I use as_bytes() instead of to_str() .

 fn example_c_wrapper<P: AsRef<std::path::Path>>(path: P) { let path_str_c = CString::new(path.as_ref().as_os_str().as_bytes()).unwrap(); some_c_function(path_str_c.as_ptr()); } 

This is for Unix. I do not know how this works for Windows.

+1
source

If you are trying to create a Vec<u8> , I usually call and do:

 #[cfg(unix)] fn path_to_bytes<P: AsRef<Path>>(path: P) -> Vec<u8> { use std::os::unix::ffi::OsStrExt; path.as_ref().as_os_str().as_bytes().to_vec() } #[cfg(not(unix))] fn path_to_bytes<P: AsRef<Path>>(path: P) -> Vec<u8> { // On Windows, could use std::os::windows::ffi::OsStrExt to encode_wide(), // but you end up with a Vec<u16> instead of a Vec<u8>, so that does not // really help. path.as_ref().to_string_lossy().to_string().into_bytes() } 

Knowing well that non-UTF8 paths on non-UNIX will not be properly supported. Note that you may need Vec<u8> when working with Thrift / protocol buffers, and not with the C API.

0
source

All Articles