C ++ WCHAR Manipulations

I am developing a tiny Win32 application in C ++. I learned the basics of C ++ a long time ago, so now I'm completely confused about character strings in C ++. There was no WCHAR or TCHAR only char and String . After a little investigation, I decided not to use TCHAR .

My problem is very simple, I think, but I cannot find a clear guide on how to manipulate strings in C ++. Affected by PHP coding in the last few years, I was expecting something simple with string manipulation and was wrong!

Just all I need to do is put the new data in a character string.

  WCHAR* cs = L"\0"; swprintf( cs, "NEW DATA" ); 

This was my first attempt. When debugging my application, I researched that swprintf only puts the first two characters in my var. I solved this problem as follows:

  WCHAR cs[1000]; swprintf( cs, "NEW DATA" ); 

But usually this trick can fail, because in my case, the new data is not a constant value, but another variable that can be wider than 1000 characters. And my code is as follows:

  WCHAR cs[1000]; WCHAR* nd1; WCHAR* nd2; wcscpy(nd1, L"Some value"); wcscpy(nd2, L"Another value"); // Actually these vars stores the path for user selected folder swprintf( cs, "The paths are %s and %s", nd1, nd2); 

In this case, it is likely that the total number of characters nd1 and nd2 may be more than 1000 characters, so critical data will be lost.

The question is, how can I copy all the data I need for a WCHAR string declared this way WCHAR* wchar_var; without losing anything?

PS Since I am Russian, the question may be unclear. Let me now about this, and I will try to explain my problem more clear and complex.

+6
source share
3 answers

In modern Windows programming, simply ignoring TCHAR and using wchar_t ( WCHAR ) and Unicode UTF-16 instead.

( TCHAR is a model of the past when you wanted to have a single code base and create both ANSI / MBCS and Unicode assemblies by modifying some preprocessor switches such as _UNICODE and UNICODE .)

In any case, to simplify the code, you should use C ++ and convenient string classes . You can use ATL::CString (which corresponds to CStringW in Unicode strings that default to VS2005) or STL std::wstring .

Using CString , you can do:

 CString str1 = L"Some value"; CString str2 = L"Another value"; CString cs; cs.Format(L"The paths are %s and %s", str1.GetString(), str2.GetString()); 

CString also provides the correct operator+ overloads to concatenate strings (so you don’t need to calculate the total length of the resulting string, dynamically allocate a buffer for the target string or check the existing buffer size, call wcscpy , wcscat , remember to free the buffer, etc.)

And you can simply pass CString instances to the Win32 API waiting for const wchar_t* parameters ( LPCWSTR/PCWSTR ), since CString offers an implicit conversion operator for const wchar_t* .

+7
source

When you use WCHAR* , you invoke undefined behavior because you have a pointer, but it does not point to anything valid. You need to figure out how long the string will be received, and dynamically allocate space for the string. For instance:

 WCHAR* cs; WCHAR* nd1; WCHAR* nd2; nd1 = new WCHAR[lstrlen(L"Some value") + 1]; // +1 for the null terminator nd2 = new WCHAR[lstrlen(L"Another value") + 1]; cs = new WCHAR[lstrlen(L"The paths are and ") + lstrlen(nd1) + lstrlen(nd2) + 1]; wcscpy(nd1, L"Some value"); wcscpy(nd2, L"Another value"); // Actually these vars stores the path for user selected folder swprintf( cs, L"The paths are %s and %s", nd1, nd2); delete[] nd1; delete[] nd2; delete[] cs; 

But it is very ugly and error prone. As already noted, you should use std::wstring , but something like this:

 std::wstring cs; std::wstring nd1; std::wstring nd2; nd1 = L"Some value"; nd2 = L"Another value"; cs = std::wstring(L"The paths are ") + nd1 + L" and " + nd2; 
+2
source

Suggest using the ATL class CStringW instead of raw WCHAR , it is much more convenient. CString is a wrapper for a dynamically allocated C string. It will control the length of the string and the allocated memory buffer, respectively, after each operation, so you will not like it.

Typical Usage:

 #include <atlstr.h> CStringW s; s.Format(L"The paths are %s and %s", L"Some value", L"Another value"); const WCHAR* wstr = s.GetString(); // To pass to some API that need WCHAR 

or

 #include <atlstr.h> CStringW s(L"The paths are "); s += L"Some value"; s += L" and "; s += L"Another value"; const WCHAR* wstr = s.GetString(); // To pass to some API that need WCHAR 
+1
source

All Articles