How to write a non-English line to a file and read from this file using C ++?

I want to write std::wstring to a file and read this content as std:wstring . This happens as expected when the string L"<Any English letter>" . But the problem happens when we have a character like Bengali, Kannada, Japanese, etc., Any non-English letter. Tried various options, for example:

  • Convert std::wstring to std::string and write to file and read time are considered as std::string and convert as std::wstring
    • Writing occurs (I could see from edito), but the read time becomes incorrect.
  • Writing std::wstring on wofstream, this also does not help the letters of the native language, such as std::wstring data = L"হ্যালো ওয়ার্ল্ড";

The platform is mac and Linux, the language is C ++

the code:

 bool write_file( const char* path, const std::wstring data ) { bool status = false; try { std::wofstream file(path, std::ios::out|std::ios::trunc|std::ios::binary); if (file.is_open()) { //std::string data_str = convert_wstring_to_string(data); file.write(data.c_str(), (std::streamsize)data.size()); file.close(); status = true; } } catch (...) { std::cout<<"exception !"<<std::endl; } return status; } // Read Method std::wstring read_file( const char* filename ) { std::wifstream fhandle(filename, std::ios::in | std::ios::binary); if (fhandle) { std::wstring contents; fhandle.seekg(0, std::ios::end); contents.resize((int)fhandle.tellg()); fhandle.seekg(0, std::ios::beg); fhandle.read(&contents[0], contents.size()); fhandle.close(); return(contents); } else { return L""; } } // Main int main() { const char* file_path_1 = "./file_content_1.txt"; const char* file_path_2 = "./file_content_2.txt"; //std::wstring data = L"Text message to write onto the file\n"; // This is happening as expected std::wstring data = L"হ্যালো ওয়ার্ল্ড"; // Not happening as expected. // Lets write some data write_file(file_path_1, data); // Lets read the file std::wstring out = read_file(file_path_1); std::wcout<<L"File Content: "<<out<<std::endl; // Let write that same data onto the different file write_file(file_path_2, out); return 0; } 
+7
c ++ clang ++ wifstream wofstream
source share
5 answers

As output wchar_t depends on the locale. By default, locale ( "C" ) usually does not accept anything other than ASCII (Unicode code codes 0x20 ... 0x7E, plus several control characters.)

Whenever a program processes text, the very first statement in main should be:

 std::locale::global( std::locale( "" ) ); 

If a program uses any of the standard stream objects, the code should also fill them with a global language before input or output.

+3
source share

To read and write Unicode files (assuming you want to write Unicode characters) you can try fopen_s

 FILE *file; if((fopen_s(&file, file_path, "w,ccs=UNICODE" )) == NULL) { fputws(your_wstring().c_str(), file); } 
0
source share

Then edit: this is for Windows (since there was no tag at the time of the answer)

You need to set the stream to a locale that supports these characters. Try something like this (for UTF8 / UTF16):

 std::wofstream myFile("out.txt"); // writing to this file myFile.imbue(std::locale(myFile.getloc(), new std::codecvt_utf8_utf16<wchar_t>)); 

And when you read from this file, you should do the same:

 std::wifstream myFile2("out.txt"); // reading from this file myFile2.imbue(std::locale(myFile2.getloc(), new std::codecvt_utf8_utf16<wchar_t>)); 
0
source share

One of the possible problems can occur when reading a line back, because you set the line length to the number of bytes in the file, and not the number of characters. This means that you are trying to read past the end of the file, and also that the line will contain garbage at the end.

If you are dealing with text files, why not just use the usual output and input operators << and >> or other text functions such as std::getline ?

0
source share

Do not use wstring or wchar_t. On platforms other than Windows, wchar_t is almost useless today .

You should use UTF-8 instead.

 bool write_file( const char* path, const std::string data ) { try { std::ofstream file(path, std::ios::out | std::ios::trunc | std::ios::binary); file.exceptions(true); file << data; return true; } catch (...) { std::cout << "exception!\n"; return false; } } // Read Method std::string read_file( const char* filename ) { std::ifstream fhandle(filename, std::ios::in | std::ios::binary); if (fhandle) { std::string contents; fhandle.seekg(0, std::ios::end); contents.resize(fhandle.tellg()); fhandle.seekg(0, std::ios::beg); fhandle.read(&contents[0], contents.size()); return contents; } else { return ""; } } int main() { const char* file_path_1 = "./file_content_1.txt"; const char* file_path_2 = "./file_content_2.txt"; std::string data = "হ্যালো ওয়ার্ল্ড"; // linux and os x compilers use UTF-8 as the default execution encoding. write_file(file_path_1, data); std::string out = read_file(file_path_1); std::wcout << "File Content: " << out << '\n'; write_file(file_path_2, out); } 
0
source share

All Articles