Cross platform newline confusion

For some reason, my write-to-textfile function suddenly stopped working.

void write_data(char* filename, char* writethis) { ofstream myfile; myfile.open (filename, std::ios_base::app); myfile << endl << writethis; myfile.close(); } 

The function is called from a loop, so basically it started with an empty line and added all the following β€œwritethis” lines to a new line.

Then all of a sudden, no more new lines. All text has been added on one line. So I did digging, and I came across this:

  • Windows = CR LF
  • Linux = LF
  • MAC <0SX = CR

So, I changed the line to

 myfile << "\r\n" << writethis; 

And it worked again. But now I'm embarrassed. I encode linux , but I read text files created using the program on Windows after transferring them using filezilla . Now, what part of this caused lines to appear in a text file as one line?

I was sure that "endl" worked fine for linux, so now I think the windows messed up the file after with the file files? By waving how the text file is written (and read), I guarantee that my program will break, so if anyone can explain this, I will be grateful.

I also don’t remember what I changed in my program to make it break, because before it worked perfectly. The only thing I added is threading.

Edit: I tried to transfer the transfer mode from ASCII / Binary (even removed the force-ASCII-for-txt extension), but it makes no difference. New lines appear in linux, but not in windows. fz-messup

How strange.

+7
source share
4 answers

What happens is that you write the end of the Unix line ('\ n'), and then transfer it to the Windows computer, getting a bitwise identical file, and then try to open the file using a viewer that does not understand the end of Unix lines ( Probably Notepad).

From my experience writing portable code:

  • Standardize ONE line end ( '\n' , LF) on all platforms.
  • Always open your files in binary format, even if you are writing text.
  • Allows the user opening the file to use a text viewer that understands any lines. There are many windows (including Visual Studio, Notepad ++, Wordpad, and your favorite browser).

Yes, I think that standardize one thing for everyone, rather than support all of them everywhere. I also deny the existence of "correct line ends on a proper platform." The fact that Microsoft decided that their native API was not talking about UTF-8 or did not understand the end of Unix strings did not prevent the code from doing this on Windows. Just remember to share this WinAPI stuff. Many times you do text processing of your internal data that the system will never see, so why the hell do you need to complicate your life by meeting the expectations of these internal systems?

+5
source

endl "works great for Linux." Streaming endl passes the \n character and clears the stream. Always.

However, a text-only file stream converts this \n to \r\n at the implementation level on Windows, and you often find that line endings will be converted when transferring a file between platforms.

This is probably not a problem in C ++, and nothing is β€œbroken”; you should probably configure FileZilla to treat your file as text, not "binary" (a mode in which line endings are not converted). If your file does not have a name extension, for example ".txt", perhaps this does not do this by default.

+5
source

FTP can corrupt your files (that is, it converts newlines) if you transfer files as ASCII. Try passing BIN (binary).

+2
source

Internally, all applications use '\ n' to indicate line termination.

The problem is that the line termination sequence is specific to the text file platform (as your research emerges). Note. Text files, this is the default format when opening a file. If you explicitly select a binary file when opening the file, translation does not occur when reading / writing.

What this actually means is that the character "\ n" is converted to a platform-specific character sequence when you write it to a file. But also note that this particular platform sequence is converted back to '\ n' when the file is being read. The problem you are facing is that you wrote files on one platform and read them on another.

On linux, the line ending sequence is LF ('\ n'). This way you write the file and all "\ n" are converted to "LF" characters. You transfer these files to the Windows system and now read the file. In windows, the line termination sequence is "CRLF". Therefore, the editor that reads the file looks for two characters to convert back to "\ n", but does not find these characters. Now it depends on how smart the editor is, whether you have one line or several lines.

+1
source

All Articles