The difference between opening a file in binary and text

I did a few things like:

FILE* a = fopen("a.txt", "w"); const char* data = "abc123"; fwrite(data, 6, 1, a); fclose(a); 

and then in the generated text file, it says "abc123", as expected. But then I:

 //this time it is "wb" not just "w" FILE* a = fopen("a.txt", "wb"); const char* data = "abc123"; fwrite(data, 6, 1, a); fclose(a); 

and get the same result. If I read the file using binary or normal mode, it will also give me the same result. So my question is what is the difference between opening with or without binary mode.

Where I read about fopen modes: http://www.cplusplus.com/reference/cstdio/fopen/

+8
c ++ file text binary
source share
2 answers

The link you gave does describe the differences, but it looks like the bottom of the page:

http://www.cplusplus.com/reference/cstdio/fopen/

Text files are files containing sequences of lines of text. Depending on the environment in which the application is running, a special character conversion may occur during text input / output in order to adapt them to the system text file . Although there are no conversions in some environments, and both text files and binary files are processed the same way, portability is improved when using the appropriate mode.

The conversion may consist in normalizing \r\n to \n (or vice versa) or, possibly, ignoring characters outside of 0x7F (a-la 'text mode' in FTP). Personally, I would open everything in binary mode and use a good text encoding library to work with text.

+10
source share

The most important difference is that with a stream opened in text mode, you get a newline on systems without * nix (it is also used for network communications, but this is not supported by the standard library). The * nix newline line uses only the ASCII line, \n , both for internal and external text representation. On Windows, an external representation often uses a pair return return + linefeed, "CRLF" (ASCII codes 13 and 10), which is converted to a single input \n and back to the output.


From standard C99 (draft document N869), §7.19.2 / 2,

A text stream is an ordered sequence of characters consisting of lines, each line consisting of zero or more characters plus the terminating character of a new line. Whether the last line requires a trailing newline character defined by the implementation. Characters can be added, modified, or deleted at the input and output to meet different conventions for representing text in a host environment. Thus, there should not be a one-to-one relationship between the characters in the stream and the external representation. The data read from the text stream will necessarily be compared with the data that was previously written to this stream only if: the data consists only of print characters and control characters of the horizontal tab and a new line; there is no newline immediately preceding space characters; and the last character is a newline character. Will the space characters that are written immediately before the newline appear when read in accordance with the implementation.

And in §7.19.3 / 2

Binary files are not truncated, except as defined in 7.19.5.3. Whether the text recording stream causes the linked file to be truncated outside of this point is determined.

On the use of fseek , in §7.19.9.2 / 4:

For a text stream, either offset must be zero, or offset must be the value returned by a previously successful ftell function call in the stream associated with the same file, and whence should be SEEK_SET .

About using ftell , in §17.19.9.4:

The ftell function retrieves the current value of the file position indicator for the stream pointed to by stream . For a binary stream, the value is the number of characters at the beginning of the file. For a text stream, the file position indicator contains undefined information used by the fseek function to return the file position indicator for the stream to its position during a ftell call; the difference between two such return values ​​is not necessarily a significant measure of the number of characters written or read.

I think this is very important, but there are some details.

+5
source share

All Articles