Why are C-supplied integer types good enough for just about any project?

I am more a system administrator than a programmer. But I spend an excessive amount of time chasing programmers' code, trying to figure out what went wrong. And the alarming amount of that time is spent on problems when the programmer expects one definition of __u_ll_int32_t or something else (yes, I know that this is not real), but expected that the file defining this type would be somewhere else than it , or (or this is much worse, but, fortunately, rarely) expected that the semantics of this definition would be something else than it is.

As I understand C, it intentionally does not define widths for integer types (and that is a good thing), but instead gives the programmer char , short , int , long , and long long , in all their signed and unsigned glories, with certain minima that are being implemented (hopefully). In addition, it gives the programmer various macros that the implementation must complete in order to tell you things such as char width, longest unsigned length, etc. And yet, the first thing any non-trivial C project can do is either import or invent another set of types that give them explicitly 8, 16, 32, and 64 bit integers. This means that as sysadmin I have to have these definition files in the place that the programmer expects (that is, after all, my work), but then not all the semantics of all these definitions are the same (this wheel was reinvented many times), and there is no non-ad-hoc method that I know to satisfy all the needs of my users here. (I sometimes resorted to creating bit / types_for_ralph.h>, which, as I know, makes puppies cry every time I do this.)

What is trying to determine the bit width of numbers explicitly (in a language that doesnโ€™t specifically want to do this), will a programmer who makes all this the main headache of configuration management? Why donโ€™t I know which miniatures and layouts MAX / MIN provided by the platform are enough to do what C programmers want? Why do you want to take a language whose main advantage is that it is transferred to arbitrarily forbidden platforms, and then introduces a certain bit width?

+7
c types integer
source share
5 answers

When a C or C ++ programmer (hereinafter referred to as the second person) chooses the size of an integer variable, it is usually executed in one of the following cases:

  • You know (at least roughly) the allowable range for a variable based on the value of the real world that it represents. For example,
    • numPassengersOnPlane in the airline reservation system should accommodate the largest supported aircraft, so at least 10 bits are required. (Round to 16.)
    • numPeopleInState in the US tabulation program needs to accommodate the most populous state (currently about 38 million), so at least 26 bits are required. (Round to 32.)

In this case, you need the semantics of int_leastN_t from <stdint.h> . It is common for programmers to use the exact width of intN_t here, when technically they shouldn't; however, 8/16/32/64-bit machines are so prevailing today that the difference is simply academic.

You can use standard types and rely on restrictions like " int must be at least 16 bits", but the disadvantage of this is the lack of a standard maximum size for integer types. If an int is 32 bits, when you only need 16, then you have unnecessarily doubled the size of your data. In many cases (see below) this is not a problem, but if you have an array of millions of numbers, you will get many page errors.

  • Your numbers do not need to be so large, but for efficiency reasons, you need a fast, โ€œnativeโ€ data type, not a small one, which may require time spent on bitmasking or zero / sign expansion.

These are int_fastN_t types in <stdint.h> . However, as a rule, it is simple to use the built-in int here, which in 16/32-bit days had int_fast16_t semantics. This is not a native type on 64-bit systems, but it is usually good enough.

  • A variable is the amount of memory, the index of the array, or the cast pointer, and therefore a size is required depending on the amount of address memory.

This corresponds to typedefs size_t , ptrdiff_t , intptr_t , etc. You should use typedefs here, because there is no built-in type that is guaranteed to have a memory size.

  • This variable is part of a structure that is serialized to a file using fread / fwrite or called from a non-C language (Java, COBOL, etc.), which has its own fixed-width data types.

In these cases, you really need the exact width type.

  • You just did not think about the corresponding type and use int out of habit.

Often this works quite well.


So, in general, all typedefs from <stdint.h> have their own use cases. However, the usefulness of built-in types is limited due to:

  • Lack of maximum sizes for these types.
  • Lack of memsize type.
  • Arbitrary choice between LP64 (on Unix-like systems) and LLP64 (for Windows) data models on 64-bit systems.

Due to the fact that there are so many redundant typedefs of fixed width ( WORD , DWORD , __int64 , gint64 , FINT64 , etc.) and memsize ( INT_PTR , LPARAM , VPTRDIFF , etc.), mainly because <stdint.h> came at the end of C development, and people still use old compilers that don't support it, so libraries must define their own. For the same reason why C ++ has so many string classes.

+11
source share

This is sometimes important. For example, most image file formats require the exact number of bits / bytes (or at least the specified number).

If you want to share a file created by the same compiler on the same computer architecture, you would be right (or at least everything will work). But in real life, things like file specifications and network packets are created by various computer architectures and compilers, so we have to take care of the details in this case (at least).

+4
source share

The main reason fundamental types cannot be fixed is because some machines do not use 8-bit bytes. Enough programmers do not care, or actively do not want them to not be bothered by such animals that most well-written code requires a certain number of bits, where overflow would not be a problem.

It is better to specify the required range than to use int or long directly, because the query "relatively large" or "relatively small" is pretty pointless. The point is to know what contributions the program can work with.

By the way, there is usually a compiler flag that will configure the built-in types. See INT_TYPE_SIZE for GCC. Perhaps it would be easier to insert this into a makefile than to specialize the entire system environment with new headers.

+3
source share

If you need portable code, you want the code that your entry functions on the same platform. if you have

  int i = 32767; 

you cannot say for sure what i+1 will provide you on all platforms.

It is not portable. Some compilers (using the same CPU architecture!) Will give you -32768, and some will give you 32768. Some perverted ones will give you 0. This is a pretty big difference. If it overflows, it's Undefined Behavior, but you don't know what UB is, unless you know exactly what int size is.

If you use standard definitions of integers (this is <stdint.h> , standard ISO / IEC 9899: 1999), then you know that the answer +1 will give an exact answer.

  int16_t i = 32767; i+1 will overflow (and on most compilers, i will appear to be -32768) uint16_t j = 32767; j+1 gives 32768; int8_t i = 32767; // should be a warning but maybe not. most compilers will set i to -1 i+1 gives 0; (//in this case, the addition didn't overflow uint8_t j = 32767; // should be a warning but maybe not. most compilers will set i to 255 i+1 gives 0; int32_t i = 32767; i+1 gives 32768; uint32_t j = 32767; i+1 gives 32768; 
+1
source share

Two opposing forces play here:

  • C needs to naturally adapt to any processor architecture.
  • The need to transfer data to / from the program (network, disk, file, etc.) so that a program running in any architecture can interpret it correctly.

The need for "processor matching" is associated with its inherent effectiveness. There is a number of CPUs that are most easily processed as a whole, in which all arithmetic operations are easily and efficiently performed and which leads to the need for the smallest bits of instruction coding. This type is int . It can be 16 bits, 18 bits *, 32 bits, 36 bits *, 64 bits or even 128 bits on some machines. (* These were some unknown machines of the 1960s and 1970s that never had a C compiler.)

The data transfer requirements for binary data transfer require that the recording fields are the same size and alignment. To do this, it is very important to have control over the size of the data. There are also statements and possibly binary representations of the data, such as floating point representations.

A program that forces all entire operations to be 32-bit in the interest of size compatibility will work well on some CPU architectures, but not on others (especially 16 bits, but possibly on some 64-bit ones).

Using the register size of the central processor is preferable if all data exchange is performed in a non-binary format, for example XML or SQL (or any other ASCII encoding).

0
source share

All Articles