You have a very interesting and difficult question.
In short, there were two drivers that led to the existence of competing type families - based on DWORD and based on int:
1) The desire to have crosses on the one hand and restrained types of sizes on the other hand.
2) Conservatism of peoples.
In any case, in order to give a detailed answer to your question and a reasonably good background for this field, we must delve into the history of computers. And start our story from the first days of computing.
Firstly, there is such a thing as a machine word. A machine word is a fragment of binary data that is natural for processing in a particular processor. Thus, the size of a machine word is hardly dependent on the processor and is generally equal to the size of the internal registers of the internal processor. Usually it can be divided into two equal parts, which can also be accessed by the processor as independent data. For example, on x86 processors, the size of a machine word is 32 bits. This means that all common registers (eax, ebx, ecx, edx, esi, edi, ebp, esp and eip) have the same size - 32 bits. But many of them may be available as part of the register. For example, you can access eax as a 32-bit data block, ax as a 16-bit data block or even an 8-bit data block. But this is not physical, it is all a 32-bit register. I think you can find a very good background in this area on Wikipedia (http://en.wikipedia.org/wiki/Word_(computer_architecture)). In short, a machine word is how much a bit of a piece of data can be used as an integer operand for a single instruction. Even today, different processor architectures have different machine word sizes.
Well, we have some understanding of a computer word. It is time to return to the history of computing. The first Intel x86 processors that were popular were 16 bits in size. He appeared on the market in 1978. Assembler was very popular at that time, if it was not the main programming language. As you know, assembler is a very thin shell under its own processor language. Because of this, he is sensitive to equipment. And when Intel pushes the new 8086 processor into the market, the first thing they needed to succeed was to direct the new processor to the market for the new processor. No one wants a processor that no one knows how to program. And when Intel gave names for different types of data in assembler for 8086, they make chois obvious and name the 16-bit piece of data as a word because the 8086 machine word is 16 bits in size. Half the machine word was called a byte (8 bits), and two words used as one operand were called a double word (32-bit). Intel used these terms in processor manuals and in assembler mnemonics (db, dw nd dd for static distribution of byte, word and double word).
Years passed and 1985 Intel moved from a 16-bit architecture to a 32-bit architecture with the introduction of the 80386 processor. But at that time there were a huge number of developers, accustomed to this word, this is a 16-bit value. In addition, a huge amount of soft text was written with the true belief that the word is 16-bit. And many of the code already written relies on the fact that the word is 16 bits. Because of this, in addition to the fact that the size of the machine word was actually changed, the notation remained the same, except for the fact that the new data type arrived in assembler - a four-digit word (64-bit), because the instruction, which relies on two the word machines remained the same, but the word machine was expanded. In the same way, a dual quad-core word (128-bit) with the 64-bit AMD64 architecture appeared. As a result, we have
byte = 8 bit word = 16 bit dword = 32 bit qword = 64 bit dqword = 128 bit
Please note that the main thing in this type is a family of strictly dimensional types. Because it comes from and is used in assembler, which requires constant-sized data types. Note that years pass one on one, but the data types from this family continue to have the same constant size, in addition to the fact that its name no longer has its original meaning.
On the other hand, at the same time, from year to year, high-level languages became more and more popular. And since this languges was designed with cross-platform application in mind, he looked at the sizes of his internal data types from a completely different perspective. If I understand correctly that no high-level language explicitly states that some of its internal data types have a fixed constant size that will never be changed in the future. Let's look at C ++ as an example. The C ++ standard reports that:
"The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set and is composed of a contiguous sequence of bits, the number of which is implementa- tion-defined. The least significant bit is called the low-order bit; the most significant bit is called the high-order bit. The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address."
So, we can see amazing information - in C ++ even a byte does not have a constant size. So even if we are used to thinking that the size is 8 bits, in accordance with C ++ there can be not only 8, but also 9, 10, 11, 12, etc. At the rate of. Or maybe even 7 bits.
"There are five signed integer types: "signed char", "short int", "int", and "long int"., and "long long int". In this list, each type provides at least as much storage as those preceding it in the list. Plain ints have the natural size suggested by the architecture of the execution environment; the other signed integer types are provided to meet special needs."
This series describes two main complaints:
1) sizeof (char) <= sizeof (short) <= sizeof (int) <= sizeof (long) <= sizeof (long long)
2) Plain ints have the natural size suggested by the runtime architecture. This means that int must be the machine word size of the target processor architecture.
You can go through all the standard text in C ++, but you won't find something like "size of int is 4 bytes" or "length of long is 64 bit". The size of individual C ++ integer types can change when moving from one processor architecture to another and moving from one compiler to another. But even when you write a program in C ++, you will periodically be faced with the requirement to use data types with a known constant size.
At least earlier, compiler developers followed these standard requirements. But now we see that conservatism comes into play again. People are used to thinking that int is 32-bit and can store values in the range from -2,147,483,648 to 2,147,483,647. Previously, when the industry went the line between 16-bit and 32-bit architecture. The second requirement was strictly observed. And when you used the C ++ compiler to create a 16-bit program, the compiler used an int with 16-bit size, which is the “natural size” for 16-bit processors, and vice versa, when you used another C ++ compiler to create A 32-bit program, but from the same source code, the compiler used an int with a 32-bit size, which is the "natural size" for 32-bit processors. Currently, if you look at the Microsoft C ++ compiler, for example, you will find that it will use a 32-bit int regardless of the target processor architecture (32-bit or 64-bit) just because people are used to thinking that int is 32-bit!
As a summary, we can see that thare are two types of data types - based on dword and int. The motivation for the second is obvious - cross-platform application development. The motivation for each of them is all cases when the perception of the values of variables makes sense. For example, among others, the following cases can be noted:
one). You must have some value in a predetermined range, and you need to use its class or other data structure, which will be populated into a huge number of instances at runtime. In this case, if you use int-based types to store this value, it will have a lack of huge amounts of memory on some architectures and could potentially break the logic on another. For example, you need to manipulate values in the range from 0 to 1,000,000. If you use int to save it, the program will behave correctly if the int is 32-bit, it will have 4-byte memory overhead for each instance of the value, if int is 64-bit and will not work correctly if int is 16-bit.
2) Data involved in the next work. In order to be able to correctly process your network protocol on different PCs, you will need to specify it in the usual format based on the size that will describe all the packets and the header in parts. Your network connection will be completely broken if on one PC your protocol header is 20 bytes long with 32-bit, and on the other PC - 28-byte length with 64-bit int.
3) Your program must store the value used for some special processor instructions, or your program will communicate with modules or code fragments written in assembler.
4) You need the storage values that will be used to communicate with devices. Each device has its own specific parameter, which describes which input device requires input, and in what form it will provide output. If a device requires a 16-bit value as an input signal, it should receive the same 16-bit value regardless of the size of the int and even regardless of the size of the machine word used by the processor in the system where the device is installed.
5) Your algorithm is based on integer overflow logic. For example, you have an array of 2 ^ 16 records, and you want to go through it briefly and sequentially and update the values of the records. If you use a 16-bit int, your program will work fine, but wimmediatelly you go to 32-bit using int, you will have access to the range range index.
In this regard, Microsoft uses both families of data types. Types based on Int, if the actual size of the data does not matter much, and DWORD - in those cases when it has. And even in this case, Microsoft defines both macros and the ability to quickly and easily implement the virtual type system used by Microsoft for a specific processor and / or compiler architecture, assigning it the correct C ++ equivalent.
I hope that I considered the issue of the origin of data types and their differences quite well.
So, we can move on to the question of why a six-digit number is used to indicate DWORD-based data type values. There are actually several reasons:
1) If we use binary data types with a hard size, this will be expected enough so that we can look at them in binary form.
2) It is very easy to understand the meaning of bit masks when they are encoded in binary form. Agree that it is much easier to understand which bit is set and which bit is reset if the value is in the following form
1100010001011001
then if it will be encoded in the following form
50265
3) Data encoded in binary form and described by a single value based on DWORD, have a constant length when the same data encoded in decimal form will have a variable length. Please note that even when a small number is encoded in binary form, a complete description of the meaning is provided.
0x00000100
instead
0x100
This property of binary coding is very attractive in the case when the analysis of a huge amount of binary data is required. For example, a hex editor or simple memory analysis used by your program in the debugger when you break a breakpoint. Agree that it’s much more convenient to look at neat columns of values, which are a bunch of weakly aligned variable size values.
So, we decided that we want to use binary coding. We have three options: use regular binary encoding, use octal encoding and use hexadecimal encoding. Peple prefers to use six-dimensional encoding, because it is the shortest of the many encodings available. Just compare
10010001101000101011001111000
and
0x1234568
Can you quickly find the number of bits that is set in the next value?
00000000100000000000000000000
and next?
0x00100000
In the second case, you can quickly divide the number in four divided bytes
0x00 0x10 0x00 0x00 3 2 1 0
in each of which the first digit denotes the 4 most significant bits, and the second denotes another 4 least significant bits. After you spend some time working with hexadecimal values, you will remember a simple bit-analogue of each hexadecimal digit and replace each other in the other without any problems:
0 - 0000 4 - 0100 8 - 1000 C - 1100 1 - 0001 5 - 0101 9 - 1001 D - 1101 2 - 0010 6 - 0110 A - 1010 E - 1110 3 - 0011 7 - 0111 B - 1011 F - 1111
So, we only need a second or two to find that we have bit number 20 set!
People use hexadecimal code because it is the shortest, convenient for immediate use and uses a binary data encoding form.