There are no bytes at the CPU level, only words that are currently 32-bit or 64-bit. Arithmetic devices are usually tightly coupled to numbers with a word size (or more, in the case of a floating point).
Thus, there is no speed advantage when using types smaller than a word with respect to arithmetic operations, and there may be a speed limit, because you need to do additional work to simulate types that the processor does not initially have, for example. writing one byte to memory requires that you first read the word in which it appears, change it and then write it down. To avoid this, most compilers will actually use the full memory word for all smaller variables, so even a boolean variable takes 32 or 64 bits.
However, if you have a large amount of data, such as a large array, then using smaller types will usually give better performance because you will have fewer misses in the cache.
source share