It depends on many factors. If you only access pixel data one byte at a time, alignment will not make any difference in the vast majority of cases. To read / write one byte of data, most processors do not care whether this byte is on a 4-byte boundary or not.
However, if you access data in units of large bytes (say, in 2-byte or 4-byte units), you will definitely see alignment effects. For some processors (for example, for many RISC processors) it is completely impossible to access unchanged data at certain levels: an attempt to read a 4-byte word from an address that is not aligned by 4 bytes will generate a data access exception (or a Data Storage exception ) on PowerPC, for example.
On other processors (for example, x86) access to unbalanced addresses is allowed, but often this happens with a hidden decrease in performance. Loading / storing memory is often implemented in microcode, and the microcode will detect uneven access. Typically, the microcode will retrieve a 4-byte amount from memory, but if it is not aligned, it will need to extract two 4-byte locations from the memory and restore the required 4-byte amount from the corresponding bytes of the two locations. Capturing two memory locations is clearly slower than one.
It is easy for simple downloads and stores. Some instructions, such as those in the MMX or SSE instruction sets, require their memory operands to be aligned correctly. If you try to access unmodified memory using these special instructions, you will see something like an illegal exception to the instruction.
To summarize, I would not worry too much about alignment unless you write super-critical code (for example, in an assembly). The compiler helps you a lot, for example. by adding structures so that 4-byte values ββare aligned at 4-byte boundaries, and on x86 the CPU also helps you deal with unsatisfied access. Since the pixel data you are dealing with is 3 bytes in size, you almost always make single-byte calls anyway.
If you decide that instead you want to access pixels in singular 4-byte accesses (as opposed to 3 single-byte accesses), it would be better to use 32-bit pixels and align each individual pixel to a 4-byte border. Aligning each line to a 4-byte border, but not every pixel will have a small, if any effect.
Based on your code, I assume this is due to reading the Windows bitmap file format. Raster files require that the length of each scan line be a multiple of 4 bytes, so setting up pixel data buffers with this property is a property that you can simply read in the entire bitmap in one fell swoop to your buffer (of course, you still have to deal with that fact that the scan lines are stored from bottom to top and not from top to bottom, and that pixel data is BGR instead of RGB). This is actually not very profitable, but it is not much harder to read in a raster single-line line at a time.