Prevent character expansion with a byte mask

I read the TCP / IP Sockets book in Java, 2nd Edition. I was hoping to get more clarity about something, but since there is no forum or anything else on the book's website, I thought I'd ask here. In several places in the book, a byte mask is used to avoid character expansion. Here is an example:

private final static int BYTEMASK = 0xFF; //8 bits public static long decodeIntBigEndian(byte[] val, int offset, int size) { long rtn = 0; for(int i = 0; i < size; i++) { rtn = (rtn << Byte.SIZE) | ((long) val[offset + i] & BYTEMASK); } return rtn; } 

So, I think what is happening. Let me know if I'm right. BYTEMASK in binary should look like 00000000 00000000 00000000 11111111 . To simplify the task, let's just say that the val byte array contains only 1 short, so the offset is 0. Therefore, let's set the byte array val[0] = 11111111 , val[1] = 00001111 . At i = 0 , rtn is all 0, so rtn << Byte.SIZE just keeps the value the same. Then (long)val[0] makes it 8 bytes with all 1 due to the sign extension. But when you use & BYTEMASK , it sets all the extra 1 to 0, leaving the last byte of all 1. Then you get rtn | val[0] rtn | val[0] , which basically flips on any 1 in the last byte of rtn . For i = 1 , (rtn << Byte.SIZE) pushes the least (rtn << Byte.SIZE) byte and leaves all 0 in place. Then (long)val[1] does long with all zero plus 00001111 for the least significant byte we want. Therefore, using & BYTEMASK does not change it. Then, when rtn | val[1] rtn | val[1] , it flips the rtn low byte to all 1. The final return value is now rtn = 00000000 00000000 00000000 00000000 00000000 00000000 11111111 11111111 . Therefore, I hope that this was not too long, and this was understandable. I just want to know if I think about it correctly, and not just completely break the logic. Also, one thing that bothers me is BYTEMASK is 0xFF . In binary format, it will be 11111111 11111111 , so if it is implicitly ported to int, will it not really be 11111111 11111111 11111111 11111111 because of the sign extension? If so, then it makes no sense to me how BYTEMASK will work. Thanks for reading.

+4
source share
1 answer

Everything is correct, except for the last point:

0xFF already int ( 0x000000FF ), so it will not be decrypted. In the general case, integers of literals in Java int , if they do not end with L or L , and then they are long s.

+7
source

Source: https://habr.com/ru/post/1411004/


All Articles