As my comments went on more and more, here is the complete answer:
Your char * buffer should store the length of the string in the first bytes of X (for example, as Pascal does). After this length, string data appears that can contain any characters that you like. After that, the next X bytes indicate the length of the next line. So, and so on and so forth, to the end, which is limited to an empty string (i.e., the last X bytes say that the next line has zero length, and your application takes this as a signal to stop looking for more lines).
One of the advantages is that you do not need to scan string data - finding the next line from the beginning of the first line takes O (1) time, finding how many lines in the list are O (n), but it will be incredibly fast anyway (if O (n) is unacceptable, you can get around this, but I don’t think it is worth going into right now).
Another advantage is that the string data can contain any character you like. This may be because if your string can contain a NUL character, you can safely retrieve it, but you must be careful not to pass it to the C string function (e.g. strlen() or strcat() ) that will see the character NUL at the end of your data (which it may or may not be). You will have to rely on memcpy() and pointer arithmetic.
The problem is the value of X (the number of bytes you use to store the length of the string). The easiest way would be 1 to get around all the problems of judgment and alignment, but would limit your lines to 255 characters. If this is a limitation, you can live fine, but 255 seems a little low to me.
X may be 2 or 4 bytes, but you will need to make sure that you have a (unsigned) data type that at least contains as many bytes ( stdint.h uint16_t or uint32_t , or maybe uint_least16_t or uint_least32_t ). The best solution would be to make X = sizeof(size_t) , since the type size_t guaranteed to be able to store the length of any string that you could save.
If X > 1 introduces alignment, and if network portability is a problem, endianness. The easiest way to read the first X bytes as a size_t variable is to pass your char * data to size_t * and just dereference. However, if you cannot guarantee the correct alignment of your char * data, this may disrupt some systems. Even if you guarantee alignment of your char * data, you will have to spend a few bytes at the end of most lines to make sure that the value of the next line length is aligned.
The easiest way to overcome alignment is to manually convert the first sizeof(size_t) bytes to a size_t value. You will need to decide whether you want to store data a little or big-endian. Most computers will be targeted at small numbers, but for manual conversion it does not matter - just select one. The number 65537 (2 ^ 16 + 2) stored in 4 bytes, big-endian, looks like { 0, 1, 0, 2 } ; little-endian, { 2, 0, 1, 0 } .
Once you decide that (it doesn’t matter, choose what you like), you simply discard the first X data points to unsigned char s, then to size_t , then do a bit-shift with the appropriate exponent to put them in the right place, and then add them all together. In the above examples, 0 was multiplied by 2 ^ 32, 1 by 2 ^ 16, 0 by 2 ^ 8 and 2 by 2 ^ 0 (or 1), producing 0 + 65536 + 0 + 2 or 65537. The difference will probably be zero in efficiency between large and little-endian, if you do a manual conversion - I want to indicate (again) that the choice is completely arbitrary, as far as I can tell.
Performing a manual conversion avoids alignment problems and completely bypasses concerns about the nature of the intersystem system, so the data transferred from a computer with a mini-terminal to a large one will be considered the same. There is still a potential problem with transferring data from the system, where sizeof(size_t) == 4 , where sizeof(size_t) == 8 . If this is a problem, you can either a) take size_t and select the size of the invariant, or b) encode (one byte, all you need) the sizeof(size_t) value for the sender as the first byte of data, and the receiver has the necessary adjustments. Choosing a) can be simpler, but it can cause problems (what if you chose too small a size to account for outdated computers on your network, and as they phase out, do you lose storage space for your data?), Therefore I would prefer b), because it scales with any system you are working on (16-bit, 32-bit, 64-bit, maybe even in the future, 128-bit), but for this you may not would need.
</vomit> I leave this to the reader to understand everything that I just wrote.