Both PPP and Ethernet have mechanisms for framing, that is, for splitting the bitstream into frames in such a way that if the receiver loses information about what he can pick up at the beginning of the next frame, they are located at the bottom of the protocol stack; all other protocol details are based on the idea of โโframes. In particular, the preamble, LCP, and FCS are at a higher level and are not used to control framing.
PPP through serial links such as dialup is framed using HDLC-like framing . The byte value 0x7e, called a sequence of flags, indicates the beginning of the frame. The frame continues until the next byte. Any appearance of a byte flag in the contents of the frame is escaped. Escaping is done by writing 0x7d, known as a control escape byte, followed by a byte that must be escaped xor'd with 0x20. The flag sequence resets to 0x5e; escape control itself must also be escaped, up to 0x5d. Other values โโcan also be shielded if their presence disrupts the modem. As a result, if the receiver loses synchronization, it can simply read and discard bytes until it sees 0x7e, after which it again recognizes it at the beginning of the frame. The contents of the frame are then structured, containing several odd small fields that are not very important, but are saved from an earlier IBM protocol along with a PPP packet (called a protocol data unit, PDU), as well as frame check (FCS).
Ethernet uses a logically similar approach, having characters that are recognized as a start and end marker, rather than data, but instead of having reserved bytes plus an evacuation mechanism, it uses a coding scheme that can express special control characters that differ from data bytes - A bit like using punctuation to break a sequence of letters. The details of the system used vary with speed.
Standard (10 Mbit / s) ethernet is encoded using Manchester coding , in which each bit to be transmitted is represented as two consecutive levels on a line, so that there is always a transition between the levels in each bit, which helps the receiver stay synchronized . The borders of the frame are indicated by a violation of the encoding rule, which leads to the fact that the bit has no transition (I read this in a book several years ago, but I can not find the link on the Internet - maybe I'm wrong). In essence, this system extends the binary code to three characters - 0, 1 and violations.
Fast (100 Mbps) ethernet uses a different coding scheme based on the 5b / 4b code , where groups of four data bits (nybbles) are represented as groups of five bits on a wire and are transmitted directly, without a Manchester scheme. Extending to five bits allows you to select sixteen necessary patterns to meet the requirements for frequent level transitions to help synchronize the receiver. However, there is still room for selecting additional characters that can be transmitted but do not match the data value, essentially expanding the set of nybbles to 24 characters - nybbles from 0 to F and characters called Q, I, J, K, T, R, S, and H. Ethernet uses a JK pair to mark the start of a frame, and TR uses a pair of JK to mark the end of a frame.
Gigabit ethernet is similar to fast ethernet, but with a different coding scheme - optical fiber versions use 8b / 10b code instead of 5b / 4b, and the twisted pair version uses a very complex binary code structure that I really don't understand. Both approaches give the same result, which is the ability to transmit either data bytes, or one of a small set of additional special characters, and these special characters are used to crop.
In addition to this basic structure structure, then there is a fixed preamble followed by a frame divider, as well as some control fields of various meaninglessness (hello, LLC / SNAP!). The validity of these fields can be used to verify the frame, but they cannot be used to determine the frames on their own.