For a virtual machine, everything is virtual, including time. For example, within 123 real seconds, you can emulate 5432 virtual seconds of processing. A common way to measure virtual time is to increment (or add something) the "number of cycles" counter each time a virtual command is emulated.
From time to time, you try to synchronize virtual time with real time. If virtual time is too far ahead of real time, you insert a delay to allow real-time catching up. If virtual time is beyond real time, you need to find a reason to slow down. Depending on the emulated architecture, there is nothing you can do; but for some architectures, there are power management features such as thermal management (for example, maybe you can pretend that the virtual processor has warmed up and is running slower to cool down).
You probably also want to have a queue of events where various emulated devices can say, “some event will happen at some particular moment”; so if the emulated CPU is inactive (waiting for the event to occur), you can proceed to the next event. This naturally allows the virtual machine to catch up with it if it runs slowly.
The next step is to identify the places where synchronization takes place, and only synchronize virtual time with real time in these specific places. If the emulated machine performs heavy processing and does nothing that is visible to the external observer, then the external observer cannot say whether virtual time is approaching real time or not. When a virtual machine does what is visible to an external observer (for example, send a network packet, update video / screen, create sound, etc.), you first synchronize virtual time in real time.
Step by step, it uses buffering to decouple when something happens inside the emulator, when they are visible to an external observer. For an (exaggerated) example, imagine that an emulated machine thinks it is 8:23 AM and she wants to send a network packet, but in reality it is only 8:00 AM. A simple solution is to delay the emulation for 23 minutes and then send the packet. This sounds good, but if (after the virtual machine sends the packet), the emulator fights to keep up with the real time (due to other processes running on the real computer or for any other reason) the emulator may lag, and You may have trouble maintaining the illusion that virtual time is the same as real-time. Alternatively, you can pretend that the packet was sent, and put the packet in the buffer and continue emulating other things, and then send the packet later (when it is actually 8:23 in the real world). In this case, if (after the virtual machine sends the packet), the emulator fights to keep up with the real time, you still have 23 minutes of freedom.