Will the processor wait until all records are completed before being written to the line, or will it be reset earlier when only one or several records have occurred?
The CPU can start the line earlier, but only if this set is under high pressure from another access belonging to the cache. This is unlikely. Caches are structured to avoid prematurely flushing recently acquired data.
Do I have to make all the entries at the end of the procedure in quick succession?
In general, yes. Temporal locality is important, which means that caches work best when access is grouped by time. Other tricks may apply. For example, you can try to βheatβ the cache line by fictitiously writing to your structure to the desired record. This allows some level of parallelism memory to be used, in which the kernel loads a cache line when executing intermediate code. By the time you are making real entries, the likelihood that the cache line will be ready in L1 will be better.
In general, be very careful about unnatural actions in your code to improve cache performance. Caches do a pretty good job, left only to themselves. You should always measure performance before and after any changes. What you think could be an improvement could hurt. If your program has multithreading, another large worm of worms with a cache conflict between the kernels may enter the game.
source share