The concept of "block size" in the cache

I'm just starting to learn the concepts of Direct mapping and Set Associative Caches. I have some elementary doubts. Here it goes.

Assuming the addresses are 32 bits long and I have a 32 KB cache with a block size of 64 bytes and 512 frames, how much data is actually stored inside the “block”? If I have an instruction that is loaded from a value from a memory location, and if that value is a 16-bit integer, is it that one of the 64-byte blocks now stores only one 16-bit integer (2 bytes). Which of the other 62 bytes in the block? If now I have another load command that also loads an integer value of 16 bits, this value now goes to another block of another frame depending on the load address (if the address is mapped to the same frame of the previous instruction, the previous value is unloaded and the block saves again only 2 bytes in 64 bytes). Right?

Please forgive me if this seems like a very silly doubt, just so that I understand my concepts correctly.

+8
caching operating-system
source share
2 answers

I typed this letter for someone to explain caches, but I think you can also find it useful.

You have 32-bit addresses that can refer to bytes in RAM. You want to be able to cache the data you are accessing, use it later.

Let's say you need a 1-MiB cache (2 20 ).

What do you do?

You have 2 restrictions that you need to meet:

  • Caching should be as uniform as possible across all addresses. those. You do not want to evade any specific address.
    • How do you do this? Use the remainder! With mod, you can evenly distribute any integer in any range.
  • You want to help minimize accounting costs. This means, for example, if you cache 1 byte in blocks, you do not want to store 4 bytes of data in order to track where 1 byte belongs.
    • How do you do this? You store blocks larger than 1 byte.

Suppose you select 16-byte (2 4 -byte) blocks. This means that you can cache 2 20/2 4 = 2 16 = 65 536 data blocks.

You now have several options:

  • You can create a cache so that data from any memory block can be stored in any of the cache blocks. This will be called a fully associative cache.
  • The advantage is that this is the “fairest” kind of cache: all blocks are processed exactly the same.
  • Compromise is speed. To find where to put a block of memory, you need to look for each block of the cache for free space. It is very slow.
  • You can create a cache so that data from any memory block can be stored in only one cache block. This could be called a direct mapping cache.
  • The advantage is that this is the fastest type of cache: you only do 1 to check if the item is in the cache or not.
  • The trade-off is that now, if you have a bad memory access pattern, you can have two blocks knocking each other out in a row, with unused blocks still remaining in the cache.
  • You can make a mixture of both: map one memory block to several blocks. This is what real processors do - they have an N-shaped set of associative caches.

Direct mapping cache:

You now have 65,536 data blocks, each of which is 16 bytes.
You save it as 65,536 “rows” inside your cache, with each “line” consisting of the data itself, as well as metadata (as to where the block belongs, regardless of whether it was written, etc.).

Question: How does each block in memory map to each block in the cache?

Answer: Well, you use direct mapping cache using mod. This means that addresses from 0 to 15 will be displayed to block 0 in the cache; 16-31 map to block 2, etc., and it wraps around when you reach the 1-MiB mark.

So, given the memory address M, how do you find the line number N? Easy: N = M% 2 20/2 4 .
But this only tells you where to store the data, not how to get it. After you save it and try to access it again, you should know how much of the 1 MB memory was saved here, right?

So, one piece of metadata: tag bits. If it's on line N, all you need to know is what happened during the mod operation. Which for a 32-bit address has 12 bits large (since the remainder is 20 bits).

So your tag is getting 12 bits longer - in particular, the top 12 bits of any memory address.
And you already knew that the lowest 4 bits are used to offset within the block (since the memory is addressed by byte, and the block is 16 bytes).
This leaves 16 bits for the “index” bits of the memory address, which can be used to determine which line the address belongs to. (This is just a division operation + remainder, but in binary format.)

You also need other bits: for example. you need to know if the block is valid or not, because when the CPU is on, it contains invalid data. Thus, you add 1 bit of metadata: the Valid bit.

The other bits that you learn are used for optimization, synchronization, etc., but these are the main ones. :)

+22
source share

I assume that you know the basics of tag, index, and offset, but here is a short explanation, as I learned in my class of computer architecture. Blocks are replaced in 64 byte blocks, so every time a new block is cached, it replaces all 64 bytes, regardless of whether you need only one byte. Therefore, when accessing the cache there is an offset indicating the byte that you want to receive from the block. Take your example, if only a 16-bit integer is loaded, the cache will look for a block by index, check the tag to make sure it has the correct data, and then get the byte according to the offset. Now, if you load another 16-bit value, say, with the same index, but with a different tag, it will replace the 64-byte block with a new block and receive information from the specified offset. (assuming direct mapping)

Hope this helps! If you need more information, or it’s still fuzzy, let me know, I know a couple of good sites that study this well.

+4
source share

All Articles