How can I disable processor cache for certain memory areas?

I read on wikipedia that disabling cpu-cache can improve performance:

Marking some memory ranges as non-cached can improve performance by avoiding caching memory areas that are rarely accessed.

When I googled how to do this in c on linux, I did not find anything. It's not that I really need this feature, but I'm still interested.

And do you know about any projects that use this optimization?

Edit: I am programming for x86_64

+7
c linux caching
source share
2 answers

This comment on non-caching does not mean what you think it means, and where it is used, it is usually not a user-accessible function. That is, managing the CPU cache is usually a privileged operation.

That said ...

- A regular user program can be built with functions that are attributes “hot” or “cold”, so that the compiler tells the loader about grouping functions in ways that use the cache most efficiently.

- A regular program can use the madvise () function in linux to tell the paging functions about various things, including that the memory that was just used was or is unlikely to be used soon.

- The kernel itself uses memory type range register (mtrr) registers and page attribute tables (pat) in later kernels to tell the hardware that certain memory ranges (e.g. mapped memory display buffer and various parts of the PCI bus) should not be cached.

Normal Data ™, such as you are likely to use in any C program, will essentially never benefit from labeling any of its data that does not have a cache. The performance improvement that non-cached data has is the subsequent lack of various caching and memory blocking operations that will be displayed on memory mapped devices and display buffers. For example, to cluster the cache on top of a device connected to the memory, you need a command to cancel the cache before each read command and a forced write cache after each write to make sure that reading and writing occur at the right time. This will "poison" the use of the cache by using up and instantly discarding cache lines (physically limited resource) in the most unfriendly and useless way.

In the rare case when you write a program that accesses one of these malicious regions, for example, if you wrote part of the X display server on a Linux system, the kernel would already set registers for device behavior and behavior other than the cache would be would be transparent to you.

Effectively there is no time when your normal application evaluation program will use every opportunity to mark a variable as harmful to the cache outside of the various types of madvise () use.

Even then, if you could get any benefit, it is so rare that if you ever faced one, the problem would put the need and methodology as part of your research, and you would be told how and why so clearly you never no need to ask this question.

To return to the same example again, if you wrote the right driver when you read on the display adapter device or on the PCI bus, various flags and methods would be documented and discussed in the hardware manual.

There are ways to get the cache flush and such out of the user space with things like the CLCLEAR instruction on the Intel platform. These methods will not improve overall performance.

Since this is a privileged operation on a Linux system, you can write a kernel driver that has acquired and marked the memory area as incompatible, and then allows you to map it to your application. But the need for such a region is so rare and therefore can be misused that there is no normal methodology for its implementation.

So how do you do this? You, at least, are not what you are today. When you become the author of a kernel driver with intimate knowledge of multi-threaded code and data synchronization problems, you will learn how to do this, and at that moment you will understand why you do not want to, except in the last resort.

TL; DR :: due to the way linux uses and manages data and code, it will never be useful to mark any part of a regular application as unreadable, which does not cause more grief than it saves. Therefore, there is no unprivileged API for this.

PS In addition, this suggests that someone has already pointed out the things that lead to this article http://lwn.net/Articles/255364/ , which describes ways to make your program very convenient for caching, and some of There are ways in which you can perform cache bypass operations very cheaply. For example, using memset () tends to bypass the cache when setting up memory, and some operations may leak into the cache. This is not the same thing you are asking for, but once you understand this whole article, you will have a much better understanding of why designating a memory area as impregnable is usually, as the Jedi say, not the solution you are looking for.

+6
source share

Recently, I needed to experiment with undisclosed memory in a multi-threaded cache application.

I came up with this module that allows you to display undisclosed memory in user space.

The user processes raw memory by calling mmap () on the character device of the module (see the test directory for a demonstration).

What every programmer needs to know about memory , you really need to read!

0
source share

All Articles