How to check for memory leak in a widespread C ++ Linux application?

I am currently working on a large-scale application project (written in C ++), which started from scratch some time ago, and we have reached the point where it is necessary to review the memory leak checks.

The application runs on Ubuntu Linux, has a lot of multimedia content and uses OpenGl, SDL and ffmpeg for various purposes, including 3D rendering of graphs, viewing windows, audio and video. You can think of it as a video game, although it is not, but the responsibilities of the application can be simplified by considering it as a video game.

I am currently a little unaware in determining whether we still have memory leaks or not. We used to identify some and delete them. However, these days the application is almost complete, and the tests we ran give me results that I cannot understand for sure.

The first thing I did was try to run the application through Valgrind ... unfortunately, then the application crashes when launched in the valgrind environment. The crash is in a “non-deterministic” state, as it falls in different places. So I abandoned Valgrind to easily identify the source of potential leaks, and ended up using two Linux commands: free and top.

free is used to check the use of system memory while the application is running

top is used with the '-p' option to examine the memory usage of the application process while it is running.

The output form at the top and free is dumped into files for further processing. I made two graphs with data that are connected at the bottom of the question.

The test is very simple: the memory data is examined after the application is already running, and it waits for commands. Then I start a sequence of commands that always do the same thing. It is expected that the application will load a lot of multimedia data into RAM, and then load it.

Unfortunately, the graph does not show me what I expected. Memory usage is increased by three different steps and then stopped. The memory, apparently, was never released, and this hinted to me that there was a HUGE memory leak. that would be fine, as it would mean that, most likely, we are not freeing the memory spent by the media.

But after the first three steps ... memory usage is stable ... there are more serious steps ... just small up and down that correspond to the expected loading and unloading of data. What is unexpected here is that the data to be loaded / unloaded amounts to hundredths of megabytes of RAM, instead, up and down are only a few megabytes (say, 8-10 MB).

Currently, I am pretty versed in interpreting this data.

Anyone have any tips or suggestions? What am I missing? Is the method I use to check for macroscopic memory leaks completely wrong? Do you know any other (preferably free) tool, other than Valgrind, for checking memory leaks?

System memory usage graph

Process memory usage graph

+6
source share
7 answers

Primarily...

and we have reached the point where it is necessary to review the memory leak checks.

This, in fact, is a problem of methodology . Correctness should be the main goal of any piece of software, and not an afterthought.

Suppose you now understand this and how much easier it would be to identify problems if you would run the unit test tool with every commit.


So what to do now?

  • Determining Runtime:

    • Try Valgrind to do the job, you probably have some environmental concerns.
    • Try ASan , ThreadSan, and MemSan ; they are not trivial to install on Linux, but so impressive!
    • Try tool assemblies: tcmalloc includes a bunch of checks for example
    • ...
  • Compile time determination:

    • Turn on warnings (preferably with -Werror ) (not relevant to your problem)
    • Use static analysis like Clang's , it can detect unpaired distribution procedures
    • ...
  • Human Detection:

    • Code overview: make sure all resources are allocated in RAII classes.
    • ...

Note. Using only the RAII classes helps remove memory leaks, but does not help with dangling links. Fortunately, broken link detection is what ASan does.


And as soon as you close all the problems, make sure that it becomes part of the process. Changes should be checked and checked always, so rotten eggs are taken away immediately, and not left to break into the code base.

+3
source

Instead of abandoning Valgrind, you should work with them and try

  • get rid of the errors you encountered in Valgrind
  • Get your app thoroughly tested and debugged with an updated Valgrind.

Saying that you abandoned Valgrind, which is the solution to your problem, does not help ...

Valgrind is a tool that we all use to check for memory leak and thread errors on Linux.

In the end, it is definitely better to invest time in figuring out “why Valgrind does not work with my application” rather than look for alternative solutions. Valgrind is a tried and tested tool, but not perfect. And he surpasses alternative methods with a long, long shot.

The Valgrind page says that it is better to send Bugzilla errors, but in fact it is better to ask https://lists.sourceforge.net/lists/listinfo/valgrind-users if anyone has seen such problems before and what to do in such a situation. In the worst case, they will tell you to indicate a bugzilla bug or file on their own.

+5
source

You probably want to see valgrind .

And you can just start with really simple examples to understand that valgrind messages can be somewhat verbose. Consider this simplified example, where valgrind is exactly what and how much is missing:

 edd@max :/tmp$ cat valgrindex.cpp #include <cstdlib> int main() { double *a = new double[100]; exit(0); } edd@max :/tmp$ g++ -o valgrindex valgrindex.cpp edd@max :/tmp$ valgrind ./valgrindex ==15910== Memcheck, a memory error detector ==15910== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==15910== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==15910== Command: ./valgrindex ==15910== ==15910== ==15910== HEAP SUMMARY: ==15910== in use at exit: 800 bytes in 1 blocks ==15910== total heap usage: 1 allocs, 0 frees, 800 bytes allocated ==15910== ==15910== LEAK SUMMARY: ==15910== definitely lost: 0 bytes in 0 blocks ==15910== indirectly lost: 0 bytes in 0 blocks ==15910== possibly lost: 0 bytes in 0 blocks ==15910== still reachable: 800 bytes in 1 blocks ==15910== suppressed: 0 bytes in 0 blocks ==15910== Rerun with --leak-check=full to see details of leaked memory ==15910== ==15910== For counts of detected and suppressed errors, rerun with: -v ==15910== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2) edd@max :/tmp$ 
+2
source

The results from free and top will not help you. I regret that you are making efforts to build graphs for their results. I gave a good explanation of why they are useless in a similar topic here: Memory stability of a C ++ application on Linux .

I also agree with the other answers here that you should probably give high priority to resolving the crashes you encounter at Valgrind. Valgrind is currently considered very stable, and I personally run quite complex multi-threaded multimedia SDL / OpenGL, etc. applications through it without problems. Most likely, the Valgrind work environment exposes instability probabilities in your application. A crash sounds like a thread state failure, although it could also be heap / memory corruption.

Then you can ask about how to debug an application that crashes from the Valgrind workspace (something that I don't know the answer about).

+2
source

The problem with the free and the upper one is that they can show you the problem, but they do not help much in fixing the problem. Of the 100 or 1000 lines of code that allocate memory, which ones leak? Valgrind helps here.

If this is for a company with a budget for tools, you can look at cleaning or other commercial tools.

Just for completeness, I mentioned the conservative garbage collector from Boehm (which works for C and C ++ code). You can disable GC and use GC_Free () and it will become a leak detection tool. Or you can leave the GC on to automatically free memory when not in use anymore.

+2
source

It all depends on which valve you are using. Libc allocators (malloc, calloc, realloc) and C ++ distributors (new, deleted) probably use an optimization trick, including not freeing memory back to the OS. You see, if you ask malloc for some memory, use it and then free it, it will not necessarily be released back to the OS. Rather, if I ask malloc for memory, then (most of the lily) becomes much larger than necessary (from outside the page). So the next time you need more memory, malloc just sits. Same thing with free. The memory may have been added to the mallocs memory pool from which your subsequent allocations are drawn.

So, your applications at first a few mallocs put the memory very high, but then the pool is large enough to accommodate future allocations.

0
source

In addition to using valgrind , you can also consider using a conservative Boehm GC ; you probably want to compile and configure it as a memory leak detector.

And you can even dare to use the Boehm GC as your primary memory allocator.

By the way, a search in /proc/1234/maps can help you (where 1234 is the pid of your process).

0
source

All Articles