Guppy memory usage is different from ps command

I profile my twisted server. It uses a lot more memory than I expected. Its memory usage grows with time.

ps -o pid,rss,vsz,sz,size,command PID RSS VSZ SZ SZ COMMAND 7697 70856 102176 25544 88320 twistd -y broadcast.tac 

As you can see, it costs 102176 KB , namely 99.78125 MB . And I use a twisted hatch guppy to see the memory usage profile.

 >>> hp.heap() Partition of a set of 120537 objects. Total size = 10096636 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 61145 51 5309736 53 5309736 53 str 1 27139 23 1031596 10 6341332 63 tuple 2 2138 2 541328 5 6882660 68 dict (no owner) 3 7190 6 488920 5 7371580 73 types.CodeType 4 325 0 436264 4 7807844 77 dict of module 5 7272 6 407232 4 8215076 81 function 6 574 0 305776 3 8520852 84 dict of class 7 605 1 263432 3 8784284 87 type 8 602 0 237200 2 9021484 89 dict of type 9 303 0 157560 2 9179044 91 dict of zope.interface.interface.Method <384 more rows. Type eg '_.more' to view.> 

Buzz ... Something seems to be wrong. Guppy shows that the total memory usage is 10096636 bytes, namely 9859.996 KB or 9.628 MB .

This is a huge difference. What is wrong with this strange result? What am I doing wrong?

Update: I wrote a monitor script last night. It records memory usage and number of online users. This is a radio server, so you can see that there is a radio and the total number of listeners. Here is a picture that I generated using matplotlib. alt text

Something strange Sometimes the memory usage printed by ps is very small, like this

 2010-01-15 00:46:05,139 INFO 4 4 17904 36732 9183 25944 2010-01-15 00:47:03,967 INFO 4 4 17916 36732 9183 25944 2010-01-15 00:48:04,373 INFO 4 4 17916 36732 9183 25944 2010-01-15 00:49:04,379 INFO 4 4 17916 36732 9183 25944 2010-01-15 00:50:02,989 INFO 4 4 3700 5256 1314 2260 

What is the reason for ultra-low memory usage? And what's more, there’s not even an online radio, no listeners, memory usage is still high.

+7
python memory-management twisted guppy
source share
3 answers

possibly due to memory replacement / reservation based on ps definition:

 RSS: resident set size, the non-swapped physical memory that a task has used (in kiloBytes). VSZ: virtual memory usage of entire process. vm_lib + vm_exe + vm_data + vm_stack 

it can be a little confusing, 4 different size metrics can be seen with:

 # ps -eo pid,vsz,rss,sz,size,cmd|egrep python PID VSZ RSS SZ SZ CMD 23801 4920 2896 1230 1100 python 

virtual size includes memory that was reserved by the process and not used, the size of all shared libraries that were loaded, pages that were unloaded, and blocks that were already freed by your process, so it can be much larger than the size of all living objects in python.

Some additional tools for researching memory performance:

good guide for tracking memory leaks in python using pdb and objgraph:

http://www.lshift.net/blog/2008/11/14/tracing-python-memory-leaks

+6
source share

As mentioned above, RSS size is what you are most interested in. The "virtual" size includes mapped libraries that you probably do not want to read.

It has been a while since I used heapy, but I am sure that the statistics it prints do not include the overhead added by heapy itself. This overhead can be quite significant (I saw that the 100MB RSS process has grown by a few dozen MB, see http://www.pkgcore.org/trac/pkgcore/doc/dev-notes/heapy.rst ).

But in your case, I suspect that the problem is that you are using some C library that either leaks or uses memory so that heapy does not track. Heapy knows the memory used directly by python objects, but if these objects wrap C objects that are allocated separately, they usually don't know about that memory. You might be able to add robust support for bindings (but if you don't control the bindings you use, this is obviously a hassle, and even if you control the bindings, you won’t be able to do this depending on what you wrap).

If there are leaks at the C level, heapy will also lose track of this memory (the RSS size will increase, but the size of the recorded messages will remain unchanged). Valgrind is probably the best tracker for them, as in other C applications.

Finally: memory fragmentation often leads to increased memory usage (as seen at the top), but not down (a lot). Usually this is not a lot of problems with daemons, since the process will reuse this memory, it simply will not be returned back to os, so the values ​​at the top do not return down. If the memory usage (as seen from above) grows more or less linearly with the number of users (connections), it does not go back, but also does not grow constantly until you click on the new maximum number of users, perhaps fragmentation is to blame.

+3
source share

This is not a complete answer, but from your hatch, I also suggest manually running gc.collect () before viewing ps or from above. guppy will show the selected heap, but does nothing for proactively free objects that are no longer allocated.

+1
source share

All Articles