Reading references for SWIG-ed C structures containing complex types does not seem to work properly

I came across an interesting discovery related to how SWIG handles the count of references to C structures that contain other structures as members.

I noticed that my python SWIG object objects were garbage collected before I was done using them in situations where I stored data from subordinate structure elements to other python objects (lists / dicts). After honestly digging, I found that the members of the SWIG-ed structure do not seem to have their own independent reference counters, although the interpreter indicates that they are โ€œSwig Objectsโ€. Therefore, when I added data from a sub-element of the structure to my list, python did not know that I added a link to this data.

I created a simple demo example. I SWIG-ed the following 3 structures:

SWIG-ed C Structures:

typedef struct { unsigned long source; unsigned long destination; } message_header; typedef struct { unsigned long data[120]; } message_large_body; typedef struct { message_header header; message_large_body body; } large_message; 

Then I created a somewhat equivalent python class to compare behavior with a purely SWIG-ed solution.

A somewhat equivalent Python class

 class pyLargeMessage(object): def __init__(self): self.header = bar.message_header() self.body = bar.message_large_body() 

Then I performed the following test in the interpreter.

Python interpreter results

 >>> y = pyLargeMessage() >>> y <__main__.pyLargeMessage object at 0x06C5E6B0> >>> y.header <Swig Object of type 'message_header *' at 0x06C5E700> >>> sys.getrefcount(y.header) 3 >>> z = [y.header] >>> sys.getrefcount(y.header) 3 >>> z += [y.header] >>> sys.getrefcount(y.header) 4 >>> >>> y = bar.large_message() >>> y <Swig Object of type 'large_message *' at 0x06C668E0> >>> y.header <Swig Object of type 'message_header *' at 0x06C66B60> >>> sys.getrefcount(y.header) 1 >>> z = [y.header] >>> sys.getrefcount(y.header) 1 >>> z += [y.header] >>> sys.getrefcount(y.header) 1 >>> 

The Python implementation behaved as I expected, but the pure SWIG implementation did not. Can someone explain what is going on here?

I have read various sections of the SWIG documentation many times and cannot find anything that directly explains this. I learned a lot more about how everything works, but I cannot find a clear explanation / workaround for the phenomenon above.

After thinking about this for a long time, re-reading Structures and Classes, Proxy classes and Structural data members, and again and again looking at the generated shell code. I still canโ€™t understand why the link counts are not processed normally.

The generated C code calls SWIG_NewPointerObj , which ultimately (in most cases) calls PyObject_New , which in turn must (as stated in the python documentation) return a new link.

Generated SWIG code for get-er for header element

 SWIGINTERN PyObject *_wrap_large_message_header_get(PyObject *self, PyObject *args) { PyObject *resultobj = 0; large_message *arg1 = (large_message *) 0 ; void *argp1 = 0 ; int res1 = 0 ; message_header *result = 0 ; if (args && PyTuple_Check(args) && PyTuple_GET_SIZE(args) > 0) SWIG_fail; res1 = SWIG_ConvertPtr(self, &argp1,SWIGTYPE_p_large_message, 0 | 0 ); if (!SWIG_IsOK(res1)) { SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "large_message_header_get" "', argument " "1"" of type '" "large_message *""'"); } arg1 = (large_message *)(argp1); result = (message_header *)& ((arg1)->header); resultobj = SWIG_NewPointerObj(SWIG_as_voidptr(result), SWIGTYPE_p_message_header, 0 | 0 ); return resultobj; fail: return NULL; } 
+6
source share
1 answer

As indicated, the object returned by getter for header and body is basically a lightweight proxy object that contains a memory pointer for header / body inside a struct . It does not own this memory (it still "belongs" to the message object itself or the C library, depending on how you created it), and it is not a copy.

Even if it were a copy, your call to sys.getrefcount will always return 1 anyway - each call to the recipient will return a new copy.

From the point of view of Python, if you want to never have a sagging pointer, you can eliminate two methods:

  • The receiver returns the proxy for the header / body copy that owns the memory it points to.
  • The recipient returns a proxy server that contains a link to the message itself, so even if message is issued, refcount will not be able to go to 0 as long as there are proxy objects related to its parts.

I have compiled run example # 2 with SWIG. Your header file remains unchanged, but the interface becomes:

 %module test %{ #include "test.h" %} %typemap(out) message_header * header %{ // This expands to resultobj = SWIG_NewPointerObj(...) exactly as before: $result = SWIG_NewPointerObj(SWIG_as_voidptr($1), $1_descriptor, 0); // This sets a reference to the parent object inside the child PyObject_SetAttrString($result, "_parent", obj0); %} %include "test.h" 

This is equivalent to saying:

 z = y.header z._parent = y 

in Python.

Now we can run:

 y = test.large_message() print(sys.getrefcount(y)) print(y.header) z = [y.header] print(sys.getrefcount(y)) z += [y.header] print(sys.getrefcount(y)) 

As expected, the number of links for y increases with each proxy created. Thus, the memory to which they refer cannot be freed prematurely (at least not SWIG).

You can make this more general and apply it to multiple types / members with %apply :

 %module test %{ #include "test.h" %} %typemap(out) SWIGTYPE * SUBOBJECT %{ $result = SWIG_NewPointerObj(SWIG_as_voidptr($1), $1_descriptor, 0); PyObject_SetAttrString($result, "_parent", obj0); assert(obj0); // hello world %} %apply SWIGTYPE * SUBOBJECT { message_header * header }; %apply SWIGTYPE * SUBOBJECT { message_large_body * body }; %include "test.h" 
+3
source

All Articles