Find Unused Structures and Elements

Some time ago we took responsibility for an outdated code base.

One of the quirks of this very poorly structured / written code was that it contained a number of really huge structures, each of which contained hundreds of members. One of the many steps that we have taken is to clear as much code as possible that was not used, therefore, it is necessary to search for unused structs / struct elements.

Regarding structures, I created a combination of python, GNU Global, and ctags to display unused structure elements.

Basically, I use ctags to create a tag file, the python script described below will analyze this file to find the whole structure and then use GNU Global to search the previously generated global database to see if this element is used in the code.

This approach has a number of rather serious drawbacks, but it kind of solved the problem that we encountered, and gave us a good start for further cleaning.

There must be a better way to do this!

The question arises: how to find unused structures and structure elements in the code base?

 #!/usr/bin/env python import os import string import sys import operator def printheader(word): """generate a nice header string""" print "\n%s\n%s" % (word, "-" * len(word)) class StructFreqAnalysis: """ add description""" def __init__(self): self.path2hfile='' self.name='' self.id='' self.members=[] def show(self): print 'path2hfile:',self.path2hfile print 'name:',self.name print 'members:',self.members print def sort(self): return sorted(self.members, key=operator.itemgetter(1)) def prettyprint(self): '''display a sorted list''' print 'struct:',self.name print 'path:',self.path2hfile for i in self.sort(): print ' ',i[0],':',i[1] print f=open('tags','r') x={} # struct_name -> class y={} # internal tags id -> class for i in f: i=i.strip() if 'typeref:struct:' in i: line=i.split() x[line[0]]=StructFreqAnalysis() x[line[0]].name=line[0] x[line[0]].path2hfile=line[1] for j in line: if 'typeref' in j: s=j.split(':') x[line[0]].id=s[-1] y[s[-1]]=x[line[0]] f.seek(0) for i in f: i=i.strip() if 'struct:' in i: items=i.split() name=items[0] id=items[-1].split(':')[-1] if id: if id in y: key=y[id] key.members.append([name,0]) f.close() # do frequency count for k,v in x.iteritems(): for i in v.members: cmd='global -a -s %s'%i[0] # -a absolute path. use global to give src-file for member g=os.popen(cmd) for gout in g: if '.c' in gout: gout=gout.strip() f=open(gout,'r') for line in f: if '->'+i[0] in line or '.'+i[0] in line: i[1]=i[1]+1 f.close() printheader('All structures') for k,v in x.iteritems(): v.prettyprint() #show which structs that can be removed printheader('These structs could perhaps be removed') for k,v in x.iteritems(): if len(v.members)==0: v.show() printheader('Total number of probably unused members') cnt=0 for k,v in x.iteritems(): for i in v.members: if i[1]==0: cnt=cnt+1 print cnt 

Edit

As @ Jens-Gustedt suggested using the compiler, this is a good way to do this. I am following an approach that can do some sort of High Level filtering before using the compiler approach.

+4
source share
4 answers

If this is just a few struct , and if the code does not make bad access hacks to the struct through another type ... then you can just comment out all the fields of your first struct and let the compiler tell you.

Uncomment one used field after another until the compiler is satisfied. Then, when it compiles, to good testing to provide a precondition for the absence of hacks.

Iterate over all struct .

Definitely not very, but in the end you will have at least one person who knows a little code.

+1
source

Use coating . This is a great tool to detect code flaws, but it's a bit expensive.

+1
source

Although this is a very old post. But recently, I did the same with python and gdb. I compiled the following code snippet with the structure at the top of the hierarchy, and then using gdb I printed the type of print in the structure and re-cursed its members.

 #include <usedheader.h> UsedStructureInTop *to_print = 0; int main(){return 0;} (gdb) p to_print (gdb) $1 = (UsedStructureInTop *) 0x0 (gdb) pt UsedStructureInTop type = struct StructureTag { members displayed here line by line } (gdb) 

Although my goal is a little different. It should generate a header that contains only the UsedStructureInTop structure and its dependency types. There are compiler options for this. But they do not remove unused / unrelated structures found in the included header files.

+1
source

In accordance with the rules of C, you can access the elements of the structure through another structure, which has a similar layout. This means that you can access struct Foo {int a; float b; char c; }; struct Foo {int a; float b; char c; }; via struct Bar { int x; float y; }; struct Bar { int x; float y; }; (except, of course, for Foo::c ).

Therefore, your algorithm is potentially erroneous. It is extremely difficult to find what you need, and that is simply because C is difficult to optimize.

0
source

All Articles