Automatically create object file (linker) dependencies for C / C ++ executables

Question

Automatically create object file (linker) dependencies for C / C ++ executables

I am currently working on a flexible C / C ++ build base, which I (hopefully) will open source soon enough. (see this question for some background).

I use the following command to generate #include file dependencies for source / header files.

gcc -M -MM -MF

Is there a way to skillfully derive linker dependencies (.o file) for executables (unit tests + the main executable for the target platform in my case) using gcc / GNU utilties in a similar way? Currently, the structure makes a lot of assumptions and is pretty dumb in defining these dependencies.

I heard about one approach where the nm command can be used to display a list of undefined characters in an object file. For example, when nm starts, something like this appears in the object file (compiled using gcc -c) -

 nm -o module.o module.o: U _undefinedSymbol1 module.o: U _undefinedSymbol2 module.o:0000386f T _definedSymbol

You can then search for other object files where these undefined characters defined to display a list of object file dependencies required for a successful file link.

Is this the best practice when defining linker dependencies for executable files? Are there other ways to define these dependencies? Suppose all object files already exist (i.e. already compiled using gcc -c) when you offer your solution.

+7

c ++ c gcc linker dependencies

thegreendroid May 20 '12 at 2:48

source share

3 answers

The following Python script can be used to collect and process nm output for all object files in the current directory:

 #! /usr/bin/env python import collections import os import re import subprocess addr_re = r"(?P<address>[0-9a-f]{1,16})?" code_re = r"(?P<code>[az])" symbol_re = r"(?P<symbol>[a-z0-9_.$]+)" nm_line_re = re.compile(r"\s+".join([addr_re, code_re, symbol_re]) + "\s*$", re.I) requires = collections.defaultdict(set) provides = collections.defaultdict(set) def get_symbols(fname): lines = subprocess.check_output(["nm", "-g", fname]) for l in lines.splitlines(): m = nm_line_re.match(l) symbol = m.group('symbol') if m.group('code') == 'U': requires[fname].add(symbol) else: provides[symbol].add(fname) for dirpath, dirnames, filenames in os.walk("."): for f in filenames: if f.endswith(".o"): get_symbols(f) def pick(symbols): # If several files provide a symbol, choose the one with the shortest name. best = None for s in symbols: if best is None or len(s) < len(best): best = s if len(symbols) > 1: best = "*" + best return best for fname, symbols in requires.items(): dependencies = set(pick(provides[s]) for s in symbols if s in provides) print fname + ': ' + ' '.join(sorted(dependencies))

The script looks for the current directory and all subdirectories for the .o files, calls nm for each found file and analyzes the result. Characters that are undefined in one .o file and defined in another are interpreted as a dependency between the two files. Symbols defined to nowhere (usually provided by external libraries) are ignored. Finally, the script prints a list of direct dependencies for all object files.

If a symbol is provided by several object files, this script arbitrarily takes a dependency on the object file with the smallest file name (and marks the selected file with * in the output). This behavior can be changed by changing the pick function.

The script works for me on Linux and MacOS, I have not tried any other operating systems, and the script has only been tested slightly.

+5

jochen May 29 '13 at 17:01

source share

The nm utility reads object files (and archives such as .a libraries) using libbfd. I think what you really want to do is process the database of common characters defined in the libraries you know about and in the object files that are part of this project so that when you create each new object file you can look at undefined characters in it and determine which object — simple or in the library — you need to link to allow links. Essentially, you are doing the same job as the linker, but sort of in the reverse order so that you can find the characters you can find.

If you work with GCC, you can always look at the source packages for your "binutils" to find the sources for nm, and even ld if you want to. You certainly don't want to run nm and parse the output when it just uses libbfd under the hood, just call libbfd yourself.

+4

Wexxor 25 sept. '12 at 10:43

source share

Jonathan leffler · Accepted Answer · 2012-09-23T07:30:27+0000

If there are several executable files (or even one executable file) that need different sets of dependencies, then the usual, classic way of processing is to use a library - static .a or general .so (or equivalent) - to hold object files that can be used for more than one program, and associate programs with this library. The component automatically pulls the correct object files from the static archive. The shared library process is slightly different, but the end result is the same: the executable file has the correct object files available to it at runtime.

For any program, there is at least one file unique to the program (usually a file containing the main() program). There may be several files for this program. These files are probably known and can be easily listed. Those that you may need depending on the configuration and compilation options are probably shared between programs and easily processed through the library mechanism.

You must decide whether you want to use static or shared libraries. Creating shared libraries is well more complicated than creating static libraries. On the other hand, you can update the shared library and immediately affect all the programs it uses, while the static library can be changed, but only those programs that are associated with the new library can take advantage of the changes.

Automatically create object file (linker) dependencies for C / C ++ executables

More articles: