Are there any tools for tracking bloat in C ++?

A carelessly written template here, some excessive attachments there - it's too easy to write bloated C ++ code. In principle, refactoring to reduce this bloat is not too difficult. The problem is to keep track of the most unpleasant patterns and inline strings — keep track of those elements that cause real bloat in real programs.

With that in mind, and because I'm sure my libraries are a bit more bloated than they should be, I was wondering if there are any tools that can automatically track these worst intruders - that is, identify those elements that make the largest contribution (including all of their repeating instances and challenges) to the size of a particular goal.

At the moment, I'm not really interested in performance - it's all about the size of the executable file.

Are there any tools for this job used on Windows and installation using MinGW GCC or Visual Studio?

EDIT - some context

I have a set of multiway-tree templates that act as replacements for standard red-black tree containers. They are written as wrappers around non-standard code without templates, but they were also written a long time ago and as an experiment “it is better to experiment with the friendliness cache”. The bottom line is that they were not actually written for long term use.

Since they support some convenient tricks (although the search is based on user comparisons / partial keys, efficient indexed access, finding the smallest unused key), they end up being used almost everywhere in my code. These days, I almost never used std :: map.

Layered on top of them, I have several more complex containers, such as double-sided cards. In addition to these, I have tree and digraph classes. In addition to these ...

Using map files, I could track if non-inline template methods cause blur. It is simply a matter of finding all instances of a particular method and adding dimensions. But what about dishonest methods? Templates were, after all, intended for thin wrappers around code other than templates, but historically my ability to judge whether to add something or not was not very reliable. The swelling effect of these patterns is not very easy to measure.

I have an idea which methods are being used to a large extent, but that the well-known fallibility error is without profiling.

+7
c ++ profiler
source share
5 answers

Check out Sort characters . I used it some time ago to find out why our installer grew 4 times in six months (it turns out that the answer was a static link between C runtime and libxml2).

+7
source share

Map File Analysis

I saw such a problem some time ago, and in the end I wrote my own tool that analyzed the map file (Visual Studio linker can be instructed on how to create it). Tool output:

  • list sorted in descending order of code size, only listing N
  • list of source files sorted in descending order by code size, listing only the first N

The Parsing map file is relatively simple (the size of the function code can be calculated as the difference between the current and the next line), the most difficult part is probably handling the distorted names in a reasonable way. You may find some ready-to-use libraries for this, I did it a few years ago, and I don't know the current situation.

Here is a brief excerpt from the map file so you know what to expect:

 Address Publics by Value Rva + Base Lib: Object

 0001: 0023cbb4? ApplyScheme @ Input @@ QAEXPBVParamEntry @@@ Z 0063dbb4 f mainInput.obj
 0001: 0023cea1? InitKeys @ Input @@ QAEXXZ 0063dea1 f mainInput.obj
 0001: 0023cf47? LoadKeys @ Input @@ QAEXABVParamEntry @@@ Z 0063df47 f mainInput.obj

Sort Symbol

As posted in Ben Staub's answer , Symbol Sort is ready to use a command-line utility (comes with a complete C # source) that does all this, with the only difference being that it doesn't parse map files, but rather pdb / exe files.

+5
source share

So what I'm reading based on your question and your comments is that the library is actually not too big.

The only tool you need to determine is the command shell or Windows File Explorer. Look at the file size. Is it so big that it causes real real problems? (Unacceptable loading time, will not fit in memory on the target platform, something like this)?

If not, then you should worry about code readability and maintainability, and nothing more. And the tool for this is your eyes. Read the code and follow the steps necessary to read it more conveniently if necessary.

If you can indicate the actual reason why the size of the executable is a problem , edit it in your question, as this is an important context.

However, assuming file size is actually a problem:

Nested functions are usually not a problem, because the compiler and no one else chooses which functions are built-in. A simple inline designation does not embed the actual generated code. The compiler enters into itself if it determines a compromise between the larger code and the lesser indirectness to be worth it. If a function is called frequently, it will not be included, because it will drastically affect the size of the code, which can hurt performance.

If you are concerned that the built-in functions cause code bloat, just compile the "optimize size" checkbox. Then the compiler will limit the attachment in cases where it does not affect the size of the executable file noticeably.

To find out which characters are the largest, analyze the map file as @Suma suggested.

But in fact, you said it yourself when you mentioned "the known error of opiomization without errors."

The very first profiling act you need to do is ask - is the size of the executable really a problem ? In the comments, you said that you “have a feeling” that is useless in the context of profiling and can be translated “no,” the size of the executable file is not a problem. "

Profile. Gather data and identify problems. Before worrying about how to reduce the size of the executable, find out what the size of the executable is and determine if this is really a problem. You haven't done it yet. You read in the book that “bloating code is a problem in C ++,” and therefore you assume that bloating is a problem in your program. but is it? What for? How do you determine what it is?

+2
source share

Basically, you are looking for expensive things that you do not need. Suppose that there is a certain category of functions in which you do not need to take some large percentage of space, for example 20%. Then, if you choose 20 random bytes from the image size, on average 4 of them (20 * 20%) will be in this category, and you can see them. Thus, you take these samples, look at them, and if you see an obvious set of functions that you really do not need, then delete them. Then do it again, because other categories of routines that used less space now occupy a higher percentage.

So, I agree with Suma that parsing a map file is a good start. Then I will write a routine to go through it, and every 5% of the way (in space) they print the routine in which I work. Thus, I get 20 samples. Often I find that most of the object space comes from a very small number (e.g. 1) of source code lines, which I could easily do differently.

You are also worried about too many features that do more than they can be. To understand this, I would take each of these samples, and since it represents a specific address in a particular function, I would track it back to the line of code in which it is located. That way, I can determine if it is in an advanced function. This is a bit of work, but doable.

A similar problem is how to find tumors when the discs are full. The same idea is to walk through a directory tree, adding up file sizes. Then you conduct it again, and as you pass each 5% point, you print the path to the file you are in. This not only tells you that you have large files, it tells you if you have a large number of small files, and it doesn’t matter how deeply they are buried or how widely they are scattered. When you clear one category of files that you don't need, you can do it again to get the next category, etc.

Good luck.

+1
source share

http://www.sikorskiy.net/prj/amap/index.html

This is a great object file in the lib / library graphic size analysis tool generated from the Visual Studio compiler map file. This tool analyzes and generates a report from a map file. You can also filter and display the size dynamically. just enter the map file into this tool, and this tool will list which functions take what size, this map file generated by dll / exe, checks its screenshots in the above file / you can also sort by size.

+1
source share

All Articles