How to handle extremely long file lists in a make recipe?

Question

How to handle extremely long file lists in a make recipe?

Because GNU make allows variables to be as large as memory allows, it has no problem creating massive dependency lists. However, if you want to use these file lists in a recipe (a sequence of shell commands to build a goal), a problem arises: the command may exceed the shell command line length limit, creating an error such as “The argument is too long list”.

For example, suppose I want to combine several files in the $(INPUTS) list to create a combined.txt file. Normally I could use:

 combined.txt: $(INPUTS) cat $^ > $@

But if $(INPUTS) contains many thousands of files, as in my case, the cat call is too long and does not execute. Is there a way around this problem in general? It is safe to assume that there is a certain sequence of commands that have the same behavior with one huge command - in this case, a sequence of cat commands, one per input file that uses >> to add to combined.txt should work. But how can you convince make generate these commands?

+7

makefile

j_random_hacker Aug 12 '11 at 12:04

source share

1 answer

j_random_hacker · Accepted Answer · 2011-08-12T12:56:53+0000

Looking for an answer about the best guess I could find was to break the list down into a series of smaller lists and process them using the for shell. But you can’t always do this, and even when you can ruin this hack: for example, it’s not clear how to get the usual make stop behavior as soon as the command completes with an error. Fortunately, after a long search and experimentation, it turned out that there was a general solution.

Infraorbital and newlines

make recipes reference a separate subshell for each line in the recipe. This behavior can be annoying and inconsistent: for example, the cd on the same line does not affect subsequent commands, since they run in separate subshells. However, in fact, we need to get make to perform actions on very long lists of files.

Usually, if you create a “multi-line” list of files with a regular variable assignment that uses a backslash to split the statement into multiple lines, make removes all lines of the new line:

 # The following two statements are equivalent FILES := abc FILES := \ a \ b \ c

However, using the define directive, you can construct variable values that contain newline. What else, if you replace such a variable with a recipe, each line will actually be run using a separate subshell, so, for example, run make test from /home/jbloggs with the make file below (and suppose that the file called test not exists) will produce the result /home/jbloggs , because the effect of the cd .. is lost when its subshell ends:

 define CMDS cd .. pwd endef test: $(CMDS)

If we create a variable that contains newline characters using define , it can be combined with other text, as usual, and processed using all the usual make functions. This, combined with the $(foreach) function, allows us to get what we want:

 # Just a single newline! Note 2 blank lines are needed. define NL endef combined.txt: $(INPUTS) rm $@ $(foreach f,$(INPUTS),cat $(f) >> $@ $(NL))

We ask $(foreach) convert each file name into a command with a ending new line to be executed in its own subshell. For more complex needs, you can instead write a list of file names to a file using the echo command, and then use xargs .

Notes

The define directive is described as optional using the = ,: = or += marker at the end of the first line to determine which variable flavor should be created, but note that this only works in versions of GNU make 3.82 and higher! Perhaps you are using the popular version 3.81, like me, which silently assigns nothing to a variable if you add one of these tokens, which will lead to great disappointment. See here for more details.
All recipe lines should start with a literal tab character, not the 8 spaces that I used here.

How to handle extremely long file lists in a make recipe?

Infraorbital and newlines

Notes

More articles: