It is not strange that no one mentioned that modern versions of GNU tar
allow you to compress when you merge:
tar -czf output.tar.gz directory1 ... tar -cjf output.tar.bz2 directory2 ...
You can also use the compressor of your choice if it supports the options < -c
'(to output stdout or from stdin) and' -d
(unpack):
tar -cf output.tar.xxx --use-compress-program=xxx directory1 ...
This will allow you to specify any alternative compressor.
[Added: if you extract compressed files from gzip
or bzip2
, GNU tar
will automatically detect them and run the corresponding program. That is, you can use:
tar -xf output.tar.gz tar -xf output.tgz # A synonym for the .tar.gz extension tar -xf output.tar.bz2
and they will be processed properly. If you are using a non-standard compressor, you need to specify what to remove.]
The reason for the separation is, as in the selected answer, separation of duties. Among other things, this means that people can use the cpio
program to pack files (instead of tar
), and then use the selected compressor (once in a while, the preferred compressor was pack
, later it was compress
(which was much more efficient than pack
), and then gzip
, which controlled the rings around both of its predecessors, and completely competed with zip
(which was ported to Unix but not native there), and now bzip2
, which in my experience usually has an advantage of 10-20% over gzip
.
[Added: someone noticed in his answer that cpio
has fun conventions. This is true, but until GNU tar
receives the appropriate parameters (' -T -
'), cpio
was the best command when you did not want to archive everything that was under this directory - you could choose exactly what Files have been archived. The drawback of cpio
was that you could not only select files - you had to select them. There is another place where cpio
scores; it can make an in-situ copy from one directory hierarchy to another without intermediate storage:
cd /old/location; find . -depth -print | cpio -pvdumB /new/place
By the way, the < -depth
'option on find
is important in this context - it copies the contents of directories before setting permissions on the directories themselves. When I checked the command before entering the add-on to this answer, I copied several read-only directories (resolution 555); when I went to delete the copy, I had to disable directory permissions before rm -fr /new/place
could finish. Without the -depth
option, the cpio
command failed. I just remembered this when I went to the cleaning - the formula I cited is automatic for me (mainly due to repeated repetitions over the years). ]