If you mean an equal number of lines, split has an option for this:
split --lines=75
If you need to know that this 75 should really be for N equal parts, its:
lines_per_part = int(total_lines + N - 1) / N
where shared strings can be obtained using wc -l .
See the following script for an example:
#!/usr/bin/bash # Configuration stuff fspec=qq.c num_files=6 # Work out lines per file. total_lines=$(wc -l <${fspec}) ((lines_per_file = (total_lines + num_files - 1) / num_files)) # Split the actual file, maintaining lines. split --lines=${lines_per_file} ${fspec} xyzzy. # Debug information echo "Total lines = ${total_lines}" echo "Lines per file = ${lines_per_file}" wc -l xyzzy.*
It is output:
Total lines = 70 Lines per file = 12 12 xyzzy.aa 12 xyzzy.ab 12 xyzzy.ac 12 xyzzy.ad 12 xyzzy.ae 10 xyzzy.af 70 total
Later versions of split allow you to specify a CHUNKS number with the -n/--number option. So you can use something like:
split --number=l/6 ${fspec} xyzzy.
(this is ell-slash-six , which means lines , not one-slash-six ).
This will give you roughly equal files in terms of size without middle line separators.
I mention this last point because it does not give you about the same number of lines in each file, more than the same number of characters.
So, if you have one 20-character line and 19 1-character lines (twenty lines in total) and are divided into five files, you most likely will not get four lines in each file.
paxdiablo Oct 14 '11 at 8:10 2011-10-14 08:10
source share