awk ' { for (i=1; i<=NF; i++) { a[NR,i] = $i } } NF>p { p = NF } END { for(j=1; j<=p; j++) { str=a[1,j] for(i=2; i<=NR; i++){ str=str" "a[i,j]; } print str } }' file
Exit
$ more file 0 1 2 3 4 5 6 7 8 9 10 11 $ ./shell.sh 0 3 6 9 1 4 7 10 2 5 8 11
Performance versus Jonathan Perl solution in 10,000 line file
$ head -5 file 1 0 1 2 2 3 4 5 3 6 7 8 4 9 10 11 1 0 1 2 $ wc -l < file 10000 $ time perl test.pl file >/dev/null real 0m0.480s user 0m0.442s sys 0m0.026s $ time awk -f test.awk file >/dev/null real 0m0.382s user 0m0.367s sys 0m0.011s $ time perl test.pl file >/dev/null real 0m0.481s user 0m0.431s sys 0m0.022s $ time awk -f test.awk file >/dev/null real 0m0.390s user 0m0.370s sys 0m0.010s
EDIT by Ed Morton (@ ghostdog74 feel free to delete if you reject).
Perhaps this version with some more explicit variable names will help answer some of the questions below and generally clarify what the script does. It also uses tabs as a separator, originally requested by the OP to handle empty fields, and it coincides to exaggerate the output for this particular case.
$ cat tst.awk BEGIN { FS=OFS="\t" } { for (rowNr=1;rowNr<=NF;rowNr++) { cell[rowNr,NR] = $rowNr } maxRows = (NF > maxRows ? NF : maxRows) maxCols = NR } END { for (rowNr=1;rowNr<=maxRows;rowNr++) { for (colNr=1;colNr<=maxCols;colNr++) { printf "%s%s", cell[rowNr,colNr], (colNr < maxCols ? OFS : ORS) } } } $ awk -f tst.awk file X row1 row2 row3 row4 column1 0 3 6 9 column2 1 4 7 10 column3 2 5 8 11
The above solutions will work in any awk (except for the old, broken awk, of course, there is YMMV).
The above solutions really read the entire file in memory - if the input files are too large for this, you can do this:
$ cat tst.awk BEGIN { FS=OFS="\t" } { printf "%s%s", (FNR>1 ? OFS : ""), $ARGIND } ENDFILE { print "" if (ARGIND < NF) { ARGV[ARGC] = FILENAME ARGC++ } } $ awk -f tst.awk file X row1 row2 row3 row4 column1 0 3 6 9 column2 1 4 7 10 column3 2 5 8 11
which uses almost no memory, but reads the input file once per the number of fields in the line, so it will be much slower than the version that reads the entire file into memory. He also assumes the number of fields is the same for each row and uses GNU awk for ENDFILE and ARGIND , but any awk can do the same with tests for FNR==1 and END .