Like awk every nth line, starting from different lines, every iteration

I would like awk print every nth line from a file, starting from line 0. Then, after awk has gone through the whole file, I would like it to print every nth line, starting from line 1 .. . then printed every nth line, starting from line 2 ... etc., up to printing every nth line, starting from line n-1. My sad attempt so far:

 #!/bin/bash rm *.sad *.sadd *.out #Create loop index for i in $(seq 20 1 36); do listm+=($i) done #Create input file for j in "${listm[@]}" do if [ $j -eq 20 ]; then awk 'NR % 20 == 0' vel_VMDout > atomvel.dat awk '{print $2,$3,$4}' atomvel.dat > velocity.dat else awk 'NR % 20 == 1' vel_VMDout > $j.sad egrep -v "^[[:space:]]*$|^#" $j.sad > $j.sadd awk '{print $2, $3, $4}' $j.sadd > $j.out paste velocity.dat $j.out > taste fi done 

Let me try to clarify this by providing input and how the output should look. The input data is an xyz MD simulation file consisting of xyz atom coordinate frames.

ENTRANCE:

input

This image shows the first shot and part of the second shot. Since this is a snapshot, the arrangement of the atoms does not change. Thus, I am trying to print the XYZ coordinates from each image for each specific atom in their own columns, as shown below. The result is a file consisting of 3N columns, where N is the number of atoms.

OUTPUT:

output

As you can see, the coordinates of each atom are in their own columns, and the common file is an Nx3N array. My bash script tried to do this, but could only do the first two atoms. I wanted to print every nth line (the coordinates of the nth atom) so that they looked like output. I really appreciate your patience to everyone.

+5
source share
1 answer

Sample data generation

This is a step that should not be necessary; the question was to include suitable data samples and the required result from this sample data.

At one level, this will not help, because you do not have my random number generation program, but the script below shows how I generated the data that follows and illustrates the lengths it may need to go when the question does not give readable data. I created some data similar to the data in the question (at least superficially):

 18 Generated by VMD in absentia C 0.979485 -6.665347 0.575383 C 1.191999 -3.002386 2.859484 C 3.151517 -5.610077 0.429413 C 3.439828 -6.454984 1.319724 C 3.726201 -0.123038 2.096854 C 1.363325 -3.031238 0.016019 C 6.090283 -3.915340 2.396358 C 0.407755 -7.957784 -0.846842 C 0.203074 -0.796428 2.659573 O 2.600610 -2.259674 -0.260378 O 4.773839 -6.765097 0.588508 H 2.743424 -2.890016 2.906452 H 2.810233 -6.641054 -0.797672 H 6.854169 -3.191721 -0.925670 O 2.914233 -1.060001 0.776983 H 3.803923 -1.497032 2.908799 H 5.669443 -7.227666 -0.647552 H 0.092455 -5.850637 2.959987 18 Generated by VMD in absentia C 6.042840 -7.254720 2.093573 C 2.551942 -6.044322 2.061072 C 3.523150 -6.167163 2.451689 C 5.197316 -3.429866 -0.412062 C 2.548777 -6.422851 1.282846 C 3.775197 -2.012031 1.377440 C 3.405112 -3.206415 -0.879886 C 1.448359 -5.419629 0.467291 C 3.661964 -2.789234 2.644294 O 4.214854 -2.439574 -0.951704 O 5.297609 -2.320418 2.709898 H 2.653940 -4.431080 -0.511743 H 5.040635 -0.676199 -0.590970 H 1.546725 -1.294582 2.562937 O 4.231461 -7.180908 1.629901 H 3.297836 -1.557133 -0.133280 H 3.442481 -4.489962 2.111930 H 1.423611 -7.982655 0.715618 18 Generated by VMD in absentia C 1.432495 -7.686243 2.525734 C 5.038409 -4.976270 2.826846 C 6.184137 -7.303094 2.711561 C 3.208125 -0.606556 1.978725 C 2.171859 -6.792060 0.678988 C 6.521124 -5.622797 -0.773797 C 1.725619 -5.768633 -0.223397 C 3.602427 -2.325680 1.762008 C 1.937521 -1.686895 1.743159 O 0.745526 -0.114246 -0.949490 O 4.754360 -6.531145 1.998913 H 1.114732 -1.158810 1.486939 H 6.410490 -5.411647 0.062737 H 4.164330 -6.743763 1.802804 O 2.587841 -3.979700 2.609748 H 2.192073 -2.815376 -0.809569 H 5.501795 -2.326438 1.325829 H 3.285032 -1.212541 1.284453 18 Generated by VMD in absentia C 3.564424 -3.117406 -0.032879 C 2.894745 -0.632591 0.532311 C 3.384916 -5.383135 1.179585 C 0.793488 -0.894539 -0.886891 C 1.348785 -6.501867 1.648604 C 2.189941 -2.438067 0.616090 C 2.043378 -4.966472 0.691603 C 3.124161 -5.792896 0.545362 C 5.741472 -0.640590 2.825374 O 0.300550 -7.149663 0.942726 O 1.344387 -0.121382 2.169401 H 4.963296 -0.964665 -0.230523 H 6.651423 -4.905053 2.509626 H 5.059694 -6.166516 0.102255 O 5.046864 -3.288883 0.853948 H 2.389007 -3.057664 1.806301 H 2.365876 -0.956860 1.458959 H 2.892502 -0.097422 -0.531714 

script I used this:

 random -n $((4 * 18)) -T '%8:6[0:7]F %8:6[-8:0]F %8:6[-1:3]F' | awk 'BEGIN { n = split("CCCCCCCCCOOHHHOHHH", atoms, ""); atoms[0] = atoms[n] } NR % n == 1 { print n; print " Generated by VMD in absentia" } { print "", atoms[NR%18], " ", $0 }' 

The -n to random indicates how many lines should be generated; I chose 72. The -T is a pattern, and the designation %8:6[0:7]F means using the %8.6F format to print evenly distributed random numbers between 0 and 7. awk script accepts data that was generated and interpolates noise (the number of atoms and the variant on the line "generated by VMD"), as well as marking the lines with the corresponding atomic symbol.

Sample Data Processing

Given some data, you need to execute it to get the desired result. This script works more or less. Of course, there are endless paths that need to be improved: for example, file names as command line arguments, using temporary file names instead of fixed names, cleaning up intermediate files, different compounds, different atoms (nitrogen, phosphorus, etc.), and so on. Further. However, it should adapt easily.

 input="data" output="output" n=$(sed 1q "$input") n2=$(($n+2)) for ((i = 3; i <= n2; i++)) do colno=$(printf "%.2d" $(($i-2))) awk -v N=$n2 -v R=$i \ ' BEGIN { name["C"] = "Carbon"; name["H"] = "Hydrogen"; name["O"] = "Oxygen"; R0 = R % N } NR > 2 && NR <= R { count[$1]++; } NR == R { printf "%-32.32s\n", name[$1] " " count[$1]; } NR % N == R0 { xyz = sprintf("%s %s %s", $2, $3, $4); printf "%-32.32s\n", xyz } ' "$input" > "column.$colno" done paste -d ' ' column.* > "$output" 

The first four lines set the control parameters, collect the number of lines per unit of data from the input file and adjust them accordingly. The for loop iterates at offsets 3 to $n2 inclusive (skipping two header lines) and runs the awk script. This encodes the types of atoms ( BEGIN ), determines which atom it processes this time ( NR > 2 && NR <= R and NR == R ), and then organizes the printing of data triplets for the corresponding atom. The formatting is carefully organized so that the column headers and the actual xyz triplets are evenly distributed. They are written to the column.$colno . When done, column.* Files are inserted to create a single output file that looks like this:

 Carbon 1 Carbon 2 Carbon 3 Carbon 4 Carbon 5 Carbon 6 Carbon 7 Carbon 8 Carbon 9 Oxygen 1 Oxygen 2 Hydrogen 1 Hydrogen 2 Hydrogen 3 Oxygen 3 Hydrogen 4 Hydrogen 5 Hydrogen 6 0.979485 -6.665347 0.575383 1.191999 -3.002386 2.859484 3.151517 -5.610077 0.429413 3.439828 -6.454984 1.319724 3.726201 -0.123038 2.096854 1.363325 -3.031238 0.016019 6.090283 -3.915340 2.396358 0.407755 -7.957784 -0.846842 0.203074 -0.796428 2.659573 2.600610 -2.259674 -0.260378 4.773839 -6.765097 0.588508 2.743424 -2.890016 2.906452 2.810233 -6.641054 -0.797672 6.854169 -3.191721 -0.925670 2.914233 -1.060001 0.776983 3.803923 -1.497032 2.908799 5.669443 -7.227666 -0.647552 0.092455 -5.850637 2.959987 6.042840 -7.254720 2.093573 2.551942 -6.044322 2.061072 3.523150 -6.167163 2.451689 5.197316 -3.429866 -0.412062 2.548777 -6.422851 1.282846 3.775197 -2.012031 1.377440 3.405112 -3.206415 -0.879886 1.448359 -5.419629 0.467291 3.661964 -2.789234 2.644294 4.214854 -2.439574 -0.951704 5.297609 -2.320418 2.709898 2.653940 -4.431080 -0.511743 5.040635 -0.676199 -0.590970 1.546725 -1.294582 2.562937 4.231461 -7.180908 1.629901 3.297836 -1.557133 -0.133280 3.442481 -4.489962 2.111930 1.423611 -7.982655 0.715618 1.432495 -7.686243 2.525734 5.038409 -4.976270 2.826846 6.184137 -7.303094 2.711561 3.208125 -0.606556 1.978725 2.171859 -6.792060 0.678988 6.521124 -5.622797 -0.773797 1.725619 -5.768633 -0.223397 3.602427 -2.325680 1.762008 1.937521 -1.686895 1.743159 0.745526 -0.114246 -0.949490 4.754360 -6.531145 1.998913 1.114732 -1.158810 1.486939 6.410490 -5.411647 0.062737 4.164330 -6.743763 1.802804 2.587841 -3.979700 2.609748 2.192073 -2.815376 -0.809569 5.501795 -2.326438 1.325829 3.285032 -1.212541 1.284453 3.564424 -3.117406 -0.032879 2.894745 -0.632591 0.532311 3.384916 -5.383135 1.179585 0.793488 -0.894539 -0.886891 1.348785 -6.501867 1.648604 2.189941 -2.438067 0.616090 2.043378 -4.966472 0.691603 3.124161 -5.792896 0.545362 5.741472 -0.640590 2.825374 0.300550 -7.149663 0.942726 1.344387 -0.121382 2.169401 4.963296 -0.964665 -0.230523 6.651423 -4.905053 2.509626 5.059694 -6.166516 0.102255 5.046864 -3.288883 0.853948 2.389007 -3.057664 1.806301 2.365876 -0.956860 1.458959 2.892502 -0.097422 -0.531714 3.439828 -6.454984 1.319724 3.726201 -0.123038 2.096854 1.363325 -3.031238 0.016019 6.090283 -3.915340 2.396358 0.407755 -7.957784 -0.846842 0.203074 -0.796428 2.659573 2.600610 -2.259674 -0.260378 4.773839 -6.765097 0.588508 2.743424 -2.890016 2.906452 Carbon 1 Carbon 2 Carbon 3 Carbon 4 Carbon 5 Carbon 6 Carbon 7 Carbon 8 Carbon 9 Oxygen 1 Oxygen 2 Hydrogen 1 Hydrogen 2 Hydrogen 3 Oxygen 3 Hydrogen 4 Hydrogen 5 Hydrogen 6 0.979485 -6.665347 0.575383 1.191999 -3.002386 2.859484 3.151517 -5.610077 0.429413 3.439828 -6.454984 1.319724 3.726201 -0.123038 2.096854 1.363325 -3.031238 0.016019 6.090283 -3.915340 2.396358 0.407755 -7.957784 -0.846842 0.203074 -0.796428 2.659573 2.600610 -2.259674 -0.260378 4.773839 -6.765097 0.588508 2.743424 -2.890016 2.906452 2.810233 -6.641054 -0.797672 6.854169 -3.191721 -0.925670 2.914233 -1.060001 0.776983 3.803923 -1.497032 2.908799 5.669443 -7.227666 -0.647552 0.092455 -5.850637 2.959987 6.042840 -7.254720 2.093573 2.551942 -6.044322 2.061072 3.523150 -6.167163 2.451689 5.197316 -3.429866 -0.412062 2.548777 -6.422851 1.282846 3.775197 -2.012031 1.377440 3.405112 -3.206415 -0.879886 1.448359 -5.419629 0.467291 3.661964 -2.789234 2.644294 4.214854 -2.439574 -0.951704 5.297609 -2.320418 2.709898 2.653940 -4.431080 -0.511743 5.040635 -0.676199 -0.590970 1.546725 -1.294582 2.562937 4.231461 -7.180908 1.629901 3.297836 -1.557133 -0.133280 3.442481 -4.489962 2.111930 1.423611 -7.982655 0.715618 1.432495 -7.686243 2.525734 5.038409 -4.976270 2.826846 6.184137 -7.303094 2.711561 3.208125 -0.606556 1.978725 2.171859 -6.792060 0.678988 6.521124 -5.622797 -0.773797 1.725619 -5.768633 -0.223397 3.602427 -2.325680 1.762008 1.937521 -1.686895 1.743159 0.745526 -0.114246 -0.949490 4.754360 -6.531145 1.998913 1.114732 -1.158810 1.486939 6.410490 -5.411647 0.062737 4.164330 -6.743763 1.802804 2.587841 -3.979700 2.609748 2.192073 -2.815376 -0.809569 5.501795 -2.326438 1.325829 3.285032 -1.212541 1.284453 3.564424 -3.117406 -0.032879 2.894745 -0.632591 0.532311 3.384916 -5.383135 1.179585 0.793488 -0.894539 -0.886891 1.348785 -6.501867 1.648604 2.189941 -2.438067 0.616090 2.043378 -4.966472 0.691603 3.124161 -5.792896 0.545362 5.741472 -0.640590 2.825374 0.300550 -7.149663 0.942726 1.344387 -0.121382 2.169401 4.963296 -0.964665 -0.230523 6.651423 -4.905053 2.509626 5.059694 -6.166516 0.102255 5.046864 -3.288883 0.853948 2.389007 -3.057664 1.806301 2.365876 -0.956860 1.458959 2.892502 -0.097422 -0.531714 5.197316 -3.429866 -0.412062 2.548777 -6.422851 1.282846 3.775197 -2.012031 1.377440 3.405112 -3.206415 -0.879886 1.448359 -5.419629 0.467291 3.661964 -2.789234 2.644294 4.214854 -2.439574 -0.951704 5.297609 -2.320418 2.709898 2.653940 -4.431080 Carbon 1 Carbon 2 Carbon 3 Carbon 4 Carbon 5 Carbon 6 Carbon 7 Carbon 8 Carbon 9 Oxygen 1 Oxygen 2 Hydrogen 1 Hydrogen 2 Hydrogen 3 Oxygen 3 Hydrogen 4 Hydrogen 5 Hydrogen 6 0.979485 -6.665347 0.575383 1.191999 -3.002386 2.859484 3.151517 -5.610077 0.429413 3.439828 -6.454984 1.319724 3.726201 -0.123038 2.096854 1.363325 -3.031238 0.016019 6.090283 -3.915340 2.396358 0.407755 -7.957784 -0.846842 0.203074 -0.796428 2.659573 2.600610 -2.259674 -0.260378 4.773839 -6.765097 0.588508 2.743424 -2.890016 2.906452 2.810233 -6.641054 -0.797672 6.854169 -3.191721 -0.925670 2.914233 -1.060001 0.776983 3.803923 -1.497032 2.908799 5.669443 -7.227666 -0.647552 0.092455 -5.850637 2.959987 6.042840 -7.254720 2.093573 2.551942 -6.044322 2.061072 3.523150 -6.167163 2.451689 5.197316 -3.429866 -0.412062 2.548777 -6.422851 1.282846 3.775197 -2.012031 1.377440 3.405112 -3.206415 -0.879886 1.448359 -5.419629 0.467291 3.661964 -2.789234 2.644294 4.214854 -2.439574 -0.951704 5.297609 -2.320418 2.709898 2.653940 -4.431080 -0.511743 5.040635 -0.676199 -0.590970 1.546725 -1.294582 2.562937 4.231461 -7.180908 1.629901 3.297836 -1.557133 -0.133280 3.442481 -4.489962 2.111930 1.423611 -7.982655 0.715618 1.432495 -7.686243 2.525734 5.038409 -4.976270 2.826846 6.184137 -7.303094 2.711561 3.208125 -0.606556 1.978725 2.171859 -6.792060 0.678988 6.521124 -5.622797 -0.773797 1.725619 -5.768633 -0.223397 3.602427 -2.325680 1.762008 1.937521 -1.686895 1.743159 0.745526 -0.114246 -0.949490 4.754360 -6.531145 1.998913 1.114732 -1.158810 1.486939 6.410490 -5.411647 0.062737 4.164330 -6.743763 1.802804 2.587841 -3.979700 2.609748 2.192073 -2.815376 -0.809569 5.501795 -2.326438 1.325829 3.285032 -1.212541 1.284453 3.564424 -3.117406 -0.032879 2.894745 -0.632591 0.532311 3.384916 -5.383135 1.179585 0.793488 -0.894539 -0.886891 1.348785 -6.501867 1.648604 2.189941 -2.438067 0.616090 2.043378 -4.966472 0.691603 3.124161 -5.792896 0.545362 5.741472 -0.640590 2.825374 0.300550 -7.149663 0.942726 1.344387 -0.121382 2.169401 4.963296 -0.964665 -0.230523 6.651423 -4.905053 2.509626 5.059694 -6.166516 0.102255 5.046864 -3.288883 0.853948 2.389007 -3.057664 1.806301 2.365876 -0.956860 1.458959 2.892502 -0.097422 -0.531714 3.208125 -0.606556 1.978725 2.171859 -6.792060 0.678988 6.521124 -5.622797 -0.773797 1.725619 -5.768633 -0.223397 3.602427 -2.325680 1.762008 1.937521 -1.686895 1.743159 0.745526 -0.114246 -0.949490 4.754360 -6.531145 1.998913 1.114732 -1.158810 Carbon 1 Carbon 2 Carbon 3 Carbon 4 Carbon 5 Carbon 6 Carbon 7 Carbon 8 Carbon 9 Oxygen 1 Oxygen 2 Hydrogen 1 Hydrogen 2 Hydrogen 3 Oxygen 3 Hydrogen 4 Hydrogen 5 Hydrogen 6 0.979485 -6.665347 0.575383 1.191999 -3.002386 2.859484 3.151517 -5.610077 0.429413 3.439828 -6.454984 1.319724 3.726201 -0.123038 2.096854 1.363325 -3.031238 0.016019 6.090283 -3.915340 2.396358 0.407755 -7.957784 -0.846842 0.203074 -0.796428 2.659573 2.600610 -2.259674 -0.260378 4.773839 -6.765097 0.588508 2.743424 -2.890016 2.906452 2.810233 -6.641054 -0.797672 6.854169 -3.191721 -0.925670 2.914233 -1.060001 0.776983 3.803923 -1.497032 2.908799 5.669443 -7.227666 -0.647552 0.092455 -5.850637 2.959987 6.042840 -7.254720 2.093573 2.551942 -6.044322 2.061072 3.523150 -6.167163 2.451689 5.197316 -3.429866 -0.412062 2.548777 -6.422851 1.282846 3.775197 -2.012031 1.377440 3.405112 -3.206415 -0.879886 1.448359 -5.419629 0.467291 3.661964 -2.789234 2.644294 4.214854 -2.439574 -0.951704 5.297609 -2.320418 2.709898 2.653940 -4.431080 -0.511743 5.040635 -0.676199 -0.590970 1.546725 -1.294582 2.562937 4.231461 -7.180908 1.629901 3.297836 -1.557133 -0.133280 3.442481 -4.489962 2.111930 1.423611 -7.982655 0.715618 1.432495 -7.686243 2.525734 5.038409 -4.976270 2.826846 6.184137 -7.303094 2.711561 3.208125 -0.606556 1.978725 2.171859 -6.792060 0.678988 6.521124 -5.622797 -0.773797 1.725619 -5.768633 -0.223397 3.602427 -2.325680 1.762008 1.937521 -1.686895 1.743159 0.745526 -0.114246 -0.949490 4.754360 -6.531145 1.998913 1.114732 -1.158810 1.486939 6.410490 -5.411647 0.062737 4.164330 -6.743763 1.802804 2.587841 -3.979700 2.609748 2.192073 -2.815376 -0.809569 5.501795 -2.326438 1.325829 3.285032 -1.212541 1.284453 3.564424 -3.117406 -0.032879 2.894745 -0.632591 0.532311 3.384916 -5.383135 1.179585 0.793488 -0.894539 -0.886891 1.348785 -6.501867 1.648604 2.189941 -2.438067 0.616090 2.043378 -4.966472 0.691603 3.124161 -5.792896 0.545362 5.741472 -0.640590 2.825374 0.300550 -7.149663 0.942726 1.344387 -0.121382 2.169401 4.963296 -0.964665 -0.230523 6.651423 -4.905053 2.509626 5.059694 -6.166516 0.102255 5.046864 -3.288883 0.853948 2.389007 -3.057664 1.806301 2.365876 -0.956860 1.458959 2.892502 -0.097422 -0.531714 

Your task is to understand why all bits of the awk script are present. For example, why do you need R0 (hint, experiment without calculating R0 and use R in its place).

+2
source

All Articles