How to remove the first two columns in a file using the shell (awk, sed, whatever)

I have a file with many rows in each row there are many columns (fields) separated by a space "" the number of columns in each row is different I want to delete the first two columns how?

+54
shell awk perl sed cut
Nov 19
source share
10 answers

You can do this with cut :

 cut -d " " -f 3- input_filename > output_filename 

Explanation:

  • cut : call the cut command
  • -d " " : use a single space as delimiter ( cut uses TAB by default)
  • -f : specify fields to save
  • 3- : all fields starting with field 3
  • input_filename : use this file as input
  • > output_filename : write the output to this file.

Alternatively, you can do this with awk :

 awk '{$1=""; $2=""; sub(" ", " "); print}' input_filename > output_filename 

Explanation:

  • awk : invoke the awk command
  • $1=""; $2=""; : set field 1 and 2 to an empty line
  • sub(...); : clear the output fields because fields 1 and 2 will still be separated by a ""
  • print : print the modified line
  • input_filename > output_filename : same as above.
+120
Nov 19
source share

Here is one way to do this with Awk, which is relatively easy to understand:

 awk '{print substr($0, index($0, $3))}' 

This is a simple awk command without a template, so the action inside {} is performed for each line of input.

The action is simply to print a substring starting at the position of the third field.

  • $0 : entire input line
  • $3 : 3rd field
  • index(in, find) : returns the find position in the in line
  • substr(string, start) : return a substring starting at index start

If you want to use another separator, such as a comma, you can specify it with the -F option:

 awk -F"," '{print substr($0, index($0, $3))}' 

You can also use this in a subset of the input lines by specifying the pattern before the action in {} . Only lines matching the pattern will be executed.

 awk 'pattern{print substr($0, index($0, $3))}' 

Where the template can be like:

  • /abcdef/ : use regexp, by default it works at $ 0.
  • $1 ~ /abcdef/ : work in a specific field.
  • $1 == blabla : use string comparison
  • NR > 1 : use record / line number
  • NF > 0 : use field / column number
+17
Feb 05 '13 at 19:18
source share

Thanks for posting the question. I would also like to add a script that helped me.

 awk '{ $1=""; print $0 }' file 
+9
Jul 07 '14 at 1:13
source share
 awk '{$1=$2="";$0=$0;$1=$1}1' 

Enter

 abcd 

Exit

 cd 
+8
Nov 18 '14 at 6:12
source share

You can use sed :

 sed 's/^[^ ][^ ]* [^ ][^ ]* //' 

It searches for lines starting with one or more non-spaces, a space, another set of one or more non-blanks and one more blank and removes the matched material, for example the first two fields. [^ ][^ ]* little shorter than the equivalent but more explicit notation [^ ]\{1,\} , and the second may run into problems with GNU sed (although if you use --posix as an option, even GNU sed can't see it). OTOH, if the character class to be repeated was more complex, numerical notation wins for brevity. It is easy to expand to treat "blank or tab" as a delimiter, or "multiple spaces" or "multiple spaces or tabs." It can also be modified to process optional blanks (or tabs) before the first field, etc.

For awk and cut see the Sampson-Chen answer . There are other ways to write an awk script, but they are not much better than the answer. Note that you may need to explicitly set the field separator ( -F" " ) in awk if you do not want the tabs to be treated as separators or you might have multiple spaces between the fields. The POSIX cut standard does not support multiple delimiters between fields; GNU cut has a useful but non-standard -i option that allows multiple delimiters between fields to be used.

You can also do this in a clean shell:

 while read junk1 junk2 residue do echo "$residue" done < in-file > out-file 
+6
Nov 19
source share

Pretty simple to do this with the shell only

 while read ABC; do echo "$C" done < oldfile >newfile 
+6
Jul 07 '14 at 2:09 on
source share

Perl:

 perl -lane 'print join(' ',@F[2..$#F])' File 

AWK:

 awk '{$1=$2=""}1' File 
+4
Dec 10 '14 at 9:17
source share

This may work for you (GNU sed):

 sed -r 's/^([^ ]+ ){2}//' file 

or for columns separated by one or more spaces:

 sed -r 's/^(\S+\s+){2}//' file 
+1
Nov 19 '12 at 7:14
source share

Use kscript

 kscript 'lines.split().select(-1,-2).print()' file 
0
May 12 '17 at 8:40
source share

Using awk and, based on some of the options below, using a for loop makes it a little more flexible; sometimes I can delete the first 9 columns (for example, I do "ls -lrt"), so I change 2 to 9 and that it:

awk '{ for(i=0;i++<2;){$i=""}; print $0 }' your_file.txt

0
Dec 20 '17 at 19:13
source share



All Articles