How to remove the first column (which is actually row names) from a data file in Linux?

Question

How to remove the first column (which is actually row names) from a data file in Linux?

I have a data file with many thousands of columns and rows. I want to remove the first column, which is actually a row counter. I used this command in linux:

cut -d " " -f 2- input.txt > output.txt

but nothing has changed in my release. Does anyone know why this is not working and what should I do?

This is what my input file looks like:

 col1 col2 col3 col4 ... 1 0 0 0 1 2 0 1 0 1 3 0 1 0 0 4 0 0 0 0 5 0 1 1 1 6 1 1 1 0 7 1 0 0 0 8 0 0 0 0 9 1 0 0 0 10 1 1 1 1 11 0 0 0 1 . . .

I want my result to look like this:

 col1 col2 col3 col4 ... 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 1 . . .

I also tried the sed command:

  sed '1d' input.file > output.file

But it deletes the first row, not the first column.

Can anyone visit me?

+16

linux bash shell

zara Sep 27 '15 at 21:14

source share

5 answers

The idiomatic use of the cut will be

 cut -f2- input > output

if the separator is a tab ("\ t").

Or just with awk magic (will work for both space and tab delimiter)

  awk '{$1=""}1' input | awk '{$1=$1}1' > output

the first awk removes field 1 but leaves the delimiter; the second awk removes the delimiter. The default output separator will be space, if you want to go to the tab, add -vOFS="\t" to the second awk.

UPDATED

Based on your updated input, the problem is that leading spaces are treated as multiple columns. One way to handle this is to remove them first before serving.

 sed 's/^ *//' input | cut -d" " -f2- > output

or use the awk alternative, which will work in this case too.

+15

karakfa Sep 27 '15 at 23:56

source share

You can use the cut with the --complement option:

 cut -f1 -d" " --complement input.file > output.file

This will output all the columns except the first.

+13

buff Sep 27 '15 at 21:20

source share

As @karakfa notes, this looks like a leading space that is causing your problems.

Here is sed oneliner to do the job (which will take into account spaces or tabs):

 sed -i.bak "s|^[ \t]\+[0-9]\+[ \t]\+||" input.txt

Explanation:

 -i edit existing file in place .bak backup original file and add .bak file extension (can use whatever you like) s substitute | separator (easiest character to read as sed separator IMO) ^ start match at start of the line [ \t] match space or tab \+ match one or more times (escape required so sed does not interpret '+' literally) [0-9] match any number 0 - 9

As noted; The input.txt file will be edited in place. The original contents of input.txt will be saved as input.txt.bak . Instead, use only -i if you do not want to back up the source file.

In addition, if you know that they are certainly leading spaces (not tabs), you can shorten it to this:

 sed -i.bak "s|^ \+[0-9]\+[ \t]\+||" input.txt

0

Jeremy davis Aug 15 '19 at 1:09

source share

You can also achieve this with grep:

 grep -E -o '[[:digit:]]([[:space:]][[:digit:]]){3}$' input.txt

Which involves single-character numbers and spaces. To place a variable number of spaces and numbers, you can do:

 grep -E -o '[[:digit:]]+([[:space:]]+[[:digit:]]+){3}$' input.txt

If your grep supports the -P flag ( --perl-regexp ), you can do:

 grep -P -o '\d+(\s+\d+){3}$' input.txt

Here are a few options if you are using GNU sed:

 sed 's/^\s\+\w\+\s\+//' input.txt sed 's/^\s\+\S\+\s\+//' input.txt sed 's/^\s\+[0-9]\+\s\+//' input.txt sed 's/^\s\+[[:digit:]]\+\s\+//' input.txt

Note that grep regular expressions correspond to the parts that we want to keep, while sed regular expressions correspond to the parts that we want to remove.

0

htaccess Sep 03 '19 at 22:29

source share

Fouad djebbar · Accepted Answer · 2018-03-20T09:56:17+0000

@Karafka I had CSV files, so I added the separator "," (you can replace it with your

 cut -d"," -f2- input.csv > output.csv

Then I used a loop to iterate over all the files inside the directory

 # files are in the directory tmp/ for f in tmp/* do name=`basename $f` echo "processing file : $name" #kepp all column excep the first one of each csv file cut -d"," -f2- $f > new/$name #files using the same names are stored in directory new/ done

How to remove the first column (which is actually row names) from a data file in Linux?

More articles: