Convert ls output to csv

How do I convert:

$ find . -ls > /tmp/files.txt 

Which gives me something like:

 908715 40 -rwxrwxr-x 1 david staff 16542 Nov 15 14:12 ./dump_info.py 908723 0 drwxr-xr-x 2 david staff 68 Nov 20 17:35 ./metadata 

The csv output? It will look like this:

 908715,40,-rwxrwxr-x,1,david,staff,16542,Nov 15 14:12,./dump_info.py 908723,0,drwxr-xr-x,2,david,staff,68,Nov 20 17:35,./metadata 

Here is an example with spaces in the file name:

 652640,80,-rw-rw-r--,1,david,staff,40036,Nov,6,15:32,./v_all_titles/V Catalog Report 11.5.xlsx 
+8
linux unix find csv
source share
5 answers

Type a little longer on the command line, but it correctly saves spaces in the file name (and quotes it too!)

 find . -ls | python -c ' import sys for line in sys.stdin: r = line.strip("\n").split(None, 10) fn = r.pop() print ",".join(r) + ",\"" + fn.replace("\"", "\"\"") + "\"" ' 
+5
source share

If you do not need spaces in the date:

 $ find . -ls | tr -s ' ' , 

If you care about those spaces:

 $ find . -ls | awk '{printf( "%s,%s,%s,%s,%s,%s,%s,%s %s %s,%s\n", $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11 )}' 

None of these will work if your file names contain spaces. How to crack spaces in the file name, you can try:

  ... | sed 's/,/ /8g' 

to get rid of everything except the first 8 commas (if your sed supports the non-standard 8g option, as gnu sed does). Of course, this will not have commas in the file name.

+4
source share

this should complete the task

  find . -ls|awk 'BEGIN{OFS=","}$1=$1' 
0
source share

And one more option. See the "-printf Format" section of the find man page for configuration.

 $ find . -type f -fprintf /tmp/files.txt "%i,%b,%M,%n,%u,%g,%s,%CY-%Cm-%Cd %CT,%p\n" 

Output Example:

 $ less /tmp/files.txt 3414558,40,-rw-rw-r--,1,webwurst,webwurst,16542,2014-09-18 15:54:36.9232917780,./dump_info.py 3414559,8,-rw-rw-r--,1,webwurst,webwurst,68,2014-09-18 15:54:51.1752922580,./metadata 
0
source share

Here's a python script I wrote ...

 #!/opt/app/python/bin/python # Convert ls output to clean csv Paolo Villaflores 2015-03-16 # # Sample usage: ls -l | ls2csv.py # # Features: # accepts -d argument to change dates to yyyy-mm-dd_hhmm format # input is via stdin # separate file/directory field # handle -dils type input (find -ls) versus -l # handle space in filename, by applying quotes around filename # handle date - format into something excel can handle correctly, whether it is from current year or not. # adds a header # handle symlinks - type l import sys from datetime import datetime b0=True def is_f(s): if s == '-': return 'f' return s for line in sys.stdin: if len(line) < 40: continue if b0: b1=line[0] in ['-', 'd', 'c', 'l'] # c is for devices eg /devices/pseudo/pts@0:5, l is for symbolic link b0=False if b1: # true when shorter ls -l style 8/9 columns. 9 for symlink cols=7 print "d,perms,#links,owner,group,size,modtime,name,symlink" else: cols=9 print "inode,bsize,d,perms,#links,owner,group,size,modtime,name,symlink" r = line.strip("\n").split(None, cols+1) if len(r) < cols+1: continue if r[cols-7][0] == 'c': continue # ignore c records: devices fn = r.pop() if b1: c = '' else: c = ",".join(r[0:2]) + "," z = 0 z = r[cols].find(':') if z < 0: d = r[cols - 1] + "/" + r[cols - 2] + "/" + r[cols] else: n = str(datetime.now() ) d = '' # handle the case where the timestamp has no year field tm=datetime.strptime(r[cols-2]+ " " + r[cols-1]+ " " + n[:4] +" " + r[cols], "%b %d %Y %H:%M") if (tm-datetime.now()).days > 0: d = r[cols - 1] + "/" + r[cols - 2] + "/" + str((datetime.now().year-1)) + " " + r[cols] tm=datetime.strptime(r[cols-2]+ " " + r[cols-1]+ " " + str(int(n[:4])-1) +" " + r[cols], "%b %d %Y %H:%M") else: d = r[cols - 1] + "/" + r[cols - 2] + "/" + " ".join([n[:4], r[cols] ] ) if len(sys.argv) > 1 and sys.argv[1] == '-d': d=tm.strftime("%Y-%m-%d_%H%M") y = fn.find(">") symlink='' if y > 0: symlink = ',\"' + fn[y+2:] + '"' fn = fn[:y-2] if fn.find( " ") <0: if fn.find('"') <0: fn2=fn else: fn2="'" + fn + "'" else: fn2="'" + fn + "'" print c+ is_f(r[cols-7][0]) + ",\"" + r[cols-7][1:] + "\"," + ",".join( r[cols-6:cols-2]) + "," + d + "," + fn2 + symlink 
0
source share

All Articles