First appearance on each line vi / vim / sed, etc.

Using a VI to replace the first occurrence / instance is pretty simple.

:%s/search/replace/args 

but here is my dataset in .csv / file format:

 "192.168.2.1","www.google.com","2009/01/11_10:00"," What a great website" "192.168.2.2/driving/is/fun","-","2009/03/22_00:00","Driving website" "192.168.2.4/boating/is/crazy","-","2009/03/22_00:00","Boating Website" "192.168.2.5","www.cars.com","2009/04/27_00:00","What a good car website" 

So, you will notice that there are 4 columns in the first row, this is an ideal line for the .csv format.

However, there are 4 columns in the second row, but the first column only accepts IP addresses and nothing else, so 192.168.2.2/driving/is/fun must be deleted or separated using ",". Csv delimter.

In vi, I was able to use the following:

  :/^"\d\{,3}\.\d\{,3}\.\d\{,3}\.\d\{,3}\//s/\//","/ 

which performs the following actions:

  • / ^ "\ d {, 3}. \ d {, 3}. \ d {, 3}. \ d {, 3} / - Sets the binding to start the search by the first IP address using, for example, line 2:" 192.168.2.2/

  • / s /// "," / - replaces / at the end of the IP address and replaces it with the delimiter .csv ","

This works fine in VI / VIM, replacing exactly what I need one line at a time. However, the data set is much larger and manual, using the following search and replace vi, is time consuming. I am looking for a script for it or finding an alternative solution, because VI / VIM will only do one line at a time, the following: s / search / replace / g replaces everything / on the line, also changing the date column.

Obviously I tried the following:

Adding% for the whole file at the beginning of the substitution like this:

  :/^"\d\{,3}\.\d\{,3}\.\d\{,3}\.\d\{,3}\//%s/\//","/ 

which highlights every entry I need to change, but errors:

  E492: Not an editor command: /^"\d\{,3}\.\d\{,3}\.\d\{,3}\.\d\{,3}\//%s/\// 

which is pretty confusing.

Ultimately, I would like to use sed / perl for the script to edit the entire file in one shot.

So..

"192.168.2.2/->" 192.168.2.2 ","

The first occurrence in each line.

Any help would be greatly appreciated.

Thanks!

+4
source share
3 answers

In ViM, try:

  :%s/^\("\d\{,3}\.\d\{,3}\.\d\{,3}\.\d\{,3}\)\(\/[^"]\)/\1","\2 

That is, instead of searching / replacing, I use the global one ( % is a shortcut to replace 1,$ ie from the first line to the last line). I moved your search pattern to a substitution pattern and captured the IP address and path in separate groups. Then replace them back, squeezing the "," between them.

+3
source

In vi / vim, you can specify the search range that you want to replace. In this case, you want :%s replaced on all lines:

 :%s/search/replace/g 

You can also specify:

 :2,5s/search/replace/g Replace on lines 2-5 :.,$s/search/replace/g Replace from current line (.) to last line ($) :.,+3s/search/replace/g Replace on the current line (.) and the two next lines (+3) :g/^asd/s/search/replace/g Replace on lines starting with 'asd'. 

You can then combine this with a simpler template to make the replacements you need throughout the file:

 :%s/^\("[^/"]*\)[^"]*"/\1"/ 

This will delete everything after the IP address from the first entry in the CSV.

 :%s/^\("[^/"]*\)\/\([^"]*\)"/\1","\2/ 

This will split the first entry into the IP address and the remainder, although this will be done only for those lines where there is a slash after the IP address. What you tried to do was find the pattern, go to that line, and then replace. Adding "%" in this case invalidated the command.

+4
source

You can do what you want with a simpler template:

 s/^\("[^/"]*\)[^"]*"/\1"/ 

These are: matching the beginning of a line, the beginning of a matching group: matching " , matching any number of characters that are not a slash, and not " closing a matching group, matching any number of characters that are not " and matching a " . Replace the contents of the match group plus a " .

The above template should be pretty simple for the script. Here is a Python example.

 #!/usr/bin/env python import re import sys if len(sys.argv) != 3: print("Usage: log_file_cleaner <input_file> <output_file>") sys.exit(1) pat = re.compile(r'^("[^/"]*)[^"]*"') with open(sys.argv[1]) as in_f, open(sys.argv[2], "w") as out_f: for line in in_f: line = re.sub(pat, r'\1"', line) out_f.write(line) 

Note: you need the latest version of Python to make one with , which makes two open() calls. If you are stuck in Cygwin, you can edit the above two nested with statements, each of which makes one call to open() .

+2
source

All Articles