How to replace text with sed or awk?

I have the following json file:

{ "last_modified": { "type": "/type/datetime", "value": "2008-04-01T03:28:50.625462" }, "type": { "key": "/type/author" }, "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.", "key": "/authors/OL2108538A", "revision": 1 } 

The name value has a double quote, and I want to replace this double quote with a single quote (not any other double quote). How can i do this?

+4
source share
7 answers

If you want to repeat all occurrences of a single character, you can also use the tr command, simpler than sed or awk:

  cat myfile.txt | tr \" \' 

Note that both quotes are escaped. If you have characters other than quotation marks, you simply write:

  cat myfile.txt | tr a A 

Edit: note that after the question has been modified, this answer is no longer valid: it replaces all double quotes, not just those inside the Name property.

+3
source

I think it would be better to use sed something like this:

sed 's / "/' / g 'your file

+1
source

If you mean only the double quote in 'Rico"s' , you can use:

 sed "s/Rico\"s/Rico's/" 

how in:

 pax> echo '{"name": "National Res...rto Rico"s Economy.", "key": "blah"}' | sed "s/Rico\"s/Rico's/" {"name": "National Res...rto Rico Economy.", "key": "blah"} 
0
source

Assuming your data is exactly the same as you showed, and additional double quotes appear only in the name field:

Update:

I made the script a little more reliable (handling ',' inside fields).

 BEGIN { q = "\"" FS = OFS = q ", " q } { split($1, arr, ": " q) gsub(q, "'", arr[2]) print arr[1] ": " q arr[2], $2, $3 } 

Put this script in a file (say dequote.awk ) and run the script using awk -f dequote.awk input.json > output.json .

Update 2:

Good, so your input is very difficult to process. The only thing I can think of is this:

 { start = match($0, "\"name\": ") + 8 stop = match($0, "\", \"key\": ") if (start == 8 || stop == 0) { print next } pre = substr($0, 1, start) post = substr($0, stop) name = substr($0, start + 1, stop - start - 1) gsub("\"", "'", name) print pre name post } 

Explanation: I am trying to slice a string in three parts:

  • Before the first double quotation mark for the field value "name";
  • name field minus double quotes;
  • the closing double quote and the rest of the string.

In part 2, I replace all double quotes with single quotes. Then I glue the three parts back and print them.

0
source
 awk '{for(i=1;i<=NF;i++) if($i~/name/) { gsub("\042","\047",$(i+1)) } }1' file 
0
source

Adding some other weird errors to your input

 { "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "type": {"key": "/type/author"}, "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.", "key": "/authors/OL2108538A", "revision": 1, "has \" escaped quote": 1, "has \" escaped quotes \"": 1, "has multiple " internal " quotes": 1, } 

this is a Perl program that corrects unshielded internal double quotes using heuristics, followed by the actual final quote of the line with optional spaces and a colon, comma, semicolon, or curly brace

 #! /usr/bin/perl -p s<"(.+?)"(\s*[:,;}])> { my($text,$terminator) = ($1,$2); $text =~ s/(?<!\\)"/'/g; # " oh, the irony! qq["$text"] . $terminator; }eg; 

outputs the following result:

  $ ./fixdqs input.json
 {"last_modified": {"type": "/ type / datetime", "value": "2008-04-01T03: 28: 50.625462"},
   "type": {"key": "/ type / author"},
   "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico Economy.",
   "key": "/ authors / OL2108538A",
   "revision": 1,
   "has \" escaped quote ": 1,
   "has \" escaped quotes \ "": 1,
   "has multiple 'internal' quotes": 1,
 } 

Delta from input to output:

  $ diff -ub input.json <(./ fixdqs input.json)
 --- input.json
 +++ / dev / fd / 63
 @@ -1.9 +1.9 @@
  {"last_modified": {"type": "/ type / datetime", "value": "2008-04-01T03: 28: 50.625462"},
    "type": {"key": "/ type / author"},
 - "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico" s Economy. ",
 + "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico Economy.",
    "key": "/ authors / OL2108538A",
    "revision": 1,
    "has \" escaped quote ": 1,
    "has \" escaped quotes \ "": 1,
 - "has multiple" internal "quotes": 1,
 + "has multiple 'internal' quotes": 1,
  } 
0
source

If there are only quotes around the "name", you can use sed from the command line or in a bash script:

  sed -i 's/ "name"/ '\'name\''/g' filename.json 

Tested, working.

0
source

All Articles