Adding some other weird errors to your input
{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "type": {"key": "/type/author"}, "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.", "key": "/authors/OL2108538A", "revision": 1, "has \" escaped quote": 1, "has \" escaped quotes \"": 1, "has multiple " internal " quotes": 1, }
this is a Perl program that corrects unshielded internal double quotes using heuristics, followed by the actual final quote of the line with optional spaces and a colon, comma, semicolon, or curly brace
#! /usr/bin/perl -p s<"(.+?)"(\s*[:,;}])> { my($text,$terminator) = ($1,$2); $text =~ s/(?<!\\)"/'/g; # " oh, the irony! qq["$text"] . $terminator; }eg;
outputs the following result:
$ ./fixdqs input.json
{"last_modified": {"type": "/ type / datetime", "value": "2008-04-01T03: 28: 50.625462"},
"type": {"key": "/ type / author"},
"name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico Economy.",
"key": "/ authors / OL2108538A",
"revision": 1,
"has \" escaped quote ": 1,
"has \" escaped quotes \ "": 1,
"has multiple 'internal' quotes": 1,
} Delta from input to output:
$ diff -ub input.json <(./ fixdqs input.json)
--- input.json
+++ / dev / fd / 63
@@ -1.9 +1.9 @@
{"last_modified": {"type": "/ type / datetime", "value": "2008-04-01T03: 28: 50.625462"},
"type": {"key": "/ type / author"},
- "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico" s Economy. ",
+ "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico Economy.",
"key": "/ authors / OL2108538A",
"revision": 1,
"has \" escaped quote ": 1,
"has \" escaped quotes \ "": 1,
- "has multiple" internal "quotes": 1,
+ "has multiple 'internal' quotes": 1,
}