Parse JSON for an array in a shell script

Question

Parse JSON for an array in a shell script

I am trying to parse a JSON object inside a script shell into an array.

For example: [Amanda, 25, http://mywebsite.com]

JSON looks like this:

{ "name" : "Amanda", "age" : "25", "websiteurl" : "http://mywebsite.com" }

I don't want to use any libraries, it would be better if I could use regex or grep. I did:

 myfile.json | grep name

It gives me a "name": "Amanda." I could do this in a loop for each line in the file and add it to the array, but I only need the right side, not the entire line.

+8

json bash shell parsing

unconditionalcoder Jul 14 '16 at 1:45

source share

3 answers

jq is good enough to solve this problem

 paste -s <(jq '.files[].name' YourJsonString) <(jq '.files[].age' YourJsonString) <( jq '.files[].websiteurl' YourJsonString)

So you get a table and you can grep any rows or awk print any columns you want

+3

Dr_hope Apr 25 '17 at 0:33

source share

To achieve this, you can use a sed liner:

 array=( $(sed -n "/{/,/}/{s/[^:]*:[[:blank:]]*//p;}" json ) )

Result:

 $ echo ${array[@]} "Amanda" "25" "http://mywebsite.com"

If you don't need / need quotes, then the following sed will do away with them:

 array=( $(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*/\1/p;}' json) )

Result:

 $ echo ${array[@]} Amanda 25 http://mywebsite.com

It will also work if you have multiple entries, for example

 $ cat json { "name" : "Amanda" "age" : "25" "websiteurl" : "http://mywebsite.com" } { "name" : "samantha" "age" : "31" "websiteurl" : "http://anotherwebsite.org" } $ echo ${array[@]} Amanda 25 http://mywebsite.com samantha 31 http://anotherwebsite.org

UPDATE:

As pointed out in the comments of mklement0, a problem may occur if the file contains embedded spaces, for example, "name" : "Amanda lastname" . In this case, Amanda and lastname will both be read into separate fields of the array each. To avoid this, you can use readarray , for example,

 readarray -t array < <(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*/\1/p;}' json2)

It will also take care of any globalization issues also mentioned in the comments.

0

nautical Jul 14 '16 at 5:31

source share

mklement0 · Accepted Answer · 2016-07-14T04:50:50+0000

If you really cannot use a suitable JSON parser, for example jq ^[1] , try awk solution :

Bash 4.x:

 readarray -t values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

Bash 3.x:

 IFS=$'\n' read -d '' -ra values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

All the property values are stored here in the Bash ${values[@]} array, which you can check with declare -p values .

These solutions have limitations:

each property should be on a separate line,
all values must be double,
built-in escaped double quotes are not supported.

All of these limitations support the recommendation to use the correct JSON parser.

^Note. ^{The following alternative solutions use the Bash 4.x + readarray -t values , but they also work with the Bash 3.x alternative, IFS=$'\n' read -d '' -ra values .}

grep + cut combination : one grep will not work (if you are not using GNU grep - see below), but adding cut helps:

 readarray -t values < <(grep '"' myfile.json | cut -d '"' -f4)

GNU grep : using -P to support PCRE, which support \K , to discard everything that has been agreed so far (a more flexible alternative to asserting appearance), and also to look at the -general statements ( (?=...) ):

 readarray -t values < <(grep -Po ':\s*"\K.+(?="\s*,?\s*$)' myfile.json)

Finally, here's a clean Bash solution (3.x +) :

^{What makes this a viable alternative in terms of performance is that no external utilities are called in each iteration of the loop;} ^{however, for large input files, a solution based on external utilities will be much faster.}

 #!/usr/bin/env bash declare -a values # declare the array # Read each line and use regex parsing (with Bash `=~` operator) # to extract the value. while read -r line; do # Extract the value from between the double quotes # and add it to the array. [[ $line =~ :[[:blank:]]+\"(.*)\" ]] && values+=( "${BASH_REMATCH[1]}" ) done < myfile.json declare -p values # print the array

^{[1] Here what reliable jq solution will look like (Bash 4.x):} ^{readarray -t values < <(jq -r '.[]' myfile.json)}

Parse JSON for an array in a shell script

More articles: