Faster getting multiple values ​​from a large jsonb field (postgresql 9.4)

TL; DR

Using PSQL 9.4, there is a way to get multiple values ​​from a jsonb field, for example, with an imaginary function:

jsonb_extract_path(x, ARRAY['a_dictionary_key', 'a_second_dictionary_key', 'a_third_dictionary_key']) 

With the hope of accelerating the almost linear time needed to select multiple values ​​(1 value = 300 ms, 2 values ​​= 450 ms, 3 values ​​= 600 ms)

Background

I have the following jsonb table:

 CREATE TABLE "public"."analysis" ( "date" date NOT NULL, "name" character varying (10) NOT NULL, "country" character (3) NOT NULL, "x" jsonb, PRIMARY KEY(date,name) ); 

With approximately 100,000 lines, where each line has a jsonb dictionary with 90+ keys and corresponding values. I am trying to write an SQL query to select several (<10) keys + values ​​quite quickly (<500 ms)

Index and query: 190 ms

I started by adding an index:

 CREATE INDEX ON analysis USING GIN (x); 

This makes the query based on the values ​​in the x dictionary fast, for example:

 SELECT date, name, country FROM analysis where date > '2014-01-01' and date < '2014-05-01' and cast(x#>> '{a_dictionary_key}' as float) > 100; 

It will take ~ 190 ms (acceptable for us)

Retrieving Dictionary Values

However, as soon as I start adding keys to return to the SELECT part, the execution time increases almost linearly:

1 value: 300 ms

 select jsonb_extract_path(x, 'a_dictionary_key') from analysis where date > '2014-01-01' and date < '2014-05-01' and cast(x#>> '{a_dictionary_key}' as float) > 100; 

Takes 366 ms (+ 175 ms)

 select x#>'{a_dictionary_key}' as gear_down_altitude from analysis where date > '2014-01-01' and date < '2014-05-01' and cast(x#>> '{a_dictionary_key}' as float) > 100 ; 

Accepts 300 ms (+ 110 ms)

3 values: 600 ms

 select jsonb_extract_path(x, 'a_dictionary_key'), jsonb_extract_path(x, 'a_second_dictionary_key'), jsonb_extract_path(x, 'a_third_dictionary_key') from analysis where date > '2014-01-01' and date < '2014-05-01' and cast(x#>> '{a_dictionary_key}' as float) > 100; 

Takes 600 ms (+410 or +100 for each selected value)

 select x#>'{a_dictionary_key}' as a_dictionary_key, x#>'{a_second_dictionary_key}' as a_second_dictionary_key, x#>'{a_third_dictionary_key}' as a_third_dictionary_key from analysis where date > '2014-01-01' and date < '2014-05-01' and cast(x#>> '{a_dictionary_key}' as float) > 100 ; 

Takes 600 ms (+410 or +100 for each selected value)

Getting More Values

Is there a way to get multiple values ​​from a jsonb field, for example with an imaginary function:

 jsonb_extract_path(x, ARRAY['a_dictionary_key', 'a_second_dictionary_key', 'a_third_dictionary_key']) 

This can speed up the search. It can return them either in the form of columns, or as a list / array or even a json object.

Getting an array using PL / Python

Just for this, I made a custom function using PL / Python, but it was much slower (5s +), possibly due to json.loads:

 CREATE OR REPLACE FUNCTION retrieve_objects(data jsonb, k VARCHAR[]) RETURNS TEXT[] AS $$ if not data: return [] import simplejson as json j = json.loads(data) l = [] for i in k: l.append(j[i]) return l $$ LANGUAGE plpython2u; # Usage: # select retrieve_objects(x, ARRAY['a_dictionary_key', 'a_second_dictionary_key', 'a_third_dictionary_key']) from analysis where date > '2014-01-01' and date < '2014-05-01' 

Update 2015-05-21

I re-executed the table using hstore with a GIN index, and the performance is almost identical to using jsonb, that is, it will not be useful in my case.

+5
source share
1 answer

You are using the #> operator, which looks like it is searching along a path. Have you tried the regular search -> ? How:

 select json_column->'json_field1' , json_column->'json_field2' 

It would be interesting to see what happened if you used a temporary table. How:

 create temporary table tmp_doclist (doc jsonb) ; insert tmp_doclist (doc) select x from analysis where ... your conditions here ... ; select doc->'col1' , doc->'col2' , doc->'col3' from tmp_doclist ; 
0
source

All Articles