Postgresql hstore key / value versus traditional SQL performance

Question

Postgresql hstore key / value versus traditional SQL performance

I need to develop a backend for keys / values, something like this:

Table T1 id-PK, Key - string, Value - string INSERT into T1('String1', 'Value1') INSERT INTO T1('String1', 'Value2') Table T2 id-PK2, id2->external key to id some other data in T2, which references data in T1 (like users which have those K/V etc)

I heard about PostgreSQL hstore with GIN / GIST. Which is better (in terms of performance)? Doing this traditional way with SQL joining and having separate columns (Key / Value)? Does PostgreSQL improve hstore in this case?

The data format must be any key => any value. I also want to do a text match, for example. partially search (LIKE% in SQL or use the hstore equivalent). I plan to have 1M-2M records in it and will probably scale at some point.

What do you recommend? Migrating with the traditional SQL / PostgreSQL hstore method or any other distributed key / value store with retention?

If this helps, my server is a VPS with 1-2 GB of RAM, so the equipment is not very good. I also thought that a cache layer was on top of this, but I think this complicates the problem. I just want good performance for 2M records. Updates will be performed frequently, but searches will be performed more often.

Thanks.

+8

performance sql postgresql key

florinp Feb 28 '12 at 18:33

source share

2 answers

Elliot chance · Answer 1 · 2012-07-04T05:05:26+0000

Your question is incomprehensible because you did not quite understand your purpose.

The key here is the index (intended for pun intended) - if you are dealing with a large number of keys that you want to get with the least search and without pulling unrelated data.

Short answer: you probably do not want to use hstore , but let's see in more detail ...

Does each id have many key / value pairs (hundreds +)? Do not use hstore .
Will any of your values contain large blocks of text (4kb +)? Do not use hstore .
Do you want to be able to search by key in wildcard expressions? Do not use hstore .
Do you want to perform complex joins / aggregations / reports? Do not use hstore .
Will you update the value for one key? Do not use hstore .
Multiple keys with the same name under id ? You can not use hstore .

So what to use hstore ? Well, one good scenario would be if you wanted to keep key / value pairs for an external application, where you know that you always want to get all the keys / values and will always save the data as a block (i.e. a place). At the same time, you need some flexibility that allows you to search for this data - simply and not store it in an XML or JSON block. In this case, since the number of key / value pairs is small, you save space because you are compressing multiple tuples into one hstore .

Consider this as a table:

 CREATE TABLE kv ( id /* SOME TYPE */ PRIMARY KEY, key_name TEXT NOT NULL, key_value TEXT, UNIQUE(id, key_name) );

kgrittn · Answer 2 · 2013-07-10T15:33:04+0000

I think the design is poorly normalized. Try something else like this:

 CREATE TABLE t1 ( t1_id serial PRIMARY KEY, <other data which depends on t1_id and nothing else>, -- possibly an hstore, but maybe better as a separate table t1_props hstore ); -- if properties are done as a separate table: CREATE TABLE t1_properties ( t1_id int NOT NULL REFERENCES t1, key_name text NOT NULL, key_value text, PRIMARY KEY (t1_id, key_name) );

If the properties are small and you don’t need to use them heavily in compounds or with suitable selection criteria, hstore might be enough. Elliot outlined some reasonable things to consider in this regard.

Your link to users indicates that it is incomplete, but you really did not give enough information to suggest where they belong. You can walk through the array in t1 , or you might be better off with a separate table.

Postgresql hstore key / value versus traditional SQL performance

More articles: