A pair of key values ​​in a relational database

Does anyone have experience storing key-value pairs in a database?

I am using this type of table:

CREATE TABLE key_value_pairs ( itemid varchar(32) NOT NULL, itemkey varchar(32) NOT NULL, itemvalue varchar(32) NOT NULL, CONSTRAINT ct_primarykey PRIMARY KEY(itemid,itemkey) ) 

Then, for example, the following lines may exist:

  itemid itemkey itemvalue ---------------- ------------- ------------ 123 Colour Red 123 Size Medium 123 Fabric Cotton 

The problem with this schema is the SQL syntax needed to retrieve the data is quite complex. Would it be better to just create a series of key / value columns?

 CREATE TABLE key_value_pairs ( itemid varchar(32) NOT NULL, itemkey1 varchar(32) NOT NULL, itemvalue1 varchar(32) NOT NULL, itemkey2 varchar(32) NOT NULL, itemvalue2 varchar(32) NOT NULL, . . .etc . . . ) 

It will be easier and faster to request, but it does not have extensibility of the first approach. Any tips?

+66
sql database
Sep 24 '08 at 9:50
source share
18 answers

Before continuing with my approach, I humbly suggest that you step back and think about whether you want to store this data in the Key-Value Pair table. I do not know your application, but my experience has shown that every time I did what you do later, I would like for me to create a color table, fabric table and size table.

Think about the limitations of referential integrity, if you take the key-value pair approach, the database will not be able to tell you when you try to store the color identifier in the size field

Think about the performance benefits of a join in a table with 10 values ​​compared to a common value that can have thousands of values ​​for multiple domains. How useful is a key value index that really will be?

Usually the reasoning about what you are doing is that the domains should be "user-defined". If so, then even I am not going to push you to create tables on the fly (although this is an acceptable approach).

However, if your arguments are that you think it will be easier to manage them than a few tables, or because you assume that the service user interface, which is common to all domains, stop and think very hard before continue.

+115
Sep 24 '08 at 13:37
source share

There is another solution that is somewhere in between. You can use an xml type column for keys and values. So you save the itemid field and then have an xml field that contains the xml defined for some key value pairs, such as <items> <item key="colour" value="red"/><item key="xxx" value="blah"/></items> Then, when you retrieve data from the database, you can process xml in several ways. Depending on your use. This is an advanced solution.

+16
Sep 24 '08 at 10:29
source share

In most cases, you would use the first method, because you have not actually sat down and thought out your model. "Well, we don't know yet what keys will be." In general, this is a pretty bad design. This will be slower than the actual use of the column keys as they should be.

I also ask why your id is varchar.

In the rare case when you really need to implement a key / value table, the first solution is suitable, although I would like the keys to be in a separate table so that you do not store varchars as keys in your key / value table.

eg,

 CREATE TABLE valid_keys ( id NUMBER(10) NOT NULL, description varchar(32) NOT NULL, CONSTRAINT pk_valid_keys PRIMARY KEY(id) ); CREATE TABLE item_values ( item_id NUMBER(10) NOT NULL, key_id NUMBER(10) NOT NULL, item_value VARCHAR2(32) NOT NULL, CONSTRAINT pk_item_values PRIMARY KEY(item_id), CONSTRAINT fk_item_values_iv FOREIGN KEY (key_id) REFERENCES valid_keys (id) ); 

Then you can even go crazy and add a "TYPE" to the keys, which allows some type checking.

+14
Sep 24 '08 at 12:11
source share

I once used key-value pairs in a database to create a spreadsheet (used to enter data) in which the cashier summarized his activities with the work of the cash drawer. Each k / v pair was a named cell into which the user entered a sum of money. The main reason for this approach is that the spreadsheet is highly susceptible to change. New products and services were added regularly (new cells appeared). In addition, certain cells are not needed in certain situations and may be discarded.

The application that I wrote was rewritten by the application, which broke the cash register into separate sections, presented in another table. The problem was that when adding products and services, modifications to the circuit were needed. As with all design options, there are pros and cons with respect to a certain direction compared to another. My redesign, of course, performed more slowly and consumed disk space faster; however, he was very agile and allowed to add new products and services in minutes. The only note, however, was disk consumption; I could not remember other headaches.

As already mentioned, the reason I usually consider the key-value pair approach is that users — this may be the owner of the business — want to create their own types that have a set of attributes specific to the user. In such situations, I came to the following definition.

If there is no need to retrieve data for these attributes, or the search may be delayed by the application after extracting a piece of data, I recommend storing all attributes in one text field (using JSON, YAML, XML, etc.). If there is an urgent need to retrieve data for these attributes, it becomes erratic.

You can create a single attribute table (id, item_id, key, value, data_type, sort_value), where the sort column covers the actual value in a row-sorted view. (e.g. date: "2010-12-25 12:00:00", number: "0000000001"). Or you can create separate attribute tables by data type (e.g. string_attributes, date_attributes, number_attributes). Among the many pros and cons of both approaches: the first is simpler, the second is faster. Both will make you write ugly complex queries.

+13
Oct 12 '09 at 18:57
source share

From experience, I found that some keys will be more widely used or requested more often. Usually we slightly de-normalized the design to include a specific field in the main table "item".

eg. if each item has a color, you can add a Color column to the product table. Fabric and size can be used less frequently and can be stored separately in a key-value pair table. You can even keep the color in the key-value pair table, but duplicate the data in the item table to get performance benefits.

Obviously, this depends on the data and how flexible the key-value pairs should be. It can also result in your attribute data not being located permanently. However, de-normalization greatly simplifies queries and improves their performance.

I would usually consider de-normalization when performance becomes a problem, and not just to simplify the request.

+5
Sep 24 '08 at 9:56
source share

I do not understand why SQL to extract data should be difficult for your first design. Of course, to get all the values ​​for an element, you simply do this:

 SELECT itemkey,itemvalue FROM key_value_pairs WHERE itemid='123'; 

or if you need only one specific key for this element:

 SELECT itemvalue FROM key_value_pairs WHERE itemid='123' AND itemkey='Fabric'; 

The first design also gives you the ability to easily add new keys whenever you want.

+2
Sep 24 '08 at 10:24
source share

I think the best way to design such tables is as follows:

  • Make commonly used fields as columns in the database.
  • Provide a Misc column that contains the dictionary (in JSON / XML / other string formeat) that will contain the fields as key-value pairs.

Highlights:

  • You can write regular SQL queries to query SQL in most situations.
  • You can use FullTextSearch for key-value pairs. MySQL has a full-text search engine, otherwise you can use “similar” queries, which are slightly slower. Although full-text search is poor, we assume that there are fewer such queries, so this should not cause too many problems.
  • If your key-value pairs are simple Boolean flags, this method has the same effect as a separate column for the key. Any more complex operation on key value pairs must be performed outside the database.
  • By looking at the frequency of requests over a period of time, you can tell which key-value pairs should be converted to columns.
  • This method also simplifies the enforcement of database integrity constraints.
  • This provides a more natural way for developers to re-pin their schema and code.
+2
May 27 '09 at 16:48
source share

The first method is quite normal. you can create a UDF that retrieves the data you need and just calls it.

+1
Sep 24 '08 at 9:53
source share

If you have very few possible keys, I would just save them as columns. But if the set of possible keys is large, then your first approach is good (and the second approach will be impossible).

Or is it so that each element can only have a finite number of keys, but the keys can be something from a large set?

You may also consider using object relational mapping to simplify queries.

+1
Sep 24 '08 at 10:01
source share

The first method is much more flexible at the price you mentioned.

And the second approach will never be viable, as you have shown. Instead, you would do (according to your first example)

 create table item_config (item_id int, colour varchar, size varchar, fabric varchar) 

Of course, this will only work when the amount of data is known and does not change much.

As a rule, any application that requires changing DDL tables for normal operation should be given a second and third thought.

+1
Sep 24 '08 at 10:01
source share

Breaking the normalization rules is fine as long as the business requirement can still be met. The presence of key_1, value_1, key_2, value_2, ... key_n, value_n can be OK, up to the point you need key_n+1, value_n+1 .

My solution was a data table for common attributes and XML for unique attributes. This means that I use both. If all (or most things) have size, then size is the column in the table. If only Object A has the Z attribute, then Z is saved as XML, a similar Peter Marshall response already set.

+1
Sep 24 '08 at 10:53
source share

PostgreSQL 8.4 supports the hstore data type for storing sets of pairs (key, value) in a single PostgreSQL data field. Please refer to http://www.postgresql.org/docs/8.4/static/hstore.html for usage information. Although this is a very old question, he thought about it, thinking that it could help someone.

+1
Sep 22 '15 at 10:20
source share

The second table is badly deformed. I would take the first approach.

0
Sep 24 '08 at 9:57
source share

I think that you are doing the right thing if the keys / values ​​for a particular type of element often change.
If they are more likely static, then just make the product table wider.

We use a similar (but more complex) approach with a lot of logic around the keys / values, as well as tables for the types of values ​​allowed for each key.
This allows us to define the elements as another instance of the key, and our central table maps arbitrary types of keys to other types of keys. It can quickly knot your brain with knots, but once you have written and encapsulated logic to handle all this, you have a lot of flexibility.

I can write more details about what we do if necessary.

0
Sep 24 '08 at 10:22
source share

If the keys are dynamic or there are many, use the mapping table that you have as your first example. In addition, this is the most general solution, it scales best in the future, since you add more keys, it is easy to code SQL to get the data, and the database will be able to optimize the query better than you could imagine (i.e. I began to make efforts to prematurely optimize this case, if it was not proven that this would be a bottleneck when testing later, in which case you could consider the following two options below).

If the keys are a known set, and there are not many of them (<10, maybe <5), then I do not see a problem with the fact that they are value columns for the element.

If there is an average number of known fixed keys (10-30), perhaps you have another table for storing item_details.

However, I never see the need to use your second sample structure, it looks bulky.

0
Sep 24 '08 at 10:28
source share

If you are following the route of the KVP table, and I have to say that I don’t like this technique at all, since it’s really difficult to request, then you should consider clustering the values ​​for one id element together using the appropriate method for any platform on which You are at.

RDBMS tend to scatter rows around to avoid block conflicts in inserts, and if you have 8 rows to retrieve, you can easily find access to 8 blocks of a table to read them. In Oracle, you should consider a hash cluster for storing them, which will greatly improve performance when accessing the values ​​for this element.

0
Sep 25
source share

Times have changed. You now have other types of databases that you can use next to relational databases. NOSQL options now include: Columns, Document Warehouses, Charts, and Multi-Models (see http://en.wikipedia.org/wiki/NoSQL ).

For Key-Value databases, your choices include (but are not limited to) CouchDb, Redis, and MongoDB.

0
Mar 06 '15 at 23:34
source share

Your example is not a good example of using key pairs. A better example would be to use something like the Fee table of the Customer table and the Customer_Fee table in the billing application. The Fee table will consist of such fields as: fee_id, fee_name, fee_description The Customer_Fee table will consist of such fields as: customer_id, fee_id, fee_value

-one
Nov 28 '08 at 8:04
source share



All Articles