Use for hstore for multiple columns

I'm having trouble deciding which approach to use.

I have several types of entities, name them A, B and C, which have a certain number of attributes (about 10-15). I created a table called ENTITIES and a column for each of the common attributes.

A, B, C also have some (mostly) unique attributes (all Boolean, can be from 10 to 30). I'm not sure what works best for modeling tables:

  • Create a column in the ENTITIES table for each attribute, which means that the types of objects that do not share this attribute will have a null value.
  • Use separate tables for the unique attributes of each entity type, which is a bit more difficult to manage.
  • Use the hstore column, each object will store its unique flags in this column.
  • ???

I tend to use 3, but I would like to know if there is a better solution.

+6
source share
1 answer

(4) Inheritance

The cleanest style in terms of database design is likely to be the inheritance, for example @yieldsfalsehood suggested in his comment. Here is an example with additional information, code and links:
Select (get) all records from multiple schemas using Postgres

The current implementation of inheritance in Postgres has a number of limitations . In particular, you cannot define common foreign key constraints for all inheriting tables. Carefully read the final chapter on warnings .

(3) hstore , json (pg 9.2 +) / jsonb (p. 9.4 +)

A good alternative for a lot of different or a changing set of attributes, especially since you can even have functional indexes for attributes inside a column:

The EAV storage type has its own set of advantages and disadvantages. This question on dba.SE gives a very good overview.

(1) One table with many columns

This is a simple alternative to brute force. Judging by your description, you get about 100 columns, most of which are Boolean and most of them NULL most of the time. Add an entity_id column to mark the type. Forced restrictions for each type are slightly inconvenient with a large number of columns. I would not bother too many restrictions that might not be needed.

The maximum number of allowed columns is 1600 .... If most columns are NULL, this upper limit applies. As long as you keep this up to 100 - 200 columns, I would not worry. NULL storage is very cheap in Postgres ( basically 1 bit per column, but it's more complicated than that. ). This is only 10-20 bytes per line. Contrary to what one might suppose (!), It is likely much less on disk than the hstore solution.

Although such a table looks monstrous to the human eye, for Postgres this is not a problem. RDBMSes specialize in brute force. You can define a set of views (for each type of entity) on top of the base table only with columns of interest and work with those where applicable. This is similar to the reverse inheritance approach. But in this way you can have common indexes and foreign keys, etc. Not so bad. I could do it.

All that said, the decision is still yours. It all depends on the details of your requirements.

+8
source

All Articles