Comparing two db constructs for internal messaging

Which of the following db options would be preferable for an internal messaging system.

Three tables:

MessageThread(models.Model): - subject - timestamp - creator Message(models.Model): - thread (pk) - content - timestamp - sender MessageRecipient - message_id (pk) - recipient (pk) - status (read, unread, deleted) 

Two tables:

 Message - thread_id - subject - content - timestamp - sender (fk) MessageRecipient - message_id (fk) - recipient (fk) - status (read, unread, deleted) 

What would be the advantages of one over the other? Thanks.

+4
source share
2 answers

Strengths of the first

The first scheme obeys the best normalization rules, and therefore, probably, in most cases it is better.

Having thread_id , which is basically a natural key that is not FK for another table, probably requires trouble. It will be very difficult to ensure that it is unique when you want it, and the same when you want it. For this reason, I would recommend the first proposed scheme.

Strengths of the second

The second scheme allows the subject to modify each message in the stream. If this is the function you want, you cannot use the first option as you wrote it (but see below).

Other options

 Message - id - parent (fk to Message.id) - subject - content - timestamp - sender (fk) MessageRecipient - message_id (fk) - recipient (fk) - status (read, unread, deleted) 

Instead of the thread_id concept thread_id you can use the parent concept. Then each response will point to the original message entry. This allows threading without a “thread” table. Another possible advantage of this is that it also allows the use of thread trees. Simply put, you can represent much more complex relationships between messages and responses in this way. If this does not bother you, then this will not be a bonus for your application.

If you don't care about the benefits of the threads I just mentioned, I would probably recommend a hybrid of your two schemes:

 MessageThread(models.Model): - id Message(models.Model): - thread (pk) - subject - content - timestamp - sender MessageRecipient - message_id (pk) - recipient (pk) - status (read, unread, deleted) 

This is similar to the first scheme, except that I moved the "subject" column from the MessageThread table to Message to allow the subject to change as the thread moves ... I just use MessageThread to act as a restriction on the thread identifier used in Message (which overcomes the limitations that I mentioned at the beginning of my answer). You may have additional metadata that you want to include in the MessageThread table, but I will leave this to you and your application.

+4
source

A separate MesageThread table may come in handy if you later want to add additional thread properties, such as “locked”, “sticky” or “important”. Choosing a more complex model just for the possible addition of additional features in the future is usually not a good idea.

The first model (one with the MessageThread table) ensures that all messages in the stream have the same object, in the second model, each message in the stream can have a different object. This can be good or bad, depending on how you want messaging to work.

The first model allows you to declare a column message.thread_id as a foreign key, so you cannot insert a message without a valid thread reference. In the second model, you do not have this guarantee. This may lead to some errors later.

I do not think that the columns MessageThread.timestamp and MessageThread.creator in the first model are really needed; not the same as the timestamp and creator of the first message in the stream? Such redundancy can have negative consequences.

I would go with the first model, but I would reset the creator and timestamp fields from MessageThread .

0
source

All Articles