Support for multiple languages โ€‹โ€‹in a NoSQL solution?

We are going to start a new project in which we (hopefully) will support thousands of customers, and therefore we are studying architecture. One of the key aspects of the application will be support for several languages โ€‹โ€‹(English, Spanish, etc., without restrictions on the number of languages). We have a lot of experience in modeling, this is a traditional DBMS (Sql Server, Oracle, etc.), but we fight when it comes to NoSQL modeling. In the SQL model, we would create a โ€œtextโ€ table with a โ€œlanguageโ€ column pointing to a โ€œlanguageโ€ table with all the different languages. Thus, all texts can be submitted in all supported languages. Consider a simple example:

table: Category columns: id (PK), Enabled (Bool)

table: Category_Descriptions columns: id (PK), CategoryID (FK), LanguageID (FK), description (text)

table: Language columns: id (PK), Enabled (Bool)

table: Language_Descriptions columns: id (PK), DescriptionLanguageID (FK), LanguageID (FK), description (text)

Thus, all languages โ€‹โ€‹will be saved in the Language table, with their corresponding description stored in the Language_Descriptions table. In addition, all categories will be stored in the Category table with descriptions in all languages โ€‹โ€‹in the Category_Descriptions table. Thus, to get all categories in a given language (English = 1):

select c.id, cd.Description from Category c, Category_Descriptions cd where c.id = cd.CategoryID and c.Enabled = 1; 

Of course, the category itself is not very useful; it will be part of another entity, such as an incident report:

table: Incident columns: identifier (PK) generated (date), category code (FK), etc.

To get information from this table, I will then make the same connection as before, and select the description column in the correct language. The basic material, we all did this before ...

Finally, we come to my question: how to store it correctly in a NoSQL database? :)

I reviewed a few (bad) solutions:

  • Save only the code and then find the correct runtime description
  • Save the last used description along with the language code and then update if the language has changed (another user)
  • Store all descriptions in one document
  • Save the description of the code in the active language, and then, if necessary, add a description in new languages โ€‹โ€‹(i.e., on request in an unused language).

All these solutions have many shortcomings and require a lot of work on implementation and support ... So, any contribution to how to best solve this problem will be appreciated.

EDIT: We look at NoSQL for two reasons:

  • Performance (scale)
  • Dynamic schema (a lot of work needs to be done to make this happen in SQL)
+4
source share
2 answers

Some time passed since it was asked, but I thought, why not =) ...

From my experience with NoSQL, you should first try to forget your RDMS background and your strong desire to normalize data. Good to have redundant data. It is normal to store things in large quantities (even if it is redundant!) It is normal that the data is not consistent. In other words, since you will be storing the language description in potentially 5 places ... it is normal for these 5 places to be different for a certain period of time.

If you are willing to make these concessions in the name of performance and dynamic design, this can help you model.

I think it's a good idea to start by using the user interface as a model. If you were a web developer and wanted to get this data, what do you need? Ideally, you want to minimize the number of calls a web developer needs to make in order to get what they need. This sometimes helps you decide how much information needs to be placed inside the document.

I think you were hinting at the possibility of running queries on documents using your SQL example. In other words, if you do your best and create 10 types of documents, and basically everything will be fine, and then you suddenly realize that you need to "join", you will run into trouble.

NoSQL is not suitable for creating conceptual joins.

As a rule, most of them use a map / abbreviation. For example, in Mongo, you can write map / reduce functions that essentially provide you with connection functionality. However, you pay the price for speed.

But if you want complex queries (things that don't match your original document model) to work a little slower, you can do whatever you want.

How would you determine which queries should be fast and which might be a little slow? Again I would point to the user interface.

A simple trial and simulation bug really helped me. I understand that this is a lame sentence, but it is true. =)

+3
source

You can create a description field as an array of objects with two fields: language and text. Just make sure the first member of this array is always the default locale value.

+1
source

All Articles