Decline algorithm for nouns in Polish / Slavic languages

Attention!! To answer this question, knowledge of Polish or any other natural language with strong bending, preferably with a system of affairs (such as German), will greatly help. In particular, the Polish declension system is very similar to the systems of other Slavic languages, such as: Russian, Czech, Serbian, etc.

Take a look at this Polish, unfinished, declinator: declinator.com I plan to distribute it in other languages, namely Russian and Latin, but now I'm struggling with Polish.

Besides having a large declination database for hundreds of nouns, I support divergent nouns that don't exist. The best solution I have come up with so far is simply checking the endings of nouns so that they can be rejected accordingly.

In my code, it comes down to this calculateDeclination method. I call it if the noun is not in the database. The method internals look like this:

  if (areLast2Letters(word, "il")) declinator = new KamilDeclinator(word); else if (areLast2Letters(word, "sk")) declinator = new DyskDeclinator(word); else if (isLastLetter(word, 'm')) declinator = new RealizmDeclinator(word); 

etc. These are only the first three dozen else if sentences that are in this method.

The example declination code looks like this:

 import static declining.utils.StringUtils.*; public class RealizmDeclinator extends realizm_XuXowiX_XemXieXieDeclinator{ public RealizmDeclinator(String noun) { super(noun); } @Override protected String calculateStem() { return word; } @Override public String calculateLocative() { return swap2ndFromEnd(stem, "ź") + "ie"; } @Override public String calculateVocative() { return swap2ndFromEnd(stem, "ź") + "ie"; } } 

So, the question is, is there another, more elegant algorithm for reducing Polish words? Should there be so many if else clauses? Do I have to write so many declensions for each type of noun?

This problem showed me how simple and incredibly numerous Polish declination rules are. This made my algorithm boring and monotonous. I hope one of you helps me make this interesting and concise!

Hooray

+5
source share
2 answers

Despite the fact that I am a native speaker of the Polish language, my answer will concern code templates in your program. As others have pointed out, tables are the way to go. However, you can try refactoring long if / else blocks using the Command pattern. See this page for a chart.

+1
source

I believe that the right way to do this is to reproduce the algorithm (with many useful functions and conditions) from a good book on morphology, and then polish it in a large dictionary as a unit test.

Updated link to my Russian declination library: https://github.com/georgy7/RussianNounsJS

+1
source

All Articles