Taxonomy of Computer Science

I am developing a web application where users have a collection of tags. I need to create a list of offers for users based on the similarity of their tags.
For example, when a user logs in, the system receives its tags and searches for these tags in the user database and shows users who have similar tags. For example, if User 1 has the following tags [ Linux, Apache, MySQL, PHP ] and User 2 has [ Windows, IIS, PHP, MySQL ] says that User 2 corresponds to User 1 with a weight of 50%, because it has 2 similar tag ( PHP strong> and MySQL ).
But imagine a situation where User 1 has [ ASP, IIS, MS Access ] and User 2 has [ PHP, Apache, MySQL ]. In this situation, my system does not offer User 2 as a "friend" for user 1 or vice versa. But we know that these two users have similarities in the area of โ€‹โ€‹work, and they work on web technologies (or web programming, etc.).
That's why I need a systematics of computer science (right now, but I probably need a taxonomy of other areas, like medicine, physics, mathematics, etc.), where these concepts are classified and therefore, when I look for the similarity of ASP and PHP , for example, we can say that they have similarities and belong to one group (or category).
I hope I have clearly described my problem, but if something explained incorrectly, you would be happy for your corrections.
thanks

+4
source share
4 answers

I donโ€™t think you really need a taxonomy. With enough data, you can analyze the clusters in the fields and display the relationships between the tags. See this document on automatic tag clustering for some details. If you don't think that tag clustering and tag-based analysis can get you as far as possible, take a look at Flickr.

Alternatively, if you think you need a taxonomy, consider using EQS. If you can match your tags with SKOS, you can perform such an analysis on them. Two sources of SKOS data that you may find particularly useful are the Library of Congress Thematic Columns and DbPedia . If you have further questions about using SKOS, try SemanticOverflow .

+3
source

If these terms appear on a forum or something like that, you can use Latent semantic analysis to create clusters of terms.

+2
source

Create some with google kits? It would be harder to get a larger dataset than this:

http://labs.google.com/sets

+2
source

You need to create relationships between tags. I do not think that this can be done automatically. You must create a database that says sql = mysql = postgresql = oracle, asp = jsp = php, etc. This way you create a group tag type. Tags can be in several ways.

+1
source

Source: https://habr.com/ru/post/1312062/


All Articles