I am developing a mobile application for Android that has potentially many users (say, about 1 million). These users can follow other users (e.g. Twitter). The application synchronizes user data through a remote REST server. The user data itself is stored in a document-oriented database (in my case, it is MongoDB).
I am currently asking myself the best way to design a user model, including its follower and the following relationships. The first thought was to embed relationships in a user document.
Example user document:
{ "_id":"50fd6bb530043e3c569af288", "name":"Marsha Garcia", "follower"["50fd6bb530043e3c569af287","50fd6bb530043e3c569af289","50fd6bb530043e3c569af28c"], "following":["70fd6bb530043e3c569af289","10fd6bb530043e3c569af222","89fd6bb530043e3c569af45o"] }
It is positive that the following / next relationships are already connected to the user. However, let's say that a user follows about 100,000 or more other users. Then the size of the document will become very large. If I load this user object through the REST service in my mobile application, this may take some time. In addition, in the worst case scenario, a user document may exceed the MongoDb document limit by 16 MB.
So my second thought was to model the follower and the following relationships in a more classic way: an additional document containing the following relationships of each user.
Example document 'user relation':
{ "_id": 50fe65828de290c0a8a8ea2d" "uid": "50fd6bb530043e3c569af288", "rel_uid": "50fe65828de290c0a8a8e9a6", "type": "FOLLOWING" }
The positive thing is that the size of each user document will remain constant. The downside is that with a lot of users and the following relationships, I could easily get millions of entries in my MongoDB user relationship collection. Of course, I'm going to set the margin index, but I'm not quite sure if this solution will scale very well in relation to the case of using an application user requesting his / her current subscribers.
I would appreciate any thoughts, impressions about my modeling problem. Perhaps someone even has a better approach to the solution.
thanks in advance.