Well, I think you need to break this down into the main “species."
You have two entity objects:
You have one "mapping" object:
You have one "transactional" -style object:
Step 1: object
Let's start with the simple ones: User and Campaign . These are really two separate objects, none of them depends on the other for its existence. There is also no hidden hierarchy between the two: users do not belong to Campaigns, and Campaigns do not belong to Users.
When you have two top-level objects like this, they usually earn their own collection. Therefore, you will need the Users collection and the Camapaigns collection.
Step 2: display
UserCampaign is currently used to display N-to-M. Now, in general, when you have an N-to-1 mapping, you can put N inside 1. However, when you are N-to-M matching, you usually need to "choose a side."
In theory, you can do one of the following:
- Put a
Campaign ID list inside each User - Put a list of
Users ID inside each Campaign
Personally, I would do # 1. Probably you have more users than in campaigns, and you probably want to place the array where it will be shorter.
Step 3: Transactional
Clicks are a completely different beast. In object terms, you might think of the following: Clicks "belongs to" a User , Clicks "refers to" a Campaign . Thus, theoretically, you can simply store clicks that are part of any of these objects. It’s easy to think that clicks belong to users or campaigns.
But if you really dig deeper, the above simplification is really wrong. On your system, Clicks are truly the central focus. In fact, you can even say that users and campaigns are really just “related” to a click.
Take a look at the questions / requests that you ask. All of these issues are actually centered around clicks. Users and campaigns are not central to your data, clicks.
In addition, clicks will be the most abundant data in your system. You will have more clicks than anything else.
This is the biggest hitch in designing a schema for such data. Sometimes you need to push away the "parent" objects when they are not the most important. Imagine creating a simple e-commerce system. It is clear that orders will “belong” to Users , but orders are so central to the system that it will become a top-level object.
Wrap it up
You will probably need three collections:
- User → has a list of campaign._id
- the campaign
- Clicks → contains user._id, campaign._id
This should satisfy all your needs:
See information from every click, such as IP, Referer, OS, etc.
db.clicks.find()
See how often clicks come from X IP, X Referer, X OS
db.clicks.group() or run Map-Reduce .
Associate each click with a user and campaign
db.clicks.find({user_id : blah}) You can also click click ids for users and campaigns (if that makes sense).
Please note that if you have many, many clicks, you really have to analyze the queries that you run the most. You cannot index in each field, so you often want to run Map-Reduces to "collapse" the data for these queries.