You can use MR to accomplish this. In MR, you simply select tags and project them:
var map = function(){ for(var i=0;i<this.tags.length;i++){ emit(this.tags[i].tagname, {count: 1}); } }
And then your reduction will go through the released documents, basically summing up the number of times this tag was noticed.
If you are upgrading to the latest erratic 2.2, you can also use the aggregation structure. You must use the $ project and $ sum piplines aggregation structures to project tags from each message and then sum them to create a point-based tag cloud, allowing the text size of each tag to be based on summation.
If so, is this a good practice? Or does this violate the nosql paradigm?
This is a fairly standard problem in MongoDB, and you will not be able to avoid it. With a reusable structure, the inevitable need arises to fulfill some complex queries on it. Fortunately, in version 2.2 there is an aggregationm structure for saving.
As for the good or bad approach, it is pretty standard, so it is neither good nor bad.
With regard to improving the structure, you can pre-aggregate unique tags with their calculation in a separate collection. This will make it easier to create a real-time tag cloud.
Pre-aggregation is a form of creating another collection that you usually get from MR without the need for MR or aggregation structure. Usually this event is based on your application, so when a user creates a message or checks the message, he triggers a pre-aggregation event in the tag_count collection, which looks like this:
{ _id: {}, tagname: "", count: 1 }
When the event fires, your application will scroll the tags in the mail, basically doing $ inc upserts like this:
db.tag_count.update({tagname: 'whoop'}, {$inc: {count: 1}}, true);
So now you will have a collection of tags with your score on your blog. From there, you go along the same route as MR, and simply request this collection by taking out your data. Of course, you will need to handle uninstall and update events, but you will get a general idea.