Converting some fields in Mongo from String to Array

I have a collection of documents in which the "tags" field has been switched from a list of space-separated tags to an array of individual tags. I want to update the previous fields, separated by spaces, so that all are arrays, such as new incoming data.

I also have problems with the $ type selector because it applies the type operation to the individual elements of the array, which are strings. Therefore, filtering by type simply returns everything.

How can I get each document, similar to the first example, in the format for the second example?

{ "_id" : ObjectId("12345"), "tags" : "red blue green white" } { "_id" : ObjectId("54321"), "tags" : [ "red", "orange", "black" ] } 
+5
source share
1 answer

We cannot use the $type operator to filter our documents here, because the type of elements in our array is "string" and, as stated in the documentation:

When applied to arrays, the type $ matches any internal element that has the specified type BSON. For example, when matching for $ type: 'array', the document will match if the field has a nested array. It will not return results where the field itself is an array.

But, fortunately, MongoDB also provides the $exists operator, which can be used here with the index of a number array.

Now, how can we update these documents?

Well, from the MongoDB version <= 3.2, we only have mapReduce() , but first, consider another alternative in the upcoming MongoDB release.

Starting with MongoDB 3.4, we can $project our documents and use the $split operator to split our string into an array of substrings.

Note that to separate only those "tags" that are strings, we need the logical processing of $cond to separate only the values ​​that are strings. The condition here is $eq , which evaluates to true when the $type this field is "string" . By the way, $type is new here in 3.4.

Finally, we can overwrite the old collection using the $out operator. But we need to explicitly specify the inclusion of another field in the $project stage .

 db.collection.aggregate( [ { "$project": { "tags": { "$cond": [ { "$eq": [ { "$type": "$tags" }, "string" ]}, { "$split": [ "$tags", " " ] }, "$tags" ] } }}, { "$out": "collection" } ] ) 

With mapReduce we need to use Array.prototype.split() to emit an array of substrings in our display function. We also need to filter our documents using the "request" option. From there we will need to iterate over the results array and $set new value for the "tags", using bulk operations, using the bulkWrite() method new in 3.2 or the now obsolete Bulk() if we are on 2.6 or 3.0, as shown here.

 db.collection.mapReduce( function() { emit(this._id, this.tags.split(" ")); }, function(key, value) {}, { "out": { "inline": 1 }, "query": { "tags.0": { "$exists": false }, "tags": { "$type": 2 } } } )['results'] 
+2
source

All Articles