Mongoimport csv files with _id and upsert string

I am trying to use mongoimport to recreate data with string values ​​in _id. Since identifiers look like integers (even if they are enclosed in quotation marks), mongoimport treats them as integers and creates new entries instead of re-creating existing entries.

The command that I run:

mongoimport --host localhost --db database --collection my_collection --type csv --file mydata.csv --headerline --upsert

Sample data in mydata.csv file:

{ "_id" : "0364", someField: "value" }

The result will be for mongo to insert such a record: { "_id" : 364, someField: "value" }instead of updating the record with _id "0364".

Does anyone know how to make it treat _idas strings?

Things that don't work:

  • Surrounding data with double double quotes "0364" ", double and single quotes" 0364 "or" 0364 "'
  • Adding an empty string to a value: { "_id" : "0364" + "", someField: "value" }
+5
source share
5 answers

Unfortunately, now there is no way to make numeric strings be interpreted as strings:

https://jira.mongodb.org/browse/SERVER-3731

You can write a script in Python or in any other language that is convenient for you, according to:

import csv, pymongo

connection = pymongo.Connection()
collection = connection.mydatabase.mycollection
reader = csv.DictReader(open('myfile.csv'))
for line in reader:
    print '_id', line['_id']
    upsert_fields = {
        '_id': line['_id'],
        'my_other_upsert_field': line['my_other_upsert_field']}

    collection.update(upsert_fields, line, upsert=True, safe=True)
+2
source

Just ran into the same problem and found an alternative. You can force Mongo to use string types for non-string values ​​by converting CSV to JSON and quoting a field. For example, if your CSV looks like this:

key value
123 foo
abc bar

123 abc. JSON, , , --type json , :

{
    "123":"foo",
    "abc":"bar"
}
+2

. :

00012345 12345 ( Int) string00012345 string00012345 (Type String)

SQL,

select 'string'+column as name

, - , , TSV json.

+1 jira .

+1

@Jesse - mongo, .

db.my_collection.find().forEach(function (obj) {
  db.my_collection.remove({_id: obj._id); // remove the old one
  obj._id = '' + obj._id; // change to string
  db.my_collection.save(obj); // resave
});

_id :

db.my_collection.find().forEach(function (obj) {
  obj.someField = '' + obj.someField; // change to string
  db.my_collection.save(obj); // resave
});
0

I ran into the same problem.

I believe the easiest way is to convert the CSV file to a JSON file using an online tool and then import.

This is the tool I used:

http://www.convertcsv.com/csv-to-json.htm

It allows you to wrap the integer values ​​of your CSV file in double quotes for your JSON file.

If you are unable to import this JSON file and encounter an error, just add -jsonArray to the import command. This will work for sure.

mongoimport --host localhost --db mydb -c mycollection --type json --jsonArray --file <file_path>
0
source

All Articles