I need to select Cassandra or MongoDB (or another nosql database, I accept offers) for a project with a lot of inserts (1M / day). Therefore, I am creating a small test to measure recording performance. Here is the code to insert in Cassandra:
import time import os import random import string import pycassa def get_random_string(string_length): return ''.join(random.choice(string.letters) for i in xrange(string_length)) def connect(): """Connect to a test database""" connection = pycassa.connect('test_keyspace', ['localhost:9160']) db = pycassa.ColumnFamily(connection,'foo') return db def random_insert(db): """Insert a record into the database. The record has the following format ID timestamp 4 random strings 3 random integers""" record = {} record['id'] = str(time.time()) record['str1'] = get_random_string(64) record['str2'] = get_random_string(64) record['str3'] = get_random_string(64) record['str4'] = get_random_string(64) record['num1'] = str(random.randint(0, 100)) record['num2'] = str(random.randint(0, 1000)) record['num3'] = str(random.randint(0, 10000)) db.insert(str(time.time()), record) if __name__ == "__main__": db = connect() start_time = time.time() for i in range(1000000): random_insert(db) end_time = time.time() print "Insert time: %lf " %(end_time - start_time)
And the code to be inserted into Mongo changes the connection function:
def connect(): """Connect to a test database""" connection = pymongo.Connection('localhost', 27017) db = connection.test_insert return db.foo2
Results: ~ 1046 seconds to insert in Kassandra and ~ 437 to complete in Mongo. He suggested that Kassandra is much faster than Mongo by inserting data. So what am I doing wrong?
python mongodb cassandra nosql
fasouto
source share