Effectively create unique keys for database entries

I am currently developing a prototype registration system. It is very simplified and essentially just a .NET form that is written in MongoDB.

What I'm stuck with is an effective way to create a unique identifier / key for each user. These identifiers should be human friendly, so something like an alphanumeric string with a length of 7 characters, for example. A1B2C3X.

The solutions I have seen so far simply use a simple function to generate a random string, and then check the database to see if it is unique (and if not repeated until you find the unique one). Of course, this will become more and more expensive as the number of records in the database grows.

My idea is to precompile the unique identifier and save it in another database. Then, when I need to add a new record to the user database, I can "put" the identifier from my id databse (in constant time) and know that it does not already exist in the user database without the need to search for it.

I'm sure someone should have done something like this before. Is there a better way? I donโ€™t know why I struggle with this so much. Your input is greatly appreciated.

+7
source share
1 answer

Creating a random string in the application and checking its uniqueness is not a bad decision. Do not worry that it is ineffective, and not - and definitely not compared to the alternatives. This will certainly be faster than running db.user.count() or saving a separate table with pre-calculated identifiers. You just need to do it right.

First of all, how often will new users be created? Probably not very often compared to other things, so in fact the whole discussion of effectiveness is debatable. Secondly, with 7 characters AZ, 0-9, which range is 36 ^ 7 or about 78 billion. It will be some time before you begin to see collisions, to say the least.

If you just do it like this, he wonโ€™t be fined unless a collision occurs (which is highly unlikely):

  • Create a unique user ID
  • Insert your custom object using user id as _id value
  • Check for duplicate key errors (how to do this depends on the language and driver, but may include running the getLastError ).
  • If the key fails again, start by creating a new user ID

Thus, in the event of a collision, there will only be extra work (and I really really want to emphasize how incredibly unlikely it will be).

Another way to generate a unique user ID is to take the current UNIX timestamp (before the second one), add a hash of the host name, and then the process ID and, finally, the current counter value. This is actually how the Mongo ObjectId is created, and ensures that you can generate as many objects per second for each process as the maximum value of your counter (which in Mongo is 3 bytes, which means 16 million). See ObjectId Docs for details: http://www.mongodb.org/display/DOCS/Object+IDs

It has the property that your user IDs will be sorted in the order they were created, but it's 12 bytes in length, so unfortunately it's a little longer than your 7 characters. You can use the same method and skip the host name / pid and reduce the counter (which can also be a random number if you want) to two bytes, then you will have up to 6 bytes, which can probably be compressed to about 9 characters are AZ, 0-9.

+11
source

All Articles