Create a unique identifier (check or not)?

Given the youtube video url (for example):

eg.:

http://www.youtube.com/watch?v=-JVkaMqD5mI&feature=related 

I am talking about the part -JVkaMqD5mI . (length = 11)

allows you to calculate the parameters:

 az = 26 | AZ = 26 |_______ > 26+26+10+2 = 64 optional chars in 11 places = 64^11 = 73786976294838206464 0-9 = 10 | -_ = 2 | 

I'm still curious when they generate a new ID for a new video, they still check if it already exists?

I am sure that they have a list (db or cache) of “already generated identifiers” ... (and if they do, do they each get db? Or in cache? Or ...?)

Or they rely on odds of 1.355252...e-20 , which are almost 0 . (but still! = 0)

What are the best solutions for such situations?

+2
source share
2 answers

Well, just because they use an alphanumeric identifier on the video does not mean that they simply generate these characters at random. Just because this line looks like random trash to you, I assure you that this is not an accident, and it contains a lot of information.

Such a quick answer: No, it is impossible to create a random sequence of letters, and then: a) hope that there will be no collisions, or b) check perhaps billions of entries to see if you have it.

It is much easier to keep the central “last identifier” and have an algorithm that moves from the “last identifier” to the “next identifier to use” in such a way as to mathematically guarantee the receipt of a previously unused identifier. In the case of consecutive identification numbers, this formula is simply f (n + 1) = f (n) +1 (for example, the last identifier was 150, the next was 151 .. guaranteed not to be used far), but you can develop your own formulas in accordance with your needs.

+6
source

A hash function is usually used for these purposes. It creates data or rows of a fixed length from some other data, which can be any given length or type. To do this, he uses some algorithm. One example is the one you gave, encoding letters into numbers.

Hash functions are not as simple as they look. There can be a serious mathematical method behind them, and you can try to prove that they are ideal or minimal (which is not so important for this example).

An ideal function is a hash function that cannot generate the same output for any two different imputations. If you have such a hash function, you do not need to check for duplicates. If you want to do this, you need to prove that your hash function is perfect.

0
source

All Articles