There are two methods for implementing a mapping service, such as the one you describe.
- Clients represent globally unique identifiers, or
- Server generates globally unique identifiers
Clients represent globally unique identifiers
As far as I know, 1. you should try to use only Guid s, unless you develop similar tools for transferring quite different information into a stream of short bytes. Anyway, if you have a byte stream representing a globally unique identifier, you can do something like this
// source is either a Guid, or some other globally unique byte stream byte[] bytes = Guid.NewGuid ().ToByteArray (); string base64String = Convert.ToBase64String (bytes).Trim ("=");
to get a user-readable string of alphanumeric characters that seems random but avoids the conflicts inherent in other random patterns. A Guid contains 16 bytes or 128 bits, which corresponds to a complete Base64 encoding of approximately 19 characters.
The advantage of this approach is that customers can generate their own tiny Uris without central authority. The disadvantage is the high length if you roll using Guid or implement your own globally unique byte stream, which - even if it encounters it - is error prone.
If you go this route, consider Google's unique global byte streams or the like. Oh, and STAY OUT OF RANDOM BYTES , otherwise you will need to create a collision resolution on the TOP of your tiny Uri generator.
Server generates globally unique identifiers
Again, the main advantage of the above is that the Client can generate his Uris a priori. It is especially convenient if you are going to send a long request that you want to check. This may not be particularly relevant to your situation and may provide only limited value.
Thus, on the side, a server-oriented approach in which one authority generates and extends identifiers can be more attractive. If this is the route you choose, then the only question is how long will you love your Uri?
Assuming a desired length of 5 characters, and suppose you go with Base64 encoding, each identifier can contain up to 5 characters per 7 bits per character, equal to 35 bits or 2 ^ 35 [34 359 738 368] different values. This is a fairly large domain. *
Then the question arises of returning a value for a given view. There are probably a lot of ways to do this, but I would go with something like this,
- List all possible values ββin the "free list" in your database
- Remove value from free list when consumed
- Add value to free list at release
Improvements or optimizations may include
- Do not list each value in the range [0, 2 ^ 35], instead list a managed subset of, say, 100,000 values ββat a time, and when all values ββare consumed, just generate another 100,000 values ββin the sequence and continue
- Add the expiration date to the values ββand recycle the expired values ββat the end of the day.
- Distribute your service when parallelizing a service simply gives you small, mutually exclusive subsets of your free list for distributed services.
Conclusion
The bottom line is that you want to guarantee uniqueness - so a collision is a big no-no.
* = 34 359 738 368 - the size of the raw domain, these are all identifiers from 0 to 5 lengths. If you are interested in limiting all identifiers to a minimum and no more than 5, then your domain will look like all identifiers of length from 0 to 5 (2 ^ 35), and all identifiers of length from 0 to 4 (2 ^ 28) are 2 ^ 35 - 2 ^ 28 = 34 091 302 912, which is still quite large :)