C # shortcut for url

Question

C # shortcut for url

I want to unambiguously shorten the names of strings and files for use in URLs, for example, by .ly bit, etc. I can use identifiers from db, but I want the urls to be random.

what would be the best solution?

the site will be a mobile site, so I want it to be as short as possible

+6

c # url-shortener bit.ly

nLL Jan 12 '10 at 21:08

source share

5 answers

There are two methods for implementing a mapping service, such as the one you describe.

Clients represent globally unique identifiers, or
Server generates globally unique identifiers

Clients represent globally unique identifiers

As far as I know, 1. you should try to use only Guid s, unless you develop similar tools for transferring quite different information into a stream of short bytes. Anyway, if you have a byte stream representing a globally unique identifier, you can do something like this

 // source is either a Guid, or some other globally unique byte stream byte[] bytes = Guid.NewGuid ().ToByteArray (); string base64String = Convert.ToBase64String (bytes).Trim ("=");

to get a user-readable string of alphanumeric characters that seems random but avoids the conflicts inherent in other random patterns. A Guid contains 16 bytes or 128 bits, which corresponds to a complete Base64 encoding of approximately 19 characters.

The advantage of this approach is that customers can generate their own tiny Uris without central authority. The disadvantage is the high length if you roll using Guid or implement your own globally unique byte stream, which - even if it encounters it - is error prone.

If you go this route, consider Google's unique global byte streams or the like. Oh, and STAY OUT OF RANDOM BYTES , otherwise you will need to create a collision resolution on the TOP of your tiny Uri generator.

Server generates globally unique identifiers

Again, the main advantage of the above is that the Client can generate his Uris a priori. It is especially convenient if you are going to send a long request that you want to check. This may not be particularly relevant to your situation and may provide only limited value.

Thus, on the side, a server-oriented approach in which one authority generates and extends identifiers can be more attractive. If this is the route you choose, then the only question is how long will you love your Uri?

Assuming a desired length of 5 characters, and suppose you go with Base64 encoding, each identifier can contain up to 5 characters per 7 bits per character, equal to 35 bits or 2 ^ 35 [34 359 738 368] different values. This is a fairly large domain. *

Then the question arises of returning a value for a given view. There are probably a lot of ways to do this, but I would go with something like this,

List all possible values in the "free list" in your database
Remove value from free list when consumed
Add value to free list at release

Improvements or optimizations may include

Do not list each value in the range [0, 2 ^ 35], instead list a managed subset of, say, 100,000 values at a time, and when all values are consumed, just generate another 100,000 values in the sequence and continue
Add the expiration date to the values and recycle the expired values at the end of the day.
Distribute your service when parallelizing a service simply gives you small, mutually exclusive subsets of your free list for distributed services.

Conclusion

The bottom line is that you want to guarantee uniqueness - so a collision is a big no-no.

* = 34 359 738 368 - the size of the raw domain, these are all identifiers from 0 to 5 lengths. If you are interested in limiting all identifiers to a minimum and no more than 5, then your domain will look like all identifiers of length from 0 to 5 (2 ^ 35), and all identifiers of length from 0 to 4 (2 ^ 28) are 2 ^ 35 - 2 ^ 28 = 34 091 302 912, which is still quite large :)

+5

johnny g Jan 12 '10 at 21:54

source share

save a random alphanumeric string and use it for a short url. make sure you think this is best for your site and users like it www.yoursite.com/d8f3

0

Rhicke Jan 12 '10 at 21:11

source share

You can use a hash (e.g. CRC32) to create fairly short URLs. You will never be able to get “unique” URLs as you are reducing data, so there should be collisions.

0

Adam pope Jan 12 '10 at 21:13

source share

Hi, as several other people told you. If you start to compress the URL into something small, it will be impossible for you to keep it unique. However, you need to make your own encoding for each URL presented to you. One way (simple) to do this is to try to create a database from the provided URLs and then generate a guid field for each and then get a substring from it, ensuring that every time you register something is completely different the previous one.

For example: www.google.com with the manual F9168C5E-CEB2-4faa-B6BF-329BF39FA1E4 → http://www.mysite.com/?q=CEB2

The more characters used, the more links you can track. for this sample you will have 65,536 different links (4 characters in hexadecimal format).

Hope this helps.

-2

rodrigoelp Jan 12 '10 at 23:32

source share

Anon. · Accepted Answer · 2010-01-12T21:11:43+0000

You cannot "uniquely shorten" arbitrary strings. The Pigeonhole principle and all.

What you want to do (and, AFAIK, what URL reduction services), stores a database of everything presented and uses a short string. Then you can find it in the database.

You can generate short lines simply by increasing the number and Base64 encoding each time.

C # shortcut for url

More articles: