Encrypt a string of numeric characters into a string alphanumerics

I have a string of numbers that I would like to make shorter for use in the url. This line always consists of numbers only. For example: 9587661771112

Theoretically, encrypting a numeric string in an alphanumeric (0-9a-zA-Z) string should always return the shorter result I want.

I created an algorithm that does the following:

Encryption (string1 = numeric input string, string2 = alphanumeric return string)

  • It takes the next two characters from string1 and converts them to a number, for example 95 for the above example.
  • Checks if a number is less than 52 (combined length of az and AZ)
    • if yes, add ("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ") [Number] to line2 and go forward 2 characters
    • else, add ("0123456789) [First digit of number) to line2 and go forward 1 character

In the next step, the number will be 58, etc.

With some tweaking, the shortest result I could get was: 9587661771112> j9UQpjva

My problem is that with this technique, the result can be dramatic. I also feel that this is not a clean solution to my problem.

So I need an encryption algorithm that converts a string of numbers into a shorter string of uppercase letters, lowercase letters and numbers. It must be decryptable and have a more or less consistent result.

Any idea how to achieve this?


Decision:

string Chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; string Base10To62(long N) { string R = ""; while (N != 0) { R += Chars[(int)(N % 62)]; N /= 62; } return R; } long Base62To10(string N) { long R = 0; int L = N.Length; for (int i = 0; i < L; i++) { R += Chars.IndexOf(N[i]) * (long)Math.Pow(62, i); } return R; } 

works like a charm :)

+4
source share
3 answers

Decision:

 string Chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; private static string Base10To62(string S) { string R = ""; var N = long.Parse(S); do { R += Chars[(int)(N % 0x3E)]; } while ((N /= 0x3E) != 0); return R; } private static string Base62To10(string S) { long R = 0; int L = S.Length; for (int i = 0; i < L; i++) R += Chars.IndexOf(S[i]) * (long)(System.Math.Pow(0x3E, i)); return R.ToString(); } 
+2
source

Linq version for 62 to 10, just for fun:

 long Base62To10(string N) { return N.Select((t, i) => Chars.IndexOf(t)*(long) Math.Pow(62, i)).Sum(); } 
+1
source

If you can add two more characters to your set to make it good even 64, then there is a simple, fast algorithm that I can describe here.

Encode numbers in a three- or four-bit code as follows:

 0: 000 1: 001 2: 010 3: 011 4: 100 5: 101 6: 1100 7: 1101 8: 1110 9: 1111 

This is a prefix code, which means you can look at the first three bits to see if you need to use the fourth. If the first three bits as an integer are greater than 5, then get one more bit. Thus, decoding will:

 get three bits as n if n < 6 the result is n + '0' else n = (n << 1) + one more bit the result is n - 6 + '0' 

Then the bit is simply stored six at a time in one of 64 valid characters.

This has a problem if you do not know a priori how many digits there are, since there will be ambiguity if you leave four or five bits not used in the last character. In this case, the code can be changed simply like this:

 0: 000 1: 001 2: 010 3: 011 4: 100 5: 1010 6: 1011 7: 1100 8: 1101 9: 1110 eom: 1111 

which takes a few more bits, but provides a unique marker for the end of the message.

In the first example, you will store an average of 1.76 digits per character. For the second example, 1.71 digits per character, less than some value for the eom marker, depending on the number of digits that you encode at a time.

If you really can use only 62 characters, then I will need to think about this a bit.

Update:

A quick look at RFC 1738 indicates that a lot more characters can be used in a URL:

 lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z" hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z" alpha = lowalpha | hialpha digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" safe = "$" | "-" | "_" | "." | "+" extra = "!" | "*" | "'" | "(" | ")" | "," unreserved = alpha | digit | safe | extra 

Thus, adding, say, $ and _ to your set will make it 64.

0
source

All Articles