You can encode the string as base 40, which is more compact than base 64. This will give you 12 such tokens in 64-bit length. The 40th token can be the end of a string token to give you the length (since it will no longer be an integer number of bytes)
If you use arithmetic coding, it can be much smaller, but for each token you need a frequency table. (using a long list of possible examples)
class Encoder { public static final int BASE = 40; StringBuilder chars = new StringBuilder(BASE); byte[] index = new byte[256]; { chars.append('\0'); for (char ch = 'a'; ch <= 'z'; ch++) chars.append(ch); for (char ch = '0'; ch <= '9'; ch++) chars.append(ch); chars.append("-:."); Arrays.fill(index, (byte) -1); for (byte i = 0; i < chars.length(); i++) index[chars.charAt(i)] = i; } public byte[] encode(String address) { try { ByteArrayOutputStream baos = new ByteArrayOutputStream(); DataOutputStream dos = new DataOutputStream(baos); for (int i = 0; i < address.length(); i += 3) { switch (Math.min(3, address.length() - i)) { case 1:
prints
twitter.com:2122 (16 chars) encoded is 11 bytes. 123.211.80.4:2122 (17 chars) encoded is 12 bytes. my-domain.se:2121 (17 chars) encoded is 12 bytes. www.stackoverflow.com:80 (24 chars) encoded is 16 bytes.
I leave decoding as an exercise .;)
source share