For those who are looking for a common solution, these may be general criteria:
- The file name should resemble a string.
- If possible, the encoding should be reversible.
- Collision probability should be minimized.
To do this, we can use a regular expression to match invalid percent-encode characters, and then limit the length of the encoded string.
private static final Pattern PATTERN = Pattern.compile("[^A-Za-z0-9_\\-]"); private static final int MAX_LENGTH = 127; public static String escapeStringAsFilename(String in){ StringBuffer sb = new StringBuffer();
Patterns
The above pattern is based on a conservative subset of the allowed characters in the POSIX specification .
If you want to allow the dot character, use:
private static final Pattern PATTERN = Pattern.compile("[^A-Za-z0-9_\\-\\.]");
Just be careful with strings like "." and ".."
If you want to avoid collisions on case insensitive file systems, you need to avoid capital:
private static final Pattern PATTERN = Pattern.compile("[^a-z0-9_\\-]");
Or skip the lowercase letters:
private static final Pattern PATTERN = Pattern.compile("[^a-z0-9_\\-]");
Instead of using a whitelist, you can choose a blacklist of reserved characters for your specific file system. EG. This regular expression is suitable for FAT32 file systems:
private static final Pattern PATTERN = Pattern.compile("[%\\.\"\\*/:<>\\?\\\\\\|\\+,\\.;=\\[\\]]");
Length
In Android, 127 characters is a safe limit. Many file systems accept 255 characters.
If you prefer to hold the tail rather than the head of your line, use:
// Truncate the string. int start = Math.max(0,encoded.length()-MAX_LENGTH); return encoded.substring(start,encoded.length());
Decoding
To convert the file name to the source string, use:
URLDecoder.decode(filename, "UTF-8");
Limitations
Since longer strings are truncated, there is the possibility of name collision during encoding or corruption during decoding.