Problem:
I need to compare two implementations of the hash table (well, basically HashMap with the other) and make a reasonable conclusion.
I am not interested in 100% accuracy, but just being in the right direction in my assessment.
I am interested in the difference not only for the operation, but mainly from the hash table as an integer.
I do not have a strict speed requirement, so if the other implementation is reasonably slower, I can accept it, but I do expect / require better memory usage (since one of the hash tables is supported by a primitive table).
What i have done so far:
Initially, I created my own “test” with loops and many hints for gc to get an idea of the difference, but I read online that using a standard tool is more reliable / suitable.
An example of my approach (MapInterface is just a wrapper, so I can switch between implementations.):
int[] keys = new int[10000000]; String[] values = new String[10000000]; for(int i = 0; i < keys.length; ++i) { keys[i] = i; values[i] = "" + i; } if(operation.equals("put", keys, values)) { runPutOperation(map); } public static long[] runOperation(MapInterface map, Integer[] keys, String[] values) { long min = Long.MAX_VALUE; long max = Long.MIN_VALUE; long run = 0; for(int i = 0; i < 10; ++i) { long start = System.currentTimeMillis(); for(int i = 0; i < keys.length; ++i) { map.put(keys[i], values[i]); } long total = System.currentTimeMillis() - start; System.out.println(total/1000d + " seconds"); if(total < min) { min = time; } if(total > max) { max = time; } run += time; map = null; map = createNewHashMap(); hintsToGC(); } return new long[] {min, max, run}; } public void hintsToGC() { for(int i = 0; i < 20; ++i) { System.out.print(". "); System.gc(); try { Thread.sleep(100); } catch (InterruptedException e) { e.printStackTrace(); } } } private HashMapInterface<String> createNewHashMap() { if(jdk) { return new JDKHashMapWrapper<String>(); } else { return new AlternativeHashMapWrapper<String>(); } } public class JDKHashMapWrapper implements HashMapInterface<String> { HashMap<Integer, String> hashMap; JDKHashMapWrapper() { hashMap = new HashMap<Integer, String>(); } public String put(Integer key, String value) { return hashMap.put(key, value); }
(I want to test put , get , contains and memory usage)
Can I be sure using my approach to get reasonable measurements?
If not for the most suitable tool to use, and how?
Update:
- I also test random numbers (also ~ 10M random numbers) using SecureRandom.
- When resizing a hash table, I print the logical hash table size / actual table size to get load factor
Update:
For my particular case, where integers are also of interest to me, what can be with pitfalls with my approach?
UPDATE after @ dimo414 comments :
Well, at least a hash table as a “whole” doesn't make sense
I mean, how a hash table behaves under different loads both at runtime and in memory consumption.
Each data structure is a compromise between different methods.
I agree. My compromise is an acceptable penalty for accessing memory.
You need to determine which features you want to test.
1) put (key, value);
2) get (key, value);
3) containsKey (key);
4) all of the above in the presence of a large number of entries in the hash table