As people point out, GC allocates faster (because it just gives you the next block on its list), but slower overall (because it has to heap a bunch regularly so that the algorithms are fast).
so go for a compromise solution (which is actually pretty damn good):
You create your own heaps, one for each object size that you usually allocate (or 4-byte, 8-byte, 16-byte, 32-byte, etc.), when you need a new piece of memory, you grab the last "block" on the corresponding heap. Since you are pre-allocating from these heaps, all you have to do when the selection is to grab the next free block. This works better than a standard allocator because you are wasting memory happily - if you want to allocate 12 bytes, you will discard the entire 16-byte block from the 16-byte heap. You save a raster image of used v free blocks so that you can quickly distribute them without wasting extra memory or without needing compactness.
In addition, since you have several heaps running, highly parallel systems work much better, since you do not need to block as often (i.e. you have several locks for each heap so that you do not become rivals almost as much)
Let's try - we used it to replace the standard heap in a very intensive application, productivity has grown significantly.
BTW. the reason standard distributors are slow is because they try not to waste memory - so if you allocate 5-byte, 7-byte and 32 bytes from the standard heap, it will keep these "boundaries". The next time you need to allocate, it will go through those who are looking for enough space to give you what you requested. This works well for low memory systems, but you only need to look at how much memory most applications use today to see GC systems go the other way and try to allocate as quickly as possible without worrying about how much memory is wasted .
gbjbaanb
source share