Here is the code to defragment an array.
int sparse_to_compact(int*arr, int total, int*is_valid) { int i = 0; int last = total - 1; // trim the last invalid elements for(; last >= 0 && !is_valid[last]; last--); // trim invalid elements from last // now we keep swapping the invalid with last valid element for(i=0; i < last; i++) { if(is_valid[i]) continue; arr[i] = arr[last]; // swap invalid with the last valid last--; for(; last >= 0 && !is_valid[last]; last--); // trim invalid elements } return last+1; // return the compact length of the array }
I copied the code from this answer.
I think a more efficient way is to use a list of bucket links. And the buckets are controlled by a bit string memory manager. This is similar to the following:
struct elem { uint32_t index; int x; }
Suppose our contents are int and encapsulated in a structure called struct elem .
enum { MAX_BUCKET_SIZE = 1024, MAX_BITMASK_SIZE = (MAX_BUCKET_SIZE + 63) >> 6, }; struct bucket { struct bucket*next; uint64_t usage[MAX_BITMASK_SIZE]; struct elem[MAX_BUCKET_SIZE]; };
A bucket is defined as a struct elem array and usage mask.
struct bucket_list { struct bucket*head; }container;
And the bucket list is a linked list containing all the buckets.
So, we need to write a memory manager code.
First, we need a new bucket that needs to be allocated when necessary.
struct bucket*bk = get_empty_bucket(&container); if(!bk) { struct bucket*bk = (struct bucket*)malloc(sizeof(struct bucket)); assert(bk); memset(bk->usage, 0, sizeof(bk->usage)); bk->next = container.head; container.head = bk; }
Now that we have the bucket, we need to set the value in the array when necessary.
for(i = 0; i < MAX_BITMASK_SIZE; i++) { uint64_t bits = ~bk.usage[i]; if(!bits) continue; int bit_index = _builtin_ctzl(bits); int index = (i<<6)+bit_index; bk->elem[index].index = index; bk->elem[index].x = 34; bk.usage[i] |= 1<<bit_index; }
Removing array elements is easy because they are not used. Now that all the elements in the bucket are not used, we can remove the bucket from the list of links.
Sometimes we can defragment buckets or optimize them to fit a smaller space. Otherwise, when we assign new items, we can choose more crowded buckets over less crowded ones. When we delete, we can change the element from less crowded to more crowded.
You can effectively remove array elements,
int remove_element(int*from, int total, int index) { if(index != (total-1)) from[index] = from[total-1]; return total;
This is done by replacing the item with the last value.