The fastest way to remove a huge number of elements from an array in C

I have a dynamic array containing thousands of elements or even more so as not to consume a large amount of memory, I can remove unwanted elements from it (i.e. the elements were used and are no longer needed for them), so from the beginning I can highlight smaller memory by evaluating the maximum size needed after deleting items each time.

I use this method, but it takes a very long time to complete, sometimes it takes 30 minutes!

int x, y ; for (x = 0 ; x<number_of_elements_to_remove ; x++){ for (y = 0 ; y<size_of_array; y++ ){ array[y] = array[y+1]; } } 

Is there a faster way than this?

+7
c arrays
source share
4 answers

Instead of deleting items one at a time, with two loops to solve O (n 2 ), you can do one loop with one read and one write index. Go through the array, copying objects along the way:

 int rd = 0, wr = 0; while (rd != size_of_array) { if (keep_element(array[rd])) { array[wr++] = array[rd]; } rd++; } 

At the end of the wr loop, the number of elements stored in the array indicated.

+3
source share

since I noticed that you want to remove only elements from the beginning of the array, try the following:

  int x; for(x = 0 ; x< size_of_array - number_of_elements_to_remove; x++) array[x] = array[number_of_elements_to_remove + x]; 

this way you use one for the loop, which greatly reduces complexity

+1
source share

It seems you are essentially

 int y; for (y = 0; y<size_of_array; y++){ array[y] = array[y+numbre_of_elements_to_remove]; } 

The above should be faster, but there are still some caveats / problems with your code (e.g. access outside the array).

0
source share

Here is the code to defragment an array.

 int sparse_to_compact(int*arr, int total, int*is_valid) { int i = 0; int last = total - 1; // trim the last invalid elements for(; last >= 0 && !is_valid[last]; last--); // trim invalid elements from last // now we keep swapping the invalid with last valid element for(i=0; i < last; i++) { if(is_valid[i]) continue; arr[i] = arr[last]; // swap invalid with the last valid last--; for(; last >= 0 && !is_valid[last]; last--); // trim invalid elements } return last+1; // return the compact length of the array } 

I copied the code from this answer.

I think a more efficient way is to use a list of bucket links. And the buckets are controlled by a bit string memory manager. This is similar to the following:

 struct elem { uint32_t index; /* helper to locate it position in the array */ int x; /* The content/object kept in the array */ } 

Suppose our contents are int and encapsulated in a structure called struct elem .

 enum { MAX_BUCKET_SIZE = 1024, MAX_BITMASK_SIZE = (MAX_BUCKET_SIZE + 63) >> 6, }; struct bucket { struct bucket*next; /* link to the next bucket */ uint64_t usage[MAX_BITMASK_SIZE]; /* track memory usage */ struct elem[MAX_BUCKET_SIZE]; /* the array */ }; 

A bucket is defined as a struct elem array and usage mask.

 struct bucket_list { struct bucket*head; /* dynamically allocated bucket */ }container; 

And the bucket list is a linked list containing all the buckets.

So, we need to write a memory manager code.

First, we need a new bucket that needs to be allocated when necessary.

 struct bucket*bk = get_empty_bucket(&container); if(!bk) { /* no empty bucket */ /* allocate a bucket */ struct bucket*bk = (struct bucket*)malloc(sizeof(struct bucket)); assert(bk); /* cleanup the usage flag */ memset(bk->usage, 0, sizeof(bk->usage)); /* link the bucket */ bk->next = container.head; container.head = bk; } 

Now that we have the bucket, we need to set the value in the array when necessary.

 for(i = 0; i < MAX_BITMASK_SIZE; i++) { uint64_t bits = ~bk.usage[i]; if(!bits) continue; /* no space */ /* get the next empty position */ int bit_index = _builtin_ctzl(bits); int index = (i<<6)+bit_index; /* set the array value */ bk->elem[index].index = index; bk->elem[index].x = 34/* my value */; bk.usage[i] |= 1<<bit_index; /* mark/flag the array element as used */ } 

Removing array elements is easy because they are not used. Now that all the elements in the bucket are not used, we can remove the bucket from the list of links.

Sometimes we can defragment buckets or optimize them to fit a smaller space. Otherwise, when we assign new items, we can choose more crowded buckets over less crowded ones. When we delete, we can change the element from less crowded to more crowded.

You can effectively remove array elements,

 int remove_element(int*from, int total, int index) { if(index != (total-1)) from[index] = from[total-1]; return total; // **DO NOT DECREASE** the total here } 

This is done by replacing the item with the last value.

0
source share

All Articles