How can I gradually reduce the array?

I have a fully populated array of values, and I would like to arbitrarily remove elements from this array with more removed at the far end.

For example, a given input (where a. Denotes a populated index)

............................................ 

I would like something like

 ....... . ... .. . . .. . . 

My first thought was to count the elements, and then iterate over the array, creating a random number somewhere between the current index and the total size of the array, for example:

 if ( mt_rand( 0, $total ) > $total - $current_index ) //remove this element 

however, since this entails creating a random number every time the loop goes around it, it becomes very difficult.

Is there a better way to do this?

+4
source share
3 answers

One easy way is to flip the weighted coin for each record with coin flips more weighted to the end. For example, if the array has size n, for each record you can select a random number from 0 to n-1 and only save the value if the index is less than or equal to a random number. (That is, save each entry with a probability of 1 - index/total .) This has the nice advantage that if you are going to compact your array in any way, and you use a good enough but efficient random number generator (it may be a simple integer hash compared to nonce), it will be pretty fast for memory access.

On the other hand, if you are only obscuring a few elements and not rebuilding the array, you can go with some kind of weighted random number generator, which most often selects numbers that are closer to the end of the index. For example, if you have a random number generator that generates a float with a value of [0,1] (closed or open borders that don't matter much), consider getting such a random float r and squaring it. This will tend to favor lower values. You can fix this by flipping it: 1-r^2 . Of course, you need this to be in your index range from 0 to n - 1 , so take floor(n * (1 - r^2)) as well as round n to n-1 .

There is practically an infinite number of variations for both of these methods.

+2
source

This is probably not the best / most efficient way to do this, but it is the best I can come up with and it works .

The NB codepad example takes a lot of time to execute, but this is because of the pretty-printed loop that I added at the end so you can see that it noticeably works. If you delete the inner loop, the runtime drops to acceptable levels.

 <?php $array = range(0, 99); for ($i = 0, $count = count($array); $i < $count; $i++) { // Get array keys $keys = array_keys($array); // Get a random number between 0 and count($keys) - 1 $rand = mt_rand(0, count($keys) - 1); // Cut $rand elements off the beginning of the keys $keys = array_slice($keys, $rand); // Unset a random key from the remaining keys unset($array[$keys[array_rand($keys)]]); } 
+2
source

This method is not random - it works by defining a function and its inverse. Different functions with different constant coefficients will have different distribution characteristics.

The results are very similar to the pattern, as expected when mapping a continuous function into a discrete structure such as an array.

Here is an example of using a quadratic function. You can try to change the constant.

demo: http://codepad.org/ojU3s9xM

 #as in y = x^2 / 7; function y($x) { return $x * $x / 7; } function x($y) { return 7 * sqrt($y); } $theArray = range(0,100); $size = count($theArray); //use func inverse to find the max value we can input to $y() without going out of array bounds $maximumX = x($size); for ($i=0; $i<$maximumX; $i++) { $index = (int) y($i); //unset the index if it still exists, else, the next greatest index while (!isset($theArray[$index]) && $index < $size) { $index++; } unset($theArray[$index]); } for ($i=0; $i<$size; $i++) { printf("[%-3s]", isset($theArray[$i]) ? $theArray[$i] : ''); } 
+1
source

Source: https://habr.com/ru/post/1414892/


All Articles