How can I randomly select an item in Ocaml?

In my program in OCaml, I need to randomly select a string from a huge set of strings. I have tried two different ways to do this so far, each with little success. First, I saved all the lines in the list, and then accidentally selected an item from the list:

let randomelement l = List.nth l (Random.int (List.length l)) 

But it takes a long time if he selects the 1000th row in the list. Therefore, I put all this into a set, assuming that Set.choose will return a random element from the set. But this does not seem to work. I think I have two questions ... how does Set.choose work, and is there a better way to randomly select an item in Ocaml?

+4
source share
6 answers

If you need the speed of choice, you should use a different container. Why use List with access to O (n) or Set with O (log n) when you can use Array with O (1), i.e. constant time.

To customize your example:

 let randomelement arr = let n = Random.int (Array.length arr) in Array.get arr n;; 
+10
source

Set.choose is an alias for Set.min_elt ; although this may not be in the future.

List.nth sure to suck if you need to do this very often.

The array will work fine, but if you need to add more elements or remove elements, this can be a nightmare for storing books.

Take a look at the versions of the Random Access Lists , which have a fairly optimal time for inserting, deleting, searching, and the cardinal, without being limited to arrays.

When I first encountered this problem, I expanded on the implementation of Set and Map to include random and nth . To expand the module, you need to re-implement the structure and add identification functions to convert between them. I wrote a new module called XSet with the following template:

 module Make (Ord : Set.OrderedType) = struct include Set.Make(Ord) type impl = Empty | Node of impl * elt * impl * int external impl_of_t : t -> impl = "%identity" external t_of_impl : impl -> t = "%identity" ... end 

You will need to use impl_of_t and vice versa to call ordinary Set functions or to call your private messages from the passed arguments - what is passed to your function should be Set.Make implementation of t not impl .

+4
source

No, Set.choose is deterministic, and I think it is listed in the documentation. I think in the current implementation it returns the minimum element.

What data structure is your first rowset stored in? An easy way to get a random element is to count the number of different lines, choose a random number K between them and get the "K-line" that you have. This can be done quite effectively for some data structure (for example, for arrays, this is a constant-time operation), less effective for others.

+3
source

Your question implies that all the time you want to prepare a structure containing strings, and then you will select several times in random order without changing the set.

If this is correct, I would recommend storing them in an array. This gives quick access to a random element (directly, as soon as you select a random index). The sets and the implementation of the list are inconvenient for accessing the ith element after you have chosen a random index i (although for Sets, if you do not mind random selection with bad randomness, you can choose a random path in the binary tree that represents a set. You cannot do this from outside the implementation, but you can copy paste it and add your function to the module).

+3
source

Firstly, there is a List.nth function that can replace your helper function.

 let n = Random.int (List.length lst) in List.nth n lst ;; 

(although that would not be faster since I'm sure this is about the same as your helper function)

As for speeding up this process: if the number of rows you have is fixed, you can use an array that would speed up the process, since the access time for the array is constant.

+2
source

Set.choose ensures that it will select the same item every time for a given set. Which element he selects is not specified, but if you look at the source code for it, you will see that it returns the minimum element.

It is best to use an array. If you want to create an immutable data structure, then I would suggest a keyword map for integers.

+1
source

All Articles