TakeSample () function in Spark

I am trying to use the takeSample() function in Spark , and the parameters are the data, the number of samples to take and the seeds . But I do not want to use the seed. I want to get different answers every time. I can’t understand how I can do this. I tried to use System.nanoTime as the initial value, but it gave an error since I think the data type did not match. Is there any other function similar to takeSample() that can be used without seed? Or there is another implementation that I can use with takeSample() to get different results each time.

+6
source share
3 answers

System.nanoTime is of type long , the seed expected by takeSample is of type Int . Therefore, takeSample(..., System.nanoTime.toInt) should work.

+7
source

System.nanoTime returns Long, while takeSample expects Int.
You can scala.util.Random.nextInt as the initial value of the takeSample function.

+1
source

In Spark version 1.0.0, the seed parameter is optional. See https://issues.apache.org/jira/browse/SPARK-1438 .

+1
source

All Articles