Quickly select random id from mysql table with millions of inconsistent records

Question

Quickly select random id from mysql table with millions of inconsistent records

I looked around and there seemed to be no easy way to do this. It seems easier to just grab a subset of records and do all the randomization in code (perl). The methods that I saw on the Internet seem to be more focused on no more than a hundred thousand, but certainly not millions.

The table in which I work contains 6 million records (and grows), identifiers automatically increase, but are not always stored in the table (without spaces).

I tried to execute the LIMIT 1 query, which was recommended, but the query is executed forever to run - is there a quick way to do this, given that there are spaces in the record? I can't just take max and randomize over a range.

Update:

One of my ideas might have been to capture the maximum, randomize the limit based on max, and then capture a range of 10 entries from random_limit_1 to random_limit_2, and then take the first entry found in that range.

Or, if I know max, is there a way that I can simply select the fifth table entry without knowing which identifier is. Then just grab the id of this entry.

Update:

This request is somewhat faster - ish. Still not fast enough = /

SELECT t.id FROM table t JOIN (SELECT(FLOOR(max(id) * rand())) as maxid FROM table) as tt on t.id >= tt.maxid LIMIT 1

+8

mysql random recordset

qodeninja Dec 9 '11 at 17:53

source share

3 answers

 SELECT * FROM TABLE ORDER BY RAND() LIMIT 1;

Good, it's slow. If you search for ORDER BY RAND() MYSQL , you will find many results saying that it is very slow, and it is. I did a little research and I found this alternative MySQL rand () slow on large datasets. Hope this is better.

+8

cristian Dec 9 '11 at 17:55

source share

 SELECT ID FROM YourTable ORDER BY RAND() LIMIT 1;

0

Joe stefanelli Dec 9 '11 at 17:56

source share

user645280 · Accepted Answer · 2011-12-09T18:17:35+0000

Yes, the idea seems good:

 select min(ID), max(ID) from table into @min, @max; set @range = @max - @min; set @mr = @min + ((@range / 1000) * (rand() * 1000)); select ID from table where ID >= @mr and ID <= @mr + 1000 order by rand() limit 1 -- into @result ;

It can vary from 1000 to 10000 or any other that is necessary for scaling ...

EDIT: you can also try the following:

 select ID from table where (ID % 1000) = floor(rand() * 1000) order by rand() limit 1 ;

Separates it into different lines ...

EDIT 2:

See: What is the best way to select a random row from a table in MySQL?

This is perhaps the fastest way:

 select @row := floor(count(*) * rand()) from some_tbl; select some_ID from some_tbl limit @row, 1;

Unfortunately, variables cannot be used in the limit clause, so you have to use a dynamic query, either by writing a query string to the code, or using PREPARE and EXECUTE. In addition, the n, 1 constraint still requires scanning n elements in the table, so this is about twice as fast as the second method, indicated above on average. (Although it is probably more uniform and guarantees matching the found string)

Quickly select random id from mysql table with millions of inconsistent records

More articles: