Get a random number of rows from a SQL Server table

Question

Get a random number of rows from a SQL Server table

I am trying to get 5 random numbers of rows from a large table (more than 1 million rows) by a quick method.

So far I have tested these SQL queries:

Method 1

Select top 5 customer_id, customer_name from Customer TABLESAMPLE(1000 rows) order by newid()

This method estimates the cost of I / O is 0.0127546 , so it is very fast (non-clustered index scan)

Method 2

 select top 5 customer_id, customer_name from Customer order by newid()

This method estimates the cost of I / O 117.21189 , and the index check non-clustered estimate of the costs of I / O 2.8735 , so this affects performance

Method 3

 select top 5 customer_id, customer_name from Customer order by rand(checksum(*))

This method estimates the cost of I / O 117.212 , and the cost of scanning non-clustered evaluation by index does not match 213.149 , this query is slower than everything because the estimated cost of the subtree is 213.228 , so it is very slow.

UPDATE:

Method 4

 select top 5 customer_id, customer_name, product_id from Customer Join Product on product_id = product_id where (customer_active = 'TRUE') order by checksum(newid())

This approach is better and very fast. All test tests are in order.

Question

How to convert method 4 to LINQ-to-SQL? Thanks

+5

guid sql-server random linq entity-framework

Arbaz abid May 08 '17 at 14:18

source share

2 answers

As stated here is the best way:

 var randomCoustmers = Customers.OrderBy(x => Guid.NewGuid()).Take(5);

0

jitender May 08 '17 at 14:37

source share

CodeNotFound · Accepted Answer · 2017-05-08T14:50:42+0000

If you want to convert Method 2 to Linq To Entities, just use a solution that jitender answers that looks like this:

 var randomCoustmers = context.Customers.OrderBy(x => Guid.NewGuid()).Take(5);

But for method 1 , which is very fast after your benchmarking, you need to do the following C # code because Linq To Entities does not have the LINQ equivalent for this SQL statement TABLESAMPLE(1000 rows) .

 var randomCoustmers = context.Customers.SqlQuery("Select TOP 5 customer_id, customer_name from Customer TABLESAMPLE(1000 rows) order by newid()").ToList();

You can move SQL statements to SQL View or a stored procedure that will accept the number of clients that will be accepted.

UPDATE

For method 4 , which seems very fast (always following your benchmark), you can run the following Linq To Entities:

 var randomCoustmers = context.Customers.OrderBy(c => SqlFunctions.Checksum(Guid.NewGuid()).Take(5);

Entity Framework can convert to SQL all functions that are defined in the SqlFunctions class. These functions have a Checksum function that will do what you want.

If you want to join other tables, you can easily do this with Linq To Entites, so I just simplified my version by DbSets only Customers DbSets .

Get a random number of rows from a SQL Server table

More articles: