Getting a random string through SQLAlchemy

How to select (or some) random rows from a table using SQLAlchemy?

+56
python sql database sqlalchemy
Sep 13 '08 at 19:58
source share
8 answers

This is a database problem.

I know that PostgreSQL and MySQL have the ability to organize a random function, so you can use this in SQLAlchemy:

from sqlalchemy.sql.expression import func, select select.order_by(func.random()) # for PostgreSQL, SQLite select.order_by(func.rand()) # for MySQL select.order_by('dbms_random.value') # For Oracle 

Then you need to limit the query to the number of records you need (for example, using .limit() ).

Keep in mind that at least in PostgreSQL, choosing a random record has serious performance problems; here is a good article about it.

+70
Sep 13 '08 at 20:09
source share

If you use orm and the table is small (or you have the number of cached rows) and you want it to be database independent, a really simple approach.

 import random rand = random.randrange(0, session.query(Table).count()) row = session.query(Table)[rand] 

This changes a bit, but thats why you use orm.

+20
Dec 24 '08 at 3:22
source share

There is an easy way to output a random string independent of the database. Just use .offset (). No need to pull all the lines:

 import random query = DBSession.query(Table) rowCount = int(query.count()) randomRow = query.offset(int(rowCount*random.random())).first() 

Where the table is your table (or you can put some query there). If you want several lines, you can simply run this several times and make sure that each line is not identical to the previous one.

+13
Feb 16 '13 at 2:19
source share

There are four different variations, ordered from the slowest to the fastest. timeit result below:

 from sqlalchemy.sql import func from sqlalchemy.orm import load_only def simple_random(): return random.choice(model_name.query.all()) def load_only_random(): return random.choice(model_name.query.options(load_only('id')).all()) def order_by_random(): return model_name.query.order_by(func.random()).first() def optimized_random(): return model_name.query.options(load_only('id')).offset( func.floor( func.random() * db.session.query(func.count(model_name.id)) ) ).limit(1).all() 

timeit results for 10,000 starts on my Macbook versus a 300-row PostgreSQL table:

 simple_random(): 90.09954111799925 load_only_random(): 65.94714171699889 order_by_random(): 23.17819356000109 optimized_random(): 19.87806927999918 

You can easily see that using func.random() much faster than returning all the results in Python random.choice() .

In addition, as the size of the table increases, the performance of order_by_random() will deteriorate significantly, since the ORDER BY parameter requires a full table scan compared to COUNT in optimized_random() , you can use the index.

+8
Nov 07 '15 at 12:55
source share

This is the solution I am using:

 from random import randint rows_query = session.query(Table) # get all rows if rows_query.count() > 0: # make sure there at least 1 row rand_index = randint(0,rows_query.count()-1) # get random index to rows rand_row = rows_query.all()[rand_index] # use random index to get random row 
0
Mar 14 '17 at 7:32
source share

An extended version of Lukash’s example if you need to select multiple lines in random order:

 import random # you must first select all the values of the primary key field for the table. # in some particular cases you can use xrange(session.query(Table).count()) instead ids = session.query(Table.primary_key_field).all() ids_sample = random.sample(ids, 100) rows = session.query(Table).filter(Table.primary_key_field.in_(ids_sample)) 

So this post just indicates that you can use .in_ to select multiple fields at once.

-one
Apr 23 '09 at 16:27
source share

this solution will select one random row

This solution requires that the primary key be named id, it must be, if it is not already:

 import random max_model_id = YourModel.query.order_by(YourModel.id.desc())[0].id random_id = random.randrange(0,max_model_id) random_row = YourModel.query.get(random_id) print random_row 
-one
Jan 20 '14 at 19:18
source share

There are several ways to use SQL, depending on the database you are using.

(I think SQLAlchemy can use all of this anyway)

MySQL:

 SELECT colum FROM table ORDER BY RAND() LIMIT 1 

PostgreSQL:

 SELECT column FROM table ORDER BY RANDOM() LIMIT 1 

MSSQL:

 SELECT TOP 1 column FROM table ORDER BY NEWID() 

IBM DB2:

 SELECT column, RAND() as IDX FROM table ORDER BY IDX FETCH FIRST 1 ROWS ONLY 

Oracle:

 SELECT column FROM (SELECT column FROM table ORDER BY dbms_random.value) WHERE rownum = 1 

However, I do not know any standard way

-four
Sep 13 '08 at 20:04
source share



All Articles