Read some random lines from a file in Python

Can someone show me how I can read a random number of lines from a file in Python?

+6
python
source share
5 answers

Your requirement is a bit vague, so here's a slightly different method (for inspiration, if nothing else):

from random import random lines = [line for line in open("/some/file") if random() >= .5] 

Compared to other solutions, the number of lines changes less (distribution over half the total number of lines), but each line is selected with a probability of 50%, and only one pass through the file is required.

+16
source share

To get an arbitrary number of lines from your file, you can do something like the following:

 import random with open('file.txt') as f: lines = random.sample(f.readlines(),5) 

In the above example, 5 rows are returned, but you can easily change that number you need. You can also change it to randint() to get a random number of lines in addition to a number of random lines, but you need to make sure that the sample size is no larger than the number of lines in the file. Depending on your input, this can be trivial or a bit more complicated.

Note that lines can appear in lines in a different order in which they appear in the file.

+12
source share
 import linecache import random import sys # number of line to get. NUM_LINES_GET = 5 # Get number of line in the file. with open('file_name') as f: number_of_lines = len(f.readlines()) if NUM_LINES_GET > number_of_lines: print "are you crazy !!!!" sys.exit(1) # Choose a random number of a line from the file. for i in random.sample(range(1, number_of_lines+1), NUM_LINES_GET) print linecache.getline('file_name', i) linecache.clearcache() 
+2
source share
 import os,random def getrandfromMem(filename) : fd = file(filename,'rb') l = fd.readlines() pos = random.randint(0,len(l)) fd.close() return (pos,l[pos]) def getrandomline2(filename) : filesize = os.stat(filename)[6] if filesize < 4096 : # Seek may not be very useful return getrandfromMem(filename) fd = file(filename,'rb') for _ in range(10) : # Try 10 times pos = random.randint(0,filesize) fd.seek(pos) fd.readline() # Read and ignore line = fd.readline() if line != '' : break if line != '' : return (pos,line) else : getrandfromMem(filename) getrandomline2("shaks12.txt") 
0
source share

Assuming the offset is always at the beginning of the file:

 import random lines = file('/your/file').read().splitlines() n_lines = random.randrange(len(lines)) random_lines = lines[:n_lines] 

Note that this will read the entire file in memory.

0
source share

All Articles