Running BLAST (bl2seq) without creating sequence files

I have a script that executes BLAST queries (bl2seq)

The script works as follows:

  • Get sequence a, sequence b
  • write sequence a to file
  • write sequence b to fileb
  • run the command 'bl2seq -i filea -j fileb -n blastn'
  • get out of STDOUT, parsing
  • repeat 20 million times

Bl2seq does not support piping. Is there a way to do this and not write / read on the hard drive?

I am using Python BTW.

+5
source share
5 answers

, bl2seq .? , - , . bl2seq -, STDOUT , . bl2seq , , -o. .

, Python, , , BioPython.

+1

, , - bash . , Python, ( ). , bl2seq , , .

+4

bl2seq BioPerl? , , . , , Bio::Tools::Run::AnalysisFactory::Pise, . Perl.

bl2seq, . , , .

+1

. .

python!

EDIT: , blast2, .

( )

def _query(self):
    from subprocess import Popen, PIPE, STDOUT
    pipe = Popen([BLAST,
    '-p', 'blastn',
    '-d', self.database,
    '-m', '8'],
    stdin=PIPE,
    stdout=PIPE)
    pipe.stdin.write('%s\n' % self.sequence)
    print pipe.communicate()[0]

self.database - , , 'nt.fa' self.sequence - ,

, . -. XML. github.

, , , , - .

+1

blast2, R script:

....
system("mkfifo seq1")
system("mkfifo seq2")
system("echo  sequence1 > seq1"), wait = FALSE)
system("echo  sequence2 > seq2"), wait = FALSE)
system("blast2 -p blastp -i seq1 -j seq2 -m 8", intern = TRUE)
....

This is 2 times slower (!) Against writing and reading from the hard drive!

+1
source

All Articles