DNA nucleotide counting using perl 6

Question

DNA nucleotide counting using perl 6

Good afternoon, I am trying to count the number of times that the letters ACTG are found in a DNA sequence using perl6.i, have tried other ways that I'm just trying to do it differently. Here are some of the code I came up with

use v6; my $default-input = "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC"; sub MAIN(Str $input = $default-input) { say "{bag($input.comb)<ACG T>}"; } use v6; my $default-input = "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC"; sub MAIN($input = $default-input) { "{<ACG T>.map({ +$input.comb(/$_/) })}".say;

Data set example
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC

+6

perl6 bioinformatics

Oluwole May 04 '16 at 19:09

source share

2 answers

Brad gilbert · Answer 1 · 2016-05-06T04:21:08+0000

 multi sub MAIN ( \DNA ) { my Int %bag = A => 0, C => 0, G => 0, T => 0; # doesn't keep the whole thing in memory # like .comb.Bag would have for DNA.comb { %bag{$_}++ } .say for %bag<ACG T> :p; } multi sub MAIN ( 'example' ){ samewith "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC" } multi sub MAIN ( Bool :STDIN($)! ){ samewith $*IN } multi sub MAIN ( Str :filename(:$file)! where .IO.f ){ samewith $file.IO }

 ~$ ./test.p6 Usage: ./test.p6 <DNA> ./test.p6 example ./test.p6 --STDIN ./test.p6 --filename|--file=<Str> ~$ ./test.p6 example A => 20 C => 12 G => 17 T => 21 ~$ ./test.p6 --STDIN < test.in A => 20 C => 12 G => 17 T => 21 ~$ ./test.p6 --file=test.in A => 20 C => 12 G => 17 T => 21

Matt oates · Answer 2 · 2016-05-09T13:23:02+0000

Another way is to use the BioInfo modules that I am working on that already have bag compulsion :)

 use v6; use BioInfo; my @sequences = ` >seqid AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC `; for @sequences -> $seq { say $seq.Bag; }

In the above code, you are importing a special bioinformatic slang that understands that the string literals between `` are FASTA literals. DNA / RNA / Amino acids are automatically detected, and for this you get a certain class. The object has its own .Bag, which does what you want. In addition to my own modules, there is also a BioPerl6 project.

If you want to read from a file, the following should work for you:

 use v6; use BioInfo::Parser::FASTA; use BioInfo::IO::FileParser; #Spawn an IO thread that parses the file and creates BioInfo::Seq objects on .get my $seq_file = BioInfo::IO::FileParser.new(file => 'myseqs.fa', parser => BioInfo::Parser::FASTA); #Print the residue counts per file while my $seq = $seq_file.get() { say $seq.Bag; }

DNA nucleotide counting using perl 6

More articles: