Gamete frequency calculation optimization in populations

I need to optimize the calculation of gamete frequencies in populations.

I have a population npand Neindividuals in each population. Each person is formed by two gametes (male and female). Each gamete contains three genes. Each gene can be 0or 1. Thus, each person is a 2x3 matrix. Each row of the matrix is ​​a gamete provided by one of the parents. The set of individuals in each population can be arbitrary (but always length Ne). For simplicity, initial populations with individuals can be represented as:

Ne = 300; np = 3^7;
(*This table may be arbitrary with the same shape*)
ind = Table[{{0, 0, 0}, {1, 1, 1}}, {np}, {Ne}]

A complete set of all possible gametes:

allGam = Tuples[{0, 1}, 3]

Each person can generate gametes in 8 possible ways with equal probability. These gametes: Tuples@Transpose@ind[[iPop, iInd]](where iPopand iIndare the indices of the population and the individual in this population). I need to calculate the frequencies of gametes generated by individuals for each population.

At this point, my solution is as follows.

First, I convert each person into gametes that he can produce:

gamsInPop = Map[Sequence @@ Tuples@Transpose@# &, ind, {2}]

But a more efficient way to do this is:

gamsInPop = 
 Table[Join @@ Table[Tuples@Transpose@ind[[i, j]], {j, 1, Ne}], {i, 1, np}]

Secondly, I calculate the frequencies of gametes generated, including zero frequencies for gametes, which are possible but absent in the population:

gamFrq = Table[Count[pop, gam]/(8 Ne), {pop, gamInPop}, {gam, allGam}]

A more efficient version of this code:

gamFrq = Total[
   Developer`ToPackedArray[
    gamInPop /. Table[
      allGam[[i]] -> Insert[{0, 0, 0, 0, 0, 0, 0}, 1, i], {i, 1, 
       8}]], {2}]/(8 Ne)

Unfortunately, the code is still too slow. Can someone help me speed it up?

+5
1

:

Clear[getFrequencies];
Module[{t = 
   Developer`ToPackedArray[
     Table[FromDigits[#, 2] & /@ 
         Tuples[Transpose[{
            PadLeft[IntegerDigits[i, 2], 3], 
            PadLeft[IntegerDigits[j, 2], 3]}]], 
       {i, 0, 7}, {j, 0, 7}]
    ]},
   getFrequencies[ind_] :=
    With[{extracted = 
       Partition[
          Flatten@Extract[t, Flatten[ind.(2^Range[0, 2]) + 1, 1]], 
          Ne*8]},
        Map[
         Sort@Join[#, Thread[{Complement[Range[0, 7], #[[All, 1]]], 0}]] &@Tally[#] &, 
         extracted
        ][[All, All, 2]]/(Ne*8)
    ]
]

40 . :

In[372]:= Ne=300;np=3^7;
(*This table may be arbitrary with the same shape*)
inds=Table[{{0,0,0},{1,1,1}},{np},{Ne}];

In[374]:= 
getFrequencies[inds]//Short//Timing
Out[374]= {0.282,{{1/8,1/8,1/8,1/8,1/8,1/8,1/8,1/8},<<2185>>,
{1/8,1/8,1/8,1/8,1/8,1/8,1/8,1/8}}}

In[375]:= 
Timing[
  gamsInPop=Table[Join@@Table[Tuples@Transpose@inds[[i,j]],{j,1,Ne}],{i,1,np}];
  gamFrq=Total[Developer`ToPackedArray[gamsInPop/.Table[allGam[[i]]->
         Insert[{0,0,0,0,0,0,0},1,i],{i,1,8}]],{2}]/(8 Ne)//Short]

Out[375]= {10.563,{{1/8,1/8,1/8,1/8,1/8,1/8,1/8,1/8},<<2185>>,
  {1/8,1/8,1/8,1/8,1/8,1/8,1/8,1/8}}}

, ( ) - ,

In[393]:= fr[[All,{1,5,3,7,2,6,4,8}]] == gamFrq
Out[393]= True

: -, t, : 0 7, , . , , {i,j}, i (), j - , ( {i,j}). , , . :

In[396]:= t//Short[#,5]&
Out[396]//Short= {{{0,0,0,0,0,0,0,0},{0,1,0,1,0,1,0,1},{0,0,2,2,0,0,2,2},
{0,1,2,3,0,1,2,3},{0,0,0,0,4,4,4,4},{0,1,0,1,4,5,4,5},{0,0,2,2,4,4,6,6},
{0,1,2,3,4,5,6,7}},<<6>>,{{7,6,5,4,3,2,1,0},{7,7,5,5,3,3,1,1},{7,6,7,6,3,2,3,2},
<<2>>,{7,7,5,5,7,7,5,5},{7,6,7,6,7,6,7,6},{7,7,7,7,7,7,7,7}}}

() .

Flatten[ind.(2^Range[0, 2]) + 1, 1]] , 1, , t . Extract Flatten Partition . Tally, ( Join[#, Thread[{Complement[Range[0, 7], #[[All, 1]]], 0}]]) Sort . , .

, . . .

+6

All Articles