For large r and b you can use a method called Monte Carlo integration, see, for example, Monte Carlo Integration on Wikipedia (and / or SICP chapter 3.1.2) to calculate the sum. For small r and b and significantly different probabilities, node-failure p[i] exact method is superior. The exact definition of small and large will depend on several factors and is best tested experimentally.
Concrete code example . This is a very simple code example (in Python) to demonstrate how such a procedure can work:
def montecarlo(p, rb, N): """Corresponds to the binomial coefficient formula.""" import random succ = 0
The function corresponds to the binomial test example and runs the N tests, checking whether b nodes from r*b nodes are found with a failure probability p . Several experiments will convince you that you need N values ββin the range of thousands of samples before you can get reasonable results, but in principle the complexity is O(N*r*b) . The accuracy of the result scales as sqrt(N) , i.e. To double your accuracy, you need to double N For sufficiently large r*b this method will clearly be superior.
Extension of approximation : you obviously need to develop a test case so that it takes into account all the properties of the system. You have proposed a couple of extensions, some of which can be easily implemented, while others are not. Let me give you some suggestions:
1) In the case of a clear but uncorrelated p[i] changes to the above code are minimal: in the function head, you pass an array instead of a single float p , and you replace the string if random.random()<p: alivenum += 1 with
if random.random()<p[j]: alivenum += 1
2) In the case of correlated p[i] you need additional information about the system. The situation that I mentioned in my comment may be this:
A--B--C | | DE | F--G--H | J
In this case, A may be a βroot nodeβ, and a failure of node D may mean an automatic failure with a 100% probability of nodes F , G , H and J ; while crashing node F will automatically reduce G , H and J , etc. At least that was the way I meant in my comment (which is a plausible interpretation, since you are talking about a tree-like probability structure in the original question). In this situation, you need to change the code that p refers to the tree structure, and for j in ... crosses the tree, skipping the lower branches from the current node, as soon as the test fails. The resulting test is still alivenum >= b , as before,
3) This approach will fail if the network is a cyclic graph that cannot be represented by a tree structure. In this case, you first need to create graph nodes that are either dead or alive, and then run the routing algorithm on the graph to calculate the number of unique reachable nodes. This will not increase the time complexity of the algorithm, but obviously the code complexity.
4) The time dependency is a non-trivial but possible modification if you know mtbf / r (average time between failures / repairs), as this can give you the probabilities p either a tree structure or an uncorrelated linear p[i] by the sum of the exponentials. Then you will need to start the MC procedure at different times with the corresponding results for p .
5) If you only have log files (as indicated in the last paragraph), this will require a substantial modification of the approach, which goes beyond what I can do on this board. Log files must be thorough enough so that you can restore the model for the network diagram (and therefore the graph p ), as well as the individual values ββof all nodes p . Otherwise, the accuracy will be unreliable. These log files should also be significantly longer than the timelines for failures and repairs, assumptions that may be unrealistic on real networks.