Locate the SCC graph (Tarjan algorithm or DFS dual run).
For each SCC, calculate the sum of its node weights; denote this value as PARTIAL-SUM.
SCC iterations in reverse topological order; for each node in each SCC, its SUM will be the sum of all the SUM values ββof the neighboring SCC plus its own PARTIAL-SUM value.
The linear run time is O(E+V) , since the SCC search is linear, the topological type is linear, and the summation is linear, since we visit each SCC no more than once and each branch no more than once.
EDIT
As noted in tzkuzzy's comments, parallel paths create a problem. This is easily solved by simply running the DFS on the SCC chart. At any intersection, we simply take the already visited node DFS tree until we reach the not-completely distorted parent, this pair of nodes (visited below and the ancestor) has two different paths between them, we make a list for each node of such descendants and when summing, we simply subtract their PARTIAL-SUM value.
So if:
u / \ \ / w
Our DFS will take the cross-border connecting the node child element from u to w and return to u (for those familiar with the typical DFS taught in schools, the simplest explanation is that u characterized as the first gray ancestor of w ), therefore we add w to the list that we support on u .
Then, when we summarize each SCC-neighboring SCC, as described, we add an extra step in which we iterate over the list and simply subtract the PARTIAL-SUM values.
DFS itself is still linear. The rollback from node to the ancestor can be linear if we cache the results (thus, we do not cross the same edge more than once). And the additional work in the summation is not more than O(V) , so we did not change the working time.
EDIT
Inclusion-exclusion makes this more difficult than I thought. This solution is not complete and does not work. A simple BFS for each node is more expensive, but simpler and will work.