The core of your question is similar to what allows you to find MST (technically called the optimal branching or minimum cost of arborescence) in a directed graph, and therefore more difficult than finding an MST in an undirected graph.
Both Prim and Kruskal algorithms work due to the cut property. If G = (V, E) is a graph, then for any cut (S, V - S) in G, if there is a tiny edge {u, v} intersecting this cut, then this edge must belong to all MST G. Unfortunately , this property is incorrect in the directed case. Here's a counterexample:
2 A ----> B | | ^ 1 | -99 | | 1 | v | +-----> C
Here the minimum cost of arborescence rooted in A is this:
2 A ----> B | -99 | v C
However, look at the section ({A}, {B, C}). The least expensive edge crossing this section is the edge (A, C), but this edge does not appear in any minimum section and is rooted in A.
Without the cut property, the Prim algorithm and the Kruskal algorithm fail. Try running the Prim algorithm in the graph here, starting with node A as the included node. You add an edge (A, C), then an edge (C, B) giving a suboptimal tree. Now try to run the Kruskal algorithm. First you add an edge (B, C), then add an edge (A, C). Unfortunately, this is actually not arborescence, because it does not have a root node.
The standard arborescences minimum cost search algorithm (Edmonds-Chu) is actually probably closer in spirit to the Boruvka algorithm . The Boruvka algorithm works by simultaneously selecting for each node the least cost edge associated with that node and adding it to the MST candidate. Then you bind all the CCs thus formed into separate nodes and repeat this process until you get your tree.
In this case, while the weights of the edges are different, this algorithm will never introduce a cycle (this is a good exercise to prove it), but this is not so in directional algorithms. The above graph is a good example of this - if you try this, you select (A, C) from A, (C, B) from C and (B, C) from B, forming a cycle (B, C, B). Correction of the Edmonds-Chu algorithm uses work by reducing one of these cycles to one node, and then repeating this process on a reduced graph and not delivering cycles based on the result. In this sense, it is similar to the Bowka algorithm, albeit with corresponding changes.
Hope this helps!