I looked at the Dijkstra algorithm link that you posted in the comments, and I think this is the source of your inefficiency. Inside the Dijkstra inner loop, he uses an extremely non-optimized approach to determine which node should be explored next (linear scan through each node at each step). The problem code is in two places. The first is code that tries to find the following node to work:
mini = -1; for (i = 1; i <= n; ++i) if (!visited[i] && ((mini == -1) || (d[i] < d[mini]))) mini = i;
Since this code is nested inside the loop that each node visits, the complexity (as mentioned in the link) is O (| V | 2 ), where | V | - number of nodes. In your case, since | V | is 30,000, there will be nine hundred million iterations of this cycle as a whole. It is very slow (as you saw), but there is no reason to do this work.
Here is another problem that tries to find which edge in the graph should be used to reduce the cost of other nodes:
for (i = 1; i <= n; ++i) if (dist[mini][i]) if (d[mini] + dist[mini][i] < d[i]) d[i] = d[mini] + dist[mini][i];
This scans the entire row in the adjacency matrix, looking for nodes to consider, which takes O (n) time no matter how many outgoing edges come out of the node.
Although you can try to fix this version of Dijkstra into a more optimized implementation, I think the correct option here is to simply throw away this code and find a better implementation of Dijkstra's algorithm. For example, if you use the pseudocode from a Wikipedia article implemented with a binary heap , you can get the Dijkstra algorithm running in O (| E | log | V |). In your case, this value is just over two million, which is about 450 times your current approach. This is a huge difference, and I bet that with a better implementation of Dijkstra you will get code that ends much shorter than before.
Alternatively, you might consider running all Dijkstra searches in parallel, as pointed out by Jacob Eggers. This camera gives you an extra speed boost for every processor you have. Combined with the above (and more critical) fix, this is likely to give you a huge increase in performance.
If you plan to run this algorithm on a much denser data set (in which the number of edges approaches | V | 2 / log | V |), then you might want to switch to the Floyd-Warshall algorithm . Running Dijkstra's algorithm once per node (sometimes called Johnson's algorithm ) takes O (| V || E | log | V |) time, while using Floyd-Warshall it takes O (| V | 3 ) time. However, for the dataset that you mentioned, the graph is pretty sparse that running multiple instances of Dijkstra should be great.
Hope this helps!