This is a controlled learning problem.
I have a directed acyclic graph (DAG). Each edge has an attribute vector X, and each node (vertex) is labeled 0 or 1. The task is to find the cost function w (X), so that the shortest path between any pair of nodes has the highest ratio of 1s to 0s (minimal error classification).
The solution should be well generalized. I tried logistic regression, and the studied logistic function predicts the node label quite well, giving the input edge functions. However, the graph topology is not taken into account by this approach; therefore, the solution on the entire graph is not optimal. In other words, the logistic function is not a good weight function at a given setting above.
Although my problem is not a typical binary classification setup problem, here is a good introduction to it: http://en.wikipedia.org/wiki/Supervised_learning#How_supervised_learning_algorithms_work
Here are some more details:
- Each vector function X is a d-dimensional list of real numbers.
- Each edge has a vector of functions. That is, taking into account the set of edges E = {e1, e2, .. en} and the set of function vectors F = {X1, X2 ... Xn}, then the edge ei is connected with the vector Xi.
- You can find the function f (X), so that f (Xi) gives the probability that the edge ei points to a node labeled 1. An example of such a function is the one I mentioned above, found through regression logistics. However, as I mentioned above, such a function is not optimal.
SO QUESTION: Given the graph, the starting node and the finish of the node, how to find out the optimal cost function w (X), so that the ratio of 1s to 0s nodes is maximum (minimal classification error)?
Diego
source share