I'm not sure if this is the right exchange site for machine learning questions, but I still saw ML questions, so I try my luck (also posted at http://math.stackexchange.com ).
I have training instances that come from different sources, so building one model does not work. Is there a known use in such cases?
An example best explains. Let me say that I want to classify cancer / non cancer training data, which were built on the basis of different population groups. Case studies from one population may have a completely different distribution of positive / negative examples than in other populations. Now I can create a separate model for each population, but the problem is that for testing I do not know from which population the test instance comes from.
* All instances of training / testing have the same set of functions, regardless of which group they came from.
source
share