Including a function error in the Random Forest algorithm

Question

Including a function error in the Random Forest algorithm

I use Random Forest to classify a large number of astronomical objects, and it does a relatively good job. However, I want to improve performance by adding information about each characteristic (or error).

In astronomy, each dimension usually has an associated error bar. For example, if I measure red and blue, each color measurement will be a brightness measurement (in astronomy, that is, the magnitude of a star), an error, for example. R value 14 + - 0.2, B - value 12 + - 0.15.

I want to figure out how to make Random Forest use the error panel as additional information. Any ideas?

+4

machine-learning random-forest

user1511102 Jul 9 '12 at 6:09

source share

2 answers

vumaasha · Answer 1 · 2013-04-04T08:32:16+0000

are the numerical functions of error and color measurement? Then I just add a new function, which is the product of both functions, I believe that this is what you call the interaction in R

vc273 · Answer 2 · 2013-11-19T07:54:05+0000

One simple thing you can consider is oversampling your data using the error distribution for each variable. therefore, you create new examples by taking x + u * sigma, where u is the normal (0,1) draw and sigma is the sd error for this variable. it may take a lot of extra samples to properly turn on noise (depending on the number of functions), but since RFs train pretty fast in parallel, this can be a simple way. also has the added benefit of simplifying the inclusion of correlated noise in a sample.

Including a function error in the Random Forest algorithm

More articles: