Making forecasts from a resume

I have a database with many resumes, including structured data by gender, age, address, number of years of study and many other parameters of each person.

At about 10% of the sample, I also have additional data about a specific action that they did at a particular point in time. For example, Jane took a mortgage in July 1998 or that John began pilot training in January 2007 and received a license in December 2007.

I need an algorithm that gives for each of the actions the likelihood that this will happen for each person in the future. For example, the probability that Bill will take a mortgage will be 2% in 2011, 3.5% in 2012, etc.

How do I approach this? Regression analysis? SVM? Neural network? Something else?

Maybe even some standard tool / library that I can only use with obvious settings?

+4
source share
3 answers

The likelihood that X is happening, given that Y came about right from the Bayesian conclusion, I think.

+1
source

Lu is right, this is a matter of Bayesian inference.

The best tool / library to solve this is the statistical programming language R (r-project.org).

Take a look at the Bayesian output libraries in R: http://cran.r-project.org/web/views/Bayesian.html

How many people are in the "10% of the sample"? If it is below 100 people or so, I fear that the results of the analysis may not be significant. If it is 1000 or more people, the results will be pretty good (rule of thumb).

I would buy data export to R (r-project) and require some data cleansing. Then find a person familiar with R and advanced statistics, he will be able to solve this very quickly. Or try yourself, but R takes some time at the beginning.

+1
source

Regarding the choice of tool / library, I suggest you try Weka . It is an open source tool for experimenting with data mining and machine learning. Weka has several tools for reading, processing and filtering your data, as well as forecasting and classification tools.

However, you must have a solid foundation in the above fields in order to strive for a useful result.

+1
source

All Articles