I am writing a piece of software that will basically analyze a dataset and should be able to “output” or “extrapolate” or “predict” when the next event will happen and what will happen in the next event.
In the hiring process as an example, some events occur at a specific time. At t0, the applicant submits an application, at t1 the HR manager looks at the form and performs basic screening, is sent to the technical counter in the t2 file, etc. Until the applicant is hired or rejected.
I have a good dataset for several “applicants”, and their time and event samples look like this: Applicant, date application, date application reviewed by HR, date application viewed by a technical client, etc.
What I need to do is for the new applicant, I want him to be able to show when the next event will happen.
I evaluate several alternatives: learning algorithms are amazing, but can be excessive, statistical methods, such as extrapolation, may be relevant, but it is difficult that the human factor is involved in the process (human delay), so I'm not sure which direction haunt and which appropriate libraries to use.
Apache Commons Math seems like a good place to start extrapolating.
Any ideas?
source share