Hal Daume wrote some basic machine learning algorithms during his Ph.D. (now he is an assistant professor and a rising star in the machine learning community).
His webpage has SVM, a simple decision tree and logistic regression in OCaml. By reading this code, you can feel how machine learning models are implemented in OCaml.
I would also like to mention F #, the new .Net language, similar to OCaml. Here is a factor graph model written in F # that analyzes the data of a chess game. This study also has a NIPS publication.
Although FP is suitable for the introduction of machine learning models and data mining. But what you can get here the most is not performance. It is true that FP supports parallel computing better than imperative languages such as C # or Java. But implementing a parallel SVM or decision tree has very little to do with the language! The parallel is parallel. Numerical optimizations of machine learning and data mining are usually required, so they are purely functionally usually difficult and less efficient. The execution of these complex algorithms is a very difficult task at the algorithm level, and not at the language level. If you want to run 100 SVMs in parallel, FP helps here. But I do not see the difficulty of running 100 libsvm parallel in C ++, except that a single libsvm stream is more efficient than the unverified haskell svm package.
Then what do FP languages like F #, OCaml, Haskell give?
Easy to check your code. FP languages usually have a top-level interpreter; you can test your functions on the fly.
Several volatile states. This means that passing the same parameter to a function, this function always gives the same result, so debugging is easy in FP.
The code is concise. Type of output, pattern matching, closure, etc. You focus more on domain logic and less on the language part. Therefore, when you write code, your mind mainly thinks about the programming logic itself.
Writing code in FP is fun.
Yin Zhu Feb 22 '10 at 1:50 2010-02-22 01:50
source share