I use Numpy daily and R is almost like that.
For heavy crystal numbers, I prefer NumPy for R with a large margin (including R packages such as "Matrix"). I find a syntax cleaner, the function is larger and the calculation is faster (although I do not find R slow by any means). For example, the NumPy Broadcasting function, I donโt think, has an analog in R.
For example, to read in the data set from the csv file and โnormalizeโ it for input into the ML algorithm (for example, the middle center, and then change the scale of each measurement), only the following is required:
data = NP.loadtxt(data1, delimiter=",")
In addition, I found that when coding ML-algorithms, I need data structures that I can use in an elementary way, and also understand linear algebra (for example, matrix multiplication, transpose, etc.). NumPy gets this and makes it easy to create these hybrid structures (without overloading or subclassing the operator, etc.).
You will not be disappointed with NumPy / SciPy, most likely you will be amazed.
So, a few recommendations - in general and, in particular, taking into account the facts in your question:
install both NumPy and Scipy . As an approximate guide, NumPy provides basic data structures (in particular ndarray) and SciPy (which is actually several times larger than NumPy) provides domain-specific functions (for example, statistics, signal processing, integration).
install repository versions , especially w / r / t NumPy, because the version is dev 2.0. Matplotlib and NumPy are tightly integrated, you can use one without the other, but both of them are the best in their class among python libraries. You can get all three with easy_install, and I assume that you are already.
NumPy / SciPy have several modules specifically designed for the Training / Statistics machine, including the Clustering and Statistics package.
As well as packages aimed at general computing, but which code much faster ML algorithms, in particular, Optimization and Linear Algebra .
There are also SciKit s not included in the base NumPy or SciPy Libraries; You need to install them separately. Generally speaking, every SciKit is a set of handy wrappers for organizing coding in a given domain. SciKits that you are likely to find most relevant are: ann (approximate nearest neighbor) and learn (a set of ML regression algorithms and statistics and classifications, for example, logistic regression, Multi-Layer Perceptron, vector machine support).
doug
source share