Matrices in Python

Yesterday I needed a matrix type in Python.

Apparently, the trivial answer to this need would be to use numpy.matrix() , but the additional problem I have is that I would like the matrix to keep arbitrary values ​​with mixed types, similar to a list. numpy.matrix does not do this. Example:

 >>> numpy.matrix([[1,2,3],[4,"5",6]]) matrix([['1', '2', '3'], ['4', '5', '6']], dtype='|S4') >>> numpy.matrix([[1,2,3],[4,5,6]]) matrix([[1, 2, 3], [4, 5, 6]]) 

As you can see, numpy.matrix should be uniform in content. If a string value is present in my initialization, each value is implicitly stored as a string. This is also confirmed by accessing single values.

 >>> numpy.matrix([[1,2,3],[4,"5",6]])[1,1] '5' >>> numpy.matrix([[1,2,3],[4,"5",6]])[1,2] '6' 

Python list type can now accept mixed types. You can have a list containing an integer and a string, storing their type. What I need is something like a list, but working as a matrix.

So I had to implement my own type. I had two options for internal implementation: a list containing lists, and dictionaries. Both solutions have disadvantages:

  • list of lists requires careful synchronization of the sizes of different lists. Switching between two lines is easy. Changing two columns is less simple. Deleting a row is also easy. Dictionary
  • (with a tuple as a key) is slightly better, but you must determine the limits of your key (for example, you cannot insert a 5.5 element if your matrix is ​​3x3), and they are more difficult to use to insert, delete or replace columns or rows.

Edit: clarification. The specific reason why I need this functionality is because I am reading CSV files. As soon as I collect values ​​from a CSV file (values ​​that can be strings, integers, floats), I would like to perform swap, delete, insert, and other operations. For this reason, I need a "matrix list".

My curiosities:

  • Do you know if there is a Python data type that provides this service (possibly in a battery-free library)?
  • Why is this data type not specified in the standard library? Too limited interest maybe?
  • How would you solve this problem? A dictionary, list, or other smarter solution?
+4
source share
6 answers

You may have heterogeneous types if your dtype is equal to object :

 In [1]: m = numpy.matrix([[1, 2, 3], [4, '5', 6]], dtype=numpy.object) In [2]: m Out[2]: matrix([[1, 2, 3], [4, 5, 6]], dtype=object) In [3]: m[1, 1] Out[3]: '5' In [4]: m[1, 2] Out[4]: 6 

I have no idea what is good about this other than the fantasy of indexing, because, as Don noted, you cannot do the math with this matrix.

+10
source

I am curious why you want this functionality; as I understand it, the reason for having matrices (in numpy) is, first of all, in performing linear mathematics (matrix transformations, etc.).

I'm not sure what a mathematical definition would be for a decimal and string product.

Inside, you probably want to take a look at sparse matrix implementations ( http://www.inf.ethz.ch/personal/arbenz/pycon03_contrib.pdf ). There are many ways to do this (hash, list, linked list), and each of them has its advantages and disadvantages. If your matrix does not have many zeros or zeros, you can cut sparse implementations.

+5
source

Have you looked at the features of numpy.recarray?

For example, here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.recarray.html

It is designed to create arrays with mixed data types.

I do not know if the array is suitable for your purposes or if you need a matrix - I did not work with numpy matrices. But if the array is good enough, re-work may work.

+3
source

Check sympy - it copes well with polymorphism in its matrices and you have operations on sympy.matrices.Matrix objects like col_swap, col_insert, col_del, etc.

  In [2]: import sympy as s 
 In [6]: import numpy as np

 In [11]: npM = np.array ([[[1,2,3.0], [4,4, "abc"]], dtype = object)
 In [12]: npM
 Out [12]: 
  [[1 2 3.0]
  [4 4 abc]]

 In [14]: type (npM [0] [0])
 Out [14]: 
 In [15]: type (npM [0] [2])
 Out [15]: 
 In [16]: type (npM [1] [2])
 Out [16]: 


 In [17]: M = s.matrices.Matrix (npM)
 In [18]: M
 Out [18]: 
 ⎑1 2 3.0⎀
 ⎒ βŽ₯
 ⎣4 4 abc⎦


 In [27]: type (M [0,2])
 Out [27]: 
 In [28]: type (M [1,2])
 Out [28]: 

 In [29]: sym = M [1,2] 
 In [32]: print sym.name
 abc

 In [34]: sym.n
 Out [34]: 
 In [40]: sym.n (subs = {'abc': 45})
 Out [40]: 45.0000000000000

+1
source

This may be a late answer, but why not use pandas ?

+1
source

Have you considered the csv module for working with csv files?

Python docs for csv module

0
source

All Articles