List Accounting and Logical Indexing

Slowly moving from Matlab to Python ...

I have this form list

list1 = [[1, 2, nan], [3, 7, 8], [1, 1, 1], [10, -1, nan]] 

and another list with the same number of items

 list2 = [1, 2, 3, 4] 

I am trying to extract elements of list1 that do not contain any nan values ​​and the corresponding elements in list2, i.e. the result should be:

 list1_clean = [[3, 7, 8], [1, 1, 1]] list2_clean = [2, 3] 

In Matlab, this is easy to do with logical indexing.

Here I get the feeling that list comprehension of some form will do the trick, but I'm stuck in:

 list1_clean = [x for x in list1 if not any(isnan(x))] 

which is obviously useless for list2.

Alternatively, the next logical indexing attempt does not work ("indexes must be integers, not lists")

 idx = [any(isnan(x)) for x in list1] list1_clean = list1[idx] list2_clean = list2[idx] 

I am sure this is painfully trivial, but I can’t understand it, help evaluate!

+8
python list matrix-indexing
source share
3 answers

You can use zip .

zip returns elements from the same index from the iterations passed to it.

 >>> from math import isnan >>> list1 = [[1, 2, 'nan'], [3, 7, 8], [1, 1, 1], [10, -1,'nan']] >>> list2 = [1, 2, 3, 4] >>> out = [(x,y) for x,y in zip(list1,list2) if not any(isnan(float(z)) for z in x)] >>> out [([3, 7, 8], 2), ([1, 1, 1], 3)] 

Now unzip out to get the required output:

 >>> list1_clean, list2_clean = map(list, zip(*out)) >>> list1_clean [[3, 7, 8], [1, 1, 1]] >>> list2_clean [2, 3] 

help zip :

 >>> print zip.__doc__ zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)] Return a list of tuples, where each tuple contains the i-th element from each of the argument sequences. The returned list is truncated in length to the length of the shortest argument sequence. 

You can use itertools.izip if you need an efficient memory solution as it returns an iterator.

+6
source share

You can simply do this:

 ans = [(x,y) for x,y in zip(list1,list2) if all(~isnan(x))] #[(array([ 3., 7., 8.]), 2), (array([ 1., 1., 1.]), 3)] 

From where you can extract each value:

 l1, l2 = zip(*ans) #l1 = (array([ 3., 7., 8.]), array([ 1., 1., 1.])) #l2 = (2,3) 

It is recommended to use the izip module from itertools , it uses iterators, which can save a huge amount of memory depending on your problem.

Instead of ~ you can use numpy.logical_not() , which may be more readable.

Welcome to Python!

+2
source share

That should work. We check whether the number is NaN or not using math.isnan .

We insert an item into list1_clean and list2_clean if none of the items in the source list is NaN . To test this, we use the any function, which returns True if any element of iterability is True .

 >>> list1 = [[1, 2, float('NaN')], [3, 7, 8], [1, 1, 1], [10, -1, float('NaN')]] >>> list2 = [1, 2, 3, 4] >>> from math import isnan >>> list1_clean = [elem for elem in list1 if not any([isnan(element) for element in elem])] >>> list1_clean [[3, 7, 8], [1, 1, 1]] >>> list2_clean = [list2[index] for index, elem in enumerate(list1) if not any([isnan(element) for element in elem])] >>> list2_clean [2, 3] 

To reduce size and avoid using zip , you can do

 >>> cleanList = [(elem, list2[index]) for index, elem in enumerate(list1) if not any([isnan(element) for element in elem])] >>> cleanList [([3, 7, 8], 2), ([1, 1, 1], 3)] >>> list1_clean = [elem[0] for elem in cleanList] >>> list2_clean = [elem[1] for elem in cleanList] 

any function β†’

 any(...) any(iterable) -> bool Return True if bool(x) is True for any x in the iterable. 

isnan function β†’

 isnan(...) isnan(x) -> bool Check if float x is not a number (NaN). 
0
source share

All Articles