Filtering a list based on a list of gates

I have a list of values ​​that I need to filter, given the values ​​in the list of logical elements:

list_a = [1, 2, 4, 6] filter = [True, False, True, False] 

I create a new filtered list with the following line:

 filtered_list = [i for indx,i in enumerate(list_a) if filter[indx] == True] 

that leads to:

 print filtered_list [1,4] 

The line works, but it looks (to me) a little redundant, and I was wondering if there is an easier way to achieve the same.




Advice

A summary of two good tips given in the answers below:

1- Don't call the filter list like me, because it is a built-in function.

2- Do not compare things with True , as I did with if filter[idx]==True.. , since this is not necessary. Just using if filter[idx] enough.

+105
python list numpy
Sep 06 '13 at 20:12
source share
4 answers

You are looking for itertools.compress :

 >>> from itertools import compress >>> list_a = [1, 2, 4, 6] >>> fil = [True, False, True, False] >>> list(compress(list_a, fil)) [1, 4] 

Time Comparison (py3.x):

 >>> list_a = [1, 2, 4, 6] >>> fil = [True, False, True, False] >>> %timeit list(compress(list_a, fil)) 100000 loops, best of 3: 2.58 us per loop >>> %timeit [i for (i, v) in zip(list_a, fil) if v] #winner 100000 loops, best of 3: 1.98 us per loop >>> list_a = [1, 2, 4, 6]*100 >>> fil = [True, False, True, False]*100 >>> %timeit list(compress(list_a, fil)) #winner 10000 loops, best of 3: 24.3 us per loop >>> %timeit [i for (i, v) in zip(list_a, fil) if v] 10000 loops, best of 3: 82 us per loop >>> list_a = [1, 2, 4, 6]*10000 >>> fil = [True, False, True, False]*10000 >>> %timeit list(compress(list_a, fil)) #winner 1000 loops, best of 3: 1.66 ms per loop >>> %timeit [i for (i, v) in zip(list_a, fil) if v] 100 loops, best of 3: 7.65 ms per loop 

Do not use filter as a variable name, this is a built-in function.

+156
Sep 06 '13 at 20:13
source share

With numpy:

 In [128]: list_a = np.array([1, 2, 4, 6]) In [129]: filter = np.array([True, False, True, False]) In [130]: list_a[filter] Out[130]: array([1, 4]) 

or see Alex Szatmary answer if list_a can be a numpy array but not filter

Numpy usually gives you a big speed boost.

 In [133]: list_a = [1, 2, 4, 6]*10000 In [134]: fil = [True, False, True, False]*10000 In [135]: list_a_np = np.array(list_a) In [136]: fil_np = np.array(fil) In [139]: %timeit list(itertools.compress(list_a, fil)) 1000 loops, best of 3: 625 us per loop In [140]: %timeit list_a_np[fil_np] 10000 loops, best of 3: 173 us per loop 
+36
Sep 06 '13 at 21:08
source share

Same:

 filtered_list = [i for (i, v) in zip(list_a, filter) if v] 

Using zip is a "pythonic" way of repeating multiple sequences in parallel, without the need for indexing. Using itertools for such a simple case is a bit overkill ...

One thing you do in your example, you should stop doing this when comparing things with True, this is usually not required. Instead of if filter[idx]==True: ... you can simply write if filter[idx]: ...

+33
Sep 06 '13 at 20:13
source share

To do this using numpy, i.e. if you have an array, a instead of list_a :

 a = np.array([1, 2, 4, 6]) my_filter = np.array([True, False, True, False], dtype=bool) a[my_filter] > array([1, 4]) 
+14
Sep 06 '13 at 21:05
source share



All Articles