How to search a list of tuples in Python

So, I have a list of tuples, for example:

[(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")] 

I need this list for a tuple whose number is equal to something.

So, if I do search(53) , it will return an index value of 2

Is there an easy way to do this?

+75
python list search tuples
May 26 '10 at 10:45
source share
8 answers
 [i for i, v in enumerate(L) if v[0] == 53] 
+74
May 26 '10 at 10:47 p.m.
source share

You can use understanding:

 >>> a = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")] >>> [x[0] for x in a] [1, 22, 53, 44] >>> [x[0] for x in a].index(53) 2 
+46
May 26 '10 at
source share

TL; DR

A generator expression is probably the most efficient and easiest solution to your problem:

 l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")] result = next((i for i, v in enumerate(l) if v[0] == 53), None) # 2 

Explanation

There are several answers that provide a simple solution to this combo box question. Although these answers are perfectly correct, they are not optimal. Depending on your use case, there can be significant benefits to making a few simple changes.

The main problem that I see when using list comprehension for this use case is that the whole list will be processed, although you only want to find 1 element.

Python provides a simple construct that is ideal here. It is called an expression. Here is an example:

 # Our input list, same as before l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")] # Call next on our generator expression. next((i for i, v in enumerate(l) if v[0] == 53), None) 

We can expect this method to do basically the same thing as the list in our trivial example, but what if we work with a large dataset? That is where the advantage of using the generator method comes into play. Instead of creating a new list, we will use your existing list as our iterable and use next() to get the first element from our generator.

Let's see how these methods work differently on some large datasets. These are large lists consisting of 10,000,000 + 1 elements, with our goal in the beginning (best) or ending (worst). We can verify that both of these lists will work equally using the following list:

Listing

"Worst case"

 worst_case = ([(False, 'F')] * 10000000) + [(True, 'T')] print [i for i, v in enumerate(worst_case) if v[0] is True] # [10000000] # 2 function calls in 3.885 seconds # # Ordered by: standard name # # ncalls tottime percall cumtime percall filename:lineno(function) # 1 3.885 3.885 3.885 3.885 so_lc.py:1(<module>) # 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 

"The best case"

 best_case = [(True, 'T')] + ([(False, 'F')] * 10000000) print [i for i, v in enumerate(best_case) if v[0] is True] # [0] # 2 function calls in 3.864 seconds # # Ordered by: standard name # # ncalls tottime percall cumtime percall filename:lineno(function) # 1 3.864 3.864 3.864 3.864 so_lc.py:1(<module>) # 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 

Generator Expressions

Here is my hypothesis for generators: we will see that generators will work much better in the best case, but similarly in the worst case. This performance increase is mainly due to the fact that the generator is evaluated lazily, that is, it will only calculate what is required to obtain the value.

Worst case

 # 10000000 # 5 function calls in 1.733 seconds # # Ordered by: standard name # # ncalls tottime percall cumtime percall filename:lineno(function) # 2 1.455 0.727 1.455 0.727 so_lc.py:10(<genexpr>) # 1 0.278 0.278 1.733 1.733 so_lc.py:9(<module>) # 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} # 1 0.000 0.000 1.455 1.455 {next} 

Best case

 best_case = [(True, 'T')] + ([(False, 'F')] * 10000000) print next((i for i, v in enumerate(best_case) if v[0] == True), None) # 0 # 5 function calls in 0.316 seconds # # Ordered by: standard name # # ncalls tottime percall cumtime percall filename:lineno(function) # 1 0.316 0.316 0.316 0.316 so_lc.py:6(<module>) # 2 0.000 0.000 0.000 0.000 so_lc.py:7(<genexpr>) # 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} # 1 0.000 0.000 0.000 0.000 {next} 

WHAT?! The best case resets understanding of lists, but I did not expect our worst case to exceed correspondence to such an extent. How so? Honestly, I could only speculate without further research. Maybe I'll ask about it.

Take it all with a piece of salt, I haven't done any solid profiling here, just some very simple tests. I would like to include some memory profiling, but these methods are running away from me. However, what we see here is enough to see that the expression for the generator is more efficient for finding this type in the list.

Please note that this is all basic, built-in python. We do not need to import anything or use special libraries.

And to give credit, when credit should be, I first saw this technique in a free course, Udacity cs212 with Peter Norwig.

+27
Jun 02 2018-12-12T00:
source share

Your tuples are basically key-value pairs - python dict - like this:

 l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")] val = dict(l)[53] 

Edit - yeah, you say you want the index value (53, "xuxa"). If this is really what you want, you will have to go through the source list or perhaps make a more complex dictionary:

 d = dict((n,i) for (i,n) in enumerate(e[0] for e in l)) idx = d[53] 
+25
May 26 '10 at 22:49
source share

Hmm ... well, the easy way that comes to mind is to turn it into a dict

 d = dict(thelist) 

and access d[53] .

EDIT: Oh, wrongly ask your question for the first time. It looks like you really want to get the index that stores the given number. In this case, try

 dict((t[0], i) for i, t in enumerate(thelist)) 

instead of the plain old dict conversion. Then d[53] will be 2.

+9
May 26 '10 at 10:47 a.m.
source share

Assuming the list can be long and the numbers can be repeated, consider using the SortedList from the sorted container containers of the Python module . The SortedList type automatically supports tuples in order by number and allows you to quickly perform searches.

For example:

 from sortedcontainers import SortedList sl = SortedList([(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]) # Get the index of 53: index = sl.bisect((53,)) # With the index, get the tuple: tup = sl[index] 

This will work much faster than a list comprehension suggestion by doing a binary search. The dictionary suggestion will be executed faster, but will not work if there can be duplicate numbers with different lines.

If there are duplicate numbers with different lines, you need to take one more step:

 end = sl.bisect((53 + 1,)) results = sl[index:end] 

As a result of halving by 54, we will find the final index for our slice. This will be significantly faster on long lists compared to the accepted answer.

+5
Apr 10 '14 at
source share

Another way.

 zip(*a)[0].index(53) 
+1
Jul 23 '13 at 19:55
source share

[k for k, v in l if v == ' delicia ']

here l is a list of tuples - [(1, "juca"), (22, "james"), (53, "xuxa"), (44, "delicia")]

And instead of converting it to a dict, we use llist understanding.

*Key* in Key,Value in list, where value = **delicia**

-one
Apr 24 '17 at 23:52
source share



All Articles