Understanding this line: list_of_tuples = [(x, y) for x, y, label in data_one]

Question

Understanding this line: list_of_tuples = [(x, y) for x, y, label in data_one]

As you already understood, I am new and trying to understand what the "pythonic way" of writing this function is built on. I know that other threads may include a partial answer to this question, but I do not know what to look for, because I do not understand what is going on here.

This line is the code my friend sent me to improve the code:

import numpy as np #load_data: def load_data(): data_one = np.load ('/Users/usr/... file_name.npy') list_of_tuples = [] for x, y, label in data_one: list_of_tuples.append( (x,y) ) return list_of_tuples print load_data()

Improved version:

 import numpy as np #load_data: def load_data(): data_one = np.load ('/Users/usr.... file_name.npy') list_of_tuples = [(x,y) for x, y, label in data_one] return list_of_tuples print load_data()

Interesting:

What's going on here?
Is this the best or worst way? since this is "Pythonic", I assume that it will not work with other languages and it might be better to get used to the more general way?

+8

python numpy list-comprehension

oba2311 Jul 6 '16 at 16:29

source share

5 answers

Both methods are correct and work. You could probably associate the first path with how everything is done in C and other languages. This means that you basically run the for loop to traverse all the values, and then add it to your list of tuples.

The second method is more pythonic, but does the same. If you look at [(x,y) for x, y, label in data_one] (this is a list comprehension), you will see that you also run a for loop on the same data, but your result will be (x, y) , and all of these results form a list. Thus, he achieves the same.

The third method (added as a comment response) uses the slice method.

I have prepared a small example similar to yours:

 data = [(1, 2, 3), (2, 3, 4), (4, 5, 6)] def load_data(): list_of_tuples = [] for x, y, label in data: list_of_tuples.append((x,y)) return list_of_tuples def load_data_2(): return [(x,y) for x, y, label in data] def load_data_3(): return [t[:2] for t in data]

They all do the same and return [(1, 2), (2, 3), (4, 5)] , but their runtime is different. This is why list comprehension is the best way to do this.

When I run the first load_data() method, I get:

 %%timeit load_data() 1000000 loops, best of 3: 1.36 µs per loop

When I run the second method load_data_2() , I get:

 %%timeit load_data_2() 1000000 loops, best of 3: 969 ns per loop

When I run the third load_data_3() method, I get:

 %%timeit load_data_3() 1000000 loops, best of 3: 981 ns per loop

the second way, understanding the list, faster!

+8

Xavier merino Jul 6 '16 at 16:47

source share

The "enhanced" version uses list comprehension . This makes declarative code (describing what you want), not imperative (describing how to get what you want).

The advantages of declarative programming are that implementation details are largely ignored, and base classes and data structures can perform operations in an optimal way. For example, one optimization that the python interpreter in the above example could have done was to pre-allocate the correct size of the list_of_tuples array, rather than constantly resizing the array during the append() operation.

To get started with the list, I’ll explain how I usually start writing. For list L write something like this:

 output = [x for x in L]

For each element in L variable is extracted (center x ) and can be used to create an output list ( x on the left). The above expression effectively does nothing, and output matches L Imperatively, this is akin to:

 output = [] for x in L: output.append(x)

From this we can understand that each x is actually a tuple that can be unpacked using the purpose of the tuple :

 output = [x for x, y, label in L]

This will create a new list containing only the x element from each tuple in the list.

If you want to pack another tuple in the output list, you just pack it on the left side:

 output = [(x,y) for x, y, label in L]

This is basically what you do in your optimized version.

You can do other useful things with a list, for example, only by inserting values that match a specific condition:

 output = [(x,y) for x, y, label in L if x > 10]

Here is a useful lesson on how you can find interesting listings: http://treyhunner.com/2015/12/python-list-comprehensions-now-in-color/

+5

Lee netherton Jul 6 '16 at 17:05

source share

The action is essentially the same. In the new Python interpreters, the range of variable values in the list comprehension is narrower ( x cannot be seen outside the comprehension).

 list_of_tuples = [] for x, y, label in data_one: list_of_tuples.append( (x,y) ) list_of_tuples = [(x,y) for x, y, label in data_one]

Such an action occurs often enough for Python developers to consider using special syntax. There is a map(fn, iterable) function that does something similar, but I think the list comprehension is clearer.

Python developers like this syntax are enough to extend it to generators and dictionaries and collections. And they allow nested and conditional sentences.

Both forms use the unpacking of tuples x,y,label in data_one .

What do both of these clips do? data_one , apparently, is a list of tuples (or subscriptions) with 3 elements. This code creates a new list with 2 items: 2 of 3 items. It seems to me that it’s easier to understand this in understanding the list.

It would be wise to be familiar with both. Sometimes the action is too complex to use a form of understanding.

Another feature of understanding is that it does not allow side effects (or at least harder to include). In some cases this may be a defect, but in general it makes the code more understandable.

+4

hpaulj Jul 6 '16 at 16:53

source share

This is called list comprehension. It looks like a loop and can often perform the same task, but it will generate a list with the results. General format [operation for variable in iterable] . For example,

[x**2 for x in range(4)] will result in [0, 1, 4, 9] .

They can also be more complex (for example, above), using several functions, variables and iterations in one list comprehension. For example,

[(x,y) for x in range(5) for y in range(10)] .

Here you can find more information about here .

+3

Alex Rosenfeld Jul 6 '16 at 16:41

source share

piRSquared · Accepted Answer · 2016-07-06T16:50:54+0000

 list_of_tuples = [(x,y) for x, y, label in data_one]

(x, y) is a tuple <- related tutorial.

This is a list of comprehension

  [(x,y) for x, y, label in data_one] # ^ ^ # | ^comprehension syntax^ | # begin list end list

data_one is iterable and necessary for list comprehension. Under the covers they are loops and have to iterate over something.

x, y, label in data_one tells me that I can "unpack" these three elements from each element that is sent using data_one iterable. It's like a local for loop variable, it changes at every iteration.

All in all, it says:

Make a list of tuples that look like (x, y) , where I get x, y, and label from each element delivered by an iterable data_one . Put each x and y in a tuple inside a list called list_of_tuples . Yes, I know that I “unpacked” the label and never used it, I don’t care.

Understanding this line: list_of_tuples = [(x, y) for x, y, label in data_one]

More articles: