What is the difference between sorted(list) and list.sort() ?
list.sort list in place and returns Nonesorted takes any iteration and returns a new list sorted.
sorted equivalent to this Python implementation, but the built-in CPython function should be noticeably faster since it is written in C:
def sorted(iterable, key=None): new_list = list(iterable)
when to use which?
- Use
list.sort if you donโt want to keep the original sort order (this way you can reuse the list in place in memory.) And when you are the sole owner of the list (if the list is shared by other code and you mutate it, you can enter errors that use this list.) - Use
sorted if you want to keep the original sort order or when you want to create a new list that only your local code owns.
Is it possible to restore the list of original positions after the list.sort ()?
No - if you didnโt make a copy yourself, this information is lost because the sorting is done locally.
"And what is faster? And how much faster?"
To illustrate the penalty for creating a new list, use the timeit module, here is our setting:
import timeit setup = """ import random lists = [list(range(10000)) for _ in range(1000)] # list of lists for l in lists: random.shuffle(l) # shuffle each list shuffled_iter = iter(lists) # wrap as iterator so next() yields one at a time """
And here are our results for a list of randomly ordered 10,000 integers, as we see here, we have refuted an earlier myth about the costs of creating a list :
Python 2.7
>>> timeit.repeat("next(shuffled_iter).sort()", setup=setup, number = 1000) [3.75168503401801, 3.7473005310166627, 3.753129180986434] >>> timeit.repeat("sorted(next(shuffled_iter))", setup=setup, number = 1000) [3.702025591977872, 3.709248117986135, 3.71071034099441]
Python 3
>>> timeit.repeat("next(shuffled_iter).sort()", setup=setup, number = 1000) [2.797430992126465, 2.796825885772705, 2.7744789123535156] >>> timeit.repeat("sorted(next(shuffled_iter))", setup=setup, number = 1000) [2.675589084625244, 2.8019039630889893, 2.849375009536743]
After some feedback, I decided that another test would be desirable with different characteristics. Here I provide the same random list of 100,000 in length for each iteration 1000 times.
import timeit setup = """ import random random.seed(0) lst = list(range(100000)) random.shuffle(lst) """
I interpret this big size difference coming from the copy mentioned by Martijn, but it does not dominate at the point indicated in the older more popular answer here, here the increase in time is only about 10%
>>> timeit.repeat("lst[:].sort()", setup=setup, number = 10000) [572.919036605, 573.1384446719999, 568.5923951] >>> timeit.repeat("sorted(lst[:])", setup=setup, number = 10000) [647.0584738299999, 653.4040515829997, 657.9457361929999]
I also used the much smaller sort described above and saw that a new sorted copy still takes about 2% longer on 1000 lengths.
Poke also had its own code, here is the code:
setup = ''' import random random.seed(12122353453462456) lst = list(range({length})) random.shuffle(lst) lists = [lst[:] for _ in range({repeats})] it = iter(lists) ''' t1 = 'l = next(it); l.sort()' t2 = 'l = next(it); sorted(l)' length = 10 ** 7 repeats = 10 ** 2 print(length, repeats) for t in t1, t2: print(t) print(timeit(t, setup=setup.format(length=length, repeats=repeats), number=repeats))
He found for 1,000,000 sorting lengths (ran 100 times) a similar result, but only about 5% increase in time, here is the output:
10000000 100 l = next(it); l.sort() 610.5015971539542 l = next(it); sorted(l) 646.7786222379655
Output:
A large list sorted with sorted copy sorted likely to dominate the differences, but the sort itself dominates the operation, and organizing the code around these differences will be a premature optimization. I would use sorted when I need a new sorted list of data, and I would use list.sort when I needed to sort the list in place, and let this determine my use.