Why are collections not being processed evenly in Python?

Question

Why are collections not being processed evenly in Python?

Settings and lists are handled differently in Python, and there seems to be no single way to work with them. For example, adding an element to set is done using the add method, while for list is done using the append method. I know that there are different semantics behind this, but there is also general semantics, and often an algorithm that works with a certain collection cares more about common features than about differences. C ++ STL shows that this can work, so why doesn't Python have such a concept?

Edit: In C ++, I can use output_iterator to store values in a (almost) arbitrary type of collection, including lists and sets. I can write an algorithm that takes such an iterator as an argument and writes elements to it. Then the algorithm is completely agnostic to the type of container (or other device, may be a file) that supports an iterator. If the backup container is a collection that ignores duplicates, then this is the caller’s decision. My specific problem is that this happened several times for me when I used, for example, list for a specific task, and later decided that set more appropriate. Now I need to change append to add in several places in my code. I'm just wondering why Python has no idea for such cases.

+6

python collections

Björn pollex Sep 14 '10 at 9:02

source share

3 answers

add and add different. Sets are unordered and contain unique elements, while append indicates that the element is always added, and that this is done specifically at the end.

sets and lists can be processed as iterators, as well as their general semantics and freely used by your algorithms.

If you have an algorithm that depends on some kind of add-on, you simply cannot depend on sets, tuples, lists, dicts, strings that behave the same.

+4

Ivo van der Wijk Sep 14 '10 at 9:13

source share

The actual reason is probably only related to the history of Python.

The built-in set type was not built before Python 2.6 and was based on the sets module, which itself wasn’t in the standard library until Python 2.3. Obviously, changing the semantics of a given type can break a whole series of existing code, which is based on the original set of modules, and, as a rule, language developers evade breaking existing code without highlighting the main number.

You can blame the original author of the module if you want, but keep in mind that custom types and built-in types must have lived in different universes prior to Python 2.2, which meant that you could not directly extend the built-in type, and probably allowed the authors of the modules Feel good without supporting consistent collection semantics.

+1

Triptych Sep 14 '10 at 10:43

source share

Glenn maynard · Accepted Answer · 2010-09-14T09:18:07+0000

The direct answer: this is a design flaw.

You should be able to paste into any container where a generic insert makes sense (e.g. excluding a dict) with the same method name. There must be a consistent, common name for insertion, for example. add , which matches set.add and list.append , so you can add to the container without worrying about what you paste.

Using different names for this operation in different types is gratuitous inconsistency and sets a low base standard: the library should encourage user containers to use a consistent API, and not provide mostly incompatible APIs for each base container.

However, this is not often a practical problem in this case: most of the time, when the results of the function are a list of elements, implement it as a generator. They allow you to process both of these sequences (in terms of function), as well as other forms of iterations:

 def foo(): yield 1 yield 2 yield 3 s = set(foo()) l = list(foo()) results1 = [i*2 for i in foo()] results2 = (i*2 for i in foo()) for r in foo(): print r

Why are collections not being processed evenly in Python?

More articles: