There are already many good answers, and indeed you have accepted them! But I thought I would add another option. Many people have seen that your code could be made more compact with generator expressions or lists. I am going to propose a hybrid style that uses generator expressions for initial filtering, while preserving the for loop in the last filter.
The advantage of this style over the style of your source code is that it simplifies the flow of control by eliminating continue
. The advantage of this style in understanding a single list is that it avoids multiple access to task.project.identifier
natural way. It also handles the volatile state (set seen
) transparently, which, in my opinion, is important.
def get_projects_of_tasks(task_list): projects = (task.project for task in task_list) ids_projects = ((p.identifier, p) for p in projects if p is not None) seen = set() unique_projects = [] for id, p in ids_projects: if id not in seen: seen.add(id) unique_projects.append(p) return unique_projects
Since these are generator expressions (enclosed in brackets instead of brackets), they do not create temporary lists. The first expression of the generator creates the iterability of the projects; you might think of it as executing the line project = task.project
from your source code in all projects at once. The second generator expression creates the iterability of the tuples (project_id, project)
. The if
clause at the end filters out None
values; (p.identifier, p)
is evaluated only if p
passes through the filter. Together, these two generator expressions eliminate your first two if
blocks. The remaining code is essentially the same as yours.
Also note the great suggestion from Marcin / delnan that you create a generator with yield
. This reduces the additional detail of your code, reducing it to the point:
def get_projects_of_tasks(task_list): projects = (task.project for task in task_list) ids_projects = ((p.identifier, p) for p in projects if p is not None) seen = set() for id, p in ids_projects: if id not in seen: seen.add(id) yield p
The only drawback - in case this is not obvious - is that if you want to permanently store projects, you must pass the result to list
.
projects_of_tasks = list(get_projects_of_tasks(task_list))
source share