Login

Imagine that there is a structure that provides a method called logutils.set_up() that sets up logging according to some configuration.

Logging should be configured as early as possible since warnings issued during library import should not be lost.

Since the old way ( if __name__=='__main__': looks ugly, we use console_script entry points to register the main() method.

 # foo/daily_report.py from framework import logutils logutils.set_up() def main(): ... 

My problem is that logutils.set_up() can be called twice:

Imagine there is a second console script that calls logutils.set_up() and imports daily_report.py .

I can change the framework code and set_up() to do nothing in the second call to logutils.set_up() , but that seems awkward. I would like to avoid this.

How can I be sure that logutils.set_up() run only once?

+6
source share
7 answers

There are several ways to achieve the goal, each of which has its advantages and disadvantages.

(some of them overlap with other answers. I do not want plagiarism, only to give an exhaustive answer).


Approach 1: the function should do it

One way to ensure that a function is run only once is to make the function itself workable by making it “remember” that it has already been called. This is more or less what is described by @eestrada and @qarma.

Regarding the implementation of this, I agree with @qarma that using memoization is the easiest and most ideotic way. There are some simple reminder decorators for python on the web. functools.lru_cache is part of the standard library. You can just use it like:

 @functools.lru_cache def set_up(): # this is your original set_up() function, now decorated <...same as before...> 

The disadvantage here is that the responsibility for maintaining state may not be set_up , but just a function. It can be argued that it should be executed twice if it is called twice, and it is only responsible for calling it when it needs it (what if you really want to run it twice)? The general argument is that a function (in order to be useful and reusable) should not make assumptions about the context in which it is called.

Is this argument correct in your case? You decide.

Another disadvantage is that it can be caused by abuse of the memoization tool. Memoization is a tool closely related to functional programming and should be applied to pure functions. Remembering funciton means "no need to run it again, because we already know the result "and not" there is no need to run it again, because there is some kind of side effect we want to avoid. "

Approach 2: the one you find ugly ( if __name__=='__main__' )

The most common pythonic path that you already mention in your question uses the infamous if __name__=='__main__' .

This ensures that the function is called only once, because it is called only from a module named __main__ , and the interpreter ensures that there is only one such module in your process.

It works. There are no complications and reservations. This is a way to run the main code (including the installation code) is executed in python. It is considered pythonic simply because it is so damn common in python (since there are no better ways).

The only drawback is that it is possibly ugly (asterically-wise, not code-quality-wise). I admit that I, too, flinched the first few times when I saw it or wrote, but it grows on you.

Approach 3: using the python module import mechanism

Python already has a caching mechanism to prevent double import of modules. You can use this mechanism by running the installation code in a new module, and then import it. This is similar to @ rll's answer. It's simple:

 # logging_setup.py from framework import logutils logutils.set_up() 

Now every caller can run this by importing a new module:

 # foo/daily_report.py import logging_setup # side effect! def main(): ... 

Since the module is imported only once, set_up is called only once.

The disadvantage here is that it violates the principle of “explicit is better than implicit”. That is, if you want to call a function, call it. Bad practice of running code with side effects during module import.

Approach 4: Monkey Patch

This is by far the worst of the approaches in this answer. Do not use it. But this is still a way to get the job done.

The idea is that if you do not want the function to be called after the first call, monkey-patch (read: vandalize it) after the first call.

 from framework import logutils logutils.set_up_only_once() 

Where set_up_only_once can be implemented as:

 def set_up_only_once(): # run the actual setup (or nothing if already vandalized): set_up() # vandalize it so it never walks again: import sys sys.modules['logutils'].set_up = lambda: None 

Disadvantages: your colleagues will hate you.


TL; DR:

The easiest way is to memoize with functools.lru_cache , but it may not be the best solution with code quality. It is up to you if this solution is good enough in your case.

The safest and most pythonic path, although not eye- if __name__=='__main__': ... , uses if __name__=='__main__': ...

+2
source

I did something similar in my phd project . I am initializing in __init__.py of theodule with the basic configuration (see here ):

 logging.getLogger('modulename').addHandler(logging.NullHandler()) FORMAT = '%(name)s:%(levelname)s: %(message)s' logging.basicConfig(format=FORMAT) 

And then, for example, if a configuration file is provided, you overwrite the configuration. As an example (you can find this in the constructor of EvoDevoWorkbench):

 logging.config.fileConfig(config.get('default','logconf'), disable_existing_loggers=False) 

where config.get('default','logconf') is the path to the logging configuration file. Then in any submodule you use the usual one:

 log = logging.getLogger(__name__) 

In your specific case, if you configure logging (or call set_up) inside the framework __init__.py , then it will never be called twice. If you cannot do this, the only way I can see is either to use the if __name__=='__main__': security module, or to make the foo or daily_report module so that you can put the set_up call into the __init__.py file. Then you can use it as described above.

You can find more detailed information in the documentation .

+1
source

You can use a single. . The Singleton class is created only once, and subsequent calls point to the same object, and not to create a new one. This answer explains the different ways of creating a singleton class (everything is simple).

I personally prefer the base class approach. First, you define the Singleton class, as shown below:

 class Singleton(type): _instances = {} def __call__(cls, *args, **kwargs): if cls not in cls._instances: cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs) return cls._instances[cls] 

then use it as a meta class as follows:

 class MyClass(object): __metaclass__ = Singleton "the rest of you class as normal" 

Now, the first time you call MyClass (), it will create an object for you. Subsequent calls will refer to the same object (sort of like a global class variable!)

0
source

It is not difficult to enter the code set_up ; indeed, this is the only way to know for sure that initialization is performed only once. Here is black magic for you if you do not want to use the if to complete the task:

 # framework/logutils.py def _set_up_internal(): global set_up # NOTE: start setup return_val = None # if there is a useful return value # NOTE: finish setup # clobber global reference with a dummy implementation set_up = lamdba: return_val # or return `None` if there is no useful return value set_up = _set_up_internal 

No explicit check is required, and he is sure that the function is only ever called once. This is not thread safe, but I assume this is not a requirement (as it was not mentioned in the question).

0
source

For completeness, I will add a solution - there are many valid options here, but maybe this fills the gap.

This is a bit verbose, but I feel it is relatively simple and clean:

 # foo/daily_report.py from framework import logutils if not hasattr(logutils.set_up, "_initiated"): logutils.set_up() logutils.set_up._initiated = True def main(): pass 

This way you are not actively changing the function itself ... or not so much: you are adding the attribute that you are checking. Instead of attaching this attribute to a function, you can place it somewhere else, or completely complete initialization with one type of class. But these solutions have already been proposed, if I am not mistaken.

Three lines to be placed on each set_up call.

The problem will be that if any code calls set_up without setting this attribute (because of an error, because of an external dependency, whatever). But if this is so, then you have no other options than to change the code or check the behavior of the function itself.

Note. I assume the framework set_up function is Pure Python. I assume this will not work for C extension functions or built-in modules, but I have not tested them.

0
source

FWIW, I think the presence of the logutils package protects itself from many calls to configure it, is not awkward, and this concerns the problem in the right place. After you have done this, the logutils package is now more reliable.

Any other solution, "outside" logutils, is error prone due to the fact that in some cases they are skipped.

-1
source

Just press @memoize on set_up and then only called once :)

-1
source

All Articles