Why python logger's logs are sometimes duplicated and how can we fix it?

Sometimes, you may face an issue that logs of a logger are duplicated. Here is an example code:

import logging

# config the root logger to handle error logs.
logging.basicConfig(level=logging.ERROR)

# issue an error log
logging.error('issued by root logger')

# config a logger to handle verbose logs.
verbose_logger = logging.getLogger('verbose_logger')
verbose_logger.setLevel(logging.DEBUG)
verbose_logger.addHandler(logging.StreamHandler())

# issue a debug log
verbose_logger.debug('issued by verbose_logger')

the output is:

ERROR:root:issued by root logger
issued by verbose_logger
DEBUG:verbose_logger:issued by verbose_logger

In the code example, we only issued two logs, but three logs are printed!

Why?

The problem is that the log "issued by verbose_logger" is printed twice (last two lines of the output above).

According to the logging format, we can tell that the first one (issued by verbose_logger) is printed by verbose_logger and the second one (DEBUG:verbose_logger:issued by verbose_logger) is printed by the root logger.

This strange behavior is related to python logging's underlying mechanism.

Let's check the logging flow on python's official document:

logging flow

Look at the bottom left part of the logging flow. After processing a logger's own handlers, if the logger's propagate is true and it has a parent logger, then the parent logger's handlers will be processed.

Back to our example code, because verbose_logger's propagate propety is true (the default value), and verbose_logger's parent is the root logger, so both verbose_logger's handlers and the root logger's handlers will process the log record. This leads to the duplicated logs.

Notice the logging level

In the code example, root logger's level is ERROR, and verbose_logger's level is DEBUG. But the debug log record issued by verbose_logger is still processed by the root logger's handlers.

The reason is, though the root logger's level is ERROR, the root logger's handler's level is NOTSET!

NOTSET level is lower than all other levels including DEBUG. That means, a NOTSET level handler will handle all logs.

How to fix it

Option 1: DO NOT add handlers to non-root loggers

It's a good practice to add handlers only to root loggers. Here's an example:

import logging

# only add handler to root
logging.root.addHandler(logging.StreamHandler())
logging.root.setLevel(logging.ERROR)

# issue an error log
logging.error('issue by root logger')

# DO NOT add handlers to verbose_logger
verbose_logger = logging.getLogger('verbose_logger')
verbose_logger.setLevel(logging.DEBUG)

# issue a debug log
verbose_logger.debug('issue by verbose_logger')

The output is not duplicated:

issue by root logger
issue by verbose_logger

Option 2: Disable non-root logger's propagate property

Set verbose_logger's propagate property to False, so its parent logger's handlers will not work.

import logging

logging.basicConfig(level=logging.ERROR)
logging.error('issue by root logger')

verbose_logger = logging.getLogger('verbose_logger')
verbose_logger.propagate = False  # disable propagate property
verbose_logger.setLevel(logging.DEBUG)
verbose_logger.addHandler(logging.StreamHandler())

verbose_logger.debug('issue by verbose_logger')

The output is not duplicated:

ERROR:root:issue by root logger
issue by verbose_logger

Understand the logging flow

Now we know, when a logger's log is passed to its parent, the parent's level (and filters) is not checked. Or we can say, the log is passed directly to the parent logger's handlers rather than the parent logger itself.

Why do python's creators make this decision which is a little counter-intuitive?

In my opinion, there are multiple possible reasons.

Reason 1: Logger level and handler level have different roles

  • A logger's level only affect whether it can issue a log record.
  • A handler's level only affect whether it can handle a log record.

When a logger's record is propagated to its parent, the record is already issued, so the parent do not need to check whether the record should be issued cause it's already done by the child logger.

Reason 2: Protect the propagating chain

For example, we have loggers "foo.bar", "foo" and "root". The propagating chain looks like:

foo.bar
↓ (progate to parent)
foo
↓ (progate to parent)
root

Normally, the record is handled in the order of "foo.bar", "foo" and "root".

If logger foo (not its handlers) needs to check the record's level again, it may stop the propagting when foo logger's level is higher than foo.bar logger. Then, root logger will not even receive the record.

Posted on 2022-04-26