Mercurial > p > roundup > code
changeset 8446:14c7c07b32d8
feature: add thread local trace_id and trace_reason to logging.
Added trace_id to default logging so that all logs for a given request
share the same trace_id.
This allows correlation of logs across a request.
admin_guide.txt, upgrading.txt:
add docs
update sample configs to include trace_id.
rewrite logging docs in admin_guide. Hopefully they are clearer now.
clean up some stuff in the logging config file docs.
admin.py:
add decorators to run_command to enable trace_id.
change calls to db.commit() to use run_command to get trace_id.
configuration.py:
clean up imports.
update docstrings, comments and inline docs.
add trace_id to default log format.
add function for testing decorated with trace_id.
add support for dumping stack trace in logging.
add check for pytest in sys.modules to enable log propagation when
pytest is running. Otherwise tests fail as the caplog logger doesn't
see the roundup logs.
logcontext.py:
new file to handle thread local contextvar mangement.
mailgw.py:
add decorators for trace_id etc.
scripts/roundup_xlmrpc_server.py:
add decorators for trace_id etc.
fix encoding bug turning bytes into a string.
fix command line issue where we can't set encoding. (not sure if
changing encoding via command line even works)
cgi/client.py
decorate two entry points for trace_id etc.
cgi/wsgi_handler.py:
decorate entry point for trace_id etc.
test/test_config.py:
add test for trace_id in new log format.
test various cases for sinfo and errors in formating msg.
| author | John Rouillard <rouilj@ieee.org> |
|---|---|
| date | Tue, 16 Sep 2025 22:53:00 -0400 |
| parents | 4b6e9a0e13ee |
| children | d06be9346c68 |
| files | CHANGES.txt doc/admin_guide.txt doc/upgrading.txt roundup/admin.py roundup/cgi/client.py roundup/cgi/wsgi_handler.py roundup/configuration.py roundup/logcontext.py roundup/mailgw.py roundup/scripts/roundup_xmlrpc_server.py test/test_config.py |
| diffstat | 11 files changed, 895 insertions(+), 113 deletions(-) [+] |
line wrap: on
line diff
--- a/CHANGES.txt Mon Sep 08 09:08:14 2025 -0400 +++ b/CHANGES.txt Tue Sep 16 22:53:00 2025 -0400 @@ -44,6 +44,11 @@ or put them in the input buffer for editing. (John Rouillard) - add format to logging section in config.ini. Used to set default logging format. (John Rouillard) +- the default logging format template includes an identifier unique + for a request. This identifier (trace_id) can be use to identify + logs for a specific transaction. Logging also supports a + trace_reason log token with the url for a web request. The logging + format can be changed in config.ini. (John Rouillard) 2025-07-13 2.5.0
--- a/doc/admin_guide.txt Mon Sep 08 09:08:14 2025 -0400 +++ b/doc/admin_guide.txt Tue Sep 16 22:53:00 2025 -0400 @@ -46,98 +46,284 @@ both choosing your "tracker home" and the ``main`` -> ``database`` variable in the tracker's config.ini. +.. _rounduplogging: Configuring Roundup Message Logging =================================== You can control how Roundup logs messages using your tracker's -config.ini file. Roundup uses the standard Python (2.3+) logging -implementation. The config file and ``roundup-server`` provide very -basic control over logging. - -Configuration for "BasicLogging" implementation: - - tracker configuration file specifies the location of a log file - ``logging`` -> ``filename`` - - tracker configuration file specifies the level to log to as - ``logging`` -> ``level`` - - tracker configuration file lets you disable other loggers - (e.g. when running under a wsgi framework) with - ``logging`` -> ``disable_loggers``. - - tracker configuration file can set the log format using - ``logging`` -> ``format``. See :ref:`logFormat` for more info. - - ``roundup-server`` specifies the location of a log file on the command - line - - ``roundup-server`` enable using the standard python logger with - the tag/channel ``roundup.http`` on the command line - -By supplying a standard log config file in ini or json (dictionary) -format, you get more control over the logs. You can set different -levels for logs (e.g. roundup.hyperdb can be set to WARNING while -other Roundup log channels are set to INFO and roundup.mailgw logs at -DEBUG level). You can also send the logs for roundup.mailgw to syslog, -and other roundup logs go to an automatically rotating log file, or -are submitted to your log aggregator over https. - -Configuration for standard "logging" module: - - tracker configuration file specifies the location of a logging - configuration file as ``logging`` -> ``config``. - -In both cases, if no logfile is specified then logging will simply be sent -to sys.stderr with only logging of ERROR messages. +config.ini file. Roundup uses the standard Python logging +implementation. The config file and ``roundup-server`` provide +very basic control over logging. + +``roundup-server``'s logging is controlled from the command +line. You can: + +- specify the location of a log file or +- enable logging using the standard Python logging library under + the tag/channel ``roundup.http`` + +Configuration for "BasicLogging" implementation for your tracker +is done using the settings in the tracker's ``config.ini`` under +the ``logging`` section: + +- ``filename`` setting: specifies the location of a log file +- ``level`` setting: specifies the minimum level to log +- ``disable_loggers`` setting: disable other loggers (e.g. when + running under a wsgi framework) +- ``format`` setting: set the log format template. See + :ref:`logFormat` for more info. + +In either case, if logfile is not specified, logging is sent to +sys.stderr. If level is not set, only ERROR or higher priority +log messages will be reported. + +You can get more control over logging by using the ``config`` +setting in the tracker's ``config.ini``. Using a logging config +file overrides all the rest of the other logging settings in +``config.ini``. You get more control over the logs by supplying +a log config file in ini or json (dictionary) format. + +Using this, you can set different levels by channel. For example +roundup.hyperdb can be set to WARNING while other Roundup log +channels are set to INFO and the roundup.mailgw channel logs at +the DEBUG level. You can also control the distribution of +logs. For example roundup.mailgw logs to syslog, other channels +log to an automatically rotating log file, or are submitted to +your log aggregator over https. .. _logFormat: Defining the Log Format ----------------------- -Starting with Roundup 2.6 you can specify the logging format. In the -``logging`` -> ``format`` setting of config.ini you can use any of the -`standard logging LogRecord attributes -<https://docs.python.org/3/library/logging.html#logrecord-attributes>`_. -However you must double any ``%`` format markers. The default value -is:: - - %%(asctime)s %%(levelname)s %%(message)s - -Standard Logging Setup +Starting with Roundup 2.6 you can specify the logging format in +config.ini. The ``logging`` -> ``format`` setting of config.ini +supports all of the `standard logging LogRecord attributes +<https://docs.python.org/3/library/logging.html#logrecord-attributes>`_ +or Roundup logging attributes. However you must double any ``%`` +format markers. The default value is:: + + %%(asctime)s %%(trace_id)s %%(levelname)s %%(message)s + +Roundup Logging Attributes +-------------------------- + +The `logging package has a number of attributes +<https://docs.python.org/3/library/logging.html#logrecord-attributes>`_ +that can be expanded in the format template. In addition to the +ones supplied by Python's logging module, Roundup defines +additional attributes: + + trace_id + a unique string that is generated for each request. It is + unique per thread. + + trace_reason + a string describing the reason for the trace/request. + + * the URL for a web triggered (http, rest, xmlrpc) request + * the email message id for an email triggered request + * the roundup-admin os user and start of command. Only first + two words in command are printed so seting a password will + not be leaked to the logs. + + sinfo + the stack traceback information at the time the log call id + made. + + This must be intentionally activated by using the extras + parameter. For example calling:: + + logging.get_logger('roundup.something').warning( + "I am here\n%(sinfo)s", extra={"sinfo": 2}) + + in the function confirmid() of the file detectors/reauth.py + in your demo tracker will print 2 items on the stack + including the log call. It results in the following (5 lines + total in the log file):: + + 2025-09-14 23:07:58,668 Cm0ZPlBaklLZ3Mm6hAAgoC WARNING I am here + File "[...]/roundup/hyperdb.py", line 1924, in fireAuditors + audit(self.db, self, nodeid, newvalues) + File "demo/detectors/reauth.py", line 7, in confirmid + logging.getLogger('roundup.something').warning( + + Note that the output does not include the arguments to + ``warning`` because they are on the following line. If you + want arguments to the log call included, they have to be on + the same line. + + Setting ``sinfo`` to an integer value N includes N lines up + the stack ending with the logging call. Setting it to 0 + includes all the lines in the stack ending with the logging + call. + + If the value is less than 0, the stack dump doesn't end at + the logging call but continues to the function that + generates the stack report. So it includes functions inside + the logging module. + + Setting it to a number larger than the stack trace will + print the trace down to the log call. So using ``-1000`` + will print up to 1000 stack frames and start at the function + that generates the stack report. + + Setting ``sinfo`` to a non-integer value ``{"sinfo": None}`` + will produce 5 lines of the stack trace ending at the + logging call. + + pct_char + produces a single ``%`` sign in the log. The usual way of + embedding a percent sign in a formatted string is to double + it like: ``%%``. However when the format string is specified + in the config.ini file percent signs are manipulated. So + ``%%(pct_char)s`` can be used in config.ini to print a + percent sign. + +The default logging template is defined in config.ini in the +``logging`` -> ``format`` setting. It includes the ``trace_id``. +When searching logs, you can use the trace_id to see all the log +messages associated with a request. + +If you want to log from a detector, extension or other code, you +can use these tokens in your messages when calling the logging +functions. (Note that doubling ``%`` signs is only required when +defining the log format in a config file, not when defining a +msg.) For example:: + + logging.getLogger('roundup.myextension').error('problem with ' + '%(trace_reason)s') + +will include the url in the message when triggered from the +web. This also works with other log methods: ``warning()``, +``debug()``, .... + +Note you must **not** use positional arguments in your +message. Using:: + + logging.getLogger('roundup.myextension').error( + '%s problem with %(trace_reason)s', "a") + +will not properly substitute the argument. You must use mapping +key based arguments and define the local values as part of the +extra dictionary. For example:: + + logging.getLogger('roundup.myextension').error('%(article)s ' + 'problem with %(trace_reason)', + extra={"article": some_local_variable}) + +Also if you are logging any data supplied by a user, you must not +log it directly. If the variable ``url`` contains the url typed in +by the user, never use: + + logger.info(url) + +or + + logger.info("Url is %s" % url) + +Use: + + logger.info("Url is %s", url) + +or + + logger.info("Url is %(url)s", extra={"url": url) + +This prevents printf style tokens in ``url`` from being processed +where it can raise an exception. This could be used to prevent +the log message from being generated. + +More on trace_id +~~~~~~~~~~~~~~~~ + +The trace_id provides a unique token (a UUID4 encoded to make it +shorter) for each transaction in the database. It is unique to +each thread or transaction. A transaction: + + for the web interface is + each web, rest or xmlrpc request + + for the email interface is + each email request. Using pipe mode will generate one + transaction. Using pop/imap etc can generate multiple + transactions, one for each email. Logging that occurs prior + to processing an email transaction has the default + ``not_set`` value for trace_id + + for the roundup-admin interface is + each command in the interactive interface or on the command + line. Plus one transaction when/if a commit happens on + roundup-admin exit. + +When creating scripts written using the roundup package the entry +point should use the ``@gen_trace_id`` decorator. For example to +decorate the entry point that performs one transaction:: + + from roundup.logcontext import gen_trace_id + + # stuff ... + + @gen_trace_id() + def main(...): + ... + +If your script does multiple processing operations, decorate the entry +point for the processing operation:: + + from roundup.logcontext import gen_trace_id + + @gen_trace_id() + def process_one(thing): + ... + + def main(): + for thing in things: + process_one(thing) + + +Advanced Logging Setup ---------------------- If the settings in config.ini are not sufficient for your logging -requirements, you can specify a full logging configuration in one of -two formats: - +requirements, you can specify a full logging configuration in one +of two formats: + + * `dictConfig format + <https://docs.python.org/3/library/logging.config.html#logging.config.dictConfig>`_ + using json with comment support * `fileConfig format <https://docs.python.org/3/library/logging.config.html#logging.config.fileConfig>`_ in ini style - * `dictConfig format - <https://docs.python.org/3/library/logging.config.html#logging.config.dictConfig>`_ - using json with comment support - -The dictConfig allows more control over configuration including -loading your own log handlers and disabling existing handlers. If you -use the fileConfig format, the ``logging`` -> ``disable_loggers`` flag -in the tracker's config is used to enable/disable pre-existing loggers -as there is no way to do this in the logging config file. + +The dictConfig format allows more control over configuration +including loading your own log handlers and disabling existing +handlers. If you use the fileConfig format, the ``logging`` -> +``disable_loggers`` flag in the tracker's config is used to +enable/disable pre-existing loggers as there is no way to do this +in the logging config file. .. _`dictLogConfig`: dictConfig Based Logging Config ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -dictConfigs are specified in JSON format with support for comments. -The file name in the tracker's config for the ``logging`` -> ``config`` -setting must end with ``.json`` to choose the correct processing. +dictConfigs are specified in JSON format with support for +comments. The file name in the tracker's config for the +``logging`` -> ``config`` setting must end with ``.json`` to +choose the correct processing. Comments have to be in one of two forms based on javascript line comments: -1. A ``//`` possibly indented with whitespace on a line is considereda - a comment and is stripped from the file before being passed to the - json parser. This is a "line comment". - -2. A ``//`` with at least three white space characters before it is - stripped from the end of the line before begin passed to the json - parser. This is an "inline comment". +1. A ``//`` possibly indented with whitespace on a line is + considereda a comment and is stripped from the file before + being passed to the json parser. This is a "line comment". + +2. A ``//`` with at least three white space characters before it + is stripped from the end of the line before being passed to + the json parser. This is an "inline comment". Block style comments are not supported. @@ -151,20 +337,23 @@ ................................. Note that this file is not actually JSON format as it include -comments. However by using javascript style comments, some tools that -expect JSON (editors, linters, formatters) might work with it. A -command like ``sed -e 's#^\s*//.*##' -e 's#\s*\s\s\s//.*##' -logging.json`` can be used to strip comments for programs that need -it. +comments. However by using javascript style comments, some tools +that treat JSON like javascript (editors, linters, formatters) +might work with it. A command like:: + + sed -e 's#^\s*//.*##' -e 's#\s*\s\s\s//.*##' logging.json + +can be used to strip comments for programs that need it. The config below works with the `Waitress wsgi server <https://github.com/Pylons/waitress>`_ configured to use the -roundup.wsgi channel. It also controls the `TransLogger middleware -<https://github.com/pasteorg/paste>`_ configured to use -roundup.wsgi.translogger, to produce httpd style combined logs. The -log file is specified relative to the current working directory not -the tracker home. The tracker home is the subdirectory demo under the -current working directory. The commented config is:: +roundup.wsgi channel. It also controls the `TransLogger +middleware <https://github.com/pasteorg/paste>`_ configured to +use roundup.wsgi.translogger, to produce httpd style combined +logs. The log file is specified relative to the current working +directory not the tracker home. The tracker home is the +subdirectory demo under the current working directory. The +commented config is:: { "version": 1, // only supported version @@ -173,11 +362,12 @@ "formatters": { // standard format for Roundup messages "standard": { - "format": "%(asctime)s %(ctx_id)s %(levelname)s %(name)s:%(module)s %(msg)s" + "format": "%(asctime)s %(trace_id)s %(levelname)s %(name)s:%(module)s %(msg)s" }, // used for waitress wsgi server to produce httpd style logs "http": { - "format": "%(message)s" + "format": "%(message)s %(trace_id)" + } }, "handlers": { @@ -254,12 +444,13 @@ fileConfig Based Logging Config ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The file config is an older and more limited method of configuring -logging. It is described by the `Configuration file format +The file config is an older and more limited method of +configuring logging. It is described by the `Configuration file +format <https://docs.python.org/3/library/logging.config.html#configuration-file-format>`_ -in the Python documentation. The file name in the tracker's config for -the ``logging`` -> ``config`` setting must end with ``.ini`` to choose -the correct processing. +in the Python documentation. The file name in the tracker's +config for the ``logging`` -> ``config`` setting must end with +``.ini`` to choose the correct processing. Example fileConfig LoggingConfig ................................ @@ -343,7 +534,7 @@ keys=basic,plain [formatter_basic] - format=%(asctime)s %(process)d %(name)s:%(module)s.%(funcName)s,%(levelname)s: %(message)s + format=%(asctime)s %(trace_id)s %(process)d %(name)s:%(module)s.%(funcName)s,%(levelname)s: %(message)s datefmt=%Y-%m-%d %H:%M:%S [formatter_plain]
--- a/doc/upgrading.txt Mon Sep 08 09:08:14 2025 -0400 +++ b/doc/upgrading.txt Tue Sep 16 22:53:00 2025 -0400 @@ -108,6 +108,35 @@ Migrating from 2.5.0 to 2.6.0 ============================= +Default Logs Include Unique Request Identifier (info) +----------------------------------------------------- + +The default logging format has been changed from:: + + %(asctime)s %(levelname)s %(message)s + +to:: + + %(asctime)s %(trace_id)s %(levelname)s %(message)s + +So logs now look like:: + + 2025-08-20 03:25:00,308 f6RPbT2s70vvJ2jFb9BQNF DEBUG get user1 cached + +which in the previous format would look like:: + + 2025-08-20 03:25:00,308 DEBUG get user1 cached + +The new format includes ``trace_id`` which is a thread and process +unique identifier for a single request. So you can link together all +of the log lines and determine where a slow down or other +problem occurred. + +The logging format is now a ``config.ini`` parameter in the +``logging`` section with the name ``format``. You can change it if you +would like the old logging format without having to create a logging +configuration file. See :ref:`rounduplogging` for details. + Support authorized changes in your tracker (optional) -----------------------------------------------------
--- a/roundup/admin.py Mon Sep 08 09:08:14 2025 -0400 +++ b/roundup/admin.py Tue Sep 16 22:53:00 2025 -0400 @@ -49,6 +49,7 @@ ) from roundup.exceptions import UsageError from roundup.i18n import _, get_translation +from roundup.logcontext import gen_trace_id, store_trace_reason try: from UserDict import UserDict @@ -2354,6 +2355,8 @@ print(function.__doc__) return 1 + @gen_trace_id() + @store_trace_reason('admin') def run_command(self, args): """Run a single command """ @@ -2599,7 +2602,7 @@ if self.db and self.db_uncommitted: commit = self.my_input(_('There are unsaved changes. Commit them (y/N)? ')) if commit and commit[0].lower() == 'y': - self.db.commit() + self.run_command(["commit"]) # looks like histfile is saved with mode 600 if self.readline and self.history_features('save_history'): @@ -2674,7 +2677,7 @@ self.interactive() else: ret = self.run_command(args) - if self.db: self.db.commit() # noqa: E701 + if self.db: self.run_command(["commit"]) # noqa: E701 return ret finally: if self.db:
--- a/roundup/cgi/client.py Mon Sep 08 09:08:14 2025 -0400 +++ b/roundup/cgi/client.py Tue Sep 16 22:53:00 2025 -0400 @@ -56,10 +56,10 @@ UsageError, ) +from roundup.logcontext import gen_trace_id, store_trace_reason +from roundup.mailer import Mailer, MessageSendError from roundup.mlink_expr import ExpressionError -from roundup.mailer import Mailer, MessageSendError - logger = logging.getLogger('roundup') if not random_.is_weak: @@ -434,6 +434,8 @@ # content-type. precompressed_mime_types = ["image/png", "image/jpeg"] + @gen_trace_id() + @store_trace_reason('client') def __init__(self, instance, request, env, form=None, translator=None): # re-seed the random number generator. Is this is an instance of # random.SystemRandom it has no effect. @@ -581,6 +583,8 @@ self._ = self.gettext = translator.gettext self.ngettext = translator.ngettext + @gen_trace_id() + @store_trace_reason('client_main') def main(self): """ Wrap the real main in a try/finally so we always close off the db. """
--- a/roundup/cgi/wsgi_handler.py Mon Sep 08 09:08:14 2025 -0400 +++ b/roundup/cgi/wsgi_handler.py Tue Sep 16 22:53:00 2025 -0400 @@ -13,6 +13,7 @@ from roundup.anypy.strings import s2b from roundup.cgi import TranslationService from roundup.cgi.client import BinaryFieldStorage +from roundup.logcontext import gen_trace_id, store_trace_reason BaseHTTPRequestHandler = http_.server.BaseHTTPRequestHandler DEFAULT_ERROR_MESSAGE = http_.server.DEFAULT_ERROR_MESSAGE @@ -102,6 +103,8 @@ else: self.preload() + @gen_trace_id() + @store_trace_reason("wsgi") def __call__(self, environ, start_response): """Initialize with `apache.Request` object""" request = RequestHandler(environ, start_response)
--- a/roundup/configuration.py Mon Sep 08 09:08:14 2025 -0400 +++ b/roundup/configuration.py Tue Sep 16 22:53:00 2025 -0400 @@ -11,18 +11,19 @@ import errno import getopt import logging -import logging.config import os import re import smtplib import sys import time +import traceback import roundup.date from roundup.anypy import random_ from roundup.anypy.strings import b2s from roundup.backends import list_backends from roundup.i18n import _ +from roundup.logcontext import gen_trace_id, get_context_info if sys.version_info[0] > 2: import configparser # Python 3 @@ -577,33 +578,36 @@ class LoggingFormatOption(Option): - """Replace escaped % (as %%) with single %. + """Escape/unescape logging format string '%(' <-> '%%(' Config file parsing allows variable interpolation using %(keyname)s. However this is exactly the format that we need for creating a logging format string. So we tell the user to quote the string using %%(...). Then we turn %%( -> %( when - retrieved. + retrieved and turn %( into %%( when saving the file. """ - class_description = ("Allowed value: Python logging module named " - "attributes with % sign doubled.") + class_description = ("Allowed value: Python LogRecord attribute named " + "formats with % *sign doubled*.\n" + "Also you can include the following attributes:\n" + " %%(trace_id)s %%(trace_reason)s and %%(pct_char)s" + ) def str2value(self, value): - """Check format of unquoted string looking for missing specifiers. + """Validate and convert value of logging format string. This does a dirty check to see if a token is missing a - specifier. So "%(ascdate)s %(level) " would fail because of - the 's' missing after 'level)'. But "%(ascdate)s %(level)s" + specifier. So "%%(ascdate)s %%(level) " would fail because of + the 's' missing after 'level)'. But "%%(ascdate)s %%(level)s" would pass. - Note that %(foo)s generates a error from the ini parser - with a less than wonderful message. + Note that '%(foo)s' (i.e. unescaped substitution) generates + a error from the ini parser with a less than wonderful message. """ unquoted_val = value.replace("%%(", "%(") # regexp matches all current logging record object attribute names. - scanned_result = re.sub(r'%\([A-Za-z_]+\)\S','', unquoted_val ) + scanned_result = re.sub(r'%\([A-Za-z_]+\)\S', '', unquoted_val) if scanned_result.find('%(') != -1: raise OptionValueError( self, unquoted_val, @@ -618,6 +622,7 @@ """ return value.replace("%(", "%%(") + class OriginHeadersListOption(Option): """List of space seperated origin header values. @@ -1658,7 +1663,7 @@ "If above 'config' option is set, this option has no effect.\n" "Allowed values: DEBUG, INFO, WARNING, ERROR"), (LoggingFormatOption, "format", - "%(asctime)s %(levelname)s %(message)s", + "%(asctime)s %(trace_id)s %(levelname)s %(message)s", "Format of the logging messages with all '%' signs\n" "doubled so they are not interpreted by the config file."), (BooleanOption, "disable_loggers", "no", @@ -2379,6 +2384,18 @@ def _get_name(self): return self["TRACKER_NAME"] + @gen_trace_id() + def _logging_test(self, sinfo, msg="test %(a)s\n%(sinfo)s", args=None): + """Test method for logging formatting. + + Not used in production. + """ + logger = logging.getLogger('roundup') + if args: + logger.info(msg, *args) + else: + logger.info(msg, extra={"a": "a_var", "sinfo": sinfo}) + def reset(self): Config.reset(self) if self.ext: @@ -2387,6 +2404,177 @@ self.detectors.reset() self.init_logging() + def gather_callstack(self, keep_full_stack=False): + # Locate logging call in stack + stack = traceback.extract_stack() + if keep_full_stack: + last_frame_index = len(stack) + else: + for last_frame_index, frame in enumerate(stack): + # Walk from the top of stack looking for + # "logging" in filename (really + # filepath). Can't use /logging/ as + # windows uses \logging\. + # + # Filtering by looking for "logging" in + # the filename isn't great. + if "logging" in frame.filename: + break + if not keep_full_stack: + stack = stack[0:last_frame_index] + return (stack, last_frame_index) + + def context_filter(self, record): + """Add context to record, expand context references in record.msg + Define pct_char as '%' + """ + + # This method can be called multiple times on different handlers. + # However it modifies the record on the first call and the changes + # persist in the record. So we only want to be called once per + # record. + if hasattr(record, "ROUNDUP_CONTEXT_FILTER_CALLED"): + return True + + for name, value in get_context_info(): + if not hasattr(record, name): + setattr(record, name, value) + record.pct_char = "%" + record.ROUNDUP_CONTEXT_FILTER_CALLED = True + + if hasattr(record, "sinfo"): + # sinfo has to be set via extras argument to logging commands + # to activate this. + # + # sinfo value can be: + # non-integer: "", None etc. Print 5 elements of + # stack before logging call + # integer N > 0: print N elements of stack before + # logging call + # 0: print whole stack before logging call + # integer N < 0: undocumented print stack starting at + # extract_stack() below. I.E. do not set bottom of + # stack to the logging call. + # if |N| is greater than stack height, print whole stack. + stack_height = record.sinfo + keep_full_stack = False + + if isinstance(stack_height, int): + if stack_height < 0: + keep_full_stack = True + stack_height = abs(stack_height) + if stack_height == 0: + # None will set value to actual stack height. + stack_height = None + else: + stack_height = 5 + + stack, last_frame_index = self.gather_callstack(keep_full_stack) + + if stack_height is None: + stack_height = last_frame_index + elif stack_height > last_frame_index: + stack_height = last_frame_index # start at frame 0 + + # report the stack info + record.sinfo = "".join( + traceback.format_list( + stack[last_frame_index - stack_height:last_frame_index] + ) + ) + + # if args are present, just return. Logging will + # expand the arguments. + if record.args: + return True + + # Since args not present, try formatting msg using + # named arguments. + try: + record.msg = record.msg % record.__dict__ + except (ValueError, TypeError): + # ValueError: means there is a % sign in the msg + # somewhere that is not meant to be a format token. So + # just leave msg unexpanded. + # + # TypeError - a positional format string is being + # handled without setting record.args. E.G. + # .info("result is %f") + # Leave message unexpanded. + pass + return True + + def add_logging_context_filter(self): + """Update log record with contextvar values and expand msg + + The contextvar values are stored as attributes on the log + record object in record.__dict__. They should not exist + when this is called. Do not overwrite them if they do exist + as they can be set in a logger call using:: + + logging.warning("a message", extra = {"trace_id": foo}) + + the extra argument/parameter. + + Attempt to expand msg using the variables in + record.__dict__. This makes:: + + logging.warning("the URL was: %(trace_reason)s") + + work and replaces the ``%(trace_reason)s`` token with the value. + Note that you can't use positional params and named params + together. For example:: + + logging.warning("user = %s and URL was: %(trace_reason)s", user) + + will result in an exception in logging when it formats the + message. + + Also ``%(pct_char)`` is defined to allow the addition of % + characters in the format string as bare % chars can't make + it past the configparser and %% encoded ones run into issue + with the format verifier. + + Calling a logger (.warning() etc.) with the argument:: + + extra={"sinfo": an_int} + + will result in a stack trace starting at the logger call + and going up the stack for `an_int` frames. Using "True" + in place of `an_int` will print only the call to the logger. + + Note that logging untrusted strings in the msg set by user + (untrusted) may be an issue. So don't do something like: + + .info("%s" % web_url) + + as web_url could include '%(trace_id)s'. Instead use: + + .info("%(url)s", extra=("url": web_url)) + + Even in the problem case, I think the damage is contained since: + + * data doesn't leak as the log string is not exposed to the + user. + + * the log string isn't executed or used internally. + + * log formating can raise an exception. But this won't + affect the application as the exception is swallowed in + the logging package. The raw message would be printed by + the fallback logging handler. + + but if it is a concern, make sure user data is added using + the extra dict when calling one of the logging functions. + """ + loggers = [logging.getLogger(name) for name in + logging.root.manager.loggerDict] + # append the root logger as it is not listed in loggerDict + loggers.append(logging.getLogger()) + for logger in loggers: + for hdlr in logger.handlers: + hdlr.addFilter(self.context_filter) + def load_config_dict_from_json_file(self, filename): import json comment_re = re.compile( @@ -2481,14 +2669,15 @@ "Unable to load logging config file. " "File extension must be '.ini' or '.json'.\n" ) - + + self.add_logging_context_filter() return if _file: raise OptionValueError(self.options['LOGGING_CONFIG'], _file, "Unable to find logging config file.") - + _file = self["LOGGING_FILENAME"] # set file & level on the roundup logger logger = logging.getLogger('roundup') @@ -2503,6 +2692,15 @@ logger.removeHandler(hdlr) logger.handlers = [hdlr] logger.setLevel(self["LOGGING_LEVEL"] or "ERROR") + if 'pytest' not in sys.modules: + # logger.propatgate is True by default. This is + # needed so that pytest caplog code will work. In + # production leaving it set to True relogs everything + # using the root logger with logging BASIC_FORMAT: + # "%(level)s:%(name)s:%(message)s + logger.propagate = False + + self.add_logging_context_filter() def validator(self, options): """ Validate options once all options are loaded.
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/roundup/logcontext.py Tue Sep 16 22:53:00 2025 -0400 @@ -0,0 +1,208 @@ +import contextvars +import functools +import logging +import os +import uuid + +logger = logging.getLogger("roundup.logcontext") + + +class SimpleSentinel: + """A hack to get a sentinel value where I can define __str__(). + + I was using sentinel = object(). However some code paths + resulted in the sentinel object showing up as an argument + to print or logging.warning|error|debug(...). In this case + seeing "<class 'object'>" in the output isn't useful. + + So I created this class (with slots) as a fast method where + I could control the __str__ representation. + + """ + __slots__ = ("name", "str") + + def __init__(self, name=None, str_value=""): + self.name = name + self.str = str_value + + def __str__(self): + # Generate a string without whitespace. + # Used in logging where whitespace could be + # a field delimiter + return ("%s%s" % ( + self.name + "-" if self.name else "", + self.str)).replace(" ", "_") + + def __repr__(self): + return 'SimpleSentinel(name=%s, str_value="%s")' % ( + self.name, self.str) + + +# store the context variable names in a dict. Because +# contactvars.copy_context().items() returns nothing if set has +# not been called on a context var. I need the contextvar names +# even if they have not been set. +ctx_vars = {} + +# set up sentinel values that will print a suitable error value +# and the context vars they are associated with. +_SENTINEL_ID = SimpleSentinel("trace_id", "not set") +ctx_vars['trace_id'] = contextvars.ContextVar("trace_id", default=_SENTINEL_ID) + + +_SENTINEL_REASON = SimpleSentinel("trace_reason", "missing") +ctx_vars['trace_reason'] = contextvars.ContextVar("trace_reason", + default=_SENTINEL_REASON) + + +def shorten_int_uuid(uuid): + """Encode a UUID integer in a shorter form for display. + + A uuid is long. Make a shorter version that takes less room + in a log line. + """ + + alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890" + result = "" + while uuid: + uuid, t = divmod(uuid, len(alphabet)) + result += alphabet[t] + return result or "0" + + +def gen_trace_id(): + """Decorator to generate a trace id (encoded uuid4) and add to contextvar + + The logging routine uses this to label every log line. All + logs with the same trace_id should be generated from a + single request. + + This decorator is applied to an entry point for a request. + Different methods of invoking Roundup have different entry + points. As a result, this decorator can be called multiple + times as some entry points can traverse another entry point + used by a different invocation method. It will not set a + trace_id if one is already assigned. + + A uuid4() is used as the uuid, but to shorten the log line, + the uuid4 integer is encoded into a 62 character ascii + alphabet (A-Za-z0-9). + + This decorator may produce duplicate (colliding) trace_id's + when used with multiple processes on some platforms where + uuid.uuid4().is_safe is unknown. Probability of a collision + is unknown. + + """ + def decorator(func): + @functools.wraps(func) + def wrapper(*args, **kwargs): + prev = None + trace_id = ctx_vars['trace_id'] + if trace_id.get() is _SENTINEL_ID: + prev = trace_id.set(shorten_int_uuid(uuid.uuid4().int)) + try: + r = func(*args, **kwargs) + finally: + if prev: + trace_id.reset(prev) + return r + return wrapper + return decorator + + +def store_trace_reason(location=None): + """Decorator finds and stores a reason trace was started in contextvar. + + Record the url for a regular web triggered request. + Record the message id for an email triggered request. + Record a roundup-admin command/action for roundup-admin request. + + Because the reason can be stored in different locations + depending on where this is called, it is called with a + location hint to activate the right extraction method. + + If the reason has already been stored (and it's not "missing", + it tries to extract it again and verifies it's the same as the + stored reason. If it's not the same it logs an error. This + safety check may be removed in a future version of Roundup. + """ + def decorator(func): + @functools.wraps(func) + def wrapper(*args, **kwargs): + nonlocal location + + reason = None + prev_trace_reason = None + trace_reason = ctx_vars['trace_reason'] + stored_reason = trace_reason.get() + + # Fast return path. Not enabled to make sure SANITY + # CHECK below runs. If the CHECK fails, we have a a + # bad internal issue: contextvars shared between + # threads, roundup modifying reason within a request, ... + # + # if stored_reason is not _SENTINEL_REASON: + # return func(*args, **kwargs) + + # use location to determine how to extract the reason + if location == "wsgi" and 'REQUEST_URI' in args[1]: + reason = args[1]['REQUEST_URI'] + elif location == "client" and 'REQUEST_URI' in args[3]: + reason = args[3]['REQUEST_URI'] + elif location == "mailgw": + reason = args[1].get_header('message-id', "no_message_id") + elif location == "admin": + reason = "roundup-admin(%s): %s" % (os.getlogin(), args[1][:2]) + elif location.startswith("file://"): + reason = location + elif location == "client_main" and 'REQUEST_URI' in args[0].env: + reason = args[0].env['REQUEST_URI'] + elif location == "xmlrpc-server": + reason = args[0].path + + if reason is None: + pass + elif stored_reason is _SENTINEL_REASON: + # no value stored and reason is not none, update + prev_trace_reason = trace_reason.set(reason) + elif reason != stored_reason: # SANITY CHECK + # Throw an error we have mismatched REASON's which + # should never happen. + logger.error("Mismatched REASON's: stored: %s, new: %s at %s", + stored_reason, reason, location) + + try: + r = func(*args, **kwargs) + finally: + # reset context var in case thread is reused for + # another request. + if prev_trace_reason: + trace_reason.reset(prev_trace_reason) + return r + return wrapper + return decorator + + +def get_context_info(): + """Return list of context var tuples [(var_name, var_value), ...]""" + + return [(name, ctx.get()) for name, ctx in ctx_vars.items()] + + +#Is returning a dict for this info more pythonic? +def get_context_dict(): + """Return dict of context var tuples ["var_name": "var_value", ...}""" + return {name: ctx.get() for name, ctx in ctx_vars.items()} + +# Dummy no=op implementation of this module: +# +#def noop_decorator(*args, **kwargs): +# def decorator(func): +# return func +# return decorator +# +#def get_context_info(): +# return [ ("trace_id", "noop_trace_id"), +# ("trace_reason", "noop_trace_reason") ] +#gen_trace_id = store_trace_reason = noop_decorator
--- a/roundup/mailgw.py Mon Sep 08 09:08:14 2025 -0400 +++ b/roundup/mailgw.py Tue Sep 16 22:53:00 2025 -0400 @@ -128,6 +128,7 @@ from roundup.anypy.strings import StringIO, b2s, u2s from roundup.hyperdb import iter_roles from roundup.i18n import _ +from roundup.logcontext import gen_trace_id, store_trace_reason from roundup.mailer import Mailer from roundup.dehtml import dehtml @@ -1646,6 +1647,8 @@ return self.handle_Message(message_from_binary_file(fp, RoundupMessage)) + @gen_trace_id() + @store_trace_reason("mailgw") def handle_Message(self, message): """Handle an RFC822 Message
--- a/roundup/scripts/roundup_xmlrpc_server.py Mon Sep 08 09:08:14 2025 -0400 +++ b/roundup/scripts/roundup_xmlrpc_server.py Tue Sep 16 22:53:00 2025 -0400 @@ -8,8 +8,12 @@ # --- patch sys.path to make sure 'import roundup' finds correct version from __future__ import print_function + +import base64 +import getopt +import os.path as osp +import socket import sys -import os.path as osp thisdir = osp.dirname(osp.abspath(__file__)) rootdir = osp.dirname(osp.dirname(thisdir)) @@ -19,15 +23,14 @@ sys.path.insert(0, rootdir) # --/ +import roundup.instance +from roundup.anypy import urllib_, xmlrpc_ +from roundup.anypy.strings import b2s +from roundup.cgi.exceptions import Unauthorised +from roundup.instance import TrackerError +from roundup.logcontext import gen_trace_id, store_trace_url +from roundup.xmlrpc import RoundupInstance, translate -import base64, getopt, os, sys, socket -from roundup.anypy import urllib_ -from roundup.xmlrpc import translate -from roundup.xmlrpc import RoundupInstance -import roundup.instance -from roundup.instance import TrackerError -from roundup.cgi.exceptions import Unauthorised -from roundup.anypy import xmlrpc_ SimpleXMLRPCServer = xmlrpc_.server.SimpleXMLRPCServer SimpleXMLRPCRequestHandler = xmlrpc_.server.SimpleXMLRPCRequestHandler @@ -64,7 +67,7 @@ scheme, challenge = authorization.split(' ', 1) if scheme.lower() == 'basic': - decoded = base64.b64decode(challenge) + decoded = b2s(base64.b64decode(challenge)) if ':' in decoded: username, password = decoded.split(':') else: @@ -85,6 +88,8 @@ db.setCurrentUser(username) return db + @gen_trace_id() + @store_trace_url("xmlrpc-server") def do_POST(self): """Extract username and password from authorization header.""" @@ -149,7 +154,7 @@ elif opt in ['-p', '--port']: port = int(arg) elif opt in ['-e', '--encoding']: - encoding = encoding + encoding = arg tracker_homes = {} for arg in args:
--- a/test/test_config.py Mon Sep 08 09:08:14 2025 -0400 +++ b/test/test_config.py Tue Sep 16 22:53:00 2025 -0400 @@ -418,6 +418,10 @@ @pytest.mark.usefixtures("save_restore_logging") class TrackerConfig(unittest.TestCase): + @pytest.fixture(autouse=True) + def inject_fixtures(self, caplog): + self._caplog = caplog + @pytest.fixture(scope="class") def save_restore_logging(self): """Save logger state and try to restore it after all tests in @@ -1009,7 +1013,136 @@ print(cm.exception) self.assertEqual(cm.exception.args[0], self.dirname) + def testFormattedLogging(self): + """Depends on using default logging format with %(trace_id)""" + def find_file_occurances(string): + return len(re.findall(r'\bFile\b', string)) + + config = configuration.CoreConfig() + + config.LOGGING_LEVEL = "DEBUG" + config.init_logging() + + + # format the record and verify the logformat/trace_id. + config._logging_test(None, msg="message") + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertEqual("message", tuple[2]) + logger = logging.getLogger('roundup') + hdlr = logger.handlers[0] + log = hdlr.format(self._caplog.records[0]) + # verify that %(trace_id) was set and substituted + # Note: trace_id is not initialized in this test case + log_parts = log.split() + self.assertRegex(log_parts[2], r'^[A-Za-z0-9]{22}') + self._caplog.clear() + + # the rest check various values of sinfo and msg formating. + + # sinfo = 1 - one line of stack starting with log call + config._logging_test(1) + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertIn("test a_var\n File", tuple[2]) + self.assertIn("in _logging_test", tuple[2]) + self.assertIn("logger.info(msg, extra=", tuple[2]) + self.assertEqual(find_file_occurances(tuple[2]), 1) + self._caplog.clear() + + # sinfo = None - 5 lines of stack starting with log call + config._logging_test(None) + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertIn("test a_var\n File", tuple[2]) + self.assertIn("in _logging_test", tuple[2]) + self.assertIn("logger.info(msg, extra=", tuple[2]) + self.assertEqual(find_file_occurances(tuple[2]), 5) + self._caplog.clear() + + # sinfo = 0 - whole stack starting with log call + config._logging_test(0) + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertIn("test a_var\n File", tuple[2]) + self.assertIn("in _logging_test", tuple[2]) + self.assertIn("logger.info(msg, extra=", tuple[2]) + # A file in 'pytest' directory should be at the top of stack. + self.assertIn("pytest", tuple[2]) + # no idea how deep the actual stack is, could change with python + # versions, but 3.12 is 33 so .... + self.assertTrue(find_file_occurances(tuple[2]) > 10) + self._caplog.clear() + + # sinfo = -1 - one line of stack starting with extract_stack() + config._logging_test(-1) + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertIn("test a_var\n File", tuple[2]) + # The call to extract_stack should be included as the frame + # at bottom of stack. + self.assertIn("extract_stack()", tuple[2]) + # only one frame included + self.assertEqual(find_file_occurances(tuple[2]), 1) + self._caplog.clear() + + # sinfo = 1000 - whole stack starting with log call 1000>stack size + config._logging_test(1000) + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertIn("test a_var\n File", tuple[2]) + self.assertIn("in _logging_test", tuple[2]) + self.assertIn("logger.info(msg, extra=", tuple[2]) + # A file in 'pytest' directory should be at the top of stack. + self.assertIn("pytest", tuple[2]) + # no idea how deep the actual stack is, could change with python + # versions, but 3.12 is 33 so .... + self.assertTrue(find_file_occurances(tuple[2]) > 10) + self._caplog.clear() + + # sinfo = -1000 - whole stack starting with extract_stack + config._logging_test(-1000) + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertIn("test a_var\n File", tuple[2]) + self.assertIn("in _logging_test", tuple[2]) + self.assertIn("logger.info(msg, extra=", tuple[2]) + # The call to extract_stack should be included as the frame + # at bottom of stack. + self.assertIn("extract_stack()", tuple[2]) + # A file in 'pytest' directory should be at the top of stack. + # no idea how deep the actual stack is, could change with python + # versions, but 3.12 is 33 so .... + self.assertTrue(find_file_occurances(tuple[2]) > 10) + self.assertIn("pytest", tuple[2]) + self._caplog.clear() + + # pass args and compatible message + config._logging_test(None, args=(1,2,3), + msg="one: %s, two: %s, three: %s" + ) + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertEqual('one: 1, two: 2, three: 3', tuple[2]) + self._caplog.clear() + + # error case for incorrect placeholder + config._logging_test(None, msg="%(a)") + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertEqual("%(a)", tuple[2]) + self._caplog.clear() + + # error case for incompatible format record is the first argument + # and it can't be turned into floating point. + config._logging_test(None, msg="%f") + tuple = self._caplog.record_tuples[0] + self.assertEqual(tuple[1], 20) + self.assertEqual("%f", tuple[2]) + self._caplog.clear() + + def testXhtmlRaisesOptionError(self): self.munge_configini(mods=[ ("html_version = ", "xhtml") ])
