Skip to content

GCCollector / multiprocess mode deadlock #322

@akx

Description

@akx

Ref #313, #321.

There's a chance for a deadlock with the GCCollector and multiprocess mode (when prometheus_multiproc_dir is set).

For instance, when running our test suite with py.test, the process seems to hang, and ctrl+c yields this (on 0.4.0):

Traceback (most recent call last):
  File "prometheus_client/gc_collector.py", line 49, in _cb
    latency.labels(gen).observe(delta)
  File "prometheus_client/core.py", line 747, in labels
    self._metrics[labelvalues] = self._wrappedClass(self._name, self._labelnames, labelvalues, **self._kwargs)
  File "prometheus_client/core.py", line 1088, in __init__
    self._sum = _ValueClass(self._type, name, name + '_sum', labelnames, labelvalues)
  File "prometheus_client/core.py", line 636, in __init__
    with lock:

This looks like a problem with GC collector callbacks firing while another metric's value object (a _MultiProcessValue) is being modified (which holds the Lock() shared by all values created here).

One possible option might be to use an RLock() instead of a Lock(), but I'm not sure what that might cause.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions