Skip to content

tentacle: librbd/pwl: fix memory leaks in discard operations#68955

Open
tchaikov wants to merge 1 commit into
ceph:tentaclefrom
tchaikov:wip-75077-tentacle
Open

tentacle: librbd/pwl: fix memory leaks in discard operations#68955
tchaikov wants to merge 1 commit into
ceph:tentaclefrom
tchaikov:wip-75077-tentacle

Conversation

@tchaikov
Copy link
Copy Markdown
Contributor

backport tracker: https://tracker.ceph.com/issues/75077


backport of #66876
parent tracker: https://tracker.ceph.com/issues/74972

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

Fix memory leak in librbd persistent write log (PWL) cache discard
operations by properly completing request objects.

ASan reported the following leaks in unittest_librbd:

  Direct leak of 240 byte(s) in 1 object(s) allocated from:
    #0 operator new(unsigned long)
    #1 librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::discard(...)
       /ceph/src/librbd/cache/pwl/AbstractWriteLog.cc:935:5
    #2 TestMockCacheReplicatedWriteLog_discard_Test::TestBody()
       /ceph/src/test/librbd/cache/pwl/test_mock_ReplicatedWriteLog.cc:534:7

  Plus multiple indirect leaks totaling 2,076 bytes through the
  shared_ptr reference chain.

Root cause:

C_DiscardRequest objects were never deleted because their complete()
method was never called. The on_write_persist callback released the
BlockGuard cell but didn't call complete() to trigger self-deletion.

Write requests use WriteLogOperationSet which takes the request as
its on_finish callback, ensuring complete() is eventually called.
Discard requests don't use WriteLogOperationSet and must explicitly
call complete() in their on_write_persist callback.

Solution:

Call discard_req->complete(r) in the on_write_persist callback and
move cell release into finish_req() -- mirroring how C_WriteRequest
handles it. The complete() -> finish() -> finish_req() chain ensures
the cell is released after the user request is completed, preserving
the same ordering as write requests.

Test results:
- Before: 2,316 bytes leaked in 15 allocations
- After: 0 bytes leaked
- unittest_librbd discard tests pass successfully with ASan

Fixes: https://tracker.ceph.com/issues/74972
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
(cherry picked from commit fcda31a)
@tchaikov tchaikov requested a review from a team as a code owner May 17, 2026 02:42
@tchaikov tchaikov added this to the tentacle milestone May 17, 2026
@tchaikov tchaikov added the rbd label May 17, 2026
@github-actions github-actions Bot added the releng-audit-pass Release engineering: passed backport verification audit. label May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rbd releng-audit-pass Release engineering: passed backport verification audit.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant