Conditional recv functions by Frank01001 · Pull Request #296 · libdebug/libdebug

Frank01001 · 2026-01-02T15:33:25Z

This PR introduces two new pipe interaction functions designed to handle non-deterministic outputs.

The new functions, match_recvuntil and match_recverruntil, allow users to specify multiple patterns that the binary may produce. Each function returns the index of the matched pattern along with the consumed bytes.

Copilot

Pull request overview

This PR introduces conditional receive functionality for handling non-deterministic outputs from debugged processes. The implementation uses an Aho-Corasick algorithm to efficiently match multiple patterns simultaneously.

Key changes:

Added match_recvuntil and match_recverruntil functions to PipeManager for pattern matching in stdout/stderr
Implemented an Aho-Corasick matcher class for efficient multi-pattern searching
Added comprehensive unit tests for the Aho-Corasick algorithm and integration tests for the conditional receive functionality

Reviewed changes

Copilot reviewed 8 out of 11 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
`libdebug/utils/search_utils.py`	Implements AhoCorasickMatcher class for efficient multi-pattern searching with stateful matching capability
`libdebug/commlink/pipe_manager.py`	Adds `match_recvuntil` and `match_recverruntil` methods for conditional receive operations on stdout/stderr
`test/scripts/conditional_recv_test.py`	Provides comprehensive unit tests for Aho-Corasick matcher and integration tests for conditional receive functionality
`test/srcs/conditional_recv_test.c`	Test binary that generates random conditional output for testing pattern matching
`test/binaries/{i386,amd64,aarch64}/conditional_recv_test`	Compiled test binaries for different architectures
`test/scripts/__init__.py`	Registers ConditionalRecvTest in the test suite
`test/run_suite.py`	Adds ConditionalRecvTest to the fast test suite
`docs/basics/running_an_executable.md`	Documents the new `match_recvuntil` and `match_recverruntil` functions in the pipe manager API
`newsfragments/296.improvement.md`	Changelog entry describing the new feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

libdebug/commlink/pipe_manager.py

libdebug/utils/search_utils.py

test/srcs/conditional_recv_test.c

libdebug/commlink/pipe_manager.py

test/scripts/__init__.py

test/scripts/conditional_recv_test.py

Copilot

Pull request overview

Copilot reviewed 8 out of 11 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

newsfragments/296.improvement.md

libdebug/commlink/pipe_manager.py

test/scripts/__init__.py

io-no

Did you try to benchmark the efficiency of Aho–Corasick with respect to other approaches? I am thinking about the really dumb approach of repeatedly calling bytes.find(pattern) for each pattern, or similar ones. I know what you’re thinking: the theoretical complexity of Aho–Corasick is very low. However, that complexity really manifests in cases of (i) large buffers and (ii) a high number of patterns.

On the other hand, I expect only a few patterns (2 or 3?) and buffers smaller than a single read (4096 bytes) in most cases (and at worst only a few times larger than a single read). Since the implementation allocates a non-negligible number of Python objects, while other “dumb” solutions are C-backed, there might be some surprises, as we have seen in the past with other pattern-searching code in libdebug, where the dumb solution was more efficient than the state-of-the-art algorithm for similar reasons.

I don’t really know the answer. I’m just suggesting that it might be worth running a few tests to ensure that the most common cases are prioritized.

Also, remember we are in 2026… the copyright 😉

PS. really cool stuff, btw

io-no · 2026-01-04T19:26:53Z

libdebug/commlink/pipe_manager.py


+    def _internal_match_recvuntil(
+        self: PipeManager,
+        patterns: list[bytes],


bytes | str

io-no · 2026-01-04T19:30:58Z

libdebug/commlink/pipe_manager.py

+        while pattern_found < 0:
+            open_flag = self._stderr_is_open if stderr else self._stdout_is_open
+
+            if (remaining_time := max(0, end_time - time.time())) == 0:


This should be after the first search; it is pointless to check during the first iteration if the time has already elapsed

io-no · 2026-01-04T19:32:07Z

libdebug/commlink/pipe_manager.py

+            if (remaining_time := max(0, end_time - time.time())) == 0:
+                raise TimeoutError("Timeout reached")
+
+            if not open_flag:


This, instead, MUST be after the first search. If data has been buffered, you are missing it. See _recvonceuntil as an example, where you have (until := data_buffer.find(delims)) at the beginning of the while. Or I am missing something?

io-no · 2026-01-04T20:00:09Z

libdebug/commlink/pipe_manager.py

+            ):
+                # We will not receive more data, the child process is not running
+                if optional:
+                    return (-1, bytes(matcher.consumed_bytes))


same here, should not call first mather.stateful_search?

MrIndeciso · 2026-01-05T10:53:56Z

I think that we should also support regex-based conditional recv functions too. I expect that some users (cough cough maybe Chino? cough cough) might expect match_[...] to work with regex-like patterns.

I'm also not sure about the names, if I need to call recvuntil with multiple patterns, I think that recvuntil_something is easier to find than something_recvuntil.

Maybe recvuntil_regex and recvuntil_pattern?

Frank01001 · 2026-01-05T11:39:50Z

I think that we should also support regex-based conditional recv functions too. I expect that some users (cough cough maybe Chino? cough cough) might expect match_[...] to work with regex-like patterns.

I'm also not sure about the names, if I need to call recvuntil with multiple patterns, I think that recvuntil_something is easier to find than something_recvuntil.

Maybe recvuntil_regex and recvuntil_pattern?

I did consider the regex matching recv as an alternative, but matching it on a stream like in the case of the multiple patterns would require adding additional dependencies such as Google's RE2 (reimplementing a regex engine is probably outside the scope of libdebug). As an alternative, I can implement a naive approach that just reruns the regex matcher from the builtin python package on the overall buffer as input is accumulated, which would probably be slow. Still, it there's demand for it, we can consider both options.

As for renaming the function, we can do a poll with different options.

MrIndeciso · 2026-01-05T11:49:15Z

I did consider the regex matching recv as an alternative, but matching it on a stream like in the case of the multiple patterns would require adding additional dependencies such as Google's RE2 (reimplementing a regex engine is probably outside the scope of libdebug). As an alternative, I can implement a naive approach that just reruns the regex matcher from the builtin python package on the overall buffer as input is accumulated, which would probably be slow. Still, it there's demand for it, we can consider both options.

I think doing it like pwntools does would be fine for now.

Frank01001 · 2026-01-05T13:10:10Z

Did you try to benchmark the efficiency of Aho–Corasick with respect to other approaches? I am thinking about the really dumb approach of repeatedly calling bytes.find(pattern) for each pattern, or similar ones. I know what you’re thinking: the theoretical complexity of Aho–Corasick is very low. However, that complexity really manifests in cases of (i) large buffers and (ii) a high number of patterns.

On the other hand, I expect only a few patterns (2 or 3?) and buffers smaller than a single read (4096 bytes) in most cases (and at worst only a few times larger than a single read). Since the implementation allocates a non-negligible number of Python objects, while other “dumb” solutions are C-backed, there might be some surprises, as we have seen in the past with other pattern-searching code in libdebug, where the dumb solution was more efficient than the state-of-the-art algorithm for similar reasons.

I don’t really know the answer. I’m just suggesting that it might be worth running a few tests to ensure that the most common cases are prioritized.

Also, remember we are in 2026… the copyright 😉

PS. really cool stuff, btw

Yes, the naive approach appears to be faster on the test binary. I guess we could either stick to the base approach or set a threshold on the number of patterns / flag to enable/disable it. I don't have large use cases in mind right now that I can readily benchmark.

kbrenner-dev · 2026-03-19T14:50:29Z

This would be really handy for CTF exploitation where the binary output changes depending on state or randomization. match_recvuntil with a list of patterns is much cleaner than wrapping everything in try/except blocks or layering timeouts. +1 on this, looking forward to seeing it land.

Frank01001 added 3 commits January 2, 2026 16:26

feat: match_recvuntil and match_recverruntil

c882123

Merge remote-tracking branch 'origin' into conditional-recv

e65468f

chore: added newsfragment for PR #296

c9440c5

Frank01001 self-assigned this Jan 2, 2026

Frank01001 requested a review from Copilot January 2, 2026 15:33

Frank01001 added the enhancement New feature or request label Jan 2, 2026

Copilot started reviewing on behalf of Frank01001 January 2, 2026 15:33 View session

Copilot AI reviewed Jan 2, 2026

View reviewed changes

fix: implemented requested fixes in Copilot review

e1613f6

Frank01001 requested a review from Copilot January 4, 2026 14:29

Copilot started reviewing on behalf of Frank01001 January 4, 2026 14:29 View session

Copilot AI reviewed Jan 4, 2026

View reviewed changes

docs: additional changes from second Copilot review

b2c5a6f

Frank01001 marked this pull request as ready for review January 4, 2026 15:20

Frank01001 requested review from MrIndeciso and io-no January 4, 2026 15:20

io-no reviewed Jan 4, 2026

View reviewed changes

Uh oh!

Conversation

Frank01001 commented Jan 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

io-no left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

io-no Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

io-no Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

io-no Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

io-no Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

MrIndeciso commented Jan 5, 2026

Uh oh!

Frank01001 commented Jan 5, 2026

Uh oh!

MrIndeciso commented Jan 5, 2026

Uh oh!

Frank01001 commented Jan 5, 2026

Uh oh!

kbrenner-dev commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

io-no left a comment •

edited

Loading