Skip to content

Redesign setup for the deterministic fixture#2304

Draft
Neverlord wants to merge 1 commit intomainfrom
issue/2095-fail-expect-on-skipped-messages
Draft

Redesign setup for the deterministic fixture#2304
Neverlord wants to merge 1 commit intomainfrom
issue/2095-fail-expect-on-skipped-messages

Conversation

@Neverlord
Copy link
Copy Markdown
Member

@Neverlord Neverlord commented Mar 24, 2026

Refactor the local_actor interface to allow the deterministic fixture to call consume directly instead of going through resume. This setup gives the fixture better control over actors and allows us to detect when an actor skips a message instead of processing.

Closes #2095.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the local_actor execution model to let the deterministic test fixture drive actors by calling consume directly (instead of going through resume), enabling the fixture to detect skipped messages and to manually dispatch scheduler jobs.

Changes:

  • Introduces local_actor::consume and updates actor implementations/callers (scheduled actor, blocking actor, net actor shell, Qt widget, IO servants/managers) to use it.
  • Redesigns the deterministic fixture internals (global mailbox + job queue) and adds APIs/tests for dispatching messages and scheduled jobs deterministically.
  • Adds supporting utilities and adjusts tests to validate skip behavior and the new fixture capabilities.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
libcaf_test/caf/test/fixture/deterministic.hpp Refactors the fixture API and internals (private data, dispatch APIs, evaluator refactor).
libcaf_test/caf/test/fixture/deterministic.cpp Implements global mailbox/job queue, direct consume-driven dispatch, and scheduler integration.
libcaf_test/caf/test/fixture/deterministic.test.cpp Adds coverage for manual job dispatch and skip-detection regression.
libcaf_core/caf/local_actor.hpp Introduces consume_result and the consume(mailbox_element_ptr&) interface.
libcaf_core/caf/scheduled_actor.hpp / .cpp Reworks message handling around consume, updates init/resume logic and metrics/finalization flow.
libcaf_core/caf/blocking_actor.hpp / .cpp Adapts blocking actors to the new consume interface and refactors receive logic.
libcaf_net/caf/net/abstract_actor_shell.hpp / .cpp Adds consume override and reuses it from the message loop.
libcaf_io/caf/io/* Updates broker servants and network managers to drive brokers via consume and ensures context is set.
libcaf_core/caf/mixin/actor_widget.hpp Switches Qt event delivery to set scheduler context + call consume.
libcaf_core/caf/intrusive_ptr_access.hpp / caf/intrusive_ptr.hpp Extracts intrusive pointer refcount policy into a reusable header.
libcaf_core/caf/detail/sync_request_bouncer.* Adds overload to bounce sync requests from mailbox_element_ptr.
libcaf_core/caf/detail/blocking_behavior.* / libcaf_core/CMakeLists.txt Removes the old blocking behavior wrapper implementation.
libcaf_core/caf/*.test.cpp Updates/extends tests for skip behavior and mailbox state expectations.
libcaf_core/caf/abstract_actor.hpp, caf/detail/behavior_stack.hpp Minor API/tidy-ups to support refactor (e.g., is_terminated, noexcept).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 65.92920% with 77 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.71%. Comparing base (c20ce0a) to head (8d035c3).

Files with missing lines Patch % Lines
libcaf_core/caf/scheduled_actor.cpp 61.62% 27 Missing and 6 partials ⚠️
libcaf_core/caf/blocking_actor.cpp 80.00% 7 Missing and 3 partials ⚠️
libcaf_net/caf/net/abstract_actor_shell.cpp 58.33% 7 Missing and 3 partials ⚠️
libcaf_io/caf/io/datagram_servant.cpp 0.00% 8 Missing ⚠️
libcaf_io/caf/io/scribe.cpp 25.00% 5 Missing and 1 partial ⚠️
libcaf_io/caf/io/broker_servant.hpp 75.00% 4 Missing and 1 partial ⚠️
libcaf_io/caf/io/network/manager.cpp 57.14% 3 Missing ⚠️
libcaf_core/caf/actor_system.cpp 0.00% 1 Missing ⚠️
libcaf_core/caf/detail/sync_request_bouncer.cpp 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2304      +/-   ##
==========================================
+ Coverage   72.63%   72.71%   +0.08%     
==========================================
  Files         638      637       -1     
  Lines       30884    30863      -21     
  Branches     3325     3323       -2     
==========================================
+ Hits        22432    22443      +11     
+ Misses       6566     6535      -31     
+ Partials     1886     1885       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Refactor the `local_actor` interface to allow the deterministic fixture
to call `consume` directly instead of going through `resume`. This setup
gives the fixture better control over actors and allows us to detect
when an actor skips a message instead of processing.
@Neverlord Neverlord force-pushed the issue/2095-fail-expect-on-skipped-messages branch from 8861351 to 8d035c3 Compare March 24, 2026 15:22
@Neverlord Neverlord requested a review from Copilot March 24, 2026 15:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Hence, we need to release one reference count here.
intrusive_ptr_release(ptr);
void schedule(resumable* ptr, uint64_t event_id) override {
global_->entries.emplace_back(intrusive_ptr{ptr, add_ref}, event_id);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deterministic_scheduler::schedule stores resumable* as intrusive_ptr{ptr, add_ref} without compensating for CAF's existing ref-count bump before scheduling. This diverges from the scheduler contract (callers typically bump once and the scheduler releases once) and will leak a reference for well-behaved callers (e.g., code that calls ref_resumable()/intrusive_ptr_add_ref before schedule/delay). Consider storing the pointer with adopt_ref (or storing the raw pointer and releasing it after resume) so the scheduler owns exactly the caller-provided ref and releases it when the job is dispatched.

Suggested change
global_->entries.emplace_back(intrusive_ptr{ptr, add_ref}, event_id);
global_->entries.emplace_back(intrusive_ptr{ptr, adopt_ref}, event_id);

Copilot uses AI. Check for mistakes.
*dispatched = true;
});
WHEN("scheduling the job") {
sys.scheduler().schedule(job.as_intrusive_ptr().get(), 0);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test schedules a resumable via sys.scheduler().schedule(job.as_intrusive_ptr().get(), 0) without first transferring an extra reference to the scheduler. CAF schedulers generally require the caller to bump the resumable ref-count before scheduling because the scheduler releases a ref after running the job. Once deterministic_scheduler::schedule follows that contract, this test would become a use-after-free unless it adds the required ref increment before calling schedule.

Suggested change
sys.scheduler().schedule(job.as_intrusive_ptr().get(), 0);
auto iptr = job.as_intrusive_ptr();
iptr->ref();
sys.scheduler().schedule(iptr.get(), 0);

Copilot uses AI. Check for mistakes.
bool deterministic::evaluator_base::dry_run(const strong_actor_ptr& dst) {
if (auto* msg = fix_->private_data_->messages->peek_one(dst)) {
const auto& with_pred = get_message_predicate();
return from_(msg->sender) && !with_pred(msg->payload);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dry_run is used by the disallow algorithm to detect whether the next mailbox element matches the configured predicates. However, it currently returns from_(sender) && !with_pred(payload), which triggers failures when the sender matches but the payload does not. This inverts the intended meaning for disallow and will miss actually disallowed messages that match the payload predicate. dry_run should return true when the message matches both sender and payload predicates.

Suggested change
return from_(msg->sender) && !with_pred(msg->payload);
return from_(msg->sender) && with_pred(msg->payload);

Copilot uses AI. Check for mistakes.
Comment on lines +845 to +989
scheduled_actor::consume_result
scheduled_actor::consume(mailbox_element_ptr& what) {
CAF_ASSERT(context_ != nullptr);
CAF_ASSERT(what != nullptr);
auto& x = *what;
auto lg = log::core::trace("x = {}", x);
current_element_ = &x;
CAF_LOG_RECEIVE_EVENT(current_element_);
// Wrap the actual body for the function.
// Returns `true` if the message was consumed, `false` when it was skipped.
auto body = [this, &x] {
// Helper function for dispatching a message to a response handler.
using ptr_t = scheduled_actor*;
using fun_t = bool (*)(ptr_t, behavior&, mailbox_element&);
auto ordinary_invoke = [](ptr_t, behavior& f, mailbox_element& in) -> bool {
return f(in.content()) != std::nullopt;
};
auto select_invoke_fun = [&]() -> fun_t { return ordinary_invoke; };
// Short-circuit awaited responses.
// Awaited responses have the highest priority and block all other messages.
if (!awaited_responses_.empty()) {
auto invoke = select_invoke_fun();
auto& pr = awaited_responses_.front();
// Skip all other messages until we receive the currently awaited response
// except for internal (system) messages.
if (x.mid != std::get<0>(pr)) {
// Responses are never internal messages and must be skipped here.
// Otherwise, an error to a response would run into the default handler.
if (x.mid.is_response())
return invoke_message_result::skipped;
if (categorize(x) == message_category::internal) {
log::core::debug("handled system message");
return invoke_message_result::consumed;
return true;
}
return invoke_message_result::skipped;
CAF_LOG_SKIP_EVENT();
return false;
}
auto f = std::move(std::get<1>(pr));
std::get<2>(pr).dispose(); // Stop the timeout.
awaited_responses_.pop_front();
if (!invoke(this, f, x)) {
// try again with error if first attempt failed
if (f(x.payload) == std::nullopt) {
// Invoke again with error if first attempt failed.
auto msg = make_message(
make_error(sec::unexpected_response, std::move(x.payload)));
f(msg);
}
return invoke_message_result::consumed;
return true;
}
// Handle multiplexed responses.
if (x.mid.is_response()) {
auto invoke = select_invoke_fun();
auto mrh = multiplexed_responses_.find(x.mid);
// neither awaited nor multiplexed, probably an expired timeout
if (mrh == multiplexed_responses_.end())
return invoke_message_result::dropped;
auto bhvr = std::move(mrh->second.first);
mrh->second.second.dispose(); // Stop the timeout.
multiplexed_responses_.erase(mrh);
if (!invoke(this, bhvr, x)) {
auto iter = multiplexed_responses_.find(x.mid);
if (iter == multiplexed_responses_.end()) {
// Neither awaited nor multiplexed, probably an expired timeout -> drop.
return true;
}
auto bhvr = std::move(iter->second.first);
iter->second.second.dispose(); // Stop the timeout.
multiplexed_responses_.erase(iter);
if (bhvr(x.payload) == std::nullopt) {
// Invoke again with error if first attempt failed.
log::core::debug("got unexpected_response");
auto msg = make_message(
make_error(sec::unexpected_response, std::move(x.payload)));
bhvr(msg);
}
return invoke_message_result::consumed;
return true;
}
// Dispatch on the content of x.
switch (categorize(x)) {
case message_category::skipped:
return invoke_message_result::skipped;
log::core::debug("skipped message");
CAF_LOG_SKIP_EVENT();
return false;
case message_category::internal:
log::core::debug("handled system message");
return invoke_message_result::consumed;
return true;
case message_category::ordinary: {
// Try to process the message with the current behavior.
detail::default_invoke_result_visitor<scheduled_actor> visitor{this};
if (!bhvr_stack_.empty()) {
auto& bhvr = bhvr_stack_.back();
if (bhvr(visitor, x.payload))
return invoke_message_result::consumed;
if (bhvr(visitor, x.payload)) {
return true;
}
}
// Fall back to default handlers for system messages.
if (x.payload.size() == 1) {
switch (x.payload.type_at(0)) {
default:
break;
case type_id_v<exit_msg>: {
auto& em = x.payload.get_mutable_as<exit_msg>(0);
call_handler(exit_handler_, this, em);
return invoke_message_result::consumed;
return true;
}
case type_id_v<down_msg>: {
auto& dm = x.payload.get_mutable_as<down_msg>(0);
call_handler(down_handler_, this, dm);
return invoke_message_result::consumed;
return true;
}
case type_id_v<node_down_msg>: {
auto& dm = x.payload.get_mutable_as<node_down_msg>(0);
call_handler(node_down_handler_, this, dm);
return invoke_message_result::consumed;
return true;
}
case type_id_v<error>: {
auto& err = x.payload.get_mutable_as<error>(0);
call_handler(error_handler_, this, err);
return invoke_message_result::consumed;
return true;
}
}
}
// No match -> call the default handler.
auto sres = call_handler(default_handler_, this, x.payload);
auto f = detail::make_overload(
[&](auto& x) {
visitor(x);
return invoke_message_result::consumed;
},
[&](skip_t&) { return invoke_message_result::skipped; });
auto f = [&visitor]<class T>(T& res) {
if constexpr (std::is_same_v<T, skip_t>) {
return false;
} else {
visitor(res);
return true;
}
};
return visit(f, sres);
}
}
// Should be unreachable.
detail::critical("categorize() returned an invalid message category");
};
// Post-process the returned value from the function body.
auto result = body();
CAF_LOG_SKIP_OR_FINALIZE_EVENT(result);
return result;
}

/// Tries to consume `x`.
void scheduled_actor::consume(mailbox_element_ptr x) {
switch (consume(*x)) {
default:
break;
case invoke_message_result::skipped:
push_to_cache(std::move(x));
}
}

bool scheduled_actor::activate(scheduler* sched) {
auto lg = log::core::trace("");
CAF_ASSERT(sched != nullptr);
CAF_ASSERT(!getf(is_blocking_flag));
context(sched);
if (getf(is_initialized_flag) && !alive()) {
log::system::warning("activate called on a terminated actor "
"with id {} and name {}",
id(), name());
return false;
}
#ifdef CAF_ENABLE_EXCEPTIONS
try {
#endif // CAF_ENABLE_EXCEPTIONS
if (!getf(is_initialized_flag)) {
if (!initialize(sched)) {
log::core::debug("actor terminated immediately after initialization");
return false;
if (metrics_.mailbox_time) {
// Note: mailbox_time, mailbox_size and processed_messages must be all
// enabled together or not at all.
CAF_ASSERT(metrics_.mailbox_size != nullptr);
CAF_ASSERT(metrics_.processed_messages != nullptr);
auto t0 = std::chrono::steady_clock::now();
auto mbox_time = x.seconds_until(t0);
if (body()) {
set_receive_timeout();
auto terminated = finalize();
telemetry::timer::observe(metrics_.processing_time, t0);
metrics_.mailbox_time->observe(mbox_time);
metrics_.mailbox_size->dec();
metrics_.processed_messages->inc();
if (terminated) {
return consume_result::terminated;
}
CAF_ASSERT(getf(is_initialized_flag));
log::core::debug("initialized actor: name = {}", name());
return consume_result::consumed;
}
#ifdef CAF_ENABLE_EXCEPTIONS
} catch (...) {
log::core::debug("failed to initialize actor due to an exception");
auto eptr = std::current_exception();
quit(call_handler(exception_handler_, this, eptr));
finalize();
return false;
return consume_result::skipped;
}
#endif // CAF_ENABLE_EXCEPTIONS
return true;
}

auto scheduled_actor::activate(scheduler* sched, mailbox_element& x)
-> activation_result {
auto lg = log::core::trace("x = {}", x);
if (!activate(sched))
return activation_result::terminated;
auto res = reactivate(x);
if (res == activation_result::success)
if (body()) {
set_receive_timeout();
return res;
}

auto scheduled_actor::reactivate(mailbox_element& x) -> activation_result {
auto lg = log::core::trace("x = {}", x);
#ifdef CAF_ENABLE_EXCEPTIONS
auto handle_exception = [&](std::exception_ptr eptr) {
auto err = call_handler(exception_handler_, this, eptr);
auto rp = make_response_promise();
rp.deliver(err);
quit(std::move(err));
};
try {
#endif // CAF_ENABLE_EXCEPTIONS
switch (consume(x)) {
case invoke_message_result::dropped:
return activation_result::dropped;
case invoke_message_result::consumed:
bhvr_stack_.cleanup();
if (finalize()) {
log::core::debug("actor finalized");
return activation_result::terminated;
}
return activation_result::success;
case invoke_message_result::skipped:
return activation_result::skipped;
if (finalize()) {
return consume_result::terminated;
}
#ifdef CAF_ENABLE_EXCEPTIONS
} catch (std::exception& e) {
log::core::info("actor died because of an exception, what: {}", e.what());
static_cast<void>(e); // keep compiler happy when not logging
handle_exception(std::current_exception());
} catch (...) {
log::core::info("actor died because of an unknown exception");
handle_exception(std::current_exception());
return consume_result::consumed;
}
finalize();
#endif // CAF_ENABLE_EXCEPTIONS
return activation_result::terminated;
return consume_result::skipped;
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scheduled_actor::consume logs CAF_LOG_RECEIVE_EVENT and logs skip events, but it no longer logs a finalize event for successfully processed messages (previously done via CAF_LOG_SKIP_OR_FINALIZE_EVENT). This creates gaps in trace logging for consumed/dropped messages under CAF_ENABLE_TRACE_LOGGING. Consider calling CAF_LOG_FINALIZE_EVENT() on the paths where body() returns true (including the multiplexed-response "drop" case that currently returns true).

Copilot uses AI. Check for mistakes.
Comment on lines +262 to +266
switch (consume(ptr)) {
case consume_result::consumed:
unstash();
consumed_msg = true;
break;
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking_actor::receive_impl logs skip events but no longer emits CAF_LOG_FINALIZE_EVENT() when a message gets consumed (the old code did). This will lead to missing "FINALIZE" events in trace logging and makes the logging behavior inconsistent with other actor types. Consider logging finalize on the consume_result::consumed path (and on the mid.is_response unexpected-response fallback when that path treats the message as consumed).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

expect should fail if actor leaves message in mailbox

2 participants