Skip to content

Reduce memory copies when processing cloud requests#22493

Draft
stelfrag wants to merge 9 commits into
netdata:masterfrom
stelfrag:aclk_eliminate_buffer_copies
Draft

Reduce memory copies when processing cloud requests#22493
stelfrag wants to merge 9 commits into
netdata:masterfrom
stelfrag:aclk_eliminate_buffer_copies

Conversation

@stelfrag
Copy link
Copy Markdown
Collaborator

@stelfrag stelfrag commented May 16, 2026

Summary
  • Previously the processing used (web_client -> local_buffer -> full_msg).
  • Add aclk_http_msg_v2_direct() which assembles JSON header + separator + HTTP headers + body into a single buffer and transfers ownership to MQTT via the existing freefnc zero-copy path.
  • Drops body copies from 3 to 1, and peak coexistent copies from 3 to 2.
  • The error paths remain the same

Summary by cubic

Reduce memory copies in HTTP API v2 publishing by building the payload once and handing it to MQTT via a zero-copy path. Adds overflow checks, centralized MQTT error handling/ownership, and fixes publish return handling with no on-wire changes.

  • Refactors
    • Add aclk_http_msg_v2_direct(...) to assemble JSON header + separator + HTTP headers + body into one buffer and publish with a free callback.
    • Update http_api_v2() to drop local_buffer; publish from header_output + data.
    • Validate inputs and sizes: require full topic; normalize NULL headers/body lengths to 0; prevent size overflows; handle allocation failures (oversize → 413).
    • Serialize JSON using json_object_to_json_string_length(...) to get length during serialization.
    • Centralize MQTT publish error handling and ownership: normalize return codes; always free payload on non-OK in mqtt_wss_publish5; remove premature frees in mqtt_ng_publish; consolidate free callbacks into freez_aclk_publish_msg; use bool for payload checks; document the single-free invariant to prevent double-free.
    • Initialize packet_id to 0 before publish to avoid uninitialized use.

Written for commit 24c77b0. Summary will update on new commits. Review in cubic

  (web_client -> local_buffer -> full_msg). Add aclk_http_msg_v2_direct()
  which assembles JSON header + separator + HTTP headers + body into a
  single buffer and transfers ownership to MQTT via the existing freefnc
  zero-copy path. Drops body copies from 3 to 1, and peak coexistent
  copies from 3 to 2. No on-wire protocol change; error paths untouched.
@stelfrag stelfrag requested a review from thiagoftsm May 16, 2026 21:27
@stelfrag stelfrag marked this pull request as ready for review May 16, 2026 21:27
Copilot AI review requested due to automatic review settings May 16, 2026 21:27
@stelfrag stelfrag marked this pull request as draft May 16, 2026 21:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces memory copies in ACLK HTTP API v2 responses by assembling the MQTT payload in a single buffer (JSON header + separator + HTTP headers + body) and publishing it via the existing zero-copy free-callback path.

Changes:

  • Added aclk_http_msg_v2_direct() to build and publish the full v2 HTTP response payload in one allocation.
  • Updated http_api_v2() to publish directly from header_output and data, removing the intermediate local_buffer.
  • Exposed the new direct-send API in aclk_tx_msgs.h.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/aclk/aclk_tx_msgs.h Exposes the new aclk_http_msg_v2_direct() API.
src/aclk/aclk_tx_msgs.c Implements direct single-buffer assembly + publish for v2 HTTP messages.
src/aclk/aclk_query.c Switches v2 query replies to the direct publish path (removes local_buffer).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/aclk/aclk_tx_msgs.c Outdated
Comment thread src/aclk/aclk_tx_msgs.c
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files

Confidence score: 3/5

  • There is a concrete reliability risk in src/aclk/aclk_tx_msgs.c: the invalid-topic error path can call aclk_http_msg_v2_err() with a NULL payload, and because that helper logs payload with %s, error reporting itself may fail or crash while handling the original problem.
  • A protocol/format regression risk is present in src/aclk/aclk_tx_msgs.c: appending V2_BIN_PAYLOAD_SEPARATOR even when both headers and body are empty diverges from the existing aclk_http_msg_v2 behavior and could break compatibility with consumers expecting the prior framing.
  • Given two medium-severity, fairly high-confidence issues in the same message-construction area, this carries some merge risk until those paths are aligned and guarded.
  • Pay close attention to src/aclk/aclk_tx_msgs.c - NULL payload handling and empty-payload separator behavior can cause runtime failures and message-format inconsistencies.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/aclk/aclk_tx_msgs.c">

<violation number="1" location="src/aclk/aclk_tx_msgs.c:210">
P2: Avoid calling `aclk_http_msg_v2_err()` with a NULL payload on the invalid-topic path; that helper logs the payload through `%s`, so this error path can fail while reporting the original problem.</violation>

<violation number="2" location="src/aclk/aclk_tx_msgs.c:239">
P2: The separator (`V2_BIN_PAYLOAD_SEPARATOR`) is unconditionally appended even when both `http_headers_len` and `body_len` are zero. The existing `aclk_http_msg_v2` path (via `aclk_send_message_with_bin_payload`) omits the separator when there is no binary payload, so this changes the on-wire format for empty-payload messages. Guard the separator with `if (http_headers_len || body_len)` to stay consistent with the existing protocol behavior.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic

Comment thread src/aclk/aclk_tx_msgs.c Outdated
Comment thread src/aclk/aclk_tx_msgs.c Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

@stelfrag stelfrag requested a review from Copilot May 17, 2026 19:47
@stelfrag
Copy link
Copy Markdown
Collaborator Author

@cubic-dev-ai review PR

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented May 17, 2026

@cubic-dev-ai review PR

@stelfrag I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant WebClient as Web Client
    participant API as http_api_v2()
    participant MsgV2 as aclk_http_msg_v2_direct()
    participant MQTT as MQTT WSS Client
    participant Memory as Memory Allocator

    Note over WebClient,Memory: Cloud Request Processing Flow

    WebClient->>API: HTTP request
    API->>API: Process API, build response
    API->>API: web_client_build_http_header()

    Note over API,MsgV2: NEW: Direct buffer assembly (no local_buffer)

    API->>MsgV2: aclk_http_msg_v2_direct() with headers + body pointers

    alt Invalid topic (missing or not starting with '/')
        MsgV2->>MsgV2: Return 500 via aclk_http_msg_v2_err
    else Valid topic
        MsgV2->>MsgV2: Create JSON header object
        MsgV2->>MsgV2: Serialize JSON header (get length during serialization)
        MsgV2->>Memory: mallocz(total_len) one buffer
        MsgV2->>MsgV2: memcpy: JSON header
        opt Has payload (headers or body)
            MsgV2->>MsgV2: memcpy: V2_BIN_PAYLOAD_SEPARATOR
            MsgV2->>MsgV2: memcpy: HTTP headers (if present)
            MsgV2->>MsgV2: memcpy: body (if present)
        end
        MsgV2->>MQTT: mqtt_wss_publish5 with free callback (zero-copy)

        alt MQTT_ERR_MSG_TOO_BIG
            MQTT-->>MsgV2: Error
            MsgV2->>MsgV2: Send 413 via aclk_http_msg_v2_err
            MsgV2-->>API: Return HTTP_RESP_CONTENT_TOO_LONG
        else Success
            MQTT-->>MsgV2: Published
            MsgV2-->>API: Return http_code
        end
    end

    API-->>WebClient: HTTP response code
    
    Note over Memory: Single allocation,<br/>ownership transferred to MQTT<br/>free'd by freez_aclk_publish5c callback
Loading

Re-trigger cubic

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread src/aclk/aclk_tx_msgs.c
Comment thread src/aclk/aclk_tx_msgs.c Outdated
Comment thread src/aclk/aclk_tx_msgs.c
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comment thread src/aclk/mqtt_websockets/mqtt_wss_client.c Outdated
Comment thread src/aclk/aclk_tx_msgs.c Outdated
Comment thread src/aclk/aclk_tx_msgs.c Outdated
stelfrag added 2 commits May 18, 2026 09:14
- Correct the return code check for `mqtt_ng_publish` in MQTT WebSocket client.
- Consolidate redundant memory free functions (`freez_aclk_publish5{a,b,c}`) into `freez_aclk_publish_msg`.
- Replace `int` with `bool` for clearer payload existence checks.
- Minor cleanup and improvements in message assembly and publishing paths.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread src/aclk/mqtt_websockets/mqtt_wss_client.c
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

src/aclk/aclk_tx_msgs.c:46

  • mqtt_wss_publish5() may take ownership of msg and can free it immediately on failure; this function then calls protomsg_to_json(msg, ...) for logging, which can become a use-after-free (and on success it still risks a race if the TX thread frees the payload before logging). Generate the JSON/loggable representation before calling mqtt_wss_publish5(), or log from a copy that remains owned by this function.
    mqtt_wss_publish5(client, (char *)topic, NULL, msg, &freez_aclk_publish_msg, msg_len, MQTT_WSS_PUB_QOS1, &packet_id);

    if (aclklog_enabled) {
        char *json = protomsg_to_json(msg, msg_len, msgname);
        log_aclk_message_bin(json, strlen(json), 1, topic, msgname);

src/aclk/aclk_tx_msgs.c:273

  • On publish failure (rc != MQTT_WSS_OK), the code reports CLOUD_EC_SND_TIMEOUT / "Timeout sending binpacked message", but mqtt_wss_publish5() can fail for non-timeout reasons (e.g., offline/disconnecting, message generation OOM). Consider mapping rc to a more accurate cloud error code/message (or at least adjusting the message) so cloud-side diagnostics aren't misleading.
    if (rc != MQTT_WSS_OK) {
        aclk_http_msg_v2_err(client, topic, msg_id, HTTP_RESP_INTERNAL_SERVER_ERROR, CLOUD_EC_SND_TIMEOUT, CLOUD_EMSG_SND_TIMEOUT, NULL, 0);
        return HTTP_RESP_INTERNAL_SERVER_ERROR;

Comment thread src/aclk/aclk_tx_msgs.c
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread src/aclk/aclk_tx_msgs.c
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (2)

src/aclk/mqtt_websockets/mqtt_wss_client.c:878

  • When mqtt_ng_publish() returns MQTT_NG_MSGGEN_MSG_TOO_BIG, it fails before it can attach the topic to any buffer fragment (and therefore cannot call topic_free). If callers ever pass a dynamically allocated topic with a non-NULL topic_free, this path will leak it. Consider freeing the topic when rc == MQTT_NG_MSGGEN_MSG_TOO_BIG (similar to how msg_free is handled).
        if (msg_free)
            msg_free(msg);
        if (rc == MQTT_NG_MSGGEN_MSG_TOO_BIG)
            return MQTT_WSS_ERR_MSG_TOO_BIG;
        return rc;

src/aclk/aclk_tx_msgs.c:261

  • Similarly, body_len > 0 triggers memcpy() without validating that body is non-NULL. Please add a guard/contract check so invalid (body == NULL, body_len > 0) inputs can’t crash the agent.
    if (body_len)
        memcpy(raw + pos, body, body_len);

Comment thread src/aclk/mqtt_websockets/mqtt_wss_client.c
Comment thread src/aclk/aclk_tx_msgs.c
…r respective lengths to 0 to prevent mismatched allocation sizes and maintain consistency during message composition.
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/aclk/aclk_tx_msgs.c:89

  • aclk_send_message_with_bin_payload() ignores mqtt_wss_publish5() failures other than MQTT_WSS_ERR_MSG_TOO_BIG and always returns 0 on other errors. This can report success to higher layers even when the publish failed (and the payload has already been freed by the publish path). Consider treating rc != MQTT_WSS_OK as an error and propagating an appropriate HTTP error code (or returning rc and letting the caller map it).
    int rc = mqtt_wss_publish5(client, (char*)topic, NULL, full_msg, &freez_aclk_publish_msg, full_msg_len, MQTT_WSS_PUB_QOS1, &packet_id);

    if (rc == MQTT_WSS_ERR_MSG_TOO_BIG)
        return HTTP_RESP_CONTENT_TOO_LONG;

    return 0;

Comment thread src/aclk/aclk_tx_msgs.c
Comment thread src/aclk/mqtt_websockets/mqtt_wss_client.c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants