Skip to content

Fix reply memory leak in retryable command retry loop#13

Open
groeneai wants to merge 1 commit into
ClickHouse:ClickHouse/2.3.0from
groeneai:fix-retryable-cmd-reply-leak
Open

Fix reply memory leak in retryable command retry loop#13
groeneai wants to merge 1 commit into
ClickHouse:ClickHouse/2.3.0from
groeneai:fix-retryable-cmd-reply-leak

Conversation

@groeneai

Copy link
Copy Markdown

Problem

_mongoc_retryable_cmd_run (src/libmongoc/src/mongoc/mongoc-retryable-cmd.c) reuses *reply across retry attempts. Per attempt:

  1. cmd->execute(...) populates reply (for a cursor read this ends in bson_copy_to(&body, reply) in _mongoc_cluster_run_opmsg_recv, allocating reply->own_buffer).
  2. The loop inspects reply for retryability/error labels.
  3. On a retry it calls cmd->select_retry_server(..., reply, ...) while reply still owns the previous attempt's allocation.

select_retry_server (cursor path: _retryable_cursor_commmand_select_retry_server -> mongoc_cluster_stream_for_reads -> _mongoc_cluster_stream_for_optype) treats reply as an out-param and, on its server-selection-failure path, overwrites it (bson_copy_to(&first_reply, reply) / bson_init(reply)) without freeing the prior contents. The loop's own bson_destroy(reply) sits after the if (!server_description) break;, so when selection fails it is never reached and the attempt's reply allocation is leaked.

Trigger: a retryable read error followed by a retry server-selection failure (a rare failover/topology race).

Fix

Destroy and re-init reply immediately before select_retry_server overwrites it, and drop the now-redundant trailing bson_destroy. This matches the existing bson_destroy; bson_init reset idiom in mongoc-cluster.c (OIDC reauth path). The reply-inspection reads all run before the destroy, so retry decisions are unchanged.

Verification

Observed in downstream (ClickHouse) CI as a LeakSanitizer 128-byte direct leak:

Direct leak of 128 byte(s) in 1 object(s) allocated from:
    bson_malloc           src/libbson/src/bson/memory.c:106
    bson_copy_to          src/libbson/src/bson/bson.c:2106
    _mongoc_cluster_run_opmsg_recv  src/libmongoc/src/mongoc/mongoc-cluster.c:3313
    ...
    _mongoc_retryable_cmd_run       src/libmongoc/src/mongoc/mongoc-retryable-cmd.c
    _mongoc_cursor_run_command / _prime / mongoc_cursor_next

Reproduced deterministically with a standalone AddressSanitizer/LeakSanitizer harness that drives the real _mongoc_retryable_cmd_run with mock execute/select_retry_server callbacks matching the ownership pattern above: the leak is reported without this change and is gone with it.

_mongoc_retryable_cmd_run reuses *reply across retry attempts. After an
attempt fails with a retryable error, the loop calls select_retry_server()
with reply while it still owns that attempt's allocation.

Of the three select_retry_server callbacks, only the cursor one
(_retryable_cursor_commmand_select_retry_server) forwards reply to
mongoc_cluster_stream_for_reads as a real out-param. On its
server-selection-failure path, _mongoc_cluster_stream_for_optype re-initializes
reply (bson_init / bson_copy_to) without freeing the prior contents. The loop's own
bson_destroy(reply) sits after the !server_description break, so it is not
reached when selection fails, and the attempt's reply is leaked. The read and
write callbacks (_retryable_read_select_retry_server,
_retryable_write_select_retry_server) ignore reply and pass NULL downstream, so
they never overwrite it and never leak.

Fix it where the bug is: reset reply (destroy + init) inside the cursor
callback, just before it hands reply to stream_for_reads. The retry loop is
left exactly as upstream, so the read and write paths keep returning the failed
first attempt's reply (with its errorLabels and server error fields) to the
caller, which they relied on before.

Observed in ClickHouse CI as a LeakSanitizer 128-byte direct leak:
bson_malloc <- bson_copy_to <- _mongoc_cluster_run_opmsg_recv <- ... <-
_mongoc_retryable_cmd_run, reached through a find cursor on a retryable read.
The trigger (retryable read error followed by a retry server-selection failure)
is a rare failover race.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@groeneai groeneai force-pushed the fix-retryable-cmd-reply-leak branch from 5aad6ab to 6b6b676 Compare June 14, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant