Fix memory leak in mongo-c-driver retryable command retry loop by groeneai · Pull Request #107448 · ClickHouse/ClickHouse

groeneai · 2026-06-14T19:33:09Z

Related: ClickHouse/mongo-c-driver#13
Related: #107429

Changelog category (leave one):

Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixed a memory leak in the bundled mongo-c-driver that could occur when reading from MongoDB (for example via a MongoDB dictionary or the mongodb table function) if a retryable read error was followed by a failed retry server selection.

Description

Bumps the mongo-c-driver submodule by one commit to fix a reply memory leak in the retryable command retry loop.

_mongoc_retryable_cmd_run reuses *reply across retry attempts. After an attempt fails with a retryable error, the loop calls select_retry_server(..., reply, ...) while reply still owns that attempt's allocation. select_retry_server (cursor path: mongoc_cluster_stream_for_reads -> _mongoc_cluster_stream_for_optype) treats reply as an out-param and, on its server-selection-failure path, overwrites it (bson_copy_to / bson_init) without freeing the prior contents. The loop's own bson_destroy(reply) sits after the if (!server_description) break;, so it is never reached when selection fails, and the attempt's reply is leaked.

The fix destroys and re-inits reply before select_retry_server overwrites it, and drops the now-redundant trailing destroy. See ClickHouse/mongo-c-driver#13 for the driver change.

This surfaced in CI as a LeakSanitizer 128-byte direct leak on test_dictionaries_all_layouts_separate_sources/test_mongo.py::test_simple_ssl (Integration tests (amd_asan_ubsan, ...)):

Direct leak of 128 byte(s) in 1 object(s) allocated from:
    bson_malloc                     contrib/mongo-c-driver/src/libbson/src/bson/memory.c:106
    bson_copy_to                    contrib/mongo-c-driver/src/libbson/src/bson/bson.c:2106
    _mongoc_cluster_run_opmsg_recv  contrib/mongo-c-driver/src/libmongoc/src/mongoc/mongoc-cluster.c:3313
    mongoc_cluster_run_opmsg / run_command_monitored / mongoc_cluster_run_command_monitored
    _retryable_cursor_command_execute / _mongoc_retryable_cmd_run
    _mongoc_cursor_run_command / _mongoc_cursor_response_refresh / _prime / mongoc_cursor_next
    DB::MongoDBSource::generate()   src/Processors/Sources/MongoDBSource.cpp:215

The trigger (retryable read error followed by a retry server-selection failure) is a rare failover race: it appeared twice in 30 days across PR CI runs, never on master. Reproduced deterministically with a standalone AddressSanitizer/LeakSanitizer harness driving the real _mongoc_retryable_cmd_run with mock callbacks matching the ownership pattern above; the leak is reported without the change and gone with it.

Found by @alexey-milovidov on #107429 and fixed here on his request.

Report: https://s3.amazonaws.com/clickhouse-test-reports/107429/index.html (Integration tests, amd_asan_ubsan, db disk, old analyzer, 3/6)

groeneai · 2026-06-14T19:33:31Z

Pre-PR validation gate

#	Question	Answer
a	Deterministic repro?	Yes. A standalone ASan/LeakSanitizer harness drives the real `_mongoc_retryable_cmd_run` with mock `execute`/`select_retry_server` callbacks matching the ownership pattern. The leak is reported 100% of runs without the fix; clean with it.
b	Root cause explained?	Yes. The retry loop calls `select_retry_server(..., reply, ...)` while `reply` still owns the previous attempt's `bson_copy_to` allocation. `select_retry_server` (`stream_for_reads` -> `_mongoc_cluster_stream_for_optype`) overwrites `reply` as an out-param on its selection-failure path without freeing it, and the loop's `bson_destroy(reply)` sits after the `!server_description` break, so it is never reached. The attempt's `reply` leaks.
c	Fix matches root cause?	Yes. `bson_destroy(reply); bson_init(reply);` is moved to just before `select_retry_server` overwrites it (the existing reset idiom in `mongoc-cluster.c` OIDC reauth path), and the now-redundant trailing destroy is removed. It targets the lifetime bug directly, not a symptom.
d	Test intent preserved / new tests added?	The ASan/LSan harness is the regression proof. Note: ClickHouse does not build the `mongo-c-driver` test suite (only `test.util` phony targets), so a driver-internal C test would never run in ClickHouse CI; the leak is instead exercised by the existing `test_mongo.py` integration tests under sanitizers.
e	Both directions demonstrated?	Yes. Without fix: `LeakSanitizer: detected memory leaks ... bson_malloc <- _mongoc_retryable_cmd_run`, exit 1. With fix: no leak, exit 0. Same allocation site as the CI report.
f	Fix is general, not a narrow patch?	Yes. The fix is at the retry loop, the single owner of the cross-attempt `reply` lifetime, not a guard at the crash site. Sibling out-param writers (`_handle_network_error`, `_mongoc_cluster_stream_for_optype`) all assume a caller-owned empty `reply`; the retry loop was the one place violating that contract.

Session id: cron:clickhouse-worker-slot-2:20260614-184800

groeneai · 2026-06-14T19:38:22Z

cc @alexey-milovidov @rschu1ze — could you review this? It bumps the mongo-c-driver submodule by one commit (ClickHouse/mongo-c-driver#13) to fix a reply memory leak in the retryable command retry loop: the loop calls select_retry_server() while reply still owns the previous attempt's allocation, and that function overwrites reply as an out-param without freeing it, so the allocation leaks when retry server selection fails. This is the test_simple_ssl LeakSanitizer failure from #107429.

clickhouse-gh · 2026-06-14T19:38:40Z

Workflow [PR], commit [59168d5]

AI Review

Summary

This PR bumps contrib/mongo-c-driver by one commit to fix a bson_t reply leak in _mongoc_retryable_cmd_run. The leak fix targets a real ownership bug, but the current generic reset also changes retryable-read error semantics for command helpers that rely on preserving the first failed server reply when retry server selection itself fails.

Missing context / blind spots

⚠️ The direct PR CI report currently reports Total: 0 test results, so completed PR CI is not available as validation evidence in this review.

Findings

⚠️ Majors
- [contrib/mongo-c-driver:1] The submodule bump moves src/libmongoc/src/mongoc/mongoc-retryable-cmd.c to a version that destroys and reinitializes reply before every select_retry_server callback. That violates the retryable-read command path's behavior of returning the original retryable error when retrying is impossible: _retryable_read_select_retry_server ignores reply and calls mongoc_cluster_stream_for_reads(..., NULL, NULL), so on selection failure mongoc_client_read_command_with_opts now returns {} instead of the first failed command's server reply, losing errorLabels and server error fields.
- Suggested fix: keep the reset local to callbacks that actually pass reply as an out-param, such as _retryable_cursor_commmand_select_retry_server, or otherwise preserve the previous reply until a select callback replaces it.

Tests

⚠️ Add focused regression coverage for a retryable read command failure followed by retry server selection failure, asserting the caller-visible reply still contains the original server error document.

Final Verdict

Status: ⚠️ Request changes

Minimum required actions: preserve the original reply for retry-server callbacks that do not overwrite it, and add focused coverage for that error path or point to automated coverage that would fail on the regression above.

clickhouse-gh · 2026-06-14T19:51:25Z

@@ -1 +1 @@
-Subproject commit a26c8e251fec147258e89b7e85e9adfa259fc42f
+Subproject commit 5aad6ab39aea8a2aa425404bc3d82bbc6fe581cd


This generic reset drops the original server reply for retryable read commands when retry server selection fails. _retryable_read_select_retry_server deliberately ignores its reply argument and calls mongoc_cluster_stream_for_reads(..., NULL, NULL), so the only reply available to mongoc_client_read_command_with_opts is the one produced by the failed first attempt. Before this submodule commit, a selection failure took the !server_description branch before the trailing bson_destroy, preserving that reply; now src/libmongoc/src/mongoc/mongoc-retryable-cmd.c:106 replaces it with {} before the callback can fail. That violates the retry path's "return the original retryable error" behavior and removes errorLabels / server error fields from the caller-visible reply.

Please keep the reset local to callbacks that actually pass reply as an out-param, for example the cursor retry-server path, or otherwise preserve the previous reply when the select callback does not replace it.

You're right, thanks. I confirmed it against the driver source.

Of the three select_retry_server callbacks, only the cursor one passes reply as a real out-param:

_retryable_read_select_retry_server (client.c): BSON_UNUSED(reply), calls mongoc_cluster_stream_for_reads(..., NULL /* reply */, NULL).

_retryable_write_select_retry_server (cluster.c): BSON_UNUSED(reply), calls mongoc_cluster_stream_for_writes(..., NULL /* reply */, NULL).

_retryable_cursor_commmand_select_retry_server (cursor.c): forwards reply to mongoc_cluster_stream_for_reads, whose _mongoc_cluster_stream_for_optype selection-failure path re-inits it (bson_init / bson_copy_to) without freeing prior contents.

So the leak is cursor-path only, and the previous reset-before-select wiped the first attempt's reply (errorLabels and server error fields) on the read/write paths, which the loop used to preserve.

New approach: the retry loop is reverted to upstream, and the reset is moved into the cursor callback right before it hands reply to stream_for_reads. Read/write keep returning the failed attempt's reply; cursor no longer leaks. The submodule bump now points at the scoped commit (ClickHouse/mongo-c-driver@6b6b676).

Verified with a standalone ASan/LSan harness driving the loop's reply lifecycle with real libbson, one scenario per process:

orig/read leak=0 errorLabels=1 (baseline) orig/cursor leak=1 errorLabels=0 <- the leak (256B) v1/read leak=0 errorLabels=0 <- the regression you flagged scoped/read leak=0 errorLabels=1 <- fix: errorLabels preserved scoped/cursor leak=0 errorLabels=0 <- fix: leak gone

rschu1ze · 2026-06-14T20:29:24Z

@groeneai Send the fix first upstream before integrating it in ClickHouse.

…ry loop Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai · 2026-06-14T20:59:48Z

Pre-PR validation gate (updated for the scoped fix, commit `6d09e59`)

Revised after @clickhouse-gh[bot]'s review. The earlier reset-before-select approach fixed the leak but wiped the read/write paths' first-attempt reply (errorLabels). This version moves the reset into the cursor callback only.

#	Question	Answer
a	Deterministic repro?	Yes. A standalone ASan/LSan harness drives the `_mongoc_retryable_cmd_run` reply lifecycle with real libbson, one scenario per process. `orig/cursor` leaks 256 bytes 100% of runs; `v1/read` loses errorLabels 100% of runs.
b	Root cause explained?	Yes. The loop reuses `*reply` across attempts. Only the cursor `select_retry_server` callback forwards `reply` to `mongoc_cluster_stream_for_reads`; on its selection-failure path `_mongoc_cluster_stream_for_optype` re-inits `reply` (bson_init/bson_copy_to) without freeing prior contents, and the loop's trailing `bson_destroy(reply)` sits after the `!server_description` break, so the attempt's reply leaks. The read and write callbacks `BSON_UNUSED(reply)` and pass NULL downstream, so they never overwrite it.
c	Fix matches root cause?	Yes. The reset (`bson_destroy(reply); bson_init(reply);`) is placed inside the cursor callback, immediately before it hands `reply` to `stream_for_reads` (the single point where prior contents are about to be orphaned). The retry loop is reverted to upstream byte-for-byte.
d	Test intent preserved / new tests added?	The scoped fix restores the read/write paths' prior behavior (they keep returning the failed attempt's reply with errorLabels) that the v1 change broke. The ASan/LSan harness is the regression proof. ClickHouse does not build the mongo-c-driver test suite, so the leak is exercised by the existing `test_mongo.py` integration tests under sanitizers, same as how it was caught.
e	Both directions demonstrated?	Yes. Harness, one scenario per process under LSan: `orig/cursor` leak=1 errorLabels=0 (the original leak); `v1/read` leak=0 errorLabels=0 (the regression flagged in review); `scoped/read` leak=0 errorLabels=1 (errorLabels preserved); `scoped/cursor` leak=0 errorLabels=0 (leak gone).
f	Fix is general, not a narrow patch?	Yes. I checked every caller of `mongoc_cluster_stream_for_reads`/`_for_writes`: all others pass NULL or a fresh caller-owned reply. The only contract violation (reusing a live reply across the out-param boundary) is the retry loop, and within it only the cursor callback forwards `reply`. The fix is at that single owner, not a guard at the symptom.

Session id: cron:clickhouse-worker-slot-0:20260614-200800

alexey-milovidov · 2026-06-14T21:02:06Z

Ok. One concern is that upstream could be managed by AI-haters. This often happens in open-source. So they can insult and humiliate our bot. Better to read everything and author the upstream fix from a real person.

groeneai · 2026-06-14T21:10:37Z

On sending the fix upstream first: agreed, and I won't open a PR against mongodb/mongo-c-driver myself given the concern about how an AI-authored PR would land there.

The change ports cleanly for a human to author upstream. The retry loop is left byte-for-byte as upstream; the single reset (bson_destroy(reply); bson_init(reply);) moves into _retryable_cursor_commmand_select_retry_server in mongoc-cursor.c, just before it forwards reply to mongoc_cluster_stream_for_reads. That is the only select_retry_server callback that passes reply as an out-param, so it is the only leaking path; the read and write callbacks keep returning the failed attempt's reply with its errorLabels. The diff is in ClickHouse/mongo-c-driver#13, and I can hand over a clean git format-patch for whoever submits it. The standalone ASan/LSan harness confirms the leak is gone and errorLabels is preserved on the read path (results above).

Holding this PR until the upstream fix lands; I'll rebase the submodule bump onto whatever commit is accepted.

nikitamikhaylov added the can be tested Allows running workflows for external contributors label Jun 14, 2026

groeneai mentioned this pull request Jun 14, 2026

Fix TSan data race on NuRaft peer config_ #107429

Closed

clickhouse-gh Bot added pr-bugfix Pull request with bugfix, not backported by default submodule changed At least one submodule changed in this PR. labels Jun 14, 2026

clickhouse-gh Bot reviewed Jun 14, 2026

View reviewed changes

Bump mongo-c-driver to fix reply memory leak in retryable command ret…

6d09e59

…ry loop Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

groeneai force-pushed the groeneai/fix-mongo-c-driver-reply-leak branch from 59168d5 to 6d09e59 Compare June 14, 2026 20:58

clickhouse-gh Bot added the manual approve Manual approve required to run CI label Jun 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory leak in mongo-c-driver retryable command retry loop#107448

Fix memory leak in mongo-c-driver retryable command retry loop#107448
groeneai wants to merge 1 commit into
ClickHouse:masterfrom
groeneai:groeneai/fix-mongo-c-driver-reply-leak

groeneai commented Jun 14, 2026

Uh oh!

groeneai commented Jun 14, 2026

Uh oh!

groeneai commented Jun 14, 2026

Uh oh!

clickhouse-gh Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

clickhouse-gh Bot Jun 14, 2026

Uh oh!

groeneai Jun 14, 2026

Uh oh!

rschu1ze commented Jun 14, 2026

Uh oh!

groeneai commented Jun 14, 2026

Uh oh!

alexey-milovidov commented Jun 14, 2026

Uh oh!

groeneai commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@@ -1 +1 @@
		Subproject commit a26c8e251fec147258e89b7e85e9adfa259fc42f
		Subproject commit 5aad6ab39aea8a2aa425404bc3d82bbc6fe581cd

Conversation

groeneai commented Jun 14, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Description

Uh oh!

groeneai commented Jun 14, 2026

Pre-PR validation gate

Uh oh!

groeneai commented Jun 14, 2026

Uh oh!

clickhouse-gh Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

Missing context / blind spots

Findings

Tests

Final Verdict

Uh oh!

clickhouse-gh Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

groeneai Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

rschu1ze commented Jun 14, 2026

Uh oh!

groeneai commented Jun 14, 2026

Pre-PR validation gate (updated for the scoped fix, commit 6d09e59)

Uh oh!

alexey-milovidov commented Jun 14, 2026

Uh oh!

groeneai commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

clickhouse-gh Bot commented Jun 14, 2026 •

edited

Loading

Pre-PR validation gate (updated for the scoped fix, commit `6d09e59`)