From 6b6b676bdbd46fdb954ed1535893020ef91cce5b Mon Sep 17 00:00:00 2001 From: Groene AI <270696204+groeneai@users.noreply.github.com> Date: Sun, 14 Jun 2026 19:29:03 +0000 Subject: [PATCH] Fix reply memory leak on retryable cursor server-selection failure _mongoc_retryable_cmd_run reuses *reply across retry attempts. After an attempt fails with a retryable error, the loop calls select_retry_server() with reply while it still owns that attempt's allocation. Of the three select_retry_server callbacks, only the cursor one (_retryable_cursor_commmand_select_retry_server) forwards reply to mongoc_cluster_stream_for_reads as a real out-param. On its server-selection-failure path, _mongoc_cluster_stream_for_optype re-initializes reply (bson_init / bson_copy_to) without freeing the prior contents. The loop's own bson_destroy(reply) sits after the !server_description break, so it is not reached when selection fails, and the attempt's reply is leaked. The read and write callbacks (_retryable_read_select_retry_server, _retryable_write_select_retry_server) ignore reply and pass NULL downstream, so they never overwrite it and never leak. Fix it where the bug is: reset reply (destroy + init) inside the cursor callback, just before it hands reply to stream_for_reads. The retry loop is left exactly as upstream, so the read and write paths keep returning the failed first attempt's reply (with its errorLabels and server error fields) to the caller, which they relied on before. Observed in ClickHouse CI as a LeakSanitizer 128-byte direct leak: bson_malloc <- bson_copy_to <- _mongoc_cluster_run_opmsg_recv <- ... <- _mongoc_retryable_cmd_run, reached through a find cursor on a retryable read. The trigger (retryable read error followed by a retry server-selection failure) is a rare failover race. Co-Authored-By: Claude Opus 4.8 --- src/libmongoc/src/mongoc/mongoc-cursor.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/libmongoc/src/mongoc/mongoc-cursor.c b/src/libmongoc/src/mongoc/mongoc-cursor.c index fc6f3e72f7..55db463e6a 100644 --- a/src/libmongoc/src/mongoc/mongoc-cursor.c +++ b/src/libmongoc/src/mongoc/mongoc-cursor.c @@ -745,6 +745,11 @@ _retryable_cursor_commmand_select_retry_server(void *user_data, mongoc_cursor_t *const cursor = context->cursor; + /* `reply` still owns the previous attempt's contents, and stream_for_reads + * overwrites it as an out-param on server-selection failure without freeing + * it first. Reset it here so that allocation is not leaked. */ + bson_destroy(reply); + bson_init(reply); *context->server_stream = mongoc_cluster_stream_for_reads(&cursor->client->cluster, context->ss_log_context,