Fix SqlTransaction race condition causing 500 on client disconnect in bundles#5553
Draft
mikaelweave wants to merge 1 commit intomainfrom
Draft
Fix SqlTransaction race condition causing 500 on client disconnect in bundles#5553mikaelweave wants to merge 1 commit intomainfrom
mikaelweave wants to merge 1 commit intomainfrom
Conversation
… bundles
Bug #188321: Batch/Transaction bundles fail with 'SqlTransaction has
completed; it is no longer usable' race condition.
Root cause is two interacting bugs:
Bug A - SqlServerFhirDataStore.MergeInternalAsync passes an already-
cancelled CancellationToken to MergeResourcesCommitTransactionAsync in
the error catch block. The cancelled token causes OpenAsync to throw
OperationCanceledException immediately, masking the original exception
and leaving server-side transaction state unclean.
Fix A: Pass CancellationToken.None so the abort call always completes.
Bug B - SqlTransactionScope.Dispose() (healthcare-shared-components)
disposes SqlConnection before SqlTransaction. This zombies the
transaction, causing SqlTransaction.Dispose()'s internal Rollback() to
throw InvalidOperationException('This SqlTransaction has completed').
During C# using-block unwinding this replaces the active exception
(FhirTransactionCancelledException), which BundleHandler then catches
and returns HTTP 500 instead of HTTP 408.
Fix B: Check cancellationToken.IsCancellationRequested in the
IsCompletedTransactionException catch block and throw
FhirTransactionCancelledException (408) instead of
FhirTransactionFailedException (500) when the client cancelled.
Fix C (defense-in-depth): Expand the transaction abort guard in both
sequential (BundleHandler) and parallel (BundleHandlerParallelOperations)
paths to abort on any 4xx/5xx status code, not only when
entry.Response.Outcome != null (outcome is null when client disconnects
mid-stream and the response body is empty).
Also adds a unit test covering the full failure chain:
2-entry transaction bundle + cancelled token + SqlTransaction zombie
exception -> asserts FhirTransactionCancelledException (408).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5553 +/- ##
==========================================
+ Coverage 77.02% 77.39% +0.36%
==========================================
Files 983 985 +2
Lines 36007 36193 +186
Branches 5469 5501 +32
==========================================
+ Hits 27736 28010 +274
+ Misses 6927 6825 -102
- Partials 1344 1358 +14 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes Bug #188321: Batch/Transaction bundles fail with
'SqlTransaction has completed; it is no longer usable'race condition.When a client disconnects mid-request during a sequential transaction bundle, the server was returning HTTP 500 instead of HTTP 408 (Request Timeout).
Root Cause
Two interacting bugs:
Bug A — Cancelled token in error catch block
SqlServerFhirDataStore.MergeInternalAsyncpasses an already-cancelledCancellationTokentoMergeResourcesCommitTransactionAsyncinside the error catch block. The cancelled token causesOpenAsyncto throwOperationCanceledExceptionimmediately, masking the original exception and leaving server-side transaction state unclean.Bug B — SqlTransactionScope.Dispose() zombie chain
SqlTransactionScope.Dispose()disposesSqlConnectionbeforeSqlTransaction. Closing the connection zombies the transaction, so the internalRollback()call inSqlTransaction.Dispose()throwsInvalidOperationException("This SqlTransaction has completed"). During C#using-block unwinding this exception replaces the activeFhirTransactionCancelledException, whichBundleHandlerthen catches and returns HTTP 500.Fixes
Fix A (
SqlServerFhirDataStore.cs): PassCancellationToken.NonetoMergeResourcesCommitTransactionAsyncin the catch block so the server-side abort always completes.Fix B (
BundleHandler.cs): CheckcancellationToken.IsCancellationRequestedin theIsCompletedTransactionExceptioncatch block and throwFhirTransactionCancelledException(408) instead ofFhirTransactionFailedException(500) when the client cancelled.Fix C — defense-in-depth (
BundleHandler.cs,BundleHandlerParallelOperations.cs): Expand the transaction abort guard to fire on any 4xx/5xx status code, not only whenentry.Response.Outcome != null. When a client disconnects mid-stream the response body is empty andOutcomeis null, so the old guard would not abort the transaction.Testing
GivenATransaction_WithCancellationAndSqlTransactionZombied_ReturnsCancelledExceptioncovering the full failure chain: 2-entry transaction bundle + cancelled token + SqlTransaction zombie exception → assertsFhirTransactionCancelledException(408).Related
ADO Bug #188321