Skip to content

httpserver: implemented concurrent connection and request limits;#1404

Open
ajscholl wants to merge 4 commits into
mainfrom
limit-concurrent-requests
Open

httpserver: implemented concurrent connection and request limits;#1404
ajscholl wants to merge 4 commits into
mainfrom
limit-concurrent-requests

Conversation

@ajscholl
Copy link
Copy Markdown
Contributor

This commit adds these settings:

httpserver..concurrency.max_requests (default: 0 = disabled)
httpserver..concurrency.max_connections (default: 0 = disabled)
httpserver..concurrency.overload_status_code (default: 503)
httpserver..concurrency.retry_after (default: 0 = header omitted)

If you configure a connection limit, we will not accept new connections until less than the limit of connections are active. Idle connections are closed to make room for new connections if the limit is reached.

If you configure a request limit, additional request will be immediately be rejected with status 503 (unless a different status is configured) and a Retry-After header is sent if configured.

This commit adds these settings:

httpserver.<name>.concurrency.max_requests (default: 0 = disabled)
httpserver.<name>.concurrency.max_connections (default: 0 = disabled)
httpserver.<name>.concurrency.overload_status_code (default: 503)
httpserver.<name>.concurrency.retry_after (default: 0 = header omitted)

If you configure a connection limit, we will not accept new connections
until less than the limit of connections are active. Idle connections
are closed to make room for new connections if the limit is reached.

If you configure a request limit, additional request will be immediately
be rejected with status 503 (unless a different status is configured)
and a Retry-After header is sent if configured.
@ajscholl ajscholl force-pushed the limit-concurrent-requests branch from 4e71846 to 0a0188c Compare May 26, 2026 07:05
@ajscholl ajscholl force-pushed the limit-concurrent-requests branch from 74d70fd to 5fa0012 Compare May 26, 2026 09:11
Comment thread pkg/http/client_test.go
return client
}

func getRetryAfterClient(t *testing.T, retries int, timeout time.Duration, defaultWaitTime time.Duration) http.Client {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a test for the WithRequestTimeout option?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment thread pkg/http/retry_after.go
}

retryAfterHeader := strings.TrimSpace(response.Header().Get(httpHeaders.RetryAfter))
if retryAfterHeader == "" && response.StatusCode() != stdHttp.StatusServiceUnavailable {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in case of a HTTP 200 we're returning a retry duration?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if you need a retry for a 200. You need to add this manually by calling httpClient.AddRetryCondition. If your function requires a retry on a HTTP 200, then yes.

Comment thread pkg/http/retry_after.go
return nil
}

return fmt.Errorf("retry wait duration %s exceeds remaining request time %s", retryAfter, remaining)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it preferred to error out here? Assuming we already had some retries and we're suddenly at 5mins retry where the deadline is only 1min. Won't this log a lot of errors?
On the other hand, I'm not sure if it's better to then set to the remaining instead, that can also be cut short, i.e. if you have two consecutive requests

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the server says it is back in 5 min, then we should not try to send another request before that. If the caller then has to deal with an error, that is something I would prefer instead of keeping the server overloaded. Yes, we might see some errors in the logs, but the main aim should be to get the server restored to capacity automatically without any developer intervention and sending more requests to the server seems to counteract that

Comment thread pkg/httpserver/AGENTS.md
httpserver.default.compression.level: default
httpserver.default.concurrency.max_requests: 0 # disabled; reject excess active requests when > 0
httpserver.default.concurrency.max_connections: 0 # disabled; evict idle keep-alive connections when > 0
httpserver.default.concurrency.overload_status_code: 503
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 429? Or is the technical overload more important, as it's not a rate-limit per se

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

429 would require the client to send more than one request to be viable. But if you are unlucky, you can get a 503 on the first request to an overloaded server, so 503 seems correct to me in this case.

case l.limit <- struct{}{}:
return nil
case <-l.closed:
return net.ErrClosed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider the application context here, and in case of shutdown reject any incoming connections?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of a shutdown we should close the socket and thus stop accepting new connections. They will then be handled like before during a shutdown, I don't think we need to explicitly handle this here.

idleEntry *list.Element
}

func newConnectionPressureTracker(ctx context.Context, metrics ServerMetricRecorder) *connectionPressureTracker {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to align the names? tracker <-> manager

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

case http.StateIdle:
m.markIdle(conn)
case http.StateHijacked, http.StateClosed:
m.remove(conn)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to handle default here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It handles all possible cases from net/http. How would you handle the default?

return false, nil
}

if err := oldestIdleConn.Close(); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when this errors, is the connection still closed or is it in a zombie state then?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure. But I think it is generally unwise to touch the connection in any way after we tried to close it - if it refers to a file descriptor internally, and that fd got allocated to some other file, closing it again (or doing anything at all with it) might cause some "interesting" things for that other file using the same fd.

"github.com/stretchr/testify/require"
)

func TestConnectionPressureTracker_TracksConnectionOpenedAndClosed(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, is the test easier if we can directly instantiate with mocks? The recorder approach is a bit bigger module tested than just the server

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

func TestConcurrentRequestLimitMiddleware_RejectsWhenLimitReached(t *testing.T) {
entered := make(chan struct{})
release := make(chan struct{})
router := newConcurrentRequestLimitRouter(httpserver.ConcurrencySettings{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.a. I also noticed that there are no direct integration tests, maybe it's worth having both the unit tests and the recorder tests?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@ajscholl ajscholl force-pushed the limit-concurrent-requests branch from ca913a4 to b5aed33 Compare May 29, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants