httpserver: implemented concurrent connection and request limits;#1404
httpserver: implemented concurrent connection and request limits;#1404ajscholl wants to merge 4 commits into
Conversation
This commit adds these settings: httpserver.<name>.concurrency.max_requests (default: 0 = disabled) httpserver.<name>.concurrency.max_connections (default: 0 = disabled) httpserver.<name>.concurrency.overload_status_code (default: 503) httpserver.<name>.concurrency.retry_after (default: 0 = header omitted) If you configure a connection limit, we will not accept new connections until less than the limit of connections are active. Idle connections are closed to make room for new connections if the limit is reached. If you configure a request limit, additional request will be immediately be rejected with status 503 (unless a different status is configured) and a Retry-After header is sent if configured.
4e71846 to
0a0188c
Compare
…oncurrent requests;
74d70fd to
5fa0012
Compare
…ring the retry-after header;
| return client | ||
| } | ||
|
|
||
| func getRetryAfterClient(t *testing.T, retries int, timeout time.Duration, defaultWaitTime time.Duration) http.Client { |
There was a problem hiding this comment.
Do we need a test for the WithRequestTimeout option?
| } | ||
|
|
||
| retryAfterHeader := strings.TrimSpace(response.Header().Get(httpHeaders.RetryAfter)) | ||
| if retryAfterHeader == "" && response.StatusCode() != stdHttp.StatusServiceUnavailable { |
There was a problem hiding this comment.
So in case of a HTTP 200 we're returning a retry duration?
There was a problem hiding this comment.
Only if you need a retry for a 200. You need to add this manually by calling httpClient.AddRetryCondition. If your function requires a retry on a HTTP 200, then yes.
| return nil | ||
| } | ||
|
|
||
| return fmt.Errorf("retry wait duration %s exceeds remaining request time %s", retryAfter, remaining) |
There was a problem hiding this comment.
Is it preferred to error out here? Assuming we already had some retries and we're suddenly at 5mins retry where the deadline is only 1min. Won't this log a lot of errors?
On the other hand, I'm not sure if it's better to then set to the remaining instead, that can also be cut short, i.e. if you have two consecutive requests
There was a problem hiding this comment.
If the server says it is back in 5 min, then we should not try to send another request before that. If the caller then has to deal with an error, that is something I would prefer instead of keeping the server overloaded. Yes, we might see some errors in the logs, but the main aim should be to get the server restored to capacity automatically without any developer intervention and sending more requests to the server seems to counteract that
| httpserver.default.compression.level: default | ||
| httpserver.default.concurrency.max_requests: 0 # disabled; reject excess active requests when > 0 | ||
| httpserver.default.concurrency.max_connections: 0 # disabled; evict idle keep-alive connections when > 0 | ||
| httpserver.default.concurrency.overload_status_code: 503 |
There was a problem hiding this comment.
Should this be 429? Or is the technical overload more important, as it's not a rate-limit per se
There was a problem hiding this comment.
429 would require the client to send more than one request to be viable. But if you are unlucky, you can get a 503 on the first request to an overloaded server, so 503 seems correct to me in this case.
| case l.limit <- struct{}{}: | ||
| return nil | ||
| case <-l.closed: | ||
| return net.ErrClosed |
There was a problem hiding this comment.
Do we need to consider the application context here, and in case of shutdown reject any incoming connections?
There was a problem hiding this comment.
In case of a shutdown we should close the socket and thus stop accepting new connections. They will then be handled like before during a shutdown, I don't think we need to explicitly handle this here.
| idleEntry *list.Element | ||
| } | ||
|
|
||
| func newConnectionPressureTracker(ctx context.Context, metrics ServerMetricRecorder) *connectionPressureTracker { |
There was a problem hiding this comment.
Do you want to align the names? tracker <-> manager
| case http.StateIdle: | ||
| m.markIdle(conn) | ||
| case http.StateHijacked, http.StateClosed: | ||
| m.remove(conn) |
There was a problem hiding this comment.
Do we need to handle default here?
There was a problem hiding this comment.
It handles all possible cases from net/http. How would you handle the default?
| return false, nil | ||
| } | ||
|
|
||
| if err := oldestIdleConn.Close(); err != nil { |
There was a problem hiding this comment.
What happens when this errors, is the connection still closed or is it in a zombie state then?
There was a problem hiding this comment.
Not 100% sure. But I think it is generally unwise to touch the connection in any way after we tried to close it - if it refers to a file descriptor internally, and that fd got allocated to some other file, closing it again (or doing anything at all with it) might cause some "interesting" things for that other file using the same fd.
| "github.com/stretchr/testify/require" | ||
| ) | ||
|
|
||
| func TestConnectionPressureTracker_TracksConnectionOpenedAndClosed(t *testing.T) { |
There was a problem hiding this comment.
Also here, is the test easier if we can directly instantiate with mocks? The recorder approach is a bit bigger module tested than just the server
| func TestConcurrentRequestLimitMiddleware_RejectsWhenLimitReached(t *testing.T) { | ||
| entered := make(chan struct{}) | ||
| release := make(chan struct{}) | ||
| router := newConcurrentRequestLimitRouter(httpserver.ConcurrencySettings{ |
There was a problem hiding this comment.
s.a. I also noticed that there are no direct integration tests, maybe it's worth having both the unit tests and the recorder tests?
ca913a4 to
b5aed33
Compare
This commit adds these settings:
httpserver..concurrency.max_requests (default: 0 = disabled)
httpserver..concurrency.max_connections (default: 0 = disabled)
httpserver..concurrency.overload_status_code (default: 503)
httpserver..concurrency.retry_after (default: 0 = header omitted)
If you configure a connection limit, we will not accept new connections until less than the limit of connections are active. Idle connections are closed to make room for new connections if the limit is reached.
If you configure a request limit, additional request will be immediately be rejected with status 503 (unless a different status is configured) and a Retry-After header is sent if configured.