Skip to content

Comments

feat: Add backpressure management config for concurrent requests#315

Closed
matt-codecov wants to merge 2 commits intomainfrom
matt/concurrent-reqs-backpressure
Closed

feat: Add backpressure management config for concurrent requests#315
matt-codecov wants to merge 2 commits intomainfrom
matt/concurrent-reqs-backpressure

Conversation

@matt-codecov
Copy link
Contributor

@matt-codecov matt-codecov commented Feb 18, 2026

  • adds a SystemMetrics struct to service state which currently houses our in-flight requests counter
  • adds a BackpressureThresholds struct to config which has a field for max in flight requests
  • makes /ready return 503 if we are processing more concurrent requests than the config allows

don't love any of the names, bikeshedding welcome

Ref FS-171

@matt-codecov matt-codecov requested a review from a team as a code owner February 18, 2026 00:36
@linear
Copy link

linear bot commented Feb 18, 2026

ENV OPENSSL_INCLUDE_DIR=/usr/include

WORKDIR /workspace
RUN git config --global --add safe.directory /workspace
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh whoops, i forget whether this is actually necessary now. might be an artifact from when i was doing hooked_objectstore stuff in the sandbox. i will revert this before merging if it is not necessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or otherwise move to a separate PR, it would be nice to capture the motivation of this change.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

merni::gauge!("server.requests.in_flight": count);
});

let (serve_result, _) = tokio::join!(server, emitter);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think the emitter has an exit condition so this would probably never finish

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean the server couldn't shut down at all right now. I need to refresh my memory, but I think the emitter finishes automatically once the server drops its other handler (in the middleware) and then the emitter has run its last emission.

@matt-codecov matt-codecov force-pushed the matt/concurrent-reqs-backpressure branch from 3fb2483 to eefe8a5 Compare February 18, 2026 00:44
@matt-codecov matt-codecov force-pushed the matt/concurrent-reqs-backpressure branch from eefe8a5 to a48d6e4 Compare February 18, 2026 00:50
}

let too_many_requests = match state.config.backpressure_thresholds.in_flight_requests {
Some(max_concurrent_requests) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The /ready health check includes its own request in the in_flight_requests count, causing an off-by-one error in the concurrency limit and allowing one extra request.
Severity: MEDIUM

Suggested Fix

Adjust the comparison in the /ready handler to account for its own request. For example, change the check to state.system_metrics.in_flight_requests.get() > state.max_concurrent_requests + 1 or subtract one from the current count before comparing: (state.system_metrics.in_flight_requests.get() - 1) >= state.max_concurrent_requests.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: objectstore-server/src/endpoints/health.rs#L30

Potential issue: The `/ready` health check endpoint checks if the number of in-flight
requests exceeds the `max_concurrent_requests` limit. However, the middleware that
counts in-flight requests increments the counter before the request handler is executed.
This means the `/ready` check's own request is included in the total. When the server
has `max_concurrent_requests - 1` active requests, a `/ready` probe brings the count to
`max_concurrent_requests`. The check `count > max_concurrent_requests` then incorrectly
evaluates to false, causing the server to report as ready and accept one more request
than its configured limit.

@lcian lcian changed the title feat: add backpressure management config for concurrent requests feat: Add backpressure management config for concurrent requests Feb 18, 2026
None => false,
};
if too_many_requests {
tracing::warn!("Serving too many concurrent requests already, failing readiness");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this will send errors to sentry given the warn log level.
I think it's fine to keep, at least initially

Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary review.

Since we already have this counter of concurrent requests in our middleware, could we simply inject backpressure there -- as middleware? Or does this intentionally admit more requests until k8s reacts to the readiness endpoint?

ENV OPENSSL_INCLUDE_DIR=/usr/include

WORKDIR /workspace
RUN git config --global --add safe.directory /workspace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or otherwise move to a separate PR, it would be nice to capture the motivation of this change.


/// Definitions for thresholds at which backpressure management or load shedding should kick
/// in.
pub backpressure_thresholds: BackpressureThresholds,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this fit into rate_limits? Or maybe, we rename rate_limits into more generic limits and then have throughput, bandwidth, and concurrency in there?

merni::gauge!("server.requests.in_flight": count);
});

let (serve_result, _) = tokio::join!(server, emitter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean the server couldn't shut down at all right now. I need to refresh my memory, but I think the emitter finishes automatically once the server drops its other handler (in the middleware) and then the emitter has run its last emission.

Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please hold on this for now. We're adding more task isolation and internal concurrency limiting in a series of PRs starting with: #322

jan-auer added a commit that referenced this pull request Feb 23, 2026
Adds a web-tier flood protection limit to the objectstore HTTP server.
When the number of in-flight requests reaches `http.max_requests`
(default: 10,000), new requests are rejected immediately with HTTP 503.
`/health` and `/ready` are excluded so orchestration probes always get
through.

The `Http` config struct groups HTTP-layer settings separately from the
service-layer `Service` struct. This keeps concerns cleanly separated
and gives a natural home for future HTTP-level settings (timeouts, body
size limits, etc.). The env var is `OS__HTTP__MAX_REQUESTS`.

**Why direct 503 instead of readiness-based backpressure**: failing
`/ready` under load lets Kubernetes drain the pod, but probe intervals
(5–10s) leave a window of continued overload, recovery wastes capacity
waiting for probe cycles, and multiple pods failing probes
simultaneously concentrates traffic onto remaining pods. A busy pod is
also semantically still *ready* — conflating load with readiness muddies
alerting. Direct rejection responds in milliseconds, recovers instantly,
keeps every pod in the pool, and works in any deployment environment.

Tested with two unit tests (pass-through and rejection/exemption via a
`Notify`-based gating pattern) and an integration test via `TestServer`
with `max_requests = 0` for deterministic coverage without needing
concurrent requests.

Ref FS-171
Closes #315
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants