Skip to content

feat(vector): add support for raising file descriptor limits#3311

Open
vparfonov wants to merge 1 commit into
openshift:masterfrom
vparfonov:log9396
Open

feat(vector): add support for raising file descriptor limits#3311
vparfonov wants to merge 1 commit into
openshift:masterfrom
vparfonov:log9396

Conversation

@vparfonov

@vparfonov vparfonov commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Description

Adds annotation-driven support for Vector's VECTOR_RAISE_FD_LIMIT env var, allowing users to opt in to raising file descriptor soft limits at startup. This prevents "Too many open files" errors when Vector monitors large numbers of log files concurrently.

Details

  • new annotation: observability.openshift.io/raise-fd-limit
  • "true" raise fd soft limit to match hard limit
  • "false" (default) no change, matches Vector's default behavior
  • sets VECTOR_RAISE_FD_LIMIT env var on the collector container; Vector reads it natively
  • invalid annotation values are rejected with a status condition on the CLF

Usage

 apiVersion: observability.openshift.io/v1
  kind: ClusterLogForwarder
  metadata:
    name: instance
    annotations:
      observability.openshift.io/raise-fd-limit: "true"

/cc
/assign @jcantril

Links

Summary by CodeRabbit

  • New Features
    • Added observability.openshift.io/raise-fd-limit annotation for ClusterLogForwarder to control whether Vector raises the process file descriptor soft limit at startup (default: false; valid values: true/false).
    • Propagates the setting to the collector environment, falling back to false for invalid values and surfacing a status condition when misconfigured.
  • Documentation
    • Documented the annotation, valid values, defaults, and included a configuration example.
  • Tests
    • Added unit and functional coverage for default, enabled, and invalid-value behavior.

@vparfonov

Copy link
Copy Markdown
Contributor Author

/hold

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Adds support for the observability.openshift.io/raise-fd-limit annotation across validation, collector configuration, tests, and documentation. New constants and a feature doc are added, and the collector now exposes VECTOR_RAISE_FD_LIMIT.

Changes

Raise-fd-limit annotation support

Layer / File(s) Summary
Contract and docs
internal/constants/annotations.go, api/observability/v1/conditions.go, docs/features/raise_fd_limit.adoc, .github/actions/spelling/patterns.txt
Defines the annotation key plus condition/reason constants, adds feature documentation, and updates spelling patterns for AsciiDoc source blocks.
Validation and conditions
internal/validations/observability/validate_annotations.go, internal/validations/observability/validate.go, internal/validations/observability/validate_annotations_test.go
Adds validation for observability.openshift.io/raise-fd-limit, sets or clears the RaiseFdLimit condition, registers the validator, and covers valid and invalid annotation values in tests.
Collector env wiring
internal/collector/collector.go, internal/collector/collector_test.go
Adds RaiseFdLimit(annotations) and wires VECTOR_RAISE_FD_LIMIT into the collector container, with unit tests for default, valid, and invalid inputs.
Functional coverage
test/functional/misc/raise_fd_limit_test.go
Adds a Ginkgo test suite that deploys the collector and checks the VECTOR_RAISE_FD_LIMIT environment value for both true and false.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 A bunny found a flag in the breeze,
"true" or "false" among the leaves.
The collector hummed, the tests all smiled,
And docs hopped along in a tidy style.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding support for raising file descriptor limits in Vector.
Description check ✅ Passed The description follows the template and covers intent, details, usage, and links, though the /cc assignment is still a placeholder.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from Clee2691 and jcantrill June 15, 2026 14:11
@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vparfonov
Once this PR has been reviewed and has the lgtm label, please assign xperimental for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 15, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
docs/features/raise_fd_limit.adoc (1)

1-38: 💤 Low value

Consider clarifying the default behavior when the annotation is absent.

The documentation is comprehensive and well-written. One minor suggestion: line 9 states "The Cluster Logging Operator does not enable this feature by default (false)" which could be slightly clearer. Consider rephrasing to explicitly state that when the annotation is absent, the behavior is equivalent to setting it to "false".

For example: "The Cluster Logging Operator does not enable this feature by default. When the annotation is absent, the behavior is equivalent to false, matching Vector's default behavior."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/features/raise_fd_limit.adoc` around lines 1 - 38, Clarify the default
behavior when the annotation is absent in the documentation. In the paragraph
discussing the default behavior of the Cluster Logging Operator (currently
stating "The Cluster Logging Operator does not enable this feature by default
(`false`), matching Vector's default behavior"), rephrase it to explicitly state
that when the observability.openshift.io/raise-fd-limit annotation is absent,
the behavior is equivalent to `false`. The revised text should clearly
communicate that omitting the annotation results in the same behavior as setting
it to `"false"`, making it unambiguous what happens in the absence of the
annotation configuration.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/functional/misc/raise_fd_limit_test.go`:
- Around line 26-32: The test in the It block claiming to verify "with
VECTOR_RAISE_FD_LIMIT env var" does not actually set the
observability.openshift.io/raise-fd-limit annotation or verify the environment
variable is present in the collector. Enhance this test to first set the
annotation to "true" on the forwarder object before calling framework.Deploy(),
then after verifying Vector starts successfully, additionally verify that the
VECTOR_RAISE_FD_LIMIT environment variable is actually set to "true" in the
collector container by examining the environment or pod spec. Consider creating
additional test cases for the annotation set to "false", omitted entirely, and
with an invalid value, verifying the corresponding environment variable values
and any status conditions as appropriate.

---

Nitpick comments:
In `@docs/features/raise_fd_limit.adoc`:
- Around line 1-38: Clarify the default behavior when the annotation is absent
in the documentation. In the paragraph discussing the default behavior of the
Cluster Logging Operator (currently stating "The Cluster Logging Operator does
not enable this feature by default (`false`), matching Vector's default
behavior"), rephrase it to explicitly state that when the
observability.openshift.io/raise-fd-limit annotation is absent, the behavior is
equivalent to `false`. The revised text should clearly communicate that omitting
the annotation results in the same behavior as setting it to `"false"`, making
it unambiguous what happens in the absence of the annotation configuration.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 47d81bf4-7967-4ea7-9515-979bd502d52b

📥 Commits

Reviewing files that changed from the base of the PR and between 218eb4a and f650eb5.

📒 Files selected for processing (9)
  • api/observability/v1/conditions.go
  • docs/features/raise_fd_limit.adoc
  • internal/collector/collector.go
  • internal/collector/collector_test.go
  • internal/constants/annotations.go
  • internal/validations/observability/validate.go
  • internal/validations/observability/validate_annotations.go
  • internal/validations/observability/validate_annotations_test.go
  • test/functional/misc/raise_fd_limit_test.go

Comment thread test/functional/misc/raise_fd_limit_test.go Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/functional/misc/raise_fd_limit_test.go (1)

28-37: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Both functional tests bypass the annotation mechanism and do not validate the feature. Both test cases manually inject the VECTOR_RAISE_FD_LIMIT environment variable via DeployWithVisitor, completely bypassing the annotation-to-environment-variable conversion that this PR implements. These tests only verify that manually-set env vars can be read back, not that the observability.openshift.io/raise-fd-limit annotation is processed by RaiseFdLimit() and injected into the collector container.

  • test/functional/misc/raise_fd_limit_test.go#L28-L37: Replace DeployWithVisitor with setting framework.Forwarder.Annotations[constants.AnnotationVectorRaiseFdLimit] = "true" before calling framework.Deploy().
  • test/functional/misc/raise_fd_limit_test.go#L39-L48: Replace DeployWithVisitor with setting framework.Forwarder.Annotations[constants.AnnotationVectorRaiseFdLimit] = "false" before calling framework.Deploy().

Additionally, add test cases for: (1) omitted annotation (verify default "false"), and (2) invalid annotation value (verify default "false" and status condition if applicable).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/functional/misc/raise_fd_limit_test.go` around lines 28 - 37, The tests
in test/functional/misc/raise_fd_limit_test.go are bypassing the annotation
mechanism by manually setting the VECTOR_RAISE_FD_LIMIT environment variable via
DeployWithVisitor instead of validating that the annotation processing works
correctly. At lines 28-37 (anchor), replace the DeployWithVisitor call with
setting framework.Forwarder.Annotations[constants.AnnotationVectorRaiseFdLimit]
= "true" before calling framework.Deploy(). At lines 39-48 (sibling), make the
same change but set the annotation to "false". Then remove the manual
environment variable injection and the RunCommand verification, since the tests
should now verify that the annotation is properly processed by the
RaiseFdLimit() function. Finally, add two additional test cases: one verifying
the default behavior when the annotation is omitted, and another verifying the
behavior when an invalid annotation value is provided.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@test/functional/misc/raise_fd_limit_test.go`:
- Around line 28-37: The tests in test/functional/misc/raise_fd_limit_test.go
are bypassing the annotation mechanism by manually setting the
VECTOR_RAISE_FD_LIMIT environment variable via DeployWithVisitor instead of
validating that the annotation processing works correctly. At lines 28-37
(anchor), replace the DeployWithVisitor call with setting
framework.Forwarder.Annotations[constants.AnnotationVectorRaiseFdLimit] = "true"
before calling framework.Deploy(). At lines 39-48 (sibling), make the same
change but set the annotation to "false". Then remove the manual environment
variable injection and the RunCommand verification, since the tests should now
verify that the annotation is properly processed by the RaiseFdLimit() function.
Finally, add two additional test cases: one verifying the default behavior when
the annotation is omitted, and another verifying the behavior when an invalid
annotation value is provided.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9bb147d9-a0ad-43d8-94d3-3435e69fc064

📥 Commits

Reviewing files that changed from the base of the PR and between f650eb5 and f95b2d9.

📒 Files selected for processing (9)
  • api/observability/v1/conditions.go
  • docs/features/raise_fd_limit.adoc
  • internal/collector/collector.go
  • internal/collector/collector_test.go
  • internal/constants/annotations.go
  • internal/validations/observability/validate.go
  • internal/validations/observability/validate_annotations.go
  • internal/validations/observability/validate_annotations_test.go
  • test/functional/misc/raise_fd_limit_test.go
✅ Files skipped from review due to trivial changes (1)
  • api/observability/v1/conditions.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • internal/validations/observability/validate_annotations_test.go
  • internal/validations/observability/validate.go
  • internal/validations/observability/validate_annotations.go
  • internal/constants/annotations.go
  • internal/collector/collector.go

@vparfonov

Copy link
Copy Markdown
Contributor Author

/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 29, 2026
@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2026
Add annotation-driven support for Vector's VECTOR_RAISE_FD_LIMIT env var,
enabling the collector to raise file descriptor soft limits at startup.
This prevents "Too many open files" errors when monitoring large numbers
of log files concurrently.

Key changes:
- New annotation: observability.openshift.io/raise-fd-limit (default: false)
- Sets VECTOR_RAISE_FD_LIMIT env var on the collector container
- Vector reads the env var natively, no entrypoint script changes needed
- Input validation rejects invalid values (only "true"/"false" allowed)
- Status condition reports validation errors to the user

Default matches Vector's behavior (false = don't raise fd limits).
Users must explicitly set the annotation to "true" to opt in.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Signed-off-by: Vitalii Parfonov <vparfono@redhat.com>
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/actions/spelling/patterns.txt (1)

55-62: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Narrow the hyphenated-name ignore pattern.

\b[a-z]+(?:-[a-z]+)+\b will also match common prose like “end-to-end” and “follow-up”, which weakens spell-check coverage far beyond Kubernetes resource names. A tighter allow-list or a more specific resource-name regex would be safer.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/actions/spelling/patterns.txt around lines 55 - 62, Narrow the
hyphenated-name ignore rule in the spelling patterns so it only matches
Kubernetes-style resource names, not general prose. Update the regex entry in
the patterns file near the existing resource-name rule to be more specific (or
replace it with a tighter allow-list) while keeping the other ignore patterns
like the mixed-identifier and pod hash rules intact. Use the existing comment
context around the Kubernetes resource-name pattern to locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.github/actions/spelling/patterns.txt:
- Around line 55-62: Narrow the hyphenated-name ignore rule in the spelling
patterns so it only matches Kubernetes-style resource names, not general prose.
Update the regex entry in the patterns file near the existing resource-name rule
to be more specific (or replace it with a tighter allow-list) while keeping the
other ignore patterns like the mixed-identifier and pod hash rules intact. Use
the existing comment context around the Kubernetes resource-name pattern to
locate the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9a35f63f-3153-40eb-b24e-11d737942dad

📥 Commits

Reviewing files that changed from the base of the PR and between f95b2d9 and cc5152a.

📒 Files selected for processing (10)
  • .github/actions/spelling/patterns.txt
  • api/observability/v1/conditions.go
  • docs/features/raise_fd_limit.adoc
  • internal/collector/collector.go
  • internal/collector/collector_test.go
  • internal/constants/annotations.go
  • internal/validations/observability/validate.go
  • internal/validations/observability/validate_annotations.go
  • internal/validations/observability/validate_annotations_test.go
  • test/functional/misc/raise_fd_limit_test.go
✅ Files skipped from review due to trivial changes (2)
  • docs/features/raise_fd_limit.adoc
  • api/observability/v1/conditions.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • internal/collector/collector_test.go
  • test/functional/misc/raise_fd_limit_test.go
  • internal/validations/observability/validate_annotations_test.go
  • internal/collector/collector.go
  • internal/validations/observability/validate_annotations.go
  • internal/validations/observability/validate.go

@vparfonov

Copy link
Copy Markdown
Contributor Author

/retest

@openshift-ci

openshift-ci Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

@vparfonov: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant