Skip to content

[WIP] fswatch: add cross-platform filesystem watcher#349

Closed
dims wants to merge 1 commit into
kubernetes:masterfrom
dims:add-fswatch
Closed

[WIP] fswatch: add cross-platform filesystem watcher#349
dims wants to merge 1 commit into
kubernetes:masterfrom
dims:add-fswatch

Conversation

@dims
Copy link
Copy Markdown
Member

@dims dims commented May 7, 2026

A small Kubernetes compatibility layer that removes direct github.com/fsnotify/fsnotify dependencies from the Linux build closure. Linux uses raw inotify(7) (epoll + pipe wakeup) inline; non-Linux platforms wrap fsnotify behind a build-tag-isolated adapter so fsnotify is absent from production Linux binaries.

WatchDir watches dir non-recursively; on Add(dir) failure (initial or self Remove/Rename) it reports through the error handler and keeps ticking onChange on the recheck interval, retrying Add(dir) on each tick until the watch comes back online — so manifest reloaders do not silently stop on transient ENOENT/ENOSPC.

WatchFile watches the parent directory so atomic-rename updates are observed; WithRecheckInterval fires onChange unconditionally each tick to support callers that retry transient apply failures, while filesystem events still go through an lstat-based change check. WithInitialCallback fires after the watch (or fallback poll) is in place to avoid a race window with the first event.

Add and Remove hold the lifecycle mutex across the underlying syscall so a concurrent Close cannot tear down the FD mid-call.

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Release note:

Adding a new fswatch package with implementation for linux and redirects to fsnotify/fsnotify for non-linux environments

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from aojea and danwinship May 7, 2026 15:24
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 7, 2026
@dims
Copy link
Copy Markdown
Member Author

dims commented May 7, 2026

xref: kubernetes/kubernetes#138812

@dims
Copy link
Copy Markdown
Member Author

dims commented May 7, 2026

xref: kubernetes/kubernetes#138862

A small Kubernetes compatibility layer that removes direct
github.com/fsnotify/fsnotify dependencies from the Linux build closure.
Linux uses raw inotify(7) (epoll + pipe wakeup) inline; non-Linux
platforms wrap fsnotify behind a build-tag-isolated adapter so fsnotify
is absent from production Linux binaries.

Public surface:

  type Watcher
  func NewWatcher() (*Watcher, error)
  (Watcher) Add / Remove / Events / Errors / Close
  type Event { Name, Op } / (Event) Has(Op)
  type Op  ( Create | Write | Remove | Rename | Chmod )
  ErrClosed, ErrNonExistentWatch, ErrEventOverflow

  func WatchFile(ctx, path, onChange, ...FileOption) error
    WithRecheckInterval, WithFallbackPolling,
    WithInitialCallback, WithErrorHandler

  func WatchDir(ctx, dir, onChange, ...DirOption) error
    WithDirRecheckInterval, WithDirErrorHandler

WatchDir watches dir non-recursively; on Add(dir) failure (initial or
self Remove/Rename) it reports through the error handler and keeps
ticking onChange on the recheck interval, retrying Add(dir) on each
tick until the watch comes back online — so manifest reloaders do not
silently stop on transient ENOENT/ENOSPC.

WatchFile watches the parent directory so atomic-rename updates are
observed; WithRecheckInterval fires onChange unconditionally each tick
to support callers that retry transient apply failures, while
filesystem events still go through an lstat-based change check.
WithInitialCallback fires after the watch (or fallback poll) is in
place to avoid a race window with the first event.

Add and Remove hold the lifecycle mutex across the underlying syscall
so a concurrent Close cannot tear down the FD mid-call.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Comment thread go.mod
require github.com/go-logr/logr v1.2.0 // indirect
require (
github.com/go-logr/logr v1.2.0 // indirect
golang.org/x/sys v0.13.0 // indirect
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs CVE patches pretty often, and now gets brought into the tree for anyone using k8s.io/utils which almost didn't have dependencies before (IMHO we should really try to factor out go-spew at some point, klog seems reasonable)

@danwinship
Copy link
Copy Markdown
Contributor

ok but why?

If we don't trust fsnotify, then we shouldn't trust it on Windows either, and if we do, then we don't need to get rid of it on Linux...

(Alternatively, to the extent that there's an implied "we don't actually care about Windows", we could just have fswatch_others.go do polling from a goroutine rather than using OS-specific APIs, which then solves Ben's problem.)

@dims dims closed this May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants