Skip to content

Meta server returns false ENOENT for existing files under concurrent lookup load #111

Description

@megu-sudo

Environment

  • BeeGFS version: 8.2.2
  • OS (servers): Ubuntu 22.04, kernel 5.15.0-164-generic
  • OS (clients): Ubuntu 22.04, kernel 5.15.0-25-generic
  • Hardware: 5 meta/storage servers (32 cores, 192GB RAM, NVMe, ConnectX-7 100GbE)
  • Clients: 2 (one RDMA via ConnectX-6 Dx, one TCP-only 10GbE)
  • Meta storage: ext4 on NVMe

Description

When a client opens files with 128 concurrent threads accessing the same directory, the metadata server occasionally returns ENOENT for files that definitively exist. The files are immediately accessible on retry.

Reproduction

  1. A directory containing 84-1400 files (e.g., JPEG images)
  2. A multi-threaded application opens files listed in a CSV, using 128 threads
  3. Approximately 1-2 out of every 9,000 open() calls return ENOENT
  4. The failing file varies randomly between runs — it is never the same file twice
  5. Immediately after the failure, the same file can be opened successfully

Example output (128-thread application reading 9,428 files across multiple directories):
Warning: Failed to open file for reading!
File: /mnt/bee/data/ins_inf/ecu/edualexi/e7d6f1d67ea62cc29eb669e91dd1baf94b5a9f4a.jpg
Files: 9428 Templates: 9427

The file exists and is world-readable:
-rwxrwxrwx+ 1 ben ben 131113 Mar 6 2023 /mnt/bee/data/.../file.jpg

Key findings

  • No client-side communication errors: dmesg on the client shows zero BeeGFS messages during the failure. The meta server sends a clean ENOENT response (not a connection timeout or retry).
  • Happens across multiple meta targets: Failures occur on directories owned by different meta nodes (m:1, m:3, etc.), ruling out a single faulty server.
  • Happens with both RDMA and TCP clients: Reproduced on a 100GbE RDMA client and a 10GbE TCP-only client.
  • Not related to caching: Setting tuneENOENTCacheValidityMS=0, tuneFileSubentryCacheValidityMS=0, and tuneDirSubentryCacheValidityMS=0 on the client does not fix it.
  • Not related to server capacity: Meta servers are 95-99% CPU idle during reproduction. Increasing connMaxInternodeNum (64→256), tuneNumStreamListeners (8→32), tuneNumWorkers (128), and tuneUsePerUserMsgQueues=true did not fix it.
  • Workaround: An LD_PRELOAD shim that retries open() on ENOENT with a 2ms delay succeeds on the first retry every time, confirming the ENOENT is transient (~1-2ms duration).

Expected behavior

The meta server should never return ENOENT for a file that exists, regardless of concurrent lookup load on the same directory.

Suspected cause

A race condition in the meta server's internal directory entry lookup when multiple concurrent requests access the same directory simultaneously. The per-directory locking or hash-walk may briefly return "not found" while another operation is in progress.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingnewIssues that haven't been triaged yet

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions