Skip to content

Lockdep circular-locking warning in beegfs lockdep circular #106

Description

@Asuk4

Description

A circular locking dependency is reported by lockdep at beegfs lockdep circular in the Linux kernel BeeGFS. Triggered from task dd. Reachable call chain (top-down): FhgfsInode_referenceHandle → __lock_acquire → get_reg → check_prev_add → check_noncircular → print_circular_bug. Lockdep reports an AB-BA cycle between FhgfsInode->fileHandlesLock and a VFS i_rwsem, taken in opposite orders by writeback and the open path. Impact: under the right interleaving the kernel deadlocks, blocking the filesystem and any task that touches the affected mount.

Version

  • Linux kernel commit 44331bd6a610
  • BeeGFS version 8.3.0

Backtrace (excerpt)

[  274.037072] WARNING: possible circular locking dependency detected
[  274.037681] 6.19.0-g990dca9c3b7a-dirty #48 Tainted: G           O       
[  274.038338] ------------------------------------------------------
[  274.038938] dd/897 is trying to acquire lock:
[  274.039376] ff1100010971b3c8 (&this->mutex#9){+.+.}-{4:4}, at: FhgfsInode_referenceHandle+0x1c4/0x16a0 [beegfs]
[  274.041228] 
[  274.041228] but task is already holding lock:
[  274.041805] ff1100010971b668 (&this->rwSem#3){++++}-{4:4}, at: FhgfsOpsHelper_flushCache+0x22/0x50 [beegfs]
[  274.043589] 
[  274.043589] which lock already depends on the new lock.
[  274.043589] 
[  274.044367] 
[  274.044367] the existing dependency chain (in reverse order) is:
[  274.045085] 
[  274.045085] -> #1 (&this->rwSem#3){++++}-{4:4}:
[  274.045729]        lock_acquire+0x150/0x2c0
[  274.046201]        down_read+0xa0/0x450
[  274.046623]        FhgfsInode_referenceHandle+0xc48/0x16a0 [beegfs]
[  274.048049]        FhgfsOps_openReferenceHandle+0x3d1/0x820 [beegfs]
[  274.049471]        do_dentry_open+0x69b/0x14d0
[  274.049978]        vfs_open+0x87/0x400
[  274.050373]        path_openat+0x216f/0x3160
[  274.050856]        do_file_open+0x22d/0x480
[  274.051310]        do_sys_openat2+0x106/0x1d0
[  274.051776]        __x64_sys_openat+0x146/0x200
[  274.052247]        do_syscall_64+0x111/0x690
[  274.052697]        entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  274.053293] 
[  274.053293] -> #0 (&this->mutex#9){+.+.}-{4:4}:
[  274.053941]        check_prev_add+0xeb/0xd00
[  274.054387]        __lock_acquire+0x1641/0x2260
[  274.054869]        lock_acquire+0x150/0x2c0
[  274.055320]        __mutex_lock+0x1a4/0x25c0
[  274.055802]        FhgfsInode_referenceHandle+0x1c4/0x16a0 [beegfs]
[  274.057249]        FhgfsOpsHelper_writeStatelessInode+0xc8/0x310 [beegfs]
[  274.058779]        __FhgfsOpsHelper_flushCacheUnlocked+0xfd/0x210 [beegfs]
[  274.060286]        FhgfsOpsHelper_flushCache+0x31/0x50 [beegfs]
[  274.061699]        __FhgfsOps_flush+0x331/0xd70 [beegfs]
[  274.063061]        filp_flush+0x12d/0x1d0
[  274.063497]        __x64_sys_close+0x84/0x120
[  274.063962]        do_syscall_64+0x111/0x690
[  274.064409]        entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  274.064976] 
[  274.064976] other info that might help us debug this:
[  274.064976] 
[  274.065753]  Possible unsafe locking scenario:
[  274.065753] 
[  274.066326]        CPU0                    CPU1
[  274.066785]        ----                    ----
[  274.067229]   lock(&this->rwSem#3);
[  274.067606]                                lock(&this->mutex#9);
[  274.068226]                                lock(&this->rwSem#3);
[  274.068843]   lock(&this->mutex#9);
[  274.069223] 
[  274.069223]  *** DEADLOCK ***
[  274.069223] 
[  274.069803] 1 lock held by dd/897:
[  274.070153]  #0: ff1100010971b668 (&this->rwSem#3){++++}-{4:4}, at: FhgfsOpsHelper_flushCache+0x22/0x50 [beegfs]
[  274.071983] 
[  274.071983] stack backtrace:
[  274.072430] CPU: 7 UID: 0 PID: 897 Comm: dd Tainted: G           O        6.19.0-g990dca9c3b7a-dirty #48 PREEMPT(lazy) 
[  274.072474] Tainted: [O]=OOT_MODULE
[  274.072483] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  274.072503] Call Trace:
[  274.072520]  <TASK>
[  274.072535]  dump_stack_lvl+0xc6/0x120
[  274.072574]  print_circular_bug+0x2d1/0x400
[  274.072632]  check_noncircular+0x146/0x160
[  274.072691]  check_prev_add+0xeb/0xd00
[  274.072724]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.072758]  ? get_reg+0x128/0x1a0
[  274.072823]  __lock_acquire+0x1641/0x2260
[  274.072866]  lock_acquire+0x150/0x2c0
[  274.072894]  ? FhgfsInode_referenceHandle+0x1c4/0x16a0 [beegfs]
[  274.073722]  ? __pfx___might_resched+0x10/0x10
[  274.073768]  __mutex_lock+0x1a4/0x25c0
[  274.073813]  ? FhgfsInode_referenceHandle+0x1c4/0x16a0 [beegfs]
[  274.074619]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.074654]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.074689]  ? FhgfsInode_referenceHandle+0x1c4/0x16a0 [beegfs]
[  274.075505]  ? __lock_acquire+0x466/0x2260
[  274.075534]  ? __pfx___mutex_lock+0x10/0x10
[  274.075582]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.075615]  ? __lock_acquire+0x466/0x2260
[  274.075649]  ? lock_acquire+0x150/0x2c0
[  274.075676]  ? is_bpf_text_address+0x2a/0x1e0
[  274.075747]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.075781]  ? lock_is_held_type+0x8f/0x100
[  274.075832]  ? FhgfsInode_referenceHandle+0x1c4/0x16a0 [beegfs]
[  274.076640]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.076674]  FhgfsInode_referenceHandle+0x1c4/0x16a0 [beegfs]
[  274.077494]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.077528]  ? filemap_get_folios_tag+0x428/0xab0
[  274.077592]  ? __pfx_FhgfsInode_referenceHandle+0x10/0x10 [beegfs]
[  274.078417]  ? __pfx_filemap_get_folios_tag+0x10/0x10
[  274.078469]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.078503]  ? _raw_spin_unlock+0x23/0x40
[  274.078538]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.078575]  ? writeback_single_inode+0x197/0x10d0
[  274.078641]  FhgfsOpsHelper_writeStatelessInode+0xc8/0x310 [beegfs]
[  274.079466]  ? __pfx_FhgfsOpsHelper_writeStatelessInode+0x10/0x10 [beegfs]
[  274.080283]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.080317]  ? lock_acquire+0x150/0x2c0
[  274.080345]  ? FhgfsOpsHelper_flushCache+0x22/0x50 [beegfs]
[  274.081182]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.081223]  __FhgfsOpsHelper_flushCacheUnlocked+0xfd/0x210 [beegfs]
[  274.082050]  FhgfsOpsHelper_flushCache+0x31/0x50 [beegfs]
[  274.082873]  __FhgfsOps_flush+0x331/0xd70 [beegfs]
[  274.083692]  ? __pfx___FhgfsOps_flush+0x10/0x10 [beegfs]
[  274.084504]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.084551]  ? file_close_fd+0x66/0x80
[  274.084600]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.084633]  ? lock_release+0xc7/0x270
[  274.084658]  ? srso_alias_return_thunk+0x5/0xfbef5
[  274.084691]  ? Logger_getLogLevel+0x12/0x40 [beegfs]
[  274.085502]  ? __pfx_FhgfsOps_flush+0x10/0x10 [beegfs]
[  274.086316]  filp_flush+0x12d/0x1d0
[  274.086359]  __x64_sys_close+0x84/0x120
[  274.086391]  do_syscall_64+0x111/0x690
[  274.086425]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  274.086455] RIP: 0033:0x7f02058b79e0
[  274.086477] Code: 0d 00 00 00 eb b2 e8 0f f8 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d 01 2c 0e 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
[  274.086504] RSP: 002b:00007ffcbdcbe8f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
[  274.086542] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f02058b79e0
[  274.086560] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
[  274.086576] RBP: 00007f02057bc6c0 R08: 0000000000000007 R09: 000055af1e93b010
[  274.086593] R10: 00007f02057d84f0 R11: 0000000000000202 R12: 0000000000000000
[  274.086610] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001
[  274.086647]  </TASK>
[  274.124619] beegfs: dd(897): NodeConn (acquire stream): Connected: beegfs-storage@10.0.0.2:8003 (protocol: TCP)
[  274.147403] dd (897) used greatest stack depth: 20488 bytes left
[  274.805132] beegfs: umount(905): App (stop components): Stopping components...
[  277.671386] beegfs: beegfs_XNodeSyn(879): Deregistration: Node deregistration successful.
[  277.672462] beegfs: umount(905): App (stop): All components stopped.
[  277.685370] beegfs: umount(905): BeeGFS unmounted.
[  282.292078] beegfs: mount(955): Built without NVFS RDMA support.
[  282.294479]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions