bisecting fixing commit since 811218eceeaa7618652e1b8d11caeff67ab42072 building syzkaller on 0655e081f42239d4eca4345ef7293307085f78f5 testing commit 811218eceeaa7618652e1b8d11caeff67ab42072 with gcc (GCC) 8.4.1 20210217 kernel signature: e952150775247d3c4fe6bae14cb3c0be7b9dc8d717963c09d6e5e98f44a3e855 run #0: crashed: kernel BUG in pfkey_send_acquire run #1: crashed: kernel BUG in corrupted run #2: crashed: kernel BUG in pfkey_send_acquire run #3: crashed: kernel BUG in pfkey_send_acquire run #4: crashed: kernel BUG in pfkey_send_acquire run #5: crashed: kernel BUG in pfkey_send_acquire run #6: crashed: kernel BUG in pfkey_send_acquire run #7: crashed: kernel BUG in pfkey_send_acquire run #8: crashed: kernel BUG in pfkey_send_acquire run #9: crashed: kernel BUG in pfkey_send_acquire run #10: crashed: kernel BUG in pfkey_send_acquire run #11: crashed: kernel BUG in pfkey_send_acquire run #12: crashed: kernel BUG in corrupted run #13: crashed: kernel BUG in pfkey_send_acquire run #14: OK run #15: OK run #16: OK run #17: OK run #18: OK run #19: OK testing current HEAD ac3af4beac439ebccd17746c9f2fd227e88107aa testing commit ac3af4beac439ebccd17746c9f2fd227e88107aa with gcc (GCC) 8.4.1 20210217 kernel signature: 11d7cb196abb39491a5fe1e759ecb8550cd20e638383ae724c633655f1bc9530 all runs: OK # git bisect start ac3af4beac439ebccd17746c9f2fd227e88107aa 811218eceeaa7618652e1b8d11caeff67ab42072 Bisecting: 317 revisions left to test after this (roughly 8 steps) [2f6f38cb8629488d5c206ca89c88ae007731e380] spi: stm32: properly handle 0 byte transfer testing commit 2f6f38cb8629488d5c206ca89c88ae007731e380 with gcc (GCC) 8.4.1 20210217 kernel signature: 96696d3b538c0efa28ac2e97a0c1a164ca9942524a395fb36a1f660918030500 all runs: OK # git bisect bad 2f6f38cb8629488d5c206ca89c88ae007731e380 Bisecting: 158 revisions left to test after this (roughly 7 steps) [e86f8f885907a7c41d238c4994a34712d958ce03] net/qrtr: restrict user-controlled length in qrtr_tun_write_iter() testing commit e86f8f885907a7c41d238c4994a34712d958ce03 with gcc (GCC) 8.4.1 20210217 kernel signature: 7975b1a240f733dd2e13300e98175761a43e0af739bbb120fbeeed92474da112 run #0: crashed: no output from test machine run #1: OK run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK reproducer seems to be flaky # git bisect good e86f8f885907a7c41d238c4994a34712d958ce03 Bisecting: 79 revisions left to test after this (roughly 6 steps) [777d796966484f5b2b6245706057a05d1d1b642a] tcp: fix SO_RCVLOWAT related hangs under mem pressure testing commit 777d796966484f5b2b6245706057a05d1d1b642a with gcc (GCC) 8.4.1 20210217 kernel signature: a91db9508e5fbd85be2eb700b55969ce3d67f9c9d8f5dd9c58862579cc680fca all runs: OK # git bisect bad 777d796966484f5b2b6245706057a05d1d1b642a Bisecting: 39 revisions left to test after this (roughly 5 steps) [1fc338cde538bc2d73006fffb0ea20fa97fdbd55] random: fix the RNDRESEEDCRNG ioctl testing commit 1fc338cde538bc2d73006fffb0ea20fa97fdbd55 with gcc (GCC) 8.4.1 20210217 kernel signature: 438d55a2208774c0b3b9af283cf636fccda838b9808dba7710539ffa83603b8e all runs: OK # git bisect bad 1fc338cde538bc2d73006fffb0ea20fa97fdbd55 Bisecting: 19 revisions left to test after this (roughly 4 steps) [9c4a31480b728b706844a47c262d9562e2f86ada] usb: quirks: add quirk to start video capture on ELMO L-12F document camera reliable testing commit 9c4a31480b728b706844a47c262d9562e2f86ada with gcc (GCC) 8.4.1 20210217 kernel signature: 8aeb0956a36e00ec2ca5371e2422f7f77f806ce105baf1ab177715866a4a619a run #0: crashed: no output from test machine run #1: OK run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK run #10: OK run #11: OK run #12: OK run #13: OK run #14: OK run #15: OK run #16: OK run #17: OK run #18: OK run #19: OK # git bisect good 9c4a31480b728b706844a47c262d9562e2f86ada Bisecting: 9 revisions left to test after this (roughly 3 steps) [7496d7034a4e1b715c2baf6fe976bbaf7a361106] cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath. testing commit 7496d7034a4e1b715c2baf6fe976bbaf7a361106 with gcc (GCC) 8.4.1 20210217 kernel signature: 22a04c805ab592f716e1beed86617f8700a34adda7027043a80642152be0a4e6 all runs: OK # git bisect bad 7496d7034a4e1b715c2baf6fe976bbaf7a361106 Bisecting: 4 revisions left to test after this (roughly 2 steps) [7f1ba7ee94ad1392fa4aace6d70cfece4e958ea0] block: add helper for checking if queue is registered testing commit 7f1ba7ee94ad1392fa4aace6d70cfece4e958ea0 with gcc (GCC) 8.4.1 20210217 kernel signature: ea0ece861cf94094c9a677c6b8679368b7da4d387aedf0e9539fb87c7b59f8a8 run #0: crashed: no output from test machine run #1: OK run #2: OK run #3: OK run #4: OK run #5: OK run #6: OK run #7: OK run #8: OK run #9: OK run #10: OK run #11: OK run #12: OK run #13: OK run #14: OK run #15: OK run #16: OK run #17: OK run #18: OK run #19: OK # git bisect good 7f1ba7ee94ad1392fa4aace6d70cfece4e958ea0 Bisecting: 2 revisions left to test after this (roughly 1 step) [6c63a7be2b11b378f77adfa8dd81e66b0df2795b] block: fix race between switching elevator and removing queues testing commit 6c63a7be2b11b378f77adfa8dd81e66b0df2795b with gcc (GCC) 8.4.1 20210217 kernel signature: 85583047c8b7d6c9aa140d3c58c6255ebd26f715ee76d153689e93f07aad9a92 all runs: OK # git bisect bad 6c63a7be2b11b378f77adfa8dd81e66b0df2795b Bisecting: 0 revisions left to test after this (roughly 0 steps) [fa137b50f3264a157575413030464c19ab553b0e] block: split .sysfs_lock into two locks testing commit fa137b50f3264a157575413030464c19ab553b0e with gcc (GCC) 8.4.1 20210217 kernel signature: 05f630018e54ef5d9c230170aba34c9db4147fcbc85b283070cbd9fbb73a4b03 all runs: OK # git bisect bad fa137b50f3264a157575413030464c19ab553b0e fa137b50f3264a157575413030464c19ab553b0e is the first bad commit commit fa137b50f3264a157575413030464c19ab553b0e Author: Ming Lei Date: Tue Aug 27 19:01:48 2019 +0800 block: split .sysfs_lock into two locks commit cecf5d87ff2035127bb5a9ee054d0023a4a7cad3 upstream. The kernfs built-in lock of 'kn->count' is held in sysfs .show/.store path. Meantime, inside block's .show/.store callback, q->sysfs_lock is required. However, when mq & iosched kobjects are removed via blk_mq_unregister_dev() & elv_unregister_queue(), q->sysfs_lock is held too. This way causes AB-BA lock because the kernfs built-in lock of 'kn-count' is required inside kobject_del() too, see the lockdep warning[1]. On the other hand, it isn't necessary to acquire q->sysfs_lock for both blk_mq_unregister_dev() & elv_unregister_queue() because clearing REGISTERED flag prevents storing to 'queue/scheduler' from being happened. Also sysfs write(store) is exclusive, so no necessary to hold the lock for elv_unregister_queue() when it is called in switching elevator path. So split .sysfs_lock into two: one is still named as .sysfs_lock for covering sync .store, the other one is named as .sysfs_dir_lock for covering kobjects and related status change. sysfs itself can handle the race between add/remove kobjects and showing/storing attributes under kobjects. For switching scheduler via storing to 'queue/scheduler', we use the queue flag of QUEUE_FLAG_REGISTERED with .sysfs_lock for avoiding the race, then we can avoid to hold .sysfs_lock during removing/adding kobjects. [1] lockdep warning ====================================================== WARNING: possible circular locking dependency detected 5.3.0-rc3-00044-g73277fc75ea0 #1380 Not tainted ------------------------------------------------------ rmmod/777 is trying to acquire lock: 00000000ac50e981 (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x59/0x72 but task is already holding lock: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&q->sysfs_lock){+.+.}: __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __mutex_lock+0x14a/0xa9b blk_mq_hw_sysfs_show+0x63/0xb6 sysfs_kf_seq_show+0x11f/0x196 seq_read+0x2cd/0x5f2 vfs_read+0xc7/0x18c ksys_read+0xc4/0x13e do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (kn->count#202){++++}: check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __kernfs_remove+0x237/0x40b kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->sysfs_lock); lock(kn->count#202); lock(&q->sysfs_lock); lock(kn->count#202); *** DEADLOCK *** 2 locks held by rmmod/777: #0: 00000000e69bd9de (&lock){+.+.}, at: null_exit+0x2e/0x95 [null_blk] #1: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b stack backtrace: CPU: 0 PID: 777 Comm: rmmod Not tainted 5.3.0-rc3-00044-g73277fc75ea0 #1380 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180724_192412-buildhw-07.phx4 Call Trace: dump_stack+0x9a/0xe6 check_noncircular+0x207/0x251 ? print_circular_bug+0x32a/0x32a ? find_usage_backwards+0x84/0xb0 check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 ? check_prev_add+0xc45/0xc45 ? mark_lock+0x11b/0x804 ? check_usage_forwards+0x1ca/0x1ca __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 ? kernfs_remove_by_name_ns+0x59/0x72 __kernfs_remove+0x237/0x40b ? kernfs_remove_by_name_ns+0x59/0x72 ? kernfs_next_descendant_post+0x7d/0x7d ? strlen+0x10/0x23 ? strcmp+0x22/0x44 kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa ? disk_events_poll_msecs_store+0x12b/0x12b ? check_flags+0x1ea/0x204 ? mark_held_locks+0x1f/0x7a null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 ? free_module+0x39f/0x39f ? blkcg_maybe_throttle_current+0x8a/0x718 ? rwlock_bug+0x62/0x62 ? __blkcg_punt_bio_submit+0xd0/0xd0 ? trace_hardirqs_on_thunk+0x1a/0x20 ? mark_held_locks+0x1f/0x7a ? do_syscall_64+0x4c/0x295 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb696cdbe6b Code: 73 01 c3 48 8b 0d 1d 20 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 008 RSP: 002b:00007ffec9588788 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000559e589137c0 RCX: 00007fb696cdbe6b RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559e58913828 RBP: 0000000000000000 R08: 00007ffec9587701 R09: 0000000000000000 R10: 00007fb696d4eae0 R11: 0000000000000206 R12: 00007ffec95889b0 R13: 00007ffec95896b3 R14: 0000559e58913260 R15: 0000559e589137c0 Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (jwang:cherry picked from commit cecf5d87ff2035127bb5a9ee054d0023a4a7cad3, adjust ctx for 4,19) Signed-off-by: Jack Wang Signed-off-by: Greg Kroah-Hartman block/blk-core.c | 1 + block/blk-mq-sysfs.c | 12 +++++----- block/blk-sysfs.c | 44 +++++++++++++++++++++++-------------- block/blk.h | 2 +- block/elevator.c | 59 +++++++++++++++++++++++++++++++++++++++++--------- include/linux/blkdev.h | 1 + 6 files changed, 86 insertions(+), 33 deletions(-) culprit signature: 05f630018e54ef5d9c230170aba34c9db4147fcbc85b283070cbd9fbb73a4b03 parent signature: ea0ece861cf94094c9a677c6b8679368b7da4d387aedf0e9539fb87c7b59f8a8 Reproducer flagged being flaky revisions tested: 11, total time: 3h31m35.923042518s (build: 1h22m23.14788755s, test: 2h8m4.876192792s) first good commit: fa137b50f3264a157575413030464c19ab553b0e block: split .sysfs_lock into two locks recipients (to): ["axboe@kernel.dk" "bvanassche@acm.org" "gregkh@linuxfoundation.org" "jinpu.wang@cloud.ionos.com" "ming.lei@redhat.com"] recipients (cc): []