mirror of
https://github.com/SCST-project/scst.git
synced 2026-05-18 11:11:27 +00:00
3d2959d5bf1e567d5536bb40b6edf76b660a4389
The symptom of the crash is that one finds the system deadlocked spinning on scsi_qla_host_t.hardware_lock in qla2x00_enable_tgt_mode with a stack something like this: crash> bt PID: 6155 TASK: ffff88006e4bc3c0 CPU: 1 COMMAND: "scst_uid" #0 [ffff88007b915b28] machine_kexec at ffffffff8103163b #1 [ffff88007b915b88] crash_kexec at ffffffff810b8e52 #2 [ffff88007b915c58] panic at ffffffff814ed0ab #3 [ffff88007b915cd8] spin_bug at ffffffff8127cd46 #4 [ffff88007b915d18] _raw_spin_lock at ffffffff8127d015 #5 [ffff88007b915d68] _spin_lock_irqsave at ffffffff814f02e4 #6 [ffff88007b915d88] qla2x00_enable_tgt_mode at ffffffffa047b672 [qla2xxx] #7 [ffff88007b915db8] q2t_host_action at ffffffffa06db6a6 [qla2x00tgt] #8 [ffff88007b915df8] q2t_enable_tgt at ffffffffa06db6ea [qla2x00tgt] #9 [ffff88007b915e18] scst_process_tgt_enable_store at ffffffffa04f102e [scst] #10 [ffff88007b915e48] scst_tgt_enable_store_work_fn at ffffffffa04f1176 [scst] #11 [ffff88007b915e58] scst_process_sysfs_works at ffffffffa04e8bbe [scst] #12 [ffff88007b915e78] sysfs_work_thread_fn at ffffffffa04e8db5 [scst] #13 [ffff88007b915ed8] kthread at ffffffff8108f976 #14 [ffff88007b915f48] kernel_thread at ffffffff8100c20a I was pulling my hair out on this one, because with the spinlock debugging (enhanced to capture the PID along with the task pointer), I figured out that the task (and process) that originally locked the lock was gone! It got really confusing when I added more spinlock debug code to the kernel to detect locks held in the task switching and task/process termination paths -- and didn't catch anything terminating with locks held! I finally tracked the problem down to two things: 1. When qla24xx_create_vhost creates a new virtual scsi_qla_host_t it does it by copying the physical (aka parent) scsi_qla_host_t. Under the right unlucky conditions, this can happen with the hardware_lock held (the spinlock is embedded in the structure). 2. The code should only be locking the hardware_lock of the physical scsi_qla_host_t, because the lock is associated with the hardware. Unfortunately, quite a few places are not using to_qla_parent to make sure they lock the correct lock. One of those places is qla2x00_enable_tgt_mode. Along with the deadlock, this has the potential to leave the hardware and driver structures in unpredictable states, because the lock isn't always providing serialization. The fix entails two things: 1. Zeroing the lock after copying the scsi_qla_host_t structure: This won't stop the deadlock, but will enable the spinlock debug code to easily catch anything that misbehaves and locks the wrong lock. I also initialized the other locks because they could have the same problem. I also initialized the list heads, because they could end up holding dangling references. I did not initialize all pointers, because there are quite a few that point to read only data and are OK (and I didn't have time to research all of them). 2. Using to_qla_parent everywhere when locking and the scsi_qla_host_t structure might be virtual. This is a lot of changes, but they are the same thing over and over again. I did not make an effort to look for scalar or pointer fields that are being picked from the wrong structure. That's getting to be as much pain as merging up to the latest QLogic driver (which would have gotten rid of this problem). From "Robinson, Herbie" <Herbie.Robinson@stratus.com> git-svn-id: http://svn.code.sf.net/p/scst/svn/trunk@4420 d57e44dd-8a1f-0410-8b47-8ef2f437770f
This is the SCST development repository. It contains not a single project SCST as one can think, but a number of them, which are divided as the following: 1. SCST core in scst/ subdirectory 2. Administration utility for SCST core scstadmin in scstadmin/ 3. Target drivers in own subdirectories qla2x00t/, iscsi-scst/, etc. 4. User space programs in usr/ subdirectory, like fileio_tgt. 5. Some various docs in doc/ subdirectory. Those subprojects are in most cases independent from each other, although some of them depend from the SCST core. They put in the single repository only to simplify their development, they are released independently. Thus, use "make all" only if you really need to build everything. Otherwise build only what you need, like for iSCSI-SCST: make scst scst_install iscsi iscsi_install For more information about each subproject see their README files. Vladislav Bolkhovitin <vst@vlnb.net>, http://scst.sourceforge.net
Description
Languages
C
90.1%
Perl
4.2%
Shell
1.8%
HTML
1.7%
Makefile
1.2%
Other
0.9%