mirror of
https://github.com/SCST-project/scst.git
synced 2026-06-09 23:22:33 +00:00
83745c0a2dbbdb0a5674fc1bb4e4d922bc38904b
The device cleanup loop in dev_user_process_cleanup() spins at ~2 million
iterations per second and never exits, ultimately triggering a kernel soft
lockup. The previous workaround panicked the system after 10,000
iterations.
Root cause (confirmed by instrumentation):
A ucmd gets permanently stuck in ucmd_hash with:
state = UCMD_STATE_ON_FREE_SKIPPED (7)
cmd = NULL
ref = 1
sent_to_user = 0
The stuck ref=1 is the reference taken by dev_user_alloc_pages() via
ucmd_get() for the first scatter-gather page. It is released only by
dev_user_free_sg_entries() → ucmd_put(), which fires when the SGV pool
*evicts* a cached object. The sequence that prevents this eviction:
1. dev_user_unjam_dev() finds an EXECING command (sent_to_user=1,
ref=2: alloc + alloc_pages), bumps ref to 3 via ucmd_get_check(),
then calls dev_user_unjam_cmd().
2. dev_user_unjam_cmd() releases cmd_list_lock and calls
scst_cmd_done(SCST_CONTEXT_THREAD), which synchronously runs the
full SCST completion pipeline:
dev_user_on_free_cmd()
ucmd->cmd = NULL
ucmd->state = UCMD_STATE_ON_FREE_SKIPPED (type == IGNORE)
dev_user_process_reply_on_free()
dev_user_free_sgv()
sgv_pool_free(ucmd->sgv)
/* SGV cached on pool LRU; dev_user_free_sg_entries()
* not called; alloc_pages ucmd_get() not balanced */
ucmd->sgv = NULL
ucmd_put() ← ref: 3→2
3. Back in dev_user_unjam_dev(): ucmd_put() ← ref: 2→1.
ref != 0, so dev_user_free_ucmd() / cmd_remove_hash() are NOT called.
ucmd remains in ucmd_hash.
4. unjam_cmd also reset sent_to_user=0, so on every subsequent pass
through dev_user_unjam_dev() the ucmd is counted (res++) but skipped
(!sent_to_user → continue). dev_user_get_next_cmd() returns -EAGAIN
(ucmd is not in ready_cmd_list). With cleanup_done=1 the while(1)
loop has no exit condition.
The sgv_pool_flush() calls at the TOP of dev_user_unjam_dev() run
BEFORE any commands are unjammed. SGV objects cached during unjamming
are therefore never flushed; dev_user_free_sg_entries() never fires.
Fix:
Add sgv_pool_flush() for both pools at the BOTTOM of
dev_user_unjam_dev(), after the spinlock is released. This evicts
all SGV objects cached during unjamming, triggering:
dev_user_free_sg_entries() → ucmd_put() → dev_user_free_ucmd()
→ cmd_remove_hash()
removing the stuck ucmd from the hash. On the next cleanup-loop iteration
dev_user_unjam_dev() returns res=0 and dev_user_process_cleanup() breaks.
sgv_pool_flush() is fully synchronous (calls sgv_dtor_and_free() inline);
by the time it returns the callbacks have already fired and the ucmd has
already been removed from the hash. No schedule() or sleep is needed.
Overview
This is the source code repository of the SCST project. SCST is a collection of Linux kernel drivers that implement SCSI target functionality. The SCST project includes:
- The SCST core in the scst/ subdirectory.
- A tool for loading, saving and modifying the SCST configuration in directory scstadmin/.
- Several SCSI target drivers in the directories iscsi-scst/, qla2x00t/, srpt/, scst_local/ and fcst/.
- User space programs in the usr/ subdirectory, e.g. fileio_tgt.
- Various documentation in the doc/ subdirectory.
Instructions for building and installing SCST are available in the INSTALL.md file.
QLogic target driver
Two QLogic target drivers are included in the SCST project.
The default driver is located in qla2x00t-32gbit directory and it supports up to 32 Gb/s FC. It is the newer one.
May anyone wish to switch back to the older driver that only supported up to
16 Gb/s adapters, it is located in qla2x00t directory. To make use of the
older driver build scst with environment variable QLA_32GBIT=no set.
Vladislav Bolkhovitin vst@vlnb.net, http://scst.sourceforge.net
Sourceforge achievements
Description
Languages
C
90.1%
Perl
4.2%
Shell
1.8%
HTML
1.7%
Makefile
1.2%
Other
0.9%
