qla2x00t-32gbit: Delay module unload while fabric scan in progress

System crash seen during load/unload test in a loop.

[105954.384919] RBP: ffff914589838dc0 R08: 0000000000000000 R09: 0000000000000086
[105954.384920] R10: 000000000000000f R11: ffffa31240904be5 R12: ffff914605f868e0
[105954.384921] R13: ffff914605f86910 R14: 0000000000008010 R15: 00000000ddb7c000
[105954.384923] FS:  0000000000000000(0000) GS:ffff9163fec40000(0000) knlGS:0000000000000000
[105954.384925] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[105954.384926] CR2: 000055d31ce1d6a0 CR3: 0000000119f5e001 CR4: 0000000000770ee0
[105954.384928] PKRU: 55555554
[105954.384929] Call Trace:
[105954.384931]  <IRQ>
[105954.384934]  qla24xx_sp_unmap+0x1f3/0x2a0 [qla2xxx]
[105954.384962]  ? qla_async_scan_sp_done+0x114/0x1f0 [qla2xxx]
[105954.384980]  ? qla24xx_els_ct_entry+0x4de/0x760 [qla2xxx]
[105954.384999]  ? __wake_up_common+0x80/0x190
[105954.385004]  ? qla24xx_process_response_queue+0xc2/0xaa0 [qla2xxx]
[105954.385023]  ? qla24xx_msix_rsp_q+0x44/0xb0 [qla2xxx]
[105954.385040]  ? __handle_irq_event_percpu+0x3d/0x190
[105954.385044]  ? handle_irq_event+0x58/0xb0
[105954.385046]  ? handle_edge_irq+0x93/0x240
[105954.385050]  ? __common_interrupt+0x41/0xa0
[105954.385055]  ? common_interrupt+0x3e/0xa0
[105954.385060]  ? asm_common_interrupt+0x22/0x40

The root cause of this was that there was a free (dma_free_attrs) in the
interrupt context.  There was a device discovery/fabric scan in
progress.  A module unload was issued which set the UNLOADING flag.  As
part of the discovery, after receiving an interrupt a work queue was
scheduled (which involved a work to be queued).  Since the UNLOADING
flag is set, the work item was not allocated and the mapped memory had
to be freed.  The free occurred in interrupt context leading to system
crash.  Delay the driver unload until the fabric scan is complete to
avoid the crash.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/202512090414.07Waorz0-lkp@intel.com/
Fixes: 783e0dc4f66a ("qla2xxx: Check for device state before unloading the driver.")
Cc: stable@vger.kernel.org
Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com>
Link: https://patch.msgid.link/20251210101604.431868-8-njavali@marvell.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 8890bf450e0b upstream ]
This commit is contained in:
Gleb Chesnokov
2026-04-09 19:35:30 +03:00
parent 8c6c124d13
commit 9337d33c3c

View File

@@ -1246,7 +1246,8 @@ qla2x00_wait_for_hba_ready(scsi_qla_host_t *vha)
while ((qla2x00_reset_active(vha) || ha->dpc_active ||
ha->flags.mbox_busy) ||
test_bit(FX00_RESET_RECOVERY, &vha->dpc_flags) ||
test_bit(FX00_TARGET_SCAN, &vha->dpc_flags)) {
test_bit(FX00_TARGET_SCAN, &vha->dpc_flags) ||
(vha->scan.scan_flags & SF_SCANNING)) {
if (test_bit(UNLOADING, &base_vha->dpc_flags))
break;
msleep(1000);