mirrors/scst - scst - Anomalous Gitea

mirror of https://github.com/SCST-project/scst.git synced 2026-06-09 23:22:33 +00:00

Author	SHA1	Message	Date
Gleb Chesnokov	ff9d238c3d	scst/include/backport.h: Fix __has_builtin() check for older GCC Fixes: https://github.com/SCST-project/scst/issues/377	2026-06-16 17:40:43 +03:00
Gleb Chesnokov	ccc7eadd0b	scst/include/backport.h: Fix backport for new stable kernels This patch fixes the build against kernel versions >= 6.18.33.	2026-06-16 17:01:15 +03:00
Gleb Chesnokov	6fa70db0e1	nightly build: Update kernel versions Another kernel versions update	2026-06-16 15:23:31 +03:00
tashen	83745c0a2d	scst_user: Fix infinite cleanup loop caused by stale SGV pool reference The device cleanup loop in dev_user_process_cleanup() spins at ~2 million iterations per second and never exits, ultimately triggering a kernel soft lockup. The previous workaround panicked the system after 10,000 iterations. Root cause (confirmed by instrumentation): A ucmd gets permanently stuck in ucmd_hash with: state = UCMD_STATE_ON_FREE_SKIPPED (7) cmd = NULL ref = 1 sent_to_user = 0 The stuck ref=1 is the reference taken by dev_user_alloc_pages() via ucmd_get() for the first scatter-gather page. It is released only by dev_user_free_sg_entries() → ucmd_put(), which fires when the SGV pool evicts a cached object. The sequence that prevents this eviction: 1. dev_user_unjam_dev() finds an EXECING command (sent_to_user=1, ref=2: alloc + alloc_pages), bumps ref to 3 via ucmd_get_check(), then calls dev_user_unjam_cmd(). 2. dev_user_unjam_cmd() releases cmd_list_lock and calls scst_cmd_done(SCST_CONTEXT_THREAD), which synchronously runs the full SCST completion pipeline: dev_user_on_free_cmd() ucmd->cmd = NULL ucmd->state = UCMD_STATE_ON_FREE_SKIPPED (type == IGNORE) dev_user_process_reply_on_free() dev_user_free_sgv() sgv_pool_free(ucmd->sgv) /* SGV cached on pool LRU; dev_user_free_sg_entries() * not called; alloc_pages ucmd_get() not balanced */ ucmd->sgv = NULL ucmd_put() ← ref: 3→2 3. Back in dev_user_unjam_dev(): ucmd_put() ← ref: 2→1. ref != 0, so dev_user_free_ucmd() / cmd_remove_hash() are NOT called. ucmd remains in ucmd_hash. 4. unjam_cmd also reset sent_to_user=0, so on every subsequent pass through dev_user_unjam_dev() the ucmd is counted (res++) but skipped (!sent_to_user → continue). dev_user_get_next_cmd() returns -EAGAIN (ucmd is not in ready_cmd_list). With cleanup_done=1 the while(1) loop has no exit condition. The sgv_pool_flush() calls at the TOP of dev_user_unjam_dev() run BEFORE any commands are unjammed. SGV objects cached during unjamming are therefore never flushed; dev_user_free_sg_entries() never fires. Fix: Add sgv_pool_flush() for both pools at the BOTTOM of dev_user_unjam_dev(), after the spinlock is released. This evicts all SGV objects cached during unjamming, triggering: dev_user_free_sg_entries() → ucmd_put() → dev_user_free_ucmd() → cmd_remove_hash() removing the stuck ucmd from the hash. On the next cleanup-loop iteration dev_user_unjam_dev() returns res=0 and dev_user_process_cleanup() breaks. sgv_pool_flush() is fully synchronous (calls sgv_dtor_and_free() inline); by the time it returns the callbacks have already fired and the ucmd has already been removed from the hash. No schedule() or sleep is needed.	2026-06-08 13:00:20 +03:00
Gleb Chesnokov	3111277776	qla2x00t-32gbit: Use nr_cpu_ids instead of NR_CPUS for qp_cpu_map allocation Change the memory allocation for qp_cpu_map to use the actual number of CPUs ('nr_cpu_ids') instead of the maximum possible CPUs ('NR_CPUS'). This saves memory on systems where the maximum CPU limit is much higher than the active CPU count. Signed-off-by: Li RongQing <lirongqing@baidu.com> Link: https://patch.msgid.link/20260331053245.1839-1-lirongqing@baidu.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 271aeff266c9 upstream ]	2026-05-28 16:58:40 +03:00
Gleb Chesnokov	cc83944d98	qla2x00t-32gbit: Add support to report MPI FW state MPI firmware state was returned as 0. Get MPI FW state to proceed with flash image validation. A new sysfs node 'mpi_fw_state' is added to report MPI firmware state: /sys/class/scsi_host/hostXX/mpi_fw_state Fixes: d74181ca110e ("scsi: qla2xxx: Add bsg interface to support firmware img validation") Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://patch.msgid.link/20260305093337.2007205-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 0e124af675eb upstream ]	2026-05-28 16:58:40 +03:00
Gleb Chesnokov	4ec53813b1	scst: Unbreak the RHEL 9.8 and RHEL 10.2 builds Fixes: https://github.com/SCST-project/scst/issues/369 Fixes: https://github.com/SCST-project/scst/issues/355	2026-05-28 15:49:14 +03:00
Gleb Chesnokov	5eb824aea8	scripts/kernel-functions: Handle Rocky kernel rebuild tarballs Rocky Linux kernel rebuilds can append an extra release suffix to the source RPM name. For example, kernel-5.14.0-687.10.1.el9_8.0.1.src.rpm can still contain a linux-5.14.0-687.10.1.el9_8.tar.* source archive. The exact linux-${kver}.tar. match then fails before the source tree can be extracted. Look for the archive name again with a trailing .0.N rebuild suffix removed and keep renaming the extracted tree to the full requested kernel release, so later paths remain unchanged.	2026-05-28 15:49:14 +03:00
Gleb Chesnokov	86a2ee9dc3	scripts/kernel-functions: Validate downloaded RPMs	2026-05-26 18:27:32 +03:00
Gleb Chesnokov	181af5c698	nightly build: Add Rocky linux kernels	2026-05-26 18:27:32 +03:00
Gleb Chesnokov	f700584be4	scripts: Add Rocky Linux kernel support	2026-05-26 18:27:32 +03:00
Brian M	67d2469c9b	scst_dlm: Fix use-after-free by waiting for lkb destruction When scst_dlm_unlock_wait() timed out, the caller could proceed to free the containing storage of @lksb while DLM still held a reference to it in lkb->lkb_lksb. A subsequently delivered AST -- e.g. a recovery-synthesized UNLOCK_REPLY for a departed peer -- would then write sb_status into freed memory. Make scst_dlm_remove_lock() honor its contract on every path: after the convert-to-NL clean-release step, issue dlm_unlock(DLM_LKF_FORCEUNLOCK) and loop on the completion until a destroying CAST (sb_status == -DLM_EUNLOCK) is observed. FORCEUNLOCK ensures that such a CAST arrives even if an earlier operation is still in flight or a peer is departing, since dlm_recover_waiters_pre() synthesizes the reply in the latter case. Non-destroying CASTs from previously canceled converts are logged and the wait continues. Route the per-registrant teardown (scst_dlm_pr_rm_reg_ls), the remote-UA teardown (scst_dlm_rm_rem_ua_ls), and the remaining stack-allocated lksb teardown sites through scst_dlm_remove_lock() so they all observe the destroying CAST before their storage may be reused. As a side benefit, those sites gain the convert-to-NL step that preserves LVB validity on peers per the DLM EX/PW release rule.	2026-05-25 13:38:38 +03:00
Brian M	e2c57de2d5	scst_vdisk: enable bind_alua_state for fileio devices Wire the on_alua_state_change_{start,finish} callbacks into vdisk_file_devtype and expose bind_alua_state as a sysfs attribute and create-time parameter for fileio. The callback bodies were already backing-agnostic; rename them from blockio_* to vdev_* to match. Default bind_alua_state=0 for fileio (vs. 1 for blockio) to preserve existing behavior on upgrade. Adjust the sysfs show function to compare against the per-backing default so scstadmin persists explicit settings correctly.	2026-05-15 13:21:47 +03:00
Brian M	c259c7abb8	scst: park async LUN-replace cleanup until async_lun_replace clears Follow-up to commit `a4a55aab41` ("scst: add async_lun_replace to defer tgt_dev cleanup after LUN replace"), which moved the slow drain of old tgt_devs off the LUN-replace management write path. That defers the drain. It does not defer the free - the asynchronous worker still acquires scst_mutex to call scst_free_tgt_dev, and that function's first action, scst_clear_reservation -> scst_dlm_res_lock, does a DLM round-trip. When the peer node has just died and has not yet been evicted from the lockspace, that round-trip stalls in scst_dlm_lock_wait. With scst_mutex held by the stalled worker, every subsequent LUN-replace management write queues behind it. When async_lun_replace=1, scst_acg_repl_lun() now parks the deferred cleanup of old tgt_devs on a list instead of scheduling it on the workqueue immediately. Writing 0 to the async_lun_replace sysfs knob releases the parked work in a batch. This lets the orchestrating layer hold cleanup until any cluster coordination it depends on (e.g. DLM peer eviction during HA failover) has completed. Module unload calls scst_async_lun_replace_set(false) as a safety net.	2026-05-13 00:27:09 +03:00
Gleb Chesnokov	d18c8fc718	scst_vdisk: Validate vdisk_blockio block size against backend device Validate the configured vdisk_blockio block size against the backend block device during open using bdev_validate_blocksize(). This rejects incompatible configurations early and prevents misaligned I/O from reaching the backend device.	2026-04-16 20:55:42 +03:00
Gleb Chesnokov	a266c02db5	nightly build: Update kernel versions Another kernel versions update	2026-04-14 17:17:39 +03:00
Gleb Chesnokov	ebce50dccd	scst/include/backport.h: Fix UEK8 build This patch fixes the build against UEK8 kernel version 6.12.0-200.74.27.2.el10uek.	2026-04-12 18:37:09 +03:00
Gleb Chesnokov	b97f62a869	nightly build: Update kernel versions Another kernel versions update	2026-04-12 18:37:09 +03:00
Gleb Chesnokov	9003543f45	scst: Port to Linux kernel v7.0 Support for the following changes in the Linux kernel v7.0: - e3b2cf6e5dba ("kernfs: pass struct ns_common instead of const void * for namespace tags")	2026-04-12 15:38:24 +03:00
Gleb Chesnokov	8a3b257c33	scst: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Linux kernel Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci against the SCST tree. This patch doesn't change any functionality.	2026-04-12 15:38:24 +03:00
Gleb Chesnokov	77c1efbc1e	scst/include/backport.h: Add kmalloc_obj() helper family Add compatibility helpers for the kmalloc_obj() family so SCST can use the typed allocation helpers on kernels that do not provide them yet. A following patch will convert existing non-scalar allocations to use these helpers. This patch doesn't change any functionality.	2026-04-12 15:38:24 +03:00
Gleb Chesnokov	5d74814c36	qla2x00t-32gbit: Completely fix fcport double free In qla24xx_els_dcmd_iocb() sp->free is set to qla2x00_els_dcmd_sp_free(). When an error happens, this function is called by qla2x00_sp_release(), when kref_put() releases the first and the last reference. qla2x00_els_dcmd_sp_free() frees fcport by calling qla2x00_free_fcport(). Doing it one more time after kref_put() is a bad idea. Fixes: 82f522ae0d97 ("scsi: qla2xxx: Fix double free of fcport") Fixes: 4895009c4bb7 ("scsi: qla2xxx: Prevent command send on chip reset") Signed-off-by: Vladimir Riabchun <ferr.lambarginio@gmail.com> Signed-off-by: Farhat Abbas <fabbas@cloudlinux.com> Link: https://patch.msgid.link/aYsDln9NFQQsPDgg@vova-pc Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit c0b7da13a04b upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	ba479183c2	qla2x00t-32gbit: Add WQ_PERCPU to alloc_workqueue() Upstream workqueue changes introduce a new WQ_PERCPU flag and plan to switch alloc_workqueue()'s default from per-CPU to unbound To keep SCST behaviour unchanged across kernels, explicitly request WQ_PERCPU.	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	4007b86ccc	qla2x00t-32gbit: Update version to 10.02.10.100-k Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-13-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 1732d10fa7ed upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	e1f87a3574	qla2x00t-32gbit: Fix bsg_done() causing double free Kernel panic observed on system, [5353358.825191] BUG: unable to handle page fault for address: ff5f5e897b024000 [5353358.825194] #PF: supervisor write access in kernel mode [5353358.825195] #PF: error_code(0x0002) - not-present page [5353358.825196] PGD 100006067 P4D 0 [5353358.825198] Oops: 0002 [#1] PREEMPT SMP NOPTI [5353358.825200] CPU: 5 PID: 2132085 Comm: qlafwupdate.sub Kdump: loaded Tainted: G W L ------- --- 5.14.0-503.34.1.el9_5.x86_64 #1 [5353358.825203] Hardware name: HPE ProLiant DL360 Gen11/ProLiant DL360 Gen11, BIOS 2.44 01/17/2025 [5353358.825204] RIP: 0010:memcpy_erms+0x6/0x10 [5353358.825211] RSP: 0018:ff591da8f4f6b710 EFLAGS: 00010246 [5353358.825212] RAX: ff5f5e897b024000 RBX: 0000000000007090 RCX: 0000000000001000 [5353358.825213] RDX: 0000000000001000 RSI: ff591da8f4fed090 RDI: ff5f5e897b024000 [5353358.825214] RBP: 0000000000010000 R08: ff5f5e897b024000 R09: 0000000000000000 [5353358.825215] R10: ff46cf8c40517000 R11: 0000000000000001 R12: 0000000000008090 [5353358.825216] R13: ff591da8f4f6b720 R14: 0000000000001000 R15: 0000000000000000 [5353358.825218] FS: 00007f1e88d47740(0000) GS:ff46cf935f940000(0000) knlGS:0000000000000000 [5353358.825219] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [5353358.825220] CR2: ff5f5e897b024000 CR3: 0000000231532004 CR4: 0000000000771ef0 [5353358.825221] PKRU: 55555554 [5353358.825222] Call Trace: [5353358.825223] <TASK> [5353358.825224] ? show_trace_log_lvl+0x1c4/0x2df [5353358.825229] ? show_trace_log_lvl+0x1c4/0x2df [5353358.825232] ? sg_copy_buffer+0xc8/0x110 [5353358.825236] ? __die_body.cold+0x8/0xd [5353358.825238] ? page_fault_oops+0x134/0x170 [5353358.825242] ? kernelmode_fixup_or_oops+0x84/0x110 [5353358.825244] ? exc_page_fault+0xa8/0x150 [5353358.825247] ? asm_exc_page_fault+0x22/0x30 [5353358.825252] ? memcpy_erms+0x6/0x10 [5353358.825253] sg_copy_buffer+0xc8/0x110 [5353358.825259] qla2x00_process_vendor_specific+0x652/0x1320 [qla2xxx] [5353358.825317] qla24xx_bsg_request+0x1b2/0x2d0 [qla2xxx] Most routines in qla_bsg.c call bsg_done() only for success cases. However a few invoke it for failure case as well leading to a double free. Validate before calling bsg_done(). Cc: stable@vger.kernel.org Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-12-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit c2c68225b145 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	d5298a4930	qla2x00t-32gbit: Query FW again before proceeding with login Issue occurred during a continuous reboot test of several thousand iterations specific to a fabric topo with dual mode target where it sends a PLOGI/PRLI and then sends a LOGO. The initiator was also in the process of discovery and sent a PLOGI to the switch. It then queried a list of ports logged in via mbx 75h and the GPDB response indicated that the target was logged in. This caused a mismatch in the states between the driver and FW. Requery the FW for the state and proceed with the rest of discovery process. Fixes: a4239945b8ad ("scsi: qla2xxx: Add switch command to simplify fabric discovery") Cc: stable@vger.kernel.org Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-11-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 42b2dab4340d upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	59015c8dd8	qla2x00t-32gbit: Validate sp before freeing associated memory System crash with the following signature [154563.214890] nvme nvme2: NVME-FC{1}: controller connect complete [154564.169363] qla2xxx [0000:b0:00.1]-3002:2: nvme: Sched: Set ZIO exchange threshold to 3. [154564.169405] qla2xxx [0000:b0:00.1]-ffffff:2: SET ZIO Activity exchange threshold to 5. [154565.539974] qla2xxx [0000:b0:00.1]-5013:2: RSCN database changed – 0078 0080 0000. [154565.545744] qla2xxx [0000:b0:00.1]-5013:2: RSCN database changed – 0078 00a0 0000. [154565.545857] qla2xxx [0000:b0:00.1]-11a2:2: FEC=enabled (data rate). [154565.552760] qla2xxx [0000:b0:00.1]-11a2:2: FEC=enabled (data rate). [154565.553079] BUG: kernel NULL pointer dereference, address: 00000000000000f8 [154565.553080] #PF: supervisor read access in kernel mode [154565.553082] #PF: error_code(0x0000) - not-present page [154565.553084] PGD 80000010488ab067 P4D 80000010488ab067 PUD 104978a067 PMD 0 [154565.553089] Oops: 0000 1 PREEMPT SMP PTI [154565.553092] CPU: 10 PID: 858 Comm: qla2xxx_2_dpc Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.11.1.el9_5.x86_64 #1 [154565.553096] Hardware name: HPE Synergy 660 Gen10/Synergy 660 Gen10 Compute Module, BIOS I43 09/30/2024 [154565.553097] RIP: 0010:qla_fab_async_scan.part.0+0x40b/0x870 [qla2xxx] [154565.553141] Code: 00 00 e8 58 a3 ec d4 49 89 e9 ba 12 20 00 00 4c 89 e6 49 c7 c0 00 ee a8 c0 48 c7 c1 66 c0 a9 c0 bf 00 80 00 10 e8 15 69 00 00 <4c> 8b 8d f8 00 00 00 4d 85 c9 74 35 49 8b 84 24 00 19 00 00 48 8b [154565.553143] RSP: 0018:ffffb4dbc8aebdd0 EFLAGS: 00010286 [154565.553145] RAX: 0000000000000000 RBX: ffff8ec2cf0908d0 RCX: 0000000000000002 [154565.553147] RDX: 0000000000000000 RSI: ffffffffc0a9c896 RDI: ffffb4dbc8aebd47 [154565.553148] RBP: 0000000000000000 R08: ffffb4dbc8aebd45 R09: 0000000000ffff0a [154565.553150] R10: 0000000000000000 R11: 000000000000000f R12: ffff8ec2cf0908d0 [154565.553151] R13: ffff8ec2cf090900 R14: 0000000000000102 R15: ffff8ec2cf084000 [154565.553152] FS: 0000000000000000(0000) GS:ffff8ed27f800000(0000) knlGS:0000000000000000 [154565.553154] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [154565.553155] CR2: 00000000000000f8 CR3: 000000113ae0a005 CR4: 00000000007706f0 [154565.553157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [154565.553158] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [154565.553159] PKRU: 55555554 [154565.553160] Call Trace: [154565.553162] <TASK> [154565.553165] ? show_trace_log_lvl+0x1c4/0x2df [154565.553172] ? show_trace_log_lvl+0x1c4/0x2df [154565.553177] ? qla_fab_async_scan.part.0+0x40b/0x870 [qla2xxx] [154565.553215] ? __die_body.cold+0x8/0xd [154565.553218] ? page_fault_oops+0x134/0x170 [154565.553223] ? snprintf+0x49/0x70 [154565.553229] ? exc_page_fault+0x62/0x150 [154565.553238] ? asm_exc_page_fault+0x22/0x30 Check for sp being non NULL before freeing any associated memory Fixes: a4239945b8ad ("scsi: qla2xxx: Add switch command to simplify fabric discovery") Cc: stable@vger.kernel.org Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-10-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit b6df15aec8c3 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	f03731b55e	qla2x00t-32gbit: Free sp in error path to fix system crash System crash seen during load/unload test in a loop, [61110.449331] qla2xxx [0000:27:00.0]-0042:0: Disabled MSI-X. [61110.467494] ============================================================================= [61110.467498] BUG qla2xxx_srbs (Tainted: G OE -------- --- ): Objects remaining in qla2xxx_srbs on __kmem_cache_shutdown() [61110.467501] ----------------------------------------------------------------------------- [61110.467502] Slab 0x000000000ffc8162 objects=51 used=1 fp=0x00000000e25d3d85 flags=0x57ffffc0010200(slab\|head\|node=1\|zone=2\|lastcpupid=0x1fffff) [61110.467509] CPU: 53 PID: 455206 Comm: rmmod Kdump: loaded Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1 [61110.467513] Hardware name: HPE ProLiant DL385 Gen10 Plus v2/ProLiant DL385 Gen10 Plus v2, BIOS A42 08/17/2023 [61110.467515] Call Trace: [61110.467516] <TASK> [61110.467519] dump_stack_lvl+0x34/0x48 [61110.467526] slab_err.cold+0x53/0x67 [61110.467534] __kmem_cache_shutdown+0x16e/0x320 [61110.467540] kmem_cache_destroy+0x51/0x160 [61110.467544] qla2x00_module_exit+0x93/0x99 [qla2xxx] [61110.467607] ? __do_sys_delete_module.constprop.0+0x178/0x280 [61110.467613] ? syscall_trace_enter.constprop.0+0x145/0x1d0 [61110.467616] ? do_syscall_64+0x5c/0x90 [61110.467619] ? exc_page_fault+0x62/0x150 [61110.467622] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [61110.467626] </TASK> [61110.467627] Disabling lock debugging due to kernel taint [61110.467635] Object 0x0000000026f7e6e6 @offset=16000 [61110.467639] ------------[ cut here ]------------ [61110.467639] kmem_cache_destroy qla2xxx_srbs: Slab cache still has objects when called from qla2x00_module_exit+0x93/0x99 [qla2xxx] [61110.467659] WARNING: CPU: 53 PID: 455206 at mm/slab_common.c:520 kmem_cache_destroy+0x14d/0x160 [61110.467718] CPU: 53 PID: 455206 Comm: rmmod Kdump: loaded Tainted: G B OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1 [61110.467720] Hardware name: HPE ProLiant DL385 Gen10 Plus v2/ProLiant DL385 Gen10 Plus v2, BIOS A42 08/17/2023 [61110.467721] RIP: 0010:kmem_cache_destroy+0x14d/0x160 [61110.467724] Code: 99 7d 07 00 48 89 ef e8 e1 6a 07 00 eb b3 48 8b 55 60 48 8b 4c 24 20 48 c7 c6 70 fc 66 90 48 c7 c7 f8 ef a1 90 e8 e1 ed 7c 00 <0f> 0b eb 93 c3 cc cc cc cc 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 [61110.467725] RSP: 0018:ffffa304e489fe80 EFLAGS: 00010282 [61110.467727] RAX: 0000000000000000 RBX: ffffffffc0d9a860 RCX: 0000000000000027 [61110.467729] RDX: ffff8fd5ff9598a8 RSI: 0000000000000001 RDI: ffff8fd5ff9598a0 [61110.467730] RBP: ffff8fb6aaf78700 R08: 0000000000000000 R09: 0000000100d863b7 [61110.467731] R10: ffffa304e489fd20 R11: ffffffff913bef48 R12: 0000000040002000 [61110.467731] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [61110.467733] FS: 00007f64c89fb740(0000) GS:ffff8fd5ff940000(0000) knlGS:0000000000000000 [61110.467734] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [61110.467735] CR2: 00007f0f02bfe000 CR3: 00000020ad6dc005 CR4: 0000000000770ee0 [61110.467736] PKRU: 55555554 [61110.467737] Call Trace: [61110.467738] <TASK> [61110.467739] qla2x00_module_exit+0x93/0x99 [qla2xxx] [61110.467755] ? __do_sys_delete_module.constprop.0+0x178/0x280 Free sp in the error path to fix the crash. Fixes: f352eeb75419 ("scsi: qla2xxx: Add ability to use GPNFT/GNNFT for RSCN handling") Cc: stable@vger.kernel.org Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-9-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 7adbd2b78090 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	40e2da6311	qla2x00t-32gbit: Delay module unload while fabric scan in progress System crash seen during load/unload test in a loop. [105954.384919] RBP: ffff914589838dc0 R08: 0000000000000000 R09: 0000000000000086 [105954.384920] R10: 000000000000000f R11: ffffa31240904be5 R12: ffff914605f868e0 [105954.384921] R13: ffff914605f86910 R14: 0000000000008010 R15: 00000000ddb7c000 [105954.384923] FS: 0000000000000000(0000) GS:ffff9163fec40000(0000) knlGS:0000000000000000 [105954.384925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [105954.384926] CR2: 000055d31ce1d6a0 CR3: 0000000119f5e001 CR4: 0000000000770ee0 [105954.384928] PKRU: 55555554 [105954.384929] Call Trace: [105954.384931] <IRQ> [105954.384934] qla24xx_sp_unmap+0x1f3/0x2a0 [qla2xxx] [105954.384962] ? qla_async_scan_sp_done+0x114/0x1f0 [qla2xxx] [105954.384980] ? qla24xx_els_ct_entry+0x4de/0x760 [qla2xxx] [105954.384999] ? __wake_up_common+0x80/0x190 [105954.385004] ? qla24xx_process_response_queue+0xc2/0xaa0 [qla2xxx] [105954.385023] ? qla24xx_msix_rsp_q+0x44/0xb0 [qla2xxx] [105954.385040] ? __handle_irq_event_percpu+0x3d/0x190 [105954.385044] ? handle_irq_event+0x58/0xb0 [105954.385046] ? handle_edge_irq+0x93/0x240 [105954.385050] ? __common_interrupt+0x41/0xa0 [105954.385055] ? common_interrupt+0x3e/0xa0 [105954.385060] ? asm_common_interrupt+0x22/0x40 The root cause of this was that there was a free (dma_free_attrs) in the interrupt context. There was a device discovery/fabric scan in progress. A module unload was issued which set the UNLOADING flag. As part of the discovery, after receiving an interrupt a work queue was scheduled (which involved a work to be queued). Since the UNLOADING flag is set, the work item was not allocated and the mapped memory had to be freed. The free occurred in interrupt context leading to system crash. Delay the driver unload until the fabric scan is complete to avoid the crash. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/202512090414.07Waorz0-lkp@intel.com/ Fixes: 783e0dc4f66a ("qla2xxx: Check for device state before unloading the driver.") Cc: stable@vger.kernel.org Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-8-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 8890bf450e0b upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	ba992ad0ff	qla2x00t-32gbit: Allow recovery for tape devices Tape device doesn't show up after RSCNs. To fix this, remove tape device specific checks which allows recovery of tape devices. Fixes: 44c57f205876 ("scsi: qla2xxx: Changes to support FCP2 Target") Cc: stable@vger.kernel.org Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-7-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit b0335ee4fb94 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	55b5644f1e	qla2x00t-32gbit: Add bsg interface to support firmware img validation Add new bsg interface to issue MPI passthrough sub command to validate the new flash firmware image partition. Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-6-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit d74181ca110e upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	aaa3a5ccca	qla2x00t-32gbit: Validate MCU signature before executing MBC 03h FC firmware does not come online during on-the-fly upgrade i.e. on soft reset. To limit Load flash firmware, i.e. MBC 3 changes, validate MCU signature before executing MBC 03h Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-5-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 478b152ab309 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	889a8c3829	qla2x00t-32gbit: Add load flash firmware mailbox support for 28xxx For 28xxx adaptor Load flash firmware mailbox load the operational firmware from flash, and also validate the checksum. Driver does not need to load the operational firmware anymore, but it still need to read fwdt from flash to build and allocate firmware dump template. Remove request_firmware() support for 28xxx adapter. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512031128.XsuvzBv1-lkp@intel.com/ Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-4-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit b99b04b12214 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	fb7adec212	qla2x00t-32gbit: Add support for 64G SFP speed Incorrect speed info is shown in driver logs for 64G SFP. Add support for 64G SFP speed as per SFF-8472 specification. Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-3-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 21ab087cae50 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	7df40fe1ba	qla2x00t-32gbit: Add Speed in SFP print information Print additional information about the speed while displaying SFP information. Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251210101604.431868-2-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 7411f1875a60 upstream ]	2026-04-09 21:57:09 +03:00
Gleb Chesnokov	47a371b50f	scst_vdisk: Include <linux/hex.h> for kernels v6.4+ Hex helpers were moved to <linux/hex.h> in kernel v6.4, and kernel v7.0 no longer pulls that header in indirectly, so include it explicitly.	2026-04-08 14:54:31 +03:00
Gleb Chesnokov	eeae35607d	scst: Change the return type of the .queuecommand() callback In clang version 21.1 and later the -Wimplicit-enum-enum-cast warning option has been introduced. This warning is enabled by default and can be used to catch .queuecommand() implementations that return another value than 0 or one of the SCSI_MLQUEUE_* constants. Hence this patch that changes the return type of the .queuecommand() implementations from 'int' into 'enum scsi_qc_status'. No functionality has been changed.	2026-04-08 14:31:45 +03:00
Gleb Chesnokov	a9a57b5eb6	scst_vdisk: Include <linux/kernel.h> for hex2bin() Include <linux/kerne.h> explicitly for hex2bin(), which avoids handling the v6.4 split of hex helpers to <linux/hex.h>.	2026-04-08 14:31:45 +03:00
Gleb Chesnokov	9481c7e3a6	scst_lib: Port to Linux kernel v7.0 Support for the following block layer changes in the Linux kernel v7.0: - 5e2fde1a9433 ("block: pass io_comp_batch to rq_end_io_fn callback")	2026-04-08 14:31:45 +03:00
Brian M	3e16f63043	scst/include/backport.h: backport kstrtobool kstrtobool was introduced in 4.6; older kernels have strtobool with an identical signature.	2026-04-07 23:19:14 +03:00
Brian M	7ab9df00f5	scst/include/backport.h: backport WRITE_ONCE for kernels < 3.19 READ_ONCE was already backported; WRITE_ONCE was not, causing a build failure on 3.10. Add the matching definition following the same pattern.	2026-04-07 23:19:14 +03:00
Brian M	a4a55aab41	scst: add async_lun_replace to defer tgt_dev cleanup after LUN replace When a LUN is replaced, scst_acg_repl_lun polls tgt_dev_cmd_count every 100ms waiting for in-flight commands on the old tgt_devs to drain before freeing them. This is synchronous: the sysfs write to luns/mgmt blocks until the drain completes. If the old device becomes unreachable before the LUN replace (e.g. due to a transport failure), in-flight commands may be stuck in error recovery for up to the transport's recovery timeout, blocking the replace for that entire window. Add a bool module parameter async_lun_replace (default false). When enabled, scst_acg_repl_lun schedules the tgt_dev drain and free on system_wq and returns immediately. Falls back to synchronous behaviour on allocation failure. This is safe because __scst_acg_del_lun removes the old tgt_devs from all session and device lookup paths before the work is scheduled. New commands from the initiator use the new tgt_devs; only in-flight commands still hold references via cmd->tgt_dev, and tgt_dev_cmd_count tracks exactly those. synchronize_rcu ensures no RCU reader holds a stale pointer before scst_free_tgt_dev is called.	2026-04-07 23:19:14 +03:00
Brian M	0731c421fd	iscsi-scstd: Return SVC_UNAVAILABLE while logins are suspended SIGUSR1/SIGUSR2 set/clear logins_suspended. While set, any login attempt is rejected with a retriable Target Error instead of the permanent Initiator Error (TGT_NOT_FOUND) that causes initiators to give up.	2026-03-31 11:11:10 +03:00
Bart Van Assche	08eaa7d5ee	Makefile: Simplify the cov-build target Simplify the Coverity build by always setting the BUILD_2X_MODULE, CONFIG_SCSI_QLA_FC and CONFIG_SCSI_QLA2XXX_TARGET variables. Setting these variables when not building a QLogic driver is safe because these variables only have an impact when building the QLogic drivers. See also commit `5c7fa24031` ("Makefile: Introduce the 'make cov-build'").	2026-03-15 14:26:14 -07:00
Gleb Chesnokov	7d6f9a1f74	.github/workflows: Drop envelope_from from mail action The current action branch fails with "No recipients defined" when envelope_from is set in this workflow. The SMTP username already provides the sender address, so envelope_from is unnecessary here.	2026-03-12 16:17:24 +03:00
Gleb Chesnokov	32c4c6b87b	.github/workflows: Use valid sender address Set a static From header and envelope_from to the configured SMTP account to restore mail notifications.	2026-03-12 15:00:49 +03:00
Gleb Chesnokov	e36e246b22	.github/workflows: Switch actions to default branches Update the Coverity and mail notification workflows to use the current default branches of the corresponding GitHub Actions.	2026-03-12 14:27:39 +03:00
Brian M	f6fc8b7d2a	scst: document pr_state and pr_dump_dir sysfs attributes in README pr_state is a common device attribute for save/restore of Persistent Reservation state. pr_dump_dir is a dev_disk handler attribute that triggers an automatic kernel-side PR state dump at unregistration time.	2026-03-12 13:58:22 +03:00
Brian M	3777c775c7	dev_disk: add pr_dump_dir handler attribute to dump PR state on detach When pr_dump_dir is set to a directory path, each dev_disk device writes its PR state to <dir>/<serial> on detach, using the same text format as the pr_state sysfs attribute. The default is an empty string, which disables the feature entirely. This provides a race-free way to capture PR state at the point of device teardown, after all in-flight commands have completed, for use cases that need to preserve PR state across a device transition. The dump must happen before scst_pr_clear_dev() wipes the in-memory registrant list during device unregistration. To achieve this, a new optional pre_unregister() callback is added to struct scst_dev_type, called from both scst_unregister_device() and scst_unregister_virtual_device() before scst_pr_clear_dev(). The disk handler registers this callback (disk_pre_unregister) to perform the dump at the correct moment. The work is split across two phases to avoid filesystem I/O while scst_mutex is held. disk_pre_unregister() (called under scst_mutex) captures the PR state into a heap buffer and records the destination path. disk_detach() (called after scst_mutex is released) writes the buffer to the filesystem. To carry state between the two phases, dh_priv is changed from a bare serial-number string to a struct disk_dh_priv containing the serial plus the captured dump fields.	2026-03-12 13:58:22 +03:00
Brian M	b92e091999	scst: add pr_state sysfs attribute for PR state save/restore Add a read/write pr_state attribute to scst_device that serializes the current persistent reservation state (generation, reservation type/scope, and all registrants with their transport IDs) to a text format, and restores it from the same format. This provides a stable interface for saving and restoring PR state across device transitions where the in-memory state would otherwise be lost.	2026-03-12 13:58:22 +03:00

1 2 3 4 5 ...

9331 Commits