In clang version 21.1 and later the -Wimplicit-enum-enum-cast warning
option has been introduced. This warning is enabled by default and can
be used to catch .queuecommand() implementations that return another
value than 0 or one of the SCSI_MLQUEUE_* constants. Hence this patch
that changes the return type of the .queuecommand() implementations from
'int' into 'enum scsi_qc_status'. No functionality has been changed.
When a LUN is replaced, scst_acg_repl_lun polls tgt_dev_cmd_count every
100ms waiting for in-flight commands on the old tgt_devs to drain before
freeing them. This is synchronous: the sysfs write to luns/mgmt blocks
until the drain completes.
If the old device becomes unreachable before the LUN replace (e.g. due to
a transport failure), in-flight commands may be stuck in error recovery for
up to the transport's recovery timeout, blocking the replace for that
entire window.
Add a bool module parameter async_lun_replace (default false). When
enabled, scst_acg_repl_lun schedules the tgt_dev drain and free on
system_wq and returns immediately. Falls back to synchronous behaviour
on allocation failure.
This is safe because __scst_acg_del_lun removes the old tgt_devs from
all session and device lookup paths before the work is scheduled. New
commands from the initiator use the new tgt_devs; only in-flight commands
still hold references via cmd->tgt_dev, and tgt_dev_cmd_count tracks
exactly those. synchronize_rcu ensures no RCU reader holds a stale
pointer before scst_free_tgt_dev is called.
SIGUSR1/SIGUSR2 set/clear logins_suspended. While set, any login
attempt is rejected with a retriable Target Error instead of the
permanent Initiator Error (TGT_NOT_FOUND) that causes initiators
to give up.
Simplify the Coverity build by always setting the BUILD_2X_MODULE,
CONFIG_SCSI_QLA_FC and CONFIG_SCSI_QLA2XXX_TARGET variables. Setting
these variables when not building a QLogic driver is safe because
these variables only have an impact when building the QLogic drivers.
See also commit 5c7fa24031 ("Makefile: Introduce the 'make cov-build'").
The current action branch fails with "No recipients defined" when
envelope_from is set in this workflow. The SMTP username already
provides the sender address, so envelope_from is unnecessary here.
pr_state is a common device attribute for save/restore of Persistent
Reservation state. pr_dump_dir is a dev_disk handler attribute that
triggers an automatic kernel-side PR state dump at unregistration time.
When pr_dump_dir is set to a directory path, each dev_disk device writes
its PR state to <dir>/<serial> on detach, using the same text format as
the pr_state sysfs attribute. The default is an empty string, which
disables the feature entirely.
This provides a race-free way to capture PR state at the point of device
teardown, after all in-flight commands have completed, for use cases that
need to preserve PR state across a device transition.
The dump must happen before scst_pr_clear_dev() wipes the in-memory
registrant list during device unregistration. To achieve this, a new
optional pre_unregister() callback is added to struct scst_dev_type,
called from both scst_unregister_device() and
scst_unregister_virtual_device() before scst_pr_clear_dev(). The disk
handler registers this callback (disk_pre_unregister) to perform the
dump at the correct moment.
The work is split across two phases to avoid filesystem I/O while
scst_mutex is held. disk_pre_unregister() (called under scst_mutex)
captures the PR state into a heap buffer and records the destination
path. disk_detach() (called after scst_mutex is released) writes the
buffer to the filesystem. To carry state between the two phases, dh_priv
is changed from a bare serial-number string to a struct disk_dh_priv
containing the serial plus the captured dump fields.
Add a read/write pr_state attribute to scst_device that serializes the
current persistent reservation state (generation, reservation type/scope,
and all registrants with their transport IDs) to a text format, and
restores it from the same format.
This provides a stable interface for saving and restoring PR state across
device transitions where the in-memory state would otherwise be lost.
Calling scst_unregister_session(wait=1) from qlt_free_session_done
blocks the qla2xxx_wq worker until session teardown completes, but
teardown requires the TM thread to process SCST_UNREG_SESS_TM while
the TM thread is blocked on scst_mutex held by concurrent session
teardown in the global management thread. Under load this stall
exceeds the hung-task timeout. Switch to wait=0 and wait on
fcport->unreg_done instead, matching the pattern in iscsi-scst.
In qla27xx_copy_fpin_pkt() and qla27xx_copy_multiple_pkt(), the frame_size
reported by firmware is used to calculate the copy length into
item->iocb. However, the iocb member is defined as a fixed-size 64-byte
array within struct purex_item.
If the reported frame_size exceeds 64 bytes, subsequent memcpy calls will
overflow the iocb member boundary. While extra memory might be allocated,
this cross-member write is unsafe and triggers warnings under
CONFIG_FORTIFY_SOURCE.
Fix this by capping total_bytes to the size of the iocb member (64 bytes)
before allocation and copying. This ensures all copies remain within the
bounds of the destination structure member.
Fixes: 875386b98857 ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com>
Link: https://patch.msgid.link/20260106205344.18031-1-jiashengjiangcool@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 19bc5f2a6962 upstream ]
A dynamic remove/add storage adapter test hits EEH on PowerPC:
EEH: [c00000000004f77c] __eeh_send_failure_event+0x7c/0x160
EEH: [c000000000048464] eeh_dev_check_failure.part.0+0x254/0x660
EEH: [c000000000934e0c] __pci_read_msi_msg+0x1ac/0x280
EEH: [c000000000100f68] pseries_msi_compose_msg+0x28/0x40
EEH: [c00000000020e1cc] irq_chip_compose_msi_msg+0x5c/0x90
EEH: [c000000000214b1c] msi_domain_set_affinity+0xbc/0x100
EEH: [c000000000206be4] irq_do_set_affinity+0x214/0x2c0
EEH: [c000000000206e04] irq_set_affinity_locked+0x174/0x230
EEH: [c000000000207044] irq_set_affinity+0x64/0xa0
EEH: [c000000000212890] write_irq_affinity.constprop.0.isra.0+0x130/0x150
EEH: [c00000000068868c] proc_reg_write+0xfc/0x160
EEH: [c0000000005adb48] vfs_write+0xf8/0x4e0
EEH: [c0000000005ae234] ksys_write+0x84/0x140
EEH: [c00000000002e994] system_call_exception+0x164/0x310
EEH: [c00000000000bfe8] system_call_vectored_common+0xe8/0x278
The irqbalance daemon kicks in before invoking qla2xxx->slot_reset
during the EEH recovery process.
irqbalance daemon
->irq_set_affinity()
->msi_domain_set_affinity()
->irq_chip_set_affiinity_parent()
->xive_irq_set_affinity()
->pseries_msi_compose_ms()
->__pci_read_msi_msg()
->irq_chip_compose_msi_msg()
In __pci_read_msi_msg(), the first MSI-X vector is set to all F by the
irqbalance daemon. pci_write_msg_msix: index=0, lo=ffffffff hi=fffffff
IRQ balancing is not required during adapter reset.
Enable "IRQ_NO_BALANCING" bit before starting adapter reset and disable
it calling pci_restore_state(). The irqbalance daemon is disabled for
this short period of time (~2s).
Co-developed-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
Link: https://patch.msgid.link/20251028142427.3969819-3-wenxiong@linux.ibm.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit eaea513077cd upstream ]
Use 'bytes' (the return value from af_alg_final) instead of 'res'
(which is 0 after the last successful af_alg_update call) when
copying the digest.
This bug caused the memcpy to copy 0 bytes, resulting in an
uninitialized digest buffer. It also triggered a GCC
-Werror=stringop-overflow warning because 'res' could theoretically
be negative, leading to a huge unsigned size.
The symlink-based %files conditional was added when scstadmin supported
a procfs variant. Procfs support is legacy now, so drop the conditional
and always package the man pages.
The lockless fast path for scst_cmd_set_sn() can cause commands to
lockup in state EXEC_CHECK_SN when there are multiple scst_tgts
accessing the same scst_device, for example two initiators
connected to the two ports of a dual-port QLogic FC HBA in target mode
both reading from the same shared disk. The multithreaded_init_done
value is too low-level for this; it does not take the higher-level
configuration into account.
- Remove the lockless fast path.
- Remove multithreaded_init_done, which enabled/disabled the lockless
fast path.
- Push the locking down into scst_cmd_set_sn(), which will now apply
regardless of set_sn_on_restart_cmd, which matters for mixed-driver
(e.g. iSCSI+qla2xxx) target-mode setups.
- Remove a bunch of comments explaining the rules for the lockless fast
path.
Fixes: https://github.com/SCST-project/scst/issues/333
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
In qla2xxx_process_purls_iocb(), an item is allocated via
qla27xx_copy_multiple_pkt(), which internally calls
qla24xx_alloc_purex_item().
The qla24xx_alloc_purex_item() function may return a pre-allocated item
from a per-adapter pool for small allocations, instead of dynamically
allocating memory with kzalloc().
An error handling path in qla2xxx_process_purls_iocb() incorrectly uses
kfree() to release the item. If the item was from the pre-allocated
pool, calling kfree() on it is a bug that can lead to memory corruption.
Fix this by using the correct deallocation function,
qla24xx_free_purex_item(), which properly handles both dynamically
allocated and pre-allocated items.
Fixes: 875386b98857 ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com>
Link: https://patch.msgid.link/20251113151246.762510-1-zilin@seu.edu.cn
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 78b1a242fe61 upstream ]
Upstream workqueue changes introduce a new WQ_PERCPU flag and plan to
switch alloc_workqueue()'s default from per-CPU to unbound
To kepp SCST behaviour unchanged across kernels, this patch makes all
alloc_workqueue() users explicit about whether they want per-CPU or
unbound queues.
Currently if a user enqueue a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the
API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be
used.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20251031095643.74246-2-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 49783aca15fb upstream ]
This enables the initiator to abort commands that are stuck pending in
the HW without waiting for a timeout.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
The driver associates two different structs with numeric handles and
passes the handles to the hardware. When the hardware passes the handle
back to the driver, the driver consults a table of void * to convert the
handle back to the struct without checking the type of struct. This can
lead to type confusion if the HBA firmware misbehaves (and some firmware
versions do). So verify the type of struct is what is expected before
using it.
But we can also do better than that. Also verify that the exchange
address of the message sent from the hardware matches the exchange
address of the command being returned. This adds an extra guard against
buggy HBA firmware that returns duplicate messages multiple times (which
has also been seen) in case the driver has reused the handle for a
different command of the same type.
These problems were seen on a QLE2694L with firmware 9.08.02 when
testing SLER / SRR support. The SRR caused the HBA to flood the
response queue with hundreds of bogus entries.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/7c7cb574-fe62-42ae-b800-d136d8dd89ca@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 4f5eb50f7c82 upstream ]
Background: loading qla2xxx with "ql2xtgt_tape_enable=1" enables
Sequence Level Error Recovery (SLER), which is most commonly used for
tape drives. With SLER enabled, if there is a recoverable I/O error
during a SCSI command, a Sequence Retransmission Request (SRR) will be
used to retry the I/O at a low-level completely within the driver
without propagating the error to the upper levels of the SCSI stack.
SRR support was removed in 2017 by commit 2c39b5ca2a8c ("qla2xxx: Remove
SRR code"). Add it back, new and improved.
The old removed SRR code used sequence numbers to correlate the SRR
CTIOs with SRR immediate notify messages. I don't see how that would
work reliably with MSI-X interrupts and multiple queues. So instead use
the exchange address to find the command associated with the immediate
notify (qlt_srr_to_cmd).
The old removed SRR code had a function qlt_check_srr_debug() to
simulate a SRR, but it didn't work for me. Instead I just used fiber
optic attenuators attached to the FC cable to reduce the strength of the
signal and induce errors. Unfortunately this only worked for inducing
SRRs on Data-Out (write) commands, so that is all I was able to test.
The code to build a new scatterlist for a SRR with nonzero offset has
been improved to reduce memory requirements and has been well-tested.
However it does not support protection information.
When a single cmd gets multiple SRRs, the old removed SRR code would
restore the data buffer from the values in cmd->se_cmd before processing
the new SRR. That might be needed if the offset for the new SRR was
lower than the offset for the previous SRR, but I am not sure if that
can happen. In my testing, when a single cmd gets multiple SRRs, the
SRR offset always increases or stays the same. But in case it can
decrease, I added the function qlt_restore_orig_sg(). If this is not
supposed to happen then qlt_restore_orig_sg() can be removed to simplify
the code.
I ran into some HBA firmware bugs with QLE269x, QLE27xx, and QLE28xx
firmware 9.05.xx - 9.08.xx where a SRR would cause the HBA to misbehave
badly. Since SRRs are rare and therefore difficult to test, I figured
it would be worth checking for the buggy firmware and disabling SLER
with a warning instead of letting others run into the same problem on
the rare occasion that they get a SRR. This turned out to be difficult
because the firmware version isn't known in the normal NVRAM config
routine, so I added a second NVRAM config routine that is called after
the firmware version is known.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/654b7181-b79e-40ed-a15b-6d6e441a5d5f@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit c7bd85a7b9c5 upstream ]
- Add the command tag to various messages so that different messages
about the same command can be correlated.
- For CTIO errors (i.e. when the HW reports an error about a cmd), print
the cmd tag, opcode, state, initiator WWPN, and LUN. This info helps
an administrator determine what is going wrong.
- When a command experiences a transport error, log a message when it
is freed. This makes debugging exceptions easier.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/c579987d-5658-41ae-9653-f0e58c9d1880@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 04957d8c9852 upstream ]
struct atio7_fcp_cmnd is a variable-length data structure because of
add_cdb_len, but it is embedded in struct atio_from_isp and copied
around like a fixed-length data structure. For big CDBs > 16 bytes,
get_datalen_for_atio() called on a fixed-length copy of the atio will
access invalid memory.
In some cases this can be fixed by moving the atio to the end of the
data structure and using a variable-length allocation. In other cases
such as allocating struct qla_tgt_cmd, the fixed-length data structures
are preallocated for speed, so in the case that add_cdb_len != 0,
allocate a separate buffer for the CDB. Also add memcpy_atio() as a
safeguard against invalid memory accesses.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/306a9d0b-3c89-42fc-a69c-eebca8171347@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 091719c21d5a upstream ]