Commit Graph

9247 Commits

Author SHA1 Message Date
Tony Battersby
33bfeab7d0 qla2x00t-32gbit: target: Improve checks in qlt_xmit_response / qlt_rdy_to_xfer
Similar fixes to both functions:

qlt_xmit_response:

 - If the cmd cannot be processed, remember to call ->free_cmd() to
   prevent the target-mode midlevel from seeing a cmd lockup.

 - Do not try to send the response if the exchange has been terminated.

 - Check for chip reset once after lock instead of both before and after
   lock.

 - Give errors from qlt_pre_xmit_response() a lower priority to
   compensate for removing the first check for chip reset.

qlt_rdy_to_xfer:

 - Check for chip reset after lock instead of before lock to avoid
   races.

 - Do not try to receive data if the exchange has been terminated.

 - Give errors from qlt_pci_map_calc_cnt() a lower priority to
   compensate for moving the check for chip reset.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/cd6ccd31-33fa-4454-be36-507bf578a546@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 5c50d84798eb upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
dd23781350 qla2x00t-32gbit: target: Fix races with aborting commands
sqa_on_hw_pending_cmd_timeout() currently unmaps DMA, sets
outstanding_cmds[h] to NULL, and forces the command to complete.  This
could cause a kernel crash if the HW later accesses the DMA mapping.
It can also cause other problems if outstanding_cmds[h] is reused for a
different command.  Fix by doing this instead:

- In sqa_on_hw_pending_cmd_timeout(), call qlt_send_term_exchange()
  first and then restart the timeout.  After another timeout, reset the
  ISP.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
2025-12-09 22:33:47 +03:00
Tony Battersby
c704a87271 qla2x00t-32gbit: target: Fix races with aborting commands
cmd->cmd_lock only protects cmd->aborted, but when deciding how to
process a cmd, it is necessary to consider other factors such as
cmd->state and if the chip has been reset, which are protected by
qpair->qp_lock_ptr.  So replace cmd_lock with qp_lock_ptr, whick makes
it possible to check additional values and make decisions about what to
do without racing with the CTIO handler and other code.

 - Lock cmd->qpair->qp_lock_ptr when aborting a cmd.

 - Eliminate cmd->cmd_lock and change cmd->aborted to a bitfield since
   it is now protected by qp_lock_ptr just like all the other flags.

 - Add another command state QLA_TGT_STATE_DONE to avoid any possible
   races between qlt_abort_cmd() and tgt_ops->free_cmd().

 - Add the cmd->sent_term_exchg flag to indicate if
   qlt_send_term_exchange() has already been called.

 - Export qlt_send_term_exchange() for SCST so that it can be called
   directly instead of trying to make qlt_abort_cmd() work for both TMR
   abort and HW timeout.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/2c8d03e4-308b-4d5a-a418-a334be23f815@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 17488f139074 upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
2c8529b5ba qla2x00t-32gbit: Clear cmds after chip reset
Commit aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling
and host reset handling") caused two problems:

1. Commands sent to FW, after chip reset got stuck and never freed as FW
   is not going to respond to them anymore.

2. BUG_ON(cmd->sg_mapped) in qlt_free_cmd().  Commit 26f9ce53817a
   ("scsi: qla2xxx: Fix missed DMA unmap for aborted commands")
   attempted to fix this, but introduced another bug under different
   circumstances when two different CPUs were racing to call
   qlt_unmap_sg() at the same time: BUG_ON(!valid_dma_direction(dir)) in
   dma_unmap_sg_attrs().

So revert "scsi: qla2xxx: Fix missed DMA unmap for aborted commands" and
partially revert "scsi: qla2xxx: target: Fix offline port handling and
host reset handling" at __qla2x00_abort_all_cmds.

Fixes: aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling and host reset handling")
Fixes: 26f9ce53817a ("scsi: qla2xxx: Fix missed DMA unmap for aborted commands")
Co-developed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/0e7e5d26-e7a0-42d1-8235-40eeb27f3e98@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit d46c69a087aa upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
61518181f5 qla2x00t-32gbit: target: Fix term exchange when cmd_sent_to_fw == 1
Properly set the nport_handle field of the terminate exchange message.
Previously when this field was not set properly, the term exchange would
fail when cmd_sent_to_fw == 1 but work when cmd_sent_to_fw == 0 (i.e. it
would fail when the HW was actively transferring data or status for the
cmd but work when the HW was idle).  With this change, term exchange
works in any cmd state, which now makes it possible to abort a command
that is locked up in the HW.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/1a221699-969b-4f28-8ea4-395d2f7a7c0a@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit ed382b95f5de upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
c89a047a59 qla2x00t-32gbit: target: Improve debug output for term exchange
Print better debug info when terminating a command, and print the
response status from the hardware.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/22f8a0b6-0e24-474d-9f28-9d65c9b7af03@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit c34e373f535e upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
635ea86df3 qla2x00t-32gbit: target: Remove code for unsupported hardware
As far as I can tell, CONTINUE_TGT_IO_TYPE and CTIO_A64_TYPE are message
types from non-FWI2 boards (older than ISP24xx), which are not supported
by qla_target.c.  Removing them makes it possible to turn a void * into
the real type and avoid some typecasts.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/cb006628-e321-4e30-a60b-08b37b8685a5@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 9da4e1dcea46 upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
6ea8d5e6ae qla2x00t-32gbit: Use reinit_completion on mbx_intr_comp
If a mailbox command completes immediately after
wait_for_completion_timeout() times out, ha->mbx_intr_comp could be left
in an inconsistent state, causing the next mailbox command not to wait
for the hardware.  Fix by reinitializing the completion before use.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/11b6485e-0bfd-4784-8f99-c06a196dad94@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 957aa5974989 upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
41692d22dd qla2x00t-32gbit: Fix lost interrupts with qlini_mode=disabled
When qla2xxx is loaded with qlini_mode=disabled,
ha->flags.disable_msix_handshake is used before it is set, resulting in
the wrong interrupt handler being used on certain HBAs
(qla2xxx_msix_rsp_q_hs() is used when qla2xxx_msix_rsp_q() should be
used).  The only difference between these two interrupt handlers is that
the _hs() version writes to a register to clear the "RISC" interrupt,
whereas the other version does not.  So this bug results in the RISC
interrupt being cleared when it should not be.  This occasionally causes
a different interrupt handler qla24xx_msix_default() for a different
vector to see ((stat & HSRX_RISC_INT) == 0) and ignore its interrupt,
which then causes problems like:

qla2xxx [0000:02:00.0]-d04c:6: MBX Command timeout for cmd 20,
  iocontrol=8 jiffies=1090c0300 mb[0-3]=[0x4000 0x0 0x40 0xda] mb7 0x500
  host_status 0x40000010 hccr 0x3f00
qla2xxx [0000:02:00.0]-101e:6: Mailbox cmd timeout occurred, cmd=0x20,
  mb[0]=0x20. Scheduling ISP abort
(the cmd varies; sometimes it is 0x20, 0x22, 0x54, 0x5a, 0x5d, or 0x6a)

This problem can be reproduced with a 16 or 32 Gbps HBA by loading
qla2xxx with qlini_mode=disabled and running a high IOPS test while
triggering frequent RSCN database change events.

While analyzing the problem I discovered that even with
disable_msix_handshake forced to 0, it is not necessary to clear the
RISC interrupt from qla2xxx_msix_rsp_q_hs() (more below).  So just
completely remove qla2xxx_msix_rsp_q_hs() and the logic for selecting
it, which also fixes the bug with qlini_mode=disabled.

The test below describes the justification for not needing
qla2xxx_msix_rsp_q_hs():

Force disable_msix_handshake to 0:
qla24xx_config_rings():
if (0 && (ha->fw_attributes & BIT_6) && (IS_MSIX_NACK_CAPABLE(ha)) &&
    (ha->flags.msix_enabled)) {

In qla24xx_msix_rsp_q() and qla2xxx_msix_rsp_q_hs(), check:
  (rd_reg_dword(&reg->host_status) & HSRX_RISC_INT)

Count the number of calls to each function with HSRX_RISC_INT set and
the number with HSRX_RISC_INT not set while performing some I/O.

If qla2xxx_msix_rsp_q_hs() clears the RISC interrupt (original code):
qla24xx_msix_rsp_q:    50% of calls have HSRX_RISC_INT set
qla2xxx_msix_rsp_q_hs:  5% of calls have HSRX_RISC_INT set
(# of qla2xxx_msix_rsp_q_hs interrupts) =
    (# of qla24xx_msix_rsp_q interrupts) * 3

If qla2xxx_msix_rsp_q_hs() does not clear the RISC interrupt (patched
code):
qla24xx_msix_rsp_q:    100% of calls have HSRX_RISC_INT set
qla2xxx_msix_rsp_q_hs:   9% of calls have HSRX_RISC_INT set
(# of qla2xxx_msix_rsp_q_hs interrupts) =
    (# of qla24xx_msix_rsp_q interrupts) * 3

In the case of the original code, qla24xx_msix_rsp_q() was seeing
HSRX_RISC_INT set only 50% of the time because qla2xxx_msix_rsp_q_hs()
was clearing it when it shouldn't have been.  In the patched code,
qla24xx_msix_rsp_q() sees HSRX_RISC_INT set 100% of the time, which
makes sense if that interrupt handler needs to clear the RISC interrupt
(which it does).  qla2xxx_msix_rsp_q_hs() sees HSRX_RISC_INT only 9% of
the time, which is just overlap from the other interrupt during the
high IOPS test.

Tested with SCST on:
QLE2742  FW:v9.08.02 (32 Gbps 2-port)
QLE2694L FW:v9.10.11 (16 Gbps 4-port)
QLE2694L FW:v9.08.02 (16 Gbps 4-port)
QLE2672  FW:v8.07.12 (16 Gbps 2-port)
both initiator and target mode

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/56d378eb-14ad-49c7-bae9-c649b6c7691e@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 4f6aaade2a22 upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
12e3d15b9d qla2x00t-32gbit: Fix initiator mode with qlini_mode=exclusive
When given the module parameter qlini_mode=exclusive, qla2xxx in
initiator mode is initially unable to successfully send SCSI commands to
devices it finds while scanning, resulting in an escalating series of
resets until an adapter reset clears the issue.  Fix by checking the
active mode instead of the module parameter.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/1715ec14-ba9a-45dc-9cf2-d41aa6b81b5e@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 8f58fc64d559 upstream ]
2025-12-09 22:33:47 +03:00
Tony Battersby
5328814318 qla2x00t-32gbit: Revert "qla2x00t-32gbit: Perform lockless command completion in abort path"
This reverts commit 0367076b0817d5c75dfb83001ce7ce5c64d803a9.

The commit being reverted added code to __qla2x00_abort_all_cmds() to
call sp->done() without holding a spinlock.  But unlike the older code
below it, this new code failed to check sp->cmd_type and just assumed
TYPE_SRB, which results in a jump to an invalid pointer in target-mode
with TYPE_TGT_CMD:

qla2xxx [0000:65:00.0]-d034:8: qla24xx_do_nack_work create sess success
  0000000009f7a79b
qla2xxx [0000:65:00.0]-5003:8: ISP System Error - mbx1=1ff5h mbx2=10h
  mbx3=0h mbx4=0h mbx5=191h mbx6=0h mbx7=0h.
qla2xxx [0000:65:00.0]-d01e:8: -> fwdump no buffer
qla2xxx [0000:65:00.0]-f03a:8: qla_target(0): System error async event
  0x8002 occurred
qla2xxx [0000:65:00.0]-00af:8: Performing ISP error recovery -
  ha=0000000058183fda.
BUG: kernel NULL pointer dereference, address: 0000000000000000
PF: supervisor instruction fetch in kernel mode
PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: 0010 [#1] SMP
CPU: 2 PID: 9446 Comm: qla2xxx_8_dpc Tainted: G           O       6.1.133 #1
Hardware name: Supermicro Super Server/X11SPL-F, BIOS 4.2 12/15/2023
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90001f93dc8 EFLAGS: 00010206
RAX: 0000000000000282 RBX: 0000000000000355 RCX: ffff88810d16a000
RDX: ffff88810dbadaa8 RSI: 0000000000080000 RDI: ffff888169dc38c0
RBP: ffff888169dc38c0 R08: 0000000000000001 R09: 0000000000000045
R10: ffffffffa034bdf0 R11: 0000000000000000 R12: ffff88810800bb40
R13: 0000000000001aa8 R14: ffff888100136610 R15: ffff8881070f7400
FS:  0000000000000000(0000) GS:ffff88bf80080000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000010c8ff006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? __die+0x4d/0x8b
 ? page_fault_oops+0x91/0x180
 ? trace_buffer_unlock_commit_regs+0x38/0x1a0
 ? exc_page_fault+0x391/0x5e0
 ? asm_exc_page_fault+0x22/0x30
 __qla2x00_abort_all_cmds+0xcb/0x3e0 [qla2xxx_scst]
 qla2x00_abort_all_cmds+0x50/0x70 [qla2xxx_scst]
 qla2x00_abort_isp_cleanup+0x3b7/0x4b0 [qla2xxx_scst]
 qla2x00_abort_isp+0xfd/0x860 [qla2xxx_scst]
 qla2x00_do_dpc+0x581/0xa40 [qla2xxx_scst]
 kthread+0xa8/0xd0
 </TASK>

Then commit 4475afa2646d ("scsi: qla2xxx: Complete command early within
lock") added the spinlock back, because not having the lock caused a
race and a crash.  But qla2x00_abort_srb() in the switch below already
checks for qla2x00_chip_is_down() and handles it the same way, so the
code above the switch is now redundant and still buggy in target-mode.
Remove it.

Cc: stable@vger.kernel.org
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/3a8022dc-bcfd-4b01-9f9b-7a9ec61fa2a3@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit b57fbc88715b upstream ]
2025-12-09 22:33:47 +03:00
Gleb Chesnokov
29aa9a3202 scst: Port to Linux kernel v6.19
Support for the following core changes in the Linux kernel v6.19:

  - 15115830c887 ("preempt: Cleanup the macro maze a bit")
2025-12-09 17:00:47 +03:00
Gleb Chesnokov
78d41552b4 scst/include/backport.h: Unbreak build on kernels < 6.13 2025-12-09 16:06:22 +03:00
Gleb Chesnokov
f1faed032e Revert "qla2x00t-32gbit: Fix memcpy() field-spanning write issue"
This reverts commit 6f4b10226b6b1e7d1ff3cdb006cf0f6da6eed71e.

We've been testing this patch and it turns out there is a significant
bug here. This leaks memory and causes a driver hang.

Link: https://lore.kernel.org/linux-scsi/yq1zfajqpec.fsf@ca-mkp.ca.oracle.com/
Signed-off-by: John Meneghini <jmeneghi@redhat.com>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 285654d58a74 upstream ]
2025-12-09 16:06:22 +03:00
Gleb Chesnokov
7ce251a956 qla2x00t-32gbit: Fix incorrect sign of error code in qla_nvme_xmt_ls_rsp()
Change the error code EAGAIN to -EAGAIN in qla_nvme_xmt_ls_rsp() to
align with qla2x00_start_sp() returning negative error codes or
QLA_SUCCESS, preventing logical errors.

Fixes: 875386b98857 ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Message-ID: <20250905075446.381139-4-rongqianfeng@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 9877c004e9f4 upstream ]
2025-12-09 16:06:22 +03:00
Gleb Chesnokov
394aa1a409 qla2x00t-32gbit: Fix incorrect sign of error code in START_SP_W_RETRIES()
Change the error code EAGAIN to -EAGAIN in START_SP_W_RETRIES() to align
with qla2x00_start_sp() returning negative error codes or QLA_SUCCESS,
preventing logical errors.  Additionally, the '_rval' variable should
store negative error codes to conform to Linux kernel error code
conventions.

Fixes: 9803fb5d2759 ("scsi: qla2xxx: Fix task management cmd failure")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Message-ID: <20250905075446.381139-3-rongqianfeng@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 1f037e3acda7 upstream ]
2025-12-09 16:06:22 +03:00
Gleb Chesnokov
2cb467baeb qla2x00t-32gbit: edif: Fix incorrect sign of error code
Change the error code EAGAIN to -EAGAIN in qla24xx_sadb_update() and
qla_edif_process_els() to align with qla2x00_start_sp() returning
negative error codes or QLA_SUCCESS, preventing logical errors.

Fixes: 0b3f3143d473 ("scsi: qla2xxx: edif: Add retry for ELS passthrough")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Message-ID: <20250905075446.381139-2-rongqianfeng@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 066b8f3fa85c upstream ]
2025-12-09 16:06:22 +03:00
Gleb Chesnokov
68a8eedd22 qla2x00t-32gbit: Use secs_to_jiffies() instead of msecs_to_jiffies()
Use secs_to_jiffies() instead of msecs_to_jiffies() and avoid scaling
'ratov_j' to milliseconds.

No functional changes intended.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20250828161153.3676-2-thorsten.blum@linux.dev
Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit e02436d37a47 upstream ]
2025-12-09 16:06:22 +03:00
Gleb Chesnokov
1122dc9f77 qla2x00t-32gbit: Fix memcpy() field-spanning write issue
purex_item.iocb is defined as a 64-element u8 array, but 64 is the
minimum size and it can be allocated larger. This makes it a standard
empty flex array.

This was motivated by field-spanning write warnings during FPIN testing:

https://lore.kernel.org/linux-nvme/20250709211919.49100-1-bgurney@redhat.com/

  >  kernel: memcpy: detected field-spanning write (size 60) of single field
  >  "((uint8_t *)fpin_pkt + buffer_copy_offset)"
  >  at drivers/scsi/qla2xxx/qla_isr.c:1221 (size 44)

I removed the outer wrapper from the iocb flex array, so that it can be
linked to 'purex_item.size' with '__counted_by'.

These changes remove the default minimum 64-byte allocation, requiring
further changes.

  In 'struct scsi_qla_host' the embedded 'default_item' is now followed
  by '__default_item_iocb[QLA_DEFAULT_PAYLOAD_SIZE]' to reserve space
  that will be used as 'default_item.iocb'. This is wrapped using the
  'TRAILING_OVERLAP()' macro helper, which effectively creates a union
  between flexible-array member 'default_item.iocb' and
  '__default_item_iocb'.

  Since 'struct pure_item' now contains a flexible-array member, the
  helper must be placed at the end of 'struct scsi_qla_host' to prevent
  a '-Wflex-array-member-not-at-end' warning.

  'qla24xx_alloc_purex_item()' is adjusted to no longer expect the
  default minimum size to be part of 'sizeof(struct purex_item)', the
  entire flexible array size is added to the structure size for
  allocation.

This also slightly changes the layout of the purex_item struct, as
2-bytes of padding are added between 'size' and 'iocb'. The resulting
size is the same, but iocb is shifted 2-bytes (the original 'purex_item'
structure was padded at the end, after the 64-byte defined array size).
I don't think this is a problem.

Tested-by: Bryan Gurney <bgurney@redhat.com>
Co-developed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20250813200744.17975-10-bgurney@redhat.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 6f4b10226b6b upstream ]
2025-12-09 16:06:22 +03:00
Gleb Chesnokov
204ef22963 qla2x00t, qla2x00t-32gbit: Update device error_state already after reset
After a Fatal Error has been reported by a device and has been recovered
through a Secondary Bus Reset, AER updates the device's error_state to
pci_channel_io_normal before invoking its driver's ->resume() callback.

By contrast, EEH updates the error_state earlier, namely after resetting
the device and before invoking its driver's ->slot_reset() callback.
Commit c58dc575f3c8 ("powerpc/pseries: Set error_state to
pci_channel_io_normal in eeh_report_reset()") explains in great detail
that the earlier invocation is necessitated by various drivers checking
accessibility of the device with pci_channel_offline() and avoiding
accesses if it returns true.  It returns true for any other error_state
than pci_channel_io_normal.

The device should be accessible already after reset, hence the reasoning
is that it's safe to update the error_state immediately afterwards.

This deviation between AER and EEH seems problematic because drivers
behave differently depending on which error recovery mechanism the
platform uses.  Three drivers have gone so far as to update the
error_state themselves, presumably to work around AER's behavior.

For consistency, amend AER to update the error_state at the same recovery
steps as EEH.  Drop the now unnecessary workaround from the three drivers.

Keep updating the error_state before ->resume() in case ->error_detected()
or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes
->slot_reset() to be skipped.  There are drivers doing this even for Fatal
Errors, e.g. mhi_pci_error_detected().

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/4517af6359ffb9d66152b827a5d2833459144e3f.1755008151.git.lukas@wunner.de
[commit 45bc82563d55 upstream]
2025-12-09 16:06:22 +03:00
Gleb Chesnokov
b2250f6ead nightly build: Update kernel versions
Another kernel versions update
2025-12-09 13:32:41 +03:00
Gleb Chesnokov
3fb16aa624 scst: Unbreak the RHEL 10.1 build
Fixes: https://github.com/SCST-project/scst/issues/317
2025-12-09 11:53:38 +03:00
Gleb Chesnokov
7d0b1d2588 scst: Unbreak the RHEL 9.7 build
Fixes: https://github.com/SCST-project/scst/issues/317
2025-11-19 11:38:42 +03:00
MajorP93
9590762792 debian, scst-dkms: Move the .install file creation to the correct location in install target
* This fixes an issue where the resulting scst-dkms deb package was empty and could not be installed.
* By moving the .install file creation to the install target we ensure:
  - The .install file is generated after the version is set
  - Paths match the actual DKMS source location
  - File contents aren't overwritten by later operations
2025-11-19 10:05:19 +03:00
Gleb Chesnokov
2df209ea5f scstadmin: Fix precedence typo in error propagation
Fix Perl precedence warnings:

  Possible precedence problem between ! and numeric gt (>) at SCST.pm line 980.
  Possible precedence problem between ! and numeric gt (>) at SCST.pm line 1223.
  Possible precedence problem between ! and numeric gt (>) at SCST.pm line 3847.
2025-11-05 11:41:41 +03:00
Gleb Chesnokov
492b6ccbea scstadmin.spec: Install unit into %{_unitdir} and package it
Fixes: https://github.com/SCST-project/scst/issues/323
2025-11-05 11:41:41 +03:00
Ameer Hamza
6c73cc8d2f scst_lib: Port to Linux kernel v6.18
Support for the following block layer and memory management changes in
the Linux kernel v6.18:

  - d86eaa0f3c56 ("block: remove the bi_inline_vecs variable sized array
    from struct bio")
  - 84efbefa26df ("mm: remove nth_page()")
2025-10-30 11:02:08 +03:00
Brian M
0d3c9018af debian, scstadmin: Add systemd scst.service
Add systemd service file when packaging for Debian.  Current
systemd will automatically generate one, but this functionality
will be removed in a future version of systemd.
2025-10-28 18:15:04 +03:00
Gleb Chesnokov
08f5f43df8 fcst: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-10-21 12:18:36 +03:00
Gleb Chesnokov
dff8aedcaa scst_priv.h: Drop redundant 'extern' from function prototypes
This patch does not change any functionality.
2025-10-21 12:18:36 +03:00
Gleb Chesnokov
3928d6d74f iscsi-scst: Drop redundant 'extern' from function prototypes
This patch does not change any functionality.
2025-10-21 12:18:36 +03:00
Gleb Chesnokov
6a2b3e36e4 ib_srpt: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-10-21 12:18:36 +03:00
Gleb Chesnokov
3b6e2ed8be scst_local_cmd: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-10-21 12:18:36 +03:00
Gleb Chesnokov
6ca7f49b8c scst_tg: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-10-21 12:18:36 +03:00
Gleb Chesnokov
6543c4c316 qla2x00t-32gbit: Remove firmware URL
The historic QLogic firmware URL redirects to a Marvell page that only
provides drivers.

Refer to linux-firmware instead.

Cc: Nilesh Javali <njavali@marvell.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: QLOGIC ML <GR-QLogic-Storage-Upstream@marvell.com>
Cc: LINUX SCSI ML <linux-scsi@vger.kernel.org>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Link: https://lore.kernel.org/r/20250624190926.115009-1-xose.vazquez@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit b152f199fa43 upstream ]
2025-10-01 22:11:28 +03:00
Gleb Chesnokov
0e890f5167 scst: Indent Kconfig help text
Fix indentation of config option's help text by adding leading spaces.
Generally help text is indented by couple of spaces more beyond the leading
tab <\t> character.  It helps Kconfig parsers to read file without error.
2025-10-01 22:11:28 +03:00
Gleb Chesnokov
2c69fe018c scst: Use block layer helpers to calculate num of queues
The calculation of the upper limit for queues does not depend solely on
the number of online CPUs; for example, the isolcpus kernel
command-line option must also be considered.

To account for this, the block layer provides a helper function to
retrieve the maximum number of queues. Use it to set an appropriate
upper queue number limit.
2025-10-01 22:11:28 +03:00
Gleb Chesnokov
3960dc87ac qla2x00t-32gbit: Avoid stack frame size warning in qla_dfs
The qla2x00_dfs_tgt_port_database_show() function constructs a fake
fc_port_t object on the stack, which--depending on the configuration--is
large enough to exceed the stack size warning limit:

drivers/scsi/qla2xxx/qla_dfs.c:176:1: error: stack frame size (1392) exceeds limit (1280) in 'qla2x00_dfs_tgt_port_database_show' [-Werror,-Wframe-larger-than]

Rework this function to no longer need the structure but instead call a
custom helper function that just prints the data directly from the
port_database_24xx structure.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250620173232.864179-1-arnd@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 6243146bb019 upstream ]
2025-10-01 22:11:28 +03:00
Gleb Chesnokov
fae9d2a1e5 scst: Unbreak UEK and RHEL builds
Drop OL6/UEK R4 support.
2025-10-01 22:11:28 +03:00
Gleb Chesnokov
5245d1cb15 .github/workflows: Print compilation output and improve annotations
Show the kernel compilation output whenever the run reached the
compilation stage (both pass/fail cases) and add a readable prefix
with the actual filename. Also refactor to use variables for version,
workdir, and output; quote expansions; and switch to titled GitHub
Actions annotations. Drop `-k` and rely on explicit cleanup.
2025-10-01 22:11:28 +03:00
Gleb Chesnokov
59bdb2463c scripts/specialize-patch: treat "#if 0" and "#if 0 && ..." as constant-false
Partial evaluation can yield guards like "+#if 0 && ...". These are false
but bypassed the filter that only matched exact "+#if 0"/"+#elif 0".
Tighten the regex to match the original spacing and catch both forms.

This is the minimal change addressing the bug observed in logs such as:

  (c) +#if 0 && !(1 && defined(FC_PORTSPEED_256GBIT)) ...
  (g2) ... output = 1   <-- wrong

After this change such guards are dropped correctly (output = 0).
2025-10-01 22:11:28 +03:00
Gleb Chesnokov
9543060a2c scripts/generate-kernel-patch: Use krel for qla2xxx patch paths
Fix mismatch where generate-kernel-patch keyed paths by full_kver
(with ‘^’) but in-tree patches were written under krel
(before ‘^’). Derive krel=${full_kver/^*} and use it for qla2xxx path
resolution.
2025-09-30 18:09:11 +03:00
Gleb Chesnokov
90c26fa844 scripts/run-regression-tests: Drop 2.6.x paths and simplify builds
Always include drivers/scsi/qla2xxx in subdirs and extend the local
module build loop to cover both qla2x00t and qla2x00t-32gbit.
2025-09-30 18:09:11 +03:00
Gleb Chesnokov
8180623310 nightly build: Update kernel versions
Another kernel versions update
2025-09-30 12:13:22 +03:00
Gleb Chesnokov
420b06472b scst_event: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-09-23 13:24:49 +03:00
Gleb Chesnokov
c38b254c7a scst_mem: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-09-23 12:49:00 +03:00
Gleb Chesnokov
6e787cbefd scst_pres: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-09-23 11:41:48 +03:00
Gleb Chesnokov
10740bb400 scst_targ: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-09-16 14:19:05 +03:00
Gleb Chesnokov
5072f0ce58 scst_copy_mgr: Fix multiple checkpatch warnings
This patch does not change any functionality.
2025-09-09 12:21:23 +03:00
Tony Battersby
f8bfa638b6 scst: Export scst_tgt->sg_tablesize via sysfs
This value is available in initiator mode via
/sys/class/scsi_host/hostN/sg_tablesize; make it available in target
mode as well.  Userspace code may use it when making decisions about
buffer sizes.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
2025-09-08 16:40:12 +03:00