Current driver does unnecessary pause for each session to get to certain
state before allowing the app start call to return. In larger environment,
this introduces a long delay. Originally the delay was meant to
synchronize app and driver. However, the with current implementation the
two sides use various events to synchronize their state.
The same is applied to the authentication failure call.
Link: https://lore.kernel.org/r/20211026115412.27691-6-njavali@marvell.com
Fixes: 4de067e5df12 ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit b492d6a4880f upstream ]
When user uses issue_lip to do link bounce, driver sends additional target
reset to remote device before resetting the link. The target reset would
affect other paths with active I/Os. This patch will remove the unnecessary
target reset.
Link: https://lore.kernel.org/r/20211026115412.27691-4-njavali@marvell.com
Fixes: 5854771e314e ("[SCSI] qla2xxx: Add ISPFX00 specific bus reset routine")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 0b7a9fd934a6 upstream ]
For RSCN of type "Area, Domain, or Fabric", which indicate a portion or
entire fabric was disturbed, current driver does not set the scan_need flag
to indicate a session was affected by the disturbance. This in turn can
lead to I/O timeout and delay of relogin. Hence initiate relogin in the
event of fabric disturbance.
Link: https://lore.kernel.org/r/20211026115412.27691-2-njavali@marvell.com
Fixes: 1560bafdff9e ("scsi: qla2xxx: Use complete switch scan for RSCN events")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit bb2ca6b3f09a upstream ]
Implement ->map queues and use the block layer blk_mq_pci_map_queues
helper for mapping queues to CPUs.
With this mapping minimum 10%+ increase in performance is noticed.
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[ commit 2b2af50ae836 upstream ]
The sgl is freed in the target stack in target_release_cmd_kref() before
calling qlt_free_cmd() but there is an unmap of sgl in qlt_free_cmd() that
causes a panic if sgl is not yet DMA unmapped:
NIP dma_direct_unmap_sg+0xdc/0x180
LR dma_direct_unmap_sg+0xc8/0x180
Call Trace:
ql_dbg_prefix+0x68/0xc0 [qla2xxx] (unreliable)
dma_unmap_sg_attrs+0x54/0xf0
qlt_unmap_sg.part.19+0x54/0x1c0 [qla2xxx]
qlt_free_cmd+0x124/0x1d0 [qla2xxx]
tcm_qla2xxx_release_cmd+0x4c/0xa0 [tcm_qla2xxx]
target_put_sess_cmd+0x198/0x370 [target_core_mod]
transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
tcm_qla2xxx_complete_free+0x6c/0x90 [tcm_qla2xxx]
The sgl may be left unmapped in error cases of response sending. For
instance, qlt_rdy_to_xfer() maps sgl and exits when session is being
deleted keeping the sgl mapped.
This patch removes use-after-free of the sgl and ensures that the sgl is
unmapped for any command that was not sent to firmware.
Link: https://lore.kernel.org/r/20211018122650.11846-1-d.bogdanov@yadro.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 4a8f71014b4d upstream ]
Commit 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of
qla2x00_process_els()"), intended to change:
bsg_job->request->msgcode == FC_BSG_HST_ELS_NOLOGIN
to:
bsg_job->request->msgcode != FC_BSG_RPT_ELS
but changed it to:
bsg_job->request->msgcode == FC_BSG_RPT_ELS
instead.
Change the == to a != to avoid leaking the fcport structure or freeing
unallocated memory.
Link: https://lore.kernel.org/r/20211012191834.90306-2-jgu@purestorage.com
Fixes: 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Joy Gu <jgu@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 7fb223d0ad80 upstream ]
DPC thread gets restricted due to a no-op mailbox, which is a blocking call
and has a high execution frequency. To free up the DPC thread we move no-op
handling to the workqueue. Also, modified qla_do_heartbeat() to send no-op
MBC if we don’t have any active interrupts, but there are still I/Os
outstanding with firmware.
Link: https://lore.kernel.org/r/20210908164622.19240-9-njavali@marvell.com
Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 3a4e1f3b3a3c upstream ]
Authentication application may be running and in the past tried to probe
driver (app_start) but was unsuccessful. This could be due to the bsg layer
not being ready to service the request. On a successful link up, driver
will use the netlink Link Up event to notify the app to retry the app_start
call.
In another case, app does not poll for new NPIV host. This link up event
would notify app of the presence of a new SCSI host.
Link: https://lore.kernel.org/r/20210908164622.19240-6-njavali@marvell.com
Fixes: 4de067e5df12 ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 527d46e0b014 upstream ]
System crash was seen when I/O was run against an NVMe target and aborts
were occurring.
Crash stack is:
-- relevant crash stack --
BUG: kernel NULL pointer dereference, address: 0000000000000010
:
#6 [ffffae1f8666bdd0] page_fault at ffffffffa740122e
[exception RIP: qla_nvme_abort_work+339]
RIP: ffffffffc0f592e3 RSP: ffffae1f8666be80 RFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff9b581fc8af80 RCX: ffffffffc0f83bd0
RDX: 0000000000000001 RSI: ffff9b5839c6c7c8 RDI: 0000000008000000
RBP: ffff9b6832f85000 R8: ffffffffc0f68160 R9: ffffffffc0f70652
R10: ffffae1f862ffdc8 R11: 0000000000000300 R12: 000000000000010d
R13: 0000000000000000 R14: ffff9b5839cea000 R15: 0ffff9b583fab170
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffffae1f8666be98] process_one_work at ffffffffa6aba184
#8 [ffffae1f8666bed8] worker_thread at ffffffffa6aba39d
#9 [ffffae1f8666bf10] kthread at ffffffffa6ac06ed
The crash was due to a stale SRB structure access after it was aborted.
Fix the issue by removing stale access.
Link: https://lore.kernel.org/r/20210908164622.19240-5-njavali@marvell.com
Fixes: 2cabf10dbbe3 ("scsi: qla2xxx: Fix hang on NVMe command timeouts")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit e6e22e6cc296 upstream ]
Update the Q2T_VERSION_STRING for the qla2x00t-32gbit driver.
Fixes: 0b3aad05 ("scripts/update-version: Also update the qla2x00t-32gbit version number")