Commit 7c15fc8f0d93 ("qla2x00t-32gbit: Enhance driver tracing with
separate tunable and more") introduced the use of the trace.h/trace_events.h
API.
Due to support for older kernel versions, limit the minimum kernel version to
use this enhance driver tracing to v5.5.
See also commit 288797871473 ("tracing: Adding new functions for kernel
access to Ftrace instances") # v5.5.
Older tracing of driver messages was to:
- log only debug messages to kernel main trace buffer; and
- log only if extended logging bits corresponding to this message is
off
This has been modified and extended as follows:
- Tracing is now controlled via ql2xextended_error_logging_ktrace
module parameter. Bit usages same as ql2xextended_error_logging.
- Tracing uses "qla2xxx" trace instance, unless instance creation have
issues.
- Tracing is enabled (compile time tunable).
- All driver messages, include debug and log messages are now traced in
kernel trace buffer.
Trace messages can be viewed by looking at the qla2xxx instance at:
/sys/kernel/tracing/instances/qla2xxx/trace
Trace tunable that takes the same bit mask as ql2xextended_error_logging
is:
ql2xextended_error_logging_ktrace (default=1)
Link: https://lore.kernel.org/r/20220826102559.17474-6-njavali@marvell.com
Suggested-by: Daniel Wagner <dwagner@suse.de>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 8bfc149ba24c upstream ]
This message is helpful to troubleshoot missing LUNs/SAN boot errors. It'd
be nice to log it by default instead of only being enabled with debug.
This user had an accidental/forgotten file modprobe.d/qla2xxx.conf w/
option qlini_mode=disabled from experiments with FC target mode, and their
boot LUN didn't come up, as it skips SCSI scan, of course.
However, their boot log didn't provide any clues to help understand that.
The issue/message could be figured out w/ ql2xextended_error_logging, but
it would have been simpler (or even deflected/addressed by user) if it had
been there by default. And it also would help support/triage/deflection
tooling.
Expected change:
scsi host15: qla2xxx
+qla2xxx [0000:3b:00.0]-00fb:15: skipping scsi_scan_host() for non-initiator port
qla2xxx [0000:3b:00.0]-00fb:15: QLogic QLE2692 - QLE2692 Dual Port 16Gb FC to PCIe Gen3 x8 Adapter.
According to:
qla2x00_probe_one()
...
ret = scsi_add_host(...);
...
ql_log(ql_log_info, ...
"skipping scsi_scan_host() for non-initiator port\n");
...
ql_log(ql_log_info, ...
"QLogic %s - %s.\n", ha->model_number, ha->model_desc);
Link: https://lore.kernel.org/r/20220825120159.275051-1-mfo@canonical.com
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit eee8bb4a2b58 upstream ]
Currently qlt_stop_phase1() may fail to call flush_scheduled_work(), for
list_empty() may return true as soon as qlt_sess_work_fn() called
list_del(). In order to close this race window, check list_empty() after
calling flush_scheduled_work().
If this patch causes problems, please check commit c4f135d64382
("workqueue: Wrap flush_workqueue() using a macro"). We are on the way to
remove all flush_scheduled_work() calls from the kernel.
Link: https://lore.kernel.org/r/7f24469d-9e39-3398-d851-329b54c0b923@I-love.SAKURA.ne.jp
Tested-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit a4345557527f upstream ]
Any attempt to flush kernel-global WQs has possibility of deadlock
so we should simply stop using them, instead introduce scst_event_wq.
See also commit c4f135d64382 ("workqueue: Wrap flush_workqueue()
using a macro") # v5.19.
DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:
1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.
2. There is no handling for it in scsi_decide_disposition() so it
results in entering SCSI error handling.
This has qla2xxx use DID_NO_CONNECT because it looks like we hit this error
when we can't find a port. It will give us the same hard error behavior and
it seems to match the error where we can't find the endpoint.
Link: https://lore.kernel.org/r/20220812010027.8251-7-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit a965d35c8741 upstream ]
Support for the following block layer changes in the Linux kernel v6.1:
- a4e1d0b76e7b ("block: Change the return type of blk_mq_map_queues() into void")
Support for the following block layer changes in the Linux kernel v6.1:
- de671d6116b5 ("block: change request end_io handler to pass back a return value")
Support for the following vfs file changes in the Linux kernel v6.1:
- 25885a35a720 ("Change calling conventions for filldir_t")
Support for the following dlm changes in the Linux kernel v6.1:
- 12cda13cfd53 ("fs: dlm: remove DLM_LSFL_FS from uapi")
Sending a REQUEST_SENSE with a buffer size 0 to the LUN that does not
exist causes the following kernel panic:
RIP: 0010:sg_init_table+0x1e/0x30
Call Trace:
scst_alloc_sg+0xc3/0x270 [scst]
scst_set_cmd_error+0x803/0xa40 [scst]
__scst_init_cmd+0x5c3/0xb80 [scst]
scst_cmd_init_done+0x142/0xae0 [scst]
cmnd_rx_start+0x7f5/0x13d0 [iscsi_scst]
isert_pdu_rx+0x54/0x140 [isert_scst]
isert_recv_completion_handler+0x498/0x580 [isert_scst]
isert_poll_cq+0x396/0x800 [isert_scst]
isert_cq_comp_work_cb+0x4a/0x120 [isert_scst]
process_one_work+0x1d1/0x410
worker_thread+0x2b/0x3d0
kthread+0x11a/0x130
ret_from_fork+0x1f/0x40
Hence set bufflen to 18 if a buffer size 0 was passed to avoid the
crash.
Reported-by: Lev Vainblat <lev@zadarastorage.com>
Sending an INQUIRY with a buffer size 0 to the LUN that does not exist
causes the following kernel panic:
RIP: 0010:sg_init_table+0x1e/0x30
Call Trace:
scst_alloc_sg+0xc3/0x270 [scst]
scst_set_cmd_error+0x8c9/0xa80 [scst]
__scst_init_cmd+0x5c3/0xb80 [scst]
scst_cmd_init_done+0x142/0xae0 [scst]
cmnd_rx_start+0x7f5/0x13d0 [iscsi_scst]
isert_pdu_rx+0x54/0x140 [isert_scst]
isert_recv_completion_handler+0x498/0x580 [isert_scst]
isert_poll_cq+0x396/0x800 [isert_scst]
isert_cq_comp_work_cb+0x4a/0x120 [isert_scst]
process_one_work+0x1d1/0x410
worker_thread+0x2b/0x3d0
kthread+0x11a/0x130
ret_from_fork+0x1f/0x40
Hence set bufflen to 36 if a buffer size 0 was passed to avoid the
crash.
Reported-by: Lev Vainblat <lev@zadarastorage.com>
Commit 8f394da36a36 ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG")
made the __qlt_24xx_handle_abts() function return early if
tcm_qla2xxx_find_cmd_by_tag() didn't find a command, but it missed to clean
up the allocated memory for the management command.
Link: https://lore.kernel.org/r/20220914024924.695604-1-rafaelmendsr@gmail.com
Fixes: 8f394da36a36 ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 601be20fc6a1 upstream ]
SCST_VERSION_CODE has not been updated since SCST v3.4 because
the update-version script used the wrong file for parsing. Hence fix
the update-version script and SCST_VERSION_CODE.
This partially reverts commit d2b292c3f6fd ("scsi: qla2xxx: Enable ATIO
interrupt handshake for ISP27XX")
For some workloads where the host sends a batch of commands and then
pauses, ATIO interrupt coalesce can cause some incoming ATIO entries to be
ignored for extended periods of time, resulting in slow performance,
timeouts, and aborted commands.
Disable interrupt coalesce and re-enable the dedicated ATIO MSI-X
interrupt.
Link: https://lore.kernel.org/r/97dcf365-89ff-014d-a3e5-1404c6af511c@cybernetics.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 53661ded2460 upstream ]
On some platforms, the current logic of relying on finding new packet
solely based on signature pattern can lead to driver reading stale
packets. Though this is a bug in those platforms, reduce such exposures by
limiting reading packets until the IN pointer.
Two module parameters are introduced:
ql2xrspq_follow_inptr:
When set, on newer adapters that has queue pointer shadowing, look for
response packets only until response queue in pointer.
When reset, response packets are read based on a signature pattern
logic (old way).
ql2xrspq_follow_inptr_legacy:
Like ql2xrspq_follow_inptr, but for those adapters where there is no
queue pointer shadowing.
Link: https://lore.kernel.org/r/20220713052045.10683-5-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit b1f707146923 upstream ]
There is a copy and paste bug here. It should check ".rsp" instead of
".req". The error message is copy and pasted as well so update that too.
Link: https://lore.kernel.org/r/YrK1A/t3L6HKnswO@kili
Fixes: 9c40c36e75ff ("scsi: qla2xxx: edif: Reduce Initiator-Initiator thrashing")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 7c33e477bd88 upstream ]
Clear wait for mailbox interrupt flag to prevent stale mailbox:
Feb 22 05:22:56 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-500a:4: LOOP UP detected (16 Gbps).
Feb 22 05:22:59 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-d04c:4: MBX Command timeout for cmd 69, ...
To fix the issue, driver needs to clear the MBX_INTR_WAIT flag on purging
the mailbox. When the stale mailbox completion does arrive, it will be
dropped.
Link: https://lore.kernel.org/r/20220616053508.27186-11-njavali@marvell.com
Fixes: b6faaaf796d7 ("scsi: qla2xxx: Serialize mailbox request")
Cc: Naresh Bannoth <nbannoth@in.ibm.com>
Cc: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Cc: stable@vger.kernel.org
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Naresh Bannoth <nbannoth@in.ibm.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit f260694e6463 upstream ]