From 799ddae08511a123489ef5c72799661ed0419e51 Mon Sep 17 00:00:00 2001 From: Yan Burman Date: Sun, 6 Apr 2014 08:13:25 +0000 Subject: [PATCH] Merged revisions 5322-5326,5329-5340,5344-5371,5382-5407 via svnmerge from svn+ssh://yanb123@svn.code.sf.net/p/scst/svn/trunk ........ r5322 | vlnb | 2014-03-05 05:27:21 +0200 (Wed, 05 Mar 2014) | 13 lines scst_vdisk: Make vdisk_nullio size configurable Keep the default size of vdisk_nullio devices at VDISK_NULLIO_SIZE. Add a sysfs attribute 'size' which is the size of a vdisk device in bytes. Make the size of vdisk_nullio devices configurable. Accept "size" and "size_mb" as creation parameters for vdisk_nullio devices. Generate a CAPACITY DATA HAS CHANGED unit attention after size changes. Refuse any attempt to change the size into a number that is not a multiple of the block size. Signed-off-by: Bart Van Assche ........ r5323 | bvassche | 2014-03-06 09:29:00 +0200 (Thu, 06 Mar 2014) | 1 line scst_vdisk: Avoid that smatch complains about unreachable code ........ r5324 | vlnb | 2014-03-07 06:02:39 +0200 (Fri, 07 Mar 2014) | 27 lines PERSISTENT RESERVE IN: Suppress a kernel warning for small output buffer sizes This patch suppresses the following error message and kernel warning: scst: ***ERROR***: Too big response data len 24 (max 8), limiting it to the max (dev iis) Call Trace: [] ? dump_stack+0x41/0x56 [] ? scst_set_resp_data_len+0x82/0xb1 [scst] [] ? scst_pr_read_reservation+0xbf/0xc4 [scst] [] ? scst_persistent_reserve_in_local+0x140/0x1ce [scst] [] ? scst_exec_check_blocking+0x57/0xf1 [scst] [] ? scst_process_active_cmd+0x86c/0x136f [scst] [] ? scst_do_job_active+0x45/0x5b [scst] [] ? scst_cmd_thread+0x218/0x2b7 [scst] [] ? wake_up_bit+0x23/0x23 [] ? scst_cmd_tasklet+0x32/0x32 [scst] [] ? kthread_freezable_should_stop+0x51/0x51 [] ? scst_cmd_tasklet+0x32/0x32 [scst] [] ? kthread+0xab/0xb3 [] ? kthread_freezable_should_stop+0x51/0x51 [] ? ret_from_fork+0x7c/0xb0 [] ? kthread_freezable_should_stop+0x51/0x51 Reported-by: Roman Bogdanov Signed-off-by: Bart Van Assche ........ r5325 | bvassche | 2014-03-08 15:30:29 +0200 (Sat, 08 Mar 2014) | 1 line nightly build: Update kernel versions ........ r5326 | vlnb | 2014-03-13 05:50:37 +0200 (Thu, 13 Mar 2014) | 3 lines Update link to Gentoo HOWTO from Jurie Botha ........ r5329 | vlnb | 2014-03-15 03:38:06 +0200 (Sat, 15 Mar 2014) | 3 lines Cleanup ........ r5330 | vlnb | 2014-03-15 03:38:37 +0200 (Sat, 15 Mar 2014) | 3 lines Implement REPORT SUPPORTED TASK MANAGEMENT FUNCTIONS command ........ r5331 | vlnb | 2014-03-15 03:40:57 +0200 (Sat, 15 Mar 2014) | 8 lines scst_vdisk: Remove an unused parameter from vdisk_fsync*() The "struct vdisk_cmd_params *p" parameter is neither used by vdisk_fsync(), vdisk_fsync_blockio() nor by vdisk_fsync_fileio() so remove it. Signed-off-by: Bart Van Assche ........ r5332 | vlnb | 2014-03-15 03:42:06 +0200 (Sat, 15 Mar 2014) | 8 lines vdisk_blockio: Change default vendor name back to "SCST_BIO" In r5316 the default vendor name for vdisk_blockio devices was changed into "SCST_FIO". Change this back into "SCST_BIO". Signed-off-by: Bart Van Assche ........ r5333 | vlnb | 2014-03-15 03:44:35 +0200 (Sat, 15 Mar 2014) | 8 lines vdisk_blockio: Add VERIFY implementation There is already an implementation of the VERIFY command for vdisk_fileio devices. Add an implementation for vdisk_blockio devices. Signed-off-by: Bart Van Assche ........ r5334 | vlnb | 2014-03-15 04:01:17 +0200 (Sat, 15 Mar 2014) | 8 lines scst_vdisk: Implement COMPARE AND WRITE Ensure that COMPARE AND WRITE is executed atomically by serializing all COMPARE AND WRITE commands per device (SCST_SERIALIZED). Signed-off-by: Bart Van Assche ........ r5335 | vlnb | 2014-03-15 04:09:17 +0200 (Sat, 15 Mar 2014) | 5 lines Fix URL to SCST website Signed-off-by: Steven J. Magnani ........ r5336 | vlnb | 2014-03-15 04:13:58 +0200 (Sat, 15 Mar 2014) | 3 lines Cleanup ........ r5337 | bvassche | 2014-03-15 08:47:26 +0200 (Sat, 15 Mar 2014) | 1 line scripts/kernel-functions: Kernel 3.13.6 build fix ........ r5338 | bvassche | 2014-03-16 15:38:50 +0200 (Sun, 16 Mar 2014) | 1 line srpt: Minor buid process terminology change ........ r5339 | bvassche | 2014-03-18 17:35:13 +0200 (Tue, 18 Mar 2014) | 1 line ib_srpt: Avoid that session logout hangs sporadically ........ r5340 | vlnb | 2014-03-19 06:28:46 +0200 (Wed, 19 Mar 2014) | 8 lines scst/README: Show how to read SCST sysfs attributes Make the behavior of SCST sysfs attributes more clear by adding examples in scst/README of code for reading and writing these attributes. Signed-off-by: Bart Van Assche ........ r5344 | bvassche | 2014-03-20 17:13:50 +0200 (Thu, 20 Mar 2014) | 1 line ib_srpt: Simplify srpt_handle_cmd() ........ r5345 | bvassche | 2014-03-20 17:14:45 +0200 (Thu, 20 Mar 2014) | 5 lines ib_srpt: Micro-optimize I/O context state manipulation All ioctx->state manipulations are serialized per command so it is not necessary to use locking to protect these manipulations. ........ r5346 | bvassche | 2014-03-20 17:15:54 +0200 (Thu, 20 Mar 2014) | 8 lines ib_srpt: Handle GID change events properly The mlx4_core driver generates a GID change event after a port has been changed from IB into Ethernet mode. Avoid that this causes the following error message to appear in the system log: ib_srpt: ***ERROR***: received unrecognized IB event 18 ........ r5347 | bvassche | 2014-03-20 17:16:27 +0200 (Thu, 20 Mar 2014) | 1 line ib_srpt/Makefile: Add kerneldoc target ........ r5348 | bvassche | 2014-03-20 17:17:04 +0200 (Thu, 20 Mar 2014) | 1 line ib_srpt: Fix an error reported by the kerneldoc tool ........ r5349 | bvassche | 2014-03-20 17:18:06 +0200 (Thu, 20 Mar 2014) | 5 lines ib_srpt: Avoid that cmd_wait_list processing triggers command reordering Although harmless for SCSI commands with SIMPLE ordering, avoid that commands received before RTU can get reordered. ........ r5350 | bvassche | 2014-03-20 17:18:38 +0200 (Thu, 20 Mar 2014) | 2 lines ib_srpt: Micro-optimize SRP_CMD parsing ........ r5351 | bvassche | 2014-03-20 17:19:05 +0200 (Thu, 20 Mar 2014) | 1 line ib_srpt: Sync information unit memory only once ........ r5352 | bvassche | 2014-03-20 17:19:34 +0200 (Thu, 20 Mar 2014) | 2 lines ib_srpt: Introduce a temporary variable in srpt_handle_new_iu() ........ r5353 | bvassche | 2014-03-20 17:20:55 +0200 (Thu, 20 Mar 2014) | 2 lines ib_srpt: Micro-optimize polling ........ r5354 | bvassche | 2014-03-20 17:22:19 +0200 (Thu, 20 Mar 2014) | 5 lines ib_srpt: Rework multi-channel support Store initiator and target port ID's once per nexus instead of in each channel data structure. ........ r5355 | bvassche | 2014-03-20 17:23:16 +0200 (Thu, 20 Mar 2014) | 2 lines ib_srpt: Simplify channel state management code ........ r5356 | bvassche | 2014-03-20 17:24:18 +0200 (Thu, 20 Mar 2014) | 5 lines ib_srpt: Defer destroying the QP until the TimeWait state has been left This is necessary to avoid that a login gets rejected due to reusing a queue pair number that has not yet been freed by the target side. ........ r5357 | bvassche | 2014-03-20 17:25:34 +0200 (Thu, 20 Mar 2014) | 5 lines ib_srpt: Rework waiting for last WQE After having changed the queue pair state into "error", queue an additional work request instead of waiting for the last WQE event. ........ r5358 | bvassche | 2014-03-20 17:26:48 +0200 (Thu, 20 Mar 2014) | 1 line srpt/session-management.txt: Document how sessions are managed by the ib_srpt driver ........ r5359 | bvassche | 2014-03-20 18:10:19 +0200 (Thu, 20 Mar 2014) | 1 line scst_const.h: Make COMPARE_AND_WRITE definition available for kernel versions 3.6..3.11 ........ r5360 | vlnb | 2014-03-21 03:58:13 +0200 (Fri, 21 Mar 2014) | 3 lines In VERIFY commands BYTCHK 1x is not supported (yet) ........ r5361 | bvassche | 2014-03-21 18:29:26 +0200 (Fri, 21 Mar 2014) | 1 line srpt/Makefile: Avoid that the build process depends on source control tools ........ r5362 | vlnb | 2014-03-22 01:12:42 +0200 (Sat, 22 Mar 2014) | 3 lines Fix error recovery of internal commands ........ r5363 | bvassche | 2014-03-24 13:53:00 +0200 (Mon, 24 Mar 2014) | 1 line ib_srpt: Clarify a kernel-doc comment ........ r5364 | bvassche | 2014-03-24 13:55:56 +0200 (Mon, 24 Mar 2014) | 1 line ib_srpt: Add newline at the end of kernel warning statements ........ r5365 | bvassche | 2014-03-24 13:57:39 +0200 (Mon, 24 Mar 2014) | 5 lines ib_srpt: Change the severity level of a log message Make sure that target port state changes get logged even with debugging disabled. ........ r5366 | bvassche | 2014-03-24 13:59:27 +0200 (Mon, 24 Mar 2014) | 5 lines ib_srpt: Clean up srpt_destroy_ch_ib() All callers guarantee that the completion queue is empty so it is not necessary to invoke ib_poll_cq() from inside this function. ........ r5367 | bvassche | 2014-03-24 14:01:07 +0200 (Mon, 24 Mar 2014) | 5 lines ib_srpt: Micro-optimize srpt_adjust_srq_wr_avail() The overhead of atomic_add_return() is lower than that of a spin_lock() / spin_unlock() pair, hence switch to the former. ........ r5368 | bvassche | 2014-03-24 14:03:09 +0200 (Mon, 24 Mar 2014) | 20 lines ib_srpt: Fix a kernel warning Avoid that the following (very rare) kernel warning is reported when an ib_srpt target port is disabled while I/O is ongoing: WARNING: CPU: 3 PID: 12259 at srpt/src/ib_srpt.c:3334 srpt_xmit_response+0x165/0x300 [ib_srpt]() Unexpected command state 6 Call Trace: [] dump_stack+0x4e/0x7a [] warn_slowpath_common+0x7d/0xa0 [] warn_slowpath_fmt+0x4c/0x50 [] srpt_xmit_response+0x165/0x300 [ib_srpt] [] scst_xmit_response+0xbc/0x560 [scst] [] scst_process_active_cmd+0x29d/0x7b0 [scst] [] scst_do_job_active+0x89/0x1a0 [scst] [] scst_cmd_thread+0x15f/0x350 [scst] [] kthread+0xed/0x110 [] ret_from_fork+0x7c/0xb0 ---[ end trace 591f7af7d006fc0e ]--- ........ r5369 | bvassche | 2014-03-24 14:04:44 +0200 (Mon, 24 Mar 2014) | 5 lines ib_srpt: Add a kernel warning Invoking srpt_zerolength_write() before the queue pair has reached the error state is a bug, so complain loudly if that happens. ........ r5370 | bvassche | 2014-03-24 14:07:43 +0200 (Mon, 24 Mar 2014) | 7 lines ib_srpt: Avoid waiting for missing error completions Apparently with mlx4 firmware up to and including 2.30.8000 it is not guaranteed that for a QP associated with an SRQ error completions are generated for all pending work requests. Avoid triggering srpt_pending_cmd_timeout() for missing error completions. ........ r5371 | bvassche | 2014-03-25 18:19:00 +0200 (Tue, 25 Mar 2014) | 1 line nightly build: Update kernel versions ........ r5382 | vlnb | 2014-03-26 04:31:52 +0200 (Wed, 26 Mar 2014) | 6 lines Black hole functionality added Scst_mutex intentially used directly in the sysfs handler, because comming sysfs improvements will allow that. ........ r5383 | vlnb | 2014-03-26 05:18:03 +0200 (Wed, 26 Mar 2014) | 5 lines documentation: Document SCST_SERIALIZED and SCST_STRICTLY_SERIALIZED Signed-off-by: Bart Van Assche ........ r5384 | vlnb | 2014-03-26 05:18:36 +0200 (Wed, 26 Mar 2014) | 3 lines Cleanup ........ r5385 | vlnb | 2014-03-26 05:20:04 +0200 (Wed, 26 Mar 2014) | 9 lines scst_local: Remove two superfluous tests The to_scst_lcl_sess() macro is based on container_of() and hence never returns NULL. Hence remove the two tests that compare the result of that macro against NULL. Signed-off-by: Bart Van Assche ........ r5386 | vlnb | 2014-03-26 05:21:25 +0200 (Wed, 26 Mar 2014) | 7 lines iscsi-scst: Introduce ARRAY_SIZE() This patch does not change any functionality. Signed-off-by: Bart Van Assche ........ r5387 | vlnb | 2014-03-26 05:22:16 +0200 (Wed, 26 Mar 2014) | 8 lines scst: Clarify a comment The comment above scst_nexus_loss() is somewhat confusing so change it into something that is more clear. Signed-off-by: Bart Van Assche ........ r5388 | vlnb | 2014-03-26 05:23:57 +0200 (Wed, 26 Mar 2014) | 9 lines scst_vdisk: Fix READ CAPACITY(10) SBC-2 defines the LBA as a 32-bit field that starts at offset 2 and not as a 64-bit field. Reported-by: Mike Christie Signed-off-by: Bart Van Assche ........ r5389 | bvassche | 2014-03-26 13:56:13 +0200 (Wed, 26 Mar 2014) | 4 lines ib_srpt: Clean up srpt_handle_rdma_comp() This patch does not change any functionality. ........ r5390 | bvassche | 2014-03-26 13:56:59 +0200 (Wed, 26 Mar 2014) | 4 lines ib_srpt: Clean up srpt_handle_send_err_comp() This patch does not change any functionality. ........ r5391 | bvassche | 2014-03-26 13:58:25 +0200 (Wed, 26 Mar 2014) | 4 lines ib_srpt: Clean up srpt_handle_rdma_err_comp() This patch does not change any functionality. ........ r5392 | bvassche | 2014-03-26 13:59:37 +0200 (Wed, 26 Mar 2014) | 5 lines ib_srpt: Suppress superfluous error messages Only complain about a missing completion for I/O contexts that are in a state where the ib_srpt driver is waiting for the HCA. ........ r5393 | bvassche | 2014-03-26 14:00:43 +0200 (Wed, 26 Mar 2014) | 5 lines ib_srpt: Make srpt_abort_cmd() state checks more strict Complain if srpt_abort_cmd() is called for an I/O context that is being processed by SCST and not by the HCA. ........ r5394 | vlnb | 2014-03-27 00:16:16 +0200 (Thu, 27 Mar 2014) | 3 lines Cosmetics ........ r5395 | vlnb | 2014-03-27 01:51:36 +0200 (Thu, 27 Mar 2014) | 3 lines Reimplement dropping of TM requests in a more reliable manner ........ r5396 | vlnb | 2014-03-27 03:57:08 +0200 (Thu, 27 Mar 2014) | 3 lines Possibility to specify SCSI target device name added ........ r5397 | bvassche | 2014-03-27 10:27:45 +0200 (Thu, 27 Mar 2014) | 6 lines Fix two checkpatch complaints about whitespace Avoid that checkpatch reports the following error message: ERROR: "(foo*)" should be "(foo *)" ........ r5398 | bvassche | 2014-03-27 10:34:27 +0200 (Thu, 27 Mar 2014) | 6 lines Avoid that checkpatch complains that return is not a function Avoid that checkpatch reports the following error message: ERROR: return is not a function, parentheses are not required ........ r5399 | bvassche | 2014-03-27 10:39:53 +0200 (Thu, 27 Mar 2014) | 1 line scripts/generate-kernel-patch: Fix for kernel versions 3.7, 3.10, 3.12 and 3.13 ........ r5400 | vlnb | 2014-03-29 04:07:22 +0300 (Sat, 29 Mar 2014) | 3 lines Cleanup ........ r5401 | bvassche | 2014-04-01 20:46:36 +0300 (Tue, 01 Apr 2014) | 1 line vdisk_blockio: Temporarily disable COMPARE AND WRITE support ........ r5402 | bvassche | 2014-04-02 00:05:31 +0300 (Wed, 02 Apr 2014) | 1 line vdisk_blockio: Fix the (recently enabled) VERIFY command ........ r5403 | bvassche | 2014-04-03 18:58:16 +0300 (Thu, 03 Apr 2014) | 1 line ib_srpt: RHEL 6.5 build fix ........ r5404 | vlnb | 2014-04-04 03:57:30 +0300 (Fri, 04 Apr 2014) | 3 lines Fix typo in scst_report_supported_tm_fns() reported by Steve Magnani ........ r5405 | bvassche | 2014-04-04 07:38:33 +0300 (Fri, 04 Apr 2014) | 1 line scripts/specialize-patch: Handle numbers surrounded by parentheses properly ........ r5406 | bvassche | 2014-04-04 08:50:52 +0300 (Fri, 04 Apr 2014) | 1 line scripts/specialize-patch: Rework r5405 ........ r5407 | bvassche | 2014-04-04 08:56:25 +0300 (Fri, 04 Apr 2014) | 1 line nightly build: Update kernel versions ........ git-svn-id: http://svn.code.sf.net/p/scst/svn/branches/iser@5408 d57e44dd-8a1f-0410-8b47-8ef2f437770f --- doc/scst_pg.sgml | 15 +- doc/scst_user_spec.sgml | 2 +- iscsi-scst/kernel/iscsi.c | 34 +- iscsi-scst/kernel/nthread.c | 10 +- nightly/conf/nightly.conf | 6 +- qla2x00t/qla2x00-target/qla2x00t.c | 34 +- scripts/generate-kernel-patch | 8 +- scripts/kernel-functions | 4 +- scripts/specialize-patch | 14 +- scst/README | 66 ++- scst/README_in-tree | 66 ++- scst/include/scst.h | 51 ++- scst/include/scst_const.h | 15 +- scst/src/Makefile | 6 +- scst/src/dev_handlers/scst_vdisk.c | 639 ++++++++++++++++++++++---- scst/src/scst_lib.c | 122 ++++- scst/src/scst_main.c | 2 +- scst/src/scst_pres.c | 10 +- scst/src/scst_sysfs.c | 159 +++++++ scst/src/scst_targ.c | 105 ++++- scst_local/scst_local.c | 8 - srpt/Makefile | 7 +- srpt/README | 2 +- srpt/session-management.txt | 45 ++ srpt/src/ib_srpt.c | 691 ++++++++++++++--------------- srpt/src/ib_srpt.h | 60 ++- www/index.html | 2 +- 27 files changed, 1635 insertions(+), 548 deletions(-) create mode 100644 srpt/session-management.txt diff --git a/doc/scst_pg.sgml b/doc/scst_pg.sgml index e3e899bd2..f690b2536 100644 --- a/doc/scst_pg.sgml +++ b/doc/scst_pg.sgml @@ -1272,7 +1272,20 @@ Where: User space API diff --git a/iscsi-scst/kernel/iscsi.c b/iscsi-scst/kernel/iscsi.c index fb8d153ec..b6e5599e5 100644 --- a/iscsi-scst/kernel/iscsi.c +++ b/iscsi-scst/kernel/iscsi.c @@ -62,7 +62,8 @@ static struct page *dummy_page; static struct scatterlist dummy_sg; static void cmnd_remove_data_wait_hash(struct iscsi_cmnd *cmnd); -static void iscsi_send_task_mgmt_resp(struct iscsi_cmnd *req, int status); +static void iscsi_send_task_mgmt_resp(struct iscsi_cmnd *req, int status, + bool dropped); static void iscsi_check_send_delayed_tm_resp(struct iscsi_session *sess); static int cmnd_insert_data_wait_hash(struct iscsi_cmnd *cmnd); static void iscsi_cmnd_init_write(struct iscsi_cmnd *rsp, int flags); @@ -2721,7 +2722,7 @@ static void execute_task_management(struct iscsi_cmnd *req) reject: if (rc != 0) - iscsi_send_task_mgmt_resp(req, status); + iscsi_send_task_mgmt_resp(req, status, false); return; } @@ -3610,7 +3611,8 @@ out: return; } -static void iscsi_send_task_mgmt_resp(struct iscsi_cmnd *req, int status) +static void iscsi_send_task_mgmt_resp(struct iscsi_cmnd *req, int status, + bool drop) { struct iscsi_cmnd *rsp; struct iscsi_task_mgt_hdr *req_hdr = @@ -3622,7 +3624,26 @@ static void iscsi_send_task_mgmt_resp(struct iscsi_cmnd *req, int status) TRACE_ENTRY(); TRACE_MGMT_DBG("TM req %p finished", req); - TRACE(TRACE_MGMT, "iSCSI TM fn %d finished, status %d", fn, status); + TRACE(TRACE_MGMT, "iSCSI TM fn %d finished, status %d, dropped %d", + fn, status, drop); + + if (drop) { + spin_lock(&sess->sn_lock); + sess->tm_active--; + spin_unlock(&sess->sn_lock); + if (fn == ISCSI_FUNCTION_TARGET_COLD_RESET) { + struct iscsi_target *target = req->conn->session->target; + + PRINT_INFO("Closing all connections for target %x at " + "COLD RESET from initiator %s", target->tid, + req->conn->session->initiator_name); + + mutex_lock(&target->target_mutex); + target_del_all_sess(target, 0); + mutex_unlock(&target->target_mutex); + } + goto out_release; + } rsp = iscsi_alloc_rsp(req); rsp_hdr = (struct iscsi_task_rsp_hdr *)&rsp->pdu.bhs; @@ -3691,8 +3712,7 @@ static void iscsi_task_mgmt_fn_done(struct scst_mgmt_cmd *scst_mcmd) int fn = scst_mgmt_cmd_get_fn(scst_mcmd); struct iscsi_cmnd *req = (struct iscsi_cmnd *) scst_mgmt_cmd_get_tgt_priv(scst_mcmd); - int status = - iscsi_get_mgmt_response(scst_mgmt_cmd_get_status(scst_mcmd)); + int status = iscsi_get_mgmt_response(scst_mgmt_cmd_get_status(scst_mcmd)); if ((status == ISCSI_RESPONSE_UNKNOWN_TASK) && (fn == SCST_ABORT_TASK)) { @@ -3714,7 +3734,7 @@ static void iscsi_task_mgmt_fn_done(struct scst_mgmt_cmd *scst_mcmd) sBUG_ON(1); break; default: - iscsi_send_task_mgmt_resp(req, status); + iscsi_send_task_mgmt_resp(req, status, scst_mgmt_cmd_dropped(scst_mcmd)); scst_mgmt_cmd_set_tgt_priv(scst_mcmd, NULL); break; } diff --git a/iscsi-scst/kernel/nthread.c b/iscsi-scst/kernel/nthread.c index 04d6875a8..145174009 100644 --- a/iscsi-scst/kernel/nthread.c +++ b/iscsi-scst/kernel/nthread.c @@ -1331,8 +1331,7 @@ static int write_data(struct iscsi_conn *conn) loff_t off = 0; int rest; - sBUG_ON(count > (signed)(sizeof(conn->write_iov) / - sizeof(conn->write_iov[0]))); + sBUG_ON(count > ARRAY_SIZE(conn->write_iov)); retry: oldfs = get_fs(); set_fs(KERNEL_DS); @@ -1367,8 +1366,8 @@ retry: break; goto out_iov; } - sBUG_ON(iop > conn->write_iov + sizeof(conn->write_iov) - /sizeof(conn->write_iov[0])); + sBUG_ON(iop > + conn->write_iov + ARRAY_SIZE(conn->write_iov)); iop->iov_base += rest; iop->iov_len -= rest; } @@ -1626,8 +1625,7 @@ static void init_tx_hdigest(struct iscsi_cmnd *cmnd) digest_tx_header(cmnd); - sBUG_ON(conn->write_iop_used >= - (signed)(sizeof(conn->write_iov)/sizeof(conn->write_iov[0]))); + sBUG_ON(conn->write_iop_used >= ARRAY_SIZE(conn->write_iov)); iop = &conn->write_iop[conn->write_iop_used]; conn->write_iop_used++; diff --git a/nightly/conf/nightly.conf b/nightly/conf/nightly.conf index 1b8d709e1..15253e973 100644 --- a/nightly/conf/nightly.conf +++ b/nightly/conf/nightly.conf @@ -3,16 +3,16 @@ ABT_DETAILS="x86_64" ABT_JOBS=5 ABT_KERNELS=" \ -3.13.5 \ +3.13.9 \ 3.12.13-nc \ 3.11.10-nc \ -3.10.32-nc \ +3.10.36-nc \ 3.9.11-nc \ 3.8.13-nc \ 3.7.10-nc \ 3.6.11-nc \ 3.5.7-nc \ -3.4.81-nc \ +3.4.86-nc \ 3.3.8-nc \ 3.2.53-nc \ 3.1.10-nc \ diff --git a/qla2x00t/qla2x00-target/qla2x00t.c b/qla2x00t/qla2x00-target/qla2x00t.c index 4de80917f..720daea09 100644 --- a/qla2x00t/qla2x00-target/qla2x00t.c +++ b/qla2x00t/qla2x00-target/qla2x00t.c @@ -1235,25 +1235,25 @@ static struct q2t_sess *q2t_create_sess(scsi_qla_host_t *ha, fc_port_t *fcport, spin_lock_irq(&pha->hardware_lock); sess = q2t_find_sess_by_port_name_include_deleted(tgt, fcport->port_name); if (sess != NULL) { - TRACE_MGMT_DBG("Double sess %p found (s_id %x:%x:%x, " - "loop_id %d), updating to d_id %x:%x:%x, " - "loop_id %d", sess, sess->s_id.b.domain, - sess->s_id.b.area, sess->s_id.b.al_pa, - sess->loop_id, fcport->d_id.b.domain, - fcport->d_id.b.area, fcport->d_id.b.al_pa, - fcport->loop_id); + TRACE_MGMT_DBG("Double sess %p found (s_id %x:%x:%x, " + "loop_id %d), updating to d_id %x:%x:%x, " + "loop_id %d", sess, sess->s_id.b.domain, + sess->s_id.b.area, sess->s_id.b.al_pa, + sess->loop_id, fcport->d_id.b.domain, + fcport->d_id.b.area, fcport->d_id.b.al_pa, + fcport->loop_id); - if (sess->deleted) - q2t_undelete_sess(sess); + if (sess->deleted) + q2t_undelete_sess(sess); - q2t_sess_get(sess); - sess->s_id = fcport->d_id; - sess->loop_id = fcport->loop_id; - sess->conf_compl_supported = fcport->conf_compl_supported; - if (sess->local && !local) - sess->local = 0; - spin_unlock_irq(&pha->hardware_lock); - goto out; + q2t_sess_get(sess); + sess->s_id = fcport->d_id; + sess->loop_id = fcport->loop_id; + sess->conf_compl_supported = fcport->conf_compl_supported; + if (sess->local && !local) + sess->local = 0; + spin_unlock_irq(&pha->hardware_lock); + goto out; } spin_unlock_irq(&pha->hardware_lock); diff --git a/scripts/generate-kernel-patch b/scripts/generate-kernel-patch index e66b287ef..532572fb4 100755 --- a/scripts/generate-kernel-patch +++ b/scripts/generate-kernel-patch @@ -268,13 +268,13 @@ done scsi_exec_req_fifo_defined=0 scst_io_context=0 for p in scst/kernel/*-${kver}.patch \ - $(if [ ${kver} = 3.7 ] && [ "${1#3.7.}" -ge 10 ]; then + $(if [ "${1#3.7.}" != "$1" ] && [ "${1#3.7.}" -ge 10 ]; then echo iscsi-scst/kernel/patches/*-3.7.10.patch; - elif [ ${kver} = 3.10 ] && [ "${1#3.10.}" -ge 30 ]; then + elif [ "${1#3.10.}" != "$1" ] && [ "${1#3.10.}" -ge 30 ]; then echo iscsi-scst/kernel/patches/*-3.10.30.patch; - elif [ ${kver} = 3.12 ] && [ "${1#3.12.}" -ge 11 ]; then + elif [ "${1#3.12.}" != "$1" ] && [ "${1#3.12.}" -ge 11 ]; then echo iscsi-scst/kernel/patches/*-3.12.11.patch; - elif [ ${kver} = 3.13 ] && [ "${1#3.13.}" -ge 3 ]; then + elif [ "${1#3.13.}" != "$1" ] && [ "${1#3.13.}" -ge 3 ]; then echo iscsi-scst/kernel/patches/*-3.13.3.patch; else echo iscsi-scst/kernel/patches/*-${kver}.patch; diff --git a/scripts/kernel-functions b/scripts/kernel-functions index 0bc0584a9..58751563f 100644 --- a/scripts/kernel-functions +++ b/scripts/kernel-functions @@ -175,7 +175,8 @@ Get rid of sparse errors on sk_buff.protocol. EOF fi if [ "${1#3.13}" != "$1" ]; then - patch -f -s -p1 <<'EOF' + if [ "$1" = "3.13" ] || [ "${1#3.13.}" -lt 6 ]; then + patch -f -s -p1 <<'EOF' From 7b4ec8dd7d4ac467e9eee4d49f2c9574d773efbb Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Thu, 16 Jan 2014 10:18:48 +1030 @@ -212,6 +213,7 @@ index 3f2793d..96e45ea 100644 __used \ __attribute__((section("___ksymtab" sec "+" #sym), unused)) \ EOF + fi fi ) rmdir "${tmpdir}" diff --git a/scripts/specialize-patch b/scripts/specialize-patch index f3ae24262..ede8b0a19 100755 --- a/scripts/specialize-patch +++ b/scripts/specialize-patch @@ -151,7 +151,13 @@ function evaluate(stmnt, pattern, arg, op, result) { { last_stmnt = stmnt - pattern = "![[:blank:]]*([0-9]+)" + pattern = "![[:blank:]]*(-*[0-9]+)" + while (match(stmnt, pattern, op) != 0) + { + sub(pattern, op[1] == 0, stmnt) + } + + pattern = "![[:blank:]]*\\([[:blank:]]*(-*[0-9]+)[[:blank:]]*\\)" while (match(stmnt, pattern, op) != 0) { sub(pattern, op[1] == 0, stmnt) @@ -197,6 +203,12 @@ function evaluate(stmnt, pattern, arg, op, result) { sub(pattern, result, stmnt) } + pattern="(-*[0-9]+)[[:blank:]]*\\&\\&[[:blank:]]*\\([[:blank:]]*(-*[0-9]+)[[:blank:]]*\\)" + while (match(stmnt, pattern, op) != 0) + { + sub(pattern, (op[1] != 0) && (op[2] != 0), stmnt) + } + pattern="(-*[0-9]+)[[:blank:]]*\\&\\&[[:blank:]]*(-*[0-9]+)" while (match(stmnt, pattern, op) != 0) { diff --git a/scst/README b/scst/README index 9e91bcc65..6234643f1 100644 --- a/scst/README +++ b/scst/README @@ -460,7 +460,38 @@ following entries: complete, an management tool should poll this file. If the operation hasn't yet completed, it will also return EAGAIN. But after it's completed, it will return the result of this operation (0 for success - or -errno for error). + or -errno for error). The following two shell functions show how to do + this: + +# Read the SCST sysfs attribute $1. See also scst/README for more information. +scst_sysfs_read() { + local EAGAIN val + + EAGAIN="Resource temporarily unavailable" + while true; do + if val="$(LC_ALL=C cat "$1" 2>&1)"; then + echo -n "${val%\[key\]}" + return 0 + elif [ "${val/*: }" != "$EAGAIN" ]; then + return 1 + fi + sleep 1 + done +} + +# Write $1 into the SCST sysfs attribute $2. See also scst/README for more +# information. +scst_sysfs_write() { + local EAGAIN status + + EAGAIN="Resource temporarily unavailable" + if status="$(LC_ALL=C; (echo -n "$1" > "$2") 2>&1)"; then + return 0 + elif [ "${status/*: }" != "$EAGAIN" ]; then + return 1 + fi + scst_sysfs_read /sys/kernel/scst_tgt/last_sysfs_mgmt_res >/dev/null +} "Devices" subdirectory contains subdirectories for each SCST devices. @@ -549,7 +580,7 @@ Every target should have at least the following entries: mapping to the corresponding hardware port. It isn't anyhow used by SCST. - - enabled - using this attribute you can enable or disable this target/ + - enabled - using this attribute you can enable or disable this target. It allows to finish configuring it before it starts accepting new connections. 0 by default. @@ -561,6 +592,31 @@ Every target should have at least the following entries: can assign the addressing method on per-initiator basis. See also the "Logical unit addressing (LUN)" section in SAM-5 for more information. + - black_hole - if set, all LUNs in the corresponding initiator group, + default target group in this case, start "swallowing" requests from + initiators. Possible values are: + + * 0 - disable black hole mode + + * 1 - immediately abort all coming commands + + * 2 - immediately abort all coming commands and drop all coming TM + commands + + * 3 - immediately abort all coming data transfer commands. + + * 4 - immediately abort all coming data transfer commands and drop all + coming TM commands + + Modes 3 and 4 are the most evil ones, because they are not too well + handled by many initiator OS'es, including Linux, so they may never + recover from it. + + Note, dropping TM commands, i.e. not sending response on them, + implemented not for all target drivers. If it's implemented for your + particular target driver or not, you can find out by checking traces + or the target driver's source code. + - cpu_mask - defines CPU affinity mask for threads serving this target. For threads serving LUNs it is used only for devices with threads_pool_type "per_initiator". @@ -719,7 +775,7 @@ commands by looking inside this file. Each security group's subdirectory contains 2 subdirectories: initiators and luns as well as the following attributes: addr_method, cpu_mask and -io_grouping_type. See above description of them. +io_grouping_type, black_hole. See above description of them. Each "initiators" subdirectory contains list of added to this groups initiator as well as as well as file "mgmt". This file has the following @@ -965,6 +1021,10 @@ Each vdisk_fileio's device has the following attributes in - prod_rev_lvl - PRODUCT REVISION LEVEL as reported via the INQUIRY response. The default value for this field is " 300". + - scsi_device_name - optional SCSI target device name to which this + SCST device belongs to (in SCSI terminology all SCST devices called + Logical Units). See SPC for more info. + - thin_provisioned - contains thin provisioning status of this virtual device. diff --git a/scst/README_in-tree b/scst/README_in-tree index 7ff520b93..f66ef58e6 100644 --- a/scst/README_in-tree +++ b/scst/README_in-tree @@ -322,7 +322,38 @@ following entries: complete, an management tool should poll this file. If the operation hasn't yet completed, it will also return EAGAIN. But after it's completed, it will return the result of this operation (0 for success - or -errno for error). + or -errno for error). The following two shell functions show how to do + this: + +# Read the SCST sysfs attribute $1. See also scst/README for more information. +scst_sysfs_read() { + local EAGAIN val + + EAGAIN="Resource temporarily unavailable" + while true; do + if val="$(LC_ALL=C cat "$1" 2>&1)"; then + echo -n "${val%\[key\]}" + return 0 + elif [ "${val/*: }" != "$EAGAIN" ]; then + return 1 + fi + sleep 1 + done +} + +# Write $1 into the SCST sysfs attribute $2. See also scst/README for more +# information. +scst_sysfs_write() { + local EAGAIN status + + EAGAIN="Resource temporarily unavailable" + if status="$(LC_ALL=C; (echo -n "$1" > "$2") 2>&1)"; then + return 0 + elif [ "${status/*: }" != "$EAGAIN" ]; then + return 1 + fi + scst_sysfs_read /sys/kernel/scst_tgt/last_sysfs_mgmt_res >/dev/null +} "Devices" subdirectory contains subdirectories for each SCST devices. @@ -411,7 +442,7 @@ Every target should have at least the following entries: mapping to the corresponding hardware port. It isn't anyhow used by SCST. - - enabled - using this attribute you can enable or disable this target/ + - enabled - using this attribute you can enable or disable this target. It allows to finish configuring it before it starts accepting new connections. 0 by default. @@ -423,6 +454,31 @@ Every target should have at least the following entries: can assign the addressing method on per-initiator basis. See also the "Logical unit addressing (LUN)" section in SAM-5 for more information. + - black_hole - if set, all LUNs in the corresponding initiator group, + default target group in this case, start "swallowing" requests from + initiators. Possible values are: + + * 0 - disable black hole mode + + * 1 - immediately abort all coming commands + + * 2 - immediately abort all coming commands and drop all coming TM + commands + + * 3 - immediately abort all coming data transfer commands. + + * 4 - immediately abort all coming data transfer commands and drop all + coming TM commands + + Modes 3 and 4 are the most evil ones, because they are not too well + handled by many initiator OS'es, including Linux, so they may never + recover from it. + + Note, dropping TM commands, i.e. not sending response on them, + implemented not for all target drivers. If it's implemented for your + particular target driver or not, you can find out by checking traces + or the target driver's source code. + - cpu_mask - defines CPU affinity mask for threads serving this target. For threads serving LUNs it is used only for devices with threads_pool_type "per_initiator". @@ -581,7 +637,7 @@ commands by looking inside this file. Each security group's subdirectory contains 2 subdirectories: initiators and luns as well as the following attributes: addr_method, cpu_mask and -io_grouping_type. See above description of them. +io_grouping_type, black_hole. See above description of them. Each "initiators" subdirectory contains list of added to this groups initiator as well as as well as file "mgmt". This file has the following @@ -823,6 +879,10 @@ Each vdisk_fileio's device has the following attributes in - prod_rev_lvl - PRODUCT REVISION LEVEL as reported via the INQUIRY response. The default value for this field is " 300". + - scsi_device_name - optional SCSI target device name to which this + SCST device belongs to (in SCSI terminology all SCST devices called + Logical Units). See SPC for more info. + - thin_provisioned - contains thin provisioning status of this virtual device. diff --git a/scst/include/scst.h b/scst/include/scst.h index 229720009..378dc589c 100644 --- a/scst/include/scst.h +++ b/scst/include/scst.h @@ -592,6 +592,9 @@ enum scst_exec_context { /* Set if tgt_dev has Unit Attention sense */ #define SCST_TGT_DEV_UA_PENDING 0 +/* Cache of acg->acg_black_hole_type */ +#define SCST_TGT_DEV_BLACK_HOLE 1 + /************************************************************* ** I/O grouping types. Changing them don't forget to change ** the corresponding *_STR values in scst_const.h! @@ -2286,6 +2289,7 @@ struct scst_mgmt_cmd { unsigned int cmd_sn_set:1; /* set, if cmd_sn field is valid */ /* Set if dev handler's task_mgmt_fn_received was called */ unsigned int task_mgmt_fn_received_called:1; + unsigned int mcmd_dropped:1; /* set if mcmd was dropped */ /* * Number of commands to finish before sending response, @@ -2727,6 +2731,33 @@ struct scst_acg { unsigned int tgt_acg:1; +/* Not a black hole */ +#define SCST_ACG_BLACK_HOLE_NONE 0 + +/* Immediately abort all coming commands */ +#define SCST_ACG_BLACK_HOLE_CMD 1 + +/* + * Immediately abort all coming commands and drop all coming TM commands. + * + * CAUTION! With some target drivers it can cause internal resources + * leaks, so don't abuse this option! + */ +#define SCST_ACG_BLACK_HOLE_ALL 2 + +/* Immediately abort all coming data transfer commands */ +#define SCST_ACG_BLACK_HOLE_DATA_CMD 3 + +/* + * Immediately abort all coming data transfer commands and drop all + * coming TM commands. + * + * CAUTION! With some target drivers it can cause internal resources + * leaks, so don't abuse this option! + */ +#define SCST_ACG_BLACK_HOLE_DATA_MCMD 4 + volatile int acg_black_hole_type; + /* sysfs release completion */ struct completion *acg_kobj_release_cmpl; @@ -3035,6 +3066,8 @@ int scst_get_cdb_info(struct scst_cmd *cmd); int scst_set_cmd_error_status(struct scst_cmd *cmd, int status); int scst_set_cmd_error(struct scst_cmd *cmd, int key, int asc, int ascq); +int scst_set_cmd_error_and_inf(struct scst_cmd *cmd, int key, int asc, + int ascq, uint64_t information); void scst_set_busy(struct scst_cmd *cmd); void scst_check_convert_sense(struct scst_cmd *cmd); @@ -3685,12 +3718,6 @@ static inline int scst_mgmt_cmd_get_status(struct scst_mgmt_cmd *mcmd) return mcmd->status; } -/* Returns mgmt cmd's TM fn */ -static inline int scst_mgmt_cmd_get_fn(struct scst_mgmt_cmd *mcmd) -{ - return mcmd->fn; -} - static inline void scst_mgmt_cmd_set_status(struct scst_mgmt_cmd *mcmd, int status) { @@ -3700,6 +3727,18 @@ static inline void scst_mgmt_cmd_set_status(struct scst_mgmt_cmd *mcmd, mcmd->status = status; } +/* Returns mgmt cmd's TM fn */ +static inline int scst_mgmt_cmd_get_fn(struct scst_mgmt_cmd *mcmd) +{ + return mcmd->fn; +} + +/* Returns true if mgmt cmd should be dropped, i.e. response not sent */ +static inline bool scst_mgmt_cmd_dropped(struct scst_mgmt_cmd *mcmd) +{ + return mcmd->mcmd_dropped; +} + /* * Called by dev handler's task_mgmt_fn_*() to notify SCST core that mcmd * is going to complete asynchronously. diff --git a/scst/include/scst_const.h b/scst/include/scst_const.h index d534f2739..06bf7e923 100644 --- a/scst/include/scst_const.h +++ b/scst/include/scst_const.h @@ -264,12 +264,12 @@ enum scst_cdb_flags { static inline int scst_sense_valid(const uint8_t *sense) { - return ((sense != NULL) && ((sense[0] & 0x70) == 0x70)); + return (sense != NULL) && ((sense[0] & 0x70) == 0x70); } static inline int scst_no_sense(const uint8_t *sense) { - return ((sense != NULL) && (sense[2] == 0)); + return (sense != NULL) && (sense[2] == 0); } static inline int scst_sense_response_code(const uint8_t *sense) @@ -421,6 +421,16 @@ static inline int scst_sense_response_code(const uint8_t *sense) #define UNMAP 0x42 #endif +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 12, 0) +/* + * From . See also commit + * 1c68cc1626341665a8bd1d2c7dfffd7fc852a79c. + */ +#ifndef COMPARE_AND_WRITE +#define COMPARE_AND_WRITE 0x89 +#endif +#endif + #if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 28) /* * From . See also commit @@ -550,7 +560,6 @@ enum scst_tg_sup { ** Misc SCSI constants *************************************************************/ #define SCST_SENSE_ASC_UA_RESET 0x29 -#define BYTCHK 0x02 #define POSITION_LEN_SHORT 20 #define POSITION_LEN_LONG 32 diff --git a/scst/src/Makefile b/scst/src/Makefile index 5d268577b..07b179724 100644 --- a/scst/src/Makefile +++ b/scst/src/Makefile @@ -1,15 +1,15 @@ # # SCSI target mid-level makefile -# +# # Copyright (C) 2004 - 2014 Vladislav Bolkhovitin # Copyright (C) 2004 - 2005 Leonid Stoljar # Copyright (C) 2007 - 2014 Fusion-io, Inc. -# +# # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation, version 2 # of the License. -# +# # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the diff --git a/scst/src/dev_handlers/scst_vdisk.c b/scst/src/dev_handlers/scst_vdisk.c index e1ce52a4c..c00798e75 100644 --- a/scst/src/dev_handlers/scst_vdisk.c +++ b/scst/src/dev_handlers/scst_vdisk.c @@ -182,12 +182,14 @@ struct scst_vdisk_dev { unsigned int vend_specific_id_set:1; unsigned int prod_id_set:1; /* true if prod_id manually set */ unsigned int prod_rev_lvl_set:1; /* true if prod_rev_lvl manually set */ + unsigned int scsi_device_name_set:1; /* true if scsi_device_name manually set */ unsigned int t10_dev_id_set:1; /* true if t10_dev_id manually set */ unsigned int usn_set:1; /* true if usn manually set */ char t10_vend_id[8 + 1]; char vend_specific_id[32 + 1]; char prod_id[16 + 1]; char prod_rev_lvl[4 + 1]; + char scsi_device_name[256 + 1]; char t10_dev_id[16+8+2]; /* T10 device ID */ char usn[MAX_USN_LEN]; uint8_t inq_vend_specific[MAX_INQ_VEND_SPECIFIC_LEN]; @@ -259,8 +261,7 @@ static enum compl_status_e fileio_exec_write(struct vdisk_cmd_params *p); static void blockio_exec_rw(struct vdisk_cmd_params *p, bool write, bool fua); static int vdisk_blockio_flush(struct block_device *bdev, gfp_t gfp_mask, bool report_error, struct scst_cmd *cmd, bool async); -static enum compl_status_e blockio_exec_verify(struct vdisk_cmd_params *p); -static enum compl_status_e fileio_exec_verify(struct vdisk_cmd_params *p); +static enum compl_status_e vdev_exec_verify(struct vdisk_cmd_params *p); static enum compl_status_e blockio_exec_write_verify(struct vdisk_cmd_params *p); static enum compl_status_e fileio_exec_write_verify(struct vdisk_cmd_params *p); static enum compl_status_e nullio_exec_write_verify(struct vdisk_cmd_params *p); @@ -277,7 +278,8 @@ static enum compl_status_e vdisk_exec_read_toc(struct vdisk_cmd_params *p); static enum compl_status_e vdisk_exec_prevent_allow_medium_removal(struct vdisk_cmd_params *p); static enum compl_status_e vdisk_exec_unmap(struct vdisk_cmd_params *p); static enum compl_status_e vdisk_exec_write_same(struct vdisk_cmd_params *p); -static int vdisk_fsync(struct vdisk_cmd_params *p, loff_t loff, +static enum compl_status_e vdisk_exec_caw(struct vdisk_cmd_params *p); +static int vdisk_fsync(loff_t loff, loff_t len, struct scst_device *dev, gfp_t gfp_flags, struct scst_cmd *cmd, bool async); #ifdef CONFIG_SCST_PROC @@ -307,8 +309,14 @@ static int vdisk_unmap_range(struct scst_cmd *cmd, #ifndef CONFIG_SCST_PROC +static ssize_t vdev_sysfs_size_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count); static ssize_t vdev_sysfs_size_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf); +static ssize_t vdev_sysfs_size_mb_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count); +static ssize_t vdev_sysfs_size_mb_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf); static ssize_t vdisk_sysfs_blocksize_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf); static ssize_t vdisk_sysfs_rd_only_show(struct kobject *kobj, @@ -347,6 +355,10 @@ static ssize_t vdev_sysfs_prod_rev_lvl_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count); static ssize_t vdev_sysfs_prod_rev_lvl_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf); +static ssize_t vdev_sysfs_scsi_device_name_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count); +static ssize_t vdev_sysfs_scsi_device_name_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf); static ssize_t vdev_sysfs_t10_dev_id_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count); static ssize_t vdev_sysfs_t10_dev_id_show(struct kobject *kobj, @@ -365,8 +377,16 @@ static ssize_t vdev_zero_copy_show(struct kobject *kobj, static ssize_t vcdrom_sysfs_filename_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count); -static struct kobj_attribute vdev_size_attr = - __ATTR(size_mb, S_IRUGO, vdev_sysfs_size_show, NULL); +static struct kobj_attribute vdev_size_ro_attr = + __ATTR(size, S_IRUGO, vdev_sysfs_size_show, NULL); +static struct kobj_attribute vdev_size_rw_attr = + __ATTR(size, S_IWUSR|S_IRUGO, vdev_sysfs_size_show, + vdev_sysfs_size_store); +static struct kobj_attribute vdev_size_mb_ro_attr = + __ATTR(size_mb, S_IRUGO, vdev_sysfs_size_mb_show, NULL); +static struct kobj_attribute vdev_size_mb_rw_attr = + __ATTR(size_mb, S_IWUSR|S_IRUGO, vdev_sysfs_size_mb_show, + vdev_sysfs_size_mb_store); static struct kobj_attribute vdisk_blocksize_attr = __ATTR(blocksize, S_IRUGO, vdisk_sysfs_blocksize_show, NULL); static struct kobj_attribute vdisk_rd_only_attr = @@ -402,6 +422,9 @@ static struct kobj_attribute vdev_prod_id_attr = static struct kobj_attribute vdev_prod_rev_lvl_attr = __ATTR(prod_rev_lvl, S_IWUSR|S_IRUGO, vdev_sysfs_prod_rev_lvl_show, vdev_sysfs_prod_rev_lvl_store); +static struct kobj_attribute vdev_scsi_device_name_attr = + __ATTR(scsi_device_name, S_IWUSR|S_IRUGO, vdev_sysfs_scsi_device_name_show, + vdev_sysfs_scsi_device_name_store); static struct kobj_attribute vdev_t10_dev_id_attr = __ATTR(t10_dev_id, S_IWUSR|S_IRUGO, vdev_sysfs_t10_dev_id_show, vdev_sysfs_t10_dev_id_store); @@ -419,7 +442,8 @@ static struct kobj_attribute vcdrom_filename_attr = vcdrom_sysfs_filename_store); static const struct attribute *vdisk_fileio_attrs[] = { - &vdev_size_attr.attr, + &vdev_size_ro_attr.attr, + &vdev_size_mb_ro_attr.attr, &vdisk_blocksize_attr.attr, &vdisk_rd_only_attr.attr, &vdisk_wt_attr.attr, @@ -434,6 +458,7 @@ static const struct attribute *vdisk_fileio_attrs[] = { &vdev_vend_specific_id_attr.attr, &vdev_prod_id_attr.attr, &vdev_prod_rev_lvl_attr.attr, + &vdev_scsi_device_name_attr.attr, &vdev_t10_dev_id_attr.attr, &vdev_usn_attr.attr, &vdev_inq_vend_specific_attr.attr, @@ -442,7 +467,8 @@ static const struct attribute *vdisk_fileio_attrs[] = { }; static const struct attribute *vdisk_blockio_attrs[] = { - &vdev_size_attr.attr, + &vdev_size_ro_attr.attr, + &vdev_size_mb_ro_attr.attr, &vdisk_blocksize_attr.attr, &vdisk_rd_only_attr.attr, &vdisk_wt_attr.attr, @@ -455,6 +481,7 @@ static const struct attribute *vdisk_blockio_attrs[] = { &vdev_vend_specific_id_attr.attr, &vdev_prod_id_attr.attr, &vdev_prod_rev_lvl_attr.attr, + &vdev_scsi_device_name_attr.attr, &vdev_t10_dev_id_attr.attr, &vdev_usn_attr.attr, &vdev_inq_vend_specific_attr.attr, @@ -463,7 +490,8 @@ static const struct attribute *vdisk_blockio_attrs[] = { }; static const struct attribute *vdisk_nullio_attrs[] = { - &vdev_size_attr.attr, + &vdev_size_rw_attr.attr, + &vdev_size_mb_rw_attr.attr, &vdisk_blocksize_attr.attr, &vdisk_rd_only_attr.attr, &vdev_dummy_attr.attr, @@ -472,6 +500,7 @@ static const struct attribute *vdisk_nullio_attrs[] = { &vdev_vend_specific_id_attr.attr, &vdev_prod_id_attr.attr, &vdev_prod_rev_lvl_attr.attr, + &vdev_scsi_device_name_attr.attr, &vdev_t10_dev_id_attr.attr, &vdev_usn_attr.attr, &vdev_inq_vend_specific_attr.attr, @@ -480,12 +509,14 @@ static const struct attribute *vdisk_nullio_attrs[] = { }; static const struct attribute *vcdrom_attrs[] = { - &vdev_size_attr.attr, + &vdev_size_ro_attr.attr, + &vdev_size_mb_ro_attr.attr, &vcdrom_filename_attr.attr, &vdev_t10_vend_id_attr.attr, &vdev_vend_specific_id_attr.attr, &vdev_prod_id_attr.attr, &vdev_prod_rev_lvl_attr.attr, + &vdev_scsi_device_name_attr.attr, &vdev_t10_dev_id_attr.attr, &vdev_usn_attr.attr, &vdev_inq_vend_specific_attr.attr, @@ -499,7 +530,7 @@ static DEFINE_MUTEX(scst_vdisk_mutex); /* * Protects the device attributes t10_vend_id, vend_specific_id, prod_id, - * prod_rev_lvl, t10_dev_id, usn and inq_vend_specific. + * prod_rev_lvl, scsi_device_name, t10_dev_id, usn and inq_vend_specific. */ static DEFINE_RWLOCK(vdisk_serial_rwlock); @@ -632,7 +663,9 @@ static struct scst_dev_type vdisk_null_devtype = { "dummy, " "read_only, " "removable, " - "rotational", + "rotational, " + "size, " + "size_mb", #endif #if defined(CONFIG_SCST_DEBUG) || defined(CONFIG_SCST_TRACING) .default_trace_flags = SCST_DEFAULT_DEV_LOG_FLAGS, @@ -942,17 +975,15 @@ static int vdisk_attach(struct scst_device *dev) dev->dev_rd_only = virt_dev->rd_only; if (!virt_dev->cdrom_empty) { - if (virt_dev->nullio) - err = VDISK_NULLIO_SIZE; - else { + if (!virt_dev->nullio) { res = vdisk_get_file_size(virt_dev->filename, virt_dev->blockio, &err); if (res != 0) goto out; - } - virt_dev->file_size = err; + virt_dev->file_size = err; - TRACE_DBG("size of file: %lld", (long long unsigned int)err); + TRACE_DBG("size of file: %lld", err); + } vdisk_blockio_check_flush_support(virt_dev); vdisk_check_tp_support(virt_dev); @@ -1123,12 +1154,12 @@ static enum compl_status_e vdisk_synchronize_cache(struct vdisk_cmd_params *p) cmd->completed = 1; cmd->scst_cmd_done(cmd, SCST_CMD_STATE_DEFAULT, SCST_CONTEXT_SAME); - vdisk_fsync(p, loff, data_len, dev, cmd->cmd_gfp_mask, NULL, true); + vdisk_fsync(loff, data_len, dev, cmd->cmd_gfp_mask, NULL, true); /* ToDo: vdisk_fsync() error processing */ scst_cmd_put(cmd); res = RUNNING_ASYNC; } else { - vdisk_fsync(p, loff, data_len, dev, cmd->cmd_gfp_mask, cmd, true); + vdisk_fsync(loff, data_len, dev, cmd->cmd_gfp_mask, cmd, true); res = RUNNING_ASYNC; } @@ -1144,7 +1175,7 @@ static enum compl_status_e vdisk_exec_start_stop(struct vdisk_cmd_params *p) TRACE_ENTRY(); - vdisk_fsync(p, 0, virt_dev->file_size, dev, cmd->cmd_gfp_mask, cmd, false); + vdisk_fsync(0, virt_dev->file_size, dev, cmd->cmd_gfp_mask, cmd, false); TRACE_EXIT(); return CMD_SUCCEEDED; @@ -1405,9 +1436,9 @@ static vdisk_op_fn blockio_ops[256] = { [WRITE_VERIFY] = blockio_exec_write_verify, [WRITE_VERIFY_12] = blockio_exec_write_verify, [WRITE_VERIFY_16] = blockio_exec_write_verify, - [VERIFY] = blockio_exec_verify, - [VERIFY_12] = blockio_exec_verify, - [VERIFY_16] = blockio_exec_verify, + [VERIFY] = vdev_exec_verify, + [VERIFY_12] = vdev_exec_verify, + [VERIFY_16] = vdev_exec_verify, SHARED_OPS }; @@ -1420,12 +1451,13 @@ static vdisk_op_fn fileio_ops[256] = { [WRITE_10] = fileio_exec_write, [WRITE_12] = fileio_exec_write, [WRITE_16] = fileio_exec_write, + [COMPARE_AND_WRITE] = vdisk_exec_caw, [WRITE_VERIFY] = fileio_exec_write_verify, [WRITE_VERIFY_12] = fileio_exec_write_verify, [WRITE_VERIFY_16] = fileio_exec_write_verify, - [VERIFY] = fileio_exec_verify, - [VERIFY_12] = fileio_exec_verify, - [VERIFY_16] = fileio_exec_verify, + [VERIFY] = vdev_exec_verify, + [VERIFY_12] = vdev_exec_verify, + [VERIFY_16] = vdev_exec_verify, SHARED_OPS }; @@ -1526,6 +1558,7 @@ static bool vdisk_parse_offset(struct vdisk_cmd_params *p, struct scst_cmd *cmd) case WRITE_10: case WRITE_12: case WRITE_16: + case COMPARE_AND_WRITE: fua = (cdb[1] & 0x8); if (fua) { TRACE(TRACE_ORDER, "FUA: loff=%lld, " @@ -2536,6 +2569,23 @@ static enum compl_status_e vdisk_exec_inquiry(struct vdisk_cmd_params *p) int num = 4; buf[1] = 0x83; + + read_lock(&vdisk_serial_rwlock); + i = strlen(virt_dev->scsi_device_name); + if (i > 0) { + /* SCSI target device name */ + buf[num + 0] = 0x3; /* ASCII */ + buf[num + 1] = 0x20 | 0x8; /* Target device SCSI name */ + i += 4 - i % 4; /* align to required 4 bytes */ + scst_copy_and_fill_b(&buf[num + 4], virt_dev->scsi_device_name, i, '\0'); + + buf[num + 3] = i; + num += buf[num + 3]; + + num += 4; + } + read_unlock(&vdisk_serial_rwlock); + /* T10 vendor identifier field format (faked) */ buf[num + 0] = 0x2; /* ASCII */ buf[num + 1] = 0x1; /* Vendor ID */ @@ -2612,6 +2662,7 @@ static enum compl_status_e vdisk_exec_inquiry(struct vdisk_cmd_params *p) buf[1] = 0xB0; buf[3] = 0x3C; buf[4] = 1; /* WSNZ set */ + buf[5] = 0xFF; /* No MAXIMUM COMPARE AND WRITE LENGTH limit */ /* Optimal transfer granuality is PAGE_SIZE */ put_unaligned_be16(max_t(int, PAGE_SIZE/dev->block_size, 1), &buf[6]); @@ -3354,7 +3405,7 @@ static enum compl_status_e vdisk_exec_read_capacity(struct vdisk_cmd_params *p) nblocks = virt_dev->nblocks; if ((cmd->cdb[8] & 1) == 0) { - uint64_t lba = get_unaligned_be64(&cmd->cdb[2]); + uint32_t lba = get_unaligned_be32(&cmd->cdb[2]); if (lba != 0) { TRACE_DBG("PMI zero and LBA not zero (cmd %p)", cmd); scst_set_cmd_error(cmd, @@ -3623,7 +3674,7 @@ static enum compl_status_e vdisk_exec_prevent_allow_medium_removal(struct vdisk_ return CMD_SUCCEEDED; } -static int vdisk_fsync_blockio(struct vdisk_cmd_params *p, loff_t loff, +static int vdisk_fsync_blockio(loff_t loff, loff_t len, struct scst_device *dev, gfp_t gfp_flags, struct scst_cmd *cmd, bool async) { @@ -3646,7 +3697,7 @@ static int vdisk_fsync_blockio(struct vdisk_cmd_params *p, loff_t loff, return res; } -static int vdisk_fsync_fileio(struct vdisk_cmd_params *p, loff_t loff, +static int vdisk_fsync_fileio(loff_t loff, loff_t len, struct scst_device *dev, struct scst_cmd *cmd, bool async) { int res; @@ -3697,7 +3748,7 @@ static int vdisk_fsync_fileio(struct vdisk_cmd_params *p, loff_t loff, return res; } -static int vdisk_fsync(struct vdisk_cmd_params *p, loff_t loff, +static int vdisk_fsync(loff_t loff, loff_t len, struct scst_device *dev, gfp_t gfp_flags, struct scst_cmd *cmd, bool async) { @@ -3722,10 +3773,12 @@ static int vdisk_fsync(struct vdisk_cmd_params *p, loff_t loff, goto out; } - if (virt_dev->blockio) - res = vdisk_fsync_blockio(p, loff, len, dev, gfp_flags, cmd, async); + if (virt_dev->nullio) + ; + else if (virt_dev->blockio) + res = vdisk_fsync_blockio(loff, len, dev, gfp_flags, cmd, async); else - res = vdisk_fsync_fileio(p, loff, len, dev, cmd, async); + res = vdisk_fsync_fileio(loff, len, dev, cmd, async); out: TRACE_EXIT_RES(res); @@ -4016,7 +4069,7 @@ restart: out_sync: /* O_DSYNC flag is used for WT devices */ if (p->fua) - vdisk_fsync(p, loff, scst_cmd_get_data_len(cmd), cmd->dev, + vdisk_fsync(loff, scst_cmd_get_data_len(cmd), cmd->dev, cmd->cmd_gfp_mask, cmd, false); out: TRACE_EXIT(); @@ -4382,65 +4435,198 @@ out: return res; } -static enum compl_status_e fileio_exec_verify(struct vdisk_cmd_params *p) +#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 24) +static int blockio_end_sync_io(struct bio *bio, unsigned int bytes_done, + int error) +#else +static void blockio_end_sync_io(struct bio *bio, int error) +#endif +{ + struct completion *c = bio->bi_private; + +#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 24) + if (bio->bi_size) + return 1; +#endif + + if (!bio_flagged(bio, BIO_UPTODATE) && error == 0) { + PRINT_ERROR("Not up to date bio with error 0; returning -EIO"); + error = -EIO; + } + + bio->bi_private = (void *)(unsigned long)error; + complete(c); + +#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 24) + return 0; +#else + return; +#endif +} + +static ssize_t blockio_rw_sync(struct scst_vdisk_dev *virt_dev, void *buf, + size_t len, loff_t *loff, unsigned rw) +{ + DECLARE_COMPLETION_ONSTACK(c); + struct block_device *bdev = virt_dev->bdev; + struct bio *bio; + void *p; + int max_nr_vecs, rc; + unsigned bytes, off; + ssize_t ret = -ENOMEM; + + max_nr_vecs = min(bio_get_nr_vecs(bdev), BIO_MAX_PAGES); + +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 30) + bio = bio_kmalloc(GFP_KERNEL, max_nr_vecs); +#else + bio = bio_alloc(GFP_KERNEL, max_nr_vecs); +#endif + + if (!bio) + goto out; + + bio->bi_rw = rw; + bio->bi_bdev = bdev; + bio->bi_end_io = blockio_end_sync_io; + bio->bi_private = &c; +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 14, 0) + bio->bi_sector = *loff >> 9; +#else + bio->bi_iter.bi_sector = *loff >> 9; +#endif + for (p = buf; p < buf + len; p += bytes) { + off = offset_in_page(p); + bytes = PAGE_SIZE - off; + rc = bio_add_page(bio, virt_to_page(p), bytes, off); + if (WARN_ON_ONCE(rc < bytes)) + goto free; + } + submit_bio(rw, bio); + wait_for_completion(&c); + ret = (unsigned long)bio->bi_private ? : len; + +free: + bio_put(bio); + +out: + return ret; +} + +static ssize_t fileio_read_sync(struct file *fd, void *buf, size_t len, + loff_t *loff) +{ + mm_segment_t old_fs; + ssize_t ret; + + old_fs = get_fs(); + set_fs(get_ds()); + + if (fd->f_op->llseek) + ret = fd->f_op->llseek(fd, *loff, 0/*SEEK_SET*/); + else + ret = default_llseek(fd, *loff, 0/*SEEK_SET*/); + if (ret < 0) + goto out; + + ret = vfs_read(fd, (char __force __user *)buf, len, loff); + +out: + set_fs(old_fs); + + return ret; +} + +static ssize_t fileio_write_sync(struct file *fd, void *buf, size_t len, + loff_t *loff) +{ + mm_segment_t old_fs; + ssize_t ret; + + old_fs = get_fs(); + set_fs(get_ds()); + + if (fd->f_op->llseek) + ret = fd->f_op->llseek(fd, *loff, 0/*SEEK_SET*/); + else + ret = default_llseek(fd, *loff, 0/*SEEK_SET*/); + if (ret < 0) + goto out; + + ret = vfs_write(fd, (char __force __user *)buf, len, loff); + +out: + set_fs(old_fs); + + return ret; +} +static ssize_t vdev_read_sync(struct scst_vdisk_dev *virt_dev, void *buf, + size_t len, loff_t *loff) +{ + if (virt_dev->nullio) + return len; + else if (virt_dev->blockio) + return blockio_rw_sync(virt_dev, buf, len, loff, 0/*read*/); + else + return fileio_read_sync(virt_dev->fd, buf, len, loff); +} + +static ssize_t vdev_write_sync(struct scst_vdisk_dev *virt_dev, void *buf, + size_t len, loff_t *loff) +{ + int rw; + + if (virt_dev->nullio) { + return len; + } else if (virt_dev->blockio) { +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 36) + rw = REQ_WRITE; +#else + rw = 1 << BIO_RW; +#endif + + return blockio_rw_sync(virt_dev, buf, len, loff, rw); + } else { + return fileio_write_sync(virt_dev->fd, buf, len, loff); + } +} + +static enum compl_status_e vdev_exec_verify(struct vdisk_cmd_params *p) { struct scst_cmd *cmd = p->cmd; loff_t loff = p->loff; - mm_segment_t old_fs; loff_t err; ssize_t length, len_mem = 0; uint8_t *address_sav, *address = NULL; int compare; struct scst_vdisk_dev *virt_dev = cmd->dev->dh_priv; - struct file *fd = virt_dev->fd; uint8_t *mem_verify = NULL; int64_t data_len = scst_cmd_get_data_len(cmd); TRACE_ENTRY(); - sBUG_ON(virt_dev->blockio); - - if (vdisk_fsync(p, loff, data_len, cmd->dev, + if (vdisk_fsync(loff, data_len, cmd->dev, cmd->cmd_gfp_mask, cmd, false) != 0) goto out; /* - * Until the cache is cleared prior the verifying, there is not - * much point in this code. ToDo. + * For file I/O, unless the cache is cleared prior the verifying, + * there is not much point in this code. ToDo. * * Nevertherless, this code is valuable if the data have not been read * from the file/disk yet. */ compare = scst_cmd_get_data_direction(cmd) == SCST_DATA_WRITE; - TRACE_DBG("VERIFY with BYTCHK=%d at offset %lld and len %lld\n", + TRACE_DBG("VERIFY with compare %d at offset %lld and len %lld\n", compare, loff, (long long)data_len); - /* SEEK */ - old_fs = get_fs(); - set_fs(get_ds()); - - if (!virt_dev->nullio) { - if (fd->f_op->llseek) - err = fd->f_op->llseek(fd, loff, 0/*SEEK_SET*/); - else - err = default_llseek(fd, loff, 0/*SEEK_SET*/); - if (err != loff) { - PRINT_ERROR("lseek trouble %lld != %lld", - (long long unsigned int)err, - (long long unsigned int)loff); - scst_set_cmd_error(cmd, - SCST_LOAD_SENSE(scst_sense_read_error)); - goto out_set_fs; - } - } - mem_verify = vmalloc(LEN_MEM); if (mem_verify == NULL) { PRINT_ERROR("Unable to allocate memory %d for verify", LEN_MEM); scst_set_busy(cmd); - goto out_set_fs; + goto out; } if (compare) { @@ -4453,11 +4639,7 @@ static enum compl_status_e fileio_exec_verify(struct vdisk_cmd_params *p) len_mem = (length > LEN_MEM) ? LEN_MEM : length; TRACE_DBG("Verify: length %zd - len_mem %zd", length, len_mem); - if (!virt_dev->nullio) - err = vfs_read(fd, (char __force __user *)mem_verify, - len_mem, &loff); - else - err = len_mem; + err = vdev_read_sync(virt_dev, mem_verify, len_mem, &loff); if ((err < 0) || (err < len_mem)) { PRINT_ERROR("verify() returned %lld from %zd", (long long unsigned int)err, len_mem); @@ -4469,14 +4651,14 @@ static enum compl_status_e fileio_exec_verify(struct vdisk_cmd_params *p) } if (compare) scst_put_buf(cmd, address_sav); - goto out_set_fs; + goto out_free; } if (compare && memcmp(address, mem_verify, len_mem) != 0) { TRACE_DBG("Verify: error memcmp length %zd", length); scst_set_cmd_error(cmd, SCST_LOAD_SENSE(scst_sense_miscompare_error)); scst_put_buf(cmd, address_sav); - goto out_set_fs; + goto out_free; } length -= len_mem; if (compare) @@ -4494,8 +4676,7 @@ static enum compl_status_e fileio_exec_verify(struct vdisk_cmd_params *p) SCST_LOAD_SENSE(scst_sense_hardw_error)); } -out_set_fs: - set_fs(old_fs); +out_free: if (mem_verify) vfree(mem_verify); @@ -4504,6 +4685,106 @@ out: return CMD_SUCCEEDED; } +/* COMPARE AND WRITE */ +static enum compl_status_e vdisk_exec_caw(struct vdisk_cmd_params *p) +{ + struct scst_cmd *cmd = p->cmd; + struct scst_device *dev = cmd->dev; + struct scst_vdisk_dev *virt_dev = dev->dh_priv; + uint32_t data_len = scst_cmd_get_data_len(cmd); + int length, i; + uint8_t *caw_buf = NULL, *read_buf = NULL; + loff_t loff, read, written; + + if (unlikely(cmd->cdb[1] & 0xE0)) { + TRACE_DBG("%s", "WRPROTECT not supported"); + scst_set_invalid_field_in_cdb(cmd, 1, + SCST_INVAL_FIELD_BIT_OFFS_VALID | 5); + goto out; + } + + /* + * A NUMBER OF LOGICAL BLOCKS field set to zero specifies that no read + * operations shall be performed, no logical block data shall be + * transferred from the Data-Out Buffer, no compare operations shall + * be performed, and no write operations shall be performed. This + * condition shall not be considered an error. + */ + if (data_len == 0) + goto out; + + length = scst_get_buf_full(cmd, &caw_buf); + read_buf = vmalloc(data_len); + if (length < 0 || !read_buf) { + PRINT_ERROR("scst_get_buf_full() failed: %d", length); + if (length == -ENOMEM || !read_buf) + scst_set_busy(cmd); + else + scst_set_cmd_error(cmd, + SCST_LOAD_SENSE(scst_sense_hardw_error)); + goto out; + } + + WARN_ON_ONCE(length != 2 * data_len); + + loff = p->loff; + read = vdev_read_sync(virt_dev, read_buf, data_len, &loff); + if (read < data_len) { + PRINT_ERROR("COMPARE AND WRITE / READ returned %lld from %d", + read, data_len); + if (read == -EAGAIN) + scst_set_busy(cmd); + else + scst_set_cmd_error(cmd, + SCST_LOAD_SENSE(scst_sense_read_error)); + goto out; + } + + if (memcmp(caw_buf, read_buf, data_len) != 0) { + for (i = 0; i < data_len && caw_buf[i] == read_buf[i]; i++) + ; + /* + * SBC-3 $5.2: if the compare operation does not indicate a + * match, then terminate the command with CHECK CONDITION + * status with the sense key set to MISCOMPARE and the + * additional sense code set to MISCOMPARE DURING VERIFY + * OPERATION. In the sense data (see 4.18 and SPC-4) the + * offset from the start of the Data-Out Buffer to the first + * byte of data that was not equal shall be reported in the + * INFORMATION field. + */ + scst_set_cmd_error_and_inf(cmd, + SCST_LOAD_SENSE(scst_sense_miscompare_error), + p->loff + i); + goto out; + } + + loff = p->loff; + written = vdev_write_sync(virt_dev, caw_buf + data_len, data_len, + &loff); + if (written < data_len) { + PRINT_ERROR("COMPARE AND WRITE / WRITE wrote %lld / %d", + written, data_len); + if (written == -EAGAIN) + scst_set_busy(cmd); + else + scst_set_cmd_error(cmd, + SCST_LOAD_SENSE(scst_sense_write_error)); + goto out; + } + if (p->fua) + vdisk_fsync(loff, scst_cmd_get_data_len(cmd), cmd->dev, + cmd->cmd_gfp_mask, cmd, false); + +out: + if (read_buf) + vfree(read_buf); + if (caw_buf) + scst_put_buf_full(cmd, caw_buf); + + return CMD_SUCCEEDED; +} + static enum compl_status_e blockio_exec_write_verify(struct vdisk_cmd_params *p) { /* Not yet implemented */ @@ -4511,18 +4792,12 @@ static enum compl_status_e blockio_exec_write_verify(struct vdisk_cmd_params *p) return blockio_exec_write(p); } -static enum compl_status_e blockio_exec_verify(struct vdisk_cmd_params *p) -{ - /* Not yet implemented */ - return CMD_SUCCEEDED; -} - static enum compl_status_e fileio_exec_write_verify(struct vdisk_cmd_params *p) { fileio_exec_write(p); /* O_DSYNC flag is used for WT devices */ if (scsi_status_is_good(p->cmd->status)) - fileio_exec_verify(p); + vdev_exec_verify(p); return CMD_SUCCEEDED; } @@ -4723,8 +4998,7 @@ static int vdev_create(struct scst_dev_type *devt, TRACE_DBG("t10_dev_id %s", virt_dev->t10_dev_id); sprintf(virt_dev->t10_vend_id, "%.*s", - (int)(sizeof(virt_dev->t10_vend_id) - 1), - virt_dev->blockio ? SCST_BIO_VENDOR : SCST_FIO_VENDOR); + (int)sizeof(virt_dev->t10_vend_id) - 1, SCST_FIO_VENDOR); sprintf(virt_dev->vend_specific_id, "%.*s", (int)(sizeof(virt_dev->vend_specific_id) - 1), @@ -4736,6 +5010,9 @@ static int vdev_create(struct scst_dev_type *devt, sprintf(virt_dev->prod_rev_lvl, "%.*s", (int)(sizeof(virt_dev->prod_rev_lvl) - 1), SCST_FIO_REV); + sprintf(virt_dev->scsi_device_name, "%.*s", + (int)(sizeof(virt_dev->scsi_device_name) - 1), ""); + scnprintf(virt_dev->usn, sizeof(virt_dev->usn), "%llx", dev_id_num); TRACE_DBG("usn %s", virt_dev->usn); @@ -4898,6 +5175,10 @@ static int vdev_parse_add_dev_params(struct scst_vdisk_dev *virt_dev, virt_dev->thin_provisioned); } else if (!strcasecmp("zero_copy", p)) { virt_dev->zero_copy = !!val; + } else if (!strcasecmp("size", p)) { + virt_dev->file_size = val; + } else if (!strcasecmp("size_mb", p)) { + virt_dev->file_size = val * 1024 * 1024; } else if (!strcasecmp("blocksize", p)) { virt_dev->blk_shift = scst_calc_block_shift(val); if (virt_dev->blk_shift < 9) { @@ -4914,6 +5195,12 @@ static int vdev_parse_add_dev_params(struct scst_vdisk_dev *virt_dev, } } + if (virt_dev->file_size % (1 << virt_dev->blk_shift) != 0) { + PRINT_ERROR("Device size %lld is not a multiple of the block" + " size %d", virt_dev->file_size, + 1 << virt_dev->blk_shift); + res = -EINVAL; + } out: TRACE_EXIT_RES(res); return res; @@ -4999,6 +5286,8 @@ static int vdev_blockio_add_device(const char *device_name, char *params) virt_dev->blockio = 1; virt_dev->wt_flag = DEF_WRITE_THROUGH; + sprintf(virt_dev->t10_vend_id, "%.*s", + (int)sizeof(virt_dev->t10_vend_id) - 1, SCST_BIO_VENDOR); res = vdev_parse_add_dev_params(virt_dev, params, allowed_params); if (res != 0) @@ -5042,7 +5331,7 @@ static int vdev_nullio_add_device(const char *device_name, char *params) int res = 0; static const char *const allowed_params[] = { "read_only", "dummy", "removable", "blocksize", "rotational", - NULL + "size", "size_mb", NULL }; struct scst_vdisk_dev *virt_dev; @@ -5055,6 +5344,7 @@ static int vdev_nullio_add_device(const char *device_name, char *params) virt_dev->command_set_version = 0x04C0; /* SBC-3 */ virt_dev->nullio = 1; + virt_dev->file_size = VDISK_NULLIO_SIZE; res = vdev_parse_add_dev_params(virt_dev, params, allowed_params); if (res != 0) @@ -5492,22 +5782,128 @@ out_free: goto out; } -static ssize_t vdev_sysfs_size_show(struct kobject *kobj, - struct kobj_attribute *attr, char *buf) +static int vdev_size_process_store(struct scst_sysfs_work_item *work) +{ + struct scst_device *dev = work->dev; + struct scst_vdisk_dev *virt_dev; + unsigned long long new_size; + int size_shift, res = -EINVAL; + + if (sscanf(work->buf, "%d %lld", &size_shift, &new_size) != 2 || + new_size > (ULONG_MAX >> size_shift)) + goto put; + + new_size <<= size_shift; + + res = scst_suspend_activity(SCST_SUSPEND_TIMEOUT_USER); + if (res) + goto put; + + /* To sync with detach*() functions */ + res = mutex_lock_interruptible(&scst_mutex); + if (res) + goto resume; + + virt_dev = dev->dh_priv; + if (!virt_dev->nullio) { + res = -EPERM; + sBUG(); + } else if (new_size % (1 << virt_dev->blk_shift) == 0) { + virt_dev->file_size = new_size; + virt_dev->nblocks = virt_dev->file_size >> dev->block_shift; + } else { + res = -EINVAL; + } + + mutex_unlock(&scst_mutex); + + if (res == 0) + scst_capacity_data_changed(dev); + +resume: + scst_resume_activity(); + +put: + kobject_put(&dev->dev_kobj); + return res; +} + +static ssize_t vdev_size_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, + size_t count, int size_shift) +{ + struct scst_device *dev = container_of(kobj, struct scst_device, + dev_kobj); + struct scst_sysfs_work_item *work; + char *new_size; + int res = -ENOMEM; + + + new_size = kasprintf(GFP_KERNEL, "%d %.*s", size_shift, (int)count, + buf); + if (!new_size) + goto out; + + res = scst_alloc_sysfs_work(vdev_size_process_store, false, &work); + if (res) + goto out_free; + + work->buf = new_size; + work->dev = dev; + + SCST_SET_DEP_MAP(work, &scst_dev_dep_map); + kobject_get(&dev->dev_kobj); + + res = scst_sysfs_queue_wait_work(work); + if (res == 0) + res = count; + +out: + return res; + +out_free: + kfree(buf); + goto out; +} + +static ssize_t vdev_sysfs_size_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + return vdev_size_store(kobj, attr, buf, count, 0); +} + +static ssize_t vdev_size_show(struct kobject *kobj, struct kobj_attribute *attr, + char *buf, int size_shift) { - int pos = 0; struct scst_device *dev; struct scst_vdisk_dev *virt_dev; - - TRACE_ENTRY(); + unsigned long long size; dev = container_of(kobj, struct scst_device, dev_kobj); virt_dev = dev->dh_priv; + size = ACCESS_ONCE(virt_dev->file_size); - pos = sprintf(buf, "%lld\n", virt_dev->file_size / 1024 / 1024); + return sprintf(buf, "%llu\n%s", size >> size_shift, + virt_dev->nullio && size != VDISK_NULLIO_SIZE ? + SCST_SYSFS_KEY_MARK "\n" : ""); +} - TRACE_EXIT_RES(pos); - return pos; +static ssize_t vdev_sysfs_size_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return vdev_size_show(kobj, attr, buf, 0); +} + +static ssize_t vdev_sysfs_size_mb_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + return vdev_size_store(kobj, attr, buf, count, 20); +} + +static ssize_t vdev_sysfs_size_mb_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return vdev_size_show(kobj, attr, buf, 20); } static ssize_t vdisk_sysfs_blocksize_show(struct kobject *kobj, @@ -6068,6 +6464,66 @@ static ssize_t vdev_sysfs_prod_rev_lvl_show(struct kobject *kobj, return pos; } +static ssize_t vdev_sysfs_scsi_device_name_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + struct scst_device *dev; + struct scst_vdisk_dev *virt_dev; + char *p; + int res, len; + + TRACE_ENTRY(); + + dev = container_of(kobj, struct scst_device, dev_kobj); + virt_dev = dev->dh_priv; + p = memchr(buf, '\n', count); + len = p ? p - buf : count; + + if (len >= sizeof(virt_dev->scsi_device_name)) { + PRINT_ERROR("SCSI device namel is too long (max %zd characters)", + sizeof(virt_dev->scsi_device_name)); + res = -EINVAL; + goto out; + } + + write_lock(&vdisk_serial_rwlock); + sprintf(virt_dev->scsi_device_name, "%.*s", len, buf); + if (strlen(virt_dev->scsi_device_name) > 0) + virt_dev->scsi_device_name_set = 1; + else + virt_dev->scsi_device_name_set = 0; + write_unlock(&vdisk_serial_rwlock); + + res = count; + +out: + TRACE_EXIT_RES(res); + return res; +} + +static ssize_t vdev_sysfs_scsi_device_name_show(struct kobject *kobj, + struct kobj_attribute *attr, + char *buf) +{ + int pos; + struct scst_device *dev; + struct scst_vdisk_dev *virt_dev; + + TRACE_ENTRY(); + + dev = container_of(kobj, struct scst_device, dev_kobj); + virt_dev = dev->dh_priv; + + read_lock(&vdisk_serial_rwlock); + pos = sprintf(buf, "%s\n%s", virt_dev->scsi_device_name, + virt_dev->scsi_device_name_set ? SCST_SYSFS_KEY_MARK "\n" : ""); + read_unlock(&vdisk_serial_rwlock); + + TRACE_EXIT_RES(pos); + return pos; +} + static ssize_t vdev_sysfs_t10_dev_id_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { @@ -6502,6 +6958,9 @@ static int vdisk_write_proc(char *buffer, char **start, off_t offset, virt_dev->blockio = 1; /* Bad hack for anyway going out procfs */ virt_dev->vdev_devt = &vdisk_blk_devtype; + sprintf(virt_dev->t10_vend_id, "%.*s", + (int)sizeof(virt_dev->t10_vend_id) - 1, + SCST_BIO_VENDOR); TRACE_DBG("%s", "BLOCKIO"); } else if (!strncmp("REMOVABLE", p, 9)) { p += 9; diff --git a/scst/src/scst_lib.c b/scst/src/scst_lib.c index 6c86a813f..da49e0c15 100644 --- a/scst/src/scst_lib.c +++ b/scst/src/scst_lib.c @@ -150,6 +150,8 @@ static int get_cdb_info_write_same10(struct scst_cmd *cmd, const struct scst_sdbops *sdbops); static int get_cdb_info_write_same16(struct scst_cmd *cmd, const struct scst_sdbops *sdbops); +static int get_cdb_info_compare_and_write(struct scst_cmd *cmd, + const struct scst_sdbops *sdbops); static int get_cdb_info_apt(struct scst_cmd *cmd, const struct scst_sdbops *sdbops); static int get_cdb_info_min(struct scst_cmd *cmd, @@ -966,6 +968,14 @@ static const struct scst_sdbops scst_scsi_op_table[] = { .info_lba_off = 2, .info_lba_len = 8, .info_len_off = 10, .info_len_len = 4, .get_cdb_info = get_cdb_info_lba_8_len_4}, + {.ops = 0x89, .devkey = "O ", + .info_op_name = "COMPARE AND WRITE", + .info_data_direction = SCST_DATA_WRITE, + .info_op_flags = SCST_TRANSFER_LEN_TYPE_FIXED|SCST_WRITE_MEDIUM| + SCST_SERIALIZED, + .info_lba_off = 2, .info_lba_len = 8, + .info_len_off = 13, .info_len_len = 1, + .get_cdb_info = get_cdb_info_compare_and_write}, {.ops = 0x8A, .devkey = "O OO O ", .info_op_name = "WRITE(16)", .info_data_direction = SCST_DATA_WRITE, @@ -1656,6 +1666,38 @@ out: } EXPORT_SYMBOL(scst_set_cmd_error); +int scst_set_cmd_error_and_inf(struct scst_cmd *cmd, int key, int asc, + int ascq, uint64_t information) +{ + int res; + + res = scst_set_cmd_error(cmd, key, asc, ascq); + if (res) + goto out; + + switch (cmd->sense[0] & 0x7f) { + case 0x70: + /* Fixed format */ + cmd->sense[0] |= 0x80; /* Information field is valid */ + put_unaligned_be32(information, &cmd->sense[3]); + break; + case 0x72: + /* Descriptor format */ + cmd->sense[7] = 12; /* additional sense length */ + cmd->sense[8 + 0] = 0; /* descriptor type: Information */ + cmd->sense[8 + 1] = 10; /* Additional length */ + cmd->sense[8 + 2] = 0x80; /* VALID */ + put_unaligned_be64(information, &cmd->sense[8 + 4]); + break; + default: + sBUG(); + } + +out: + return res; +} +EXPORT_SYMBOL(scst_set_cmd_error_and_inf); + static void scst_fill_field_pointer_sense(uint8_t *fp_sense, int field_offs, int bit_offs, bool cdb) { @@ -3831,7 +3873,7 @@ found: t->acg_dev->acg->acg_io_grouping_type); } else { res = t; - if (!*(volatile bool*)&res->active_cmd_threads->io_context_ready) { + if (!*(volatile bool *)&res->active_cmd_threads->io_context_ready) { TRACE_DBG("IO context for t %p not yet " "initialized, waiting...", t); msleep(100); @@ -4110,6 +4152,10 @@ static int scst_alloc_add_tgt_dev(struct scst_session *sess, tgt_dev->tgt_dev_rd_only = acg_dev->acg_dev_rd_only || dev->dev_rd_only; tgt_dev->sess = sess; atomic_set(&tgt_dev->tgt_dev_cmd_count, 0); + if (sess->acg->acg_black_hole_type != SCST_ACG_BLACK_HOLE_NONE) + set_bit(SCST_TGT_DEV_BLACK_HOLE, &tgt_dev->tgt_dev_flags); + else + clear_bit(SCST_TGT_DEV_BLACK_HOLE, &tgt_dev->tgt_dev_flags); scst_sgv_pool_use_norm(tgt_dev); @@ -4233,7 +4279,7 @@ out_free: goto out; } -/* No locks supposed to be held, scst_mutex - held */ +/* scst_mutex supposed to be held */ void scst_nexus_loss(struct scst_tgt_dev *tgt_dev, bool queue_UA) { TRACE_ENTRY(); @@ -4531,6 +4577,24 @@ out: return res; } +static void scst_prelim_finish_internal_cmd(struct scst_cmd *cmd) +{ + unsigned long flags; + + TRACE_ENTRY(); + + sBUG_ON(!cmd->internal); + + spin_lock_irqsave(&cmd->sess->sess_list_lock, flags); + list_del(&cmd->sess_cmd_list_entry); + spin_unlock_irqrestore(&cmd->sess->sess_list_lock, flags); + + __scst_cmd_put(cmd); + + TRACE_EXIT(); + return; +} + int scst_prepare_request_sense(struct scst_cmd *orig_cmd) { int res = 0; @@ -4734,7 +4798,7 @@ out: return res; out_free_cmd: - __scst_cmd_put(cmd); + scst_prelim_finish_internal_cmd(cmd); out_busy: scst_set_busy(ws_cmd); @@ -6436,8 +6500,16 @@ static int get_cdb_info_fmt(struct scst_cmd *cmd, static int get_cdb_info_verify10(struct scst_cmd *cmd, const struct scst_sdbops *sdbops) { + if (unlikely(cmd->cdb[1] & 4)) { + PRINT_ERROR("VERIFY(10): BYTCHK 1x not supported (dev %s)", + cmd->dev ? cmd->dev->virt_name : NULL); + scst_set_invalid_field_in_cdb(cmd, 1, + 2 | SCST_INVAL_FIELD_BIT_OFFS_VALID); + return 1; + } + cmd->lba = get_unaligned_be32(cmd->cdb + sdbops->info_lba_off); - if (cmd->cdb[1] & BYTCHK) { + if (cmd->cdb[1] & 2) { cmd->bufflen = get_unaligned_be16(cmd->cdb + sdbops->info_len_off); cmd->data_len = cmd->bufflen; cmd->data_direction = SCST_DATA_WRITE; @@ -6455,7 +6527,15 @@ static int get_cdb_info_verify6(struct scst_cmd *cmd, cmd->op_flags |= SCST_LBA_NOT_VALID; cmd->lba = 0; - if (cmd->cdb[1] & BYTCHK) { + if (unlikely(cmd->cdb[1] & 4)) { + PRINT_ERROR("VERIFY(6): BYTCHK 1x not supported (dev %s)", + cmd->dev ? cmd->dev->virt_name : NULL); + scst_set_invalid_field_in_cdb(cmd, 1, + 2 | SCST_INVAL_FIELD_BIT_OFFS_VALID); + return 1; + } + + if (cmd->cdb[1] & 2) { /* BYTCHK 01 */ cmd->bufflen = get_unaligned_be24(cmd->cdb + sdbops->info_len_off); cmd->data_len = cmd->bufflen; cmd->data_direction = SCST_DATA_WRITE; @@ -6470,8 +6550,16 @@ static int get_cdb_info_verify6(struct scst_cmd *cmd, static int get_cdb_info_verify12(struct scst_cmd *cmd, const struct scst_sdbops *sdbops) { + if (unlikely(cmd->cdb[1] & 4)) { + PRINT_ERROR("VERIFY(12): BYTCHK 1x not supported (dev %s)", + cmd->dev ? cmd->dev->virt_name : NULL); + scst_set_invalid_field_in_cdb(cmd, 1, + 2 | SCST_INVAL_FIELD_BIT_OFFS_VALID); + return 1; + } + cmd->lba = get_unaligned_be32(cmd->cdb + sdbops->info_lba_off); - if (cmd->cdb[1] & BYTCHK) { + if (cmd->cdb[1] & 2) { /* BYTCHK 01 */ cmd->bufflen = get_unaligned_be32(cmd->cdb + sdbops->info_len_off); if (unlikely(cmd->bufflen & SCST_MAX_VALID_BUFFLEN_MASK)) { PRINT_ERROR("Too big bufflen %d (op %x)", @@ -6492,8 +6580,16 @@ static int get_cdb_info_verify12(struct scst_cmd *cmd, static int get_cdb_info_verify16(struct scst_cmd *cmd, const struct scst_sdbops *sdbops) { + if (unlikely(cmd->cdb[1] & 4)) { + PRINT_ERROR("VERIFY(16): BYTCHK 1x not supported (dev %s)", + cmd->dev ? cmd->dev->virt_name : NULL); + scst_set_invalid_field_in_cdb(cmd, 1, + 2 | SCST_INVAL_FIELD_BIT_OFFS_VALID); + return 1; + } + cmd->lba = get_unaligned_be64(cmd->cdb + sdbops->info_lba_off); - if (cmd->cdb[1] & BYTCHK) { + if (cmd->cdb[1] & 2) { /* BYTCHK 01 */ cmd->bufflen = get_unaligned_be32(cmd->cdb + sdbops->info_len_off); if (unlikely(cmd->bufflen & SCST_MAX_VALID_BUFFLEN_MASK)) { PRINT_ERROR("Too big bufflen %d (op %x)", @@ -6684,6 +6780,15 @@ static int get_cdb_info_write_same16(struct scst_cmd *cmd, return 0; } +static int get_cdb_info_compare_and_write(struct scst_cmd *cmd, + const struct scst_sdbops *sdbops) +{ + cmd->lba = get_unaligned_be64(cmd->cdb + sdbops->info_lba_off); + cmd->data_len = cmd->cdb[sdbops->info_len_off]; + cmd->bufflen = 2 * cmd->data_len; + return 0; +} + /** * get_cdb_info_apt() - Parse ATA PASS-THROUGH CDB. * @@ -6799,7 +6904,8 @@ static int get_cdb_info_min(struct scst_cmd *cmd, break; case MI_REPORT_SUPPORTED_TASK_MANAGEMENT_FUNCTIONS: cmd->op_name = "REPORT SUPPORTED TASK MANAGEMENT FUNCTIONS"; - cmd->op_flags |= SCST_WRITE_EXCL_ALLOWED; + cmd->op_flags |= SCST_WRITE_EXCL_ALLOWED | + SCST_LOCAL_CMD | SCST_FULLY_LOCAL_CMD; break; default: break; diff --git a/scst/src/scst_main.c b/scst/src/scst_main.c index 2be0ab797..d3e99cbac 100644 --- a/scst/src/scst_main.c +++ b/scst/src/scst_main.c @@ -1925,7 +1925,7 @@ out_wait: * Wait for io_context gets initialized to avoid possible races * for it from the sharing it tgt_devs. */ - while (!*(volatile bool*)&cmd_threads->io_context_ready) { + while (!*(volatile bool *)&cmd_threads->io_context_ready) { TRACE_DBG("Waiting for io_context for cmd_threads %p " "initialized", cmd_threads); msleep(50); diff --git a/scst/src/scst_pres.c b/scst/src/scst_pres.c index e7210d433..5ab3b6eb1 100644 --- a/scst/src/scst_pres.c +++ b/scst/src/scst_pres.c @@ -2457,7 +2457,7 @@ void scst_pr_read_reservation(struct scst_cmd *cmd, uint8_t *buffer, if (buffer_size < 8) { TRACE_PR("buffer_size too small: %d. expected >= 8 " "(buffer %p)", buffer_size, buffer); - goto skip; + goto out; } memset(b, 0, sizeof(b)); @@ -2491,10 +2491,10 @@ void scst_pr_read_reservation(struct scst_cmd *cmd, uint8_t *buffer, size = 24; } - memset(buffer, 0, buffer_size); - memcpy(buffer, b, min(size, buffer_size)); - -skip: +out: + size = min(size, buffer_size); + memcpy(buffer, b, size); + memset(buffer + size, 0, buffer_size - size); scst_set_resp_data_len(cmd, size); TRACE_EXIT(); diff --git a/scst/src/scst_sysfs.c b/scst/src/scst_sysfs.c index c7e08ab7d..8ca303484 100644 --- a/scst/src/scst_sysfs.c +++ b/scst/src/scst_sysfs.c @@ -1756,6 +1756,124 @@ static struct kobj_attribute scst_tgt_io_grouping_type = scst_tgt_io_grouping_type_show, scst_tgt_io_grouping_type_store); +static ssize_t __scst_acg_black_hole_show(struct scst_acg *acg, char *buf) +{ + int res, t = acg->acg_black_hole_type; + + res = sprintf(buf, "%d\n", t); + + return res; +} + +static ssize_t __scst_acg_black_hole_store(struct scst_acg *acg, + const char *buf, size_t count) +{ + int res = 0; + int prev, t; + struct scst_session *sess; + + prev = acg->acg_black_hole_type; + + if ((buf == NULL) || (count == 0)) { + res = 0; + goto out; + } + + mutex_lock(&scst_mutex); + + BUILD_BUG_ON((SCST_ACG_BLACK_HOLE_NONE != 0) || + (SCST_ACG_BLACK_HOLE_CMD != 1) || + (SCST_ACG_BLACK_HOLE_ALL != 2) || + (SCST_ACG_BLACK_HOLE_DATA_CMD != 3) || + (SCST_ACG_BLACK_HOLE_DATA_MCMD != 4)); + switch (buf[0]) { + case '0': + acg->acg_black_hole_type = SCST_ACG_BLACK_HOLE_NONE; + break; + case '1': + acg->acg_black_hole_type = SCST_ACG_BLACK_HOLE_CMD; + break; + case '2': + acg->acg_black_hole_type = SCST_ACG_BLACK_HOLE_ALL; + break; + case '3': + acg->acg_black_hole_type = SCST_ACG_BLACK_HOLE_DATA_CMD; + break; + case '4': + acg->acg_black_hole_type = SCST_ACG_BLACK_HOLE_DATA_MCMD; + break; + default: + PRINT_ERROR("%s: Requested action not understood: %s", + __func__, buf); + res = -EINVAL; + goto out_unlock; + } + + t = acg->acg_black_hole_type; + + if (prev == t) + goto out_unlock; + + list_for_each_entry(sess, &acg->acg_sess_list, acg_sess_list_entry) { + int i; + for (i = 0; i < SESS_TGT_DEV_LIST_HASH_SIZE; i++) { + struct list_head *head = &sess->sess_tgt_dev_list[i]; + struct scst_tgt_dev *tgt_dev; + list_for_each_entry(tgt_dev, head, sess_tgt_dev_list_entry) { + if (t != SCST_ACG_BLACK_HOLE_NONE) + set_bit(SCST_TGT_DEV_BLACK_HOLE, &tgt_dev->tgt_dev_flags); + else + clear_bit(SCST_TGT_DEV_BLACK_HOLE, &tgt_dev->tgt_dev_flags); + } + } + } + + PRINT_INFO("Black hole set to %d for ACG %s", t, acg->acg_name); + +out_unlock: + mutex_unlock(&scst_mutex); + +out: + return res; +} + +static ssize_t scst_tgt_black_hole_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct scst_acg *acg; + struct scst_tgt *tgt; + + tgt = container_of(kobj, struct scst_tgt, tgt_kobj); + acg = tgt->default_acg; + + return __scst_acg_black_hole_show(acg, buf); +} + +static ssize_t scst_tgt_black_hole_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + int res; + struct scst_acg *acg; + struct scst_tgt *tgt; + + tgt = container_of(kobj, struct scst_tgt, tgt_kobj); + acg = tgt->default_acg; + + res = __scst_acg_black_hole_store(acg, buf, count); + if (res != 0) + goto out; + + res = count; + +out: + TRACE_EXIT_RES(res); + return res; +} + +static struct kobj_attribute scst_tgt_black_hole = + __ATTR(black_hole, S_IRUGO | S_IWUSR, + scst_tgt_black_hole_show, scst_tgt_black_hole_store); + static ssize_t __scst_acg_cpu_mask_show(struct scst_acg *acg, char *buf) { int res; @@ -2473,6 +2591,7 @@ static struct attribute *scst_tgt_attrs[] = { &scst_tgt_comment.attr, &scst_tgt_addr_method.attr, &scst_tgt_io_grouping_type.attr, + &scst_tgt_black_hole.attr, &scst_tgt_cpu_mask.attr, &scst_tgt_unknown_cmd_count_attr.attr, &scst_tgt_write_cmd_count_attr.attr, @@ -4285,6 +4404,39 @@ static struct kobj_attribute scst_acg_io_grouping_type = scst_acg_io_grouping_type_show, scst_acg_io_grouping_type_store); +static ssize_t scst_acg_black_hole_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct scst_acg *acg; + + acg = container_of(kobj, struct scst_acg, acg_kobj); + + return __scst_acg_black_hole_show(acg, buf); +} + +static ssize_t scst_acg_black_hole_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + int res; + struct scst_acg *acg; + + acg = container_of(kobj, struct scst_acg, acg_kobj); + + res = __scst_acg_black_hole_store(acg, buf, count); + if (res != 0) + goto out; + + res = count; + +out: + TRACE_EXIT_RES(res); + return res; +} + +static struct kobj_attribute scst_acg_black_hole = + __ATTR(black_hole, S_IRUGO | S_IWUSR, + scst_acg_black_hole_show, scst_acg_black_hole_store); + static ssize_t scst_acg_cpu_mask_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -4408,6 +4560,13 @@ int scst_acg_sysfs_create(struct scst_tgt *tgt, goto out_del; } + res = sysfs_create_file(&acg->acg_kobj, &scst_acg_black_hole.attr); + if (res != 0) { + PRINT_ERROR("Can't add tgt attr %s for tgt %s", + scst_acg_black_hole.attr.name, tgt->tgt_name); + goto out_del; + } + res = sysfs_create_file(&acg->acg_kobj, &scst_acg_cpu_mask.attr); if (res != 0) { PRINT_ERROR("Can't add tgt attr %s for tgt %s", diff --git a/scst/src/scst_targ.c b/scst/src/scst_targ.c index 697083bf6..705b24263 100644 --- a/scst/src/scst_targ.c +++ b/scst/src/scst_targ.c @@ -972,6 +972,32 @@ out: } #endif + if (unlikely(test_bit(SCST_TGT_DEV_BLACK_HOLE, &cmd->tgt_dev->tgt_dev_flags))) { + struct scst_session *sess = cmd->sess; + bool abort = false; + switch (sess->acg->acg_black_hole_type) { + case SCST_ACG_BLACK_HOLE_CMD: + case SCST_ACG_BLACK_HOLE_ALL: + abort = true; + break; + case SCST_ACG_BLACK_HOLE_DATA_CMD: + case SCST_ACG_BLACK_HOLE_DATA_MCMD: + if (cmd->data_direction != SCST_DATA_NONE) + abort = true; + break; + default: + break; + } + if (abort) { + TRACE_MGMT_DBG("Black hole: aborting cmd %p (op %x, " + "initiator %s)", cmd, cmd->cdb[0], + sess->initiator_name); + spin_lock_irq(&sess->sess_list_lock); + scst_abort_cmd(cmd, NULL, false, false); + spin_unlock_irq(&sess->sess_list_lock); + } + } + TRACE_EXIT_HRES(res); return res; @@ -2036,6 +2062,74 @@ out_unlock_put_not_completed: goto out; } +static int scst_report_supported_tm_fns(struct scst_cmd *cmd) +{ + int res = SCST_EXEC_COMPLETED; + int length, resp_len = 0; + uint8_t *address; + uint8_t buf[16]; + + TRACE_ENTRY(); + + length = scst_get_buf_full_sense(cmd, &address); + TRACE_DBG("length %d", length); + if (unlikely(length <= 0)) + goto out_compl; + + memset(buf, 0, sizeof(buf)); + + buf[0] = 0xD8; /* ATS, ATSS, CTSS, LURS */ + buf[1] = 0; + if ((cmd->cdb[2] & 0x80) == 0) + resp_len = 4; + else { + buf[3] = 0x0C; +#if 1 + buf[4] = 1; /* TMFTMOV */ + buf[6] = 0x80; /* ATTS */ + put_unaligned_be32(300, &buf[8]); /* long timeout - 30 sec. */ + put_unaligned_be32(150, &buf[12]); /* short timeout - 15 sec. */ +#endif + resp_len = 16; + } + + if (length > resp_len) + length = resp_len; + memcpy(address, buf, length); + + scst_put_buf_full(cmd, address); + if (length < cmd->resp_data_len) + scst_set_resp_data_len(cmd, length); + +out_compl: + cmd->completed = 1; + + /* Report the result */ + cmd->scst_cmd_done(cmd, SCST_CMD_STATE_DEFAULT, SCST_CONTEXT_SAME); + + TRACE_EXIT_RES(res); + return res; +} + +static int scst_maintenance_in(struct scst_cmd *cmd) +{ + int res; + + TRACE_ENTRY(); + + switch (cmd->cdb[1] & 0x1f) { + case MI_REPORT_SUPPORTED_TASK_MANAGEMENT_FUNCTIONS: + res = scst_report_supported_tm_fns(cmd); + break; + default: + res = SCST_EXEC_NOT_COMPLETED; + break; + } + + TRACE_EXIT_RES(res); + return res; +} + static int scst_reserve_local(struct scst_cmd *cmd) { int res = SCST_EXEC_NOT_COMPLETED; @@ -2777,6 +2871,7 @@ static scst_local_exec_fn scst_local_fns[256] = { [PERSISTENT_RESERVE_OUT] = scst_persistent_reserve_out_local, [REPORT_LUNS] = scst_report_luns_local, [REQUEST_SENSE] = scst_request_sense_local, + [MAINTENANCE_IN] = scst_maintenance_in, }; static int scst_do_local_exec(struct scst_cmd *cmd) @@ -5397,10 +5492,18 @@ static int scst_clear_task_set(struct scst_mgmt_cmd *mcmd) * >0, if it should be requeued, <0 otherwise */ static int scst_mgmt_cmd_init(struct scst_mgmt_cmd *mcmd) { - int res = 0, rc; + int res = 0, rc, t; TRACE_ENTRY(); + t = mcmd->sess->acg->acg_black_hole_type; + if (unlikely((t == SCST_ACG_BLACK_HOLE_ALL) || + (t == SCST_ACG_BLACK_HOLE_DATA_MCMD))) { + TRACE_MGMT_DBG("Dropping mcmd %p (fn %d, initiator %s)", mcmd, + mcmd->fn, mcmd->sess->initiator_name); + mcmd->mcmd_dropped = 1; + } + switch (mcmd->fn) { case SCST_ABORT_TASK: { diff --git a/scst_local/scst_local.c b/scst_local/scst_local.c index 5953ae38e..e872bdeaa 100644 --- a/scst_local/scst_local.c +++ b/scst_local/scst_local.c @@ -1657,11 +1657,6 @@ static int scst_local_driver_remove(struct device *dev) TRACE_ENTRY(); sess = to_scst_lcl_sess(dev); - if (!sess) { - PRINT_ERROR("%s", "Unable to locate sess info"); - return -ENODEV; - } - scsi_remove_host(sess->shost); scsi_host_put(sess->shost); @@ -1726,8 +1721,6 @@ static void scst_local_release_adapter(struct device *dev) TRACE_ENTRY(); sess = to_scst_lcl_sess(dev); - if (sess == NULL) - goto out; /* * At this point the SCSI device is almost gone because the SCSI @@ -1760,7 +1753,6 @@ static void scst_local_release_adapter(struct device *dev) scst_unregister_session(sess->scst_sess, false, scst_local_free_sess); -out: TRACE_EXIT(); return; } diff --git a/srpt/Makefile b/srpt/Makefile index 3a5c1c8c7..2f0e28c96 100644 --- a/srpt/Makefile +++ b/srpt/Makefile @@ -40,6 +40,8 @@ INSTALL_DIR := $(INSTALL_MOD_PATH)/lib/modules/$(KVER)/extra set_var = $(shell { if [ -e "$(1)" ]; then grep -v '^$(2)=' "$(1)"; fi; echo "$(2)=$(3)"; } >/tmp/$(1)-$$$$.tmp && mv /tmp/$(1)-$$$$.tmp $(1)) +SRC_FILES=$(wildcard */*.[ch]) + # The file Modules.symvers has been renamed in the 2.6.18 kernel to # Module.symvers. Find out which name to use by looking in $(KDIR). MODULE_SYMVERS:=$(shell if [ -e $(KDIR)/Module.symvers ]; then \ @@ -113,7 +115,7 @@ src/Module.symvers src/Modules.symvers: $(SCST_SYMVERS_DIR)/$(MODULE_SYMVERS) "installed."; \ false; \ else \ - echo " Building against non-OFED InfiniBand kernel headers."; \ + echo " Building against in-tree InfiniBand kernel headers."; \ cp $< $@; \ fi; \ fi @@ -141,4 +143,7 @@ extraclean: clean release-archive: ../scripts/generate-release-archive srpt "$$(sed -n 's/^#define[[:blank:]]DRV_VERSION[[:blank:]]*\"\([^\"]*\)\".*/\1/p' src/ib_srpt.c)" +kerneldoc.html: $(SRC_FILES) + $(KDIR)/scripts/kernel-doc -html $(SRC_FILES) >$@ + .PHONY: all install clean extraclean 2debug 2release 2perf diff --git a/srpt/README b/srpt/README index e15498f84..01297583d 100644 --- a/srpt/README +++ b/srpt/README @@ -411,7 +411,7 @@ Q: Loading the kernel module ib_srpt triggers a kernel panic with a call trace [] system_call_fastpath+0x16/0x1b A: This means that you are using a system on which OFED has been installed but - that ib_srpt has been compiled against the non-OFED kernel headers instead + that ib_srpt has been compiled against the in-tree kernel headers instead of the OFED kernel headers. You can fix this by rebuilding ib_srpt against the OFED kernel headers. The ib_srpt makefile should detect the OFED kernel headers automatically - at least if ib_srpt is built after OFED has been diff --git a/srpt/session-management.txt b/srpt/session-management.txt new file mode 100644 index 000000000..752218787 --- /dev/null +++ b/srpt/session-management.txt @@ -0,0 +1,45 @@ + ib_srpt and session management + ============================== + +The following actions related to SRP sessions can all occur concurrently: +* IB communication manager (CM) invokes srpt_cm_handler(). +* HCA driver invokes the queue pair (QP) completion handler srpt_completion(). +* HCA driver invokes the QP async event handler srpt_qp_event(). +* HCA transfers data between initiator and target via RDMA. +* srpt_compl_thread() polls the QP. +* SCST core invokes one of the callback functions defined in srpt_template(). + +The actions that occur over the lifetime of a session are as follows: +- A REQ message is received from the initiator. +- srpt_cm_req_recv() is invoked and allocates a queue pair and creates a + completion thread. +- If the connection request is not accepted, a REJ message is sent and + srpt_close_ch() is invoked. The srpt_close_ch() call causes the completion + thread to stop and the allocated resources to be freed asynchronously. +- If the connection request is accepted a REP message is sent to the initiator. +- Once an RTU message is received from the initiator, srpt_cm_rtu_recv() is + invoked. That function changes the queue pair state into RTS, the channel + state into CH_LIVE and wakes up the completion thread. +- RDMA communication starts and continues until either a DREQ message is + received or sent. The function ib_send_cm_dreq() can get invoked + either because a target port is disabled or from inside the + srpt_close_session() function. +- After a DREQ has been sent either a DREP will be received + (srpt_cm_drep_recv()) or the TimeWait state will be reached and will be left + (srpt_cm_timewait_exit()). +- srpt_cm_dre[pq]_recv() and srpt_cm_timewait_exit() all invoke + srpt_close_ch(). +- srpt_close_ch() changes the channel state into CH_DISCONNECTING, the + queue pair state into IB_QPS_ERR and queues a zero-length write. +- Upon receipt of the zero-length write completion the channel state is + changed into CH_DISCONNECTED. This is the last work completion so once the + channel state has reached CH_DISCONNECTED it is guaranteed that the queue + pair completion handler won't be invoked again and also that the completion + handler won't call wake_up_process(ch->thread) anymore. +- That channel state change causes the polling loop in the completion thread + to stop and also triggers a call of scst_unregister_session(). +- Once scst_unregister_session() has finished srpt_unreg_sess() is invoked. +- srpt_unreg_sess() destroys the CM ID and decrements the channel refcount. +- Once the channel refcount drops to zero srpt_free_ch() is invoked which + frees the queue pair and other IB resources. + diff --git a/srpt/src/ib_srpt.c b/srpt/src/ib_srpt.c index 6b9b54fa8..05bcbc4b5 100644 --- a/srpt/src/ib_srpt.c +++ b/srpt/src/ib_srpt.c @@ -181,10 +181,9 @@ static void srpt_unregister_procfs_entry(struct scst_tgt_template *tgt); #endif /*CONFIG_SCST_PROC*/ static void srpt_unmap_sg_to_ib_sge(struct srpt_rdma_ch *ch, struct srpt_send_ioctx *ioctx); -static void srpt_drain_channel(struct ib_cm_id *cm_id); static void srpt_destroy_ch_ib(struct srpt_rdma_ch *ch); -static enum rdma_ch_state srpt_set_ch_state_to_disc(struct srpt_rdma_ch *ch) +static bool srpt_set_ch_state(struct srpt_rdma_ch *ch, enum rdma_ch_state new) { unsigned long flags; enum rdma_ch_state prev; @@ -192,57 +191,7 @@ static enum rdma_ch_state srpt_set_ch_state_to_disc(struct srpt_rdma_ch *ch) spin_lock_irqsave(&ch->spinlock, flags); prev = ch->state; - switch (prev) { - case CH_CONNECTING: - case CH_LIVE: - ch->state = CH_DISCONNECTING; - wake_up_process(ch->thread); - changed = true; - break; - default: - break; - } - spin_unlock_irqrestore(&ch->spinlock, flags); - - return prev; -} - -static bool srpt_set_ch_state_to_draining(struct srpt_rdma_ch *ch) -{ - unsigned long flags; - bool changed = false; - - spin_lock_irqsave(&ch->spinlock, flags); - switch (ch->state) { - case CH_CONNECTING: - case CH_LIVE: - case CH_DISCONNECTING: - ch->state = CH_DRAINING; - wake_up_process(ch->thread); - changed = true; - break; - default: - break; - } - spin_unlock_irqrestore(&ch->spinlock, flags); - - return changed; -} - -/** - * srpt_test_and_set_ch_state() - Test and set the channel state. - * - * Returns true if and only if the channel state has been set to the new state. - */ -static bool srpt_test_and_set_ch_state(struct srpt_rdma_ch *ch, - enum rdma_ch_state old, - enum rdma_ch_state new) -{ - unsigned long flags; - bool changed = false; - - spin_lock_irqsave(&ch->spinlock, flags); - if (ch->state == old) { + if (new > prev) { ch->state = new; wake_up_process(ch->thread); changed = true; @@ -341,6 +290,7 @@ static void srpt_event_handler(struct ib_event_handler *handler, case IB_EVENT_PKEY_CHANGE: case IB_EVENT_SM_CHANGE: case IB_EVENT_CLIENT_REREGISTER: + case IB_EVENT_GID_CHANGE: /* Refresh port data asynchronously. */ port_num = event->element.port_num - 1; if (port_num < sdev->device->phys_port_cnt) { @@ -378,8 +328,8 @@ static const char *get_ch_state_name(enum rdma_ch_state s) return "live"; case CH_DISCONNECTING: return "disconnecting"; - case CH_DRAINING: - return "draining"; + case CH_DISCONNECTED: + return "disconnected"; } return "???"; } @@ -389,8 +339,6 @@ static const char *get_ch_state_name(enum rdma_ch_state s) */ static void srpt_qp_event(struct ib_event *event, struct srpt_rdma_ch *ch) { - unsigned long flags; - TRACE_DBG("QP event %d on cm_id=%p sess_name=%s state=%s", event->event, ch->cm_id, ch->sess_name, get_ch_state_name(ch->state)); @@ -408,11 +356,6 @@ static void srpt_qp_event(struct ib_event *event, struct srpt_rdma_ch *ch) case IB_EVENT_QP_LAST_WQE_REACHED: TRACE_DBG("%s, state %s: received Last WQE event.", ch->sess_name, get_ch_state_name(ch->state)); - BUG_ON(!ch->thread); - spin_lock_irqsave(&ch->spinlock, flags); - ch->last_wqe_received = true; - wake_up_process(ch->thread); - spin_unlock_irqrestore(&ch->spinlock, flags); break; default: PRINT_ERROR("received unrecognized IB QP event %d", @@ -926,13 +869,11 @@ static enum srpt_command_state srpt_set_cmd_state(struct srpt_send_ioctx *ioctx, { enum srpt_command_state previous; - BUG_ON(!ioctx); + EXTRACHECKS_BUG_ON(!ioctx); - spin_lock(&ioctx->spinlock); previous = ioctx->state; if (previous != SRPT_STATE_DONE) ioctx->state = new; - spin_unlock(&ioctx->spinlock); return previous; } @@ -948,15 +889,13 @@ static bool srpt_test_and_set_cmd_state(struct srpt_send_ioctx *ioctx, { enum srpt_command_state previous; - WARN_ON(!ioctx); - WARN_ON(old == SRPT_STATE_DONE); - WARN_ON(new == SRPT_STATE_NEW); + EXTRACHECKS_BUG_ON(!ioctx); + EXTRACHECKS_BUG_ON(old == SRPT_STATE_DONE); + EXTRACHECKS_BUG_ON(new == SRPT_STATE_NEW); - spin_lock(&ioctx->spinlock); previous = ioctx->state; if (previous == old) ioctx->state = new; - spin_unlock(&ioctx->spinlock); return previous == old; } @@ -986,15 +925,7 @@ static int srpt_post_recv(struct srpt_device *sdev, static int srpt_adjust_srq_wr_avail(struct srpt_rdma_ch *ch, int delta) { - int res; - unsigned long flags; - - spin_lock_irqsave(&ch->spinlock, flags); - ch->sq_wr_avail += delta; - res = ch->sq_wr_avail; - spin_unlock_irqrestore(&ch->spinlock, flags); - - return res; + return atomic_add_return(delta, &ch->sq_wr_avail); } /** @@ -1038,6 +969,25 @@ out: return ret; } +/** + * srpt_zerolength_write() - Perform a zero-length RDMA write. + * + * A quote from the InfiniBand specification: C9-88: For an HCA responder + * using Reliable Connection service, for each zero-length RDMA READ or WRITE + * request, the R_Key shall not be validated, even if the request includes + * Immediate data. + */ +static int srpt_zerolength_write(struct srpt_rdma_ch *ch) +{ + struct ib_send_wr wr, *bad_wr; + + memset(&wr, 0, sizeof(wr)); + wr.opcode = IB_WR_RDMA_WRITE; + wr.wr_id = encode_wr_id(SRPT_RDMA_ZEROLENGTH_WRITE, 0xffffffffUL); + wr.send_flags = IB_SEND_SIGNALED; + return ib_post_send(ch->qp, &wr, &bad_wr); +} + /** * srpt_get_desc_tbl() - Parse the data descriptors of an SRP_CMD request. * @ioctx: Pointer to the I/O context associated with the request. @@ -1060,6 +1010,7 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx *ioctx, struct srp_direct_buf *db; unsigned add_cdb_offset; int ret; + u8 fmt; /* * The pointer computations below will only be compiled correctly @@ -1084,13 +1035,18 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx *ioctx, * buffer descriptor format, and the highest four bits contain the * DATA-OUT buffer descriptor format. */ - *dir = SCST_DATA_NONE; - if (srp_cmd->buf_fmt & 0xf) + fmt = srp_cmd->buf_fmt; + if (fmt & 0xf) { /* DATA-IN: transfer data from target to initiator (read). */ *dir = SCST_DATA_READ; - else if (srp_cmd->buf_fmt >> 4) + fmt = fmt & 0xf; + } else if (fmt >> 4) { /* DATA-OUT: transfer data from initiator to target (write). */ *dir = SCST_DATA_WRITE; + fmt = fmt >> 4; + } else { + *dir = SCST_DATA_NONE; + } /* * According to the SRP spec, the lower two bits of the 'ADDITIONAL @@ -1098,8 +1054,7 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx *ioctx, * is four times the value specified in bits 3..7. Hence the "& ~3". */ add_cdb_offset = srp_cmd->add_cdb_len & ~3; - if (((srp_cmd->buf_fmt & 0xf) == SRP_DATA_DESC_DIRECT) || - ((srp_cmd->buf_fmt >> 4) == SRP_DATA_DESC_DIRECT)) { + if (fmt == SRP_DATA_DESC_DIRECT) { ioctx->n_rbuf = 1; ioctx->rbufs = &ioctx->single_rbuf; @@ -1107,8 +1062,7 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx *ioctx, + add_cdb_offset); memcpy(ioctx->rbufs, db, sizeof(*db)); *data_len = be32_to_cpu(db->len); - } else if (((srp_cmd->buf_fmt & 0xf) == SRP_DATA_DESC_INDIRECT) || - ((srp_cmd->buf_fmt >> 4) == SRP_DATA_DESC_INDIRECT)) { + } else if (fmt == SRP_DATA_DESC_INDIRECT) { idb = (struct srp_indirect_buf *)(srp_cmd->add_data + add_cdb_offset); @@ -1142,7 +1096,11 @@ static int srpt_get_desc_tbl(struct srpt_send_ioctx *ioctx, db = idb->desc_list; memcpy(ioctx->rbufs, db, ioctx->n_rbuf * sizeof(*db)); *data_len = be32_to_cpu(idb->len); + } else if (fmt != 0) { + PRINT_ERROR("Unsupported data format %d\n", fmt); + ret = -EINVAL; } + out: return ret; } @@ -1353,44 +1311,31 @@ static void srpt_put_send_ioctx(struct srpt_send_ioctx *ioctx) * srpt_abort_cmd() - Make SCST stop processing a SCSI command. * @ioctx: I/O context associated with the SCSI command. * @context: Preferred execution context. + * + * Must only be called when the I/O context is in a state where it is waiting + * for the HCA. */ static void srpt_abort_cmd(struct srpt_send_ioctx *ioctx, enum scst_exec_context context) { - struct scst_cmd *scmnd; - enum srpt_command_state state; + struct scst_cmd *scmnd = &ioctx->scmnd; + enum srpt_command_state state = ioctx->state; TRACE_ENTRY(); - BUG_ON(!ioctx); - - /* - * If the command is in a state where the target core is waiting for - * the ib_srpt driver, change the state to the next state. Changing - * the state of the command from SRPT_STATE_NEED_DATA to - * SRPT_STATE_DATA_IN ensures that srpt_xmit_response() will call this - * function a second time. - */ - spin_lock(&ioctx->spinlock); - state = ioctx->state; switch (state) { case SRPT_STATE_NEED_DATA: ioctx->state = SRPT_STATE_DATA_IN; break; - case SRPT_STATE_DATA_IN: case SRPT_STATE_CMD_RSP_SENT: case SRPT_STATE_MGMT_RSP_SENT: ioctx->state = SRPT_STATE_DONE; break; default: + WARN_ONCE(true, "%s: unexpected I/O context state %d\n", + __func__, state); break; } - spin_unlock(&ioctx->spinlock); - - if (state == SRPT_STATE_DONE) - goto out; - - scmnd = &ioctx->scmnd; WARN_ON(ioctx != scst_cmd_get_tgt_priv(scmnd)); @@ -1401,11 +1346,7 @@ static void srpt_abort_cmd(struct srpt_send_ioctx *ioctx, case SRPT_STATE_NEW: case SRPT_STATE_DATA_IN: case SRPT_STATE_MGMT: - /* - * Do nothing - defer abort processing until - * srpt_xmit_response() is invoked. - */ - WARN_ON(!scst_cmd_aborted_on_xmit(scmnd)); + case SRPT_STATE_DONE: break; case SRPT_STATE_NEED_DATA: /* SCST_DATA_WRITE - RDMA read error or RDMA read timeout. */ @@ -1429,60 +1370,74 @@ static void srpt_abort_cmd(struct srpt_send_ioctx *ioctx, * management commands. Note: the SCST core frees these * commands immediately after srpt_tsk_mgmt_done() returned. */ - WARN(true, "Unexpected command state %d", state); - break; - default: - WARN(true, "Unexpected command state %d", state); + WARN(true, "Unexpected command state %d\n", state); break; } -out: - ; - TRACE_EXIT(); } +void srpt_on_abort_cmd(struct scst_cmd *cmd) +{ + struct srpt_send_ioctx *ioctx = scst_cmd_get_tgt_priv(cmd); + struct srpt_rdma_ch *ch = ioctx->ch; + + if (ch->state >= CH_DISCONNECTED) { + switch (ioctx->state) { + case SRPT_STATE_NEW: + case SRPT_STATE_DATA_IN: + case SRPT_STATE_MGMT: + case SRPT_STATE_DONE: + /* + * An SCST command thread is busy processing the + * command associated with the I/O context, so wait + * until that processing has finished. + */ + break; + case SRPT_STATE_NEED_DATA: + case SRPT_STATE_CMD_RSP_SENT: + case SRPT_STATE_MGMT_RSP_SENT: + PRINT_ERROR("Cmd %p: IB completion for idx %u has not" + " been received in time (SRPT command state" + " %d)", cmd, ioctx->ioctx.index, + ioctx->state); + srpt_abort_cmd(ioctx, SCST_CONTEXT_THREAD); + break; + } + } +} + /** * srpt_handle_send_err_comp() - Process an IB_WC_SEND error completion. */ static void srpt_handle_send_err_comp(struct srpt_rdma_ch *ch, u64 wr_id, enum scst_exec_context context) { - struct srpt_send_ioctx *ioctx; - enum srpt_command_state state; - struct scst_cmd *scmnd; - u32 index; + u32 index = idx_from_wr_id(wr_id); + struct srpt_send_ioctx *ioctx = ch->ioctx_ring[index]; + enum srpt_command_state state = ioctx->state; srpt_adjust_srq_wr_avail(ch, 1); - index = idx_from_wr_id(wr_id); - ioctx = ch->ioctx_ring[index]; - state = ioctx->state; - scmnd = &ioctx->scmnd; - - EXTRACHECKS_WARN_ON(state != SRPT_STATE_CMD_RSP_SENT - && state != SRPT_STATE_MGMT_RSP_SENT - && state != SRPT_STATE_NEED_DATA - && state != SRPT_STATE_DONE); - - /* - * If SRP_RSP sending failed, undo the ch->req_lim and ch->req_lim_delta - * changes. - */ - if (state == SRPT_STATE_CMD_RSP_SENT - || state == SRPT_STATE_MGMT_RSP_SENT) - srpt_undo_inc_req_lim(ch, ioctx->req_lim_delta); switch (state) { - default: + case SRPT_STATE_NEED_DATA: + srpt_abort_cmd(ioctx, context); + break; + case SRPT_STATE_CMD_RSP_SENT: + srpt_undo_inc_req_lim(ch, ioctx->req_lim_delta); srpt_abort_cmd(ioctx, context); break; case SRPT_STATE_MGMT_RSP_SENT: + srpt_undo_inc_req_lim(ch, ioctx->req_lim_delta); srpt_put_send_ioctx(ioctx); break; case SRPT_STATE_DONE: PRINT_ERROR("Received more than one IB error completion" " for wr_id = %u.", (unsigned)index); break; + default: + EXTRACHECKS_WARN_ON(true); + break; } } @@ -1520,12 +1475,11 @@ static void srpt_handle_rdma_comp(struct srpt_rdma_ch *ch, enum srpt_opcode opcode, enum scst_exec_context context) { - struct scst_cmd *scmnd; + struct scst_cmd *scmnd = &ioctx->scmnd; EXTRACHECKS_WARN_ON(ioctx->n_rdma <= 0); srpt_adjust_srq_wr_avail(ch, ioctx->n_rdma); - scmnd = &ioctx->scmnd; if (opcode == SRPT_RDMA_READ_LAST && scmnd) { if (srpt_test_and_set_cmd_state(ioctx, SRPT_STATE_NEED_DATA, SRPT_STATE_DATA_IN)) @@ -1536,7 +1490,7 @@ static void srpt_handle_rdma_comp(struct srpt_rdma_ch *ch, } else if (opcode == SRPT_RDMA_ABORT) { ioctx->rdma_aborted = true; } else { - WARN(true, "scmnd == NULL (opcode %d)", opcode); + WARN(true, "scmnd == NULL (opcode %d)\n", opcode); } } @@ -1548,11 +1502,9 @@ static void srpt_handle_rdma_err_comp(struct srpt_rdma_ch *ch, enum srpt_opcode opcode, enum scst_exec_context context) { - struct scst_cmd *scmnd; - enum srpt_command_state state; + struct scst_cmd *scmnd = &ioctx->scmnd; + enum srpt_command_state state = ioctx->state; - scmnd = &ioctx->scmnd; - state = ioctx->state; switch (opcode) { case SRPT_RDMA_READ_LAST: if (ioctx->n_rdma <= 0) { @@ -1569,6 +1521,12 @@ static void srpt_handle_rdma_err_comp(struct srpt_rdma_ch *ch, __LINE__, state); break; case SRPT_RDMA_WRITE_LAST: + /* + * Note: if an RDMA write error completion is received that + * means that a SEND has also been posted. Defer further + * processing of the associated command until the send error + * completion has been received. + */ scst_set_delivery_status(scmnd, SCST_CMD_DELIVERY_ABORTED); break; default: @@ -1688,18 +1646,15 @@ static int srpt_handle_cmd(struct srpt_rdma_ch *ch, scst_data_direction dir; u64 data_len; int ret; - int atomic; BUG_ON(!send_ioctx); srp_cmd = recv_ioctx->ioctx.buf; - atomic = context == SCST_CONTEXT_TASKLET ? SCST_ATOMIC - : SCST_NON_ATOMIC; scmnd = &send_ioctx->scmnd; ret = scst_rx_cmd_prealloced(scmnd, ch->scst_sess, (u8 *) &srp_cmd->lun, sizeof(srp_cmd->lun), srp_cmd->cdb, - sizeof(srp_cmd->cdb), atomic); + sizeof(srp_cmd->cdb), in_interrupt()); if (ret) { PRINT_ERROR("tag 0x%llx: SCST command initialization failed", srp_cmd->tag); @@ -1842,40 +1797,41 @@ static u8 scst_to_srp_tsk_mgmt_status(const int scst_mgmt_status) /** * srpt_handle_new_iu() - Process a newly received information unit. - * @ch: RDMA channel through which the information unit has been received. - * @ioctx: SRPT I/O context associated with the information unit. + * @ch: RDMA channel through which the information unit has been received. + * @recv_ioctx: SRPT I/O context associated with the information unit. + * @context: SCST command processing context. */ -static void srpt_handle_new_iu(struct srpt_rdma_ch *ch, - struct srpt_recv_ioctx *recv_ioctx, - struct srpt_send_ioctx *send_ioctx, - enum scst_exec_context context) +static struct srpt_send_ioctx * +srpt_handle_new_iu(struct srpt_rdma_ch *ch, + struct srpt_recv_ioctx *recv_ioctx, + enum scst_exec_context context) { + struct srpt_send_ioctx *send_ioctx = NULL; struct srp_cmd *srp_cmd; + u8 opcode; BUG_ON(!ch); BUG_ON(!recv_ioctx); + if (unlikely(ch->state == CH_CONNECTING)) + goto push; + ib_dma_sync_single_for_cpu(ch->sport->sdev->device, recv_ioctx->ioctx.dma, srp_max_req_size, DMA_FROM_DEVICE); srp_cmd = recv_ioctx->ioctx.buf; - if (unlikely(ch->state == CH_CONNECTING)) { - list_add_tail(&recv_ioctx->wait_list, &ch->cmd_wait_list); - goto out; + opcode = srp_cmd->opcode; + if (opcode == SRP_CMD || opcode == SRP_TSK_MGMT) { + send_ioctx = srpt_get_send_ioctx(ch); + if (unlikely(!send_ioctx)) + goto push; } - if (srp_cmd->opcode == SRP_CMD || srp_cmd->opcode == SRP_TSK_MGMT) { - if (!send_ioctx) - send_ioctx = srpt_get_send_ioctx(ch); - if (unlikely(!send_ioctx)) { - list_add_tail(&recv_ioctx->wait_list, - &ch->cmd_wait_list); - goto out; - } - } + if (!list_empty(&recv_ioctx->wait_list)) + list_del_init(&recv_ioctx->wait_list); - switch (srp_cmd->opcode) { + switch (opcode) { case SRP_CMD: srpt_handle_cmd(ch, recv_ioctx, send_ioctx, context); break; @@ -1895,14 +1851,20 @@ static void srpt_handle_new_iu(struct srpt_rdma_ch *ch, PRINT_ERROR("Received SRP_RSP"); break; default: - PRINT_ERROR("received IU with unknown opcode 0x%x", - srp_cmd->opcode); + PRINT_ERROR("received IU with unknown opcode 0x%x", opcode); break; } srpt_post_recv(ch->sport->sdev, recv_ioctx); + out: - return; + return send_ioctx; + +push: + if (list_empty(&recv_ioctx->wait_list)) + list_add_tail(&recv_ioctx->wait_list, + &ch->cmd_wait_list); + goto out; } static void srpt_process_rcv_completion(struct ib_cq *cq, @@ -1921,7 +1883,7 @@ static void srpt_process_rcv_completion(struct ib_cq *cq, if (unlikely(req_lim < 0)) PRINT_ERROR("req_lim = %d < 0", req_lim); ioctx = sdev->ioctx_ring[index]; - srpt_handle_new_iu(ch, ioctx, NULL, srpt_new_iu_context); + srpt_handle_new_iu(ch, ioctx, srpt_new_iu_context); } else { PRINT_INFO("receiving failed for idx %u with status %d", index, wc->status); @@ -1931,17 +1893,16 @@ static void srpt_process_rcv_completion(struct ib_cq *cq, static void srpt_process_wait_list(struct srpt_rdma_ch *ch) { struct srpt_recv_ioctx *recv_ioctx, *tmp; - struct srpt_send_ioctx *send_ioctx; + + ch->processing_wait_list = true; list_for_each_entry_safe(recv_ioctx, tmp, &ch->cmd_wait_list, wait_list) { - send_ioctx = srpt_get_send_ioctx(ch); - if (!send_ioctx) + if (!srpt_handle_new_iu(ch, recv_ioctx, srpt_new_iu_context)) break; - list_del(&recv_ioctx->wait_list); - srpt_handle_new_iu(ch, recv_ioctx, send_ioctx, - srpt_new_iu_context); } + + ch->processing_wait_list = false; } /** @@ -1976,8 +1937,12 @@ static void srpt_process_send_completion(struct ib_cq *cq, opcode == SRPT_RDMA_ABORT) { srpt_handle_rdma_comp(ch, ch->ioctx_ring[index], opcode, srpt_xmt_rsp_context); + } else if (opcode == SRPT_RDMA_ZEROLENGTH_WRITE) { + WARN_ONCE(true, "%s: QP not in error state\n", + ch->sess_name); + WARN_ON_ONCE(!srpt_set_ch_state(ch, CH_DISCONNECTED)); } else { - WARN(true, "unexpected opcode %d", opcode); + WARN(true, "unexpected opcode %d\n", opcode); } } else { if (opcode == SRPT_SEND) { @@ -1993,40 +1958,58 @@ static void srpt_process_send_completion(struct ib_cq *cq, " and cables.", opcode, index, wc->status); srpt_handle_rdma_err_comp(ch, ch->ioctx_ring[index], opcode, srpt_xmt_rsp_context); + } else if (opcode == SRPT_RDMA_ZEROLENGTH_WRITE) { + WARN_ON_ONCE(!srpt_set_ch_state(ch, CH_DISCONNECTED)); } else if (opcode != SRPT_RDMA_MID) { - WARN(true, "unexpected opcode %d", opcode); + WARN(true, "unexpected opcode %d\n", opcode); } } if (unlikely(!list_empty(&ch->cmd_wait_list) && - ch->state != CH_CONNECTING)) + ch->state != CH_CONNECTING && + !ch->processing_wait_list)) srpt_process_wait_list(ch); } -static void srpt_poll(struct srpt_rdma_ch *ch) +static void srpt_process_one_compl(struct srpt_rdma_ch *ch, struct ib_wc *wc) +{ + struct ib_cq *const cq = ch->cq; + + if (opcode_from_wr_id(wc->wr_id) == SRPT_RECV) + srpt_process_rcv_completion(cq, ch, wc); + else + srpt_process_send_completion(cq, ch, wc); +} + +static int srpt_poll(struct srpt_rdma_ch *ch, int budget) { struct ib_cq *const cq = ch->cq; struct ib_wc *const wc = ch->wc; - int i, n; + int i, n, processed = 0; - while ((n = ib_poll_cq(cq, ARRAY_SIZE(ch->wc), wc)) > 0) { - for (i = 0; i < n; i++) { - if (opcode_from_wr_id(wc[i].wr_id) == SRPT_RECV) - srpt_process_rcv_completion(cq, ch, &wc[i]); - else - srpt_process_send_completion(cq, ch, &wc[i]); - } + while ((n = ib_poll_cq(cq, min_t(int, ARRAY_SIZE(ch->wc), budget), + wc)) > 0) { + for (i = 0; i < n; i++) + srpt_process_one_compl(ch, &wc[i]); + budget -= n; + processed += n; } + + return processed; } -static void srpt_process_completion(struct srpt_rdma_ch *ch) +static int srpt_process_completion(struct srpt_rdma_ch *ch, int budget) { struct ib_cq *const cq = ch->cq; + int processed = 0, n = budget; do { - srpt_poll(ch); - } while (ib_req_notify_cq(cq, IB_CQ_NEXT_COMP | - IB_CQ_REPORT_MISSED_EVENTS) > 0); + processed += srpt_poll(ch, n); + n = ib_req_notify_cq(cq, IB_CQ_NEXT_COMP | + IB_CQ_REPORT_MISSED_EVENTS); + } while (n > 0); + + return processed; } /** @@ -2036,7 +2019,6 @@ static void srpt_completion(struct ib_cq *cq, void *ctx) { struct srpt_rdma_ch *ch = ctx; - BUG_ON(!ch->thread); wake_up_process(ch->thread); } @@ -2044,11 +2026,6 @@ static void srpt_free_ch(struct kref *kref) { struct srpt_rdma_ch *ch = container_of(kref, struct srpt_rdma_ch, kref); - /* - * The function call below will wait for the completion handler - * callback to finish and hence ensures that wake_up_process() won't - * be invoked anymore from that callback for the current thread. - */ srpt_destroy_ch_ib(ch); kfree(ch); @@ -2066,19 +2043,6 @@ static void srpt_unreg_sess(struct scst_session *scst_sess) sdev, ch->rq_size, ch->max_rsp_size, DMA_TO_DEVICE); - /* - * Note: if a DREQ is received after ch->dreq_received has been read, - * ib_destroy_cm_id() will send a DREP. - * - */ - if (ch->dreq_received) { - if (ib_send_cm_drep(ch->cm_id, NULL, 0) >= 0) - PRINT_INFO("Received DREQ and sent DREP for session %s", - ch->sess_name); - else - PRINT_ERROR("Sending DREP failed"); - } - /* * If the connection is still established, ib_destroy_cm_id() will * send a DREQ. @@ -2099,6 +2063,7 @@ static void srpt_unreg_sess(struct scst_session *scst_sess) static int srpt_compl_thread(void *arg) { + enum { poll_budget = 65536 }; struct srpt_rdma_ch *ch; /* Hibernation / freezing of the SRPT kernel thread is not supported. */ @@ -2107,30 +2072,12 @@ static int srpt_compl_thread(void *arg) ch = arg; BUG_ON(!ch); - set_current_state(TASK_INTERRUPTIBLE); -#if defined(__GNUC__) -#if (__GNUC__ * 100 + __GNUC_MINOR__) <= 406 - /* See also http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52925. */ - barrier(); -#endif -#endif - while (!ch->last_wqe_received && ch->state <= CH_LIVE) { - srpt_process_completion(ch); - schedule(); + while (ch->state < CH_DISCONNECTED) { set_current_state(TASK_INTERRUPTIBLE); - } - set_current_state(TASK_RUNNING); - - /* - * Process all IB (error) completions before invoking - * scst_unregister_session(). - */ - for (;;) { - set_current_state(TASK_INTERRUPTIBLE); - srpt_process_completion(ch); - if (atomic_read(&ch->scst_sess->sess_cmd_count) == 0) - break; - schedule_timeout(HZ / 10); + if (srpt_process_completion(ch, poll_budget) >= poll_budget) + cond_resched(); + else + schedule(); } set_current_state(TASK_RUNNING); @@ -2208,7 +2155,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) TRACE_DBG("qp_num = %#x", ch->qp->qp_num); - ch->sq_wr_avail = qp_init->cap.max_send_wr; + atomic_set(&ch->sq_wr_avail, qp_init->cap.max_send_wr); TRACE_DBG("%s: max_cqe= %d max_sge= %d sq_size = %d" " cm_id= %p", __func__, ch->cq->cqe, @@ -2235,15 +2182,8 @@ err_destroy_cq: static void srpt_destroy_ch_ib(struct srpt_rdma_ch *ch) { - TRACE_ENTRY(); - - while (ib_poll_cq(ch->cq, ARRAY_SIZE(ch->wc), ch->wc) > 0) - ; - ib_destroy_qp(ch->qp); ib_destroy_cq(ch->cq); - - TRACE_EXIT(); } /** @@ -2260,7 +2200,6 @@ static bool __srpt_close_ch(struct srpt_rdma_ch *ch) __acquires(&ch->srpt_tgt->spinlock) { struct srpt_tgt *srpt_tgt = ch->srpt_tgt; - enum rdma_ch_state prev_state; int ret; bool was_live; @@ -2268,28 +2207,23 @@ static bool __srpt_close_ch(struct srpt_rdma_ch *ch) lockdep_assert_held(&srpt_tgt->spinlock); #endif - was_live = false; - - prev_state = srpt_set_ch_state_to_disc(ch); - - switch (prev_state) { - case CH_CONNECTING: - case CH_LIVE: - was_live = true; - break; - case CH_DISCONNECTING: - case CH_DRAINING: - break; - } - + was_live = srpt_set_ch_state(ch, CH_DISCONNECTING); if (was_live) { kref_get(&ch->kref); spin_unlock_irq(&srpt_tgt->spinlock); ret = srpt_ch_qp_err(ch); if (ret < 0) - PRINT_ERROR("Setting queue pair in error state" - " failed: %d", ret); + PRINT_ERROR("%s: changing queue pair into error state" + " failed: %d", ch->sess_name, ret); + + ret = srpt_zerolength_write(ch); + if (ret < 0) { + PRINT_ERROR("%s: queuing zero-length write failed: %d", + ch->sess_name, ret); + WARN_ON_ONCE(!srpt_set_ch_state(ch, CH_DISCONNECTED)); + } + kref_put(&ch->kref, srpt_free_ch); spin_lock_irq(&srpt_tgt->spinlock); @@ -2310,36 +2244,9 @@ static void srpt_close_ch(struct srpt_rdma_ch *ch) spin_unlock_irq(&srpt_tgt->spinlock); } -/** - * srpt_drain_channel() - Drain a channel by resetting the IB queue pair. - * @cm_id: Pointer to the CM ID of the channel to be drained. - * - * Note: Must be called from inside srpt_cm_handler to avoid a race between - * accessing sdev->spinlock and the call to kfree(sdev) in srpt_remove_one() - * (the caller of srpt_cm_handler holds the cm_id spinlock; srpt_remove_one() - * waits until all target sessions for the associated IB device have been - * unregistered and target session registration involves a call to - * ib_destroy_cm_id(), which locks the cm_id spinlock and hence waits until - * this function has finished). - */ -static void srpt_drain_channel(struct ib_cm_id *cm_id) -{ - struct srpt_rdma_ch *ch; - int ret; - - WARN_ON_ONCE(irqs_disabled()); - - ch = cm_id->context; - if (srpt_set_ch_state_to_draining(ch)) { - ret = srpt_ch_qp_err(ch); - if (ret < 0) - PRINT_ERROR("Setting queue pair in error state" - " failed: %d", ret); - } -} - static void __srpt_close_all_ch(struct srpt_tgt *srpt_tgt) { + struct srpt_nexus *nexus; struct srpt_rdma_ch *ch; #if LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 32) @@ -2347,14 +2254,15 @@ static void __srpt_close_all_ch(struct srpt_tgt *srpt_tgt) #endif restart: - list_for_each_entry(ch, &srpt_tgt->rch_list, list) { - if (ch->state >= CH_DISCONNECTING) - continue; - PRINT_INFO("Closing channel %s because target %s has been" - " disabled", ch->sess_name, - srpt_tgt->scst_tgt->tgt_name); - WARN_ON_ONCE(!__srpt_close_ch(ch)); - goto restart; + list_for_each_entry(nexus, &srpt_tgt->nexus_list, entry) { + list_for_each_entry(ch, &nexus->ch_list, list) { + if (ib_send_cm_dreq(ch->cm_id, NULL, 0) < 0) + continue; + PRINT_INFO("Closing channel %s because target %s has" + " been disabled", ch->sess_name, + srpt_tgt->scst_tgt->tgt_name); + goto restart; + } } } @@ -2374,6 +2282,48 @@ static struct srpt_tgt *srpt_convert_scst_tgt(struct scst_tgt *scst_tgt) return srpt_tgt; } +/* + * Look up (i_port_id, t_port_id) in srpt_tgt->nexus_list. Create an entry if + * it does not yet exist. + */ +static struct srpt_nexus *srpt_get_nexus(struct srpt_tgt *srpt_tgt, + u8 i_port_id[16], u8 t_port_id[16]) +{ + unsigned long flags; + struct srpt_nexus *nexus = NULL, *tmp_nexus = NULL, *n; + + for (;;) { + spin_lock_irqsave(&srpt_tgt->spinlock, flags); + list_for_each_entry(n, &srpt_tgt->nexus_list, entry) { + if (memcmp(n->i_port_id, i_port_id, 16) == 0 && + memcmp(n->t_port_id, t_port_id, 16) == 0) { + nexus = n; + break; + } + } + if (!nexus && tmp_nexus) { + list_add_tail(&tmp_nexus->entry, &srpt_tgt->nexus_list); + swap(nexus, tmp_nexus); + } + spin_unlock_irqrestore(&srpt_tgt->spinlock, flags); + + if (nexus) + break; + tmp_nexus = kzalloc(sizeof(*nexus), GFP_KERNEL); + if (!tmp_nexus) { + nexus = ERR_PTR(-ENOMEM); + break; + } + INIT_LIST_HEAD(&tmp_nexus->ch_list); + memcpy(tmp_nexus->i_port_id, i_port_id, 16); + memcpy(tmp_nexus->t_port_id, t_port_id, 16); + } + + kfree(tmp_nexus); + + return nexus; +} + #if !defined(CONFIG_SCST_PROC) /** * srpt_enable_target - Set the "enabled" status of a target. @@ -2388,8 +2338,8 @@ static int srpt_enable_target(struct scst_tgt *scst_tgt, bool enable) if (!srpt_tgt) goto out; - TRACE_DBG("%s target %s", enable ? "Enabling" : "Disabling", - scst_tgt->tgt_name); + PRINT_INFO("%s target %s", enable ? "Enabling" : "Disabling", + scst_tgt->tgt_name); spin_lock_irq(&srpt_tgt->spinlock); srpt_tgt->enabled = enable; @@ -2428,10 +2378,11 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, struct srpt_port *const sport = &sdev->port[param->port - 1]; struct srpt_tgt *const srpt_tgt = one_target_per_port ? &sport->srpt_tgt : &sdev->srpt_tgt; + struct srpt_nexus *nexus; struct srp_login_req *req; - struct srp_login_rsp *rsp; - struct srp_login_rej *rej; - struct ib_cm_rep_param *rep_param; + struct srp_login_rsp *rsp = NULL; + struct srp_login_rej *rej = NULL; + struct ib_cm_rep_param *rep_param = NULL; struct srpt_rdma_ch *ch = NULL; struct task_struct *thread; u32 it_iu_len; @@ -2485,6 +2436,13 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, be16_to_cpu(*(__be16 *)&sdev->port[param->port - 1].gid.raw[12]), be16_to_cpu(*(__be16 *)&sdev->port[param->port - 1].gid.raw[14])); + nexus = srpt_get_nexus(srpt_tgt, req->initiator_port_id, + req->target_port_id); + if (IS_ERR(nexus)) { + ret = PTR_ERR(nexus); + goto out; + } + ret = -ENOMEM; rsp = kzalloc(sizeof(*rsp), GFP_KERNEL); rej = kzalloc(sizeof(*rej), GFP_KERNEL); @@ -2537,8 +2495,7 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, PRINT_ERROR("Translating pkey %#x failed (%d) - using index 0", be16_to_cpu(param->primary_path->pkey), ret); } - memcpy(ch->i_port_id, req->initiator_port_id, 16); - memcpy(ch->t_port_id, req->target_port_id, 16); + ch->nexus = nexus; ch->sport = sport; ch->srpt_tgt = srpt_tgt; ch->cm_id = cm_id; @@ -2600,7 +2557,7 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, "0x%016llx%016llx", be64_to_cpu(*(__be64 *) &sdev->port[param->port - 1].gid.raw[8]), - be64_to_cpu(*(__be64 *)(ch->i_port_id + 8))); + be64_to_cpu(*(__be64 *)(nexus->i_port_id + 8))); } else { /* * Default behavior: use the initiator port identifier as the @@ -2608,16 +2565,16 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, */ snprintf(ch->sess_name, sizeof(ch->sess_name), "0x%016llx%016llx", - be64_to_cpu(*(__be64 *)ch->i_port_id), - be64_to_cpu(*(__be64 *)(ch->i_port_id + 8))); + be64_to_cpu(*(__be64 *)nexus->i_port_id), + be64_to_cpu(*(__be64 *)(nexus->i_port_id + 8))); } TRACE_DBG("registering session %s", ch->sess_name); BUG_ON(!srpt_tgt->scst_tgt); ret = -ENOMEM; - ch->scst_sess = scst_register_session(srpt_tgt->scst_tgt, 0, ch->sess_name, - ch, NULL, NULL); + ch->scst_sess = scst_register_session(srpt_tgt->scst_tgt, 0, + ch->sess_name, ch, NULL, NULL); if (!ch->scst_sess) { rej->reason = cpu_to_be32(SRP_LOGIN_REJ_INSUFFICIENT_RESOURCES); TRACE_DBG("Failed to create SCST session"); @@ -2640,28 +2597,19 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, rsp->rsp_flags = SRP_LOGIN_RSP_MULTICHAN_NO_CHAN; restart: - list_for_each_entry(ch2, &srpt_tgt->rch_list, list) { - if (!memcmp(ch2->i_port_id, req->initiator_port_id, 16) - && param->port == ch2->sport->port - && param->listen_id == ch2->sport->sdev->cm_id) { - if (!__srpt_close_ch(ch2)) - continue; - - PRINT_INFO("Relogin - closed existing channel" - " %s; cm_id = %p", ch2->sess_name, - ch2->cm_id); - - rsp->rsp_flags = - SRP_LOGIN_RSP_MULTICHAN_TERMINATED; - - goto restart; - } + list_for_each_entry(ch2, &nexus->ch_list, list) { + if (ib_send_cm_dreq(ch2->cm_id, NULL, 0) < 0) + continue; + PRINT_INFO("Relogin - closed existing channel %s", + ch2->sess_name); + rsp->rsp_flags = SRP_LOGIN_RSP_MULTICHAN_TERMINATED; + goto restart; } } else { rsp->rsp_flags = SRP_LOGIN_RSP_MULTICHAN_MAINTAINED; } - list_add_tail(&ch->list, &srpt_tgt->rch_list); + list_add_tail(&ch->list, &nexus->ch_list); ch->thread = thread; if (!srpt_tgt->enabled) { @@ -2778,7 +2726,6 @@ out: static void srpt_cm_rej_recv(struct ib_cm_id *cm_id) { PRINT_INFO("Received InfiniBand REJ packet for cm_id %p.", cm_id); - srpt_drain_channel(cm_id); } /** @@ -2793,35 +2740,49 @@ static void srpt_cm_rtu_recv(struct ib_cm_id *cm_id) int ret; ret = srpt_ch_qp_rts(ch, ch->qp); - if (ret == 0 && srpt_test_and_set_ch_state(ch, CH_CONNECTING, - CH_LIVE)) { - wake_up_process(ch->thread); - } else { + if (ret < 0) { + PRINT_ERROR("%s: QP transition to RTS failed", ch->sess_name); srpt_close_ch(ch); + return; } + /* + * Note: calling srpt_close_ch() if the transition to the LIVE state + * fails is not necessary since that means that that function has + * already been invoked from another thread. + */ + if (!srpt_set_ch_state(ch, CH_LIVE)) + PRINT_ERROR("%s: Channel transition to LIVE state failed", + ch->sess_name); } static void srpt_cm_timewait_exit(struct ib_cm_id *cm_id) { + struct srpt_rdma_ch *ch = cm_id->context; + PRINT_INFO("Received InfiniBand TimeWait exit for cm_id %p.", cm_id); - srpt_drain_channel(cm_id); + srpt_close_ch(ch); } static void srpt_cm_rep_error(struct ib_cm_id *cm_id) { PRINT_INFO("Received InfiniBand REP error for cm_id %p.", cm_id); - srpt_drain_channel(cm_id); } /** * srpt_cm_dreq_recv() - Process reception of a DREQ message. */ -static void srpt_cm_dreq_recv(struct ib_cm_id *cm_id) +static int srpt_cm_dreq_recv(struct ib_cm_id *cm_id) { struct srpt_rdma_ch *ch = cm_id->context; + int ret; - ch->dreq_received = true; - srpt_set_ch_state_to_disc(ch); + ret = ib_send_cm_drep(cm_id, NULL, 0); + if (ret < 0) + PRINT_ERROR("%s: sending DREP failed", ch->sess_name); + + srpt_close_ch(ch); + + return ret; } /** @@ -2829,8 +2790,10 @@ static void srpt_cm_dreq_recv(struct ib_cm_id *cm_id) */ static void srpt_cm_drep_recv(struct ib_cm_id *cm_id) { + struct srpt_rdma_ch *ch = cm_id->context; + PRINT_INFO("Received InfiniBand DREP message for cm_id %p.", cm_id); - srpt_drain_channel(cm_id); + srpt_close_ch(ch); } /** @@ -2863,7 +2826,7 @@ static int srpt_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) srpt_cm_rtu_recv(cm_id); break; case IB_CM_DREQ_RECEIVED: - srpt_cm_dreq_recv(cm_id); + ret = srpt_cm_dreq_recv(cm_id); break; case IB_CM_DREP_RECEIVED: srpt_cm_drep_recv(cm_id); @@ -3181,13 +3144,17 @@ static int srpt_perform_rdmas(struct srpt_rdma_ch *ch, wr.num_sge = 0; wr.wr_id = encode_wr_id(SRPT_RDMA_ABORT, ioctx->ioctx.index); wr.send_flags = IB_SEND_SIGNALED; + PRINT_INFO("Trying to abort failed RDMA transfer [%d]", + ioctx->ioctx.index); while (ch->state == CH_LIVE && ib_post_send(ch->qp, &wr, &bad_wr) != 0) { PRINT_INFO("Trying to abort failed RDMA transfer [%d]", ioctx->ioctx.index); msleep(1000); } - while (ch->state != CH_DRAINING && !ioctx->rdma_aborted) { + PRINT_INFO("Waiting until RDMA abort finished [%d]", + ioctx->ioctx.index); + while (ch->state < CH_DISCONNECTED && !ioctx->rdma_aborted) { PRINT_INFO("Waiting until RDMA abort finished [%d]", ioctx->ioctx.index); msleep(1000); @@ -3327,7 +3294,6 @@ static int srpt_xmit_response(struct scst_cmd *scmnd) ch = scst_sess_get_tgt_priv(scst_cmd_get_session(scmnd)); BUG_ON(!ch); - spin_lock(&ioctx->spinlock); state = ioctx->state; switch (state) { case SRPT_STATE_NEW: @@ -3335,10 +3301,9 @@ static int srpt_xmit_response(struct scst_cmd *scmnd) ioctx->state = SRPT_STATE_CMD_RSP_SENT; break; default: - WARN(true, "Unexpected command state %d", state); + WARN(true, "Unexpected command state %d\n", state); break; } - spin_unlock(&ioctx->spinlock); if (unlikely(scst_cmd_aborted_on_xmit(scmnd))) { srpt_adjust_req_lim(ch, 0, 1); @@ -3467,7 +3432,8 @@ static int srpt_get_initiator_port_transport_id(struct scst_tgt *tgt, res = 0; tr_id->protocol_identifier = SCSI_TRANSPORTID_PROTOCOLID_SRP; - memcpy(tr_id->i_port_id, ch->i_port_id, sizeof(ch->i_port_id)); + memcpy(tr_id->i_port_id, ch->nexus->i_port_id, + sizeof(tr_id->i_port_id)); *transport_id = (uint8_t *)tr_id; @@ -3527,16 +3493,19 @@ static int srpt_close_session(struct scst_session *sess) { struct srpt_rdma_ch *ch = scst_sess_get_tgt_priv(sess); - srpt_close_ch(ch); + ib_send_cm_dreq(ch->cm_id, NULL, 0); return 0; } -static int srpt_ch_list_empty(struct srpt_tgt *srpt_tgt) +static bool srpt_ch_list_empty(struct srpt_tgt *srpt_tgt) { - int res; + struct srpt_nexus *nexus; + bool res = true; spin_lock_irq(&srpt_tgt->spinlock); - res = list_empty(&srpt_tgt->rch_list); + list_for_each_entry(nexus, &srpt_tgt->nexus_list, entry) + if (!list_empty(&nexus->ch_list)) + res = false; spin_unlock_irq(&srpt_tgt->spinlock); return res; @@ -3547,6 +3516,7 @@ static int srpt_ch_list_empty(struct srpt_tgt *srpt_tgt) */ static int srpt_release_sport(struct srpt_tgt *srpt_tgt) { + struct srpt_nexus *nexus, *next_n; struct srpt_rdma_ch *ch; TRACE_ENTRY(); @@ -3565,14 +3535,24 @@ static int srpt_release_sport(struct srpt_tgt *srpt_tgt) PRINT_INFO("%s: waiting for session unregistration ...", srpt_tgt->scst_tgt->tgt_name); spin_lock_irq(&srpt_tgt->spinlock); - list_for_each_entry(ch, &srpt_tgt->rch_list, list) { - PRINT_INFO("%s: state %s; %d commands in progress", - ch->sess_name, get_ch_state_name(ch->state), + list_for_each_entry(nexus, &srpt_tgt->nexus_list, entry) { + list_for_each_entry(ch, &nexus->ch_list, list) { + PRINT_INFO("%s: state %s; %d commands in" + " progress", ch->sess_name, + get_ch_state_name(ch->state), atomic_read(&ch->scst_sess->sess_cmd_count)); + } } spin_unlock_irq(&srpt_tgt->spinlock); } + spin_lock_irq(&srpt_tgt->spinlock); + list_for_each_entry_safe(nexus, next_n, &srpt_tgt->nexus_list, entry) { + list_del(&nexus->entry); + kfree(nexus); + } + spin_unlock_irq(&srpt_tgt->spinlock); + TRACE_EXIT(); return 0; } @@ -3783,6 +3763,7 @@ static struct scst_tgt_template srpt_template = { .close_session = srpt_close_session, .xmit_response = srpt_xmit_response, .rdy_to_xfer = srpt_rdy_to_xfer, + .on_abort_cmd = srpt_on_abort_cmd, .on_hw_pending_cmd_timeout = srpt_pending_cmd_timeout, .on_free_cmd = srpt_on_free_cmd, .task_mgmt_fn_done = srpt_tsk_mgmt_done, @@ -3816,7 +3797,7 @@ static struct scst_proc_data srpt_log_proc_data = { /* Note: the caller must have zero-initialized *@srpt_tgt. */ static void srpt_init_tgt(struct srpt_tgt *srpt_tgt) { - INIT_LIST_HEAD(&srpt_tgt->rch_list); + INIT_LIST_HEAD(&srpt_tgt->nexus_list); init_waitqueue_head(&srpt_tgt->ch_releaseQ); spin_lock_init(&srpt_tgt->spinlock); } @@ -3956,8 +3937,10 @@ static void srpt_add_one(struct ib_device *device) goto err_event; } - for (i = 0; i < sdev->srq_size; ++i) + for (i = 0; i < sdev->srq_size; ++i) { + INIT_LIST_HEAD(&sdev->ioctx_ring[i]->wait_list); srpt_post_recv(sdev, sdev->ioctx_ring[i]); + } WARN_ON(sdev->device->phys_port_cnt > ARRAY_SIZE(sdev->port)); diff --git a/srpt/src/ib_srpt.h b/srpt/src/ib_srpt.h index 5e635c2ff..621ca3926 100644 --- a/srpt/src/ib_srpt.h +++ b/srpt/src/ib_srpt.h @@ -56,6 +56,7 @@ */ #define SRP_SERVICE_NAME_PREFIX "SRP.T10:" +struct srpt_nexus; struct srpt_tgt; enum { @@ -132,6 +133,16 @@ enum { RDMA_COMPL_TIMEOUT_S = 80, }; +#if LINUX_VERSION_CODE < KERNEL_VERSION(3, 1, 0) && \ + !(defined(CONFIG_SUSE_KERNEL) && \ + LINUX_VERSION_CODE >= KERNEL_VERSION(3, 0, 76)) && \ + !(defined(RHEL_MAJOR) && \ + (RHEL_MAJOR -0 > 6 || \ + RHEL_MAJOR -0 == 6 && RHEL_MINOR -0 >= 5)) +/* See also patch "IB/core: Add GID change event" (commit 761d90ed4). */ +enum { IB_EVENT_GID_CHANGE = 18 }; +#endif + enum srpt_opcode { SRPT_RECV, SRPT_SEND, @@ -139,6 +150,7 @@ enum srpt_opcode { SRPT_RDMA_ABORT, SRPT_RDMA_READ_LAST, SRPT_RDMA_WRITE_LAST, + SRPT_RDMA_ZEROLENGTH_WRITE, }; static inline u64 encode_wr_id(enum srpt_opcode opcode, u32 idx) @@ -240,7 +252,7 @@ struct srpt_tsk_mgmt { * @req_lim_delta: Value of the req_lim_delta value field in the latest * SRP response sent. * @tsk_mgmt: SRPT task management function context information. - * @rdma_ius_buf: DMA mapping context information. + * @rdma_ius_buf: Inline rdma_ius buffer for small requests. */ struct srpt_send_ioctx { struct srpt_ioctx ioctx; @@ -274,19 +286,20 @@ struct srpt_send_ioctx { * @CH_DISCONNECTING: DREQ has been received and waiting for DREP or DREQ has * been sent and waiting for DREP or channel is being closed * for another reason. - * @CH_DRAINING: QP is in ERR state. + * @CH_DISCONNECTED: Last WQE has been received. */ enum rdma_ch_state { CH_CONNECTING, CH_LIVE, CH_DISCONNECTING, - CH_DRAINING, + CH_DISCONNECTED, }; /** * struct srpt_rdma_ch - RDMA channel. * @thread: Kernel thread that processes the IB queues associated with * the channel. + * @nexus: I_T nexus this channel is associated with. * @cm_id: IB CM ID associated with the channel. * @qp: IB queue pair used for communicating over this channel. * @cq: IB completion queue for this channel. @@ -298,8 +311,6 @@ enum rdma_ch_state { * @sport: pointer to the information of the HCA port used by this * channel. * @srpt_tgt: Target port used by this channel. - * @i_port_id: 128-bit initiator port identifier copied from SRP_LOGIN_REQ. - * @t_port_id: 128-bit target port identifier copied from SRP_LOGIN_REQ. * @max_ti_iu_len: maximum target-to-initiator information unit length. * @req_lim: request limit: maximum number of requests that may be sent * by the initiator without having received a response. @@ -311,9 +322,8 @@ enum rdma_ch_state { * @ioctx_ring: Send I/O context ring. * @wc: Work completion array. * @state: channel state. See also enum rdma_ch_state. - * @dreq_received: Whether an IB CM DREQ event has been received. - * @last_wqe_received: Whether the Last WQE QP event has been received. - * @list: node for insertion in the srpt_device.rch_list list. + * @processing_wait_list: Whether or not cmd_wait_list is being processed. + * @list: Entry in srpt_nexus.ch_list; * @cmd_wait_list: list of SCST commands that arrived before the RTU event. This * list contains struct srpt_ioctx elements and is protected * against concurrent modification by the cm_id spinlock. @@ -323,6 +333,7 @@ enum rdma_ch_state { */ struct srpt_rdma_ch { struct task_struct *thread; + struct srpt_nexus *nexus; struct ib_cm_id *cm_id; struct ib_qp *qp; struct ib_cq *cq; @@ -330,11 +341,9 @@ struct srpt_rdma_ch { int rq_size; int max_sge; int max_rsp_size; - int sq_wr_avail; + atomic_t sq_wr_avail; struct srpt_port *sport; struct srpt_tgt *srpt_tgt; - u8 i_port_id[16]; - u8 t_port_id[16]; int max_ti_iu_len; int req_lim; int req_lim_delta; @@ -346,25 +355,38 @@ struct srpt_rdma_ch { struct list_head list; struct list_head cmd_wait_list; uint16_t pkey_index; - bool dreq_received; - bool last_wqe_received; + bool processing_wait_list; struct scst_session *scst_sess; u8 sess_name[40]; }; +/** + * struct srpt_nexus - I_T nexus + * @entry: srpt_tgt.nexus_list list node. + * @ch_list: struct srpt_rdma_ch list. Protected by srpt_tgt.spinlock + * @i_port_id: 128-bit initiator port identifier copied from SRP_LOGIN_REQ. + * @t_port_id: 128-bit target port identifier copied from SRP_LOGIN_REQ. + */ +struct srpt_nexus { + struct list_head entry; + struct list_head ch_list; + u8 i_port_id[16]; + u8 t_port_id[16]; +}; + /** * struct srpt_tgt - * @ch_releaseQ: Enables waiting for removal from rch_list. - * @spinlock: Protects rch_list. - * @rch_list: Per-device channel list -- see also srpt_rdma_ch.list. - * @scst_tgt: SCST target information associated with this HCA. - * @enabled: Whether or not this SCST target is enabled. + * @ch_releaseQ: Enables waiting for removal from nexus_list. + * @spinlock: Protects nexus_list. + * @nexus_list: Per-device I_T nexus list. + * @scst_tgt: SCST target information associated with this HCA. + * @enabled: Whether or not this SCST target is enabled. */ struct srpt_tgt { wait_queue_head_t ch_releaseQ; spinlock_t spinlock; - struct list_head rch_list; + struct list_head nexus_list; struct scst_tgt *scst_tgt; bool enabled; }; diff --git a/www/index.html b/www/index.html index 6bd2cffee..14a457cdb 100644 --- a/www/index.html +++ b/www/index.html @@ -149,7 +149,7 @@

Documentation

HTML

PDF

-

Gentoo HOWTO

+

Gentoo HOWTO

HOWTO For iSCSI-SCST

Gentoo HOWTO For iSCSI-SCST

Alpine Linux HOWTO