diff --git a/iscsi-scst/doc/manpages/iscsi-scstd.8 b/iscsi-scst/doc/manpages/iscsi-scstd.8 index 6b536556e..2ddee95b6 100644 --- a/iscsi-scst/doc/manpages/iscsi-scstd.8 +++ b/iscsi-scst/doc/manpages/iscsi-scstd.8 @@ -160,7 +160,7 @@ Value(s): value Value(s): user->name .RE -"succesfull login by %s" +"successful login by %s" .RS Value(s): user->name .RE diff --git a/qla2x00t/qla2x00-target/README b/qla2x00t/qla2x00-target/README index 129742c28..bedd1fc12 100644 --- a/qla2x00t/qla2x00-target/README +++ b/qla2x00t/qla2x00-target/README @@ -185,6 +185,11 @@ default. N_Port ID Virtualization ------------------------ +Unfortunately, due to severe problems in the original qla2xxx driver, +NPIV in this version is not supported. If you need NPIV, you can use +previous version 2.1 of this driver. + + N_Port ID Virtualization (NPIV) is a Fibre Channel facility allowing multiple N_Port IDs to share a single physical N_Port. NPIV is fully supported by this driver. You must have 24xx+ ISPs with NPIV-supporting diff --git a/scst/README b/scst/README index 7f78d8159..c15c718f2 100644 --- a/scst/README +++ b/scst/README @@ -450,6 +450,14 @@ following entries: have different IDs and SNs. For instance, VDISK dev handler uses this ID to generate T10 vendor specific identifier and SN of the devices. + - poll_us - if polling is desired, sets how many us each SCST thread + is polling its queue after it became empty in a hope that a new + command can come. In some cases, polling can significantly increase + IOPS, especially if low power states on CPU not disabled, because on + high IOPS polling could be cheaper comparing to spending significant + time on entering, then exiting CPU low power states + corresponding + context switches. Disabled, i.e. set to 0, by default. + - suspend - globally suspends or releases all SCSI activities on all devices. Useful for mass management, like adding or deleting LUNs. Writing to it value v: @@ -2186,13 +2194,6 @@ you, so the resulting performance will, in average, be better (sometimes, much better) than with other SCSI targets. But in some cases you can by manual tuning improve it even more. -If you want to get maximum performance from your target, RHEL/CentOS 5.x -kernels are not recommended on both target and initiators, if you are -using Linux initiators, because those kernels are based on very outdated -2.6.18 kernel, hence, missed >3 years of important improvements in the -kernel's storage area. You should use at least long maintained vanilla -2.6.27.x kernel, although 2.6.29+ would be even better. - Before doing any performance measurements note that performance results are very much dependent from your type of load, so it is crucial that you choose access mode (FILEIO, BLOCKIO, O_DIRECT, pass-through), which @@ -2215,11 +2216,18 @@ In order to get the maximum performance you should: - Disable in Makefile CONFIG_SCST_TRACING and CONFIG_SCST_DEBUG. +Note, by disabling CONFIG_SCST_TRACING and CONFIG_SCST_DEBUG you are +disabling many useful SCST diagnostic messages, which can significantly +help in many troubleshooting cases. So, if you may consider to keep +CONFIG_SCST_TRACING, its performance impact is very limited. + IMPORTANT: The development version of SCST in the SVN is optimized for ========= development and bug hunting, not for performance. To reconfigure - it for performance you should run "make 2perf" in the - root of your source code (e.g. trunk/). It will set the above - options as needed. The only option it doesn't set is + + it for performance you should run "make 2perf" or "make + 2release" (to keep CONFIG_SCST_TRACING) in the root of your + source code (e.g. trunk/). It will set the above options as + needed. The only option it doesn't set is CONFIG_SCST_TEST_IO_IN_SIRQ, so, if needed, you should change it manually. @@ -2416,6 +2424,17 @@ context might be done on the same CPUs as SSD devices' threads doing data transfers. As the result, those threads won't receive all the processing power of those CPUs and perform worse. +10. If your storage is capable of operation on hundreds of thousands +IOPS level, you can use poll_us sysfs attribute to set how many us each +SCST thread is polling its queue after it became empty in a hope that a +new command can come. In some cases, polling can significantly increase +IOPS, especially if low power states on CPU not disabled, because on +high IOPS polling could be cheaper comparing to spending significant +time on entering, then exiting CPU low power states + corresponding +context switches. Polling is disabled by default. The recommended value +to start from is 5-10 us. Then you can increase or decrease it to see if +your IOPS are increasing or decreasing. + Commands suspending takes too long ---------------------------------- @@ -2470,7 +2489,7 @@ issues: 1. Ignore incoming task management (TM) commands. It's fine if there are not too many of them, so average performance isn't hurt and the corresponding device isn't getting put offline, i.e. if the backstorage -isn't too slow. +isn't a way too slow. 2. Decrease /sys/block/sdX/device/queue_depth on the initiator in case if it's Linux (see below how) or/and SCST_MAX_TGT_DEV_COMMANDS constant @@ -2502,11 +2521,9 @@ By default, this timeout is 30 or 60 seconds, depending on your distribution. 5. Increase speed of the target's backstorage. -6. Implement in SCST dynamic I/O flow control, so queue depth on the -target is dynamically decreased/increased based on how slow/fast the -backstorage speed comparing to the target link. See "Dynamic I/O flow -control" section on http://scst.sourceforge.net/contributing.html page -for possible implementation idea. +6. Implement in SCST QoS, so queue depth size on the target is +dynamically adjusted, hence worst case initiator seen latencies are +controlled. Next, consider the case of too slow link between initiator and target, when the initiator tries to simultaneously push N commands to the target @@ -2516,13 +2533,8 @@ command, hence one or more commands in the tail of the queue can not be served on time less than the timeout, so the initiator will decide that they are stuck on the target and will try to recover. -To workaround/fix this issue in this case you can use ways 1, 2, 3, 6 -above or (7): increase speed of the link between target and initiator. -But for some initiators implementations for WRITE commands there might -be cases when target has no way to detect the issue, so dynamic I/O flow -control will not be able to help. In those cases you could also need on -the initiator(s) to either decrease the queue depth (way 2), or increase -the corresponding timeout (way 3). +To workaround/fix this issue in this case you can use ways 1, 2, 3 above +or (7): increase speed of the link between target and initiator. Note, that logged messages about QUEUE_FULL status are quite different by nature. This is a normal work, just SCSI flow control in action. diff --git a/scst/README_in-tree b/scst/README_in-tree index 5159aea88..0bdc68618 100644 --- a/scst/README_in-tree +++ b/scst/README_in-tree @@ -314,6 +314,14 @@ following entries: have different IDs and SNs. For instance, VDISK dev handler uses this ID to generate T10 vendor specific identifier and SN of the devices. + - poll_us - if polling is desired, sets how many us each SCST thread + is polling its queue after it became empty in a hope that a new + command can come. In some cases, polling can significantly increase + IOPS, especially if low power states on CPU not disabled, because on + high IOPS polling could be cheaper comparing to spending significant + time on entering, then exiting CPU low power states + corresponding + context switches. Disabled, i.e. set to 0, by default. + - suspend - globally suspends or releases all SCSI activities on all devices. Useful for mass management, like adding or deleting LUNs. Writing to it value v: @@ -2042,6 +2050,11 @@ In order to get the maximum performance you should: - Disable in Makefile CONFIG_SCST_TRACING and CONFIG_SCST_DEBUG. +Note, by disabling CONFIG_SCST_TRACING and CONFIG_SCST_DEBUG you are +disabling many useful SCST diagnostic messages, which can significantly +help in many troubleshooting cases. So, if you may consider to keep +CONFIG_SCST_TRACING, its performance impact is very limited. + 4. Make sure you have io_grouping_type option set correctly, especially in the following cases: @@ -2230,6 +2243,17 @@ context might be done on the same CPUs as SSD devices' threads doing data transfers. As the result, those threads won't receive all the processing power of those CPUs and perform worse. +10. If your storage is capable of operation on hundreds of thousands +IOPS level, you can use poll_us sysfs attribute to set how many us each +SCST thread is polling its queue after it became empty in a hope that a +new command can come. In some cases, polling can significantly increase +IOPS, especially if low power states on CPU not disabled, because on +high IOPS polling could be cheaper comparing to spending significant +time on entering, then exiting CPU low power states + corresponding +context switches. Polling is disabled by default. The recommended value +to start from is 5-10 us. Then you can increase or decrease it to see if +your IOPS are increasing or decreasing. + Commands suspending takes too long ---------------------------------- @@ -2284,7 +2308,7 @@ issues: 1. Ignore incoming task management (TM) commands. It's fine if there are not too many of them, so average performance isn't hurt and the corresponding device isn't getting put offline, i.e. if the backstorage -isn't too slow. +isn't a way too slow. 2. Decrease /sys/block/sdX/device/queue_depth on the initiator in case if it's Linux (see below how) or/and SCST_MAX_TGT_DEV_COMMANDS constant @@ -2316,11 +2340,9 @@ By default, this timeout is 30 or 60 seconds, depending on your distribution. 5. Increase speed of the target's backstorage. -6. Implement in SCST dynamic I/O flow control, so queue depth on the -target is dynamically decreased/increased based on how slow/fast the -backstorage speed comparing to the target link. See "Dynamic I/O flow -control" section on http://scst.sourceforge.net/contributing.html page -for possible implementation idea. +6. Implement in SCST QoS, so queue depth size on the target is +dynamically adjusted, hence worst case initiator seen latencies are +controlled. Next, consider the case of too slow link between initiator and target, when the initiator tries to simultaneously push N commands to the target @@ -2330,13 +2352,8 @@ command, hence one or more commands in the tail of the queue can not be served on time less than the timeout, so the initiator will decide that they are stuck on the target and will try to recover. -To workaround/fix this issue in this case you can use ways 1, 2, 3, 6 -above or (7): increase speed of the link between target and initiator. -But for some initiators implementations for WRITE commands there might -be cases when target has no way to detect the issue, so dynamic I/O flow -control will not be able to help. In those cases you could also need on -the initiator(s) to either decrease the queue depth (way 2), or increase -the corresponding timeout (way 3). +To workaround/fix this issue in this case you can use ways 1, 2, 3 above +or (7): increase speed of the link between target and initiator. Note, that logged messages about QUEUE_FULL status are quite different by nature. This is a normal work, just SCSI flow control in action. diff --git a/scst/src/scst_main.c b/scst/src/scst_main.c index 0fa6efcbf..b8affa4a8 100644 --- a/scst/src/scst_main.c +++ b/scst/src/scst_main.c @@ -119,10 +119,12 @@ struct kmem_cache *scst_cmd_cachep; unsigned long scst_trace_flag; #endif -int scst_max_tasklet_cmd = SCST_DEF_MAX_TASKLET_CMD; - unsigned long scst_flags; +unsigned long scst_poll_ns = SCST_DEF_POLL_NS; + +int scst_max_tasklet_cmd = SCST_DEF_MAX_TASKLET_CMD; + struct scst_cmd_threads scst_main_cmd_threads; struct scst_percpu_info scst_percpu_infos[NR_CPUS]; diff --git a/scst/src/scst_priv.h b/scst/src/scst_priv.h index 9698bf8db..6f89563ba 100644 --- a/scst/src/scst_priv.h +++ b/scst/src/scst_priv.h @@ -184,6 +184,9 @@ extern unsigned int scst_setup_id; #define SCST_DEF_MAX_TASKLET_CMD 10 extern int scst_max_tasklet_cmd; +#define SCST_DEF_POLL_NS 0 +extern unsigned long scst_poll_ns; + extern spinlock_t scst_init_lock; extern struct list_head scst_init_cmd_list; extern wait_queue_head_t scst_init_cmd_list_waitQ; diff --git a/scst/src/scst_sysfs.c b/scst/src/scst_sysfs.c index 103ff8a3b..7725036a5 100644 --- a/scst/src/scst_sysfs.c +++ b/scst/src/scst_sysfs.c @@ -7052,6 +7052,53 @@ static struct kobj_attribute scst_max_tasklet_cmd_attr = __ATTR(max_tasklet_cmd, S_IRUGO | S_IWUSR, scst_max_tasklet_cmd_show, scst_max_tasklet_cmd_store); +static ssize_t scst_poll_us_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int count; + unsigned long t = scst_poll_ns; + + TRACE_ENTRY(); + + do_div(t, 1000); + count = sprintf(buf, "%ld\n%s\n", t, + (scst_poll_ns == SCST_DEF_POLL_NS) + ? "" : SCST_SYSFS_KEY_MARK); + + TRACE_EXIT(); + return count; +} + +static ssize_t scst_poll_us_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + int res; + unsigned long val; + + TRACE_ENTRY(); + + res = kstrtoul(buf, 0, &val); + if (res != 0) { + PRINT_ERROR("kstrtoul() for %s failed: %d ", buf, res); + goto out; + } + + PRINT_INFO("Changed poll_us to %ld us", val); + + val *= 1000; + scst_poll_ns = val; + + res = count; + +out: + TRACE_EXIT_RES(res); + return res; +} + +static struct kobj_attribute scst_poll_us_attr = + __ATTR(poll_us, S_IRUGO | S_IWUSR, scst_poll_us_show, + scst_poll_us_store); + static ssize_t scst_suspend_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -7326,6 +7373,7 @@ static struct attribute *scst_sysfs_root_default_attrs[] = { &scst_threads_attr.attr, &scst_setup_id_attr.attr, &scst_max_tasklet_cmd_attr.attr, + &scst_poll_us_attr.attr, &scst_suspend_attr.attr, #if defined(CONFIG_SCST_DEBUG) || defined(CONFIG_SCST_TRACING) &scst_main_trace_level_attr.attr, diff --git a/scst/src/scst_targ.c b/scst/src/scst_targ.c index 9324b9bf7..ba1e7ebfc 100644 --- a/scst/src/scst_targ.c +++ b/scst/src/scst_targ.c @@ -5623,6 +5623,38 @@ again: thr_locked = false; } + if (scst_poll_ns > 0) { + struct timespec ts; + ktime_t end, kt; + int rc; + + rc = __getnstimeofday(&ts); + if (unlikely(rc != 0)) { + WARN_ON_ONCE(rc); + goto go; + } + + end = timespec_to_ktime(ts); + end = ktime_add_ns(end, scst_poll_ns); + + do { + barrier(); + if (!list_empty(&p_cmd_threads->active_cmd_list) || + !list_empty(&thr->thr_active_cmd_list)) { + TRACE_DBG("Poll successful"); + goto again; + } + cpu_relax(); + rc = __getnstimeofday(&ts); + if (unlikely(rc != 0)) { + WARN_ON_ONCE(rc); + goto go; + } + kt = timespec_to_ktime(ts); + } while (ktime_before(kt, end)); + } + +go: spin_lock_irq(&p_cmd_threads->cmd_list_lock); spin_lock(&thr->thr_cmd_list_lock); } diff --git a/scstadmin/scstadmin.procfs/scst-0.8.22/lib/SCST/SCST.pm b/scstadmin/scstadmin.procfs/scst-0.8.22/lib/SCST/SCST.pm index 4476df3df..88e9daa86 100644 --- a/scstadmin/scstadmin.procfs/scst-0.8.22/lib/SCST/SCST.pm +++ b/scstadmin/scstadmin.procfs/scst-0.8.22/lib/SCST/SCST.pm @@ -1293,7 +1293,7 @@ Returns: (boolean) $groupExists =item SCST::SCST->addGroup(); Adds a security group to SCST's configuration. Returns 0 upon success, 1 if -unsuccessfull and 2 if the group already exists. +unsuccessful and 2 if the group already exists. Arguments: (string) $group @@ -1302,11 +1302,11 @@ Returns: (int) $success =item SCST::SCST->removeGroup(); Removes a group from SCST's configuration. Returns 0 upon success, 1 if -unsuccessfull and 2 if group does not exist. +unsuccessful and 2 if group does not exist. =item SCST::SCST->renameGroup(); -Renames an already existing group. Returns 0 upon success, 1 if unsuccessfull +Renames an already existing group. Returns 0 upon success, 1 if unsuccessful or 2 if the new group name already exists. =item SCST::SCST->sgvStats(); @@ -1376,7 +1376,7 @@ Returns: (int) $handler_type =item SCST::SCST->openDevice(); Opens an already existing specified device for the specified device handler. -Returns 0 upon success, 1 if unsuccessfull and 2 if the device already exists. +Returns 0 upon success, 1 if unsuccessful and 2 if the device already exists. Available options for the parameter $options are: WRITE_THROUGH, READ_ONLY, O_DIRECT @@ -1387,7 +1387,7 @@ Returns: (int) $success =item SCST::SCST->closeDevice(); Closes an open device configured for the specified device handler. Returns -0 upon success, 1 if unsuccessfull and 2 of the device does not exist. +0 upon success, 1 if unsuccessful and 2 of the device does not exist. Arguments: (int) $handler, (string) $device, (string) $path @@ -1396,7 +1396,7 @@ Returns: (int) $success =item SCST::SCST->setT10DeviceId(); Changes the T10 device ID for the specified device and handler. Returns -0 upon success, 1 if unsuccessfull and 2 of the device does not exist. +0 upon success, 1 if unsuccessful and 2 of the device does not exist. Arguments: (int) $handler, (string) $device, (string) $t10_dev_id @@ -1421,7 +1421,7 @@ Returns: (hash ref) $users =item SCST::SCST->addUser(); Adds the specified user to the specified security group. Returns 0 -upon success, 1 if unsuccessfull and 2 if the user already exists. +upon success, 1 if unsuccessful and 2 if the user already exists. Arguments: (string) $user, (string) $group @@ -1430,7 +1430,7 @@ Returns: (int) $success =item SCST::SCST->removeUser(); Removed the specified user from the specified security group. Returns -0 upon success, 1 if unsuccessfull and 2 if the user does not exist. +0 upon success, 1 if unsuccessful and 2 if the user does not exist. Arguments: (string) $user, (string) $group @@ -1440,7 +1440,7 @@ Returns: (int) $success Moves a user from one group to another. Both groups must be defined and user must already exist in the first group. Returns 0 upon -success, 1 if unsuccessfull and 2 if the user already exists in the +success, 1 if unsuccessful and 2 if the user already exists in the second group. Arguments: (string) $user, (string) $fromGroup, (string) $toGroup @@ -1450,7 +1450,7 @@ Returns: (int) $success =item SCST::SCST->clearUsers(); Removes all users from the specified security group. Returns 0 upon -success or 1 if unsuccessfull. +success or 1 if unsuccessful. Arguments: (string) $group @@ -1495,7 +1495,7 @@ Hash Layout: (string) $device = (int) $lun =item SCST::SCST->assignDeviceToGroup(); Assigns the specified device to the specified security group. Returns -0 upon success, 1 if unsuccessfull and 2 if the device has already +0 upon success, 1 if unsuccessful and 2 if the device has already been assigned to the specified security group. Arguments: (string) $device, (string) $group, (int) $lun [, (string) $options] @@ -1506,7 +1506,7 @@ Returns: (int) $success Replaces an already assigned device to the specified lun in a specified security group with $newDevice. Returns 0 upon success, 1 -if unsuccessfull and 2 if the device has already been assigned to +if unsuccessful and 2 if the device has already been assigned to the specified security group. Arguments: (string) $newDevice, (string) $group, (int) $lun [, (string) $options] @@ -1516,7 +1516,7 @@ Returns (int) $success =item SCST::SCST->assignDeviceToHandler(); Assigns specified device to specified handler. Returns 0 upon success, -1 if unsuccessfull and 2 if the specified device is already assigned to +1 if unsuccessful and 2 if the specified device is already assigned to the specified handler. Arguments: (string) $device, (string) $handler @@ -1526,7 +1526,7 @@ Returns: (int) $success =item SCST::SCST->removeDeviceFromGroup(); Removes the specified device from the specified security group. Returns -0 upon success, 1 if unsuccessfull and 2 if the device has not been +0 upon success, 1 if unsuccessful and 2 if the device has not been assigned to the specified security group. Arguments: (string) $device, (string) $group @@ -1536,7 +1536,7 @@ Returns: (int) $success =item SCST::SCST->clearGroupDevices(); Removes all devices from the specified security group. Returns 0 upon -success or 1 if unsuccessfull. +success or 1 if unsuccessful. Arguments: (string) $group