diff --git a/srpt/ChangeLog b/srpt/ChangeLog index 898ac20af..4a64449ab 100644 --- a/srpt/ChangeLog +++ b/srpt/ChangeLog @@ -1,10 +1,36 @@ -Summary of changes in SRPT between versions 1.0.0 and 1.0.1 ------------------------------------------------------------ +Version 1.0.1 (not yet released) +------------- - - Update for kernels up to 2.6.29 +Changes: +- Performance has been improved. I/O requests are now handled in soft IRQ + context instead of on a separate thread, which drastically reduces the + number of context switches needed for processing I/O (r594). +- Added support for building SCST-SRPT against the OFED InfiniBand drivers + (r814:838). +- Compiles now without further patches on RHEL 5.x / CentOS 5.x systems (r638). +- Compiles now against 2.6.26 and later kernels (r516). +- Fixed incorrect SCST state used on error path (r557). +- Fixed memory leak triggered by rejecting SRP login requests (r800). +- Fixed kernel oops triggered by reception of an asynchronous InfiniBand event. + Asynchronous events are triggered by e.g. resetting an InfiniBand switch or + reconnecting an InfiniBand cable (r878:880). The call stack of the oops is + as follows: + queue_work+0x1a/0x20 + schedule_work+0x16/0x20 + srpt_event_handler+0xda/0xe0 [ib_srpt] + ib_dispatch_event+0x39/0x70 [ib_core] + mlx4_ib_process_mad+0x3e6/0x430 [mlx4_ib] + ib_post_send_mad+0x374/0x6f0 [ib_mad] + ? futex_wake+0x105/0x120 + ib_umad_write+0x4a8/0x5c0 [ib_umad] + vfs_write+0xcb/0x170 + sys_write+0x50/0x90 + system_call_fastpath+0x16/0x1b +- The login information for HCA's with more than two ports is now displayed + correctly. Note: no such devices exist yet (r799). - - Fixed incorrect SCST state used on error path - - Unneeded context switches during commands processing eliminated - - - Minor fixes and cleanups +Version 1.0.0 (released on July 14, 2008) +------------- +Almost identical to trunk r440 (only the variables SCST_DIR and EXTRA_CFLAGS +in the Makefiles are different from trunk r440). diff --git a/srpt/README b/srpt/README index da30ce3f8..7f802c0e2 100644 --- a/srpt/README +++ b/srpt/README @@ -112,24 +112,27 @@ Notes: Performance notes ----------------- -In some cases, for instance working with SSD devices, which consume 100% -of a single CPU load for data transfers in their internal threads, to -maximize IOPS it can be needed to assign for those threads dedicated -CPUs using Linux CPU affinity facilities. No IRQ processing should be -done on those CPUs. Check that using /proc/interrupts. See taskset -command and Documentation/IRQ-affinity.txt in your kernel's source tree -for how to assign CPU affinity to tasks and IRQs. +* For latency sensitive applications, using the noop scheduler at the initiator + side can give significantly better resuts than with other schedulers. -The reason for that is that processing of coming commands in SIRQ context -can be done on the same CPUs as SSD devices' threads doing data -transfers. As the result, those threads won't receive all the CPU power -and perform worse. +* In some cases, for instance working with SSD devices, which consume 100% + of a single CPU load for data transfers in their internal threads, to + maximize IOPS it can be needed to assign for those threads dedicated + CPUs using Linux CPU affinity facilities. No IRQ processing should be + done on those CPUs. Check that using /proc/interrupts. See taskset + command and Documentation/IRQ-affinity.txt in your kernel's source tree + for how to assign CPU affinity to tasks and IRQs. -Alternatively to CPU affinity assignment, you can try to enable SRP -target's internal thread. It will allows Linux CPU scheduler to better -distribute load among available CPUs. To enable SRP target driver's -internal thread you should load ib_srpt module with parameter -"thread=1". + The reason for that is that processing of coming commands in SIRQ context + can be done on the same CPUs as SSD devices' threads doing data + transfers. As the result, those threads won't receive all the CPU power + and perform worse. + + Alternatively to CPU affinity assignment, you can try to enable SRP + target's internal thread. It will allows Linux CPU scheduler to better + distribute load among available CPUs. To enable SRP target driver's + internal thread you should load ib_srpt module with parameter + "thread=1". Send questions about this driver to scst-devel@lists.sourceforge.net, CC: diff --git a/srpt/ToDo b/srpt/ToDo index c0d890d2b..9901cb430 100644 --- a/srpt/ToDo +++ b/srpt/ToDo @@ -3,94 +3,25 @@ * https://wiki.openfabrics.org/tiki-index.php?page=SRPT+Installation -2. SRPT driver directly uses internal states of SCST core target state -machine, which is bad, bad, bad and generally not acceptable. Only dev -handler are allowed to use them. That should be fixed. +2. The SRPT driver directly uses the internal state of the SCST core target + state machine (scmnd->state), which is bad, bad, bad and generally not + acceptable. Only dev handler are allowed to use them. This should be fixed. -3. Analyze why memory usage keeps increasing for repeatedly rejected logins. - -Details: openSUSE 11.1, 2.6.29.1 kernel with SCST patches applied (target), - SCST trunk r800. - -How to reproduce: -* Run the following command on the target system: - while true; do echo "$(date) $(cat /proc/meminfo)"; done | tee memlog.txt -* Run the following command on the initiator system: - for ((i=0;i<100000;i++)); do echo 'id_ext=0002c9030003cca2,ioc_guid=0002c9030003cca2,pkey=ffff,dgid=fe800000000000000002c9030003cca3,service_id=0002c9030003cca3' >/sys/class/infiniband_srp/srp-mlx4_0-1/add_target ; done - -Result: - -* The value of MemFree was decreasing during this test. -* The values of Active, Inactive, Active(anon), AnonPages and - Committed_AS were all increasing at the same rate as MemFree was - decreasing. -* No other values in /proc/meminfo changed significantly. +3. Fix the race condition between srpt_refresh_port_work() and + srpt_remove_one(). Although the probability that this happens is very low, + at least in theory it is possible that srpt_refresh_port_work() gets + called for a port after srpt_remove_one() called kfree() on the data + structure that contains the work_struct passed to srpt_refresh_port_work(). + It's not clear to me whether or not letting srpt_remove_one() wait until + srpt_refresh_work() finished can result in a deadlock. -4. Analyze why ib_srpt.ko triggers a kernel oops if ib_srpt is loaded before - opensm is started. - -Details: openSUSE 11.1, 2.6.29.1 kernel with SCST patches applied, SCST trunk r830. - -How to reproduce: - -/etc/init.d/scst stop -/etc/init.d/opensmd stop -/etc/init.d/openibd stop -modprobe scst -modprobe ib_srpt -/etc/init.d/openibd start -dmesg -c >/dev/null -/etc/init.d/opensmd start -dmesg -c - -Result: - -ib_srpt: ASYNC event= 17 on device= mlx4_0 -------------[ cut here ]------------ -kernel BUG at kernel/workqueue.c:189! -invalid opcode: 0000 [#1] SMP -last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/infiniband_mad/umad0/port -CPU 0 -Modules linked in: rdma_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_uverbs ib_umad mlx4_ib ib_srpt scst_vdisk scst ib_cm ib_sa ib_mad ib_core ip6t_LOG ipt_MASQUERADE xt_pkttype xt_TCPMSS xt_tcpudp ipt_LOG xt_limit iptable_nat nf_nat vboxnetflt vboxdrv snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device af_packet ip6t_REJECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq ipv6 fuse loop dm_mod coretemp snd_hda_codec_atihdmi snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore i2c_i801 joydev sr_mod serio_raw i2c_core rtc_cmos button snd_page_alloc hid_belkin cdrom pcspkr mlx4_core rtc_core intel_agp rtc_lib sg usbhid hid raid456 async_xor async_memcpy async_tx xor raid0 sd_mod crc_t10dif ehci_hcd uhci_hcd usbcore edd raid1 ext3 mbcache jbd fan ide_pci_generic ide_core ata_generic ata_piix pata_marvell ahci libata scsi_mod thermal processor thermal_sys hwmon [last unloaded: scst] -Pid: 9073, comm: opensm Not tainted 2.6.29.1-scst #2 P5Q DELUXE -RIP: 0010:[] [] queue_work_on+0x56/0x60 -RSP: 0018:ffff8801149dbc48 EFLAGS: 00010003 -RAX: ffff880095c28120 RBX: ffff8801149dbd08 RCX: ffff880095c28118 -RDX: 0000000000000000 RSI: ffff88013ec96400 RDI: 0000000000000000 -RBP: ffff8801149dbc48 R08: 0000000000000000 R09: 0000000000000006 -R10: ffffffff80711480 R11: ffff8801149dbc18 R12: ffff880095c20000 -R13: 0000000000000282 R14: ffff8800932a8040 R15: ffff880138453800 -FS: 00007f8868e91950(0000) GS:ffffffff80697040(0000) knlGS:0000000000000000 -CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b -CR2: 000000000040d3f0 CR3: 0000000093057000 CR4: 00000000000406e0 -DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 -DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 -Process opensm (pid: 9073, threadinfo ffff8801149da000, task ffff8800937a3710) -Stack: - ffff8801149dbc58 ffffffff80254bca ffff8801149dbc68 ffffffff80254be6 - ffff8801149dbc88 ffffffffa06dcaca ffff880095c28180 ffff8801149dbd08 - ffff8801149dbcb8 ffffffffa0629b49 ffff8801149dbcb8 0000000000000001 -Call Trace: - [] queue_work+0x1a/0x20 - [] schedule_work+0x16/0x20 - [] srpt_event_handler+0xda/0xe0 [ib_srpt] - [] ib_dispatch_event+0x39/0x70 [ib_core] - [] mlx4_ib_process_mad+0x3e6/0x430 [mlx4_ib] - [] ib_post_send_mad+0x374/0x6f0 [ib_mad] - [] ? futex_wake+0x105/0x120 - [] ib_umad_write+0x4a8/0x5c0 [ib_umad] - [] vfs_write+0xcb/0x170 - [] sys_write+0x50/0x90 - [] system_call_fastpath+0x16/0x1b -Code: 8b 46 20 48 8b 06 45 85 c0 48 f7 d0 0f 45 3d 12 73 3a 00 48 89 ce 48 63 d7 48 8b 3c d0 e8 e3 fe ff ff ba 01 00 00 00 c9 89 d0 c3 <0f> 0b eb fe 66 0f 1f 44 00 00 55 48 89 e5 53 48 8d 5d b0 48 83 -RIP [] queue_work_on+0x56/0x60 - RSP ----[ end trace 8dc16c5c1664b10b ]--- +4. Find out from which threads the srpt_devices list can be accessed and + whether it has to be protected by a spinlock or mutex. -5. rmmod ib_srpt under load crashes. +5. Fix the issue that 'rmmod ib_srpt' under load hangs. 6. The initiator names supplied to the SCST core contain the target port name,