mirror of
https://github.com/SCST-project/scst.git
synced 2026-05-14 01:01:27 +00:00
- Fixed minor problem in iSCSI-SCST
- Important reference counting and barriers usage cleanups - Sense buffer made dynamic - Other minor improvements and cleanups - Docs updates git-svn-id: http://svn.code.sf.net/p/scst/svn/trunk@287 d57e44dd-8a1f-0410-8b47-8ef2f437770f
This commit is contained in:
316
AskingQuestions
Normal file
316
AskingQuestions
Normal file
@@ -0,0 +1,316 @@
|
||||
Before asking any questions to me directly or scst-devel mailing list
|
||||
make sure that you read *ALL* relevant documentation files (at least, 2
|
||||
README files: one for SCST and one for target driver you are using) and
|
||||
*understood* *ALL* written there. I personally very much like working
|
||||
with people who understand what they are doing and hate when somebody
|
||||
tries to use me as a replacement of his brain and to save his time on
|
||||
expense of mine. So, in such cases don't be surprised if your question
|
||||
will be ignored or answered in the RTFM style.
|
||||
|
||||
Particularly, I will refuse to answer on any questions about low
|
||||
performance if you don't *explicitly* write in your question that you
|
||||
don't use the debug build and ensured (write from what) that your target
|
||||
and backstorage devices don't share the same PCI bus.
|
||||
|
||||
Another too FAQ area is "What are those aborts and resets, which your
|
||||
target from time to time logging, mean and what to do with them?", "Do
|
||||
they relate to I/O stalls I sometimes experience" and "Why after them my
|
||||
device was put offline?".
|
||||
|
||||
Sorry, if the above might sound too harsh. Unfortunately, I have a
|
||||
limited power and can't waste it keeping explaining basic concepts and
|
||||
answering on the same questions.
|
||||
|
||||
Example of a really bad question:
|
||||
|
||||
======================================================================
|
||||
|
||||
In our user space driver , i use epoll_wait to wait on multiple file
|
||||
descriptors for multiple devices. Apparently when i wait on the ioctl in
|
||||
blocking mode , everything works well , but when i wait on epoll , and
|
||||
try to attach a target device , i get immediately a "Bad address" error
|
||||
value from the epoll.
|
||||
|
||||
What is the reason ?
|
||||
|
||||
======================================================================
|
||||
|
||||
It is bad, because, apparently, the author was doing something wrong
|
||||
with epoll, but instead of checking the source code to find out when
|
||||
"Bad address" error can be returned and understand possible reasons for
|
||||
it, he expected me to do that for him. He even didn't bothered to look
|
||||
in the kernel log, where, very probably, the reason for the error was
|
||||
logged.
|
||||
|
||||
|
||||
Here are three examples of good questions:
|
||||
|
||||
======================================================================
|
||||
|
||||
I'm looking for a help in understanding of SCST internal architecture
|
||||
and operation. The problem I'm experiencing now is that SCST seems to
|
||||
process deferred commands incorrectly in some cases. More specifically,
|
||||
I'm confused with the 'while' loop in scst_send_to_midlev function.
|
||||
|
||||
As far as I understand, the basic execution path consists of a call to
|
||||
scst_do_send_midlev followed by taking of a decision on this command
|
||||
(continue with this command, reschedule it, or move to the next one),
|
||||
the decision is stored in 'int res', which is then returned from the
|
||||
function.
|
||||
|
||||
However, if there are deferred commands on the device, the function does
|
||||
not return but makes another call to scst_do_send_to_midlev, analyzes
|
||||
the return code again and stores the decision in 'int res' thereby
|
||||
erasing the decision for the previous command. If scst_send_to_midlev
|
||||
exits now, it will return the _new_ decision (for the deferred command)
|
||||
whereas the scst_process_active_cmd will think that it is the decision
|
||||
for the command that was originally passed to scst_send_to_midlev.
|
||||
|
||||
For example, this will cause problems in the following situation:
|
||||
1. scst_send_to_midlev is called with cmd == 0x80000100
|
||||
2. scst_do_send_to_midlev is called with cmd == 0x8000100
|
||||
3. scst_do_send_to_midlev returns with SCST_EXEC_COMPLETED
|
||||
(in certain scenarios the command is already destroyed at this point)
|
||||
4. scst_check_deferred_commands finds the defferred cmd == 0x80000200
|
||||
5. scst_do_send_to_midlev is called with cmd == 0x80000200
|
||||
6. scst_do_send_to_midlev returns with SCST_EXEC_NEED_THREAD
|
||||
7. scst_send_to_midlev returns with SCST_CMD_STATE_RES_NEED_THREAD
|
||||
8. Now, the scst_process_active_cmd will try to reschedule command 0x8000100
|
||||
which is already destroyed at this point !
|
||||
|
||||
Can anyone on the list confirm my guess? Or, this situation should never
|
||||
happen because of some other condition which I may have missed? Right
|
||||
now I can't think of any of simple methods to work around the issue,
|
||||
i.e. any of my ideas require rewriting significant part of the code.
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I have two machines (SCST targets) with the following parameters:
|
||||
- two dual core Xeon CPUs
|
||||
- QLA2342 FC HBA
|
||||
- Areca SATA RAID HBA
|
||||
- Linux 2.6.21.3, running in 64 bit mode with 16G RAM
|
||||
- SCST trunk version
|
||||
|
||||
On the client side there is a Solaris 10 U3 machine, with the same (chip
|
||||
wise) Qlogic controller.
|
||||
|
||||
There is an FC switch between the three machines, and each of the
|
||||
targets are zoned to the client's port in a one-by-one manner, so HBA
|
||||
port 1 sees only target 1 and port 2 sees only target 2.
|
||||
|
||||
The targets are configured with two large sparse files on XFS (8 TB
|
||||
each, with dd if=/dev/zero of=file bs=1M count=0 seek=8388608).
|
||||
|
||||
In Solaris I do various tests with SVM (Sun's built in volume manager)
|
||||
and multiterabyte UFS. Occasionally, there are some strange write
|
||||
errors, where the volume manager drops its volumes and without a VM, a
|
||||
simple UFS fs write can fail too.
|
||||
|
||||
I see various errors logged by the kernel (Solaris'), these are some
|
||||
examples, both with and without SVM:
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::GPN_ID for D_ID=621200 failed
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::N_x Port with D_ID=621200, PWWN=210000e08b944419 disappeared from
|
||||
fabric
|
||||
Jun 21 10:42:53 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:42:53 solaris SCSI transport failed: reason
|
||||
'tran_err': retrying command
|
||||
Jun 21 10:43:06 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:06 solaris SCSI transport failed: reason 'timeout':
|
||||
retrying command
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.notice] Device is gone
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:13 solaris transport rejected fatal error
|
||||
Jun 21 10:43:13 solaris md_stripe: [ID 641072 kern.warning] WARNING: md:
|
||||
d10: write error on /dev/dsk/c2t210000E08B944419d0s6
|
||||
Jun 21 10:43:13 solaris last message repeated 9 times
|
||||
Jun 21 10:43:13 solaris scsi: [ID 243001 kern.info]
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0 (fcp1):
|
||||
Jun 21 10:43:13 solaris offlining lun=0 (trace=0), target=621200
|
||||
(trace=2800004)
|
||||
Jun 21 10:43:13 solaris ufs: [ID 702911 kern.warning] WARNING: Error
|
||||
writing master during ufs log roll
|
||||
Jun 21 10:43:13 solaris ufs: [ID 127457 kern.warning] WARNING: ufs log
|
||||
for /mnt changed state to Error
|
||||
Jun 21 10:43:13 solaris ufs: [ID 616219 kern.warning] WARNING: Please
|
||||
umount(1M) /mnt and run fsck(1M)
|
||||
Jun 21 11:08:55 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:08:55 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris offline or reservation conflict
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris SYNCHRONIZE CACHE command failed (5)
|
||||
|
||||
I don't see anything in the dmesg on the target side.
|
||||
|
||||
After these errors SCST seems to be dead. I can't unload its modules and
|
||||
can't communicate it via /proc.
|
||||
A simple cat vdisk just waits and waits.
|
||||
|
||||
Could you please help? What should I set/collect/send in this case to
|
||||
help resolving this issue?
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I am trying to get scst working on an Opteron machine.
|
||||
|
||||
After some hours, playing with different kernel versions and different
|
||||
missing functions, I've sticked with a 2.6.15 and a
|
||||
drivers/scsi/scsi_lib.c hack from 2.6.14, which contains the
|
||||
scsi_wait_req. (Linux is a mess, each point release changes something.
|
||||
How can developers keep up with this?)
|
||||
|
||||
Now everything seems to be OK, I could load the modules and such.
|
||||
|
||||
I have a setup of two machines connected to each other in an FC-P2P
|
||||
manner. The two machines has two 2G links between them. On the initiator
|
||||
side I have FreeBSD, because I know that better and this is what I did
|
||||
some target mode tests.
|
||||
|
||||
The strange thing is that the loop seems to be only running at 1 Gbps:
|
||||
[ 61.731265] QLogic Fibre Channel HBA Driver
|
||||
[ 61.731454] GSI 21 sharing vector 0xD1 and IRQ 21
|
||||
[ 61.731563] ACPI: PCI Interrupt 0000:06:01.0[A] -> GSI 36 (level, low) -> IRQ 21
|
||||
[ 61.731821] qla2300 0000:06:01.0: Found an ISP2312, irq 21, iobase 0xffffc200
|
||||
00014000
|
||||
[ 61.732194] qla2300 0000:06:01.0: Configuring PCI space...
|
||||
[ 61.732441] qla2300 0000:06:01.0: Configure NVRAM parameters...
|
||||
[ 61.816885] qla2300 0000:06:01.0: Verifying loaded RISC code...
|
||||
[ 61.852177] qla2300 0000:06:01.0: Extended memory detected (512 KB)...
|
||||
[ 61.852294] qla2300 0000:06:01.0: Resizing request queue depth (2048 -> 4096)
|
||||
...
|
||||
[ 61.852604] qla2300 0000:06:01.0: LIP reset occured (f8e8).
|
||||
[ 61.852740] qla2300 0000:06:01.0: Waiting for LIP to complete...
|
||||
[ 62.865911] qla2300 0000:06:01.0: LIP occured (f7f7).
|
||||
[ 62.866042] qla2300 0000:06:01.0: LOOP UP detected (1 Gbps).
|
||||
[ 62.866269] qla2300 0000:06:01.0: Topology - (Loop), Host Loop address 0x0
|
||||
[ 62.868285] scsi0 : qla2xxx
|
||||
[ 62.868507] qla2300 0000:06:01.0:
|
||||
[ 62.868507] QLogic Fibre Channel HBA Driver: 8.01.03-k
|
||||
[ 62.868508] QLogic QLA2312 -
|
||||
[ 62.868509] ISP2312: PCI-X (100 MHz) @ 0000:06:01.0 hdma+, host#=0, fw=3.03.18 IPX
|
||||
|
||||
|
||||
I did the following:
|
||||
modprobe qla2x00tgt:
|
||||
|
||||
[ 104.988170] qla2x00tgt: no version for "scst_unregister" found: kernel tainted.
|
||||
|
||||
echo "open lun0 /data/lun0" >/proc/scsi_tgt/disk_fileio/disk_fileio"
|
||||
[ 169.102877] scst: Device handler disk_fileio for type 0 loaded successfully
|
||||
[ 169.103002] scst: Device handler cdrom_fileio for type 5 loaded successfully
|
||||
[ 191.261000] dev_fileio: Attached SCSI target virtual disk lun0 (file="/data/l
|
||||
un0", fs=1000001MB, bs=512, nblocks=2048002048, cyln=1000001)
|
||||
[ 191.261191] scst: Attached SCSI target mid-level to virtual device lun0 (id 1
|
||||
)
|
||||
|
||||
and
|
||||
echo "add lun0 0" > /proc/scsi_tgt/groups/Default/devices
|
||||
|
||||
On the other side a camcontrol rescan all (SCSI rescan) gives me the following with a verbose logging kernel:
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1 at isp0 bus 0 target 0 lun 0
|
||||
Mar 29 18:09:17 blade2 kernel: da1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: da1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 1024MB (2097152 512 byte sectors: 64H 32S/T 1024C)
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): error 6
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): Unretryable Error
|
||||
Mar 29 18:09:17 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): error 5
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retries Exausted
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): Unretryable Error
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): error 6
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): Unretryable Error
|
||||
|
||||
|
||||
The device is there, but I cannot use it.
|
||||
|
||||
BTW, the target mode machine (Linux) runs on a dual Opteron in 64 bit
|
||||
mode, with 8GB of RAM. I've lowered it with mem=800M, but the effect is
|
||||
the same.
|
||||
|
||||
Assuming that mixed 2.6.14-.15 kernel is the fault, could you please
|
||||
tell me what version should I use, for which all of the patches will
|
||||
work?
|
||||
|
||||
======================================================================
|
||||
|
||||
So, as a bottom line, if you want me to be friendly, don't ask questions
|
||||
answers on which you can find out yourself by a simple documentation
|
||||
reading and minimal thinking effort.
|
||||
|
||||
Also it is very desirable if you attach to your question full kernel log
|
||||
from target since it's booted.
|
||||
|
||||
Vladislav Bolkhovitin <vst@vlnb.net>, http://scst.sourceforge.net
|
||||
316
iscsi-scst/AskingQuestions
Normal file
316
iscsi-scst/AskingQuestions
Normal file
@@ -0,0 +1,316 @@
|
||||
Before asking any questions to me directly or scst-devel mailing list
|
||||
make sure that you read *ALL* relevant documentation files (at least, 2
|
||||
README files: one for SCST and one for target driver you are using) and
|
||||
*understood* *ALL* written there. I personally very much like working
|
||||
with people who understand what they are doing and hate when somebody
|
||||
tries to use me as a replacement of his brain and to save his time on
|
||||
expense of mine. So, in such cases don't be surprised if your question
|
||||
will be ignored or answered in the RTFM style.
|
||||
|
||||
Particularly, I will refuse to answer on any questions about low
|
||||
performance if you don't *explicitly* write in your question that you
|
||||
don't use the debug build and ensured (write from what) that your target
|
||||
and backstorage devices don't share the same PCI bus.
|
||||
|
||||
Another too FAQ area is "What are those aborts and resets, which your
|
||||
target from time to time logging, mean and what to do with them?", "Do
|
||||
they relate to I/O stalls I sometimes experience" and "Why after them my
|
||||
device was put offline?".
|
||||
|
||||
Sorry, if the above might sound too harsh. Unfortunately, I have a
|
||||
limited power and can't waste it keeping explaining basic concepts and
|
||||
answering on the same questions.
|
||||
|
||||
Example of a really bad question:
|
||||
|
||||
======================================================================
|
||||
|
||||
In our user space driver , i use epoll_wait to wait on multiple file
|
||||
descriptors for multiple devices. Apparently when i wait on the ioctl in
|
||||
blocking mode , everything works well , but when i wait on epoll , and
|
||||
try to attach a target device , i get immediately a "Bad address" error
|
||||
value from the epoll.
|
||||
|
||||
What is the reason ?
|
||||
|
||||
======================================================================
|
||||
|
||||
It is bad, because, apparently, the author was doing something wrong
|
||||
with epoll, but instead of checking the source code to find out when
|
||||
"Bad address" error can be returned and understand possible reasons for
|
||||
it, he expected me to do that for him. He even didn't bothered to look
|
||||
in the kernel log, where, very probably, the reason for the error was
|
||||
logged.
|
||||
|
||||
|
||||
Here are three examples of good questions:
|
||||
|
||||
======================================================================
|
||||
|
||||
I'm looking for a help in understanding of SCST internal architecture
|
||||
and operation. The problem I'm experiencing now is that SCST seems to
|
||||
process deferred commands incorrectly in some cases. More specifically,
|
||||
I'm confused with the 'while' loop in scst_send_to_midlev function.
|
||||
|
||||
As far as I understand, the basic execution path consists of a call to
|
||||
scst_do_send_midlev followed by taking of a decision on this command
|
||||
(continue with this command, reschedule it, or move to the next one),
|
||||
the decision is stored in 'int res', which is then returned from the
|
||||
function.
|
||||
|
||||
However, if there are deferred commands on the device, the function does
|
||||
not return but makes another call to scst_do_send_to_midlev, analyzes
|
||||
the return code again and stores the decision in 'int res' thereby
|
||||
erasing the decision for the previous command. If scst_send_to_midlev
|
||||
exits now, it will return the _new_ decision (for the deferred command)
|
||||
whereas the scst_process_active_cmd will think that it is the decision
|
||||
for the command that was originally passed to scst_send_to_midlev.
|
||||
|
||||
For example, this will cause problems in the following situation:
|
||||
1. scst_send_to_midlev is called with cmd == 0x80000100
|
||||
2. scst_do_send_to_midlev is called with cmd == 0x8000100
|
||||
3. scst_do_send_to_midlev returns with SCST_EXEC_COMPLETED
|
||||
(in certain scenarios the command is already destroyed at this point)
|
||||
4. scst_check_deferred_commands finds the defferred cmd == 0x80000200
|
||||
5. scst_do_send_to_midlev is called with cmd == 0x80000200
|
||||
6. scst_do_send_to_midlev returns with SCST_EXEC_NEED_THREAD
|
||||
7. scst_send_to_midlev returns with SCST_CMD_STATE_RES_NEED_THREAD
|
||||
8. Now, the scst_process_active_cmd will try to reschedule command 0x8000100
|
||||
which is already destroyed at this point !
|
||||
|
||||
Can anyone on the list confirm my guess? Or, this situation should never
|
||||
happen because of some other condition which I may have missed? Right
|
||||
now I can't think of any of simple methods to work around the issue,
|
||||
i.e. any of my ideas require rewriting significant part of the code.
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I have two machines (SCST targets) with the following parameters:
|
||||
- two dual core Xeon CPUs
|
||||
- QLA2342 FC HBA
|
||||
- Areca SATA RAID HBA
|
||||
- Linux 2.6.21.3, running in 64 bit mode with 16G RAM
|
||||
- SCST trunk version
|
||||
|
||||
On the client side there is a Solaris 10 U3 machine, with the same (chip
|
||||
wise) Qlogic controller.
|
||||
|
||||
There is an FC switch between the three machines, and each of the
|
||||
targets are zoned to the client's port in a one-by-one manner, so HBA
|
||||
port 1 sees only target 1 and port 2 sees only target 2.
|
||||
|
||||
The targets are configured with two large sparse files on XFS (8 TB
|
||||
each, with dd if=/dev/zero of=file bs=1M count=0 seek=8388608).
|
||||
|
||||
In Solaris I do various tests with SVM (Sun's built in volume manager)
|
||||
and multiterabyte UFS. Occasionally, there are some strange write
|
||||
errors, where the volume manager drops its volumes and without a VM, a
|
||||
simple UFS fs write can fail too.
|
||||
|
||||
I see various errors logged by the kernel (Solaris'), these are some
|
||||
examples, both with and without SVM:
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::GPN_ID for D_ID=621200 failed
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::N_x Port with D_ID=621200, PWWN=210000e08b944419 disappeared from
|
||||
fabric
|
||||
Jun 21 10:42:53 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:42:53 solaris SCSI transport failed: reason
|
||||
'tran_err': retrying command
|
||||
Jun 21 10:43:06 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:06 solaris SCSI transport failed: reason 'timeout':
|
||||
retrying command
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.notice] Device is gone
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:13 solaris transport rejected fatal error
|
||||
Jun 21 10:43:13 solaris md_stripe: [ID 641072 kern.warning] WARNING: md:
|
||||
d10: write error on /dev/dsk/c2t210000E08B944419d0s6
|
||||
Jun 21 10:43:13 solaris last message repeated 9 times
|
||||
Jun 21 10:43:13 solaris scsi: [ID 243001 kern.info]
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0 (fcp1):
|
||||
Jun 21 10:43:13 solaris offlining lun=0 (trace=0), target=621200
|
||||
(trace=2800004)
|
||||
Jun 21 10:43:13 solaris ufs: [ID 702911 kern.warning] WARNING: Error
|
||||
writing master during ufs log roll
|
||||
Jun 21 10:43:13 solaris ufs: [ID 127457 kern.warning] WARNING: ufs log
|
||||
for /mnt changed state to Error
|
||||
Jun 21 10:43:13 solaris ufs: [ID 616219 kern.warning] WARNING: Please
|
||||
umount(1M) /mnt and run fsck(1M)
|
||||
Jun 21 11:08:55 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:08:55 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris offline or reservation conflict
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris SYNCHRONIZE CACHE command failed (5)
|
||||
|
||||
I don't see anything in the dmesg on the target side.
|
||||
|
||||
After these errors SCST seems to be dead. I can't unload its modules and
|
||||
can't communicate it via /proc.
|
||||
A simple cat vdisk just waits and waits.
|
||||
|
||||
Could you please help? What should I set/collect/send in this case to
|
||||
help resolving this issue?
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I am trying to get scst working on an Opteron machine.
|
||||
|
||||
After some hours, playing with different kernel versions and different
|
||||
missing functions, I've sticked with a 2.6.15 and a
|
||||
drivers/scsi/scsi_lib.c hack from 2.6.14, which contains the
|
||||
scsi_wait_req. (Linux is a mess, each point release changes something.
|
||||
How can developers keep up with this?)
|
||||
|
||||
Now everything seems to be OK, I could load the modules and such.
|
||||
|
||||
I have a setup of two machines connected to each other in an FC-P2P
|
||||
manner. The two machines has two 2G links between them. On the initiator
|
||||
side I have FreeBSD, because I know that better and this is what I did
|
||||
some target mode tests.
|
||||
|
||||
The strange thing is that the loop seems to be only running at 1 Gbps:
|
||||
[ 61.731265] QLogic Fibre Channel HBA Driver
|
||||
[ 61.731454] GSI 21 sharing vector 0xD1 and IRQ 21
|
||||
[ 61.731563] ACPI: PCI Interrupt 0000:06:01.0[A] -> GSI 36 (level, low) -> IRQ 21
|
||||
[ 61.731821] qla2300 0000:06:01.0: Found an ISP2312, irq 21, iobase 0xffffc200
|
||||
00014000
|
||||
[ 61.732194] qla2300 0000:06:01.0: Configuring PCI space...
|
||||
[ 61.732441] qla2300 0000:06:01.0: Configure NVRAM parameters...
|
||||
[ 61.816885] qla2300 0000:06:01.0: Verifying loaded RISC code...
|
||||
[ 61.852177] qla2300 0000:06:01.0: Extended memory detected (512 KB)...
|
||||
[ 61.852294] qla2300 0000:06:01.0: Resizing request queue depth (2048 -> 4096)
|
||||
...
|
||||
[ 61.852604] qla2300 0000:06:01.0: LIP reset occured (f8e8).
|
||||
[ 61.852740] qla2300 0000:06:01.0: Waiting for LIP to complete...
|
||||
[ 62.865911] qla2300 0000:06:01.0: LIP occured (f7f7).
|
||||
[ 62.866042] qla2300 0000:06:01.0: LOOP UP detected (1 Gbps).
|
||||
[ 62.866269] qla2300 0000:06:01.0: Topology - (Loop), Host Loop address 0x0
|
||||
[ 62.868285] scsi0 : qla2xxx
|
||||
[ 62.868507] qla2300 0000:06:01.0:
|
||||
[ 62.868507] QLogic Fibre Channel HBA Driver: 8.01.03-k
|
||||
[ 62.868508] QLogic QLA2312 -
|
||||
[ 62.868509] ISP2312: PCI-X (100 MHz) @ 0000:06:01.0 hdma+, host#=0, fw=3.03.18 IPX
|
||||
|
||||
|
||||
I did the following:
|
||||
modprobe qla2x00tgt:
|
||||
|
||||
[ 104.988170] qla2x00tgt: no version for "scst_unregister" found: kernel tainted.
|
||||
|
||||
echo "open lun0 /data/lun0" >/proc/scsi_tgt/disk_fileio/disk_fileio"
|
||||
[ 169.102877] scst: Device handler disk_fileio for type 0 loaded successfully
|
||||
[ 169.103002] scst: Device handler cdrom_fileio for type 5 loaded successfully
|
||||
[ 191.261000] dev_fileio: Attached SCSI target virtual disk lun0 (file="/data/l
|
||||
un0", fs=1000001MB, bs=512, nblocks=2048002048, cyln=1000001)
|
||||
[ 191.261191] scst: Attached SCSI target mid-level to virtual device lun0 (id 1
|
||||
)
|
||||
|
||||
and
|
||||
echo "add lun0 0" > /proc/scsi_tgt/groups/Default/devices
|
||||
|
||||
On the other side a camcontrol rescan all (SCSI rescan) gives me the following with a verbose logging kernel:
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1 at isp0 bus 0 target 0 lun 0
|
||||
Mar 29 18:09:17 blade2 kernel: da1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: da1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 1024MB (2097152 512 byte sectors: 64H 32S/T 1024C)
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): error 6
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): Unretryable Error
|
||||
Mar 29 18:09:17 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): error 5
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retries Exausted
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): Unretryable Error
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): error 6
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): Unretryable Error
|
||||
|
||||
|
||||
The device is there, but I cannot use it.
|
||||
|
||||
BTW, the target mode machine (Linux) runs on a dual Opteron in 64 bit
|
||||
mode, with 8GB of RAM. I've lowered it with mem=800M, but the effect is
|
||||
the same.
|
||||
|
||||
Assuming that mixed 2.6.14-.15 kernel is the fault, could you please
|
||||
tell me what version should I use, for which all of the patches will
|
||||
work?
|
||||
|
||||
======================================================================
|
||||
|
||||
So, as a bottom line, if you want me to be friendly, don't ask questions
|
||||
answers on which you can find out yourself by a simple documentation
|
||||
reading and minimal thinking effort.
|
||||
|
||||
Also it is very desirable if you attach to your question full kernel log
|
||||
from target since it's booted.
|
||||
|
||||
Vladislav Bolkhovitin <vst@vlnb.net>, http://scst.sourceforge.net
|
||||
@@ -111,8 +111,9 @@ Check SCST README file how to tune for the best performance.
|
||||
|
||||
If under high load you experience I/O stalls or see in the kernel log
|
||||
abort or reset messages, then try to reduce QueuedCommands parameter in
|
||||
iscsi-scstd.conf file for the corresponding target. See also SCST README
|
||||
file for more details about that issue.
|
||||
iscsi-scstd.conf file for the corresponding target to some lower value,
|
||||
like 8 (default is 32). See also SCST README file for more details about
|
||||
that issue.
|
||||
|
||||
Sometimes, when there are communication problems with initiator(s),
|
||||
shutting down iSCSI-SCST can take very long time, up to about 10
|
||||
|
||||
@@ -20,7 +20,7 @@
|
||||
#include <sys/uio.h>
|
||||
#endif
|
||||
|
||||
#define ISCSI_VERSION_STRING "0.9.6/0.4.15r145"
|
||||
#define ISCSI_VERSION_STRING "0.9.6/0.4.15r147"
|
||||
|
||||
/* The maximum length of 223 bytes in the RFC. */
|
||||
#define ISCSI_NAME_LEN 256
|
||||
|
||||
@@ -454,12 +454,14 @@ static struct iscsi_cmnd *iscsi_cmnd_create_rsp_cmnd(struct iscsi_cmnd *parent)
|
||||
|
||||
static inline struct iscsi_cmnd *get_rsp_cmnd(struct iscsi_cmnd *req)
|
||||
{
|
||||
struct iscsi_cmnd *res;
|
||||
struct iscsi_cmnd *res = NULL;
|
||||
|
||||
/* Currently this lock isn't needed, but just in case.. */
|
||||
spin_lock_bh(&req->rsp_cmd_lock);
|
||||
res = list_entry(req->rsp_cmd_list.prev, struct iscsi_cmnd,
|
||||
rsp_cmd_list_entry);
|
||||
if (!list_empty(&req->rsp_cmd_list)) {
|
||||
res = list_entry(req->rsp_cmd_list.prev, struct iscsi_cmnd,
|
||||
rsp_cmd_list_entry);
|
||||
}
|
||||
spin_unlock_bh(&req->rsp_cmd_lock);
|
||||
|
||||
return res;
|
||||
@@ -472,6 +474,8 @@ static void iscsi_cmnds_init_write(struct list_head *send, int flags)
|
||||
struct iscsi_conn *conn = rsp->conn;
|
||||
struct list_head *pos, *next;
|
||||
|
||||
sBUG_ON(list_empty(send));
|
||||
|
||||
/*
|
||||
* If we don't remove hashed req cmd from the hash list here, before
|
||||
* submitting it for transmittion, we will have a race, when for
|
||||
@@ -618,8 +622,8 @@ static struct iscsi_cmnd *create_status_rsp(struct iscsi_cmnd *req, int status,
|
||||
rsp_hdr->cmd_status = status;
|
||||
rsp_hdr->itt = cmnd_hdr(req)->itt;
|
||||
|
||||
if (status == SAM_STAT_CHECK_CONDITION) {
|
||||
TRACE_DBG("%s", "CHECK_CONDITION");
|
||||
if (SCST_SENSE_VALID(sense_buf)) {
|
||||
TRACE_DBG("%s", "SENSE VALID");
|
||||
/* ToDo: __GFP_NOFAIL ?? */
|
||||
sg = rsp->sg = scst_alloc(PAGE_SIZE, GFP_KERNEL|__GFP_NOFAIL,
|
||||
&rsp->sg_cnt);
|
||||
@@ -920,11 +924,12 @@ static void cmnd_prepare_skip_pdu_set_resid(struct iscsi_cmnd *req)
|
||||
TRACE_DBG("%p", req);
|
||||
|
||||
rsp = get_rsp_cmnd(req);
|
||||
if (rsp == NULL)
|
||||
goto skip;
|
||||
|
||||
rsp_hdr = (struct iscsi_scsi_rsp_hdr *)&rsp->pdu.bhs;
|
||||
if (unlikely(cmnd_opcode(rsp) != ISCSI_OP_SCSI_RSP)) {
|
||||
PRINT_ERROR("Unexpected response command %u", cmnd_opcode(rsp));
|
||||
return;
|
||||
}
|
||||
|
||||
sBUG_ON(cmnd_opcode(rsp) != ISCSI_OP_SCSI_RSP);
|
||||
|
||||
size = cmnd_write_size(req);
|
||||
if (size) {
|
||||
@@ -941,6 +946,8 @@ static void cmnd_prepare_skip_pdu_set_resid(struct iscsi_cmnd *req)
|
||||
rsp_hdr->residual_count = cpu_to_be32(size);
|
||||
}
|
||||
}
|
||||
|
||||
skip:
|
||||
req->pdu.bhs.opcode =
|
||||
(req->pdu.bhs.opcode & ~ISCSI_OPCODE_MASK) | ISCSI_OP_SCSI_REJECT;
|
||||
|
||||
@@ -1285,7 +1292,6 @@ static int scsi_cmnd_start(struct iscsi_cmnd *req)
|
||||
if (unlikely(req->scst_state != ISCSI_CMD_STATE_AFTER_PREPROC)) {
|
||||
TRACE_DBG("req %p is in %x state", req, req->scst_state);
|
||||
if (req->scst_state == ISCSI_CMD_STATE_PROCESSED) {
|
||||
/* Response is already prepared */
|
||||
cmnd_prepare_skip_pdu_set_resid(req);
|
||||
goto out;
|
||||
}
|
||||
@@ -1437,7 +1443,8 @@ static void data_out_end(struct iscsi_cmnd *cmnd)
|
||||
|
||||
iscsi_extracheck_is_rd_thread(cmnd->conn);
|
||||
|
||||
if (!(cmnd->conn->ddigest_type & DIGEST_NONE)) {
|
||||
if (!(cmnd->conn->ddigest_type & DIGEST_NONE) &&
|
||||
!cmnd->ddigest_checked) {
|
||||
cmd_add_on_rx_ddigest_list(req, cmnd);
|
||||
cmnd_get(cmnd);
|
||||
}
|
||||
@@ -1900,12 +1907,16 @@ static void iscsi_cmnd_exec(struct iscsi_cmnd *cmnd)
|
||||
logout_exec(cmnd);
|
||||
break;
|
||||
case ISCSI_OP_SCSI_REJECT:
|
||||
TRACE_MGMT_DBG("REJECT cmnd %p (scst_cmd %p)", cmnd,
|
||||
cmnd->scst_cmd);
|
||||
iscsi_cmnd_init_write(get_rsp_cmnd(cmnd),
|
||||
ISCSI_INIT_WRITE_REMOVE_HASH | ISCSI_INIT_WRITE_WAKE);
|
||||
{
|
||||
struct iscsi_cmnd *rsp = get_rsp_cmnd(cmnd);
|
||||
TRACE_MGMT_DBG("REJECT cmnd %p (scst_cmd %p), rsp %p", cmnd,
|
||||
cmnd->scst_cmd, rsp);
|
||||
if (rsp != NULL)
|
||||
iscsi_cmnd_init_write(rsp, ISCSI_INIT_WRITE_REMOVE_HASH |
|
||||
ISCSI_INIT_WRITE_WAKE);
|
||||
req_cmnd_release(cmnd);
|
||||
break;
|
||||
}
|
||||
default:
|
||||
PRINT_ERROR("unexpected cmnd op %x", cmnd_opcode(cmnd));
|
||||
req_cmnd_release(cmnd);
|
||||
@@ -2281,10 +2292,14 @@ void cmnd_rx_end(struct iscsi_cmnd *cmnd)
|
||||
data_out_end(cmnd);
|
||||
break;
|
||||
case ISCSI_OP_PDU_REJECT:
|
||||
iscsi_cmnd_init_write(get_rsp_cmnd(cmnd),
|
||||
ISCSI_INIT_WRITE_REMOVE_HASH | ISCSI_INIT_WRITE_WAKE);
|
||||
{
|
||||
struct iscsi_cmnd *rsp = get_rsp_cmnd(cmnd);
|
||||
if (rsp != NULL)
|
||||
iscsi_cmnd_init_write(rsp, ISCSI_INIT_WRITE_REMOVE_HASH |
|
||||
ISCSI_INIT_WRITE_WAKE);
|
||||
req_cmnd_release(cmnd);
|
||||
break;
|
||||
}
|
||||
case ISCSI_OP_DATA_REJECT:
|
||||
req_cmnd_release(cmnd);
|
||||
break;
|
||||
|
||||
@@ -237,6 +237,7 @@ struct iscsi_cmnd {
|
||||
unsigned int data_waiting:1;
|
||||
unsigned int force_cleanup_done:1;
|
||||
unsigned int dec_active_cmnds:1;
|
||||
unsigned int ddigest_checked:1;
|
||||
#ifdef EXTRACHECKS
|
||||
unsigned int on_rx_digest_list:1;
|
||||
unsigned int release_called:1;
|
||||
|
||||
@@ -622,7 +622,17 @@ static int recv(struct iscsi_conn *conn)
|
||||
break;
|
||||
case RX_CHECK_DDIGEST:
|
||||
conn->read_state = RX_END;
|
||||
if (cmnd_opcode(cmnd) == ISCSI_OP_SCSI_CMD) {
|
||||
if (cmnd->pdu.datasize <= 16*1024) {
|
||||
/* It's cache hot, so let's compute it inline */
|
||||
TRACE_DBG("cmnd %p, opcode %x: checking RX "
|
||||
"ddigest inline", cmnd, cmnd_opcode(cmnd));
|
||||
cmnd->ddigest_checked = 1;
|
||||
rc = digest_rx_data(cmnd);
|
||||
if (unlikely(rc != 0)) {
|
||||
mark_conn_closed(conn);
|
||||
goto out;
|
||||
}
|
||||
} else if (cmnd_opcode(cmnd) == ISCSI_OP_SCSI_CMD) {
|
||||
cmd_add_on_rx_ddigest_list(cmnd, cmnd);
|
||||
cmnd_get(cmnd);
|
||||
} else if (cmnd_opcode(cmnd) != ISCSI_OP_SCSI_DATA_OUT) {
|
||||
@@ -631,12 +641,12 @@ static int recv(struct iscsi_conn *conn)
|
||||
* specify how to deal with digest errors in this case.
|
||||
* Is closing connection correct?
|
||||
*/
|
||||
TRACE_DBG("cmnd %p, opcode %x: checking RX "
|
||||
"ddigest inline", cmnd, cmnd_opcode(cmnd));
|
||||
TRACE_DBG("cmnd %p, opcode %x: checking NOP RX "
|
||||
"ddigest", cmnd, cmnd_opcode(cmnd));
|
||||
rc = digest_rx_data(cmnd);
|
||||
if (unlikely(rc != 0)) {
|
||||
conn->read_state = RX_CHECK_DDIGEST;
|
||||
mark_conn_closed(conn);
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
@@ -1611,7 +1611,7 @@ mpt_xmit_response(struct scst_cmd *scst_cmd)
|
||||
|
||||
TRACE_DBG("rq_result=%x, resp_flags=%x, %x, %d", prm.rq_result,
|
||||
resp_flags, prm.bufflen, prm.sense_buffer_len);
|
||||
if (prm.rq_result != 0)
|
||||
if ((prm.rq_result != 0) && (prm.sense_buffer != NULL))
|
||||
TRACE_BUFFER("Sense", prm.sense_buffer, prm.sense_buffer_len);
|
||||
|
||||
if ((resp_flags & SCST_TSC_FLAG_STATUS) == 0) {
|
||||
|
||||
316
qla2x00t/qla2x00-target/AskingQuestions
Normal file
316
qla2x00t/qla2x00-target/AskingQuestions
Normal file
@@ -0,0 +1,316 @@
|
||||
Before asking any questions to me directly or scst-devel mailing list
|
||||
make sure that you read *ALL* relevant documentation files (at least, 2
|
||||
README files: one for SCST and one for target driver you are using) and
|
||||
*understood* *ALL* written there. I personally very much like working
|
||||
with people who understand what they are doing and hate when somebody
|
||||
tries to use me as a replacement of his brain and to save his time on
|
||||
expense of mine. So, in such cases don't be surprised if your question
|
||||
will be ignored or answered in the RTFM style.
|
||||
|
||||
Particularly, I will refuse to answer on any questions about low
|
||||
performance if you don't *explicitly* write in your question that you
|
||||
don't use the debug build and ensured (write from what) that your target
|
||||
and backstorage devices don't share the same PCI bus.
|
||||
|
||||
Another too FAQ area is "What are those aborts and resets, which your
|
||||
target from time to time logging, mean and what to do with them?", "Do
|
||||
they relate to I/O stalls I sometimes experience" and "Why after them my
|
||||
device was put offline?".
|
||||
|
||||
Sorry, if the above might sound too harsh. Unfortunately, I have a
|
||||
limited power and can't waste it keeping explaining basic concepts and
|
||||
answering on the same questions.
|
||||
|
||||
Example of a really bad question:
|
||||
|
||||
======================================================================
|
||||
|
||||
In our user space driver , i use epoll_wait to wait on multiple file
|
||||
descriptors for multiple devices. Apparently when i wait on the ioctl in
|
||||
blocking mode , everything works well , but when i wait on epoll , and
|
||||
try to attach a target device , i get immediately a "Bad address" error
|
||||
value from the epoll.
|
||||
|
||||
What is the reason ?
|
||||
|
||||
======================================================================
|
||||
|
||||
It is bad, because, apparently, the author was doing something wrong
|
||||
with epoll, but instead of checking the source code to find out when
|
||||
"Bad address" error can be returned and understand possible reasons for
|
||||
it, he expected me to do that for him. He even didn't bothered to look
|
||||
in the kernel log, where, very probably, the reason for the error was
|
||||
logged.
|
||||
|
||||
|
||||
Here are three examples of good questions:
|
||||
|
||||
======================================================================
|
||||
|
||||
I'm looking for a help in understanding of SCST internal architecture
|
||||
and operation. The problem I'm experiencing now is that SCST seems to
|
||||
process deferred commands incorrectly in some cases. More specifically,
|
||||
I'm confused with the 'while' loop in scst_send_to_midlev function.
|
||||
|
||||
As far as I understand, the basic execution path consists of a call to
|
||||
scst_do_send_midlev followed by taking of a decision on this command
|
||||
(continue with this command, reschedule it, or move to the next one),
|
||||
the decision is stored in 'int res', which is then returned from the
|
||||
function.
|
||||
|
||||
However, if there are deferred commands on the device, the function does
|
||||
not return but makes another call to scst_do_send_to_midlev, analyzes
|
||||
the return code again and stores the decision in 'int res' thereby
|
||||
erasing the decision for the previous command. If scst_send_to_midlev
|
||||
exits now, it will return the _new_ decision (for the deferred command)
|
||||
whereas the scst_process_active_cmd will think that it is the decision
|
||||
for the command that was originally passed to scst_send_to_midlev.
|
||||
|
||||
For example, this will cause problems in the following situation:
|
||||
1. scst_send_to_midlev is called with cmd == 0x80000100
|
||||
2. scst_do_send_to_midlev is called with cmd == 0x8000100
|
||||
3. scst_do_send_to_midlev returns with SCST_EXEC_COMPLETED
|
||||
(in certain scenarios the command is already destroyed at this point)
|
||||
4. scst_check_deferred_commands finds the defferred cmd == 0x80000200
|
||||
5. scst_do_send_to_midlev is called with cmd == 0x80000200
|
||||
6. scst_do_send_to_midlev returns with SCST_EXEC_NEED_THREAD
|
||||
7. scst_send_to_midlev returns with SCST_CMD_STATE_RES_NEED_THREAD
|
||||
8. Now, the scst_process_active_cmd will try to reschedule command 0x8000100
|
||||
which is already destroyed at this point !
|
||||
|
||||
Can anyone on the list confirm my guess? Or, this situation should never
|
||||
happen because of some other condition which I may have missed? Right
|
||||
now I can't think of any of simple methods to work around the issue,
|
||||
i.e. any of my ideas require rewriting significant part of the code.
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I have two machines (SCST targets) with the following parameters:
|
||||
- two dual core Xeon CPUs
|
||||
- QLA2342 FC HBA
|
||||
- Areca SATA RAID HBA
|
||||
- Linux 2.6.21.3, running in 64 bit mode with 16G RAM
|
||||
- SCST trunk version
|
||||
|
||||
On the client side there is a Solaris 10 U3 machine, with the same (chip
|
||||
wise) Qlogic controller.
|
||||
|
||||
There is an FC switch between the three machines, and each of the
|
||||
targets are zoned to the client's port in a one-by-one manner, so HBA
|
||||
port 1 sees only target 1 and port 2 sees only target 2.
|
||||
|
||||
The targets are configured with two large sparse files on XFS (8 TB
|
||||
each, with dd if=/dev/zero of=file bs=1M count=0 seek=8388608).
|
||||
|
||||
In Solaris I do various tests with SVM (Sun's built in volume manager)
|
||||
and multiterabyte UFS. Occasionally, there are some strange write
|
||||
errors, where the volume manager drops its volumes and without a VM, a
|
||||
simple UFS fs write can fail too.
|
||||
|
||||
I see various errors logged by the kernel (Solaris'), these are some
|
||||
examples, both with and without SVM:
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::GPN_ID for D_ID=621200 failed
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::N_x Port with D_ID=621200, PWWN=210000e08b944419 disappeared from
|
||||
fabric
|
||||
Jun 21 10:42:53 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:42:53 solaris SCSI transport failed: reason
|
||||
'tran_err': retrying command
|
||||
Jun 21 10:43:06 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:06 solaris SCSI transport failed: reason 'timeout':
|
||||
retrying command
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.notice] Device is gone
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:13 solaris transport rejected fatal error
|
||||
Jun 21 10:43:13 solaris md_stripe: [ID 641072 kern.warning] WARNING: md:
|
||||
d10: write error on /dev/dsk/c2t210000E08B944419d0s6
|
||||
Jun 21 10:43:13 solaris last message repeated 9 times
|
||||
Jun 21 10:43:13 solaris scsi: [ID 243001 kern.info]
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0 (fcp1):
|
||||
Jun 21 10:43:13 solaris offlining lun=0 (trace=0), target=621200
|
||||
(trace=2800004)
|
||||
Jun 21 10:43:13 solaris ufs: [ID 702911 kern.warning] WARNING: Error
|
||||
writing master during ufs log roll
|
||||
Jun 21 10:43:13 solaris ufs: [ID 127457 kern.warning] WARNING: ufs log
|
||||
for /mnt changed state to Error
|
||||
Jun 21 10:43:13 solaris ufs: [ID 616219 kern.warning] WARNING: Please
|
||||
umount(1M) /mnt and run fsck(1M)
|
||||
Jun 21 11:08:55 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:08:55 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris offline or reservation conflict
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris SYNCHRONIZE CACHE command failed (5)
|
||||
|
||||
I don't see anything in the dmesg on the target side.
|
||||
|
||||
After these errors SCST seems to be dead. I can't unload its modules and
|
||||
can't communicate it via /proc.
|
||||
A simple cat vdisk just waits and waits.
|
||||
|
||||
Could you please help? What should I set/collect/send in this case to
|
||||
help resolving this issue?
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I am trying to get scst working on an Opteron machine.
|
||||
|
||||
After some hours, playing with different kernel versions and different
|
||||
missing functions, I've sticked with a 2.6.15 and a
|
||||
drivers/scsi/scsi_lib.c hack from 2.6.14, which contains the
|
||||
scsi_wait_req. (Linux is a mess, each point release changes something.
|
||||
How can developers keep up with this?)
|
||||
|
||||
Now everything seems to be OK, I could load the modules and such.
|
||||
|
||||
I have a setup of two machines connected to each other in an FC-P2P
|
||||
manner. The two machines has two 2G links between them. On the initiator
|
||||
side I have FreeBSD, because I know that better and this is what I did
|
||||
some target mode tests.
|
||||
|
||||
The strange thing is that the loop seems to be only running at 1 Gbps:
|
||||
[ 61.731265] QLogic Fibre Channel HBA Driver
|
||||
[ 61.731454] GSI 21 sharing vector 0xD1 and IRQ 21
|
||||
[ 61.731563] ACPI: PCI Interrupt 0000:06:01.0[A] -> GSI 36 (level, low) -> IRQ 21
|
||||
[ 61.731821] qla2300 0000:06:01.0: Found an ISP2312, irq 21, iobase 0xffffc200
|
||||
00014000
|
||||
[ 61.732194] qla2300 0000:06:01.0: Configuring PCI space...
|
||||
[ 61.732441] qla2300 0000:06:01.0: Configure NVRAM parameters...
|
||||
[ 61.816885] qla2300 0000:06:01.0: Verifying loaded RISC code...
|
||||
[ 61.852177] qla2300 0000:06:01.0: Extended memory detected (512 KB)...
|
||||
[ 61.852294] qla2300 0000:06:01.0: Resizing request queue depth (2048 -> 4096)
|
||||
...
|
||||
[ 61.852604] qla2300 0000:06:01.0: LIP reset occured (f8e8).
|
||||
[ 61.852740] qla2300 0000:06:01.0: Waiting for LIP to complete...
|
||||
[ 62.865911] qla2300 0000:06:01.0: LIP occured (f7f7).
|
||||
[ 62.866042] qla2300 0000:06:01.0: LOOP UP detected (1 Gbps).
|
||||
[ 62.866269] qla2300 0000:06:01.0: Topology - (Loop), Host Loop address 0x0
|
||||
[ 62.868285] scsi0 : qla2xxx
|
||||
[ 62.868507] qla2300 0000:06:01.0:
|
||||
[ 62.868507] QLogic Fibre Channel HBA Driver: 8.01.03-k
|
||||
[ 62.868508] QLogic QLA2312 -
|
||||
[ 62.868509] ISP2312: PCI-X (100 MHz) @ 0000:06:01.0 hdma+, host#=0, fw=3.03.18 IPX
|
||||
|
||||
|
||||
I did the following:
|
||||
modprobe qla2x00tgt:
|
||||
|
||||
[ 104.988170] qla2x00tgt: no version for "scst_unregister" found: kernel tainted.
|
||||
|
||||
echo "open lun0 /data/lun0" >/proc/scsi_tgt/disk_fileio/disk_fileio"
|
||||
[ 169.102877] scst: Device handler disk_fileio for type 0 loaded successfully
|
||||
[ 169.103002] scst: Device handler cdrom_fileio for type 5 loaded successfully
|
||||
[ 191.261000] dev_fileio: Attached SCSI target virtual disk lun0 (file="/data/l
|
||||
un0", fs=1000001MB, bs=512, nblocks=2048002048, cyln=1000001)
|
||||
[ 191.261191] scst: Attached SCSI target mid-level to virtual device lun0 (id 1
|
||||
)
|
||||
|
||||
and
|
||||
echo "add lun0 0" > /proc/scsi_tgt/groups/Default/devices
|
||||
|
||||
On the other side a camcontrol rescan all (SCSI rescan) gives me the following with a verbose logging kernel:
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1 at isp0 bus 0 target 0 lun 0
|
||||
Mar 29 18:09:17 blade2 kernel: da1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: da1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 1024MB (2097152 512 byte sectors: 64H 32S/T 1024C)
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): error 6
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): Unretryable Error
|
||||
Mar 29 18:09:17 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): error 5
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retries Exausted
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): Unretryable Error
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): error 6
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): Unretryable Error
|
||||
|
||||
|
||||
The device is there, but I cannot use it.
|
||||
|
||||
BTW, the target mode machine (Linux) runs on a dual Opteron in 64 bit
|
||||
mode, with 8GB of RAM. I've lowered it with mem=800M, but the effect is
|
||||
the same.
|
||||
|
||||
Assuming that mixed 2.6.14-.15 kernel is the fault, could you please
|
||||
tell me what version should I use, for which all of the patches will
|
||||
work?
|
||||
|
||||
======================================================================
|
||||
|
||||
So, as a bottom line, if you want me to be friendly, don't ask questions
|
||||
answers on which you can find out yourself by a simple documentation
|
||||
reading and minimal thinking effort.
|
||||
|
||||
Also it is very desirable if you attach to your question full kernel log
|
||||
from target since it's booted.
|
||||
|
||||
Vladislav Bolkhovitin <vst@vlnb.net>, http://scst.sourceforge.net
|
||||
@@ -4135,8 +4135,9 @@ int qla2xxx_tgt_register_driver(struct qla2x_tgt_initiator *tgt_data,
|
||||
|
||||
ENTER(__func__);
|
||||
|
||||
if ((tgt_data == NULL) || (tgt_data->magic != QLA2X_TARGET_MAGIC))
|
||||
{
|
||||
if ((tgt_data == NULL) || (tgt_data->magic != QLA2X_TARGET_MAGIC)) {
|
||||
printk("***ERROR*** Wrong version of the target driver: %d\n",
|
||||
tgt_data->magic);
|
||||
res = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
316
scst/AskingQuestions
Normal file
316
scst/AskingQuestions
Normal file
@@ -0,0 +1,316 @@
|
||||
Before asking any questions to me directly or scst-devel mailing list
|
||||
make sure that you read *ALL* relevant documentation files (at least, 2
|
||||
README files: one for SCST and one for target driver you are using) and
|
||||
*understood* *ALL* written there. I personally very much like working
|
||||
with people who understand what they are doing and hate when somebody
|
||||
tries to use me as a replacement of his brain and to save his time on
|
||||
expense of mine. So, in such cases don't be surprised if your question
|
||||
will be ignored or answered in the RTFM style.
|
||||
|
||||
Particularly, I will refuse to answer on any questions about low
|
||||
performance if you don't *explicitly* write in your question that you
|
||||
don't use the debug build and ensured (write from what) that your target
|
||||
and backstorage devices don't share the same PCI bus.
|
||||
|
||||
Another too FAQ area is "What are those aborts and resets, which your
|
||||
target from time to time logging, mean and what to do with them?", "Do
|
||||
they relate to I/O stalls I sometimes experience" and "Why after them my
|
||||
device was put offline?".
|
||||
|
||||
Sorry, if the above might sound too harsh. Unfortunately, I have a
|
||||
limited power and can't waste it keeping explaining basic concepts and
|
||||
answering on the same questions.
|
||||
|
||||
Example of a really bad question:
|
||||
|
||||
======================================================================
|
||||
|
||||
In our user space driver , i use epoll_wait to wait on multiple file
|
||||
descriptors for multiple devices. Apparently when i wait on the ioctl in
|
||||
blocking mode , everything works well , but when i wait on epoll , and
|
||||
try to attach a target device , i get immediately a "Bad address" error
|
||||
value from the epoll.
|
||||
|
||||
What is the reason ?
|
||||
|
||||
======================================================================
|
||||
|
||||
It is bad, because, apparently, the author was doing something wrong
|
||||
with epoll, but instead of checking the source code to find out when
|
||||
"Bad address" error can be returned and understand possible reasons for
|
||||
it, he expected me to do that for him. He even didn't bothered to look
|
||||
in the kernel log, where, very probably, the reason for the error was
|
||||
logged.
|
||||
|
||||
|
||||
Here are three examples of good questions:
|
||||
|
||||
======================================================================
|
||||
|
||||
I'm looking for a help in understanding of SCST internal architecture
|
||||
and operation. The problem I'm experiencing now is that SCST seems to
|
||||
process deferred commands incorrectly in some cases. More specifically,
|
||||
I'm confused with the 'while' loop in scst_send_to_midlev function.
|
||||
|
||||
As far as I understand, the basic execution path consists of a call to
|
||||
scst_do_send_midlev followed by taking of a decision on this command
|
||||
(continue with this command, reschedule it, or move to the next one),
|
||||
the decision is stored in 'int res', which is then returned from the
|
||||
function.
|
||||
|
||||
However, if there are deferred commands on the device, the function does
|
||||
not return but makes another call to scst_do_send_to_midlev, analyzes
|
||||
the return code again and stores the decision in 'int res' thereby
|
||||
erasing the decision for the previous command. If scst_send_to_midlev
|
||||
exits now, it will return the _new_ decision (for the deferred command)
|
||||
whereas the scst_process_active_cmd will think that it is the decision
|
||||
for the command that was originally passed to scst_send_to_midlev.
|
||||
|
||||
For example, this will cause problems in the following situation:
|
||||
1. scst_send_to_midlev is called with cmd == 0x80000100
|
||||
2. scst_do_send_to_midlev is called with cmd == 0x8000100
|
||||
3. scst_do_send_to_midlev returns with SCST_EXEC_COMPLETED
|
||||
(in certain scenarios the command is already destroyed at this point)
|
||||
4. scst_check_deferred_commands finds the defferred cmd == 0x80000200
|
||||
5. scst_do_send_to_midlev is called with cmd == 0x80000200
|
||||
6. scst_do_send_to_midlev returns with SCST_EXEC_NEED_THREAD
|
||||
7. scst_send_to_midlev returns with SCST_CMD_STATE_RES_NEED_THREAD
|
||||
8. Now, the scst_process_active_cmd will try to reschedule command 0x8000100
|
||||
which is already destroyed at this point !
|
||||
|
||||
Can anyone on the list confirm my guess? Or, this situation should never
|
||||
happen because of some other condition which I may have missed? Right
|
||||
now I can't think of any of simple methods to work around the issue,
|
||||
i.e. any of my ideas require rewriting significant part of the code.
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I have two machines (SCST targets) with the following parameters:
|
||||
- two dual core Xeon CPUs
|
||||
- QLA2342 FC HBA
|
||||
- Areca SATA RAID HBA
|
||||
- Linux 2.6.21.3, running in 64 bit mode with 16G RAM
|
||||
- SCST trunk version
|
||||
|
||||
On the client side there is a Solaris 10 U3 machine, with the same (chip
|
||||
wise) Qlogic controller.
|
||||
|
||||
There is an FC switch between the three machines, and each of the
|
||||
targets are zoned to the client's port in a one-by-one manner, so HBA
|
||||
port 1 sees only target 1 and port 2 sees only target 2.
|
||||
|
||||
The targets are configured with two large sparse files on XFS (8 TB
|
||||
each, with dd if=/dev/zero of=file bs=1M count=0 seek=8388608).
|
||||
|
||||
In Solaris I do various tests with SVM (Sun's built in volume manager)
|
||||
and multiterabyte UFS. Occasionally, there are some strange write
|
||||
errors, where the volume manager drops its volumes and without a VM, a
|
||||
simple UFS fs write can fail too.
|
||||
|
||||
I see various errors logged by the kernel (Solaris'), these are some
|
||||
examples, both with and without SVM:
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::GPN_ID for D_ID=621200 failed
|
||||
Jun 21 10:42:14 solaris fctl: [ID 517869 kern.warning] WARNING:
|
||||
fp(1)::N_x Port with D_ID=621200, PWWN=210000e08b944419 disappeared from
|
||||
fabric
|
||||
Jun 21 10:42:53 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:42:53 solaris SCSI transport failed: reason
|
||||
'tran_err': retrying command
|
||||
Jun 21 10:43:06 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:06 solaris SCSI transport failed: reason 'timeout':
|
||||
retrying command
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.notice] Device is gone
|
||||
Jun 21 10:43:13 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 10:43:13 solaris transport rejected fatal error
|
||||
Jun 21 10:43:13 solaris md_stripe: [ID 641072 kern.warning] WARNING: md:
|
||||
d10: write error on /dev/dsk/c2t210000E08B944419d0s6
|
||||
Jun 21 10:43:13 solaris last message repeated 9 times
|
||||
Jun 21 10:43:13 solaris scsi: [ID 243001 kern.info]
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0 (fcp1):
|
||||
Jun 21 10:43:13 solaris offlining lun=0 (trace=0), target=621200
|
||||
(trace=2800004)
|
||||
Jun 21 10:43:13 solaris ufs: [ID 702911 kern.warning] WARNING: Error
|
||||
writing master during ufs log roll
|
||||
Jun 21 10:43:13 solaris ufs: [ID 127457 kern.warning] WARNING: ufs log
|
||||
for /mnt changed state to Error
|
||||
Jun 21 10:43:13 solaris ufs: [ID 616219 kern.warning] WARNING: Please
|
||||
umount(1M) /mnt and run fsck(1M)
|
||||
Jun 21 11:08:55 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:08:55 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris offline or reservation conflict
|
||||
Jun 21 11:09:41 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:41 solaris i/o to invalid geometry
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris offline or reservation conflict
|
||||
Jun 21 11:09:43 solaris scsi: [ID 107833 kern.warning] WARNING:
|
||||
/pci@1,0/pci1022,7450@a/pcie11,105@1,1/fp@0,0/disk@w210000e08b944419,0
|
||||
(sd1):
|
||||
Jun 21 11:09:43 solaris SYNCHRONIZE CACHE command failed (5)
|
||||
|
||||
I don't see anything in the dmesg on the target side.
|
||||
|
||||
After these errors SCST seems to be dead. I can't unload its modules and
|
||||
can't communicate it via /proc.
|
||||
A simple cat vdisk just waits and waits.
|
||||
|
||||
Could you please help? What should I set/collect/send in this case to
|
||||
help resolving this issue?
|
||||
|
||||
======================================================================
|
||||
|
||||
Hello,
|
||||
|
||||
I am trying to get scst working on an Opteron machine.
|
||||
|
||||
After some hours, playing with different kernel versions and different
|
||||
missing functions, I've sticked with a 2.6.15 and a
|
||||
drivers/scsi/scsi_lib.c hack from 2.6.14, which contains the
|
||||
scsi_wait_req. (Linux is a mess, each point release changes something.
|
||||
How can developers keep up with this?)
|
||||
|
||||
Now everything seems to be OK, I could load the modules and such.
|
||||
|
||||
I have a setup of two machines connected to each other in an FC-P2P
|
||||
manner. The two machines has two 2G links between them. On the initiator
|
||||
side I have FreeBSD, because I know that better and this is what I did
|
||||
some target mode tests.
|
||||
|
||||
The strange thing is that the loop seems to be only running at 1 Gbps:
|
||||
[ 61.731265] QLogic Fibre Channel HBA Driver
|
||||
[ 61.731454] GSI 21 sharing vector 0xD1 and IRQ 21
|
||||
[ 61.731563] ACPI: PCI Interrupt 0000:06:01.0[A] -> GSI 36 (level, low) -> IRQ 21
|
||||
[ 61.731821] qla2300 0000:06:01.0: Found an ISP2312, irq 21, iobase 0xffffc200
|
||||
00014000
|
||||
[ 61.732194] qla2300 0000:06:01.0: Configuring PCI space...
|
||||
[ 61.732441] qla2300 0000:06:01.0: Configure NVRAM parameters...
|
||||
[ 61.816885] qla2300 0000:06:01.0: Verifying loaded RISC code...
|
||||
[ 61.852177] qla2300 0000:06:01.0: Extended memory detected (512 KB)...
|
||||
[ 61.852294] qla2300 0000:06:01.0: Resizing request queue depth (2048 -> 4096)
|
||||
...
|
||||
[ 61.852604] qla2300 0000:06:01.0: LIP reset occured (f8e8).
|
||||
[ 61.852740] qla2300 0000:06:01.0: Waiting for LIP to complete...
|
||||
[ 62.865911] qla2300 0000:06:01.0: LIP occured (f7f7).
|
||||
[ 62.866042] qla2300 0000:06:01.0: LOOP UP detected (1 Gbps).
|
||||
[ 62.866269] qla2300 0000:06:01.0: Topology - (Loop), Host Loop address 0x0
|
||||
[ 62.868285] scsi0 : qla2xxx
|
||||
[ 62.868507] qla2300 0000:06:01.0:
|
||||
[ 62.868507] QLogic Fibre Channel HBA Driver: 8.01.03-k
|
||||
[ 62.868508] QLogic QLA2312 -
|
||||
[ 62.868509] ISP2312: PCI-X (100 MHz) @ 0000:06:01.0 hdma+, host#=0, fw=3.03.18 IPX
|
||||
|
||||
|
||||
I did the following:
|
||||
modprobe qla2x00tgt:
|
||||
|
||||
[ 104.988170] qla2x00tgt: no version for "scst_unregister" found: kernel tainted.
|
||||
|
||||
echo "open lun0 /data/lun0" >/proc/scsi_tgt/disk_fileio/disk_fileio"
|
||||
[ 169.102877] scst: Device handler disk_fileio for type 0 loaded successfully
|
||||
[ 169.103002] scst: Device handler cdrom_fileio for type 5 loaded successfully
|
||||
[ 191.261000] dev_fileio: Attached SCSI target virtual disk lun0 (file="/data/l
|
||||
un0", fs=1000001MB, bs=512, nblocks=2048002048, cyln=1000001)
|
||||
[ 191.261191] scst: Attached SCSI target mid-level to virtual device lun0 (id 1
|
||||
)
|
||||
|
||||
and
|
||||
echo "add lun0 0" > /proc/scsi_tgt/groups/Default/devices
|
||||
|
||||
On the other side a camcontrol rescan all (SCSI rescan) gives me the following with a verbose logging kernel:
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: pass1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1 at isp0 bus 0 target 0 lun 0
|
||||
Mar 29 18:09:17 blade2 kernel: da1: <SCST_FIO lun0 093> Fixed Direct Access SCSI-4 device
|
||||
Mar 29 18:09:17 blade2 kernel: da1: Serial Number 383
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 100.000MB/s transfers
|
||||
Mar 29 18:09:17 blade2 kernel: da1: 1024MB (2097152 512 byte sectors: 64H 32S/T 1024C)
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): error 6
|
||||
Mar 29 18:09:17 blade2 kernel: (probe0:isp0:0:0:1): Unretryable Error
|
||||
Mar 29 18:09:17 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:17 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:2): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:3): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:4): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retrying Command
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:5): Unretryable Error
|
||||
Mar 29 18:09:18 blade2 kernel: isp0: data overrun for command on 0.0.0
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Data Overrun
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): error 5
|
||||
Mar 29 18:09:18 blade2 kernel: (da1:isp0:0:0:0): Retries Exausted
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): error 6
|
||||
Mar 29 18:09:18 blade2 kernel: (probe0:isp0:0:0:6): Unretryable Error
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): error 6
|
||||
Mar 29 18:09:19 blade2 kernel: (probe0:isp0:0:0:7): Unretryable Error
|
||||
|
||||
|
||||
The device is there, but I cannot use it.
|
||||
|
||||
BTW, the target mode machine (Linux) runs on a dual Opteron in 64 bit
|
||||
mode, with 8GB of RAM. I've lowered it with mem=800M, but the effect is
|
||||
the same.
|
||||
|
||||
Assuming that mixed 2.6.14-.15 kernel is the fault, could you please
|
||||
tell me what version should I use, for which all of the patches will
|
||||
work?
|
||||
|
||||
======================================================================
|
||||
|
||||
So, as a bottom line, if you want me to be friendly, don't ask questions
|
||||
answers on which you can find out yourself by a simple documentation
|
||||
reading and minimal thinking effort.
|
||||
|
||||
Also it is very desirable if you attach to your question full kernel log
|
||||
from target since it's booted.
|
||||
|
||||
Vladislav Bolkhovitin <vst@vlnb.net>, http://scst.sourceforge.net
|
||||
41
scst/README
41
scst/README
@@ -711,10 +711,14 @@ debug2perf Makefile target.
|
||||
you have no choice, but PCI bus sharing, set in the BIOS PCI latency
|
||||
as low as possible.
|
||||
|
||||
6. If you use VDISK IO module in FILEIO mode, NV_CACHE option will
|
||||
provide you the best performance. But using it make sure you use a good
|
||||
UPS with ability to shutdown the target on the power failure.
|
||||
|
||||
IMPORTANT: If you use on initiator some versions of Windows (at least W2K)
|
||||
========= you can't get good write performance for VDISK FILEIO devices with
|
||||
default 512 bytes block sizes. You could get about 10% of the
|
||||
expected one. This is because of partition alignment, which
|
||||
expected one. This is because of the partition alignment, which
|
||||
is (simplifying) incompatible with how Linux page cache
|
||||
works, so for each write the corresponding block must be read
|
||||
first. Use 4096 bytes block sizes for VDISK devices and you
|
||||
@@ -732,31 +736,38 @@ What if target's backstorage is too slow
|
||||
|
||||
If under high load you experience I/O stalls or see in the kernel log on
|
||||
the target abort or reset messages, then your backstorage is too slow
|
||||
for your target link speed and amount of simultaneously queued commands.
|
||||
Simply processing of one or more commands takes too long, so initiator
|
||||
decides that they are stuck on the target and tries to recover.
|
||||
Particularly, it is known that the default amount of simultaneously
|
||||
queued commands (48) is sometimes too high if you do intensive writes
|
||||
from VMware on a target disk, which uses LVM in the snapshot mode. In
|
||||
this case value like 16 or even 10 depending of your backstorage speed
|
||||
could be more appropriate.
|
||||
comparing with your target link speed and amount of simultaneously
|
||||
queued commands. On some seek intensive workloads even fast disks or
|
||||
RAIDs, which able to serve continuous data stream on 500+ MB/s speed,
|
||||
can be as slow as 0.3 MB/s. So, simply processing of one or more
|
||||
commands takes too long time, hence initiator decides that they are
|
||||
stuck on the target and tries to recover. Particularly, it is known that
|
||||
the default amount of simultaneously queued commands (48) is sometimes
|
||||
too high if you do intensive writes from VMware on a target disk, which
|
||||
uses LVM in the snapshot mode. In this case value like 16 or even 8-10
|
||||
depending of your backstorage speed could be more appropriate.
|
||||
|
||||
Unfortunately, currently SCST lacks dynamic I/O flow control, when the
|
||||
queue depth on the target is dynamically decreased/increased based on
|
||||
how slow/fast the backstorage speed comparing to the target link. So,
|
||||
there are only 4 possible actions, which you can do to workaround or fix
|
||||
there are only 5 possible actions, which you can do to workaround or fix
|
||||
this issue:
|
||||
|
||||
1. Ignore incoming task management (TM) commands. It's fine if there are
|
||||
not too many of them, i.e. if the backstorage isn't too slow.
|
||||
not too many of them, so average performance isn't hurt and the
|
||||
corresponding device isn't put offline, i.e. if the backstorage isn't
|
||||
too much slow.
|
||||
|
||||
2. Decrease /sys/block/sdX/device/queue_depth on the initiator in case
|
||||
if it's Linux (see below how) and/or SCST_MAX_TGT_DEV_COMMANDS constant
|
||||
in scst_priv.h file until you stop seeing incoming TM commands.
|
||||
if it's Linux (see below how) or/and SCST_MAX_TGT_DEV_COMMANDS constant
|
||||
in scst_priv.h file until you stop seeing incoming TM commands.
|
||||
ISCSI-SCST driver also has its own iSCSI specific parameter for that.
|
||||
|
||||
3. Insrease speed of the target's backstorage.
|
||||
3. Try to avoid such seek intensive workloads.
|
||||
|
||||
4. Implement in SCST the dynamic I/O flow control.
|
||||
4. Insrease speed of the target's backstorage.
|
||||
|
||||
5. Implement in SCST the dynamic I/O flow control.
|
||||
|
||||
To decrease device queue depth on Linux initiators run command:
|
||||
|
||||
|
||||
@@ -1097,6 +1097,9 @@ struct scst_cmd
|
||||
/* Set if tgt_sn field is valid */
|
||||
unsigned int tgt_sn_set:1;
|
||||
|
||||
/* Set if cmd is finished */
|
||||
unsigned int finished:1;
|
||||
|
||||
/**************************************************************/
|
||||
|
||||
unsigned long cmd_flags; /* cmd's async flags */
|
||||
@@ -1194,6 +1197,8 @@ struct scst_cmd
|
||||
uint8_t host_status; /* set by low-level driver to indicate status */
|
||||
uint8_t driver_status; /* set by mid-level */
|
||||
|
||||
uint8_t *sense; /* pointer to sense buffer */
|
||||
|
||||
/* Used for storage of target driver private stuff */
|
||||
void *tgt_priv;
|
||||
|
||||
@@ -1206,8 +1211,6 @@ struct scst_cmd
|
||||
*/
|
||||
int orig_sg_cnt, orig_sg_entry, orig_entry_len;
|
||||
|
||||
uint8_t sense_buffer[SCST_SENSE_BUFFERSIZE]; /* sense buffer */
|
||||
|
||||
/* List entry for dev's blocked_cmd_list */
|
||||
struct list_head blocked_cmd_list_entry;
|
||||
|
||||
@@ -2051,13 +2054,13 @@ static inline uint8_t scst_cmd_get_driver_status(struct scst_cmd *cmd)
|
||||
/* Returns pointer to cmd's sense buffer */
|
||||
static inline uint8_t *scst_cmd_get_sense_buffer(struct scst_cmd *cmd)
|
||||
{
|
||||
return cmd->sense_buffer;
|
||||
return cmd->sense;
|
||||
}
|
||||
|
||||
/* Returns cmd's sense buffer length */
|
||||
static inline int scst_cmd_get_sense_buffer_len(struct scst_cmd *cmd)
|
||||
{
|
||||
return sizeof(cmd->sense_buffer);
|
||||
return SCST_SENSE_BUFFERSIZE;
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -2477,6 +2480,10 @@ struct proc_dir_entry *scst_create_proc_entry(struct proc_dir_entry * root,
|
||||
int scst_add_cmd_threads(int num);
|
||||
void scst_del_cmd_threads(int num);
|
||||
|
||||
int scst_alloc_sense(struct scst_cmd *cmd, int atomic);
|
||||
int scst_alloc_set_sense(struct scst_cmd *cmd, int atomic,
|
||||
const uint8_t *sense, unsigned int len);
|
||||
|
||||
void scst_set_sense(uint8_t *buffer, int len, int key,
|
||||
int asc, int ascq);
|
||||
|
||||
@@ -2513,6 +2520,12 @@ int scst_check_mem(struct scst_cmd *cmd);
|
||||
void scst_get(void);
|
||||
void scst_put(void);
|
||||
|
||||
/*
|
||||
* Cmd ref counters
|
||||
*/
|
||||
void scst_cmd_get(struct scst_cmd *cmd);
|
||||
void scst_cmd_put(struct scst_cmd *cmd);
|
||||
|
||||
/*
|
||||
* Allocates and returns pointer to SG vector with data size "size".
|
||||
* In *count returned the count of entries in the vector.
|
||||
|
||||
@@ -105,9 +105,11 @@ enum scst_cmd_queue_type
|
||||
*************************************************************/
|
||||
#define SCST_LOAD_SENSE(key_asc_ascq) key_asc_ascq
|
||||
|
||||
#define SCST_SENSE_VALID(sense) ((((uint8_t *)(sense))[0] & 0x70) == 0x70)
|
||||
#define SCST_SENSE_VALID(sense) ((sense != NULL) && \
|
||||
((((uint8_t *)(sense))[0] & 0x70) == 0x70))
|
||||
|
||||
#define SCST_NO_SENSE(sense) (((uint8_t *)(sense))[2] == 0)
|
||||
#define SCST_NO_SENSE(sense) ((sense != NULL) && \
|
||||
(((uint8_t *)(sense))[2] == 0))
|
||||
|
||||
static inline int scst_is_ua_sense(const uint8_t *sense)
|
||||
{
|
||||
|
||||
@@ -340,30 +340,29 @@ int tape_done(struct scst_cmd *cmd)
|
||||
if ((status == SAM_STAT_GOOD) || (status == SAM_STAT_CONDITION_MET)) {
|
||||
res = scst_tape_generic_dev_done(cmd, tape_set_block_size);
|
||||
} else if ((status == SAM_STAT_CHECK_CONDITION) &&
|
||||
SCST_SENSE_VALID(cmd->sense_buffer))
|
||||
SCST_SENSE_VALID(cmd->sense))
|
||||
{
|
||||
struct tape_params *params;
|
||||
TRACE_DBG("%s", "Extended sense");
|
||||
if (opcode == READ_6 && !(cmd->cdb[1] & SILI_BIT) &&
|
||||
(cmd->sense_buffer[2] & 0xe0)) { /* EOF, EOM, or ILI */
|
||||
(cmd->sense[2] & 0xe0)) { /* EOF, EOM, or ILI */
|
||||
int TransferLength, Residue = 0;
|
||||
if ((cmd->sense_buffer[2] & 0x0f) == BLANK_CHECK) {
|
||||
cmd->sense_buffer[2] &= 0xcf; /* No need for EOM in this case */
|
||||
if ((cmd->sense[2] & 0x0f) == BLANK_CHECK) {
|
||||
cmd->sense[2] &= 0xcf; /* No need for EOM in this case */
|
||||
}
|
||||
TransferLength = ((cmd->cdb[2] << 16) |
|
||||
(cmd->cdb[3] << 8) | cmd->cdb[4]);
|
||||
/* Compute the residual count */
|
||||
if ((cmd->sense_buffer[0] & 0x80) != 0) {
|
||||
Residue = ((cmd->sense_buffer[3] << 24) |
|
||||
(cmd->sense_buffer[4] << 16) |
|
||||
(cmd->sense_buffer[5] << 8) |
|
||||
cmd->sense_buffer[6]);
|
||||
if ((cmd->sense[0] & 0x80) != 0) {
|
||||
Residue = ((cmd->sense[3] << 24) |
|
||||
(cmd->sense[4] << 16) |
|
||||
(cmd->sense[5] << 8) |
|
||||
cmd->sense[6]);
|
||||
}
|
||||
TRACE_DBG("Checking the sense key "
|
||||
"sn[2]=%x cmd->cdb[0,1]=%x,%x TransLen/Resid %d/%d",
|
||||
(int) cmd->sense_buffer[2],
|
||||
cmd->cdb[0], cmd->cdb[1], TransferLength,
|
||||
Residue);
|
||||
(int) cmd->sense[2], cmd->cdb[0], cmd->cdb[1],
|
||||
TransferLength, Residue);
|
||||
if (TransferLength > Residue) {
|
||||
int resp_data_len = TransferLength - Residue;
|
||||
if (cmd->cdb[1] & SCST_TRANSFER_LEN_TYPE_FIXED) {
|
||||
|
||||
@@ -1263,9 +1263,12 @@ static int dev_user_process_reply_exec(struct scst_user_cmd *ucmd,
|
||||
|
||||
cmd->status = ereply->status;
|
||||
if (ereply->sense_len != 0) {
|
||||
res = copy_from_user(cmd->sense_buffer,
|
||||
res = scst_alloc_sense(cmd, 0);
|
||||
if (res != 0)
|
||||
goto out_compl;
|
||||
res = copy_from_user(cmd->sense,
|
||||
(void*)(unsigned long)ereply->psense_buffer,
|
||||
min(sizeof(cmd->sense_buffer),
|
||||
min((unsigned int)SCST_SENSE_BUFFERSIZE,
|
||||
(unsigned int)ereply->sense_len));
|
||||
if (res < 0) {
|
||||
PRINT_ERROR("%s", "Unable to get sense data");
|
||||
@@ -1478,9 +1481,12 @@ again:
|
||||
u = NULL;
|
||||
if (!list_empty(cmd_list)) {
|
||||
u = list_entry(cmd_list->next, typeof(*u), ready_cmd_list_entry);
|
||||
|
||||
TRACE_DBG("Found ready ucmd %p", u);
|
||||
list_del(&u->ready_cmd_list_entry);
|
||||
|
||||
EXTRACHECKS_BUG_ON(u->state & UCMD_STATE_JAMMED_MASK);
|
||||
|
||||
if (u->cmd != NULL) {
|
||||
if (u->state == UCMD_STATE_EXECING) {
|
||||
struct scst_user_dev *dev = u->dev;
|
||||
|
||||
@@ -873,13 +873,12 @@ static int vdisk_do_job(struct scst_cmd *cmd)
|
||||
"loff=%Ld, data_len=%Ld, immed=%d", (uint64_t)loff,
|
||||
(uint64_t)data_len, immed);
|
||||
if (immed) {
|
||||
scst_get();
|
||||
scst_cmd_get(cmd);
|
||||
cmd->completed = 1;
|
||||
cmd->scst_cmd_done(cmd, SCST_CMD_STATE_DEFAULT);
|
||||
/* cmd is dead here */
|
||||
vdisk_fsync(thr, loff, data_len, NULL);
|
||||
/* ToDo: vdisk_fsync() error processing */
|
||||
scst_put();
|
||||
scst_cmd_put(cmd);
|
||||
goto out;
|
||||
} else {
|
||||
vdisk_fsync(thr, loff, data_len, cmd);
|
||||
@@ -2333,7 +2332,7 @@ static void blockio_exec_rw(struct scst_cmd *cmd, struct scst_vdisk_thr *thr,
|
||||
|
||||
/* +1 to prevent erroneous too early command completion */
|
||||
atomic_set(&blockio_work->bios_inflight, bios+1);
|
||||
smp_mb();
|
||||
smp_wmb();
|
||||
|
||||
while (hbio) {
|
||||
bio = hbio;
|
||||
|
||||
@@ -41,6 +41,51 @@ static void scst_free_tgt_dev(struct scst_tgt_dev *tgt_dev);
|
||||
static void scst_check_internal_sense(struct scst_device *dev, int result,
|
||||
uint8_t *sense, int sense_len);
|
||||
|
||||
int scst_alloc_sense(struct scst_cmd *cmd, int atomic)
|
||||
{
|
||||
int res = 0;
|
||||
unsigned long gfp_mask = atomic ? GFP_ATOMIC : (GFP_KERNEL|__GFP_NOFAIL);
|
||||
|
||||
TRACE_ENTRY();
|
||||
|
||||
sBUG_ON(cmd->sense != NULL);
|
||||
|
||||
cmd->sense = mempool_alloc(scst_sense_mempool, gfp_mask);
|
||||
if (cmd->sense == NULL) {
|
||||
PRINT_ERROR("FATAL!!! Sense memory allocation failed (op %x). "
|
||||
"The sense data will be lost!!", cmd->cdb[0]);
|
||||
res = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
memset(cmd->sense, 0, SCST_SENSE_BUFFERSIZE);
|
||||
|
||||
out:
|
||||
TRACE_EXIT_RES(res);
|
||||
return res;
|
||||
}
|
||||
|
||||
int scst_alloc_set_sense(struct scst_cmd *cmd, int atomic,
|
||||
const uint8_t *sense, unsigned int len)
|
||||
{
|
||||
int res;
|
||||
|
||||
TRACE_ENTRY();
|
||||
|
||||
res = scst_alloc_sense(cmd, atomic);
|
||||
if (res != 0) {
|
||||
PRINT_BUFFER("Lost sense", sense, len);
|
||||
goto out;
|
||||
}
|
||||
|
||||
memcpy(cmd->sense, sense, min((int)len, (int)SCST_SENSE_BUFFERSIZE));
|
||||
TRACE_BUFFER("Sense set", cmd->sense, SCST_SENSE_BUFFERSIZE);
|
||||
|
||||
out:
|
||||
TRACE_EXIT_RES(res);
|
||||
return res;
|
||||
}
|
||||
|
||||
void scst_set_cmd_error_status(struct scst_cmd *cmd, int status)
|
||||
{
|
||||
TRACE_ENTRY();
|
||||
@@ -60,13 +105,23 @@ void scst_set_cmd_error_status(struct scst_cmd *cmd, int status)
|
||||
|
||||
void scst_set_cmd_error(struct scst_cmd *cmd, int key, int asc, int ascq)
|
||||
{
|
||||
int rc;
|
||||
|
||||
TRACE_ENTRY();
|
||||
|
||||
scst_set_cmd_error_status(cmd, SAM_STAT_CHECK_CONDITION);
|
||||
scst_set_sense(cmd->sense_buffer, sizeof(cmd->sense_buffer),
|
||||
key, asc, ascq);
|
||||
TRACE_BUFFER("Sense set", cmd->sense_buffer, sizeof(cmd->sense_buffer));
|
||||
|
||||
rc = scst_alloc_sense(cmd, 1);
|
||||
if (rc != 0) {
|
||||
PRINT_ERROR("Lost sense data (key %x, asc %x, ascq %x)",
|
||||
key, asc, ascq);
|
||||
goto out;
|
||||
}
|
||||
|
||||
scst_set_sense(cmd->sense, SCST_SENSE_BUFFERSIZE, key, asc, ascq);
|
||||
TRACE_BUFFER("Sense set", cmd->sense, SCST_SENSE_BUFFERSIZE);
|
||||
|
||||
out:
|
||||
TRACE_EXIT();
|
||||
return;
|
||||
}
|
||||
@@ -90,11 +145,7 @@ void scst_set_cmd_error_sense(struct scst_cmd *cmd, uint8_t *sense,
|
||||
TRACE_ENTRY();
|
||||
|
||||
scst_set_cmd_error_status(cmd, SAM_STAT_CHECK_CONDITION);
|
||||
|
||||
memset(cmd->sense_buffer, 0, sizeof(cmd->sense_buffer));
|
||||
memcpy(cmd->sense_buffer, sense, min((unsigned long)len,
|
||||
(unsigned long)sizeof(cmd->sense_buffer)));
|
||||
TRACE_BUFFER("Sense set", cmd->sense_buffer, sizeof(cmd->sense_buffer));
|
||||
scst_alloc_set_sense(cmd, 1, sense, len);
|
||||
|
||||
TRACE_EXIT();
|
||||
return;
|
||||
@@ -857,7 +908,7 @@ void scst_free_internal_cmd(struct scst_cmd *cmd)
|
||||
{
|
||||
TRACE_ENTRY();
|
||||
|
||||
scst_cmd_put(cmd);
|
||||
__scst_cmd_put(cmd);
|
||||
|
||||
TRACE_EXIT();
|
||||
return;
|
||||
@@ -897,35 +948,33 @@ out_error:
|
||||
#undef sbuf_size
|
||||
}
|
||||
|
||||
struct scst_cmd *scst_complete_request_sense(struct scst_cmd *cmd)
|
||||
struct scst_cmd *scst_complete_request_sense(struct scst_cmd *req_cmd)
|
||||
{
|
||||
struct scst_cmd *orig_cmd = cmd->orig_cmd;
|
||||
struct scst_cmd *orig_cmd = req_cmd->orig_cmd;
|
||||
uint8_t *buf;
|
||||
int len;
|
||||
|
||||
TRACE_ENTRY();
|
||||
|
||||
if (cmd->dev->handler->dev_done != NULL) {
|
||||
if (req_cmd->dev->handler->dev_done != NULL) {
|
||||
int rc;
|
||||
TRACE_DBG("Calling dev handler %s dev_done(%p)",
|
||||
cmd->dev->handler->name, cmd);
|
||||
rc = cmd->dev->handler->dev_done(cmd);
|
||||
req_cmd->dev->handler->name, req_cmd);
|
||||
rc = req_cmd->dev->handler->dev_done(req_cmd);
|
||||
TRACE_DBG("Dev handler %s dev_done() returned %d",
|
||||
cmd->dev->handler->name, rc);
|
||||
req_cmd->dev->handler->name, rc);
|
||||
}
|
||||
|
||||
sBUG_ON(orig_cmd);
|
||||
|
||||
len = scst_get_buf_first(cmd, &buf);
|
||||
len = scst_get_buf_first(req_cmd, &buf);
|
||||
|
||||
if (scsi_status_is_good(cmd->status) && (len > 0) &&
|
||||
SCST_SENSE_VALID(buf) && (!SCST_NO_SENSE(buf)))
|
||||
{
|
||||
if (scsi_status_is_good(req_cmd->status) && (len > 0) &&
|
||||
SCST_SENSE_VALID(buf) && (!SCST_NO_SENSE(buf))) {
|
||||
PRINT_BUFF_FLAG(TRACE_SCSI, "REQUEST SENSE returned",
|
||||
buf, len);
|
||||
memcpy(orig_cmd->sense_buffer, buf,
|
||||
((int)sizeof(orig_cmd->sense_buffer) > len) ?
|
||||
len : (int)sizeof(orig_cmd->sense_buffer));
|
||||
scst_alloc_set_sense(orig_cmd, scst_cmd_atomic(req_cmd), buf,
|
||||
len);
|
||||
} else {
|
||||
PRINT_ERROR("%s", "Unable to get the sense via "
|
||||
"REQUEST SENSE, returning HARDWARE ERROR");
|
||||
@@ -934,9 +983,9 @@ struct scst_cmd *scst_complete_request_sense(struct scst_cmd *cmd)
|
||||
}
|
||||
|
||||
if (len > 0)
|
||||
scst_put_buf(cmd, buf);
|
||||
scst_put_buf(req_cmd, buf);
|
||||
|
||||
scst_free_internal_cmd(cmd);
|
||||
scst_free_internal_cmd(req_cmd);
|
||||
|
||||
TRACE_EXIT_HRES((unsigned long)orig_cmd);
|
||||
return orig_cmd;
|
||||
@@ -1212,6 +1261,16 @@ void scst_sched_session_free(struct scst_session *sess)
|
||||
return;
|
||||
}
|
||||
|
||||
void scst_cmd_get(struct scst_cmd *cmd)
|
||||
{
|
||||
__scst_cmd_get(cmd);
|
||||
}
|
||||
|
||||
void scst_cmd_put(struct scst_cmd *cmd)
|
||||
{
|
||||
__scst_cmd_put(cmd);
|
||||
}
|
||||
|
||||
struct scst_cmd *scst_alloc_cmd(int gfp_mask)
|
||||
{
|
||||
struct scst_cmd *cmd;
|
||||
@@ -1283,19 +1342,6 @@ void scst_free_cmd(struct scst_cmd *cmd)
|
||||
}
|
||||
#endif
|
||||
|
||||
if (likely(cmd->tgt_dev != NULL)) {
|
||||
atomic_dec(&cmd->tgt_dev->tgt_dev_cmd_count);
|
||||
atomic_dec(&cmd->dev->dev_cmd_count);
|
||||
}
|
||||
|
||||
/*
|
||||
* cmd->mgmt_cmnd can't being changed here, since for that it either
|
||||
* must be on search_cmd_list, or cmd_ref must be taken. Both are
|
||||
* false here.
|
||||
*/
|
||||
if (unlikely(cmd->mgmt_cmnd))
|
||||
scst_complete_cmd_mgmt(cmd, cmd->mgmt_cmnd);
|
||||
|
||||
scst_check_restore_sg_buff(cmd);
|
||||
|
||||
if (unlikely(cmd->internal)) {
|
||||
@@ -1324,6 +1370,12 @@ void scst_free_cmd(struct scst_cmd *cmd)
|
||||
|
||||
scst_release_space(cmd);
|
||||
|
||||
if (unlikely(cmd->sense != NULL)) {
|
||||
TRACE_MEM("Releasing sense %p (cmd %p)", cmd->sense, cmd);
|
||||
mempool_free(cmd->sense, scst_sense_mempool);
|
||||
cmd->sense = NULL;
|
||||
}
|
||||
|
||||
if (likely(cmd->tgt_dev != NULL)) {
|
||||
#ifdef EXTRACHECKS
|
||||
if (unlikely(!cmd->sent_to_midlev)) {
|
||||
@@ -2446,8 +2498,8 @@ void scst_alloc_set_UA(struct scst_tgt_dev *tgt_dev,
|
||||
if (sense_len > (int)sizeof(UA_entry->UA_sense_buffer))
|
||||
sense_len = sizeof(UA_entry->UA_sense_buffer);
|
||||
memcpy(UA_entry->UA_sense_buffer, sense, sense_len);
|
||||
|
||||
set_bit(SCST_TGT_DEV_UA_PENDING, &tgt_dev->tgt_dev_flags);
|
||||
smp_mb__after_set_bit();
|
||||
|
||||
TRACE_MGMT_DBG("Adding new UA to tgt_dev %p", tgt_dev);
|
||||
|
||||
@@ -2693,6 +2745,9 @@ void scst_block_dev(struct scst_device *dev, int outstanding)
|
||||
__scst_block_dev(dev);
|
||||
spin_unlock_bh(&dev->dev_lock);
|
||||
|
||||
/* spin_unlock_bh() doesn't provide the necessary memory barrier */
|
||||
smp_mb();
|
||||
|
||||
TRACE_MGMT_DBG("Waiting during blocking outstanding %d (on_dev_count "
|
||||
"%d)", outstanding, atomic_read(&dev->on_dev_count));
|
||||
wait_event(dev->on_dev_waitQ,
|
||||
@@ -2971,6 +3026,7 @@ void scst_xmit_process_aborted_cmd(struct scst_cmd *cmd)
|
||||
{
|
||||
TRACE_ENTRY();
|
||||
|
||||
smp_rmb();
|
||||
if (test_bit(SCST_CMD_ABORTED_OTHER, &cmd->cmd_flags)) {
|
||||
if (cmd->completed) {
|
||||
/* It's completed and it's OK to return its result */
|
||||
@@ -2992,11 +3048,13 @@ void scst_xmit_process_aborted_cmd(struct scst_cmd *cmd)
|
||||
}
|
||||
} else {
|
||||
if ((cmd->tgt_dev != NULL) &&
|
||||
scst_is_ua_sense(cmd->sense_buffer)) {
|
||||
scst_is_ua_sense(cmd->sense)) {
|
||||
/* This UA delivery is going to fail, so requeue it */
|
||||
TRACE_MGMT_DBG("Requeuing UA for aborted cmd %p", cmd);
|
||||
scst_check_set_UA(cmd->tgt_dev, cmd->sense_buffer,
|
||||
sizeof(cmd->sense_buffer), 1);
|
||||
scst_check_set_UA(cmd->tgt_dev, cmd->sense,
|
||||
SCST_SENSE_BUFFERSIZE, 1);
|
||||
mempool_free(cmd->sense, scst_sense_mempool);
|
||||
cmd->sense = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -3116,7 +3174,7 @@ static void tm_dbg_timer_fn(unsigned long arg)
|
||||
{
|
||||
TRACE_MGMT_DBG("%s", "delayed cmd timer expired");
|
||||
tm_dbg_flags.tm_dbg_release = 1;
|
||||
smp_mb();
|
||||
smp_wmb();
|
||||
wake_up_all(tm_dbg_p_cmd_list_waitQ);
|
||||
}
|
||||
|
||||
@@ -3328,7 +3386,7 @@ void tm_dbg_task_mgmt(struct scst_device *dev, const char *fn, int force)
|
||||
tm_dbg_delayed_cmds_count);
|
||||
tm_dbg_change_state();
|
||||
tm_dbg_flags.tm_dbg_release = 1;
|
||||
smp_mb();
|
||||
smp_wmb();
|
||||
if (tm_dbg_p_cmd_list_waitQ != NULL)
|
||||
wake_up_all(tm_dbg_p_cmd_list_waitQ);
|
||||
} else {
|
||||
|
||||
@@ -52,6 +52,12 @@
|
||||
handlers will not be supported.
|
||||
#endif
|
||||
|
||||
/**
|
||||
** SCST global variables. They are all uninitialized to have their layout in
|
||||
** memory be exactly as specified. Otherwise compiler puts zero-initialized
|
||||
** variable separately from nonzero-initialized ones.
|
||||
**/
|
||||
|
||||
/*
|
||||
* All targets, devices and dev_types management is done under this mutex.
|
||||
*
|
||||
@@ -60,40 +66,42 @@
|
||||
* scst_user detach_tgt(), which is called under scst_mutex and calls
|
||||
* flush_scheduled_work().
|
||||
*/
|
||||
DEFINE_MUTEX(scst_mutex);
|
||||
struct mutex scst_mutex;
|
||||
|
||||
LIST_HEAD(scst_template_list);
|
||||
LIST_HEAD(scst_dev_list);
|
||||
LIST_HEAD(scst_dev_type_list);
|
||||
struct list_head scst_template_list;
|
||||
struct list_head scst_dev_list;
|
||||
struct list_head scst_dev_type_list;
|
||||
|
||||
spinlock_t scst_main_lock = SPIN_LOCK_UNLOCKED;
|
||||
spinlock_t scst_main_lock;
|
||||
|
||||
struct kmem_cache *scst_mgmt_cachep;
|
||||
mempool_t *scst_mgmt_mempool;
|
||||
struct kmem_cache *scst_ua_cachep;
|
||||
mempool_t *scst_ua_mempool;
|
||||
struct kmem_cache *scst_sense_cachep;
|
||||
mempool_t *scst_sense_mempool;
|
||||
struct kmem_cache *scst_tgtd_cachep;
|
||||
struct kmem_cache *scst_sess_cachep;
|
||||
struct kmem_cache *scst_acgd_cachep;
|
||||
|
||||
LIST_HEAD(scst_acg_list);
|
||||
struct list_head scst_acg_list;
|
||||
struct scst_acg *scst_default_acg;
|
||||
|
||||
spinlock_t scst_init_lock = SPIN_LOCK_UNLOCKED;
|
||||
DECLARE_WAIT_QUEUE_HEAD(scst_init_cmd_list_waitQ);
|
||||
LIST_HEAD(scst_init_cmd_list);
|
||||
spinlock_t scst_init_lock;
|
||||
wait_queue_head_t scst_init_cmd_list_waitQ;
|
||||
struct list_head scst_init_cmd_list;
|
||||
unsigned int scst_init_poll_cnt;
|
||||
|
||||
struct kmem_cache *scst_cmd_cachep;
|
||||
|
||||
#if defined(DEBUG) || defined(TRACING)
|
||||
unsigned long scst_trace_flag = SCST_DEFAULT_LOG_FLAGS;
|
||||
unsigned long scst_trace_flag;
|
||||
#endif
|
||||
|
||||
unsigned long scst_flags;
|
||||
atomic_t scst_cmd_count = ATOMIC_INIT(0);
|
||||
atomic_t scst_cmd_count;
|
||||
|
||||
spinlock_t scst_cmd_mem_lock = SPIN_LOCK_UNLOCKED;
|
||||
spinlock_t scst_cmd_mem_lock;
|
||||
unsigned long scst_cur_cmd_mem, scst_cur_max_cmd_mem;
|
||||
unsigned long scst_max_cmd_mem;
|
||||
|
||||
@@ -101,33 +109,33 @@ struct scst_cmd_lists scst_main_cmd_lists;
|
||||
|
||||
struct scst_tasklet scst_tasklets[NR_CPUS];
|
||||
|
||||
spinlock_t scst_mcmd_lock = SPIN_LOCK_UNLOCKED;
|
||||
LIST_HEAD(scst_active_mgmt_cmd_list);
|
||||
LIST_HEAD(scst_delayed_mgmt_cmd_list);
|
||||
DECLARE_WAIT_QUEUE_HEAD(scst_mgmt_cmd_list_waitQ);
|
||||
spinlock_t scst_mcmd_lock;
|
||||
struct list_head scst_active_mgmt_cmd_list;
|
||||
struct list_head scst_delayed_mgmt_cmd_list;
|
||||
wait_queue_head_t scst_mgmt_cmd_list_waitQ;
|
||||
|
||||
DECLARE_WAIT_QUEUE_HEAD(scst_mgmt_waitQ);
|
||||
spinlock_t scst_mgmt_lock = SPIN_LOCK_UNLOCKED;
|
||||
LIST_HEAD(scst_sess_init_list);
|
||||
LIST_HEAD(scst_sess_shut_list);
|
||||
wait_queue_head_t scst_mgmt_waitQ;
|
||||
spinlock_t scst_mgmt_lock;
|
||||
struct list_head scst_sess_init_list;
|
||||
struct list_head scst_sess_shut_list;
|
||||
|
||||
DECLARE_WAIT_QUEUE_HEAD(scst_dev_cmd_waitQ);
|
||||
wait_queue_head_t scst_dev_cmd_waitQ;
|
||||
|
||||
DEFINE_MUTEX(scst_suspend_mutex);
|
||||
LIST_HEAD(scst_cmd_lists_list); /* protected by scst_suspend_mutex */
|
||||
struct mutex scst_suspend_mutex;
|
||||
struct list_head scst_cmd_lists_list;
|
||||
|
||||
static int scst_threads;
|
||||
struct scst_threads_info_t scst_threads_info;
|
||||
|
||||
static int suspend_count;
|
||||
|
||||
int scst_virt_dev_last_id = 1; /* protected by scst_mutex */
|
||||
static int scst_virt_dev_last_id; /* protected by scst_mutex */
|
||||
|
||||
/*
|
||||
* This buffer and lock are intended to avoid memory allocation, which
|
||||
* could fail in improper places.
|
||||
*/
|
||||
spinlock_t scst_temp_UA_lock = SPIN_LOCK_UNLOCKED;
|
||||
spinlock_t scst_temp_UA_lock;
|
||||
uint8_t scst_temp_UA[SCST_SENSE_BUFFERSIZE];
|
||||
|
||||
module_param_named(scst_threads, scst_threads, int, 0);
|
||||
@@ -235,12 +243,12 @@ out:
|
||||
TRACE_EXIT_RES(res);
|
||||
return res;
|
||||
|
||||
out_m_up:
|
||||
mutex_unlock(&m);
|
||||
|
||||
out_cleanup:
|
||||
scst_cleanup_proc_target_dir_entries(vtt);
|
||||
|
||||
out_m_up:
|
||||
mutex_unlock(&m);
|
||||
|
||||
out_err:
|
||||
PRINT_ERROR("Failed to register target template %s", vtt->name);
|
||||
goto out;
|
||||
@@ -1466,7 +1474,6 @@ static void __init scst_print_config(void)
|
||||
static int __init init_scst(void)
|
||||
{
|
||||
int res = 0, i;
|
||||
struct scst_cmd *cmd;
|
||||
int scst_num_cpus;
|
||||
|
||||
TRACE_ENTRY();
|
||||
@@ -1474,14 +1481,13 @@ static int __init init_scst(void)
|
||||
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,18)
|
||||
{
|
||||
struct scsi_request *req;
|
||||
BUILD_BUG_ON(sizeof(cmd->sense_buffer) !=
|
||||
BUILD_BUG_ON(SCST_SENSE_BUFFERSIZE !=
|
||||
sizeof(req->sr_sense_buffer));
|
||||
}
|
||||
#else
|
||||
{
|
||||
struct scsi_sense_hdr *shdr;
|
||||
BUILD_BUG_ON((sizeof(cmd->sense_buffer) < sizeof(*shdr)) &&
|
||||
(sizeof(cmd->sense_buffer) >= SCST_SENSE_BUFFERSIZE));
|
||||
BUILD_BUG_ON(SCST_SENSE_BUFFERSIZE < sizeof(*shdr));
|
||||
}
|
||||
#endif
|
||||
{
|
||||
@@ -1496,6 +1502,34 @@ static int __init init_scst(void)
|
||||
BUILD_BUG_ON(SCST_DATA_READ != DMA_FROM_DEVICE);
|
||||
BUILD_BUG_ON(SCST_DATA_NONE != DMA_NONE);
|
||||
|
||||
mutex_init(&scst_mutex);
|
||||
INIT_LIST_HEAD(&scst_template_list);
|
||||
INIT_LIST_HEAD(&scst_dev_list);
|
||||
INIT_LIST_HEAD(&scst_dev_type_list);
|
||||
spin_lock_init(&scst_main_lock);
|
||||
INIT_LIST_HEAD(&scst_acg_list);
|
||||
spin_lock_init(&scst_init_lock);
|
||||
init_waitqueue_head(&scst_init_cmd_list_waitQ);
|
||||
INIT_LIST_HEAD(&scst_init_cmd_list);
|
||||
#if defined(DEBUG) || defined(TRACING)
|
||||
scst_trace_flag = SCST_DEFAULT_LOG_FLAGS;
|
||||
#endif
|
||||
atomic_set(&scst_cmd_count, 0);
|
||||
spin_lock_init(&scst_cmd_mem_lock);
|
||||
spin_lock_init(&scst_mcmd_lock);
|
||||
INIT_LIST_HEAD(&scst_active_mgmt_cmd_list);
|
||||
INIT_LIST_HEAD(&scst_delayed_mgmt_cmd_list);
|
||||
init_waitqueue_head(&scst_mgmt_cmd_list_waitQ);
|
||||
init_waitqueue_head(&scst_mgmt_waitQ);
|
||||
spin_lock_init(&scst_mgmt_lock);
|
||||
INIT_LIST_HEAD(&scst_sess_init_list);
|
||||
INIT_LIST_HEAD(&scst_sess_shut_list);
|
||||
init_waitqueue_head(&scst_dev_cmd_waitQ);
|
||||
mutex_init(&scst_suspend_mutex);
|
||||
INIT_LIST_HEAD(&scst_cmd_lists_list);
|
||||
scst_virt_dev_last_id = 1;
|
||||
spin_lock_init(&scst_temp_UA_lock);
|
||||
|
||||
spin_lock_init(&scst_main_cmd_lists.cmd_list_lock);
|
||||
INIT_LIST_HEAD(&scst_main_cmd_lists.active_cmd_list);
|
||||
init_waitqueue_head(&scst_main_cmd_lists.cmd_list_waitQ);
|
||||
@@ -1525,7 +1559,11 @@ static int __init init_scst(void)
|
||||
|
||||
INIT_CACHEP(scst_mgmt_cachep, scst_mgmt_cmd, out);
|
||||
INIT_CACHEP(scst_ua_cachep, scst_tgt_dev_UA, out_destroy_mgmt_cache);
|
||||
INIT_CACHEP(scst_cmd_cachep, scst_cmd, out_destroy_ua_cache);
|
||||
{
|
||||
struct scst_sense { uint8_t s[SCST_SENSE_BUFFERSIZE]; };
|
||||
INIT_CACHEP(scst_sense_cachep, scst_sense, out_destroy_ua_cache);
|
||||
}
|
||||
INIT_CACHEP(scst_cmd_cachep, scst_cmd, out_destroy_sense_cache);
|
||||
INIT_CACHEP(scst_sess_cachep, scst_session, out_destroy_cmd_cache);
|
||||
INIT_CACHEP(scst_tgtd_cachep, scst_tgt_dev, out_destroy_sess_cache);
|
||||
INIT_CACHEP(scst_acgd_cachep, scst_acg_dev, out_destroy_tgt_cache);
|
||||
@@ -1544,6 +1582,14 @@ static int __init init_scst(void)
|
||||
goto out_destroy_mgmt_mempool;
|
||||
}
|
||||
|
||||
/* Loosing sense may have fatal consequences, so let's have a big pool */
|
||||
scst_sense_mempool = mempool_create(128, mempool_alloc_slab,
|
||||
mempool_free_slab, scst_sense_cachep);
|
||||
if (scst_sense_mempool == NULL) {
|
||||
res = -ENOMEM;
|
||||
goto out_destroy_ua_mempool;
|
||||
}
|
||||
|
||||
if (scst_max_cmd_mem == 0) {
|
||||
struct sysinfo si;
|
||||
si_meminfo(&si);
|
||||
@@ -1558,7 +1604,7 @@ static int __init init_scst(void)
|
||||
|
||||
res = scst_sgv_pools_init(scst_max_cmd_mem, 0);
|
||||
if (res != 0)
|
||||
goto out_destroy_ua_mempool;
|
||||
goto out_destroy_sense_mempool;
|
||||
|
||||
scst_default_acg = scst_alloc_add_acg(SCST_DEFAULT_ACG_NAME);
|
||||
if (scst_default_acg == NULL) {
|
||||
@@ -1611,6 +1657,9 @@ out_free_acg:
|
||||
out_destroy_sgv_pool:
|
||||
scst_sgv_pools_deinit();
|
||||
|
||||
out_destroy_sense_mempool:
|
||||
mempool_destroy(scst_sense_mempool);
|
||||
|
||||
out_destroy_ua_mempool:
|
||||
mempool_destroy(scst_ua_mempool);
|
||||
|
||||
@@ -1629,6 +1678,9 @@ out_destroy_sess_cache:
|
||||
out_destroy_cmd_cache:
|
||||
kmem_cache_destroy(scst_cmd_cachep);
|
||||
|
||||
out_destroy_sense_cache:
|
||||
kmem_cache_destroy(scst_sense_cachep);
|
||||
|
||||
out_destroy_ua_cache:
|
||||
kmem_cache_destroy(scst_ua_cachep);
|
||||
|
||||
@@ -1664,9 +1716,11 @@ static void __exit exit_scst(void)
|
||||
|
||||
mempool_destroy(scst_mgmt_mempool);
|
||||
mempool_destroy(scst_ua_mempool);
|
||||
mempool_destroy(scst_sense_mempool);
|
||||
|
||||
DEINIT_CACHEP(scst_mgmt_cachep);
|
||||
DEINIT_CACHEP(scst_ua_cachep);
|
||||
DEINIT_CACHEP(scst_sense_cachep);
|
||||
DEINIT_CACHEP(scst_cmd_cachep);
|
||||
DEINIT_CACHEP(scst_sess_cachep);
|
||||
DEINIT_CACHEP(scst_tgtd_cachep);
|
||||
@@ -1695,6 +1749,8 @@ EXPORT_SYMBOL(scst_set_busy);
|
||||
EXPORT_SYMBOL(scst_set_cmd_error_status);
|
||||
EXPORT_SYMBOL(scst_set_cmd_error);
|
||||
EXPORT_SYMBOL(scst_set_resp_data_len);
|
||||
EXPORT_SYMBOL(scst_alloc_sense);
|
||||
EXPORT_SYMBOL(scst_alloc_set_sense);
|
||||
EXPORT_SYMBOL(scst_set_sense);
|
||||
EXPORT_SYMBOL(scst_set_cmd_error_sense);
|
||||
|
||||
@@ -1739,6 +1795,9 @@ EXPORT_SYMBOL(scst_single_seq_open);
|
||||
EXPORT_SYMBOL(scst_get);
|
||||
EXPORT_SYMBOL(scst_put);
|
||||
|
||||
EXPORT_SYMBOL(scst_cmd_get);
|
||||
EXPORT_SYMBOL(scst_cmd_put);
|
||||
|
||||
EXPORT_SYMBOL(scst_alloc);
|
||||
EXPORT_SYMBOL(scst_free);
|
||||
|
||||
|
||||
@@ -138,24 +138,13 @@ static inline int scst_get_context(void)
|
||||
|
||||
extern unsigned long scst_max_cmd_mem;
|
||||
|
||||
#define SCST_MGMT_CMD_CACHE_STRING "scst_mgmt_cmd"
|
||||
extern struct kmem_cache *scst_mgmt_cachep;
|
||||
extern mempool_t *scst_mgmt_mempool;
|
||||
|
||||
#define SCST_UA_CACHE_STRING "scst_ua"
|
||||
extern struct kmem_cache *scst_ua_cachep;
|
||||
extern mempool_t *scst_ua_mempool;
|
||||
extern mempool_t *scst_sense_mempool;
|
||||
|
||||
#define SCST_CMD_CACHE_STRING "scst_cmd"
|
||||
extern struct kmem_cache *scst_cmd_cachep;
|
||||
|
||||
#define SCST_SESSION_CACHE_STRING "scst_session"
|
||||
extern struct kmem_cache *scst_sess_cachep;
|
||||
|
||||
#define SCST_TGT_DEV_CACHE_STRING "scst_tgt_dev"
|
||||
extern struct kmem_cache *scst_tgtd_cachep;
|
||||
|
||||
#define SCST_ACG_DEV_CACHE_STRING "scst_acg_dev"
|
||||
extern struct kmem_cache *scst_acgd_cachep;
|
||||
|
||||
extern spinlock_t scst_main_lock;
|
||||
@@ -489,12 +478,12 @@ static inline void scst_sess_put(struct scst_session *sess)
|
||||
scst_sched_session_free(sess);
|
||||
}
|
||||
|
||||
static inline void scst_cmd_get(struct scst_cmd *cmd)
|
||||
static inline void __scst_cmd_get(struct scst_cmd *cmd)
|
||||
{
|
||||
atomic_inc(&cmd->cmd_ref);
|
||||
}
|
||||
|
||||
static inline void scst_cmd_put(struct scst_cmd *cmd)
|
||||
static inline void __scst_cmd_put(struct scst_cmd *cmd)
|
||||
{
|
||||
if (atomic_dec_and_test(&cmd->cmd_ref))
|
||||
scst_free_cmd(cmd);
|
||||
|
||||
@@ -106,7 +106,10 @@ static int scst_init_cmd(struct scst_cmd *cmd, int context)
|
||||
TRACE_MGMT_DBG("%s", "init cmd list busy");
|
||||
goto out_redirect;
|
||||
}
|
||||
smp_rmb();
|
||||
/*
|
||||
* Memory barrier isn't necessary here, because CPU appears to
|
||||
* be self-consistent
|
||||
*/
|
||||
|
||||
rc = __scst_init_cmd(cmd);
|
||||
if (unlikely(rc > 0))
|
||||
@@ -1080,13 +1083,8 @@ static void scst_do_cmd_done(struct scst_cmd *cmd, int result,
|
||||
scst_set_resp_data_len(cmd, cmd->resp_data_len - resid);
|
||||
}
|
||||
|
||||
/*
|
||||
* We checked that rq_sense_len < sizeof(cmd->sense_buffer)
|
||||
* in init_scst()
|
||||
*/
|
||||
memcpy(cmd->sense_buffer, rq_sense, rq_sense_len);
|
||||
memset(&cmd->sense_buffer[rq_sense_len], 0,
|
||||
sizeof(cmd->sense_buffer) - rq_sense_len);
|
||||
if (cmd->status == SAM_STAT_CHECK_CONDITION)
|
||||
scst_alloc_set_sense(cmd, in_irq(), rq_sense, rq_sense_len);
|
||||
|
||||
TRACE(TRACE_SCSI, "result=%x, cmd->status=%x, resid=%d, "
|
||||
"cmd->msg_status=%x, cmd->host_status=%x, "
|
||||
@@ -1549,7 +1547,6 @@ int scst_check_local_events(struct scst_cmd *cmd)
|
||||
SCST_LOAD_SENSE(scst_sense_reset_UA));
|
||||
/* It looks like it is safe to clear was_reset here */
|
||||
dev->scsi_dev->was_reset = 0;
|
||||
smp_mb();
|
||||
done = 1;
|
||||
}
|
||||
spin_unlock_bh(&dev->dev_lock);
|
||||
@@ -1666,6 +1663,7 @@ static inline int scst_local_exec(struct scst_cmd *cmd)
|
||||
return res;
|
||||
}
|
||||
|
||||
/* cmd must be additionally referenced to not die inside */
|
||||
static int scst_do_send_to_midlev(struct scst_cmd *cmd)
|
||||
{
|
||||
int rc = SCST_EXEC_NOT_COMPLETED;
|
||||
@@ -1686,7 +1684,6 @@ static int scst_do_send_to_midlev(struct scst_cmd *cmd)
|
||||
cmd->scst_cmd_done = scst_cmd_done_local;
|
||||
|
||||
rc = scst_pre_exec(cmd);
|
||||
/* !! At this point cmd, sess & tgt_dev can be already freed !! */
|
||||
if (rc != SCST_EXEC_NOT_COMPLETED) {
|
||||
if (rc == SCST_EXEC_COMPLETED)
|
||||
goto out;
|
||||
@@ -1697,7 +1694,6 @@ static int scst_do_send_to_midlev(struct scst_cmd *cmd)
|
||||
}
|
||||
|
||||
rc = scst_local_exec(cmd);
|
||||
/* !! At this point cmd, sess & tgt_dev can be already freed !! */
|
||||
if (rc != SCST_EXEC_NOT_COMPLETED) {
|
||||
if (rc == SCST_EXEC_COMPLETED)
|
||||
goto out;
|
||||
@@ -1714,7 +1710,6 @@ static int scst_do_send_to_midlev(struct scst_cmd *cmd)
|
||||
TRACE_BUFF_FLAG(TRACE_SND_TOP, "Execing: ", cmd->cdb, cmd->cdb_len);
|
||||
cmd->scst_cmd_done = scst_cmd_done_local;
|
||||
rc = dev->handler->exec(cmd);
|
||||
/* !! At this point cmd, sess & tgt_dev can be already freed !! */
|
||||
TRACE_DBG("Dev handler %s exec() returned %d",
|
||||
dev->handler->name, rc);
|
||||
if (rc == SCST_EXEC_COMPLETED)
|
||||
@@ -1847,13 +1842,39 @@ out:
|
||||
return;
|
||||
}
|
||||
|
||||
static int scst_process_internal_cmd(struct scst_cmd *cmd)
|
||||
{
|
||||
int res = SCST_CMD_STATE_RES_CONT_NEXT, rc;
|
||||
|
||||
TRACE_ENTRY();
|
||||
|
||||
__scst_cmd_get(cmd);
|
||||
|
||||
rc = scst_do_send_to_midlev(cmd);
|
||||
if (rc == SCST_EXEC_NEED_THREAD) {
|
||||
TRACE_DBG("%s", "scst_do_send_to_midlev() requested "
|
||||
"thread context, rescheduling");
|
||||
res = SCST_CMD_STATE_RES_NEED_THREAD;
|
||||
} else {
|
||||
struct scst_device *dev = cmd->dev;
|
||||
sBUG_ON(rc != SCST_EXEC_COMPLETED);
|
||||
if (dev->scsi_dev != NULL)
|
||||
generic_unplug_device(dev->scsi_dev->request_queue);
|
||||
}
|
||||
|
||||
__scst_cmd_put(cmd);
|
||||
|
||||
TRACE_EXIT_RES(res);
|
||||
return res;
|
||||
}
|
||||
|
||||
static int scst_send_to_midlev(struct scst_cmd **active_cmd)
|
||||
{
|
||||
int res, rc;
|
||||
struct scst_cmd *cmd = *active_cmd;
|
||||
struct scst_cmd *ref_cmd;
|
||||
struct scst_tgt_dev *tgt_dev = cmd->tgt_dev;
|
||||
struct scst_device *dev = cmd->dev;
|
||||
struct scst_session *sess = cmd->sess;
|
||||
typeof(tgt_dev->expected_sn) expected_sn;
|
||||
int count;
|
||||
|
||||
@@ -1861,22 +1882,9 @@ static int scst_send_to_midlev(struct scst_cmd **active_cmd)
|
||||
|
||||
res = SCST_CMD_STATE_RES_CONT_NEXT;
|
||||
|
||||
__scst_get(0); /* protect dev */
|
||||
scst_sess_get(sess); /* protect tgt_dev */
|
||||
|
||||
if (unlikely(cmd->internal || cmd->retry)) {
|
||||
rc = scst_do_send_to_midlev(cmd);
|
||||
/* !! At this point cmd can be already freed !! */
|
||||
if (rc == SCST_EXEC_NEED_THREAD) {
|
||||
TRACE_DBG("%s", "scst_do_send_to_midlev() requested "
|
||||
"thread context, rescheduling");
|
||||
res = SCST_CMD_STATE_RES_NEED_THREAD;
|
||||
scst_dec_on_dev_cmd(cmd);
|
||||
goto out_put;
|
||||
} else {
|
||||
sBUG_ON(rc != SCST_EXEC_COMPLETED);
|
||||
goto out_unplug;
|
||||
}
|
||||
res = scst_process_internal_cmd(cmd);
|
||||
goto out;
|
||||
}
|
||||
|
||||
#ifdef MEASURE_LATENCY
|
||||
@@ -1891,7 +1899,10 @@ static int scst_send_to_midlev(struct scst_cmd **active_cmd)
|
||||
#endif
|
||||
|
||||
if (unlikely(scst_inc_on_dev_cmd(cmd) != 0))
|
||||
goto out_put;
|
||||
goto out;
|
||||
|
||||
ref_cmd = cmd;
|
||||
__scst_cmd_get(ref_cmd);
|
||||
|
||||
if (unlikely(cmd->queue_type == SCST_CMD_QUEUE_HEAD_OF_QUEUE))
|
||||
goto exec;
|
||||
@@ -1902,9 +1913,10 @@ static int scst_send_to_midlev(struct scst_cmd **active_cmd)
|
||||
/* Optimized for lockless fast path */
|
||||
if ((cmd->sn != expected_sn) || (tgt_dev->hq_cmd_count > 0)) {
|
||||
spin_lock_irq(&tgt_dev->sn_lock);
|
||||
|
||||
tgt_dev->def_cmd_count++;
|
||||
smp_mb();
|
||||
barrier(); /* to reread expected_sn & hq_cmd_count */
|
||||
|
||||
expected_sn = tgt_dev->expected_sn;
|
||||
if ((cmd->sn != expected_sn) || (tgt_dev->hq_cmd_count > 0)) {
|
||||
/* We are under IRQ lock, but dev->dev_lock is BH one */
|
||||
@@ -1924,8 +1936,9 @@ static int scst_send_to_midlev(struct scst_cmd **active_cmd)
|
||||
&tgt_dev->deferred_cmd_list);
|
||||
}
|
||||
spin_unlock_irq(&tgt_dev->sn_lock);
|
||||
/* !! At this point cmd can be already freed !! */
|
||||
|
||||
__scst_dec_on_dev_cmd(dev, cmd_blocking);
|
||||
|
||||
goto out_put;
|
||||
} else {
|
||||
TRACE_SN("Somebody incremented expected_sn %ld, "
|
||||
@@ -1941,6 +1954,7 @@ exec:
|
||||
atomic_t *slot = cmd->sn_slot;
|
||||
int inc_expected_sn = !cmd->inc_expected_sn_on_done &&
|
||||
cmd->sn_set;
|
||||
|
||||
rc = scst_do_send_to_midlev(cmd);
|
||||
if (rc == SCST_EXEC_NEED_THREAD) {
|
||||
TRACE_DBG("%s", "scst_do_send_to_midlev() requested "
|
||||
@@ -1954,15 +1968,22 @@ exec:
|
||||
goto out_put;
|
||||
}
|
||||
sBUG_ON(rc != SCST_EXEC_COMPLETED);
|
||||
/* !! At this point cmd can be already freed !! */
|
||||
|
||||
count++;
|
||||
|
||||
if (inc_expected_sn)
|
||||
scst_inc_expected_sn(tgt_dev, slot);
|
||||
|
||||
cmd = scst_check_deferred_commands(tgt_dev);
|
||||
if (cmd == NULL)
|
||||
break;
|
||||
|
||||
if (unlikely(scst_inc_on_dev_cmd(cmd) != 0))
|
||||
break;
|
||||
|
||||
__scst_cmd_put(ref_cmd);
|
||||
ref_cmd = cmd;
|
||||
__scst_cmd_get(ref_cmd);
|
||||
}
|
||||
|
||||
out_unplug:
|
||||
@@ -1970,10 +1991,10 @@ out_unplug:
|
||||
generic_unplug_device(dev->scsi_dev->request_queue);
|
||||
|
||||
out_put:
|
||||
scst_sess_put(sess);
|
||||
__scst_put();
|
||||
__scst_cmd_put(ref_cmd);
|
||||
/* !! At this point sess, dev and tgt_dev can be already freed !! */
|
||||
|
||||
out:
|
||||
TRACE_EXIT_HRES(res);
|
||||
return res;
|
||||
}
|
||||
@@ -1999,7 +2020,6 @@ static int scst_check_sense(struct scst_cmd *cmd)
|
||||
cmd->ua_ignore = 0;
|
||||
/* It looks like it is safe to clear was_reset here */
|
||||
dev->scsi_dev->was_reset = 0;
|
||||
smp_mb();
|
||||
}
|
||||
|
||||
dbl_ua_possible = dev->dev_double_ua_possible;
|
||||
@@ -2015,12 +2035,12 @@ static int scst_check_sense(struct scst_cmd *cmd)
|
||||
}
|
||||
|
||||
if (unlikely(cmd->status == SAM_STAT_CHECK_CONDITION) &&
|
||||
SCST_SENSE_VALID(cmd->sense_buffer)) {
|
||||
PRINT_BUFF_FLAG(TRACE_SCSI, "Sense", cmd->sense_buffer,
|
||||
sizeof(cmd->sense_buffer));
|
||||
SCST_SENSE_VALID(cmd->sense)) {
|
||||
PRINT_BUFF_FLAG(TRACE_SCSI, "Sense", cmd->sense,
|
||||
SCST_SENSE_BUFFERSIZE);
|
||||
/* Check Unit Attention Sense Key */
|
||||
if (scst_is_ua_sense(cmd->sense_buffer)) {
|
||||
if (cmd->sense_buffer[12] == SCST_SENSE_ASC_UA_RESET) {
|
||||
if (scst_is_ua_sense(cmd->sense)) {
|
||||
if (cmd->sense[12] == SCST_SENSE_ASC_UA_RESET) {
|
||||
if (dbl_ua_possible) {
|
||||
if (ua_sent) {
|
||||
TRACE(TRACE_MGMT_MINOR, "%s",
|
||||
@@ -2032,8 +2052,8 @@ static int scst_check_sense(struct scst_cmd *cmd)
|
||||
cmd->msg_status = 0;
|
||||
cmd->host_status = DID_OK;
|
||||
cmd->driver_status = 0;
|
||||
memset(cmd->sense_buffer, 0,
|
||||
sizeof(cmd->sense_buffer));
|
||||
mempool_free(cmd->sense, scst_sense_mempool);
|
||||
cmd->sense = NULL;
|
||||
cmd->retry = 1;
|
||||
cmd->state = SCST_CMD_STATE_SEND_TO_MIDLEV;
|
||||
res = 1;
|
||||
@@ -2053,12 +2073,12 @@ static int scst_check_sense(struct scst_cmd *cmd)
|
||||
if (cmd->ua_ignore == 0) {
|
||||
if (unlikely(dbl_ua_possible)) {
|
||||
__scst_dev_check_set_UA(dev, cmd,
|
||||
cmd->sense_buffer,
|
||||
sizeof(cmd->sense_buffer));
|
||||
cmd->sense,
|
||||
SCST_SENSE_BUFFERSIZE);
|
||||
} else {
|
||||
scst_dev_check_set_UA(dev, cmd,
|
||||
cmd->sense_buffer,
|
||||
sizeof(cmd->sense_buffer));
|
||||
cmd->sense,
|
||||
SCST_SENSE_BUFFERSIZE);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -2090,8 +2110,8 @@ static int scst_check_auto_sense(struct scst_cmd *cmd)
|
||||
TRACE_ENTRY();
|
||||
|
||||
if (unlikely(cmd->status == SAM_STAT_CHECK_CONDITION) &&
|
||||
(!SCST_SENSE_VALID(cmd->sense_buffer) ||
|
||||
SCST_NO_SENSE(cmd->sense_buffer))) {
|
||||
(!SCST_SENSE_VALID(cmd->sense) ||
|
||||
SCST_NO_SENSE(cmd->sense))) {
|
||||
TRACE(TRACE_SCSI|TRACE_MINOR, "CHECK_CONDITION, but no sense: "
|
||||
"cmd->status=%x, cmd->msg_status=%x, "
|
||||
"cmd->host_status=%x, cmd->driver_status=%x", cmd->status,
|
||||
@@ -2212,10 +2232,10 @@ static int scst_done_cmd_check(struct scst_cmd *cmd, int *pres)
|
||||
struct scst_tgt_dev *tgt_dev_tmp;
|
||||
struct scst_device *dev = cmd->dev;
|
||||
|
||||
TRACE(TRACE_SCSI, "Real RESERVE failed lun=%Ld, status=%x",
|
||||
(uint64_t)cmd->lun, cmd->status);
|
||||
PRINT_BUFF_FLAG(TRACE_SCSI, "Sense", cmd->sense_buffer,
|
||||
sizeof(cmd->sense_buffer));
|
||||
TRACE(TRACE_SCSI, "Real RESERVE failed lun=%Ld, "
|
||||
"status=%x", (uint64_t)cmd->lun, cmd->status);
|
||||
PRINT_BUFF_FLAG(TRACE_SCSI, "Sense", cmd->sense,
|
||||
SCST_SENSE_BUFFERSIZE);
|
||||
|
||||
/* Clearing the reservation */
|
||||
spin_lock_bh(&dev->dev_lock);
|
||||
@@ -2232,10 +2252,9 @@ static int scst_done_cmd_check(struct scst_cmd *cmd, int *pres)
|
||||
/* Check for MODE PARAMETERS CHANGED UA */
|
||||
if ((cmd->dev->scsi_dev != NULL) &&
|
||||
(cmd->status == SAM_STAT_CHECK_CONDITION) &&
|
||||
SCST_SENSE_VALID(cmd->sense_buffer) &&
|
||||
scst_is_ua_sense(cmd->sense_buffer) &&
|
||||
(cmd->sense_buffer[12] == 0x2a) &&
|
||||
(cmd->sense_buffer[13] == 0x01)) {
|
||||
SCST_SENSE_VALID(cmd->sense) &&
|
||||
scst_is_ua_sense(cmd->sense) &&
|
||||
(cmd->sense[12] == 0x2a) && (cmd->sense[13] == 0x01)) {
|
||||
TRACE(TRACE_SCSI, "MODE PARAMETERS CHANGED UA (lun %Ld)",
|
||||
(uint64_t)cmd->lun);
|
||||
cmd->state = SCST_CMD_STATE_MODE_SELECT_CHECKS;
|
||||
@@ -2256,6 +2275,9 @@ static int scst_pre_dev_done(struct scst_cmd *cmd)
|
||||
|
||||
TRACE_ENTRY();
|
||||
|
||||
atomic_dec(&cmd->tgt_dev->tgt_dev_cmd_count);
|
||||
atomic_dec(&cmd->dev->dev_cmd_count);
|
||||
|
||||
rc = scst_done_cmd_check(cmd, &res);
|
||||
if (rc)
|
||||
goto out;
|
||||
@@ -2310,10 +2332,9 @@ static int scst_mode_select_checks(struct scst_cmd *cmd)
|
||||
scst_obtain_device_parameters(dev);
|
||||
}
|
||||
} else if ((cmd->status == SAM_STAT_CHECK_CONDITION) &&
|
||||
SCST_SENSE_VALID(cmd->sense_buffer) &&
|
||||
scst_is_ua_sense(cmd->sense_buffer) &&
|
||||
(cmd->sense_buffer[12] == 0x2a) &&
|
||||
(cmd->sense_buffer[13] == 0x01)) {
|
||||
SCST_SENSE_VALID(cmd->sense) &&
|
||||
scst_is_ua_sense(cmd->sense) &&
|
||||
(cmd->sense[12] == 0x2a) && (cmd->sense[13] == 0x01)) {
|
||||
if (atomic) {
|
||||
TRACE_DBG("%s", "MODE PARAMETERS CHANGED UA: thread "
|
||||
"context required");
|
||||
@@ -2636,24 +2657,34 @@ static int scst_finish_cmd(struct scst_cmd *cmd)
|
||||
|
||||
atomic_dec(&cmd->sess->sess_cmd_count);
|
||||
|
||||
cmd->finished = 1;
|
||||
smp_mb(); /* to sync with scst_abort_cmd() */
|
||||
|
||||
if (unlikely(test_bit(SCST_CMD_ABORTED, &cmd->cmd_flags))) {
|
||||
unsigned long flags;
|
||||
|
||||
TRACE_MGMT_DBG("Aborted cmd %p finished (cmd_ref %d, "
|
||||
"scst_cmd_count %d)", cmd, atomic_read(&cmd->cmd_ref),
|
||||
atomic_read(&scst_cmd_count));
|
||||
|
||||
spin_lock_irqsave(&scst_mcmd_lock, flags);
|
||||
if (cmd->mgmt_cmnd)
|
||||
scst_complete_cmd_mgmt(cmd, cmd->mgmt_cmnd);
|
||||
spin_unlock_irqrestore(&scst_mcmd_lock, flags);
|
||||
}
|
||||
|
||||
if (unlikely(cmd->delivery_status != SCST_CMD_DELIVERY_SUCCESS)) {
|
||||
if ((cmd->tgt_dev != NULL) &&
|
||||
scst_is_ua_sense(cmd->sense_buffer)) {
|
||||
scst_is_ua_sense(cmd->sense)) {
|
||||
/* This UA delivery failed, so requeue it */
|
||||
TRACE_MGMT_DBG("Requeuing UA for delivery failed cmd "
|
||||
"%p", cmd);
|
||||
scst_check_set_UA(cmd->tgt_dev, cmd->sense_buffer,
|
||||
sizeof(cmd->sense_buffer), 1);
|
||||
scst_check_set_UA(cmd->tgt_dev, cmd->sense,
|
||||
SCST_SENSE_BUFFERSIZE, 1);
|
||||
}
|
||||
}
|
||||
|
||||
scst_cmd_put(cmd);
|
||||
__scst_cmd_put(cmd);
|
||||
|
||||
res = SCST_CMD_STATE_RES_CONT_NEXT;
|
||||
|
||||
@@ -2693,6 +2724,10 @@ static void scst_cmd_set_sn(struct scst_cmd *cmd)
|
||||
goto ordered;
|
||||
#endif
|
||||
if (likely(tgt_dev->num_free_sn_slots >= 0)) {
|
||||
/*
|
||||
* atomic_inc_return() implies memory barrier to sync
|
||||
* with scst_inc_expected_sn()
|
||||
*/
|
||||
if (atomic_inc_return(tgt_dev->cur_sn_slot) == 1) {
|
||||
tgt_dev->curr_sn++;
|
||||
TRACE_SN("Incremented curr_sn %ld",
|
||||
@@ -2888,6 +2923,10 @@ static void scst_do_job_init(void)
|
||||
TRACE_ENTRY();
|
||||
|
||||
restart:
|
||||
/*
|
||||
* There is no need for read barrier here, because we don't care where
|
||||
* this check will be done.
|
||||
*/
|
||||
susp = test_bit(SCST_FLAG_SUSPENDED, &scst_flags);
|
||||
if (scst_init_poll_cnt > 0)
|
||||
scst_init_poll_cnt--;
|
||||
@@ -2924,6 +2963,7 @@ restart:
|
||||
* in case of simultaneous such calls anyway.
|
||||
*/
|
||||
TRACE_MGMT_DBG("Deleting cmd %p from init cmd list", cmd);
|
||||
smp_wmb();
|
||||
list_del(&cmd->cmd_list_entry);
|
||||
spin_unlock(&scst_init_lock);
|
||||
|
||||
@@ -2942,6 +2982,7 @@ restart:
|
||||
goto restart;
|
||||
}
|
||||
|
||||
/* It isn't really needed, but let's keep it */
|
||||
if (susp != test_bit(SCST_FLAG_SUSPENDED, &scst_flags))
|
||||
goto restart;
|
||||
|
||||
@@ -3018,6 +3059,8 @@ void scst_process_active_cmd(struct scst_cmd *cmd, int context)
|
||||
switch (cmd->state) {
|
||||
case SCST_CMD_STATE_PRE_PARSE:
|
||||
res = scst_pre_parse(cmd);
|
||||
EXTRACHECKS_BUG_ON(res ==
|
||||
SCST_CMD_STATE_RES_NEED_THREAD);
|
||||
break;
|
||||
|
||||
case SCST_CMD_STATE_DEV_PARSE:
|
||||
@@ -3050,6 +3093,8 @@ void scst_process_active_cmd(struct scst_cmd *cmd, int context)
|
||||
|
||||
case SCST_CMD_STATE_PRE_DEV_DONE:
|
||||
res = scst_pre_dev_done(cmd);
|
||||
EXTRACHECKS_BUG_ON(res ==
|
||||
SCST_CMD_STATE_RES_NEED_THREAD);
|
||||
break;
|
||||
|
||||
case SCST_CMD_STATE_MODE_SELECT_CHECKS:
|
||||
@@ -3062,6 +3107,8 @@ void scst_process_active_cmd(struct scst_cmd *cmd, int context)
|
||||
|
||||
case SCST_CMD_STATE_PRE_XMIT_RESP:
|
||||
res = scst_pre_xmit_response(cmd);
|
||||
EXTRACHECKS_BUG_ON(res ==
|
||||
SCST_CMD_STATE_RES_NEED_THREAD);
|
||||
break;
|
||||
|
||||
case SCST_CMD_STATE_XMIT_RESP:
|
||||
@@ -3070,6 +3117,8 @@ void scst_process_active_cmd(struct scst_cmd *cmd, int context)
|
||||
|
||||
case SCST_CMD_STATE_FINISHED:
|
||||
res = scst_finish_cmd(cmd);
|
||||
EXTRACHECKS_BUG_ON(res ==
|
||||
SCST_CMD_STATE_RES_NEED_THREAD);
|
||||
break;
|
||||
|
||||
default:
|
||||
@@ -3281,13 +3330,11 @@ out:
|
||||
return res;
|
||||
}
|
||||
|
||||
/* No locks */
|
||||
/* scst_mcmd_lock supposed to be held and IRQ off */
|
||||
void scst_complete_cmd_mgmt(struct scst_cmd *cmd, struct scst_mgmt_cmd *mcmd)
|
||||
{
|
||||
TRACE_ENTRY();
|
||||
|
||||
spin_lock_irq(&scst_mcmd_lock);
|
||||
|
||||
TRACE_MGMT_DBG("cmd %p completed (tag %llu, mcmd %p, "
|
||||
"mcmd->cmd_wait_count %d)", cmd, cmd->tag, mcmd,
|
||||
mcmd->cmd_wait_count);
|
||||
@@ -3313,8 +3360,6 @@ void scst_complete_cmd_mgmt(struct scst_cmd *cmd, struct scst_mgmt_cmd *mcmd)
|
||||
&scst_active_mgmt_cmd_list);
|
||||
}
|
||||
|
||||
spin_unlock_irq(&scst_mcmd_lock);
|
||||
|
||||
wake_up(&scst_mgmt_cmd_list_waitQ);
|
||||
|
||||
out:
|
||||
@@ -3368,6 +3413,8 @@ static inline int scst_is_strict_mgmt_fn(int mgmt_fn)
|
||||
void scst_abort_cmd(struct scst_cmd *cmd, struct scst_mgmt_cmd *mcmd,
|
||||
int other_ini, int call_dev_task_mgmt_fn)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
TRACE_ENTRY();
|
||||
|
||||
TRACE(((mcmd != NULL) && (mcmd->fn == SCST_ABORT_TASK)) ? TRACE_MGMT_MINOR : TRACE_MGMT,
|
||||
@@ -3381,7 +3428,8 @@ void scst_abort_cmd(struct scst_cmd *cmd, struct scst_mgmt_cmd *mcmd,
|
||||
clear_bit(SCST_CMD_ABORTED_OTHER, &cmd->cmd_flags);
|
||||
}
|
||||
set_bit(SCST_CMD_ABORTED, &cmd->cmd_flags);
|
||||
smp_mb__after_set_bit();
|
||||
/* To sync with cmd->finished set in scst_finish_cmd() */
|
||||
smp_mb__after_set_bit();
|
||||
|
||||
if (cmd->tgt_dev == NULL) {
|
||||
unsigned long flags;
|
||||
@@ -3396,8 +3444,8 @@ void scst_abort_cmd(struct scst_cmd *cmd, struct scst_mgmt_cmd *mcmd,
|
||||
scst_call_dev_task_mgmt_fn(mcmd, cmd->tgt_dev, 1);
|
||||
}
|
||||
|
||||
if (mcmd) {
|
||||
unsigned long flags;
|
||||
spin_lock_irqsave(&scst_mcmd_lock, flags);
|
||||
if ((mcmd != NULL) && !cmd->finished) {
|
||||
/*
|
||||
* Delay the response until the command's finish in
|
||||
* order to guarantee that "no further responses from
|
||||
@@ -3418,12 +3466,16 @@ void scst_abort_cmd(struct scst_cmd *cmd, struct scst_mgmt_cmd *mcmd,
|
||||
}
|
||||
#endif
|
||||
sBUG_ON(cmd->mgmt_cmnd);
|
||||
spin_lock_irqsave(&scst_mcmd_lock, flags);
|
||||
|
||||
mcmd->cmd_wait_count++;
|
||||
spin_unlock_irqrestore(&scst_mcmd_lock, flags);
|
||||
/* cmd can't die here or sess_list_lock already taken */
|
||||
|
||||
/*
|
||||
* cmd can't die here or sess_list_lock already taken and cmd is
|
||||
* in the search list
|
||||
*/
|
||||
cmd->mgmt_cmnd = mcmd;
|
||||
}
|
||||
spin_unlock_irqrestore(&scst_mcmd_lock, flags);
|
||||
|
||||
tm_dbg_release_cmd(cmd);
|
||||
|
||||
@@ -3724,7 +3776,7 @@ static int scst_mgmt_cmd_init(struct scst_mgmt_cmd *mcmd)
|
||||
spin_unlock_irq(&sess->sess_list_lock);
|
||||
goto out;
|
||||
}
|
||||
scst_cmd_get(cmd);
|
||||
__scst_cmd_get(cmd);
|
||||
spin_unlock_irq(&sess->sess_list_lock);
|
||||
TRACE_MGMT_DBG("Cmd %p for tag %llu (sn %ld, set %d, "
|
||||
"queue_type %x) found, aborting it", cmd, mcmd->tag,
|
||||
@@ -3748,7 +3800,7 @@ static int scst_mgmt_cmd_init(struct scst_mgmt_cmd *mcmd)
|
||||
}
|
||||
res = scst_set_mcmd_next_state(mcmd);
|
||||
mcmd->cmd_to_abort = NULL; /* just in case */
|
||||
scst_cmd_put(cmd);
|
||||
__scst_cmd_put(cmd);
|
||||
break;
|
||||
}
|
||||
|
||||
@@ -4673,10 +4725,8 @@ void scst_unregister_session(struct scst_session *sess, int wait,
|
||||
spin_lock_irqsave(&scst_mgmt_lock, flags);
|
||||
|
||||
sess->unreg_done_fn = unreg_done_fn;
|
||||
if (wait) {
|
||||
if (wait)
|
||||
sess->shutdown_compl = pc;
|
||||
smp_mb();
|
||||
}
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
else
|
||||
sess->shutdown_compl = NULL;
|
||||
|
||||
Reference in New Issue
Block a user