From fec8d2459cb6d6550950eb55c9de8f9be94eccee Mon Sep 17 00:00:00 2001 From: Bart Van Assche Date: Tue, 13 Jan 2015 08:55:46 +0000 Subject: [PATCH] qla2x00t: Copy entire SCST sense buffer to q2x ctio There seems to be a bug in passing sense information to QLA HBAs, where the last 2 bytes of the sense data (ASC, ASCQ) are not copied to the low level sense buffer. We encountered this in ESX, which relies on these 2 bytes to parse the MISCOMPARE sense code (0xE1, 0x1D, 0x00). Bellow is a simple test to recreate this issue, but during vMotion operations (where VMs are moved from one host to another), this may cause the operation to fail leaving the VM in an inconsistent state. The test I ran to verify that we are indeed missing the bytes is the following: 1. Create a SCST based device 2. Expose the device to 2 ESX hosts 3. Format the device as VMFS5, create a test directory 4. From both hosts, I start writing to this directory (no VMs involved, just write normal files) At this stage, both ESX hosts try to take access to the directory. The VMFS filesystem contains a per-directory lock which is managed by COMPARE AND WRITE command. Each ESX will attempt to change the VMFS lock location from unlocked to locked to create the new file. Obviously there are bound to be failures (which are equivalent to programming locking conflicts), these are reported by the MISCOMPARE sense code. Upon these MISCOMPARE errors, the host will re-try taking the lock until it succeeds, and will then proceed to perform the write operation on the directory. Due to the bug in copying the sense buffer from the SCST core to the QLA ctio, instead of the full sense code, only the key (0xE) is sent, and ESX does not know how to handle it resulting in IO error. Here are the errors as they appear on the command line: /vmfs/volumes/54a297c4-ca5af1cc-7f94-002219d20f28/ats_test # ./open_close_test-esx2.sh ./open_close_test-esx2.sh: line 8: can't create ats_fileoptest-esx2_1.txt: Input/output error ./open_close_test-esx2.sh: line 8: can't create ats_fileoptest-esx2_21.txt: Input/output error ./open_close_test-esx2.sh: line 8: can't create ats_fileoptest-esx2_110.txt: Input/output error ./open_close_test-esx2.sh: line 8: can't create ats_fileoptest-esx2_111.txt: Input/output error In the /var/log/vmkernel.log, we can see that the sense information is missing (0xE, 0x0, 0x0) instead of (0xE, 0x1D, 0x0). 2014-12-30T12:13:20.714Z cpu6:33519)ScsiDeviceIO: 2338: Cmd(0x412e84f957c0) 0x89, CmdSN 0x234d from world 519051 to dev "eui.0024f400d5020007" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x0 0x0. 2014-12-30T12:13:20.766Z cpu6:33519)ScsiDeviceIO: 2338: Cmd(0x412e84f91d00) 0x89, CmdSN 0x2350 from world 519051 to dev "eui.0024f400d5020007" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x0 0x0. 2014-12-30T12:13:20.766Z cpu6:33519)ScsiDeviceIO: 2338: Cmd(0x412e80449fc0) 0x89, CmdSN 0x234f from world 519051 to dev "eui.0024f400d5020007" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x0 0x0. This patch fixes this issue, the test will run without a problem with the fix (no IO errors, all the files are properly written to the directory). Signed-off-by: Shahar Salzman Reviewed-by: Eran Mann [bvanassche: simplified implementation] Signed-off-by: Bart Van Assche git-svn-id: http://svn.code.sf.net/p/scst/svn/trunk@5965 d57e44dd-8a1f-0410-8b47-8ef2f437770f --- qla2x00t/qla2x00-target/qla2x00t.c | 42 +++++++++++++++++++----------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/qla2x00t/qla2x00-target/qla2x00t.c b/qla2x00t/qla2x00-target/qla2x00t.c index 848413564..a92440611 100644 --- a/qla2x00t/qla2x00-target/qla2x00t.c +++ b/qla2x00t/qla2x00-target/qla2x00t.c @@ -2679,6 +2679,31 @@ out_unlock_free_unmap: goto out; } +/* + * Convert sense buffer (byte array) to little endian format as required by + * qla24xx firmware. + */ +static void q24_copy_sense_buffer_to_ctio(ctio7_status1_entry_t *ctio, + uint8_t *sense_buf, unsigned int sense_buf_len) +{ + uint32_t *src = (void *)sense_buf; + uint32_t *end = (void *)sense_buf + sense_buf_len; + uint8_t *p; + __be32 *dst = (void *)ctio->sense_data; + + /* + * The sense buffer allocated by scst_alloc_sense() is zero-filled and + * has a length that is a multiple of four. This means that it is safe + * to access the bytes after the end of the sense buffer up to a + * boundary that is a multiple of four. + */ + for (p = (uint8_t *)end; ((uintptr_t)p & 3) != 0; p++) + WARN_ONCE(*p != 0, "sense_buf[%zd] = %d\n", p - sense_buf, *p); + + for ( ; src < end; src++) + *dst++ = cpu_to_be32(*src); +} + static inline int q2t_need_explicit_conf(scsi_qla_host_t *ha, struct q2t_cmd *cmd, int sending_sense) { @@ -2914,7 +2939,6 @@ static void q24_init_ctio_ret_entry(ctio7_status0_entry_t *ctio, ctio->residual = cpu_to_le32(prm->residual); ctio->scsi_status = cpu_to_le16(prm->rq_result); if (scst_sense_valid(prm->sense_buffer)) { - int i; ctio1 = (ctio7_status1_entry_t *)ctio; if (q2t_need_explicit_conf(prm->tgt->ha, prm->cmd, 1)) { ctio1->flags |= cpu_to_le16( @@ -2925,20 +2949,8 @@ static void q24_init_ctio_ret_entry(ctio7_status0_entry_t *ctio, ctio1->flags |= cpu_to_le16(CTIO7_FLAGS_STATUS_MODE_1); ctio1->scsi_status |= cpu_to_le16(SS_SENSE_LEN_VALID); ctio1->sense_length = cpu_to_le16(prm->sense_buffer_len); - for (i = 0; i < prm->sense_buffer_len/4; i++) - ((uint32_t *)ctio1->sense_data)[i] = - cpu_to_be32(((uint32_t *)prm->sense_buffer)[i]); -#if 0 - if (unlikely((prm->sense_buffer_len % 4) != 0)) { - static int q; - if (q < 10) { - PRINT_INFO("qla2x00t(%ld): %d bytes of sense " - "lost", prm->tgt->ha->instance, - prm->sense_buffer_len % 4); - q++; - } - } -#endif + q24_copy_sense_buffer_to_ctio(ctio1, prm->sense_buffer, + prm->sense_buffer_len); } else { ctio1 = (ctio7_status1_entry_t *)ctio; ctio1->flags &= ~cpu_to_le16(CTIO7_FLAGS_STATUS_MODE_0);