From ca9debd891f6dc0e77833e04aeecf9560b588556 Mon Sep 17 00:00:00 2001 From: Vladislav Bolkhovitin Date: Sat, 13 Jan 2018 03:37:19 +0000 Subject: [PATCH] iscsi-scst: fix a subtle race inside the EXTRACHECKs facility. The system is running an iSCSI write test with header and data digests enabled. On the command that resulted in the assert, all of the write data was sent as immediate and unsolicited. The unsolicited data consisted of a single DataOut PDU and it was the iscsi_cmnd resource allocated for receiving the unsolicited DataOut that caused the assert, not the iscsi_cmnd resource tracking the iSCSI write command. As the Recv thread is finishing up with the iscsi_cmnd struct for receiving the final unsolicited DataOut, SCST is call the iscsi_pre_exec function to restart the command and this routine will process the DataOut iscsi_cmnd as it is queued onto the rx_digest_list of the iscsi_cmnd for Write. ffff8a984150c600 is the iscsi_cmnd allocated to track the iSCSI Write command ffff8a984150d680 is the iscsi_cmnd allocated for receiving the Data-out PDU In the logs below, recv thread 3677 just completed receiving the unsolicited DataOut. [ 977.563046] [3677]: do_recv:808:nr_segs 0, bytes_left 0, res 196608 [ 977.563048] [31728]: iscsi_make_conn_rd_active:351:conn ffff8a97d0df1600, rd_state 2, rd_data_ready 1 [ 977.563065] [3677]: do_recv:808:nr_segs 0, bytes_left 0, res 4 [ 977.563069] [31728]: iscsi_make_conn_rd_active:351:conn ffff8a97d0df1600, rd_state 2, rd_data_ready 1 [ 977.563076] [3677]: cmnd_rx_end:3475:cmnd ffff8a984150d680, opcode 5 [ 977.563078] [31728]: iscsi_make_conn_rd_active:351:conn ffff8a97d0df1600, rd_state 2, rd_data_ready 1 [ 977.563081] [3677]: cmnd_rx_end:3478:Updated last_rcv_time 4295644274 [ 977.563091] [3677]: data_out_end:2455:cmnd ffff8a984150d680, req ffff8a984150c600 [ 977.563093] [3677]: cmd_add_on_rx_ddigest_list:820:Adding RX ddigest cmd ffff8a984150d680 to digest list of req ffff8a984150c600 [ 977.563096] [3677]: cmnd_get:764:cmnd ffff8a984150d680, new cmnd->ref_cnt 2 Here we see no more data is pending to be received on this write, so iscsi_restart_cmnd is called by recv thread which should trigger SCST to start processing the command: [ 977.563106] [3677]: data_out_end:2495:req ffff8a984150c600, FINAL 80, outstanding_r2t 0, r2t_len_to_receive 0, r2t_len_to_send 0 [ 977.563108] [3677]: req_del_from_write_timeout_list:141:Deleting cmd ffff8a984150c600 from conn ffff8a97d0df1600 write_timeout_list [ 977.563111] [3677]: cmnd_remove_data_wait_hash:1606:Deleting cmnd ffff8a984150c600 from the hash (ITT 1e230400) SCST thread 24654 calls the pre execute routine which verifies the data digest on immediate and unsolicited data: [ 977.563157] [24654]: iscsi_pre_exec:1917:Checking digest of RX ddigest cmd ffff8a984150c600 [ 977.563161] [24654]: digest_data:125:req ffff8a984150c600, idx 0, count 16, sg_cnt 64, size 65536, offset 0 [ 977.563170] [24654]: digest_rx_data:219:RX data digest OK for cmd ffff8a984150c600 [ 977.563173] [3677]: cmnd_put:775:cmnd ffff8a984150c600, new ref_cnt 2 [ 977.563176] [24654]: cmd_del_from_rx_ddigest_list:830:Deleting RX digest cmd ffff8a984150c600 from digest list Recv thread calls pre release on iscsi_cmnd allocated for unsolicited DataOut PDU: [ 977.563178] [3677]: req_cmnd_pre_release:820:req ffff8a984150d680 [ 977.563181] [24654]: cmnd_put:775:cmnd ffff8a984150c600, new ref_cnt 1 SCST thread 24654 processing data digest on unsolicited DataOut PDU. Here we have two threads accessing the same iscsi_cmnd struct. SCST thread will set on_rx_digest_list flag to 0. Recv thread will set release_called flag to 1. [ 977.563184] [24654]: iscsi_pre_exec:1917:Checking digest of RX ddigest cmd ffff8a984150d680 [ 977.563186] [24654]: cmd_del_from_rx_ddigest_list:830:Deleting RX digest cmd ffff8a984150d680 from digest list [ 977.563189] [24654]: cmnd_put:775:cmnd ffff8a984150d680, new ref_cnt 0 [ 977.563191] [3677]: conn_get:868:conn ffff8a97d0df1600, new conn_ref_cnt 4 [ 977.563194] [24654]: cmnd_done:589:cmnd ffff8a984150d680 [ 977.563196] BUG at ./iscsi-scst/kernel/iscsi.c:598 (cmnd->on_rx_digest_list) [ 977.563212] ------------[ cut here ]------------ [ 977.563215] kernel BUG at ./iscsi-scst/kernel/iscsi.c:598! Reported and analyzed by Adam Hutchinson git-svn-id: http://svn.code.sf.net/p/scst/svn/trunk@7338 d57e44dd-8a1f-0410-8b47-8ef2f437770f --- iscsi-scst/kernel/iscsi.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/iscsi-scst/kernel/iscsi.h b/iscsi-scst/kernel/iscsi.h index 6c7f0a926..1e95f1410 100644 --- a/iscsi-scst/kernel/iscsi.h +++ b/iscsi-scst/kernel/iscsi.h @@ -404,9 +404,16 @@ struct iscsi_cmnd { */ unsigned int data_out_in_data_receiving:1; unsigned int force_release_done:1; + #ifdef CONFIG_SCST_EXTRACHECKS - unsigned int on_rx_digest_list:1; unsigned int release_called:1; + + /* + * This flag might be cleared from SCST thread after/during RX digest + * calculation in parallel with clearing release_called from the + * read thread, so it must be modifiable inependently. + */ + unsigned long on_rx_digest_list:1 __aligned(sizeof(long)); #endif /*