mirror of
https://github.com/SCST-project/scst.git
synced 2026-05-14 01:01:27 +00:00
Support for scst_tgt_template detect() method was declared obsolete
in 2015. Remove support for scst_tgt_template detect() method.
See also commit 4ac6d7a26d ("[PATCH] scst: Drop detect() method").
2165 lines
72 KiB
Plaintext
2165 lines
72 KiB
Plaintext
<!doctype linuxdoc system>
|
|
|
|
<article>
|
|
|
|
<title>
|
|
SCST technical description
|
|
</title>
|
|
|
|
<author>
|
|
<name>Vladislav Bolkhovitin</name>
|
|
</author>
|
|
|
|
<date>
|
|
Version 3.0.0 for SCST 3.0.0 and later
|
|
</date>
|
|
|
|
<toc>
|
|
|
|
<sect>Introduction
|
|
|
|
<p> SCST is a SCSI target mid-level subsystem for Linux. It provides
|
|
unified consistent interface between SCSI target drivers, backend device
|
|
handlers and Linux kernel as well as simplifies target drivers
|
|
development as much as possible.
|
|
|
|
It has the following features:
|
|
|
|
<itemize>
|
|
|
|
<item> Very low overhead and fine-grained locks, which allow to reach
|
|
maximum possible performance and scalability that close to theoretical
|
|
limit.
|
|
|
|
<item> Complete SMP support.
|
|
|
|
<item> Performs all required pre- and post- processing of incoming
|
|
requests and all necessary error recovery functionality.
|
|
|
|
<item> Emulates necessary functionality of SCSI host adapters, because
|
|
from a remote initiator's point of view SCST acts as a SCSI host with
|
|
its own devices. Some of the emulated functions are the following:
|
|
|
|
<itemize>
|
|
|
|
<item> Generation of necessary UNIT ATTENTIONs, their storage and
|
|
delivery to all connected remote initiators (sessions).
|
|
|
|
<item> RESERVE/RELEASE functionality, including Persistent Reservations.
|
|
|
|
<item> All types of RESETs and other task management functions.
|
|
|
|
<item> REPORT LUNS command as well as SCSI address space
|
|
management in order to have consistent address space on all
|
|
remote initiators, since local SCSI devices could not know about
|
|
each other to report via REPORT LUNS command. Additionally, SCST
|
|
responds with error on all commands to non-existing devices and
|
|
provides access control, so different remote initiators could
|
|
see different set of devices.
|
|
|
|
<item> Other necessary functionality (task attributes, etc.) as
|
|
specified in SAM-2, SPC-2, SAM-3, SPC-3 and other SCSI standards.
|
|
|
|
</itemize>
|
|
|
|
<item> Verifies all incoming requests to ensure commands execution
|
|
reliability and security.
|
|
|
|
<item> Device handlers architecture provides extra flexibility by
|
|
allowing to make additional requests processing, which is completely
|
|
independent from target drivers, for example, data caching or device
|
|
dependent exceptional conditions treatment.
|
|
|
|
</itemize>
|
|
|
|
<sect>Terms and Definitions
|
|
|
|
<p>
|
|
<bf/SCSI initiator device/
|
|
|
|
A SCSI device that originates service and task management requests to be
|
|
processed by a SCSI target device and receives device service and task
|
|
management responses from SCSI target devices.
|
|
|
|
<bf/SCSI target device/
|
|
|
|
A SCSI device that receives device service and task management requests
|
|
for processing and sends device service and task management responses
|
|
to SCSI initiator devices or drivers.
|
|
|
|
<bf/SCST session/
|
|
|
|
SCST session is the object that describes relationship between a remote
|
|
initiator and SCST via a target driver. All the commands from the remote
|
|
initiator is passed to SCST in the session. For example, for connection
|
|
oriented protocols, like iSCSI, SCST session could be mapped to TCP
|
|
connection (as well as iSCSI session). SCST session is equivalent of
|
|
SCSI I_T nexus object.
|
|
|
|
<bf/Local SCSI initiator/
|
|
|
|
A SCSI initiator that is located on the same host as SCST subsystem.
|
|
Examples are sg and st drivers.
|
|
|
|
<bf/Remote SCSI initiator/
|
|
|
|
A SCSI initiator that is located on the remote host for SCST subsystem
|
|
and makes client connections to SCST via SCST target drivers.
|
|
|
|
<bf/SCSI target driver/
|
|
|
|
A Linux hardware or logical driver that acts as a SCSI target for remote
|
|
SCSI initiators, i.e. accepts remote connections, passes incoming SCSI
|
|
requests to SCST and sends SCSI responses from SCST back to their
|
|
originators.
|
|
|
|
<bf/Device (backend) handler driver/
|
|
|
|
Also known as "device type specific driver" or "dev handler", SCST
|
|
driver, which helps SCST to analyze incoming requests and determine
|
|
parameters, specific to various types of devices as well as perform some
|
|
processing. See below for more details.
|
|
|
|
<sect>SCST Core Architecture
|
|
|
|
<p>
|
|
SCST accepts commands and passes them to SCSI mid-level at the same
|
|
way as SCSI high-level drivers (sg, sd, st) do. Figure 1 shows
|
|
interaction between SCST, its drivers and Linux SCSI subsystem.
|
|
|
|
<figure>
|
|
<eps file="fig1.png">
|
|
<img src="fig1.png">
|
|
<caption>
|
|
<newline> Interaction between SCST, its drivers and Linux SCSI subsystem.
|
|
</caption>
|
|
</figure>
|
|
|
|
<sect> Target drivers
|
|
|
|
<sect1>struct scst_tgt_template
|
|
|
|
<p>
|
|
To work with SCST a target driver must register its template in SCST by
|
|
calling <bf/scst_register_target_template()/. The template lets SCST know the
|
|
target driver's entry points. It is defined as the following:
|
|
|
|
<verb>
|
|
struct scst_tgt_template
|
|
{
|
|
int sg_tablesize;
|
|
const char name[SCST_MAX_NAME];
|
|
|
|
unsigned unchecked_isa_dma:1;
|
|
unsigned use_clustering:1;
|
|
unsigned no_clustering:1;
|
|
|
|
unsigned xmit_response_atomic:1;
|
|
unsigned rdy_to_xfer_atomic:1;
|
|
|
|
unsigned no_proc_entry:1;
|
|
|
|
int max_hw_pending_time;
|
|
|
|
int threads_num;
|
|
|
|
int (*release)(struct scst_tgt *tgt);
|
|
|
|
int (*xmit_response)(struct scst_cmd *cmd);
|
|
int (*rdy_to_xfer)(struct scst_cmd *cmd);
|
|
|
|
void (*on_hw_pending_cmd_timeout) (struct scst_cmd *cmd);
|
|
|
|
void (*on_free_cmd) (struct scst_cmd *cmd);
|
|
|
|
int (*alloc_data_buf) (struct scst_cmd *cmd);
|
|
|
|
void (*preprocessing_done) (struct scst_cmd *cmd);
|
|
|
|
int (*pre_exec) (struct scst_cmd *cmd);
|
|
|
|
void (*task_mgmt_affected_cmds_done) (struct scst_mgmt_cmd *mgmt_cmd);
|
|
void (*task_mgmt_fn_done)(struct scst_mgmt_cmd *mgmt_cmd);
|
|
|
|
int (*report_aen) (struct scst_aen *aen);
|
|
|
|
int (*read_proc) (struct seq_file *seq, struct scst_tgt *tgt);
|
|
int (*write_proc) (char *buffer, char **start, off_t offset,
|
|
int length, int *eof, struct scst_tgt *tgt);
|
|
|
|
int (*get_initiator_port_transport_id) (struct scst_session *sess,
|
|
uint8_t **transport_id);
|
|
}
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/sg_tablesize/ - allows checking whether scatter/gather can be
|
|
used or not and, if yes, sets the maximum supported count of
|
|
scatter/gather entries
|
|
|
|
<item><bf/name/ - the name of the template. Must be unique to identify
|
|
the template. Must be defined.
|
|
|
|
<item><bf/unchecked_isa_dma/ - true, if this target adapter uses
|
|
unchecked DMA onto an ISA bus.
|
|
|
|
<item><bf/use_clustering/ - true, if this target adapter wants to use
|
|
clustering (i.e. smaller number of merged segments).
|
|
|
|
<item> <bf/no_clustering/ - true, if this target adapter doesn't support
|
|
SG-vector clustering
|
|
|
|
<item><bf/xmit_response_atomic/, <bf/rdy_to_xfer_atomic/ - true, if the
|
|
corresponding function supports execution in the atomic (non-sleeping)
|
|
context.
|
|
|
|
<item> <bf/no_proc_entry/ - true, if this template doesn't need the entry in /proc
|
|
|
|
<item> <bf/max_hw_pending_time/ - The maximum time in seconds cmd can
|
|
stay inside the target hardware, i.e. after rdy_to_xfer() and
|
|
xmit_response(), before on_hw_pending_cmd_timeout() will be called, if
|
|
defined. In the current implementation a cmd will be aborted in time t
|
|
max_hw_pending_time <= t < 2*max_hw_pending_time.
|
|
|
|
<item> <bf/threads_num/ - number of additional threads to the pool of
|
|
dedicated threads. Used if xmit_response() or rdy_to_xfer() is blocking.
|
|
It is the target driver's duty to ensure that not more, than that number
|
|
of threads, are blocked in those functions at any time.
|
|
|
|
<item><bf/int (*release)(struct scst_tgt *tgt)/ - this function is
|
|
intended to free up resources allocated to the device. The function
|
|
should return 0 to indicate successful release or a negative value if
|
|
there are some issues with the release. In the current version of SCST
|
|
the return value is ignored. Must be defined.
|
|
|
|
<item><bf/int (*xmit_response)(struct scst_cmd *cmd)/ - this
|
|
function is equivalent to the SCSI queuecommand(). The target should
|
|
transmit the response data and the status in the struct scst_cmd. See
|
|
below for details. Must be defined.
|
|
|
|
<item><bf/int (*rdy_to_xfer)(struct scst_cmd *cmd)/ - this function
|
|
informs the driver that data buffer corresponding to the said command
|
|
have now been allocated and it is OK to receive data for this command.
|
|
This function is necessary because a SCSI target does not have any
|
|
control over the commands it receives. Most lower-level protocols have
|
|
the corresponding function which informs the initiator that buffers have
|
|
been allocated e.g., XFER_RDY in Fibre Channel. After the data actually
|
|
received, the low-level driver should call <it/scst_rx_data()/ in order
|
|
to continue processing this command. Returns one of the
|
|
<it/SCST_TGT_RES_*/ constants, described below. Pay attention to
|
|
"atomic" attribute of the command, which can be get via
|
|
scst_cmd_atomic(). It is true if the function called in the atomic
|
|
(non-sleeping) context. Must be defined.
|
|
|
|
<item> <bf/void (*on_hw_pending_cmd_timeout) (struct scst_cmd *cmd)/ -
|
|
Called if cmd stays inside the target hardware, i.e. after rdy_to_xfer()
|
|
and xmit_response(), more than max_hw_pending_time time. The target
|
|
driver supposed to cleanup this command and resume cmd's processing.
|
|
|
|
<item><bf/void (*on_free_cmd)(struct scst_cmd *cmd)/ - this function
|
|
called to notify the driver that the command is about to be freed.
|
|
Necessary, because for aborted commands xmit_response() could not be
|
|
called. Could be used on IRQ context. Must be defined.
|
|
|
|
<item> <bf/int (*alloc_data_buf) (struct scst_cmd *cmd)/ - this function
|
|
allows target driver to handle data buffer allocations on its own.
|
|
Target driver doesn't have to always allocate buffer in this function,
|
|
but if it decided to do it, it must check that
|
|
scst_cmd_get_data_buff_alloced() returns 0, otherwise to avoid double
|
|
buffer allocation and memory leaks alloc_data_buf() shall fail. Returns
|
|
0 in case of success or < 0 (preferably -ENOMEM) in case of error, or >
|
|
0 if the regular SCST allocation should be done. In case of returning
|
|
successfully, scst_cmd->tgt_data_buf_alloced will be set by SCST. It is
|
|
possible that both target driver and dev handler request own memory
|
|
allocation. If allocation in atomic context, i.e. scst_cmd_atomic() is
|
|
true, and < 0 is returned, this function will be recalled in thread
|
|
context. Note that the driver will have to handle itself all relevant
|
|
details such as scatterlist setup, highmem, freeing the allocated
|
|
memory, etc.
|
|
|
|
<item> <bf/void (*preprocessing_done) (struct scst_cmd *cmd)/ - this
|
|
function informs the driver that data buffer corresponding to the said
|
|
command have now been allocated and other preprocessing tasks have been
|
|
done. A target driver could need to do some actions at this stage. After
|
|
the target driver done the needed actions, it shall call
|
|
<it/scst_restart_cmd()/ in order to continue processing this command. In case
|
|
of preliminary commands completion, this function will also be called
|
|
before xmit_response(). Called only for commands queued using
|
|
scst_cmd_init_stage1_done() instead of scst_cmd_init_done(). Returns
|
|
void, the result is expected to be returned using scst_restart_cmd().
|
|
This command is expected to be NON-BLOCKING. If it is blocking, consider
|
|
to set threads_num to some none 0 number. Pay attention to "atomic"
|
|
attribute of the cmd, which can be get by scst_cmd_atomic(). It is true
|
|
if the function called in the atomic (non-sleeping) context.
|
|
|
|
<item> <bf/int (*pre_exec) (struct scst_cmd *cmd)/ - this function
|
|
informs the driver that the said command is about to be executed.
|
|
Returns one of the <it/SCST_PREPROCESS_*/ constants. This command is
|
|
expected to be NON-BLOCKING. If it is blocking, consider to set
|
|
threads_num to some none 0 number.
|
|
|
|
<item> <bf/void (*task_mgmt_affected_cmds_done) (struct scst_mgmt_cmd
|
|
*mgmt_cmd)/ - this function informs the driver that all affected by the
|
|
corresponding task management function commands have beed completed. No
|
|
return value expected. This function is expected to be NON-BLOCKING.
|
|
Called without any locks held from a thread context.
|
|
|
|
<item><bf/void (*task_mgmt_fn_done)(struct scst_mgmt_cmd *mgmt_cmd)/ -
|
|
this function informs the driver that a received task management
|
|
function has been completed. Completion status could be get via
|
|
<it/scst_mgmt_cmd_get_status()/. No return value expected. Must be
|
|
defined, if the target supports task management functionality.
|
|
|
|
<item><bf/int (*report_aen) (struct scst_aen *aen)/ - this function is
|
|
used for Asynchronous Event Notifications. Returns one of the
|
|
<it/SCST_AEN_RES_*/ constants. After AEN is sent, target driver must
|
|
call <it/scst_aen_done()/ and, optionally,
|
|
<it/scst_set_aen_delivery_status()/. This function is expected to be
|
|
NON-BLOCKING, but can sleep. This function must be prepared to handle
|
|
AENs between calls for the corresponding session of
|
|
scst_unregister_session() and unreg_done_fn() callback called or before
|
|
scst_unregister_session() returned, if its called in the blocking mode.
|
|
AENs for such sessions should be ignored. Must be defined, if low-level
|
|
protocol supports AENs.
|
|
|
|
<item> <bf/int (*read_proc) (struct seq_file *seq, struct scst_tgt
|
|
*tgt), int (*write_proc) (char *buffer, char **start, off_t offset,
|
|
int length, int *eof, struct scst_tgt *tgt)/ - those functions can be
|
|
used to export the driver's statistics and other infos to the world
|
|
outside the kernel as well as to get some management commands from it.
|
|
If the driver needs to create additional files in its /proc
|
|
subdirectory, it can use <it/scst_proc_get_tgt_root()/ function to get
|
|
the root proc_dir_entry.
|
|
|
|
<item> <bf/int (*get_initiator_port_transport_id) (struct scst_session
|
|
*sess, uint8_t **transport_id)/ - this function returns in tr_id the
|
|
corresponding to sess initiator port TransporID in the form as it's used
|
|
by PR commands, see "Transport Identifiers" in SPC. Space for the
|
|
initiator port TransporID must be allocated via kmalloc(). Caller
|
|
supposed to kfree() it, when it isn't needed anymore. If sess is NULL,
|
|
this function must return TransportID PROTOCOL IDENTIFIER of this
|
|
transport. Returns 0 on success or negative error code otherwise. Should
|
|
be defined, because it's required for Persistent Reservations.
|
|
|
|
</itemize>
|
|
|
|
Functions <bf/xmit_response()/, <bf/rdy_to_xfer()/ are expected to be
|
|
non-blocking, i.e. return immediately and don't wait for actual data
|
|
transfer to finish. Blocking in such command could negatively impact on
|
|
overall system performance. If blocking is necessary, it is worth to
|
|
consider creating dedicated thread(s) in target driver, to which the
|
|
commands would be passed and which would perform blocking operations
|
|
instead of SCST. If the function allowed to sleep or not is defined by
|
|
"atomic" attribute of the cmd that can be get via
|
|
<it/scst_cmd_atomic()/, which is true, if sleeping is not allowed. In
|
|
this case, if the function requires sleeping, it can return
|
|
<it/SCST_TGT_RES_NEED_THREAD_CTX/ in order to be recalled in the thread
|
|
context, where sleeping is allowed.
|
|
|
|
Functions <bf/task_mgmt_fn_done()/ and <bf/report_aen()/ are recommended
|
|
to be non-blocking as well. Blocking there will stop all management
|
|
processing for all target drivers in the system (there is only one
|
|
management thread in the system).
|
|
|
|
Functions <bf/xmit_response()/ and <bf/rdy_to_xfer()/ can return the
|
|
following error codes:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/SCST_TGT_RES_SUCCESS/ - success.
|
|
|
|
<item><bf/SCST_TGT_RES_QUEUE_FULL/ - internal device queue is full, retry
|
|
again later.
|
|
|
|
<item><bf/SCST_TGT_RES_NEED_THREAD_CTX/ - it is impossible to complete
|
|
requested task in atomic context. The command should be restarted in the
|
|
thread context as described above.
|
|
|
|
<item><bf/SCST_TGT_RES_FATAL_ERROR/ - fatal error, i.e. it is unable to
|
|
perform requested operation. If returned by <bf/xmit_response()/ the
|
|
command will be destroyed, if by <bf/rdy_to_xfer()/,
|
|
<bf/xmit_response()/ will be called with <bf/HARDWARE ERROR/ sense data.
|
|
|
|
</itemize>
|
|
|
|
<sect2>More about xmit_response()
|
|
|
|
<p>
|
|
As already written above, function xmit_response() should transmit
|
|
the response data and the status from the cmd parameter.
|
|
|
|
Sense data, if any, is contained in the buffer, returned by
|
|
<it/scst_cmd_get_sense_buffer()/, with length, returned by
|
|
<it/scst_cmd_get_sense_buffer_len()/. SCST always works in autosense
|
|
mode. If a low-level SCSI driver/device doesn't support autosense mode,
|
|
SCST will issue REQUEST SENSE command, if necessary. Thus, if CHECK
|
|
CONDITION established, target driver will always see sense in the sense
|
|
buffer and isn't required to request the sense manually.
|
|
|
|
After the response is completely sent, the target should call
|
|
<it/scst_tgt_cmd_done()/ function in order to allow SCST to free the
|
|
command.
|
|
|
|
Function xmit_response() returns one of the <it/SCST_TGT_RES_*/
|
|
constants, described above. Pay attention to "atomic" attribute of the
|
|
cmd, which can be get via <it/scst_cmd_atomic()/: it is true if the
|
|
function called in the atomic (non-sleeping) context.
|
|
|
|
To detect aborted commands xmit_response() must in the beginning check
|
|
return status of function <bf/scst_cmd_aborted_on_xmit()/. If it's true,
|
|
xmit_response() must call <bf/scst_set_delivery_status(cmd,
|
|
SCST_CMD_DELIVERY_ABORTED)/ and terminate further processing by calling
|
|
<bf/scst_tgt_cmd_done(cmd, SCST_CONTEXT_SAME)/.
|
|
|
|
<sect1>Target driver registration functions
|
|
|
|
<sect2>scst_register_target_template()
|
|
|
|
<p>
|
|
Function <bf/scst_register_target_template()/ is defined as the following:
|
|
|
|
<verb>
|
|
int scst_register_target_template(
|
|
struct scst_tgt_template *vtt)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
<item><bf/vtt/ - pointer to the target driver template
|
|
</itemize>
|
|
|
|
Returns 0 on success or appropriate error code otherwise.
|
|
|
|
<sect2>scst_register_target()
|
|
|
|
<p>
|
|
Function <bf/scst_register_target()/ is defined as the following:
|
|
|
|
<verb>
|
|
struct scst_tgt *scst_register_target(
|
|
struct scst_tgt_template *vtt)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
<item><bf/vtt/ - pointer to the target driver template
|
|
</itemize>
|
|
|
|
Returns target structure based on template vtt or NULL in case of error.
|
|
|
|
<sect1>Target driver unregistration functions
|
|
|
|
<p>
|
|
In order to unregister itself target driver should at first call
|
|
<bf/scst_unregister_target()/ for all its adapters and then call
|
|
<bf/scst_unregister_target_template()/ for its template.
|
|
|
|
<sect2>scst_unregister_target()
|
|
|
|
<p>
|
|
Function <bf/scst_unregister_target()/ is defined as the following:
|
|
|
|
<verb>
|
|
void scst_unregister_target(
|
|
struct scst_tgt *tgt)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
<item><bf/tgt/ - pointer to the target driver structure
|
|
</itemize>
|
|
|
|
<sect2>scst_unregister_target_template()
|
|
|
|
<p>
|
|
Function <bf/scst_unregister_target_template()/ is defined as the following:
|
|
|
|
<verb>
|
|
void scst_unregister_target_template(
|
|
struct scst_tgt_template *vtt)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
<item><bf/vtt/ - pointer to the target driver template
|
|
</itemize>
|
|
|
|
<sect>Device specific drivers (backend device handlers)
|
|
|
|
<p> Device specific drivers are add-ons for SCST, which help SCST to
|
|
analyze incoming requests and determine parameters, specific to various
|
|
types of devices as well as actually execute specified SCSI commands.
|
|
Device handlers are intended for the following:
|
|
|
|
<itemize>
|
|
|
|
<item>To get data transfer length and direction directly from CDB and
|
|
current device's configuration exactly as an end-target SCSI device
|
|
does. This serves two purposes:
|
|
|
|
<itemize>
|
|
|
|
<item> Improves security and reliability by not trusting the data
|
|
supplied by remote initiator via SCSI low-level protocol.
|
|
|
|
<item> Some low-level SCSI protocols don't provide data transfer
|
|
length and direction, so that information can be get only
|
|
directly from CDB and current device's configuration. For
|
|
example, for tape devices to get data transfer size it might be
|
|
necessary to know block size setting.
|
|
|
|
</itemize>
|
|
|
|
<item> Execute commands
|
|
|
|
<item>To process some exceptional conditions, like ILI on tape devices.
|
|
|
|
<item>To initialize incoming commands with some device-specific
|
|
parameters, like timeout value.
|
|
|
|
<item>To allow some additional device-specific commands pre-, post-
|
|
processing or alternative execution, like copying data from system
|
|
cache, and do that completely independently from target drivers.
|
|
|
|
</itemize>
|
|
|
|
Device handlers considered to be part of SCST, so they could directly
|
|
access any fields in SCST's structures as well as use the corresponding
|
|
functions.
|
|
|
|
Without appropriate device handler SCST hides devices of this type from
|
|
remote initiators and returns <bf/HARDWARE ERROR/ sense data to any
|
|
requests to them.
|
|
|
|
<sect1>Structure <bf/scst_dev_type/
|
|
|
|
<p>
|
|
Structure <bf/scst_dev_type/ is defined as the following:
|
|
|
|
<verb>
|
|
struct scst_dev_type
|
|
{
|
|
char name[];
|
|
int type;
|
|
|
|
unsigned parse_atomic:1;
|
|
unsigned alloc_data_buf_atomic:1;
|
|
unsigned dev_done_atomic:1;
|
|
|
|
unsigned no_proc:1;
|
|
|
|
unsigned pr_cmds_notifications:1;
|
|
|
|
int threads_num;
|
|
enum scst_dev_type_threads_pool_type threads_pool_type;
|
|
|
|
int (*attach) (struct scst_device *dev);
|
|
void (*detach) (struct scst_device *dev);
|
|
|
|
int (*attach_tgt) (struct scst_tgt_device *tgt_dev);
|
|
void (*detach_tgt) (struct scst_tgt_device *tgt_dev);
|
|
|
|
int (*parse) (struct scst_cmd *cmd);
|
|
int (*alloc_data_buf) (struct scst_cmd *cmd);
|
|
int (*exec) (struct scst_cmd *cmd);
|
|
int (*dev_done) (struct scst_cmd *cmd);
|
|
int (*on_free_cmd) (struct scst_cmd *cmd);
|
|
|
|
int (*task_mgmt_fn) (struct scst_mgmt_cmd *mgmt_cmd,
|
|
struct scst_tgt_dev *tgt_dev);
|
|
|
|
int (*read_proc) (struct seq_file *seq, struct scst_dev_type *dev_type);
|
|
int (*write_proc) (char *buffer, char **start, off_t offset,
|
|
int length, int *eof, struct scst_dev_type *dev_type);
|
|
}
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/name/ - the name of the device handler. Must be defined and
|
|
unique.
|
|
|
|
<item><bf/type/ - SCSI type of the supported device. Must be defined.
|
|
|
|
<item><bf/parse_atomic/, <bf/alloc_data_buf_atomic/,
|
|
<bf/dev_done_atomic/ - true, if the corresponding callback supports
|
|
execution in the atomic (non-sleeping) context.
|
|
|
|
<item> <bf/no_proc/ - true, if no /proc files should be automatically
|
|
created by SCST for this dev handler
|
|
|
|
<item> <bf/pr_cmds_notifications/ - should be set if the device wants to
|
|
receive notification of Persistent Reservation commands (PR OUT only)
|
|
Note: The notifications will not be sent if the command failed.
|
|
|
|
<item> <bf/threads_num/ - sets number of threads in this handler's
|
|
devices' threads pools. If 0 - no threads will be created, if <0 -
|
|
creation of the threads pools is prohibited. Also pay attention to
|
|
<it/threads_pool_type/ below.
|
|
|
|
<item> <bf/threads_pool_type/ - threads pool type. Valid only if
|
|
threads_num > 0. Possible values:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/SCST_THREADS_POOL_PER_INITIATOR/ - each initiator
|
|
will have dedicated threads pool
|
|
|
|
<item> <bf/SCST_THREADS_POOL_SHARED/ - all connected initiators will use
|
|
shared threads pool
|
|
|
|
</itemize>
|
|
|
|
<item><bf/int (*attach) (struct scst_device *dev)/ - called when new
|
|
device is being attached to the device handler
|
|
|
|
<item><bf/void (*detach) (struct scst_device *dev)/ - called when new
|
|
device is being detached from the device handler
|
|
|
|
<item><bf/int (*attach_tgt) (struct scst_tgt_device *tgt_dev)/ - called
|
|
when new tgt_dev (session) is being attached to the device handler
|
|
|
|
<item><bf/void (*detach_tgt) (struct scst_tgt_device *tgt_dev)/ - called
|
|
when tgt_dev (session) is being detached from the device handler
|
|
|
|
<item><bf/int (*parse) (struct scst_cmd *cmd, const struct scst_info_cdb
|
|
*cdb_info)/ - called to parse CDB from the cmd and initialize
|
|
<it/cmd->bufflen/ and <it/cmd->data_direction/ (both - REQUIRED). Returns the
|
|
command's <it/next state/ or <it/SCST_CMD_STATE_DEFAULT/, if the next default
|
|
state should be used, or <it/SCST_CMD_STATE_NEED_THREAD_CTX/ if the function
|
|
called in atomic context, but requires sleeping, or <it/SCST_CMD_STATE_STOP/
|
|
if the command should not be further processed for now. In the
|
|
SCST_CMD_STATE_NEED_THREAD_CTX case the function will be recalled in the
|
|
thread context, where sleeping is allowed. Pay attention to "atomic"
|
|
attribute of the cmd, which can be get by scst_cmd_atomic(). It is true
|
|
if the function called in the atomic (non-sleeping) context. Must be
|
|
defined.
|
|
|
|
<item><bf/int (*alloc_data_buf) (struct scst_cmd *cmd)/ - this function
|
|
allows dev handler to handle data buffer allocations on its own. Returns
|
|
the command's <it/next state/ or <it/SCST_CMD_STATE_DEFAULT/, if the
|
|
next default state should be used, or
|
|
<it/SCST_CMD_STATE_NEED_THREAD_CTX/ if the function called in atomic
|
|
context, but requires sleeping, or <it/SCST_CMD_STATE_STOP/ if the
|
|
command should not be further processed for now. In the
|
|
SCST_CMD_STATE_NEED_THREAD_CTX case the function will be recalled in the
|
|
thread context, where sleeping is allowed. Pay attention to "atomic"
|
|
attribute of the cmd, which can be get by scst_cmd_atomic(). It is true
|
|
if the function called in the atomic (non-sleeping) context.
|
|
|
|
<item> <bf/int (*exec) (struct scst_cmd *cmd)/ - called to execute CDB.
|
|
Useful, for instance, to implement data caching. The result of CDB
|
|
execution is reported via <it/cmd->scst_cmd_done()/ callback.
|
|
|
|
Returns:
|
|
<itemize>
|
|
|
|
<item> <bf/SCST_EXEC_COMPLETED/ - the cmd is done, go to other ones
|
|
|
|
<item> <bf/SCST_EXEC_NOT_COMPLETED/ - the cmd should be sent to SCSI
|
|
mid-level.
|
|
</itemize>
|
|
|
|
If this function provides sync execution, you should consider to setup
|
|
dedicated threads by setting <it/threads_num/ > 0.
|
|
|
|
Optional, if not set, the commands will be sent directly to SCSI
|
|
device.
|
|
|
|
<bf/If this function is implemented, scst_check_local_events() shall be
|
|
called inside it just before the actual command's execution./
|
|
|
|
<item><bf/int (*dev_done) (struct scst_cmd *cmd)/ - called to notify
|
|
device handler about the result of the command's execution and perform
|
|
some post processing. If <it/parse()/ function is called, dev_done() is
|
|
<it/guaranteed/ to be called as well. The command's fields
|
|
<it/tgt_resp_flags/ and <it/resp_data_len/ should be set by this
|
|
function, but SCST offers good defaults. Pay attention to "atomic"
|
|
attribute of the command, which can be get via scst_cmd_atomic(). It is
|
|
true if the function called in the atomic (non-sleeping) context.
|
|
Returns the command's <it/next state/ or <it/SCST_CMD_STATE_DEFAULT/, if
|
|
the next default state should be used, or
|
|
<it/SCST_CMD_STATE_NEED_THREAD_CTX/ if the function called in atomic
|
|
context, but requires sleeping. In the last case, the function will be
|
|
recalled in the thread context, where sleeping is allowed.
|
|
|
|
<item><bf/void (*on_free_cmd) (struct scst_cmd *cmd)/ - called to notify
|
|
device handler that the command is about to be freed. Could be called on
|
|
IRQ context.
|
|
|
|
<item><bf/int (*task_mgmt_fn) (struct scst_mgmt_cmd *mgmt_cmd, struct
|
|
scst_tgt_dev *tgt_dev)/ - called to execute a task management command.
|
|
Returns:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/SCST_MGMT_STATUS_SUCCESS/ - the command is done
|
|
with success, no further actions required
|
|
|
|
<item><bf/SCST_MGMT_STATUS_*/ - the command is failed,
|
|
no further actions required
|
|
|
|
<item><bf/SCST_DEV_TM_NOT_COMPLETED/ - regular standard actions
|
|
for the command should be done
|
|
|
|
</itemize>
|
|
|
|
<bf/NOTE/: for <bf/SCST_ABORT_TASK/ it is called under spinlock!
|
|
|
|
<item> <bf/int (*read_proc) (struct seq_file *seq, struct scst_tgt
|
|
*tgt), int (*write_proc) (char *buffer, char **start, off_t offset,
|
|
int length, int *eof, struct scst_tgt *tgt)/ - those functions can be
|
|
used to export the driver's statistics and other infos to the world
|
|
outside the kernel as well as to get some management commands from it.
|
|
If the driver needs to create additional files in its /proc
|
|
subdirectory, it can use <it/scst_proc_get_dev_type_root()/ function to
|
|
get the root proc_dir_entry.
|
|
|
|
|
|
</itemize>
|
|
|
|
<sect1>Device specific drivers registration
|
|
|
|
<sect2> scst_register_dev_driver()
|
|
|
|
<p>
|
|
To work with SCST a device specific driver must register itself in SCST by
|
|
calling <bf/scst_register_dev_driver()/. It is defined as the following:
|
|
|
|
<verb>
|
|
int scst_register_dev_driver(
|
|
struct scst_dev_type *dev_type)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
<item><bf/dev_type/ - device specific driver's description structure
|
|
</itemize>
|
|
|
|
The function returns 0 on success or appropriate error code otherwise.
|
|
|
|
<sect2> scst_register_virtual_device()
|
|
|
|
<p>
|
|
To create a virtual device a device handler must register it in SCST by
|
|
calling <bf/scst_register_virtual_device()/. It is defined as the following:
|
|
|
|
<verb>
|
|
int scst_register_virtual_device(
|
|
struct scst_dev_type *dev_handler,
|
|
const char *dev_name)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/dev_handler/ - device specific driver's description structure
|
|
|
|
<item> <bf/dev_name/ - the new device name, NULL-terminated string. Must be unique
|
|
among all virtual devices in the system.
|
|
|
|
</itemize>
|
|
|
|
The function returns ID assigned to the device on success, or negative
|
|
value otherwise.
|
|
|
|
All local real SCSI devices will be registered and unregistered by the
|
|
SCST core automatically, so pass-through dev handlers don't have to
|
|
worry about it.
|
|
|
|
|
|
<sect1>Device specific drivers unregistration
|
|
|
|
<sect2> scst_unregister_virtual_device()
|
|
|
|
<p>
|
|
Virtual devices unregistered by calling
|
|
<bf/scst_unregister_virtual_device()/. It is defined as the following:
|
|
|
|
<verb>
|
|
void scst_unregister_virtual_device(
|
|
int id)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
<item><bf/id/ - the device's ID, returned by the registration function.
|
|
</itemize>
|
|
|
|
<sect2> scst_unregister_dev_driver()
|
|
|
|
<p>
|
|
Device specific driver is unregistered by calling
|
|
<bf/scst_unregister_dev_driver()/. It is defined as the following:
|
|
|
|
<verb>
|
|
void scst_unregister_dev_driver(
|
|
struct scst_dev_type *dev_type)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
<item><bf/dev_type/ - device specific driver's description structure
|
|
</itemize>
|
|
|
|
<sect>SCST sessions
|
|
|
|
<sect1>SCST sessions registration
|
|
|
|
<p>
|
|
When target driver determines that it needs to create new SCST session
|
|
(for example, by receiving new TCP connection), it should call
|
|
<bf/scst_register_session()/, that is defined as the following:
|
|
|
|
<verb>
|
|
struct scst_session *scst_register_session(
|
|
struct scst_tgt *tgt,
|
|
int atomic,
|
|
const char *initiator_name,
|
|
void *tgt_priv,
|
|
void *result_fn_data,
|
|
void (*result_fn) (
|
|
struct scst_session *sess,
|
|
void *data,
|
|
int result))
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/tgt/ - target
|
|
|
|
<item><bf/atomic/ - true, if the function called in the atomic context
|
|
|
|
<item><bf/initiator_name/ - remote initiator's name, any NULL-terminated
|
|
string, e.g. iSCSI name, which used as the key to found appropriate
|
|
access control group. Could be NULL, then "default" group is used. The
|
|
groups are set up via /proc interface.
|
|
|
|
<item> <bf/tgt_priv/ - pointer to target driver's private data
|
|
|
|
<item><bf/result_fn_data/ - data that will be used as the second
|
|
parameter for <bf/bf/result_fn/()/ function
|
|
|
|
<item><bf/result_fn/ - pointer to the function that will be
|
|
asynchronously called when session initialization finishes. Can be NULL.
|
|
Parameters:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/sess/ - session
|
|
|
|
<item><bf/data/ - target driver supplied to scst_register_session() data
|
|
|
|
<item><bf/result/ - session initialization result, 0 on success or
|
|
appropriate error code otherwise
|
|
|
|
</itemize>
|
|
|
|
</itemize>
|
|
|
|
A session creation and initialization is a complex task, which requires
|
|
sleeping state, so it can't be fully done in interrupt context.
|
|
Therefore the "bottom half" of it, if scst_register_session() is
|
|
called from atomic context, will be done in SCST thread context. In this
|
|
case scst_register_session() will return not completely initialized
|
|
session, but the target driver can supply commands to this session via
|
|
scst_rx_cmd(). Those commands processing will be delayed inside
|
|
SCST until the session initialization is finished, then their processing
|
|
will be restarted. The target driver will be notified about finish of
|
|
the session initialization by function <it/result_fn()/. On success the
|
|
target driver could do nothing, but if the initialization fails, the
|
|
target driver must ensure that no more new commands being sent or will
|
|
be sent to SCST after result_fn() returns. All already sent to SCST
|
|
commands for failed session will be returned in <it/xmit_response()/
|
|
with BUSY status. In case of failure the driver shall call
|
|
<it/scst_unregister_session()/ inside result_fn(), it will NOT be
|
|
called automatically.
|
|
|
|
Thus, scst_register_session() can be safely called from IRQ context.
|
|
|
|
<sect1>SCST sessions unregistration
|
|
|
|
<p>
|
|
SCST session unregistration basically is the same, except that instead of
|
|
atomic parameter there is <bf/wait/ one.
|
|
|
|
<verb>
|
|
void scst_unregister_session(
|
|
struct scst_session *sess,
|
|
int wait,
|
|
void (*unreg_done_fn)(
|
|
struct scst_session *sess))
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/sess/ - session to be unregistered
|
|
|
|
<item><bf/wait/ - if true, instructs to wait until all commands, which
|
|
currently being executed in the session, finished. Otherwise, target
|
|
driver should be prepared to receive <it/xmit_response()/ for the
|
|
session after scst_unregister_session() returns.
|
|
|
|
<item><bf/unreg_done_fn/ - pointer to the function that will be
|
|
asynchronously called when the last session's command finishes and the
|
|
session is about to be completely freed. Can be NULL. Parameter:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/sess/ - session
|
|
|
|
</itemize>
|
|
|
|
</itemize>
|
|
|
|
All outstanding commands will be finished regularly. After
|
|
scst_unregister_session() returned no new commands must be sent to SCST
|
|
via scst_rx_cmd(). Also, the caller must ensure that no scst_rx_cmd() or
|
|
scst_rx_mgmt_fn_*() is called in parallel with
|
|
scst_unregister_session().
|
|
|
|
Function scst_unregister_session()/ can be called before result_fn() of
|
|
scst_register_session() called, i.e. during the session
|
|
registration/initialization.
|
|
|
|
|
|
<sect>Commands processing and interaction between SCST core and its drivers
|
|
|
|
<p>
|
|
Consider simplified commands processing example. It assumes that target
|
|
driver doesn't need own memory allocation, i.e. not defined
|
|
alloc_data_buf() callback. Example of such target driver is qla2x00t.
|
|
|
|
The commands processing by SCST started when target driver calls
|
|
<bf/scst_rx_cmd()/. This function returns SCST's command. Then the
|
|
target driver finishes the command's initialization, for example,
|
|
storing necessary target driver specific data there, and calls
|
|
<bf/scst_cmd_init_done()/ telling SCST that it can start the command processing.
|
|
Then SCST translates the command's LUN to local device, determines the
|
|
command's data direction and required data buffer size by calling
|
|
appropriate device handler's <bf/parse()/ callback function. Then:
|
|
|
|
<itemize>
|
|
|
|
<item>If the command required no data transfer, it will be passed to
|
|
SCSI mid-level directly or via device handler's <bf/exec()/ callback.
|
|
|
|
<item>If the command is a <it/READ/ command (data to the remote/local initiator),
|
|
necessary space will be allocated and then the command will be passed
|
|
to SCSI mid-level directly or via device handler's <bf/exec()/ callback.
|
|
|
|
<item>If the command is a <it/WRITE/ command (data from the remote/local initiator),
|
|
necessary space will be allocated, then the target's <bf/rdy_to_xfer()/
|
|
callback will be called, telling the target that the space is ready and
|
|
it can start data transferring. When all the data are read from the
|
|
target, it will call <bf/scst_rx_data()/, and the command will be passed
|
|
to SCSI mid-level directly or via device handler's <bf/exec()/ callback.
|
|
|
|
</itemize>
|
|
|
|
When the command is finished by SCSI mid-level, device handler's
|
|
<bf/dev_done()/ callback is called to notify it about the command's
|
|
completion. Then in order to send its response the target's
|
|
<bf/xmit_response()/ callback is called. When the response, including
|
|
data, if any, is transmitted, the target will call
|
|
<bf/scst_tgt_cmd_done()/ to tell SCST that it can free the command and
|
|
its data buffer.
|
|
|
|
Then during the command's deallocation device handler's and the target's
|
|
<bf/on_free_cmd()/ callback will be called in this order, if set.
|
|
|
|
This sequence is illustrated on Figure 2. To simplify the picture, sign
|
|
"..." means SCST's waiting state for the corresponding command to
|
|
complete. During this state SCST and its drivers continue processing of
|
|
other commands, if there are any. One way arrow, for example to
|
|
xmit_response(), means that after this function returns, nothing
|
|
valuable for the current command will be done and SCST goes sleeping or
|
|
to the next command processing until the corresponding event happens.
|
|
|
|
<figure>
|
|
<eps file="fig2.png">
|
|
<img src="fig2.png">
|
|
<caption>
|
|
<newline> The commands processing flow
|
|
</caption>
|
|
</figure>
|
|
|
|
<sect1>The commands processing functions
|
|
|
|
<sect2>scst_rx_cmd()
|
|
|
|
<p>
|
|
Function <bf/scst_rx_cmd()/ creates and sends new command to SCST. Returns
|
|
the command on success or NULL otherwise. It is defined as the
|
|
following:
|
|
|
|
<verb>
|
|
struct scst_cmd *scst_rx_cmd(
|
|
struct scst_session *sess,
|
|
const uint8_t *lun,
|
|
int lun_len,
|
|
const uint8_t *cdb,
|
|
int cdb_len,
|
|
int atomic)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/sess/ - SCST's session
|
|
|
|
<item><bf/lun/ - pointer to device's LUN as specified by SAM in without
|
|
any byte order translation. Extended addressing method is not supported.
|
|
|
|
<item><bf/lun_len/ - LUN's length
|
|
|
|
<item><bf/cdb/ - SCSI CDB
|
|
|
|
<item><bf/cdb_len/ - CDB's length. Can be up to 64KB long.
|
|
|
|
<item><bf/atomic/ - if true, the command will be allocated with
|
|
GFP_ATOMIC flag, otherwise GFP_KERNEL will be used
|
|
|
|
</itemize>
|
|
|
|
<sect2>scst_cmd_init_done()
|
|
|
|
<p>
|
|
Function <bf/scst_cmd_init_done()/ notifies SCST that the driver finished
|
|
its part of the command initialization, and the command is ready for
|
|
execution. It is defined as the following:
|
|
|
|
<verb>
|
|
void scst_cmd_init_done(
|
|
struct scst_cmd *cmd,
|
|
enum scst_exec_context pref_context)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/cmd/ - the command
|
|
|
|
<item><bf/pref_context/ - preferred command execution context. See
|
|
<it/SCST_CONTEXT_*/ constants below for details.
|
|
|
|
</itemize>
|
|
|
|
<sect2>scst_rx_data()
|
|
|
|
<p>
|
|
Function <bf/scst_rx_data()/ notifies SCST that the driver received all
|
|
the necessary data and the command is ready for further processing. It
|
|
is defined as the following:
|
|
|
|
<verb>
|
|
void scst_rx_data(
|
|
struct scst_cmd *cmd,
|
|
int status,
|
|
enum scst_exec_context pref_context)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/cmd/ - the command
|
|
|
|
<item><bf/status/ - completion status, see below.
|
|
|
|
<item><bf/pref_context/ - preferred command execution context. See
|
|
<it/SCST_CONTEXT_*/ constants below for details.
|
|
|
|
</itemize>
|
|
|
|
Parameter <bf/status/ can have one of the following values:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/SCST_RX_STATUS_SUCCESS/ - success
|
|
|
|
<item><bf/SCST_RX_STATUS_ERROR/ - data receiving finished with error, so
|
|
SCST should set the sense and finish the command by calling
|
|
xmit_response()
|
|
|
|
<item><bf/SCST_RX_STATUS_ERROR_SENSE_SET/ - data receiving finished with
|
|
error and the sense is set, so SCST should finish the command by calling
|
|
xmit_response()
|
|
|
|
<item><bf/SCST_RX_STATUS_ERROR_FATAL/ - data receiving finished with
|
|
fatal error, so SCST should finish the command, but don't call
|
|
xmit_response(). In this case the driver must free all associated
|
|
with the command data before calling scst_rx_data().
|
|
|
|
</itemize>
|
|
|
|
<sect2>scst_tgt_cmd_done()
|
|
|
|
<p>
|
|
Function <bf/scst_tgt_cmd_done()/ notifies SCST that the driver has sent
|
|
the data and/or response. It must not been called if there are an error
|
|
and xmit_response() returned something other, than SCST_TGT_RES_SUCCESS.
|
|
It is defined as the following:
|
|
|
|
<verb>
|
|
void scst_tgt_cmd_done(
|
|
struct scst_cmd *cmd,
|
|
enum scst_exec_context pref_context)
|
|
</verb>
|
|
|
|
Where:
|
|
<itemize>
|
|
|
|
<item><bf/cmd/ - the command
|
|
|
|
<item><bf/pref_context/ - preferred command execution context. See
|
|
<it/SCST_CONTEXT_*/ constants below for details.
|
|
|
|
</itemize>
|
|
|
|
<sect1>The commands processing context
|
|
|
|
<p>
|
|
Execution context often is a major problem in the kernel drivers
|
|
development, because many contexts, like IRQ context, greatly limit
|
|
available functionality, therefore require additional complex code in
|
|
order to pass processing to more simple context. SCST does its best to
|
|
undertake most of the context handling.
|
|
|
|
On the initialization time SCST creates for internal command processing
|
|
as many threads as there are processors in the system or specified by
|
|
user via <bf/scst_threads/ module parameter. Similarly, as many tasklets
|
|
created as there are processors in the system.
|
|
|
|
Each command can be processed in one of four contexts:
|
|
|
|
<enum>
|
|
<item>Directly, i.e. in the caller's context, without limitations
|
|
<item>Directly atomically, i.e. with sleeping forbidden
|
|
<item>In the SCST's internal threads
|
|
<item>In the SCST's per processor tasklets
|
|
</enum>
|
|
|
|
The target driver sets this context as pref_context parameter for SCST
|
|
functions. Additionally, target's template's <it/xmit_response_atomic/
|
|
and <it/rdy_to_xfer_atomic/ flags have direct influence on the context.
|
|
If one of them is false, the corresponding function will never be called
|
|
in the atomic context and, if necessary, the command will be rescheduled
|
|
to one of the SCST's threads.
|
|
|
|
SCST in some circumstances can change preferred context to less
|
|
restrictive one, for example, for large data buffer allocation, if
|
|
there is not enough GFP_ATOMIC memory.
|
|
|
|
<sect2>Preferred context constants
|
|
|
|
<p>
|
|
There are the following preferred context constants:
|
|
|
|
<itemize>
|
|
|
|
<item><bf/SCST_CONTEXT_DIRECT/ - sets direct command processing (i.e.
|
|
regular function calls in the current context) sleeping is allowed, no
|
|
context restrictions. Supposed to be used when calling from thread
|
|
context where no locks are held and the driver's architecture allows
|
|
sleeping without performance degradation or anything like that.
|
|
|
|
<item><bf/SCST_CONTEXT_DIRECT_ATOMIC/ - sets direct command processing
|
|
(i.e. regular function calls in the current context), sleeping is not
|
|
allowed. Supposed to be used when calling on thread context where there
|
|
are locks held, when calling on softirq context or the driver's
|
|
architecture does not allow sleeping without performance degradation or
|
|
anything like that.
|
|
|
|
<item><bf/SCST_CONTEXT_TASKLET/ - tasklet or thread context required for
|
|
the command processing. Supposed to be used when calling from IRQ
|
|
context.
|
|
|
|
<item><bf/SCST_CONTEXT_THREAD/ - thread context required for the
|
|
command processing. Supposed to be used if the driver's architecture
|
|
does not allow using any of above.
|
|
|
|
<item> <bf/SCST_CONTEXT_SAME/ - context is the same as it was in
|
|
previous call of the corresponding callback. For example, if dev
|
|
handler's exec() does sync. data reading this value should be used for
|
|
scst_cmd_done(). The same is true if scst_tgt_cmd_done() called directly
|
|
from target driver's xmit_response(). Not allowed in
|
|
scst_cmd_init_done() and scst_cmd_init_stage1_done().
|
|
|
|
</itemize>
|
|
|
|
<sect1>SCST commands' processing states
|
|
|
|
<p>
|
|
There are the following processing states, which a SCST command passes
|
|
through during execution and which could be returned by device handler's
|
|
<bf/parse()/ and <bf/dev_done()/ (but not all states are allowed to be
|
|
returned):
|
|
|
|
<itemize>
|
|
|
|
<item><bf/SCST_CMD_STATE_INIT_WAIT/ - the command is created, but
|
|
<it/scst_cmd_init_done()/ not called
|
|
|
|
<item><bf/SCST_CMD_STATE_INIT/ - LUN translation (i.e. <it/cmd->tgt_dev/
|
|
assignment) state
|
|
|
|
<item><bf/SCST_CMD_STATE_PARSE/ - device handler's <it/parse()/ is going
|
|
to be called
|
|
|
|
<item><bf/SCST_CMD_STATE_PREPARE_SPACE/ - allocation of the command's
|
|
data buffer
|
|
|
|
<item> <bf/SCST_CMD_STATE_PREPROCESSING_DONE_CALLED/ - waiting for scst_restart_cmd()
|
|
|
|
<item><bf/SCST_CMD_STATE_RDY_TO_XFER/ - target driver's
|
|
<it/rdy_to_xfer()/ is going to be called
|
|
|
|
<item><bf/SCST_CMD_STATE_DATA_WAIT/ - waiting for data from the initiator
|
|
(until <it/scst_rx_data()/ called)
|
|
|
|
<item> <bf/SCST_CMD_STATE_TGT_PRE_EXEC/ - target driver's
|
|
<it/pre_exec()/ is going to be called
|
|
|
|
<item><bf/SCST_CMD_STATE_SEND_FOR_EXEC/ - the command is going to be
|
|
sent for execution
|
|
|
|
<item><bf/SCST_CMD_STATE_EXECUTING/ - waiting for the command's execution
|
|
finish
|
|
|
|
<item> <bf/SCST_CMD_STATE_LOCAL_EXEC/ - the command is being checked if
|
|
it should be executed locally
|
|
|
|
<item> <bf/SCST_CMD_STATE_REAL_EXEC/ - the command is ready for execution
|
|
|
|
<item> <bf/SCST_CMD_STATE_REAL_EXECUTING/ - waiting for CDB's execution
|
|
finish
|
|
|
|
<item> <bf/SCST_CMD_STATE_PRE_DEV_DONE/ - internal post-exec checks
|
|
|
|
<item> <bf/SCST_CMD_STATE_MODE_SELECT_CHECKS/ - internal MODE SELECT
|
|
pages related checks
|
|
|
|
<item><bf/SCST_CMD_STATE_DEV_DONE/ - device handler's <it/dev_done()/ is
|
|
going to be called
|
|
|
|
<item> <bf/SCST_CMD_STATE_PRE_XMIT_RESP/ - checks before target driver's
|
|
<it/xmit_response()/ is called
|
|
|
|
<item><bf/SCST_CMD_STATE_XMIT_RESP/ - target driver's
|
|
<it/xmit_response()/ is going to be called
|
|
|
|
<item><bf/SCST_CMD_STATE_XMIT_WAIT/ - waiting for data/response's
|
|
transmission finish (until <it/scst_tgt_cmd_done()/ called)
|
|
|
|
<item><bf/SCST_CMD_STATE_FINISHED/ - the command finished and going to be
|
|
freed
|
|
|
|
</itemize>
|
|
|
|
|
|
<sect>Task management functions
|
|
|
|
<p>
|
|
There are the following task management functions supported:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/SCST_ABORT_TASK/ - this is <it/ABORT_TASK/ SAM task
|
|
management function. Aborts the specified task (command).
|
|
|
|
<item> <bf/SCST_ABORT_TASK_SET/ - this is <it/ABORT_TASK_SET/ SAM task
|
|
management function. Aborts all tasks (commands) in the specified
|
|
session.
|
|
|
|
<item> <bf/SCST_CLEAR_ACA/ - this is <bf/CLEAR_ACA/ SAM task management
|
|
function. Currently does nothing.
|
|
|
|
<item> <bf/SCST_CLEAR_TASK_SET/ - this is <bf/CLEAR_TASK_SET/ SAM task
|
|
management function. Clears task set of commands on the specified
|
|
device or session.
|
|
|
|
<item> <bf/SCST_LUN_RESET/ - this is <bf/LUN_RESET/ SAM task management
|
|
function. Resets specified device.
|
|
|
|
<item> <bf/SCST_TARGET_RESET/ - this is <bf/TARGET_RESET/ SAM task management
|
|
function. Resets all devices visible in this session.
|
|
|
|
<item> <bf/SCST_NEXUS_LOSS_SESS/ - SCST extension. Notifies about I_T
|
|
nexus loss event in the corresponding session. Aborts all tasks there,
|
|
resets the reservation, if any, and sets up the I_T Nexus loss UA.
|
|
|
|
<item> <bf/SCST_ABORT_ALL_TASKS_SESS/ - SCST extension. Aborts all
|
|
tasks in the corresponding session.
|
|
|
|
<item> <bf/SCST_NEXUS_LOSS/ - SCST extension. Notifies about I_T nexus
|
|
loss event. Aborts all tasks in all sessions of the tgt, resets the
|
|
reservations, if any, and sets up the I_T Nexus loss UA.
|
|
|
|
<item> <bf/SCST_ABORT_ALL_TASKS/ - SCST extension. Aborts all tasks in
|
|
all sessions of the tgt.
|
|
|
|
</itemize>
|
|
|
|
All task management functions return completion status via
|
|
<it/task_mgmt_fn_done()/ when the affected SCSI commands (tasks) are
|
|
actually aborted, i.e. guaranteed never be executed any time later.
|
|
|
|
<sect1>scst_rx_mgmt_fn_tag()
|
|
|
|
<p>
|
|
Function <bf/scst_rx_mgmt_fn_tag()/ tells SCST to perform the specified
|
|
task management function, based on the command's tag. Can be used only
|
|
for <it/SCST_ABORT_TASK/.
|
|
|
|
It is defined as the following:
|
|
|
|
<verb>
|
|
int scst_rx_mgmt_fn_tag(
|
|
struct scst_session *sess,
|
|
int fn,
|
|
uint32_t tag,
|
|
int atomic,
|
|
void *tgt_priv)
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/sess/ - the session, on which the command should be performed.
|
|
|
|
<item> <bf/fn/ - task management function, one of the constants above.
|
|
|
|
<item> <bf/tag/ - the command's tag.
|
|
|
|
<item> <bf/atomic/ - true, if the function called in the atomic context.
|
|
|
|
<item> <bf/tgt_priv/ - pointer to the target driver specific data, can
|
|
be retrieved in task_mgmt_fn_done() via <it/scst_mgmt_cmd_get_status()/
|
|
function.
|
|
|
|
</itemize>
|
|
|
|
Returns 0 if the command was successfully created and scheduled for
|
|
execution, error code otherwise. On success, the completion status of
|
|
the command will be reported asynchronously via task_mgmt_fn_done()
|
|
driver's callback.
|
|
|
|
<sect1>scst_rx_mgmt_fn_lun()
|
|
|
|
<p>
|
|
Function <bf/scst_rx_mgmt_fn_lun()/ tells SCST to perform the specified
|
|
task management function, based on the LUN. Currently it can be used for
|
|
any function, except <it/SCST_ABORT_TASK/.
|
|
|
|
It is defined as the following:
|
|
|
|
<verb>
|
|
int scst_rx_mgmt_fn_lun(
|
|
struct scst_session *sess,
|
|
int fn,
|
|
const uint8_t *lun,
|
|
int lun_len,
|
|
int atomic,
|
|
void *tgt_priv);
|
|
</verb>
|
|
|
|
Where:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/sess/ - the session, on which the command should be performed.
|
|
|
|
<item> <bf/fn/ - task management function, one of the constants above.
|
|
|
|
<item> <bf/lun/ - LUN, the format is the same as for <bf/scst_rx_cmd()/.
|
|
|
|
<item> <bf/lun_len/ - LUN's length.
|
|
|
|
<item> <bf/atomic/ - true, if the function called in the atomic context.
|
|
|
|
<item> <bf/tgt_priv/ - pointer to the target driver specific data, can
|
|
be retrieved in task_mgmt_fn_done() via <it/scst_mgmt_cmd_get_status()/
|
|
function.
|
|
|
|
</itemize>
|
|
|
|
Returns 0 if the command was successfully created and scheduled for
|
|
execution, error code otherwise. On success, the completion status of
|
|
the command will be reported asynchronously via task_mgmt_fn_done()
|
|
driver's callback.
|
|
|
|
Possible status constants which can be returned by
|
|
<bf/scst_mgmt_cmd_get_status()/:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/SCST_MGMT_STATUS_SUCCESS/ - success
|
|
|
|
<item> <bf/SCST_MGMT_STATUS_TASK_NOT_EXIST/ - requested task does not exist
|
|
|
|
<item> <bf/SCST_MGMT_STATUS_LUN_NOT_EXIST/ - requested LUN does not exist
|
|
|
|
<item> <bf/SCST_MGMT_STATUS_FN_NOT_SUPPORTED/ - requested TM function
|
|
does not exist.
|
|
|
|
<item> <bf/SCST_MGMT_STATUS_REJECTED/ - TM function rejected.
|
|
|
|
<item> <bf/SCST_MGMT_STATUS_FAILED/ - TM function failed.
|
|
|
|
</itemize>
|
|
|
|
<sect>SGV cache<label id="sgv_cache">
|
|
|
|
<p>
|
|
SCST SGV cache is a memory management subsystem in SCST. One can call it
|
|
a "memory pool", but Linux kernel already have a mempool interface,
|
|
which serves different purposes. SGV cache provides to SCST core, target
|
|
drivers and backend dev handlers facilities to allocate, build and cache
|
|
SG vectors for data buffers. The main advantage of it is the caching
|
|
facility, when it doesn't free to the system each vector, which is not
|
|
used anymore, but keeps it for a while (possibly indefinitely) to let it
|
|
be reused by the next consecutive command. This allows to:
|
|
|
|
<itemize>
|
|
|
|
<item> Reduce commands processing latencies and, hence, improve performance;
|
|
|
|
<item> Make commands processing latencies predictable, which is essential
|
|
for RT applications.
|
|
|
|
</itemize>
|
|
|
|
The freed SG vectors are kept by the SGV cache either for some (possibly
|
|
indefinite) time, or, optionally, until the system needs more memory and
|
|
asks to free some using the set_shrinker() interface. Also the SGV cache
|
|
allows to:
|
|
|
|
<itemize>
|
|
|
|
<item> Cluster pages together. "Cluster" means merging adjacent pages in a
|
|
single SG entry. It allows to have less SG entries in the resulting SG
|
|
vector, hence improve performance handling it as well as allow to
|
|
work with bigger buffers on hardware with limited SG capabilities.
|
|
|
|
<item> Set custom page allocator functions. For instance, scst_user device
|
|
handler uses this facility to eliminate unneeded mapping/unmapping of
|
|
user space pages and avoid unneeded IOCTL calls for buffers allocations.
|
|
In fileio_tgt application, which uses a regular malloc() function to
|
|
allocate data buffers, this facility allows ~30% less CPU load and
|
|
considerable performance increase.
|
|
|
|
<item> Prevent each initiator or all initiators altogether to allocate too
|
|
much memory and DoS the target. Consider 10 initiators, which can have
|
|
access to 10 devices each. Any of them can queue up to 64 commands, each
|
|
can transfer up to 1MB of data. So, all of them in a peak can allocate
|
|
up to 10*10*64 = ~6.5GB of memory for data buffers. This amount must be
|
|
limited somehow and the SGV cache performs this function.
|
|
|
|
</itemize>
|
|
|
|
<sect1> Implementation
|
|
|
|
<p>
|
|
From implementation POV the SGV cache is a simple extension of the kmem
|
|
cache. It can work in 2 modes:
|
|
|
|
<enum>
|
|
|
|
<item> With fixed size buffers.
|
|
|
|
<item> With a set of power 2 size buffers. In this mode each SGV cache
|
|
(struct sgv_pool) has SGV_POOL_ELEMENTS (11 currently) of kmem caches.
|
|
Each of those kmem caches keeps SGV cache objects (struct sgv_pool_obj)
|
|
corresponding to SG vectors with size of order X pages. For instance,
|
|
request to allocate 4 pages will be served from kmem cache[2&rsqb, since the
|
|
order of the of number of requested pages is 2. If later request to
|
|
allocate 11KB comes, the same SG vector with 4 pages will be reused (see
|
|
below). This mode is in average allows less memory overhead comparing
|
|
with the fixed size buffers mode.
|
|
|
|
</enum>
|
|
|
|
Consider how the SGV cache works in the set of buffers mode. When a
|
|
request to allocate new SG vector comes, sgv_pool_alloc() via
|
|
sgv_get_obj() checks if there is already a cached vector with that
|
|
order. If yes, then that vector will be reused and its length, if
|
|
necessary, will be modified to match the requested size. In the above
|
|
example request for 11KB buffer, 4 pages vector will be reused and
|
|
modified using trans_tbl to contain 3 pages and the last entry will be
|
|
modified to contain the requested length - 2*PAGE_SIZE. If there is no
|
|
cached object, then a new sgv_pool_obj will be allocated from the
|
|
corresponding kmem cache, chosen by the order of number of requested
|
|
pages. Then that vector will be filled by pages and returned.
|
|
|
|
In the fixed size buffers mode the SGV cache works similarly, except
|
|
that it always allocate buffer with the predefined fixed size. I.e.
|
|
even for 4K request the whole buffer with predefined size, say, 1MB,
|
|
will be used.
|
|
|
|
In both modes, if size of a request exceeds the maximum allowed for
|
|
caching buffer size, the requested buffer will be allocated, but not
|
|
cached.
|
|
|
|
Freed cached sgv_pool_obj objects are actually freed to the system
|
|
either by the purge work, which is scheduled once in 60 seconds, or in
|
|
sgv_shrink() called by system, when it's asking for memory.
|
|
|
|
<sect1> Interface
|
|
|
|
<sect2> sgv_pool *sgv_pool_create()
|
|
|
|
<p>
|
|
<verb>
|
|
struct sgv_pool *sgv_pool_create(
|
|
const char *name,
|
|
enum sgv_clustering_types clustered, int single_alloc_pages,
|
|
bool shared, int purge_interval)
|
|
</verb>
|
|
|
|
This function creates and initializes an SGV cache. It has the following
|
|
arguments:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/name/ - the name of the SGV cache
|
|
|
|
<item> <bf/clustered/ - sets type of the pages clustering. The type can be:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/sgv_no_clustering/ - no clustering performed.
|
|
|
|
<item> <bf/sgv_tail_clustering/ - a page will only be merged with the latest
|
|
previously allocated page, so the order of pages in the SG will be
|
|
preserved
|
|
|
|
<item> <bf/sgv_full_clustering/ - free merging of pages at any place in
|
|
the SG is allowed. This mode usually provides the best merging
|
|
rate.
|
|
|
|
</itemize>
|
|
|
|
<item> <bf/single_alloc_pages/ - if 0, then the SGV cache will work in the set of
|
|
power 2 size buffers mode. If >0, then the SGV cache will work in the
|
|
fixed size buffers mode. In this case single_alloc_pages sets the
|
|
size of each buffer in pages.
|
|
|
|
<item> <bf/shared/ - sets if the SGV cache can be shared between devices or not.
|
|
The cache sharing allowed only between devices created inside the same
|
|
address space. If an SGV cache is shared, each subsequent call of
|
|
sgv_pool_create() with the same cache name will not create a new cache,
|
|
but instead return a reference to it.
|
|
|
|
<item> <bf/purge_interval/ - sets the cache purging interval. I.e. an SG buffer
|
|
will be freed if it's unused for time t purge_interval <= t <
|
|
2*purge_interval. If purge_interval is 0, then the default interval
|
|
will be used (60 seconds). If purge_interval <0, then the automatic
|
|
purging will be disabled. Shrinking by the system's demand will also
|
|
be disabled.
|
|
|
|
</itemize>
|
|
|
|
Returns the resulting SGV cache or NULL in case of any error.
|
|
|
|
<sect2> void sgv_pool_del()
|
|
|
|
<p>
|
|
<verb>
|
|
void sgv_pool_del(
|
|
struct sgv_pool *pool)
|
|
</verb>
|
|
|
|
This function deletes the corresponding SGV cache. If the cache is
|
|
shared, it will decrease its reference counter. If the reference counter
|
|
reaches 0, the cache will be destroyed.
|
|
|
|
<sect2> void sgv_pool_flush()
|
|
|
|
<p>
|
|
<verb>
|
|
void sgv_pool_flush(
|
|
struct sgv_pool *pool)
|
|
</verb>
|
|
|
|
This function flushes, i.e. frees, all the cached entries in the SGV
|
|
cache.
|
|
|
|
<sect2> void sgv_pool_set_allocator()
|
|
|
|
<p>
|
|
<verb>
|
|
void sgv_pool_set_allocator(
|
|
struct sgv_pool *pool,
|
|
struct page *(*alloc_pages_fn)(struct scatterlist *sg, gfp_t gfp, void *priv),
|
|
void (*free_pages_fn)(struct scatterlist *sg, int sg_count, void *priv));
|
|
</verb>
|
|
|
|
This function allows to set for the SGV cache a custom pages allocator. For
|
|
instance, scst_user uses such function to supply to the cache mapped from
|
|
user space pages.
|
|
|
|
<bf/alloc_pages_fn()/ has the following parameters:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/sg/ - SG entry, to which the allocated page should be added.
|
|
|
|
<item> <bf/gfp/ - the allocation GFP flags
|
|
|
|
<item> <bf/priv/ - pointer to a private data supplied to sgv_pool_alloc()
|
|
|
|
</itemize>
|
|
|
|
This function should return the allocated page or NULL, if no page was
|
|
allocated.
|
|
|
|
|
|
<bf/free_pages_fn()/ has the following parameters:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/sg/ - SG vector to free
|
|
|
|
<item> <bf/sg_count/ - number of SG entries in the sg
|
|
|
|
<item> <bf/priv/ - pointer to a private data supplied to the
|
|
corresponding sgv_pool_alloc()
|
|
|
|
</itemize>
|
|
|
|
<sect2> struct scatterlist *sgv_pool_alloc()
|
|
|
|
<p>
|
|
<verb>
|
|
struct scatterlist *sgv_pool_alloc(
|
|
struct sgv_pool *pool,
|
|
unsigned int size,
|
|
gfp_t gfp_mask,
|
|
int flags,
|
|
int *count,
|
|
struct sgv_pool_obj **sgv,
|
|
struct scst_mem_lim *mem_lim,
|
|
void *priv)
|
|
</verb>
|
|
|
|
This function allocates an SG vector from the SGV cache. It has the
|
|
following parameters:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/pool/ - the cache to alloc from
|
|
|
|
<item> <bf/size/ - size of the resulting SG vector in bytes
|
|
|
|
<item> <bf/gfp_mask/ - the allocation mask
|
|
|
|
<item> <bf/flags/ - the allocation flags. The following flags are possible and
|
|
can be set using OR operation:
|
|
|
|
<enum>
|
|
|
|
<item> <bf/SGV_POOL_ALLOC_NO_CACHED/ - the SG vector must not be cached.
|
|
|
|
<item> <bf/SGV_POOL_NO_ALLOC_ON_CACHE_MISS/ - don't do an allocation on a
|
|
cache miss.
|
|
|
|
<item> <bf/SGV_POOL_RETURN_OBJ_ON_ALLOC_FAIL/ - return an empty SGV object,
|
|
i.e. without the SG vector, if the allocation can't be completed.
|
|
For instance, because SGV_POOL_NO_ALLOC_ON_CACHE_MISS flag set.
|
|
|
|
</enum>
|
|
|
|
<item> <bf/count/ - the resulting count of SG entries in the resulting SG vector.
|
|
|
|
<item> <bf/sgv/ - the resulting SGV object. It should be used to free the
|
|
resulting SG vector.
|
|
|
|
<item> <bf/mem_lim/ - memory limits, see below.
|
|
|
|
<item> <bf/priv/ - pointer to private for this allocation data. This pointer will
|
|
be supplied to alloc_pages_fn() and free_pages_fn() and can be
|
|
retrieved by sgv_get_priv().
|
|
|
|
</itemize>
|
|
|
|
This function returns pointer to the resulting SG vector or NULL in case
|
|
of any error.
|
|
|
|
<sect2> void sgv_pool_free()
|
|
|
|
<p>
|
|
<verb>
|
|
void sgv_pool_free(
|
|
struct sgv_pool_obj *sgv,
|
|
struct scst_mem_lim *mem_lim)
|
|
</verb>
|
|
|
|
This function frees previously allocated SG vector, referenced by SGV
|
|
cache object sgv.
|
|
|
|
<sect2> void *sgv_get_priv(struct sgv_pool_obj *sgv)
|
|
|
|
<p>
|
|
<verb>
|
|
void *sgv_get_priv(
|
|
struct sgv_pool_obj *sgv)
|
|
</verb>
|
|
|
|
This function allows to get the allocation private data for this SGV
|
|
cache object sgv. The private data are set by sgv_pool_alloc().
|
|
|
|
<sect2> void scst_init_mem_lim()
|
|
|
|
<p>
|
|
<verb>
|
|
void scst_init_mem_lim(
|
|
struct scst_mem_lim *mem_lim)
|
|
</verb>
|
|
|
|
This function initializes memory limits structure mem_lim according to
|
|
the current system configuration. This structure should be latter used
|
|
to track and limit allocated by one or more SGV caches memory.
|
|
|
|
|
|
<sect1> Runtime information and statistics.
|
|
|
|
<p>
|
|
SGV cache runtime information and statistics is available in
|
|
<it>/proc/scsi_tgt/sgv</it>.
|
|
|
|
|
|
<sect> Target driver qla2x00t
|
|
|
|
<p>
|
|
Target driver qla2x00t allows to use QLogic 2xxx based adapters in
|
|
the target (server) mode.
|
|
|
|
It consists from two parts:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/qla2xxx/ - patched initiator driver from Linux kernel, which
|
|
is, among other things, intended to perform all the initialization and
|
|
shutdown tasks.
|
|
|
|
<item> <bf/qla2x00tgt/ - target mode add-on for the changed qla2xxx
|
|
|
|
</itemize>
|
|
|
|
The initiator driver qla2xxx was changed to:
|
|
|
|
<itemize>
|
|
|
|
<item> To provide support for the target mode add-on via a set of
|
|
exported callbacks
|
|
|
|
<item> To provide extra info and management interface in the driver's
|
|
sysfs interface (attributes target_mode_enabled, ports_database, etc.)
|
|
|
|
<item> To fix some problems uncovered during target mode development and
|
|
usage.
|
|
|
|
</itemize>
|
|
|
|
The changes are relatively small (few thousands lines big patch) and local.
|
|
|
|
The changed qla2xxx is still capable to work as initiator only. Mode,
|
|
when a host acts as initiator and target simultaneously, is supported as
|
|
well.
|
|
|
|
Since firmware interface for 24xx+ chips is fundamentally different from
|
|
earlier versions, qla2x00t generally contains 2 separate drivers sharing
|
|
some common processing.
|
|
|
|
<sect1> Driver initialization
|
|
|
|
<p>
|
|
On initialization, qla2x00tgt registers its SCST template tgt2x_template
|
|
in the SCST core. Then during template registration SCST core calls
|
|
detect() callback which is function q2t_target_detect().
|
|
|
|
In this function qla2x00tgt registers its callbacks in qla2xxx by
|
|
calling qla2xxx_tgt_register_driver(). Qla2xxx_tgt_register_driver()
|
|
stores pointer to the being registered callbacks in variable qla_target.
|
|
|
|
Then q2t_target_detect() calls qla2xxx_add_targets(), which calls for
|
|
each known local FC port (HBA instance) qla_target.tgt_host_action()
|
|
callback with ADD_TARGET action. Then q2t_host_action() calls
|
|
q2t_add_target() which registers SCST target for this FC port.
|
|
|
|
If later a new FC port is hot added, qla2x00_probe_one() will also call
|
|
for all new local ports qla_target.tgt_host_action() with ADD_TARGET
|
|
action.
|
|
|
|
|
|
<sect1> Driver unload
|
|
|
|
<p>
|
|
When a local FC port is being removed, the Linux kernel calls
|
|
qla2x00_remove_one(), which then qla_target.tgt_host_action() with
|
|
REMOVE_TARGET action.
|
|
|
|
Then q2t_host_action() calls q2t_remove_target(), which unregisters the
|
|
corresponding SCST target in SCST. During unregistration SCST core calls
|
|
release() callback of tgt2x_template, which is q2t_target_release().
|
|
|
|
Then q2t_target_release() calls q2t_target_stop(). Then
|
|
q2t_target_stop() marks this target as stopped by setting flag tgt_stop.
|
|
When this flag is set, all incoming from initiators commands are
|
|
refused.
|
|
|
|
Then q2t_target_stop() schedules deletion of all sessions of the target.
|
|
|
|
Then q2t_target_stop() waits until all outstanding commands finished and
|
|
sessions deleted.
|
|
|
|
Then q2t_target_stop(), if necessary, calls qla2x00_disable_tgt_mode()
|
|
to disables target mode, which disables target mode of the corresponding
|
|
HBA and resets it. Then qla2x00_disable_tgt_mode() waits until reset
|
|
finished.
|
|
|
|
Then q2t_target_stop() returns and then q2t_target_release() frees the
|
|
target.
|
|
|
|
|
|
If module qla2x00tgt is being unloaded, q2t_exit() at first takes
|
|
q2t_unreg_rwsem on writing. Taking it is necessary to make sure that
|
|
q2t_host_action() will not be active during qla2x00tgt unload.
|
|
|
|
Then q2t_exit() calls scst_unregister_target_template() for
|
|
tgt2x_template, which then in a loop will unregister all QLA SCST targets
|
|
from SCST as described above.
|
|
|
|
|
|
<sect1> Enabling target mode
|
|
|
|
<p>
|
|
When command to enable target mode received,
|
|
qla_target.tgt_host_action() with action ENABLE_TARGET_MODE called. Then
|
|
q2t_host_action() goes over all discovered remote of the being enabled
|
|
target and adds SCST sessions for all them.
|
|
|
|
Then it calls qla2x00_enable_tgt_mode(), which enables target mode of
|
|
the corresponding HBA and resets it. Then qla2x00_enable_tgt_mode()
|
|
waits until reset finished.
|
|
|
|
During reset firmware initialization functions detect that target mode
|
|
is enables and initialize the firmware accordingly.
|
|
|
|
|
|
<sect1> Disabling target mode
|
|
|
|
<p>
|
|
When command to disable target mode received,
|
|
qla_target.tgt_host_action() with action DISABLE_TARGET_MODE called. Then
|
|
q2t_host_action() calls q2t_target_stop(), which processes as describe above.
|
|
|
|
|
|
<sect1> SCST sessions management
|
|
|
|
<p>
|
|
As required by SCSI and FC standards, each remote initiator FC port
|
|
has the corresponding SCST session.
|
|
|
|
Since qla2xxx is not intended to strictly maintain database of remote
|
|
initiator FC ports as it is needed for target mode, qla2x00t uses mixed
|
|
approach for SCST sessions management, when both qla2xxx and QLogic
|
|
firmware generate events and information about currently active remote
|
|
FC ports.
|
|
|
|
Remote FC ports management also has to handle changing FC and loop IDs
|
|
after fabric events, so it needs to constantly monitor FC and loop IDs
|
|
of the registered FC ports. This is implemented by checks in
|
|
q2t_create_sess() that being registered FC port already has SCST session
|
|
and q2t_check_fcport_exist() in q2t_del_sess_work_fn(). See below for
|
|
more info.
|
|
|
|
Interaction with qla2xxx is implemented using tgt_fc_port_added() and
|
|
tgt_fc_port_deleted() qla_target's callbacks.
|
|
|
|
Callback tgt_fc_port_added() called by qla2xxx when the target driver
|
|
detects new remote FC port. Assigned to it q2t_fc_port_added() checks if
|
|
an SCST session already exists for this remote FC port and, if not,
|
|
creates it.
|
|
|
|
Callback tgt_fc_port_deleted() called by qla2xxx when it deletes a
|
|
remote FC port from its database. Assigned to it q2t_fc_port_deleted()
|
|
checks if an SCST session already exists for this remote FC port and, if
|
|
yes, schedules it for deletion.
|
|
|
|
Driver qla2x00tgt has 2 types of SCST sessions: local and not local.
|
|
Sessions created by q2t_fc_port_added() are not local. Local sessions
|
|
created if qla2x00tgt receives a command from remote initiator for which
|
|
there is no know remote FC port and, hence, SCST session. Local sessions
|
|
are created in tgt->sess_work (q2t_sess_work_fn()) by calling
|
|
q2t_make_local_sess(). All received from remote initiators commands for
|
|
local sessions are delayed until the sessions are created.
|
|
|
|
To minimize affecting initiators by FC fabric events, qla2x00tgt doesn't
|
|
immediately delete SCST sessions scheduled for deletion, but instead
|
|
delay them for some time. If during this time a command from an unknown
|
|
remote initiator received, q2t_make_local_sess()/q2t_create_sess() at
|
|
first check if a session for this initiator already exists and, if yes,
|
|
undelete then reuse it after updating its s_id and loop_id to new values.
|
|
|
|
If a session not reused during the delete delay time, then
|
|
q2t_del_sess_work_fn() asks the firmware internal database if it knows
|
|
the corresponding remote FC port. If yes, then this session is undeleted
|
|
and its s_id and loop_id updated to new values. If no, the session is
|
|
deleted.
|
|
|
|
|
|
<sect1> Handling stuck commands
|
|
|
|
<p>
|
|
Driver qla2x00tgt defines in tgt2x_template callback
|
|
on_hw_pending_cmd_timeout for handling stuck commands in
|
|
q2t_on_hw_pending_cmd_timeout() function, with max_hw_pending_time
|
|
timeout set Q2T_MAX_HW_PENDING_TIME (60 seconds). If the firmware
|
|
doesn't return reply for one or more IOCBs for the corresponding SCST
|
|
command, SCST core calls this callback.
|
|
|
|
In this callback all the stuck commands are forcibly finished.
|
|
|
|
<appendix>
|
|
|
|
<sect> Debugging and troubleshooting
|
|
|
|
<p>
|
|
SCST core and its drivers provide excessive debugging and logging
|
|
facilities suitable to catch and analyze problems of virtually any level
|
|
of complexity.
|
|
|
|
Depending from amount debugging and logging facilities available, there
|
|
are 3 types of builds:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/release/ - has basic amount of logging, suitable for basic
|
|
tracing. Extra checking is disabled in this mode. This is the default
|
|
mode.
|
|
|
|
<item> <bf/debug/ - has full amount of logging and extrachecks enabled.
|
|
Has slower and much bigger binary code, but suitable for advanced
|
|
tracing and debugging. Also in this mode more logging is enabled by
|
|
default.
|
|
|
|
<item> <bf/perf/ - has all logging and extrachecks disables. Intended to
|
|
performance measuremens, including measurements of overhead introduced
|
|
by the logging and extrachecks facilities.
|
|
|
|
</itemize>
|
|
|
|
Switch between build modes is done by calling "make x2y", where "x" -
|
|
current build mode and "y" - desired build mode. For instance, to switch
|
|
from release to debug mode you should run "make release2debug".
|
|
|
|
<sect1> Logging levels management
|
|
|
|
<p>
|
|
Logging levels management is done using "trace_level" file located in the
|
|
driver's proc interface subdirectory. Each SCST driver has it, except in
|
|
the perf build mode. For instance, for SCST core it's located in
|
|
/proc/scsi_tgt/. For qla2x00t it's located in /proc/scsi_tgt/qla2x00tgt/.
|
|
|
|
Reading from it you can find currently enabled logging levels.
|
|
|
|
You can change them by writing in this file, like:
|
|
|
|
# echo "add scsi" >/proc/scsi_tgt/trace_level
|
|
|
|
The following commands are available:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/add trace_level/ - adds (enables) the corresponding trace level
|
|
|
|
<item> <bf/del trace_level/ - deletes (disables) the corresponding trace level
|
|
|
|
<item> <bf/set mask/ - sets all trace levels at ones using a mask, e.g.
|
|
0x1538
|
|
|
|
<item> <bf/all/ - enables all trace levels
|
|
|
|
<item> <bf/none/ - disables all trace levels
|
|
|
|
<item> <bf/default/ - sets all trace levels in the default value
|
|
|
|
<item> <bf/dump_prs dev_name/ - dumps Persistent Reservations states for
|
|
device "dev_name"
|
|
|
|
</itemize>
|
|
|
|
The following trace levels are common for all drivers:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/function/ - enables printing the corresponding function names
|
|
for each logged messages
|
|
|
|
<item> <bf/line/ - enables printing the corresponding numbers of line of
|
|
code for each logged message
|
|
|
|
<item> <bf/pid/ - enables printing PIDs of the corresponding processes
|
|
or threads for each logged message
|
|
|
|
<item> <bf/scsi/ - enables logging of processed SCSI commands and their
|
|
processing results
|
|
|
|
<item> <bf/mgmt/ - enables logging of processed Task Management functions
|
|
|
|
<item> <bf/minor/ - enables logging of minor events, line unknown SCSI
|
|
commands or difference between buffer lengths encoded in CDBs and
|
|
expected transfer values
|
|
|
|
<item> <bf/out_of_mem/ - enables logging of out of memory events
|
|
|
|
<item> <bf/entryexit/ - enables logging of functions entry and exit. Not
|
|
available in the release build.
|
|
|
|
<item> <bf/mem/ - enables logging of memory allocation and freeing. Not
|
|
available in the release build.
|
|
|
|
<item> <bf/debug/ - enables various debug logging messages. Not
|
|
available in the release build.
|
|
|
|
<item> <bf/buff/ - enables logging of various buffers contain. Not
|
|
available in the release build.
|
|
|
|
<item> <bf/sg/ - enables logging of SG vectors manipulations. Not
|
|
available in the release build.
|
|
|
|
<item> <bf/mgmt_dbg/ - enables debug logging of Task Management
|
|
functions processing. Not available in the release build.
|
|
|
|
<item> <bf/special/ - enables logging of "special" events. Intended to
|
|
temporary enable logging of some debug messages without enabling the
|
|
whole "debug" level. Not available in the release build.
|
|
|
|
</itemize>
|
|
|
|
The following trace levels are additionally available for SCST core:
|
|
|
|
<itemize>
|
|
|
|
<item> <bf/scsi_serializing/ - enables logging of SCSI commands task
|
|
attributes processings (SIMPLE, ORDERED, etc.). Not available in the
|
|
release build.
|
|
|
|
<item> <bf/retry/ - enables logging of retries of rdy_to_xfer() and
|
|
xmit_response() target drivers callbacks. Not available in the release
|
|
build.
|
|
|
|
<item> <bf/recv_bot/, <bf/send_bot/, <bf/recv_top/, <bf/send_top/ -
|
|
enables logging of commands buffers on various processing stages. Not
|
|
available in the release build.
|
|
|
|
</itemize>
|
|
|
|
<sect1> Preparing a debug kernel
|
|
|
|
<p>
|
|
SCST logging can produce huge amount of logging, which default kernel
|
|
configuration can't cope with, so it needs some extra adjustments.
|
|
|
|
For that you should change in lib/Kconfig.debug or init/Kconfig
|
|
depending from your kernel version LOG_BUF_SHIFT from "12 21" to "12 25".
|
|
|
|
Then you should in your .config set CONFIG_LOG_BUF_SHIFT to 25.
|
|
|
|
Also, Linux kernel has a lot of helpful debug facilities, like lockdep,
|
|
which allows to catch various deadlocks, or memory allocation debugging.
|
|
It is recommended to enable them during SCST debugging.
|
|
|
|
The following options are recommended to be enabled (available depending
|
|
from your kernel version): CONFIG_SLUB_DEBUG, CONFIG_PRINTK_TIME,
|
|
CONFIG_MAGIC_SYSRQ, CONFIG_DEBUG_FS, CONFIG_DEBUG_KERNEL,
|
|
CONFIG_DEBUG_SHIRQ, CONFIG_DETECT_SOFTLOCKUP, CONFIG_DETECT_HUNG_TASK,
|
|
CONFIG_SLUB_DEBUG_ON, CONFIG_SLUB_STATS, CONFIG_DEBUG_PREEMPT,
|
|
CONFIG_DEBUG_RT_MUTEXES, CONFIG_DEBUG_PI_LIST, CONFIG_DEBUG_SPINLOCK,
|
|
CONFIG_DEBUG_MUTEXES, CONFIG_DEBUG_LOCK_ALLOC, CONFIG_PROVE_LOCKING,
|
|
CONFIG_LOCKDEP, CONFIG_LOCK_STAT, CONFIG_DEBUG_SPINLOCK_SLEEP,
|
|
CONFIG_STACKTRACE, CONFIG_DEBUG_BUGVERBOSE, CONFIG_DEBUG_VM,
|
|
CONFIG_DEBUG_VIRTUAL, CONFIG_DEBUG_WRITECOUNT, CONFIG_DEBUG_MEMORY_INIT,
|
|
CONFIG_DEBUG_LIST, CONFIG_DEBUG_SG, CONFIG_DEBUG_NOTIFIERS,
|
|
CONFIG_FRAME_POINTER, CONFIG_FAULT_INJECTION, CONFIG_FAILSLAB,
|
|
CONFIG_FAIL_PAGE_ALLOC, CONFIG_FAIL_MAKE_REQUEST,
|
|
CONFIG_FAIL_IO_TIMEOUT, CONFIG_FAULT_INJECTION_DEBUG_FS,
|
|
CONFIG_FAULT_INJECTION_STACKTRACE_FILTER.
|
|
|
|
<sect1> Preparing logging subsystem
|
|
|
|
<p>
|
|
It is recommended that you system logger daemon on the target configured:
|
|
|
|
<itemize>
|
|
|
|
<item> To store kernel logs in separate files on the fastest disk you
|
|
have. It will be better if this disk is dedicated for logging or, at
|
|
least, doesn't contain your LUNs data.
|
|
|
|
<item> To write the kernel logs to the disk in asynchronous manner, i.e.
|
|
without calling fsync() after each written message. Usually, you can
|
|
achieve it, if you add a '-' sign before the corresponding file path in
|
|
your syslog daemon conf file, like:
|
|
|
|
kern.* -/var/log/kern.log
|
|
|
|
</itemize>
|
|
|
|
<sect1> Decoding OOPS messages
|
|
|
|
<p>
|
|
You can decode an OOPS message to the corresponding line in C file
|
|
using gdb "l" command. For example, an OOPS message has a line:
|
|
|
|
<verb>
|
|
[<ffffffff88646174>&rsqb :iscsi_scst:iscsi_extracheck_is_rd_thread+0x94/0xb0
|
|
</verb>
|
|
|
|
You can decode it by:
|
|
|
|
<verb>
|
|
$ gdb iscsi-scst.ko
|
|
(gdb) l *iscsi_scst:iscsi_extracheck_is_rd_thread+0x94
|
|
</verb>
|
|
|
|
For that the corresponding module (iscsi-scst.ko) should be build with
|
|
debug info. But modules not always have debug info built-in. To
|
|
workaround it you can add "-g" flag in the corresponding Makefile
|
|
(without changing anything else!) or enable in .config using "make
|
|
menuconfig" building kernel with debug info. Then rebuild only the .o
|
|
file you need.
|
|
|
|
For instance, to decode OOPS in mm/filemap.c in the kernel you need
|
|
enable in .config building kernel with debug info and then run:
|
|
|
|
<verb>
|
|
$ make mm/filemap.o
|
|
...
|
|
$ gdb mm/filemap.o
|
|
</verb>
|
|
|
|
</article>
|