|
|
|
|
@@ -54,9 +54,9 @@ being actively used, nor going to implement it in the future.</p>
|
|
|
|
|
|
|
|
|
|
<p> MC/S is done on the iSCSI level, while MPIO is done on the higher
|
|
|
|
|
level. Hence, all MPIO infrastructure is shared among all SCSI
|
|
|
|
|
transports, including Fibre Channel and SAS. </p>
|
|
|
|
|
transports, including Fibre Channel, SAS, etc. </p>
|
|
|
|
|
|
|
|
|
|
<p> MC/S was designed at time, when most OS'es didn't have OS level
|
|
|
|
|
<p> MC/S was designed at time, when most OS'es didn't have standard OS level
|
|
|
|
|
multipath. Instead, each vendor had its own implementation, which
|
|
|
|
|
created huge interoperability problems. So, one of the goals of MC/S was
|
|
|
|
|
to address this issue and standardize the multipath area. But
|
|
|
|
|
@@ -85,9 +85,9 @@ Consequently, all reservations and other SCSI states as well as other
|
|
|
|
|
initiators connected to the device remain unaffected.</p>
|
|
|
|
|
|
|
|
|
|
<p>For MPIO failover recovery is much more complicated. This is because
|
|
|
|
|
it involves transferring outstanding commands and SCSI states from one
|
|
|
|
|
it involves transfer of all outstanding commands and SCSI states from one
|
|
|
|
|
I_T Nexus to another. The first thing, which initiator will do for
|
|
|
|
|
failover recovery is to abort all outstanding commands on the faulted
|
|
|
|
|
that is to abort all outstanding commands on the faulted
|
|
|
|
|
I_T Nexus. There are 2 approaches for that: CLEAR TASK SET and LUN RESET
|
|
|
|
|
task management functions. </p>
|
|
|
|
|
|
|
|
|
|
@@ -105,7 +105,7 @@ RESET resets all SCSI settings for all connected initiators to the
|
|
|
|
|
initial state and, if device had reservation from any initiator, it will
|
|
|
|
|
be cleared.
|
|
|
|
|
|
|
|
|
|
<p>But the harm will be minimal:</p>
|
|
|
|
|
<p>But the harm is minimal:</p>
|
|
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
<li><span> With TAS bit set on Control Mode page, all the aborted commands will
|
|
|
|
|
@@ -131,12 +131,12 @@ retry all the aborted commands. On a properly configured system it
|
|
|
|
|
should be less than few seconds, which is well acceptable on practice.
|
|
|
|
|
If Linux storage stack improved to allow to abort all submitted to it
|
|
|
|
|
commands (currently only wait for their completion is possible), then
|
|
|
|
|
time to abort all commands can be decreased to a fraction of second. </p>
|
|
|
|
|
time to abort all the commands can be decreased to a fraction of second. </p>
|
|
|
|
|
|
|
|
|
|
<h2>Performance</h2>
|
|
|
|
|
|
|
|
|
|
<p>At first, neither MC/S, nor MPIO can improve performance if there is
|
|
|
|
|
only one SCSI command sent to target at time. For instance, for tape
|
|
|
|
|
only one SCSI command sent to target at time. For instance, in case of tape
|
|
|
|
|
backup and restore. Both MC/S and MPIO work on the commands level, so
|
|
|
|
|
can't split data transfers for a single command over several links. Only
|
|
|
|
|
bonding can improve performance in this case, because it works on the
|
|
|
|
|
@@ -149,9 +149,9 @@ link was submitted earlier. Delays in links processing can change
|
|
|
|
|
commands order in the place where target receives them.</p>
|
|
|
|
|
|
|
|
|
|
<p>Since initiators usually send commands in the optimal for performance
|
|
|
|
|
order, reordering can hurt performance. But this can happen only with
|
|
|
|
|
naive implementation, which can't recover the optimal commands execution
|
|
|
|
|
order. Currently Linux is not naive and quite good on it. See, for
|
|
|
|
|
order, reordering can somehow hurt performance. But this can happen only with
|
|
|
|
|
naive target implementation, which can't recover the optimal commands execution
|
|
|
|
|
order. Currently Linux is not naive and quite good on this area. See, for
|
|
|
|
|
instance, section "SEQUENTIAL ACCESS OVER MPIO" in <a
|
|
|
|
|
href="vl_res.txt">those measurements</a>. Don't look at the absolute
|
|
|
|
|
numbers, look at %% of performance improvement using the second link.
|
|
|
|
|
@@ -161,11 +161,11 @@ possible maximum.</p>
|
|
|
|
|
<p>If free commands reorder is forbidden for a device, either
|
|
|
|
|
by use of ORDERED tag, or if the Queue Algorithm Modifier in the Control
|
|
|
|
|
Mode Page is set to 0, then MPIO will have to maintain commands order by
|
|
|
|
|
sending them over only a single link. But on practice this case is
|
|
|
|
|
sending commands over only a single link. But on practice this case is
|
|
|
|
|
really rare and 99.(9)% of OS'es and applications allow free commands
|
|
|
|
|
reorder and it is enabled by default.</p>
|
|
|
|
|
|
|
|
|
|
<p>However strictly preserving commands order as MC/S does has a
|
|
|
|
|
<p>From other side, strictly preserving commands order as MC/S does has a
|
|
|
|
|
downside as well. It can lead to so called "commands ordering
|
|
|
|
|
bottleneck", when newer commands have to wait before one or more older
|
|
|
|
|
commands get executed, although it would be better for performance to
|
|
|
|
|
@@ -201,14 +201,13 @@ for instance,
|
|
|
|
|
task? Better to spend this effort on improving MPIO. Simply, MC/S is
|
|
|
|
|
done on the wrong level. No surprise then that no Open Source OS'es
|
|
|
|
|
neither support, nor going to implement it. Moreover, when back to 2005
|
|
|
|
|
there was an attempt to add MC/S in Linux, it was rejected. See for more details:
|
|
|
|
|
there was an attempt to add MC/S in Linux, it was rejected. See for more details
|
|
|
|
|
<a href="http://article.gmane.org/gmane.linux.scsi/15769">http://article.gmane.org/gmane.linux.scsi/15769</a>
|
|
|
|
|
and <a href="http://article.gmane.org/gmane.linux.scsi/16301">http://article.gmane.org/gmane.linux.scsi/16301</a>.
|
|
|
|
|
</p>
|
|
|
|
|
|
|
|
|
|
<p>But for sake of completeness, there are marginal cases, where MPIO
|
|
|
|
|
can't be used or will not provide any benefit, but MC/S can provide
|
|
|
|
|
benefits:</p>
|
|
|
|
|
can't be used or will not provide any benefit, but MC/S can be successful:</p>
|
|
|
|
|
|
|
|
|
|
<ol>
|
|
|
|
|
<li><span>When strict commands order is required.</span></li>
|
|
|
|
|
@@ -233,13 +232,13 @@ and backup applications one or both can be true. But on practice:</p>
|
|
|
|
|
limitation of legacy tape drives, which support only implicit
|
|
|
|
|
address commands, not of MPIO. Modern tape drives and backup
|
|
|
|
|
applications can use explicit address commands, which you can
|
|
|
|
|
abort and then retry, hence compatible with MPIO.</span></li>
|
|
|
|
|
abort and then retry, hence they are compatible with MPIO.</span></li>
|
|
|
|
|
|
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
<p>If in future SCSI standards have possibility to group several I_T nexuses
|
|
|
|
|
<p>If in future SCSI standards gain possibility to group several I_T nexuses
|
|
|
|
|
with possibilities to reassign commands between them and preserve commands
|
|
|
|
|
order among them, the above the only advantages of MC/S over MPIO will be
|
|
|
|
|
order among them, the above minor advantages of MC/S over MPIO will be
|
|
|
|
|
removed and, hence, all investments in it will be voided.</p>
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|