Commit Graph

887 Commits

Author SHA1 Message Date
Avi Kivity
14c1b17105 storage_service: fix construct_range_to_endpoint_map with semi-infinite range
After the conversion to nonwrapping ranges, construct_range_to_endpoint_map()
may be called with semi-infinite token ranges, but it does not expect this,
calling nonwrapping_range::end()->value() unconditionally.

Fix by checking whether this is a semi-infinite range on the right, and
replace ->value() by maximum_token() instead.

Fixes `nodetool describering` (once more).
Message-Id: <1478983010-29630-1-git-send-email-avi@scylladb.com>
2016-11-14 11:39:48 +01:00
Avi Kivity
2670e46f3e storage_service: deinline most methods
Most inline methods in storage_service are too large to be inlined, and
just increase compile time.  De-inline them.
2016-11-12 21:12:28 +02:00
Gleb Natapov
93f068bd44 storage_proxy: fix speculation target selection logic
Current speculation target selection logic has several bugs in multi-dc
setup. It may select a non local target for CL=LOCAL and it may select
more than one target to speculate, one of which is non local.

Examples:

1. Two dataceneters: DC1 RF 2, DC2 RF 2  and read with LOCAL_QUORUM.

In this scenario db::filter_for_query() will return both replicas from
local DC and speculation target selection logic will peek one one which
will be in different DC.

2. Two dataceneters: DC1 RF 2, DC2 RF 2  and read with LOCAL_ONE + RRD.DC_LOCAL

In this scenario db::filter_for_query() will return all nodes in local DC and
there already be enough nodes to speculate, but current logic will add
one node from non local dc as a speculation target.

The patch below fixed both of those scenarios.

Message-Id: <20161103154637.GS7766@scylladb.com>
2016-11-08 18:32:47 +01:00
Avi Kivity
767cfb4fe9 storage_service: fix range wrapping in describe_ring even more
Commit 8fca1887c2 ("storage_service: fix range wrapping in
describe_ring") fixed incorrect range wrapping code for describe_ring,
but fails when the number of endpoints for a token is greater than one,
because the endpoints are stored in an unordered vector.

Fix by comparing the endpoints in a way that ignores their order.
Message-Id: <1478460826-15923-1-git-send-email-avi@scylladb.com>
2016-11-07 16:18:20 +01:00
Avi Kivity
8fca1887c2 storage_service: fix range wrapping in describe_ring
describe_ring() tries to re-wrap the ranges, but fails because the ranges
are not sorted.  Adjust the code not to rely on sorting.
Message-Id: <1478198630-27483-1-git-send-email-avi@scylladb.com>
2016-11-04 10:48:14 +00:00
Avi Kivity
b3299d5bc3 storage_proxy: simplify range queries
Instead of asking a shard for cmd->partition_limit and cmd->row_limit,
just ask it for the number of partitions and rows still needed to
satisfy the query.  This removes the need to trim the shard's result.
2016-11-03 19:10:20 +02:00
Avi Kivity
a668e575f6 storage_proxy: execute multi-partition query sequentially over shards
Since every shard might cause the row_limit quota to be satisfied, every
shard might be the last one we need.  Hence it is better to process shards
sequentially, stopping if the quota is reached or the range is exhausted.

The original code tried to yield to reduce latency, but this is now
unnecessary, as we're doing a lot less work per iteration (if it becomes
necessary, we should do it on the replica shard, not the coordinating shard).
2016-11-03 19:10:20 +02:00
Avi Kivity
a35136533d Convert ring_position and token ranges to be nonwrapping
Wrapping ranges are a pain, so we are moving wrap handling to the edges.

Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.

Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>
2016-11-02 21:04:11 +02:00
Vlad Zolotarov
f75a350a8f service::storage_proxy: use global_trace_state_ptr when using invoke_on
When trace_state may migrate to a different shard a global_trace_state_ptr
has to be used.

This patch completes the patch below:

commit 7e180c7bd3
Author: Vlad Zolotarov <vladz@cloudius-systems.com>
Date:   Tue Sep 20 19:09:27 2016 +0300

    tracing: introduce the tracing::global_trace_state_ptr class

Fixes #1770

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1476993537-27388-1-git-send-email-vladz@cloudius-systems.com>
2016-10-21 11:34:13 +03:00
Avi Kivity
63f053e9b7 storage_proxy: fix mutation reordering with wrapping ranges
If we have a range query involving a wrapping range (i.e., from thrift),
and mutations from both halves of the result are involved, then
we will return the results in the wrong order (and potentially the wrong
partitions) since we order by token, so the results from the second half
of the wrapping range end up before the first.

Fix by splitting the two queries, and merging the second half with lower
priority compared to the first half.

Note: this will be fixed in a better way once we have the sharding iterator,
as then we can query sequentially.

Fixes #1761.
Message-Id: <1476262693-30162-1-git-send-email-avi@scylladb.com>
2016-10-12 15:59:16 +02:00
Duarte Nunes
01ab2081cd storage_service: Implement get_splits() function
This patch implements the get_splits() function in storage_service,
used to split a particular token range in slices of approximately the
specified size, using the sample keys and estimates of the CF's
sstables.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-10-10 22:32:08 +02:00
Raphael S. Carvalho
9c59ccc52a storage_service: improve log message for refresh
'No new SSTables were found for keyspace1.standard1' was printed
if user uploaded new sstables to upload dir instead, and that is
confusing. We should instead print that if new sstables weren't
found in both cf and cf/upload dirs.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <90386f6255407697434213227ae7ff0de7464f99.1475535203.git.raphaelsc@scylladb.com>
2016-10-06 18:26:32 +03:00
Avi Kivity
c94fb1bf12 build: reduce inclusions of messaging_service.hh
Remove inclusions from header files (primary offender is fb_utilities.hh)
and introduce new messaging_service_fwd.hh to reduce rebuilds when the
messaging service changes.

Message-Id: <1475584615-22836-1-git-send-email-avi@scylladb.com>
2016-10-05 11:46:49 +03:00
Paweł Dziepak
7599ef6fde query_pager: fix splitting range at the end bound
Currently, the code responsible for calculating ranges for the next
request could produce a wrap-around partition range. For example, if the
original range was (unimportant, A] and the last partition key A then
the output range would be (A, A].

This patch adds checks to make sure that in such cases the range is
removed.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475497244-2790-1-git-send-email-pdziepak@scylladb.com>
2016-10-03 19:33:42 +02:00
Vlad Zolotarov
7e180c7bd3 tracing: introduce the tracing::global_trace_state_ptr class
This object, similarly to a global_schema_ptr, allows to dynamically
create the trace_state_ptr objects on different shards in a context
of the original tracing session.

This object would create a secondary tracing session object from the
original trace_state_ptr object when a trace_state_ptr object is needed
on a "remote" shard, similarly to what we do when we need it on a remote
Node.

Fixes #1678
Fixes #1647

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1474387767-21910-1-git-send-email-vladz@cloudius-systems.com>
2016-10-02 11:31:37 +03:00
Paweł Dziepak
eb1fcf3ecc query_pagers: fix clustering key range calculation
Paging code assumes that clustering row range [a, a] contains only one
row which may not be true. Another problem is that it tries to use
range<> interface for dealing with clustering key ranges which doesn't
work because of the lack of correct comparator.

Refs #1446.
Fixes #1684.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475236805-16223-1-git-send-email-pdziepak@scylladb.com>
2016-09-30 17:32:59 +02:00
Duarte Nunes
a36888f3cb storage_service: Convert token through partitioner
This patch ensures we use the partitioner to convert a token to
sstring instead of casting.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1475179683-28552-1-git-send-email-duarte@scylladb.com>
2016-09-30 10:54:26 +02:00
Asias He
f377a3b7ac streaming: Fail streaming sessions during shutdown
Fixes repair_additional_test.py:RepairAdditionalTest.repair_kill_3_test

The test does:

- Insert data on node1 only
- Insert data on node2 only
- Run repair on node1 and stop node1
  once "starting user-requested repair" is seen

The repair shutdown code may wait for the stream session to complete for
a very long time if node 1 finishes sending data to node2 and is waiting
for node2 to send data to it, when node1 is stopped. The stream session
will not be closed in this case until stream session _keep_alive_timeout
(10 minutes) expires. Instead of waiting for the stream_session keep
alive timer to expire, we can fail all the stream sessions during
shutdown.

Before 1 - The bad case (repair shutdown will last for 10 minutes):

  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:23:56,618 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:23:58,626 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:23:58,626 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:23:58,626 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:23:58,626 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:23:58,669 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:25:56,624 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] The session 0x600021516c00 made no progress with peer 127.0.0.2

Before 2 - The good case:

  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:18:34,155 [shard 0] messaging_service - Retry verb=19 to 127.0.0.2:0, retry=10: rpc::closed_error (connection is closed)
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] COMPLETE_MESSAGE for 127.0.0.2 has failed: rpc::closed_error (connection is closed)
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Streaming error occurred
  INFO  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Session with 127.0.0.2 is complete, state=FAILED
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] bytes_sent = 0, bytes_received = 245000
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Stream failed, peers={127.0.0.2}
  WARN  2016-09-21 16:18:34,155 [shard 0] repair - repair's stream failed: streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:18:34,155 [shard 0] repair - repair 1 failed - streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:18:34,155 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:18:34,155 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:18:34,156 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:18:34,156 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:18:34,156 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:18:34,199 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:18:34,199 [shard 0] repair - Completed shutdown of repair
  INFO  2016-09-21 16:18:34,199 [shard 0] compaction_manager - Asked to stop
  INFO  2016-09-21 16:18:34,199 [shard 1] compaction_manager - Asked to stop

After:

  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:06:21,685 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Session with 127.0.0.2 is complete, state=FAILED
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - stream_manager stopped
  INFO  2016-09-21 16:06:23,688 [shard 1] storage_service - stream_manager stopped
  INFO  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] bytes_sent = 0, bytes_received = 25725
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: shutdown stream_manager done
  WARN  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Stream failed, peers={127.0.0.2}
  WARN  2016-09-21 16:06:23,688 [shard 0] repair - repair's stream failed: streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:06:23,688 [shard 0] repair - repair 1 failed - streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:06:23,688 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:06:23,688 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:06:23,688 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:06:23,688 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:06:23,774 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:06:23,774 [shard 0] repair - Completed shutdown of repair
  INFO  2016-09-21 16:06:23,774 [shard 0] compaction_manager - Asked to stop
  INFO  2016-09-21 16:06:23,774 [shard 1] compaction_manager - Asked to stop
2016-09-26 06:29:40 +08:00
Duarte Nunes
f4cf2f2aef tracing: Make trace_state_ptr argument required
This patch makes the optional trace_state_ptr arguments introduced in
previous patches mandatory where possible. Functions which are called
internally don't have a trace context, so for those we keep the
argument's default value for convenience.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-01 12:04:32 +02:00
Duarte Nunes
46b86ff801 storage_proxy: Pass along trace_state for queries
This patch changes the storage_proxy so it passed along a
trace_state_ptr to the layers below, when querying locally or
receiving a remote query request.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-01 12:04:32 +02:00
Glauber Costa
4310635bae move estimated histogram to utils
Nothing sstable-specific in it, really.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-08-31 15:13:23 -04:00
Glauber Costa
ffc2131c51 decouple estimated_histogram from sstables
There is nothing really that fundamentally ties the estimated histogram to
sstables. This patch gets rid of the few incidental ties. They are:

 - the namespace name, which is now moved to utils. Users inside sstables/
   now need to add a namespace prefix, while the ones outside have to change
   it to the right one
 - sstables::merge, which has a very non-descriptive name to begin with, is
   changed to a more descriptive name that can live inside utils/
 - the disk_types.hh include has to be removed - but it had no reason to be
   here in the first place.

Todo, is to actually move the file outside sstables/. That is done in a separate
step for clarity.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-08-31 15:13:23 -04:00
Duarte Nunes
39e0fb1260 storage_proxy: Support multiple partition ranges
This patch adds the ability to query multiple partition ranges. This
is needed since 55f2cf1626, where we
started unwrapping partition ranges in Thrift.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1472474594-15368-1-git-send-email-duarte@scylladb.com>
2016-08-30 17:43:40 +03:00
Avi Kivity
fb3a83a811 Merge "Slow query logging" from Vlad
"This series introduces a "slow query logging" feature that
allows logging the queries that take more than a specified
threshold time to complete.

Once such a query detected, it will be logged in a system_traces.node_slow_log table.

In addition all trace for that query that have been collected on a Coordinator
are going to be written as well.

If the handling time on a replica in the context of a query takes more than (the same) threshold
they are going to be written too.

The raw in a node_slow_log contains a session_id of a corresponding tracing session,
thereby allowing the user to query the system_traces tables for the corresponding trace
records.

The schema of the node_slow_log table is as follows:

CREATE TABLE system_traces.node_slow_log (
    node_ip inet,
    shard int,
    session_id uuid,
    date timestamp,
    start_time timeuuid,
    command text,
    duration int,
    parameters map<text, text>,
    source_ip inet,
    table_names set<text>,
    username text,
    PRIMARY KEY (start_time, node_ip, shard))
    WITH default_time_to_live = 86400

where
 - node_ip: IP of the coordinator Node.
 - shard: shard ID on a Coordinator where the query was handled.
 - session_id: ID of a corresponding tracing session.
 - date: a time when the query has began.
 - start_time: a time-based UUID for this query (needed for a primary key mostly).
 - command: a query string.
 - duration: a time it took to handle this query (in microseconds).
 - parameters: a map of query parameters (like in system_traces.sessions).
 - source_ip: IP of a Client that sent this query.
 - table_names: a set of "<keyspace>.<table name>" strings representing column
                families used in this query.
 - username: a user name used for this query.

The good thing is that most of the data we needed is already
collected by the regular tracing framework. The only missing ones
are a username and tables' names. So, this series makes the framework collect them too.

The whole feature is integrated in the Tracing framework. The main
changes to the framework that were made are as follows:
 - Store the constant capabilities of the tracing session in an enum_set, e.g.:
  - primary/secondary.
  - write on close.
 - Introduce two new capabilities to a tracing session of a specific query:
  - full tracing: collect all traces for this query (as it is before this series).
  - log slow query: log this query if its duration is above the threshold.
  These two capabilities may be defined independently.
 - Add the logic that handles the "log slow query"-only case:
  - Build the parameters<sstring, sstring> map only if the "duration" is above
    the given threshold.
  - The same about writing the trace entries.
 - In a not-only "log slow query" case:
  - Write the node_slow_log entry.
  - Extend the trace_info struct to pass slow query threshold and TTL to the replica
    Node.

In addition to above this series add the capability to configure the slow query logging
threshold and a TTL for the node_slow_log records.

The heaviest patch in the series is the last one. The series contains a few cosmetic (renaming)
patches that are meant to align the naming of the existing methods with the ones the last one
is going to add."
2016-08-29 13:11:36 +03:00
Gleb Natapov
a2cdddb795 storage_proxy: forward mutation write with correct timeout value
Now that mutation handler knows how much time is left for mutation
write to be handled it can use this knowledge to set correct timeout
for forwarded mutations.

Message-Id: <20160828080637.GE9243@scylladb.com>
2016-08-29 13:06:36 +03:00
Vlad Zolotarov
8609900621 tracing: introduce trace_state capabilities bit field
- Instead of keeping separate booleans introduce a trace_state_props_set enum_set and
     pass it around instead of separate booleans.
   - Change the trace_info to hold this value in addition to write_on_close. Initialize
     a corresponding bit in an enum_set based on a write_on_close value in a trace_info
     constructor for a backward compatibility.
   - Separate a trace_state constructor into two:
      - For a primary session object.
      - For a secondary session object.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 18:34:36 +03:00
Vlad Zolotarov
b40a819d1e tracing::trace_state: rename: get_session_id() -> session_id()
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-23 17:58:42 +03:00
Duarte Nunes
3275fabe53 storage_proxy: Short circuit query without clustering ranges
This patch makes the storage_proxy return an empty result when the
query doesn't define any clustering ranges (default or specific).

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-15 14:48:57 +00:00
Piotr Jastrzebski
f212a6cfcb Fix after free access bug in storage proxy
Due to speculative reads we can't guarantee that all
fibers started by storage_proxy::query will be finished
by the time the method returns a result.

We need to make sure that no parameter passed to this
method ever changes.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <31952e323e599905814b7f378aafdf779f7072b8.1471005642.git.piotr@scylladb.com>
2016-08-12 16:34:43 +02:00
Duarte Nunes
54ad038aa6 storage_proxy: Enforce partition_limit
This patch enforces the partition_limit at the
mutation_result_merger.

Ref #693

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 21:17:06 +00:00
Paweł Dziepak
0f902738f0 Revert "storage_proxy: Enforce partition_limit"
This reverts commit 141ea49e05.

There was a confusion around the meaning of "partition limit".
Parts of our code interpreted it just as "maximum number of partitions".
This is also how Cassandra behaves.

However, the other parts of the code, including data query, interpreted
it as "maximum number of live partitions" or otherwise skipped dead
partitions resulting in #1447.

A decision has been made to stick to the "maximum number of live
partitions" interpretation everywhere. The consequences are, among
others, that the patch reverted by this one is no longer correct.

While, the actual series fixing the interpretations of partition limit
and getting rid of the confusion is yet to come, the purpose of this
revert is to make backporting easier (as the patch being reverted
hasn't made it to branch-1.3 yet).
2016-08-02 16:53:01 +01:00
Duarte Nunes
141ea49e05 storage_proxy: Enforce partition_limit
This patch enforces the partition_limit at the
mutation_result_merger.

Ref #693

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1470065526-3174-1-git-send-email-duarte@scylladb.com>
2016-08-02 10:10:43 +01:00
Duarte Nunes
7d1b7e8da3 storage_service: Fix get_range_to_address_map_in_local_dc
This patch fixes a couple of bugs in
get_range_to_address_map_in_local_dc.

Fixes #1517

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1469782666-21320-1-git-send-email-duarte@scylladb.com>
2016-07-29 11:11:07 +02:00
Vlad Zolotarov
57b58cad8e SELECT tracing instrumentation: improve inter-nodes communication stages messages
Add/fix "sending to"/"received from" messages.

With this patch the single key select trace with a data on an external node
looks as follows:

Tracing session: 65dbfcc0-4f51-11e6-8dd2-000000000001

 activity                                                                                                                        | timestamp                  | source    | source_elapsed
---------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
                                                                                                              Execute CQL3 query | 2016-07-21 17:42:50.124000 | 127.0.0.2 |              0
                                                                                                   Parsing a statement [shard 1] | 2016-07-21 17:42:50.124127 | 127.0.0.2 |             --
                                                                                                Processing a statement [shard 1] | 2016-07-21 17:42:50.124190 | 127.0.0.2 |             64
 Creating read executor for token 2309717968349690594 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 1] | 2016-07-21 17:42:50.124229 | 127.0.0.2 |            103
                                                                            read_data: sending a message to /127.0.0.1 [shard 1] | 2016-07-21 17:42:50.124234 | 127.0.0.2 |            108
                                                                           read_data: message received from /127.0.0.2 [shard 1] | 2016-07-21 17:42:50.124358 | 127.0.0.1 |             14
                                                          read_data handling is done, sending a response to /127.0.0.2 [shard 1] | 2016-07-21 17:42:50.124434 | 127.0.0.1 |             89
                                                                               read_data: got response from /127.0.0.1 [shard 1] | 2016-07-21 17:42:50.124662 | 127.0.0.2 |            536
                                                                                  Done processing - preparing a result [shard 1] | 2016-07-21 17:42:50.124695 | 127.0.0.2 |            569
                                                                                                                Request complete | 2016-07-21 17:42:50.124580 | 127.0.0.2 |            580

Fixes #1481

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1469112271-22818-1-git-send-email-vladz@cloudius-systems.com>
2016-07-21 19:46:43 +03:00
Vlad Zolotarov
7c590295ef SELECT instrumentation: add a nice trace point
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:59 +03:00
Vlad Zolotarov
b36b69c1d6 service::storage_proxy: remove a default value for a tracing::trace_state_ptr parameter
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:59 +03:00
Vlad Zolotarov
baa6496816 service::storage_proxy: READ instrumentation: store trace state object in abstract_read_executor
Having a trace_state_ptr in the storage_proxy level is needed to trace code bits in this level.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:59 +03:00
Vlad Zolotarov
962bddf8fe transport: CQL tracing: instrument a BATCH command
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
be88074f47 service::query_state: get rid of begin_tracing()
Use tracing::begin() directly.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
982d301178 service::client_state: add a const version of get_trace_state()
tracing::begin() requires a non-const version, tracing::trace()
requires a const version.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
da56aa4256 service::client_state: rename: trace_state_ptr() -> get_trace_state()
Rename the method for consistency with other classes methods returning
the same value.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
4c16df9e4c service: instrument MUTATE flow with tracing
Store the trace state in the abstract_write_response_handler.
Instrument send_mutation RPC to receive an additional
rpc::optional parameter that will contain optional<trace_info>
value.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
952dc8a3d4 query_state: add get_trace_state() method
Adding this method allows to use tracing helper functions
and remove the no longer needed accessors in the query_state.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
0552ffcd17 service/storage_proxy: tracing: adjust the existing SELECT instrumentation with the new trace() interface
From now on trace_state::trace() is able to receive the sprint-ready
format string with the arguments that will be applied only during
the flush event.

This patch also optimizes the way the source address is evaluated -
do it only once instead of twice if tracing is requested.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
0689843e79 tracing::trace_state: add method to set the session's "params" map entries
Sometimes we want to be able to set "params" map after we
started a tracing session, e.g. when the parameters values,
like a consistency level parsed from the "options" part of a binary frame,
are available only after some heavy part of a flow we would like
to trace.

This patch includes the following changes:

   - No longer pass a map to the begin().
   - Limit the parameters to the known set.
   - Define a method to set each such parameter and save its
     value till the final sstring->sstring map is created.
   - Construct the final sstring->sstring map in the destructor of the trace_state
     object in order to defer all the formatting to be after the traced flow.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Vlad Zolotarov
a5022a09a4 tracing: use 'write' instead of 'flush' and 'store' for consistency with seastar's API
In names of functions and variables:
s/flush_/write_/
s/store_/write_/

In a i_tracing_backend_helper:
s/flush()/kick()/

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:57 +03:00
Duarte Nunes
3c389ba871 client_state: Add has_schema_access function
This function is similar to has_column_family_access, but skips
validating if the specified keyspace and column family names map to a
valid schema, as it already takes one as an argument.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-17 17:38:23 +00:00
Tomasz Grabiec
c97871d95c migration_manager: Uncomment logging for keysapce drop
Message-Id: <1468413673-6899-1-git-send-email-tgrabiec@scylladb.com>
2016-07-13 13:42:23 +01:00
Paweł Dziepak
85c092c56c storage_service: add LARGE_PARTITIONS_FEATURE
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-13 09:51:23 +01:00
Asias He
e0949a8f4f storage_service: Exit shadow round state if it fails
If a node fails to talk to any seed node, shadow round will fail. We
should exit shadow round state before we continue.

This issue is spotted by
consistency_test.TestConsistency.data_query_digest_test dtest.
Message-Id: <ba0613532a69bac369ca316ab61d907b320c8e68.1467963674.git.asias@scylladb.com>
2016-07-08 10:05:07 +01:00