Commit Graph

236 Commits

Author SHA1 Message Date
Pekka Enberg
12d99bd282 service/migration_manager: Migration listener hooks
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Pekka Enberg
5b3f7f8091 service: Convert MigrationListener to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-04 15:08:24 +03:00
Pekka Enberg
44b53be4a1 service: Import IMigrationListener.java
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-04 13:17:05 +03:00
Shlomi Livne
b8bcf6d6e7 Fix a bug in mutation response processing in smp
Current code passes the storage_proxy instance of the shard on which the
message was received instead of using the storage_proxy instance of the
shard that sent the request.

Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-08-04 11:23:49 +03:00
Pekka Enberg
a3c95235e6 migration_manager: Make stateful with sharded<>
In preparation for adding listener state to migration manager, use
sharded<> for migration manager.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-04 11:23:23 +03:00
Tomasz Grabiec
51b3bf38bb storage_proxy: Introduce make_local_reader() 2015-08-03 15:21:40 +02:00
Tomasz Grabiec
19851fb8db storage_proxy: Move stop() implementation to source file 2015-08-03 11:52:23 +02:00
Glauber Costa
438a3f8619 storage_proxy: move speculative_retry type to schema
It will be stored in the schema, so move there, where it belongs.  We'll need
to do more than just store a type, so provide a class that encapsulates it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-29 18:44:59 -04:00
Asias He
04e1cbfac7 storage_service: Enable check_for_endpoint_collision
We do a shadow gossip round to check if the ip address is used by
another node.
2015-07-29 15:41:52 +08:00
Avi Kivity
34131b22dd Merge "Debuggability improvements" from Tomasz 2015-07-28 12:47:33 +03:00
Tomasz Grabiec
c2664c0d46 storage_proxy: Add query logging on log_level::trace 2015-07-28 11:31:08 +02:00
Pekka Enberg
3803e0ed5b exceptions: Move request_timeout_exception to exceptions.hh
Now that consistency level dependency issues are sorted out, move
request_timeout_exception to exceptions.hh.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-28 10:06:18 +03:00
Pekka Enberg
0b8c67ed79 exceptions: Move unavailable_exception to exceptions.hh
Move unavailable_exception to exceptions.hh where other CQL transport
level exceptions are defined in.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-28 10:06:18 +03:00
Pekka Enberg
a4225fa7d0 service/storage_proxy: Use proper read_timeout_exception
Use the shiny new read_timeout_exception that handles CQL transport
protocol encoding properly.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-27 17:49:52 +03:00
Amnon Heiman
893410b08d storage_proxy: setting the write timers
This patch set the write timmers: histogram, timeout and unavailable.

For the histogram a latency is needed. For that the latency object is
used.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:57:40 +03:00
Amnon Heiman
c317e61f6d Adding histogrms to storage_proxy
The storage proxy needs to collect statistics about read, write and
range. For that the ihistogram object was added to its stats object.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:57:40 +03:00
Tomasz Grabiec
e5feff5d71 dht: ring_position: Switch to total ordering
range::is_wrap_around() and range::contains() rely on total ordering
on values to work properly. Current ring_position_comparator was only
imposing a weak ordering (token positions equal to all key positions
with that token).

range::before() and range::after() can't work for weak ordering. If
the bound is exclusive, we don't know if user-provided token position
is inside or outside.

Also, is_wrap_around() can't properly detect wrap around in all
cases. Consider this case:

 (1) ]A; B]
 (2) [A; B]

For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is
not. Without total ordering between A and B, range::is_wrap_around() can't
tell that.

I think the simplest soution is to define a total ordering on
ring_position by making token positions positioned either before or
after all keys with that token.
2015-07-24 16:08:41 +02:00
Pekka Enberg
b912c888f1 service/migration_manager: Fix logging by de-thread-localizing loggers
The migration_manager and migration_task logging is currently not
visible in the logs. Fix that by de-thread-localizing both loggers.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-23 16:16:14 +03:00
Gleb Natapov
e41880fc66 drop reference to storage_proxy from read executors
Now when storage_proxy can be accessed globally there is no need to
waste memory to save its pointer in each read executor.
2015-07-23 12:32:21 +03:00
Gleb Natapov
00d4974cec storage_proxy: add logger
Uncomment exiting logging points.
2015-07-23 12:32:21 +03:00
Gleb Natapov
f122ee39b9 storage_proxy: return proper error codes to transport layer
Transport layer expects to get error code in an exception of type
exceptions::cassandra_exception. Fix code to use it as a base for
all user visible exceptions and put correct error code there.
2015-07-23 12:32:21 +03:00
Avi Kivity
8870bf1bf8 Merge "Handling of non-full partition range queries" from Tomasz 2015-07-22 15:18:02 +03:00
Pekka Enberg
55858137e0 utils: Clean up runtime::get_uptime() API
Return a std::chrono::steady_clock::duration and switch the caller in
migration manager to also use proper C++ durations.

Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 14:56:52 +03:00
Gleb Natapov
150c28e941 storage_proxy: do not ignore connection errors
Do nothing about them for now. Read will eventually fail on timeout.
2015-07-22 13:44:47 +03:00
Gleb Natapov
98fae1a010 storage_proxy: handle read timeout 2015-07-22 13:44:46 +03:00
Avi Kivity
a547b881a7 Merge "Schema pull" from Pekka
"This series enables the schema pull functionality. It's used to
synchronize schema at node startup for schema changes that happened in
the cluster while the node was down.

Node 1:

  # Node 2 is not running.

  [penberg@nero apache-cassandra-2.1.7]$ ./bin/cqlsh --no-color 127.0.0.1
  Connected to Test Cluster at 127.0.0.1:9042.
  [cqlsh 5.0.1 | Cassandra 2.2.0 | CQL spec 3.2.0 | Native protocol v3]
  Use HELP for help.
  cqlsh> CREATE KEYSPACE keyspace3 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
  cqlsh> SELECT * FROM system.schema_keyspaces;

   keyspace_name | durable_writes | strategy_class                             | strategy_options
  ---------------+----------------+--------------------------------------------+----------------------------
       keyspace3 |           True |                             SimpleStrategy | {"replication_factor":"1"}
          system |           True | org.apache.cassandra.locator.LocalStrategy |                         {}

  (2 rows)
  cqlsh> SELECT key, schema_version FROM system.local;

   key   | schema_version
  -------+--------------------------------------
   local | c3a18ddc-80c5-3a25-b82d-57178a318771

  (1 rows)

Node 2:

  # Node 2 is started.

  [penberg@nero apache-cassandra-2.1.7]$ ./bin/cqlsh --no-color 127.0.0.2
  Connected to Test Cluster at 127.0.0.2:9042.
  [cqlsh 5.0.1 | Cassandra 2.2.0 | CQL spec 3.2.0 | Native protocol v3]
  Use HELP for help.
  cqlsh> SELECT * FROM system.schema_keyspaces;

   keyspace_name | durable_writes | strategy_class                             | strategy_options
  ---------------+----------------+--------------------------------------------+----------------------------
       keyspace3 |           True |                             SimpleStrategy | {"replication_factor":"1"}
          system |           True | org.apache.cassandra.locator.LocalStrategy |                         {}

  (2 rows)
  cqlsh> SELECT key, schema_version FROM system.local;

   key   | schema_version
  -------+--------------------------------------
   local | c3a18ddc-80c5-3a25-b82d-57178a318771

  (1 rows)"
2015-07-22 13:32:47 +03:00
Gleb Natapov
b737a85f08 storage_proxy: fix get_live_sorted_endpoints()
remove_if() does not really removes anything.

Fixes #33
2015-07-22 12:14:40 +02:00
Pekka Enberg
fa02929165 service/storage_service: Wire up schema pull
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 13:08:18 +03:00
Pekka Enberg
f1d5b9c4ae service/migration_manager: Convert scheduleSchemaPull() to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 13:08:18 +03:00
Pekka Enberg
5033858551 service/storage_service: Fix schema version announce in prepare_to_join()
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
d4d2e2fa0e service/storage_proxy: add get_storage_proxy() helpers
Make storage proxy a singleton and add helpers to look up a reference.
This makes it easier to convert code from Origin in areas such as
storage service.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
c6dc61eab4 service: Convert MigrationTask to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
03a43678af service: Import MigrationTask.java
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
d81b04f356 service/storage_proxy: Wire up MIGRATION_REQUEST messages
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Tomasz Grabiec
6882e6ca50 storage_proxy: Fix range splitting when split point is exlcluded by lower bound
If we had a range (x; ...] then x is excluded, but token iterator was
initialized with x. The splitting loop would exit prematurely because
it would detect that the token is outside the range.

The fix is to teach ring_range() to recognize this and always give
tokens which are not smaller than the range's lower bound.
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
c219c3676b storage_proxy: Avoid unnecessary copy of a partition_range 2015-07-22 10:27:48 +02:00
Tomasz Grabiec
abdb36a0f6 storage_proxy: Fix ordering comparator used during merging and reconciliation 2015-07-22 10:27:48 +02:00
Tomasz Grabiec
9b52f5bf2b storage_proxy: Make range splitting never produce a wrap around range
Origin has no notion of a maximum token so a range without upper bound
is represented as (x; min]. The splitting code is supposed to produce
only non-wrapping ranges, but (x; min] looks like a wrapping range, so
database code which consumes it would have to special-case for it. A
simpler solution is to change the splitting code to never produce a
wrapping range.
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
f557951554 storage_proxy: Encapsulate access to range tokens
The wrappers also take care of cases when each bound is undefined, and
return mimimum or maximum token respectively, which fixes undefined
behavior in get_restricted_ranges().

Alternative solution would be to make partition_range use a different
type of range, the one where bounds are always specified. However it's
not worth introducing a new range type just for those few users.
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
5587d2ce9a storage_proxy: Simplify access to storage_proxy::response_id_type
storage_proxy::storage_proxy::response_id_type
 -> storage_proxy::response_id_type
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
0b0ea04958 range: Remove start_value() and end_value()
It's easy to miss that they may be undefined. start() and end(), which
return optional<bound> const&, make it clear.
2015-07-22 10:27:47 +02:00
Asias He
547d9c347e storage_service: Simplfy do_update_system_peers_table
Using template helper.
2015-07-22 11:22:18 +03:00
Asias He
7ee4ae2ff7 storage_service: Use logger.error instead of print 2015-07-21 22:56:24 +08:00
Asias He
94e34cde64 storage_service: Drop ss_debug 2015-07-21 22:56:24 +08:00
Asias He
72a8c27b56 storage_service: Enable logger 2015-07-21 22:56:24 +08:00
Asias He
344d8e95d1 storage_service: Update system.peers table
Before:

   cqlsh> SELECT * from system.peers ;
   cqlsh:system> SELECT * FROM system.peers ;
    peer | data_center | host_id | preferred_ip | rack | release_version | rpc_address | schema_version | tokens
   ------+-------------+---------+--------------+------+-----------------+-------------+----------------+--------
   (0 rows)

After:

   cqlsh> SELECT * from system.peers ;
    peer      | data_center | host_id                              | preferred_ip | rack  | release_version | rpc_address | schema_version | tokens
   -----------+-------------+--------------------------------------+--------------+-------+-----------------+-------------+----------------+--------
    127.0.0.2 | datacenter1 | 7daa116a-03d0-4623-8084-f213701f3136 |         null | rack1 |      urchin_1_0 |   127.0.0.2 |           null |   null
   (1 rows)
2015-07-21 17:10:56 +08:00
Asias He
d8a281e811 storage_service: Use broadcast_address as broadcast_rpc_address for now
Until we can get it from the config system.

It is better than a empty address.
2015-07-21 17:00:15 +08:00
Shlomi Livne
026f6b7e2b Fix a race condition in node state for bootstrapped nodes
Booting a bootstrapped node had a race condition between setting and
advertising the state as boot strapped (call to
storage_service::bootstrap) and setting and advertising the
state as normal (call to storage_service::set_tokens) - as such a node
could get into a state in which it was "stuck" in bootstrap mode.

Following this patch you must wait for 5 seconds to have the cluster in
a stable state.

Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-07-20 17:53:57 +03:00
Gleb Natapov
c58aea07c3 remove storage_proxy::query_local()
Use storage_proxy::query() instead.
2015-07-20 12:06:00 +02:00
Gleb Natapov
640e65b947 storage_proxy: Reconciliation. The beginning
The patch introduces reconciliation code. The same code suppose to be
working for both range and single key queries. Handling of raw_limit,
short reads and read repairs is still very much missing.

--
v1->v2:
  - call live_row_count() only once.
2015-07-16 19:23:58 +03:00