Commit Graph

222 Commits

Author SHA1 Message Date
Amnon Heiman
893410b08d storage_proxy: setting the write timers
This patch set the write timmers: histogram, timeout and unavailable.

For the histogram a latency is needed. For that the latency object is
used.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:57:40 +03:00
Amnon Heiman
c317e61f6d Adding histogrms to storage_proxy
The storage proxy needs to collect statistics about read, write and
range. For that the ihistogram object was added to its stats object.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:57:40 +03:00
Tomasz Grabiec
e5feff5d71 dht: ring_position: Switch to total ordering
range::is_wrap_around() and range::contains() rely on total ordering
on values to work properly. Current ring_position_comparator was only
imposing a weak ordering (token positions equal to all key positions
with that token).

range::before() and range::after() can't work for weak ordering. If
the bound is exclusive, we don't know if user-provided token position
is inside or outside.

Also, is_wrap_around() can't properly detect wrap around in all
cases. Consider this case:

 (1) ]A; B]
 (2) [A; B]

For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is
not. Without total ordering between A and B, range::is_wrap_around() can't
tell that.

I think the simplest soution is to define a total ordering on
ring_position by making token positions positioned either before or
after all keys with that token.
2015-07-24 16:08:41 +02:00
Pekka Enberg
b912c888f1 service/migration_manager: Fix logging by de-thread-localizing loggers
The migration_manager and migration_task logging is currently not
visible in the logs. Fix that by de-thread-localizing both loggers.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-23 16:16:14 +03:00
Gleb Natapov
e41880fc66 drop reference to storage_proxy from read executors
Now when storage_proxy can be accessed globally there is no need to
waste memory to save its pointer in each read executor.
2015-07-23 12:32:21 +03:00
Gleb Natapov
00d4974cec storage_proxy: add logger
Uncomment exiting logging points.
2015-07-23 12:32:21 +03:00
Gleb Natapov
f122ee39b9 storage_proxy: return proper error codes to transport layer
Transport layer expects to get error code in an exception of type
exceptions::cassandra_exception. Fix code to use it as a base for
all user visible exceptions and put correct error code there.
2015-07-23 12:32:21 +03:00
Avi Kivity
8870bf1bf8 Merge "Handling of non-full partition range queries" from Tomasz 2015-07-22 15:18:02 +03:00
Pekka Enberg
55858137e0 utils: Clean up runtime::get_uptime() API
Return a std::chrono::steady_clock::duration and switch the caller in
migration manager to also use proper C++ durations.

Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 14:56:52 +03:00
Gleb Natapov
150c28e941 storage_proxy: do not ignore connection errors
Do nothing about them for now. Read will eventually fail on timeout.
2015-07-22 13:44:47 +03:00
Gleb Natapov
98fae1a010 storage_proxy: handle read timeout 2015-07-22 13:44:46 +03:00
Avi Kivity
a547b881a7 Merge "Schema pull" from Pekka
"This series enables the schema pull functionality. It's used to
synchronize schema at node startup for schema changes that happened in
the cluster while the node was down.

Node 1:

  # Node 2 is not running.

  [penberg@nero apache-cassandra-2.1.7]$ ./bin/cqlsh --no-color 127.0.0.1
  Connected to Test Cluster at 127.0.0.1:9042.
  [cqlsh 5.0.1 | Cassandra 2.2.0 | CQL spec 3.2.0 | Native protocol v3]
  Use HELP for help.
  cqlsh> CREATE KEYSPACE keyspace3 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
  cqlsh> SELECT * FROM system.schema_keyspaces;

   keyspace_name | durable_writes | strategy_class                             | strategy_options
  ---------------+----------------+--------------------------------------------+----------------------------
       keyspace3 |           True |                             SimpleStrategy | {"replication_factor":"1"}
          system |           True | org.apache.cassandra.locator.LocalStrategy |                         {}

  (2 rows)
  cqlsh> SELECT key, schema_version FROM system.local;

   key   | schema_version
  -------+--------------------------------------
   local | c3a18ddc-80c5-3a25-b82d-57178a318771

  (1 rows)

Node 2:

  # Node 2 is started.

  [penberg@nero apache-cassandra-2.1.7]$ ./bin/cqlsh --no-color 127.0.0.2
  Connected to Test Cluster at 127.0.0.2:9042.
  [cqlsh 5.0.1 | Cassandra 2.2.0 | CQL spec 3.2.0 | Native protocol v3]
  Use HELP for help.
  cqlsh> SELECT * FROM system.schema_keyspaces;

   keyspace_name | durable_writes | strategy_class                             | strategy_options
  ---------------+----------------+--------------------------------------------+----------------------------
       keyspace3 |           True |                             SimpleStrategy | {"replication_factor":"1"}
          system |           True | org.apache.cassandra.locator.LocalStrategy |                         {}

  (2 rows)
  cqlsh> SELECT key, schema_version FROM system.local;

   key   | schema_version
  -------+--------------------------------------
   local | c3a18ddc-80c5-3a25-b82d-57178a318771

  (1 rows)"
2015-07-22 13:32:47 +03:00
Gleb Natapov
b737a85f08 storage_proxy: fix get_live_sorted_endpoints()
remove_if() does not really removes anything.

Fixes #33
2015-07-22 12:14:40 +02:00
Pekka Enberg
fa02929165 service/storage_service: Wire up schema pull
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 13:08:18 +03:00
Pekka Enberg
f1d5b9c4ae service/migration_manager: Convert scheduleSchemaPull() to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 13:08:18 +03:00
Pekka Enberg
5033858551 service/storage_service: Fix schema version announce in prepare_to_join()
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
d4d2e2fa0e service/storage_proxy: add get_storage_proxy() helpers
Make storage proxy a singleton and add helpers to look up a reference.
This makes it easier to convert code from Origin in areas such as
storage service.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
c6dc61eab4 service: Convert MigrationTask to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
03a43678af service: Import MigrationTask.java
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Pekka Enberg
d81b04f356 service/storage_proxy: Wire up MIGRATION_REQUEST messages
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-22 11:57:00 +03:00
Tomasz Grabiec
6882e6ca50 storage_proxy: Fix range splitting when split point is exlcluded by lower bound
If we had a range (x; ...] then x is excluded, but token iterator was
initialized with x. The splitting loop would exit prematurely because
it would detect that the token is outside the range.

The fix is to teach ring_range() to recognize this and always give
tokens which are not smaller than the range's lower bound.
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
c219c3676b storage_proxy: Avoid unnecessary copy of a partition_range 2015-07-22 10:27:48 +02:00
Tomasz Grabiec
abdb36a0f6 storage_proxy: Fix ordering comparator used during merging and reconciliation 2015-07-22 10:27:48 +02:00
Tomasz Grabiec
9b52f5bf2b storage_proxy: Make range splitting never produce a wrap around range
Origin has no notion of a maximum token so a range without upper bound
is represented as (x; min]. The splitting code is supposed to produce
only non-wrapping ranges, but (x; min] looks like a wrapping range, so
database code which consumes it would have to special-case for it. A
simpler solution is to change the splitting code to never produce a
wrapping range.
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
f557951554 storage_proxy: Encapsulate access to range tokens
The wrappers also take care of cases when each bound is undefined, and
return mimimum or maximum token respectively, which fixes undefined
behavior in get_restricted_ranges().

Alternative solution would be to make partition_range use a different
type of range, the one where bounds are always specified. However it's
not worth introducing a new range type just for those few users.
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
5587d2ce9a storage_proxy: Simplify access to storage_proxy::response_id_type
storage_proxy::storage_proxy::response_id_type
 -> storage_proxy::response_id_type
2015-07-22 10:27:48 +02:00
Tomasz Grabiec
0b0ea04958 range: Remove start_value() and end_value()
It's easy to miss that they may be undefined. start() and end(), which
return optional<bound> const&, make it clear.
2015-07-22 10:27:47 +02:00
Asias He
547d9c347e storage_service: Simplfy do_update_system_peers_table
Using template helper.
2015-07-22 11:22:18 +03:00
Asias He
7ee4ae2ff7 storage_service: Use logger.error instead of print 2015-07-21 22:56:24 +08:00
Asias He
94e34cde64 storage_service: Drop ss_debug 2015-07-21 22:56:24 +08:00
Asias He
72a8c27b56 storage_service: Enable logger 2015-07-21 22:56:24 +08:00
Asias He
344d8e95d1 storage_service: Update system.peers table
Before:

   cqlsh> SELECT * from system.peers ;
   cqlsh:system> SELECT * FROM system.peers ;
    peer | data_center | host_id | preferred_ip | rack | release_version | rpc_address | schema_version | tokens
   ------+-------------+---------+--------------+------+-----------------+-------------+----------------+--------
   (0 rows)

After:

   cqlsh> SELECT * from system.peers ;
    peer      | data_center | host_id                              | preferred_ip | rack  | release_version | rpc_address | schema_version | tokens
   -----------+-------------+--------------------------------------+--------------+-------+-----------------+-------------+----------------+--------
    127.0.0.2 | datacenter1 | 7daa116a-03d0-4623-8084-f213701f3136 |         null | rack1 |      urchin_1_0 |   127.0.0.2 |           null |   null
   (1 rows)
2015-07-21 17:10:56 +08:00
Asias He
d8a281e811 storage_service: Use broadcast_address as broadcast_rpc_address for now
Until we can get it from the config system.

It is better than a empty address.
2015-07-21 17:00:15 +08:00
Shlomi Livne
026f6b7e2b Fix a race condition in node state for bootstrapped nodes
Booting a bootstrapped node had a race condition between setting and
advertising the state as boot strapped (call to
storage_service::bootstrap) and setting and advertising the
state as normal (call to storage_service::set_tokens) - as such a node
could get into a state in which it was "stuck" in bootstrap mode.

Following this patch you must wait for 5 seconds to have the cluster in
a stable state.

Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-07-20 17:53:57 +03:00
Gleb Natapov
c58aea07c3 remove storage_proxy::query_local()
Use storage_proxy::query() instead.
2015-07-20 12:06:00 +02:00
Gleb Natapov
640e65b947 storage_proxy: Reconciliation. The beginning
The patch introduces reconciliation code. The same code suppose to be
working for both range and single key queries. Handling of raw_limit,
short reads and read repairs is still very much missing.

--
v1->v2:
  - call live_row_count() only once.
2015-07-16 19:23:58 +03:00
Pekka Enberg
c090ac076f service/migration_manager: Convert passiveAnnounce() to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-16 14:53:31 +03:00
Pekka Enberg
b417405f48 service/migration_manager: Enable logging
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-16 14:53:31 +03:00
Pekka Enberg
822d3cab55 service/storage_service: Make value_factory public
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-16 14:53:31 +03:00
Avi Kivity
c74e36c30e Merge branch 'master' of github.com:cloudius-systems/urchin into db
Conflicts:
	message/messaging_service.cc
	message/messaging_service.hh
2015-07-16 12:51:19 +03:00
Avi Kivity
1d4805236b messaging_service: don't include config.hh in .hh
config.hh changes rapidly, so don't force lots of recompiles by including it.

Need to place seed_provider_type in namespace scope, so we can forward
declare it for that.
2015-07-16 12:26:02 +03:00
Asias He
b07ae77585 messaging_service: Add wrapper for READ_DIGEST verb 2015-07-16 17:23:26 +08:00
Asias He
c19f5edefe messaging_service: Add wrapper for READ_MUTATION_DATA verb 2015-07-16 17:23:26 +08:00
Asias He
85c6dc0b7e messaging_service: Add wrapper for READ_DATA verb 2015-07-16 17:23:26 +08:00
Asias He
0707c15000 messaging_service: Add wrapper for MUTATION_DONE verb 2015-07-16 17:23:26 +08:00
Asias He
196c7d6f2b messaging_service: Add wrapper for MUTATION verb 2015-07-16 17:23:22 +08:00
Asias He
49b80535d9 messaging_service: Add wrapper for DEFINITIONS_UPDATE verb 2015-07-16 17:19:51 +08:00
Asias He
3e9d88e67c migration_manager: Do not use make_lw_shared in push_schema_mutation
The code does:

    auto fm = make_lw_shared<std::vector<frozen_mutation>>(schema.begin(), schema.end());
    return net::get_local_messaging_service().send_message_oneway(net::messaging_verb::DEFINITIONS_UPDATE,
         std::move(id), std::move(fm));

    ms.register_handler(net::messaging_verb::DEFINITIONS_UPDATE, [this] (std::vector<frozen_mutation> m) {
    ...
    }

We should not send a lw_shared_ptr, but std::vector<frozen_mutation>.

However, from Gleb:

   Well, this is not a bug, this is really cool (and to be honest
   unintended) feature of RPC. It is smart enough to detect that it is a
   smart pointer to an object and dereference it. But in this particular
   case there is not justification to use shared_ptr in the first place.

So, drop the lw_shared_ptr anyway.
2015-07-16 17:19:51 +08:00
Gleb Natapov
ed0faffd12 storage_proxy: beginning of read implementation for ranges queries 2015-07-15 12:41:38 +03:00
Gleb Natapov
6dc5b5ad76 storage_proxy: add resolver classes facilitate query resolving
The responsibility of the classes is to collect all results and do
reconciliation if needed.
2015-07-15 12:41:38 +03:00