Commit Graph

5432 Commits

Author SHA1 Message Date
Avi Kivity
2d0dfcd491 Merge "Add config yaml file" from Shlomi 2015-08-05 13:58:51 +03:00
Avi Kivity
3aad7c9c19 Merge "Migration manager cleanups" from Pekka
"Clean up various issues that I've stumbled across in the migration
manager code."
2015-08-05 13:48:21 +03:00
Pekka Enberg
d743f6df50 service/migration_manager: Fix error handling in announce()
Propagate exceptions from migration_manager::announce() to the callers.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
0793a7849f service/migration_manager: Fix logger name
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
c281e6f1c3 service/migration_manager: Use get_local_gossiper()
Use the get_local_gossiper() helper instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
d70091e1fe service/migration_manager: Remove isReadyForBootstrap()
Remove commented out isReadyForBoostrap. We don't have a StageManager
nor we will so drop the function.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
dd9d178502 service/migration_manager: Rename MIGRATION_DELAY_IN_MSEC
Rename "MIGRATION_DELAY_IN_MSEC" to "migration_delay" as the unit of
time is already clear from the type.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
feb6b7d316 service/migration_manager: Remove storage proxy arguments
Use get_storage_proxy() and get_local_storage_proxy() helpers under the
hood to simplify migration manager API users.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Nadav Har'El
34b1cc42cd Initial repair support
This patch adds the beginning of node repair support. Repair is initiated
on a node using the REST API, for example to repair all the column families
in the "try1" keyspace, you can use:

curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/repair_async/try1"

I tested that the repair already works (exchanges mutations with all other
replicas, and successfully repairs them), so I think can be committed,
but will need more work to be completed

 1. Repair options are not yet supported (range repair, sequential/parallel
    repair, choice of hosts, datacenters and column families, etc.).

 2. *All* the data of the keyspace is exchanged - Merkle Trees (or an
    alternative optimization) and partial data exchange haven't been
    implemented yet.

 3. Full repair for nodes with multiple separate ranges is not yet
    implemented correctly. E.g., consider 10 nodes with vnodes and RF=2,
    so each vnode's range has a different host as a replica, so we need
    to exchange each key range separately with a different remote host.

 4. Our repair operation returns a numeric operation id (like Origin),
    but we don't yet provide any means to use this id to check on ongoing
    repairs like Origin allows.

 5. Error hangling, logging, etc., needs to be improved.

 6. SMP nodes (with multiple shards) should work correctly (thanks to
    Asias's latest patch for SMP mutation streaming) but haven't been
    tested.

 7. Incremental repair is not supported (see
    http://www.datastax.com/dev/blog/more-efficient-repairs)

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-05 13:26:36 +03:00
Avi Kivity
9217dbc82a Merge "streaming shard support" from Asias 2015-08-05 13:17:39 +03:00
Pekka Enberg
a49e16a762 service/migration_manager: Remove ifdef'd code
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:16:45 +03:00
Avi Kivity
5316abd3a6 Merge seastar upstream
* seastar eb2197d...c09488e (1):
  > rpc: fix bug allowing parallel writting into the same connection
2015-08-05 13:15:01 +03:00
Avi Kivity
55ca295154 Merge "Initial CQL event support" from Pekka
"This series implements initial support for CQL events. We introduce
migration_listener hook in migration manager as well as event notifier
in the CQL server that's built on top of it to send out the events via
CQL binary protocol. We also wire up create keyspace events to the
system so subscribed clients are notified when a new keyspace is
created.

There's still more work to be done to support all the events. That
requires some work to restructure existing code so it's better to merge
this initial series now and avoid future code conflicts."
2015-08-05 12:56:37 +03:00
Avi Kivity
749347232a Merge seastar upstream
* seastar 6de00be...eb2197d (1):
  > net::dpdk: call rte_pktmbuf_reset() when recycling the mbuf

Fixes #66.
2015-08-05 12:52:45 +03:00
Pekka Enberg
618ba067bf database: Wire up create keyspace listener hook
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:52 +03:00
Pekka Enberg
05c23c7f73 database: Add create_keyspace_on_all() helper
Add a create_keyspace_on_all() helper which is needed for sending just
one event notification per created keyspace, not one per shard.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:52 +03:00
Pekka Enberg
66f7f05eef transport/server: Wire up event notifier to process_register()
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:52 +03:00
Pekka Enberg
07c8f0b1ac transport/server: Add read_string_list() helper
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Pekka Enberg
27068fe912 transport/server: Event notifier
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Pekka Enberg
6e29d51c0a transport/server: Add write_schema_change_event() helper
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Pekka Enberg
12d99bd282 service/migration_manager: Migration listener hooks
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Avi Kivity
a9ce60dc28 Merge "storage_service fixes" from Nadav 2015-08-05 11:47:28 +03:00
Avi Kivity
b92f36965e Merge seastar upstream
* seastar 947619e...6de00be (4):
  > net: prevent tcp from fragmenting packet headers
  > net: use malloc() in internal packet allocations
  > core/memory: Fix compilation of debug-mode version of stats()
  > memory: Expose more statistics over collectd
2015-08-05 11:42:55 +03:00
Asias He
d6c31a2668 system_keyspace: Fix more execute_cql using inet_address
We should pass inet_address.addr().

With this, tokens in system.peers are updated correctly.

(1 rows)
cqlsh> SELECT tokens from system.peers;

 tokens
------------------------------------------------------------------------
 {'-5463187748725106974', '8051017138680641610', '8833112506891013468'}

(1 rows)
2015-08-05 15:58:55 +08:00
Asias He
cb6fbee68b storage_service: Wire up remove_endpoint 2015-08-05 15:45:33 +08:00
Asias He
65e19203b0 storage_service: Enable more debug print 2015-08-05 15:32:38 +08:00
Asias He
f7be7c9cee storage_service: Wire up update_tokens in handle_state_normal 2015-08-05 15:29:32 +08:00
Asias He
1b7b199bdf system_keyspace: Fix remove_endpoint
I got this error If I pass inet_address to it.

boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::bad_any_cast>
> (boost::bad_any_cast: failed conversion using boost::any_cast)
2015-08-05 15:29:32 +08:00
Asias He
8ccf7665e9 storage_service: Enable call to _token_metadata.remove_from_moving
This function is available now.
2015-08-05 15:29:32 +08:00
Asias He
3cb68f05f5 storage_service: Use update_normal_tokens to update tokens
Assume we have 3 tokens,

  {ee 36 d0 3e e8 6c 35 b1 , c5 5b 00 4a 1d 77 4e 50 , b9 b2 a1 0a 16 0d 76 8e }

With this

   for (auto t : tokens) {
       _token_metadata.update_normal_token(t, get_broadcast_address());
   }

Only the last token is inserted.

With this

   _token_metadata.update_normal_tokens(tokens, get_broadcast_address());

All 3 tokens are inserted correctly.
2015-08-05 15:29:32 +08:00
Raphael S. Carvalho
1a3604f3c2 sstables: add a comment describing some sstable fields.
The reason is that the reader may think that these fields store
some statistics information about a sstable just loaded, but
they are only used when writing a new sstable.
Now I'm starting to see the value of having a sstable class for
a sstable loaded and another one for a sstable being created
(that's what Origin does).

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-05 10:24:02 +03:00
Avi Kivity
52dea3ac02 Merge "storage_service update" from Asias
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-04 18:13:54 +03:00
Tomasz Grabiec
eba7121bf5 Merge branch 'pdziepak/select-distinct/v2' from seastar-dev.git
From Pawel:

This series fixes SELECT DISTINC statements. Previously, we relied on the
existance of static row to get proper results. That obviously doesn't work
when there is no static row in the partition. The solution for that is
to introduce new option to partition_slice: distinct which informs that
the only important information is static row and whether the partition
exists.
2015-08-04 17:00:56 +02:00
Avi Kivity
f393ba5410 Merge "Improve range queries"
(or rather, improve them in the future when they use make_local_reader)

Since shard data is now disjoint, read shards in order rather than
concurrently.
2015-08-04 17:24:53 +03:00
Avi Kivity
8d050b679a db: improve make_local_reader()
Instead of merging shard data using make_combined_reader(), take advantage
of the fact that shard data is disjoint, and use make_joining_reader().
This removes the need to sort the partitions as they are being read.
2015-08-04 17:11:39 +03:00
Avi Kivity
951eef2945 mutation_reader: add make_lazy_reader
Construct the reader on first use.  Useful with make_joining_reader().
2015-08-04 16:55:31 +03:00
Paweł Dziepak
b4d967fcf2 tests/cql3: add tests for SELECT DISTINCT
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-08-04 15:39:57 +02:00
Paweł Dziepak
7a7919a62e cql3: set properly partition_slice for SELECT DISTINCT
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-08-04 15:39:54 +02:00
Paweł Dziepak
8a0d21b8b8 query: support option distinct in partition_slice
In case of SELECT DISTINCT statments we are not intersted in clustering
keys at all. The only important information is whether partition key
exists and what's in static row (if it exists).

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-08-04 15:39:42 +02:00
Paweł Dziepak
71e7d3bc20 partition_slice: add distinct option
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-08-04 15:38:36 +02:00
Nadav Har'El
1d4c1eda51 ninja: add "clean" target
This patch adds a "ninja clean", better than the current "ninja -t clean".

Ninja's "ninja -t clean" is a nice trick, designed to save the Makefile writer
the tedious chore of listing the targets to remove, by automatically gathering
this list. But our build system, following OSv's one, actually uses a much
cooler (and better) trick: All build files are generated in a single
subdirectory, "build/", and cleaning the build products is as simple as
"rm -rf build".

So this patch adds a target, "ninja clean", which does exactly this (rm -rf
build). "ninja clean" is not only easier to type than "ninja -t clean", it
also has one important benefit: When the ninja rules change, "ninja -t clean"
doesn't remember to delete now-defunct targets, and they stay behind. On my
build machine, "ninja -t clean" left behind almost a gigabyte of old crap.
Moreover, when the ninja file changes drastically (as it changed a few days
ago), not cleaning up everything can even cause new builds to break - e.g.,
when something was previously a file and now needs to be a directory.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-04 16:00:09 +03:00
Avi Kivity
318cc489c8 mutation_reader: add make_joining_reader()
Reads from provided readers, in order (assumes provided readers are
disjoint and in required order).
2015-08-04 15:49:09 +03:00
Asias He
2aec155f50 storage_service: Add debug print to dump token to endpoint map 2015-08-04 20:39:33 +08:00
Asias He
c42df5b40d token_metadata: Enable remove_from_moving 2015-08-04 20:39:33 +08:00
Asias He
2250123654 storage_service: Enable code for remove_endpoint in handle_state_normal 2015-08-04 20:39:33 +08:00
Asias He
9dc155a8ef token_metadata: Make get_endpoint_for_host_id return optional 2015-08-04 20:39:32 +08:00
Asias He
66f5cfaf39 token_metadata: Add remove_endpoint 2015-08-04 20:39:32 +08:00
Asias He
ba1a8c5ad7 storage_service: Enable debug print for tokens 2015-08-04 20:39:32 +08:00
Asias He
e572101433 to_string: Support print std::unordered_set
With this, we can do

std::unordered_set<dht::tokens> tokens
logger.debug("tokens = {}", tokens)
2015-08-04 20:39:32 +08:00
Asias He
a7b9a8faed storage_service: Remove debug print for tokens in on_join 2015-08-04 20:26:34 +08:00