Commit Graph

5447 Commits

Author SHA1 Message Date
Asias He
93de64a061 storage_service: Add helper to get property and friends
They are used but we don't support them yet. Add stub helpers for now.
2015-08-06 15:23:51 +08:00
Glauber Costa
c2eca19737 sstable_test: fix check_toc_func
We are currently failing the sstable test. The reason is that we use the store()
function for test purposes, and that function does not store the TOC component.
It was removed by Aviccident in 3a5e3c88.

Because that function is only used for testing purposes, it doesn't need to write
the Index and Data components: we can then remove them from the list.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-06 10:11:55 +03:00
Glauber Costa
fb34ac7f65 database: fix scan_dir
When probing for the type, I have made the classical mistake of using
as a parameter part of a structure that is moved into the capture. That
is what broke our tests.

But also, when stat'ing, de.name will give us only the component relative to
the current path. We need to add the directory so the stat will succeed.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-06 10:10:56 +03:00
Glauber Costa
ece8f01d06 database: make sure a type is present.
Our directory scanner currently requires a type to be passed, and we have a
FIXME saying that we should stat when there is none. In some filesystems,
in particular, XFS, getdents won't return a type, meaning we should manually
probe it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-05 22:52:07 +03:00
Pekka Enberg
464c96d901 db/schema_tables: Wire up CF creation CQL event
The code is merge_tables() is a twisted maze of tricks that is hard to
restructure so that event notification can be done cleanly like with
keyspaces.

The problem there is that we need to run bunch of database operations
for the merging that really need to happen on all the shards.  To fix
the issue, lets cheat a little and simply only run CQL event
notification on cpu zero.

This seems to fix cluster schema propagation issues in urchin-dtest. I
can now run TestSimpleCluster.simple_create_insert_select_test without
any additional delays inserted into the test code.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 17:33:59 +03:00
Avi Kivity
891b0fe6fc Merge seastar upstream
* seastar c09488e...6f1dd3c (3):
  > net: make udp send more robust wrt. errors
  > net: remove packet constructors with template Deleter parameter
  > memory: Support for discovering allocator's address range
2015-08-05 17:23:32 +03:00
Pekka Enberg
e5ca713e72 transport/server: Fix schema change event encoding
We also need to encode the event type in the response message. Fixes the
following dtest breakage:

  cassandra.connection: ERROR: Error decoding response from Cassandra. opcode: 000c; message contents: '\x83\x00\xff\xff\x0c\x00\x00\x00\x17\x00\x07CREATED\x00\x08KEYSPACE\x00\x02ks'
  Traceback (most recent call last):
    File "/usr/lib64/python2.7/site-packages/cassandra/connection.py", line 431, in process_msg
      flags, opcode, body, self.decompressor)
    File "/usr/lib64/python2.7/site-packages/cassandra/protocol.py", line 123, in decode_response
      msg = msg_class.recv_body(body, protocol_version, user_type_map)
    File "/usr/lib64/python2.7/site-packages/cassandra/protocol.py", line 803, in recv_body
      raise NotSupportedError('Unknown event type %r' % event_type)
  NotSupportedError: Unknown event type u'CREATED'

Reported-by: Shlomi Livne <shlomi@cloudius-systems.com>
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 15:16:50 +03:00
Avi Kivity
522f23b830 Merge "Schema table cleanups" from Pekka
"Clean up the schema table code. Be explicit that we don't support
Cassandra 3.0 and eliminate some dead code."
2015-08-05 15:09:59 +03:00
Avi Kivity
6b2be41df0 tests: give cql_test_env a directory
While supposedly running in memory, looks like it still wants a data
directory.  Give it one.
2015-08-05 15:05:50 +03:00
Avi Kivity
c720cddc5c tests: mv tests/urchin/* -> tests/
Now that seastar is in a separate repository, we can use the tests/
directory.
2015-08-05 14:16:52 +03:00
Raphael S. Carvalho
3ddb9be984 db: fix compaction on an empty column family
When forcing a compaction on a column family with no sstables, an
assert will fail because there is no sstables to be compacted.
This problem is fixed by ignoring a compaction request when no
sstable is provided.

Fixes #61.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-05 14:04:22 +03:00
Takuya ASADA
a431c49731 .spec file and scripts to build RPM for Fedora
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-08-05 14:00:46 +03:00
Avi Kivity
2d0dfcd491 Merge "Add config yaml file" from Shlomi 2015-08-05 13:58:51 +03:00
Pekka Enberg
45e5eff544 db/schema_tables: Remove ifdef'd code
We already have all_tables() function converted and there's really no
use for compile() unless we switch to using CQL to create the schema
tables.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:56:50 +03:00
Pekka Enberg
a355c83c6c db/schema_tables.hh: Remove obsolete comment
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:56:49 +03:00
Pekka Enberg
99a80050e3 db: Rename legacy_schema_tables to schema_tables
There's nothing legacy about it so rename legacy_schema_tables to
schema_tables. The naming comes from a Cassandra 3.x development branch
which is not relevant for us in the near future.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:56:47 +03:00
Avi Kivity
3aad7c9c19 Merge "Migration manager cleanups" from Pekka
"Clean up various issues that I've stumbled across in the migration
manager code."
2015-08-05 13:48:21 +03:00
Pekka Enberg
d743f6df50 service/migration_manager: Fix error handling in announce()
Propagate exceptions from migration_manager::announce() to the callers.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
0793a7849f service/migration_manager: Fix logger name
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
c281e6f1c3 service/migration_manager: Use get_local_gossiper()
Use the get_local_gossiper() helper instead of open-coding it.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
d70091e1fe service/migration_manager: Remove isReadyForBootstrap()
Remove commented out isReadyForBoostrap. We don't have a StageManager
nor we will so drop the function.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
dd9d178502 service/migration_manager: Rename MIGRATION_DELAY_IN_MSEC
Rename "MIGRATION_DELAY_IN_MSEC" to "migration_delay" as the unit of
time is already clear from the type.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Pekka Enberg
feb6b7d316 service/migration_manager: Remove storage proxy arguments
Use get_storage_proxy() and get_local_storage_proxy() helpers under the
hood to simplify migration manager API users.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:31:46 +03:00
Nadav Har'El
34b1cc42cd Initial repair support
This patch adds the beginning of node repair support. Repair is initiated
on a node using the REST API, for example to repair all the column families
in the "try1" keyspace, you can use:

curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/repair_async/try1"

I tested that the repair already works (exchanges mutations with all other
replicas, and successfully repairs them), so I think can be committed,
but will need more work to be completed

 1. Repair options are not yet supported (range repair, sequential/parallel
    repair, choice of hosts, datacenters and column families, etc.).

 2. *All* the data of the keyspace is exchanged - Merkle Trees (or an
    alternative optimization) and partial data exchange haven't been
    implemented yet.

 3. Full repair for nodes with multiple separate ranges is not yet
    implemented correctly. E.g., consider 10 nodes with vnodes and RF=2,
    so each vnode's range has a different host as a replica, so we need
    to exchange each key range separately with a different remote host.

 4. Our repair operation returns a numeric operation id (like Origin),
    but we don't yet provide any means to use this id to check on ongoing
    repairs like Origin allows.

 5. Error hangling, logging, etc., needs to be improved.

 6. SMP nodes (with multiple shards) should work correctly (thanks to
    Asias's latest patch for SMP mutation streaming) but haven't been
    tested.

 7. Incremental repair is not supported (see
    http://www.datastax.com/dev/blog/more-efficient-repairs)

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-05 13:26:36 +03:00
Avi Kivity
9217dbc82a Merge "streaming shard support" from Asias 2015-08-05 13:17:39 +03:00
Pekka Enberg
a49e16a762 service/migration_manager: Remove ifdef'd code
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:16:45 +03:00
Avi Kivity
5316abd3a6 Merge seastar upstream
* seastar eb2197d...c09488e (1):
  > rpc: fix bug allowing parallel writting into the same connection
2015-08-05 13:15:01 +03:00
Avi Kivity
55ca295154 Merge "Initial CQL event support" from Pekka
"This series implements initial support for CQL events. We introduce
migration_listener hook in migration manager as well as event notifier
in the CQL server that's built on top of it to send out the events via
CQL binary protocol. We also wire up create keyspace events to the
system so subscribed clients are notified when a new keyspace is
created.

There's still more work to be done to support all the events. That
requires some work to restructure existing code so it's better to merge
this initial series now and avoid future code conflicts."
2015-08-05 12:56:37 +03:00
Avi Kivity
749347232a Merge seastar upstream
* seastar 6de00be...eb2197d (1):
  > net::dpdk: call rte_pktmbuf_reset() when recycling the mbuf

Fixes #66.
2015-08-05 12:52:45 +03:00
Pekka Enberg
618ba067bf database: Wire up create keyspace listener hook
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:52 +03:00
Pekka Enberg
05c23c7f73 database: Add create_keyspace_on_all() helper
Add a create_keyspace_on_all() helper which is needed for sending just
one event notification per created keyspace, not one per shard.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:52 +03:00
Pekka Enberg
66f7f05eef transport/server: Wire up event notifier to process_register()
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:52 +03:00
Pekka Enberg
07c8f0b1ac transport/server: Add read_string_list() helper
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Pekka Enberg
27068fe912 transport/server: Event notifier
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Pekka Enberg
6e29d51c0a transport/server: Add write_schema_change_event() helper
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Pekka Enberg
12d99bd282 service/migration_manager: Migration listener hooks
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 11:50:51 +03:00
Avi Kivity
a9ce60dc28 Merge "storage_service fixes" from Nadav 2015-08-05 11:47:28 +03:00
Avi Kivity
b92f36965e Merge seastar upstream
* seastar 947619e...6de00be (4):
  > net: prevent tcp from fragmenting packet headers
  > net: use malloc() in internal packet allocations
  > core/memory: Fix compilation of debug-mode version of stats()
  > memory: Expose more statistics over collectd
2015-08-05 11:42:55 +03:00
Asias He
d6c31a2668 system_keyspace: Fix more execute_cql using inet_address
We should pass inet_address.addr().

With this, tokens in system.peers are updated correctly.

(1 rows)
cqlsh> SELECT tokens from system.peers;

 tokens
------------------------------------------------------------------------
 {'-5463187748725106974', '8051017138680641610', '8833112506891013468'}

(1 rows)
2015-08-05 15:58:55 +08:00
Asias He
cb6fbee68b storage_service: Wire up remove_endpoint 2015-08-05 15:45:33 +08:00
Asias He
65e19203b0 storage_service: Enable more debug print 2015-08-05 15:32:38 +08:00
Asias He
f7be7c9cee storage_service: Wire up update_tokens in handle_state_normal 2015-08-05 15:29:32 +08:00
Asias He
1b7b199bdf system_keyspace: Fix remove_endpoint
I got this error If I pass inet_address to it.

boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::bad_any_cast>
> (boost::bad_any_cast: failed conversion using boost::any_cast)
2015-08-05 15:29:32 +08:00
Asias He
8ccf7665e9 storage_service: Enable call to _token_metadata.remove_from_moving
This function is available now.
2015-08-05 15:29:32 +08:00
Asias He
3cb68f05f5 storage_service: Use update_normal_tokens to update tokens
Assume we have 3 tokens,

  {ee 36 d0 3e e8 6c 35 b1 , c5 5b 00 4a 1d 77 4e 50 , b9 b2 a1 0a 16 0d 76 8e }

With this

   for (auto t : tokens) {
       _token_metadata.update_normal_token(t, get_broadcast_address());
   }

Only the last token is inserted.

With this

   _token_metadata.update_normal_tokens(tokens, get_broadcast_address());

All 3 tokens are inserted correctly.
2015-08-05 15:29:32 +08:00
Raphael S. Carvalho
1a3604f3c2 sstables: add a comment describing some sstable fields.
The reason is that the reader may think that these fields store
some statistics information about a sstable just loaded, but
they are only used when writing a new sstable.
Now I'm starting to see the value of having a sstable class for
a sstable loaded and another one for a sstable being created
(that's what Origin does).

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-05 10:24:02 +03:00
Avi Kivity
52dea3ac02 Merge "storage_service update" from Asias
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-04 18:13:54 +03:00
Tomasz Grabiec
eba7121bf5 Merge branch 'pdziepak/select-distinct/v2' from seastar-dev.git
From Pawel:

This series fixes SELECT DISTINC statements. Previously, we relied on the
existance of static row to get proper results. That obviously doesn't work
when there is no static row in the partition. The solution for that is
to introduce new option to partition_slice: distinct which informs that
the only important information is static row and whether the partition
exists.
2015-08-04 17:00:56 +02:00
Avi Kivity
f393ba5410 Merge "Improve range queries"
(or rather, improve them in the future when they use make_local_reader)

Since shard data is now disjoint, read shards in order rather than
concurrently.
2015-08-04 17:24:53 +03:00
Avi Kivity
8d050b679a db: improve make_local_reader()
Instead of merging shard data using make_combined_reader(), take advantage
of the fact that shard data is disjoint, and use make_joining_reader().
This removes the need to sort the partitions as they are being read.
2015-08-04 17:11:39 +03:00