Commit Graph

7340 Commits

Author SHA1 Message Date
Avi Kivity
f3afe3e876 allocation_strategy: constify migrate_fn
Since abstract_type will be providing our migrate_fn, they must be const,
and indeed a migration does not change the migration function.
2015-11-13 17:13:07 +02:00
Avi Kivity
a4a776e66c cql3: add operation::make_cell() helpers
atomic_cell will soon become type-aware, so add helpers to class operation
that can supply the type, as it is available in operation::column.type.

(the type will be used in following patches)
2015-11-13 17:13:07 +02:00
Avi Kivity
79f7431a03 db: change collection_mutation::{one,view} not to use nested classes
Nested classes cannot be forward-declared, so change the naming
not to use them.  Follows atomic_cell{,_view}.
2015-11-13 17:13:07 +02:00
Avi Kivity
3fcb7add2e types: fix concrete_type::native_type_move()
The source is modified during a move, and so must not be const.
2015-11-13 17:13:07 +02:00
Avi Kivity
68a902ad0c data_value: add constructor from bool
schema_tables manages some boolean columns stored in system tables; it
dynamically creates them from C++ values.  But as we lacked bool->data_value
conversion, the C++ value was converted to a int32_type.  Somehow this didn't
cause any problems, but with some pending patches I have, it does.

Add a bool->data_value converting constructor to fix this.
2015-11-13 17:13:07 +02:00
Avi Kivity
47499dcf18 data_value: make conversion from bytes explicit
Since bytes is a very generic value that is returned from many calls,
it is easy to pass it by mistake to a function expecting a data_value,
and to get a wrong result.  It is impossible for the data_value constructor
to know if the argument is a genuine bytes variable, a data_value of another
type, but serialized, or some other serialized data type.

To prevent misuse, make the data_value(bytes) constructor
(and complementary data_value(optional<bytes>) explicit.
2015-11-13 17:12:29 +02:00
Tomasz Grabiec
f3f2bf0b44 schema: Move definitions to source file 2015-11-12 13:50:01 +02:00
Avi Kivity
6a9ed4a4eb Merge seastar upstream
* seastar 5c10d3e...20bf03b (5):
  > do not re-throw exception to get to an exception pointer
  > Adding timeout counter to the rpc
  > configure.py: support for pkg-config before release 0.28
  > future: don't forget to warn about ignored exception
  > tutorial: continue network API section
2015-11-12 11:19:52 +02:00
Asias He
6aa5bfe59f range_streamer: Add virtual destructor to i_source_filter
Found by debug build

==10190==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x602000084430 in thread T0:
  object passed to delete has wrong type:
  size of the allocated type:   16 bytes;
  size of the deallocated type: 8 bytes.
    #0 0x7fe244add512 in operator delete(void*, unsigned long) (/lib64/libasan.so.2+0x9a512)
    #1 0x3c674fe in std::default_delete<dht::range_streamer::i_source_filter>::operator()(dht::range_streamer::i_source_filter*)
       const /usr/include/c++/5.1.1/bits/unique_ptr.h:76
    #2 0x3c60584 in std::unique_ptr<dht::range_streamer::i_source_filter, std::default_delete<dht::range_streamer::i_source_filter> >::~unique_ptr()
       /usr/include/c++/5.1.1/bits/unique_ptr.h:236
    #3 0x3c7ac22 in void __gnu_cxx::new_allocator<std::unique_ptr<dht::range_streamer::i_source_filter,
       std::default_delete<dht::range_streamer::i_source_filter> > >::destroy<std::unique_ptr<dht::range_streamer::i_source_filter,
       std::default_delete<dht::range_streamer::i_source_filter> > >(std::unique_ptr<dht::range_streamer::i_source_filter,
       std::default_delete<dht::range_streamer::i_source_filter> >*) /usr/include/c++/5.1.1/ext/new_allocator.h:124
...
2015-11-12 11:19:22 +02:00
Avi Kivity
47c7dd96c5 Merge "Aggregate paging support" from Calle
"This adds repeated paged querying to do aggregate queries (similar to
origin). Uses "batched" paging."

Fixes #549
2015-11-11 20:47:14 +02:00
Calle Wilund
fdc549cd47 select_statement: Handle aggregate queries
Fixes #549.

Being clinically absent-minded, aggregate query support (i.e. count(...))
was left out of the "paging" change set.

This adds repeated paged querying to do aggregate queries (similar to
origin). Uses "batched" paging.
2015-11-11 18:41:47 +01:00
Calle Wilund
cc88763961 query_pager: Add method for repeated queries
fetch_page method that instead of returning a full result set, adds row
to a pre-existing one. For "batching".
2015-11-11 18:40:14 +01:00
Amnon Heiman
1b369be663 compaction_strategy should accept both class name and full class name
For compatibility reasons, compaction_strategy should accept both class
name strategy and the full class name that includes the package name.

In origin the result name depends on the configuration, we cannot mimic
that as we are using enum for the type.

So currently the return class name remains the class itself, we can
consider changing it in the future.

If the name is org.apache.cassandra.db.compaction.Name the it will be
compare as Name

The error message was modified to report the name it was given.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Fixes #545
2015-11-11 15:31:39 +02:00
Avi Kivity
b95ff5c09e Merge "Commitlog format change: add scylla "magic"" from Calle
"Slight file format change for commitlog segments, now incluing
a scylla "marker". Allows for fast-fail if trying to load an
Origin segment.

WARNING: This changes the file format, and there is no good way for me to
check if a CL is "old" scylla, or Origin (since "version" is the same). So
either "old" scylla files also fail, or we never fail (until later, and
worse). Thus, if upgrading from older to this patch ensure to
have cleaned out all commit logs first."
2015-11-11 11:42:47 +02:00
Avi Kivity
dcc7302312 Merge "Paging support" from Calle
Fixes #355

"Implements query paging similar to origin. If driver sets a "page size" in
a query, and we cannot know that we will not exceed this limit in a single
query, the query is performed using a "pager" object, which, using modified
partition ranges and query limits, keeps track of returned rows to "page"
through the results.

Implementation structure sort of mimics the origin design, even though it
is maybe a little bit overkill for us (currently). On the other hand, it
does not really hurt.

This implementation is tested using the "paging_test" subset in dtest.
It passes all test except:

* test_paging_using_secondary_indexes
* test_paging_using_secondary_indexes_with_static_cols
* test_failure_threshold_deletions

The two first because we don't have secondary indexes yet, the latter
because the test depends on "tombstone_failure_threshold" in origin.

Potential todo: Currently the pager object does not shortcut result
building fully when page limit is exceeded. Could save a little work
here, but probably not very significant."
2015-11-11 10:45:41 +02:00
Avi Kivity
5aecf210e2 Merge "gossip shutdown fix + streaming fix" from Asias
Fixes: #540 #542
2015-11-11 10:27:43 +02:00
Asias He
efda753c0c token_metadata: Implement pending_endpoints_for
It is used in storage_proxy::create_write_response_handler. The second
argument should be keyspace name instead of the keyspace class.

Refs: #539
2015-11-11 09:41:21 +02:00
Calle Wilund
43712a583d commitlog_replayer: Special case exception from "old/origin file"
And write some nice informative stuff.
2015-11-10 17:14:22 +01:00
Calle Wilund
85b8d65374 commitlog: Change file format to include magic marker
Allows us fail fast if someone tries to replay an Origin commit log.

WARNING: This changes the file format, and there is no good way for me to
check if a CL is "old" scylla, or Origin (since "version" is the same). So
either "old" scylla files also fail, or we never fail (until later, and
worse). Thus, if upgrading from older to this patch, likewise, ensure to
have cleaned out all commit logs first.
2015-11-10 17:11:06 +01:00
Glauber Costa
72573d0b46 storage proxy: be more vocal about timeouts
If a timeout happens, we should log it. Trace level is really
not adequate for this.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-11-10 16:41:22 +02:00
Calle Wilund
ecd7674867 select_statement: Paging
Check if paging might be needed, and if so, use a paging object.
Similar to origin, but without all the filters.
2015-11-10 13:16:06 +01:00
Calle Wilund
13189c3176 query_pagers: Paging implementation
* Static query method to determine if paging might be required
(very conservative - almost all querys will be paged me thinks).
* Static factory method for pager
* Actual pager implementation

Pager object uses three variables to keep track of paging state:
1.) Last partition key - partition key of last partion processed 
	-> next partition to start process
2.) Last clustering key, i.e. row offset within last key partition, 
	i.e. how far we got last time
3.) Max remaining - max rows to process further, i.e. initial limit - 
    processed so far
    
Partition ranges are modified/removed so that we begin with "Last key", 
if present. (Or end with, in the case of reversed processing)

A counting visitor then keeps count of rows to include in processing.
2015-11-10 13:16:05 +01:00
Calle Wilund
bef3604f5a query_pager: Pure virtual interface
Basic interface for paging control objects.

We probably do not need virtual behaviour for paging, but on the other
hand it does not really cost much, and it keeps a nice symmetry with
origin.
2015-11-10 13:12:34 +01:00
Calle Wilund
284b10cabe Make partition_slice::row_ranges mulitplex on partition
Allows for having more than one clustering row range set, depending on
PK queried (although right now limited to one - which happens to be exactly
the number of mutiplexing paging needs... What a coincidence...)

Encapsulates the row_ranges member in a query function, and if needed holds
ranges outside the default one in an extra object.

Query result::builder::add_partition now fetches the correct row range for
the partition, and this is the range used in subsequent iteration.
2015-11-10 13:12:33 +01:00
Calle Wilund
d8cafa8dec cql_server::connection::read_options: Deserialize paging state 2015-11-10 13:12:33 +01:00
Calle Wilund
545d3151e2 paging_state implementation
Note: serial format blob is different compared to origin, due to scyllas
different internal architecture. I.e. we query actual rows.
But drivers etc ignore the content of the blob, it is opaque.
2015-11-10 13:12:33 +01:00
Calle Wilund
7e2569c680 result_set: Make "paging_state" a pointer to const member
Paging state is immutable. Constness is nice.
2015-11-10 13:12:33 +01:00
Calle Wilund
4a1a17defc cql3::selection: Move result set building visitor to result_set_builder
Allows its use (and partial override - hint hint) in more place than
one.
2015-11-10 13:12:33 +01:00
Calle Wilund
820ba3540b result_view: Add static helper method to do the dispatch from select
Again, makes it easier to use the same optimization/technique at more than
one location.
2015-11-10 13:12:33 +01:00
Calle Wilund
23b6240dad cql3::selection: Fix some constness correctness 2015-11-10 13:12:33 +01:00
Calle Wilund
369e09459c cql3::result_set: Add non-const metadata getter. 2015-11-10 13:12:33 +01:00
Calle Wilund
0fa543800a data_output: Template "blob" writers (bytes*) to allow for varying "size" type 2015-11-10 13:12:33 +01:00
Calle Wilund
9ee8204993 data_input: Fix missing bounds check 2015-11-10 13:12:33 +01:00
Asias He
19a6dfcfd0 streaming: stream_session print stream_session_state properly 2015-11-10 15:39:34 +08:00
Asias He
7506d57dec streaming: Add operator<< for stream_session_state 2015-11-10 15:39:34 +08:00
Asias He
860c7aff37 streaming: Print plan_id in logger 2015-11-10 15:39:34 +08:00
Asias He
d2e5d13e69 streaming: Set state to STREAMING only if we really have data to sent 2015-11-10 15:39:34 +08:00
Asias He
fcf7486d4c streaming: Improve state transition log for maybe_completed and complete 2015-11-10 15:39:34 +08:00
Asias He
72a7a6bd9b streaming: session close
Currently, there are multiple places we can close a session, this makes
the close code path hard to follow. Remove the call to maybe_completed
in follower_start_sent to simplify closing a bit.

- stream_session::follower_start_sent -> maybe_completed()
- stream_session::receive_task_completed -> maybe_completed()
- stream_session::transfer_task_completed -> maybe_completed()
- on receive of the COMPLETE_MESSAGE -> complete()
2015-11-10 15:39:34 +08:00
Asias He
cadf8b1484 streaming: Handle stream_plan with no range added
If no ranges for neither sending nor receiving are added for the stream
plan, the stream plan is empty. Return a ready future immediately.
2015-11-10 15:39:34 +08:00
Asias He
13934140f6 streaming: Remove bogus file size info for prepare completed
Scylla does not streaming sstable files directly for streaming. The file
size info is incorrect, let's get rid of it.
2015-11-10 15:39:34 +08:00
Asias He
ac1977486d gossip: Fix shutdown
nodetool decommission node 127.0.0.2, on node 127.0.0.1, I saw:

DEBUG [shard 0] gossip - failure_detector: Forcing conviction of 127.0.0.1
TRACE [shard 0] gossip - convict ep=127.0.0.1, phi=8, is_alive=1, is_dead_state=0
TRACE [shard 0] gossip - marking as down 127.0.0.1
INFO [shard 0] gossip - inet_address 127.0.0.1 is now DOWN
DEBUG [shard 0] storage_service - on_dead endpoint=127.0.0.1

This is wrong since the argument for send_gossip_shutdown should be the
node being shutdown instead of the live node.
2015-11-10 15:39:34 +08:00
Avi Kivity
fee9688ae8 Merge "Fixes for collections of collections" from Paweł
"These are some fixes for removing items from collections of frozen
collections."
2015-11-10 09:38:13 +02:00
Paweł Dziepak
e494b2c1a0 tests/cql: add tests for collections of collections
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-11-09 16:37:50 +01:00
Paweł Dziepak
ee182f39a5 cql3/sets: simplify sets::discarder
Since the introduction of sets::element_discarder sets::discarder is
always given a set, never a single value.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-11-09 15:24:25 +01:00
Paweł Dziepak
0b0cef2457 cql3/maps: do not assume that the term type is constants::value
Fixes #516.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-11-09 15:18:07 +01:00
Paweł Dziepak
101ee1affd cql3/sets: add element_discarder
Currently sets::discarder is used by both set difference and removal of
a single element operations. To distinguish between them the discarder
checks whether the provided value is a set or something else, this won't
work however if a set of frozen sets is created.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-11-09 14:47:09 +01:00
Tomasz Grabiec
488528d1cd range: Fix range::subtract() for some cases
Was not handling wrap-around cases like this properly:

 (8, 3) - (2, 1)
2015-11-09 14:20:31 +02:00
Glauber Costa
8e0ad183b9 dist: AMI: add discard mount option
Crucial on SSDs, inocuous (hopefully) on HDDs. Let's add it to make peace
with our inner selves.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-11-09 13:31:01 +02:00
Asias He
d622fe867e gossip: Pass const ref if possible
It is clear that we will not change the parameter.
2015-11-09 13:01:37 +02:00