Commit Graph

5191 Commits

Author SHA1 Message Date
Avi Kivity
2e745bebad Merge "use compaction strategy options" from Raphael 2015-07-27 17:06:43 +03:00
Nadav Har'El
e24d6c21d9 range: Use std::declval in std::hash<>
Use C++11's std::declval<T>() instead of my ad-hoc scary-looking
idiom *(T*)nullptr.

Both techniques produce an object of type T which is only useful for
unevaluated contexts, only inspecting an object's type and not is value.
For example, in decltype() expressions.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-07-27 16:46:16 +03:00
Nadav Har'El
cc64e46425 Add equality operator for range
The operator== is needed when actually using a hash table - the hash
function is not enough.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-07-27 15:55:42 +03:00
Nadav Har'El
afa3d8c2c8 Fix errors in hash function for range
Amazing how many errors a short of piece of code can have, without the
compiler complaining at all. The magic of templates :-)

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-07-27 15:55:41 +03:00
Nadav Har'El
1399087753 Allow range<T> as hash-table key
Some methods in storage_service.cc want to return an
unordered_set<query::range<dht::token>>. This patch adds the missing
hash function for a query::range<T> to make it usable as a hash-table key.

The hash function we used is a trivial linear combination of the range's
start and end hash function - the same function used by Cassandra's
AbstractBounds.hashCode() so it is probably "good enough".

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-07-27 14:44:33 +03:00
Avi Kivity
c017a0aa81 Merge "Fix bug in row_cache::populate()" from Tomasz
"This should fix the problem Glauber was seeing with query from system tables
returning incorrect data."
2015-07-27 14:38:29 +03:00
Avi Kivity
579ff00553 Merge "Adding histogram and latency stats to storage_proxy" from Amnon
"The series adds a histogram obect similiar to the yammer histogram and a
latency helper object to calculate latency. It then adds histogram to the
storage_proxy and set it in the code and the API for the write path."
2015-07-27 14:33:26 +03:00
Tomasz Grabiec
3c79525530 tests: row_cache_test: Test harder 2015-07-27 13:27:35 +02:00
Tomasz Grabiec
31d28c5ccf row_cache: Fix bug in populate()
lower_bound() of course can return an iterator to an entry which has a
different key.
2015-07-27 12:31:42 +02:00
Asias He
74b281b92a gossip: Fix QUARANTINE_DELAY initialization
Dependencies between static variables don't work if they're in different
translation units.

I see in gossiper's constructor, QUARANTINE_DELAY is still 0.

Make it a function. It is nicer to make it inline, but I don't want to
pull storage_service.hh into gossiper.hh.
2015-07-27 11:29:13 +03:00
Tomasz Grabiec
fadaa41c63 Merge tag 'gdb/initial/v1' from seastar-dev.git
From Avi:

Add pretty-print support for sstring, uuid; add commands to query
where database objects, keyspace objects, and column_family objects are
located.
2015-07-27 10:22:12 +02:00
Asias He
3e766750a8 git: Ignore a dirty submodule tree 2015-07-27 10:14:02 +03:00
Avi Kivity
182d5ab798 memtable: fix memory leak
Since memtable::partitions is now an intrusive_set, it must be cleared
explicitly, or memory is leaked.
2015-07-26 20:01:50 +03:00
Avi Kivity
261df2c40f Merge seastar upstream 2015-07-26 19:05:33 +03:00
Avi Kivity
72cc2c0115 gdb: list database, keyspace, column_family
(gdb) scylla databases
    0 (database*)0x60000009c900
    1 (database*)0x60100011bf00
(gdb) scylla keyspaces
    0 "system"             (keyspace*)0x6000000cacf8
    1 "system"             (keyspace*)0x601000114018
(gdb) scylla column_families
    0 "5a1ff267-ace0-3f12-8563-cfae6103c65e" "system"/"sstable_activity"                   (column_family*)0x60000010bf88
    0 "b4dbb7b4-dc49-3fb5-b3bf-ce6e434832ca" "system"/"compaction_history"                 (column_family*)0x60000010c148
    0 "55d76438-4e55-3f8b-9f6e-676d4af3976d" "system"/"range_xfers"                        (column_family*)0x60000010c4c8
    0 "59dfeaea-8db2-3341-91ef-109974d81484" "system"/"peer_events"                        (column_family*)0x60000010c688
    0 "37f71aca-7dc2-383b-a706-72528af04d4f" "system"/"peers"                              (column_family*)0x60000010c848
    0 "7ad54392-bcdd-35a6-8417-4e047860b377" "system"/"local"                              (column_family*)0x60000010ca08
    0 "b7b7f0c2-fd0a-3410-8c05-3ef614bb7c2d" "system"/"paxos"                              (column_family*)0x60000010cbc8
    0 "d1b675fe-2b50-3ca4-8e49-c0f81989dcad" "system"/"schema_functions"                   (column_family*)0x60000010d488
    0 "296e9c04-9bec-3085-827d-c17d3df2122a" "system"/"schema_columns"                     (column_family*)0x60000010d9c8
    0 "9f5c6374-d485-3229-9a0a-5094af9ad1e3" "system"/"IndexInfo"                          (column_family*)0x60000010d108
    0 "0359bc71-7123-3ee1-9a4a-b9dfb11fc125" "system"/"schema_triggers"                    (column_family*)0x60000010d808
    0 "0290003c-977e-397c-ac3e-fdfdc01d626b" "system"/"batchlog"                           (column_family*)0x60000010cd88
    0 "3aa75225-4f82-350b-8d5c-430fa221fa0a" "system"/"schema_usertypes"                   (column_family*)0x60000010d648
    0 "55080ab0-5d9c-3886-90a4-acb25fe1f77b" "system"/"compactions_in_progress"            (column_family*)0x60000010c308
    0 "b0f22357-4458-3cdb-9631-c43e59ce3676" "system"/"schema_keyspaces"                   (column_family*)0x60000010db88
    0 "45f5b360-24bc-3f83-a363-1034ea4fa697" "system"/"schema_columnfamilies"              (column_family*)0x60000010e008
    0 "a5fc57fc-9d6c-3bfd-a3fc-01ad54686fea" "system"/"schema_aggregates"                  (column_family*)0x60000010d2c8
    0 "2666e205-73ef-38b3-90fe-fecf96e8f0c7" "system"/"hints"                              (column_family*)0x60000010cf48
    1 "5a1ff267-ace0-3f12-8563-cfae6103c65e" "system"/"sstable_activity"                   (column_family*)0x601000121f88
    1 "b4dbb7b4-dc49-3fb5-b3bf-ce6e434832ca" "system"/"compaction_history"                 (column_family*)0x601000122148
    1 "55d76438-4e55-3f8b-9f6e-676d4af3976d" "system"/"range_xfers"                        (column_family*)0x6010001224c8
    1 "59dfeaea-8db2-3341-91ef-109974d81484" "system"/"peer_events"                        (column_family*)0x601000122688
    1 "37f71aca-7dc2-383b-a706-72528af04d4f" "system"/"peers"                              (column_family*)0x601000122848
    1 "7ad54392-bcdd-35a6-8417-4e047860b377" "system"/"local"                              (column_family*)0x601000122a08
    1 "b7b7f0c2-fd0a-3410-8c05-3ef614bb7c2d" "system"/"paxos"                              (column_family*)0x601000122bc8
    1 "d1b675fe-2b50-3ca4-8e49-c0f81989dcad" "system"/"schema_functions"                   (column_family*)0x601000123488
    1 "296e9c04-9bec-3085-827d-c17d3df2122a" "system"/"schema_columns"                     (column_family*)0x6010001239c8
    1 "9f5c6374-d485-3229-9a0a-5094af9ad1e3" "system"/"IndexInfo"                          (column_family*)0x601000123108
    1 "0359bc71-7123-3ee1-9a4a-b9dfb11fc125" "system"/"schema_triggers"                    (column_family*)0x601000123808
    1 "0290003c-977e-397c-ac3e-fdfdc01d626b" "system"/"batchlog"                           (column_family*)0x601000122d88
    1 "3aa75225-4f82-350b-8d5c-430fa221fa0a" "system"/"schema_usertypes"                   (column_family*)0x601000123648
    1 "55080ab0-5d9c-3886-90a4-acb25fe1f77b" "system"/"compactions_in_progress"            (column_family*)0x601000122308
    1 "b0f22357-4458-3cdb-9631-c43e59ce3676" "system"/"schema_keyspaces"                   (column_family*)0x601000123b88
    1 "45f5b360-24bc-3f83-a363-1034ea4fa697" "system"/"schema_columnfamilies"              (column_family*)0x601000124008
    1 "a5fc57fc-9d6c-3bfd-a3fc-01ad54686fea" "system"/"schema_aggregates"                  (column_family*)0x6010001232c8
    1 "2666e205-73ef-38b3-90fe-fecf96e8f0c7" "system"/"hints"                              (column_family*)0x601000122f48
2015-07-26 18:54:13 +03:00
Avi Kivity
e8dbfdb56b gdb: pretty-print uuids 2015-07-26 18:54:12 +03:00
Avi Kivity
53856bf2d1 Add gdb pretty-printing script 2015-07-26 18:54:11 +03:00
Avi Kivity
a9d37f0e20 debug: store database pointer in a static variable
This makes it easily accessible to a debugger
2015-07-26 18:54:08 +03:00
Amnon Heiman
01aacbeacc API: Adding the histogram implementation to storage_proxy
This adds the implementation to the histogram for the storage proxy.
After this patch the following url will be available:
/storage_proxy/metrics/read/latency/histogram
/storage_proxy/metrics/range/latency/histogram
/storage_proxy/metrics/write/latency/histogram

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 11:03:31 +03:00
Amnon Heiman
429f7d2b20 API: Adding the histogram stats definition to storage_proxy
This adds the read, write and range histograms to the storage_proxy
It adds the following commands:
get_read_metrics_latency_histogram
get_range_metrics_latency_histogram
get_write_metrics_latency_histogram

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 11:03:30 +03:00
Amnon Heiman
130d8a7cc6 API: generalize the sum helper functions and add histogram support
This patch generalizd the sum helper function to accept any field as
long as it support the + operator and that it can be parrsed as json.

It adds a sum function to sum histograms it does so by:
adding the totatl, adding the sum, set the min and max
setting the avrage and variance and combining the samples.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 11:03:24 +03:00
Amnon Heiman
4908222d6a Adding utils.json Swagger definition file
The utils file will hold general modules, that need to be used by
multiple modules.

As a start, it holds the histogram definition.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:58:45 +03:00
Amnon Heiman
893410b08d storage_proxy: setting the write timers
This patch set the write timmers: histogram, timeout and unavailable.

For the histogram a latency is needed. For that the latency object is
used.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:57:40 +03:00
Amnon Heiman
c317e61f6d Adding histogrms to storage_proxy
The storage proxy needs to collect statistics about read, write and
range. For that the ihistogram object was added to its stats object.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:57:40 +03:00
Amnon Heiman
b2c5e2a7cc utils: adding the latency object
The latency object is used to simplify calculating latencies. It uses a
start and stop time_point so the latency can be queried multiple time.

The start need to be done explicitely and not in the constructor to
allow reuse of the object.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:57:32 +03:00
Amnon Heiman
2b584ec2ec Adding the histogram object
The histogram object is equivalent to the Histogram used in Origin. It
collect multiple values about the data:
Count, Min, Max, Sum, variance and the sum of square that are used for
std calculation.

It also contain a sample of the last n elements, that are stored in a
circular buffer.

The histogram is used by the API to report histogram statistics.

As the API does not support unsigned integer, the count is signed.

Typically the base type of the histogram is int64_t, so ihistogram was
defined as such.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-07-26 10:55:14 +03:00
Avi Kivity
68ac0f1562 thrift: fix collectd type name
collectd rejects "thrift_requests" because it doesn't exist in its types.db.
Replace with "total_requests".
2015-07-26 10:32:26 +03:00
Avi Kivity
095c2f2920 Merge "Fixes for partition_range model" from Tomasz
"range::is_wrap_around() will not work with current ring_position, because it
relies on total ordering. Same for range::contains(). Currently ring_position
is weakly ordered. This series fixes this problem by making ring_position
totally ordered.

Another problem fixed by this series is handling of wrap-around ranges. In
Origin, ]x; x] is treated as a wrap around range covering whole ring."
2015-07-25 17:47:40 +03:00
Avi Kivity
73dfa66b8a remove "reversed_type.hh"
Not used (and fix one accidental use).
2015-07-25 17:34:56 +03:00
Avi Kivity
347cec2922 Merge "Support clustering order" from Glauber
"This is the work required to support the clustering order statement.

There is no work needed from the sstables side, because it should just
write whatever we are given by the memtable layer, in the order established
by the memtable layer.

So the work is basically done in the types-level, to support "reversed types",
one concept from Origin we were missing"
2015-07-25 17:09:30 +03:00
Avi Kivity
092ca04f54 Merge "fixes for compaction" from Raphael 2015-07-25 13:43:53 +03:00
Glauber Costa
7fdb21ae8c sstable_test: test clustering order
If we revert the type of the clustering key, which is what would happen if we
defined the table as with clustering order by (cl desc), we expect the
clustering keys to be in descending order on disk.

There is no work needed for sstables for that to happen. But we should still
verify that this is indeed the case.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:12 -04:00
Glauber Costa
21fc542af1 create_table_statement: revert reversed types
We have the information that they should be reverted, but we are not yet
reverting them. Go ahead and do it

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:12 -04:00
Glauber Costa
5306c70f3f create_table statement: sanity test defined_order
Code-conversion, mainly

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:12 -04:00
Glauber Costa
3c0982a01f create_table_statement: adjust _defined_ordering
First of all, we should abide by our convention of prepending member names with
a '_'. (That is the underline character, it just looks like a face)

But more importantly, because we will be searching its contents frequently, a
helper function is provided.

Note that obviously a hash is better suited for this: but because we do need to
keep the fields in order they are inserted, a vector really is the best choice
for that.

A table is not expected to have a lot of clustering keys. So this search should
be cheap. If it turns out to be a problem, we can adjust later.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:12 -04:00
Glauber Costa
051aed33f9 compound: invert order for byte comparison of reversed types
reversed types can be byte_comparable, but in this case we should
invert the order of the comparation.

One alternative here, of course, would be to just declare all reversed types
non-byte comparable. That would definitely be safer, but at the expense of
always having more expensive comparisons for inverted orders.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:12 -04:00
Glauber Costa
e169cd8e4f type_test: test reversed types
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:12 -04:00
Glauber Costa
7669563900 type_parser: support reversed types
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:08 -04:00
Glauber Costa
1047f36a9a types: support reversed type
A reversed type has an underlying type to which it is equal in every aspect.
Except that it will compare differently: it compares in the reverse order
of its base type.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-24 22:55:08 -04:00
Raphael S. Carvalho
15bbb71b7b db: handle compaction exception outside keep doing
Otherwise, we would needlessly handle it twice.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-07-24 19:12:34 -03:00
Raphael S. Carvalho
5f89f80ae5 Revert "db: dont rethrow exceptions for termination of compaction fiber"
Actually we should rethrow exceptions because they are needed for
keep_doing() to finish. Otherwise, the future _compaction_done
will never be resolved.

This reverts commit 89698b0d1c.
2015-07-24 19:07:47 -03:00
Tomasz Grabiec
45b4471a0e tests: Introduce test for query::partition_range 2015-07-24 16:08:41 +02:00
Tomasz Grabiec
5e03dea65d range: Fix is_wrap_around()
In Origin, dht.Range() with equal values is considered a full wrap
around. Make our range<> recognize this.

So we have:

 ]x; x] - wrap around, full ring
 [x; x[ - wrap around, full ring
 ]x; x[ - wrap around, excluding x
 [x; x] - not wrap around, only x included
2015-07-24 16:08:41 +02:00
Tomasz Grabiec
1b7ab4f639 range: Introduce unwrap() 2015-07-24 16:08:41 +02:00
Tomasz Grabiec
1c95f646ae range: Make before() and after() public 2015-07-24 16:08:41 +02:00
Tomasz Grabiec
4d06c2aa1d Move to_partition_range() adaptor to global scope
It should be moved to i_partitioner.hh, but to do that range<> has to
be first moved out of query-request.hh to break cyclic dependency.
I didn't want to cause conflicts with in-flight patches to range<>.
2015-07-24 16:08:41 +02:00
Tomasz Grabiec
e5feff5d71 dht: ring_position: Switch to total ordering
range::is_wrap_around() and range::contains() rely on total ordering
on values to work properly. Current ring_position_comparator was only
imposing a weak ordering (token positions equal to all key positions
with that token).

range::before() and range::after() can't work for weak ordering. If
the bound is exclusive, we don't know if user-provided token position
is inside or outside.

Also, is_wrap_around() can't properly detect wrap around in all
cases. Consider this case:

 (1) ]A; B]
 (2) [A; B]

For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is
not. Without total ordering between A and B, range::is_wrap_around() can't
tell that.

I think the simplest soution is to define a total ordering on
ring_position by making token positions positioned either before or
after all keys with that token.
2015-07-24 16:08:41 +02:00
Tomasz Grabiec
2e845140e2 dht: ring_position: Implement less_compare() and equals() using tri_compare() 2015-07-24 16:08:41 +02:00
Tomasz Grabiec
7b30e8fcff dht: ring_position: Move definitions out of line 2015-07-24 16:08:41 +02:00
Tomasz Grabiec
1c562fcdb3 Merge tag 'asias/gossip/rejoin/v2'
From Asias: "
This pathset fixes:

- node rejoin issue
  Start node 1 and node 2, kill node 2, restart node 2.
  Now, node 1 can talk to node 2 correctly.

- node mark dead issue

- failure_detector sampling
"
2015-07-24 14:09:20 +03:00