Commit Graph

53948 Commits

Author SHA1 Message Date
Tomasz Grabiec
7eb0da3ed8 Merge tag 'asias/range_streamer_use_after_free_gossip_print/v1' from seastar-dev.git
range_streamer use after free fix and gossip print cleanups from
Asias.
2015-10-23 14:43:20 +02:00
Asias He
a63871f024 gossip: Print application_state name instead of number
Plus print cleanup.
2015-10-23 17:03:18 +08:00
Asias He
687d88dd0c gossip: Add operator<< operator for application_state 2015-10-23 16:32:36 +08:00
Asias He
80f1b9781a gossip: Simplify make_token_string with boost::adaptors::transformed 2015-10-23 16:13:31 +08:00
Asias He
1ba20f4efd storage_service: Enable single_datacenter_filter in rebuild 2015-10-23 16:13:30 +08:00
Asias He
b172146223 range_streamer: Introduce single_datacenter_filter 2015-10-23 16:13:30 +08:00
Asias He
a934e31379 storage_service: Fix use after free in rebuild
streamer is a stack variable, it is gone when the function returns.
Fix it using a shared pointer.
2015-10-23 16:13:30 +08:00
Asias He
98b34ecc67 boot_strapper: Fix use after free
streamer is a stack variable, it is gone when the function returns.
Fix it using a shared pointer.

Fixes #489
2015-10-23 16:13:30 +08:00
Asias He
e2391b02da storage_service: Add debug info for get_load_map 2015-10-23 16:13:30 +08:00
Avi Kivity
2c16d1f980 Merge "fix nodetool gossipinfo and status" from Asias
"- Implement get_load_map using load_broadcaster
- Fix token in nodetool gossipinfo"
2015-10-23 10:34:13 +03:00
Asias He
bc8d3f0d24 storage_service: Implement get_load_map using load_broadcaster
$nodetool -p 7199 status

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns    Host ID Rack
UN  127.0.0.1  60711      3       ?  78ce8d76-69f8-44fc-a5ae-b932eb4f641b  rack1
UN  127.0.0.2  66237      3       ?  b1eaaee6-c334-4a65-bc0c-95af49bca0b3  rack1
2015-10-23 11:53:46 +08:00
Asias He
a137006d27 main: Set load_broadcaster for storage_service during startup
Note only storage_service on shard 0 will access load_broadcaster
2015-10-23 11:28:06 +08:00
Asias He
992390feda storage_service: Store load_broadcaster into storage_service
It is needed by get_load_map.
2015-10-23 11:25:51 +08:00
Asias He
270e713fea Revert "API: Workaround for load_map"
This reverts commit 497e403387.
2015-10-23 10:45:54 +08:00
Asias He
6b7ca7e334 gossip: Cleanup versioned_value class 2015-10-23 10:06:19 +08:00
Asias He
5556861ac0 gossip: Fix nodetool gossipinfo
Use dht::global_partitioner().{to_sstring and from_sstring) to handle
tokens.

Before:
$ nodetool -p 7199  gossipinfo
127.0.0.1
  generation:1445528714
  heartbeat:105
  0:NORMAL,TOKENS
  2:c1e6c9ef-22c8-3f93-bca1-ea5810bafd36
  3:datacenter1
  4:rack1
  5:2.1.8
  8:127.0.0.1
  11:0
  12:d24aef9c-fc4e-4290-92f4-1fd317b8883a

After:
$ nodetool -p 7199  gossipinfo
127.0.0.1
  generation:1445528714
  heartbeat:105
  0:NORMAL,-3524784140453853209;2276970246802708341;-4108982606669659076
  2:c1e6c9ef-22c8-3f93-bca1-ea5810bafd36
  3:datacenter1
  4:rack1
  5:2.1.8
  8:127.0.0.1
  11:0
  12:d24aef9c-fc4e-4290-92f4-1fd317b8883a
2015-10-23 09:40:59 +08:00
Raphael S. Carvalho
19f2dc9ef9 import Downsampling.java
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2015-10-22 14:44:40 -02:00
Gleb Natapov
5b97604735 load_broadcaster: fix linkage error in debug mode
Also move it to service namespace.
2015-10-22 18:18:05 +02:00
Amnon Heiman
14482d4496 API: Add read, write, range estimated histogram implementation for storage_proxy
This patch adds the implmentation for the read, write and range
estimated histogram and total latency.

After this patch the following url will be available:
/storage_proxy/metrics/read/estimated_histogram/
/storage_proxy/metrics/read
/storage_proxy/metrics/write/estimated_histogram/
/storage_proxy/metrics/write
/storage_proxy/metrics/range/estimated_histogram/
/storage_proxy/metrics/range

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-10-22 18:54:45 +03:00
Amnon Heiman
e1168c0a34 API: Add estimated histogram to storage_proxy
This patch close the gap between the storage_proxy read, write and range
metrics and the API.

For each of the metrics there will be a histogram, estimated histogram
and total.

The patch contains the definitions for the following:
get_read_estimated_histogram
get_read_latency
get_write_estimated_histogram
get_write_latency
get_range_estimated_histogram
get_range_latency

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>

need mrege storage_proxy
2015-10-22 18:54:45 +03:00
Amnon Heiman
7b8c557f30 storage_service: Add estimated histogram for read, write and range
This patch adds an estimated histogram for read, write and range to the
proxy_service stats.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-10-22 18:54:45 +03:00
Amnon Heiman
a071cdce66 API: Add getters to storage_proxy timers timeout
This patch expose the configuration timeout values of the timers.
The timers will return their values in seconds, the swagger definition
file was modified to reflect the change.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-10-22 18:54:11 +03:00
Tomasz Grabiec
5bbc902eec mutation_partition: Drop now unnecessary unconst() usage
This change was actually promised by
f74c665671.
2015-10-22 17:12:03 +02:00
Tomasz Grabiec
c7be350961 mutation_partition: Rename reversion_traits to reversal_traits
As pointed out by Nadav, 'reversion' is from 'revert', 'reversal' is
from 'reverse'.
2015-10-22 18:09:07 +03:00
Tomasz Grabiec
f74c665671 mutation_partition: Add non-const-qualified version of range() and use it 2015-10-22 18:09:07 +03:00
Asias He
6c48ce065e storage_service: Make comment clearer in get_changed_ranges_for_leaving
We are not removing the range. Current node and new node will be
responsible for the range are calculated. We only need to stream data to
node = new node - current node. E.g,

Assume we have node 1 and node 2 in the cluster, RF=2. If we remove node2:

Range (3c 25 fa 7e d2 2a 26 b4 , 81 2a a7 32 29 e5 3a 7c ],
current_replica_endpoints={127.0.0.1, 127.0.0.2} new_replica_endpoints={127.0.0.1}

Range (3c 25 fa 7e d2 2a 26 b4 , 81 2a a7 32 29 e5 3a 7c ] already in all replicas

no data will be streamed to node 1 since it already has it.
2015-10-22 18:08:03 +03:00
Avi Kivity
a699bc20bc Merge "Adding the stream metrics API" from Amnon
"This series adds the stream metrics API, the swagger definition are based on
the StreamMetrics class in origin."
2015-10-22 17:18:01 +03:00
Amnon Heiman
5323b29699 API: Add compaction history to the API
This patch adds a definition and a stub for the compaction history. The
implementation should read fromt the compaction history table and return
an array of results.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-10-22 17:16:58 +03:00
Avi Kivity
5d37ee29f8 Merge "Adding log level support to the API" from Amnon
"This series adds an API to get and set the log level.
After this series it will be possible to use the folloing url:
GET/POST:
/system/logger
/system/logger/{name}"
2015-10-22 17:15:41 +03:00
Avi Kivity
b4e5b9dcf1 Merge "Add support to nodetool describecluster"
"This series adds the functionality that is required for nodetool
describecluster

It uses the gossiper for get cluster name and get partitioner.  The
describe_schema_versions functionality is missing and a workaround is used so
the command would work.

After this series an example for nodetool describecluster:
./bin/nodetool describecluster
Cluster Information:
	Name: Test Cluster
	Snitch: org.apache.cassandra.locator.SimpleSnitch
	Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
	Schema versions:
		127.0.0.1: [48c4e6c8-5d6a-3800-9a3a-517d3f7b2f26]"
2015-10-22 17:11:30 +03:00
Gleb Natapov
ece6c68288 convert loadBroadcaster 2015-10-22 17:10:20 +03:00
Avi Kivity
8362622eb8 Merge "Support Ubuntu 14.04LTS" from Takuya 2015-10-22 17:03:04 +03:00
Avi Kivity
0129e42b06 Merge "Mutation diff" from Paweł
"This series add code for computing mutation_partition difference.
For mutations A and B:

diffA = A.difference(B);
diffB = B.difference(A);
AB = A.apply(B);

diffA is the minimal mutation that when applied to B makes it equal
to AB and diffB is the minimal mutation that applied to A results in AB.

Fixes #430."
2015-10-22 16:38:25 +03:00
Avi Kivity
f7087da054 Merge "GET methods for snapshots" from Glauber
"The snapshots API need to expose GET methods so people can
query information on them. Now that taking snapshots is supported,
this relatively simple series implement get_snapshot_details, a
column family method, and wire that up through the storage_service."
2015-10-22 15:23:45 +03:00
Calle Wilund
05de462fa9 commitlog: Make flush/segment delete slightly mode defensive + test tolerant
Fix for (mainly) test failures (use-after free)
I.e. test case test_commitlog_delete_when_over_disk_limit causes
use-after free because test shuts down before a pending flush is done,
and the segment manager is actually gone -> crash writing stats.
Now, we could make the stats a shared pointer, but we should never
allow an operation to outlive the segment_manager.
In normal op, we _almost_ guarantee this with the shutdown() call,
but technically, we could have a flush continuation trailing somewhere.

* Make sure we never delete segments from segment_manager until they are
  fully flushed
* Make test disposal method "clear" be more defensive in flushing and
  clearing out segments
2015-10-22 15:19:24 +03:00
Avi Kivity
9dc8d98146 Merge "Make mutation queries respect reversed order" from Tomasz
"Affects CQL statements like the following one:

   select * from table order by ck desc;

Fixes #480."
2015-10-22 15:08:19 +03:00
Takuya ASADA
a03056a915 dist: add dependency packages and build script for ubuntu 2015-10-22 11:55:50 +00:00
Takuya ASADA
e0604b066a dist: add debian/ directory to build .deb package for Ubuntu 2015-10-22 11:55:50 +00:00
Takuya ASADA
592b64e478 dist: mount hugetlbfs on ubuntu 2015-10-22 11:55:50 +00:00
Takuya ASADA
830041d6be dist: share scripts both on redhat and ubuntu 2015-10-22 11:55:50 +00:00
Avi Kivity
a2577fefb9 Merge "Enable decommission support" from Asias
"Tested with:

- start node 1
- insert value
- start node 2
- insert value
- decommission node2

I can see from the log that data range belongs to node2 is streamed to node1
and cqlsh query node1 returns all the data, and node2 is not in the live node
list from node1's view."
2015-10-22 14:44:24 +03:00
Avi Kivity
5f3a46eabb Merge "load_new_sstables" from Glauber
"This patchset implements load_new_sstables, allowing one to move tables inside the
data directory of a CF, and then call "nodetool refresh" to start using them.

Keep in mind that for Cassandra, this is deemed an unsafe operation:
https://issues.apache.org/jira/browse/CASSANDRA-6245

It is still for us something we should not recommend - unless the CF is totally
empty and not yet used, but we can do a much better job in the safety front.

To guarantee that, the process works in four steps:

1) All writes to this specific column family are disabled. This is a horrible thing to
   do, because dirty memory can grow much more than desired during this. Throughout out
   this implementation, we will try to keep the time during which the writes are disabled
   to its bare minimum.

   While disabling the writes, each shard will tell us about the highest generation number
   it has seen.

2) We will scan all tables that we haven't seen before. Those are any tables found in the
   CF datadir, that are higher than the highest generation number seen so far. We will link
   them to new generation numbers that are sequential to the ones we have so far, and end up
   with a new generation number that is returned to the next step

3) The generation number computed in the previous step is now propagated to all CFs, which
   guarantees that all further writes will pick generation numbers that won't conflict with
   the existing tables. Right after doing that, the writes are resumed.

4) The tables we found in step 2 are passed on to each of the CFs. They can now load those
   tables while operations to the CF proceed normally."
2015-10-22 13:42:24 +03:00
Avi Kivity
2b0a504cbc Merge "Adding row chache statistic to column family" from Amnon
"This series adds row cache statistic to the column family that will be expose
via the API."
2015-10-22 13:38:01 +03:00
Paweł Dziepak
740e2166c5 tests/mutation: add test for mutation diff
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:08:53 +02:00
Paweł Dziepak
f78a80dfa3 mutation_partition: add method for computing difference
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:08:53 +02:00
Paweł Dziepak
85edc3de07 mutation_partition: compute row difference
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:08:53 +02:00
Paweł Dziepak
75df23dd3c types: add collection_type_impl::difference()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:08:53 +02:00
Paweł Dziepak
4440f9b85b mutation_partition: add row_marker::is_live()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:08:53 +02:00
Paweł Dziepak
2aa96eb00f mutation_partition: add insert_row()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:08:53 +02:00
Paweł Dziepak
a064181d7c mutation_partition: add row::with_both_ranges()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:08:53 +02:00