Commit Graph

6765 Commits

Author SHA1 Message Date
Avi Kivity
77daa7a59f Merge "Flush queue ordering" from Calle
"Adds a small utility queue and through this enforces memtable flush ordering
such that a flush may _run_ unchecked, however the "post" operation may
execute once all "lower numbered" (i.e. lower replay position) post ops
has finished.

This means that:
a.) Callbacks to commitlog are now guaranteed to fulfill ordering criteria
b.) Calling column_family::flush() and waiting for the result will also
     wait for any previously initiated flushes to finish. But not those
     initiated _after_."
2015-10-14 15:13:24 +03:00
Calle Wilund
012ab24469 column_family: Add flush queue object to act as ordering guarantee 2015-10-14 14:07:40 +02:00
Calle Wilund
d8658d4536 flush_queue_test
Small test to verify at least some integrity of the util in question
2015-10-14 14:07:39 +02:00
Calle Wilund
4540036a01 Add "flush_queue" helper structure
Small utility to order operation->post operation
so that the "post" step is guaranteed to only be run
when all "post"-ops for lower valued keys (T) have been completed

This is a generalized utility mainly to be testable.
2015-10-14 14:07:38 +02:00
Avi Kivity
ed2481db7f Merge branch-0.10 2015-10-14 15:03:19 +03:00
Takuya ASADA
eb1924a4e4 dist: fix file not found error on centos_dep/build_dependency.sh
We don't have boost.diff, and doesn't need it. So return to rpmbuild --rebuild.

Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-10-14 14:12:46 +03:00
Asias He
c58ae5432c storage_service: Fix nodetool info return wrong gossiper status
Before:
$ nodetool info
ID                     : a5adfbbf-cfd8-4c88-ab6b-6a34ccc2857c
Gossip active          : false

After:
$ nodetool info
ID                     : a5adfbbf-cfd8-4c88-ab6b-6a34ccc2857c
Gossip active          : true

Fix #354.
2015-10-14 12:37:51 +03:00
Avi Kivity
1a439f2259 Merge seastar upstream
* seastar 78e3924...a2523ae (7):
  > core: fix pipe unread
  > Merge 'xfs-extents'
  > Merge "separate-dma-alignment"
  > output_stream: wait for stream to be taken out of poller in case final flush returns exception.
  > reactor: Use more widely compatible xfs include
  > readme: Add xfslibs-dev to Ubuntu deps
  > pipe: add unread() operation
2015-10-14 11:49:11 +03:00
Avi Kivity
cd6054253c Merge "fix consumer parser" from Raphael
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
2015-10-13 18:53:27 +03:00
Raphael S. Carvalho
d05a5fbeb4 tests: add testcase for bug on consumer parser
problem described by commit:
3926748594

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-10-13 11:49:57 -03:00
Raphael S. Carvalho
3926748594 sstable: fix consumer parser
The first problem is the while loop around the code that processes prestate.
That's wrong because there may be a need to read more data before continuing
to process a prestate.
The second problem is the code assumption that a prestate will be processed
at once, and then unconditionally process the current state.
Both problems are likely to happen when reading a large buffer because more
than one read may be required.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-10-13 11:45:10 -03:00
Avi Kivity
c1cfec3e5a dht: mark boot_strapper logger static
Or it violates the ODR and causes link errors.
2015-10-13 16:42:42 +03:00
Tomasz Grabiec
11eb9a6260 range: Fix range::contains()
A range open-ended on both sides should contain all wrapping ranges.

Spotted by Avi.
2015-10-13 12:14:54 +03:00
Nadav Har'El
39f70a043d main: don't warn twice about the same directory
I was mildly annoyed by seeing two warnings about the same directory not
being XFS, when the sstable directory and the commitlog directory are the
same one (I don't know if this is typical, but this is what I do in all
my tests...). So I wrote this trivial patch to make sure not to test the
same directory twice.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-10-13 11:23:55 +03:00
Avi Kivity
eed11583cb Merge "Add node support" from Asias
"With this, a new node can stream data from existing nodes when joins the cluster.

I tested with the following:

  1) stat a node 1

  2) insert data into node 1

  3) start node 2

I can see from the logger that data is streamed correctly from node 1
to node 2."
2015-10-13 11:06:11 +03:00
Asias He
cf9d9e2ced boot_strapper: Enable range_streamer code in bootstrap
Add code to actually stream data from other nodes during bootstrap.

I tested with the following:

   1) stat a node 1

   2) insert data into node 1

   3) start node 2

I can see from the logger that data is streamed correctly from node 1
to node 2.
2015-10-13 15:54:18 +08:00
Asias He
0d1e5c3961 boot_strapper: Add debug info for get_bootstrap_tokens
Print the token generated for the node who is bootstrapping.
2015-10-13 15:48:53 +08:00
Asias He
e92a759dad locator: Add debug info for abstract_replication_strategy::get_address_ranges
It is useful to check relations between token and its primary range.
2015-10-13 15:46:36 +08:00
Asias He
dc02d76aee range_streamer: Implement fetch_async
It is used by boot_strapper::bootstrap() in bootstrap process to start
the streaming.
2015-10-13 15:45:56 +08:00
Asias He
887c0a36ec range_streamer: Implement add_ranges
It is used by boot_strapper::bootstrap() in bootstrap process.
2015-10-13 15:45:56 +08:00
Asias He
1c1f9bed09 range_streamer: Implement use_strict_sources_for_ranges 2015-10-13 15:45:55 +08:00
Asias He
43d4e62b5a range_streamer: Implement add_source_filter 2015-10-13 15:45:55 +08:00
Asias He
d47ea88aa8 range_streamer: Implement get_all_ranges_with_strict_sources_for 2015-10-13 15:45:55 +08:00
Asias He
84de936e43 range_streamer: Implement get_all_ranges_with_sources_for 2015-10-13 15:45:55 +08:00
Asias He
944e28cd6c range_streamer: Implement get_range_fetch_map 2015-10-13 15:45:55 +08:00
Asias He
d986a4d875 range_streamer: Add constructor 2015-10-13 15:45:55 +08:00
Asias He
1d6c081766 range_streamer: Add i_source_filter and failure_detector_source_filter
It is used to filter out unwanted node.
2015-10-13 15:45:55 +08:00
Asias He
c8b9a6fa06 dht: Convert RangeStreamer to C++ 2015-10-13 15:45:55 +08:00
Asias He
b95521194e dht: Import dht/RangeStreamer 2015-10-13 15:45:55 +08:00
Asias He
b1c92f377d token_metadata: Add get_pending_ranges
One version returns only the ranges
   std::vector<range<token>>

Another version returns a map
   std::unordered_map<range<token>, std::unordered_set<inet_address>>
which is converted from
   std::unordered_multimap<range<token>, inet_address>

They are needed by token_metadata::pending_endpoints_for,
storage_service::get_all_ranges_with_strict_sources_for and
storage_service::decommission.
2015-10-13 15:45:55 +08:00
Asias He
b860f6a393 token_metadata: Add get_pending_ranges_mm
Helper for get_pending_ranges.
2015-10-13 15:45:55 +08:00
Asias He
c96d826fe0 token_metadata: Introduce _pending_ranges member 2015-10-13 15:45:55 +08:00
Asias He
da072b8814 token_metadata: Remove duplicated sortedTokens
It is implemented already.
2015-10-13 15:45:55 +08:00
Asias He
d820c83141 locator: Add abstract_replication_strategy::get_pending_address_ranges
Given the current token_metadata and the new token which will be
inserted into the ring after bootstrap, calculate the ranges this new
node will be responsible for.

This is needed by boot_strapper::bootstrap().
2015-10-13 15:45:55 +08:00
Asias He
1adb27e283 token_metadata: Add clone_only_token_map
Needed by get_pending_address_ranges.
2015-10-13 15:45:55 +08:00
Asias He
527edd69ae locator: Add abstract_replication_strategy::get_range_addresses
Needed by range_streamer::get_all_ranges_with_sources_for.
2015-10-13 15:45:55 +08:00
Asias He
044dcf43de locator: Add abstract_replication_strategy::get_address_ranges
Needed by get_pending_address_ranges.
2015-10-13 15:45:55 +08:00
Asias He
3d0d02816d token_metadata: Add get_primary_ranges_for and get_primary_range_for
Given tokens, return ranges the tokens present. For example, with t1 and
t2, it returns ranges:

(token before t1, t1]
(token before t2, t2]
2015-10-13 15:45:55 +08:00
Asias He
542b1394d7 token_metadata: Add get_predecessor
It is used to get the previous token of this token in the ring.
2015-10-13 15:45:55 +08:00
Asias He
ddfd417c13 locator: Make calculate_natural_endpoints take extra token_metadata parameter
When adding/removing a node, we need to use a temporary token_metadata
with pending tokens.
2015-10-13 15:45:55 +08:00
Asias He
7959c12073 stream_session: Support column_families is empty case
An empty column_families means to get all the column families.
2015-10-13 15:44:59 +08:00
Tomasz Grabiec
a383f91b68 range: Implement range::contains() which takes another range 2015-10-13 15:44:36 +08:00
Pekka Enberg
2ed34b0e96 Merge seastar upstream
* seastar 1995676...78e3924 (5):
  > fix output stream batching
  > rpc: server connection shutdown fix
  > doc: add Seastar tutorial
  > resource: increase default reserve memory
  > http client: moved http_response_parser.rl from apps/seawreck into http directory

Adjust transport/server.cc for the demise of output_stream::batch_flush()
scylla-0.10
2015-10-12 16:12:35 +03:00
Avi Kivity
0498cebc58 Merge seastar upstream
* seastar c2e86d5...78e3924 (2):
  > fix output stream batching
  > rpc: server connection shutdown fix

Adjust transport/server.cc for the demise of output_stream::batch_flush()
2015-10-12 14:00:40 +03:00
Avi Kivity
b8c8473505 Merge seastar upstream
* seastar 1995676...c2e86d5 (3):
  > doc: add Seastar tutorial
  > resource: increase default reserve memory
  > http client: moved http_response_parser.rl from apps/seawreck into http directory
2015-10-12 10:29:10 +03:00
Amnon Heiman
6fd3c81db5 keyspace clean up should be a POST not a GET 2015-10-11 15:51:56 +03:00
Avi Kivity
e252475e67 Merge "locator: Adding EC2Snitch" from Vlad
"This series adds EC2Snich.

Since both GossipingPropertyFileSnitch and EC2SnitchXXX snitches family
are using the same property file it was logical to share the corresponding
code. Most of this series does just that... "
2015-10-11 14:55:26 +03:00
Glauber Costa
f03480c054 avoid exception when processing caching_options
While trying to debug an unrelated bug, I was annoyed by the fact that parsing
caching options keep throwing exceptions all the time. Those exceptions have no
reason to happen: we try to convert the value to a number, and if we fail we
fall back to one of the two blessed strings.

We could just as easily just test for those strings beforehand and avoid all of
that.

While we're on it, the exception message should show the value of "r", not "k".

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-11 14:53:55 +03:00
Avi Kivity
2b87c8c372 build: allow defaulting test binaries not to strip debug information
Useful for continuous integration, which has enough disk space.
2015-10-11 12:25:20 +03:00
Glauber Costa
12ac9a1fbd do not calculate truncation time independently
Currently, we are calculating truncated_at during truncate() independently for
each shard. It will work if we're lucky, but it is fairly easy to trigger cases
in which each shard will end up with a slightly different time.

The main problem here, is that this time is used as the snapshot name when auto
snapshots are enabled. Previous to my last fixes, this would just generate two
separate directories in this case, which is wrong but not severe.

But after the fix, this means that both shards will wait for one another to
synchronize and this will hang the database.

Fix this by making sure that the truncation time is calculated before
invoke_on_all in all needed places.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-09 17:39:47 +03:00