scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 19:46:48 +00:00

Author	SHA1	Message	Date
Avi Kivity	2d933c62ec	thrift: capture "this" explicitly in lambda C++20 deprecates capturing this in default-copy lambdas ([=]), with good reason. Move to explicit captures to avoid any ambiguity and reduce warning spew. Message-Id: <20200517151023.754906-1-avi@scylladb.com>	2020-05-18 10:24:00 +03:00
Rafael Ávila de Espíndola	311fbe2f0a	repair: Make sure sinks are always closed In a recent next failure I got the following backtrace #3 0x00007efd71251a66 in __GI___assert_fail (assertion=assertion@entry=0x2d0c00 "this->_con->get()->sink_closed()", file=file@entry=0x32c9d0 "./seastar/include/seastar/rpc/rpc_impl.hh", line=line@entry=795, function=function@entry=0x270360 "seastar::rpc::sink_impl<Serializer, Out>::~sink_impl() [with Serializer = netw::serializer; Out = {repair_row_on_wire_with_cmd}]") at assert.c:101 #4 0x0000000001f5d2c3 in seastar::rpc::sink_impl<netw::serializer, repair_row_on_wire_with_cmd>::~sink_impl (this=<optimized out>, __in_chrg=<optimized out>) at ./seastar/include/seastar/core/future.hh:312 #5 0x0000000001f5d2f4 in seastar::shared_ptr_count_for<seastar::rpc::sink_impl<netw::serializer, repair_row_on_wire_with_cmd> >::~shared_ptr_count_for (this=0x60100075b680, __in_chrg=<optimized out>) at ./seastar/include/seastar/core/shared_ptr.hh:463 #6 seastar::shared_ptr_count_for<seastar::rpc::sink_impl<netw::serializer, repair_row_on_wire_with_cmd> >::~shared_ptr_count_for (this=0x60100075b680, __in_chrg=<optimized out>) at ./seastar/include/seastar/core/shared_ptr.hh:463 #7 0x000000000240f2e6 in seastar::shared_ptr<seastar::rpc::sink<repair_row_on_wire_with_cmd>::impl>::~shared_ptr (this=0x601003118590, __in_chrg=<optimized out>) at ./seastar/include/seastar/core/future.hh:427 #8 seastar::rpc::sink<repair_row_on_wire_with_cmd>::~sink (this=0x601003118590, __in_chrg=<optimized out>) at ./seastar/include/seastar/rpc/rpc_types.hh:270 #9 <lambda(auto:134&)>::<lambda(const seastar::rpc::client_info&, uint64_t, seastar::rpc::source<repair_hash_with_cmd>)>::<lambda(std::__exception_ptr::exception_ptr)>::~<lambda> (this=0x601003118570, __in_chrg=<optimized out>) at repair/row_level.cc:2059 This patch changes a few functions to use finally to make sure the sink is always closed. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200515202803.60020-1-espindola@scylladb.com>	2020-05-18 08:13:42 +03:00
Avi Kivity	beaeda5234	database: remove variadic future from query() and query_mutations() Variadic futures are deprecated; replace with future<std::tuple<...>>. Tests: unit (dev)	2020-05-17 18:45:38 +02:00
Nadav Har'El	4cf44ddbdf	docs: update alternator.md Some statements made in docs/alternator/alternator.md on having a single keyspace, or recommending a DNS setup, are not up-to-date. So fix them. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200517132444.9422-1-nyh@scylladb.com>	2020-05-17 18:38:13 +02:00
Nadav Har'El	1b807a5018	alternator test: better recognition that Alternator failed to boot The test/alternator/run script starts Scylla to be tested. It waits until CQL is responsive and if Scylla dies earlier, recognizes the failure immediately. This is useful so we see boot errors immediately instead of waiting for the first test to timeout and fail. However, Scylla starts the Alternator service after CQL. So it is possible that after the "run" script found CQL to be up, Alternator couldn't start (e.g., bad configuration parameters) and Scylla is shut down, and instead of recognizing this situation, we start the actual test. The fix is simple: don't start the tests until verifying that Alternator is up. We verify this using the trivial healthcheck request (which is nothing more than an HTTP GET request). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200517125851.8484-1-nyh@scylladb.com>	2020-05-17 18:33:27 +02:00
Nadav Har'El	2b9437076f	README.md: update instructions for building docker image The instructions in README.md about building a docker image start with "cd dist/docker", but it actually needs to be "cd dist/docker/redhat". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200517152815.15346-1-nyh@scylladb.com>	2020-05-17 18:29:55 +03:00
Tzach Livyatan	82dfab0a54	Fix a link to contributor-agreement in the CONTRIBUTING page	2020-05-17 14:15:49 +03:00
Avi Kivity	513faa5c71	Merge 'Use http Stream for describe ring' from Amnon " This series changes the describe_ring API to use HTTP stream instead of serializing the results and send it as a single buffer. While testing the change I hit a 4-year-old issue inside service/storage_proxy.cc that causes a use after free, so I fixed it along the way. Fixes #6297 " * amnonh-stream_describe_ring: api/storage_service.cc: stream result of token_range storage_service: get_range_to_address_map prevent use after free	2020-05-17 14:05:26 +03:00
Amnon Heiman	7c4562d532	api/storage_service.cc: stream result of token_range The get token range API can become big which can cause large allocation and stalls. This patch replace the implementation so it would stream the results using the http stream capabilities instead of serialization and sending one big buffer. Fixes #6297 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-05-17 13:56:05 +03:00
Amnon Heiman	69a46d4179	storage_service: get_range_to_address_map prevent use after free The implementation of get_range_to_address_map has a default behaviour, when getting an empty keypsace, it uses the first non-system keyspace (first here is basically, just a keyspace). The current implementation has two issues, first, it uses a reference to a string that is held on a stack of another function. In other word, there's a use after free that is not clear why we never hit. The second, it calls get_non_system_keyspaces twice. Though this is not a bug, it's redundant (get_non_system_keyspaces uses a loop, so calling that function does have a cost). This patch solves both issues, by chaning the implementation to hold a string instead of a reference to a string. Second, it stores the results from get_non_system_keyspaces and reuse them it's more efficient and holds the returned values on the local stack. Fixes #6465 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-05-17 13:53:13 +03:00
Dejan Mircevski	8db7e4cc96	cql: Add test for invalid unbounded DELETE In `add40d4e59`, we relaxed the prohibition of unbounded DELETE and stopped testing the failure message. But there are still scenarios when unbounded DELETE is prohibited, so add a test to ensure we continue to catch it where appropriate. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-05-17 12:28:36 +03:00
Avi Kivity	b155eef726	Merge "allow early aborts through abort sources." from Glauber " The shutdown process of compaction manager starts with an explicit call from the database object. However that can only happen everything is already initialized. This works well today, but I am soon to change the resharding process to operate before the node is fully ready. One can still stop the database in this case, but reshardings will have to finish before the abort signal is processed. This patch passes the existing abort source to the construction of the compaction_manager and subscribes to it. If the abort source is triggered, the compaction manager will react to it firing and all compactions it manages will be stopped. We still want the database object to be able to wait for the compaction manager, since the database is the object that owns the lifetime of the compaction manager. To make that possible we'll use a future that is return from stop(): no matter what triggered the abort, either an early abort during initial resharding or a database-level event like drain, everything will shut down in the right order. The abort source is passed to the database, who is responsible from constructing the compaction manager Tests: unit (debug), manual start+stop, manual drain + stop, previously failing dtests. "	2020-05-17 11:49:00 +03:00
Avi Kivity	777d5e88c3	types: support altering fixed-size integer types to varint Fixed-size integer types are legal varints - both are serialized as two's complement in network byte order. So there's tinyint, shortint, int, and bigint can be interpreted as varints. Change is_compatible_with() to reflect that. Message-Id: <20200516115143.28690-2-avi@scylladb.com>	2020-05-17 11:31:00 +03:00
Avi Kivity	ff57e4d9a5	types: make short and byte types value-compatible with varint The short and byte types are two's complement network byte order, just like varint (except fixed size) and so varint can read them just fine. Mark them as value compatible like int32_type and long_type. A unit test is added. Message-Id: <20200516115143.28690-1-avi@scylladb.com>	2020-05-17 11:31:00 +03:00
Benny Halevy	a96087165a	hints: get_device_id: use seastar file_stat This avoids potential use-after-move, since undefined c++ sequencing order may std::move(f) in the lambda capture before evaluating f.stat(). Also, this makes use of a more generic library function that doesn't require to open and hold on to the file in the application. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200514152054.162168-1-bhalevy@scylladb.com>	2020-05-15 10:11:45 +02:00
Asias He	b2c4d9fdbc	repair: Fix race between write_end_of_stream and apply_rows Consider: n1, n2, n1 is the repair master, n2 is the repair follower. === Case 1 === 1) n1 sends missing rows {r1, r2} to n2 2) n2 runs apply_rows_on_follower to apply rows, e.g., {r1, r2}, r1 is written to sstable, r2 is not written yet, r1 belongs to partition 1, r2 belongs to partition 2. It yields after row r1 is written. data: partition_start, r1 3) n1 sends repair_row_level_stop to n2 because error has happened on n1 4) n2 calls wait_for_writer_done() which in turn calls write_end_of_stream() data: partition_start, r1, partition_end 5) Step 2 resumes to apply the rows. data: partition_start, r1, partition_end, partition_end, partition_start, r2 === Case 2 === 1) n1 sends missing rows {r1, r2} to n2 2) n2 runs apply_rows_on_follower to apply rows, e.g., {r1, r2}, r1 is written to sstable, r2 is not written yet, r1 belongs to partition 1, r2 belongs to partition 2. It yields after partition_start for r2 is written but before _partition_opened is set to true. data: partition_start, r1, partition_end, partition_start 3) n1 sends repair_row_level_stop to n2 because error has happened on n1 4) n2 calls wait_for_writer_done() which in turn calls write_end_of_stream(). Since _partition_opened[node_idx] is false, partition_end is skipped, end_of_stream is written. data: partition_start, r1, partition_end, partition_start, end_of_stream This causes unbalanced partition_start and partition_end in the stream written to sstables. To fix, serialize the write_end_of_stream and apply_rows with a semaphore. Fixes: #6394 Fixes: #6296 Fixes: #6414	2020-05-14 18:15:01 +03:00
Pekka Enberg	96e35f841c	docs/redis: API reference documentation The Redis API in Scylla only supports a small subset of the Redis commands. Let's document what we support so people have the right expectations when they try it out.	2020-05-14 17:33:39 +03:00
Benny Halevy	0d4b93b11d	sstable: fix potential use-after-move sites Avoid `f(s).then([s = std::move(s)] {})` patterns, where the move into the lambda capture may potentially be sequenced by the compiler before passing `s` to function `f`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200514131701.140046-1-bhalevy@scylladb.com>	2020-05-14 16:06:07 +02:00
Nadav Har'El	f3fd976120	docs, alternator: improve description of status of global tables support The existing text did not explain what happens if additional DCs are added to the cluster, so this patch improves the explanation of the status of our support for global tables, including that issue. Fixes #6353 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200513175908.21642-1-nyh@scylladb.com>	2020-05-14 08:03:16 +02:00
Glauber Costa	7423ccc318	compaction_manager: allow early aborts through abort sources. The shutdown process of compaction manager starts with an explicit call from the database object. However that can only happen everything is already initialized. This works well today, but I am soon to change the resharding process to operate before the node is fully ready. One can still stop the database in this case, but reshardings will have to finish before the abort signal is processed. This patch passes the existing abort source to the construction of the compaction_manager and subscribes to it. If the abort source is triggered, the compaction manager will react to it firing and all compactions it manages will be stopped. We still want the database object to be able to wait for the compaction manager, since the database is the object that owns the lifetime of the compaction manager. To make that possible we'll use a future that is return from stop(): no matter what triggered the abort, either an early abort during initial resharding or a database-level event like drain, everything will shut down in the right order. The abort source is passed to the database, who is responsible from constructing the compaction manager. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Glauber Costa	45dc9cc6e5	compaction_manager: carve out a drain method We want stop() to be callable just once. Having the compaction manager stopped twice is a potential indication that something is wrong. Still there are places where we want to stop all ongoing compactions and prevent new from running - like the drain operation. Today the only operation that allows for cancellation of all existing compations is stop(). To unweave this, we will split those two things. A drain operation is carved out, and it should be safe to be called many times. The compaction manager is usable after this, and new compactions can even be sent if it happen to be enabled again (we currently don't) A stop operation, which includes a drain, will only be allowed once. After a stop() the compaction_manager object is no longer usable. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Glauber Costa	e29701ca1c	compaction_manager: expand state to be able to differentiate between enabled and stopped We are having many issues with the stop code in the compaction_manager. Part of the reason is that the "stopped" state has its meaning overloaded to indicate both "compaction manager is not accepting compactions" and "compaction manager is not ready or destructed". In a later step we could default to enabled-at-start, but right now we maintain current behavior to minimize noise. It is only possible to stop the compaction manager once. It is possible to enable / disable the compaction manager many times. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Nadav Har'El	62c00a3f17	merge: Use time window compaction strategy for CDC Log table Merged pull request https://github.com/scylladb/scylla/pull/6427 by Piotr Jastrzębski: CDC Log is a time series so it makes sense to use time window compaction strategy for it. Our support for time series is limited so we make sure that we don't create more than 24 sstables. If TTL is configured to 0, meaning data does not expire, we don't use time window compaction strategy. This PR also sets gc_grace_seconds to 0 when TTL is not set to 0.	2020-05-13 14:36:43 +03:00
Benny Halevy	94a558e9a8	test.py: print test command line and env to log Print the test command line and the UBSAN and ASAN env settings to the log so the run can be easily reproduced (optionally with providing --random-seed=XXX that is printed by scylla unit tests when they start). Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200513110959.32015-1-bhalevy@scylladb.com>	2020-05-13 14:27:15 +03:00
Raphael S. Carvalho	c06cdcdb3c	table: Don't allow a shared SSTable to be selected for regular compaction After commit `88d2486fca`, removal of shared SSTables is not atomic anymore. They can be first removed from the list of shared SSTables and only later be removed from the SSTable set. That list is used to filter out shared SSTables from regular compaction candidates. So it can happen that regular compaction pick up a shared SSTable as candidate after it was removed from that list but before it was removed from the set. To fix this, let's only remove a shared SSTable from that aforementioned list after it was successfully removed from the SSTable set, so that a shared SSTable cannot be selected for regular compaction anymore. Fixes #6439. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200512175224.114487-1-raphaelsc@scylladb.com>	2020-05-13 10:43:48 +03:00
Avi Kivity	fc5568167b	tests: like_matcher_test: adjust for C++20 char8_t C++20 makes string literals defined with u8"my string" as using a new type char8_t. This is sensible, as plain char might not have 8 bits, but conflicts with our bytes type. Adjust by having overloads that cast back to char*. This limits us to environments where char is 8 bits, but this is already a restriction we have. Reviewed-by: Dejan Mircevski <dejan@scylladb.com> Message-Id: <20200512101646.127688-1-avi@scylladb.com>	2020-05-13 09:37:39 +03:00
Avi Kivity	33fda05388	counters: change deprecated std::is_pod<> to replacement C++20 deprecates std::is_pod<> in favor of the easier-to-type std::is_starndard_layout<> && std::is_trivial<>. Change to the recommendation in order to avoid a flood of warnings. Reviewed-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200512092200.115351-1-avi@scylladb.com>	2020-05-13 09:36:52 +03:00
Avi Kivity	2afd40fe6f	tracing: use correct std::memory_order_* scoping std::memory_order is an unscoped enum, and so does not need its members to be prefixed with std::memory_order::, just std::. This used to work, but in C++20 it no longer does. Use the standard way to name these constants, which works in both C++17 and C++20. Reviewed-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200512092408.115649-1-avi@scylladb.com>	2020-05-13 09:36:23 +03:00
Avi Kivity	8d4bdc49f1	tests: sstable_run_based_compaction_strategy_for_tests: adjust for C++20 pass-by-value in std::accumulate C++20 changed the parameter to the binary operation function in std::accumulate() to be passed by value (quite sensibly). Adjust the code to be compatible by using a #if. This will be removed once we switch over to C++20. Message-Id: <20200512105427.142423-1-avi@scylladb.com>	2020-05-12 20:41:16 +02:00
Avi Kivity	74c1db7f59	tests: like_matcher_test: add casts for utf8 string literals C++20 makes string literals defined with u8"foo" return a new char8_t. This is sensible but is noisy for us. Cast them to plain const char. Message-Id: <20200512104751.137816-1-avi@scylladb.com>	2020-05-12 20:41:02 +02:00
Avi Kivity	07061f9a00	duration: adjust for C++20 char8_t type C++20 makes string literals defined with u8"blah" return a new char8_t type, which is sensible but noisy here. Adjust for it by dropping an unneeded u8 in one place, and adding a cast in another. Message-Id: <20200512104515.137459-1-avi@scylladb.com>	2020-05-12 20:40:30 +02:00
Avi Kivity	89ea879ba9	storage_proxy: adjust for C++20 std::accumulate() pass-by-value C++20 passes the input to the binary operation by value (which is sensible), but is not compatible with C++17. Add some #if logic to support both methods. We can remove the logic when we fully transition to C++20. Message-Id: <20200512101355.127333-1-avi@scylladb.com>	2020-05-12 20:39:21 +02:00
Tomasz Grabiec	df4b698309	Merge "Add more defenses against empty keys" from Botond In theory we shouldn't have empty keys in the database, as we validate all keys that enter the database via CQL with `validation::validate_cql_keys()`, which will reject empty keys. In this context, empty means a single-component key, with its only component being empty. Yet recently we've seen empty keys appear in a cluster and wreak havoc on it, as they will cause the memtable flush to fail due to the sstable summary rejecting the empty key. This will cause an infinite loop, where Scylla keeps retrying to flush the memtable and failing. The intermediate consequence of this is that the node cannot be shut down gracefully. The indirect consequence is possible data loss, as commitlog files cannot be replayed as they just re-insert the empty key into the memtable and the infinite flush retry circle starts all over again. A workaround is to move problematic commitlog files away, allowing the node to start up. This can however lead to data loss, if multiple replicas had to move away commitlogs that contain the same data. To prevent the node getting into an unusable state and subsequent data loss, extend the existing defenses against invalid (empty) keys to the commitlog replay, which will now ignore them during replay. Fixes: #6106 * denesb/empty-keys/v5: commitlog_replayer: ignore entries with invalid keys test: lib/sstable_utils: add make_keys_for_shard validation: add is_cql_key_invalid() validation: validate_cql_key(): make key parameter a `partition_key_view` partition_key_view: add validate method	2020-05-12 20:36:40 +02:00
Avi Kivity	72172effc8	transport: stop using boost::bimap<> We use boost::bimap for bi-directional conversion from protocol type encodings to type objects. Unfortunately, boost::bimap isn't C++20-ready. Fortunately, we only used one direction of the bimap. Replace with plain old std::unordered_map<>. Message-Id: <20200512103726.134124-1-avi@scylladb.com>	2020-05-12 18:55:26 +03:00
Botond Dénes	74b020ad05	main: run redis service in the statement scheduling group Like all the other API services (CQL, thrift and alternator). Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200512145631.104051-1-bdenes@scylladb.com>	2020-05-12 18:01:27 +03:00
Piotr Dulikowski	0c5ac0da98	hinted handoff: remove discarded hint positions from rps_set Related commit: `85d5c3d` When attempting to send a hint, an exception might occur that results in that hint being discarded (e.g. keyspace or table of the hint was removed). When such an exception is thrown, position of the hint will already be stored in rps_set. We are only allowed to retain positions of hints that failed to be sent and needed to be retried later. Dropping a hint is not an error, therefore its position should be removed from rps_set - but current logic does not do that. Because of that bug, hint files with many discardable hints might cause rps_set to grow large when the file is replayed. Furthermore, leaving positions of such hints in rps_set might cause more hints than necessary to be re-sent if some non-discarded hints fail to be sent. This commit fixes the problem by removing positions of discarded hints from rps_set. Fixes #6433	2020-05-12 15:13:59 +02:00
Avi Kivity	05e19078f6	storage_proxy: replace removed std::not1() by replacement std::not_fn() C++17 deprecated std::not1() and C++20 removed it; replace with its successor. Message-Id: <20200512101205.127046-1-avi@scylladb.com>	2020-05-12 14:05:03 +03:00
Avi Kivity	e774ee06ed	Update seastar submodule * seastar e708d1df3a...92365e7b87 (11): > tests: distributed_test: convert to SEASTAR_TEST_CASE > Merge "Avoid undefined behavior on future self move assignments" from Rafael > Merge "C++20 support" from Avi > optimized_optional: don't use experimental C++ features > tests: scheduling_group_test: verify that later() doesn't modify the current group > tests: demos: coroutine_demo: add missing include for open_file_dma() > rpc: minor documentation improvements > rpc: Assert that sinks are closed > Merge "Fix most tests under valgrind" from Rafael > distributed_test: Fix it on slow machines > rpc_test: Make sure we always flush and close the sink loading_shard_values.hh: added missing include for gcc6-concepts.hh, exposed by the submodule update. Frozen toolchain updated for the new valgrind dependency.	2020-05-12 14:04:16 +03:00
Botond Dénes	6083ed668b	commitlog_replayer: ignore entries with invalid keys When replaying the commitlog, pass keys to `validation::validate_cql_key()`. Discard entries which fail validation and warn about it in the logs. This prevents invalid keys from getting into the system, possibly failing the commitlog replay and the successful boot of the node, preventing the node from recovering data.	2020-05-12 12:07:21 +03:00
Botond Dénes	e0f5ef5ef0	test: lib/sstable_utils: add make_keys_for_shard A variant of make_keys() which creates keys for the requested shard. As this version is more generic than the existing local_shards_only variant, the former is reimplemented on top of the latter.	2020-05-12 12:07:21 +03:00
Botond Dénes	dd76e8c8de	validation: add is_cql_key_invalid()	2020-05-12 12:07:00 +03:00
Botond Dénes	95bf3a75de	validation: validate_cql_key(): make key parameter a `partition_key_view` This is more general than the previous `const partition_key&` and allows for passing keys obtained from the likes of `frozen_mutation` that only have a view of the key. While at it also change the schema parameter from schema_ptr to const schema&. No need to pass a shared pointer.	2020-05-12 12:07:00 +03:00
Botond Dénes	84c47c4228	partition_key_view: add validate method We want to be able to pass `partition_key_view` to `validation::validate_cql_key()`. As the latter wants to call `validate()` on the key, replicate `partition_key::validate()` in `partition_key_view`.	2020-05-12 12:07:00 +03:00
Asias He	b744dba75a	repair: Abort the queue in write_end_of_stream in case of error In write_end_of_stream, it does: 1) Write write_partition_end 2) Write empty mutation_fragment_opt If 1) fails, 2) will be skipped, the consumer of the queue will wait for the empty mutation_fragment_opt forever. Found this issue when injecting random exceptions between 1) and 2). Refs #6272 Refs #6248	2020-05-12 10:50:52 +02:00
Avi Kivity	f1fde537a9	Merge 'Support Snapshot of multiple tables' from Amnon This series adds support for taking a snapshot of multiple tables. Fixes #6333 * amnonh-snapshot_keyspace_table: api/storage_service.cc: Snapshot, support multiple tables service/storage_service: Take snapshot of multiple tables	2020-05-12 11:34:09 +03:00
Piotr Jastrzebski	49b6010cb4	cdc: Use time window compaction strategy for CDC Log table CDC Log is a time series with data TTLed by default to 24 hours so it makes sense to use for it a time window compaction. A window size is adjusted to the TTL configured for CDC Log so that no more than 24 sstables will be created. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-12 07:53:40 +02:00
Glauber Costa	70a89ab4ab	compaction: do not assume I/O priority class We shouldn't assume the I/O priority class for compactions. For instance, if we are dealing with offstrategy compactions we may want to use the maintenance group priority for them. For now, all compactions are put in the compaction class. rewrite compactions (scrub, cleanup) could be maintenance, but we don't have clear access to the database object at this time to derive the equivalent CPU priority. This is planned to be changed in the future, and when we do change it, we'll adjust. Same goes for resharding: while we could at this point change it we'd risking memory pressure since resharding is run online and sstables are shared until resharding is done. When we move it to offline execution we'll do it with maintenance priority. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200512002233.306538-3-glauber@scylladb.com>	2020-05-12 08:23:19 +03:00
Glauber Costa	4234538292	compaction: pass descriptor all the way down to compaction object. To do that - and still avoid a copy - we need to add some fields to the compaction object that are exclusive to regular_compaction. Still, not only this simplifies the code, resharding and regular compaction look more and more alike. This is done now in preparation for another patch that will add more information to the descriptor. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200512002233.306538-2-glauber@scylladb.com>	2020-05-12 08:23:19 +03:00
Piotr Sarna	5f2eadce09	alternator: wait for schema agreement after table creation In order to be sure that all nodes acknowledged that a table was created, the CreateTable request will now only return after seeing that schema agreement was reached. Rationale: alternator users check if the table was created by issuing a DescribeTable request, and assume that the table was correctly created if it returns nonempty results. However, our current implementation of DescribeTable returns local results, which is not enough to judge if all the other nodes acknowledge the new table. CQL drivers are reported to always wait for schema agreement after issuing DDL-changing requests, so there should be no harm in waiting a little longer for alternator's CreateTable as well. Fixes #6361 Tests: alternator(local)	2020-05-11 21:51:12 +03:00
Piotr Jastrzebski	0cd0775a27	cdc: Set CDC Log gc_grace_seconds to 0 Data in CDC Log is TTLed and we want to remove it as soon as it expires. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-11 17:59:52 +02:00

1 2 3 4 5 ...

22091 Commits