scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-05 14:33:08 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	b761e558b5	system_keyspace: Keep local_cache reference on board For now it's a reference, but all users of the cache will be eventually switched into using system_keyspace. In cql-test-env cache starting happens earlier than it was before, but that's OK, it just initializes empty instances. In main cache starts at the same time as before patching. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-16 13:57:59 +03:00
Pavel Emelyanov	1bcb6c13a5	system_keyspace: Move minimal_setup into start Start happens at exactly the same place. One thing to take care of is that it happens on all shards. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-16 13:57:59 +03:00
Pavel Emelyanov	7ef69b8189	system_keyspace: Make sharded object The db::system_keyspace was made a class some time ago, time to create a standard sharded<> object out of it. It needs query processor and database. None of those depensencies is started early enough, so the object for now starts in two steps -- early instances creation and late start. The instances will carry qctx and local_cache on board and all the services that need those two will depend on system-keyspace. Its start happens at exactly the same place where system_keyspace::setup happens thus any service that will use system_keyspace will be on the same safe side as it is now. In the further future the system_keyspace will be equpped with its own query processor backed by local replica database instance, instead of the whole storage proxy as it is now. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-16 13:57:59 +03:00
Mikołaj Sielużycki	a8cb7bf677	readers: Make result_collector use queue reader handle v2. Transitively modifies streaming_virtual_table::as_mutation_source, along with the tests. Closes #10223	2022-03-15 17:02:28 +02:00
Botond Dénes	61028ad718	evicatble_reader: avoid preemption pitfall around waiting for readmission Permits have to wait for re-admission after having been evicted. This happens via `reader_permit::maybe_wait_readmission()`. The user of this method -- the evictable reader -- uses it to re-wait admission when the underlying reader was evicted. There is one tricky scenario however, when the underlying reader is created for the first time. When the evictable reader is part of a multishard query stack, the created reader might in fact be a resumed, saved one. These readers are kept in an inactive state until actually resumed. The evictable reader shares it permit with the to-be-resumed reader so it can check whether it has been evicted while saved and needs to wait readmission before being resumed. In this flow it is critical that there is no preemption point between this check and actually resuming the reader, because if there is, the reader might end up actually recreated, without having waited for readmission first. To help avoid this situation, the existing `maybe_wait_readmission()` is split into two methods: * `bool reader_permit::needs_readmission()` * `future<> reader_permit::wait_for_readmission()` The evictable reader can now ensure there is no preemption point between `needs_readmission()` and resuming the reader. Fixes: #10187 Tests: unit(release) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20220315105851.170364-1-bdenes@scylladb.com>	2022-03-15 14:37:22 +02:00
Avi Kivity	39504dea61	Merge "Convert result builders to v2" from Botond " Namely the query result writer and the reconcilable result builder, used for building results for regular queries and mutation queries (used in read repair) respectively. With this, there are no users left for the v1 output of the compactor, so we remove that, making the compactor v2 all-the-way (and simpler). This means that for regular queries, a downgrade phase is eliminated completely, as regular queries don't store range tombstone in their result, so no need to convert them. Tests: unit(dev, release, debug) " * 'result-builders-v2/v1' of https://github.com/denesb/scylla: reconcilable_result_builder: remove v1 support query_result_builder: remove v1 support mutation_compactor: drop v1 related code-paths mutation_compactor: drop v1 support altogether from the API tree: migrate to the v2 consumer APIs test/boost/mutation_test: remove v1 specific test code querier: switch to v2 compactor output reconcilable_result_builder: add v2 support query_result_writer: add v2 support query_result_builder: make consume(range_tombstone) noop	2022-03-15 14:32:58 +02:00
Mikołaj Sielużycki	7ce0d380d4	readers: Update tests to use make_queue_reader_v2. Closes #10220	2022-03-15 13:56:50 +02:00
Piotr Dulikowski	5d7b2c6515	utils/result_try: prevent exceptions from being caught multiple times The `result_try` and `result_futurize_try` are supposed to handle both failed results and exceptions in a way similar to a try..catch block. In order to catch exceptions, the metaprogramming machinery invokes the fallible code inside a stack of try..catch blocks, each one of them handling one exception. This is done instead of creating a single try..catch block, as to my knowledge it is not possible to create a try..catch block with the number of "catch" clauses depending on a variadic template parameter pack. Unfortunately, a "try" with multiple "catches" is not functionally equivalent to a "try block stack". Consider the following code: try { try { return execute_try_block(); } catch (const derived_exception&) { // 1 } } catch (const base_exception&) { // 2 } If `execute_try_block` throws `derived_exception` and the (1) catch handler rethrows this exception, it will also be handled in (2), which is not the same behavior as if the try..catch stack was "flat". This causes wrong behavior in `result_try` and `result_futurize_try`. The following snippet has the same, wrong behavior as the previous one: return utils::result_try([&] { return execute_try_block(); }, utils::result_catch<derived_exception>([&] (const auto&& ex) { // 1 }), utils::result_catch<base_exception>([&] (const auto&& ex) { // 2 }); This commit fixes the problem by adding a boolean flag which is set just before a catch handler is executed. If another catch handler is accidentally matched due to exception rethrow, the catch handler is skipped and exception is automatically rethrown. Tests: unit(dev, debug) Fixes: #10211 Closes #10216	2022-03-15 11:42:42 +02:00
Benny Halevy	e5538cf52e	test: mutation_write_test: test_timestamp_based_splitting_mutation_writer: no need to downgrade reader to v1 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220315083425.2786228-2-bhalevy@scylladb.com>	2022-03-15 11:41:11 +02:00
Benny Halevy	90edddd7e3	everywhere: use make_flat_mutation_reader_from_mutations_v2 Rather than upgrade_to_v2(make_flat_mutation_reader_from_mutations) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220315083425.2786228-1-bhalevy@scylladb.com>	2022-03-15 11:41:10 +02:00
Nadav Har'El	189ff5414f	test/cql-pytest: implement test_tools.py without run-script cooperation In commit `afab1a97c6`, we added test_tools.py - tests for the various tools embedded in the Scylla executable. These tests need to know where the Scylla executable is, and also where its sstables are stored. For this, the commit added two test parameters - "--scylla-path" and "--workdir" - with which the "run" script communicated this knowledge to the test. However, that implementation meant that these tests only work if the test was run via the test/cql-pytest/run script - they won't work if the user ran Scylla/pytest manually, or through some other script not passing these options. This patch drops the "--scylla-path" and "--workdir" parameters, and instead the test figures out this information on its own: 1. To find the Scylla executable, we begin by looking (using the local_process_id(cql) function from the previous patch) for a local process which listens to our CQL connection, and then find the executable's path using /proc. 2. To find the Scylla data directory (which is what we really need, not workdir which is just a shortcut to set all directories!), we retrieve this configuration from the system.config table through CQL. I tested that test_tools.py now works not only through test/cql-pytest/run but also if I run Scylla manually and then run "pytest test_tools.py" without any extra parameters. Fixes #10209 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220314151125.2737815-2-nyh@scylladb.com>	2022-03-14 20:25:22 +02:00
Nadav Har'El	8ed0909cc3	test/cql-pytest: add mechanism and example of testing Scylla log messages Generally, cql-pytest tests do not, and should not rely on looking up messages in the Scylla log file: Relying on such messages makes it impossible to run the same test against Cassandra or even a remotely- installed Scylla, and the tests tend to break when logging (which is not considered part of our API) changes. Moreover, usually what our dtests achieve by looking at the log - e.g., figuring out when some event has happened - can be achieved through official CQL APIs, and this is what normal users do anyway (users don't normally dig through the log to figure out when their operation completed). However, sometimes we do want to write a test to confirm that during a certain operation, a certain log message gets written to Scylla's log. A desire to do this was raised by @fruch and @soyacz, so in this patch I provide a mechanism to do this, and a trivial example - which checks that a "Creating ..." message appears on the log whenever a table is created, and "Dropping ..." when the table is deleted. As is explained in detail in patches in the comment, Scylla's log file is found automatically, without relying on Scylla's runner (such as the script test/cql-pytest/run) communicating to the test where the log file is. If the log file can't be found - e.g., we're testing a remote Scylla, or if this isn't Scylla, the tests are skipped. I would like all logfile-testing tests to be in the same file, test_logs.py. As I explained above, I think it is a mistake for general tests to check the log file just because they can. I think that the only tests that should use the log file are tests deliberately written to check what gets logged - and those can be collected in the same file. As part of this patch, we add the utility function local_process_id(cql) to find (if we can) the local process which listens to the connection "cql". This utility function will later be useful in more places - for example test_tools.py needs to find Scylla's executable. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220314151125.2737815-1-nyh@scylladb.com>	2022-03-14 20:25:20 +02:00
Lukasz Sojka	c65f1c3b47	test/cql-pytest: add warnings test cql client should return warnings when batch exceedes certain size. This test verifies if response contains them. Test covers issue: https://github.com/scylladb/scylla/issues/10196 Signed-off-by: Lukasz Sojka <lukasz.sojka@scylladb.com> Closes #10197	2022-03-14 19:49:06 +02:00
Raphael S. Carvalho	fce9d869b4	compaction: Move table::compact_sstables() into compaction manager Table submits compaction request into manager, which in turn calls back table to run the compaction when the time has come, i.e.: table -> compaction manager -> table -> execute compaction But manager should not rely on table to run compaction, as compaction execution procedure sits one layer below the manager and should be accessed directly by it, i.e: table -> compaction manager -> execute compaction This makes code easier to understand and update_compaction_history() can now be noop for unit tests using table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220311023410.250149-1-raphaelsc@scylladb.com>	2022-03-14 15:39:23 +02:00
Botond Dénes	964d9e033d	Merge "raft_group_registry: drain_on_shutdown" from Benny Halevy " This series hardens raft_group_registry::stop_servers and uses it to drain_on_shutdown, called before the database is stopped in cql_test_env. (Not needed for main). raft_group_registry deferred_stop is introduced right after the service is started to make sure it's properly stopped even if there's an exception at any point while starting. Test: unit(dev) " * tag 'raft_group_registry-drain_on_shutdown-v1' of https://github.com/bhalevy/scylla: cql_test_env: raft_group_registry::drain_on_shutdown before stopping the database raft_group_registry: harden stop_servers raft_group_registry: delete unused _shutdown_gate	2022-03-14 14:22:46 +02:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Benny Halevy	26b1be0b8f	test: lib: random_mutation_generator: accept optional random seed Provide an easy way to instrument a particular test case to use a given random number seed (that's curretly already printed to the test log). Refs #5349 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210907114537.3464004-1-bhalevy@scylladb.com>	2022-03-14 13:09:36 +02:00
Benny Halevy	8481852c91	cql_test_env: raft_group_registry::drain_on_shutdown before stopping the database We're currently stopping raft_gr before shutting the database down, but we fail to do that if anything goes wrong before that, e.g. if distributed_loader::init_non_system_keyspaces fails. This change splits drain_on_shutdown out of stop() to stop the raft groups before the database is stopped and does the rest in a deferred_stop placed right after the rafr_gr registry is strated. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-14 11:49:44 +02:00
Nadav Har'El	383aa326cc	cql-pytest: translate Cassandra's tests for BATCH operations This is a translation of Cassandra's CQL unit test source file validation/operations/BatchTest.java into our our cql-pytest framework. This test file includes 13 tests for various types of BATCH operations. All tests pass on Scylla - no known or new bugs were reproduced. Two of the tests involve very slow testing of TTLs, so after verifying they work I marked them "skip" for now (we can always turn them on later, perhaps after reducing the length or number of the sleeps). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220313121634.2611423-1-nyh@scylladb.com>	2022-03-14 09:43:02 +01:00
Botond Dénes	924ff6a503	mutation_compactor: drop v1 support altogether from the API Fully mechanical change. Drop all v1 types, template types. Internal code is left unchanged, will be made v2 only in the next patch.	2022-03-11 09:24:05 +02:00
Botond Dénes	eacdfb2cb7	test/boost/mutation_test: remove v1 specific test code From test_compactor_range_tombstone_spanning_many_pages, preparing for the retirement of the v1 output of the compactor.	2022-03-11 09:24:05 +02:00
Botond Dénes	0b5217052d	querier: switch to v2 compactor output The change is mostly mechanical: update all compactor instances to the _v2 variant and update all call-sites, of which there is not that many. As a consequence of this patch, queries -- both single-partition and range-scans -- now do the v2->v1 conversion in the consumers, instead of in the compactor.	2022-03-11 09:24:05 +02:00
Botond Dénes	7e0b51ff23	Merge 'Overhaul compaction_manager::task' from Benny Halevy The series overhauls the compaction_manager::task design and implementation by properly layering the functionality between the compaction_manager that deals with generic task execution, and the per-task business logic that is defined in a set of classes derived from the generic task class. While at it, the series introduces `task::state` and a set of helper functions to manage it to prevent leaks in the statistics, fixing #9974. Two more stats counter were exposed: `completed_tasks` and a new `postponed_tasks`. Test: sstable_compaction_test Dtest: compaction_test.py compaction_additional_test.py Fixes #9974 Closes #10122 * github.com:scylladb/scylla: compaction_manager: use coroutine::switch_to compaction_manager::task: drop _compaction_running compaction_manager: move per-type logic to derived task compaction_manager: task: add state enum compaction_manager: task: add maybe_retry compaction_manager: reevaluate_postponed_compactions: mark as noexcept compaction_manager: define derived task types compaction_manager: register_metrics: expose postponed_compactions compaction_manager: register_metrics: expose failed_compactions compaction_manager: register_metrics: expose _stats.completed_tasks compaction: add documentation for compaction_type to string conversions compaction: expose to_string(compaction_type) compaction_manager: task: standardize task description in log messages compaction_manager: refactor can_proceed compaction_manager: pass compaction_manager& to task ctor compaction_manager: use shared_ptr<task> rather than lw_shared_ptr compaction_manager: rewrite_sstables: acquire _maintenance_ops_sem once compaction_manager: use compaction_state::lock only to synchronize major and regular compaction	2022-03-10 13:33:56 +02:00
Benny Halevy	a2a5e530f0	compaction_manager: move per-type logic to derived task Move the business logic into the task specific classes. Separating initialization during task construction, from the compaction_done task, moved into a do_run() method, and in some cases moving a lambda function that was called per table (as in rewrite_sstables) into a private method of the derived class. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Mikołaj Sielużycki	5920349357	row_cache: Make row_cache reader from sstables compacting. Reading data from sstables without compacting first puts unnecessary pressure on the cache. The mutation streams need to be resolved anyway before passing to subsequent consumers, so it's better to do it as close to the source as possible. Fixes: #3568 Closes #10188	2022-03-10 11:40:10 +02:00
Benny Halevy	72162ed653	compaction_manager: define derived task types Turn task into a class, defining a clear hierarchy of private, protected, and public methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 11:35:35 +02:00
Botond Dénes	32e9809e9c	test/boost/mutation_test: migrate to compact_for_mutation_v2	2022-03-10 09:16:33 +02:00
Botond Dénes	959483a2dc	test: migrate to the v2 variant of the sstable writer API	2022-03-10 09:16:33 +02:00
Benny Halevy	20a8609392	compaction_manager: task: standardize task description in log messages Define task::describe and use it via operator<< to print the task metadata to the log in a standard way. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:18 +02:00
Benny Halevy	33b2731a4a	compaction_manager: pass compaction_manager& to task ctor And use it to get the compaction state of the table to compact. It will be used in a later patch to manage the task state from task methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:17 +02:00
Benny Halevy	20067b1050	compaction_manager: use shared_ptr<task> rather than lw_shared_ptr Prepare for defining per compaction type tasks derived from compaction_manager::task. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 08:39:17 +02:00
Botond Dénes	105bf8888a	sstables: convert mx writer to v2 The sstables::sstable class has two methods for writing sstables: 1) sstable_writer get_writer(...); 2) future<> write_components(flat_mutation_reader, ...); (1) directly exposes the writer type, so we have to update all users of it (there is not that many) in this same patch. We defer updating users of (2) to a follow-up commits.	2022-03-10 07:03:49 +02:00
Botond Dénes	2057db54ab	test/boost/mutation_test: test_compactor_range_tombstone_spanning_many_pages extend to check v2 output too	2022-03-10 07:03:49 +02:00
Botond Dénes	7a37e30310	mutation_reader: convert compacting reader v2 Its input was already a v2 reader, now itself is also a v2 reader. With this commit, compaction.cc is finally v2 all-the-way.	2022-03-10 07:03:46 +02:00
Botond Dénes	6544da342a	test/lib/mutation_source_test: log name of each run_mutation_source() Although we have a log in run_mutation_reader_tests(), it is useful to know where it was called from, when trying to find the test scenario that failed.	2022-03-10 06:46:46 +02:00
Avi Kivity	e1c326a5ba	Merge "Convert multishard writer to v2" from Botond " Also convert the foreign_reader used by it in the process. Tests: unit(dev) " * 'multishard-writer-v2/v1' of https://github.com/denesb/scylla: mutation_writer/multishard_writer: remove now unused v1 factory overloads test/boost/mutation_writer_test: test the v2 variant of distribute_reader_and_consume_on_shards() flat_mutation_reader: add v2 variant of make_generating_reader() mutation_reader: multishard_writer: migrate implementation to v2 mutation_reader: convert foreign_reader to v2 streaming/consumer: convert to v2 mutation_writer/multishard_writer: add v2 variant of distribute_reader_and_consume_on_shards()	2022-03-09 19:28:05 +02:00
Tomasz Grabiec	8fa704972f	loading_cache: Make invalidation take immediate effect There are two issues with current implementation of remove/remove_if: 1) If it happens concurrently with get_ptr(), the latter may still populate the cache using value obtained from before remove() was called. remove() is used to invalidate caches, e.g. the prepared statements cache, and the expected semantic is that values calculated from before remove() should not be present in the cache after invalidation. 2) As long as there is any active pointer to the cached value (obtained by get_ptr()), the old value from before remove() will be still accessible and returned by get_ptr(). This can make remove() have no effect indefinitely if there is persistent use of the cache. One of the user-perceived effects of this bug is that some prepared statements may not get invalidated after a schema change and still use the old schema (until next invalidation). If the schema change was modifying UDT, this can cause statement execution failures. CQL coordinator will try to interpret bound values using old set of fields. If the driver uses the new schema, the coordinaotr will fail to process the value with the following exception: User Defined Type value contained too many fields (expected 5, got 6) The patch fixes the problem by making remove()/remove_if() erase old entries from _loading_values immediately. The predicate-based remove_if() variant has to also invalidate values which are concurrently loading to be safe. The predicate cannot be avaluated on values which are not ready. This may invalidate some values unnecessarily, but I think it's fine. Fixes #10117 Message-Id: <20220309135902.261734-1-tgrabiec@scylladb.com>	2022-03-09 16:13:07 +02:00
Nadav Har'El	397dd64dea	test/cql-pytest: avoid "run" warnings caused by pytest bug This patch gets rid annoying pytest configuration warnings when running test/cql-pytest/run. These started to happen after commit `afab1a97c6`, due to a pytest bug: In that commit, we added new "--scylla-path" and "--workdir" parameters to our pytest tests, and test/cql-pytest/run started passing them, and test/cql-pytest/run sometest runs pytest as: pytest --host something --workdir somedir --scylla-path somepath sometest Pytest wants to find a configuration file (pytest.ini or tox.ini) in the directory where the tests live, but its logic to find that directory is buggy: It (_pytest/config/findpaths.py::determine_setup()) looks at the command line for directory names, and looks for config files in these directories or any of their parents. It ignores parameters beginning with "-", but in our case the various arguments like "--scylla-path" are each followed by another option, and this one is not ignored! So instead of looking for the config file in sometest's parent directories (and finding test/cql-pytest/pytest.ini), pytest sees the directory given after "scylla-path", and finds the completely irrelevant tox.ini there - and uses that, which (depending what you have installed) can generate warnings. The solution is to change the run script to use "--scylla-path=..." as one parameter instead of "--scylla-path ..." as two parameters. When it's just one parameter, the pytest determine_setup() logic skips it entirely, and finds just the actual test directory. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220309132726.2311721-1-nyh@scylladb.com>	2022-03-09 15:37:08 +02:00
Piotr Sarna	05b66102e9	test: add a case for LIKE operator on a descending order column This case is a regression test for issue #10181, where it turned out that a clustering column with descending order is not properly recognized as a string. This test case used to fail with: cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="LIKE is allowed only on string types, which b is not" ...until it got fixed by the previous commit.	2022-03-09 08:56:22 +01:00
Nadav Har'El	c8152e78d7	Merge 'CQL3: fromJson accepts string as bool' from Jadw1 The problem was incompatibility with cassandra, which accepts bool as a string in `fromJson()` UDF. The difference between Cassandra and Scylla now is Scylla accepts whitespaces around word in string, Cassandra don't. Both are case insensitive. Fixes: https://github.com/scylladb/scylla/issues/7915 Closes #10134 * github.com:scylladb/scylla: CQL3/pytest: Updating test_json CQL3: fromJson accepts string as bool	2022-03-08 16:27:40 +02:00
Avi Kivity	1622995900	Merge 'Allow empty partition keys in views' from Nadav Har'El Cassandra generally does not allow empty strings as partition keys (note, by the way, that empty strings are allowed as clustering keys, as well as in individual components of a compound partition key). However, Cassandra does allow empty strings in _regular_ columns - and those regular columns can be indexed by a secondary index, or become an empty partition-key column in a materialized view. As noted in issues #9375 and #9364 and verified in a few xfailing cql-pytest tests, Scylla didn't allow these cases - and this patch series fixes that. Before the last patch in this series finally enables empty-string partition keys in materialized views, we first need to solve a couple of bugs in our code related to handling empty partition keys: The first patch fixes issue #10178 - a bug in `key_view::tri_compare()` where comparing two empty keys returned a random result instead of "equal". The second patch fixes issue #9352: our tokenizer has an inconsistency where for an empty string key, two variants of the same function return different results: 1. One variant `murmur3_partitioner::get_token(bytes_view key)` returned `minimum_token()` for the empty string. 2. Another variant `murmur3_partitioner::get_token(const schema& s, partition_key_view key)` did not have this special case, and called the normal hash-function calculation on the empty string (the resulting token is 0). Variant 2 was an unintentional bug, because Cassandra always does what variant does 1. So the "obvious" fix here would be to fix variant 2 to do what variant 1 does. Nevertheless, we decided to do the opposite: Change variant 1 to match variant 2. The reasoning is as follows: The `minimum_token()` is `token{token::kind::before_all_keys, 0 }` - it's not a real token. Since we intend in this patch allow real data to exist with the empty key, we need this real data to have a real token. For example, this token needs to be located on the token ring (so the empty-key partition will have replicas) and also belong to one of the shards, and it's not clear that `minimum_token()` will be handled correctly in this context. After changing the token of the empty string to 0, we note that some places in the code assume that `dht::decorated_key(dh t::minimum_token(), partition_key::make_empty())` is a legal decorated key. However, as far as I can tell, none of these places actually assume that the partition-key part (the `make_empty()`) really matches the token - this decorated key is only used to start an iteration (ignoring this key itself) or to indicate a non-existent key (in modern code `std::optional` should be used for that). While normally changing the token of a key is a big faux-pas, which can result in old data no longer being readable, in this case this change is safe because: 1. Scylla previously disallowed empty partition keys (in both base tables and views), so we cannot have had such a partition key saved in any sstable. 3. Cassandra does allow empty partition keys in _views_ and _secondary indexes_, but we do not support migrating sstables of those into Scylla - users are expected to only migrate the base table and then re-create the view or index. So however Cassandra writes those empty-key partitions, we don't care. The third patch finally fixes the materialized views implementation to not drop view rows with an empty-string partition key (#9375). This means we basically revert commit `ec8960df45` - which fixed #3262 by disallowing empty partition keys in views, whereas this patch fixes the same problem by handling the empty partition keys correctly. The fix for the secondary index bug (#9364) comes "for free" because it is based on materialized views. We already had xfailing test cases for empty strings in materialized views and indexes, and after this series they begin to pass so the "xfail" mark is removed. The series also adds additional test cases that validate additional corner cases discovered during the debugging. Fixes #9352 Fixes #9364 Fixes #9375 Fixes #10178 Closes #10170 * github.com:scylladb/scylla: compound_compat.hh: add missing methods of iterator materialized views: allow empty strings in views and indexes murmur3: fix inconsistent token for empty partition key compound_compat.hh: fix bug iterating on empty singular key	2022-03-08 15:55:55 +02:00
Nadav Har'El	ef43531fb6	materialized views: allow empty strings in views and indexes Although Cassandra generally does not allow empty strings as partition keys (note they are allowed as clustering keys!), it does allow empty strings in regular columns to be indexed by a secondary index, or to become an empty partition-key column in a materialized view. As noted in issues #9375 and #9364 and verified in a few xfailing cql-pytest tests, Scylla didn't allow these cases - and this patch fixes that. The patch mostly removes unnecessary code: In one place, code prevented an sstable with an empty partition key from being written. Another piece of removed code was a function is_partition_key_empty() which the materialized-view code used to check whether the view's row will end up with an empty partition key, which was supposedly forbidden. But in fact, should have been allowed like they are allowed in Cassandra and required for the secondary-index implementation, and the entire function wasn't necessary. Note that the removed function is_partition_key_empty() was NOT required for the "IS NOT NULL" feature of materialized views - this continues to work as expected after this patch, and we add another test to confirm it. Being null and being an empty string are two different things. This patch also removes a part of a unit test which enshrined the wrong behavior. After this patch we are left with one interesting difference from Cassandra: Though Cassandra allows a user to create a view row with an empty-string partition key, and this row is fully visible in when scanning the view, this row can not be queried individually because "WHERE v=''" is forbidden when v is the partition key (of the view). Scylla does not reproduce this anomaly - and such point query does work in Scylla after this patch. We add a new test to check this case, and mark it "cassandra_bug", i.e., it's a Cassandra behavior which we consider wrong and don't want to emulate. This patch relies on #9352 and #10178 having been fixed in previous patches, otherwise the WHERE v='' does not work when reading from sstables. We add to the already existing tests we had for empty materialized-views keys a lookup with WHERE v='' which failed before fixing those two issues. Fixes #9364 Fixes #9375 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-03-08 15:34:26 +02:00
Nadav Har'El	f8807e24f4	compound_compat.hh: fix bug iterating on empty singular key When iterating over a compound key with legacy_compound_view<>, when the key is "singular" (i.e., a single column) we need to iterate over just the component's actual bytes - without the two length bytes or end-of-component byte. In particular, when the component is an empty string, the iteration should return zero bytes. In other words, we should have begin() == end(). Unfortunately, this is not what happened - for an empty singular key, the iterator returned for begin() was slightly different from end() - so code using this iterator would not know there is nothing to iterate. So in this patch we fix begin() and end() to return the same thing if we have an empty singular key. The bug in legacy_compound_view<> (which we fix here) caused a bug in sstables::key_view::tri_compare(const schema& s, partition_key_view other), causing it to return wrong results when comparing two empty keys. As a result we were unable to retrieve a partition with an empty key from the sstable index. So this patch is necessary to fix support for empty-string keys in sstables (part of issue #9375). This patch also includes a unit-test for this bug. We test it in the context of sstables::key_view::tri_compare(), where it was first discovered, and also test the legacy_compound_view itself. The included test used to fail in both places before this patch, and pass after it. Fixes #10178 Refs #9375 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-03-08 14:14:18 +02:00
Benny Halevy	a085ef74ff	atomic_cell: compare_atomic_cell_for_merge: compare ttl if expiry is equal Following up on `a57c087c89`, compare_atomic_cell_for_merge should compare the ttl value in the reverse order since, when comparing two cells that are identical in all attributes but their ttl, we want to keep the cell with the smaller ttl value rather than the larger ttl, since it was written at a later (wall-clock) time, and so would remain longer after it expires, until purged after gc_grace seconds. Fixes #10173 Test: mutation_test.test_cell_ordering, unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220302154328.2400717-1-bhalevy@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220306091913.106508-1-bhalevy@scylladb.com>	2022-03-07 11:05:30 +02:00
Gleb Natapov	108e7fcc4e	raft: enter candidate state immediately when starting a singleton cluster When a node starts it does not immediately becomes a candidate since it waits to learn about already existing leader and randomize the time it becomes a candidate to prevent dueling candidates if several nodes are started simultaneously. If a cluster consist of only one node there is no point in waiting before becoming a candidate though because two cases above cannot happen. This patch checks that the node belongs to a singleton cluster where the node itself is the only voting member and becomes candidate immediately. This reduces the starting time of a single node cluster which are often used in testing. Message-Id: <YiCbQXx8LPlRQssC@scylladb.com>	2022-03-04 20:30:52 +01:00
Kamil Braun	1c5ab5d80c	test: raft: randomized_nemesis_test: when setting up clusters, only create the first server with singleton configuration When setting up clusters in regression tests, a bunch of servers were created, each starting with a singleton configuration containing itself. This is wrong: servers joining to an existing cluster should be started with an empty configuration. It 'worked' because the first server, which we wait for to become a leader before creating the other servers, managed to override the logs and configurations of other servers before they became leaders in their configurations. But if we want to change the logic so that servers in single-server clusters elect themselves as leaders immediately, things start to break. So fix the bug. Message-Id: <20220303100344.6932-1-kbraun@scylladb.com>	2022-03-04 20:29:19 +01:00
Jadw1	213dace26e	CQL3/pytest: Updating test_json Referring to issue #7915, cassandra also works with unprepared statement. There was missing `fromJson()`, the test was inserting string into boolean column.	2022-03-04 14:18:42 +01:00
Benny Halevy	a57c087c89	atomic_cell: compare_atomic_cell_for_merge: compare ttl if expiry is equal Unlike atomic_cell_or_collection::equals, compare_atomic_cell_for_merge currently returns std::strong_ordering::equal if two cells are equal in every way except their ttl:s. The problem with that is that the cells' hashes are different and this will cause repair to keep trying to repair discrepancies caused by the ttl being different. This may be triggered by e.g. the spark migrator that computes the ttl based on the expiry time by subtracting the expiry time from the current time to produce a respective ttl. If the cell is migrated multiple times at different times, it will generate cells that the same expiry (by design) but have different ttl values. Fixes #10156 Test: mutation_test.test_cell_ordering, unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220302154328.2400717-1-bhalevy@scylladb.com>	2022-03-03 15:27:16 +02:00
Nadav Har'El	3d0bd523b5	Merge 'CQL3: fromJson out of range integer cause as error' from Jadw1 Passing integer which exceeds corresponding type's bounds to `fromJson()` was causing silent overflow, e.g. inserting `fromJson('2147483648')` to `int` coulmn stored `-2147483648`. Now, this will cause marshal_exception. All integer types are testing agains their bounds. Tests referring issue https://github.com/scylladb/scylla/issues/7914 in `test/cql-pytest/cassandra_tests/validation/entities/json_test.py` won't pass because the expected error's messages differ from the thrown ones. I was wondering what the message should be, because expected messages in tests aren't consistent, for instance: - bigint overflow expects `Expected a bigint value, but got a` message - short overflow expects `Unable to make short from` message For now the message is `Value {} out of bound`. Fixes: https://github.com/scylladb/scylla/issues/7914 Closes #10145 * github.com:scylladb/scylla: CQL3/pytest: Updating test_json CQL3: fromJson out of range integer cause as error	2022-03-03 13:46:16 +02:00
Jadw1	742efc4992	CQL3/pytest: Updating test_json Added test for bigint overflow.	2022-03-02 15:36:09 +01:00

1 2 3 4 5 ...

2868 Commits