scylladb

Author	SHA1	Message	Date
Michael Litvak	fb18b95b3c	test/boost/view_schema_test.cc: fix race in wait_until_built create the view waiter before creating the view, otherwise if the waiter is created after the view is built we may lose the notification.	2025-07-01 13:20:19 +03:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Dawid Pawlik	7554e55c2c	test/boost: add vector type cql_env boost tests These tests check serialization and deserialization (including JSON), basic inserts and selects, aggregate functions, element validation, vector usage in user defined types and functions. test_vector_between_user_types is a translated Apache Cassandra test to check if it is handled properly internally.	2025-01-28 21:14:49 +01:00
Takuya ASADA	03461d6a54	test: compile unit tests into a single executable To reduce test executable size and speed up compilation time, compile unit tests into a single executable. Here is a file size comparison of the unit test executable: - Before applying the patch $ du -h --exclude='.o' --exclude='.o.d' build/release/test/boost/ build/debug/test/boost/ 11G build/release/test/boost/ 29G build/debug/test/boost/ - After applying the patch du -h --exclude='.o' --exclude='.o.d' build/release/test/boost/ build/debug/test/boost/ 5.5G build/release/test/boost/ 19G build/debug/test/boost/ It reduces executable sizes 5.5GB on release, and 10GB on debug. Closes #9155 Closes scylladb/scylladb#21443	2024-12-22 19:14:09 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Dawid Mędrek	926eaf8fe9	test/boost/view_schema_test: Improve comments in test_view_update_generating_writetime In this commit, we elaborate on the semantics of generating view updates for each case the test goes through so that the reader less familiar with the logic has an easier time understanding it.	2024-11-24 22:48:15 +01:00
Dawid Mędrek	2d12acd09a	test/boost/view_schema_test.cc: Improve checks in test_view_update_generating_writetime We modify the checks in the test to obtain full information whenever a failure happens. Before this change, we compared the number of view updates one-by-one. As a result, when the first check failed, we didn't learn anything about the other two. Now we always compare them all at once. A negative impact of this commit is that if one of the lambdas throws an exception, we don't learn ANYTHING. However, a lambda throwing an exception is a more appalling problem than the comparison failing, and we DO learn about it in such a situation; so we accept that cost.	2024-11-24 22:48:13 +01:00
Dawid Mędrek	fb62fc6061	test/boost/view_schema_test.cc: Split test cases in test_view_update_generating_writetime We split some of the test cases so it's clearer what's going on in the test. Also, if a bug happens in the future, it should be easier to reason about it when it corresponds to exactly one CQL statement instead of possibly two.	2024-11-24 22:47:27 +01:00
Dawid Mędrek	f913ae571f	db/view: Don't generate view updates for unselected columns The semantics of Scylla's materialized views may vary depending on how their primary keys correspond to the base table's one. One of the differences is how we handle writes to columns in the base table that are not selected by a view: * Case 1: The view's PK is a permutation of the base table's PK: Since the view's primary key cannot be changed in an update, a row in the view remains alive as long as the corresponding row in the base table is alive. The tricky part comes when the base table has columns that are NOT selected by the view. CQL3 used to not allow for defining a table that didn't have any other columns besides its primary key. Also, when inserting a row into a table, it was mandatory to provide at least one value aside from the primary key. At some point it changed [1] and the implementation of the solution relied on the notion of the row marker. Putting the details aside, consider the following scenario: (i) the base table has a primary key consisting of columns c_1, ..., c_k, and it has regular columns rc_1, ..., rc_n, (ii) the primary key of an MV defined on that table consists of a permutation of c_1, ..., c_k. The MV doesn't select at least one of the regular columns of the base table. Without loss of generality, let that unselected column be rc_1. (iii) the base table has a row R whose only non-null value is the one in the regular column rc_1. Now, what will R correspond to in the MV? The base table doesn't have a row marker, but all of its regular columns in the MV will be NULLs. That's NOT allowed. To solve that problem, all unselected columns have corresponding virtual columns in the MV; the only information they provide is whether there is a value in the base table or not. This way, the MV knows if a row is still alive or not. For that reason, we send view updates to virtual columns in the following cases: (i) the value in the column changes from NULL to a value, i.e. it's created, (ii) the value in the column exists, but its TTL has been updated. * Case 2: The view's PK has one more column that the base table's one: Since the primary key of the view has a regular column C from the base table, it is guaranteed that if there's a row in the MV, the corresponding row in the base table can remain alive: since C is part of the view's PK, it must have a value, so the row in the base table has a value in C too. The problem with virtual columns from the previous case doesn't manifest in this one. The liveness of the cell in C determines the liveness of the whole row in the view. The semantics gets more complex, but the conclusion is this: in case 1, virtual columns exist and we may need to generate view updates for them, while in case 2 virtual columns do NOT exist and so we don't generate view updates for them. What changes in this patch is we adjust the code to it. If a view has a regular column from the base table as part of its primary key, we no longer emit view updates when we change a column unselected by that view. It is purely an OPTIMIZATION change. [1]: https://issues.apache.org/jira/browse/CASSANDRA-4361 Fixes scylladb/scylladb#21652 Closes scylladb/scylladb#21653	2024-11-24 19:01:28 +02:00
Dawid Mędrek	af4afc84ec	test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime The auxiliary function `eventually()` (defined in `test/lib/eventually.hh`) tries to execute a passed function. If it throws, `eventually()` sleeps for `2^#previous_attempts` milliseconds and tries to perform it again. The default limit of attempts is 17. In `test_view_update_generating_writetime`, right before the last test case, we perform: ```cql UPDATE t USING TTL 10 AND TIMESTAMP 8 SET g=40 WHERE k=1 AND c=1; ``` The test case itself executes: ```cql SELECT WRITETIME(g) FROM t; ``` and asserts that the result of the query is equal to 8, i.e. it corresponds to the timestamp of the last write to the table `t`. However, if the test case keeps failing, then during its 14th attempt (so affter sleeping for at least `2^14 - 1` milliseconds, which amounts to about 16 seconds), we'll observe the following error: ``` [Exception] - std::runtime_error: Expected row not found: [0000000000000008] not in {result_message::rows {row: null}} ``` The reason behind it is the specified TTL is too short. 10 seconds will have already passed before the 14th attempt, so the value in the column `g` will be `NULL` again. In particular, the `WRITETIME(g)` will no longer be equal to `8`. To solve that issue, we change the TTL in the CQL statement to 300. The time spent on 17 loops of `eventually()` amounts to about `2^18 - 1` milliseconds, which is about 263 seconds. That's why setting the TTL to 300 seconds should be enough to prevent the error from occurring.	2024-11-19 13:02:34 +01:00
Dawid Mędrek	5ca0cc4e85	test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime Before these changes, we didn't wait for the materialized views to finish building before writing to the base table. That led to generating an additional view update, which, in turn, led to test failures. The scenario corresponding to the summary above looked like this: 1. The test creates an empty table and MVs on it. 2. The view builder starts, but it doesn't finish immediately. 3. The test performs mutations to the base table. Since the views already exist, view updates are generated. 4. Finally, the view builder finishes. It notices that the base table has a row, so it generates a view update for it because it doesn't notice that we already have data in the view. We solve it by explicitly waiting for both views to finish building and only then start writing to the base table. Fixes scylladb/scylladb#20889	2024-11-19 12:51:22 +01:00
Wojciech Mitros	272e80fe0a	node_update_backlog: divide adding and fetching backlogs Currently, we only update the backlogs in node_update_backlog at the same time when we're fetching them. This is done using storage_proxy's method get_view_update_backlog, which is confusing because it's a getter with side-effects. Additionally, we don't always want to update the backlog when we're reading it (as in gossip which is only on shard 0) and we don't always want to read it when we're updating it (when we're not handling any writes but the backlog drops due to background work finish). This patch divides the node_view_backlog::add_fetch as well the storage_proxy::get_view_update_backlog both into two methods; one for updating and one for reading the backlog. This patch only replaces the places where we're currently using the view backlog getter, more situations where we should get/update the backlog should be considered in a following patch.	2024-06-06 10:45:13 +02:00
Kefu Chai	5ca9a46a91	test/lib: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18515	2024-05-05 23:31:48 +03:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Jan Ciolek	7f0c64a69d	test: remove invalid IS NOT NULL restrictions from tests The IS NOT NULL restrictions is currently supported only in the CREATE MATERIALIZED VIEW statements. These restrictions works correctly for columns that are part of the view's primary key, but they're silently ignored on other columns. The following commits will forbid placing the IS NOT NULL restriction on columns that aren't a part of the view's primary key. The tests have to be modified in order to pass, because some of them have a useless IS NOT NULL restriction on regular columns that don't belong to the view's primary key. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-05-17 15:38:03 +02:00
Kefu Chai	c37f4e5252	treewide: use fmt::join() when appropriate now that fmtlib provides fmt::join(). see https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view there is not need to revent the wheel. so in this change, the homebrew join() is replaced with fmt::join(). as fmt::join() returns an join_view(), this could improve the performance under certain circumstances where the fully materialized string is not needed. please note, the goal of this change is to use fmt::join(), and this change does not intend to improve the performance of existing implementation based on "operator<<" unless the new implementation is much more complicated. we will address the unnecessarily materialized strings in a follow-up commit. some noteworthy things related to this change: * unlike the existing `join()`, `fmt::join()` returns a view. so we have to materialize the view if what we expect is a `sstring` * `fmt::format()` does not accept a view, so we cannot pass the return value of `fmt::join()` to `fmt::format()` * fmtlib does not format a typed pointer, i.e., it does not format, for instance, a `const std::string`. but operator<<() always print a typed pointer. so if we want to format a typed pointer, we either need to cast the pointer to `void` or use `fmt::ptr()`. * fmtlib is not able to pick up the overload of `operator<<(std::ostream& os, const column_definition* cd)`, so we have to use a wrapper class of `maybe_column_definition` for printing a pointer to `column_definition`. since the overload is only used by the two overloads of `statement_restrictions::add_single_column_parition_key_restriction()`, the operator<< for `const column_definition*` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 20:34:18 +08:00
Nadav Har'El	73e258fc34	materialized views: verify CLUSTERING ORDER BY clause Cassandra is very strict in the CLUSTERING ORDER BY clause which it allows when creating a materialized view - if it appears, it must list all the clustering columns of the view. Scylla is less strict - a subset of the clustering columns may be specified. But Scylla was too lenient - a user could specify non-clustering columns and even non-existent columns and Scylla would not fail the MV creation. This patch fixes that - with it MV creation fails if anything besides clustering columns are listed on CLUSTERING ORDER BY. An xfailing test we had for this case no longer fails after this patch so its xfail mark is removed. We also add a few more corner cases to the tests. This patch also fixs one C++ test which had exactly the error that this patch detects - the test author tried to use the partition key, instead of the clustering key, in CLUSTERING ORDER BY (this error had no effect because the specified order, "asc", was the default anyway). Fixes #10767 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12885	2023-02-27 15:09:42 +02:00
Raphael S. Carvalho	3c5afb2d5c	test: Enable Scylla test command line options for boost tests We have enabled the command line options without changing a single line of code, we only had to replace old include with scylla_test_case.hh. Next step is to add x-log-compaction-groups options, which will determine the number of compaction groups to be used by all instantiations of replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Avi Kivity	ae3a360725	database: Move database, keyspace, table classes to replica/ directory The database, keyspace, and table classes represent the replica-only part of the objects after which they are named. Reading from a table doesn't give you the full data, just the replica's view, and it is not consistent since reconciliation is applied on the coordinator. As a first step in acknowledging this, move the related files to a replica/ subdirectory.	2022-01-06 17:07:30 +02:00
Avi Kivity	27a2c74b64	test: replace seastar::sprint() with fmt::format() sprint() is obsolete.	2021-10-27 17:02:00 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Tomasz Grabiec	617ccc5408	tests: mv: Test dropping columns from base table Reproduces #7061.	2020-08-20 14:53:07 +02:00
Pavel Emelyanov	8618a02815	migration_manager: Remove db/schema_tables.hh inclustion into header The schema_tables.hh -> migration_manager.hh couple seems to work as one of "single header for everyhing" creating big blot for many seemingly unrelated .hh's. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:54:43 +03:00
Pavel Emelyanov	86c712a340	test: Split view_schema_test Detach partition_key and clustering_key ones into own files. The resultint 2 tests run ~4 minutes each, the leftover ones complete within 11 minutes. The same -- the goal to run out of 14 minutes is reached, further splitting needs more thinking than just wildcarding. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-03-16 20:27:45 +03:00
Avi Kivity	dcab666d52	cql3: query_processor: reduce #includes query_processor is a central class, so reducing its includes can reduce dependencies treewite. This patch removes includes for parsed_statement, cf_statement, and untyped_result_set and fixes up the rest of the tree to include what it lacks as a result of these removals.	2020-02-09 12:24:24 +02:00
Rafael Ávila de Espíndola	bd93a0af52	types: Return bytes_opt from data_value::serialize Since a data_value can contain a null value, returning bytes from serialize() was losing information as it was mapping null to empty. This also introduces a serialize_nonnull that still returns bytes, but results in an internal error if called with a null value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-01-29 14:04:59 -08:00
Konstantin Osipov	1c8736f998	tests: move all test source files to their new locations 1. Move tests to test (using singular seems to be a convention in the rest of the code base) 2. Move boost tests to test/boost, other (non-boost) unit tests to test/unit, tests which are expected to be run manually to test/manual. Update configure.py and test.py with new paths to tests.	2019-12-16 17:47:42 +03:00

29 Commits