scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Patryk Jędrzejczak	16b0eeb3d6	test: ManagerClient: servers_add: specify consistent-topology-changes assumption ManagerClient.servers_add can be called only if the cluster uses consistent topology changes. We add this specification to the leading comment.	2024-01-02 12:19:31 +01:00
Kefu Chai	f4bd86384b	install.sh: use a temporary file when packaging scylla.yaml we create a default `scylla.yaml` on the fly in `install.sh`. but the path to the temporary file holding the default yaml file is hardwired to `/tmp/scylla.yaml`. this works fine if we only have a single `install.sh` at a certain time point. but if we have multiple `install.sh` process running in parallel, these packaging jobs could step on each other when they create and remove the `scylla.yaml`. in this change, because the limit of `installconfig`, it always consider the "dest" parameter as a directory, `mktemp` is used for creating a parent directory of the temporary file. Fixes #16591 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16592	2024-01-01 21:50:29 +02:00
Kefu Chai	48b8544a63	.git: add skip more words and directories we use "ue" for the short of "update_expressions", before we change our minds and use a more readable name, let's add "ue" to the "ignore_word_list" option of the codespell. also, use the abslolute path in "skip" option. as the absolute paths are also used by codespell's own github workflow. and we are still observing codespell github workflow is showing the misspelling errors in our "test/" directory even we have it listed in "skip". so this change should silence them as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16593	2024-01-01 14:32:16 +02:00
Avi Kivity	8ba0decda5	Merge 'System.peers: enforce host_id' from Benny Halevy The HOST_ID is already written to system.peers since inception pretty much (See https://github.com/scylladb/scylladb/pull/16376#discussion_r1429248185 for details). However, it is written to the table using an individual CQL query and so it is not set atomically with other columns. If scylla crashes or even hits an exception before updating the host_id, then system.peers might be left in an inconsistent state, and in particular without no HOST_ID value. This series makes sure that HOST_ID is written to system.peers and use it to "seal" the record by upserting it in a single CQL BATCH query when adding the state for new nodes. On the read side, skip rows that have no HOST_ID state in system.peers, assuming they are incomplete, i.e. scylla got an exception or crashed while writing them, so they can't be trusted. With that change we can assume that endpoint state loaded from system.peers will always have a valid host_id. Refs https://github.com/scylladb/scylladb/pull/15903 Closes scylladb/scylladb#16376 * github.com:scylladb/scylladb: gms: endpoint_state: change application_state_map to std::unordered_map system_keyspace: update_peer_info: drop single-column overloads storage_service: drop do_update_system_peers_table storage_service: on_change: fixup indentation endpoint_state subscriptions: batch on_change notification everywhere: drop before_change subscription system_keyspace: load_tokens/peers/host_ids: enforce presence of host_id system_keyspace: drop update_tokens(endpoint, tokens) overload storage_service: seal peer info with host_id storage_service: update_peer_info: pass peer_info to sys_ks gms: endpoint_state: define application_state_map system_keyspace: update_peer_info: use struct peer_info for all optional values query_processor: execute_internal: support unset values types: add data_value_list system_keyspace: get rid of update_cached_values storage_service: do not update peer info for this node	2023-12-31 21:22:04 +02:00
Benny Halevy	cdd5605d81	gms: endpoint_state: change application_state_map to std::unordered_map State changes are processed as a batch and there is no reason to maintain them as an ordered map. Instead, use a std::unordered_map that is more efficient. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	c520fc23f0	system_keyspace: update_peer_info: drop single-column overloads They are no longer used. Instead, all callers now pass peer_info. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	0e5a666e6f	storage_service: drop do_update_system_peers_table It is no longer used after previous patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	13d395fa6a	storage_service: on_change: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	ad8a9104d8	endpoint_state subscriptions: batch on_change notification Rather than calling on_change for each particular application_state, pass an endpoint_state::map_type with all changed states, to be processed as a batch. In particular, thise allows storage_service::on_change to update_peer_info once for all changed states. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	1d07a596bf	everywhere: drop before_change subscription None of the subscribers is doing anything before_change. This is done before changing `on_change` in the following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	7670f60b83	system_keyspace: load_tokens/peers/host_ids: enforce presence of host_id Skip rows that have no host_id to make sure the node state we load always has a valid host_id. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	74159bb5ae	system_keyspace: drop update_tokens(endpoint, tokens) overload It is unused now after the previous patch to update_peer_info in one call. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	2075c85b70	storage_service: seal peer info with host_id When adding a peer via update_peer_info, insert all columns in a single query using system_keyspace::peer_info. This ensures that `host_id` is inserted along with all other app states, so we can rely on it when loading the peer info after restart. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	eb4cd388ce	storage_service: update_peer_info: pass peer_info to sys_ks Use the newly added system_keyspace::peer_info to pass a struct of all optional system.peea members to system_keyspace::update_peer_info. Add `get_peer_info_for_update` to construct said struct from the endpoint state. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	5abf556399	gms: endpoint_state: define application_state_map Have a central definition for the map held in the endpoint_state (before changing it to std::unordered_map). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:34 +02:00
Benny Halevy	b2735d47f7	system_keyspace: update_peer_info: use struct peer_info for all optional values Define struct peer_info holding optional values for all system.peers columns, allowing the caller to update any column. Pass the values as std::vector<std::optional<data_value>> to query_processor::execute_internal. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:37:30 +02:00
Benny Halevy	6123dc6b09	query_processor: execute_internal: support unset values Add overloads for execute_internal and friends accepting a vector of optional<data_value>. The caller can pass nullopt for any unset value. The vector of optionals is translated internally to `cql3::raw_value_vector_with_unset` by `make_internal_options`. This path will be called by system_keyspace::update_peer_info for updating a subset of the system.peers columns. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:21:35 +02:00
Benny Halevy	328ce23c78	types: add data_value_list data_value_list is a wrapper around std::initializer_list<data_value>. Use it for passing values to `cql3::query_processor::execute_internal` and friends. A following path will add a std::variant for data_value_or_unset and extend data_value_list to support unset values. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 18:17:27 +02:00
Konstantin Osipov	246da8884a	test.py: override SCYLLA_* env keys test.py inherits its env from the user, which is the right thing: some python modules, e.g. logging, do accept env-based configuration. However, test.py also starts subprocesses, i.e. tests, which start scylladb instances. And when the instance is started without an explicit configuration file, SCYLLA_CONF from user environment can be used. If this scylla.conf contains funny parameters, e.g. unsupported configuration options, the tests may break in an unexpected way. Avoid this by resetting the respecting env keys in test.py. Fixes gh-16583 Closes scylladb/scylladb#16577	2023-12-31 13:02:49 +02:00
Benny Halevy	85b3232086	system_keyspace: get rid of update_cached_values It's a no-op. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 10:10:51 +02:00
Benny Halevy	f64ecc2edf	storage_service: do not update peer info for this node system_keyspace had a hack to skip update_peer_info for the local node, and then to remove an entry for the local node in system.peers if `update_tokens(endpoint, ...)` was called for this node. This change unhacks system_keyspace by considering update of system.peers with the local address as an internal error and fixing the call sites that do that. Fixes #16425 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-31 10:10:51 +02:00
Patryk Jędrzejczak	f1dea4bc8a	storage_proxy: do not fence reads and writes to local tables Fencing is necessary only for reads and writes to non-local tables. Moreover, fencing a read or write to a local table can cause an error on the bootstrapping node. It is explained in the comment in storage_proxy::get_fence. A scenario described in the comment has been reported in scylladb/scylladb#16423. A write to the local RAFT table failed because of fencing, and it killed server_impl::io_fiber. Fixes scylladb/scylladb#16423 Closes scylladb/scylladb#16525	2023-12-28 19:34:27 +02:00
Nadav Har'El	91636f6d21	test/cql-pytest: reproducer of slightly too strict parser of timestamp Scylla refuses the timestamp format "2014-01-01 12:15:45.0000000Z" that has 6 digits of precision for the fractional second, and only allows 3 digits of precision. This restriction makes sense - after all CQL timestamp columns (note - this is NOT "using timestamp"!) only have millisecond precision. Nevertheless, Cassandra does not have this restriction and does allow these over-precise timestamps. In this patch we add a test that demonstrates this difference. Curiously, in the past Scylla generated this forbidden timestamp format when outputting the timestamp to a string (e.g. toJson()), which it then couldn't read back! This was issue #16575. Today Scylla no longer generates this forbidden timestamp format. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#16576	2023-12-28 19:01:25 +02:00
Takuya ASADA	7275b614aa	scylla_util.py: wait for apt operation on other processes apt_install() / apt_uninstall() may fail if background process running apt operation, such as unattended-upgrades. To avoid this, we need to add two things: 1. For apt-get install / remove, we need to option "DPkg::Lock::Timeout=-1" to wait for dpkg lock. 2. For apt-get update, there is no option to wait for cache lock. Therefore, we need to implement retry-loop to wait for apt-get update succeed. Fixes #16537 Closes scylladb/scylladb#16561	2023-12-28 19:00:36 +02:00
Takuya ASADA	331d9ce788	install.sh: fix scylla-server.service failure on nonroot mode On `3da346a86d`, we moved AmbientCapabilities to scylla-server.service, but it causes "Operation not permitted" on nonroot mode. It is because nonroot user does not have enough privilege to set capabilities, we need to disable the parameter on nonroot mode. Closes scylladb/scylladb#16574	2023-12-27 20:52:17 +02:00
Avi Kivity	6394854f04	Merge 'Some cleanups in tests for tablets + MV ' from Nadav Har'El This small series improves two things in the multi-node tests for tablet supports in materialized views: 1. The test for Alternator LSI, which "sometimes" could reproduce the bug by creating 10-node cluster with a random tablet distribution, is replaced by a reliable 2-node cluster which controls the tablet distribution. The new test also confirms that tablets are actually enabled in Alternator (reviewers of the original test noted it would be easy to pass the test if tablets were accidentally not enabled... :-)). 2. Simplify the tablet lookup code in the test to not go through a "table id", and lookup the table's (or view's) name directly (requires a full-table of the tablets table, but that's entirely reasonable in a test). The third patch in this series also fixes a comment typo discovered in a previous review. Closes scylladb/scylladb#16440 * github.com:scylladb/scylladb: materialized views: fix typo in comment test_mv_tablets: simplify lookup of tablets alternator, tablets: improve Alternator LSI tablets test	2023-12-27 20:18:14 +02:00
Gleb Natapov	e31f6893af	storage_service: topology coordinator: fix accessing outdated node in case of barrier failure When metadata barrier fails a guard is released and node becomes outdated. Failure handling path needs to re-take the guard and re-create the node before continuing. Fixes: #16568 Message-ID: <ZYxEm+SaBeFcRT8E@scylladb.com>	2023-12-27 18:40:10 +02:00
Avi Kivity	3ce0576a31	Merge 'Sanitize keyspace_metadata creation' from Pavel Emelyanov The amount of arguments needed to create ks metadata object is pretty large and there are many different ways it can be and it is created over the code. This set simplifies it for the most typical patterns. closes: #16447 closes: #16449 Closes scylladb/scylladb#16565 * github.com:scylladb/scylladb: schema_tables: Use new_keyspace() sugar keyspace_metadata: Drop vector-of-schemas argument from new_keyspace() keyspace_metadata: Add default value for new_keyspace's durable_writes keyspace_metadata: Pack constructors with default arguments	2023-12-27 17:15:04 +02:00
Botond Dénes	1647b29cba	tools/schema_loader: add db::config parameter to all load methods So that a single centrally managed db::config instance can be shared by all code requiring it, instead of creating local instances where needed. This is required to load schema from encrypted schema-tables, and it also helps memory consumption a bit (db::config consumes a lot of memory). Fixes: #16480 Closes scylladb/scylladb#16495	2023-12-27 16:28:38 +02:00
Nadav Har'El	e6dc9bca0d	Merge 'Profile dumping rest api support' from Eliran Sinvani This change is motivated by wanting to have code coverage reporting support. Currently the only way to get a profile dump in ScyllaDB is stopping it with SIGTERM, however, this doesn't suite all cases, more specifically: 1. In dtest, when some of the tests intentionally abruptly kill a node 2. In test.py, where we would like to distinguish (at least for now), graceful shutdown of ScyllaDB testing and teardown procedures (which currently kills the nodes). This mini series adds two changes: 1. It adds the support for profile dumping in ScyllaDB with rest api ('/system/dump_profile') 2. It adds the support for this API in test.py and also adds a call for it as part of the node stop procedure in a permissive way that will not fail the teardown or test if the call doesn't succeed for whatever reason - after this change, all current test.py suits except for pylib_test (expected) dumps profiles if instrumented and will be able to participate in coverage reporting. Refs #16323 Closes scylladb/scylladb#16557 * github.com:scylladb/scylladb: test.py: Dump coverage profile before killing a node rest api: Add an api for profile dumping	2023-12-27 12:06:39 +02:00
Eliran Sinvani	e49b3ffc89	test.py: Dump coverage profile before killing a node Up until now the only way to get a coverage profile was to shut down the ScyllaDB nodes gracefully (using SIGTERM), this means that the coverage profile was lost for every node that was killed abruptly (SIGKILL). This in turn would have been requiring us to shut down all nodes gracefully which is not something we set out to do. Here we use the rest API for dumping the coverage profile which will cause the most minimal impact possible on the test runs. If the dumping fails (due to the node doesn't support the API or due to a real error in dumping we ignore it as it is not part of the system we would like to test. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2023-12-27 07:17:26 +02:00
Eliran Sinvani	4c60804c4c	rest api: Add an api for profile dumping As part of code coverage support we need to work with dumped profiles for ScyllaDB executables. Those profiles are created on two occasions: 1. When an application exits notmaly (which will trigger __llvm_dump_profile registered in the exit hooks. 2. For ScyllaDB commit `d7b524cf10` introduced a manual call to __llvm_dump_profile upon receiving a SIGTERM signal. This commit adds a third option, a rest API to dump the profile. In addition the target file is logged and the counters are reset, which enables incremental dumping of the profile. Except for logging, if the executable is not instrumented, this API call becomes a no-op so it bears minimal risk in keeping it in our releases. Specifically for code coverage, the gain will be that we will not be required to change the entire test run to shut down clusters gracefully and this will cause minimal effect to the actual test behavior. The change was tested by manually triggering the API in with and without instrumentation as well as re triggering it with write permissions for the profile file disabled (to test fault tolerance). Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2023-12-27 07:06:54 +02:00
Nadav Har'El	fc71c34597	Merge 'select statement: verify EXECUTE permissions only for non native functions' from Eliran Sinvani Commit `62458b8e4f` introduced the enforcement of EXECUTE permissions of functions in cql select. However, according to the reference in #12869, the permissions should be enforced only on UDFs and UDAs. The code does not distinguish between the two so the permissions are also unintenionally enforced also on native function. This commit introduce the distinction and only enforces the permissions on non native functions. Fixes #16526 Manually verified (before and after change) with the reproducer supplied in #16526 and also with some the `min` and `max` native functions. Also added test that checks for regression on native functions execution and verified that it fails on authorization before the fix and passes after the fix. Closes scylladb/scylladb#16556 * github.com:scylladb/scylladb: test.py: Add test for native functions permissions select statement: verify EXECUTE permissions only for non native functions	2023-12-26 18:14:21 +02:00
Pavel Emelyanov	129196db98	schema_tables: Use new_keyspace() sugar The create_keyspace_from_schema_partition code creates ks metadata without schemas and user-types. There's new_keyspace() convenience helper for such cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-26 13:26:58 +03:00
Pavel Emelyanov	a1ad2571fc	keyspace_metadata: Drop vector-of-schemas argument from new_keyspace() It's only testing code that wants to call new_keyspace with existing schemas, all the other callers either construct the ks metadata directly, or use convenience new_keyspace with explicitly empty schemas. By and large it's nicer if new_keyspace() doesn't requires this argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-26 13:00:44 +03:00
Pavel Emelyanov	ffdafe4024	keyspace_metadata: Add default value for new_keyspace's durable_writes Almost all callers call new_keyspace with durable writes ON, so it's worth having default value for it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-26 11:47:37 +03:00
Pavel Emelyanov	9ab0065796	keyspace_metadata: Pack constructors with default arguments There's a cascade of keyspace_metadata constructors each adding one default argument to the prevuous one. All this can be expressed shorter with the help of native default argument Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-26 11:41:01 +03:00
Eliran Sinvani	a336550041	test.py: Add test for native functions permissions Native functions (non UDF/UDA functions), should be usable even if a user is not granted EXECUTE permissions on them. This is a regression test that was added following: https://github.com/scylladb/scylladb/issues/16526 Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2023-12-26 10:27:04 +02:00
Eliran Sinvani	cac79977d6	select statement: verify EXECUTE permissions only for non native functions Commit `62458b8e4f` introduced the enforcement of EXECUTE permissions of functions in cql select. However, according to the reference in #12869, the permissions should be enforced only on UDFs and UDAs. The code does not distinguish between the two so the permissions are also unintentionally enforced also on native function. This commit introduce the distinction and only enforces the permissions on non native functions. Fixes #16526 Manually verified (before and after change) with the reproducer supplied in #16526 and also with some the `min` and `max` native functions. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2023-12-26 10:27:04 +02:00
Avi Kivity	3968fc11bf	Merge 'cql: fix regression in SELECT * GROUP BY' from Nadav Har'El This short series fixes a regression from Scylla 5.2 to Scylla 5.4 in "SELECT * GROUP BY" - this query was supposed to return just a single row from each partition (the first one in clustering order), but after the expression rewrite started to wrongly return all rows. The series also includes a regression test that verifies that this query works doesn't work correctly before this series, but works with this patch - and also works as expected in Scylla 5.2 and in Cassadra. Fixes #16531. Closes scylladb/scylladb#16559 * github.com:scylladb/scylladb: test/cql-pytest: check that most aggregators don't take "" cql-pytest: add reproducer for GROUP BY regression cql: fix regression in SELECT GROUP BY	2023-12-25 19:53:55 +02:00
Avi Kivity	3da346a86d	Merge 'Drop CentOS7 specific codes' from Takuya ASADA Since we decided to drop CentOS7 support from latest version of Scylla, now we can drop CentOS7 specific codes from packaging scripts and setup scripts. Related scylladb/scylla-enterprise#3502 Closes scylladb/scylladb#16365 * github.com:scylladb/scylladb: scylla-server.service: switch deprecated PermissionsStartsOnly to ExecStartPre=+ dist: drop legacy control group parameters scylla-server.slice: Drop workaround for MemorySwapMax=0 bug dist: move AmbientCapabilities to scylla-server.service Revert "scylla_setup: add warning for CentOS7 default kernel" [avi: CentOS 7 reached EOL on June 2024]	2023-12-25 18:25:05 +02:00
Kefu Chai	68c98d2203	build: cmake: link against boost static when --static-boost is specified `--static-boost` is an option provided by `configure.py`. this option is not used by our CI or building scripts. but in order to be compatible with the existing behavior of `configure.py`, let's support this option when building with CMake. `Boost_USE_STATIC_LIBS` is a cmake variable supported by CMake's FindBoost and Boost's own `BoostConfig.cmake`. see https://cmake.org/cmake/help/latest/module/FindBoost.html#other-variables by default boost is linked via its shared libraries. by setting this variable, we link boost's static libraries. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16545	2023-12-25 18:23:49 +02:00
Avi Kivity	da022ca4e8	Merge 'build: cmake: add "mode_list" target ' from Kefu Chai scylla uses build modes like "debug" and "release" to differentiate different build modes. while we intend to use the typical build configurations / build types used by CMake like "Debug" and "RelWithDebInfo" for naming CMAKE_CONFIGURATION_TYPES and CMAKE_BUILD_TYPE. the former is used for naming the build directory and for the preprocess macro named "SCYLLA_BUILD_MODE". `test.py` and scylladb's CI are designed based on the naming of build directory. in which, `test.py` lists the build modes using the dedicated build target named `list_modes`, which is added by `configure.py`. so, in this change, the target is added to CMake as well. the variables of "scylla_build_mode" defined by the per-mode configuration are collected and printed by the `list_modes`. because, by default, CMake generates a target for each build configuration when a multi-config generator is used. but we only want to print the build mode for a single time when "list_modes" is built. so a "BYPRODUCTS" is deliberately added for the target, and the patch of this "BYPRODUCTS" is named without the "$<CONFIG>" it its path. Closes scylladb/scylladb#16532 * github.com:scylladb/scylladb: build: cmake: add "mode_list" target build: cmake: define scylla_build_mode	2023-12-25 18:20:34 +02:00
Kefu Chai	4a817f8a2a	data_dictionary: use insert_or_assign() when appropriate when compiling clang-18 in "release" mode, `assert()` is optimized out. so `i` is not used. and clang complains like: ``` /home/kefu/dev/scylladb/data_dictionary/user_types_metadata.hh:29:14: error: unused variable 'i' [-Werror,-Wunused-variable] 29 \| auto i = _user_types.find(type->_name); \| ^ ``` in this change, we use `i` as the hint for the insertion, for two reasons: - silence the warning. - avoid the looking up in the unordered_map twice with the same key. `type` is not moved away when being passed to `insert_or_assign()`, because otherwise, `type->_name` could be referencing a moved-away shared_ptr, because the order of evaluating a function's parameter is not determined. since `type` is a shared_ptr, the overhead is negligible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16530	2023-12-25 18:18:20 +02:00
Takuya ASADA	0b894a7cac	locator::ec2_snitch: change retry logic to exponential backoff Since Amazon recommended to use exponential backoff logic when retries to call AWS API, we should switch the logic on ec2_snitch. see https://docs.aws.amazon.com/general/latest/gr/api-retries.html Related with #12160 Closes scylladb/scylladb#13442	2023-12-25 18:17:23 +02:00
Yaron Kaikov	8917947f29	build_docker: Add `description` and `summary` labels Adding description and summary labels to our docker images per @tzach and @mykaul request, Closes scylladb/scylladb#16419	2023-12-25 18:14:56 +02:00
Pavel Emelyanov	ac3dd4bf5d	test: Coroutinize some secondary_index_test cases Now they are long then-chains that are hard to read Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#16547	2023-12-25 18:08:19 +02:00
Nadav Har'El	55317666c6	test/cql-pytest: check that most aggregators don't take "" Although you can "SELECT COUNT()", this has special handling in the CQL parser (it is converted into a special row-counting request) and you can't give "" to other aggregators - e.g., "SELECT SUM()". This patch includes a simple test that confirms this. I wanted to check this in relation to the previous patch, which did, sort of, a "SELECT $$first$$(*)" - a syntax which this test shows wouldn't have actually worked if we tried it. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-12-25 17:53:42 +02:00
Nadav Har'El	e2773b4a3a	cql-pytest: add reproducer for GROUP BY regression test/cql-pytest/test_group_by.py has tests that verifies that requests like SELECT p,c1,c2,v FROM tbl WHERE p=0 GROUP BY p work as expected - the "GROUP BY p" means in this case that we should only return the first row in the p=0 partition. As a user discovered, it turns out that the almost identical request: SELECT * FROM tbl WHERE p=0 GROUP BY p Doesn't work the same - before the fix in the previous patch, it erroneously returned all rows in p=0, not just the first one. The test in this patch demonstrates this - it fails on Scylla 5.4, passes on Scylla 5.2 and on Cassandra - and passes when the fix from the previous patch is used. This patch includes another tiny test, to check the interaction of GROUP BY with filtering. This second test passes on Scylla - but I want it in anyway because it is yet another interaction that might break (the user that reported #16531 also had filtering, and I was worried it might have been related). Refs #16531 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-12-25 17:53:42 +02:00
Nadav Har'El	1aea2136c8	cql: fix regression in SELECT * GROUP BY Recently, the expression-rewrite effort changed the way that GROUP BY is implemented. Usually GROUP BY involves an aggregation function (e.g., if you want a separate SUM per partition). But there's also a query like SELECT p, c1, c2, v FROM tbl GROUP BY p This query is supposed to return one row - the first row in clustering order - per group (in this case, partition). The expression rewrite re-implemented this feature by introducing a new internal aggregator, first(), which returns the first aggregated value. The above query is rewritten into: SELECT first(p), first(c1), first(c2), first(v) FROM tbl GROUP BY p This case works correctly, and we even have a regression test for it. But unfortunately the rewrite broke the following query: SELECT * FROM tbl GROUP BY p Note the "" instead of the explicit list of columns. In our implementation, a selection of "" is looks like an empty selection, and it didn't get the "first()" treatment and it remained a "SELECT " - and wrongly returned all rows instead of just the first one in each partition. This was a regression - it worked correctly in Scylla 5.2 (and also in Cassandra) - see the next patch for a regression test. In this patch we fix this regression. When there is a GROUP BY, the "" is rewritten to the appropriate list of all visible columns and then gets the first() treatment, so it will return only the first row as expected. The next patch will be a test that confirms the bug and its fix. Fixes #16531 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-12-25 17:52:57 +02:00

1 2 3 4 5 ...

40459 Commits