scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	5abddc8568	Merge "Testing performance of different collections" from Pavel Emelyanov There's a perf_bptree test that compares B+ tree collection with std::set and std::map ones. There will come more, also the "patterns" to compare are not just "fill with keys" and "drain to empty", so here's the perf_collection test, that measures timings of - fill with keys - drain key by key - empty with .clear() call - full scan with iterator - insert-and-remove of a single element for currently used collections - std::set - std::map - intrusive_set_external_comparator - bplus::tree * https://github.com/xemul/scylla/tree/br-perf-collection-test: test: Generalize perf_bptree into perf_collection perf_collection: Clear collection between itartions perf_collection: Add intrusive_set_external_comparator perf_collection: Add test for single element insertion perf_collection: Add test for destruction with .clear() perf_collection: Add test for full scan time	2020-11-03 13:42:54 +02:00
Piotr Wojtczak	caa3c471c0	Validate ascii values when creating from CQL Although the code for it existed already, the validation function hasn't been invoked properly. This change fixes that, adding a validating check when converting from text to specific value type and throwing a marshal exception if some characters are not ASCII. Fixes #5421 Closes #7532	2020-11-02 16:47:32 +02:00
Pavel Emelyanov	364ddab148	test: Do not dump test log onto terminal When unit tests fail the test.py dump their output on the screen. This is impossible to read this output from the terminal, all the more so the logs are anyway saved in the testlog/ directory. At the same time the names of the failed tests are all left _before_ these logs, and if the terminal history is not large enough, it becomes quite annoying to find the names out. The proposal is not to spoil the terminal with raw logs -- just names and summaries. Logs themselves are at testlog/$mode/$name_of_the_test.log Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20201031154518.22257-1-xemul@scylladb.com>	2020-11-02 15:42:34 +02:00
Tomasz Grabiec	ba42e7fcc5	multishard_mutation_query: Propagate mutation_reader::forwarding flag Otherwise all readers will be created with the default forwarding::yes. This inhibits some optimizations (e.g. results in more sstable read-ahead). It will also be problematic when we introduce mutation sources which don't support forwarding::yes in the future. Message-Id: <1604065206-3034-1-git-send-email-tgrabiec@scylladb.com>	2020-11-02 15:24:36 +02:00
Avi Kivity	eb861e68e9	build: switch to clang as the default compiler Clang brings us working support for coroutines, which are needed for Raft and for code simplification. perf_simple_query as well as full system tests show no significant performance regression. Test: unit(dev, release, debug) Closes #7531	2020-11-02 14:18:13 +02:00
Nadav Har'El	ffbd487c86	Merge 'alternator::streams: Use end-of-record info in get_records' from Calle Wilund Fixes #7496 Since cdc log now has an end-of-batch/record marker that tells us explicitly that we've read the last row of a change, we can use this instead of timestamp checks + limit extra to ensure we have complete records. Note that this does not try to fulfill user query limit exact. To do this we would need to add a loop and potentially re-query if quried rows are not enough. But that is a separate exercise, and superbly suited for coroutines! Closes #7498 * github.com:scylladb/scylla: alternator::streams: Reduce the query limit depending on cdc opts alternator::streams: Use end-of-record info in get_records	2020-11-02 13:34:00 +02:00
Tomasz Grabiec	2dfc5f1ee5	Merge "Cleanup gossiper endpoint interface" from Benny This series cleans up the gossiper endpoint_state interface marking methods const and const noexcept where possible. To achieve that, endpoint_state::get_status was changed to return a string_view rather than a sstring so it won't need to allocate memory. Also, the get_cluster_name and get_partitioner_name were changes to return a const sstring& rather than sstring so they won't need to allocate memory. The motivation for the series stems from #7339 where an exception in get_host_id within a storage_service notification handler, called from seastar::defer crashed the server. With this series, get_host_id may still throw exceptions on logical error, but not from calling get_application_state_ptr. Refs #7339 Test: unit(dev) * tag 'gossiper-endpoint-noexcept-v2': gossiper: mark trivial methods noexcept gossiper: get_cluster_name, get_partitioner_name: make noexcept gossiper: get_gossip_status: return string_view and make noexcept gms/endpoint_state: mark methods using get_status noexcept gms/endpoint_state: get_status: return string_view and make noexcept gms/endpoint_state: mark get_application_state_ptr and is_cql_ready noexcept gms/endpoint_state: mark trivial methods noexcept gms/heart_beat_state: mark methods noexcept gms/versioned_value: mark trivial methods noexcept gms/version_generator: mark get_next_version noexcept fb_utilities.hh: mark methods noexcept messaging: msg_addr: mark methods noexcept gms/inet_address: mark methods noexcept	2020-11-02 12:30:30 +01:00
Avi Kivity	7a3376907e	Merge 'improvements for GCE image' from Bentsi when logging in to the GCE instance that is created from the GCE image it takes 10 seconds to understand that we are not running on AWS. Also, some unnecessary debug logging messages are printed: ``` bentsi@bentsi-G3-3590:~/devel/scylladb$ ssh -i ~/.ssh/scylla-qa-ec2 bentsi@35.196.8.86 Warning: Permanently added '35.196.8.86' (ECDSA) to the list of known hosts. Last login: Sun Nov 1 22:14:57 2020 from 108.128.125.4 _____ _ _ _____ ____ / ____\| \| \| \| \| __ \\| _ \ \| (___ ___ _ _\| \| \| __ _\| \| \| \| \|_) \| \___ \ / __\| \| \| \| \| \|/ _` \| \| \| \| _ < ____) \| (__\| \|_\| \| \| \| (_\| \| \|__\| \| \|_) \| \|_____/ \___\|\__, \|_\|_\|\__,_\|_____/\|____/ __/ \| \|___/ Version: 666.development-0.20201101.6be9f4938 Nodetool: nodetool help CQL Shell: cqlsh More documentation available at: http://www.scylladb.com/doc/ By default, Scylla sends certain information about this node to a data collection server. For information, see http://www.scylladb.com/privacy/ WARNING:root:Failed to grab http://169.254.169.254/latest/... WARNING:root:Failed to grab http://169.254.169.254/latest/... Initial image configuration failed! To see status, run 'systemctl status scylla-image-setup' [bentsi@artifacts-gce-image-jenkins-db-node-aa57409d-0-1 ~]$ ``` this PR fixes this Closes #7523 * github.com:scylladb/scylla: scylla_util.py: remove unnecessary logging scylla_util.py: make is_aws_instance faster scylla_util.py: added ability to control sleep time between retries in curl()	2020-11-02 12:32:25 +02:00
Piotr Sarna	b66c285f94	schema_tables: fix fixing old secondary index schemas Old secondary index schemas did not have their idx_token column marked as computed, and there already exists code which updates them. Unfortunately, the fix itself contains an error and doesn't fire if computed columns are not yet supported by the whole cluster, which is a very common situation during upgrades. Fixes #7515 Closes #7516	2020-11-02 12:30:20 +02:00
Takuya ASADA	100127bc02	install.sh: allow --packaging with nonroot mode Since scylla-ccm wants to skip systemctl, we need to support --packaging in nonroot mode too. Related: #7187	2020-11-02 12:07:14 +02:00
Calle Wilund	7c8f457bab	alternator::streams: Reduce the query limit depending on cdc opts Avoid querying much more than needed. Since we have exact row markers now, this is more safe to do.	2020-11-02 08:37:27 +00:00
Calle Wilund	c79108edbb	alternator::streams: Use end-of-record info in get_records Fixes #7496 Since cdc log now has an end-of-batch/record marker that tells us explicitly that we've read the last row of a change, we can use this instead of timestamp checks + limit extra to ensure we have complete records. Note that this does not try to fulfill user query limit exact. To do this we would need to add a loop and potentially re-query if quried rows are not enough. But that is a separate exercise, and superbly suited for coroutines!	2020-11-02 08:35:36 +00:00
Avi Kivity	b6f8bb6b77	tools/toolchain: update maintainer instructions The instructions are updated for multiarch images (images that can be used on x86 and ARM machines). Additionally, - docker is replaced with podman, since that is now used by developers. Docker is still supported for developers, but the image creation instructions are only tested with podman. - added instructions about updating submodules - `--format docker` is removed. It is not necessary with more recent versions of docker. Closes #7521	2020-11-02 10:29:54 +02:00
Avi Kivity	3993498fb4	connection_notifier: prevent link errors due to variables defined in header connection_notifier.hh defines a number of template-specialized variables in a header. This is illegal since you're allowed to define something multiple times if it's a template, but not if it's fully specialized. gcc doesn't care but clang notices and complains. Fix by defining the variiables as inline variables, which are allowed to have definitions in multiple translation units. Closes #7519	2020-11-02 10:28:55 +02:00
Avi Kivity	83b3d3d1d1	test: increase timeout to 12000 seconds to account for slow ARM cores Some ARM cores are slow, and trip our current timeout of 3000 seconds in debug mode. Quadrupling the timeout is enough to make debug-mode tests pass on those machines. Since the timeout's role is to catch rare infinite loops in unsupervised testing, increasing the timeout has no ill effect (other than to delay the report of the failure). Closes #7518	2020-11-02 10:28:14 +02:00
Piotr Sarna	ed047d54bf	Merge 'alternator: fix combination of filter and projection' from Nadav The main goal of this this series is to fix issue #6951 - a Query (or Scan) with a combination of filtering and projection parameters produced wrong results if the filter needs some attributes which weren't projected. This series also adds new tests for various corner cases of this issue. These new tests also pass after this fix, or still fail because some other missing feature (namely, nested attributes). These additional tests will be important if we ever want to refactor or optimize this code, because they exercise some rare corner code paths at the intersection of filtering and projection. This series also fixes some additional problems related to this issue, like combining old and new filtering/projection syntaxes (should be forbidden), and even one fix to a wrong comment. Closes #7328 * github.com:scylladb/scylla: alternator test: tests for nested attributes in FilterExpression alternator test: fix comment alternator tests: additional tests for filter+projection combination alternator: forbid combining old and new-style parameters alternator: fix query with both projection and filtering	2020-11-02 07:28:41 +01:00
Bentsi Magidovich	2866f2d65d	scylla_util.py: remove unnecessary logging when calling curl and exception is raised we can see unnecessary log messages that we can't control. For example when used in scylla_login we can see following messages: WARNING:root:Failed to grab http://169.254.169.254/latest/... WARNING:root:Failed to grab http://169.254.169.254/latest/... Initial image configuration failed! To see status, run 'systemctl status scylla-image-setup'	2020-11-02 01:13:44 +03:00
Bentsi Magidovich	a62237f1c6	scylla_util.py: make is_aws_instance faster when used for example in scylla_login we need to understand that we are not running on AWS faster then 10 seconds	2020-11-02 00:11:21 +03:00
Bentsi Magidovich	83a8550a5f	scylla_util.py: added ability to control sleep time between retries in curl()	2020-11-01 22:39:19 +03:00
Avi Kivity	b45c933036	tools: toolchain: update for gcc-10.2.1-6.fc33.x86_64	2020-11-01 19:18:00 +02:00
Avi Kivity	d626563fe3	Update seastar submodule * seastar 57b758c2f9...a62a80ba1d (1): > thread: increase stack size in debug mode	2020-11-01 19:16:59 +02:00
Benny Halevy	e4614d4836	gossiper: mark trivial methods noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:47 +02:00
Benny Halevy	1ba4c84ae2	gossiper: get_cluster_name, get_partitioner_name: make noexcept These methods can return a const sstring& rather than allocating a sstring. And with that they can be marked noexcept. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:29 +02:00
Benny Halevy	11a8912093	gossiper: get_gossip_status: return string_view and make noexcept Change get_gossip_status to return string_view, and with that it can be noexcept now that it doesn't allocate memory via sstring. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	126e486fde	gms/endpoint_state: mark methods using get_status noexcept Now that get_status returns string_view, just compare it with a const char* rather than making a sstring out of it, and consequently, can be marked noexcept. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	6b9191b6c2	gms/endpoint_state: get_status: return string_view and make noexcept get_status doesn't need to allocate a sstring, it can just return a std::string_view to the status string, if found. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	232c665bab	gms/endpoint_state: mark get_application_state_ptr and is_cql_ready noexcept Although std::map::find is not guaranteed to be noexcept it depends on the comperator used and in this case comparing application_state is noexcept. Therefore, we can safely mark get_application_state_ptr noexcept. is_cql_ready depends on get_application_state_ptr and otherwise handles an exceptions boost::lexical_cast so it can be marked noexcept as well. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	5d8e2c038b	gms/endpoint_state: mark trivial methods noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	d4c364507e	gms/heart_beat_state: mark methods noexcept Now that get_next_version() is noexcept, update_heart_beat can be noexcept too. All others are trivially noexcept. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	68a2920201	gms/versioned_value: mark trivial methods noexcept Also, versioned_value::compare_to() can be marked const. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	c295f521b9	gms/version_generator: mark get_next_version noexcept It is trivially so. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	87c3fd9cd8	fb_utilities.hh: mark methods noexcept Now that gms::inet_address assignment is marked as noexcept. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	e28d80ec0c	messaging: msg_addr: mark methods noexcept Based on gms::inet_address. With that, gossiper::get_msg_addr can be marked noexcept (and const while at it). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Benny Halevy	232fc19525	gms/inet_address: mark methods noexcept Based on the corresponding net::inet_address calls. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-01 16:46:18 +02:00
Avi Kivity	6be9f49380	cql3: expression: switch from range_bound to interval_bound to avoid clang class template argument deduction woes Clang does not implement P1814R0 (class template argument deduction for alias templates), so it can't deduce the template arguments for range_bound, but it can for interval_bound, so switch to that. Using the modern name rather than the compatibility alias is preferred anyway. Closes #7422	2020-11-01 13:19:44 +02:00
Nadav Har'El	deaa141aea	docs/isolation.md: fix list of IO priority classes In commit `de38091827` the two IO priority classes streaming_read and streaming_write into just one. The document docs/isolation.md leaves a lot to be desired (hint, hint, to anyone reading this and can write content!) but let's at least not have incorrect information there. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20201101102220.2943159-1-nyh@scylladb.com>	2020-11-01 12:27:06 +02:00
Avi Kivity	46612fe92b	Merge 'Add debug context to views out of sync' from Piotr Sarna This series adds more context to debugging information in case a view gets out of sync with its base table. A test was conducted manually, by: 1. creating a table with a secondary index 2. manually deleting computed column information from system_schema.computed_columns 3. restarting the target node 4. trying to write to the index Here's what's logged right after the index metadata is loaded from disk: ``` ERROR 2020-10-30 12:30:42,806 [shard 0] view - Column idx_token in view ks.t_c_idx_index was not found in the base table ks.t ERROR 2020-10-30 12:30:42,806 [shard 0] view - Missing idx_token column is caused by an incorrect upgrade of a secondary index. Please recreate index ks.t_c_idx_index to avoid future issues. ``` And here's what's logged during the actual failure - when Scylla notices that there exists a column which is not computed, but it's also not found in the base table: ``` ERROR 2020-10-30 12:31:25,709 [shard 0] storage_proxy - exception during mutation write to 127.0.0.1: seastar::internal::backtraced<std::runtime_error> (base_schema(): operation unsupported when initialized only for view reads. Missing column in the base table: idx_token Backtrace: 0x1d14513 0x1d1468b 0x1d1492b 0x109bbad 0x109bc97 0x109bcf4 0x1bc4370 0x1381cd3 0x1389c38 0xaf89bf 0xaf9b20 0xaf1654 0xaf1afe 0xb10525 0xb10ad8 0xb10c3a 0xaaefac 0xabf525 0xabf262 0xac107f 0x1ba8ede 0x1bdf749 0x1be338c 0x1bfe984 0x1ba73fa 0x1ba77a4 0x9ea2c8 /lib64/libc.so.6+0x27041 0x9d11cd -------- seastar::lambda_task<seastar::execution_stage::flush()::{lambda()#1}> ``` Hopefully, this information will make it much easier to solve future problems with out-of-sync views. Tests: unit(dev) Fixes #7512 Closes #7513 * github.com:scylladb/scylla: view: add printing missing base column on errors view: simplify creating base-dependent info for reads only view: fix typo: s/dependant/dependent view: add error logs if a view is out of sync with its base	2020-11-01 11:09:58 +02:00
Piotr Wojtczak	2150c0f7a2	cql: Check for timestamp correctness in USING TIMESTAMP statements In certain CQL statements it's possible to provide a custom timestamp via the USING TIMESTAMP clause. Those values are accepted in microseconds, however, there's no limit on the timestamp (apart from type size constraint) and providing a timestamp in a different unit like nanoseconds can lead to creating an entry with a timestamp way ahead in the future, thus compromising the table. To avoid this, this change introduces a sanity check for modification and batch statements that raises an error when a timestamp of more than 3 days into the future is provided. Fixes #5619 Closes #7475	2020-11-01 11:01:24 +02:00
Piotr Sarna	35887bf88b	view: add printing missing base column on errors When an out-of-sync view is attempted to be used in a write operation, the whole operation needs to be aborted with an error. After this patch, the error contains more context - namely, the missing column.	2020-10-31 12:22:07 +01:00
Piotr Sarna	ef3470fa34	view: simplify creating base-dependent info for reads only The code which created base-dependent info for materialized views can be expressed with fewer branches. Also, the constructor which takes a single parameter is made explicit.	2020-10-31 12:22:07 +01:00
Piotr Sarna	71b28d69b3	view: fix typo: s/dependant/dependent	2020-10-31 12:22:07 +01:00
Piotr Sarna	669e2ada92	view: add error logs if a view is out of sync with its base When Scylla finds out that a materialized view contains columns which are not present in the base table (and they are not computed), it now presents comprehensible errors in the log.	2020-10-31 12:22:07 +01:00
Avi Kivity	1734205315	Update seastar submodule * seastar 6973080cd1...57b758c2f9 (11): > http: handle 'match all' rule correctly > http: add missing HTTP methods > memory: remove unused lambda capture in on_allocation_failure() > Support seastar allocator when seastar::alien is used > Merge "make timer related functions noexcept" from Benny > script: update dependecy packages for centos7/8 > tutorial: add linebreak between sections > doc: add nav for the second last chap > doc: add nav bar at the bottom also > doc: rename add_prologue() to add_nav_to_body() > Wrong name used in an example in mini tutorial.	2020-10-30 09:49:47 +02:00
Avi Kivity	27125a45b2	test: switch lsa-related tests (imr_test and double_decker_test) to seastar framework An upcoming change in Seastar only initializes the Seastar allocator in reactor threads. This causes imr_test and double_decker_test to fail: 1. Those tests rely on LSA working 2. LSA requires the Seastar allocator 3. Seastar is not initialized, so the Seastar allocator is not initialized. Fix by switching to the Seastar test framework, which initializes Seastar. Closes #7486	2020-10-30 08:06:04 +02:00
Avi Kivity	8a8589038c	test: increase quota for tests to 6GB test.py estimates the amount of memory needed per test in order not to overload the machine, but it underestimates badly and so machines with many cores but not a lot of memory fail the tests (in debug mode principally) due to running out of memory. Increase the estimate from 2GB per test to 6GB. Closes #7499	2020-10-30 08:04:40 +02:00
Avi Kivity	24097eee11	test: sstable_3_x_test: reduce stack usage in thread- local storage initialization gcc collects all the initialization code for thread-local storage and puts it in one giant function. In combination with debug mode, this creates a very large stack frame that overflows the stack on aarch64. Work around the problem by placing each initializer expression in its own function, thus reusing the stack. Closes #7509	2020-10-30 08:03:44 +02:00
Piotr Grabowski	e96ef0d629	tests: Cleanup select_statement_utils Add additional comments to select_statement_utils, fix formatting, add missing #pragma once and introduce set_internal_paging_size_guard to set internal_paging in RAII fashion. Closes #7507	2020-10-29 15:25:02 +01:00
Asias He	d47033837a	gossiper: Use dedicated gossip scheduling group Gossip currently runs inside the default (main) scheduling group. It is fine to run inside default scheduling group. From time to time, we see many tasks in main scheduling group and we suspect gossip. It is best we can move gossip to a dedicated scheduling group, so that we can catch bugs that leak tasks to main group more easily. After this patch, we can check: scylla_scheduler_time_spent_on_task_quota_violations_ms{group="gossip",shard="0"} Fixes: #7154 Tests: unit(dev)	2020-10-29 12:53:37 +02:00
Avi Kivity	bd73898a5c	dist: redhat: don't pull in kernel package We require a kernel that is at least 3.10.0-514, because older kernel have an XFS related bug that causes data corruption. However this Requires: clause pulls in a kernel even in Docker installation, where it (and especially the associated firmware) occupies a lot of space. Change to a Conflicts: instead. This prevents installation when the really old kernel is present, but doesn't pull it in for the Docker image. Closes #7502	2020-10-29 12:44:22 +02:00
Piotr Sarna	8c645f74ce	Merge 'select_statement: Fix aggregate results on indexed selects (timeouts fixed) ' from Piotr Grabowski Overview Fixes #7355. Before this changes, there were a few invalid results of aggregates/GROUP BY on tables with secondary indexes (see below). Unfortunately, it still does NOT fix the problem in issue #7043. Although this PR moves forward fixing of that issue, there is still a bug with `TOKEN(...)` in `WHERE` clauses of indexed selects that is not addressed in this PR. It will be fixed in my next PR. It does NOT fix the problems in issues #7432, #7431 as those are out-of-scope of this PR and do not affect the correctness of results (only return a too large page). GROUP BY (first commit) Before the change, `GROUP BY` `SELECT`s with some `WHERE` restrictions on an indexed column would return invalid results (same grouped column values appearing multiple times): ``` CREATE TABLE ks.t(pk int, ck int, v int, PRIMARY KEY(pk, ck)); CREATE INDEX ks_t on ks.t(v); INSERT INTO ks.t(pk, ck, v) VALUES (1, 2, 3); INSERT INTO ks.t(pk, ck, v) VALUES (1, 4, 3); SELECT pk FROM ks.t WHERE v=3 GROUP BY pk; pk ---- 1 1 ``` This is fixed by correctly passing `_group_by_cell_indices` to `result_set_builder`. Fixes the third failing example from issue #7355. Paging (second commit) Fixes two issues related to improper paging on indexed `SELECT`s. As those two issues are closely related (fixing one without fixing the other causes invalid results of queries), they are in a single commit (second commit). The first issue is that when using `slice.set_range`, the existing `_row_ranges` (which specify clustering key prefixes) are not taken into account. This caused the wrong rows to be included in the result, as the clustering key bound was set to a half-open range: ``` CREATE TABLE ks.t(a int, b int, c int, PRIMARY KEY ((a, b), c)); CREATE INDEX kst_index ON ks.t(c); INSERT INTO ks.t(a, b, c) VALUES (1, 2, 3); INSERT INTO ks.t(a, b, c) VALUES (1, 2, 4); INSERT INTO ks.t(a, b, c) VALUES (1, 2, 5); SELECT COUNT() FROM ks.t WHERE c = 3; count ------- 2 ``` The second commit fixes this issue by properly trimming `row_ranges`. The second fixed problem is related to setting the `paging_state` to `internal_options`. It was improperly set to the value just after reading from index, making the base query start from invalid `paging_state`. The second commit fixes this issue by setting the `paging_state` after both index and base table queries are done. Moreover, the `paging_state` is now set based on `paging_state` of index query and the results of base table query (as base query can return more rows than index query). The second commit fixes the first two failing examples from issue #7355. Tests (fourth commit) Extensively tests queries on tables with secondary indices with aggregates and `GROUP BY`s. Tests three cases that are implemented in `indexed_table_select_statement::do_execute` - `partition_slices`, `whole_partitions` and (non-`partition_slices` and non-`whole_partitions`). As some of the issues found were related to paging, the tests check scenarios where the inserted data is smaller than a page, larger than a page and larger than two pages (and some in-between page boundaries scenarios). I found all those parameters (case of `do_execute`, number of inserted rows) to have an impact of those fixed bugs, therefore the tests validate a large number of those scenarios. Configurable internal_paging_size (third commit) Before this change, internal `page_size` when doing aggregate, `GROUP BY` or nonpaged filtering queries was hard-coded to `DEFAULT_COUNT_PAGE_SIZE` (10,000). This change adds new internal_paging_size variable, which is configurable by `set_internal_paging_size` and `reset_internal_paging_size` free functions. This functionality is only meant for testing purposes. Closes #7497 github.com:scylladb/scylla: tests: Add secondary index aggregates tests select_statement: Introduce internal_paging_size select_statement: Fix paging on indexed selects select_statement: Fix GROUP BY on indexed select	2020-10-29 08:30:16 +01:00

1 2 3 4 5 ...

24112 Commits