scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 12:17:02 +00:00

Author	SHA1	Message	Date
Dejan Mircevski	f9b00a4318	cql: Fix mixed selection with GROUP BY GROUP BY is currently supported by simple_selection, the class used when all selectors are simple. But when selectors are mixed, we use selection_with_processing, which does not yet support GROUP BY. This patch fixes that. It also adapts one testcase in filtering_test to the new behavior of simple_selector. The test currently expects the last value seen, but simple_selector now outputs the first value seen. (More details: the WHERE clause implicitly selects the columns it references, and unit tests are forced to provide expected values for these columns. The user-visible result is unchanged in the test; users never see the WHERE column values due to filtering in cql::transport, outside unit tests.) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-14 12:50:39 -04:00
Dejan Mircevski	06e3b36164	cql: Allow mixing of aggregate and simple selectors Scylla currently rejects SELECT statements with both simple and aggregate selectors, but Cassandra allows them. This patch brings parity to Scylla. Fixes #4447. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-14 10:34:02 -04:00
Glauber Costa	a23531ebd5	Support AWS i3en instances AWS just released their new instances, the i3en instances. The instance is verified already to work well with scylla, the only adjustments that we need is advertise that we support it, and pre-fill the disk information according to the performance numbers obtained by running the instance. Fixes #4486 Branches: 3.1 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190508170831.6003-1-glauber@scylladb.com>	2019-05-08 20:09:44 +03:00
Avi Kivity	a86fdeb02b	Merge "Implement GROUP BY" from Dejan " Cassandra has supported GROUP BY in SELECT statements since 2016 (v3.10), while ScyllaDB currently treats it as a syntax error. To achieve parity with Cassandra in this important bit of functionality, this patch adds full support for GROUP BY, from parsing to validation to implementation to testing. " * 'groupby-implPP' of https://github.com/dekimir/scylla: Implement grouping in selection processing Propagate GROUP BY indices to result_set_builder Process GROUP BY columns into select_statement Parse GROUP BY clause, store column identifiers	2019-05-08 18:35:12 +03:00
Dejan Mircevski	d51e4a589d	Implement grouping in selection processing Make result_set_builder obey its _group_by_cell_indices by recognizing group boundaries and resetting the selectors. Also make simple_selectors work correctly when grouping. Fixes #2206. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-08 11:05:36 -04:00
Dejan Mircevski	c3929aee3a	Propagate GROUP BY indices to result_set_builder Ensure that the indices recorded in select_statement are passed to result_set_builder when one is created for processing the cell values. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-08 10:10:10 -04:00
Dejan Mircevski	274a77f45e	Process GROUP BY columns into select_statement Validate raw GROUP BY identifiers and translate them into a select_statement member. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-08 10:10:10 -04:00
Dejan Mircevski	e1fb414805	Parse GROUP BY clause, store column identifiers Extend the grammar file with GROUP BY, collect the column identifiers, and store them in raw::select_statement. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-05-08 10:09:22 -04:00
Avi Kivity	ab3f044daa	Revert "Merge "gc_clock: Fix hashing to be backwards-compatible" from Tomasz" This reverts commit `dcb263b36b`, reversing changes made to `a6759dc6aa`. schema_change_test fails consistently on master with it.	2019-05-08 16:19:38 +03:00
JP-Reddy	56420dc650	scylla_io_setup: TypeError in iotune_args array from scylla_io_setup script Whenever the iotune_args array uses "--smp", it needs cpudata.smp() which returns an integer instead of a string. So when iotune_args is passed to subprocess.check_call(), it actually throws "TypeError: expected str, bytes or os.PathLike object, not int" but "%s did not pass validation tests, it may not be on XFS..." is shown as the exception. Even though the user inputs correct arguments, it might still throw an error and confuse the user that he/she has not passed the right arguments. One simple fix is to use str(cpudata.smp()) instead of cpudata.smp(). Signed-off-by: JP-Reddy <guthijp.reddy@gmail.com> Message-Id: <20190406070118.48477-1-guthijp.reddy@gmail.com>	2019-05-07 20:13:54 +03:00
Paweł Dziepak	8a16cbc50d	Merge "treewide: adjust for gcc 9" from Avi " gcc 9 complains a lot about pessimizing moves, narrowing conversions, and has tighter deduction rules, plus other nice warnings. Fix problems found by it, and make some non-problems compile without warnings. " * tag 'gcc9/v1' of https://github.com/avikivity/scylla: types: fix pessimizing moves thrift: fix pessimizing moves tests: fix pessimizing moves tests: cql_query_test: silence narrowing conversion warning test: cql_auth_syntax_test: fix ambiguity due to parser uninitialized<T> table: fix potentially wrong schema when reading from zero sstables storage_proxy: fix pessimizing moves memtable: fix pessimizing moves IDL: silence narrowing conversion in bool serializer compaction: fix pessimizing moves cache: fix pessimizing moves locator: fix pessimizing moves database: fix pessimizing moves cql: fix pessimizing moves cql parser: fix conversion from uninitalized<T> to optional<T> with gcc 9	2019-05-07 12:19:29 +01:00
Avi Kivity	43867fe618	types: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 10:01:36 +03:00
Avi Kivity	1b760297f5	thrift: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 10:01:15 +03:00
Avi Kivity	0ff6e48e77	tests: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 10:00:58 +03:00
Avi Kivity	b60d58d6bd	tests: cql_query_test: silence narrowing conversion warning Make it explicit to gcc 9 that the conversion to bool is intended.	2019-05-07 09:59:44 +03:00
Avi Kivity	5636b621a7	test: cql_auth_syntax_test: fix ambiguity due to parser uninitialized<T> gcc 9 is unable to decide whether to call role_name's copy or move constructor. Help it by casting.	2019-05-07 09:58:21 +03:00
Avi Kivity	add20eb9a6	table: fix potentially wrong schema when reading from zero sstables We use the schema during creation of the mutation_source rather than during the query itself. Likely they're the same, and since no rows are returned from a zero-sstable query, harmless. But gcc 9 complains. Fix by using the query's schema.	2019-05-07 09:56:30 +03:00
Avi Kivity	985a30a01c	storage_proxy: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 09:56:09 +03:00
Avi Kivity	fd3c493961	memtable: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 09:55:53 +03:00
Avi Kivity	17c268cd55	IDL: silence narrowing conversion in bool serializer bool serializers are now aliases to int8_t serializers, but gcc 9 complains about narrowing conversions, due to the path int8_t -> int -> bool. A bad narrowing conversion here cannot happen in practice, but massage the code a little to silence it.	2019-05-07 09:28:24 +03:00
Avi Kivity	d7cbd3dc61	compaction: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 09:28:12 +03:00
Avi Kivity	9c7eb95f78	cache: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 09:27:50 +03:00
Avi Kivity	c42d59d805	locator: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 09:27:27 +03:00
Avi Kivity	96a0073929	database: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 09:26:58 +03:00
Avi Kivity	03e9cdbfb0	cql: fix pessimizing moves Remove pessimizing moves, as reported by gcc 9.	2019-05-07 09:26:20 +03:00
Avi Kivity	c26ec176dd	cql parser: fix conversion from uninitalized<T> to optional<T> with gcc 9 We use uninitialized<T> (wrapping an optional<T>) to adjust to the parser's way of laying out the code, but this fails with gcc 9 (presumably for the correct reasons) when converting from uninitialized<T> back to optional<T>. Add a conversion operator to make it build.	2019-05-07 09:21:22 +03:00
Dejan Mircevski	0ea6df2cd1	tests: Add predicates for checking exception messages Many tests verify exception messages. Currently, they do so via verbose lambdas or inner functions that hide test-failure locations. This patch adds utilities for quick creation of message-checking tests and replaces existing ad-hoc methods with these new utilities. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Message-Id: <20190506210006.124645-1-dejan@scylladb.com>	2019-05-07 07:11:07 +03:00
Avi Kivity	dcb263b36b	Merge "gc_clock: Fix hashing to be backwards-compatible" from Tomasz " Commit `d0f9e00` changed the representation of the gc_clock::duration from int32_t to int64_t. Mutation hashing uses appending_hash<gc_clock::time_point>, which by default feeds duration::count() into the hasher. duration::rep changed from int32_t to int64_t, which changes the value of the hash. This affects schema digest and query digests, resulting in mismatches between nodes during a rolling upgrade. Fixes #4460. Branches: 3.1 " * tag 'fix-gc_clock-digest-v1' of github.com:tgrabiec/scylla: tests: Add test which verifies that schema digest stays the same tests: Add sstables for the schema digest test gc_clock: Fix hashing to be backwards-compatible	2019-05-07 07:04:40 +03:00
Tomasz Grabiec	8019634dba	tests: Add test which verifies that schema digest stays the same	2019-05-06 18:43:43 +02:00
Tomasz Grabiec	1f2995c8c5	tests: Add sstables for the schema digest test Generated by running test_schema_digest_does_not_change with regenerate set to true.	2019-05-06 18:43:43 +02:00
Tomasz Grabiec	549d0eb2f3	gc_clock: Fix hashing to be backwards-compatible Commit `d0f9e00` changed the representation of the gc_clock::duration from int32_t to int64_t. Mutation hashing uses appending_hash<gc_clock::time_point>, which by default feeds duration::count() into the hasher. duration::rep changed from int32_t to int64_t, which changes the value of the hash. This affects schema digest and query digests, resulting in mismatches between nodes during a rolling upgrade. Fixes #4460.	2019-05-06 18:43:43 +02:00
Avi Kivity	a6759dc6aa	Update seastar submodule * seastar 4cdccae...f73690e (16): > sstring: silence technically correct but unhelpful warning in sstring move ctor > cmake: add a seastar_supports_flag function > future: Fix build with libc++'s non-trivially-constructible std::tuple<> > Revert "Make sure all allocations are properly bytes aligned" > Merge "future: simplify future_state management" from Rafael > Make sure all allocations are properly bytes aligned > util/log: use correct clock type > core/reactor: don't assume system_clock::duration is in nanoseconds > Merge "Optimize the future_state move constructor" from Rafael > rpc: don't use boost/variant.hpp directly > core/memory: Omit [[gnu::leaf]] attribute on clang > Fix build with std::filesystem > Merge "Fix clang build and tests" from Rafael > cmake: Move ) out of quotes > Merge "Fix some bugs found by (or perhaps in) gcc 9" by Avi > Deduplicate Seastar dependencies management in CMake scripts	2019-05-06 19:17:37 +03:00
Gleb Natapov	1d851a3892	messaging: catch an error that sending of CLIENT_ID may return Avoid a warning about unhandled exception. Message-Id: <20190506122718.GL21208@scylladb.com>	2019-05-06 18:13:51 +03:00
Glauber Costa	79a5351651	scylla-housekeeping: timeout eventually scylla-housekeeping always wants to run in the installation to check if we are running the latest version. This happens regardless of whether or not we said yes or no to the housekeeping scylla_setup question - as that question only deals with whether or not we want to do this through a timer. It is fine to try to run scylla-housekeeping, as long as we time it out. The current code doesn't. The naive solution is to add a timeout parameter to urllib.request.open. However, that timeout is not respected and in my tests I saw real timeouts up to four times higher the timeout we set. For a reasonable 5s timeout, this mean a 20s real timeout which can lead to a very bad user experience. This seems to be a known problem with this module according to a quick Google search. This patch then takes a slightly more complex solution and uses multiprocess to enforce a well-defined user-visible timeout. Fixes #3980 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190506122335.5707-1-glauber@scylladb.com>	2019-05-06 17:37:59 +03:00
Gleb Natapov	b8188e1e2f	storage_proxy: avoid copying of a topology and endpoint array in batchlog code batchlog make copies of topology and endpoint array in batchlog endpoint choosing code. There is a remark that at least endpoint copy is deliberate because Cassandra code has it. We do not have to follow. Our endpoint calculation code is atomic, so we can use a reference. Message-Id: <20190506115815.GK21208@scylladb.com>	2019-05-06 17:36:50 +03:00
Raphael S. Carvalho	ef5681486f	compaction: do not unconditionally delete a new sstable in interrupted compaction After incremental compaction, new sstables may have already replaced old sstables at any point. Meaning that a new sstable is in-use by table and a old sstable is already deleted when compaction itself is UNFINISHED. Therefore, we should NEVER delete a new sstable unconditionally for an interrupted compaction, or data loss could happen. To fix it, we'll only delete new sstables that didn't replace anything in the table, meaning they are unused. Found the problem while auditting the code. Fixes #4479. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20190506134723.16639-1-raphaelsc@scylladb.com>	2019-05-06 16:55:36 +03:00
Avi Kivity	1c65ba6e66	Use correct scylla_tables schema for removing version column Mutations carry their schema, so use that instead of bring in a global schema, which may change as features are added. Message-Id: <20190505132542.6472-1-avi@scylladb.com>	2019-05-06 13:51:08 +02:00
Paweł Dziepak	51e98e0e11	tests/perf_fast_forward: report average number of aio operations perf_fast_forward is used to detect performance regressions. The two main metrics used for this are fargments per second and the number of the IO operations. The former is a median of a several runs, but the latter is just the actual number of asynchronous IO operations performed in the run that happened to be picked as a median frag/s-wise. There's no always a direct correlation between frag/s and aio and the latter can vary which makes the latter hard to compare. In order to make this easier a new metric was introduced: "average aio" which reports the average number of asynchronous IO operations performed in a run. This should produce much more stable results and therefore make the comparison more meaningful. Message-Id: <20190430134401.19238-1-pdziepak@scylladb.com>	2019-05-06 11:47:31 +02:00
Piotr Sarna	cf8d2a5141	Revert "view: cache is_index for view pointer" This reverts commit `dbe8491655`. Caching the value was not done in a correct manner, which resulted in longevity tests failures. Fixes #4478 Branches: 3.1 Message-Id: <762ca9db618ca2ed7702372fbafe8ecd193dcf4d.1557129652.git.sarna@scylladb.com>	2019-05-06 11:45:46 +03:00
Benny Halevy	d9136f96f3	commitlog: descriptor: skip leading path from filename std::regex_match of the leading path may run out of stack with long paths in debug build. Using rfind instead to lookup the last '/' in in pathname and skip it if found. Fixes #4464 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190505144133.4333-1-bhalevy@scylladb.com>	2019-05-05 17:51:56 +03:00
Benny Halevy	3a2fa82d6e	time_window_backlog_tracker: fix use after free Fixes #4465 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190430094209.13958-1-bhalevy@scylladb.com>	2019-05-05 12:47:51 +03:00
Glauber Costa	47d04e49e8	scylla_setup: respect user's decision not to call housekeeping The setup script asks the user whether or not housekeeping should be called, and in the first time the script is executed this decision is respected. However if the script is invoked again, that decision is not respected. This is because the check has the form: if (housekeeping_cfg_file_exists) { version_check = ask_user(); } if (version_check) { do_version_check() } else { dont_do_it() } When it should have the form: if (housekeeping_cfg_file_exists) { version_check = ask_user(); if (version_check) { do_version_check() } else { dont_do_it() } } (Thanks python) This is problematic in systems that are not connected to the internet, since housekeeping will fail to run and crash the setup script. Fixes #4462 Branches: master, branch-3.1 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190502034211.18435-1-glauber@scylladb.com>	2019-05-02 18:46:41 +03:00
Glauber Costa	99c00547ad	make scylla_util OS detection robust against empty lines Newer versions of RHEL ship the os-release file with newlines in the end, which our script was not prepared to handle. As such, scylla_setup would fail. This patch makes our OS detection robust against that. Fixes #4473 Branches: master, branch-3.1 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190502152224.31307-1-glauber@scylladb.com>	2019-05-02 18:33:35 +03:00
Paweł Dziepak	cf451f0e62	Merge "gdb: Fixes and improvements to memory analysis" from Tomasz " One of the fixes is for incorrect recognition of memory pages as belonging or not belonging to small allocation pools in some cases. Also, compensates for https://github.com/scylladb/seastar/issues/608 in "scylla memory", which improves accurracy of the small allocation pool report. Fixes "scylla task_histogram" to not look into pages which do not belong to live small allocation pool spans. Fixes #4367 Fixes #4368 " * tag 'gdb-fix-span-qualification-v2' of github.com:tgrabiec/scylla: gdb: Print size of large allocations in 'scylla ptr' gdb: Fix 'scylla ptr' for free pages gdb: Set is_live and offset for large allocations properly in 'scylla ptr' gdb: Fix 'scylla ptr' misqualifying pointers gdb: Make 'scylla memory' show unused memory in small pools gdb: Fix small pool memory usage reporting in 'scylla memory' gdb: Switch 'scylla memory' to use the span_checker to find large spans gdb: Switch task_histogram to use the span_checker gdb: Introduce span_checker	2019-05-02 14:25:30 +01:00
Gleb Natapov	95c6d19f6c	batchlog_manager: fix array out of bound access endpoint_filter() function assumes that each bucket of std::unordered_multimap contains elements with the same key only, so its size can be used to know how many elements with a particular key are there. But this is not the case, elements with multiple keys may share a bucket. Fix it by counting keys in other way. Fixes #3229 Message-Id: <20190501133127.GE21208@scylladb.com>	2019-05-01 17:30:11 +03:00
Nadav Har'El	2710f382de	secondary index: expand test of secondary-index and UPDATE requests The existing unit test test_secondary_index_contains_virtual_columns reproduced a bug (issue #4144) with indexing of primary-key columns, but we only actually tested clustering columns. In issue #4471 there was a question whether we may still have a bug when indexing of partition-key columns. This patch adds a test that verifies that we don't, and this case works well too. Refs #4144 Refs #4471 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190501113500.25900-1-nyh@scylladb.com>	2019-05-01 12:53:23 +01:00
Nadav Har'El	a45b6e41a0	materialized views and secondary index: sometimes allow dropping base columns Until this patch, dropping columns from a table was completely forbidden if this table has any materialized views or secondary indexes. However, this is excessively harsh, and not compatible with Cassandra which does allow dropping columns from a base table which has a secondary index on other columns. This incompatibility was raised in the following Stackoverflow question: https://stackoverflow.com/questions/55757273/error-while-dropping-column-from-a-table-with-secondary-index-scylladb/55776490 In this patch, we allow dropping a base table column if none of its materialized views needs this column. Columns selected by a view (as regular or key columns) are needed by it, of course, but when virtual columns are used (namely, there is a view with same key columns as the base), all columns are needed by the view, so unfortunately none of the columns may be dropped. After this patch, when a base-table column cannot be dropped because one of the materialized views needs it, the error message will look like: exceptions::invalid_request_exception: Cannot drop column a from base table ks.cf: a materialized view cf_a_idx_index needs this column. This patch also includes extensive testing for the cases where dropping columns are now allowed, and not allowed. The secondary-index tests are especially interesting, because they demonstrate that now usually (when a non-key column is being indexed) dropping columns will be allowed, which is what originally bothered the Stackoverflow user. Fixes #4448. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190429214805.2972-1-nyh@scylladb.com>	2019-04-30 12:13:10 +01:00
Nadav Har'El	92d5f61ba5	cql: support single-value IN restriction wherever EQ restriction is supported There are several places were IN restrictions are not currently supported, especially in queries involving a secondary index. However, when the IN restriction has just a single value, it is nothing more than an equality restriction and can be converted into one and be supported. So this patch does exactly this. Note that Cassandra does this conversion since August 2016, and therefore supports the special case of single-value IN even where general IN is not supported. So it's important for Cassandra compatibility that we do this conversion too. This patch also includes a test with two queries involving a secondary index that were previously disallowed because of the "IN" on the primary key or the indexed column - and are now allowed when the IN restriction has just a single value. A third query tested is not related to secondary indexes, but confirms we don't break multi-column single-value IN queries. Fixes #4455. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190428160317.23328-1-nyh@scylladb.com>	2019-04-30 12:13:06 +01:00
Tomasz Grabiec	1adcb3637e	Merge "multishard reader: fix handling of non strictly monotonous positions" from Botond The shard readers of the multishard reader assumed that the positions in the data stream are strictly monotonous. This assumption is invalid. Range tombstones can have positions that they can share with other range tombstones and/or a clustering row. The effect of this false assumption was that when the shard reader was evicted such that the last seen fragment was a range tombstone, when recreated it would skip any unseen fragments that have the same position as that of the last seen range tombstone. Fixes: #4418 Branches: master, 3.0, 2019.1 Tests: unit(dev) * https://github.com/denesb/scylla.git multishard_reader_handle_non_strictly_monotonous_positions/v4: multishard_combining_reader: shard_reader::remote_reader extract fill-buffer logic into do_fill_buffer() mutlishard_combining_reader: reorder shard_reader::remote_reader::do_fill_buffer() code position_in_partition_view: add region() accessor multishard_combining_reader: fix handling of non-strictly monotonous positions flat_mutation_reader: add flat_mutation_reader_from_mutations() overload with range and slice flat_mutation_reader: add make_flat_mutation_reader_from_fragments() overload with range and slice tests: add unit test for multishard reader correctly handling non-strictly monotonous positions	2019-04-30 12:35:28 +02:00
Tomasz Grabiec	077c639e42	Merge "Simplify the result_set_row API" from Rafael Currently null and missing values are treated differently. Missing values throw no_such_column. Null values return nullptr, std::nullopt or throw null_column_value. The api is a bit confusing since a function returning a std::optional either returns std::nullopt or throws depending on why there is no value. With this patch series only get_nonnull throws and there is only one exception type. * https://github.com/espindola/scylla.git espindola/merge-null-and-missing-v2: query-result-set: merge handling of null and missing values Remove result_set_row::has Return a reference from get_nonnull	2019-04-30 11:06:29 +02:00

1 2 3 4 5 ...

18610 Commits