scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Benny Halevy	749371c2b0	mutation: consume_clustering_fragments: shuffle emit_rt logic around To prepare for a following patch that will get rid of the cookie.reversed_range_tombstones list. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 16:44:23 +03:00
Benny Halevy	0e21073c38	mutation: consume, consume_gently: simplify partition_start logic Concentrate the logic in a single (!cookie.partition_start_consumed) block Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:49:12 +03:00
Benny Halevy	d661b84d51	mutation: consume_clustering_fragments: pass iterators to mutation_consume_cookie ctor and set crs and rts only in the block where they are used, so we can get rid of reversed_range_tombstones. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:30:36 +03:00
Benny Halevy	f1b7a1a6f1	mutation: consume_clustering_fragments: keep the reversed schema in cookie Rather than reversing the schema on every call just keep the potentially reversed schema in cookie. Othwerwise, cookie.schema was write only. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:30:36 +03:00
Benny Halevy	a230ea0019	mutation: clustering_iterators: get rid of current_rt It is currently write-only. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:30:16 +03:00
Benny Halevy	017f9b4131	mutation_test: test_mutation_consume_position_monotonicity: test also consume_gently Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 14:43:52 +03:00
Nadav Har'El	055340ae39	cql-pytest: increase more timeouts In commit `7eda6b1e90`, we increased the request_timeout parameter used by cql-pytest tests from the default of 10 seconds to 120 seconds. 10 seconds was usually more than enough for finishing any Scylla request, but it turned out that in some extreme cases of a debug build running on an extremely over-committed machine, the default timeout was not enough. Recently, in issue #11289 we saw additional cases of timeouts which the request_timeout setting did not solve. It turns out that the Python CQL driver has two additional timeout settings - connect_timeout and control_connection_timeout, which default to 5 seconds and 2 seconds respectively. I believe that most of the timeouts in issue #11289 come from the control_connection_timeout setting - by changing it to a tiny number (e.g., 0.0001) I got the same error messages as those reported in #11289. The default of that timeout - 2 seconds - is certainly low enough to be reached on an extremely over-committed machine. So this patch significantly increases both connect_timeout and control_connection_timeout to 60 seconds. We don't care that this timeout is ridiculously large - under normal operations it will never be reached. There is no code which loops for this amount of time, for example. Refs #11289 (perhaps even Fixes, we'll need to see that the test errors go away). NOTE: This patch only changes test/cql-pytest/util.py, which is only used by the cql-pytest test suite. We have multiple other test suites which copied this code, and those test suites might need fixing separately. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11295	2022-08-16 19:11:59 +03:00
Kamil Braun	08842444b4	gms: gossiper: include nodes with empty feature sets when calculating enabled features Right now, if there's a node for which we don't know the features supported by this node (they are neither persisted locally, nor gossiped by that node), we would skip this node in calculating the set of enabled features and potentially enable a feature which shouldn't be enabled - because that node may not know it. We should only enable a feature when we know that all nodes have upgraded and know the feature. This bug caused us problems when we tried to move RAFT out of experimental. There are dtests such as `partitioner_tests.py` in which nodes would enable features prematurely, which caused the Raft upgrade procedure to break (the procedure starts only when all nodes upgrade and announce that they know the SUPPORTS_RAFT cluster feature). Closes #11225	2022-08-16 19:07:41 +03:00
Piotr Sarna	cf30d4cbcf	Merge 'Secondary index of collection columns' from Nadav Har'El This pull request introduces global secondary-indexing for non-frozen collections. The intent is to enable such queries: ``` CREATE TABLE test(int id, somemap map<int, int>, somelist<int>, someset<int>, PRIMARY KEY(id)); CREATE INDEX ON test(keys(somemap)); CREATE INDEX ON test(values(somemap)); CREATE INDEX ON test(entries(somemap)); CREATE INDEX ON test(values(somelist)); CREATE INDEX ON test(values(someset)); -- index on test(c) is the same as index on (values(c)) CREATE INDEX IF NOT EXISTS ON test(somelist); CREATE INDEX IF NOT EXISTS ON test(someset); CREATE INDEX IF NOT EXISTS ON test(somemap); SELECT * FROM test WHERE someset CONTAINS 7; SELECT * FROM test WHERE somelist CONTAINS 7; SELECT * FROM test WHERE somemap CONTAINS KEY 7; SELECT * FROM test WHERE somemap CONTAINS 7; SELECT * FROM test WHERE somemap[7] = 7; ``` We use here all-familiar materialized views (MVs). Scylla treats all the collections the same way - they're a list of pairs (key, value). In case of sets, the value type is dummy one. In case of lists, the key type is TIMEUUID. When describing the design, I will forget that there is more than one collection type. Suppose that the columns in the base table were as follows: ``` pkey int, ckey1 int, ckey2 int, somemap map<int, text>, PRIMARY KEY(pkey, ckey1, ckey2) ``` The MV schema is as follows (the names of columns which are not the same as in base might be different). All the columns here form the primary key. ``` -- for index over entries indexed_coll (int, text), idx_token long, pkey int, ckey1 int, ckey2 int -- for index over keys indexed_coll int, idx_token long, pkey int, ckey1 int, ckey2 int -- for index over values indexed_coll text, idx_token long, pkey int, ckey1 int, ckey2 int, coll_keys_for_values_index int ``` The reason for the last additional column is that the values from a collection might not be unique. Fixes #2962 Fixes #8745 Fixes #10707 This patch does not implement local secondary indexes for collection columns: Refs #10713. Closes #10841 * github.com:scylladb/scylladb: test/cql-pytest: un-xfail yet another passing collection-indexing test secondary index: fix paging in map value indexing test/cql-pytest: test for paging with collection values index cql, view: rename and explain bytes_with_action cql, index: make collection indexing a cluster feature test/cql-pytest: failing tests for oversized key values in MV and SI cql: fix secondary index "target" when column name has special characters cql, index: improve error messages cql, index: fix default index name for collection index test/cql-pytest: un-xfail several collecting indexing tests test/cql-pytest/test_secondary_index: verify that local index on collection fails. docs/design-notes/secondary_index: add `VALUES` to index target list test/cql-pytest/test_secondary_index: add randomized test for indexes on collections cql-pytest/cassandra_tests/.../secondary_index_test: fix error message in test ported from Cassandra cql-pytest/cassandra_tests/.../secondary_index_on_map_entries,select_test: test ported from Cassandra is expected to fail, since Scylla assumes that comparison with null doesn't throw error, just evaluates to false. Since it's not a bug, but expected behavior from the perspective of Scylla, we don't mark it as xfail. test/boost/secondary_index_test: update for non-frozen indexes on collections test/cql-pytest: Uncomment collection indexes tests that should be working now cql, index: don't use IS NOT NULL on collection column cql3/statements/select_statement: for index on values of collection, don't emit duplicate rows cql/expr/expression, index/secondary_index_manager: needs_filtering and index_supports_expression rewrite to accomodate for indexes over collections cql3, index: Use entries() indexes on collections for queries cql3, index: Use keys() and values() indexes on collections for queries. types/tuple: Use std::begin() instead of .begin() in tuple_type_impl::build_value_fragmented cql3/statements/index_target: throw exception to signalize that we didn't miss returning from function db/view/view.cc: compute view_updates for views over collections view info: has_computed_column_depending_on_base_non_primary_key column_computation: depends_on_non_primary_key_column schema, index/secondary_index_manager: make schema for index-induced mv index/secondary_index_manager: extract keys, values, entries types from collection cql3/statements/: validate CREATE INDEX for index over a collection cql3/statements/create_index_statement,index_target: rewrite index target for collection column_computation.hh, schema.cc: collection_column_computation column_computation.hh, schema.cc: compute_value interface refactor Cql.g, treewide: support cql syntax `INDEX ON table(VALUES(collection))`	2022-08-16 14:18:51 +02:00
Nadav Har'El	fbb0b66d0c	test/cql-pytest: fix run's "--ssl" option Commit `23acc2e848` broke the "--ssl" option of test/cql-pytest/run (which makes Scylla - and cqlpytest - use SSL-encrypted CQL). The problem was that there was a confusion between the "ssl" module (Python's SSL support) and a new "ssl" variable. A rename and a missing "import" solves the breakage. We never noticed this because Jenkins does not run cql-pytest/run with --ssl (actually, it no longer runs cql-pytest/run at all). It is still a useful option for checking SSL-related problems in Scylla and Seastar. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11292	2022-08-16 12:29:05 +02:00
Kamil Braun	4e35e62597	Merge 'Raft test topology part 3' from Alecco Test schema changes when there was an underlying topology change. - per test case checks of cluster health and cycling - helper class to do cluster manager API requests - tests can perform topology changes: stop/start/restart servers - modified clusters are marked dirty and discarded after the test case - cql connection is updated per topology change and per cluster change Closes #11266 * github.com:scylladb/scylladb: test.py: test topology and schema changes test.py: ClusterManager API mark cluster dirty test.py: call before/after_test for each test case test.py: handle driver connection in ManagerClient test.py: ClusterManager API and ManagerClient test.py: improve topology docstring	2022-08-16 11:00:26 +02:00
Avi Kivity	afa7960926	Merge 'database: evict all inactive reads for table when detaching table' from Botond Dénes Currently, when detaching the table from the database, we force-evict all queriers for said table. This series broadens the scope of this force-evict to include all inactive reads registered at the semaphore. This ensures that any regular inactive read "forgotten" for any reason in the semaphore, will not end up in said readers accessing a dangling table reference when destroyed later. Fixes: https://github.com/scylladb/scylladb/issues/11264 Closes #11273 * github.com:scylladb/scylladb: querier: querier_cache: remove now unused evict_all_for_table() database: detach_column_family(): use reader_concurrency_semaphore::evict_inactive_reads_for_table() reader_concurrency_semaphore: add evict_inactive_reads_for_table()	2022-08-15 19:05:59 +03:00
Botond Dénes	d56dcb842c	db/virtual_table: add virtual destructor to virtual_table It should have had one, derived instances are stored and destroyed via the base-class. The only reason this haven't caused bugs yet is that derived instances happen to not have any non-trivial members yet. Closes #11293	2022-08-15 16:58:05 +03:00
Avi Kivity	73d4930815	Merge 'test/lib: various improvements to sstable test env' from Botond Dénes A mixed bag of improvements developed as part of another PR (https://github.com/scylladb/scylladb/pull/10736). Said PR was closed so I'm submitting these improvements separately. Closes #11294 * github.com:scylladb/scylladb: test/lib: move convenience table config factory to sstable_test_env test/lib/sstable_test_env: move members to impl struct test/lib/sstable_utils: use test_env::do_with_async()	2022-08-15 16:57:01 +03:00
Botond Dénes	92e5f438a4	querier: querier_cache: remove now unused evict_all_for_table()	2022-08-15 14:16:41 +03:00
Botond Dénes	2b1eb6e284	database: detach_column_family(): use reader_concurrency_semaphore::evict_inactive_reads_for_table() Instead of querier_cache::evict_all_for_table(). The new method cover all queriers and in addition any other inactive reads registered on the semaphore. In theory by the time we detach a table, no regular inactive reads should be in the semaphore anymore, but if there is any still, we better evict them before the table is destroyed, they might attempt to access it in when destroyed later.	2022-08-15 14:16:41 +03:00
Botond Dénes	e55ccbde8f	reader_concurrency_semaphore: add evict_inactive_reads_for_table() Allowing for evicting all inactive reads that belong to a certain table.	2022-08-15 14:16:41 +03:00
Botond Dénes	c8ef356859	test/lib: move convenience table config factory to sstable_test_env All users of `column_family_test_config()`, get the semaphore parameter for it from `sstable_test_env`. It is clear that the latter serves as the storage space for stable objects required by the table config. This patch just enshrines this fact by moving the config factory method to `sstable_test_env`, so it can just get what it needs from members.	2022-08-15 11:23:59 +03:00
Botond Dénes	c0e017e0f7	test/lib/sstable_test_env: move members to impl struct All present members of sstable_test_env are std::unique_ptr<>:s because they require stable addresses. This makes their handling somewhat awkward. Move all of them into an internal `struct impl` and make that member a unique ptr.	2022-08-15 11:20:09 +03:00
Botond Dénes	a9f296ed47	test/lib/sstable_utils: use test_env::do_with_async() Instead of manually instantiating test_env.	2022-08-15 11:19:27 +03:00
Botond Dénes	a9573b84c5	Merge 'commitlog: Revert/modify `fac2bc4` - do footprint add in delete' from Calle Wilund Fixes #11184 Fixes #11237 In prev (broken) fix for https://github.com/scylladb/scylladb/issues/11184 we added the footprint for left-over files (replay candidates) to disk footprint on commitlog init. This effectively prevents us from creating segments iff we have tight limits. Since we nowadays do quite a bit of inserts _before_ commitlog replay (system.local, but...) we can end up in a situation where we deadlock start because we cannot get to the actual replay that will eventually free things. Another, not thought through, consequence is that we add a single footprint to _all_ commitlog shard instances - even though only shard 0 will get to actually replay + delete (i.e. drop footprint). So shards 1-X would all be either locked out or performance degraded. Simplest fix is to add the footprint in delete call instead. This will lock out segment creation until delete call is done, but this is fast. Also ensures that only replay shard is involved. To further emphasize this, don't store segments found on init scan in all shard instances, instead retrieve (based on low time-pos for current gen) when required. This changes very little, but we at last don't store pointless string lists in shards 1 to X, and also we can potentially ask for the list twice. More to the point, goes better hand-in-hand with the semantics of "delete_segments", where any file sent in is considered candidate for recycling, and included in footprint. Closes #11251 * github.com:scylladb/scylladb: commitlog: Make get_segments_to_replay on-demand commitlog: Revert/modify `fac2bc4` - do footprint add in delete	2022-08-15 09:10:32 +03:00
Botond Dénes	8f10413087	Merge 'doc: describe specifying workload attributes with service levels' from Anna Stuchlik Fix https://github.com/scylladb/scylladb/issues/11197 This PR adds a new page where specifying workload attributes with service levels is described and adds it to the menu. Also, I had to fix some links because of the warnings. Closes #11209 * github.com:scylladb/scylladb: doc: remove the reduntant space from index doc: update the syntax for defining service level attributes doc: rewording doc: update the links to fix the warnings doc: add the new page to the toctree doc: add the descrption of specifying workload attributes with service levels doc: add the definition of workloads to the glossary	2022-08-15 07:14:28 +03:00
Nadav Har'El	c8b5c3595e	Merge 'cql3: select_statement: coroutinize indexed_table_select_statement::do_execute_base_query()' from Avi Kivity Increase readability in preparation for managing topology with effective_replication_map (continuing `69aea59d9`). Closes #11290 * github.com:scylladb/scylladb: cql3: select_statement: improve loop termination condition in indexed_table_select_statement::do_execute_base_query() cql3: select_statement: reindent indexed_table_select_statement::do_execute_base_query() cql3: select_statement: coroutinize indexed_table_select_statement::do_execute_base_query() cql3: select_statement: de-result_wrap indexed_table_select_statement::do_execute_base_query()	2022-08-14 23:26:06 +03:00
Nadav Har'El	4a4231ea53	Merge 'storage_proxy: coroutinize some counter mutate functions' from Avi Kivity In preparation for effective_replication_map hygiene, convert some counter functions to coroutines to simplify the changes. Closes #11291 * github.com:scylladb/scylladb: storage_proxy: mutate_counters_on_leader: coroutinize storage_proxy: mutate_counters: coroutinize storage_proxy: mutate_counters: reorganize error handling	2022-08-14 23:16:42 +03:00
Avi Kivity	8070cdbbf9	storage_proxy: mutate_counters_on_leader: coroutinize Simplify ahead of refactoring for consistent effective_replication_map.	2022-08-14 17:36:58 +03:00
Avi Kivity	6e330d98d2	storage_proxy: mutate_counters: coroutinize Simplify ahead of refactoring for consistent effective_replication_map. This is probably a pessimization of the error case, but the error case will be terrible in any case unless we resultify it.	2022-08-14 17:28:46 +03:00
Avi Kivity	105b066ff7	storage_proxy: mutate_counters: reorganize error handling Move the error handling function where it's used so the code is more straightforward. Due to some std::move()s later, we must still capture the schema early.	2022-08-14 17:13:22 +03:00
Avi Kivity	fbaa280acd	cql3: select_statement: improve loop termination condition in indexed_table_select_statement::do_execute_base_query() Move the termination condition to the front of the loop so it's clear why we're looping and when we stop. It's less than perfectly clean since we widen the scope of some variables (from loop-internal to loop-carried), but IMO it's clearer.	2022-08-14 15:40:45 +03:00
Avi Kivity	60c7c11c96	cql3: select_statement: reindent indexed_table_select_statement::do_execute_base_query() Reindent after coroutinization. No functional changes.	2022-08-14 15:35:36 +03:00
Avi Kivity	492dc6879e	cql3: select_statement: coroutinize indexed_table_select_statement::do_execute_base_query() It's much easier to maintain this way. Since it uses ranges_to_vnodes, it interacts with topology and needs integration into effective_replication_map management. The patch leaves bad indentation and an infinite-looking loop in the interest of minimization, but that will be corrected later. Note, the test for `!r.has_value()` was eliminated since it was short-circuited by the test for `!rqr.has_value()` returning from the coroutine rather than propagating an error.	2022-08-14 15:31:45 +03:00
Avi Kivity	973034978c	cql3: select_statement: de-result_wrap indexed_table_select_statement::do_execute_base_query() We use result_wrap() in two places, but that makes coroutinizing the containing function a little harder, since it's composed of more lambdas. Remove the wrappers, gaining a bit of performance in the error case.	2022-08-14 15:22:18 +03:00
Kamil Braun	b4c5b79f5e	db: system_distributed_keyspace: don't call `on_internal_error` in `check_exists` The function `check_exists` checks whether a given table exists, giving an error otherwise. It previously used `on_internal_error`. `check_exists` is used in some old functions that insert CDC metadata to CDC tables. These tables are no longer used in newer Scylla versions (they were replaced with other tables with different schema), and this function is no longer called. The table definitions were removed and these tables are no longer created. They will only exists in clusters that were upgraded from old versions of Scylla (4.3) through a sequence of upgrades. If you tried to upgrade from a very old version of Scylla which had neither the old or the new tables to a modern version, say from 4.2 to 5.0, you would get `on_internal_error` from this `check_exists` function. Fortunately: 1. we don't support such upgrade paths 2. `on_internal_error` in production clusters does not crash the system, only throws. The exception would be catched, printed, and the system would run (just without CDC - until you finished upgrade and called the propoer nodetool command to fix the CDC module). Unfortunately, there is a dtest (`partitioner_tests.py`) which performs an unsupported upgrade scenario - it starts Scylla from Cassandra (!) work directories, which is like upgrading from a very old version of Scylla. This dtest was not failing due to another bug which masked the problem. When we try to fix the bug - see #11225 - the dtest starts hitting the assertion in `check_exists`. Because it's a test, we configure `on_internal_error` to crash the system. The point of this commit is to not crash the system in this rare scenario which happens only in some weird tests. We now throw `std::runtime_error` instead of calling `on_internal_error`. In the dtest, we already ignore the resulting CDC error appearing in the logs (see scylladb/scylla-dtest#2804). Together with this change, we'll be able to fix the #11225 bug and pass this test. Closes #11287	2022-08-14 13:12:03 +03:00
Nadav Har'El	329068df99	test/cql-pytest: un-xfail yet another passing collection-indexing test After collection indexing has been implemented, yet another test which failed because of #2962 now passes. So remove the "xfail" marker. Refs #2962 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	f6f18b187a	secondary index: fix paging in map value indexing When indexing a map column's values, if the same value appears more than once, the same row will appear in the index more than once. We had code that removed these duplicates, but this deduplication did not work across page boundaries. We had two xfailing tests to demonstrate this bug. In this patch we fix this bug by looking at the page's start and not generating the same row again, thereby getting the same deduplication we had inside pages - now across pages. The previously-xfailing tests now pass, and their xfail tag is removed. I also added another test, for the case where the base table has only partition keys without clustering keys. This second test is important because the code path for the partition-key-only case is different, and the second test exposed a bug in it as well (which is also fixed in this patch). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	dc445b9a73	test/cql-pytest: test for paging with collection values index If a map has several keys with the same value, then the "values(m)" index must remember all of them as matching the same row - because later we may remove one of these keys from the map but the row would still need to match the value because of the remaining keys. We already had a test (test_index_map_values) that although the same row appears more than once for this value, when we search for this value the result only returns the row once. Under the hood, Scylla does find the same value multiple times, but then eliminates the duplicate matched raw and returns it only once. But there is a complication, that this de-duplication does not easily span paging. So in this patch we add a test that checks that paging does not cause the same row to be returned more than once. Unfortunately, this test currently fails on Scylla so marked "xfail". It passes on Cassandra. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	5d556115a1	cql, view: rename and explain bytes_with_action The structure "bytes_with_action" was very hard to understand because of its mysterious and general-sounding name, and no comments. In this patch I add a large comment explaining its purpose, and rename it to a more suitable name, view_key_and_action, which suggests that each such object is about one view key (where to add a view row), and an additional "action" that we need to take beyond adding the view row. This is the best I can do to make this code easier to understand without completely reorganizing it. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	8b00c91c13	cql, index: make collection indexing a cluster feature Prevent a user from creating a secondary index on a collection column if the cluster has any nodes which don't support this feature. Such nodes will not be able to correctly handle requests related to this index, so better not allow creating one. Attempting to create an index on a collection before the entire cluster supports this feature will result in the error: Indexing of collection columns not supported by some older nodes in this cluster. Please upgrade them. Tested by manually disabling this feature in feature_service.cc and seeing this error message during collection indexing test. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	aa86f808a6	test/cql-pytest: failing tests for oversized key values in MV and SI In issue #9013, we noticed that if a value larger than 64 KB is indexed, the write fails in a bad way, and we fixed it. But the test we wrote when fixing that issue already suggested that something was still wrong: Cassandra failed the write cleanly, with an InvalidRequest, while Scylla failed with a mysterious WriteFailure (with a relevant error message only in the log). This patch adds several xfailing tests which demonstrate what's still wrong. This is also summarized in issue #8627: 1. A write of an oversized value to an indexed column returns the wrong error message. 2. The same problem also exists when indexing a collection, and the indexed key or value is oversized. 3. The situation is even less pleasant when adding an index to a table with pre-existing data and an oversized value. In this case, the view building will fail on the bad row, and never finish. 4. We have exactly the same bugs not just with indexes but also with materialized views. Interestingly, Cassandra has similar bugs in materialized views as well (but not in the secondary index case, where Cassandra does behave as expected). Refs #8627. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	2c244c6e09	cql: fix secondary index "target" when column name has special characters Unfortunately, we encode the "target" of a secondary index in one of three ways: 1. It can be just a column name 2. It can be a string like keys(colname) - for the new type of collection indexes introduced in this series. 3. It can be a JSON map ({ ... }). This form is used for local indexes. The code parsing this target - target_parser::parse() - needs not to confuse these different formats. Before this patch, if the column name contains special characters like braces or parentheses (this is allowed in CQL syntax, via quoting), we can confuse case 1, 2, and 3: A column named "keys(colname)" will be confused for case 2, and a column named "{123}" will be confused with case 3. This problem can break indexing of some specially-crafted column names - as reproduced by test_secondary_index.py::test_index_quoted_names. The solution adopted in this patch is that the column name in case 1 should be escaped somehow so it cannot be possibly confused with either cases 2 and 3. The way we chose is to convert the column name to CQL (with column_definition::as_cql_name()). In other words, if the column name contains non-alphanumeric characters, it is wrapped in quotes and also quotes are doubled, as in CQL. The result of this can't be confused with case 2 or 3, neither of which may begin with a quote. This escaping is not the minimal we could have done, but incidentally it is exactly what Cassandra does as well, so I used it as well. This change is mostly backward compatible: Already-existing indexes will still have unescaped column names stored for their "target" string, and the unescaping code will see they are not wrapped in quotes, and not change them. Backward compatibility will only fail on existing indexes on columns whose name begin and end in the quote characters - but this case is extremely unlikely. This patch illustrates how un-ideal our index "target" encoding is, but isn't what made it un-ideal. We should not have used three different formats for the index target - the third representation (JSON) should have sufficed. However, two two other representations are identical to Cassandra's, so using them when we can has its compatibility advantages. The patch makes test_secondary_index.py::test_index_quoted_names pass. Fixes #10707. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	56204a3794	cql, index: improve error messages Before this patch, trying to create an index on entries(x) where x is not a map results in an error message: Cannot create index on index_keys_and_values of column x The string "index_keys_and_values" is strange - Cassandra prints the easier to understand string "entries()" - which better corresponds to what the user actually did. It turns out that this string "index_keys_and_values" comes from an elaborate set of variables and functions spanning multiple source files, used to convert our internal target_type variable into such a string. But although this code was called "index_option" and sounded very important, it was actually used just for one thing - error messages! So in this patch we drop the entire "index_option" abstraction, replacing it by a static trivial function defined exactly where it's used (create_index_statement.cc), which prints a target type. While at it, we print "entries()" instead of "index_keys_and_values" ;-) After this patch, the test_secondary_index.py::test_index_collection_wrong_type finally passes (the previous patch fixed the default table names it assumes, and this patch fixes the expected error messages), so its "xfail" tag is removed. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	84461f1827	cql, index: fix default index name for collection index When creating an index "CREATE INDEX ON tbl(keys(m))", the default name of the index should be tbl_m_idx - with just "m". The current code incorrectly used the default name tbl_m_keys_idx, so this patch adds a test (which passes on Cassandra, and after this patch also on Scylla) and fixes the default name. It turns out that the default index name was based on a mysterious index_target::as_string(), which printed the target "keys(m)" as "m_keys" without explaining why it was so. This method was actually used only in three places, and all of them wanted just the column name, without the "_keys" suffix! So in this patch we rename the mysterious as_string() to column_name(), and use this function instead. Now that the default index name uses column_name() and gets just column_name(), the correct default index name is generated, and the test passes. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	94ba03a4d6	test/cql-pytest: un-xfail several collecting indexing tests After the previous patches implemented collection indexing, several tests in test/cql-pytest/test_secondary_index.py that were marked with "xfail" started to pass - so here we remove the xfail. Only three collection indexing tests continue to xfail: test_secondary_index.py::test_index_collection_wrong_type test_secondary_index.py::test_index_quoted_names (#10707) test_secondary_index.py::test_local_secondary_index_on_collection (#10713) Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Michał Radwański	2690ecd65d	test/cql-pytest/test_secondary_index: verify that local index on collection fails. Collection indexing is being tracked by #2962. Global secondary index over collection is enabled by #10123. Leave this test to track this behaviour. Related issue: #10713	2022-08-14 10:29:52 +03:00
Michał Radwański	1d852a9c7f	docs/design-notes/secondary_index: add `VALUES` to index target list A new secondary index target is being supported, which is `VALUES(v)`.	2022-08-14 10:29:52 +03:00
Michał Radwański	25f4c905f5	test/cql-pytest/test_secondary_index: add randomized test for indexes on collections	2022-08-14 10:29:52 +03:00
Michał Radwański	2a8289c101	cql-pytest/cassandra_tests/.../secondary_index_test: fix error message in test ported from Cassandra	2022-08-14 10:29:52 +03:00
Michał Radwański	fb476702a7	cql-pytest/cassandra_tests/.../secondary_index_on_map_entries,select_test: test ported from Cassandra is expected to fail, since Scylla assumes that comparison with null doesn't throw error, just evaluates to false. Since it's not a bug, but expected behavior from the perspective of Scylla, we don't mark it as xfail.	2022-08-14 10:29:52 +03:00
Michał Radwański	f572051ee9	test/boost/secondary_index_test: update for non-frozen indexes on collections	2022-08-14 10:29:52 +03:00
Karol Baryła	9e377b2824	test/cql-pytest: Uncomment collection indexes tests that should be working now	2022-08-14 10:29:52 +03:00
Nadav Har'El	67990d2170	cql, index: don't use IS NOT NULL on collection column When the secondary-index code builds a materialized view on column x, it adds "x IS NOT NULL" to the where-clause of the view, as required. However, when we index a collection column, we index individual pieces of the collection (keys, values), the the entire collection, so checking if the entire collection is null does not make sense. Moreover, for a collection column x, "x IS NOT NULL" currently doesn't work and throws errors when evaluating that expression when data is written to the table. The solution used in this patch is to simply avoid adding the "x IS NOT NULL" when creating the materialized view for a collection index. Everything works just fine without it. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00

1 2 3 4 5 ...

32713 Commits