scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 18:10:39 +00:00

Author	SHA1	Message	Date
Kefu Chai	f916286b25	index: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16892	2024-01-21 16:52:25 +02:00
Kefu Chai	0ae81446ef	./: not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16766	2024-01-17 16:30:14 +02:00
Botond Dénes	cf188f40b9	index: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:50:41 -04:00
Marcin Maliszkiewicz	bcbaccc143	rjson: avoid copy constructors in from_string calls when possible This function anyway copies the value so no need to do extra copy.	2023-01-16 15:15:26 +01:00
Nadav Har'El	2c244c6e09	cql: fix secondary index "target" when column name has special characters Unfortunately, we encode the "target" of a secondary index in one of three ways: 1. It can be just a column name 2. It can be a string like keys(colname) - for the new type of collection indexes introduced in this series. 3. It can be a JSON map ({ ... }). This form is used for local indexes. The code parsing this target - target_parser::parse() - needs not to confuse these different formats. Before this patch, if the column name contains special characters like braces or parentheses (this is allowed in CQL syntax, via quoting), we can confuse case 1, 2, and 3: A column named "keys(colname)" will be confused for case 2, and a column named "{123}" will be confused with case 3. This problem can break indexing of some specially-crafted column names - as reproduced by test_secondary_index.py::test_index_quoted_names. The solution adopted in this patch is that the column name in case 1 should be escaped somehow so it cannot be possibly confused with either cases 2 and 3. The way we chose is to convert the column name to CQL (with column_definition::as_cql_name()). In other words, if the column name contains non-alphanumeric characters, it is wrapped in quotes and also quotes are doubled, as in CQL. The result of this can't be confused with case 2 or 3, neither of which may begin with a quote. This escaping is not the minimal we could have done, but incidentally it is exactly what Cassandra does as well, so I used it as well. This change is mostly backward compatible: Already-existing indexes will still have unescaped column names stored for their "target" string, and the unescaping code will see they are not wrapped in quotes, and not change them. Backward compatibility will only fail on existing indexes on columns whose name begin and end in the quote characters - but this case is extremely unlikely. This patch illustrates how un-ideal our index "target" encoding is, but isn't what made it un-ideal. We should not have used three different formats for the index target - the third representation (JSON) should have sufficed. However, two two other representations are identical to Cassandra's, so using them when we can has its compatibility advantages. The patch makes test_secondary_index.py::test_index_quoted_names pass. Fixes #10707. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Nadav Har'El	56204a3794	cql, index: improve error messages Before this patch, trying to create an index on entries(x) where x is not a map results in an error message: Cannot create index on index_keys_and_values of column x The string "index_keys_and_values" is strange - Cassandra prints the easier to understand string "entries()" - which better corresponds to what the user actually did. It turns out that this string "index_keys_and_values" comes from an elaborate set of variables and functions spanning multiple source files, used to convert our internal target_type variable into such a string. But although this code was called "index_option" and sounded very important, it was actually used just for one thing - error messages! So in this patch we drop the entire "index_option" abstraction, replacing it by a static trivial function defined exactly where it's used (create_index_statement.cc), which prints a target type. While at it, we print "entries()" instead of "index_keys_and_values" ;-) After this patch, the test_secondary_index.py::test_index_collection_wrong_type finally passes (the previous patch fixed the default table names it assumes, and this patch fixes the expected error messages), so its "xfail" tag is removed. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-08-14 10:29:52 +03:00
Michał Radwański	cbe33f8d7a	cql3/statements/: validate CREATE INDEX for index over a collection Allow CQL like this: CREATE INDEX idx ON table(some_map); CREATE INDEX idx ON table(KEYS(some_map)); CREATE INDEX idx ON table(VALUES(some_map)); CREATE INDEX idx ON table(ENTRIES(some_map)); CREATE INDEX idx ON table(some_set); CREATE INDEX idx ON table(VALUES(some_set)); CREATE INDEX idx ON table(some_list); CREATE INDEX idx ON table(VALUES(some_list)); This is needed to support creating indexes on collections.	2022-08-14 10:29:13 +03:00
Michał Radwański	166afd46b5	Cql.g, treewide: support cql syntax `INDEX ON table(VALUES(collection))` Brings support of cql syntax `INDEX ON table(VALUES(collection))`, even though there is still no support for indexes over collections. Previously, index_target::target_type::values was refering to values of a regular (non-collection) column. Rename it to `regular_values`. Fixes #8745.	2022-08-14 10:29:13 +03:00
Avi Kivity	5937b1fa23	treewide: remove empty comments in top-of-files After `fcb8d040` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), many dual-licensed files were left with empty comments on top. Remove them to avoid visual noise. Closes #10562	2022-05-13 07:11:58 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Nadav Har'El	5e52858295	rjson, alternator: rename set() functions add() The rjson::set() sounds like it can set any member of a JSON object (i.e., map), but that's not true :-( It calls the RapidJson function AddMember() so it can only add a member to an object which doesn't have a member with the same name (i.e., key). If it is called with a key that already has a value, the result may have two values for the same key, which is ill-formed and can cause bugs like issue #9542. So in this patch we begin by renaming rjson::set() and its variant to rjson::add() - to suggest to its user that this function only adds members, without checking if they already exist. After this rename, I was left with dozens of calls to the set() functions that need to changed to either add() - if we're sure that the object cannot already have a member with the same name - or to replace() if it might. The vast majority of the set() calls were starting with an empty item and adding members with fixed (string constant) names, so these can be trivially changed to add(). It turns out that all other set() calls - except the one fixed in issue #9542 - can also use add() because there are various "excuses" why we know the member names will be unique. A typical example is a map with column-name keys, where we know that the column names are unique. I added comments in front of such non-obvious uses of add() which are safe. Almost all uses of rjson except a handful are in Alternator, so I verified that all Alternator test cases continue to pass after this patch. Fixes #9583 Refs #9542 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211104152540.48900-1-nyh@scylladb.com>	2021-11-04 16:35:38 +01:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	e0749d6264	treewide: some random header cleanups Eliminate not used includes and replace some more includes with forward declarations where appropriate. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-06-06 19:18:49 +03:00
Piotr Sarna	4cb79f04b0	treewide: replace libjsoncpp usage with rjson In order to eventually switch to a single JSON library, most of the libjsoncpp usage is dropped in favor of rjson. Unfortunately, one usage still remains: test/utils/test_repl utility heavily depends on the exact textual format of its output JSON files, so replacing a library results in all tests failing because of differences in formatting. It is possible to force rjson to print its documents in the exact matching format, but that's left for later, since the issue is not critical. It would be nice though if our test suite compared JSON documents with a real JSON parser, since there are more differences - e.g. libjsoncpp keeps children of the object sorted, while rapidjson uses an unordered data structure. This change should cause no change in semantics, it strives just to replace all usage of libjsoncpp with rjson.	2020-07-03 10:27:23 +02:00
Piotr Sarna	757419b524	index: add serialization function for index targets Since target_parser is responsible for deserializing target strings, the function that serializes them belongs in the same class.	2019-03-20 10:51:26 +01:00
Piotr Sarna	2fcae3d0ec	index: add parsing target column name from local index targets When (re)creating a local index, the target string needs to be used to parse out the actual indexed column: "(base_pk_part1,base_pk_part2,base_pk_part3),actual_indexed_column". This column is later used to deterine if an index should be applied to a SELECT statement.	2019-03-20 10:20:24 +01:00
Piotr Sarna	de5e5ee1a5	index: add checking if serialized target implies local index This utility enables checking if the specified target indicated having a local index, even before base table schema is known.	2019-03-20 10:20:24 +01:00
Piotr Sarna	5672edc149	index: enable parsing multi-key targets Parsing index targets that consist of partition key columns followed by clustering key columns is enabled.	2019-03-20 10:20:24 +01:00
Piotr Sarna	9782381dd4	index: move target parser code to .cc file It will be useful later when expanding the implementation.	2019-03-20 10:20:24 +01:00
Nadav Har'El	21d7507b74	secondary index: move stuff out of db/index directory The db/index directory contains just a few lines of code that exists there for historical reasons. It's confusing that we have both db/index and index/ directory related to secondary-indexing. This patch moves what little is still in db/index/ to index/. In the future we should probably get rid of the "secondary_index" class we had there, but for now, let's at least not have a whole new directory for it. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180501101246.21143-1-nyh@scylladb.com>	2018-05-01 13:21:24 +03:00

20 Commits