scylladb

Author	SHA1	Message	Date
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Piotr Jastrzebski	76d7c761d1	schema: Stop using deprecated constructor This is another boring patch. One of schema constructors has been deprecated for many years now but was used in several places anyway. Usage of this constructor could lead to data corruption when using MX sstables because this constructor does not set schema version. MX reading/writing code depends on schema version. This patch replaces all the places the deprecated constructor is used with schema_builder equivalent. The schema_builder sets the schema version correctly. Fixes #8507 Test: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <4beabc8c942ebf2c1f9b09cfab7668777ce5b384.1622357125.git.piotr@scylladb.com>	2021-05-30 11:58:27 +03:00
Nadav Har'El	de938eba8c	Reduce dependency on header utils/rjson.hh If utils/rjson.hh is modified, 300 (!) source files get recompiled. This is frustrating for anyone working on this header file (like me). Moreover - utils/rjson.hh includes the large rapidjson header files (rapidjson is a header-only library!), slowing the compilation all these 300 files. It turns out most includers utils/rjson.hh get it because column_computation.hh includes it. But the fact that column computations are serialized as JSON are an internal implementation detail that the users of this header don't need to know - and they care even less that this JSON implementation uses utils/rjson.hh. So in this patch column_computation.hh no longer includes rjson.hh, and no longer exposes a method taking a rjson::value that was never used outside the implementation. After this patch, touching utils/rjson.hh only recompiles 21 files. Refs #1 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210422183526.114366-1-nyh@scylladb.com>	2021-04-25 13:20:51 +03:00
Eliran Sinvani	9162748b18	materialized views: create view schemas with proper base table reference. Newly created view schemas don't always have their base info, this is bad since such schemas don't support read nor write. This leaves us vulnerable to a race condition where there is an attempt to use this schema for read or write. Here we initialize the base reference and also reconfigure the view to conform to the new computed column type, which makes it usable for write and not only reads. We do it for views created in the migration manager following announcements and also for copied schemas.	2021-03-07 12:50:42 +02:00
Kamil Braun	bf115e7d69	schema_tables: put schema tables on shard 0 We use a custom sharder for all schema tables: every table under the `system_schema` keyspace, plus `system.scylla_table_schema_history`. This sharder puts all data on shard 0. To achieve this, we hardcode the sharder in initial schema object definitions. Furthermore - since the sharder is not stored inside schema mutations yet - whenever we deserialize schema objects from mutations, we modify the sharder based on the schema's keyspace and table names. A regression test is added to ensure no one forgets to set the special sharder for newly added schema tables. This test assumes that all newly added schema tables will end up in the `system_schema` keyspace (other tables may go unnoticed, unfortunately). Closes #7947	2021-01-28 13:28:22 +02:00
Amos Kong	6b1659ee80	schema.cc/describe: fix invalid compaction options in schema There is a typo in schema.cql of snapshot, lack of comma after compaction strategy. It will fail to restore schema by the file. AND compaction = {'class': 'SizeTieredCompactionStrategy''max_compaction_threshold': '32'} map_as_cql_param() function has a `first` parameter to smartly add comma, the compaction_strategy_options is always not the first. Fixes #7741 Signed-off-by: Amos Kong <amos@scylladb.com> Closes #7734	2020-12-06 17:40:05 +02:00
Piotr Grabowski	9fc2dc59b8	column_computation: add token_column_computation Introduce new token_column_computation class which is intended to replace legacy_token_column_computation. The new column computation returns token as long_type, which means that it will be ordered according to signed comparison (not unsigned comparison of bytes), which is the correct ordering of tokens.	2020-11-04 12:02:42 +01:00
Piotr Grabowski	b1350af951	token_column_computation: rename as legacy Raname token_column_computation to legacy_token_column_computation, as it will be replaced with new column_computation. The reason is that this computation returns bytes, but all tokens in Scylla can now be represented by int64_t. Moreover, returning bytes causes invalid token ordering as bytes comparison is done in unsigned way (not signed as int64_t). See issue: https://github.com/scylladb/scylla/issues/7443	2020-11-04 12:00:18 +01:00
Pavel Solodovnikov	aa4c359cff	column_mapping_entry: extract == and != operators Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20201016123638.99534-1-pa.solodovnikov@scylladb.com>	2020-10-16 14:59:50 +02:00
Pavel Solodovnikov	81cf11f8a0	schema: add equality operator for `column_mapping` class Add a comparator for column mappings that will be used later in unit-tests to check whether two column mappings match or not. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2020-10-15 19:24:44 +03:00
Avi Kivity	3daa49f098	Merge "materialized views: Fix undefined behavior on base table schema changes" from Tomasz " The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema did not change when the base table was altered. Another problem was that view building was using the current table's schema to interpret the fragments and invoke view building. That's incorrect for two reasons. First, fragments generated by a reader must be accessed only using the reader's schema. Second, base_non_pk_columns_in_view_pk of the recorded view ptrs may not longer match the current base table schema, which is used to generate the view updates. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entity called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Fixes #7061. Tests: - unit (dev) - [v1] manual (reproduced using scylla binary and cqlsh) " * tag 'mv-schema-mismatch-fix-v2' of github.com:tgrabiec/scylla: db: view: Refactor view_info::initialize_base_dependent_fields() tests: mv: Test dropping columns from base table db: view: Fix incorrect schema access during view building after base table schema changes schema: Call on_internal_error() when out of range id is passed to column_at() db: views: Fix undefined behavior on base table schema changes db: views: Introduce has_base_non_pk_columns_in_view_pk()	2020-08-26 17:37:52 +03:00
Tomasz Grabiec	d64d60f576	schema: Call on_internal_error() when out of range id is passed to column_at() Improves debuggability because backtrace is attached. Before, plain std::out_of_range exception was thrown.	2020-08-20 14:53:07 +02:00
Rafael Ávila de Espíndola	f0e4e5b85a	schema: Make some functions static This just make it easier to see that they are file local helpers. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:05:31 -07:00
Rafael Ávila de Espíndola	6363716799	schema: Pass an rvalue to set_compaction_strategy_options This produces less code and makes sure every caller moves the value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:35 -07:00
Rafael Ávila de Espíndola	527c1ab546	schema: Move set_compaction_strategy_options out of line Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:13 -07:00
Pavel Solodovnikov	9aa4712270	lwt: introduce `paxos_grace_seconds` per-table option to set paxos ttl Previously system.paxos TTL was set as max(3h, gc_grace_seconds). Introduce new per-table option named `paxos_grace_seconds` to set the amount of seconds which are used to TTL data in paxos tables when using LWT queries against the base table. Default value is equal to `DEFAULT_GC_GRACE_SECONDS`, which is 10 days. This change allows to easily test various issues related to paxos TTL. Fixes #6284 Tests: unit (dev, debug) Co-authored-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200816223935.919081-1-pa.solodovnikov@scylladb.com>	2020-08-17 16:44:14 +02:00
Avi Kivity	061ec49a6c	Merge "Improve error reporting on invalid internal schema access" from Tomasz " Contains several fixes which improve debuggability in situations where too large column ids are passed to column definition loop methods. " * 'schema-range-check-fix' of github.com:tgrabiec/scylla: schema: Add table name and schema version to error messages schema: Use on_internal_error() for range check errors schema: Fix off-by-one in column range check schema: Make range checks for regular and static columns the same as for clustering columns	2020-08-16 17:48:48 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Tomasz Grabiec	db1c8c439a	schema: Add table name and schema version to error messages	2020-08-14 14:35:09 +02:00
Tomasz Grabiec	817c2e0508	schema: Use on_internal_error() for range check errors	2020-08-14 14:35:09 +02:00
Tomasz Grabiec	43d503102b	schema: Fix off-by-one in column range check We'd fail in std::vector::at() instead. Let's catch all invalid accesses, as intended.	2020-08-14 14:34:51 +02:00
Tomasz Grabiec	b41f2c719b	schema: Make range checks for regular and static columns the same as for clustering columns	2020-08-14 14:34:51 +02:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Rafael Ávila de Espíndola	efeaded427	Everywhere: Add a make_shared_schema helper This replaces a lot of make_lw_shared(schema(...)) with make_shared_schema(...). This makes it easier to drop a dependency on the differences between seastar::make_shared and std::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Rafael Ávila de Espíndola	ad6d65dbbd	Everywhere: Explicitly instantiate make_shared seastar::make_shared has a constructor taking a T&&. There is no such constructor in std::make_shared: https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared This means that we have to move from make_shared(T(...) to make_shared<T>(...) If we don't want to depend on the idiosyncrasies of seastar::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Calle Wilund	3376209718	cdc::schema: Make extensions expicitly settable from builder To make non-cql cdc schema options a reality.	2020-07-15 08:21:34 +00:00
Piotr Sarna	4cb79f04b0	treewide: replace libjsoncpp usage with rjson In order to eventually switch to a single JSON library, most of the libjsoncpp usage is dropped in favor of rjson. Unfortunately, one usage still remains: test/utils/test_repl utility heavily depends on the exact textual format of its output JSON files, so replacing a library results in all tests failing because of differences in formatting. It is possible to force rjson to print its documents in the exact matching format, but that's left for later, since the issue is not critical. It would be nice though if our test suite compared JSON documents with a real JSON parser, since there are more differences - e.g. libjsoncpp keeps children of the object sorted, while rapidjson uses an unordered data structure. This change should cause no change in semantics, it strives just to replace all usage of libjsoncpp with rjson.	2020-07-03 10:27:23 +02:00
Pavel Emelyanov	f045cec586	snap: Get rid of storage_service reference in schema.cc Now when the snapshot stopping is correctly handled, we may pull the database reference all the way down to the schema::describe(). One tricky place is in table::napshot() -- the local db reference is pulled through an smp::submit_to call, but thanks to the shard checks in the place where it is needed the db is still "local" Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 20:28:25 +03:00
Piotr Sarna	911dee5417	schema: add has_column utility function With this simple helper function, a code snippet in alternator can be transformed from try-catch to a simple condition. Message-Id: <553debf4e91c0511566e53e2c8a5e8e6ee6552e2.1592233511.git.sarna@scylladb.com>	2020-06-15 23:55:06 +03:00
Pavel Solodovnikov	f6e765b70f	cql3: pass `column_specification` via lw_shared_ptr `column_specification` class is marked as "final": it's safe to use non-polymorphic pointer "lw_shared_ptr" instead of a more generic "shared_ptr". tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200427084016.26068-1-pa.solodovnikov@scylladb.com>	2020-04-27 12:47:42 +03:00
Piotr Sarna	9c15604659	treewide: deprecate passing explicit order in schema building In order to avoid confusion with regard to whose responsibility it is to sort the key columns (see #5856), the interface which allows adding columns to the builder with explicit column id is moved to a private function. An internal with_column_ordered() overload is maintained to be used for internal operations, but it's encouraged to use simpler with_column() in new code. Fixes #6235 Tests: unit(dev)	2020-04-19 16:19:17 +03:00
Botond Dénes	a4aa753f0f	schema: schema(): use std::stable_sort() to sort key columns When multiple key columns (clustering or partition) are passed to the schema constructor, all having the same column id, the expectation is that these columns will retain the order in which they were passed to `schema_builder::with_column()`. Currently however this is not guaranteed as the schema constructor sort key columns by column id with `std::sort()`, which doesn't guarantee that equally comparing elements retain their order. This can be an issue for indexes, the schemas of which are built independently on each node. If there is any room for variance between for the key column order, this can result in different nodes having incompatible schemas for the same index. The fix is to use `std::stable_sort()` which guarantees that the order of equally comparing elements won't change. This is a suspected cause of #5856, although we don't have hard proof. Fixes: #5856 Signed-off-by: Botond Dénes <bdenes@scylladb.com> [avi: upgraded "Refs" to "Fixes", since we saw that std::sort() becomes unstable at 17 elements, and the failing schema had a clustering key with 23 elements] Message-Id: <20200417121848.1456817-1-bdenes@scylladb.com>	2020-04-19 13:42:44 +03:00
Piotr Jastrzebski	e72696a8e6	sharding_info: rename the class to sharder Also rename all variables that were named si or sinfo to sharder. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	2e850421a0	i_partitioner:remove embeded sharding_info sharding_info embeded into partitioner is no longer used anywhere and can be removed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	7bd2b8d73f	schema: make it possible to set sharding_info per schema Previously schema::get_sharding_info was obtaining sharding_info from the partitioner but we want to remove sharding_info from the partitioner so we need a place in schema to store it there instead. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	c5d0887471	schema_builder: remove unused with_partitioner_for_tests_only After previous patches that switched some tests to use sharding_info instead of i_partitioner, we now don't need with_partitioner_for_tests_only and the function can be removed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	8d81a2498f	schema: add get_sharding_info At the moment, we have a single sharding logic per node but we want to be able to set it per table in the future. To make it easy to change in the future sharding_info will be managed inside schema and all the other code will access it through schema::get_sharding_info function. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 09:35:27 +02:00
Nadav Har'El	35d95d6887	merge: Add postimage implementation Merged pull request https://github.com/scylladb/scylla/pull/5996 from Calle Wilund: Fixes #4992 Implements post-image support by synthesizing it from pre-image + delta. Post-image data differs from the delta data in two ways: 1.) It merges non-atomics into an actual result value 2.) It contains all columns of the row, not just those affected by the update. For a non-atomic field, the post-image value of a column is either the pre-image or the delta (maybe null) Tested by adding post-image checks to pre-image test and collection/udt tests	2020-03-16 13:42:07 +02:00
Calle Wilund	ca7046256f	schema: Add "columns" accessor for columns by kind To prevent switch-code everywhere.	2020-03-16 09:21:06 +00:00
Piotr Jastrzebski	5bbb826c49	schema: drop optional from _partitioner field Always set the field to the default value if no table specific partitioner has been set. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:21 +01:00
Piotr Jastrzebski	924ed7bb1c	make_multishard_combining_reader: stop taking partitioner The function already takes schema so there's no need for it to take partitioner. It can be obtained using schema::get_partitioner Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	22daa262ee	partitioner: move default_partitioner to schema.cc Make it inaccessible to other compilation units. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	57b69fb804	schema: include partitioner name in scylla tables mutation There are two results of this patch: 1. New partitioner name column is persited on node's disk in scylla_tables 2. New partitioner name column is included into schema digest This is achieved by including this new column in scylla tables mutation. For that we: 1. Add partitioner name to the result of make_scylla_tables_mutation. If table does not have a specific partitioner set and uses default partitioner then we don't include the name of such default partitioner. Only the name of custom partitioner is added if a table has one. 2. In create_table_from_mutations we check whether scylla tables mutation has a partitioner name set. If so then we use it as a parameter for schema_builder. Note that previous patches have ensured that this new column will be included into schema digest only after the whole cluster supports per table partitioners. Before that, during rolling upgrade, new partitioner name column is hidden and not shared with other nodes. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	1d6cec1b0a	schema: make it possible to set custom partitioner schema_builder::with_partitioner can be used now to set custom partitioner on a table. If no such partitioner is set, global partitioner is still used. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	54d24553bb	schema: get_partitioner return const& Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-06 13:33:53 +01:00
Piotr Dulikowski	861c7b5626	schema: get cdc options from schema extensions Removes logic responsible for setting cdc_options from dedicated column in scylla_tables, and uses the "cdc" schema extension instead.	2020-03-05 16:11:21 +01:00
Rafael Ávila de Espíndola	151f5e723f	Pass string_view to the schema constructor This moves string copies from the callers of the constructor to the implementation. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-28 17:04:12 -08:00
Rafael Ávila de Espíndola	9ab2346e7f	Pass string_view to the schema_builder constructor With this we don't need to construct a sstring just to construct a schema_builder. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-28 08:36:27 -08:00
Piotr Jastrzebski	406f42e012	schema: reduce number of global_partitioner() calls Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	9b95153136	schema: add get_partitioner() The plan is to remove dht::global_partitioner() and use schema::get_partitioner() instead. This will allow a usage of per schema/table partitioner instead of a single global partitioner everywhere. Initially schema::get_partitioner will call dht::global_partitioner. After all the calls to dht::global_partitioner are switched to schema::get_partitioner, the ability to set per schema partitioner will be implemented. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:04:41 +01:00

1 2 3 4

196 Commits