scylladb

Author	SHA1	Message	Date
Avi Kivity	3daa49f098	Merge "materialized views: Fix undefined behavior on base table schema changes" from Tomasz " The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema did not change when the base table was altered. Another problem was that view building was using the current table's schema to interpret the fragments and invoke view building. That's incorrect for two reasons. First, fragments generated by a reader must be accessed only using the reader's schema. Second, base_non_pk_columns_in_view_pk of the recorded view ptrs may not longer match the current base table schema, which is used to generate the view updates. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entity called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Fixes #7061. Tests: - unit (dev) - [v1] manual (reproduced using scylla binary and cqlsh) " * tag 'mv-schema-mismatch-fix-v2' of github.com:tgrabiec/scylla: db: view: Refactor view_info::initialize_base_dependent_fields() tests: mv: Test dropping columns from base table db: view: Fix incorrect schema access during view building after base table schema changes schema: Call on_internal_error() when out of range id is passed to column_at() db: views: Fix undefined behavior on base table schema changes db: views: Introduce has_base_non_pk_columns_in_view_pk()	2020-08-26 17:37:52 +03:00
Tomasz Grabiec	d64d60f576	schema: Call on_internal_error() when out of range id is passed to column_at() Improves debuggability because backtrace is attached. Before, plain std::out_of_range exception was thrown.	2020-08-20 14:53:07 +02:00
Rafael Ávila de Espíndola	f0e4e5b85a	schema: Make some functions static This just make it easier to see that they are file local helpers. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:05:31 -07:00
Rafael Ávila de Espíndola	6363716799	schema: Pass an rvalue to set_compaction_strategy_options This produces less code and makes sure every caller moves the value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:35 -07:00
Rafael Ávila de Espíndola	527c1ab546	schema: Move set_compaction_strategy_options out of line Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:13 -07:00
Pavel Solodovnikov	9aa4712270	lwt: introduce `paxos_grace_seconds` per-table option to set paxos ttl Previously system.paxos TTL was set as max(3h, gc_grace_seconds). Introduce new per-table option named `paxos_grace_seconds` to set the amount of seconds which are used to TTL data in paxos tables when using LWT queries against the base table. Default value is equal to `DEFAULT_GC_GRACE_SECONDS`, which is 10 days. This change allows to easily test various issues related to paxos TTL. Fixes #6284 Tests: unit (dev, debug) Co-authored-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200816223935.919081-1-pa.solodovnikov@scylladb.com>	2020-08-17 16:44:14 +02:00
Avi Kivity	061ec49a6c	Merge "Improve error reporting on invalid internal schema access" from Tomasz " Contains several fixes which improve debuggability in situations where too large column ids are passed to column definition loop methods. " * 'schema-range-check-fix' of github.com:tgrabiec/scylla: schema: Add table name and schema version to error messages schema: Use on_internal_error() for range check errors schema: Fix off-by-one in column range check schema: Make range checks for regular and static columns the same as for clustering columns	2020-08-16 17:48:48 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Tomasz Grabiec	db1c8c439a	schema: Add table name and schema version to error messages	2020-08-14 14:35:09 +02:00
Tomasz Grabiec	817c2e0508	schema: Use on_internal_error() for range check errors	2020-08-14 14:35:09 +02:00
Tomasz Grabiec	43d503102b	schema: Fix off-by-one in column range check We'd fail in std::vector::at() instead. Let's catch all invalid accesses, as intended.	2020-08-14 14:34:51 +02:00
Tomasz Grabiec	b41f2c719b	schema: Make range checks for regular and static columns the same as for clustering columns	2020-08-14 14:34:51 +02:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Rafael Ávila de Espíndola	efeaded427	Everywhere: Add a make_shared_schema helper This replaces a lot of make_lw_shared(schema(...)) with make_shared_schema(...). This makes it easier to drop a dependency on the differences between seastar::make_shared and std::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Rafael Ávila de Espíndola	ad6d65dbbd	Everywhere: Explicitly instantiate make_shared seastar::make_shared has a constructor taking a T&&. There is no such constructor in std::make_shared: https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared This means that we have to move from make_shared(T(...) to make_shared<T>(...) If we don't want to depend on the idiosyncrasies of seastar::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Calle Wilund	3376209718	cdc::schema: Make extensions expicitly settable from builder To make non-cql cdc schema options a reality.	2020-07-15 08:21:34 +00:00
Piotr Sarna	4cb79f04b0	treewide: replace libjsoncpp usage with rjson In order to eventually switch to a single JSON library, most of the libjsoncpp usage is dropped in favor of rjson. Unfortunately, one usage still remains: test/utils/test_repl utility heavily depends on the exact textual format of its output JSON files, so replacing a library results in all tests failing because of differences in formatting. It is possible to force rjson to print its documents in the exact matching format, but that's left for later, since the issue is not critical. It would be nice though if our test suite compared JSON documents with a real JSON parser, since there are more differences - e.g. libjsoncpp keeps children of the object sorted, while rapidjson uses an unordered data structure. This change should cause no change in semantics, it strives just to replace all usage of libjsoncpp with rjson.	2020-07-03 10:27:23 +02:00
Pavel Emelyanov	f045cec586	snap: Get rid of storage_service reference in schema.cc Now when the snapshot stopping is correctly handled, we may pull the database reference all the way down to the schema::describe(). One tricky place is in table::napshot() -- the local db reference is pulled through an smp::submit_to call, but thanks to the shard checks in the place where it is needed the db is still "local" Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 20:28:25 +03:00
Piotr Sarna	911dee5417	schema: add has_column utility function With this simple helper function, a code snippet in alternator can be transformed from try-catch to a simple condition. Message-Id: <553debf4e91c0511566e53e2c8a5e8e6ee6552e2.1592233511.git.sarna@scylladb.com>	2020-06-15 23:55:06 +03:00
Pavel Solodovnikov	f6e765b70f	cql3: pass `column_specification` via lw_shared_ptr `column_specification` class is marked as "final": it's safe to use non-polymorphic pointer "lw_shared_ptr" instead of a more generic "shared_ptr". tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200427084016.26068-1-pa.solodovnikov@scylladb.com>	2020-04-27 12:47:42 +03:00
Piotr Sarna	9c15604659	treewide: deprecate passing explicit order in schema building In order to avoid confusion with regard to whose responsibility it is to sort the key columns (see #5856), the interface which allows adding columns to the builder with explicit column id is moved to a private function. An internal with_column_ordered() overload is maintained to be used for internal operations, but it's encouraged to use simpler with_column() in new code. Fixes #6235 Tests: unit(dev)	2020-04-19 16:19:17 +03:00
Botond Dénes	a4aa753f0f	schema: schema(): use std::stable_sort() to sort key columns When multiple key columns (clustering or partition) are passed to the schema constructor, all having the same column id, the expectation is that these columns will retain the order in which they were passed to `schema_builder::with_column()`. Currently however this is not guaranteed as the schema constructor sort key columns by column id with `std::sort()`, which doesn't guarantee that equally comparing elements retain their order. This can be an issue for indexes, the schemas of which are built independently on each node. If there is any room for variance between for the key column order, this can result in different nodes having incompatible schemas for the same index. The fix is to use `std::stable_sort()` which guarantees that the order of equally comparing elements won't change. This is a suspected cause of #5856, although we don't have hard proof. Fixes: #5856 Signed-off-by: Botond Dénes <bdenes@scylladb.com> [avi: upgraded "Refs" to "Fixes", since we saw that std::sort() becomes unstable at 17 elements, and the failing schema had a clustering key with 23 elements] Message-Id: <20200417121848.1456817-1-bdenes@scylladb.com>	2020-04-19 13:42:44 +03:00
Piotr Jastrzebski	e72696a8e6	sharding_info: rename the class to sharder Also rename all variables that were named si or sinfo to sharder. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	2e850421a0	i_partitioner:remove embeded sharding_info sharding_info embeded into partitioner is no longer used anywhere and can be removed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	7bd2b8d73f	schema: make it possible to set sharding_info per schema Previously schema::get_sharding_info was obtaining sharding_info from the partitioner but we want to remove sharding_info from the partitioner so we need a place in schema to store it there instead. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	c5d0887471	schema_builder: remove unused with_partitioner_for_tests_only After previous patches that switched some tests to use sharding_info instead of i_partitioner, we now don't need with_partitioner_for_tests_only and the function can be removed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	8d81a2498f	schema: add get_sharding_info At the moment, we have a single sharding logic per node but we want to be able to set it per table in the future. To make it easy to change in the future sharding_info will be managed inside schema and all the other code will access it through schema::get_sharding_info function. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 09:35:27 +02:00
Nadav Har'El	35d95d6887	merge: Add postimage implementation Merged pull request https://github.com/scylladb/scylla/pull/5996 from Calle Wilund: Fixes #4992 Implements post-image support by synthesizing it from pre-image + delta. Post-image data differs from the delta data in two ways: 1.) It merges non-atomics into an actual result value 2.) It contains all columns of the row, not just those affected by the update. For a non-atomic field, the post-image value of a column is either the pre-image or the delta (maybe null) Tested by adding post-image checks to pre-image test and collection/udt tests	2020-03-16 13:42:07 +02:00
Calle Wilund	ca7046256f	schema: Add "columns" accessor for columns by kind To prevent switch-code everywhere.	2020-03-16 09:21:06 +00:00
Piotr Jastrzebski	5bbb826c49	schema: drop optional from _partitioner field Always set the field to the default value if no table specific partitioner has been set. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:21 +01:00
Piotr Jastrzebski	924ed7bb1c	make_multishard_combining_reader: stop taking partitioner The function already takes schema so there's no need for it to take partitioner. It can be obtained using schema::get_partitioner Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	22daa262ee	partitioner: move default_partitioner to schema.cc Make it inaccessible to other compilation units. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	57b69fb804	schema: include partitioner name in scylla tables mutation There are two results of this patch: 1. New partitioner name column is persited on node's disk in scylla_tables 2. New partitioner name column is included into schema digest This is achieved by including this new column in scylla tables mutation. For that we: 1. Add partitioner name to the result of make_scylla_tables_mutation. If table does not have a specific partitioner set and uses default partitioner then we don't include the name of such default partitioner. Only the name of custom partitioner is added if a table has one. 2. In create_table_from_mutations we check whether scylla tables mutation has a partitioner name set. If so then we use it as a parameter for schema_builder. Note that previous patches have ensured that this new column will be included into schema digest only after the whole cluster supports per table partitioners. Before that, during rolling upgrade, new partitioner name column is hidden and not shared with other nodes. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	1d6cec1b0a	schema: make it possible to set custom partitioner schema_builder::with_partitioner can be used now to set custom partitioner on a table. If no such partitioner is set, global partitioner is still used. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	54d24553bb	schema: get_partitioner return const& Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-06 13:33:53 +01:00
Piotr Dulikowski	861c7b5626	schema: get cdc options from schema extensions Removes logic responsible for setting cdc_options from dedicated column in scylla_tables, and uses the "cdc" schema extension instead.	2020-03-05 16:11:21 +01:00
Rafael Ávila de Espíndola	151f5e723f	Pass string_view to the schema constructor This moves string copies from the callers of the constructor to the implementation. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-28 17:04:12 -08:00
Rafael Ávila de Espíndola	9ab2346e7f	Pass string_view to the schema_builder constructor With this we don't need to construct a sstring just to construct a schema_builder. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-28 08:36:27 -08:00
Piotr Jastrzebski	406f42e012	schema: reduce number of global_partitioner() calls Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	9b95153136	schema: add get_partitioner() The plan is to remove dht::global_partitioner() and use schema::get_partitioner() instead. This will allow a usage of per schema/table partitioner instead of a single global partitioner everywhere. Initially schema::get_partitioner will call dht::global_partitioner. After all the calls to dht::global_partitioner are switched to schema::get_partitioner, the ability to set per schema partitioner will be implemented. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:04:41 +01:00
Piotr Jastrzebski	9c55e5be13	partitioner: remove token_to_bytes i_partitioner::token_to_bytes is just a call to token::data and does not depend on partitioner at all. It is possible to convert token to bytes without having access to partitioner. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-05 09:31:32 +01:00
Amnon Heiman	82367b325a	schema: Add a describe method This patch adds a describe method to a table schema. It acts similar to a DESCRIBE cql command that is implemented in a CQL driver. The method supports tables, secondary indexes local indexes and materialize views. relates to: #4192 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-01-15 15:06:00 +02:00
Konstantin Osipov	6159c012db	schema: pre-allocate the bitset of column_set The number of columns is usually small, and avoiding a resize speeds up bit manipulation functions.	2019-11-13 11:41:51 +03:00
Konstantin Osipov	191acec7ab	schema: rename column_mask to column_set Since it contains a precise set of columns, it's more accurate to call it a set, not a mask. Besides, the name column_mask is already used for column options on storage level.	2019-11-13 11:41:30 +03:00
Kamil Braun	b38b8af0f2	schema: generalize compound_name to UDTs.	2019-10-25 12:04:44 +02:00
Nadav Har'El	631846a852	CDC: Implement minimal version that logs only primary key of each change Merge a patch series from Piotr Jastrzębski (haaawk): This PR introduces CDC in it's minimal version. It is possible now to create a table with CDC enabled or to enable/disable CDC on existing table. There is a management of CDC log and description related to enabling/disabling CDC for a table. For now only primary key of the changed data is logged. To be able to co-locate cdc streams with related base table partitions it was needed to propagate the information about the number of shards per node. This was node through gossip. There is an assumption that all the nodes use the same value for sharding_ignore_msb_bits. If it does not hold we would have to gossip sharding_ignore_msb_bits around together with the number of shards. Fixes #4986. Tests: unit(dev, release, debug)	2019-10-20 11:41:01 +03:00
Piotr Jastrzebski	ca9536a771	schema: add _cdc_options field Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-10-17 10:55:31 +02:00
Konstantin Osipov	c0f0ab5edd	lwt: introduce column mask Introduce a bitset container which can be used to compute all columns used in a query. Add a partition_slice constructor which uses the bitset.	2019-10-16 22:40:55 +03:00
Konstantin Osipov	fa73421198	lwt: introduce column_definition::ordinal_id Make sure every column in the schema, be it a column of partition key, clustering key, static or regular one, has a unique ordinal identifier. This makes it easy to compute the set of columns used in a query, as well as index row cells. Allow to get column definition in schema by ordinal id.	2019-10-16 15:46:25 +03:00
Rafael Ávila de Espíndola	096de10eee	types: Remove abstract_type::equals All types are interned, so we can just compare the pointers. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-08-14 10:02:00 -07:00

1 2 3 4

186 Commits