scylladb

Author	SHA1	Message	Date
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	ae3a360725	database: Move database, keyspace, table classes to replica/ directory The database, keyspace, and table classes represent the replica-only part of the objects after which they are named. Reading from a table doesn't give you the full data, just the replica's view, and it is not consistent since reconciliation is applied on the coordinator. As a first step in acknowledging this, move the related files to a replica/ subdirectory.	2022-01-06 17:07:30 +02:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Botond Dénes	4dea339e0c	schema_builder: add a constructor providing make_shared_schema semantics make_shared_schema() is often used to create a schema that is then passed to schema_builder to modify it further. This is wasteful as the schema is built just to be disassembled and rebuilt again. To replace this wasteful pattern we provide a schema_builder constructor that has the same signature as `make_shared_schema()`, allowing follow-up modifications on the schema before it is fully built.	2021-11-05 11:41:04 +02:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Kamil Braun	bf115e7d69	schema_tables: put schema tables on shard 0 We use a custom sharder for all schema tables: every table under the `system_schema` keyspace, plus `system.scylla_table_schema_history`. This sharder puts all data on shard 0. To achieve this, we hardcode the sharder in initial schema object definitions. Furthermore - since the sharder is not stored inside schema mutations yet - whenever we deserialize schema objects from mutations, we modify the sharder based on the schema's keyspace and table names. A regression test is added to ensure no one forgets to set the special sharder for newly added schema tables. This test assumes that all newly added schema tables will end up in the `system_schema` keyspace (other tables may go unnoticed, unfortunately). Closes #7947	2021-01-28 13:28:22 +02:00
Rafael Ávila de Espíndola	6363716799	schema: Pass an rvalue to set_compaction_strategy_options This produces less code and makes sure every caller moves the value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:35 -07:00
Rafael Ávila de Espíndola	527c1ab546	schema: Move set_compaction_strategy_options out of line Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:13 -07:00
Pavel Solodovnikov	9aa4712270	lwt: introduce `paxos_grace_seconds` per-table option to set paxos ttl Previously system.paxos TTL was set as max(3h, gc_grace_seconds). Introduce new per-table option named `paxos_grace_seconds` to set the amount of seconds which are used to TTL data in paxos tables when using LWT queries against the base table. Default value is equal to `DEFAULT_GC_GRACE_SECONDS`, which is 10 days. This change allows to easily test various issues related to paxos TTL. Fixes #6284 Tests: unit (dev, debug) Co-authored-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200816223935.919081-1-pa.solodovnikov@scylladb.com>	2020-08-17 16:44:14 +02:00
Calle Wilund	3376209718	cdc::schema: Make extensions expicitly settable from builder To make non-cql cdc schema options a reality.	2020-07-15 08:21:34 +00:00
Piotr Sarna	911dee5417	schema: add has_column utility function With this simple helper function, a code snippet in alternator can be transformed from try-catch to a simple condition. Message-Id: <553debf4e91c0511566e53e2c8a5e8e6ee6552e2.1592233511.git.sarna@scylladb.com>	2020-06-15 23:55:06 +03:00
Piotr Sarna	9c15604659	treewide: deprecate passing explicit order in schema building In order to avoid confusion with regard to whose responsibility it is to sort the key columns (see #5856), the interface which allows adding columns to the builder with explicit column id is moved to a private function. An internal with_column_ordered() overload is maintained to be used for internal operations, but it's encouraged to use simpler with_column() in new code. Fixes #6235 Tests: unit(dev)	2020-04-19 16:19:17 +03:00
Piotr Jastrzebski	e72696a8e6	sharding_info: rename the class to sharder Also rename all variables that were named si or sinfo to sharder. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	2e850421a0	i_partitioner:remove embeded sharding_info sharding_info embeded into partitioner is no longer used anywhere and can be removed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	7bd2b8d73f	schema: make it possible to set sharding_info per schema Previously schema::get_sharding_info was obtaining sharding_info from the partitioner but we want to remove sharding_info from the partitioner so we need a place in schema to store it there instead. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	c5d0887471	schema_builder: remove unused with_partitioner_for_tests_only After previous patches that switched some tests to use sharding_info instead of i_partitioner, we now don't need with_partitioner_for_tests_only and the function can be removed. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Avi Kivity	ee9df91a76	Merge "Allow setting partitioner per table" from Piotr " This PR makes it possible to enable the usage of different partitioner for each table. If no table-specific partitioner is set for a given table then a default partitioner is used. The PR is composed of the following parts: - Introduction of schema::get_partitioner that still returns dht::global_partitioner - Replacement of all the usage of dht::global_partitioner with schema::get_partitioner - Making it possible to set table-specific partitioner in a schema_builder - Remove all the places that were setting default partitioner except for main.cc (mostly tests) - Move default partitioner from i_partitioner to schema.cc and hide it from the rest of the codebase - Remove dht::global_partitioner After this PR there's no such thing as global partitioner at all. There is only a default partitioner but it still has to be accessed through schema::get_partitioner. There are some intermediate states in which i_partitioner is stored as shared_ptr in the schema but the final version keeps it by const&. The PR does not enable per table partitioner end-to-end. Just the internals of the single node are covered. I still have to deal with: - Making sure a table has the same partitioner on each node - Allowing user to set up a table-specific partitioner on table - Signal driver about what partitioner is used by a given table - Persist partitioner info for each table that does not use default partitioner. Fixes #5493 Tests: unit(dev, release, debug), dtest(byo) " * 'per_table_partitioner' of https://github.com/haaawk/scylla: schema: drop optional from _partitioner field make_multishard_combining_reader: stop taking partitioner split_range_to_single_shard: stop taking partitioner as argument tests: remove unused murmur3 includes partitioner: move default_partitioner to schema.cc partitioner: hide dht::default_partitioner schema: include partitioner name in scylla tables mutation schema: make it possible to set custom partitioner scylla_tables: add partitioner column schema_features: add PER_TABLE_PARTITIONERS feature features: add PER_TABLE_PARTITIONERS feature	2020-03-16 11:13:47 +02:00
Kamil Braun	aa72a1c556	cql3: when altering table, keep old values of unchanged extensions When the user performed alter ks.t with compaction = {...} the values of most other options, which were not specified in the statement, e.g. compression, were left unchanged. That wasn't true for extension options however: for example, the "cdc" option was removed. This commit fixes the behavior to keep the old values of extension options not specified in the alter statement.	2020-03-15 17:45:30 +02:00
Piotr Jastrzebski	924ed7bb1c	make_multishard_combining_reader: stop taking partitioner The function already takes schema so there's no need for it to take partitioner. It can be obtained using schema::get_partitioner Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	1d6cec1b0a	schema: make it possible to set custom partitioner schema_builder::with_partitioner can be used now to set custom partitioner on a table. If no such partitioner is set, global partitioner is still used. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Dulikowski	861c7b5626	schema: get cdc options from schema extensions Removes logic responsible for setting cdc_options from dedicated column in scylla_tables, and uses the "cdc" schema extension instead.	2020-03-05 16:11:21 +01:00
Rafael Ávila de Espíndola	9ab2346e7f	Pass string_view to the schema_builder constructor With this we don't need to construct a sstring just to construct a schema_builder. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-28 08:36:27 -08:00
Kamil Braun	bd42b10df1	cdc: rename cdc/cdc.{hh,cc} to cdc/log.{hh,cc} To increase modularity, making it easier to find what is where and maintain. The 'log' module (cdc/log.{hh,cc}) is responsible for updating CDC log tables when base table writes are performed. The 'generation' module (cdc/generation.{hh,cc}) handles stream generation changes in response to topology change events. cdc/metadata.{hh,cc} contains a helper class which holds the currently used generation of streams. It is used by both aforementioned modules: 'log' queries it, while 'generation' updates it.	2020-01-30 11:10:39 +01:00
Gleb Natapov	16e0fc4742	schema: allow schema to be marked as 'always sync to commitlog' All writes that uses this schema will be immediately persisted on a storage.	2020-01-15 12:15:42 +02:00
Piotr Jastrzebski	8df942a320	schema_builder: handle schema::_cdc_options Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-10-17 10:55:31 +02:00
Piotr Sarna	a1100e3737	schema: allow marking columns as computed in schema builder In order to be able to transform legacy materialized view definitions, builder is now able to mark an existing column as computed.	2019-07-19 11:58:41 +02:00
Piotr Sarna	491b7a817f	schema: add computed info to column definition Some columns may represent not user-provided values, but ones computed from other columns. Currently an example is token column used in secondary indexes to provide proper ordering. In order to avoid hardcoding special cases in execution stage, optional additional information for computed columns is stored in column definition.	2019-07-19 11:47:46 +02:00
Duarte Nunes	fa2b0384d2	Replace std::experimental types with C++17 std version. Replace stdx::optional and stdx::string_view with the C++ std counterparts. Some instances of boost::variant were also replaced with std::variant, namely those that called seastar::visit. Scylla now requires GCC 8 to compile. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190108111141.5369-1-duarte@scylladb.com>	2019-01-08 13:16:36 +02:00
Paweł Dziepak	43e0201ec6	schema_builder: make member function names less confusing Right now, schema_builder member functions have names that very poorly convey the actions that are performed for them. This is made even worse by some overloads which drastically change the semantics. For example: schema_builder() .with_column("v1", /* ... /) .without_column("v1", removal_timestamp); Creates a column "v1" and adds an information that there was a column with that name that was removed at 'removal_timestamp'. schema_builder() .with_coulmn("v1") .without_column(utf8_type->decompose("v1")); This adds column "v1" and then immediately removes it. In order to clean up this mess the names were changes so that: with_/without_ functions only add informations to the schema (e.g. info that a column was removed, but without removing a column of that name if one exists) * functions which names start with a verb actually perform that action, e.g. the new remove_column() removes the column (and adds information that it used to exist) as in the second example.	2018-11-22 11:30:31 +00:00
Nadav Har'El	0a1d93138d	schema: add "view virtual" flag to schema's column_definition In this patch we add a flag, "view virtual", that we can mark on on a column defined in a schema. In following patches, we will add such virtual columns to materialized views to allow view rows to remain alive despite having no data (refs #3362). After this patch, the "view virtual" flag exists in our in-memory representation of the schema, but not persisted to disk - we will fix this in the next patch. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-08-16 15:23:09 +03:00
Piotr Sarna	0513dc17a1	schema: add clearing indexes to schema builder This commit adds 'without_indexes()' method to builder, used to clear all previous index declarations from schema definition.	2018-05-22 21:10:51 +02:00
Calle Wilund	3ab760b375	schema: Add opaque type to represent extensions A virtual opaque object meant to represent the "extensions" mapping in schema_tables::tables/views	2018-02-07 10:11:45 +00:00
Duarte Nunes	7eecda3a61	schema: Support compaction enabled attribute Fixes #2547 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170721132206.3037-1-duarte@scylladb.com>	2017-07-21 15:38:45 +02:00
Tomasz Grabiec	a9237c1666	schema: Revert back to the 1.7 layout of static compact tables in memory We are using C* 3.x compatible layout in schema tables but want to keep using the 1.7 layout in memory for compatibility during rolling upgrade. This patch switches the schema and schema_builder classes back to the old layout. Translation of layout happens when converting to/from schema mutations. Notable changes: 1) Includes a revert of commit `6260f31e08` "thrift: Update CQL mapping of static CFs". 2) Brings back the "default_validation_class" schema attribute. In v3 it can be dervied from column definitions, but in v2 it can't, so we have to store it. 3) legacy_schema_migrator and schema_builder don't have to do conversions to v3, this is now handled by the v3_columns class. schema_builder works with the same layout as schema, that is v2. 4) Includes a revert of commit `66991a7ccb` "v3 schema test fixes" Fixes #2555.	2017-07-19 09:52:15 +02:00
Tomasz Grabiec	49e21b3b8e	schema_builder: Add factory method for default_names	2017-07-17 09:40:06 +02:00
Calle Wilund	6c8b5fc09d	schema_tables: Use v3 schema tables and formats Switches system/schema_* for system_schema/*, updates schema/schema builder and uses to hold/expect v3 style info (i.e. types & dropped).	2017-05-10 16:44:48 +00:00
Calle Wilund	1c328a4166	schema_builder: Add helper to generate unique column names akin origin	2017-05-10 16:44:48 +00:00
Pekka Enberg	06564afedb	schema: Kill index_info class It's no longer used. Indices are managed by the index_metadata class.	2017-05-08 10:19:34 +03:00
Pekka Enberg	830591b092	schema: Remove add_default_index_names() from schema_builder class The add_default_index_names() is part of the old and incomplete secondary index implementation in Scylla. Drop it as it's no longer used.	2017-05-04 14:59:12 +03:00
Pekka Enberg	62fba73a05	schema_builder: Add index_metadata support	2017-05-04 14:59:11 +03:00
Duarte Nunes	a64c47f315	schema: Move raw_view_info outside of raw_schema In preparation of an upcoming patch, where the schema won't directly store the raw_view_info. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-03-15 15:38:31 +01:00
Duarte Nunes	82ce8eedbd	schema: Add view_info field This patch adds a view_info optional field to the schema. It's presence indicates the schema represents a materialized view. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	5995aebf39	schema_builder: Ensure dense tables have compact col This patch ensures that when the schema is dense, regardless of compact_storage being set, the single regular columns is translated into a compact column. This fixes an issue where Thrift dynamic column families are translated to a dense schema with a regular column, instead of a compact one. Since a compact column is also a regular column (e.g., for purposes of querying), no further changes are required. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1470062410-1414-1-git-send-email-duarte@scylladb.com>	2016-08-02 14:49:13 +02:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	41d475d9c0	schema_builder: Fluentize property setters	2016-02-22 20:23:29 +01:00
Paweł Dziepak	84840c1c98	schema: keep track of removed collections Cassandra disallows adding a column with the same name as a collection that existed in the past in that table if the types aren't compatible. To enforce that Scylla needs to keep track of all collections that ever existed in the column family. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-18 08:34:29 +01:00
Paweł Dziepak	da0f999123	schema_builder: add with_altered_column_type() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-11 10:34:54 +01:00
Paweł Dziepak	9807ddd158	schema_builder: add with_column_rename() Columns that are part of the primary key can be renamed. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-11 10:34:54 +01:00
Paweł Dziepak	42dc4ce715	schema: keep track of dropped columns Knowing which columns were dropped (and when) is important to prevent the data from the dropped ones reappearing if a new column is added with the same name. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	f58c2dec1e	schema: Make schema objects versioned The version needs to change value not only on structural changes but also temporal. This is needed for nodes to detect if the version they see was already synchronized with or not even if it has the same structure as the past versions. We also need to end up with the same version on all nodes when schema changes are commuted. For regular mutable schemas version will be calculated from underlying mutations when schema is announced. For static schemas of system keyspace it is calculated by hashing scylla version and column id, because we don't have mutations at the time of building the schema.	2016-01-08 21:10:26 +01:00

1 2

77 Commits