scylladb

Author	SHA1	Message	Date
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Avi Kivity	247f2b69d5	Merge "system tables: create the schema more efficiently" from Botond " System tables currently almost uniformly use a pattern like this to create their schema: return schema_builder(make_shared_schema(...)) // [...] .with_version(...) .build(...); This pattern is very wasteful because it first creates a schema, then dismantles it just to recreate it again. This series abolishes this pattern without much churn by simply adding a constructor to schema builder that takes identical parameters to `make_shared_schema()`, then simply removing `make_shared_schema()` from these users, who now build a schema builder object directly and build the schema only once. Tests: unit(dev) " * 'schema-builder-make-shared-schema-ctor/v1' of https://github.com/denesb/scylla: treewide: system tables: don't use make_shared_schema() for creating schemas schema_builder: add a constructor providing make_shared_schema semantics schema_builder: without_column(): don't assume column_specification exists schema: add static variant of column_name_type()	2021-11-07 18:23:22 +02:00
Botond Dénes	e991604918	schema: make private constructor invokable via make_lw_shared The schema has a private constructor, which means it can't be constructed with `make_lw_shared()` even by classes which are otherwise able to invoke the private constructor themselves. This results in such classes (`schema_builder`) resorting to building a local schema object, then invoking `make_lw_shared()` with the schema's public move constructor. Moving a schema is not cheap at all however, so each `schema_builder::build()` call results in two expensive schema construction operations. We could make `make_lw_shared()` a friend of `schema` to resolve this, but then we'd de-facto open the private consctructor to the world. Instead this patch introduces a private tag type, which is added to the private constructor, which is then made public. Everybody can invoke the constructor but only friends can create the private tag instance required to actually call it. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20211105085940.359708-1-bdenes@scylladb.com>	2021-11-07 12:51:09 +02:00
Botond Dénes	d3833c5978	schema: add static variant of column_name_type() So schema_builder can use it too (without a schema instance at hand).	2021-11-05 11:41:04 +02:00
Raphael S. Carvalho	4950ce539c	schema: replace outdated comment on default compaction strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211104210043.199156-1-raphaelsc@scylladb.com>	2021-11-05 00:35:41 +02:00
Botond Dénes	3f4f408bcf	schema: add get_reversed() A variant of make_reversed() which goes through the schema registry, teaching the schema to the registry if necessary. This effectively caches the result of the reversing and as an added bonus double reversing yields the very same schema C++ object that was the starting point. Closes #9365	2021-09-22 18:55:25 +03:00
Botond Dénes	f200c8104a	schema: introduce make_reversed() `make_revered()` creates a schema identical to the schema instance it is called on, with clustering order reversed. To distinguish the reverse schema from the original one, the node-id part of its version UUID is bit-flipped. This ensures that reversing a schema twice will result in the identical schema to the original one (although a different C++ object). This reversed schema will be used in reversed reads, so intermediate layers can be ignorant of the fact that the read happens in reverse.	2021-09-09 11:49:05 +03:00
Botond Dénes	9a9b58e67b	schema: add a transforming copy constructor Taking a transform functor, which is executed after the raw schema is copied, but before the derivate fields are computed (rebuild()).	2021-09-09 11:49:05 +03:00
Asias He	6350a19f73	compaction: Move compaction_strategy.hh to compaction dir The top dir is a mess. Move compaction_strategy.hh and compaction_strategy_type.hh to the new home.	2021-08-07 08:06:37 +08:00
Pavel Solodovnikov	76bea23174	treewide: reduce header interdependencies Use forward declarations wherever possible. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Closes #8813	2021-06-07 15:58:35 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	e0749d6264	treewide: some random header cleanups Eliminate not used includes and replace some more includes with forward declarations where appropriate. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-06-06 19:18:49 +03:00
Piotr Jastrzebski	76d7c761d1	schema: Stop using deprecated constructor This is another boring patch. One of schema constructors has been deprecated for many years now but was used in several places anyway. Usage of this constructor could lead to data corruption when using MX sstables because this constructor does not set schema version. MX reading/writing code depends on schema version. This patch replaces all the places the deprecated constructor is used with schema_builder equivalent. The schema_builder sets the schema version correctly. Fixes #8507 Test: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <4beabc8c942ebf2c1f9b09cfab7668777ce5b384.1622357125.git.piotr@scylladb.com>	2021-05-30 11:58:27 +03:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Pavel Solodovnikov	aa4c359cff	column_mapping_entry: extract == and != operators Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20201016123638.99534-1-pa.solodovnikov@scylladb.com>	2020-10-16 14:59:50 +02:00
Pavel Solodovnikov	81cf11f8a0	schema: add equality operator for `column_mapping` class Add a comparator for column mappings that will be used later in unit-tests to check whether two column mappings match or not. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2020-10-15 19:24:44 +03:00
Pavel Solodovnikov	9aa4712270	lwt: introduce `paxos_grace_seconds` per-table option to set paxos ttl Previously system.paxos TTL was set as max(3h, gc_grace_seconds). Introduce new per-table option named `paxos_grace_seconds` to set the amount of seconds which are used to TTL data in paxos tables when using LWT queries against the base table. Default value is equal to `DEFAULT_GC_GRACE_SECONDS`, which is 10 days. This change allows to easily test various issues related to paxos TTL. Fixes #6284 Tests: unit (dev, debug) Co-authored-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200816223935.919081-1-pa.solodovnikov@scylladb.com>	2020-08-17 16:44:14 +02:00
Nadav Har'El	7e01ae089e	cdc: avoid including cdc/cdc_options.hh everywhere Before this patch, modifying cdc/cdc_options.hh required recompiling 264 source files. This is because this header file was included by a couple other header files - most notably schema.hh, where a forward declaration would have been enough. Only the handful of source files which really need to access the CDC options should include "cdc/cdc_options.hh" directly. After this patch, modifying cdc/cdc_options.hh requires only 6 source files to be recompiled. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200813070631.180192-1-nyh@scylladb.com>	2020-08-16 14:41:47 +03:00
Rafael Ávila de Espíndola	efeaded427	Everywhere: Add a make_shared_schema helper This replaces a lot of make_lw_shared(schema(...)) with make_shared_schema(...). This makes it easier to drop a dependency on the differences between seastar::make_shared and std::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Pavel Emelyanov	f045cec586	snap: Get rid of storage_service reference in schema.cc Now when the snapshot stopping is correctly handled, we may pull the database reference all the way down to the schema::describe(). One tricky place is in table::napshot() -- the local db reference is pulled through an smp::submit_to call, but thanks to the shard checks in the place where it is needed the db is still "local" Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 20:28:25 +03:00
Glauber Costa	44a0e40cb2	compaction: move compaction_strategy_type to its own header I just hit a circularity in header inclusion that I traced back to the fact that schema.hh includes compaction_strategy.hh. schema.hh is in turn included in lots of places, so a circularity is not hard to come by. The schema header really only needs to know about the compaction_type, so it can inform schema users about it. Following the trend in header clenups, I am moving that to a separate header which will both break the circularity and make sure we are included less stuff that is not needed. With this change, Scylla fails to compile due to a new missing forward declaration at index/secondary_index_manager.hh, so this is fixed. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200527172203.915936-1-glauber@scylladb.com>	2020-05-29 08:14:27 +03:00
Pekka Enberg	ed0d00f51e	Revert "Revert "schema: Default dc_local_read_repair_chance to zero"" This reverts commit `43b488a7bc`. The commit was originally reverted because a dtest was sensitive to the value. The dtest is fixed now, so let's revert the revert as requested by Glauber.	2020-05-21 08:05:13 +03:00
Pavel Solodovnikov	f6e765b70f	cql3: pass `column_specification` via lw_shared_ptr `column_specification` class is marked as "final": it's safe to use non-polymorphic pointer "lw_shared_ptr" instead of a more generic "shared_ptr". tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200427084016.26068-1-pa.solodovnikov@scylladb.com>	2020-04-27 12:47:42 +03:00
Pekka Enberg	43b488a7bc	Revert "schema: Default dc_local_read_repair_chance to zero" This reverts commit `fdd2d9de3d` because it breaks one heat-weighted load balancing dtest: FAIL: heat_weighted_load_balancing_cl_QUORUM_test (heat_weighted_load_balancing_test.HeatWeightedLB) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/penberg/src/scylla/scylla-dtest/heat_weighted_load_balancing_test.py", line 182, in heat_weighted_load_balancing_cl_QUORUM_test self.run_heat_weighted_load_balancing('QUORUM') File "/home/penberg/src/scylla/scylla-dtest/heat_weighted_load_balancing_test.py", line 165, in run_heat_weighted_load_balancing self.verify_metrics(metrics, cached=False) File "/home/penberg/src/scylla/scylla-dtest/heat_weighted_load_balancing_test.py", line 73, in verify_metrics mean_avg, node_mean_avg, key)) AssertionError: 19.0 not found in range(3, 13) : Cache difference between nodes is less then expected: 6469.6/328.2, metric scylla_storage_proxy_coordinator_reads_local_node I am reverting because it's a test issue, and we should bring this commit back once the test is fixed. Gleb Natapov explains: "dtest result directly depends on replicas we contact. Glauber's patch make us contacts less replicas, so numbers differ."	2020-04-02 13:43:29 +03:00
Glauber Costa	fdd2d9de3d	schema: Default dc_local_read_repair_chance to zero dc_local_read_repair_chance is a legacy of old times: Cassandra itself now defaults to zero, and we should look into that too. Most serious production clusters are either repaired through our asynchronous repair, or don't need repair at all. Synchronous read repair can help things converging, but it implies an impact at query time. For clusters that are on an asynchronous repair schedule this should not be needed. Fixes #6109 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200331183418.21452-1-glauber@scylladb.com>	2020-04-01 08:27:49 +02:00
Piotr Jastrzebski	e72696a8e6	sharding_info: rename the class to sharder Also rename all variables that were named si or sinfo to sharder. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	92cdc21123	schema: remove incorrect comment partitioner is actually part of schema digest and is stored locally in internal tables. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	7bd2b8d73f	schema: make it possible to set sharding_info per schema Previously schema::get_sharding_info was obtaining sharding_info from the partitioner but we want to remove sharding_info from the partitioner so we need a place in schema to store it there instead. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 18:42:33 +02:00
Piotr Jastrzebski	8d81a2498f	schema: add get_sharding_info At the moment, we have a single sharding logic per node but we want to be able to set it per table in the future. To make it easy to change in the future sharding_info will be managed inside schema and all the other code will access it through schema::get_sharding_info function. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-30 09:35:27 +02:00
Nadav Har'El	35d95d6887	merge: Add postimage implementation Merged pull request https://github.com/scylladb/scylla/pull/5996 from Calle Wilund: Fixes #4992 Implements post-image support by synthesizing it from pre-image + delta. Post-image data differs from the delta data in two ways: 1.) It merges non-atomics into an actual result value 2.) It contains all columns of the row, not just those affected by the update. For a non-atomic field, the post-image value of a column is either the pre-image or the delta (maybe null) Tested by adding post-image checks to pre-image test and collection/udt tests	2020-03-16 13:42:07 +02:00
Calle Wilund	ca7046256f	schema: Add "columns" accessor for columns by kind To prevent switch-code everywhere.	2020-03-16 09:21:06 +00:00
Piotr Jastrzebski	5bbb826c49	schema: drop optional from _partitioner field Always set the field to the default value if no table specific partitioner has been set. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:21 +01:00
Piotr Jastrzebski	22daa262ee	partitioner: move default_partitioner to schema.cc Make it inaccessible to other compilation units. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	57b69fb804	schema: include partitioner name in scylla tables mutation There are two results of this patch: 1. New partitioner name column is persited on node's disk in scylla_tables 2. New partitioner name column is included into schema digest This is achieved by including this new column in scylla tables mutation. For that we: 1. Add partitioner name to the result of make_scylla_tables_mutation. If table does not have a specific partitioner set and uses default partitioner then we don't include the name of such default partitioner. Only the name of custom partitioner is added if a table has one. 2. In create_table_from_mutations we check whether scylla tables mutation has a partitioner name set. If so then we use it as a parameter for schema_builder. Note that previous patches have ensured that this new column will be included into schema digest only after the whole cluster supports per table partitioners. Before that, during rolling upgrade, new partitioner name column is hidden and not shared with other nodes. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	1d6cec1b0a	schema: make it possible to set custom partitioner schema_builder::with_partitioner can be used now to set custom partitioner on a table. If no such partitioner is set, global partitioner is still used. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Piotr Jastrzebski	54d24553bb	schema: get_partitioner return const& Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-06 13:33:53 +01:00
Piotr Dulikowski	861c7b5626	schema: get cdc options from schema extensions Removes logic responsible for setting cdc_options from dedicated column in scylla_tables, and uses the "cdc" schema extension instead.	2020-03-05 16:11:21 +01:00
Rafael Ávila de Espíndola	151f5e723f	Pass string_view to the schema constructor This moves string copies from the callers of the constructor to the implementation. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-28 17:04:12 -08:00
Piotr Jastrzebski	9b95153136	schema: add get_partitioner() The plan is to remove dht::global_partitioner() and use schema::get_partitioner() instead. This will allow a usage of per schema/table partitioner instead of a single global partitioner everywhere. Initially schema::get_partitioner will call dht::global_partitioner. After all the calls to dht::global_partitioner are switched to schema::get_partitioner, the ability to set per schema partitioner will be implemented. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:04:41 +01:00
Nadav Har'El	9953a33354	merge "Adding a schema file when creating a snapshot" Merged pull request https://github.com/scylladb/scylla/pull/5294 from Amnon Heiman: To use a snapshot we need a schema file that is similar to the result of running cql DESCRIBE command. The DESCRIBE is implemented in the cql driver so the functionality needs to be re-implemented inside scylla. This series adds a describe method to the schema file and use it when doing a snapshot. There are different approach of how to handle materialize views and secondary indexes. This implementation creates each schema.cql file in its own relevant directory, so the schema for materializing view, for example, will be placed in the snapshot directory of the table of that view. Fixes #4192	2020-01-16 12:05:50 +02:00
Amnon Heiman	82367b325a	schema: Add a describe method This patch adds a describe method to a table schema. It acts similar to a DESCRIBE cql command that is implemented in a CQL driver. The method supports tables, secondary indexes local indexes and materialize views. relates to: #4192 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-01-15 15:06:00 +02:00
Gleb Natapov	16e0fc4742	schema: allow schema to be marked as 'always sync to commitlog' All writes that uses this schema will be immediately persisted on a storage.	2020-01-15 12:15:42 +02:00
Calle Wilund	2787b0c4f8	cdc: Move "options" to separate header to avoid to much header inclusion cdc should not contaminate the whole universe.	2019-12-09 12:12:09 +00:00
Konstantin Osipov	6159c012db	schema: pre-allocate the bitset of column_set The number of columns is usually small, and avoiding a resize speeds up bit manipulation functions.	2019-11-13 11:41:51 +03:00
Konstantin Osipov	e95d675567	schema: introduce schema::all_columns_count() schema::all_columns_count() will be used to reserve memory of the column_set bitmask.	2019-11-13 11:41:42 +03:00
Konstantin Osipov	191acec7ab	schema: rename column_mask to column_set Since it contains a precise set of columns, it's more accurate to call it a set, not a mask. Besides, the name column_mask is already used for column options on storage level.	2019-11-13 11:41:30 +03:00
Nadav Har'El	631846a852	CDC: Implement minimal version that logs only primary key of each change Merge a patch series from Piotr Jastrzębski (haaawk): This PR introduces CDC in it's minimal version. It is possible now to create a table with CDC enabled or to enable/disable CDC on existing table. There is a management of CDC log and description related to enabling/disabling CDC for a table. For now only primary key of the changed data is logged. To be able to co-locate cdc streams with related base table partitions it was needed to propagate the information about the number of shards per node. This was node through gossip. There is an assumption that all the nodes use the same value for sharding_ignore_msb_bits. If it does not hold we would have to gossip sharding_ignore_msb_bits around together with the number of shards. Fixes #4986. Tests: unit(dev, release, debug)	2019-10-20 11:41:01 +03:00
Piotr Jastrzebski	ca9536a771	schema: add _cdc_options field Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-10-17 10:55:31 +02:00

1 2 3 4 5

233 Commits