scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-25 09:11:10 +00:00

Author	SHA1	Message	Date
Calle Wilund	5105e9f5e1	cdc::log: Missing "preimage" check in row deletion pre-image Fixes #6561 Pre-image generation in row deletion case only checked if we had a pre-image result set row. But that can be from post-image. Also check actual existance of the pre-image CK. Message-Id: <20200608132804.23541-1-calle@scylladb.com>	2020-06-09 10:56:41 +03:00
Kamil Braun	013330199d	cdc/storage_proxy: keep cdc_service alive in storage_proxy operations storage_proxy is never deinitialized, so it may have still used cdc_service after its destructor was called. This fixes the problem by cdc_service inheriting from async_sharded_service and storage_proxy calling shared_from_this on the service whenever it uses it. cdc_service inherits from async_sharded_service and not simply from enable_shared_from_this, because there might be other services that cdc_service depends on. Assuming that these services are deinitialized after cdc_service (as they should), i.e. after stop() is called on cdc_service, making cdc_service async_sharded_service will keep their deinitialization code from being called until all references to cdc_service disappear (async_sharded_service keeps stop() from returning until this happens). Some more improvements should be possible through some refactoring: 1. Make augment_mutation_call a free function, not a member of cdc_service: it doesn't need any state that cdc_service has. db_context can be passed down from storage_proxy when it calls the function. 2. Remove the storage_proxy -> cdc_service reference. storage_proxy only needs augment_mutation_call, which would not be a part of the service. This would also get rid of the proxy -> cdc -> proxy reference cycle that we have now, and would allow storage_proxy to be safely deinitialized after cdc_service. 3. Maybe we could even remove the cdc_service -> storage_proxy reference. Is it really needed?	2020-06-08 13:25:51 +03:00
Kamil Braun	a1e235b1a4	CDC: Don't split collection tombstone away from base update Overwriting a collection cell using timestamp T is a process with following steps: 1. inserting a row marker (if applicable) with timestamp T; 2. writing a collection tombstone with timestamp T-1; 3. writing the new collection value with timestamp T. Since CDC does clustering of the operations by timestamp, this would result in 3 separate calls to `transform` (in case of INSERT, or 2 - in the case of UPDATE), which seems excessive, especially when pre-/postimage is enabled. This patch makes collection tombstones being treated as if they had the same TS as the base write and thus they are processed in one call to `transform` (as long as TTLs are not used). Also, `cdc_test` had to be updated in places that relied on former splitting strategy. Fixes #6084	2020-06-07 17:09:05 +03:00
Kamil Braun	d89b7a0548	cdc: rename CDC description tables Commit `968177da04` has changed the schema of cdc_topology_description and cdc_description tables in the system_distributed keyspace. Unfortunately this was a backwards-incompatible change: these tables would always be created, irrespective of whether or not "experimental" was enabled. They just wouldn't be populated with experimental=off. If the user now tries to upgrade Scylla from a version before this change to a version after this change, it will work as long as CDC is protected b the experimental flag and the flag is off. However, if we drop the flag, or if the user turns experimental on, weird things will happen, such as nodes refusing to start because they try to populate cdc_topology_description while assuming a different schema for this table. The simplest fix for this problem is to rename the tables. This fix must get merged in before CDC goes out of experimental. If the user upgrades his cluster from a pre-rename version, he will simply have two garbage tables that he is free to delete after upgrading. sstables and digests need to be regenerated for schema_digest_test since this commit effectively adds new tables to the system_distributed keyspace. This doesn't result in schema disagreement because the table is announced to all nodes through the migration manager.	2020-06-05 09:59:16 +02:00
Piotr Sarna	9a4394327a	Merge 'CDC: Disallowed CDC for tables with counter column(s)' from Juliusz. CDC for counters is unimplemented as of now, therefore any attempt to enable CDC log on counter table needs to be clearly disallowed. This patch does exactly this. The check whether schema has counter columns is performed in `cdc_service::impl` in: - `on_before_create_column_family`, - `on_before_update_column_family` and, if so, results in `invalid_request_exception` thrown. Fixes #6553 * jul-stas-6553-disallow-cdc-for-counters: test/cql: Check that CDC for counters is disallowed CDC: Disallowed CDC for tables with counter column(s)	2020-06-05 07:46:53 +02:00
Juliusz Stasiewicz	3a079cf21b	CDC: Disallowed CDC for tables with counter column(s) Until we get implementation of CDC for counters, we explicitly disallow it. The check is performed in `cdc_service::impl` in: - `on_before_create_column_family`, - `on_before_update_column_family` and results in `invalid_request_exception` thrown.	2020-06-03 18:29:36 +02:00
Piotr Dulikowski	97cb2892b2	cdc: include information about all PKs in trace This fixes a bug in CDC mutation augmentation logic. A lambda that is called for each partition key in a batch captures a trace state pointer, but moves it out after being called for the first time. This caused CDC tracing information to be included only for one of the partition keys of the batch. Fixes #6575	2020-06-03 11:07:57 +02:00
Juliusz Stasiewicz	f2cedbc228	cdc: Remove assert that bootstrap_tokens is nonempty	2020-05-29 12:23:08 +02:00
Kamil Braun	7a98db2ab3	cdc: set ttl column in log rows which update only collections	2020-05-27 08:40:05 +03:00
Piotr Jastrzebski	cd33b9f406	cdc: Tune expired sstables check frequency CDC Log is a time series which uses time window compaction with some time window. Data is TTLed with the same value. This means that sstable won't become fully expired more often than once per time window duration. This patch sets expired_sstable_check_frequency_seconds compaction strategy parameter to half of the time window. Default value of this parameter is 10 minutes which in most cases won't be a good fit. By default, we set TTL to 24h and time window to 1h. This means that with a default value of the parameter we would be checking every 10 minutes but new expired sstable would appear only every 60 minutes. The parameter is set to half of the time window duration because it's the expected time we have to wait for sstable to become fully expired. Half of the time we will wait longer and half of the time we will wait shorter. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-18 16:49:19 +03:00
Nadav Har'El	62c00a3f17	merge: Use time window compaction strategy for CDC Log table Merged pull request https://github.com/scylladb/scylla/pull/6427 by Piotr Jastrzębski: CDC Log is a time series so it makes sense to use time window compaction strategy for it. Our support for time series is limited so we make sure that we don't create more than 24 sstables. If TTL is configured to 0, meaning data does not expire, we don't use time window compaction strategy. This PR also sets gc_grace_seconds to 0 when TTL is not set to 0.	2020-05-13 14:36:43 +03:00
Piotr Jastrzebski	49b6010cb4	cdc: Use time window compaction strategy for CDC Log table CDC Log is a time series with data TTLed by default to 24 hours so it makes sense to use for it a time window compaction. A window size is adjusted to the TTL configured for CDC Log so that no more than 24 sstables will be created. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-12 07:53:40 +02:00
Piotr Jastrzebski	0cd0775a27	cdc: Set CDC Log gc_grace_seconds to 0 Data in CDC Log is TTLed and we want to remove it as soon as it expires. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-11 17:59:52 +02:00
Avi Kivity	76d21a0c22	Merge 'Make it possible to turn caching off per table and stop caching CDC Log' from Piotr J. " We inherited from Origin a `caching` table parameter. It's a map of named caching parameters. Before this PR two caching parameters were expected: `keys` and `rows_per_partition`. So far we have been ignoring them. This PR adds a new caching parameter called `enabled` which can be set to `true` or `false` and controls the usage of the cache for the table. By default, it's set to `true` which reflects Scylla behavior before this PR. This new capability is used to disable caching for CDC Log table. It is desirable because CDC Log entries are not expected to be read often. They also put much more pressure on memory than entries in Base Table. This is caused by the fact that some writes to Base Table can override previous writes. Every write to CDC Log is unique and does not invalidate any previous entry. Fixes #6098 Fixes #6146 Tests: unit(dev, release), manual " * haaawk-dont_cache_cdc: cdc: Don't cache CDC Log table table: invalidate disabled cache on memtable flush table: Add cache_enabled member function cf_prop_defs: persist caching_options in schema property_definitions: add get that returns variant feature: add PER_TABLE_CACHING feature caching_options: add enabled parameter	2020-05-10 15:39:42 +03:00
Avi Kivity	6f1a8cfeea	Merge 'Use special partitioner for CDC Log' from Piotr " CDC has to create CDC streams that are co-located with corresponding BaseTable data. This is not always easy. Especially for small vnodes. This PR introduces new partitioner which allows us to easily find such stream ids that the stream belongs to a given vnode and shard. The idea is that a partitioner accepts only keys that are a blob composed of two int64 numbers. The first number is the token of the key. Tests: unit(dev), dtests(CDC) " * haaawk-cdc_partitioner: cdc:use CDCPartitioner for CDC Log dht: Add find_first_token_for_shard dht: use long_token in token::to_int64 cdc: add CDCPartitioner stream_id: add token_from_bytes static function i_partitioner: Stop distinguishing whether keys order is preserved	2020-05-06 20:29:27 +03:00
Piotr Jastrzebski	e3dd78b68f	cdc: Don't cache CDC Log table CDC writes are not expected to be read multiple times so it makes little sense to cache them. Moreover, CDC Log puts much bigger pressure on memory usage than Base Table because some updates to the Base Table override existing data while related CDC Log updates are always a new entry in a memtable. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-06 18:39:01 +02:00
Piotr Sarna	be5d3f4733	Merge 'A bunch of refactors in versioned_value and gossiper' from Kamil 1. Remove the `versioned_value::factory` class, it didn't add any value. It just forced us to create an object for making `versioned_value`s, for no sensible reason. 2. Move some `versioned_value` deserialization code (string -> internal data structures) into the versioned_value module. Previously, it was scattered all around the place. 3. Make `gossiper::get_seeds` const and return a const reference. I needed these refactors for a PR I was preparing to fix an issue with CDC. The attempt of fixing the issue failed (I'm trying something different now), but the refactors might be useful anyway. * kbr--vv-refactor: gossiper: make `get_seeds` method const and return a const ref versioned_value: remove versioned_value::factory class gms: move TOKENS string deserialization code into versioned_value	2020-04-28 10:27:45 +02:00
Juliusz Stasiewicz	d37b3f34f1	cdc: fix the "NoHostAvailable" client error when CL is not met This commit resolves the client-observable effect of CDC read consistencies. I wrapped the preimage's SELECT query in try-catch to intercept the `unavailable_exception`, which led to misleading `NoHostAvailable` in Python and Java drivers. Now client gets a new error code and a message specific to the issue of CL not being met by the preimage query. Fixes #5746	2020-04-27 13:56:57 +02:00
Piotr Jastrzebski	0416d70c9f	cdc:use CDCPartitioner for CDC Log This will allow deterministic stream_id generation and would remove the risk of not being able to generate a stream id for some vnode. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-04-22 18:25:51 +02:00
Piotr Jastrzebski	7884eada1a	cdc: add CDCPartitioner This is a special partitioner that will be used by CDC Log. It works only with partition key that is blob composed of two ints. The first int is a token this partitioner will map the key to. The second int is there to make it possible to create multiple keys that are different from each other but map to the same token. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-04-21 15:50:22 +02:00
Piotr Jastrzebski	330cd162f0	stream_id: add token_from_bytes static function This function will be used by CDCPartitioner to extract token from partition key. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-04-21 15:50:22 +02:00
Juliusz Stasiewicz	c70311f73e	cdc: CL for preimage select is calculated from base write CL CL of LOCAL_QUORUM used to be hardcoded into CDC preimage query and led to an error when number of replicas was lower than CL would require. The solution here is to link the CLs of writes to base table with the CLs of CDC reads, so the client will get the (limited) control over the consistency of preimage SELECTs (instead of getting error every time). The algorithm is as follows: 1. If write that caused CDC activity was done with CL = ANY, then do preimage read with CL = ONE. 2. If write that caused CDC activity was done with CL = ALL, then do preimage read with CL = QUORUM. 3. SERIAL and LOCAL_SERIAL writes cause preimage read with QUORUM and LOCAL_QUORUM, respectively. 4. In other cases do preimage read with the same CL as base write.	2020-04-21 14:33:36 +02:00
Kamil Braun	113384b6f8	gms: move TOKENS string deserialization code into versioned_value And do the same with CDC_STREAMS_TIMESTAMP. The code that took a list of tokens represented as a string inside versioned_value (for gossiping) and deserialized it into an `unordered_set<dht::token>` lived in the storage_service module, while the code that did the serializing (set -> string) lived in versioned_value. There was a similar situation with the CDC generation timestamp. To increase maintanability and reusability, the deserialization code is now placed next to the serialization code in versioned_value. Furthermore, the `make_full_token_string`, `make_token_string`, and `make_cdc_streams_timestamp_string` (serialization functions) are moved out of versioned_value::factory and made static methods of versioned_value instead.	2020-04-20 12:57:13 +02:00
Piotr Dulikowski	ff80b7c3e2	cdc: do not change frozen list type in cdc log table For a column of type `frozen<list<T>>` in base table, a corresponding column of type `frozen<map<timeuuid, T>>` is created in cdc log. Although a similar change of type takes place in case of non-frozen lists, this is unneeded in case of frozen lists - frozen collections are atomic, therefore there is no need for complicated type that will be able to represent a column update that depends on its previous value (e.g. appending elements to the end of the list). Moreover, only cdc log table creation logic performs this type change for frozen lists. The logic of `transformer::transform`, which is responsible for creation of mutations to cdc log, assumes that atomic columns will have their types unchanged in cdc log table. It simply copies new value of the column from original mutation to the cdc log mutation. A serialized frozen list might be copied to a field that is of frozen map type, which may cause the field to become impossible to deserialize. This patch causes frozen list base table columns to have a corresponding column in cdc log with the same type. A test is added which asserts that the type of cdc log columns is not changed in the case of frozen base columns. Tests: unit(dev) Fixes #6172	2020-04-14 09:44:22 +02:00
Calle Wilund	65a6ebbd73	cdc: Postimage must check iff we have (pre-)image row data for non-touched columns Fixes #6143 When doing post-image generation, we also write values for columns not in delta (actual update), based on data selected in pre-image row. However, if we are doing initial update/insert with only a subset of columns, when the pre-image result set is nil, this cannot be done. Adds check to non-touched column post-image code. Also uses the pre-image value extractor to handle non-atomic sets properly. Tests updated.	2020-04-08 13:48:54 +02:00
Calle Wilund	532a8634c6	cdc::log: Only generate pre/post-image when enabled Fixes #6073 The logic with pre/post image was tangled into looking at "rs" and would cause pre-image info to be stored even if only post-image data was enabled. Now only generate keys (and rows for them) iff explicitly enabled. And only generate pre-image key iff we have pre-image data.	2020-03-24 15:32:30 +00:00
Calle Wilund	881ebe192b	cdc::log: Handle non-atomic column assignments broken into two Fixes #6070 When mutation splitting was added, non-atomic column assignments were broken into two invocation of transform. This means the second (actual data assignment) does not know about the tombstone in first one -> postimage is created as if we were _adding_ to the collection, not replacing it. While not pretty, we can handle this knowing that we always get invoked in timestamp order -> tombstone first, then assign. So we simply keep track of non-atomic columns deleted across calls and filter out preimage data post this. Added test cases for all non-atomics	2020-03-24 14:07:13 +00:00
Nadav Har'El	f1aaa91e21	merge: add metrics Merged pull request https://github.com/scylladb/scylla/pull/6030 from Piotr Dulikowski: Adds CDC-related metrics. Following counters are added, both for total and failed operations: Total number of CDC operations that did/did not perform splitting, Total number of CDC operations that touched a particular mutation part. Total number of preimage selects. Fixes #6002. Tests: unit(dev, debug) * 'cdc-metrics' of github.com:piodul/scylla: storage_proxy: track CDC operations in LWT flow storage_proxy: track CDC operations in logged batches storage_proxy: track CDC operations in standard flow storage_proxy: add cdc tracker hooks to write response handlers storage_proxy: move "else if" remainder into "else" block cdc: create an operation_result_tracker object cdc: add an object for tracking progress of cdc mutations cdc: count touched mutation parts in transformer::transform cdc: track preimage selects in metrics cdc: register metric counters cdc: fix non-atomic updates in splitting	2020-03-23 21:55:58 +02:00
Piotr Dulikowski	5a5cc57878	cdc: create an operation_result_tracker object An `operation_result_tracker` object is now returned as a second return value from the `augment_mutation_call` function.	2020-03-23 14:05:25 +01:00
Piotr Dulikowski	1b92cbeabe	cdc: add an object for tracking progress of cdc mutations CDC metrics, apart from tracking "total" metrics for all performed CDC operations, also track metrics for "failed" operations. Because the result of the CDC operation depends on whether all CDC mutations were written successfully by storage_proxy, checking for failure and incrementing appropriate counters is deferred after all write response handlers finish. The `cdc::operation_result_tracker` object was created for that purpose. It contains all the details needed to accurately update the metrics based on what actually happened in the `augment_mutation_call` function, and holds a flag which tells if any of write response handlers failed. This object is supposed to be referenced by write response handlers for CDC mutations created after the same `augment_mutation_call`. After all write response handlers are destroyed, the destructor of `operation_result_tracker` will update appropriate metrics. Actual creating and attaching this object to write response handlers will be done in subsequent commits.	2020-03-23 14:05:25 +01:00
Piotr Dulikowski	98e5fdc7ac	cdc: count touched mutation parts in transformer::transform Modifies the transformer::transform so that it also returns a set of flags indicating what parts of the mutation (e.g. rows, tombstones, collections, etc.) were processed during transforming.	2020-03-23 14:05:25 +01:00
Piotr Dulikowski	53570d8657	cdc: track preimage selects in metrics This commit causes preimage select counter to be increased after performing this operation.	2020-03-23 14:05:25 +01:00
Piotr Dulikowski	e7062de02b	cdc: register metric counters This patch defines a CDC metrics object and registers all of its counters. storage_proxy is chosen as the owner of the metrics object. Because in subsequent commits it will become possible for CDC metrics to be updated after a write operation ends, and because the cdc_service has shorter lifetime than storage_proxy, we could risk a use-after-free if we placed this object inside cdc_service.	2020-03-23 14:05:25 +01:00
Piotr Dulikowski	338e473946	cdc: fix non-atomic updates in splitting This patch fixes a bug in mutation splitting logic of CDC. In the part that handles updates of non-atomic clustering columns, the column definition was fetched from a static column of the same id instead of the actual definition of the clustering column. It could cause the value to be written to a wrong column. Tests: unit(dev)	2020-03-23 13:47:23 +01:00
Piotr Dulikowski	a693e6ff6c	cdc: fix non-atomic updates in splitting This patch fixes a bug in mutation splitting logic of CDC. In the part that handles updates of non-atomic clustering columns, the schema for serializing that column was looked up incorrectly in the table schema - instead of a `regular_column`, a `static_column` was looked up. Due to how the `column_at` function works, a correct schema was always returned if the table had no static columns. Therefore, in order for this bug to manifest, a table with a static column and a regular column with non-atomic collection was needed.	2020-03-23 10:20:24 +01:00
Botond Dénes	e0284bb9ee	treewide: add missing headers and/or forward declarations	2020-03-23 09:29:45 +02:00
Piotr Dulikowski	3bfb044bf1	cdc: do not create cdc$deleted columns for pks and cks Primary key and clustering key column should not have a corresponding "cdc$deleted_<name>" column in cdc log table, because it does not make sense to delete such a column from a row. Fixes: #6049 Tests: unit(dev)	2020-03-21 07:33:23 +01:00
Piotr Dulikowski	59727fb34b	cdc: remove result_callback The `result_callback` was a callback returned by `augment_mutation_call` that was supposed to be used in the CDC postimage implementation. Because CDC postimage was implemented without using this callback, and currently a no-op function is always returned, this callback can safely be removed.	2020-03-19 14:55:07 +02:00
Nadav Har'El	35d95d6887	merge: Add postimage implementation Merged pull request https://github.com/scylladb/scylla/pull/5996 from Calle Wilund: Fixes #4992 Implements post-image support by synthesizing it from pre-image + delta. Post-image data differs from the delta data in two ways: 1.) It merges non-atomics into an actual result value 2.) It contains all columns of the row, not just those affected by the update. For a non-atomic field, the post-image value of a column is either the pre-image or the delta (maybe null) Tested by adding post-image checks to pre-image test and collection/udt tests	2020-03-16 13:42:07 +02:00
Calle Wilund	0a3383c090	cdc: Add postimage implementation Fixes #4992 Implements post-image support by synthesizing it from pre-image + delta. Post-image data differs from the delta data in two ways: 1.) It merges non-atomics into an actual result value 2.) It contains _all_ columns of the row, not just those affected by the update. For a non-atomic field, the post-image value of a column is either the pre-image or the delta (maybe null) Tested by adding post-image checks to pre-image test and collection/udt tests	2020-03-16 09:21:06 +00:00
Piotr Dulikowski	b1e8170bf9	cdc: add tracing Adds information about the stages of CDC mutation augmentation to tracing sessions.	2020-03-15 11:54:10 +01:00
Calle Wilund	5c743bfd53	cdc: rename inner "process_cells" to avoid confusion Two lambdas should not share name in same function.	2020-03-09 13:06:32 +00:00
Piotr Dulikowski	5f652e58c1	cdc: allow dropping manually created tables with cdc log suffix The is_log_for_some_table function incorrectly assumed that database::find_schema would return a null pointer in case the queried schema does not exist. This patch fixes that, and now this function checks for existence of the schema using database::has_schema. Tests: unit(dev)	2020-03-09 12:17:13 +01:00
Nadav Har'El	6febd4199e	merge: cdc: on row delete, show the whole row as preimage Merged pull request https://github.com/scylladb/scylla/pull/5980 by Piotr Jastrzębski, based on https://github.com/scylladb/scylla/pull/5976 by Juliusz Stasiewicz: "If base mutation has at least one row tombstone, its preimage log entry displays all the base columns." Fixes #5709 Tests: unit(dev)	2020-03-08 14:54:59 +02:00
Juliusz Stasiewicz	68071d35ce	cdc: on row delete display the entire row as preimage If base mutation has at least one row tombstone, its preimage log entry is constructed from all the base columns. Fixes #5709 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-08 12:11:07 +01:00
Piotr Dulikowski	0e413efb48	cdc: correct static row preimage for case with no clustering row In case a static and a clustering row is written at the same time, but a clustering row with given key was not present, the preimage query was incorrectly configured and no rows were returned. This resulted in an empty preimage, while a preimage for static row should be present. This patch fixes this and now the static row is correctly written to cdc log in the case above. Tests: unit(dev)	2020-03-08 09:25:45 +01:00
Juliusz Stasiewicz	e2b76fd559	cdc: move the extractor of `pirow` columns into separate method Because it will be used more than once.	2020-03-06 17:54:42 +01:00
Piotr Dulikowski	f317283578	cdc: disallow creating nested CDC logs This change disallows creating CDC log tables for already existing CDC log tables. CDC logs nested in that way are not really useful and do not work at the moment, therefore disallowing their creation prevents confusion.	2020-03-06 10:47:13 +01:00
Piotr Dulikowski	38b7f1ad45	unit tests: register cdc extension before tests In the following commits, using cdc in tests will require registering cdc extension explicitly in db config.	2020-03-05 16:11:20 +01:00
Piotr Dulikowski	0f4f48ef76	cdc: construct cdc_options directly inside cdc_extension Instead of storing a raw map of options inside `cdc_extension`, the extension now converts them into `cdc_options` directly on construction. This removes the need to construct `cdc_options` object multiple times.	2020-03-05 16:09:44 +01:00

... 6 7 8 9 10

481 Commits