scylladb

Author	SHA1	Message	Date
Calle Wilund	05851578d4	alternator::streams: Report streams as not ready until CDC stream id:s are available Refs #6864 When booting a clean scylla, CDC stream ID:s will not be availble until a nring delay time period has passed. Before this, writing to a CDC enabled table will fail hard. For alternator (and its tests), we can report the stream(s) for tables as not yet available (ENABLING) until such time as id:s are computed. v2: Keep storage service ref in executor	2020-08-03 20:34:15 +03:00
Nadav Har'El	2dcb6294da	merge: cdc: New delta modes: `off`, `keys`, `fulll` Merged pull request https://github.com/scylladb/scylla/pull/6914 by By Juliusz Stasiewicz: The goal is to have finer control over CDC "delta" rows, i.e.: disable them totally (mode off); record only base PK+CK columns (mode keys); make them behave as usual (mode full, default). The editing of log rows is performed at the stage of finishing CDC mutation. Fixes #6838 tests: Added CQL test for `delta mode` cdc: Implementations of `delta_mode::off/keys` cdc: Infrastructure for controlling `delta_mode`	2020-08-03 14:10:15 +03:00
Botond Dénes	92a7b16cba	query: read_command: add max_result_size This field will replace max size which is currently passed once per established rpc connection via the CLIENT_ID verb and stored as an auxiliary value on the client_info. For now it is unused, but we update all sites creating a read command to pass the correct value to it. In the next patch we will phase out the old max size and use this field to pass max size on each verb instead.	2020-07-28 18:00:29 +03:00
Botond Dénes	8992bcd1f8	query: read_command: use tagged ints for limit ctor params The convenience constructor of read_command now has two integer parameter next to each other. In the next patch we intend to add another one. This is recipe for disaster, so to avoid mistakes this patch converts these parameters to tagged integers. This makes sure callers pass what they meant to pass. As a matter of fact, while fixing up call-sites, I already found several ones passing `query::max_partitions` to the `row_limit` parameter. No harm done yet, as `query::max_partitions` == `query::max_rows` but this shows just how easy it is to mix up parameters with the same type.	2020-07-28 18:00:29 +03:00
Juliusz Stasiewicz	9e4247090f	cdc: Implementations of `delta_mode::off/keys` At the stage of `finish`ing CDC mutation, deltas are removed (mode `off`) or edited to keep only PK+CK of the base table (mode `keys`). Fixes #6838	2020-07-27 19:05:47 +02:00
Juliusz Stasiewicz	c05128d217	cdc: Infrastructure for controlling `delta_mode` The goal is to have finer control over CDC "delta" rows, i.e.: - disable them totally (mode `off`); - record only PK+CK (mode `keys`); - make them behave as usual (mode `full`, default). This commit adds the necessary infrastructure to `cdc_options`.	2020-07-27 19:00:06 +02:00
Kamil Braun	12e2891c60	cdc: if ring_delay == 0, don't add delay to newly created generation If ring_delay == 0, something fishy is going on, e.g. single-node tests are being performed. In this case we want the CDC generation to start operating immediately. There is no need to wait until it propagates to the cluster. You should not use ring_delay == 0 in production. Fixes https://github.com/scylladb/scylla/issues/6864.	2020-07-22 16:06:09 +03:00
Pavel Emelyanov	757a7145b9	headers: Remove mutation.hh from trace_state.hh Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:40:23 +03:00
Piotr Dulikowski	e2462bce3b	cdc: fix a corner case inside get_base_table It is legal for a user to create a table with name that has a _scylla_cdc_log suffix. In such case, the table won't be treated as a cdc log table, and does not require a corresponding base table to exist. During refactoring done as a part of initial implemetation of of Alternator streams (#6694), `is_log_for_some_table` started throwing when trying to check a name like `X_scylla_cdc_log` when there was no table with name `X`. Previously, it just returned false. The exception originates inside `get_base_table`, which tries to return the base table schema, not checking for its existence - which may throw. It makes more sense for this function to return nullptr in such case (it already does when provided log table name does not have the cdc log suffix), so this patch adds an explicit check and returns nullptr when necessary. A similar oversight happened before (see #5987), so this patch also adds a comment which explains why existence of `X_scylla_cdc_log` does not imply existence of `X`. Fixes: #6852 Refs: #5724, #5987	2020-07-16 16:38:48 +03:00
Calle Wilund	3376209718	cdc::schema: Make extensions expicitly settable from builder To make non-cql cdc schema options a reality.	2020-07-15 08:21:34 +00:00
Calle Wilund	0158f6473b	cdc: Add stream ids structure with time and expiration For reading the topology tables from within scylla.	2020-07-15 08:10:23 +00:00
Calle Wilund	331aa7c501	cdc: Add "is_cdc_metacolumn_name" predicate To sift column names	2020-07-15 08:10:23 +00:00
Calle Wilund	8a728ce618	cdc: Add get_base_table helper	2020-07-15 08:10:23 +00:00
Calle Wilund	8f462e8606	CDC::log: Add `base_name` helper To extract base table name from CDC log table name.	2020-07-15 08:10:23 +00:00
Piotr Dulikowski	ad811a48bf	cdc: fix indentation	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	20b236d27d	cdc: don't update partition state when not needed In some cases, tracking the state of processed rows inside `transformer` is not needd at all. We don't need to do it if either: - Preimage and postimage are disabled for the table, - Only preimage is enabled and we are processing the last timestamp. This commit disables updating the state in the cases listed above.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	246f8da6f6	cdc: implement pre/postimage persistence Moves responsibility for generating pre/postimage rows from the "process_change" method to "produce_preimage" and "produce_postimage". This commit actually affects the contents of generated CDC log mutations. Added a unit test that verifies more complicated cases with CQL BATCH.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	24b50ffbc8	cdc: add interface for producing pre/postimages Introduces new methods to the change_processor interface that will cause it to produce pre/postimage rows for requested clustering key, or for static row. Introduces logic in split.cc responsible for calling pre/postimage methods of the change_processor interface. This does not have any effect on generated CDC log mutations yet, because the transformer class has empty implementations in place of those methods.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	761c59d92a	cdc: load preimage query result into partition state fields Instead of looking up preimage data directly from the raw preimage query results, use the raw results to populate current partition state data, and read directly from the current partition state.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	946354ee74	cdc: introduce fields for keeping partition state Introduces data structures that will be used for keeping the current state of processed rows: _clustering_row_states, and _static_row_state.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	bb587a93be	cdc: rename set_pk_columns -> allocate_new_log_row The new name better describes what this function does.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	82ddeb1992	cdc: track batch_no inside transformer Move tracking of batch_no inside the transformer.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	7b47f84965	cdc: move cdc$time generation to transformer Generate the timeuuid on the transformer side, which allows to simplify the change_processor interface.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	7691568b0a	cdc: move find_timestamp to split.cc The function is no longer used in log.cc, so instead it is moved to split.cc. Removed declaration of the function from the log.hh header, because it is not used elsewhere - apart from testing code, but it already declared find_timestamp in the cdc_test.cc file.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	51d97be0b3	cdc: introduce change_processor interface This allows for a more refined use of the transformer by the for_each_change function (now named "process_changes_with_splitting). The change_processor interface exposes two methods so far: begin_timestamp, and process_change (previously named "transform"). By separating those two and exposing them, process_changes_with\ _splitting can cause the transformer to generate less CDC log mutations - only one for each timestamp in the batch.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	f907cab156	cdc: remove redundant schema arguments from cdc functions A `mutation` object already has a reference to its schema. It does not make sense to call functions changed in this commit with a different schema.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	fa00ea996a	cdc: move management of generated mutations inside transformer CDC log mutations are now stored inside `transformer`, and only moved to the final set of mutations at the end of `transformer`'s lifetime.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	76a323a02d	cdc: move preimage result set into a field of transformer Instead of passing the preimage result set in each `transform` call, it is now assigned to a field, and `transform` uses that field.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	79eabc04a8	cdc: keep ts and tuuid inside transformer Adds a `begin_timestamp` method which tells the `transformer` to start using the following timestamp and timeuuid when generating new log row mutations.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	3c01b3c41d	cdc: track touched parts of mutations inside transformer Moves tracking of the "touched parts" statistics inside the transformer class. This commit is the first of multiple commits in this series which move parts of the state used in CDC log row generation inside the `transformer` class. There is a lot of state being passed to `transformer` each time its methods are called, which could be as well tracked by the `transformer` itself. This will result in a nicer interface and will allow us to generate less CDC log mutations which give the same result.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	027d20c654	cdc: always include preimage for affected rows This changes the current algorithm so that the preimage row will not be skipped if the corresponding rows was not present in preimage query results.	2020-07-08 15:36:40 +02:00
Piotr Sarna	4cb79f04b0	treewide: replace libjsoncpp usage with rjson In order to eventually switch to a single JSON library, most of the libjsoncpp usage is dropped in favor of rjson. Unfortunately, one usage still remains: test/utils/test_repl utility heavily depends on the exact textual format of its output JSON files, so replacing a library results in all tests failing because of differences in formatting. It is possible to force rjson to print its documents in the exact matching format, but that's left for later, since the issue is not critical. It would be nice though if our test suite compared JSON documents with a real JSON parser, since there are more differences - e.g. libjsoncpp keeps children of the object sorted, while rapidjson uses an unordered data structure. This change should cause no change in semantics, it strives just to replace all usage of libjsoncpp with rjson.	2020-07-03 10:27:23 +02:00
Juliusz Stasiewicz	8628ede009	cdc: Fix segfault when stream ID key is too short When a token is calculated for stream_id, we check that the key is exactly 16 bytes long. If it's not - `minimum_token` is returned and client receives empty result. This used to be the expected behavior for empty keys; now it's extended to keys of any incorrect length. Fixes #6570	2020-06-17 18:19:37 +03:00
Nadav Har'El	86a4dfcd29	merge: api: Command to check and repair cdc streams Merged pull request https://github.com/scylladb/scylla/pull/6551 from Juliusz Stasiewicz: The command regenerates streams when: generations corresponding to a gossiped timestamp cannot be fetched from system_distributed table, or when generation token ranges do not align with token metadata. In such case the streams are regenerated and new timestamp is gossiped around. The returned JSON is always empty, regardless of whether streams needed regeneration or not. Fixes #6498 Accompanied by: scylladb/scylla-jmx#109, scylladb/scylla-tools-java#172	2020-06-15 14:17:35 +03:00
Calle Wilund	5105e9f5e1	cdc::log: Missing "preimage" check in row deletion pre-image Fixes #6561 Pre-image generation in row deletion case only checked if we had a pre-image result set row. But that can be from post-image. Also check actual existance of the pre-image CK. Message-Id: <20200608132804.23541-1-calle@scylladb.com>	2020-06-09 10:56:41 +03:00
Kamil Braun	013330199d	cdc/storage_proxy: keep cdc_service alive in storage_proxy operations storage_proxy is never deinitialized, so it may have still used cdc_service after its destructor was called. This fixes the problem by cdc_service inheriting from async_sharded_service and storage_proxy calling shared_from_this on the service whenever it uses it. cdc_service inherits from async_sharded_service and not simply from enable_shared_from_this, because there might be other services that cdc_service depends on. Assuming that these services are deinitialized after cdc_service (as they should), i.e. after stop() is called on cdc_service, making cdc_service async_sharded_service will keep their deinitialization code from being called until all references to cdc_service disappear (async_sharded_service keeps stop() from returning until this happens). Some more improvements should be possible through some refactoring: 1. Make augment_mutation_call a free function, not a member of cdc_service: it doesn't need any state that cdc_service has. db_context can be passed down from storage_proxy when it calls the function. 2. Remove the storage_proxy -> cdc_service reference. storage_proxy only needs augment_mutation_call, which would not be a part of the service. This would also get rid of the proxy -> cdc -> proxy reference cycle that we have now, and would allow storage_proxy to be safely deinitialized after cdc_service. 3. Maybe we could even remove the cdc_service -> storage_proxy reference. Is it really needed?	2020-06-08 13:25:51 +03:00
Kamil Braun	a1e235b1a4	CDC: Don't split collection tombstone away from base update Overwriting a collection cell using timestamp T is a process with following steps: 1. inserting a row marker (if applicable) with timestamp T; 2. writing a collection tombstone with timestamp T-1; 3. writing the new collection value with timestamp T. Since CDC does clustering of the operations by timestamp, this would result in 3 separate calls to `transform` (in case of INSERT, or 2 - in the case of UPDATE), which seems excessive, especially when pre-/postimage is enabled. This patch makes collection tombstones being treated as if they had the same TS as the base write and thus they are processed in one call to `transform` (as long as TTLs are not used). Also, `cdc_test` had to be updated in places that relied on former splitting strategy. Fixes #6084	2020-06-07 17:09:05 +03:00
Kamil Braun	d89b7a0548	cdc: rename CDC description tables Commit `968177da04` has changed the schema of cdc_topology_description and cdc_description tables in the system_distributed keyspace. Unfortunately this was a backwards-incompatible change: these tables would always be created, irrespective of whether or not "experimental" was enabled. They just wouldn't be populated with experimental=off. If the user now tries to upgrade Scylla from a version before this change to a version after this change, it will work as long as CDC is protected b the experimental flag and the flag is off. However, if we drop the flag, or if the user turns experimental on, weird things will happen, such as nodes refusing to start because they try to populate cdc_topology_description while assuming a different schema for this table. The simplest fix for this problem is to rename the tables. This fix must get merged in before CDC goes out of experimental. If the user upgrades his cluster from a pre-rename version, he will simply have two garbage tables that he is free to delete after upgrading. sstables and digests need to be regenerated for schema_digest_test since this commit effectively adds new tables to the system_distributed keyspace. This doesn't result in schema disagreement because the table is announced to all nodes through the migration manager.	2020-06-05 09:59:16 +02:00
Piotr Sarna	9a4394327a	Merge 'CDC: Disallowed CDC for tables with counter column(s)' from Juliusz. CDC for counters is unimplemented as of now, therefore any attempt to enable CDC log on counter table needs to be clearly disallowed. This patch does exactly this. The check whether schema has counter columns is performed in `cdc_service::impl` in: - `on_before_create_column_family`, - `on_before_update_column_family` and, if so, results in `invalid_request_exception` thrown. Fixes #6553 * jul-stas-6553-disallow-cdc-for-counters: test/cql: Check that CDC for counters is disallowed CDC: Disallowed CDC for tables with counter column(s)	2020-06-05 07:46:53 +02:00
Juliusz Stasiewicz	3a079cf21b	CDC: Disallowed CDC for tables with counter column(s) Until we get implementation of CDC for counters, we explicitly disallow it. The check is performed in `cdc_service::impl` in: - `on_before_create_column_family`, - `on_before_update_column_family` and results in `invalid_request_exception` thrown.	2020-06-03 18:29:36 +02:00
Piotr Dulikowski	97cb2892b2	cdc: include information about all PKs in trace This fixes a bug in CDC mutation augmentation logic. A lambda that is called for each partition key in a batch captures a trace state pointer, but moves it out after being called for the first time. This caused CDC tracing information to be included only for one of the partition keys of the batch. Fixes #6575	2020-06-03 11:07:57 +02:00
Juliusz Stasiewicz	f2cedbc228	cdc: Remove assert that bootstrap_tokens is nonempty	2020-05-29 12:23:08 +02:00
Kamil Braun	7a98db2ab3	cdc: set ttl column in log rows which update only collections	2020-05-27 08:40:05 +03:00
Piotr Jastrzebski	cd33b9f406	cdc: Tune expired sstables check frequency CDC Log is a time series which uses time window compaction with some time window. Data is TTLed with the same value. This means that sstable won't become fully expired more often than once per time window duration. This patch sets expired_sstable_check_frequency_seconds compaction strategy parameter to half of the time window. Default value of this parameter is 10 minutes which in most cases won't be a good fit. By default, we set TTL to 24h and time window to 1h. This means that with a default value of the parameter we would be checking every 10 minutes but new expired sstable would appear only every 60 minutes. The parameter is set to half of the time window duration because it's the expected time we have to wait for sstable to become fully expired. Half of the time we will wait longer and half of the time we will wait shorter. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-18 16:49:19 +03:00
Nadav Har'El	62c00a3f17	merge: Use time window compaction strategy for CDC Log table Merged pull request https://github.com/scylladb/scylla/pull/6427 by Piotr Jastrzębski: CDC Log is a time series so it makes sense to use time window compaction strategy for it. Our support for time series is limited so we make sure that we don't create more than 24 sstables. If TTL is configured to 0, meaning data does not expire, we don't use time window compaction strategy. This PR also sets gc_grace_seconds to 0 when TTL is not set to 0.	2020-05-13 14:36:43 +03:00
Piotr Jastrzebski	49b6010cb4	cdc: Use time window compaction strategy for CDC Log table CDC Log is a time series with data TTLed by default to 24 hours so it makes sense to use for it a time window compaction. A window size is adjusted to the TTL configured for CDC Log so that no more than 24 sstables will be created. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-12 07:53:40 +02:00
Piotr Jastrzebski	0cd0775a27	cdc: Set CDC Log gc_grace_seconds to 0 Data in CDC Log is TTLed and we want to remove it as soon as it expires. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-11 17:59:52 +02:00
Avi Kivity	76d21a0c22	Merge 'Make it possible to turn caching off per table and stop caching CDC Log' from Piotr J. " We inherited from Origin a `caching` table parameter. It's a map of named caching parameters. Before this PR two caching parameters were expected: `keys` and `rows_per_partition`. So far we have been ignoring them. This PR adds a new caching parameter called `enabled` which can be set to `true` or `false` and controls the usage of the cache for the table. By default, it's set to `true` which reflects Scylla behavior before this PR. This new capability is used to disable caching for CDC Log table. It is desirable because CDC Log entries are not expected to be read often. They also put much more pressure on memory than entries in Base Table. This is caused by the fact that some writes to Base Table can override previous writes. Every write to CDC Log is unique and does not invalidate any previous entry. Fixes #6098 Fixes #6146 Tests: unit(dev, release), manual " * haaawk-dont_cache_cdc: cdc: Don't cache CDC Log table table: invalidate disabled cache on memtable flush table: Add cache_enabled member function cf_prop_defs: persist caching_options in schema property_definitions: add get that returns variant feature: add PER_TABLE_CACHING feature caching_options: add enabled parameter	2020-05-10 15:39:42 +03:00
Avi Kivity	6f1a8cfeea	Merge 'Use special partitioner for CDC Log' from Piotr " CDC has to create CDC streams that are co-located with corresponding BaseTable data. This is not always easy. Especially for small vnodes. This PR introduces new partitioner which allows us to easily find such stream ids that the stream belongs to a given vnode and shard. The idea is that a partitioner accepts only keys that are a blob composed of two int64 numbers. The first number is the token of the key. Tests: unit(dev), dtests(CDC) " * haaawk-cdc_partitioner: cdc:use CDCPartitioner for CDC Log dht: Add find_first_token_for_shard dht: use long_token in token::to_int64 cdc: add CDCPartitioner stream_id: add token_from_bytes static function i_partitioner: Stop distinguishing whether keys order is preserved	2020-05-06 20:29:27 +03:00
Piotr Jastrzebski	e3dd78b68f	cdc: Don't cache CDC Log table CDC writes are not expected to be read multiple times so it makes little sense to cache them. Moreover, CDC Log puts much bigger pressure on memory usage than Base Table because some updates to the Base Table override existing data while related CDC Log updates are always a new entry in a memtable. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-06 18:39:01 +02:00

1 2 3 4

165 Commits