scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 10:00:35 +00:00

Author	SHA1	Message	Date
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Pavel Solodovnikov	83862d9871	db: save supported features after passing gossip feature check Move saving features to `system.local#supported_features` to the point after passing all remote feature checks in the gossiper, right before joining the ring. This makes `system.local#supported_features` column to store advertised feature set. Leave a comment in the definition of `system.local` schema to reflect that. Since the column value is not actually used anywhere for now, it shouldn't affect any tests or alter the existing behavior. Later, we can optimize the gossip communication between nodes in the cluster, removing the feature check altogether in some cases (since the column value should now be monotonic). Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-12-23 12:48:37 +03:00
Pavel Solodovnikov	96799a72d9	db: add `save_local_supported_features` function This is a utility function for writing the supported feature set to the `system.local` table. Will be used to move the corresponding part from `system_keyspace::setup_version` to the gossiper after passing remote feature check, effectively making `system.local#supported_features` store the advertised features (which already passed the feature check). Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-12-20 13:31:52 +03:00
Avi Kivity	078f69c133	Merge "raft: (service) implement group 0 as a service" from Kostja " To ensure consistency of schema and topology changes, Scylla needs a linearizable storage for this data available at every member of the database cluster. The series introduces such storage as a service, available to all Scylla subsystems. Using this service, any other internal service such as gossip or migrations (schema) could persist changes to cluster metadata and expect this to be done in a consistent, linearizable way. The series uses the built-in Raft library to implement a dedicated Raft group, running on shard 0, which includes all members of the cluster (group 0), adds hooks to topology change events, such as adding or removing nodes of the cluster, to update group 0 membership, ensures the group is started when the server boots. The state machine for the group, i.e. the actual storage for cluster-wide information still remains a stub. Extending it to actually persist changes of schema or token ring is subject to a subsequent series. Another Raft related service was implemented earlier: Raft Group Registry. The purpose of the registry is to allow Scylla have an arbitrary number of groups, each with its own subset of cluster members and a relevant state machine, sharing a common transport. Group 0 is one (the first) group among many. " * 'raft-group-0-v12' of github.com:scylladb/scylla-dev: raft: (server) improve tracing raft: (metrics) fix spelling of waiters_awaken raft: make forwarding optional raft: (service) manage Raft configuration during topology changes raft: (service) break a dependency loop raft: (discovery) introduce leader discovery state machine system_keyspace: mark scylla_local table as always-sync commitlog system_keyspace: persistence for Raft Group 0 id and Raft Server Id raft: add a test case for adding entries on follower raft: (server) allow adding entries/modify config on a follower raft: (test) replace virtual with override in derived class raft: (server) fix a typo in exception message raft: (server) implement id() helper raft: (server) remove apply_dummy_entry() raft: (test) fix missing initialization in generator.hh	2021-11-30 16:24:51 +02:00
Avi Kivity	ec775ba292	Merge "Remove more gms::get(_local)?_gossiper() calls" from Pavel E " This set covers simple but diverse cases: - cache hitrace calculator - repair - system keyspace (virtual table) - dht code - transport event notifier All the places just require straightforward arguments passing. And a reparation in transport -- event notifier needs a backref to the owning server. Remaining after this set is the snitch<->gossiper interaction and the cache hitrate app state update from table code. tests: unit(dev) " * 'br-unglobal-gossiper-cont' of https://github.com/xemul/scylla: transport: Use server gossiper in event notifier transport: Keep backreference from event_notifier transport: Keep gossiper on server dht: Pass gossiper to range_streamer::add_ranges dht: Pass gossiper argument to bootstrap system_keyspace: Keep gossiper on cluster_status_table code: Carry gossiper down to virtual tables creation repair: Use local gossiper reference cache_hitrate_calculator: Keep reference on gossiper	2021-11-28 14:18:28 +02:00
Pavel Solodovnikov	1365e2f13e	gms: feature_service: re-enable features on node startup Re-enable previously persisted enabled features on node startup. The features list to be enabled is read from `system.local#enabled_features`. In case an unknown feature is encountered, the node fails to boot with an exception, because that means the node is doing a prohibited downgrade procedure. Features should be enabled before commitlog starts replaying since some features affect storage (for example, when determining used sstable format). This patch implements a part of solution proposed by Tomek in https://github.com/scylladb/scylla/issues/4458. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-11-28 14:18:24 +02:00
Konstantin Osipov	30e3227e0b	system_keyspace: mark scylla_local table as always-sync commitlog It is infrequently updated (typically once at start) but stores critical state for this instance survival (Raft Group 0 id, Raft server id, sstables format), so always write it to commit log in sync mode.	2021-11-25 11:50:38 +03:00
Konstantin Osipov	fd295850fe	system_keyspace: persistence for Raft Group 0 id and Raft Server Id Implement system_keyspace helpers to persist Raft Group 0 id and Raft Server id. Do not use coroutines in a template function to work around https://bugs.llvm.org/show_bug.cgi?id=50345	2021-11-25 11:50:38 +03:00
Pavel Emelyanov	ef1960d034	code: Carry gossiper down to virtual tables creation One of the tables needs gossiper and uses global one. This patch prepares the fix by patching the main -> register_virtual_tables stack with the gossiper reference. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-25 10:52:55 +03:00
Pavel Emelyanov	947e4c9a10	code: Push db::config down to virtual tables The db::config reference is available on the database, which can be get from the virtual_table itself. The problem is that it's a const refernece, while system.config will be updateable and will need non-const reference. Adding non-const get_config() on the database looks wrong. The database shouldn't be used as config provider, even the const one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-11 16:39:34 +03:00
Botond Dénes	200e2fad4d	db/system_keyspace: propagate distributed<> database and storage_service to register_virtual_tables() As some virtual tables will need the distributed versions of these.	2021-11-05 15:42:41 +02:00
Pavel Emelyanov	4b4ce015aa	system-keyspace: Keep UUID value when saving The set_local_host_id() accepts UUID references and starts to save it in local keyspace and in all shards' local cache. Before it was coroutinized the UUID was copied on captures and survived, after it it remains references. The problem is that callers pass local variables as arguments that go away "really soon". Fix it to accept UUID as value, it's short enough for safe and painless copy. fixes: #9425 tests: dtest.ReplaceAddress_rbo_enabled.replace_node_diff_ip(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20211004145421.32137-1-xemul@scylladb.com>	2021-10-04 18:21:44 +03:00
Pavel Emelyanov	9f5fd8b5c0	system_keyspace: Keep local_host_id on local_cache Some places in the code want to have future-less access to the host id, now they do it all by themselves. Local cache seems to be a better place (for the record -- some time ago the "better place" argument justified cached host id relocation from the storage_service onto the database). While at it -- add the future-less getter for the host_id to be used further. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:54:38 +03:00
Pavel Emelyanov	beb345c00a	code: Rename get_local_host_id() into load_...() There will appear the future-less method which better deserves the get_ prefix, so give the existing method the load_ one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-30 10:33:57 +03:00
Avi Kivity	e9ae9279e8	system_keyspace: reindent after conversion to class Conversion to class left indentation in ruins, but that can be easily fixed. 'git diff -w' reports no changes. Closes #9339	2021-09-14 08:49:24 +03:00
Avi Kivity	e70b9d4835	system_keyspace: convert from namespace to class All the namespace scope functions in system_keyspace have no place to store context, so they must store their context in global variables. This prevents conversion of those global variables to constructor-provided depdendencies. Take the first step towards providing a place to store the context by converting system_keyspace to a class. All the functions are static, so no context is yet available, but we can de-static-ify them incrementally in the future and store the context in class members. Indentation is a mess, but can be easily fixed later.	2021-09-13 15:14:14 +03:00
Avi Kivity	115d6d8d4c	system_keyspace: prepare forward-declared members In anticipation of making system_keyspace a class instead of a namespace, rename any member that is currently forward-declared, since one can't forward-declare a class member. Each member is taken out of the system_keyspace namespace and gains a system_keyspace prefix. Aliases are added to reduce code churn. The result isn't lovely, but can be adjusted later.	2021-09-13 15:11:26 +03:00
Avi Kivity	c6ce81d6a0	system_keyspace: rearrange legacy subnamespace Merge two fragments together, in anticipation of making 'legacy' s struct instead of a namespace (when system_keyspace is a class, we can't nest a namespace inside it).	2021-09-13 15:10:15 +03:00
Avi Kivity	6d379ae6f9	system_keyspace: remove outdated java code This code has been rewritten and not removed, or is not needed. Remove it to reduce clutter.	2021-09-13 15:08:57 +03:00
Pavel Solodovnikov	8d3c0ee9b6	raft: new schema for storing raft snapshots Previously, the layout for storing raft snapshot descriptors contained a `config` field, which had `blob` data type. That means `raft::configuration` for the snapshot was serialized as a whole in binary form. It's convenient to implement and is the most compact form of representing the data, but: 1. Hard to debug due to the need to de-serialize the data. 2. Plants a time bomb wrt. changing data layout and also the documentation in the future. Remove the `config` field from `system.raft_snapshots` and extract it to a separate `system.raft_config` table to store the data in exploded form. Also, modify the schema of `system.raft_snapshots` table in the following way: add a `server_id` field as a part of composite partition key ((group_id, server_id)) to be able to start multiple raft servers belonging to one raft group on the same scylla node. Rename `id` field in `raft_snapshots` to `snapshot_id` so it's self-documenting. Rename `snapshot_id` from clustering key since a given server can have only one snapshot installed at a time. Note that the `raft::server_address` stucture contains an opaque `info` member, which is `bytes`, but in the `raft_config` table we use `ip_addr inet` field, instead. We always know that the corresponding member field is going to contain an IP address (either v4 or v6) of a given raft server. So, now the snapshots schema looks like this: CREATE TABLE raft_snapshots ( group_id timeuuid, server_id uuid, snapshot_id uuid, idx int, term int, -- no `config` field here, moved to `raft_config` table PRIMARY KEY ((group_id, server_id)) ) CREATE TABLE raft_config ( group_id timeuuid, my_server_id uuid, server_id uuid, disposition text, -- can be either 'CURRENT` or `PREVIOUS' can_vote bool, ip_addr inet, PRIMARY KEY ((group_id, my_server_id), server_id, disposition) ); This way it's much easier to extend the schema with new fields, very easy to debug and inspect via CQL, and it's much more descriptive in terms of self-documentation. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-08-27 09:24:46 +03:00
Pavel Solodovnikov	c0854a0f62	raft: create system tables only when `raft` experimental feature is set Also introduce a tiny function to return raft-enabled db config for cql testing. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210826091432.279532-1-pa.solodovnikov@scylladb.com>	2021-08-26 12:21:12 +03:00
Juliusz Stasiewicz	f8067d938d	storage_service: Pass the reference down to system_keyspace According to the policy of avoiding globals.	2021-07-20 14:18:24 +02:00
Pavel Solodovnikov	76bea23174	treewide: reduce header interdependencies Use forward declarations wherever possible. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Closes #8813	2021-06-07 15:58:35 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Kamil Braun	4658adbe18	tree-wide: introduce cdc::generation_id_v2 This is a new type of CDC generation identifiers. Compared to old IDs, additionally to the timestamp it contains an UUID. These new identifiers will allow a safer and more efficient algorithm of introducing new generations into a cluster (introduced in a later commit). For now, nodes keep using the old identifier format when creating new generations and whenever they learn about a new CDC generation from gossip they assume that it also is stored in the v1 format. But they do know how to (de)serialize the second format and how to persist new identifiers in local tables.	2021-05-24 17:50:21 +02:00
Kamil Braun	99fd2244a3	tree-wide: introduce cdc::generation_id type This is a follow-up to the previous commit. Each CDC generation has a timestamp which denotes a logical point in time when this generation starts operating. That same timestamp is used to identify the CDC generation. We use this identification scheme to exchange CDC generations around the cluster. However, the fact that a generation's timestamp is used as an ID for this generation is an implementation detail of the currently used method of managing CDC generations. Places in the code that deal with the timestamp, e.g. functions which take it as an argument (such as handle_cdc_generation) are often interested in the ID aspect, not the "when does the generation start operating" aspect. They don't care that the ID is a `db_clock::time_point`. They may sometimes want to retrieve the time point given the ID (such as do_handle_cdc_generation when it calls `cdc::metadata::insert`), but they don't care about the fact that the time point actually IS the ID. In the future we may actually change the specific type of the ID if we modify the generation management algorithms. This commit is an intermediate step that will ease the transition in the future. It introduces a new type, `cdc::generation_id`. Inside it contains the timestamp, so: 1. if a piece of code doesn't care about the timestamp, it just passes the ID around 2. if it does care, it can simply access it using the `get_ts` function. The fact that `get_ts` simply accesses the ID's only field is an implementation detail. Using the occasion, we change the `do_handle_cdc_generation_intercept...` function to be a standard function, not a coroutine. It turns out that - depending on the shape of the passed-in argument - the function would sometimes miscompile (the compiled code would not copy the argument to the coroutine frame).	2021-04-07 13:47:13 +02:00
Kamil Braun	e486e0f759	tree-wide: rename "cdc streams timestamp" to "cdc generation id" Each CDC generation always has a timestamp, but the fact that the timestamp identifies the generation is an implementation detail. We abstract away from this detail by using a more generic naming scheme: a generation "identifier" (whatever that is - a timestamp or something else). It's possible that a CDC generation will be identified by more than a timestamp in the (near) future. The actual string gossiped by nodes in their application state is left as "CDC_STREAMS_TIMESTAMP" for backward compatibility. Some stale comments have been updated.	2021-04-06 13:15:31 +02:00
Kamil Braun	1019ff07cb	db: system_keyspace: group cdc functions in single place	2021-04-06 13:15:31 +02:00
Kamil Braun	9bdd000e97	cdc: rewrite streams to the new description table Nodes automatically ensure that the latest CDC generation's list of streams is present in the streams description table. When a new generation appears, we only need to update the table for this generation; old generations are already inserted. However, we've changed the description table (from `cdc_streams_descriptions` to `cdc_streams_descriptions_v2`). The existing mechanism only ensures that the latest generation appears in the new description table. This commit adds an additional procedure that rewrites the older generations as well, if we find that it is necessary to do so (i.e. when some CDC log tables may contain data in these generations).	2021-02-18 11:44:59 +01:00
Gleb Natapov	d8345c67d9	Consolidate system and non system keyspace creation The code that creates system keyspace open code a lot of things from database::create_keyspace(). The patch makes create_keyspace() suitable for both system and non system keyspaces and uses it to create system keyspaces as well. Message-Id: <20210209160506.1711177-1-gleb@scylladb.com>	2021-02-09 17:18:04 +01:00
Pavel Solodovnikov	cf5b8c4b79	raft: create `system.raft` and `system.raft_snapshots` tables System raft table will be used as a backend storage for implementing raft persistence module in Scylla. It combines both raft log, persisted vote and term, and snapshot info. The table is partitioned by group id, thus allowing multi-raft operation. The rest of the table structure mirrors the fields of corresponding core raft structures defined in `raft.hh`, such as `raft::log_entry`. The raft table stores the only the latest snapshot id while the actual snapshot will be available in a separate table called `system.raft_snapshots`. The schema of `raft_snapshots` mirrors the fields of `raft::snapshot` structure. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-01-29 01:59:04 +03:00
Piotr Sarna	f293c59a46	system_keyspace: migrate helper functions to string_view Functions for checking if the keyspace is system/internal were based on sstring references, which is impractical compared to string views and may lead to unnecessary creation of sstring instances.	2021-01-04 09:47:01 +01:00
Pavel Emelyanov	fea4a5492f	system-keyspace: Remove dead code Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20201123151453.27341-1-xemul@scylladb.com>	2020-11-23 17:16:15 +02:00
Pavel Emelyanov	689fd029a1	query-context: Remove database from qctx No users of qctx::db are left. One global database reference less. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	fb20d9cd1e	system-keyspace: Remove dead code Not called anywhare. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	78298ec776	init: Use local messaging reference in main There are few places that initialize db and system_ks and need the messaging service. Pass the reference to it from main instead of using the global helpers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 13:08:12 +03:00
Calle Wilund	30a700c5b0	system_keyspace: Remove support for legacy truncation records Fixes #6341 Since scylla no longer supports upgrading from a version without the "new" (dedicated) truncation record table, we can remove support for these and the migtration thereof. Make sure the above holds whereever this is committed. Note that this does not remove the "truncated_at" field in system.local.	2020-08-03 17:16:26 +03:00
Benny Halevy	e39fbe1849	compaction: move compaction uuid generation to compaction_info We'd like to use the same uuid both for printing compaction log messages and to update compaction_history. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-07-16 13:55:23 +03:00
Pavel Emelyanov	3c2066bd78	system_keyspace: Cleanup setup() from storage_service Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-05-25 14:17:31 +03:00
Gleb Natapov	97af6bb0bd	lwt: make load_paxos_state to take partition_key_view instead of a deference Some caller have partition_key_view, but not partition_key, so thy need to create a temporary and copy just to pass a reference. Change it by accepting a view.	2020-04-22 13:51:43 +03:00
Gleb Natapov	8a408ac5a8	lwt: remove entries from system.paxos table after successful learn stage The learning stage of PAXOS protocol leaves behind an entry in system.paxos table with the last learned value (which can be large). In case not all participants learned it successfully next round on the same key may complete the learning using this info. But if all nodes learned the value the entry does not serve useful purpose any longer. The patch adds another round, "prune", which is executed in background (limited to 1000 simultaneous instances) and removes the entry in case all nodes replied successfully to the "learn" round. It uses the ballot's timestamp to do the deletion, so not to interfere with the next round. Since deletion happens very close to previous writes it will likely happen in memtable and will never reach sstable, so that reduces memtable flush and compaction overhead. Fixes #5779 Message-Id: <20200330154853.GA31074@scylladb.com>	2020-03-30 21:02:14 +03:00
Pavel Emelyanov	4fa12f2fb8	header: De-bloat schema.hh The header sits in many other headers, but there's a handy schema_fwd.hh that's tiny and contains needed declarations for other headers. So replace shema.hh with schema_fwd.hh in most of the headers (and remove completely from some). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303102050.18462-1-xemul@scylladb.com>	2020-03-03 11:34:00 +01:00
Pavel Emelyanov	6050c559a3	storage_service: Move get_local_tokens wrapper This wrapper just makes sure the system_keyspace::get_saved_tokens reports non empty result. Move them close together. As a side effect -- get rid of penultimate global storage_service reference from size_estimates_virtual_reader (the last one will be removed soon). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-10 20:54:31 +03:00
Avi Kivity	bed61b96a2	Merge "Move features from storage- into feature-service" from Pavel " There's a lot of code around that needs storage service purely to get the specific feature value (cluster_supports_<something> calls). This creates several circular dependencies, e.g. storage_service <-> migration_manager one and database <-> storage_servuce. Also features sit on storage_service, but register themselfs on the feature_service and the former subscribes on them back which also looks strange. I propose to keep all the features on feature_service, this keeps the latter intependent from other components, makes it possible to break one of the mentioned circle dependencyand heavily relax the other. Also the set helps us fighting the globals and, after it, the feature_service can be safely stopped at the very last moment. Tests: unit(dev), manual debug build start-stop " * 'br-features-to-service-5' of https://github.com/xemul/scylla: gossiper: Avoid string merge-split for nothing features: Stop on shutdown storage_service: Remove helpers storage_service: Prepare to switch from on-board feature helpers cql3: Check feature in .validate database: Use feature service storage_proxy: Use feature service migration_manager: Use feature service start: Pass needed feature as argument into migrate_truncation_records features: Unfriend storage_service features: Simplify feature registration features: Introduce known_feature_set features: Move disabled features set from storage_service features: Move schema_features helper features: Move all features from storage_service to feature_service storage_service: Use feature_config from _feature_service features: Add feature_config storage_service: Kill set_disabled_features gms: Move features stuff into own .cc file migration_manager: Move some fns into class	2020-02-09 19:22:07 +02:00
Pavel Emelyanov	74fd3466b5	start: Pass needed feature as argument into migrate_truncation_records As a nice side-effect this stops using global storage service instance by this function. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-03 15:16:23 +03:00
Kamil Braun	7fa30f6f34	db: add a system.cdc_local table with CDC generation timestamp This will be used to persist CDC streams generation timestamp proposed by a joining node in case the node crashes or restarts, similarly to the way tokens are persisted. The get_saved_cdc_streams_timestamp method retrieves the generation timestamp from the system table. It will be used by a restarting node. The update_cdc_streams_timestamp method saves CDC stream generation timestamp of the calling node in the system table. A joining node will persist the timestamp before it proposes it to other nodes.	2020-01-30 11:10:08 +01:00
Gleb Natapov	0fc48515d8	paxos: mark paxos table schema as "always sync" We want all writes to paxos table to be persisted on a storage before declared completed.	2020-01-15 12:15:42 +02:00
Piotr Sarna	36ec43a262	Merge "add table with connected cql clients" from Juliusz This change introduces system.clients table, which provides information about CQL clients connected. PK is the client's IP address, CK consists of outgoing port number and client_type (which will be extended in future to thrift/alternator/redis). Table supplies also shard_id and username. Other columns, like connection_stage, driver_name, driver_version..., are currently empty but exist for C* compatibility and future use. This is an ordinary table (i.e. non-virtual) and it's updated upon accepting connections. This is also why C*'s column request_count was not introduced. In case of abrupt DB stop, the table should not persist, so it's being truncated on startup. Resolves #4820	2020-01-14 10:01:07 +02:00
Piotr Jastrzebski	c08e6985cd	cdc: allow cluster rolling upgrade Addition of cdc column in scylla_tables changes how schema digests are calculated, and affect the ABI of schema update messages (adding a column changes other columns' indexes in frozen_mutation). To fix this, extend the schema_tables mechanism with support for the cdc column, and adjust schemas and mutations to remove that column when sending schemas during upgrade. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-01-05 14:39:23 +02:00

1 2 3

135 Commits