scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-18 13:52:07 +00:00

Author	SHA1	Message	Date
Botond Dénes	c7c5817808	Merge 'Improve timestamp heuristics for tombstone garbage collection' from Benny Halevy When purging regular tombstone consult the min_live_timestamp, if available. This is safe since we don't need to protect dead data from resurrection, as it is already dead. For shadowable_tombstones, consult the min_memtable_live_row_marker_timestamp, if available, otherwise fallback to the min_live_timestamp. If we see in a view table a shadowable tombstone with time T, then in any row where the row marker's timestamp is higher than T the shadowable tombstone is completely ignored and it doesn't hide any data in any column, so the shadowable tombstone can be safely purged without any effect or risk resurrecting any deleted data. In other words, rows which might cause problems for purging a shadowable tombstone with time T are rows with row markers older or equal T. So to know if a whole sstable can cause problems for shadowable tombstone of time T, we need to check if the sstable's oldest row marker (and not oldest column) is older or equal T. And the same check applies similarly to the memtable. If both extended timestamp statistics are missing, fallback to the legacy (and inaccurate) min_timestamp. Fixes scylladb/scylladb#20423 Fixes scylladb/scylladb#20424 > [!NOTE] > no backport needed at this time > We may consider backport later on after given some soak time in master/enterprise > since we do see tombstone accumulation in the field under some materialized views workloads Closes scylladb/scylladb#20446 * github.com:scylladb/scylladb: cql-pytest: add test_compaction_tombstone_gc sstable_compaction_test: add mv_tombstone_purge_test sstable_compaction_test: tombstone_purge_test: test that old deleted data do not inhibit tombstone garbage collection sstable_compaction_test: tombstone_purge_test: add testlog debugging sstable_compaction_test: tombstone_purge_test: make_expiring: use next_timestamp sstable, compaction: add debug logging for extended min timestamp stats compaction: get_max_purgeable_timestamp: use memtable and sstable extended timestamp stats compaction: define max_purgeable_fn tombstone: can_gc_fn: move declaration to compaction_garbage_collector.hh sstables: scylla_metadata: add ext_timestamp_stats compaction_group, storage_group, table_state: add extended timestamp stats getters sstables, memtable: track live timestamps memtable_encoding_stats_collector: update row_marker: do nothing if missing	2024-09-13 08:56:51 +03:00
Benny Halevy	4de4af954f	sstables: scylla_metadata: add ext_timestamp_stats Store and retrieve the optional extended timestamp statistics (min_live_timestamp and min_live_row_marker_timestamp) in the scylla_metadata component. Note that there is no need for a cluster feature to store those attributes since the scylla_metadata on-disk format is extensible so that old sstables can be read by new versions, seeing the extra stats is missing, and new sstables can be read by old versions that ignore unknown scylla metadata section types. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-09-10 19:05:57 +03:00
Botond Dénes	de81388edb	Merge 'commitlog: Handle oversized entries' from Calle Wilund Refs #18161 Yet another approach to dealing with large commitlog submissions. We handle oversize single mutation by adding yet another entry typo: fragmented. In this case we only add a fragment (aha) of the data that needs storing into each entry, along with metadata to correlate and reconstruct the full entry on replay. Because these fragmented entries are spread over N segments, we also need to add references from the first segment in a chain to the subsequent ones. These are released once we clear the relevant cf_id count in the base. * This approach has the downside that due to how serialization etc works w.r.t. mutations, we need to create an intermediate buffer to hold the full serialized target entry. This is then incrementally written into entries of < max_mutation_size, successively requesting more segments. On replay, when encountering a fragment chain, the fragment is added to a "state", i.e. a mapping of currently processing frag chains. Once we've found all fragments and concatenated the buffers into a single fragmented one, we can issue a replay callback as usual. Note that a replay caller will need to create and provide such a state object. Old signature replay function remains for tests and such. This approach bumps the file format (docs to come). To ensure "atomicity" we both force synchronization, and should the whole op fail, we restore segment state (rewinding), thus discarding data all we wrote. Closes scylladb/scylladb#19472 * github.com:scylladb/scylladb: commitlog/database: Make some commitlog options updatable + add feature listener features/config: Add feature for fragmented commitlog entries docs: Add entry on commitlog file format v4 commitlog_test: Add more oversized cases commitlog_replayer: Replay segments in order created commitlog_replayer: Use replay state to support fragmented entries commitlog_replayer: coroutinize partly commitlog: Handle oversized entries	2024-09-10 17:15:46 +03:00
Avi Kivity	c3e19425bd	Merge 'docs/dev/docker-hub.md: refresh aio-max-nr calculation' from Laszlo Ersek ~~~ What we have today in "docs/dev/docker-hub.md" on "aio-max-nr" dates back to scylla commit `f4412029f4` ("docs/docker-hub.md: add quickstart section with --smp 1", 2020-09-22). Problems with the current language: - The "65K" claim as default value on non-production systems is wrong; "fs/aio.c" in Linux initializes "aio_max_nr" to 0x10000, which is 64K. - The section in question uses equal signs (=) incorrectly. The intent was probably to say "which means the same as", but that's not what equality means. - In the same section, the relational operator "<" is bogus. The available AIO count must be at least as high (>=) as the requested AIO count. - Clearer names should be used; adjust_max_networking_aio_io_control_blocks() in "src/core/reactor.cc" sets a great example: - "reactor::max_aio" should be called "storage_iocbs", - "detect_aio_poll" should be called "preempt_iocbs", - "reactor_backend_aio::max_polls" should be called "network_iocbs". - The specific value 10000 for the last one ("network_iocbs") is not correct in scylla's context. It is correct as the Seastar default, but scylla has used 50000 since commit `2cfc517874` ("main, test: adjust number of networking iocbs", 2021-07-18). Rewrite the section to address these problems. See also: - https://github.com/scylladb/scylladb/issues/5981 - https://github.com/scylladb/seastar/pull/2396 - https://github.com/scylladb/scylladb/pull/19921 Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com> ~~~ No need for backporting; the documentation being refreshed targets developers as audience, not end-users. Closes scylladb/scylladb#20398 * github.com:scylladb/scylladb: docs/dev/docker-hub.md: refresh aio-max-nr calculation docs/dev/docker-hub.md: strip trailing whitespace	2024-09-09 15:04:38 +03:00
Laszlo Ersek	53524974db	docs/dev/maintainer.md: clarify "Updating submodule references" Before the introduction of "scripts/refresh-submodules.sh", there was indeed some manual work for the maintainer to do, hence "publish your work" must have sounded correct. Today, the phrase "publish your work" sounds confusing. Commit `71da4e6e79` ("docs: Document sync-submodules.sh script in maintainer.md", 2020-06-18) should have arguably reworded the last step of the submodule refresh procedure; let's do it now. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com> Closes scylladb/scylladb#20333	2024-09-05 13:57:32 +03:00
Calle Wilund	9bf452c7a0	docs: Add entry on commitlog file format v4	2024-09-03 16:38:28 +00:00
Laszlo Ersek	cd0819e3ed	docs/dev/docker-hub.md: refresh aio-max-nr calculation What we have today in "docs/dev/docker-hub.md" on "aio-max-nr" dates back to scylla commit `f4412029f4` ("docs/docker-hub.md: add quickstart section with --smp 1", 2020-09-22). Problems with the current language: - The "65K" claim as default value on non-production systems is wrong; "fs/aio.c" in Linux initializes "aio_max_nr" to 0x10000, which is 64K. - The section in question uses equal signs (=) incorrectly. The intent was probably to say "which means the same as", but that's not what equality means. - In the same section, the relational operator "<" is bogus. The available AIO count must be at least as high (>=) as the requested AIO count. - Clearer names should be used; adjust_max_networking_aio_io_control_blocks() in "src/core/reactor.cc" sets a great example: - "reactor::max_aio" should be called "storage_iocbs", - "detect_aio_poll" should be called "preempt_iocbs", - "reactor_backend_aio::max_polls" should be called "network_iocbs". - The specific value 10000 for the last one ("network_iocbs") is not correct in scylla's context. It is correct as the Seastar default, but scylla has used 50000 since commit `2cfc517874` ("main, test: adjust number of networking iocbs", 2021-07-18). Rewrite the section to address these problems. See also: - https://github.com/scylladb/scylladb/issues/5981 - https://github.com/scylladb/seastar/pull/2396 - https://github.com/scylladb/scylladb/pull/19921 Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>	2024-09-03 12:10:59 +02:00
Laszlo Ersek	15738d14ce	docs/dev/docker-hub.md: strip trailing whitespace Strip trailing whitespace. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>	2024-09-03 12:00:28 +02:00
Kefu Chai	28b5471c01	docs/dev/maintainer.md: fix formatting * in the "Backporting Seastar commits" section, there's a single quote instead of a backtick in this line, so fix it. * add backticks around `refresh-submodules.sh`, which is a filename. * correct the command line setting a git config option, because `git-config` does not support this command line syntax, ```console $ git config --global diff.conflictstyle = diff3 $ git config --global get diff.conflictstyle = $ git config --global diff.conflictstyle diff3 $ git config --global get diff.conflictstyle diff3 ``` quote from git-config(1) > ``` > git config set [<file-option>] [--type=<type>] [--all] [--value=<value>] [--fixed-value] <name> <value> > ``` * stop using the deprecated mode of the `git-config` command, and use subcommand instead. as git-config(1) puts: > git config <name> <value> [<value-pattern>] > Replaced by git config set [--value=<pattern>] <name> <value>. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#20328	2024-09-01 22:24:01 +03:00
Aleksandra Martyniuk	3d78172328	api: task_manager: add operation to get ttl	2024-08-29 13:53:39 +02:00
Pavel Emelyanov	245cc852dd	docs: Document the new backup method Add the new /storage_service/backup endpoint to object_storage.md as yet another way to use S3 from Scylla.	2024-08-22 19:47:06 +03:00
Pavel Emelyanov	8949d73cd9	docs: Update object_storage.md with AWS_ environment Commit `51c53d8db6` made it possible to configure object storage endpoint creds via environment. Mention this in the docs.	2024-08-22 14:08:21 +03:00
Pavel Emelyanov	d3f9865d2f	docs: Restructure object_storage.md Currently the doc assumes that object storage can only be used to keep sstables on it. It's going to change, restructure the doc to allow for more usage scenarios.	2024-08-22 14:08:21 +03:00
Łukasz Paszkowski	f4ca734ccb	reverse-reads.md: Drop legacy reverse format information	2024-08-13 10:07:12 +02:00
Pavel Emelyanov	f0f28cf685	docs: Extend debugging with info about exploring ELF notes When debugging coredumps some (small, but useful) information is hidden in the notes of the core ELF file. Add some words about it exists, what it includes and the thing that is always forgotten -- the way to get one Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#19962	2024-08-05 09:49:52 +03:00
Kefu Chai	35394c3f9a	docs/dev: fix a typo remove the extraneous "is". Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19902	2024-07-30 10:46:25 +03:00
Aleksandra Martyniuk	d04159e7de	docs: describe virtual tasks	2024-07-23 13:35:02 +02:00
Avi Kivity	3fc4e23a36	forward_service: rename to mapreduce_service forward_service is nondescriptive and misnamed, as it does more than forward requests. It's a classic map/reduce algorithm (and in fact one of its parameters is "reducer"), so name it accordingly. The name "forward" leaked into the wire protocol for the messaging service RPC isolation cookie, so it's kept there. It's also maintained in the name of the logger (for "nodetool setlogginglevel") for compatibility with tests. Closes scylladb/scylladb#19444	2024-07-03 19:29:47 +03:00
Andrei Chekun	b6aabca9a7	Add documentation how to use allure reporting Add documentation how to install and basic usage example of the allure reporting tool. Fix typo test/README.md Related: scylladb/qa-tasks#1665 Depends on: scylladb/scylladb#18169 Closes scylladb/scylladb#18710	2024-07-01 16:21:50 +02:00
Avi Kivity	fdc1449392	treewide: rename flat_mutation_reader_v2 to mutation_reader flat_mutation_reader_v2 was introduced in a pair of commits in 2021: `e3309322c3` "Clone flat_mutation_reader related classes into v2 variants" `08b5773c12` "Adapt flat_mutation_reader_v2 to the new version of the API" as a replacement for flat_mutation_reader, using range_tombstone_change instead of range_tombstone to represent represent range tombstones. See those commits for more information. The transition was incremental; the last use of the original flat_mutation_reader was removed in 2022 in commit `026f8cc1e7` "db: Use mutation_partition_v2 in mvcc" In turn, flat_mutation_reader was introduced in 2017 in commit `748205ca75` "Introduce flat_mutation_reader" To transition from a mutation_reader that nested rows within a partition in a separate stream, to a flat reader that streamed partitions and rows in the same stream. Here, we reclaim the original name and rename the awkward flat_mutation_reader_v2 to mutation_reader. Note that mutation_fragment_v2 remains since we still use the original for compatibilty, sometimes. Some notes about the transition: - files were also renamed. In one case (flat_mutation_reader_test.cc), the rename target already existed, so we rename to mutation_reader_another_test.cc. - a namespace 'mutation_reader' with two definitions existed (in mutation_reader_fwd.hh). Its contents was folded into the mutation_reader class. As a result, a few #includes had to be adjusted. Closes scylladb/scylladb#19356	2024-06-21 07:12:06 +03:00
Kefu Chai	ad649be1bf	treewide: drop thrift support thrift support was deprecated since ScyllaDB 5.2 > Thrift API - legacy ScyllaDB (and Apache Cassandra) API is > deprecated and will be removed in followup release. Thrift has > been disabled by default. so let's drop it. in this change, * thrift protocol support is dropped * all references to thrift support in document are dropped * the "thrift_version" column in system.local table is preserved for backward compatibility, as we could load from an existing system.local table which still contains this clolumn, so we need to write this column as well. * "/storage_service/rpc_server" is only preserved for backward compatibility with java-based nodetool. * `rpc_port` and `start_rpc` options are preserved, but they are marked as "Unused". so that the new release of scylladb can consume existing scylla.yaml configurations which might contain these settings. by making them deprecated, user will be able get warned, and update their configurations before we actually remove them in the next major release. Fixes #3811 Fixes #18416 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-06-07 06:44:59 +08:00
Aleksandra Martyniuk	beef77a778	docs: describe task folding	2024-05-31 10:40:04 +02:00
Aleksandra Martyniuk	8a72324ff1	docs: add docs to task manager Closes scylladb/scylladb#18967	2024-05-30 09:05:02 +03:00
Pavel Emelyanov	e74a4b038f	Merge 'tablets: alter keyspace' from Piotr Smaron This change supports changing replication factor in tablets-enabled keyspaces. This covers both increasing and decreasing the number of tablets replicas through first building topology mutations (`alter_keyspace_statement.cc`) and then tablets/topology/schema mutations (`topology_coordinator.cc`). For the limitations of the current solution, please see the docs changes attached to this PR. Fixes: #16129 Closes scylladb/scylladb#16723 * github.com:scylladb/scylladb: test: Do not check tablets mutations on nodes that don't have them test: Fix the way tablets RF-change test parses mutation_fragments test/tablets: Unmark RF-changing test with xfail docs: document ALTER KEYSPACE with tablets Return response only when tablets are reallocated cql-pytest: Verify RF is changes by at most 1 when tablets on cql3/alter_keyspace_statement: Do not allow for change of RF by more than 1 Reject ALTER with 'replication_factor' tag Implement ALTER tablets KEYSPACE statement support Parameterize migration_manager::announce by type to allow executing different raft commands Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks Extend system.topology with 3 new columns to store data required to process alter ks global topo req Allow query_processor to check if global topo queue is empty Introduce new global topo `keyspace_rf_change` req New raft cmd for both schema & topo changes Add storage service to query processor tablets: tests for adding/removing replicas tablet_allocator: make load_balancer_stats_manager configurable by name	2024-05-29 14:17:51 +03:00
Piotr Smaron	59d3fd615f	Extend system.topology with 3 new columns to store data required to process alter ks global topo req Because ALTER KS will result in creating a global topo req, we'll have to pass the req data to topology coordinator's state machine, and the easiest way to do it is through sytem.topology table, which is going to be extended with 3 extra columns carrying all the data required to execute ALTER KS from within topology coordinator.	2024-05-28 13:55:11 +02:00
Kefu Chai	61b5bfae6d	docs: fix typos in dev documents these typos were identified by codespell. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18871	2024-05-27 12:28:34 +03:00
Marcin Maliszkiewicz	2ab143fb40	db: auth: move auth tables to system keyspace Separate keyspace which also behaves as system brings little benefit while creating some compatibility problems like schema digest mismatch during rollback. So we decided to move auth tables into system keyspace. Fixes https://github.com/scylladb/scylladb/issues/18098 Closes scylladb/scylladb#18769	2024-05-26 22:30:42 +03:00
Nadav Har'El	dcd26d8a16	Merge 'docs: update isolation.md' from Botond Dénes Update `docs/dev/isolation.d`: * Update the list of scheduling groups * Remove IO priority groups (they were folded into scheduling groups) * Add section on RPC isolation Closes scylladb/scylladb#18749 * github.com:scylladb/scylladb: docs: isolation.md: add section on RPC call isolation docs: isolation.md: remove mention of IO priority groups docs: isolation.md: update scheduling group list, add aliases	2024-05-21 11:46:57 +03:00
Botond Dénes	11fa79a537	docs: isolation.md: add section on RPC call isolation	2024-05-21 03:12:22 -04:00
Pavel Emelyanov	159e44d08a	test.py: Make it possible to avoid wildcard test names matching There's a nasty scenario when this searching plays bad joke. When CI picks up a new branch and notices, that a test had changed, it spawns a custom job with test.py --repeat 100 $changed_test_name in it. Next, when the test.py tries opt-in test name matching, it uses the wildcard search and can pick up extra unwanted tests into the run. To solve this, the case-selection syntax is extended. Now if the caller specifies `suite/test::` as test, the test file is selected by exact name match, but the specific test-case is not selected, the `` makes it run all cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18704	2024-05-20 15:50:47 +02:00
Botond Dénes	936a7e282b	docs: isolation.md: remove mention of IO priority groups They were folded into CPU scheduling groups, which now apply to both CPU and IO.	2024-05-20 03:33:24 -04:00
Botond Dénes	8f61468322	docs: isolation.md: update scheduling group list, add aliases	2024-05-20 03:30:04 -04:00
Tomasz Grabiec	eb3a22d5a8	docs: Document tablet sharding vs tablet replica placement	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	10a4903d0c	dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard Require users to specify whether we want shard for reads or for writes by switching to appropriate non-deprecated variant. For example, shard_of() can be replaced with shard_for_reads() or shard_for_writes(). The next_shard/token_for_next_shard APIs have only for-reads variant, and the act of switching will be a testimony to the fact that the code is valid for intra-node migration.	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	6946ad2a45	sharding: Prepare for intra-node-migration Tablet sharder is adjusted to handle intra-migration where a tablet can have two replicas on the same host. For reads, sharder uses the read selector to resolve the conflict. For writes, the write selector is used. The old shard_of() API is kept to represent shard for reads, and new method is introduced to query the shards for writing: shard_for_writes(). All writers should be switched to that API, which is not done in this patch yet. The request handler on replica side acts as a second-level coordinator, using sharder to determine routing to shards. A given sharder has a scope of a single topology version, a single effective_replication_map_ptr, which should be kept alive during writes.	2024-05-16 00:28:46 +02:00
Tomasz Grabiec	b5bb46357b	docs: Document sharder use for tablets	2024-05-16 00:28:46 +02:00
Tomasz Grabiec	82b34d34d8	tablets: Introduce tablet transition kind for intra-node migration We need a separate transition kind for intra node migration so that we don't have to recover this information from replica set in an expensive way. This information is needed in the hot path - in effective_replicaiton_map, to not return the pending tablet replica to the coordinator. From its perspective, replica set is not transitional. The transition will also be used to alter the behavior of the sharder. When not in intra-node migration, the sharder should advertise the shard which is either in the previous or next replica set. During intra-node migration, that's not possible as there may be two such shards. So it will return the shard according to the current read selector.	2024-05-16 00:28:46 +02:00
Aleksandra Martyniuk	561fb1dd09	service: move to cleanup stage if allow_write_both_read_old fails If allow_write_both_read_old tablet transition stage fails, move to cleanup_target stage before reverting migration. It's a preparation for further patches which deallocate storage group of a tablet during cleanup.	2024-05-10 14:56:38 +02:00
Yaniv Michael Kaul	124064844f	docs/dev/object_stroage.md: convert example AWS keys to be more innocent Someone thought that they actually represent real keys (the 'EXAMPLE' in their name was not enough). Converted them to be as clear as can be, example data. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#18565	2024-05-09 08:26:43 +03:00
Botond Dénes	96a7ed7efb	Merge 'sstables: add dead row count when issuing warning to system.large_partitions' from Ferenc Szili This is the second half of the fix for issue #13968. The first half is already merged with PR #18346 Scylla issues warnings for partitions containing more rows than a configured threshold. The warning is issued by inserting a row into the `system.large_partitions` table. This row contains the information about the partition for which the warning is issued: keyspace, table, sstable, partition key and size, compaction time and the number of rows in the partition. A previous PR #18346 also added range tombstone count to this row. This change adds a new counter for dead rows to the large_partitions table. This change also adds cluster feature protection for writing into these new counters. This is needed in case a cluster is in the process of being upgraded to this new version, after which an upgraded node writes data with the new schema into `system.large_partitions`, and finally a node is then rolled back to an old version. This node will then revert the schema to the old version, but the written sstables will still contain data with the new counters, causing any readers of this table to throw errors when they encounter these cells. This is an enhancement, and backporting is not needed. Fixes #13968 Closes scylladb/scylladb#18458 * github.com:scylladb/scylladb: sstable: added test for counting dead rows sstable: added docs for system.large_partitions.dead_rows sstable: added cluster feature for dead rows and range tombstones sstable: write dead_rows count to system.large_partitions sstable: added counter for dead rows	2024-05-09 08:26:43 +03:00
Ferenc Szili	8e9771d010	sstable: added docs for system.large_partitions.dead_rows	2024-05-07 15:44:33 +02:00
Piotr Dulikowski	64ba620dc2	Merge 'hinted handoff: Use host IDs instead of IPs in the module' from Dawid Mędrek This pull request introduces host ID in the Hinted Handoff module. Nodes are now identified by their host IDs instead of their IPs. The conversion occurs on the boundary between the module and `storage_proxy.hh`, but aside from that, IPs have been erased. The changes take into considerations that there might still be old hints, still identified by IPs, on disk – at start-up, we map them to host IDs if it's possible so that they're not lost. Refs scylladb/scylladb#6403 Fixes scylladb/scylladb#12278 Closes scylladb/scylladb#15567 * github.com:scylladb/scylladb: docs: Update Hinted Handoff documentation db/hints: Add endpoint_downtime_not_bigger_than() db/hints: Migrate hinted handoff when cluster feature is enabled db/hints: Handle arbitrary directories in resource manager db/hints: Start using hint_directory_manager db/hints: Enforce providing IP in get_ep_manager() db/hints: Introduce hint_directory_manager db/hints/resource_manager: Update function description db/hints: Coroutinize space_watchdog::scan_one_ep_dir() db/hints: Expose update lock of space watchdog db/hints: Add function for migrating hint directories to host ID db/hints: Take both IP and host ID when storing hints db/hints: Prepare initializing endpoint managers for migrating from IP to host ID db/hints: Migrate to locator::host_id db/hints: Remove noexcept in do_send_one_mutation() service: Add locator::host_id to on_leave_cluster service: Fix indentation db/hints: Fix indentation	2024-05-06 09:58:18 +02:00
Raphael S. Carvalho	62b1cfa89c	topology_coordinator: Fix synchronization of tablet split with other concurrent ops Finalization of tablet split was only synchronizing with migrations, but that's not enough as we want to make sure that all processes like repair completes first as they might hold erm and therefore will be working with a "stale" version of token metadata. For synchronization to work properly, handling of tablet split finalize will now take over the state machine, when possible, and execute a global token metadata barrier to guarantee that update in topology by split won't cause problems. Repair for example could be writing a sstable with stale metadata, and therefore, could generate a sstable that spans multiple tablets. We don't want that to happen, therefore we need the barrier. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#18380	2024-04-30 19:23:28 +02:00
Dawid Medrek	bf802e99eb	docs: Update Hinted Handoff documentation We briefly explain the process of migration of Hinted Handoff to host IDs, the rationale for it, consequences, and possible side effects.	2024-04-28 01:22:59 +02:00
Avi Kivity	c2b8ca7d71	Merge 'cql3: statements: change default tombstone_gc mode for tablets' from Aleksandra Martyniuk Repair may miss some tablets that migrated across nodes. So if tombstones expire after some timeout, then we can have data resurrection. Set default tombstone_gc mode to "repair" for tables which use tablets (if repair is required). Fixes: #16627. Closes scylladb/scylladb#18013 * github.com:scylladb/scylladb: test: check default value of tombstone_gc test: topology: move some functions to util.py cql3: statements: change default tombstone_gc mode for tablets	2024-04-25 19:18:37 +03:00
Aleksandra Martyniuk	58f72f9019	cql3: statements: change default tombstone_gc mode for tablets Currently, if tombstone_gc mode isn't specified for a table, then "timeout" is used by default. With tablets, running "nodetool repair -pr" may miss a tablet if it migrated across the nodes. Then, if we expire tombstones for ranges that weren't repaired, we may get data resurrection. Set default tombstone_gc mode value for DDLs that don't specify it. It's set to "repair" for tables which use tablets unless they use local replication strategy or rf = 1. Otherwise it's set to "timeout".	2024-04-24 10:42:10 +02:00
Ferenc Szili	c528597a84	sstables: add docs changes for system.large_partitions This commit updates the documentation changes for the new column range_tombstones in system.large_partitions	2024-04-22 15:25:41 +02:00
Anna Stuchlik	a3481a4566	doc: document the system_auth_v2 feature This commit includes updates related to replacing system_auth with system_auth_v2. - The keyspace name system_auth is renamed to system_auth_v2. - The procedures are updated to account for system_auth_v2. - No longer required system_auth RF changes are removed from procedures. - The information is added that if the consistent topology updates feature was not enabled upon upgrade from 5.4, there are limitations or additional steps to do (depending on the procedure). The files with that kind of information are to be found in _common folders and included as needed. - The upgrade guide has been updated to reflect system_auth_v2 and related impacts. Closes scylladb/scylladb#18077	2024-04-18 18:33:49 +02:00
Botond Dénes	2cb5dcabf7	docs/dev/maintainer.md: document another exceptions to rule no.0 Maintainers are also allowed to commit their own backport PR. They are allowed to backport their own code, opening a PR to get a CI run for a backport doesn't change this. Closes scylladb/scylladb#17727	2024-04-03 09:51:19 +03:00
Kefu Chai	d1e8d89ae2	doc: topology-over-raft: add transition_state to node state diagram in order to help the developers to understand the transitions of `node_state` and the `transition_state` on each of the `node_state`, in this change, the nested state machine diagram is added to the node state diagram. please note, instead of trying to merge similar states like bootstrapping and replacing into a single state, we keep them as separate ones, and replicate the nested state machine diagram in them as well, to be more clear. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18025	2024-03-27 12:16:35 +01:00

1 2 3 4

182 Commits