This PR implements a procedure that upgrades existing clusters to use
raft-based topology operations. The procedure does not start
automatically, it must be triggered manually by the administrator after
making sure that no topology operations are currently running.
Upgrade is triggered by sending `POST
/storage_service/raft_topology/upgrade` request. This causes the
topology coordinator to start who drives the rest of the process: it
builds the `system.topology` state based on information observed in
gossip and tells all nodes to switch to raft mode. Then, topology
coordinator runs normally.
Upgrade progress is tracked in a new static column `upgrade_state` in
`system.topology`.
The procedure also serves as an extension to the current recovery
procedure on raft. The current recovery procedure requires restarting
nodes in a special mode which disables raft, perform `nodetool
removenode` on the dead nodes, clean up some state on the nodes and
restart them so that they automatically rebuild the group 0. Raft
topology fits into existing procedure by falling back to legacy topology
operations after disabling raft. After rebuilding the group 0, upgrade
needs to be triggered again.
Because upgrade is manual and it might not be convenient for
administrators to run it right after upgrading the cluster, we allow the
cluster to operate in legacy topology operations mode until upgrade,
which includes allowing new nodes to join. In order to allow it, nodes
now ask the cluster about the mode they should use to join before
proceeding by using a new `JOIN_NODE_QUERY` RPC.
The procedure is explained in more detail in `topology-over-raft.md`.
Fixes: https://github.com/scylladb/scylladb/issues/15008Closesscylladb/scylladb#17077
* github.com:scylladb/scylladb:
test/topology_custom: upgrade/recovery tests for topology on raft
cdc/generation_service: in legacy mode, fall back to raft tables
system_keyspace: add read_cdc_generation_opt
cdc/generation_service: turn off gossip notifications in raft topo mode
cql_test_env: move raft_topology_change_enabled var earlier
group0_state_machine: pull snapshot after raft topology feature enabled
storage_service: disable persistent feature enabler on upgrade
storage_service: replicate raft features to system.peers
storage_service: gossip tokens and cdc generation in raft topology mode
API: add api for triggering and monitoring topology-on-raft upgrade
storage_service: infer which topology operations to use on startup
storage_service: set the topology kind value based on group 0 state
raft_group0: expose link to the upgrade doc in the header
feature_service: fall back to checking legacy features on startup
storage_service: add fiber for tracking the topology upgrade progress
gms: feature_service: add SUPPORTS_CONSISTENT_TOPOLOGY_CHANGES
topology_coordinator: implement core upgrade logic
topology_coordinator: extract top-level error handling logic
storage_service: initialize discovery leader's state earlier
topology_coordinator: allow for custom sharding info in prepare_and_broadcast_cdc_generation_data
topology_coordinator: allow for custom sharding info in prepare_new_cdc_generation_data
topology_coordinator: remove outdated fixme in prepare_new_cdc_generation_data
topology_state_machine: introduce upgrade_state
storage_service: disallow topology ops when upgrade is in progress
raft_group0_client: add in_recovery method
storage_service: introduce join_node_query verb
raft_group0: make discover_group0 public
raft_group0: filter current node's IP in discover_group0
raft_group0: remove my_id arg from discover_group0
storage_service: make _raft_topology_change_enabled more advanced
docs: document raft topology upgrade and recovery
In one of the previous patches, we changed the `rollback_to_normal`
state from a node state to a transition state. We document it
in this patch. The node state wasn't documented, so there is
nothing to remove.
The motivation for tablet resizing is that we want to keep the average tablet size reasonable, such that load rebalancing can remain efficient. Too large tablet makes migration inefficient, therefore slowing down the balancer.
If the avg size grows beyond the upper bound (split threshold), then balancer decides to split. Split spans all tablets of a table, due to power-of-two constraint.
Likewise, if the avg size decreases below the lower bound (merge threshold), then merge takes place in order to grow the avg size. Merge is not implemented yet, although this series lays foundation for it to be impĺemented later on.
A resize decision can be revoked if the avg size changes and the decision is no longer needed. For example, let's say table is being split and avg size drops below the target size (which is 50% of split threshold and 100% of merge one). That means after split, the avg size would drop below the merge threshold, causing a merge after split, which is wasteful, so it's better to just cancel the split.
Tablet metadata gains 2 new fields for managing this:
resize_type: resize decision type, can be either of "merge", "split", or "none".
resize_seq_number: a sequence number that works as the global identifier of the decision (monotonically increasing, increased by 1 on every new decision emitted by the coordinator).
A new RPC was implemented to pull stats from each table replica, such that load balancer can calculate the avg tablet size and know the "split status", for a given table. Avg size is aggregated carefully while taking RF of each DC into account (which might differ).
When a table is done splitting its storage, it loads (mirror) the resize_seq_number from tablet metadata into its local state (in another words, my split status is ready). If a table is split ready, coordinator will see that table's seq number is the same as the one in tablet metadata. Helps to distinguish stale decisions from the latest one (in case decisions are revoked and re-emited later on). Also, it's aggregated carefully, by taking the minimum among all replicas, so coordinator will only update topology when all replicas are ready.
When load balancer emits split decision, replicas will listen to need to split with a "split monitor" that is awakened once a table has replication metadata updated and detects the need for split (i.e. resize_type field is "split").
The split monitor will start splitting of compaction groups (using mechanism introduced here: 081f30d149) for the table. And once splitting work is completed, the table updates its local state as having completed split.
When coordinator pulls the split status of all replicas for a table via RPC, the balancer can see whether that table is ready for "finalizing" the decision, which is about updating tablet metadata to split each tablet into two. Once table replicas have their replication metadata updated with the new tablet count, they can update appropriately their set of compaction groups (that were previously split in the preparation step).
Fixes#16536.
Closesscylladb/scylladb#16580
* github.com:scylladb/scylladb:
test/topology_experimental_raft: Add tablet split test
replica: Bypass reshape on boot with tablets temporarily
replica: Fix table::compaction_group_for_sstable() for tablet streaming
test/topology_experimental_raft: Disable load balancer in test fencing
replica: Remap compaction groups when tablet split is finalized
service: Split tablet map when split request is finalized
replica: Update table split status if completed split compaction work
storage_service: Implement split monitor
topology_cordinator: Generate updates for resize decisions made by balancer
load_balancer: Introduce metrics for resize decisions
db: Make target tablet size a live-updateable config option
load_balancer: Implement resize decisions
service: Wire table_resize_plan into migration_plan
service: Introduce table_resize_plan
tablet_mutation_builder: Add set_resize_decision()
topology_coordinator: Wire load stats into load balancer
storage_service: Allow tablet split and migration to happen concurrently
topology_coordinator: Periodically retrieve table_load_stats
locator: Introduce topology::get_datacenter_nodes()
storage_service: Implement table_load_stats RPC
replica: Expose table_load_stats in table
replica: Introduce storage_group::live_disk_space_used()
locator: Introduce table_load_stats
tablets: Add resize decision metadata to tablet metadata
locator: Introduce resize_decision
In one of the previous patches, we changed the `left_token_ring`
state from a node state to a transition state. We document it
in this patch. The node state wasn't documented, so there is
nothing to remove.
This implements the ability in load balancer to emit split or merge
requests, cancel ongoing ones if they're no longer needed, and
also finalize those that are ready for the topology changes.
That's all based on average tablet size, collected by coordinator
from all nodes, and split and merge thresholds.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The new metadata describes the ongoing resize operation (can be either
of merge, split or none) that spans tablets of a given table.
That's managed by group0, so down nodes will be able to see the
decision when they come back up and see the changes to the
metadata.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
New tablet replicas are allocated and rebuilt synchronously with node
operations. They are safely rebuilt from all existing replicas.
The list of ignored nodes passed to node operations is respected.
Tablet scheduler is responsible for scheduling tablet rebuilding transition which
changes the replicas set. The infrastructure for handling decommission
in tablet scheduler is reused for this.
Scheduling is done incrementally, respecting per-shard load
limits. Rebuilding transitions are recognized by load calculation to
affect all tablet replicas.
New kind of tablet transition is introduced called "rebuild" which
adds new tablet replica and rebuilds it from existing replicas. Other
than that, the transition goes through the same stages as regular
migration to ensure safe synchronization with request coordinators.
In this PR we simply stream from all tablet replicas. Later we should
switch to calling repair to avoid sending excessive amounts of data.
Fixes https://github.com/scylladb/scylladb/issues/16690.
Closesscylladb/scylladb#16894
* github.com:scylladb/scylladb:
tests: tablets: Add tests for removenode and replace
tablets: Add support for removenode and replace handling
topology_coordinator: tablets: Do not fail in a tight loop
topology_coordinator: tablets: Avoid warnings about ignored failured future
storage_service, topology: Track excluded state in locator::topology
raft topology: Introduce param-less topology::get_excluded_nodes()
raft topology: Move get_excluded_nodes() to topology
tablets: load_balancer: Generalize load tracking
tablets: Introduce get_migration_streaming_info() which works on migration request
tablets: Move migration_to_transition_info() to tablets.hh
tablets: Extract get_new_replicas() which works on migraiton request
tablets: Move tablet_migration_info to tablets.hh
tablets: Store transition kind per tablet
Add `docs/dev/code-coverage.md` with explanations about how to work with
the different tools added for coverage reporting and cli options added
to `configure.py` and `test.py`
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Tablets metadata is quite expensive to generate (each data_value is
an allocation), so an old driver (without support for tablets) will
generate huge amounts of such notifications. This commit adds a way
to negotiate generation of the notification: a new driver will ask
for them, and an old driver won't get them. It uses the
OPTIONS/SUPPORTED/STARTUP protocol described in native_protocol_v4.spec.
Closesscylladb/scylladb#16611
Previously, the tablet information was sent to the drivers
in two pieces within the custom_payload. We had information
about the replicas under the `tablet_replicas` key and token range
information under `token_range`. These names were quite generic
and might have caused problems for other custom_payload users.
Additionally, dividing the information into two pieces raised
the question of what to do if one key is present while the other
is missing.
This commit changes the serialization mechanism to pack all information
under one specific name, `tablets-routing-v1`.
From: Sylwia Szunejko <sylwia.szunejko@scylladb.com>
Closesscylladb/scylladb#16148
When performing a schema change through group 0, extend the schema mutations with a version that's persisted and then used by the nodes in the cluster in place of the old schema digest, which becomes horribly slow as we perform more and more schema changes (#7620).
If the change is a table create or alter, also extend the mutations with a version for this table to be used for `schema::version()`s instead of having each node calculate a hash which is susceptible to bugs (#13957).
When performing a schema change in Raft RECOVERY mode we also extend schema mutations which forces nodes to revert to the old way of calculating schema versions when necessary.
We can only introduce these extensions if all of the cluster understands them, so protect this code by a new cluster/schema feature, `GROUP0_SCHEMA_VERSIONING`.
Fixes: #7620Fixes: #13957
---
This is a reincarnation of PR scylladb/scylladb#15331. The previous PR was reverted due to a bug it unmasked; the bug has now been fixed (scylladb/scylladb#16139). Some refactors from the previous PR were already merged separately, so this one is a bit smaller.
I have checked with @Lorak-mmk's reproducer (https://github.com/Lorak-mmk/udt_schema_change_reproducer -- many thanks for it!) that the originally exposed bug is no longer reproducing on this PR, and that it can still be reproduced if I revert the aforementioned fix on top of this PR.
Closesscylladb/scylladb#16242
* github.com:scylladb/scylladb:
docs: describe group 0 schema versioning in raft docs
test: add test for group 0 schema versioning
feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode
schema_tables: don't delete `version` cell from `scylla_tables` mutations from group 0
migration_manager: add `committed_by_group0` flag to `system.scylla_tables` mutations
schema_tables: use schema version from group 0 if present
migration_manager: store `group0_schema_version` in `scylla_local` during schema changes
system_keyspace: make `get/set_scylla_local_param` public
feature_service: add `GROUP0_SCHEMA_VERSIONING` feature
Tablet streaming involves asynchronous RPCs to other replicas which transfer writes. We want side-effects from streaming only within the migration stage in which the streaming was started. This is currently not guaranteed on failure. When streaming master fails (e.g. due to RPC failing), it can be that some streaming work is still alive somewhere (e.g. RPC on wire) and will have side-effects at some point later.
This PR implements tracking of all operations involved in streaming which may have side-effects, which allows the topology change coordinator to fence them and wait for them to complete if they were already admitted.
The tracking and fencing is implemented by using global "sessions", created for streaming of a single tablet. Session is globally identified by UUID. The identifier is assigned by the topology change coordinator, and stored in system.tablets. Sessions are created and closed based on group0 state (tablet metadata) by the barrier command sent to each replica, which we already do on transitions between stages. Also, each barrier waits for sessions which have been closed to be drained.
The barrier is blocked only if there is some session with work which was left behind by unsuccessful streaming. In which case it should not be blocked for long, because streaming process checks often if the guard was left behind and stops if it was.
This mechanism of tracking is fault-tolerant: session id is stored in group0, so coordinator can make progress on failover. The barriers guarantee that session exists on all replicas, and that it will be closed on all replicas.
Closesscylladb/scylladb#15847
* github.com:scylladb/scylladb:
test: tablets: Add test for failed streaming being fenced away
error_injection: Introduce poll_for_message()
error_injection: Make is_enabled() public
api: Add API to kill connection to a particular host
range_streamer: Do not block topology change barriers around streaming
range_streamer, tablets: Do not keep token metadata around streaming
tablets: Fail gracefully when migrating tablet has no pending replica
storage_service, api: Add API to disable tablet balancing
storage_service, api: Add API to migrate a tablet
storage_service, raft topology: Run streaming under session topology guard
storage_service, tablets: Use session to guard tablet streaming
tablets: Add per-tablet session id field to tablet metadata
service: range_streamer: Propagate topology_guard to receivers
streaming: Always close the rpc::sink
storage_service: Introduce concept of a topology_guard
storage_service: Introduce session concept
tablets: Fix topology_metadata_guard holding on to the old erm
docs: Document the topology_guard mechanism
Prototype implementation of format suggested/requested by @avikivity:
Divides segments into disk-write-alignment sized pages, each tagged with segment ID + CRC of data content.
When read, we both verify sector integrity (CRC) to detect corruption, as well as matching ID read with expected one.
If the latter mismatches we have a prematurely terminated segment (read truncation), which, depending on whether the CL is
written in batch or periodic mode, as well as explicit sync, can mean data loss.
Note: all-zero pages are treated as kosher, both to align with newly allocated segments, as well as fully terminated (zero-page) ones.
Note: This is a preview/RFC - the rest of the file format is not modified. At least parts of entry CRC could probably be removed, but I have not done so yet (needs some thinking).
Note: Some slight abstraction breaks in impl. and probably less than maximal efficiency.
v2:
* Removed entry CRC:s in file format.
* Added docs on format v3
* Added one more test for recycling-truncation
v3:
* Fixed typos in size calc and docs
* Changed sect metadata order
* Explicit iter type
Closesscylladb/scylladb#15494
* github.com:scylladb/scylladb:
commitlog_test: Add test for replaying large-ish mutation
commitlog_test: Add additional test for segmnent truncation
docs: Add docs on commitlog format 3
commitlog: Remove entry CRC from file format
commitlog: Implement new format using CRC:ed sectors
commitlog: Add iterator adaptor for doing buffer splitting into sub-page ranges
fragmented_temporary_buffer: Add const iterator access to underlying buffers
commitlog_replayer: differentiate between truncated file and corrupt entries
This patch series adds error handling for streaming failure during
topology operations instead of an infinite retry. If streaming fails the
operation is rolled back: bootstrap/replace nodes move to left and
decommissioned/remove nodes move back to normal state.
* 'gleb/streaming-failure-rollback-v4' of github.com:scylladb/scylla-dev:
raft: make sure that all operation forwarded to a leader are completed before destroying raft server
storage_service: raft topology: remove code duplication from global_tablet_token_metadata_barrier
tests: add tests for streaming failure in bootstrap/replace/remove/decomission
test/pylib: do not stop node if decommission failed with an expected error
storage_service: raft topology: fix typo in "decommission" everywhere
storage_service: raft topology: add streaming error injection
storage_service: raft topology: do not increase topology version during CDC repair
storage_service: raft topology: rollback topology operation on streaming failure.
storage_service: raft topology: load request parameters in left_token_ring state as well
storage_service: raft topology: do not report term_changed_error during global_token_metadata_barrier as an error
storage_service: raft topology: change global_token_metadata_barrier error handling to try/catch
storage_service: raft topology: make global_token_metadata_barrier node independent
storage_service: raft topology: split get_excluded_nodes from exec_global_command
storage_service: raft topology: drop unused include_local and do_retake parameters from exec_global_command which are always true
storage_service: raft topology: simplify streaming RPC failure handling
- s/aws_key/aws_access_key_id/
- s/aws_secret/aws_secret_access_key/
- s/aws_token/aws_session_token/
rename them to more popular names, these names are also used by
boto's API. this should improve the readability and consistency.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
in hope to lower the bar to testing object store.
* add language specifier for better readability of the document.
to highlight the config with YAML syntax
* add more specific comment on the AWS related settings
* explain that endpoint should match in the CREATE KEYSPACE
statement and the one defined by the YAML configuration.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15433
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.
Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.
If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be aborted. There is no infrastructure for
aborts yet, so this is not implemented.
Closes#15197
* github.com:scylladb/scylladb:
tablets, raft topology: Add support for decommission with tablets
tablet_allocator: Compute load sketch lazily
tablet_allocator: Set node id correctly
tablet_allocator: Make migration_plan a class
tablets: Implement cleanup step
storage_service, tablets: Prevent stale RPCs from running beyond their stage
locator: Introduce tablet_metadata_guard
locator, replica: Add a way to wait for table's effective_replication_map change
storage_service, tablets: Extract do_tablet_operation() from stream_tablet()
raft topology: Add break in the final case clause
raft topology: Fix SIGSEGV when trace-level logging is enabled
raft topology: Set node state in topology
raft topology: Always set host id in topology
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.
Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.
If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be paused so that admin becomes aware of the problem and
issues an abort on the topology change. There is no infrastructure for
aborts yet, so this is not implemented.
we get the path object storage config like:
```c++
db::config::get_conf_sub("object_storage.yaml").native()
```
so, the default path should be $SCYLLA_CONF/object_storage.yaml.
in this change, it is corrected.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15406
We change the type of IDs in CDC_GENERATIONS_V3 to timeuuid to
give them a time-based order. We also change how we initialize
them so that the new CDC generation always has the highest ID.
This is the last step to enabling the efficient clearing of
obsolete CDC generation data.
Additionally, we change the types of current_cdc_generation_uuid,
new_cdc_generation_data_uuid and the second values of the elements
in unpublished_cdc_generations to timeuuid, so that they match id
in CDC_GENERATIONS_V3.
After moving the creation of uuid out of
make_new_generation_description, this function only calls the
topology_description_generator's constructor and its generate
method. We could remove this function, but we instead simplify
the code by removing the topology_description_generator class.
We can do this refactor because make_new_generation_description
is the only place using it. We inline its generate method into
make_new_generation_description and turn its private methods into
static functions.
We make CDC_GENERATIONS_V3 single-partition by adding the key
column and changing the clustering key from range_end to
(id, range_end). This is the first step to enabling the efficient
clearing of obsolete CDC generation data, which we need to prevent
Raft-topology snapshots from endlessly growing as we introduce new
generations over time. The next step is to change the type of the id
column to timeuuid. We do it in the following commits.
After making CDC_GENERATIONS_V3 single-partition, there is no easy
way of preserving the num_ranges column. As it is used only for
sanity checking, we remove it to simplify the implementation.
In the following commits, we replace the
topology::transition_state::publish_cdc_generation state with
a background fiber that continually publishes committed CDC
generations. To make these generations accessible to the
topology coordinator, we store them in the new column of
system.topology -- unpublished_cdc_generations.
Add a document describing in detail how to use clang's "-ftime-trace"
option, and the ClangBuildAnalyzer tool, to find the source files,
header files and templates which slow down Scylla's build the most.
I've used this tool in the past to reduce Scylla build time - see
commits:
fa7a302130 (reduced 6.5%)
f84094320d (reduced 0.1%)
6ebf32f4d7 (reduced 1%)
d01e1a774b (reduced 4%)
I'm hoping that documenting how to use this tool will allow other
developers to suggest similar commits.
Refs #1.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#15209
We add support for `--ignore-dead-nodes` in `raft_removenode` and
`--ignore-dead-nodes-for-replace` in `raft_replace`. For now, we allow
passing only host ids of the ignored nodes. Supporting IPs is currently
impossible because `raft_address_map` doesn't provide a mapping from IP
to a host id.
The main steps of the implementation are as follows:
- add the `ignore_nodes` column to `system.topology`,
- set the `ignore_nodes` value of the topology mutation in `raft_removenode` and `raft_replace`,
- extend `service::request_param` with alternative types that allow storing a set of ids of the ignored nodes,
- load `ignore_nodes` from `system.topology` into `request_param` in `system_keyspace::load_topology_state`,
- add `ignore_nodes` to `exclude_nodes` in `topology_coordinator::exec_global_command`,
- pass `ignore_nodes` to `replace_with_repair` and `remove_with_repair` in `storage_service::raft_topology_cmd_handler`.
Additionally, we add `test_raft_ignore_nodes.py` with two tests that verify the added changes.
Fixes#15025Closes#15113
* github.com:scylladb/scylladb:
test: add test_raft_ignore_nodes
test: ManagerClient.remove_node: allow List[HostId] for ignore_dead
raft topology: pass ignore_nodes to {replace, remove}_with_repair
raft topology: exec_global_command: add ignore_nodes to exclude_nodes
raft topology: exec_global_command: change type of exclude_nodes
topology_state_machine: extend request_param with a set of raft ids
raft topology: set ignore_nodes in raft_removenode and raft_replace
utils: introduce split_comma_separated_list
raft topology: add the ignore_nodes column to system.topology
In the following commits, we add support for --ignore-dead-nodes
in raft_removenode and --ignore-dead-nodes-for-replace in
raft_replace. To make these request parameters accessible for the
topology coordinator, we store them in the new ignore_nodes
column of system.topology.
The system.tablets table stores replica sets as a CQL set type,
which is sorted. This means that if, in a tablet replica set
[n1, n2, n3] n2 is replaced with n4, then on reload we'll see
[n1, n3, n4], changing the relative position of n3 from the third
replica to the second.
The relative position of replicas in a replica set is important
for materialized views, as they use it to pair base replicas with
view replicas. To prepare for materialized views using tablets,
change the persistent data type to list, which preserves order.
The code that generates new replica sets already preserves order:
see locator::replace_replica().
While this changes the system schema, tablets are an experimental
feature so we don't need to worry about upgrades.
Closes#15111
After this change, the load balancer can make progress with active
migrations. If the algorithm is called with active tablet migrations
in tablet metadata, those are treated by load balancer as if they were
already completed. This allows the algorithm to incrementally make
decision which when executed with active migrations will produce the
desired result.
Overload of shards is limited by the fact that the algorithm tracks
streaming concurrency on both source and target shards of active
migrations and takes concurrency limit into account when producing new
migrations.
The coordinator executes the load balancer on edges of tablet state
machine stransitions. This allows new migrations to be started as soon
as tablets finish streaming.
The load balancer is also continuously invoked as long as it produces
a non-empty plan. This is in order to saturate the cluster with
streaming. A single make_plan() call is still not saturating, due
to the way algorithm is implemented.
and add docs/dev/timestamp-conflict-resolution.md
to document the details of the conflict resolution algorithm.
Refs scylladb/scylladb#14063
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Containing two tables, describing all the possible operations seen in user, system and streaming semaphore diagnostics dumps.
Closes#14171
* github.com:scylladb/scylladb:
docs/dev/reader-concurrency-semaphore.md: add section about operations
docs/dev/reader-concurrency-semaphore.md: switch to # headers markings
reader_concurrency_semaphore: s/description/operation/ in diagnostics dumps