scylladb

Author	SHA1	Message	Date
Benny Halevy	26ff8f7bf7	docs: dml: add update ordering section and add docs/dev/timestamp-conflict-resolution.md to document the details of the conflict resolution algorithm. Refs scylladb/scylladb#14063 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-20 11:55:54 +03:00
Avi Kivity	5acb137c2e	Merge 'docs/dev/reader-concurrency-semaphore.md: add section about operations' from Botond Dénes Containing two tables, describing all the possible operations seen in user, system and streaming semaphore diagnostics dumps. Closes #14171 * github.com:scylladb/scylladb: docs/dev/reader-concurrency-semaphore.md: add section about operations docs/dev/reader-concurrency-semaphore.md: switch to # headers markings reader_concurrency_semaphore: s/description/operation/ in diagnostics dumps	2023-06-07 22:53:18 +03:00
Botond Dénes	0c632b6e3d	docs/dev/reader-concurrency-semaphore.md: add section about operations Containing two tables, describing all the possible operations seen in user, system and streaming semaphore diagnostics dumps.	2023-06-07 14:22:52 +03:00
Botond Dénes	0067fa0a09	docs/dev/reader-concurrency-semaphore.md: switch to # headers markings As they allow for more levels, than the current `---` and `===` ones.	2023-06-07 14:22:10 +03:00
Botond Dénes	c4faa05888	reader_concurrency_semaphore: s/description/operation/ in diagnostics dumps "description" is not the respective column contains, so fix the header.	2023-06-07 14:21:48 +03:00
Marcin Maliszkiewicz	8b06684a8c	docs: dev: document pytest run convenience script Closes #13995	2023-06-07 12:37:52 +03:00
Kefu Chai	8e7c7e1079	docs/dev/repair_based_node_ops: better formatting * indent the nested paragraphs of list items * use table to format the time sequence for better readability Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14016	2023-05-25 08:31:43 +03:00
Botond Dénes	eb457b6104	Merge 'fixed broken links, added community forum link, university link, spelling and other mistakes' from Guy Shtub Closes #13979 * github.com:scylladb/scylladb: Update docker-hub.md Update docs/dev/docker-hub.md Update docs/dev/docker-hub.md Update docs/dev/docker-hub.md Update docs/dev/docker-hub.md Update docs/dev/docker-hub.md fixed broken links, added community forum link, university link, other mistakes	2023-05-24 09:58:58 +03:00
Guy Shtub	65c0afc899	Update docker-hub.md	2023-05-24 07:34:58 +03:00
Guy Shtub	7e3d768369	Update docs/dev/docker-hub.md Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com>	2023-05-24 07:27:07 +03:00
Guy Shtub	6329036656	Update docs/dev/docker-hub.md Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com>	2023-05-24 07:26:42 +03:00
Guy Shtub	3538a2e1c2	Update docs/dev/docker-hub.md Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com>	2023-05-24 07:23:51 +03:00
Guy Shtub	53183d6302	Update docs/dev/docker-hub.md Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com>	2023-05-24 07:23:37 +03:00
Guy Shtub	2677d47bbc	Update docs/dev/docker-hub.md Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com>	2023-05-24 07:23:28 +03:00
Kefu Chai	b8c565875b	docs/dev/system_keyspace: add raft table it is one of the non-volatile tables. we need add more of them. but let's do this piecemeal. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-24 10:08:04 +08:00
Kefu Chai	eee0003312	docs/dev/system_keyspace: move sstables and tablets into another section not all tables in system keyspace are volatile. among other things, system.sstables and system.tablets are persisted using sstables like regular user tables. so move them into the section where we have other regular tables there. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-05-24 10:08:03 +08:00
Kefu Chai	1246568e3b	docs/dev/system_keyspace: use timeuuid for sstables.generation we changed the type of generation column in system.sstables from bigint to timeuuid in `74e9e6dd1a` but that change failed to update the document accordingly. so let's update the document to reflect the change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13994	2023-05-23 14:37:28 +03:00
Guy Shtub	eefaad189a	fixed broken links, added community forum link, university link, other mistakes	2023-05-22 13:12:16 +03:00
Tomasz Grabiec	9d4bca26cc	Merge 'raft topology: implement `check_and_repair_cdc_streams` API' from Kamil Braun `check_and_repair_cdc_streams` is an existing API which you can use when the current CDC generation is suboptimal, e.g. after you decommissioned a node the current generation has more stream IDs than you need. In that case you can do `nodetool checkAndRepairCdcStreams` to create a new generation with fewer streams. It also works when you change number of shards on some node. We don't automatically introduce a new generation in that case but you can use `checkAndRepairCdcStreams` to create a new generation with restored shard-colocation. This PR implements the API on top of raft topology, it was originally implemented using gossiper. It uses the `commit_cdc_generation` topology transition state and a new `publish_cdc_generation` state to create new CDC generations in a cluster without any nodes changing their `node_state`s in the process. Closes #13683 * github.com:scylladb/scylladb: docs: update topology-over-raft.md test: topology_experimental_raft: test `check_and_repair_cdc` API raft topology: implement `check_and_repair_cdc_streams` API raft topology: implement global request handling raft topology: introduce `prepare_new_cdc_generation_data` raft_topology: `get_node_to_work_on_opt`: return guard if no node found raft topology: remove `node_to_work_on` from `commit_cdc_generation` transition raft topology: separate `publish_cdc_generation` state raft topology: non-node-specific `exec_global_command` raft topology: introduce `start_operation()` raft topology: non-node-specific `topology_mutation_builder` topology_state_machine: introduce `global_topology_request` topology_state_machine: use `uint16_t` for `enum_class`es raft topology: make `new_cdc_generation_data_uuid` topology-global	2023-05-22 11:33:58 +02:00
Calle Wilund	469e710caa	docs: Add initial doc on commitlog segment file format Refs #12849 Just a few lines on the file format of segments. Closes #13848	2023-05-15 16:22:44 +03:00
Kamil Braun	ddb5b45aef	docs: update topology-over-raft.md It was already outdated before this PR. Describe the version of topology state machine implemented in this PR. Fix some typos and make it proper markdown so it renders nicely on GitHub etc.	2023-05-08 16:49:01 +02:00
Pavel Emelyanov	0b18e3bff9	doc: Add a document describing how to configure S3 backend Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:23:38 +03:00
Tomasz Grabiec	9d786c1ebc	db: tablets: Add persistence layer	2023-04-24 10:49:37 +02:00
Kamil Braun	88aff50e8b	docs: cdc: describe generation changes using group 0 topology coordinator Update the `Generation switching` section: most of the existing description landed in `Gossiper-based topology changes` subsection, and a new subsection was added to describe Raft group 0 based topology changes. Marked as WIP - we expect further development in this area soon. The existing gossiper-based description was also updated a bit.	2023-04-20 16:36:41 +02:00
Botond Dénes	edc75f51ff	docs/dev/reader-concurrency-semaphore.md: expand on how the semaphore works Greatly expand on the details of how the semaphore works. Organize the content into thematic chapters to improve navigation. Improve formatting while at it.	2023-04-14 08:51:24 -04:00
Botond Dénes	943ae7fc69	reader_permit: give better names to active* states The names of these states have been the source of confusion ever since they were introduced. Give them names which better reflects their true meaning and gives less room for misinterpretation. The changes are: * active/unused -> active * active/used -> active/need_cpu * active/blocked -> active/await Hopefully the new names do a better job at conveying what these states really mean: * active - a regular admitted permit, which is active (as opposed to an inactive permit). * active/need_cpu - an active permit which was marked as needing CPU for the read to make progress. This permit prevents admission of new permits while it is in this state. * active/await - a former active/need_cpu permit, which has to wait on I/O or a remote shard. While in this state, it doesn't block the admission of new permits (pending other criteria such as resource availability).	2023-04-14 08:40:46 -04:00
Pavel Emelyanov	08e9046d07	system_keyspace: Add ownership table The schema is CREATE TABLE system.sstables ( location text, generation bigint, format text, status text, uuid uuid, version text, PRIMARY KEY (location, generation) ) A sample entry looks like: location \| generation \| format \| status \| uuid \| version ---------------------------------------------------------------------+------------+--------+--------+--------------------------------------+--------- /data/object_storage_ks/test_table-d096a1e0ad3811ed85b539b6b0998182 \| 2 \| big \| sealed \| d0a743b0-ad38-11ed-85b5-39b6b0998182 \| me The uuid field points to the "folder" on the storage where the sstable components are. Like this: s3 `- test_bucket `- f7548f00-a64d-11ed-865a-0c1fbc116bb3 `- Data.db - Index.db - Filter.db - ... It's not very nice that the whole /var/lib/... path is in fact used as location, it needs the PR #12707 to fix this place. Also, the "status" part is not yet fully functional, it only supports three options: - creating -- the same as TemporaryTOC file exists on disk - sealed -- default state - deleting -- the analogy for the deletion log on disk The latter needs support from the distributed_loader, which's not yet there. In fact, distributes_loader also needs to be patched to actualy select entries from this table on load. Also it needs the mentioned PR #12707 to support staging and quarantine sstables. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:28 +03:00
Kefu Chai	c24a9600af	docs: dev: correct a typo s/By expending/By expanding/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13392	2023-03-31 17:19:08 +03:00
Kefu Chai	11cea36c12	docs: dev: write mathematical expressions in LaTeX for better readability Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13341	2023-03-29 15:07:14 +03:00
Gleb Natapov	5e232ebee5	system_keyspace: add a table to persist topology change state machine's state Add local table to store topology change state machine's state there. Also add a function that loads the state to memory.	2023-03-21 16:06:43 +02:00
Gleb Natapov	a2b7d2c1a1	service: Introduce topology state machine data structures The topology state machine will track all the nodes in a cluster, their state, properties (topology, tokens, etc) and requested actions. Node state can be one of those: none - the node is not yet in the cluster bootstrapping - the node is currently bootstrapping decommissioning - the node is being decommissioned removing - the node is being removed replacing - the node is replacing another node normal - the node is working normally rebuild - the node is being rebuilt left - the node is left the cluster Nodes in state left are never removed from the state. Tokens also can be in one of the states: write_both_read_old - writes are going to new and old replica, but reads are from old replicas still write_both_read_new - writes still going to old and new replicas but reads are from new replica owner - tokens are owned by the node and reads and write go to new replica set only Tokens that needs to be move start in 'write_both_read_old' state. After entire cluster learns about it streaming start. After the streaming tokens move to 'write_both_read_new' state and again the whole cluster needs to learn about it and make sure no reads started before that point exist in the system. After that tokens may move to 'owner' state. topology_request is the field through which a topology operation request can be issued to a node. A request is one of the topology operation currently supported: join, leave, replace or remove.	2023-03-21 16:06:43 +02:00
Wojciech Mitros	52eb70aef0	docs: make wasm documentation visible for users Until now, the instructions on generating wasm files and using them for Scylla UDFs were stored in docs/dev, so they were not visible on the docs website. Now that the Rust helper library for UDFs is ready, and we're inviting users to try it out, we should also make the rest of the Wasm UDF documentation readily available for the users. Closes #13139	2023-03-14 16:21:23 +02:00
Wojciech Mitros	d4851ccae7	treewide: rename the "xwasm" UDF language to "wasm" When the WASM UDFs were first introduced, the LANGUAGE required in the CQL statements to use them was "xwasm", because the ABI for the UDFs was still not specified and changes to it could be backwards incompatible. Now, the ABI is stabilized, but if backwards incompatible changes are made in the future, we will add a new ABI version for them, so the name "xwasm" is no longer needed and we can finally change it to "wasm". Closes #13089	2023-03-07 10:21:11 +02:00
Wojciech Mitros	6d2e785b5c	docs: update wasm.md The WASM UDF implementation has changed since the last time the docs were written. In particular, the Rust helper library has been released, and using it should be the recommended method. Some decisions that were only experimental at the start, were also "set in stone", so we should refer to them as such. The docs also contain some code examples. This patch adds tests for these examples to make sure that they are not wrong and misleading. Closes #12941	2023-02-28 20:59:25 +02:00
Anna Stuchlik	95ce2e8980	doc: fix the option name LWT_OPTIMIZATION_META_BIT_MASK Fixes #12940. Closes #12982 [avi: move fixes tag out of subject]	2023-02-26 19:51:20 +02:00
Avi Kivity	c863186dc5	Merge 'Fixes for docs/dev/building.md' from Kamil Braun Closes #12071 * github.com:scylladb/scylladb: docs/dev: building.md: mention node-exporter packages docs/dev: building.md: replace `dev` with `<mode>` in list of debs	2023-02-26 19:27:33 +02:00
Piotr Smaroń	d2bfe124ad	doc: fix command invoking tests The developer documentation from `building.md` suggested to run unit tests with `./tools/toolchain/dbuild test` command, however this command only invokes `test` bash tool, which immediately returns with status `1`: ``` [piotrs@new-host scylladb]$ ./tools/toolchain/dbuild test [piotrs@new-host scylladb]$ echo $? 1 ``` This was probably unintended mistake and what author really meant was invoking `dbuild ninja test`. Closes #12890	2023-02-17 10:16:33 +02:00
Yaniv Kaul	9039b94790	docs: dev - how to test your tests documentation Short paragraph on how to develop tests and ensure they are solid. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes #12746	2023-02-06 12:07:43 +02:00
Tomasz Grabiec	806f698272	doc: Introduce docs/dev/mvcc.md This extracts information which was there in row_cache.md, but is relevant to MVCC in general. It also makes adaptations and reflects the upcoming changes in this series related to switching to the new mutation_partition_v2 model: - continuity in evictable snapshots can now overlap. This is needed to represent range tombstone information, which is linked to continuity information. - description of range tombstone representation was added	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	abc43f97c9	Merge 'Simplify some Raft tables' from Kamil Braun Rename `system.raft_config` to `system.raft_snapshot_config` to make it clearer what the table stores. Remove the `my_server_id` partition key column from `system.raft_snapshot_config` and a corresponding column from `system.raft_snapshots` which would store the Raft server ID of the local node. It's unnecessary, all servers running on a given node in different groups will use the same ID - the Raft ID of the node which is equal to its Host ID. There will be no multiple servers running in a single Raft group on the same node. Closes #12513 * github.com:scylladb/scylladb: db: system_keyspace: remove (my_)server_id column from RAFT_SNAPSHOTS and RAFT_SNAPSHOT_CONFIG db: system_keyspace: rename 'raft_config' to 'raft_snapshot_config'	2023-01-13 00:23:21 +01:00
Botond Dénes	4e41e7531c	docs/dev/debugging.md: recommend open-coredump.sh for opening coredumps Leave the guide for manual opening in though, the script might not work in all cases. Also update the version example, we changed how development versions look like. Closes #12511	2023-01-12 19:30:59 +02:00
Kamil Braun	bed555d1e5	db: system_keyspace: rename 'raft_config' to 'raft_snapshot_config' Make it clear that the table stores the snapshot configuration, which is not necessarily the currently operating configuration (the last one appended to the log). In the future we plan to have a separate virtual table for showing the currently operating configuration, perhaps we will call it `system.raft_config`.	2023-01-12 16:21:26 +01:00
Wojciech Mitros	082bfea187	rust: use depfile and Cargo.lock to avoid building rust when unnecessary Currently, we call cargo build every time we build scylla, even when no rust files have been changed. This is avoided by adding a depfile to the ninja rule for the rust library. The rust file is generated by default during cargo build, but it uses the full paths of all depenencies that it includes, and we use relative paths. This is fixed by specifying CARGO_BUILD_DEP_INFO_BASEDIR='.', which makes it so the current path is subtracted from all generated paths. Instead of using 'always' when specifying when to run the cargo build, a dependency on Cargo.lock is added additionally to the depfile. As a result, the rust files are recompiled not only when the source files included in the depfile are modified, but also when some rust dependency is updated. Cargo may put an old cached file as a result of the build even when the Cargo.lock was recently updated. Because of that, the the build result may be older than the Cargo.lock file even if the build was just performed. This may cause ninja to rebuilt the file every following time. To avoid this, we 'touch' the build result, so that its last modification time is up to date. Because the dependency on Cargo.lock was added, the new command for the build does not modify it. Instead, the developer must update it when modifying the dependencies - the docs are updated to reflect that. Closes #12489 Fixes #12508	2023-01-12 14:44:11 +02:00
Raphael "Raph" Carvalho	407c7fdaf2	docs: Fix command to create a symbolic link to relocatable pkg dir Closes #12481	2023-01-10 07:09:14 +02:00
Avi Kivity	5ffe4fee6d	Merge 'Remove legacy half reverse' from Michał Radwański This commit removes consume_in_reverse::legacy_half_reverse, an option once used to indicate that the given key ranges are sorted descending, based on the clustering key of the start of the range, and that the range tombstones inside partition would be sorted (descending, as all the mutation fragments would) according to their end (but range tombstone would still be stored according to their start bound). As it turns out, mutation::consume, when called with legacy_half_reverse option produces invalid fragment stream, one where all the row tombstone changes come after all the clustering rows. This was not an issue, since when constructing results from the query, Scylla would not pass the tombstones to the client, but instead compact data beforehand. In this commit, the consume_in_reverse::legacy_half_reverse is removed, along with all the uses. As for the swap out in mutation_partition.cc in query_mutation and to_data_query_result: The downstream was not prepared to deal with legacy_half_reverse. mutation::consume contains ``` if (reverse == consume_in_reverse::yes) { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::yes>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } else { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::no>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } ``` So why did it work at all? to_data_query_result deals with a single slice. The used consumer (compact_for_query_v2) compacts-away the range tombstone changes, and thus the only difference between the consume_in_reverse::no and consume_in_reverse::yes was that one was ordered increasing wrt. ckeys and the second one was ordered decreasing. This property is maintained if we swap out for the consume_in_reverse::yes format. Refs: #12353 Closes #12453 * github.com:scylladb/scylladb: mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse mutation_partition_view: treat query::partition_slice::option::reversed in to_data_query_result as consume_in_reverse::yes mutation: move consume_in_reverse def to mutation_consumer.hh	2023-01-08 15:42:00 +02:00
Wojciech Mitros	4d7858e66d	rust: adjust build according to cxxbridge's recommendations Currently, the rust build system in Scylla creates a separate static library for each incuded rust package. This could cause duplicate symbol issues when linking against multiple libraries compiled from rust. This issue is fixed in this patch by creating a single static library to link against, which combines all rust packages implemented in Scylla. The Cargo.lock for the combined build is now tracked, so that all users of the same scylla version also use the same versions of imported rust modules. Additionally, the rust package implementation and usage docs are modified to be compatible with the build changes. This patch also adds a new header file 'rust/cxx.hh' that contains definitions of additional rust types available in c++.	2023-01-06 14:05:53 +01:00
Michał Radwański	1fbf433966	mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse This commit removes consume_in_reverse::legacy_half_reverse, an option once used to indicate that the given key ranges are sorted descending, based on the clustering key of the start of the range, and that the range tombstones inside partition would be sorted (descending, as all the mutation fragments would) according to their end (but range tombstone would still be stored according to their start bound). As it turns out, mutation::consume, when called with legacy_half_reverse option produces invalid fragment stream, one where all the row tombstone changes come after all the clustering rows. This was not an issue, since when constructing results from the query, Scylla would not pass the tombstones to the client, but instead compact data beforehand. In this commit, the consume_in_reverse::legacy_half_reverse is removed, along with all the uses. As for the swap out in mutation_partition.cc in query_mutation and to_data_query_result: The downstream was not prepared to deal with legacy_half_reverse. mutation::consume contains ``` if (reverse == consume_in_reverse::yes) { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::yes>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } else { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::no>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } ``` So why did it work at all? to_data_query_result deals with a single slice. The used consumer (compact_for_query_v2) compacts-away the range tombstone changes, and thus the only difference between the consume_in_reverse::no and consume_in_reverse::yes was that one was ordered increasing wrt. ckeys and the second one was ordered decreasing. This property is maintained if we swap out for the consume_in_reverse::yes format.	2023-01-05 18:48:55 +01:00
Avi Kivity	919888fe60	Merge 'docs/dev: Add backport instructions for contributors' from Jan Ciołek Add instructions on how to backport a feature to on older version of Scylla. It contains a detailed step-by-step instruction so that people unfamiliar with intricacies of Scylla's repository organization can easily get the hang of it. This is the guide I wish I had when I had to do my first backport. I put it in backport.md because that looks like the file responsible for this sort of information. For a moment I thought about `CONTRIBUTING.md`, but this is a really short file with general information, so it doesn't really fit there. Maybe in the future there will be some sort of unification (see #12126) Closes #12138 * github.com:scylladb/scylladb: dev/docs: add additional git pull to backport docs docs/dev: add a note about cherry-picking individual commits docs/dev: use 'is merged into' instead of 'becomes' docs/dev: mention that new backport instructions are for the contributor docs/dev: Add backport instructions for contributors	2022-12-13 16:27:04 +02:00
Botond Dénes	f017e9f1c6	docs: document the reader concurrency semaphore diagnostics dump The diagnostics dumped by the reader concurrency semaphore are pretty common-sight in logs, as soon as a node becomes problematic. The reason is that the reader concurrency semaphore acts as the canary in the coal mine: it is the first that starts screaming when the node or workload is unhealthy. This patch adds documentation of the content of the diagnostics and how to diagnose common problems based on it. Fixes: #10471 Closes #11970	2022-12-06 16:24:44 +02:00
Jan Ciolek	05ea0c1d60	dev/docs: add additional git pull to backport docs Botond noted that an additional git pull might be needed here: https://github.com/scylladb/scylladb/pull/12138#discussion_r1035857007 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-30 16:14:02 +01:00

1 2

75 Commits