scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Anna Stuchlik	7f7ab3ae3e	doc: fix the broken Glossary link Fixes https://github.com/scylladb/scylladb/issues/13805 This commit fixes the redirection required by moving the Glossary page from the top of the page tree to the Reference section. As the change was only merged to master (not to branch-5.2), it is not working for version 5.2, which is now the latest stable version. For this reason, "stable" in the path must be replaced with "master". Closes #13847	2023-05-11 10:30:59 +03:00
Anna Stuchlik	4898a20ae9	doc: add troubleshooting for failed schema sync Fixes https://github.com/scylladb/scylladb/issues/12133 This commit adds a Troubleshooting article to support users when schema sync failed on their cluster. Closes #13709	2023-05-10 14:01:36 +03:00
Anna Stuchlik	c64109d8c7	doc: add driver support for Serverless Fixes https://github.com/scylladb/scylladb/issues/13453 This is V2 of https://github.com/scylladb/scylladb/pull/13710/. This commit adds: - the information about which ScyllaDB drivers support ScyllaDB Cloud Serverless. - language and organization improvements to the ScyllaDB CQL Drivers page. Closes #13825	2023-05-09 20:43:22 +03:00
Anna Stuchlik	98e1d7a692	doc: add the Elixir driver to the docs This commit adds the link to the Exlixir driver to the list of the third-party drivers. The driver actively supports ScyllaDB. This is v2 of https://github.com/scylladb/scylladb/pull/13701 Closes #13806	2023-05-08 15:36:35 +03:00
Anna Stuchlik	27b0dff063	doc: make branch-5.2 latest and stable This commit changes the configuration in the conf.py file to make branch-5.2 the latest version and remove it from the list of unstable versions. As a result, the docs for version 5.2 will become the default for users accessing the ScyllaDB Open Source documentation. This commit should be merged as soon as version 5.2 is released. Closes #13681	2023-05-05 11:11:17 +03:00
Pavel Emelyanov	0b18e3bff9	doc: Add a document describing how to configure S3 backend Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-05-03 20:23:38 +03:00
Botond Dénes	a93e5698b0	Merge 'Adding MindsDB integration to Docs' from Guy Shtub @annastuchlik please review Closes #13691 * github.com:scylladb/scylladb: adding documentation for integration with MindsDB adding documentation for integration with MindsDB	2023-04-28 11:47:10 +03:00
Guy Shtub	c4664f9b66	adding documentation for integration with MindsDB	2023-04-27 13:13:19 +03:00
Guy Shtub	7e35a07f93	adding documentation for integration with MindsDB	2023-04-27 13:12:38 +03:00
Kamil Braun	30cc07b40d	Merge 'Introduce tablets' from Tomasz Grabiec This PR introduces an experimental feature called "tablets". Tablets are a way to distribute data in the cluster, which is an alternative to the current vnode-based replication. Vnode-based replication strategy tries to evenly distribute the global token space shared by all tables among nodes and shards. With tablets, the aim is to start from a different side. Divide resources of replica-shard into tablets, with a goal of having a fixed target tablet size, and then assign those tablets to serve fragments of tables (also called tablets). This will allow us to balance the load in a more flexible manner, by moving individual tablets around. Also, unlike with vnode ranges, tablet replicas live on a particular shard on a given node, which will allow us to bind raft groups to tablets. Those goals are not yet achieved with this PR, but it lays the ground for this. Things achieved in this PR: - You can start a cluster and create a keyspace whose tables will use tablet-based replication. This is done by setting `initial_tablets` option: ``` CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 3, 'initial_tablets': 8}; ``` All tables created in such a keyspace will be tablet-based. Tablet-based replication is a trait, not a separate replication strategy. Tablets don't change the spirit of replication strategy, it just alters the way in which data ownership is managed. In theory, we could use it for other strategies as well like EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy is augmented to support tablets. - You can create and drop tablet-based tables (no DDL language changes) - DML / DQL work with tablet-based tables Replicas for tablet-based tables are chosen from tablet metadata instead of token metadata Things which are not yet implemented: - handling of views, indexes, CDC created on tablet-based tables - sharding is done using the old method, it ignores the shard allocated in tablet metadata - node operations (topology changes, repair, rebuild) are not handling tablet-based tables - not integrated with compaction groups - tablet allocator piggy-backs on tokens to choose replicas. Eventually we want to allocate based on current load, not statically Closes #13387 * github.com:scylladb/scylladb: test: topology: Introduce test_tablets.py raft: Introduce 'raft_server_force_snapshot' error injection locator: network_topology_strategy: Support tablet replication service: Introduce tablet_allocator locator: Introduce tablet_aware_replication_strategy locator: Extract maybe_remove_node_being_replaced() dht: token_metadata: Introduce get_my_id() migration_manager: Send tablet metadata as part of schema pull storage_service: Load tablet metadata when reloading topology state storage_service: Load tablet metadata on boot and from group0 changes db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata() migration_notifier: Introduce before_drop_keyspace() migration_manager: Make prepare_keyspace_drop_announcement() return a future<> test: perf: Introduce perf-tablets test: Introduce tablets_test test: lib: Do not override table id in create_table() utils, tablets: Introduce external_memory_usage() db: tablets: Add printers db: tablets: Add persistence layer dht: Use last_token_of_compaction_group() in split_token_range_msb() locator: Introduce tablet_metadata dht: Introduce first_token() dht: Introduce next_token() storage_proxy: Improve trace-level logging locator: token_metadata: Fix confusing comment on ring_range() dht, storage_proxy: Abstract token space splitting Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries" db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms() db: Introduce get_non_local_vnode_based_strategy_keyspaces() service: storage_proxy: Avoid copying keyspace name in write handler locator: Introduce per-table replication strategy treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type locator: Introduce effective_replication_map locator: Rename effective_replication_map to vnode_effective_replication_map locator: effective_replication_map: Abstract get_pending_endpoints() db: Propagate feature_service to abstract_replication_strategy::validate_options() db: config: Introduce experimental "TABLETS" feature db: Log replication strategy for debugging purposes db: Log full exception on error in do_parse_schema_tables() db: keyspace: Remove non-const replication strategy getter config: Reformat	2023-04-27 09:40:18 +02:00
Anna Stuchlik	c7df168059	doc: move Glossary to the Reference section This commit moves the Glossary page to the Reference section. In addition, it adds the redirection so that there are no broken links because of this change and fixes a link to a subsection of Glossary. Closes #13664	2023-04-27 07:03:55 +03:00
Anna Stuchlik	1ce50faf02	doc: remove reduntant information about versions Fixes https://github.com/scylladb/scylladb/issues/13578 Now that the documentation is versioned, we can remove the .. versionadded:: and .. versionchanged:: information (especially that the latter is hard to maintain and now outdated), as well as the outdated information about experimental features in very old releases. This commit removes that information and nothing else. Closes #13680	2023-04-26 17:20:52 +03:00
Aleksandra Martyniuk	725110a035	docs: clarify the meaning of cfhistogram's sstable column Closes #13669	2023-04-26 16:19:23 +03:00
Tomasz Grabiec	9d786c1ebc	db: tablets: Add persistence layer	2023-04-24 10:49:37 +02:00
Maxim Korolyov	002bdd7ae7	doc: add jaeger integration docs Closes #13490	2023-04-24 08:26:53 +03:00
Chang Chen Chien	c25a718008	docs: fix typo in using-scylla/local-secondary-indexes.rst Closes #13607	2023-04-24 06:56:19 +03:00
Tomasz Grabiec	bd0b299322	Merge 'Manage CDC generations when bootstrapping nodes using Raft Group 0 topology coordinator' from Kamil Braun Introduce a new table `CDC_GENERATIONS_V3` (`system.cdc_generations_v3`). The table schema is a copy-paste of the `CDC_GENERATIONS_V2` schema. The difference is that V2 lives in `system_distributed_keyspace` and writes to it are distributed using regular `storage_proxy` replication mechanisms based on the token ring. The V3 table lives in `system_keyspace` and any mutations written to it will go through group 0. Extend the `TOPOLOGY` schema with new columns: - `new_cdc_generation_data_uuid` will be stored as part of a bootstrapping node's `ring_slice`, it stores UUID of a newly introduced CDC generation which is used as partition key for the `CDC_GENERATIONS_V3` table to access this new generation's data. It's a regular column, meaning that every row (corresponding to a node) will have its own. - `current_cdc_generation_uuid` and `current_cdc_generation_timestamp` together form the ID of the newest CDC generation in the cluster. (the uuid is the data key for `CDC_GENERATIONS_V3`, the timestamp is when the CDC generation starts operating). Those are static columns since there's a single newest CDC generation. When topology coordinator handles a request for node to join, calculate a new CDC generation using the bootstrapping node's tokens, translate it to mutation format, and insert this mutation to the CDC_GENERATIONS_V3 table through group 0 at the same time we assign tokens to the node in Raft topology. The partition key for this data is stored in the bootstrapping node's `ring_slice`. After inserting new CDC generation data , we need to pick a timestamp for this generation and commit it, telling all nodes in the cluster to start using the generation for CDC log writes once their clocks cross that timestamp. We introduce a separate step to the bootstrap saga, before `write_both_read_old`, called `commit_cdc_generation`. In this step, the coordinator takes the `new_cdc_generation_data_uuid` stored in a bootstrapping node's `ring_slice` - which serves as the key to the table where the CDC generation data is stored - and combines it with a timestamp which it generates a bit into the future (as in old gossiper-based code, we use 2 * ring_delay, by default 1 minute). This gives us a CDC generation ID which we commit into the topology state as the `current_cdc_generation_id` while switching the saga to the next step, `write_both_read_old`. Once a new CDC generation is committed to the cluster by the topology coordinator, we also need to publish it to the user-facing description tables so CDC applications know which streams to read from. This uses regular distributed table writes underneath (tables living in the `system_distributed` keyspace) so it requires `token_metadata` to be nonempty. We need a hack for the case of bootstrapping the first node in the cluster - turning the tokens into normal tokens earlier in the procedure in `token_metadata`, but this is fine for the single-node case since no streaming is happening. When a node notices that a new CDC generation was introduced in `storage_service::topology_state_load`, it updates its internal data structures that are used when coordinating writes to CDC log tables. We include the current CDC generation data in topology snapshot transfers. Some fixes and refactors included. Closes #13385 * github.com:scylladb/scylladb: docs: cdc: describe generation changes using group 0 topology coordinator cdc: generation_service: add a FIXME cdc: generation_service: add legacy_ prefix for gossiper-based functions storage_service: include current CDC generation data in topology snapshots db: system_keyspace: introduce `query_mutations` with range/slice storage_service: hold group 0 apply mutex when reading topology snapshot service: raft_group0_client: introduce `hold_read_apply_mutex` storage_service: use CDC generations introduced by Raft topology raft topology: publish new CDC generation to the user description tables raft topology: commit a new CDC generation on node bootstrap raft topology: create new CDC generation data during node bootstrap service: topology_state_machine: make topology::find const db: system_keyspace: small refactor of `load_topology_state` cdc: generation: extract pure parts of `make_new_generation` outside db: system_keyspace: add storage for CDC generations managed by group 0 service: topology_state_machine: better error checking for state name (de)serialization service: raft: plumbing `cdc::generation_service&` cdc: generation: `get_cdc_generation_mutations`: take timestamp as parameter cdc: generation: make `topology_description_generator::get_sharding_info` a parameter sys_dist_ks: make `get_cdc_generation_mutations` public sys_dist_ks: move find_schema outside `get_cdc_generation_mutations` sys_dist_ks: move mutation size threshold calculation outside `get_cdc_generation_mutations` service/raft: group0_state_machine: signal topology state machine in `load_snapshot`	2023-04-21 18:11:27 +02:00
Anna Stuchlik	a68b976c91	doc: document `tombstone_gc` as not experimental The tombstone_gc was documented as experimental in version 5.0. It is no longer experimental in version 5.2. This commit updates the information about the option. Closes #13469	2023-04-21 14:43:25 +02:00
Kamil Braun	88aff50e8b	docs: cdc: describe generation changes using group 0 topology coordinator Update the `Generation switching` section: most of the existing description landed in `Gossiper-based topology changes` subsection, and a new subsection was added to describe Raft group 0 based topology changes. Marked as WIP - we expect further development in this area soon. The existing gossiper-based description was also updated a bit.	2023-04-20 16:36:41 +02:00
Warren Krewenki	73eaebe338	Remove visible :orphan: The text `:orphan:` was showing up in the scylla.yaml documentation with no context. Closes #13524	2023-04-20 08:24:48 +03:00
Anna Stuchlik	3d25edf539	doc: remove the sequential repair option from docs Fixes https://github.com/scylladb/scylladb/issues/12132 The sequential repair mode is not supported. This commit removes the incorrect information from the documentation. Closes #13544	2023-04-18 09:45:48 +03:00
Anna Stuchlik	da7a75fe7e	doc: remove in-memory tables from OSS docs Related: https://github.com/scylladb/scylladb/issues/13119 This commit removes the information about in-memory tables from the Open Source documentation, as it is an Enterprise-only feature. Closes #13496	2023-04-17 16:00:09 +03:00
Botond Dénes	b8e47569e6	Merge 'doc: extend the information about the recommended RF on the Tracing page' from Anna Stuchlik Fixes https://github.com/scylladb/scylla-doc-issues/issues/823. This PR extends the note on the Tracing page to explain what is meant by setting the RF to ALL and adds a link for reference. Closes #12418 * github.com:scylladb/scylladb: docs: add an explanation to recommendation in the Note box doc: extend the information about the recommended RF on the Tracing page	2023-04-17 13:28:19 +03:00
Anna Stuchlik	2d2d92cf18	docs: add an explanation to recommendation in the Note box	2023-04-17 11:39:06 +02:00
Pavel Emelyanov	c501163f95	Merge 'reader_permit: give better names to active* states' from Botond Dénes The names of these states have been the source of confusion ever since they were introduced. Give them names which better reflects their true meaning and gives less room for misinterpretation. The changes are: * active/unused -> active * active/used -> active/need_cpu * active/blocked -> active/await Hopefully the new names do a better job at conveying what these states really mean: * active - a regular admitted permit, which is active (as opposed to an inactive permit). * active/need_cpu - an active permit which was marked as needing CPU for the read to make progress. This permit prevents admission of new permits while it is in this state. * active/await - a former active/need_cpu permit, which has to wait on I/O or a remote shard. While in this state, it doesn't block the admission of new permits (pending other criteria such as resource availability). Closes #13482 * github.com:scylladb/scylladb: docs/dev/reader-concurrency-semaphore.md: expand on how the semaphore works reader_permit: give better names to active* states	2023-04-14 20:39:05 +03:00
Tomasz Grabiec	952b455310	Merge ' tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes scylla-sstable currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a CQL format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a schema.cql is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like qurantine, staging etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13448 * github.com:scylladb/scylladb: test/cql-pytest: test_tools.py: add tests for schema loading test/cql-pytest: add no_autocompaction_context docs: scylla-sstable.rst: remove accidentally added copy-pasta docs: scylla-sstable.rst: remove paragraph with schema limitations docs: scylla-sstable.rst: update schema section test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-04-14 16:46:26 +02:00
Botond Dénes	edc75f51ff	docs/dev/reader-concurrency-semaphore.md: expand on how the semaphore works Greatly expand on the details of how the semaphore works. Organize the content into thematic chapters to improve navigation. Improve formatting while at it.	2023-04-14 08:51:24 -04:00
Botond Dénes	943ae7fc69	reader_permit: give better names to active* states The names of these states have been the source of confusion ever since they were introduced. Give them names which better reflects their true meaning and gives less room for misinterpretation. The changes are: * active/unused -> active * active/used -> active/need_cpu * active/blocked -> active/await Hopefully the new names do a better job at conveying what these states really mean: * active - a regular admitted permit, which is active (as opposed to an inactive permit). * active/need_cpu - an active permit which was marked as needing CPU for the read to make progress. This permit prevents admission of new permits while it is in this state. * active/await - a former active/need_cpu permit, which has to wait on I/O or a remote shard. While in this state, it doesn't block the admission of new permits (pending other criteria such as resource availability).	2023-04-14 08:40:46 -04:00
Anna Stuchlik	989a75b2f7	doc: update the metrics between 5.2 and 2023.1 Related: https://github.com/scylladb/scylla-enterprise/issues/2794 This commit adds the information about the metric changes in version 2023.1 compared to version 5.2. This commit is part of the 5.2-to-2023.1 upgrade guide and must be backported to branch-5.2. Closes #13506	2023-04-14 08:23:53 +03:00
Botond Dénes	b7a4304b69	docs: scylla-sstable.rst: remove accidentally added copy-pasta	2023-04-12 03:14:43 -04:00
Botond Dénes	1673f10f7a	docs: scylla-sstable.rst: remove paragraph with schema limitations The above file contained a paragraph explaining the limitations of `scylla-sstable.rst` w.r.t. automatically finding the schema. This no longer applies so remove it.	2023-04-12 03:14:43 -04:00
Botond Dénes	9f9beef8fd	docs: scylla-sstable.rst: update schema section With the recent changes to the ways schema can be provided to the tool.	2023-04-12 03:14:43 -04:00
Anna Stuchlik	2921059ebb	doc: add a disclaimer about unsupported upgrade Fixes https://github.com/scylladb/scylla-enterprise/issues/2805 This commit adds the disclaimer that an upgrade by replacing the cluster nodes with nodes with a different release is not supported. Closes #13445	2023-04-11 10:47:39 +03:00
Pavel Emelyanov	08e9046d07	system_keyspace: Add ownership table The schema is CREATE TABLE system.sstables ( location text, generation bigint, format text, status text, uuid uuid, version text, PRIMARY KEY (location, generation) ) A sample entry looks like: location \| generation \| format \| status \| uuid \| version ---------------------------------------------------------------------+------------+--------+--------+--------------------------------------+--------- /data/object_storage_ks/test_table-d096a1e0ad3811ed85b539b6b0998182 \| 2 \| big \| sealed \| d0a743b0-ad38-11ed-85b5-39b6b0998182 \| me The uuid field points to the "folder" on the storage where the sstable components are. Like this: s3 `- test_bucket `- f7548f00-a64d-11ed-865a-0c1fbc116bb3 `- Data.db - Index.db - Filter.db - ... It's not very nice that the whole /var/lib/... path is in fact used as location, it needs the PR #12707 to fix this place. Also, the "status" part is not yet fully functional, it only supports three options: - creating -- the same as TemporaryTOC file exists on disk - sealed -- default state - deleting -- the analogy for the deletion log on disk The latter needs support from the distributed_loader, which's not yet there. In fact, distributes_loader also needs to be patched to actualy select entries from this table on load. Also it needs the mentioned PR #12707 to support staging and quarantine sstables. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:28 +03:00
Kamil Braun	c2a2996c2b	docs: cleaning up after failed membership change After a failed topology operation, like bootstrap / decommission / removenode, the cluster might contain a garbage entry in either token ring or group 0. This entry can be cleaned-up by executing removenode on any other node, pointing to the node that failed to bootstrap or leave the cluster. Document this procedure, including a method of finding the host ID of a garbage entry. Add references in other documents. Fixes: #13122 Closes #13186	2023-04-06 13:48:37 +02:00
Yaron Kaikov	c80ab78741	doc: update supported os for 2022.1 ubuntu22.04 is already supported on both `5.0` and `2022.1` updating the table Closes #13340	2023-04-05 06:43:58 +03:00
Nadav Har'El	aeabfcb93f	Merge 'Revert scylla sstable schema improvements' from Botond Dénes This PR reverts the scylla sstable schema loading improvements as they fail in CI every other run. I am already working on fixes for these but I am not sure I understand all the failures so it is best to revert and re-post the series later. Fixes: #13404 Fixes: #13410 Closes #13419 * github.com:scylladb/scylladb: Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" Revert "tools/schema_loader: don't require results from optional schema tables"	2023-04-04 18:22:14 +03:00
Anna Stuchlik	447ce58da5	doc: update Raft doc for versions 5.2 and 2023.1 Fixes https://github.com/scylladb/scylladb/issues/13345 Fixes https://github.com/scylladb/scylladb/issues/13421 This commit updates the Raft documentation page to be up to date in versions 5.2 and 2023.1. - Irrelevant information about previous releases is removed. - Some information is clarified. - Mentions of version 5.2 are either removed (if possible) or version 2023.1 is added. Closes #13426	2023-04-04 15:15:56 +02:00
Anna Stuchlik	595325c11b	doc: add upgrade guide from 5.2 to 2023.1 Related: https://github.com/scylladb/scylla-enterprise/issues/2770 This commit adds the upgrade guide from ScyllaDB Open Source 5.2 to ScyllaDB Enterprise 2023.1. This commit does not cover metric updates (the metrics file has no content, which needs to be added in another PR). As this is an upgrade guide, this commit must be merged to master and backported to branch-5.2 and branch-2023.1 in scylla-enterprise.git. Closes #13294	2023-04-04 08:24:00 +03:00
Botond Dénes	54c0a387a2	Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" This reverts commit `32fff17e19`, reversing changes made to `164afe14ad`. This series proved to be problematic, the new test introduced by it failing quite often. Revert it until the problems are tracked down and fixed.	2023-04-03 13:54:00 +03:00
Kefu Chai	c24a9600af	docs: dev: correct a typo s/By expending/By expanding/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13392	2023-03-31 17:19:08 +03:00
Tomasz Grabiec	4d6443e030	Merge 'Schema commitlog separate dir' from Gusev Petr The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in `commitlog::descriptor::descriptor`, which is logged with the `WARN` level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new `schema_commitlog_directory` parameter to move the schema commitlog to another disk drive. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867 Closes #13263 * github.com:scylladb/scylladb: commitlog: use separate directory for schema commitlog schema commitlog: fix commitlog_total_space_in_mb initialization	2023-03-30 23:48:58 +02:00
Petr Gusev	0152c000bb	commitlog: use separate directory for schema commitlog The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in commitlog::descriptor::descriptor, which is logged with the WARN level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new schema_commitlog_directory parameter to move the schema commitlog to another disk drive. By default, the schema commitlog directory is nested in the commitlog_directory. This can help avoid problems during an upgrade if the commitlog_directory in the custom scylla.yaml is located on a separate disk partition. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867	2023-03-30 21:55:50 +04:00
Nadav Har'El	32fff17e19	Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes `scylla-sstable` currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a `CQL` format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a `schema.cql` is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like `qurantine`, `staging` etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13075 * github.com:scylladb/scylladb: docs/operating-scylla/admin-tools: scylla-sstable.rst: update schema section test/cql-pytest: test_tools.py: add test for schema loading test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-03-30 09:35:59 +03:00
Kefu Chai	11cea36c12	docs: dev: write mathematical expressions in LaTeX for better readability Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13341	2023-03-29 15:07:14 +03:00
David Garcia	f45c4983db	docs: update theme 1.4 Closes #13346	2023-03-29 06:56:27 +03:00
Anna Stuchlik	4435b8b6f1	doc: elaborate on Scylla admin REST API - V2 This is V2 of https://github.com/scylladb/scylladb/pull/11849 This commit addes more information about ScyllaDB's REST API, including and example for Docker and a screenshot of the Swagger UI. Co-authored-by: Tzach Livyatan <tzach.livyatan@gmail.com> Closes #13331	2023-03-28 08:27:09 +03:00
David Garcia	70ce1b2002	docs: Separate conf.py docs: update github actions docs: fix Makefile tabs Update docs-pr.yaml Update Makefile Closes #13323	2023-03-27 13:42:58 +03:00
Anna Stuchlik	1cfea1f13c	doc: remove incorrect info about BYPASS CACHE Fixes https://github.com/scylladb/scylladb/issues/13106 This commit removes the information that BYPASS CACHE is an Enterprise-only feature and replaces that info with the link to the BYPASS CACHE description. Closes #13316	2023-03-26 18:13:17 +03:00
Tzach Livyatan	46e6c639d9	docs: minor improvments to the Raft Handling Failures and recovery procedure sections Closes #13292	2023-03-24 18:17:36 +01:00

1 2 3 4 5 ...

857 Commits