scylladb

Author	SHA1	Message	Date
Botond Dénes	c7c5817808	Merge 'Improve timestamp heuristics for tombstone garbage collection' from Benny Halevy When purging regular tombstone consult the min_live_timestamp, if available. This is safe since we don't need to protect dead data from resurrection, as it is already dead. For shadowable_tombstones, consult the min_memtable_live_row_marker_timestamp, if available, otherwise fallback to the min_live_timestamp. If we see in a view table a shadowable tombstone with time T, then in any row where the row marker's timestamp is higher than T the shadowable tombstone is completely ignored and it doesn't hide any data in any column, so the shadowable tombstone can be safely purged without any effect or risk resurrecting any deleted data. In other words, rows which might cause problems for purging a shadowable tombstone with time T are rows with row markers older or equal T. So to know if a whole sstable can cause problems for shadowable tombstone of time T, we need to check if the sstable's oldest row marker (and not oldest column) is older or equal T. And the same check applies similarly to the memtable. If both extended timestamp statistics are missing, fallback to the legacy (and inaccurate) min_timestamp. Fixes scylladb/scylladb#20423 Fixes scylladb/scylladb#20424 > [!NOTE] > no backport needed at this time > We may consider backport later on after given some soak time in master/enterprise > since we do see tombstone accumulation in the field under some materialized views workloads Closes scylladb/scylladb#20446 * github.com:scylladb/scylladb: cql-pytest: add test_compaction_tombstone_gc sstable_compaction_test: add mv_tombstone_purge_test sstable_compaction_test: tombstone_purge_test: test that old deleted data do not inhibit tombstone garbage collection sstable_compaction_test: tombstone_purge_test: add testlog debugging sstable_compaction_test: tombstone_purge_test: make_expiring: use next_timestamp sstable, compaction: add debug logging for extended min timestamp stats compaction: get_max_purgeable_timestamp: use memtable and sstable extended timestamp stats compaction: define max_purgeable_fn tombstone: can_gc_fn: move declaration to compaction_garbage_collector.hh sstables: scylla_metadata: add ext_timestamp_stats compaction_group, storage_group, table_state: add extended timestamp stats getters sstables, memtable: track live timestamps memtable_encoding_stats_collector: update row_marker: do nothing if missing	2024-09-13 08:56:51 +03:00
Aleksandra Martyniuk	59fba9016f	docs: operating-scylla: add task manager docs Admin-facing documentation of task manager. Closes scylladb/scylladb#20209	2024-09-12 16:42:28 +03:00
Avi Kivity	ed7d352e7d	Merge 'Validate checksums for uncompressed SSTables' from Nikos Dragazis This PR introduces a new file data source implementation for uncompressed SSTables that will be validating the checksum of each chunk that is being read. Unlike for compressed SSTables, checksum validation for uncompressed SSTables will be active for scrub/validate reads but not for normal user reads to ensure we will not have any performance regression. It consists of: * A new file data source for uncompressed SSTables. * Integration of checksums into SSTable's shareable components. The validation code loads the component on demand and manages its lifecycle with shared pointers. * A new `integrity_check` flag to enable the new file data source for uncompressed SSTables. The flag is currently enabled only through the validation path, i.e., it does not affect normal user reads. * New scrub tests for both compressed and uncompressed SSTables, as well as improvements in the existing ones. * A change in JSON response of `scylla validate-checksums` to report if an uncompressed SSTable cannot be validated due to lack of checksums (no `CRC.db` in `TOC.txt`). Refs #19058. New feature, no backport is needed. Closes scylladb/scylladb#20207 * github.com:scylladb/scylladb: test: Add test to validate SSTables with no checksums tools: Fix typo in help message of scylla validate-checksums sstables: Allow validate_checksums() to report missing checksums test: Add test for concurrent scrub/validate operations test: Add scrub/validate tests for uncompressed SSTables test/lib: Add option to create uncompressed random schemas test: Add test for scrub/validate with file-level corruption test: Check validation errors in scrub tests sstables: Enable checksum validation for uncompressed SSTables sstables: Expose integrity option via crawling mutation readers sstables: Expose integrity option via data_consume_rows() sstables: Add option for integrity check in data streams sstables: Remove unused variable sstables: Add checksum in the SSTable components sstables: Introduce checksummed file data source implementation sstables: Replace assert with on_internal_error	2024-09-11 23:09:45 +03:00
Nikos Dragazis	5c0a7f706b	sstables: Allow validate_checksums() to report missing checksums Change the return type of `sstable::validate_checksums()` from binary (valid/invalid) to a ternary (valid/invalid/no_checksums). The third status represents uncompressed SSTables without a CRC component (no entry for CRC.db in the TOC). Also, change the JSON response of `sstable validate-checksums` to expose the new status. Replace the boolean value for valid/invalid checksums with an object that contains two boolean keys: one that indicates if the SSTable has checksums, and one that indicates if the checksums are valid or not. The second key is optional and appears only if the SSTable has checksums. Finally, update the documentation to reflect the changes in the API. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 13:12:39 +03:00
Benny Halevy	4de4af954f	sstables: scylla_metadata: add ext_timestamp_stats Store and retrieve the optional extended timestamp statistics (min_live_timestamp and min_live_row_marker_timestamp) in the scylla_metadata component. Note that there is no need for a cluster feature to store those attributes since the scylla_metadata on-disk format is extensible so that old sstables can be read by new versions, seeing the extra stats is missing, and new sstables can be read by old versions that ignore unknown scylla metadata section types. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-09-10 19:05:57 +03:00
Kefu Chai	3f8f1d7274	tools/scylla-sstable: print warning when running shard-of with tablets the subcommand of "shard-of" does not support tablets yet. so let's print out an error message, instead of printing the mapping assuming that the sstables are distributed based on token only. this commit also adds two more command line options to this subcommand, so that user is required to specify either "--vnodes" or "--tablets" to instruct the tool how the cluster distributes the tokens across nodes and their shards. this helps to minimize the suprise of user. this change prepares for the succeeding changes to implement the tablets support. the corresponding test is updated accordingly so that it only exercises the "shard-of" subcommand without tablets. we will test it with tablets enabled in a succeeding change. Refs scylladb/scylladb#16488 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-08-15 15:49:55 +08:00
Tzach Livyatan	91401f7da5	docs: Update Scylla to ScyllaDB in all RST docs files v3 Closes scylladb/scylladb#19578	2024-07-01 18:04:21 +02:00
Kefu Chai	ad649be1bf	treewide: drop thrift support thrift support was deprecated since ScyllaDB 5.2 > Thrift API - legacy ScyllaDB (and Apache Cassandra) API is > deprecated and will be removed in followup release. Thrift has > been disabled by default. so let's drop it. in this change, * thrift protocol support is dropped * all references to thrift support in document are dropped * the "thrift_version" column in system.local table is preserved for backward compatibility, as we could load from an existing system.local table which still contains this clolumn, so we need to write this column as well. * "/storage_service/rpc_server" is only preserved for backward compatibility with java-based nodetool. * `rpc_port` and `start_rpc` options are preserved, but they are marked as "Unused". so that the new release of scylladb can consume existing scylla.yaml configurations which might contain these settings. by making them deprecated, user will be able get warned, and update their configurations before we actually remove them in the next major release. Fixes #3811 Fixes #18416 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-06-07 06:44:59 +08:00
Tzach Livyatan	4930095d39	Docs: Fix link fro scylla-sstable.rst to /architecture/sstable/ Fix https://github.com/scylladb/scylladb/issues/18096 Closes scylladb/scylladb#18097	2024-03-29 10:48:24 +02:00
Yaniv Kaul	a2ac80340f	Typo: pint -> print Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#17804	2024-03-14 15:50:35 +02:00
Mikołaj Grzebieluch	cb17b4ac59	docs: maintenance socket: add section about accessing maintenance socket Closes scylladb/scylladb#17701	2024-03-11 20:25:00 +02:00
Tzach Livyatan	dafc83205b	Docs: rename the select-from-mutation-fragments page name Closes scylladb/scylladb#17456	2024-03-06 10:32:56 +02:00
Anna Stuchlik	37237407f6	doc: remove info about outdated versions This PR removes information about outdated versions, including disclaimers and information when a given feature was added. Now that the documentation is versioned, information about outdated versions is unnecessary (and makes the docs harder to read). Fixes https://github.com/scylladb/scylladb/issues/12110 Closes scylladb/scylladb#17430	2024-02-20 19:32:13 +02:00
Avi Kivity	93af3dd69b	Merge 'Maintenance socket: set filesystem permissions to 660' from Mikołaj Grzebieluch Set filesystem permissions for the maintenance socket to 660 (previously it was 755) to allow a scyllaadm's group to connect. Split the logic of creating sockets into two separate functions, one for each case: when it is a regular cql controller or used by maintenance_socket. Fixes https://github.com/scylladb/scylladb/issues/16487. Closes scylladb/scylladb#17113 * github.com:scylladb/scylladb: maintenance_socket: add option to set owning group transport/controller: get rid of magic number for socket path's maximal length transport/controller: set unix_domain_socket_permissions for maintenance_socket transport/controller: pass unix_domain_socket_permissions to generic_server::listen transport/controller: split configuring sockets into separate functions	2024-02-20 15:09:54 +02:00
Anna Stuchlik	4f8f183736	doc: remove SSTable2json from the docs This commit removes the SSTable2json documentation, as well as the links to the removed page. In addition, it adds a redirection for that page to prevent 404. Fixes https://github.com/scylladb/scylladb/issues/17204 Closes scylladb/scylladb#17340	2024-02-20 08:43:27 +02:00
Mikołaj Grzebieluch	182cfebe40	maintenance_socket: add option to set owning group Option `maintenance-socket-group` sets the owning group of the maintenance socket. If not set, the group will be the same as the user running the scylla node.	2024-02-19 10:21:00 +01:00
Tzach Livyatan	902733cd7e	Docs: rename doc page from REST tp Admin REST API Closes scylladb/scylladb#17334	2024-02-14 13:49:54 +02:00
Mikołaj Grzebieluch	9c07a189e8	docs: add maintenance mode documentation	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	81ef9fc91e	docs: add cqlsh usage to maintenance socket documentation After https://github.com/scylladb/scylla-cqlsh/pull/67, the user can use cqlsh to connect to the node by maintenance socket.	2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch	2c34d9fcd8	docs: update maintenance socket documentation to use WhiteListRoundRobinPolicy After https://github.com/scylladb/python-driver/pull/287, the user can use WhiteListRoundRobinPolicy to connect to the node by maintenance socket.	2024-01-25 14:52:24 +01:00
Botond Dénes	7bb3ed7f23	docs/operating-scylla: scylla-sstable.rst: fix checksum list Add empty line before list of different checksums in validate-checksums's description. Otherwise the list is not rendered. Closes scylladb/scylladb#16401	2024-01-24 16:34:13 +01:00
Kamil Braun	6fcaec75db	Merge 'Add maintenance socket' from Mikołaj Grzebieluch It enables interaction with the node through CQL protocol without authentication. It gives full-permission access. The maintenance socket is available by Unix domain socket with file permissions `755`, thus it is not accessible from outside of the node and from other POSIX groups on the node. It is created before the node joins the cluster. To set up the maintenance socket, use the `maintenance-socket` option when starting the node. * If set to `ignore` maintenance socket will not be created. * If set to `workdir` maintenance socket will be created in `<node's workdir>/cql.m`. * Otherwise maintenance socket will be created in the specified path. The default value is `ignore`. * With python driver ```python from cassandra.cluster import Cluster from cassandra.connection import UnixSocketEndPoint from cassandra.policies import HostFilterPolicy, RoundRobinPolicy socket = "<node's workdir>/cql.m" cluster = Cluster([UnixSocketEndPoint(socket)], # Driver tries to connect to other nodes in the cluster, so we need to filter them out. load_balancing_policy=HostFilterPolicy(RoundRobinPolicy(), lambda h: h.address == socket)) session = cluster.connect() ``` Merge note: apparently cqlsh does not support unix domain sockets; it will have to be fixed in a follow-up. Closes scylladb/scylladb#16172 * github.com:scylladb/scylladb: test.py: add maintenance socket test test.py: enable maintenance socket in tests by default docs: add maintenance socket documentation main: add maintenance socket main: refactor initialization of cql controller and auth service auth/service: don't create system_auth keyspace when used by maintenance socket cql_controller: maintenance socket: fix indentation cql_controller: add option to start maintenance socket db/config: add maintenance_socket_enabled bool class auth: add maintenance_socket_role_manager db/config: add maintenance_socket variable	2023-12-20 19:04:40 +02:00
Mikołaj Grzebieluch	21b3ba4927	docs: add maintenance socket documentation	2023-12-18 17:58:13 +01:00
Kefu Chai	273ee36bee	tools/scylla-sstable: add `scylla sstable shard-of` command when migrating to the uuid-based identifiers, the mapping from the integer-based generation to the shard-id is preserved. we used to have "gen % smp_count" for calculating the shard which is responsible to host a given sstable. despite that this is not a documented behavior, this is handy when we try to correlate an sstable to a shard, typically when looking at a performance issue. in this change, a new subcommand is added to expose the connection between the sstable and its "owner" shards. Fixes #16343 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16345	2023-12-15 11:36:45 +02:00
Yaniv Kaul	862909ee4f	Typos: fix typos in documentation Using codespell, went over the docs and fixed some typos. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#16275	2023-12-07 11:10:17 +02:00
Botond Dénes	5fb0d667cb	tools/scylla-sstable: always read scylla.yaml Currently, scylla.yaml is read conditionally, if either the user provided `--scylla-yaml-file` command line parameter, or if deducing the data dir location from the sstable path failed. We want the scylla.yaml file to be always read, so that when working with encrypted file (enterprise), scylla-sstable can pick up the configuration for the encryption. This patch makes scylla-sstable always attempt to read the scylla-yaml file, whether the user provided a location for it or not. When not, the default location is used (also considering the `SCYLLA_CONF` and `SCYLLA_HOME` environment variables. Failing to find the scylla.yaml file is not considered an error. The rational is that the user will discover this if they attempt to do an operation that requires this anyway. There is a debug-level log about whether it was successfully read or not. Fixes: #16132 Closes scylladb/scylladb#16174	2023-12-05 15:06:29 +02:00
Kefu Chai	48340380dd	scylla-sstable: print "validate" result in JSON instead of printing the result of the "validate" subcommand in a free-style plain text, let's print it using JSON. for two reasons: 1. it is simpler to consume the output with other tools and tests. 2. more consistent with other commands. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16105	2023-11-22 17:44:07 +02:00
Kefu Chai	2392b6a179	doc: start unordered list with an empty line otherwise, sphinx would render them as a single block instead of as an unordererd list. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15504	2023-09-21 14:35:09 +03:00
Kefu Chai	2f17b76df7	docs/operating-scylla/admin-tools: add note on deprecating sstabledump sstabledump is deprecated in place of `scylla sstable` commands. so let's reflect this in the document. Fixes #15020 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15021	2023-08-24 08:31:29 +03:00
Anna Stuchlik	b5c4d13e36	doc: update the Seastar Perftune page This commit updates the description of perftune.py. It is based on the information in the reported issue (below), the contents of help for perftune.py, and the input from @vladzcloudius. Fixes https://github.com/scylladb/scylladb/issues/14233 Closes #14879	2023-08-21 10:23:30 +03:00
Kefu Chai	7275b8967c	docs: add sstablemetadata to operating-scylla/admin-tools to note that sstablemetadata is being deprecated and encourage user to switch over to the native tools. Fixes #15020 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15040	2023-08-17 18:48:46 +03:00
Botond Dénes	718f57c510	docs/operating-scylla/admin-tools: add documentation for the SELECT * FROM MUTATION_FRAGMENTS() statement	2023-07-19 01:28:28 -04:00
Anna Stuchlik	a93fd2b162	doc: fix internal links Fixes https://github.com/scylladb/scylladb/issues/14490 This commit fixes mulitple links that were broken after the documentation is published (but not in the preview) due to incorrect syntax. I've fixed the syntax to use the :docs: and :ref: directive for pages and sections, respectively. Closes #14664	2023-07-14 18:32:47 +03:00
Anna Stuchlik	e7bb86e0f1	doc: fix broken links on the Scylla SStable page	2023-06-30 12:00:59 +02:00
Botond Dénes	e92b71c451	docs/operating-scylla/admin-tools: scylla-sstable: document scrub operation	2023-06-16 06:20:14 -04:00
Anna Stuchlik	1ce50faf02	doc: remove reduntant information about versions Fixes https://github.com/scylladb/scylladb/issues/13578 Now that the documentation is versioned, we can remove the .. versionadded:: and .. versionchanged:: information (especially that the latter is hard to maintain and now outdated), as well as the outdated information about experimental features in very old releases. This commit removes that information and nothing else. Closes #13680	2023-04-26 17:20:52 +03:00
Botond Dénes	b7a4304b69	docs: scylla-sstable.rst: remove accidentally added copy-pasta	2023-04-12 03:14:43 -04:00
Botond Dénes	1673f10f7a	docs: scylla-sstable.rst: remove paragraph with schema limitations The above file contained a paragraph explaining the limitations of `scylla-sstable.rst` w.r.t. automatically finding the schema. This no longer applies so remove it.	2023-04-12 03:14:43 -04:00
Botond Dénes	9f9beef8fd	docs: scylla-sstable.rst: update schema section With the recent changes to the ways schema can be provided to the tool.	2023-04-12 03:14:43 -04:00
Botond Dénes	54c0a387a2	Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" This reverts commit `32fff17e19`, reversing changes made to `164afe14ad`. This series proved to be problematic, the new test introduced by it failing quite often. Revert it until the problems are tracked down and fixed.	2023-04-03 13:54:00 +03:00
Botond Dénes	b6682ad607	docs/operating-scylla/admin-tools: scylla-sstable.rst: update schema section With the recent changes to the ways schema can be provided to the tool.	2023-03-24 11:41:40 -04:00
Botond Dénes	e5071fdeab	tools/scylla-sstable: add script operation Loads the script from the specified path, then feeds the mutation fragment stream to it. For now only Lua scripts are supported for the simple reason that Lua is easy to write bindings for, it is simple and lightweight and more importantly we already have Lua included in the Scylla binary as it is used as the implementation language for UDF/UDA. We might consider WASM support in the future, but for now we don't have any language support in WASM available.	2023-01-09 09:46:57 -05:00
Botond Dénes	9713a5c314	tool/scylla-sstable: move documentation online The inline-help of operations will only contain a short summary of the operation and the link to the online documentation. The move is not a straightforward copy-paste. First and foremost because we move from simple markdown to RST. Informal references are also replaced with proper RST links. Some small edits were also done on the texts. The intent is the following: * the inline help serves as a quick reference for what the operation does and what flags it has; * the online documentation serves as the full reference manual, explaining all details;	2022-12-15 04:10:21 -05:00
Botond Dénes	3cf7afdf95	docs: scylla-sstable.rst: add sstable content section Provides a link to the architecture/sstable page for more details on the sstable format itself. It also describes the mutation-fragment stream, the parts of it that is relevant to the sstable operations. The purpose of this section is to provide a target for links that want to point to a common explanation on the topic. In particular, we will soon move the detailed documentation of the scylla-sstable operations into this file and we want to have a common explanation of the mutation fragment stream that these operations can point to.	2022-12-15 04:10:21 -05:00
Botond Dénes	641fb4c8bb	docs: scylla-{sstable,types}.rst: drop Syntax section In both files, the section hierarchy is as follows: Usage Syntax Sections with actual content This scheme uses up 3 levels of hierarchy, leaving not much room to expand the sections with actual content with subsections of their own. Remove the Syntax level altogether, directly embedding the sections with content under the Usage section.	2022-12-15 04:03:00 -05:00
Botond Dénes	c35cee7e2b	docs: scylla-types.rst: add mention of per-operation --help	2022-12-06 14:47:28 +02:00
Botond Dénes	4f9799ce4f	tools/scylla-types: add serialize operation Takes human readable values and converts them to serialized hex encoded format. Only regular atomic types are supported for now, no collection/UDT/tuple support, not even in frozen form.	2022-12-06 14:46:53 +02:00
Botond Dénes	15452730fb	tools/scylla-types: s/print/deserialize/ operation Soon we will have a serialize operation. Rename the current print operation to deserialize in preparation to that. We want the two operations (serialize and deserialize) to reflect their relation in their names too.	2022-12-06 14:45:30 +02:00
Botond Dénes	f98e6552b4	docs: scylla-types.rst: document tokenof and shardof These new actions were added recently but without the accompanying documentation change. Make up for this now.	2022-12-06 14:45:30 +02:00
Botond Dénes	30c047cae6	docs: scylla-types.rst: fix typo in compare operation description	2022-12-06 14:45:23 +02:00

1 2

62 Commits