Commit Graph

90 Commits

Author SHA1 Message Date
Botond Dénes
584f4e467e tools/scylla-sstable: introduce the dump-schema command
There is a limited number of ways to obtain the schema of a table:
1) Use DESCRIBE TABLE in cqlsh
2) Find the schema definition in the code (for system tables)
3) Ask support/user to provide schema
4) Piece together the schema definition from the system tables

Option (1) is the most convenient but requires access to live cluster.
(2) is limited to system tables only.
When investigating issues for customers, we have to rely on (3) and this
often adds communication round-trips and delays. (4) requires knowledge
of ScyllaDB internals and access to system tables.

The new dump-schema commands provides a convenient way to obtain the
schema of tables, given that there is access to either an sstable or the
system tables. It can dump the schema of system tables without either.

Closes scylladb/scylladb#26433
2025-11-25 20:32:36 +03:00
Anna Stuchlik
724dc1e582 doc: fix the info about object storage
This commit fixes the information about object storage:

- Object storage configuration is no longer marked as experimental.
- Redundant information has been removed from the description.
- Information related to object storage for SStabels has been removed
  as the feature is not working.

Fixes https://github.com/scylladb/scylladb/issues/26985

Closes scylladb/scylladb#26987
2025-11-24 17:16:33 +03:00
Botond Dénes
e404dd7cf0 tools/scylla-sstable: add cql support to write operation
Add new --input-format command line argument. Possible values are json
(current) and cql (new -- added in this patch).
When --input-format=cql (new default), the input-file is expected to
contain CQL INSERT, UPDATE or DELETE statements, separated by semicolon.
The input file can contain any number of statements, in any order. The
statements will be executed and applied to a memtable, which is then
flushed to create an sstable with the content generated from the
statement. The memtable's size is capped at 1MiB, if it reaches this
size, it is flushed and recreated. Consequently, multiple sstables can
be created from a single scylla-sstable write --input-format=cql
operation.
2025-10-13 18:10:40 +03:00
Botond Dénes
34cc7aafae tools/scylla-sstable: introduce the upgrade command
An offline, scylla-sstable variant of nodetool upgradesstables command.
Applies latest (or selected) sstable version and latest schema.

Closes scylladb/scylladb#26109
2025-09-27 16:53:14 +03:00
Botond Dénes
edaf67edcb tools/scylla-sstable: remove writetime-histogram command
This command was written for an investigation and was used exactly once.
This would have been a perfect candidate for the (also rarely used)
scylla-sstable script command, but it didn't exist yet.
Drop this command from the tool, such super-specific commands should be
written as sstable-scripts nowadays, which is what we will do if we ever
need this again.

Closes scylladb/scylladb#26062
2025-09-18 12:05:54 +03:00
Botond Dénes
839056b648 docs/operating-scylla: scylla-sstable.rst: update write docs
scylla-sstable write (and scrub) moved to UUID generations in
514f59d157, but said patch forgot to
update the docs. This is fixed here.

Closes scylladb/scylladb#25965
2025-09-18 07:39:50 +03:00
Ran Regev
6eca083137 remove redis documentation
As part of removing redis from Scylla source tree.
This commit removes all related documentation.
Following commit remove the code itself.

Signed-off-by: Ran Regev <ran.regev@scylladb.com>
2025-08-20 17:53:23 +03:00
Anna Stuchlik
d303edbc39 doc: remove copyright from Cassandra Stress
This commit removes the Apache copyright note from the Cassandra Stress page.

It's a follow up to https://github.com/scylladb/scylladb/pull/21723, which missed
that update (see https://github.com/scylladb/scylladb/pull/21723#discussion_r1944357143).

Cassandra Stress is a separate tool with separate repo with the docs, so the copyright
information on the page is incorrect.

Fixes https://github.com/scylladb/scylladb/issues/23240

Closes scylladb/scylladb#24219
2025-05-26 09:35:30 +02:00
Kefu Chai
1ab2b7e7a0 tree: fix misspellings
these two misspellings were flagged by codespell.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#23357
2025-03-19 09:13:20 +02:00
Ernest Zaslavsky
112b4c8764 docs: Update Scylla Tools Documentation for S3 SSTable Support
Updated the Scylla Tools documentation to include changes related to
the enhanced support for S3-stored SSTables. This update ensures that
the documentation accurately reflects the latest functionality and
improvements.
2025-03-09 09:50:37 +02:00
Avi Kivity
28906c9261 Merge 'scylla-sstable: introduce the query command' from Botond Dénes
The scylla-sstable dump-* command suite has proven invaluable  in many investigations. In certain cases however, I found that `dump-data` is quite cumbersome. An example would be trying to find certain values in an sstable, or trying to read the content of system tables when a node is down. For these cases, `dump-data`  is very cumbersome: one has to trudge through tons of uninteresting metadata and do compaction in their heads. This PR introduces the new scylla-sstable query command, specifically targeted at situations like this: it allows executing queries on sstables, exposing to the user all the power of CQL, to tailor the output as they see fit.

Select everything from a table:

    $ scylla sstable query --system-schema /path/to/data/system_schema/keyspaces-*/*-big-Data.db
     keyspace_name                 | durable_writes | replication
    -------------------------------+----------------+-------------------------------------------------------------------------------------
            system_replicated_keys |           true |                         ({class : org.apache.cassandra.locator.EverywhereStrategy})
                       system_auth |           true |   ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 1})
                     system_schema |           true |                              ({class : org.apache.cassandra.locator.LocalStrategy})
                system_distributed |           true |   ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 3})
                            system |           true |                              ({class : org.apache.cassandra.locator.LocalStrategy})
                                ks |           true | ({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1})
                     system_traces |           true |   ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 2})
     system_distributed_everywhere |           true |                         ({class : org.apache.cassandra.locator.EverywhereStrategy})

Select everything from a single SSTable, use the JSON output (filtered through [jq](https://jqlang.github.io/jq/) for better readability):

    $ scylla sstable query --system-schema --output-format=json /path/to/data/system_schema/keyspaces-*/me-3gm7_127s_3ndxs28xt4llzxwqz6-big-Data.db | jq
    [
      {
        "keyspace_name": "system_schema",
        "durable_writes": true,
        "replication": {
          "class": "org.apache.cassandra.locator.LocalStrategy"
        }
      },
      {
        "keyspace_name": "system",
        "durable_writes": true,
        "replication": {
          "class": "org.apache.cassandra.locator.LocalStrategy"
        }
      }
    ]

Select a specific field in a specific partition using the command-line:

    $ scylla sstable query --system-schema --query "select replication from scylla_sstable.keyspaces where keyspace_name='ks'" ./scylla-workdir/data/system_schema/keyspaces-*/*-Data.db
     replication
    -------------------------------------------------------------------------------------
     ({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1})

Select a specific field in a specific partition using ``--query-file``:

    $ echo "SELECT replication FROM scylla_sstable.keyspaces WHERE keyspace_name='ks';" > query.cql
    $ scylla sstable query --system-schema --query-file=./query.cql ./scylla-workdir/data/system_schema/keyspaces-*/*-Data.db
     replication
    -------------------------------------------------------------------------------------
     ({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1})

New functionality: no backport needed.

Closes scylladb/scylladb#22007

* github.com:scylladb/scylladb:
  docs/operating-scylla: document scylla-sstable query
  test/cqlpy/test_tools.py: add tests for scylla-sstable query
  test/cqlpy/test_tools.py: make scylla_sstable() return table name also
  scylla-sstable: introduce the query command
  tools/utils: get_selected_operation(): use std::string for operation_options
  utils/rjson: streaming_writer: add RawValue()
  cql3/type_json: add to_json_type()
  test/lib/cql_test_env: introduce do_with_cql_env_noreentrant_in_thread()
2025-03-06 13:42:45 +02:00
Anna Stuchlik
d0a48c5661 doc: remove the reference to the 6.2 version
This commit removes the OSS version name, which is irrelevant
and confusing for 2025.1 and later users.
Also, it updates the warning to avoid specifying the release
when the deprecated feature will be removed.

Fixes https://github.com/scylladb/scylladb/issues/22839

Closes scylladb/scylladb#22936
2025-02-24 15:02:11 +02:00
Dusan Malusev
4e6ea232d2 docs: add instruction for installing cassandra-stress
Signed-off-by: Dusan Malusev <dusan.malusev@scylladb.com>

Closes scylladb/scylladb#21723
2025-02-19 09:25:16 +02:00
Botond Dénes
2e062e4e10 docs/operating-scylla: document scylla-sstable query 2025-02-18 07:37:05 -05:00
Aleksandra Martyniuk
683176d3db tasks: add shard, start_time, and end_time to task_stats
task_stats contains short info about a task. To get a list of task_stats
in the module, one needs to request /task_manager/list_module_tasks/{module}.

To make identification and navigation between tasks easier, extend
task_stats to contain shard, start_time, and end_time.

Closes scylladb/scylladb#22351
2025-02-04 12:11:24 +02:00
Aleksandra Martyniuk
18cc79176a api: task_manager: do not unregister tasks on get_status
Currently, /task_manager/task_status_recursive/{task_id} and
/task_manager/task_status/{task_id} unregister queries task if it
has already finished.

The status should not disappear after being queried. Do not unregister
finished task when its status or recursive status is queried.
2025-01-27 11:23:45 +01:00
Aleksandra Martyniuk
e37d1bcb98 api: task_manager: add /task_manager/drain
In the following patches, get_status won't be unregistering finished
tasks. However, tests need a functionality to drop a task, so that
they could manipulate only with the tasks for operations that were
invoked by these tests.

Add /task_manager/drain/{module} to unregister all finished tasks
from the module. Add respective nodetool command.
2025-01-27 11:23:45 +01:00
Geoff Montee
c8ca2bd212 docs: operating-scylla/admin-tools/virtual-tables.rst: fix link to virtual tables
Closes scylladb/scylladb#22198
2025-01-14 08:45:49 +02:00
Botond Dénes
f899f0e411 tools/scylla-sstable: dump-statistics: fix handling of {min,max}_column_names
Said fields in statistics are of type
`disk_array<uint32_t, disk_string<uint16_t>>` and currently are handled
as array of regular strings. However these fields store exploded
clustering keys, so the elements store binary data and converting to
string can yield invalid UTF-8 characters that certain JSON parsers (jq,
or python's json) can choke on. Fix this by treating them as binary and
using `to_hex()` to convert them to string. This requires some massaging
of the json_dumper: passing field offset to all visit() methods and
using a caller-provided disk-string to sstring converter to convert disk
strings to sstring, so in the case of statistics, these fields can be
intercepted and properly handled.

While at it, the type of these fields is also fixed in the
documentation.

Before:

    "min_column_names": [
      "��Z���\u0011�\u0012ŷ4^��<",
      "�2y\u0000�}\u007f"
    ],
    "max_column_names": [
      "��Z���\u0011�\u0012ŷ4^��<",
      "}��B\u0019l%^"
    ],

After:

    "min_column_names": [
      "9dd55a92bc8811ef12c5b7345eadf73c",
      "80327900e2827d7f"
    ],
    "max_column_names": [
      "9dd55a92bc8811ef12c5b7345eadf73c",
      "7df79242196c255e"
    ],

Fixes: #22078

Closes scylladb/scylladb#22225
2025-01-13 09:19:04 +03:00
Botond Dénes
f55dc71c3f Merge 'Use checksummed input streams in validate_checksums()' from Nikos Dragazis
With commits ed7d352e7d and bb1867c7c7, we now have input streams for both compressed and uncompressed SSTables that provide seamless checksum and digest checking. The code for these was based on `validate_checksums()`, which implements its own validation logic over raw streams. This has led to some duplicate code.

This PR deduplicates the uncompressed case by modifying `validate_checksums()` to use a checksummed input stream instead of a raw stream. The same cannot be done for compressed SSTables though. The reason is that `validate_checksums()` needs to examine the whole data file, even if an invalid chunk is encountered. In the checksummed case we support that by offloading the error handling logic from the data source via a function parameter. In the compressed data source we cannot do that because it needs to return decompressed data and decompression may fail if the data are invalid.

This PR also enables `validate_checksums()` to partially verify SSTables with just the per-chunk checksums if the digest is missing.

In more detail, this PR consists of:
* Port of some integrity checks from `do_validate_uncompressed()` to the checksummed data source. It should now be able to detect corruption due to truncated or appended chunks (expected number of chunks is retrieved from the CRC component).
* Introduction of `error_handler` parameter in checksummed data source and `data_stream()`.
* Refactoring of `validate_checksums()`. The JSON response of `sstable validate-checksums` was also modified to report a missing digest.
*  Tests for `validate_checksums()` against SSTables with truncated data, appended data, invalid digests, or no digest.

Refs #19058.

This PR is a hybrid of cleanup and feature. No backport is needed.

Closes scylladb/scylladb#20933

* github.com:scylladb/scylladb:
  tools/scylla-sstable: Rename valid_checksums -> valid
  test: Check validate_checksums() with missing digest
  sstables: Allow validate_checksums() to report missing digests
  sstables: Refactor validate_checksums() to use checksummed data stream
  sstables: Add error_handler parameter to data_stream()
  sstables: Add error handler in checksummed data source
  sstables: Check for excessive chunks in checksummed data source
  sstables: Check for premature EOF in checksummed data source
  test: test_validate_checksums: Check SSTable with invalid digest
  test: test_validate_checksums: Check SSTable with appended data
  test: test_validate_checksums: Complement test for truncated SSTable
2024-12-04 10:46:18 +02:00
Botond Dénes
ff90a77f5b scylla-sstable: revamp schema sources
Demote --scylla-data-dir and --scylla-yaml-file to schema source
helpers, rather than schema source in themselves. This practically means
that when these options are used, they won't define where the tool will
attempt to load the schema from, they will just be helpers to help locate
the schema, for whichever schema source the tool was instructed to use
(or left to choose).
--scylla-data-dir and --scylla-yaml-file being schema sources were
problematic with encryption at rest and for S3 support (not yet
implemented). With encryption, the tool needs access to the
configuration, so --scylla-yaml-file is often used to provide the path
to the configuration file, which contains encryption configuration,
needed for the tool to decrypt the sstable. Currently, using this option
implies forcing the tool to read the schema from the schema tables,
which is a problematic option for tests -- Scylla might be compacting a
schema sstable and this will make the tool fail to load the schema.
Demoting these options the schema helpers, allows providing them, while
at the same time having the option to use a different schema-source.

To allow the user to force the tool to load the schema from the schema
tables, a new --schema-tables option is added. Similarly, a
--sstable-schema option is introduced to force the tool to load the
schema from the sstable itself.

With this, each 4 schema source now has an option to force the use of
said schema source. There are various helper options to be used along
with these.

The documentation as well as the tests are updated with the changes.
The schema related documentation gets an rather extensive facelift
because it was a bit out-of-date and incomplete.

Fixes: scylladb/scylladb#20534

Closes scylladb/scylladb#21678
2024-11-28 18:36:09 +02:00
Botond Dénes
ccb433d767 Merge 'tasks: add api_task_ttl for tasks started with API' from Aleksandra Martyniuk
When users start an operation asynchronously with API, they are expected to check the operation's status. Hence, the status should be kept in task manager for reasonable time after the operation is done. The operations that are started internally usually don't need to stay in task manager for that long.

Add api_task_ttl that will be used for tasks started with API. By default it's 1 hour. The time for which non-API tasks stay in task manager isn't changed.

Fixes: #21499.
Refs: #21425.

No backport needed - previous versions may use task_ttl

Closes scylladb/scylladb#21505

* github.com:scylladb/scylladb:
  test: add test to check user_task_ttl
  tasks: api: move make_task method
  docs: nodetool: update backup and restore commands docs
  docs: update task manager docs
  nodetool: add nodetool tasks user-ttl command
  node_ops: use user task ttl for node ops virtual task
  tasks: use user_task_ttl for tasks started by user
  api: task_manager: add /task_manager/user_ttl to get and set user task ttl
  tasks: add task_manager::task::is_user_task method
  tasks: keep updateable_value of task_ttl in task manager
  db: config: add user_task_ttl_seconds named value
2024-11-27 09:57:57 +02:00
Aleksandra Martyniuk
3b86150e88 docs: update task manager docs 2024-11-26 09:57:41 +01:00
Nikos Dragazis
29ce29db33 tools/scylla-sstable: Rename valid_checksums -> valid
The `sstable validate-checksums` tool provides the validation result
via the `valid_checksums` key in its JSON response.

The name can be misleading as it refers to both the per-chunk checksums
and the digest (full checksum). We use the terms "digest" and
"full checksum" interchangeably.

Replace with the word "valid" to avoid confusion.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2024-11-25 16:04:58 +02:00
Nikos Dragazis
636524bde1 sstables: Allow validate_checksums() to report missing digests
Currently, `validate_checksums()` expects the SSTable to have a digest
component and fails immediately otherwise. This is suboptimal since data
integrity verification could still be carried out partially via checksum
checking.

Lift this restriction by allowing the function to perform checksum
checking in any case, and treat digest checking as best effort. Add a
separate boolean flag in the response to indicate the presence or
absence of the digest component, so that the user can deduce if a valid
result involved digest checking or not.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2024-11-25 16:04:57 +02:00
Botond Dénes
75ccb9f266 docs: sstableloader.rst: add deprecation notice
The java tools (including sstableloader) are deprecated and slated for
removal in the next ScyllaDB release. Add a notice about this to the
sstableloader page.
2024-11-22 03:39:48 -05:00
Botond Dénes
f22f022e16 docs: admin-tools: update deprecation notice for sstable{dump,metadata}
These two tools already have a deprecation notice, since ScyllaDB 5.4.
Now we have a target release for the actual removal of these tools, so
update the deprecation notice to reflect that.
2024-11-22 03:39:48 -05:00
Benny Halevy
2268912589 docs: add documentation for scylla_identifier
Commit 3a12ad96c7
added an sstable_identifier uuid to the SSTable
scylla_metadata component, however it was
under-documented and this patch adds the missing
documentation for the sstable component format,
and to the scylla sstable tool documentation.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#21221
2024-10-28 08:18:08 +02:00
Botond Dénes
c7c5817808 Merge 'Improve timestamp heuristics for tombstone garbage collection' from Benny Halevy
When purging regular tombstone consult the min_live_timestamp, if available.
This is safe since we don't need to protect dead data from resurrection, as it is already dead.

For shadowable_tombstones, consult the min_memtable_live_row_marker_timestamp,
if available, otherwise fallback to the min_live_timestamp.

If we see in a view table a shadowable tombstone with time T, then in any row where the row marker's timestamp is higher than T the shadowable tombstone is completely ignored and it doesn't hide any data in any column, so the shadowable tombstone can be safely purged without any effect or risk resurrecting any deleted data.

In other words, rows which might cause problems for purging a shadowable tombstone with time T are rows with row markers older or equal T. So to know if a whole sstable can cause problems for shadowable tombstone of time T, we need to check if the sstable's oldest row marker (and not oldest column) is older or equal T. And the same check applies similarly to the memtable.

If both extended timestamp statistics are missing, fallback to the legacy (and inaccurate) min_timestamp.

Fixes scylladb/scylladb#20423
Fixes scylladb/scylladb#20424

> [!NOTE]
> no backport needed at this time
> We may consider backport later on after given some soak time in master/enterprise
> since we do see tombstone accumulation in the field under some materialized views workloads

Closes scylladb/scylladb#20446

* github.com:scylladb/scylladb:
  cql-pytest: add test_compaction_tombstone_gc
  sstable_compaction_test: add mv_tombstone_purge_test
  sstable_compaction_test: tombstone_purge_test: test that old deleted data do not inhibit tombstone garbage collection
  sstable_compaction_test: tombstone_purge_test: add testlog debugging
  sstable_compaction_test: tombstone_purge_test: make_expiring: use next_timestamp
  sstable, compaction: add debug logging for extended min timestamp stats
  compaction: get_max_purgeable_timestamp: use memtable and sstable extended timestamp stats
  compaction: define max_purgeable_fn
  tombstone: can_gc_fn: move declaration to compaction_garbage_collector.hh
  sstables: scylla_metadata: add ext_timestamp_stats
  compaction_group, storage_group, table_state: add extended timestamp stats getters
  sstables, memtable: track live timestamps
  memtable_encoding_stats_collector: update row_marker: do nothing if missing
2024-09-13 08:56:51 +03:00
Aleksandra Martyniuk
59fba9016f docs: operating-scylla: add task manager docs
Admin-facing documentation of task manager.

Closes scylladb/scylladb#20209
2024-09-12 16:42:28 +03:00
Avi Kivity
ed7d352e7d Merge 'Validate checksums for uncompressed SSTables' from Nikos Dragazis
This PR introduces a new file data source implementation for uncompressed SSTables that will be validating the checksum of each chunk that is being read. Unlike for compressed SSTables, checksum validation for uncompressed SSTables will be active for scrub/validate reads but not for normal user reads to ensure we will not have any performance regression.

It consists of:
* A new file data source for uncompressed SSTables.
* Integration of checksums into SSTable's shareable components. The validation code loads the component on demand and manages its lifecycle with shared pointers.
* A new `integrity_check` flag to enable the new file data source for uncompressed SSTables. The flag is currently enabled only through the validation path, i.e., it does not affect normal user reads.
* New scrub tests for both compressed and uncompressed SSTables, as well as improvements in the existing ones.
* A change in JSON response of `scylla validate-checksums` to report if an uncompressed SSTable cannot be validated due to lack of checksums (no `CRC.db` in `TOC.txt`).

Refs #19058.

New feature, no backport is needed.

Closes scylladb/scylladb#20207

* github.com:scylladb/scylladb:
  test: Add test to validate SSTables with no checksums
  tools: Fix typo in help message of scylla validate-checksums
  sstables: Allow validate_checksums() to report missing checksums
  test: Add test for concurrent scrub/validate operations
  test: Add scrub/validate tests for uncompressed SSTables
  test/lib: Add option to create uncompressed random schemas
  test: Add test for scrub/validate with file-level corruption
  test: Check validation errors in scrub tests
  sstables: Enable checksum validation for uncompressed SSTables
  sstables: Expose integrity option via crawling mutation readers
  sstables: Expose integrity option via data_consume_rows()
  sstables: Add option for integrity check in data streams
  sstables: Remove unused variable
  sstables: Add checksum in the SSTable components
  sstables: Introduce checksummed file data source implementation
  sstables: Replace assert with on_internal_error
2024-09-11 23:09:45 +03:00
Nikos Dragazis
5c0a7f706b sstables: Allow validate_checksums() to report missing checksums
Change the return type of `sstable::validate_checksums()` from binary
(valid/invalid) to a ternary (valid/invalid/no_checksums). The third
status represents uncompressed SSTables without a CRC component (no
entry for CRC.db in the TOC).

Also, change the JSON response of `sstable validate-checksums` to expose
the new status. Replace the boolean value for valid/invalid checksums
with an object that contains two boolean keys: one that indicates if the
SSTable has checksums, and one that indicates if the checksums are valid
or not. The second key is optional and appears only if the SSTable has
checksums.

Finally, update the documentation to reflect the changes in the API.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2024-09-11 13:12:39 +03:00
Benny Halevy
4de4af954f sstables: scylla_metadata: add ext_timestamp_stats
Store and retrieve the optional extended timestamp statistics
(min_live_timestamp and min_live_row_marker_timestamp)
in the scylla_metadata component.

Note that there is no need for a cluster feature to
store those attributes since the scylla_metadata
on-disk format is extensible so that old sstables
can be read by new versions, seeing the extra stats
is missing, and new sstables can be read by old
versions that ignore unknown scylla metadata section types.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-09-10 19:05:57 +03:00
Kefu Chai
3f8f1d7274 tools/scylla-sstable: print warning when running shard-of with tablets
the subcommand of "shard-of" does not support tablets yet. so let's
print out an error message, instead of printing the mapping assuming
that the sstables are distributed based on token only.

this commit also adds two more command line options to this subcommand,
so that user is required to specify either "--vnodes" or "--tablets"
to instruct the tool how the cluster distributes the tokens across nodes
and their shards. this helps to minimize the suprise of user.

this change prepares for the succeeding changes to implement the tablets
support.

the corresponding test is updated accordingly so that it only exercises
the "shard-of" subcommand without tablets. we will test it with tablets
enabled in a succeeding change.

Refs scylladb/scylladb#16488
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-08-15 15:49:55 +08:00
Tzach Livyatan
91401f7da5 docs: Update Scylla to ScyllaDB in *all* RST docs files v3
Closes scylladb/scylladb#19578
2024-07-01 18:04:21 +02:00
Kefu Chai
ad649be1bf treewide: drop thrift support
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is
  preserved for backward compatibility, as we could load
  from an existing system.local table which still contains
  this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for
  backward compatibility with java-based nodetool.
* `rpc_port` and `start_rpc` options are preserved, but
  they are marked as "Unused". so that the new release
  of scylladb can consume existing scylla.yaml configurations
  which might contain these settings. by making them
  deprecated, user will be able get warned, and update
  their configurations before we actually remove them
  in the next major release.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-07 06:44:59 +08:00
Tzach Livyatan
4930095d39 Docs: Fix link fro scylla-sstable.rst to /architecture/sstable/
Fix https://github.com/scylladb/scylladb/issues/18096

Closes scylladb/scylladb#18097
2024-03-29 10:48:24 +02:00
Yaniv Kaul
a2ac80340f Typo: pint -> print
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#17804
2024-03-14 15:50:35 +02:00
Mikołaj Grzebieluch
cb17b4ac59 docs: maintenance socket: add section about accessing maintenance socket
Closes scylladb/scylladb#17701
2024-03-11 20:25:00 +02:00
Tzach Livyatan
dafc83205b Docs: rename the select-from-mutation-fragments page name
Closes scylladb/scylladb#17456
2024-03-06 10:32:56 +02:00
Anna Stuchlik
37237407f6 doc: remove info about outdated versions
This PR removes information about outdated versions, including disclaimers and information when a given feature was added.
Now that the documentation is versioned, information about outdated versions is unnecessary (and makes the docs harder to read).

Fixes https://github.com/scylladb/scylladb/issues/12110

Closes scylladb/scylladb#17430
2024-02-20 19:32:13 +02:00
Avi Kivity
93af3dd69b Merge 'Maintenance socket: set filesystem permissions to 660' from Mikołaj Grzebieluch
Set filesystem permissions for the maintenance socket to 660 (previously it was 755) to allow a scyllaadm's group to connect.
Split the logic of creating sockets into two separate functions, one for each case: when it is a regular cql controller or used by maintenance_socket.

Fixes https://github.com/scylladb/scylladb/issues/16487.

Closes scylladb/scylladb#17113

* github.com:scylladb/scylladb:
  maintenance_socket: add option to set owning group
  transport/controller: get rid of magic number for socket path's maximal length
  transport/controller: set unix_domain_socket_permissions for maintenance_socket
  transport/controller: pass unix_domain_socket_permissions to generic_server::listen
  transport/controller: split configuring sockets into separate functions
2024-02-20 15:09:54 +02:00
Anna Stuchlik
4f8f183736 doc: remove SSTable2json from the docs
This commit removes the SSTable2json documentation,
as well as the links to the removed page.

In addition, it adds a redirection for that page
to prevent 404.

Fixes https://github.com/scylladb/scylladb/issues/17204

Closes scylladb/scylladb#17340
2024-02-20 08:43:27 +02:00
Mikołaj Grzebieluch
182cfebe40 maintenance_socket: add option to set owning group
Option `maintenance-socket-group` sets the owning group of the maintenance socket.
If not set, the group will be the same as the user running the scylla node.
2024-02-19 10:21:00 +01:00
Tzach Livyatan
902733cd7e Docs: rename doc page from REST tp Admin REST API
Closes scylladb/scylladb#17334
2024-02-14 13:49:54 +02:00
Mikołaj Grzebieluch
9c07a189e8 docs: add maintenance mode documentation 2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch
81ef9fc91e docs: add cqlsh usage to maintenance socket documentation
After https://github.com/scylladb/scylla-cqlsh/pull/67, the user can use
cqlsh to connect to the node by maintenance socket.
2024-01-25 15:27:53 +01:00
Mikołaj Grzebieluch
2c34d9fcd8 docs: update maintenance socket documentation to use WhiteListRoundRobinPolicy
After https://github.com/scylladb/python-driver/pull/287, the user can use
WhiteListRoundRobinPolicy to connect to the node by maintenance socket.
2024-01-25 14:52:24 +01:00
Botond Dénes
7bb3ed7f23 docs/operating-scylla: scylla-sstable.rst: fix checksum list
Add empty line before list of different checksums in
validate-checksums's description. Otherwise the list is not rendered.

Closes scylladb/scylladb#16401
2024-01-24 16:34:13 +01:00
Kamil Braun
6fcaec75db Merge 'Add maintenance socket' from Mikołaj Grzebieluch
It enables interaction with the node through CQL protocol without authentication. It gives full-permission access.
The maintenance socket is available by Unix domain socket with file permissions `755`, thus it is not accessible from outside of the node and from other POSIX groups on the node.
It is created before the node joins the cluster.

To set up the maintenance socket, use the `maintenance-socket` option when starting the node.

* If set to `ignore` maintenance socket will not be created.
* If set to `workdir` maintenance socket will be created in `<node's workdir>/cql.m`.
* Otherwise maintenance socket will be created in the specified path.

The default value is `ignore`.

* With python driver

```python
from cassandra.cluster import Cluster
from cassandra.connection import UnixSocketEndPoint
from cassandra.policies import HostFilterPolicy, RoundRobinPolicy

socket = "<node's workdir>/cql.m"
cluster = Cluster([UnixSocketEndPoint(socket)],
                  # Driver tries to connect to other nodes in the cluster, so we need to filter them out.
                  load_balancing_policy=HostFilterPolicy(RoundRobinPolicy(), lambda h: h.address == socket))
session = cluster.connect()
```

Merge note: apparently cqlsh does not support unix domain sockets; it
will have to be fixed in a follow-up.

Closes scylladb/scylladb#16172

* github.com:scylladb/scylladb:
  test.py: add maintenance socket test
  test.py: enable maintenance socket in tests by default
  docs: add maintenance socket documentation
  main: add maintenance socket
  main: refactor initialization of cql controller and auth service
  auth/service: don't create system_auth keyspace when used by maintenance socket
  cql_controller: maintenance socket: fix indentation
  cql_controller: add option to start maintenance socket
  db/config: add maintenance_socket_enabled bool class
  auth: add maintenance_socket_role_manager
  db/config: add maintenance_socket variable
2023-12-20 19:04:40 +02:00