Commit Graph

341 Commits

Author SHA1 Message Date
Anna Stuchlik
7eb1dc2ae5 doc: document the option to run ScyllaDB in Docker on macOS
This commit adds a description of a workaround to create a multi-node ScyllaDB cluster
with Docker on macOS.

Refs https://github.com/scylladb/scylladb/issues/16806
See https://forum.scylladb.com/t/running-3-node-scylladb-in-docker/1057/4

Closes scylladb/scylladb#20857
2024-10-01 14:58:58 +03:00
Pavel Emelyanov
ae76481444 Merge 'treewide: add "table" parameter to "backup" API ' from Kefu Chai
with this parameter, "backup" API can backup the given table, this
enables it to be a drop-in replacement of existing rclone API used by
scylla manager.

Fixes https://github.com/scylladb/scylladb/issues/20636

---

this change is a part of the efforts to bring the native backup/restore to scylla, no need to backprt.

Closes scylladb/scylladb#20661

* github.com:scylladb/scylladb:
  backup_task: fix the indent
  treewide: add "table" parameter to "backup" API
2024-09-25 10:53:38 +03:00
Kefu Chai
d663b6c13b treewide: add "table" parameter to "backup" API
with this parameter, "backup" API can backup the given table, this
enables it to be a drop-in replacement of existing rclone API used by
scylla manager.

in this change:

* api/storage_service: add "table" parameter to "backup" API.
* snapshot_ctl: compose the full path of the snapshot directory in
  `snapshot_ctl::start_backup`. since we have all the information
  for composing the snapshot directory, and what the `backup_task_impl`
  class is interested is but the snapshot directory, we just pass
  the path to it instead the individual components of the directory.
* backup_task_impl: instead of scan the whole keyspace recursively,
  only scan the specified snapshot directory.

Fixes scylladb/scylladb#20636
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-25 09:11:26 +08:00
Avi Kivity
d16ea0afd6 Merge 'cql3: Extend DESC SCHEMA by auth and service levels' from Dawid Mędrek
Auth has been managed via Raft since Scylla 6.0. Restoring data
following the usual procedure (1) is error-prone and so a safer
method must have been designed and implemented. That's what
happens in this PR.

We want to extend `DESC SCHEMA` by auth and service levels
to provide a safe way to backup and restore those two components.
To realize that, we change the meaning of `DESC SCHEMA WITH INTERNALS`
and add a new "tier": `DESC SCHEMA WITH INTERNALS AND PASSWORDS`.

* `DESC SCHEMA` -- no change, i.e. the statement describes the current
  schema items such as keyspaces, tables, views, UDTs, etc.
* `DESC SCHEMA WITH INTERNALS` -- does the same as the previous tier
  and also describes auth and service levels. No information about
  passwords is returned.
* `DESC SCHEMA WITH INTERNALS AND PASSWORDS` -- does the same
  as the previous tier and also includes information about the salted
  hashes corresponding to the passwords of roles.

To restore existing roles, we extend the `CREATE ROLE` statement
by allowing to use the option `WITH SALTED HASH = '[...]'`.

---

Implementation strategy:

* Add missing things/adjust existing ones that will be used later.
* Implement creating a role with salted hash.
* Add tests for creating a role with salted hash.
* Prepare for implementing describe functionality of auth and service levels.
* Implement describe functionality for elements of auth and service levels.
* Extend the grammar.
* Add tests for describe auth and service levels.
* Add/update documentation.

---

(1): https://opensource.docs.scylladb.com/stable/operating-scylla/procedures/backup-restore/restore.html
In case the link stops working, restoring a schema was realised
by managing raw files on disk.

Fixes scylladb/scylladb#18750
Fixes scylladb/scylladb#18751
Fixes scylladb/scylladb#20711

Closes scylladb/scylladb#20168

* github.com:scylladb/scylladb:
  docs: Update user documentation for backup and restore
  docs/dev: Add documentation for DESC SCHEMA
  test: Add tests for describing auth and service levels
  cql3/functions/user_function: Remove newline character before and after UDF body
  cql3: Implement DESCRIBE SCHEMA WITH INTERNALS AND PASSWORDS
  auth: Implement describing auth
  auth/authenticator: Add member functions for querying password hash
  service/qos/service_level_controller: Describe service levels
  data_dictionary: Remove keyspace_element.hh
  treewide: Start using new overloads of describe
  treewide: Fix indentation in describe functions
  treewide: Return create statement optionally in describe functions
  treewide: Add new describe overloads to implementations of data_dictionary::keyspace_element
  treewide: Start using schema::ks_name() instead of schema::keyspace_name()
  cql3: Refactor `description`
  cql3: Move description to dedicated files
  test: Add tests for `CREATE ROLE WITH SALTED HASH`
  cql3/statements: Restrict CREATE ROLE WITH SALTED HASH
  auth: Allow for creating roles with SALTED HASH
  types: Introduce a function `cql3_type_name_without_frozen()`
  cql3/util: Accept std::string_view rather than const sstring&
2024-09-24 21:44:32 +03:00
Dawid Mędrek
5e1d7f109a docs/dev: Add documentation for DESC SCHEMA
We add documentation for developers addressing
`DESCRIBE SCHEMA`. It covers the following aspects
of it:

* motivation,
* synopsis of the solution,
* implementation of the solution,

as well as a few subsections explaining the details:

* restoring process and its side effects,
* restoring roles with passwords,
* list of statements generated by `DESC SCHEMA`
  with examples,
* implementation details.
2024-09-24 14:18:01 +02:00
Michał Jadwiszczak
d7945eea2a docs/dev/service_levels: replace unspecified workload type with NULL
`unspecified` workload type is an internal value and it's not exposed to
user via CQL.
Default value for workload type from user's perspective is `NULL`.

Fixes scylladb/scylladb#20780
2024-09-24 11:43:29 +03:00
Pavel Emelyanov
eb22c2a8c8 Merge 'reader_concurrency_semaphore: improve the diagnostics dump' from Botond Dénes
* Also dump diagnostics when a read times out while active (not queued).
* Add the "Trigger permit" line, containing the details of the permit which caused the diagnostics dump (by e.g. timing out).
* Add the "Identified bottleneck(s)" line, containing the identified bottlenecks which lead to permits being queued. This line is missing if no such bottleneck can be identified.
* Document the new features, as well as the stat dump, which was added some time ago.

Example of the new dump format:
```
INFO  2024-09-12 08:09:48,046 [shard  0:main] reader_concurrency_semaphore - Semaphore reader_concurrency_semaphore_dump_reader_diganostics with 8/10 count and 106192275/32768 memory resources: timed out, dumping permit diagnostics:
Trigger permit: count=0, memory=0, table=ks.tbl0, operation=mutation-query, state=waiting_for_admission
Identified bottleneck(s): memory

permits count   memory  table/operation/state
3       2       26M     *.*/push-view-updates-2/active
3       2       16M     ks.tbl1/push-view-updates-1/active
1       1       15M     ks.tbl2/push-view-updates-1/active
1       0       13M     ks.tbl1/multishard-mutation-query/active
1       0       12M     ks.tbl0/push-view-updates-1/active
1       1       10M     ks.tbl3/push-view-updates-2/active
1       1       6060K   ks.tbl3/multishard-mutation-query/active
2       1       1930K   ks.tbl0/push-view-updates-2/active
1       0       1216K   ks.tbl0/multishard-mutation-query/active
6       0       0B      ks.tbl1/shard-reader/waiting_for_admission
3       0       0B      *.*/data-query/waiting_for_admission
9       0       0B      ks.tbl0/mutation-query/waiting_for_admission
2       0       0B      ks.tbl2/shard-reader/waiting_for_admission
4       0       0B      ks.tbl0/shard-reader/waiting_for_admission
9       0       0B      ks.tbl0/data-query/waiting_for_admission
7       0       0B      ks.tbl3/mutation-query/waiting_for_admission
5       0       0B      ks.tbl1/mutation-query/waiting_for_admission
2       0       0B      ks.tbl2/mutation-query/waiting_for_admission
8       0       0B      ks.tbl1/data-query/waiting_for_admission
1       0       0B      *.*/mutation-query/waiting_for_admission
26      0       0B      permits omitted for brevity

96      8       101M    total

Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 0
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 0
reads_admitted: 1
reads_enqueued_for_admission: 82
reads_enqueued_for_memory: 0
reads_admitted_immediately: 1
reads_queued_because_ready_list: 0
reads_queued_because_need_cpu_permits: 82
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 0
reads_queued_with_eviction: 0
total_permits: 97
current_permits: 96
need_cpu_permits: 0
awaits_permits: 0
disk_reads: 0
sstables_read: 0
```

Fixes: https://github.com/scylladb/scylladb/issues/19535

Improvement, no backport needed.

Closes scylladb/scylladb#20545

* github.com:scylladb/scylladb:
  docs/dev/reader-concurrency-semaphore.md: update the documentation on diagnostics dumps
  test/boost/reader_concurrency_semaphore_test: test the new diagnostics functionality
  reader_concurrency_semaphore: add bottleneck self-diagnosis to diagnosis dump
  reader_concurrency_semaphore: include trigger permit in diagnostic dump
  reader_concurrency_semaphore: propagate permit to do_dump_reader_permit_diagnostics()
  reader_concurrency_semaphore: use consistent exception type for timeout
  reader_concurrency_semaphore: dump diagnostics when non-waiting reader times out
2024-09-18 14:06:05 +03:00
Ernest Zaslavsky
924325fd25 treewide: add "prefix" parameter to backup API
Allow the caller to pass the prefix when performing backup and restore

Fixes scylladb/scylladb#20335

Closes scylladb/scylladb#20413
2024-09-18 08:25:00 +03:00
Botond Dénes
c7c5817808 Merge 'Improve timestamp heuristics for tombstone garbage collection' from Benny Halevy
When purging regular tombstone consult the min_live_timestamp, if available.
This is safe since we don't need to protect dead data from resurrection, as it is already dead.

For shadowable_tombstones, consult the min_memtable_live_row_marker_timestamp,
if available, otherwise fallback to the min_live_timestamp.

If we see in a view table a shadowable tombstone with time T, then in any row where the row marker's timestamp is higher than T the shadowable tombstone is completely ignored and it doesn't hide any data in any column, so the shadowable tombstone can be safely purged without any effect or risk resurrecting any deleted data.

In other words, rows which might cause problems for purging a shadowable tombstone with time T are rows with row markers older or equal T. So to know if a whole sstable can cause problems for shadowable tombstone of time T, we need to check if the sstable's oldest row marker (and not oldest column) is older or equal T. And the same check applies similarly to the memtable.

If both extended timestamp statistics are missing, fallback to the legacy (and inaccurate) min_timestamp.

Fixes scylladb/scylladb#20423
Fixes scylladb/scylladb#20424

> [!NOTE]
> no backport needed at this time
> We may consider backport later on after given some soak time in master/enterprise
> since we do see tombstone accumulation in the field under some materialized views workloads

Closes scylladb/scylladb#20446

* github.com:scylladb/scylladb:
  cql-pytest: add test_compaction_tombstone_gc
  sstable_compaction_test: add mv_tombstone_purge_test
  sstable_compaction_test: tombstone_purge_test: test that old deleted data do not inhibit tombstone garbage collection
  sstable_compaction_test: tombstone_purge_test: add testlog debugging
  sstable_compaction_test: tombstone_purge_test: make_expiring: use next_timestamp
  sstable, compaction: add debug logging for extended min timestamp stats
  compaction: get_max_purgeable_timestamp: use memtable and sstable extended timestamp stats
  compaction: define max_purgeable_fn
  tombstone: can_gc_fn: move declaration to compaction_garbage_collector.hh
  sstables: scylla_metadata: add ext_timestamp_stats
  compaction_group, storage_group, table_state: add extended timestamp stats getters
  sstables, memtable: track live timestamps
  memtable_encoding_stats_collector: update row_marker: do nothing if missing
2024-09-13 08:56:51 +03:00
Botond Dénes
f834ad81e0 docs/dev/reader-concurrency-semaphore.md: update the documentation on diagnostics dumps
The part of the document which explains diagnostics dumps was due for an
update. It was missing an explanation on the dumped stats and it
also needs to explain the "Problematic permit" and "Identified
bottleneck(s)".
2024-09-12 08:31:25 -04:00
Benny Halevy
4de4af954f sstables: scylla_metadata: add ext_timestamp_stats
Store and retrieve the optional extended timestamp statistics
(min_live_timestamp and min_live_row_marker_timestamp)
in the scylla_metadata component.

Note that there is no need for a cluster feature to
store those attributes since the scylla_metadata
on-disk format is extensible so that old sstables
can be read by new versions, seeing the extra stats
is missing, and new sstables can be read by old
versions that ignore unknown scylla metadata section types.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-09-10 19:05:57 +03:00
Botond Dénes
de81388edb Merge 'commitlog: Handle oversized entries' from Calle Wilund
Refs #18161

Yet another approach to dealing with large commitlog submissions.

We handle oversize single mutation by adding yet another entry
typo: fragmented. In this case we only add a fragment (aha) of
the data that needs storing into each entry, along with metadata
to correlate and reconstruct the full entry on replay.

Because these fragmented entries are spread over N segments, we
also need to add references from the first segment in a chain
to the subsequent ones. These are released once we clear the
relevant cf_id count in the base.
                 *
This approach has the downside that due to how serialization etc
works w.r.t. mutations, we need to create an intermediate buffer
to hold the full serialized target entry. This is then incrementally
written into entries of < max_mutation_size, successively requesting
more segments.

On replay, when encountering a fragment chain, the fragment is
added to a "state", i.e. a mapping of currently processing
frag chains. Once we've found all fragments and concatenated
the buffers into a single fragmented one, we can issue a
replay callback as usual.

Note that a replay caller will need to create and provide such
a state object. Old signature replay function remains for tests
and such.

This approach bumps the file format (docs to come).

To ensure "atomicity" we both force synchronization, and should
the whole op fail, we restore segment state (rewinding), thus
discarding data all we wrote.

Closes scylladb/scylladb#19472

* github.com:scylladb/scylladb:
  commitlog/database: Make some commitlog options updatable + add feature listener
  features/config: Add feature for fragmented commitlog entries
  docs: Add entry on commitlog file format v4
  commitlog_test: Add more oversized cases
  commitlog_replayer: Replay segments in order created
  commitlog_replayer: Use replay state to support fragmented entries
  commitlog_replayer: coroutinize partly
  commitlog: Handle oversized entries
2024-09-10 17:15:46 +03:00
Avi Kivity
c3e19425bd Merge 'docs/dev/docker-hub.md: refresh aio-max-nr calculation' from Laszlo Ersek
~~~
What we have today in "docs/dev/docker-hub.md" on "aio-max-nr" dates back
to scylla commit f4412029f4 ("docs/docker-hub.md: add quickstart section
with --smp 1", 2020-09-22). Problems with the current language:

- The "65K" claim as default value on non-production systems is wrong;
  "fs/aio.c" in Linux initializes "aio_max_nr" to 0x10000, which is 64K.

- The section in question uses equal signs (=) incorrectly. The intent was
  probably to say "which means the same as", but that's not what equality
  means.

- In the same section, the relational operator "<" is bogus. The available
  AIO count must be at least as high (>=) as the requested AIO count.

- Clearer names should be used;
  adjust_max_networking_aio_io_control_blocks() in "src/core/reactor.cc"
  sets a great example:

  - "reactor::max_aio" should be called "storage_iocbs",

  - "detect_aio_poll" should be called "preempt_iocbs",

  - "reactor_backend_aio::max_polls" should be called "network_iocbs".

- The specific value 10000 for the last one ("network_iocbs") is not
  correct in scylla's context. It is correct as the Seastar default, but
  scylla has used 50000 since commit 2cfc517874 ("main, test: adjust
  number of networking iocbs", 2021-07-18).

Rewrite the section to address these problems.

See also:
- https://github.com/scylladb/scylladb/issues/5981
- https://github.com/scylladb/seastar/pull/2396
- https://github.com/scylladb/scylladb/pull/19921

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
~~~

No need for backporting; the documentation being refreshed targets developers as audience, not end-users.

Closes scylladb/scylladb#20398

* github.com:scylladb/scylladb:
  docs/dev/docker-hub.md: refresh aio-max-nr calculation
  docs/dev/docker-hub.md: strip trailing whitespace
2024-09-09 15:04:38 +03:00
Laszlo Ersek
53524974db docs/dev/maintainer.md: clarify "Updating submodule references"
Before the introduction of "scripts/refresh-submodules.sh", there was
indeed some manual work for the maintainer to do, hence "publish your
work" must have sounded correct. Today, the phrase "publish your work"
sounds confusing.

Commit 71da4e6e79 ("docs: Document sync-submodules.sh script in
maintainer.md", 2020-06-18) should have arguably reworded the last step of
the submodule refresh procedure; let's do it now.

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>

Closes scylladb/scylladb#20333
2024-09-05 13:57:32 +03:00
Calle Wilund
9bf452c7a0 docs: Add entry on commitlog file format v4 2024-09-03 16:38:28 +00:00
Laszlo Ersek
cd0819e3ed docs/dev/docker-hub.md: refresh aio-max-nr calculation
What we have today in "docs/dev/docker-hub.md" on "aio-max-nr" dates back
to scylla commit f4412029f4 ("docs/docker-hub.md: add quickstart section
with --smp 1", 2020-09-22). Problems with the current language:

- The "65K" claim as default value on non-production systems is wrong;
  "fs/aio.c" in Linux initializes "aio_max_nr" to 0x10000, which is 64K.

- The section in question uses equal signs (=) incorrectly. The intent was
  probably to say "which means the same as", but that's not what equality
  means.

- In the same section, the relational operator "<" is bogus. The available
  AIO count must be at least as high (>=) as the requested AIO count.

- Clearer names should be used;
  adjust_max_networking_aio_io_control_blocks() in "src/core/reactor.cc"
  sets a great example:

  - "reactor::max_aio" should be called "storage_iocbs",

  - "detect_aio_poll" should be called "preempt_iocbs",

  - "reactor_backend_aio::max_polls" should be called "network_iocbs".

- The specific value 10000 for the last one ("network_iocbs") is not
  correct in scylla's context. It is correct as the Seastar default, but
  scylla has used 50000 since commit 2cfc517874 ("main, test: adjust
  number of networking iocbs", 2021-07-18).

Rewrite the section to address these problems.

See also:
- https://github.com/scylladb/scylladb/issues/5981
- https://github.com/scylladb/seastar/pull/2396
- https://github.com/scylladb/scylladb/pull/19921

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-09-03 12:10:59 +02:00
Laszlo Ersek
15738d14ce docs/dev/docker-hub.md: strip trailing whitespace
Strip trailing whitespace.

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>
2024-09-03 12:00:28 +02:00
Kefu Chai
28b5471c01 docs/dev/maintainer.md: fix formatting
* in the "Backporting Seastar commits" section, there's a single quote
  instead of a backtick in this line, so fix it.
* add backticks around `refresh-submodules.sh`, which is a filename.
* correct the command line setting a git config option, because `git-config`
  does not support this command line syntax,

  ```console
  $ git config --global diff.conflictstyle = diff3
  $ git config --global get diff.conflictstyle
  =
  $ git config --global diff.conflictstyle diff3
  $ git config --global get diff.conflictstyle
  diff3
  ```

  quote from git-config(1)

  > ```
  > git config set [<file-option>] [--type=<type>] [--all] [--value=<value>] [--fixed-value] <name> <value>
  > ```
* stop using the deprecated mode of the `git-config` command, and use
  subcommand instead. as git-config(1) puts:

  > git config <name> <value> [<value-pattern>]
  >   Replaced by git config set [--value=<pattern>] <name> <value>.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20328
2024-09-01 22:24:01 +03:00
Aleksandra Martyniuk
3d78172328 api: task_manager: add operation to get ttl 2024-08-29 13:53:39 +02:00
Pavel Emelyanov
245cc852dd docs: Document the new backup method
Add the new /storage_service/backup endpoint to object_storage.md as yet
another way to use S3 from Scylla.
2024-08-22 19:47:06 +03:00
Pavel Emelyanov
8949d73cd9 docs: Update object_storage.md with AWS_ environment
Commit 51c53d8db6 made it possible to configure object storage endpoint
creds via environment. Mention this in the docs.
2024-08-22 14:08:21 +03:00
Pavel Emelyanov
d3f9865d2f docs: Restructure object_storage.md
Currently the doc assumes that object storage can only be used to keep
sstables on it. It's going to change, restructure the doc to allow for
more usage scenarios.
2024-08-22 14:08:21 +03:00
Łukasz Paszkowski
f4ca734ccb reverse-reads.md: Drop legacy reverse format information 2024-08-13 10:07:12 +02:00
Pavel Emelyanov
f0f28cf685 docs: Extend debugging with info about exploring ELF notes
When debugging coredumps some (small, but useful) information is hidden
in the notes of the core ELF file. Add some words about it exists, what
it includes and the thing that is always forgotten -- the way to get one

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#19962
2024-08-05 09:49:52 +03:00
Kefu Chai
35394c3f9a docs/dev: fix a typo
remove the extraneous "is".

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19902
2024-07-30 10:46:25 +03:00
Aleksandra Martyniuk
d04159e7de docs: describe virtual tasks 2024-07-23 13:35:02 +02:00
Avi Kivity
3fc4e23a36 forward_service: rename to mapreduce_service
forward_service is nondescriptive and misnamed, as it does more than
forward requests. It's a classic map/reduce algorithm (and in fact one
of its parameters is "reducer"), so name it accordingly.

The name "forward" leaked into the wire protocol for the messaging
service RPC isolation cookie, so it's kept there. It's also maintained
in the name of the logger (for "nodetool setlogginglevel") for
compatibility with tests.

Closes scylladb/scylladb#19444
2024-07-03 19:29:47 +03:00
Andrei Chekun
b6aabca9a7 Add documentation how to use allure reporting
Add documentation how to install and basic usage example of the allure reporting tool.
Fix typo test/README.md

Related: scylladb/qa-tasks#1665

Depends on: scylladb/scylladb#18169

Closes scylladb/scylladb#18710
2024-07-01 16:21:50 +02:00
Avi Kivity
fdc1449392 treewide: rename flat_mutation_reader_v2 to mutation_reader
flat_mutation_reader_v2 was introduced in a pair of commits in 2021:

  e3309322c3 "Clone flat_mutation_reader related classes into v2 variants"
  08b5773c12 "Adapt flat_mutation_reader_v2 to the new version of the API"

as a replacement for flat_mutation_reader, using range_tombstone_change
instead of range_tombstone to represent represent range tombstones. See
those commits for more information.

The transition was incremental; the last use of the original
flat_mutation_reader was removed in 2022 in commit

  026f8cc1e7 "db: Use mutation_partition_v2 in mvcc"

In turn, flat_mutation_reader was introduced in 2017 in commit

  748205ca75 "Introduce flat_mutation_reader"

To transition from a mutation_reader that nested rows within
a partition in a separate stream, to a flat reader that streamed
partitions and rows in the same stream.

Here, we reclaim the original name and rename the awkward
flat_mutation_reader_v2 to mutation_reader.

Note that mutation_fragment_v2 remains since we still use the original
for compatibilty, sometimes.

Some notes about the transition:

 - files were also renamed. In one case (flat_mutation_reader_test.cc), the
   rename target already existed, so we rename to
    mutation_reader_another_test.cc.

 - a namespace 'mutation_reader' with two definitions existed (in
   mutation_reader_fwd.hh). Its contents was folded into the mutation_reader
   class. As a result, a few #includes had to be adjusted.

Closes scylladb/scylladb#19356
2024-06-21 07:12:06 +03:00
Kefu Chai
ad649be1bf treewide: drop thrift support
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is
  preserved for backward compatibility, as we could load
  from an existing system.local table which still contains
  this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for
  backward compatibility with java-based nodetool.
* `rpc_port` and `start_rpc` options are preserved, but
  they are marked as "Unused". so that the new release
  of scylladb can consume existing scylla.yaml configurations
  which might contain these settings. by making them
  deprecated, user will be able get warned, and update
  their configurations before we actually remove them
  in the next major release.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-07 06:44:59 +08:00
Aleksandra Martyniuk
beef77a778 docs: describe task folding 2024-05-31 10:40:04 +02:00
Aleksandra Martyniuk
8a72324ff1 docs: add docs to task manager
Closes scylladb/scylladb#18967
2024-05-30 09:05:02 +03:00
Pavel Emelyanov
e74a4b038f Merge 'tablets: alter keyspace' from Piotr Smaron
This change supports changing replication factor in tablets-enabled keyspaces.
This covers both increasing and decreasing the number of tablets replicas through
first building topology mutations (`alter_keyspace_statement.cc`) and then
tablets/topology/schema mutations (`topology_coordinator.cc`).
For the limitations of the current solution, please see the docs changes attached to this PR.

Fixes: #16129

Closes scylladb/scylladb#16723

* github.com:scylladb/scylladb:
  test: Do not check tablets mutations on nodes that don't have them
  test: Fix the way tablets RF-change test parses mutation_fragments
  test/tablets: Unmark RF-changing test with xfail
  docs: document ALTER KEYSPACE with tablets
  Return response only when tablets are reallocated
  cql-pytest: Verify RF is changes by at most 1 when tablets on
  cql3/alter_keyspace_statement: Do not allow for change of RF by more than 1
  Reject ALTER with 'replication_factor' tag
  Implement ALTER tablets KEYSPACE statement support
  Parameterize migration_manager::announce by type to allow executing different raft commands
  Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks
  Extend system.topology with 3 new columns to store data required to process alter ks global topo req
  Allow query_processor to check if global topo queue is empty
  Introduce new global topo `keyspace_rf_change` req
  New raft cmd for both schema & topo changes
  Add storage service to query processor
  tablets: tests for adding/removing replicas
  tablet_allocator: make load_balancer_stats_manager configurable by name
2024-05-29 14:17:51 +03:00
Piotr Smaron
59d3fd615f Extend system.topology with 3 new columns to store data required to process alter ks global topo req
Because ALTER KS will result in creating a global topo req, we'll have
to pass the req data to topology coordinator's state machine, and the
easiest way to do it is through sytem.topology table, which is going to
be extended with 3 extra columns carrying all the data required to
execute ALTER KS from within topology coordinator.
2024-05-28 13:55:11 +02:00
Kefu Chai
61b5bfae6d docs: fix typos in dev documents
these typos were identified by codespell.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18871
2024-05-27 12:28:34 +03:00
Marcin Maliszkiewicz
2ab143fb40 db: auth: move auth tables to system keyspace
Separate keyspace which also behaves as system brings
little benefit while creating some compatibility problems
like schema digest mismatch during rollback. So we decided
to move auth tables into system keyspace.

Fixes https://github.com/scylladb/scylladb/issues/18098

Closes scylladb/scylladb#18769
2024-05-26 22:30:42 +03:00
Nadav Har'El
dcd26d8a16 Merge 'docs: update isolation.md' from Botond Dénes
Update `docs/dev/isolation.d`:
* Update the list of scheduling groups
* Remove IO priority groups (they were folded into scheduling groups)
* Add section on RPC isolation

Closes scylladb/scylladb#18749

* github.com:scylladb/scylladb:
  docs: isolation.md: add section on RPC call isolation
  docs: isolation.md: remove mention of IO priority groups
  docs: isolation.md: update scheduling group list, add aliases
2024-05-21 11:46:57 +03:00
Botond Dénes
11fa79a537 docs: isolation.md: add section on RPC call isolation 2024-05-21 03:12:22 -04:00
Pavel Emelyanov
159e44d08a test.py: Make it possible to avoid wildcard test names matching
There's a nasty scenario when this searching plays bad joke.

When CI picks up a new branch and notices, that a test had changed, it
spawns a custom job with test.py --repeat 100 $changed_test_name in
it. Next, when the test.py tries opt-in test name matching, it uses the
wildcard search and can pick up extra unwanted tests into the run.

To solve this, the case-selection syntax is extended. Now if the caller
specifies `suite/test::*` as test, the test file is selected by exact
name match, but the specific test-case is not selected, the `*` makes it
run all cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18704
2024-05-20 15:50:47 +02:00
Botond Dénes
936a7e282b docs: isolation.md: remove mention of IO priority groups
They were folded into CPU scheduling groups, which now apply to both CPU
and IO.
2024-05-20 03:33:24 -04:00
Botond Dénes
8f61468322 docs: isolation.md: update scheduling group list, add aliases 2024-05-20 03:30:04 -04:00
Tomasz Grabiec
eb3a22d5a8 docs: Document tablet sharding vs tablet replica placement 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
10a4903d0c dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard
Require users to specify whether we want shard for reads or for writes
by switching to appropriate non-deprecated variant.

For example, shard_of() can be replaced with shard_for_reads() or
shard_for_writes().

The next_shard/token_for_next_shard APIs have only for-reads variant,
and the act of switching will be a testimony to the fact that the code
is valid for intra-node migration.
2024-05-16 00:28:47 +02:00
Tomasz Grabiec
6946ad2a45 sharding: Prepare for intra-node-migration
Tablet sharder is adjusted to handle intra-migration where a tablet
can have two replicas on the same host. For reads, sharder uses the
read selector to resolve the conflict. For writes, the write selector
is used.

The old shard_of() API is kept to represent shard for reads, and new
method is introduced to query the shards for writing:
shard_for_writes(). All writers should be switched to that API, which
is not done in this patch yet.

The request handler on replica side acts as a second-level
coordinator, using sharder to determine routing to shards. A given
sharder has a scope of a single topology version, a single
effective_replication_map_ptr, which should be kept alive during
writes.
2024-05-16 00:28:46 +02:00
Tomasz Grabiec
b5bb46357b docs: Document sharder use for tablets 2024-05-16 00:28:46 +02:00
Tomasz Grabiec
82b34d34d8 tablets: Introduce tablet transition kind for intra-node migration
We need a separate transition kind for intra node migration so that we
don't have to recover this information from replica set in an
expensive way. This information is needed in the hot path - in
effective_replicaiton_map, to not return the pending tablet replica to
the coordinator. From its perspective, replica set is not
transitional.

The transition will also be used to alter the behavior of the
sharder. When not in intra-node migration, the sharder should
advertise the shard which is either in the previous or next replica
set. During intra-node migration, that's not possible as there may be
two such shards. So it will return the shard according to the current
read selector.
2024-05-16 00:28:46 +02:00
Aleksandra Martyniuk
561fb1dd09 service: move to cleanup stage if allow_write_both_read_old fails
If allow_write_both_read_old tablet transition stage fails, move
to cleanup_target stage before reverting migration.

It's a preparation for further patches which deallocate storage
group of a tablet during cleanup.
2024-05-10 14:56:38 +02:00
Yaniv Michael Kaul
124064844f docs/dev/object_stroage.md: convert example AWS keys to be more innocent
Someone thought that they actually represent real keys (the 'EXAMPLE' in their name was not enough).
Converted them to be as clear as can be, example data.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#18565
2024-05-09 08:26:43 +03:00
Botond Dénes
96a7ed7efb Merge 'sstables: add dead row count when issuing warning to system.large_partitions' from Ferenc Szili
This is the second half of the fix for issue #13968. The first half is already merged with PR #18346

Scylla issues warnings for partitions containing more rows than a configured threshold. The warning is issued by inserting a row into the `system.large_partitions` table. This row contains the information about the partition for which the warning is issued: keyspace, table, sstable, partition key and size, compaction time and the number of rows in the partition. A previous PR #18346 also added range tombstone count to this row.

This change adds a new counter for dead rows to the large_partitions table.

This change also adds cluster feature protection for writing into these new counters. This is needed in case a cluster is in the process of being upgraded to this new version, after which an upgraded node writes data with the new schema into `system.large_partitions`, and finally a node is then rolled back to an old version. This node will then revert the schema to the old version, but the written sstables will still contain data with the new counters, causing any readers of this table to throw errors when they encounter these cells.

This is an enhancement, and backporting is not needed.

Fixes #13968

Closes scylladb/scylladb#18458

* github.com:scylladb/scylladb:
  sstable: added test for counting dead rows
  sstable: added docs for system.large_partitions.dead_rows
  sstable: added cluster feature for dead rows and range tombstones
  sstable: write dead_rows count to system.large_partitions
  sstable: added counter for dead rows
2024-05-09 08:26:43 +03:00
Ferenc Szili
8e9771d010 sstable: added docs for system.large_partitions.dead_rows 2024-05-07 15:44:33 +02:00