Commit Graph

38442 Commits

Author SHA1 Message Date
Patryk Jędrzejczak
b044ee535f test: add test_raft_ignore_nodes
We add two tests verifying that --ignore-dead-nodes in
raft_removenode and --ignore-dead-nodes-for-replace in
raft_replace are handled correctly.

We need a 7-cluster to have a Raft majority. Therefore, these
tests are quite slow, and we want to run them only in the dev mode.
2023-08-22 14:19:21 +02:00
Patryk Jędrzejczak
6818d13f7d test: ManagerClient.remove_node: allow List[HostId] for ignore_dead
ManagerClient.remove_node allows passing ignore_dead only as
List[IPAddress]. However, raft_removenode currently supports
only host ids. To write a test that passes ignore_dead to
ManagerClient.remove_node in the Raft topology mode, we allow
passing ignore_dead as List[HostId].

Note that we don't want to use List[IPAddress | HostId] because
mixing IP addresses and host ids fails anyway. See
ss::remove_node.set(...) in api::set_storage_service.
2023-08-22 14:19:09 +02:00
Patryk Jędrzejczak
26ad527666 raft topology: pass ignore_nodes to {replace, remove}_with_repair
To properly stream ranges during the removenode or replace
operation in the Raft topology mode, we pass IPs of the ignored
nodes to replace_with_repair and remove_with_repair in
storage_service::raft_topology_cmd_handler.
2023-08-22 14:18:39 +02:00
Patryk Jędrzejczak
e685182290 raft topology: exec_global_command: add ignore_nodes to exclude_nodes
We add ignore_nodes to exclude_nodes in exec_global_command
to ignore nodes marked as dead by --ignore-dead-nodes for
raft_removenode and --ignore-dead-nodes-for-replace for
raft_replace.
2023-08-22 14:18:37 +02:00
Patryk Jędrzejczak
5ebee35f99 raft topology: exec_global_command: change type of exclude_nodes
We extend exclude_nodes in exec_global_command with ignore_nodes
in the next commit. Since we already use std::unordered_set to
store ids of the ignored nodes and their number is unknown, we
change the type of exclude_nodes from utils::small_vector to
std::unordered_set.
2023-08-22 14:17:55 +02:00
Patryk Jędrzejczak
1f57d80ba1 topology_state_machine: extend request_param with a set of raft ids
We add two new alternative types to service::request_param:
removenode_param and replace_param. They allow storing the list
of ignored nodes loaded from the ignore_nodes column of
system.topology. We also remove the raft::server_id type because
it has been only used by the replace operation.
2023-08-22 14:17:37 +02:00
Patryk Jędrzejczak
7d3dc306eb raft topology: set ignore_nodes in raft_removenode and raft_replace
To handle --ignore-dead-nodes in raft_removenode and
--ignore-dead-nodes-for-replace in raft_replace, we set the
ignore_nodes value of the topology mutation in these functions. In
the following commits, we ensure that the topology coordinator
properly makes use of it.
2023-08-22 14:13:51 +02:00
Patryk Jędrzejczak
0beabdc6ba utils: introduce split_comma_separated_list
Three places handle comma-separated lists similarly:
- ss::remove_node.set(...) in api::set_storage_service,
- storage_service::parse_node_list,
- storage_service::is_repair_based_node_ops_enabled.
In the next commit, the fourth place that needs the same logic
appears -- storage_service::raft_replace. It needs to load
and parse the --ignore-dead-nodes-for-replace param from config.

Moreover, the code in is_repair_based_node_ops_enabled is
different and doesn't seem right. We swap '\"' and '\'' with ' '
but don't do anything with it afterward.

To avoid code duplication and fix is_repair_based_node_ops_enabled,
we introduce the new function utils::split_comma_separated_list.

This change has a small side effect on logging. For example,
ignore_nodes_strs in storage_service::parse_node_list might be
printed in a slightly different form.
2023-08-22 10:30:36 +02:00
Patryk Jędrzejczak
16f5db8af2 raft topology: add the ignore_nodes column to system.topology
In the following commits, we add support for --ignore-dead-nodes
in raft_removenode and --ignore-dead-nodes-for-replace in
raft_replace. To make these request parameters accessible for the
topology coordinator, we store them in the new ignore_nodes
column of system.topology.
2023-08-22 10:30:12 +02:00
Avi Kivity
a4e7f9bed0 docs: cql: split DML page into one page per statement
The DML page is quite long (21 screenfuls on my monitor); split
it into one page per statement to make it more digestible.

The sections that are common to multiple statement are kept
in the main DML page, and references to them are added.

Closes #15053
2023-08-20 17:14:32 +03:00
Kefu Chai
12d6ec5a18 config: respect --log-with-color 1
scylladb overrides some of seastar logging related options with its
own options by applying them with `logging::apply_settings()`. but
we fail to inherit `with_color` from Seastar as we are using the
designated initializer, so the unspecified members are zero initialized.
that's why we always have logging message in black and white even
if scylla is running in a tty and `--log-with-color 1` is specified.

so, make the debugging life more colorful, let's inherit the option
from Seastar, and apply it when setting logging related options.

see also 29e09a3292

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15076
2023-08-20 13:47:43 +03:00
Tomasz Grabiec
bd8bb5d4b1 Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

```
INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```

Closes #14863

* github.com:scylladb/scylladb:
  Kill scylla option to configure number of compaction groups
  replica: Wire tablet into compaction group
  token_metadata: Add this_host_id to topology config
  replica: Switch to chunked_vector for storing compaction groups
  replica: Generate group_id for compaction_group on demand
2023-08-18 15:17:17 +02:00
Avi Kivity
1901475598 Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai
in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly.

Closes #14948

* github.com:scylladb/scylladb:
  config: remove unused namespace alias
  config: use std::ranges when appropriate
  config: drop "experimental" option
  test: disable 'enable_user_defined_functions' if experimental_features does not include udf
  test: pylib: specify experimental_features explicitly
2023-08-17 20:42:02 +03:00
Kefu Chai
7275b8967c docs: add sstablemetadata to operating-scylla/admin-tools
to note that sstablemetadata is being deprecated and encourage
user to switch over to the native tools.

Fixes #15020
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15040
2023-08-17 18:48:46 +03:00
Avi Kivity
e91256a621 Merge 'build: cmake: fix the build of rpm/deb from submodules' from Kefu Chai
in this series, the build of rpm and deb from submodules is fixed:

1. correct the path of reloc package
2. add the dependency of reloc package to deb/rpm build targets

Closes #15062

* github.com:scylladb/scylladb:
  build: cmake: correct reloc_pkg's path
  build: cmake: build rpm/deb from reloc_pkg
2023-08-17 17:58:49 +03:00
Pavel Emelyanov
3ed5b00ba2 Merge 's3/client: generate config file for tests and cleanups' from Kefu Chai
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.

to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.

Closes #15064

* github.com:scylladb/scylladb:
  s3/client: avoid hardwiring env variables names
  s3/client: generate config file for tests
2023-08-17 16:39:23 +03:00
Gleb Natapov
4ffc39d885 cql3: Extend the scope of group0_guard during DDL statement execution
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.

Fixes: #13942

Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
2023-08-17 15:52:48 +03:00
Kefu Chai
6788903fd6 db: config: mark config class final
in 34c3688017, we added a virtual function
to `config_file`, and we new and delete pointer pointing to a
`db::config` instance with `unique_ptr<>`. this makes the compiler
nervous, as deleting a pointer pointing to an instance of non-final
class with virtual function could lead to leak, if this pointer actually
points to a derived class of this non-final class. so, in order to
silence the warning and to prevent potential problem in future, let's
mark `db::config` final.

the warning from Clang 16 looks like:

```
In file included from /home/kefu/dev/scylladb/test/lib/test_services.cc:10:
In file included from /home/kefu/dev/scylladb/test/lib/test_services.hh:25:
In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'db::config' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<db::config>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/lib/test_services.cc:189:16: note: in instantiation of member function 'std::unique_ptr<db::config>::~unique_ptr' requested here
    auto cfg = std::make_unique<db::config>();
               ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15071
2023-08-17 13:43:16 +03:00
Kefu Chai
fc6b8d4040 s3/client: avoid hardwiring env variables names
instead of hardwiring the names in multiple places, let's just
keep them in a single place as variables, and reference them by
these variables instead of their values.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-17 16:06:55 +08:00
Kefu Chai
ec7fa3628c s3/client: generate config file for tests
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.

to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-17 16:06:55 +08:00
Raphael S. Carvalho
b578d6643f Kill scylla option to configure number of compaction groups
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
cc60598368 replica: Wire tablet into compaction group
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf

There's a need for compaction_group_manager, as table will still support
"tabletless" mode, and we don't want to sprinkle ifs here and there,
to support both modes. It's not really a manager (it's not even supposed
to store a state), but I couldn't find a better name.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
5d1f60439a token_metadata: Add this_host_id to topology config
The motivation is that token_metadata::get_my_id() is not available
early in the bootstrap process, as raft topology is pulled later
than new tables are registered and created, and this node is added
to topology even later.

To allow creation of compaction groups to retrieve "my id" from
token metadata early, initialization will now feed local id
into topology config which is immutable for each node anyway.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:44 -03:00
Piotr Smaroń
34c3688017 db: config: add live_updatable_config_params_changeable_via_cql option
If `live_updatable_config_params_changeable_via_cql` is set to true, configuration parameters defined with `liveness::LiveUpdate` option can be updated in the runtime with CQL, i.e. by updating `system.config` virtual table.
If we don't want any configuration parameter to be changed in the
runtime by updating `system.config` virtual table, this option should be
set to false. This option should be set to false for e.g. cloud users,
who can only perform CQL queries, and should not be able to change
scylla's configuration on the fly.

Current implemenatation is generic, but has a small drawback - messages
returned to the user can be not fully accurate, consider:
```
cqlsh> UPDATE system.config SET value='2' WHERE name='task_ttl_in_seconds';
WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="option is not live-updateable" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
```
where `task_ttl_in_seconds` has been defined with
`liveness::LiveUpdate`, but because `live_updatable_config_params_changeable_via_cql` is set to
`false` in `scylla.yaml,` `task_ttl_in_seconds` cannot be modified in the
runtime by updating `system.config` virtual table.

Fixes #14355

Closes #14382
2023-08-16 17:56:27 +03:00
Aleksandra Martyniuk
e9d94894f1 compaction: release resources of compaction executors
Before compaction task executors started inheriting from
compaction_task_impl, they were destructed immediately after
compaction finished. Destructors of executors and their
fields performed actions that affected global structures and
statistics and had impact on compaction process.

Currently, task executors are kept in memory much longer, as their
are tracked by task manager. Thus, destructors are not called just
after the compaction, which results in compaction stats not being
updated, which causes e.g. infinite cleanup loop.

Add release_resources() method which is called at the end
of compaction process and does what destructors used to.

Fixes: #14966.
Fixes: #15030.

Closes #15005
2023-08-16 15:51:17 +03:00
Kefu Chai
564522c4a8 s3/test: remove tempdir if log does not exists
should have been use `ignore_errors=True` to ignore
the error. this issue has not poped up, because
we haven't run into the case where the log file
does not exist.

this was a regression introduced by
d4ee84ee1e

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15063
2023-08-16 15:11:00 +03:00
Kefu Chai
32c26624bf build: cmake: correct reloc_pkg's path
before this change, the filename in path of reloc package looks like:
tools-scylla-5.4.0~dev-0.20230816.2eb6dc57297e.noarch.tar.gz
but it should have been:
scylla-tools-5.4.0~dev-0.20230816.2eb6dc57297e.noarch.tar.gz
so, when repackaging the reloc tarball to rpm / deb, the scripts
just fails to find the reloc tarball and fail.

after this change, the filename is corrected to match with the one
generated using `build_reloc.sh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-16 16:15:23 +08:00
Kefu Chai
a19c7fa8d5 build: cmake: build rpm/deb from reloc_pkg
before this change, dist-${name}-rpm and dist-${name}-deb targets
do not depend on the corresponding reloc pkg from which these
prebuilt packages are created. so these scripts fail if the reloc
package does not exist.

to address this problem, the reloc package is added as their
dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-16 16:15:23 +08:00
Avi Kivity
e8f3b073c3 Merge 'Maintain sstable state explicitly' from Pavel Emelyanov
An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state.

This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path.

refs: #13017
refs: #12707

Closes #14152

* github.com:scylladb/scylladb:
  sstable/storage: Make filesystem storage with initial state
  sstable: Maintain state
  sstable: Make .change_state() accept state, not directory string
  sstable: Construct it with state
  sstables_manager: Remove state-less make_sstable()
  table: Make sstables with required state
  test: Make sstables with upload state in some cases
  tools: Make sstables with normal state
  table: Open-code sstables making streaming helpers
  tests: Make sstables with normal state by default
  sstable_directory: Make sstable with required state
  sstable_directory: Construct with state
  distributed_loader: Make sstable with desired state when populating
  distributed_loader: Make sstable with upload state when uploading
  sstable: Introduce state enum
  sstable_directory: Merge verify and g.c. calls
  distributed_loader: Merge verify and gc invocations
  sstable/filesystem: Put underscores to dir members
  sstable/s3: Mark make_s3_object_name() const
  sstable: Remove filename(dir, ...) method
2023-08-15 17:44:06 +03:00
Avi Kivity
5949623e0d Merge 'sstable_set: maintain bytes on disk' from Benny Halevy
and use that in compaction_group, rather than
respective accumulators of its own.

This is part of of larger series to make cache updates exception safe.

Refs #14043

Closes #15052

* github.com:scylladb/scylladb:
  sstable_set: maintain total bytes_on_disk
  sstable_set: insert, erase: return status
2023-08-15 17:32:12 +03:00
Kefu Chai
64ed0127d7 s3/client: retry if minio server fails to start
there is a small time window after we find a free port and before
the minio server listens on that port, if another server sneaked
in the time window and listen on that port, minio server can
still fail to start even there might be free port for it.

so, in this change, we just retry with a random port for a fixed
number of times until the minio server is able to serve.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15042
2023-08-15 16:17:47 +03:00
Raphael S. Carvalho
d3f71ae4ee replica: Switch to chunked_vector for storing compaction groups
We aim for a large number of tablets, therefore let's switch
to chunked_vector to avoid large contiguous allocs.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-15 09:04:05 -03:00
Raphael S. Carvalho
2590eec352 replica: Generate group_id for compaction_group on demand
There are a few good reasons for this change.
1) compaction_group doesn't have to be aware of # of groups
2) thinking forward to dynamic tablets, # of groups cannot be
statically embedded in group id, otherwise it gets stale.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-15 09:04:05 -03:00
Raphael S. Carvalho
9400b79658 gce_snitch: Fix use-after-move in load_config()
The use-after-move is not very harmful as it's only used when
handling exception. So user would be left with a bogus message.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15054
2023-08-15 10:23:57 +03:00
Kefu Chai
c82f1d2f57 tools/scylla-sstable: dump column_desc as an object
before this change, `scylla sstable dump-statistics` prints the
"regular_columns" as a list of strings, like:

```
        "regular_columns": [
          "name",
          "clustering_order",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type",
          "name",
          "column_name_bytes",
          "type_name",
          "org.apache.cassandra.db.marshal.BytesType",
          "name",
          "kind",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type",
          "name",
          "position",
          "type_name",
          "org.apache.cassandra.db.marshal.Int32Type",
          "name",
          "type",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type"
        ]
```

but according
https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/scylla-sstable.html#dump-statistics,

> $SERIALIZATION_HEADER_METADATA := {
>     "min_timestamp_base": Uint64,
>     "min_local_deletion_time_base": Uint64,
>     "min_ttl_base": Uint64",
>     "pk_type_name": String,
>     "clustering_key_types_names": [String, ...],
>     "static_columns": [$COLUMN_DESC, ...],
>     "regular_columns": [$COLUMN_DESC, ...],
> }
>
> $COLUMN_DESC := {
>     "name": String,
>     "type_name": String
> }

"regular_columns" is supposed to be a list of "$COLUMN_DESC".
the same applies to "static_columnes". this schema makes sense,
as each column should be considered as a single object which
is composed of two properties. but we dump them like a list.

so, in this change, we guard each visit() call of `json_dumper()`
with `StartObject()` and `EndObject()` pair, so that each column
is printed as an object.

after the change, "regular_columns" are printed like:
```
        "regular_columns": [
          {
            "name": "clustering_order",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          },
          {
            "name": "column_name_bytes",
            "type_name": "org.apache.cassandra.db.marshal.BytesType"
          },
          {
            "name": "kind",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          },
          {
            "name": "position",
            "type_name": "org.apache.cassandra.db.marshal.Int32Type"
          },
          {
            "name": "type",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          }
        ]
```

Fixes #15036
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15037
2023-08-15 08:22:51 +03:00
Avi Kivity
d57a951d48 Revert "cql3: Extend the scope of group0_guard during DDL statement execution"
This reverts commit 70b5360a73. It generates
a failure in group0_test .test_concurrent_group0_modifications in debug
mode with about 4% probability.

Fixes #15050
2023-08-15 00:26:45 +03:00
Patryk Jędrzejczak
e7077da12d replica: reduce the size limit of the schema commitlog
The size of the schema commitlog is incorrectly set to 10 TB. To
avoid wasting space, we reduce it to 2 * schema commitlog segment
size.

Closes #14946
2023-08-14 20:41:15 +02:00
Benny Halevy
f54ab48273 sstable_set: maintain total bytes_on_disk
and use that in compaction_group, rather than
respective accumulators of its own.

bytes_on_disk is implemented by each sstable_set_impl
and is update on insert and erase (whether directly
into the sstable_set_impl or via the sstable_set).

Although compound_sstable_set doesn't implement
insert and erase, it override `bytes_on_disk()` to return
the sum of all the underlying `sstable_set::bytes_on_disk()`.

Also, added respective unit tests for `partitioned_sstable_set`
and `time_series_sstable_set`, that test each type's
bytes_on_disk, including cloning of the set, and the
`compound_sstable_set` bytes_on_disk semantics.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-14 21:07:27 +03:00
Benny Halevy
9f77a32805 compaction_manager: run_offstrategy_compaction: retrieve owned_ranges from compaction_state
perform_offstrategy is called from try_perform_cleanup
when there are sstables in the maintenance set that require
cleanup.

The input sstables are inserted into the compaction_state
`sstables_requiring_cleanup` and `try_perform_cleanup`
expects offstrategy compaction to clean them up along
with reshape compaction.

Otherwise, the maintenance sstables that require cleanup
are not cleaned up by cleanup compaction, since
the reshape output sstable(s) are not analyzed again
after reshape compaction, where that would insert
the output sstable(s) into `sstables_requiring_cleanup`
and trigger their cleanup in the subsequent cleanup compaction.

The latter method is viable too, but it is less effficient
since we can do reshape+cleanup in one pass, vs.
reshape first and cleanup later.

Fixes scylladb/scylladb#15041

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15043
2023-08-14 18:37:34 +03:00
Avi Kivity
1937a5c1dd docs: cql: document the relative priority of SELECT clauses
Document how SELECT clauses are considered. For example, given the query

    SELECT * FROM tab WHERE a = 3 LIMIT 1

We'll get different results if we first apply the WHERE clause then LIMIT
the result set, or if we first LIMIT there result set and then apply the
WHERE clause.

Closes #14990
2023-08-14 17:40:37 +03:00
Benny Halevy
2dc9ef17be sstable_set: insert, erase: return status
To be used for maintaining disk_space_used
in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-14 17:10:39 +03:00
Aleksandra Martyniuk
7a28cc60ec compaction: ignore future explicitly
discard_result ignores only successful futures. Thus, if
perform_compaction<regular_compaction_task_executor> call fails,
a failure is considered abandoned, causing tests to fail.

Explicitly ignore failed future.

Fixes: #14971.

Closes #15000
2023-08-14 16:41:15 +03:00
Pavel Emelyanov
296eb61432 sstable/storage: Make filesystem storage with initial state
The filesystem storage driver uses different paths depending on sstable
state. It's possible to keep only table directory _and_ state on it and
construct this path on demand when needed, but it's faster to keep full
path onboard. All the more so it's only exported outside via .prefix()
call which is for logs only, but still

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:40:44 +03:00
Pavel Emelyanov
5c39d61b62 sstable: Maintain state
This means -- keep state on sstable, change it with change_state() call
and (!) fix the is_<state>() helpers not to check storage->prefix()

nit: mark requires_view_building() noexcept while at it

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:40:44 +03:00
Pavel Emelyanov
b06917f235 sstable: Make .change_state() accept state, not directory string
Pretty cosmetic change, but it will allow S3 to finally support moving
sstables between states (after this patch it still doesn't)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:40:44 +03:00
Pavel Emelyanov
ef25352412 sstable: Construct it with state
This just moves make_path() call from outside of sstable::sstable()
inside it. Later it will be moved even further. Also, now sstable can
know its state and keep it (next patch)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:28:54 +03:00
Pavel Emelyanov
1f247f0b05 sstables_manager: Remove state-less make_sstable()
Now all callers specify the state they want their sstables in explicitly
and the old API can be removed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:28:54 +03:00
Pavel Emelyanov
fdfec474ae table: Make sstables with required state
By default it's created with normal state, but there are some places
that need to put it into staging. Do it with new state enum

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:28:54 +03:00
Pavel Emelyanov
855f2b4b86 test: Make sstables with upload state in some cases
As was mentione in the previous patch, there are few places in tests
that put sstables in upload/ subdir and they really mean it. Those need
to use sstables manager/directory API directly (already) and specify the
state explicitly (this patch)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 15:28:54 +03:00
Pavel Emelyanov
9e752ca6ab tools: Make sstables with normal state
Just like tests, tool open sstable by its full path and doesn't make any
assumptions about sstable state

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-14 14:56:02 +03:00