Commit Graph

38453 Commits

Author SHA1 Message Date
Kefu Chai
63b32cbdb4 tasks: s/stoppping/stopping/
fix a typo

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15103
2023-08-21 22:28:38 +03:00
Eliran Sinvani
eb368f9f6e internal_keyspace extention: enhance the semantics also to flushes
commit 7c8c020 introduced a new type of a keyspace, an internal keyspace
It defined the semantics for this internal keyspace, this keyspace is
somewhat a hybrid between system and user keyspace.

Here we extend the semantics to include also flushes, meaning that
flushes will be done using the system dirty_mamory_manager. This is
in order to allow inter dependencies between internal tables and user
tables and prevent deadlocks.

One example of such a deadlock is our `replicated_key_provider`
encryption on the enterprise version. The deadlock occur because in some
circumstances, an encrypted user table flush is dependant upon the
`encrypted_keys` table being flushed but since the requests are
serialized, we get a deadlock.

Tests: unit tests dev + debug
The deadlock dtest reproducer:
encryption_at_rest_test.py::TestEncryptionAtRest::test_reboot

Fixes #14529

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>

Closes #14547
2023-08-21 18:17:05 +03:00
Avi Kivity
ce43effc21 Merge "fix rebuild with consistent topology management" From Gleb Natapov
"
The series fixes bogus asserting during topology state load and add a
test that runs rebuild to make sure the code will not regress again.

Fixes #14958
"

* 'gleb/rebuilding_fix_v1' of github.com:scylladb/scylla-dev:
  test: add rebuild test
  system_keyspace: fix assertion for missing transition_state
2023-08-21 16:00:42 +03:00
Kefu Chai
8cc215db96 test: randomized_nemesis_test: do not brace around scalars
Clang and GCC's warning option of `-Wbraced-scalar-init` warns
at seeing superfluous use of braces, like:
```
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2187:32: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init]
            .snapshot_threshold{1},
                               ^~~

```
usually, this does not hurt. but by taking the braces out, we have
a more readable piece of code, and less warnings.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15086
2023-08-21 15:57:06 +03:00
Aleksandra Martyniuk
e0ce711e4f compaction: do not swallow compaction_stopped_exception for reshape
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: #15058.

Closes #15067
2023-08-21 12:41:55 +03:00
Vlad Zolotarov
e13a2b687d scylla_raid_setup: make --online-discard argument useful
This argument was dead since its introduction and 'discard' was
always configured regardless of its value.
This patch allows actually configuring things using this argument.

Fixes #14963

Closes #14964
2023-08-21 12:21:23 +03:00
Anna Stuchlik
b5c4d13e36 doc: update the Seastar Perftune page
This commit updates the description of perftune.py.
It is based on the information in the reported issue (below),
the contents of help for perftune.py, and the input from
@vladzcloudius.

Fixes https://github.com/scylladb/scylladb/issues/14233

Closes #14879
2023-08-21 10:23:30 +03:00
Anna Stuchlik
57e86b05f1 doc: fix the outdated Networking section
Fixes https://github.com/scylladb/scylla-docs/issues/2467

This commit updates the Networking section. The scope is:
- Removing the outdated content, including the reference to
  the super outdated posix_net_conf.sh script.
- Adding the guidelines provided by @vladzcloudius.
- Adding the reference to the documentation for
  the perftune.py script.

Closes #14859
2023-08-21 10:17:37 +03:00
Petr Gusev
9176a3341a test_topology_smp: more logs for debug/aarch64
The test is flaky on CI in debug builds
on aarch64 (#14752), here we sprinkle more
logs for debug/aarch64 hoping it'll help to
debug it.

Ref #14752

Closes #14822
2023-08-21 10:03:09 +03:00
Kefu Chai
adfc139a74 tools/scylla-sstable: path::parent_path() when appropriate
in load_sstables(), `sst_path` is already an instace of `std::filesystem::path`,
so there is no need to cast it to `std::filesystem::path`. also,
`path.remove_filename()` returns something like
"system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/", when the
trailing slash. when we get a component's path in `sstable::filename`,
we always add a "/" in between the `dir` and the filename, so this'd
end up with two slashes in the path like:

"/var/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f//mc-2-big-Data.db"

so, in order to remove the duplicated slash, let's just use
`path.parent_path()` here.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15035
2023-08-21 09:28:03 +03:00
Benny Halevy
0f54e24519 migration_notifier: get schema_ptr by value
To prevent use-after-free as seen in
https://github.com/scylladb/scylladb/issues/15097
where a temp schema_ptr retrieved from a global_schema_ptr
get destroyed when the notification function yielded.

Capturing the schema_ptr on the coroutine frame
is inexpensive since its a shared ptr and it makes sure
that the schema remains valid throughput the coroutine
life time.

Fixes scylladb/scylladb#15097

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15098
2023-08-20 21:36:57 +03:00
David Garcia
e23d9cd7eb docs: Autogenerate db/config.cc docs
Update layout

docs: remove output param

docs: generate cc properties on build

docs: track cc file on change

rm: note dependency

docs: clean _data

Fixes #8424.

Closes #14973
2023-08-20 21:27:37 +03:00
Kefu Chai
1aa01d63d4 test: randomized_nemesis_test: mark direct_fd_{pinger,clock} final
`raft_server` in test/raft/randomized_nemesis_test.cc manages
instances of direct_fd_pinger and direct_fd_clock with unique_ptr<>.
this unique_ptr<> deletes these managed instances using delete.
but since these two classes have virtual methods, the compiler feels
nervous when deleting them. because these two classes have virtual
functions, but they do not have virtual destructor. in other words,
in theory, these pointers could be pointing derived classes of them,
and deleting them could lead to leak.

so to silence the warning and to prevent potential issues, let's
just mark these two classes final.

this should address the warning like:

```
In file included from /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:9:
In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/reactor.hh:24:
In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/aligned_buffer.hh:24:
In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'direct_fd_pinger<int>' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<direct_fd_pinger<int>>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1400:5: note: in instantiation of member function 'std::unique_ptr<direct_fd_pinger<int>>::~unique_ptr' requested here
    ~raft_server() {
    ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: note: in instantiation of member function 'raft_server<ExReg>::~raft_server' requested here
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<raft_server<ExReg>>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1704:24: note: in instantiation of member function 'std::unique_ptr<raft_server<ExReg>>::~unique_ptr' requested here
            ._server = nullptr,
                       ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1742:19: note: in instantiation of member function 'environment<ExReg>::new_node' requested here
        auto id = new_node(first, std::move(cfg));
                  ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2113:39: note: in instantiation of member function 'environment<ExReg>::new_server' requested here
        auto leader_id = co_await env.new_server(true);
                                      ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15084
2023-08-20 21:26:08 +03:00
Avi Kivity
4db5d8dd56 Merge 'build: cmake: support Coverage and Sanitize build modes' from Kefu Chai
to mirror the build modes supported by `configure.py`.

Closes #15085

* github.com:scylladb/scylladb:
  build: cmake: support Coverage and Sanitize build modes
  build: cmake: error out if specified build type is unknown
2023-08-20 21:25:21 +03:00
Pavel Emelyanov
6bc30f1944 system_keyspace: De-bloat .setup() from messing with system.local
On boot several manipulations with system.local are performed.

1. The host_id value is selected from it with key = local

   If not found, system_keyspace generates a new host_id, inserts the
   new value into the table and returns back

2. The cluster_name is selected from it with key = local

   Then it's system_keyspace that either checks that the name matches
   the one from db::config, or inserts the db::config value into the
   table

3. The row with key = local is updated with various info like versions,
   listen, rpc and bcast addresses, dc, rack, etc. Unconditionally

All three steps are scattered over main, p.1 is called directly, p.2 and
p.3 are executed via system_keyspace::setup() that happens rather late.
Also there's some touch of this table from the cql_test_env startup code.

The proposal is to collect this setup into one place and execute it
early -- as soon as the system.local table is populated. This frees the
system_keyspace code from the logic of selecting host id and cluster
name leaving it to main and keeps it with only select/insert work.

refs: #2795

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15082
2023-08-20 21:24:31 +03:00
Tomasz Grabiec
1552044615 storage_service, tablets: Fix corrupting tablet metadata on migration concurrent with table drop
Tablet migration may execute a global token metadata barrier before
executing updates of system.tablets. If table is dropped while the
barrier is happening, the updates will bring back rows for migrated
tablets in a table which is no longer there. This will cause tablet
metadata loading to fail with error:

 missing_column (missing column: tablet_count)

Like in this log line:

storage_service - raft topology: topology change coordinator fiber got error raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1206): std::_Nested_exception<std::runtime_error> (Failed to read tablet metadata): missing_column (missing column: tablet_count)œ")

The fix is to read and execute the updates in a single group0 guard
scope, and move execution of the barrier later. We cannot now generate
updates in the same handle_tablet_migration() step if barrier needs to
be executed, so we resuse the mechanism for two-step stage transition
which we already have for handling of streaming. The next pass will
notice that the barrier is not needed for a given tablet and will
generate the stage update.

Fixes #15061

Closes #15069
2023-08-20 21:17:57 +03:00
Avi Kivity
a4e7f9bed0 docs: cql: split DML page into one page per statement
The DML page is quite long (21 screenfuls on my monitor); split
it into one page per statement to make it more digestible.

The sections that are common to multiple statement are kept
in the main DML page, and references to them are added.

Closes #15053
2023-08-20 17:14:32 +03:00
Kefu Chai
12d6ec5a18 config: respect --log-with-color 1
scylladb overrides some of seastar logging related options with its
own options by applying them with `logging::apply_settings()`. but
we fail to inherit `with_color` from Seastar as we are using the
designated initializer, so the unspecified members are zero initialized.
that's why we always have logging message in black and white even
if scylla is running in a tty and `--log-with-color 1` is specified.

so, make the debugging life more colorful, let's inherit the option
from Seastar, and apply it when setting logging related options.

see also 29e09a3292

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15076
2023-08-20 13:47:43 +03:00
Tomasz Grabiec
bd8bb5d4b1 Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

```
INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```

Closes #14863

* github.com:scylladb/scylladb:
  Kill scylla option to configure number of compaction groups
  replica: Wire tablet into compaction group
  token_metadata: Add this_host_id to topology config
  replica: Switch to chunked_vector for storing compaction groups
  replica: Generate group_id for compaction_group on demand
2023-08-18 15:17:17 +02:00
Kefu Chai
9fa0b9b75b build: cmake: support Coverage and Sanitize build modes
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-18 14:17:12 +08:00
Kefu Chai
3c3fb03b01 build: cmake: error out if specified build type is unknown
this should help the developer to understand what build types are
supported if the specified one is unknown.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-18 14:17:12 +08:00
Avi Kivity
1901475598 Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai
in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly.

Closes #14948

* github.com:scylladb/scylladb:
  config: remove unused namespace alias
  config: use std::ranges when appropriate
  config: drop "experimental" option
  test: disable 'enable_user_defined_functions' if experimental_features does not include udf
  test: pylib: specify experimental_features explicitly
2023-08-17 20:42:02 +03:00
Kefu Chai
7275b8967c docs: add sstablemetadata to operating-scylla/admin-tools
to note that sstablemetadata is being deprecated and encourage
user to switch over to the native tools.

Fixes #15020
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15040
2023-08-17 18:48:46 +03:00
Avi Kivity
e91256a621 Merge 'build: cmake: fix the build of rpm/deb from submodules' from Kefu Chai
in this series, the build of rpm and deb from submodules is fixed:

1. correct the path of reloc package
2. add the dependency of reloc package to deb/rpm build targets

Closes #15062

* github.com:scylladb/scylladb:
  build: cmake: correct reloc_pkg's path
  build: cmake: build rpm/deb from reloc_pkg
2023-08-17 17:58:49 +03:00
Pavel Emelyanov
3ed5b00ba2 Merge 's3/client: generate config file for tests and cleanups' from Kefu Chai
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.

to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.

Closes #15064

* github.com:scylladb/scylladb:
  s3/client: avoid hardwiring env variables names
  s3/client: generate config file for tests
2023-08-17 16:39:23 +03:00
Gleb Natapov
4ffc39d885 cql3: Extend the scope of group0_guard during DDL statement execution
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.

Fixes: #13942

Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
2023-08-17 15:52:48 +03:00
Kefu Chai
6788903fd6 db: config: mark config class final
in 34c3688017, we added a virtual function
to `config_file`, and we new and delete pointer pointing to a
`db::config` instance with `unique_ptr<>`. this makes the compiler
nervous, as deleting a pointer pointing to an instance of non-final
class with virtual function could lead to leak, if this pointer actually
points to a derived class of this non-final class. so, in order to
silence the warning and to prevent potential problem in future, let's
mark `db::config` final.

the warning from Clang 16 looks like:

```
In file included from /home/kefu/dev/scylladb/test/lib/test_services.cc:10:
In file included from /home/kefu/dev/scylladb/test/lib/test_services.hh:25:
In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'db::config' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<db::config>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/lib/test_services.cc:189:16: note: in instantiation of member function 'std::unique_ptr<db::config>::~unique_ptr' requested here
    auto cfg = std::make_unique<db::config>();
               ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15071
2023-08-17 13:43:16 +03:00
Kefu Chai
fc6b8d4040 s3/client: avoid hardwiring env variables names
instead of hardwiring the names in multiple places, let's just
keep them in a single place as variables, and reference them by
these variables instead of their values.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-17 16:06:55 +08:00
Kefu Chai
ec7fa3628c s3/client: generate config file for tests
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.

to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-17 16:06:55 +08:00
Raphael S. Carvalho
b578d6643f Kill scylla option to configure number of compaction groups
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
cc60598368 replica: Wire tablet into compaction group
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf

There's a need for compaction_group_manager, as table will still support
"tabletless" mode, and we don't want to sprinkle ifs here and there,
to support both modes. It's not really a manager (it's not even supposed
to store a state), but I couldn't find a better name.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
5d1f60439a token_metadata: Add this_host_id to topology config
The motivation is that token_metadata::get_my_id() is not available
early in the bootstrap process, as raft topology is pulled later
than new tables are registered and created, and this node is added
to topology even later.

To allow creation of compaction groups to retrieve "my id" from
token metadata early, initialization will now feed local id
into topology config which is immutable for each node anyway.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:44 -03:00
Piotr Smaroń
34c3688017 db: config: add live_updatable_config_params_changeable_via_cql option
If `live_updatable_config_params_changeable_via_cql` is set to true, configuration parameters defined with `liveness::LiveUpdate` option can be updated in the runtime with CQL, i.e. by updating `system.config` virtual table.
If we don't want any configuration parameter to be changed in the
runtime by updating `system.config` virtual table, this option should be
set to false. This option should be set to false for e.g. cloud users,
who can only perform CQL queries, and should not be able to change
scylla's configuration on the fly.

Current implemenatation is generic, but has a small drawback - messages
returned to the user can be not fully accurate, consider:
```
cqlsh> UPDATE system.config SET value='2' WHERE name='task_ttl_in_seconds';
WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="option is not live-updateable" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
```
where `task_ttl_in_seconds` has been defined with
`liveness::LiveUpdate`, but because `live_updatable_config_params_changeable_via_cql` is set to
`false` in `scylla.yaml,` `task_ttl_in_seconds` cannot be modified in the
runtime by updating `system.config` virtual table.

Fixes #14355

Closes #14382
2023-08-16 17:56:27 +03:00
Aleksandra Martyniuk
e9d94894f1 compaction: release resources of compaction executors
Before compaction task executors started inheriting from
compaction_task_impl, they were destructed immediately after
compaction finished. Destructors of executors and their
fields performed actions that affected global structures and
statistics and had impact on compaction process.

Currently, task executors are kept in memory much longer, as their
are tracked by task manager. Thus, destructors are not called just
after the compaction, which results in compaction stats not being
updated, which causes e.g. infinite cleanup loop.

Add release_resources() method which is called at the end
of compaction process and does what destructors used to.

Fixes: #14966.
Fixes: #15030.

Closes #15005
2023-08-16 15:51:17 +03:00
Kefu Chai
564522c4a8 s3/test: remove tempdir if log does not exists
should have been use `ignore_errors=True` to ignore
the error. this issue has not poped up, because
we haven't run into the case where the log file
does not exist.

this was a regression introduced by
d4ee84ee1e

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15063
2023-08-16 15:11:00 +03:00
Kefu Chai
32c26624bf build: cmake: correct reloc_pkg's path
before this change, the filename in path of reloc package looks like:
tools-scylla-5.4.0~dev-0.20230816.2eb6dc57297e.noarch.tar.gz
but it should have been:
scylla-tools-5.4.0~dev-0.20230816.2eb6dc57297e.noarch.tar.gz
so, when repackaging the reloc tarball to rpm / deb, the scripts
just fails to find the reloc tarball and fail.

after this change, the filename is corrected to match with the one
generated using `build_reloc.sh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-16 16:15:23 +08:00
Kefu Chai
a19c7fa8d5 build: cmake: build rpm/deb from reloc_pkg
before this change, dist-${name}-rpm and dist-${name}-deb targets
do not depend on the corresponding reloc pkg from which these
prebuilt packages are created. so these scripts fail if the reloc
package does not exist.

to address this problem, the reloc package is added as their
dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-16 16:15:23 +08:00
Avi Kivity
e8f3b073c3 Merge 'Maintain sstable state explicitly' from Pavel Emelyanov
An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state.

This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path.

refs: #13017
refs: #12707

Closes #14152

* github.com:scylladb/scylladb:
  sstable/storage: Make filesystem storage with initial state
  sstable: Maintain state
  sstable: Make .change_state() accept state, not directory string
  sstable: Construct it with state
  sstables_manager: Remove state-less make_sstable()
  table: Make sstables with required state
  test: Make sstables with upload state in some cases
  tools: Make sstables with normal state
  table: Open-code sstables making streaming helpers
  tests: Make sstables with normal state by default
  sstable_directory: Make sstable with required state
  sstable_directory: Construct with state
  distributed_loader: Make sstable with desired state when populating
  distributed_loader: Make sstable with upload state when uploading
  sstable: Introduce state enum
  sstable_directory: Merge verify and g.c. calls
  distributed_loader: Merge verify and gc invocations
  sstable/filesystem: Put underscores to dir members
  sstable/s3: Mark make_s3_object_name() const
  sstable: Remove filename(dir, ...) method
2023-08-15 17:44:06 +03:00
Avi Kivity
5949623e0d Merge 'sstable_set: maintain bytes on disk' from Benny Halevy
and use that in compaction_group, rather than
respective accumulators of its own.

This is part of of larger series to make cache updates exception safe.

Refs #14043

Closes #15052

* github.com:scylladb/scylladb:
  sstable_set: maintain total bytes_on_disk
  sstable_set: insert, erase: return status
2023-08-15 17:32:12 +03:00
Kefu Chai
64ed0127d7 s3/client: retry if minio server fails to start
there is a small time window after we find a free port and before
the minio server listens on that port, if another server sneaked
in the time window and listen on that port, minio server can
still fail to start even there might be free port for it.

so, in this change, we just retry with a random port for a fixed
number of times until the minio server is able to serve.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15042
2023-08-15 16:17:47 +03:00
Raphael S. Carvalho
d3f71ae4ee replica: Switch to chunked_vector for storing compaction groups
We aim for a large number of tablets, therefore let's switch
to chunked_vector to avoid large contiguous allocs.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-15 09:04:05 -03:00
Raphael S. Carvalho
2590eec352 replica: Generate group_id for compaction_group on demand
There are a few good reasons for this change.
1) compaction_group doesn't have to be aware of # of groups
2) thinking forward to dynamic tablets, # of groups cannot be
statically embedded in group id, otherwise it gets stale.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-15 09:04:05 -03:00
Raphael S. Carvalho
9400b79658 gce_snitch: Fix use-after-move in load_config()
The use-after-move is not very harmful as it's only used when
handling exception. So user would be left with a bogus message.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #15054
2023-08-15 10:23:57 +03:00
Kefu Chai
c82f1d2f57 tools/scylla-sstable: dump column_desc as an object
before this change, `scylla sstable dump-statistics` prints the
"regular_columns" as a list of strings, like:

```
        "regular_columns": [
          "name",
          "clustering_order",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type",
          "name",
          "column_name_bytes",
          "type_name",
          "org.apache.cassandra.db.marshal.BytesType",
          "name",
          "kind",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type",
          "name",
          "position",
          "type_name",
          "org.apache.cassandra.db.marshal.Int32Type",
          "name",
          "type",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type"
        ]
```

but according
https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/scylla-sstable.html#dump-statistics,

> $SERIALIZATION_HEADER_METADATA := {
>     "min_timestamp_base": Uint64,
>     "min_local_deletion_time_base": Uint64,
>     "min_ttl_base": Uint64",
>     "pk_type_name": String,
>     "clustering_key_types_names": [String, ...],
>     "static_columns": [$COLUMN_DESC, ...],
>     "regular_columns": [$COLUMN_DESC, ...],
> }
>
> $COLUMN_DESC := {
>     "name": String,
>     "type_name": String
> }

"regular_columns" is supposed to be a list of "$COLUMN_DESC".
the same applies to "static_columnes". this schema makes sense,
as each column should be considered as a single object which
is composed of two properties. but we dump them like a list.

so, in this change, we guard each visit() call of `json_dumper()`
with `StartObject()` and `EndObject()` pair, so that each column
is printed as an object.

after the change, "regular_columns" are printed like:
```
        "regular_columns": [
          {
            "name": "clustering_order",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          },
          {
            "name": "column_name_bytes",
            "type_name": "org.apache.cassandra.db.marshal.BytesType"
          },
          {
            "name": "kind",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          },
          {
            "name": "position",
            "type_name": "org.apache.cassandra.db.marshal.Int32Type"
          },
          {
            "name": "type",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          }
        ]
```

Fixes #15036
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15037
2023-08-15 08:22:51 +03:00
Avi Kivity
d57a951d48 Revert "cql3: Extend the scope of group0_guard during DDL statement execution"
This reverts commit 70b5360a73. It generates
a failure in group0_test .test_concurrent_group0_modifications in debug
mode with about 4% probability.

Fixes #15050
2023-08-15 00:26:45 +03:00
Patryk Jędrzejczak
e7077da12d replica: reduce the size limit of the schema commitlog
The size of the schema commitlog is incorrectly set to 10 TB. To
avoid wasting space, we reduce it to 2 * schema commitlog segment
size.

Closes #14946
2023-08-14 20:41:15 +02:00
Benny Halevy
f54ab48273 sstable_set: maintain total bytes_on_disk
and use that in compaction_group, rather than
respective accumulators of its own.

bytes_on_disk is implemented by each sstable_set_impl
and is update on insert and erase (whether directly
into the sstable_set_impl or via the sstable_set).

Although compound_sstable_set doesn't implement
insert and erase, it override `bytes_on_disk()` to return
the sum of all the underlying `sstable_set::bytes_on_disk()`.

Also, added respective unit tests for `partitioned_sstable_set`
and `time_series_sstable_set`, that test each type's
bytes_on_disk, including cloning of the set, and the
`compound_sstable_set` bytes_on_disk semantics.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-14 21:07:27 +03:00
Benny Halevy
9f77a32805 compaction_manager: run_offstrategy_compaction: retrieve owned_ranges from compaction_state
perform_offstrategy is called from try_perform_cleanup
when there are sstables in the maintenance set that require
cleanup.

The input sstables are inserted into the compaction_state
`sstables_requiring_cleanup` and `try_perform_cleanup`
expects offstrategy compaction to clean them up along
with reshape compaction.

Otherwise, the maintenance sstables that require cleanup
are not cleaned up by cleanup compaction, since
the reshape output sstable(s) are not analyzed again
after reshape compaction, where that would insert
the output sstable(s) into `sstables_requiring_cleanup`
and trigger their cleanup in the subsequent cleanup compaction.

The latter method is viable too, but it is less effficient
since we can do reshape+cleanup in one pass, vs.
reshape first and cleanup later.

Fixes scylladb/scylladb#15041

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15043
2023-08-14 18:37:34 +03:00
Avi Kivity
1937a5c1dd docs: cql: document the relative priority of SELECT clauses
Document how SELECT clauses are considered. For example, given the query

    SELECT * FROM tab WHERE a = 3 LIMIT 1

We'll get different results if we first apply the WHERE clause then LIMIT
the result set, or if we first LIMIT there result set and then apply the
WHERE clause.

Closes #14990
2023-08-14 17:40:37 +03:00
Benny Halevy
2dc9ef17be sstable_set: insert, erase: return status
To be used for maintaining disk_space_used
in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-14 17:10:39 +03:00