Commit Graph

38464 Commits

Author SHA1 Message Date
Benny Halevy
d666fbfe8f gossiper: run: no need to replicate live_endpoints
As asias@scylladb.com noticed, after the previous
patch that calls replicate_live_endpoints_on_change
in mutate_live_and_unreachable_endpoints, _live_endpoints
are always updated on all shards when they change,
so there's no need anymore to replicate them here.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 11:57:32 +03:00
Benny Halevy
2c27297dbd gossiper: fold update_live_endpoints_version into replicate_live_endpoints_on_change
We want to propagate any change to _live_endpoints
to all shards.  Currently we just update the `_live_endpoints_version`
and `replicate_live_endpoints_on_change` propagtes the
change some undetermined time in the future.

To rely on `_live_endpoints` for gossiper::is_alive,
that may be called on any shard, we want to propagate
the change to all shards as soon as it happens.

Use `mutate_live_and_unreachable_endpoints` to update
_live_endpoints and/or _unreachable_endpoints safely,
under `lock_endpoint_update_semaphore`. It is responsible
for incrementing _live_endpoints_version and
calling `replicate_live_endpoints_on_change` to
propagate the change to all shards.

Refs scylladb/scylladb#15089
Refs scylladb/scylladb#15088

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 11:56:41 +03:00
Benny Halevy
86ccc1f49b gossiper: add mutate_live_and_unreachable_endpoints
To be used for safely modifying _live_endpoints
and/or _unreachable_endpoints and replicating the
new version to all shards.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 11:56:37 +03:00
Benny Halevy
a14e5ab8a3 gossiper: reset_endpoint_state_map: clear also shadow endpoint sets
If we don't clear them, there is a slight chance
that the next update will make `_live_endpoints` or `_unreachable_endpoints`
equal to their shadow counterparts and prevent an update
in `replicate_live_endpoints_on_change`.

Fixes #15003

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:17:19 +03:00
Benny Halevy
0cc0a95543 gossiper: reset_endpoint_state_map: clear live/unreachable endpoints on all shards
Not only on the calling shard (shard 0).
Essentially this change folds `update_live_endpoints_version`
into `reset_endpoint_state_map`.

Acquire the _endpoint_update_semaphore to serialize
this with `replicate_live_endpoints_on_change`.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:17:19 +03:00
Benny Halevy
c45868e3bc gossiper: functions that change _live_endpoints must be called on shard 0
`update_live_endpoints_version` and functions that call it
must be called on shard 0, since it updates the authoritative
`_live_endpoints` and `_live_endpoints_version`.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:17:19 +03:00
Benny Halevy
b0b1c8ae6e gossiper: add lock_endpoint_update_semaphore
Add a private helper to acquire the _endpoint_update_semaphore
before calling replicate_live_endpoints_on_change.

Must be called on shard 0.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:17:19 +03:00
Benny Halevy
18881bc89d gossiper: make _live_endpoints an unordered_set
It is more efficient to maintain as an unrdered_set
and it will be used in a following patch
to determine is_alive(endpoint) in O(1) on average.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:17:19 +03:00
Benny Halevy
97061cc3b8 endpoint_state: use gossiper::is_alive externally
Before we remove endpoint_state:_is_alive to rely
solely on gossipper::_live_endpoints.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-08-22 09:06:09 +03:00
Avi Kivity
23be6f0336 tablets: change persistent type of replica set from set to list
The system.tablets table stores replica sets as a CQL set type,
which is sorted. This means that if, in a tablet replica set
[n1, n2, n3] n2 is replaced with n4, then on reload we'll see
[n1, n3, n4], changing the relative position of n3 from the third
replica to the second.

The relative position of replicas in a replica set is important
for materialized views, as they use it to pair base replicas with
view replicas. To prepare for materialized views using tablets,
change the persistent data type to list, which preserves order.

The code that generates new replica sets already preserves order:
see locator::replace_replica().

While this changes the system schema, tablets are an experimental
feature so we don't need to worry about upgrades.

Closes #15111
2023-08-21 22:55:14 +02:00
Nadav Har'El
18e8e62798 cql-pytest: translate Cassandra's tests for SELECT with LIMIT
This is a translation of Cassandra's CQL unit test source file
validation/operations/SelectLimitTest.java into our cql-pytest framework.

The tests reproduce two already-known bugs:

Refs #9879:  Using PER PARTITION LIMIT with aggregate functions should
             fail as Invalid query
Refs #10357: Spurious static row returned from query with filtering,
             despite not matching filter

And also helped discover two new issues:

Refs #15099: Incorrect sort order when combining IN, and ORDER BY
Refs #15109: PER PARTITION LIMIT should be rejected if SELECT DISTINCT
             is used

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #15114
2023-08-21 22:29:11 +03:00
Kefu Chai
63b32cbdb4 tasks: s/stoppping/stopping/
fix a typo

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15103
2023-08-21 22:28:38 +03:00
Eliran Sinvani
eb368f9f6e internal_keyspace extention: enhance the semantics also to flushes
commit 7c8c020 introduced a new type of a keyspace, an internal keyspace
It defined the semantics for this internal keyspace, this keyspace is
somewhat a hybrid between system and user keyspace.

Here we extend the semantics to include also flushes, meaning that
flushes will be done using the system dirty_mamory_manager. This is
in order to allow inter dependencies between internal tables and user
tables and prevent deadlocks.

One example of such a deadlock is our `replicated_key_provider`
encryption on the enterprise version. The deadlock occur because in some
circumstances, an encrypted user table flush is dependant upon the
`encrypted_keys` table being flushed but since the requests are
serialized, we get a deadlock.

Tests: unit tests dev + debug
The deadlock dtest reproducer:
encryption_at_rest_test.py::TestEncryptionAtRest::test_reboot

Fixes #14529

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>

Closes #14547
2023-08-21 18:17:05 +03:00
Avi Kivity
ce43effc21 Merge "fix rebuild with consistent topology management" From Gleb Natapov
"
The series fixes bogus asserting during topology state load and add a
test that runs rebuild to make sure the code will not regress again.

Fixes #14958
"

* 'gleb/rebuilding_fix_v1' of github.com:scylladb/scylla-dev:
  test: add rebuild test
  system_keyspace: fix assertion for missing transition_state
2023-08-21 16:00:42 +03:00
Kefu Chai
8cc215db96 test: randomized_nemesis_test: do not brace around scalars
Clang and GCC's warning option of `-Wbraced-scalar-init` warns
at seeing superfluous use of braces, like:
```
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2187:32: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init]
            .snapshot_threshold{1},
                               ^~~

```
usually, this does not hurt. but by taking the braces out, we have
a more readable piece of code, and less warnings.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15086
2023-08-21 15:57:06 +03:00
Aleksandra Martyniuk
e0ce711e4f compaction: do not swallow compaction_stopped_exception for reshape
Loop in shard_reshaping_compaction_task_impl::run relies on whether
sstables::compaction_stopped_exception is thrown from run_custom_job.
The exception is swallowed for each type of compaction
in compaction_manager::perform_task.

Rethrow an exception in perfrom task for reshape compaction.

Fixes: #15058.

Closes #15067
2023-08-21 12:41:55 +03:00
Vlad Zolotarov
e13a2b687d scylla_raid_setup: make --online-discard argument useful
This argument was dead since its introduction and 'discard' was
always configured regardless of its value.
This patch allows actually configuring things using this argument.

Fixes #14963

Closes #14964
2023-08-21 12:21:23 +03:00
Anna Stuchlik
b5c4d13e36 doc: update the Seastar Perftune page
This commit updates the description of perftune.py.
It is based on the information in the reported issue (below),
the contents of help for perftune.py, and the input from
@vladzcloudius.

Fixes https://github.com/scylladb/scylladb/issues/14233

Closes #14879
2023-08-21 10:23:30 +03:00
Anna Stuchlik
57e86b05f1 doc: fix the outdated Networking section
Fixes https://github.com/scylladb/scylla-docs/issues/2467

This commit updates the Networking section. The scope is:
- Removing the outdated content, including the reference to
  the super outdated posix_net_conf.sh script.
- Adding the guidelines provided by @vladzcloudius.
- Adding the reference to the documentation for
  the perftune.py script.

Closes #14859
2023-08-21 10:17:37 +03:00
Petr Gusev
9176a3341a test_topology_smp: more logs for debug/aarch64
The test is flaky on CI in debug builds
on aarch64 (#14752), here we sprinkle more
logs for debug/aarch64 hoping it'll help to
debug it.

Ref #14752

Closes #14822
2023-08-21 10:03:09 +03:00
Kefu Chai
adfc139a74 tools/scylla-sstable: path::parent_path() when appropriate
in load_sstables(), `sst_path` is already an instace of `std::filesystem::path`,
so there is no need to cast it to `std::filesystem::path`. also,
`path.remove_filename()` returns something like
"system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/", when the
trailing slash. when we get a component's path in `sstable::filename`,
we always add a "/" in between the `dir` and the filename, so this'd
end up with two slashes in the path like:

"/var/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f//mc-2-big-Data.db"

so, in order to remove the duplicated slash, let's just use
`path.parent_path()` here.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15035
2023-08-21 09:28:03 +03:00
Benny Halevy
0f54e24519 migration_notifier: get schema_ptr by value
To prevent use-after-free as seen in
https://github.com/scylladb/scylladb/issues/15097
where a temp schema_ptr retrieved from a global_schema_ptr
get destroyed when the notification function yielded.

Capturing the schema_ptr on the coroutine frame
is inexpensive since its a shared ptr and it makes sure
that the schema remains valid throughput the coroutine
life time.

Fixes scylladb/scylladb#15097

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #15098
2023-08-20 21:36:57 +03:00
David Garcia
e23d9cd7eb docs: Autogenerate db/config.cc docs
Update layout

docs: remove output param

docs: generate cc properties on build

docs: track cc file on change

rm: note dependency

docs: clean _data

Fixes #8424.

Closes #14973
2023-08-20 21:27:37 +03:00
Kefu Chai
1aa01d63d4 test: randomized_nemesis_test: mark direct_fd_{pinger,clock} final
`raft_server` in test/raft/randomized_nemesis_test.cc manages
instances of direct_fd_pinger and direct_fd_clock with unique_ptr<>.
this unique_ptr<> deletes these managed instances using delete.
but since these two classes have virtual methods, the compiler feels
nervous when deleting them. because these two classes have virtual
functions, but they do not have virtual destructor. in other words,
in theory, these pointers could be pointing derived classes of them,
and deleting them could lead to leak.

so to silence the warning and to prevent potential issues, let's
just mark these two classes final.

this should address the warning like:

```
In file included from /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:9:
In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/reactor.hh:24:
In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/aligned_buffer.hh:24:
In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'direct_fd_pinger<int>' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<direct_fd_pinger<int>>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1400:5: note: in instantiation of member function 'std::unique_ptr<direct_fd_pinger<int>>::~unique_ptr' requested here
    ~raft_server() {
    ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: note: in instantiation of member function 'raft_server<ExReg>::~raft_server' requested here
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<raft_server<ExReg>>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1704:24: note: in instantiation of member function 'std::unique_ptr<raft_server<ExReg>>::~unique_ptr' requested here
            ._server = nullptr,
                       ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1742:19: note: in instantiation of member function 'environment<ExReg>::new_node' requested here
        auto id = new_node(first, std::move(cfg));
                  ^
/home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2113:39: note: in instantiation of member function 'environment<ExReg>::new_server' requested here
        auto leader_id = co_await env.new_server(true);
                                      ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15084
2023-08-20 21:26:08 +03:00
Avi Kivity
4db5d8dd56 Merge 'build: cmake: support Coverage and Sanitize build modes' from Kefu Chai
to mirror the build modes supported by `configure.py`.

Closes #15085

* github.com:scylladb/scylladb:
  build: cmake: support Coverage and Sanitize build modes
  build: cmake: error out if specified build type is unknown
2023-08-20 21:25:21 +03:00
Pavel Emelyanov
6bc30f1944 system_keyspace: De-bloat .setup() from messing with system.local
On boot several manipulations with system.local are performed.

1. The host_id value is selected from it with key = local

   If not found, system_keyspace generates a new host_id, inserts the
   new value into the table and returns back

2. The cluster_name is selected from it with key = local

   Then it's system_keyspace that either checks that the name matches
   the one from db::config, or inserts the db::config value into the
   table

3. The row with key = local is updated with various info like versions,
   listen, rpc and bcast addresses, dc, rack, etc. Unconditionally

All three steps are scattered over main, p.1 is called directly, p.2 and
p.3 are executed via system_keyspace::setup() that happens rather late.
Also there's some touch of this table from the cql_test_env startup code.

The proposal is to collect this setup into one place and execute it
early -- as soon as the system.local table is populated. This frees the
system_keyspace code from the logic of selecting host id and cluster
name leaving it to main and keeps it with only select/insert work.

refs: #2795

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15082
2023-08-20 21:24:31 +03:00
Tomasz Grabiec
1552044615 storage_service, tablets: Fix corrupting tablet metadata on migration concurrent with table drop
Tablet migration may execute a global token metadata barrier before
executing updates of system.tablets. If table is dropped while the
barrier is happening, the updates will bring back rows for migrated
tablets in a table which is no longer there. This will cause tablet
metadata loading to fail with error:

 missing_column (missing column: tablet_count)

Like in this log line:

storage_service - raft topology: topology change coordinator fiber got error raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1206): std::_Nested_exception<std::runtime_error> (Failed to read tablet metadata): missing_column (missing column: tablet_count)œ")

The fix is to read and execute the updates in a single group0 guard
scope, and move execution of the barrier later. We cannot now generate
updates in the same handle_tablet_migration() step if barrier needs to
be executed, so we resuse the mechanism for two-step stage transition
which we already have for handling of streaming. The next pass will
notice that the barrier is not needed for a given tablet and will
generate the stage update.

Fixes #15061

Closes #15069
2023-08-20 21:17:57 +03:00
Avi Kivity
a4e7f9bed0 docs: cql: split DML page into one page per statement
The DML page is quite long (21 screenfuls on my monitor); split
it into one page per statement to make it more digestible.

The sections that are common to multiple statement are kept
in the main DML page, and references to them are added.

Closes #15053
2023-08-20 17:14:32 +03:00
Kefu Chai
12d6ec5a18 config: respect --log-with-color 1
scylladb overrides some of seastar logging related options with its
own options by applying them with `logging::apply_settings()`. but
we fail to inherit `with_color` from Seastar as we are using the
designated initializer, so the unspecified members are zero initialized.
that's why we always have logging message in black and white even
if scylla is running in a tty and `--log-with-color 1` is specified.

so, make the debugging life more colorful, let's inherit the option
from Seastar, and apply it when setting logging related options.

see also 29e09a3292

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15076
2023-08-20 13:47:43 +03:00
Tomasz Grabiec
bd8bb5d4b1 Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

```
INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```

Closes #14863

* github.com:scylladb/scylladb:
  Kill scylla option to configure number of compaction groups
  replica: Wire tablet into compaction group
  token_metadata: Add this_host_id to topology config
  replica: Switch to chunked_vector for storing compaction groups
  replica: Generate group_id for compaction_group on demand
2023-08-18 15:17:17 +02:00
Kefu Chai
9fa0b9b75b build: cmake: support Coverage and Sanitize build modes
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-18 14:17:12 +08:00
Kefu Chai
3c3fb03b01 build: cmake: error out if specified build type is unknown
this should help the developer to understand what build types are
supported if the specified one is unknown.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-18 14:17:12 +08:00
Avi Kivity
1901475598 Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai
in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly.

Closes #14948

* github.com:scylladb/scylladb:
  config: remove unused namespace alias
  config: use std::ranges when appropriate
  config: drop "experimental" option
  test: disable 'enable_user_defined_functions' if experimental_features does not include udf
  test: pylib: specify experimental_features explicitly
2023-08-17 20:42:02 +03:00
Kefu Chai
7275b8967c docs: add sstablemetadata to operating-scylla/admin-tools
to note that sstablemetadata is being deprecated and encourage
user to switch over to the native tools.

Fixes #15020
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15040
2023-08-17 18:48:46 +03:00
Avi Kivity
e91256a621 Merge 'build: cmake: fix the build of rpm/deb from submodules' from Kefu Chai
in this series, the build of rpm and deb from submodules is fixed:

1. correct the path of reloc package
2. add the dependency of reloc package to deb/rpm build targets

Closes #15062

* github.com:scylladb/scylladb:
  build: cmake: correct reloc_pkg's path
  build: cmake: build rpm/deb from reloc_pkg
2023-08-17 17:58:49 +03:00
Pavel Emelyanov
3ed5b00ba2 Merge 's3/client: generate config file for tests and cleanups' from Kefu Chai
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.

to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.

Closes #15064

* github.com:scylladb/scylladb:
  s3/client: avoid hardwiring env variables names
  s3/client: generate config file for tests
2023-08-17 16:39:23 +03:00
Gleb Natapov
4ffc39d885 cql3: Extend the scope of group0_guard during DDL statement execution
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.

Fixes: #13942

Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
2023-08-17 15:52:48 +03:00
Kefu Chai
6788903fd6 db: config: mark config class final
in 34c3688017, we added a virtual function
to `config_file`, and we new and delete pointer pointing to a
`db::config` instance with `unique_ptr<>`. this makes the compiler
nervous, as deleting a pointer pointing to an instance of non-final
class with virtual function could lead to leak, if this pointer actually
points to a derived class of this non-final class. so, in order to
silence the warning and to prevent potential problem in future, let's
mark `db::config` final.

the warning from Clang 16 looks like:

```
In file included from /home/kefu/dev/scylladb/test/lib/test_services.cc:10:
In file included from /home/kefu/dev/scylladb/test/lib/test_services.hh:25:
In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'db::config' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
        delete __ptr;
        ^
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<db::config>::operator()' requested here
          get_deleter()(std::move(__ptr));
          ^
/home/kefu/dev/scylladb/test/lib/test_services.cc:189:16: note: in instantiation of member function 'std::unique_ptr<db::config>::~unique_ptr' requested here
    auto cfg = std::make_unique<db::config>();
               ^
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15071
2023-08-17 13:43:16 +03:00
Kefu Chai
fc6b8d4040 s3/client: avoid hardwiring env variables names
instead of hardwiring the names in multiple places, let's just
keep them in a single place as variables, and reference them by
these variables instead of their values.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-17 16:06:55 +08:00
Kefu Chai
ec7fa3628c s3/client: generate config file for tests
before this change, object_store/test_basic.py create a config file
for specifying the object storage settings, and pass the path of this
file as the argument of `--object-storage-config-file` option when
running scylla. we have the same requirement when testing scylla
with minio server, where we launch a minio server and manually
create a the config file and feed it to scylla.

to ease the preparation work, let's consolidate by creating the
config file in `minio_server.py`, so it always creates the config
file and put it in its tempdir. since object_store/test_basic.py
can also run against an S3 bucket, the fixture implemented
object_store/conftest.py is updated accordingly to reuse the
helper exposed by MinioServer to create the config file when it
is not available.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-17 16:06:55 +08:00
Raphael S. Carvalho
b578d6643f Kill scylla option to configure number of compaction groups
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
cc60598368 replica: Wire tablet into compaction group
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf

There's a need for compaction_group_manager, as table will still support
"tabletless" mode, and we don't want to sprinkle ifs here and there,
to support both modes. It's not really a manager (it's not even supposed
to store a state), but I couldn't find a better name.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
5d1f60439a token_metadata: Add this_host_id to topology config
The motivation is that token_metadata::get_my_id() is not available
early in the bootstrap process, as raft topology is pulled later
than new tables are registered and created, and this node is added
to topology even later.

To allow creation of compaction groups to retrieve "my id" from
token metadata early, initialization will now feed local id
into topology config which is immutable for each node anyway.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:44 -03:00
Piotr Smaroń
34c3688017 db: config: add live_updatable_config_params_changeable_via_cql option
If `live_updatable_config_params_changeable_via_cql` is set to true, configuration parameters defined with `liveness::LiveUpdate` option can be updated in the runtime with CQL, i.e. by updating `system.config` virtual table.
If we don't want any configuration parameter to be changed in the
runtime by updating `system.config` virtual table, this option should be
set to false. This option should be set to false for e.g. cloud users,
who can only perform CQL queries, and should not be able to change
scylla's configuration on the fly.

Current implemenatation is generic, but has a small drawback - messages
returned to the user can be not fully accurate, consider:
```
cqlsh> UPDATE system.config SET value='2' WHERE name='task_ttl_in_seconds';
WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="option is not live-updateable" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
```
where `task_ttl_in_seconds` has been defined with
`liveness::LiveUpdate`, but because `live_updatable_config_params_changeable_via_cql` is set to
`false` in `scylla.yaml,` `task_ttl_in_seconds` cannot be modified in the
runtime by updating `system.config` virtual table.

Fixes #14355

Closes #14382
2023-08-16 17:56:27 +03:00
Aleksandra Martyniuk
e9d94894f1 compaction: release resources of compaction executors
Before compaction task executors started inheriting from
compaction_task_impl, they were destructed immediately after
compaction finished. Destructors of executors and their
fields performed actions that affected global structures and
statistics and had impact on compaction process.

Currently, task executors are kept in memory much longer, as their
are tracked by task manager. Thus, destructors are not called just
after the compaction, which results in compaction stats not being
updated, which causes e.g. infinite cleanup loop.

Add release_resources() method which is called at the end
of compaction process and does what destructors used to.

Fixes: #14966.
Fixes: #15030.

Closes #15005
2023-08-16 15:51:17 +03:00
Kefu Chai
564522c4a8 s3/test: remove tempdir if log does not exists
should have been use `ignore_errors=True` to ignore
the error. this issue has not poped up, because
we haven't run into the case where the log file
does not exist.

this was a regression introduced by
d4ee84ee1e

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15063
2023-08-16 15:11:00 +03:00
Kefu Chai
32c26624bf build: cmake: correct reloc_pkg's path
before this change, the filename in path of reloc package looks like:
tools-scylla-5.4.0~dev-0.20230816.2eb6dc57297e.noarch.tar.gz
but it should have been:
scylla-tools-5.4.0~dev-0.20230816.2eb6dc57297e.noarch.tar.gz
so, when repackaging the reloc tarball to rpm / deb, the scripts
just fails to find the reloc tarball and fail.

after this change, the filename is corrected to match with the one
generated using `build_reloc.sh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-16 16:15:23 +08:00
Kefu Chai
a19c7fa8d5 build: cmake: build rpm/deb from reloc_pkg
before this change, dist-${name}-rpm and dist-${name}-deb targets
do not depend on the corresponding reloc pkg from which these
prebuilt packages are created. so these scripts fail if the reloc
package does not exist.

to address this problem, the reloc package is added as their
dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-16 16:15:23 +08:00
Avi Kivity
e8f3b073c3 Merge 'Maintain sstable state explicitly' from Pavel Emelyanov
An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state.

This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path.

refs: #13017
refs: #12707

Closes #14152

* github.com:scylladb/scylladb:
  sstable/storage: Make filesystem storage with initial state
  sstable: Maintain state
  sstable: Make .change_state() accept state, not directory string
  sstable: Construct it with state
  sstables_manager: Remove state-less make_sstable()
  table: Make sstables with required state
  test: Make sstables with upload state in some cases
  tools: Make sstables with normal state
  table: Open-code sstables making streaming helpers
  tests: Make sstables with normal state by default
  sstable_directory: Make sstable with required state
  sstable_directory: Construct with state
  distributed_loader: Make sstable with desired state when populating
  distributed_loader: Make sstable with upload state when uploading
  sstable: Introduce state enum
  sstable_directory: Merge verify and g.c. calls
  distributed_loader: Merge verify and gc invocations
  sstable/filesystem: Put underscores to dir members
  sstable/s3: Mark make_s3_object_name() const
  sstable: Remove filename(dir, ...) method
2023-08-15 17:44:06 +03:00
Avi Kivity
5949623e0d Merge 'sstable_set: maintain bytes on disk' from Benny Halevy
and use that in compaction_group, rather than
respective accumulators of its own.

This is part of of larger series to make cache updates exception safe.

Refs #14043

Closes #15052

* github.com:scylladb/scylladb:
  sstable_set: maintain total bytes_on_disk
  sstable_set: insert, erase: return status
2023-08-15 17:32:12 +03:00