Commit Graph

37135 Commits

Author SHA1 Message Date
Alexey Novikov
ffd4fcceec Alternator: return full table description on return of DeleteTable
The DeleteTable operation in Alternator shoudl return a TableDescription
object describing the table which has just been deleted, similar to what
DescribeTable returns

Fixes scylladb#11472

Closes #11628
2023-06-04 21:00:26 +03:00
Israel Fruchter
1ce739b020 Update tools/cqlsh submodule
* tools/cqlsh 8769c4c2...6e1000f1 (5):
  > build: erase uid/gid information from tar archives
  > Add github action to update the dockerhub description
  > cqlsh: Add extension handler for "scylla_encryption_options"
  > requirements.txt: update python-driver==3.26.0
  > Add support for arm64 docker image

Closes #13878
2023-06-04 19:56:52 +03:00
Kefu Chai
3cd9aa1448 build: cmake: build .wat from source files
we compile .wat files from .rs and .c source files since
6d89d718d9.
these .wat are used by test/cql-pytest/test_wasm.py . let's update
the CMake building system accordingly so these .wat files can also
be generated using the "wasm" target. since the ctest system is
not used. this change should allow us to perform this test manually.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14126
2023-06-04 14:55:38 +03:00
Aarav Arora
a12d2d5f16 fix: keyspace spell
Closes #14121
2023-06-04 13:48:43 +03:00
Kefu Chai
421331a20b test.py: consolidate multiple runs of the same test
before this change, when consolidating the boost's XML logger file,
we just practically concatenate all the tests' logger file into a single
one. sometimes, we run the tests for multiple times, and these runs share
the same TestSuite and TestCase tags. this has two sequences,

1. there is chance that only a test has both successful and failed
   runs. but jenkins' "Test Results" page cannot identify the failed
   run, it just picks a random run when one click for the detail of
   the run. as it takes the TestCase's name as part of its identifier.
   and we have multiple of them if the argument passed to the --repeat
   option is greater than 1 -- this is the case when we promote the
   "next" branch.
2. the testReport page of Jenkins' xUnit plugin created for the "next"
   job is 3 times as large as the one for the regular "scylla-ci" run.
   as all tests are repeated for 3 times. but what we really cares is
   history of a certain test not a certain run of it.

in this change, we just pick a representive run of a test if it is
repeated multiple times and add a "Message" tag for including the
summary of the runs. this should address the problems above:

1. the failed tests always stand out so we can always pinpoint it with
   Jenkins's "Test Results" page.
2. the tests are deduped by its name.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14069
2023-06-04 13:15:46 +03:00
Konstantin Osipov
b39ca97919 consistent_cluster_management: make the default
As per our roll out plan, make consistent_cluster_management (aka Raft
for schema changes) the default going forward. It means all
clusters which upgrade from the previous version and don't have
`consistent_cluster_management` explicitly set in scylla.yaml will begin
upgrading to Raft once all nodes in the cluster have moved to the new
version.

Fixes #13980

Closes #13984
2023-06-02 09:05:09 +02:00
Kamil Braun
34b60ba82b test_tablets: use run_async instead of execute
Don't block the thread which prevents concurrent tests from running
during this time. Use the dedicated `run_async`.

Also to silence `mypy` which complains that `manager.cql` is `Optional`
(so in theory might be `None`, e.g. after `driver_close`), use
`manager.get_cql()`.

Closes #14109
2023-06-01 18:05:05 +02:00
Kamil Braun
8be69fc3a0 Merge 'Initialize group0 server on boot before allowing incoming requests' from Gleb
The series includes mostly cleanups and one bug fix.

The fix is for the race where messages that need to access group0 server are arriving
before the server is initialized.

* 'gleb/group0-sp-mm-race-v2' of github.com:scylladb/scylla-dev:
  service: raft: fix typo
  service: raft: split off setup_group0_if_exist from setup_group0
  storage_service: do not allow override_decommission flag if consistent cluster management is enabled
  storage_service: fix indentation after the previous patch
  storage_service: co-routinize storage_service::join_cluster() function
  storage_service: do not reload topology from peers table if topology over raft is enabled
  storage_service: optimize debug logging code in case debug log is not enabled
2023-06-01 17:37:58 +02:00
Kamil Braun
297c75c6d8 storage_service: wait for schema agreement during initial boot
In production environments the Scylla boot procedure includes various
sleeps such as 'ring delay' and 'waiting for gossip to settle'. We
disable those sleeps in test.py tests and we'd also like to disable
them, if possible, in dtests.

Unfortunately, disabling the sleeps causes problems with schema: a
bootstrapping node creates its own versions of distributed keyspaces and
tables (such as `system_distributed`) because it doesn't first wait for
gossip to settle, during which it would usually pull existing schemas of
those keyspaces/tables from existing nodes. This may cause schema
disagreement for the whole duration of the bootstrap procedure (the
other nodes don't pull schema from a bootstrapping node; pulls are only
allowed once it becomes NORMAL), which causes the bootstrapping node to
costantly pull schema in attempts to synchronize, which doesn't work
because it's the other nodes which don't have schema mutations, not this
node. Even when the bootstrapping node finishes, the existing nodes
won't automatically pull schema from that node - only once we perform
another schema change a pull will be triggered.

The continuous pulls and the lack of schema synchronization until manual
schema change cause problems in tests. For example we observed the test
timing out in debug mode because bootstrap took too long due to the node
having to perform ~700 schema pulls (it attempts to synchronize schema
on each range repair). There's also potential for permanent schema
divergence, although I haven't seen this yet - in my experiments, once
the existing nodes pull from the new node, schema would always converge.

In any case, the safe and robust solution is to ensure that the
bootstrapping node pulls schema from existing nodes early in the boot
procedure. Then it won't try to create its own versions of the
distributed keyspaces/tables because it'll see they are already present
in the cluster.

In fact there already is `storage_service::wait_for_ring_to_settle`
which is supposed to wait until schema is in agreement before
proceeding.

However, this schema agreement wait relied on an earlier wait at the
beginning of the function - for a node to show up in gossiper
(otherwise, if we're the only node in gossiper, the schema agreement
wait trivially finishes immediately).

Unfortunately, this wait would timeout after `ring_delay` and proceed,
even if no other node was observed, instead of throwing an error...

To make it safe, modify the logic so if we timeout, we refuse to
bootstrap. To make it work in tests which set `ring_delay` to 0, make it
independent of `ring_delay` - just set the timeout to 5 minutes.

Fixes #14065
Fixes #14073

Closes #14105
2023-06-01 13:24:43 +03:00
Petr Gusev
0415ac3d5f test_secondary_index_collections: change insert/create index order
Secondary index creation is asynchronous, meaning it
takes time for existing data to be reflected within
the index. However, new data added after the
index is created should appear in it immediately.

The test consisted of two parts. The first created
a series of indexes for one table, added
test data to the table, and then ran a series of checks.
In the second part, several new indexes were added to
the same table, and checks were made to make sure that
already existing data would appear in them. This
last part was flaky.

The patch just moves the index creation statements
from the second part to the first.

Fixes: #14076

Closes #14090
2023-05-31 23:30:57 +03:00
Nadav Har'El
0e602159b9 storage_service: avoid excessive delay in wait_for_ring_to_settle()
The function storage_service::wait_for_ring_to_settle() is called when
bootstrapping a new node in an existing cluster, and it's supposed to
wait until the caller has the right schema - to allow the bootstrap
to start (the bootstrap needs to copy all existing tables from other
nodes).

The code of this function mostly checks in-memory structures in the
gossiper and migration manager, and if they aren't ready, sleeps and
tries again (until a timeout of "ring_delay_ms"). Today we sleep a
whole second between each try, but that's excessive - the checks are
very cheap, and we can do them much more often, so we can stop the
loop much closer to when the schema becomes available.

This patch changes the sleep from 1 second to 10 milliseconds.

The benefit of this patch is not huge - on average I measured about
0.25 seconds saving on adding a node to a cluster. But I don't see
any downside either.

Noticed while looking into Refs #14073

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #14101
2023-05-31 17:49:38 +02:00
Benny Halevy
bda3705974 test/lib: test_reader_conversions: always close reader
read_mutation_from_flat_mutation_reader might throw
so we need to close the reader returned from
ms.make_fragment_v1_stream also on the error
path to avoid the internal error abort when the
reader is destroyed while opened.

Fixes #14098

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #14099
2023-05-31 17:49:38 +02:00
Kamil Braun
8b718db42f Update seastar submodule
* seastar aff87d5b...afe39231 (12):
  > rpc: Fix formatting after previous patches
  > rpc: Introduce server::shutdown()
  > rpc: Wait for server socket to stop before killing conns
  > rpc: Document server::stop() method
  > util: remove unused #include
  > rpc: rpc_types: make `connection_id` a class
  > tests: rpc_test: simple test for connection aborting
  > rpc: introduce `server::abort_connection(connection_id)`
  > treewide: add C++ modules support
  > rpc: remove `connection::_server` field
  > rpc: add `server&` and `connection_id` to `client_info`
  > rpc: rpc_types: move `connection_id` definition before `client_info`
2023-05-31 17:49:38 +02:00
Gleb Natapov
dcfd224e8b service: raft: fix typo 2023-05-31 11:01:33 +03:00
Gleb Natapov
f26179cd27 service: raft: split off setup_group0_if_exist from setup_group0
Currently setup_group0 is responsible to start existing group0 on restart
or create a new one and joining the cluster with it during bootstrap. We
want to create the server for existing group0 earlier, before we start
to accept messages because some messages may assume that the server
exists already. For that we split creation of exiting group0 server into
a separate function and call it on restart before the messaging service
starts accepting messages.

Fixes: #13887
2023-05-31 11:00:41 +03:00
Gleb Natapov
acc035b504 storage_service: do not allow override_decommission flag if consistent cluster management is enabled
If consistent cluster management is enabled it is not possible to
restart decommissioned node since it will not be part of the grouup0.
2023-05-31 10:40:42 +03:00
Raphael S. Carvalho
23443e0574 compaction: Fix incremental compaction for sstable cleanup
After c7826aa910, sstable runs are cleaned up together.

The procedure which executes cleanup was holding reference to all
input sstables, such that it could later retry the same cleanup
job on failure.

Turns out it was not taking into account that incremental compaction
will exhaust the input set incrementally.

Therefore cleanup is affected by the 100% space overhead.

To fix it, cleanup will now have the input set updated, by removing
the sstables that were already cleaned up. On failure, cleanup
will retry the same job with the remaining sstables that weren't
exhausted by incremental compaction.

New unit test reproduces the failure, and passes with the fix.

Fixes #14035.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #14038
2023-05-31 06:46:12 +03:00
Pavel Emelyanov
66ccc14fcb scylla-gdb: Add commitlog command
The command prints segment_manager address, because it's the manager
who's on interest, not the db::commitlog itself. Also it prints out all
found segments, it's just for convenience -- segments are in a vector of
shared pointers and it's handy to have object addresses instantly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #14088
2023-05-30 22:55:18 +03:00
Avi Kivity
bb361f41d8 Merge 'RPC: add [[ref]] attribute to heavy parameters' from Gusev Petr
By default `idl-compiler.py` emits code to pass parameters by value. There was an attribute `[[ref]]`, which makes it to use `const&`, but it was not used systematically and in many cases parameters were redundantly copied. In this PR, all `verb` directives have been reviewed and the `[[ref]]` attribute has been added where it makes sense.

The parameters [are serialised synchronously](https://github.com/scylladb/seastar/blob/master/include/seastar/rpc/rpc_impl.hh#L471) so there should be no lifetime issues. This was not the case before, but the behaviour changed in [this commit](3942546d41). Now it's not a problem to get an object by reference when using `send_` methods.

Fixes: #12504

Closes #14003

* github.com:scylladb/scylladb:
  tracing::trace_info: pass by ref
  storage_proxy: pass inet_address_vector_replica_set by ref
  raft: add [[ref]] attribute
  repair: add [[ref]] attribute
  forward_request: add [[ref]] attribute
  storage_proxy: paxos:: add [[ref]] attribute
  storage_proxy: read_XXX:: make read_command [[ref]]
  storage_proxy: hint_mutation:: make frozen_mutation [[ref]]
  storage_proxy: mutation:: make frozen_mutation [[ref]]
2023-05-30 16:37:24 +03:00
Kefu Chai
037113f752 reloc: raise if rmtree fails
occasionally, we are observing build failures like:
```
17:20:54  FAILED: build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz
17:20:54  dist/debuginfo/scripts/create-relocatable-package.py --mode release 'build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz'
17:20:54  Traceback (most recent call last):
17:20:54    File "/jenkins/workspace/scylla-master/scylla-ci/scylla/dist/debuginfo/scripts/create-relocatable-package.py", line 60, in <module>
17:20:54      os.makedirs(f'build/{SCYLLA_DIR}')
17:20:54    File "<frozen os>", line 225, in makedirs
17:20:54  FileExistsError: [Errno 17] File exists: 'build/scylla-debuginfo-package'
```

to understand the root cause better, instead of swallowing the error,
let's raise the exception it is not caused by non-existing directory.

a similar change was applied to scripts/create-relocatable-package.py
in a0b8aa9b13. which was correct per-se.
but the original intention was to understand the root cause of the
failure when packaging scylla-debuginfo-*.tar.gz, which is created
by the dist/debuginfo/scripts/create-relocatable-package.py.

so, in this change, the change is ported to this script.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14082
2023-05-30 15:39:24 +03:00
Botond Dénes
9a1e5784b0 Merge 'Use table_info in compaction' from Aleksandra Martyniuk
In compaction logs table is identified by {keyspace}.{table_id}.

Instead, table name should be used in run_on_existing_tables
logs. To do so, task manager's compaction tasks use table_info
instead of table_id.

Keyspace argument is copied to run_on_existing_tables
to ensure it's alive.

Closes #13816

* github.com:scylladb/scylladb:
  compaction: use table_info in compaction tasks
  api: move table_info to schema/schema_fwd.hh
2023-05-30 15:10:47 +03:00
Kefu Chai
82cac8e7cf treewide: s/std::source_location/seastar::compact::source_location/
CWG 2631 (https://cplusplus.github.io/CWG/issues/2631.html) reports
an issue on how the default argument is evaluated. this problem is
more obvious when it comes to how `std::source_location::current()`
is evaluated as a default argument. but not all compilers have the
same behavior, see https://godbolt.org/z/PK865KdG4.

notebaly, clang-15 evaluates the default argument at the callee
site. so we need to check the capability of compiler and fall back
to the one defined by util/source_location-compat.hh if the compiler
suffers from CWG 2631. and clang-16 implemented CWG2631 in
https://reviews.llvm.org/D136554. But unfortunately, this change
was not backported to clang-15.

before switching over to clang-16, for using std::source_location::current()
as the default parameter and expect the behavior defined by CWG2631,
we have to use the compatible layer provided by Seastar. otherwise
we always end up having the source_location at the callee side, which
is not interesting under most circumstances.

so in this change, all places using the idiom of passing
std::source_location::current() as the default parameter are changed
to use seastar::compat::source_location::current(). despite that
we have `#include "seastarx.h"` for opening the seastar namespace,
to disambiguate the "namespace compat" defined somewhere in scylladb,
the fully qualified name of
`seastar::compat::source_location::current()` is used.

see also 09a3c63345, where we used
std::source_location as an alias of std::experimental::source_location
if it was available. but this does not apply to the settings of our
current toolchain, where we have GCC-12 and Clang-15.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14086
2023-05-30 15:10:12 +03:00
Petr Gusev
3a88c7769f tracing::trace_info: pass by ref
sizeof(std::optional<tracing::trace_info>) == 64 bytes,
so it should be more efficient.
2023-05-30 14:32:10 +04:00
Petr Gusev
48600049fc storage_proxy: pass inet_address_vector_replica_set by ref
sizeof(inet_address_vector_replica_set) == 96 bytes and
it has complex move constructor.
2023-05-30 14:04:53 +04:00
Pavel Emelyanov
577cd96da8 scripts: Fix options iteration in open-coredump.sh
When run like 'open-coredump.sh --help' the options parsing loop doesn't
run because $# == 1 and [ $# -gt 1 ] evaluates to false.

The simplest fix is to parse -h|--help on its own as the options parsing
loop assumes that there's core-file argument present.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #14075
2023-05-30 12:25:01 +03:00
Petr Gusev
896e3bb425 raft: add [[ref]] attribute 2023-05-30 13:14:19 +04:00
Petr Gusev
4ff1adaef9 repair: add [[ref]] attribute 2023-05-30 13:14:19 +04:00
Petr Gusev
282d66d15d forward_request: add [[ref]] attribute 2023-05-30 13:14:19 +04:00
Petr Gusev
db4030f792 storage_proxy: paxos:: add [[ref]] attribute
read_command, partition_key and paxos::proposal
are marked with [[ref]]. partition_key contains
dynamic allocations and can be big. proposal
contains frozen_mutation, so it's also
contains dynamic allocations.

The call sites are fine, the already passed
by reference.
2023-05-30 13:14:19 +04:00
Petr Gusev
f2cba20945 storage_proxy: read_XXX:: make read_command [[ref]]
We had a redundant copies at the call sites of
these methods. Class read_command does not
contain dynamic allocations, but it's quite
but by itself (368 bytes).
2023-05-30 13:14:19 +04:00
Petr Gusev
ffb4e39e40 storage_proxy: hint_mutation:: make frozen_mutation [[ref]]
We had a redundant copy in hint_mutation::apply_remotely.
This frozen_mutation is dynamically allocated and
can be arbitrary large.
2023-05-30 13:14:19 +04:00
Petr Gusev
5adbb6cde2 storage_proxy: mutation:: make frozen_mutation [[ref]]
We had a redundant copy in receive_mutation_handler
forward_fn callback. This frozen_mutation is
dynamically allocated and can be arbitrary large.

Fixes: #12504
2023-05-30 13:14:19 +04:00
Tzach Livyatan
e655060429 Remove Ubuntu 18.04 support from 5.2
Ubuntu [18.04 will be soon out of standard support](https://ubuntu.com/blog/18-04-end-of-standard-support), and can be removed from 5.2 supported list
https://github.com/scylladb/scylla-pkg/issues/3346

Closes #13529
2023-05-30 11:12:17 +03:00
Aleksandra Martyniuk
f48b57e7b9 compaction: use table_info in compaction tasks
Task manager compaction tasks need table names for logs.
Thus, compaction tasks store table infos instead of table ids.

get_table_ids function is deleted as it isn't used anywhere.
2023-05-30 09:58:55 +02:00
Aleksandra Martyniuk
4206139e5a api: move table_info to schema/schema_fwd.hh
table_info is moved from api/storage_service.hh to schema/schema_fwd.hh
so that it could be used in task manager's tasks.
2023-05-30 09:57:21 +02:00
Avi Kivity
ffce6d94fc Merge 'service: storage_proxy: make hint write handlers cancellable' from Kamil Braun
The `view_update_write_response_handler` class, which is a subclass of
`abstract_write_response_handler`, was created for a single purpose:
to make it possible to cancel a handler for a view update write,
which means we stop waiting for a response to the write, timing out
the handler immediately. This was done to solve issue with node
shutdown hanging because it was waiting for a view update to finish;
view updates were configured with 5 minute timeout. See #3966, #4028.

Now we're having a similar problem with hint updates causing shutdown
to hang in tests (#8079).

`view_update_write_response_handler` implements cancelling by adding
itself to an intrusive list which we then iterate over to timeout each
handler when we shutdown or when gossiper notifies `storage_proxy`
that a node is down.

To make it possible to reuse this algorithm for other handlers, move
the functionality into `abstract_write_response_handler`. We inherit
from `bi::list_base_hook` so it introduces small memory overhead to
each write handler (2 pointers) which was only present for view update
handlers before. But those handlers are already quite large, the
overhead is small compared to their size.

Use this new functionality to also cancel hint write handlers when we
shutdown. This fixes #8079.

Closes #14047

* github.com:scylladb/scylladb:
  test: reproducer for hints manager shutdown hang
  test: pylib: ScyllaCluster: generalize config type for `server_add`
  test: pylib: scylla_cluster: add explicit timeout for graceful server stop
  service: storage_proxy: make hint write handlers cancellable
  service: storage_proxy: rename `view_update_handlers_list`
  service: storage_proxy: make it possible to cancel all write handler types
2023-05-30 01:36:50 +03:00
Avi Kivity
27f7cc4032 Revert "Merge 'cql: update permissions when creating/altering a function/keyspace' from Wojciech Mitros"
This reverts commit 52e4edfd5e, reversing
changes made to d2d53fc1db. The associated test
fails with about 10% probablity, which blocks other work.

Fixes #13919
Reopens #13747
2023-05-29 23:03:25 +03:00
Botond Dénes
a35758607a Update tools/java submodule
* tools/java eb3c43f8...0cbfeb03 (1):
  > nodetool: add `--primary-replica-only` option to `refresh`
2023-05-29 23:03:25 +03:00
Botond Dénes
fc24685b4d Update tools/jmx submodule
* tools/jmx 1fd23b60...d1077582 (1):
  > Support `--primary-replica-only` option from `nodetool refresh`
2023-05-29 23:03:25 +03:00
Pavel Emelyanov
b0525e20d5 main: Ignore sleep_aborted exception in main
When scylla starts it may go to sleep along the way before the "serving"
message appears. If SIGINT is sent at that time the whole thing unrolls
and the main code ends up catching the sleep_aborted exception, printing
the error in logs and exiting with non-zero code. However, that's not an
error, just the start was interrupted earlier than it was expected by
the stop_signal thing.

fixes: #12898

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #14034
2023-05-29 23:03:25 +03:00
Avi Kivity
2303f08eea utils: logalloc: correct asan_interface.h location
It's a system header, so it deserves angle brackets.

Closes #14036
2023-05-29 23:03:25 +03:00
Benny Halevy
c685ef9e71 partitioned_sstable_set: insert: return early if sst is already in the set
Currently, partitioned_sstable_set::insert may erase a sstable
from the set inadvertently, if an exception is thrown while
(re-)inserting it.

To prevent that, simply return early after detecting that
insertion didn't took place, based on the unordered_set::insert
result.

This issue is theoretical, as there are no known case
of re-inserting sstables into the partitioned sstable set.

Fixes #14060

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #14061
2023-05-29 23:03:25 +03:00
Aleksandra Martyniuk
24864e39dd compaction: delete unnecessary sequence number incrementations
Task manager's tasks that have parent task inherit sequence number
from their parents. Thus they do not need to have a new sequence number
generated as it will be overwritten anyway.

Closes #14045
2023-05-29 23:03:25 +03:00
Kefu Chai
c00f4af5d4 build: cmake: link auth against libcrypt
libxcrypt is used by auth subsystem, for instance, `crypt_r()` provided
by this library is used by passwords.cc. so let's link against it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14030
2023-05-29 23:03:24 +03:00
Benny Halevy
774a10017c backlog_controller: destroy _update_timer before _current_backlog
The _update_timer callback calls adjust() that
depends on _current_backlog and currently, _current_backlog is
destroyed before _update_timer.

This is benign since there are no preemption points in
the destructor, but it's more correct and elegant
to destroy the timer first, before other members it depends on.

Fixes #14056

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #14057
2023-05-29 23:03:24 +03:00
Kefu Chai
a0b8aa9b13 create-relocatable-package.py: raise if rmtree fails
occasionally, we are observing build failures like:
```
17:20:54  FAILED: build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz
17:20:54  dist/debuginfo/scripts/create-relocatable-package.py --mode release 'build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz'
17:20:54  Traceback (most recent call last):
17:20:54    File "/jenkins/workspace/scylla-master/scylla-ci/scylla/dist/debuginfo/scripts/create-relocatable-package.py", line 60, in <module>
17:20:54      os.makedirs(f'build/{SCYLLA_DIR}')
17:20:54    File "<frozen os>", line 225, in makedirs
17:20:54  FileExistsError: [Errno 17] File exists: 'build/scylla-debuginfo-package'
```

to understand the root cause better, instead of swallowing the error,
let's raise the exception it is not caused by non-existing directory.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13978
2023-05-29 23:03:24 +03:00
Avi Kivity
2cef3350af Merge 'Initialize/destroy ks/cf directories with explicit class methods' from Pavel Emelyanov
This set encapsulates ks/cf directories creation and deletion into keyspace and table classes methods. This is needed to facilitate making the storage initialization storage-type aware in the future. Also this makes the replica/ code less involved in formatting sstables' directory path by hand.

refs: #13020
refs: #12707

Closes #14048

* github.com:scylladb/scylladb:
  keyspace: Introduce init_storage()
  keyspace: Remove column_family_directory()
  table: Introduce destroy_storage()
  table: Simplify init_storage()
  table: Coroutinize init_storage()
  table: Relocate ks.make_directory_for_column_family()
  distributed_loader: Use cf.dir() instead of ks.column_family_directory()
  test: Don't create directory for system tables in cql_test_env
2023-05-29 23:03:24 +03:00
Kefu Chai
55ee0e2724 build: preserve $libs when linking a single testing executable
if we just want to build a single test and scylla executables, we
might want to use `configure.py` like:

./configure.py --mode debug --compiler clang++ --with scylla --with test/boost/database_test

which generates `build.ninja` for us, with following rules:

build $builddir/debug/test/boost/database_test_g: link.debug ... | $builddir/debug/seastar/libseastar.so
$builddir/debug/seastar/libseastar_testing.so
   libs = $seastar_libs_debug $libs -lthrift -lboost_system $seastar_testing_libs_debug
   libs = $seastar_libs_debug

but the last line prevents database_test_g for linking against
the third-party libraries like libabsl, which could have been
pulled in by $libs. but the second assignment expression just
makes the value of `libs` identical to that of `seastar_libs_debug`.
but that library does not include the libraries which are only
used by scylla. so we could run into link failure with the
`build.ninja` generated with this command line. like:
```
FAILED: build/debug/test/boost/database_test_g
...
ld.lld: error: undefined symbol: seastar::testing::entry_point(int, char**)
>>> referenced by scylla_test_case.hh:22 (./test/lib/scylla_test_case.hh:22)
>>>               build/debug/test/boost/database_test.o:(main)
...
ld.lld: error: undefined symbol: boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_tes
t::basic_cstring<char const>)
>>> referenced by database_test.cc:298 (test/boost/database_test.cc:298)
>>>               build/debug/test/boost/database_test.o:(require_exist(seastar::basic_sstring<char, unsigned int, 15u, true> const&, bool))
...
```

with this change, the extra assignment expression is dropped. this
should not cause any regression. as f'$seastar_libs_{mode}' as
been included as a part of `local_libs` before the grand if-the-else
block in the for loop before this `f.write()` statement.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14041
2023-05-29 23:03:24 +03:00
Kefu Chai
74dd6dc185 Revert "test: string_format_test: don't compare std::string with sstring"
This reverts commit 3c54d5ec5e.

The reverted change fixed the FTBFS of the test in question with Clang 16,
which rightly stopped convert the LHS of `"hello" == sstring{"hello"}` to
the type of the type acceptable by the member operator even we have a
constructor for this conversion, like

class sstring {
public:
  bar_t(const char*);
  bool operator==(const sstring&) const;
  bool operator!=(const sstring&) const;
};

because we have an operator!=, as per the draft of C++ standard
https://eel.is/c++draft/over.match.oper#4 :

> A non-template function or function template F named operator==
> is a rewrite target with first operand o unless a search for the
> name operator!= in the scope S from the instantiation context of
> the operator expression finds a function or function template
> that would correspond ([basic.scope.scope]) to F if its name were
> operator==, where S is the scope of the class type of o if F is a
> class member, and the namespace scope of which F is a member
> otherwise.

in 397f4b51c3, the seastar submodule was
updated. in which, we now have a dedicated overload for the `const char*`
case. so the compiler is now able to compile the expression like
`"hello" == sstring{"hello"}` in C++20 now.

so, in this change, the workaround is reverted.

Closes #14040
2023-05-29 23:03:24 +03:00
Benny Halevy
26705ba6af partitioned_sstable_set: erase empty runs
When erasing a sstable first check if its run_id
exists in _all_runs, otherwise do nothing with
that respect, and then if the run becomes empty
when erasing the last sstable (and it could have been
a single-sstable run from get go), erase the run
from `_all_runs`.

Fixes #14052

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #14054
2023-05-29 23:03:24 +03:00