The DeleteTable operation in Alternator shoudl return a TableDescription
object describing the table which has just been deleted, similar to what
DescribeTable returns
Fixes scylladb#11472
Closes#11628
* tools/cqlsh 8769c4c2...6e1000f1 (5):
> build: erase uid/gid information from tar archives
> Add github action to update the dockerhub description
> cqlsh: Add extension handler for "scylla_encryption_options"
> requirements.txt: update python-driver==3.26.0
> Add support for arm64 docker image
Closes#13878
we compile .wat files from .rs and .c source files since
6d89d718d9.
these .wat are used by test/cql-pytest/test_wasm.py . let's update
the CMake building system accordingly so these .wat files can also
be generated using the "wasm" target. since the ctest system is
not used. this change should allow us to perform this test manually.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14126
before this change, when consolidating the boost's XML logger file,
we just practically concatenate all the tests' logger file into a single
one. sometimes, we run the tests for multiple times, and these runs share
the same TestSuite and TestCase tags. this has two sequences,
1. there is chance that only a test has both successful and failed
runs. but jenkins' "Test Results" page cannot identify the failed
run, it just picks a random run when one click for the detail of
the run. as it takes the TestCase's name as part of its identifier.
and we have multiple of them if the argument passed to the --repeat
option is greater than 1 -- this is the case when we promote the
"next" branch.
2. the testReport page of Jenkins' xUnit plugin created for the "next"
job is 3 times as large as the one for the regular "scylla-ci" run.
as all tests are repeated for 3 times. but what we really cares is
history of a certain test not a certain run of it.
in this change, we just pick a representive run of a test if it is
repeated multiple times and add a "Message" tag for including the
summary of the runs. this should address the problems above:
1. the failed tests always stand out so we can always pinpoint it with
Jenkins's "Test Results" page.
2. the tests are deduped by its name.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14069
As per our roll out plan, make consistent_cluster_management (aka Raft
for schema changes) the default going forward. It means all
clusters which upgrade from the previous version and don't have
`consistent_cluster_management` explicitly set in scylla.yaml will begin
upgrading to Raft once all nodes in the cluster have moved to the new
version.
Fixes#13980Closes#13984
Don't block the thread which prevents concurrent tests from running
during this time. Use the dedicated `run_async`.
Also to silence `mypy` which complains that `manager.cql` is `Optional`
(so in theory might be `None`, e.g. after `driver_close`), use
`manager.get_cql()`.
Closes#14109
The series includes mostly cleanups and one bug fix.
The fix is for the race where messages that need to access group0 server are arriving
before the server is initialized.
* 'gleb/group0-sp-mm-race-v2' of github.com:scylladb/scylla-dev:
service: raft: fix typo
service: raft: split off setup_group0_if_exist from setup_group0
storage_service: do not allow override_decommission flag if consistent cluster management is enabled
storage_service: fix indentation after the previous patch
storage_service: co-routinize storage_service::join_cluster() function
storage_service: do not reload topology from peers table if topology over raft is enabled
storage_service: optimize debug logging code in case debug log is not enabled
In production environments the Scylla boot procedure includes various
sleeps such as 'ring delay' and 'waiting for gossip to settle'. We
disable those sleeps in test.py tests and we'd also like to disable
them, if possible, in dtests.
Unfortunately, disabling the sleeps causes problems with schema: a
bootstrapping node creates its own versions of distributed keyspaces and
tables (such as `system_distributed`) because it doesn't first wait for
gossip to settle, during which it would usually pull existing schemas of
those keyspaces/tables from existing nodes. This may cause schema
disagreement for the whole duration of the bootstrap procedure (the
other nodes don't pull schema from a bootstrapping node; pulls are only
allowed once it becomes NORMAL), which causes the bootstrapping node to
costantly pull schema in attempts to synchronize, which doesn't work
because it's the other nodes which don't have schema mutations, not this
node. Even when the bootstrapping node finishes, the existing nodes
won't automatically pull schema from that node - only once we perform
another schema change a pull will be triggered.
The continuous pulls and the lack of schema synchronization until manual
schema change cause problems in tests. For example we observed the test
timing out in debug mode because bootstrap took too long due to the node
having to perform ~700 schema pulls (it attempts to synchronize schema
on each range repair). There's also potential for permanent schema
divergence, although I haven't seen this yet - in my experiments, once
the existing nodes pull from the new node, schema would always converge.
In any case, the safe and robust solution is to ensure that the
bootstrapping node pulls schema from existing nodes early in the boot
procedure. Then it won't try to create its own versions of the
distributed keyspaces/tables because it'll see they are already present
in the cluster.
In fact there already is `storage_service::wait_for_ring_to_settle`
which is supposed to wait until schema is in agreement before
proceeding.
However, this schema agreement wait relied on an earlier wait at the
beginning of the function - for a node to show up in gossiper
(otherwise, if we're the only node in gossiper, the schema agreement
wait trivially finishes immediately).
Unfortunately, this wait would timeout after `ring_delay` and proceed,
even if no other node was observed, instead of throwing an error...
To make it safe, modify the logic so if we timeout, we refuse to
bootstrap. To make it work in tests which set `ring_delay` to 0, make it
independent of `ring_delay` - just set the timeout to 5 minutes.
Fixes#14065Fixes#14073Closes#14105
Secondary index creation is asynchronous, meaning it
takes time for existing data to be reflected within
the index. However, new data added after the
index is created should appear in it immediately.
The test consisted of two parts. The first created
a series of indexes for one table, added
test data to the table, and then ran a series of checks.
In the second part, several new indexes were added to
the same table, and checks were made to make sure that
already existing data would appear in them. This
last part was flaky.
The patch just moves the index creation statements
from the second part to the first.
Fixes: #14076Closes#14090
The function storage_service::wait_for_ring_to_settle() is called when
bootstrapping a new node in an existing cluster, and it's supposed to
wait until the caller has the right schema - to allow the bootstrap
to start (the bootstrap needs to copy all existing tables from other
nodes).
The code of this function mostly checks in-memory structures in the
gossiper and migration manager, and if they aren't ready, sleeps and
tries again (until a timeout of "ring_delay_ms"). Today we sleep a
whole second between each try, but that's excessive - the checks are
very cheap, and we can do them much more often, so we can stop the
loop much closer to when the schema becomes available.
This patch changes the sleep from 1 second to 10 milliseconds.
The benefit of this patch is not huge - on average I measured about
0.25 seconds saving on adding a node to a cluster. But I don't see
any downside either.
Noticed while looking into Refs #14073
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#14101
read_mutation_from_flat_mutation_reader might throw
so we need to close the reader returned from
ms.make_fragment_v1_stream also on the error
path to avoid the internal error abort when the
reader is destroyed while opened.
Fixes#14098
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#14099
Currently setup_group0 is responsible to start existing group0 on restart
or create a new one and joining the cluster with it during bootstrap. We
want to create the server for existing group0 earlier, before we start
to accept messages because some messages may assume that the server
exists already. For that we split creation of exiting group0 server into
a separate function and call it on restart before the messaging service
starts accepting messages.
Fixes: #13887
After c7826aa910, sstable runs are cleaned up together.
The procedure which executes cleanup was holding reference to all
input sstables, such that it could later retry the same cleanup
job on failure.
Turns out it was not taking into account that incremental compaction
will exhaust the input set incrementally.
Therefore cleanup is affected by the 100% space overhead.
To fix it, cleanup will now have the input set updated, by removing
the sstables that were already cleaned up. On failure, cleanup
will retry the same job with the remaining sstables that weren't
exhausted by incremental compaction.
New unit test reproduces the failure, and passes with the fix.
Fixes#14035.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#14038
The command prints segment_manager address, because it's the manager
who's on interest, not the db::commitlog itself. Also it prints out all
found segments, it's just for convenience -- segments are in a vector of
shared pointers and it's handy to have object addresses instantly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#14088
By default `idl-compiler.py` emits code to pass parameters by value. There was an attribute `[[ref]]`, which makes it to use `const&`, but it was not used systematically and in many cases parameters were redundantly copied. In this PR, all `verb` directives have been reviewed and the `[[ref]]` attribute has been added where it makes sense.
The parameters [are serialised synchronously](https://github.com/scylladb/seastar/blob/master/include/seastar/rpc/rpc_impl.hh#L471) so there should be no lifetime issues. This was not the case before, but the behaviour changed in [this commit](3942546d41). Now it's not a problem to get an object by reference when using `send_` methods.
Fixes: #12504Closes#14003
* github.com:scylladb/scylladb:
tracing::trace_info: pass by ref
storage_proxy: pass inet_address_vector_replica_set by ref
raft: add [[ref]] attribute
repair: add [[ref]] attribute
forward_request: add [[ref]] attribute
storage_proxy: paxos:: add [[ref]] attribute
storage_proxy: read_XXX:: make read_command [[ref]]
storage_proxy: hint_mutation:: make frozen_mutation [[ref]]
storage_proxy: mutation:: make frozen_mutation [[ref]]
occasionally, we are observing build failures like:
```
17:20:54 FAILED: build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz
17:20:54 dist/debuginfo/scripts/create-relocatable-package.py --mode release 'build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz'
17:20:54 Traceback (most recent call last):
17:20:54 File "/jenkins/workspace/scylla-master/scylla-ci/scylla/dist/debuginfo/scripts/create-relocatable-package.py", line 60, in <module>
17:20:54 os.makedirs(f'build/{SCYLLA_DIR}')
17:20:54 File "<frozen os>", line 225, in makedirs
17:20:54 FileExistsError: [Errno 17] File exists: 'build/scylla-debuginfo-package'
```
to understand the root cause better, instead of swallowing the error,
let's raise the exception it is not caused by non-existing directory.
a similar change was applied to scripts/create-relocatable-package.py
in a0b8aa9b13. which was correct per-se.
but the original intention was to understand the root cause of the
failure when packaging scylla-debuginfo-*.tar.gz, which is created
by the dist/debuginfo/scripts/create-relocatable-package.py.
so, in this change, the change is ported to this script.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14082
In compaction logs table is identified by {keyspace}.{table_id}.
Instead, table name should be used in run_on_existing_tables
logs. To do so, task manager's compaction tasks use table_info
instead of table_id.
Keyspace argument is copied to run_on_existing_tables
to ensure it's alive.
Closes#13816
* github.com:scylladb/scylladb:
compaction: use table_info in compaction tasks
api: move table_info to schema/schema_fwd.hh
CWG 2631 (https://cplusplus.github.io/CWG/issues/2631.html) reports
an issue on how the default argument is evaluated. this problem is
more obvious when it comes to how `std::source_location::current()`
is evaluated as a default argument. but not all compilers have the
same behavior, see https://godbolt.org/z/PK865KdG4.
notebaly, clang-15 evaluates the default argument at the callee
site. so we need to check the capability of compiler and fall back
to the one defined by util/source_location-compat.hh if the compiler
suffers from CWG 2631. and clang-16 implemented CWG2631 in
https://reviews.llvm.org/D136554. But unfortunately, this change
was not backported to clang-15.
before switching over to clang-16, for using std::source_location::current()
as the default parameter and expect the behavior defined by CWG2631,
we have to use the compatible layer provided by Seastar. otherwise
we always end up having the source_location at the callee side, which
is not interesting under most circumstances.
so in this change, all places using the idiom of passing
std::source_location::current() as the default parameter are changed
to use seastar::compat::source_location::current(). despite that
we have `#include "seastarx.h"` for opening the seastar namespace,
to disambiguate the "namespace compat" defined somewhere in scylladb,
the fully qualified name of
`seastar::compat::source_location::current()` is used.
see also 09a3c63345, where we used
std::source_location as an alias of std::experimental::source_location
if it was available. but this does not apply to the settings of our
current toolchain, where we have GCC-12 and Clang-15.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14086
When run like 'open-coredump.sh --help' the options parsing loop doesn't
run because $# == 1 and [ $# -gt 1 ] evaluates to false.
The simplest fix is to parse -h|--help on its own as the options parsing
loop assumes that there's core-file argument present.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#14075
read_command, partition_key and paxos::proposal
are marked with [[ref]]. partition_key contains
dynamic allocations and can be big. proposal
contains frozen_mutation, so it's also
contains dynamic allocations.
The call sites are fine, the already passed
by reference.
We had a redundant copies at the call sites of
these methods. Class read_command does not
contain dynamic allocations, but it's quite
but by itself (368 bytes).
We had a redundant copy in receive_mutation_handler
forward_fn callback. This frozen_mutation is
dynamically allocated and can be arbitrary large.
Fixes: #12504
Task manager compaction tasks need table names for logs.
Thus, compaction tasks store table infos instead of table ids.
get_table_ids function is deleted as it isn't used anywhere.
The `view_update_write_response_handler` class, which is a subclass of
`abstract_write_response_handler`, was created for a single purpose:
to make it possible to cancel a handler for a view update write,
which means we stop waiting for a response to the write, timing out
the handler immediately. This was done to solve issue with node
shutdown hanging because it was waiting for a view update to finish;
view updates were configured with 5 minute timeout. See #3966, #4028.
Now we're having a similar problem with hint updates causing shutdown
to hang in tests (#8079).
`view_update_write_response_handler` implements cancelling by adding
itself to an intrusive list which we then iterate over to timeout each
handler when we shutdown or when gossiper notifies `storage_proxy`
that a node is down.
To make it possible to reuse this algorithm for other handlers, move
the functionality into `abstract_write_response_handler`. We inherit
from `bi::list_base_hook` so it introduces small memory overhead to
each write handler (2 pointers) which was only present for view update
handlers before. But those handlers are already quite large, the
overhead is small compared to their size.
Use this new functionality to also cancel hint write handlers when we
shutdown. This fixes#8079.
Closes#14047
* github.com:scylladb/scylladb:
test: reproducer for hints manager shutdown hang
test: pylib: ScyllaCluster: generalize config type for `server_add`
test: pylib: scylla_cluster: add explicit timeout for graceful server stop
service: storage_proxy: make hint write handlers cancellable
service: storage_proxy: rename `view_update_handlers_list`
service: storage_proxy: make it possible to cancel all write handler types
This reverts commit 52e4edfd5e, reversing
changes made to d2d53fc1db. The associated test
fails with about 10% probablity, which blocks other work.
Fixes#13919Reopens#13747
When scylla starts it may go to sleep along the way before the "serving"
message appears. If SIGINT is sent at that time the whole thing unrolls
and the main code ends up catching the sleep_aborted exception, printing
the error in logs and exiting with non-zero code. However, that's not an
error, just the start was interrupted earlier than it was expected by
the stop_signal thing.
fixes: #12898
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#14034
Currently, partitioned_sstable_set::insert may erase a sstable
from the set inadvertently, if an exception is thrown while
(re-)inserting it.
To prevent that, simply return early after detecting that
insertion didn't took place, based on the unordered_set::insert
result.
This issue is theoretical, as there are no known case
of re-inserting sstables into the partitioned sstable set.
Fixes#14060
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#14061
Task manager's tasks that have parent task inherit sequence number
from their parents. Thus they do not need to have a new sequence number
generated as it will be overwritten anyway.
Closes#14045
libxcrypt is used by auth subsystem, for instance, `crypt_r()` provided
by this library is used by passwords.cc. so let's link against it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14030
The _update_timer callback calls adjust() that
depends on _current_backlog and currently, _current_backlog is
destroyed before _update_timer.
This is benign since there are no preemption points in
the destructor, but it's more correct and elegant
to destroy the timer first, before other members it depends on.
Fixes#14056
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#14057
occasionally, we are observing build failures like:
```
17:20:54 FAILED: build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz
17:20:54 dist/debuginfo/scripts/create-relocatable-package.py --mode release 'build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz'
17:20:54 Traceback (most recent call last):
17:20:54 File "/jenkins/workspace/scylla-master/scylla-ci/scylla/dist/debuginfo/scripts/create-relocatable-package.py", line 60, in <module>
17:20:54 os.makedirs(f'build/{SCYLLA_DIR}')
17:20:54 File "<frozen os>", line 225, in makedirs
17:20:54 FileExistsError: [Errno 17] File exists: 'build/scylla-debuginfo-package'
```
to understand the root cause better, instead of swallowing the error,
let's raise the exception it is not caused by non-existing directory.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13978
This set encapsulates ks/cf directories creation and deletion into keyspace and table classes methods. This is needed to facilitate making the storage initialization storage-type aware in the future. Also this makes the replica/ code less involved in formatting sstables' directory path by hand.
refs: #13020
refs: #12707Closes#14048
* github.com:scylladb/scylladb:
keyspace: Introduce init_storage()
keyspace: Remove column_family_directory()
table: Introduce destroy_storage()
table: Simplify init_storage()
table: Coroutinize init_storage()
table: Relocate ks.make_directory_for_column_family()
distributed_loader: Use cf.dir() instead of ks.column_family_directory()
test: Don't create directory for system tables in cql_test_env
if we just want to build a single test and scylla executables, we
might want to use `configure.py` like:
./configure.py --mode debug --compiler clang++ --with scylla --with test/boost/database_test
which generates `build.ninja` for us, with following rules:
build $builddir/debug/test/boost/database_test_g: link.debug ... | $builddir/debug/seastar/libseastar.so
$builddir/debug/seastar/libseastar_testing.so
libs = $seastar_libs_debug $libs -lthrift -lboost_system $seastar_testing_libs_debug
libs = $seastar_libs_debug
but the last line prevents database_test_g for linking against
the third-party libraries like libabsl, which could have been
pulled in by $libs. but the second assignment expression just
makes the value of `libs` identical to that of `seastar_libs_debug`.
but that library does not include the libraries which are only
used by scylla. so we could run into link failure with the
`build.ninja` generated with this command line. like:
```
FAILED: build/debug/test/boost/database_test_g
...
ld.lld: error: undefined symbol: seastar::testing::entry_point(int, char**)
>>> referenced by scylla_test_case.hh:22 (./test/lib/scylla_test_case.hh:22)
>>> build/debug/test/boost/database_test.o:(main)
...
ld.lld: error: undefined symbol: boost::unit_test::unit_test_log_t::set_checkpoint(boost::unit_test::basic_cstring<char const>, unsigned long, boost::unit_tes
t::basic_cstring<char const>)
>>> referenced by database_test.cc:298 (test/boost/database_test.cc:298)
>>> build/debug/test/boost/database_test.o:(require_exist(seastar::basic_sstring<char, unsigned int, 15u, true> const&, bool))
...
```
with this change, the extra assignment expression is dropped. this
should not cause any regression. as f'$seastar_libs_{mode}' as
been included as a part of `local_libs` before the grand if-the-else
block in the for loop before this `f.write()` statement.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14041
This reverts commit 3c54d5ec5e.
The reverted change fixed the FTBFS of the test in question with Clang 16,
which rightly stopped convert the LHS of `"hello" == sstring{"hello"}` to
the type of the type acceptable by the member operator even we have a
constructor for this conversion, like
class sstring {
public:
bar_t(const char*);
bool operator==(const sstring&) const;
bool operator!=(const sstring&) const;
};
because we have an operator!=, as per the draft of C++ standard
https://eel.is/c++draft/over.match.oper#4 :
> A non-template function or function template F named operator==
> is a rewrite target with first operand o unless a search for the
> name operator!= in the scope S from the instantiation context of
> the operator expression finds a function or function template
> that would correspond ([basic.scope.scope]) to F if its name were
> operator==, where S is the scope of the class type of o if F is a
> class member, and the namespace scope of which F is a member
> otherwise.
in 397f4b51c3, the seastar submodule was
updated. in which, we now have a dedicated overload for the `const char*`
case. so the compiler is now able to compile the expression like
`"hello" == sstring{"hello"}` in C++20 now.
so, in this change, the workaround is reverted.
Closes#14040
When erasing a sstable first check if its run_id
exists in _all_runs, otherwise do nothing with
that respect, and then if the run becomes empty
when erasing the last sstable (and it could have been
a single-sstable run from get go), erase the run
from `_all_runs`.
Fixes#14052
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#14054