Task manager tasks should be created with make_task method since
it properly sets information about child-parent relationship
between tasks. Though, sometimes we may want to keep additional
task data in classes inheriting from task_manager::task::impl.
Doing it with existing make_task method makes it impossible since
implementation objects are created internally.
The commit adds a new make_task that allows to provide a task
implementation pointer created by caller. All the fields except
for the one connected with children and parent should be set before.
parent_data struct contains info that is common for each task,
not only in parent-child relationship context. To use it this way
without confusion, its name is changed to task_info.
In order to be able to widely and comfortably use task_info,
it is moved from tasks/task_manager.hh to tasks/types.hh
and slightly extended.
It is convenient to create many different tasks implementations
representing more and more specific parts of the operation in
a module. Presenting all of them through the api makes it cumbersome
for user to navigate and track, though.
Flag internal is added to task_manager::task::impl so that the tasks
could be filtered before they are sent to user.
After calling filter_for_query() the extra_replica to speculate to may
be left default-initialized which is :0 ipv6 address. Later below this
address is used as-is to check if it belongs to the same DC or not which
is not nice, as :0 is not an address of any existing endpoint.
Recent move of dc/rack data onto topology made this place reveal itself
by emitting the internal error due to :0 not being present on the
topology's collection of endpoints. Prior to this move the dc filter
would count :0 as belonging to "default_dc" datacenter which may or may
not match with the dc of the local node.
The fix is to explicitly tell set extra_replica from unset one.
fixes: #11825
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#11833
On most of the software distribution tar.gz, it has sub-directory to contain
everything, to prevent extract contents to current directory.
We should follow this style on our unified package too.
To do this we need to increment relocatable package version to '3.0'.
Fixes#8349Closes#8867
We added UUID device file existance check on #11399, we expect UUID
device file is created before checking, and we wait for the creation by
"udevadm settle" after "mkfs.xfs".
However, we actually getting error which says UUID device file missing,
it probably means "udevadm settle" doesn't guarantee the device file created,
on some condition.
To avoid the error, use var-lib-scylla.mount to wait for UUID device
file is ready, and run the file existance check when the service is
failed.
Fixes#11617Closes#11666
Since the end bound is exclusive, the end position should be
before_key(), not after_key().
Affects only tests, as far as I know, only there we can get an end
bound which is a clustering row position.
Would cause failures once row cache is switched to v2 representation
because of violated assumptions about positions.
Introduced in 76ee3f029cCloses#11823
We should scan all sstables in the table directory and its
subdirectories to determine the highest sstable version and generation
before using it for creating new sstables (via reshard or reshape).
Otherwise, the generations of new sstables created when populating staging (via reshard or reshape) may collide with generations in the base directory, leading to https://github.com/scylladb/scylladb/issues/11789
Refs scylladb/scylladb#11789
Fixes scylladb/scylladb#11793
Closes#11795
* github.com:scylladb/scylladb:
distributed_loader: populate_column_family: reindent
distributed_loader: coroutinize populate_column_family
distributed_loader: table_population_metadata: start: reindent
distributed_loader: table_population_metadata: coroutinize start_subdir
distributed_loader: table_population_metadata: start_subdir: reindent
distributed_loader: pre-load all sstables metadata for table before populating it
As described in issue #11801, we saw in Alternator when a GSI has both partition and sort keys which were non-key attributes in the base, cases where updating the GSI-sort-key attribute to the same value it already had caused the entire GSI row to be deleted.
In this series fix this bug (it was a bug in our materialized views implementation) and add a reproducing test (plus a few more tests for similar situations which worked before the patch, and continue to work after it).
Fixes#11801Closes#11808
* github.com:scylladb/scylladb:
test/alternator: add test for issue 11801
MV: fix handling of view update which reassign the same key value
materialized views: inline used-once and confusing function, replace_entry()
The storage_service::stop() calls repair_service::abort_repair_node_ops() but at that time the sharded<repair_service> is already stopped and call .local() on it just crashes.
The suggested fix is to remove explicit storage_service -> repair_service kick. Instead, the repair_infos generated for the sake of node-ops are subscribed on the node_ops_meta_data's abort source and abort themselves automatically.
fixes: #10284Closes#11797
* github.com:scylladb/scylladb:
repair: Remove ops_uuid
repair: Remove abort_repair_node_ops() altogether
repair: Subscribe on node_ops_info::as abortion
repair: Keep abort source on node_ops_info
repair: Pass node_ops_info arg to do_sync_data_using_repair()
repair: Mark repair_info::abort() noexcept
node_ops: Remove _aborted bit
node_ops: Simplify construction of node_ops_metadata
main: Fix message about repair service starting
The test wants to see that no allocations larger than 128k are present,
but sets the warning threshold to exactly 128k. Due to an off-by-one in
Seastar, this went unnoticed. However, now that the off-by-one in Seastar
is fixed [1], this test starts to fail.
Fix by setting the warning threshold to 128k + 1.
[1] 429efb5086Closes#11817
The vector(initializer_list<T>) constructor copies the T since
initializer_list is read-only. Move the mutation instead.
This happens to fix a use-after-return on clang 15 on aarch64.
I'm fairly sure that's a miscompile, but the fix is worthwhile
regardless.
Closes#11818
This PR adds some unit tests for the `expr::evaluate()` function.
At first I wanted to add the unit tests as part of #11658, but their size grew and grew, until I decided that they deserve their own pull request.
I found a few places where I think it would be better to behave in a different way, but nothing serious.
Closes#11815
* github.com:scylladb/scylladb:
test/boost: move expr_test_utils.hh to .hh and .cc in test/lib
cql3: expr: Add unit tests for bind_variable validation of collections
cql3: expr: Add test for subscripted list and map
cql3: expr: Add test for usertype_constructor
cql3: expr: Add test for tuple_constructor
cql3: expr: Add tests for evaluation of collection constructors
cql3: expr: Add tests for evaluation of column_values and bind_variables
cql3: expr: Add constant evaluation tests
test/boost: Add expr_test_utils.hh
cql3: Add ostream operator for raw_value
cql3: add is_empty_value() to raw_value and raw_value_view
expr_test_utils.hh was a header file with helper methods for
expression tests. All functions were inline, because I didn't
know how to create and link a .cc file in test/boost.
Now the header is split into expr_test_utils.hh and expr_test_utils.cc
and moved to test/lib, which is designed to keep this kind of files.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
"
Snitch was the junction of several services' deps because it was the
holder of endpoint->dc/rack mappings. Now this information is all on
topology object, so snitch can be finally made main-local
"
* 'br-deglobalize-snitch' of https://github.com/xemul/scylla:
code: Deglobalize snitch
tests: Get local reference on global snitch instance once
gossiper: Pass current snitch name into checker
snitch: Add sharded<snitch_ptr> arg to reset_snitch()
api: Move update_snitch endpoint
api: Use local snitch reference
api: Unset snitch endpoints on stop
storage_service: Keep local snitch reference
system_keyspace: Don't use global snitch instance
snitch: Add const snitch_ptr::operator->()
These cannot be meaningfully define for a vector value like resources.
To prevent instinctive misuse, remove them. Operator bool is replaced
with `non_zero()` which hopefully better expresses what to expected.
The comparison operator is just removed and inlined into its own user,
which actually help said user's readability.
Closes#11813
evaluating a bind variable should validate
collection values.
Test that bound collection values are validated,
even in case of a nested collection.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Test that evaluate(tuple_constructor) works
as expected.
It was necessary to implement a custom function
for serializing tuples, because some tests
require the tuple to contain unset_value
or an empty value, which is impossible
to express using the exisiting code.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Test that evaluate(collection_constructor) works as expected.
Added a bunch of utility methods for creating
collection values to expr_test_utils.hh.
I was forced to write custom serialization of
collections. I tried to use data_value,
but it doesn't allow to express unset_value
and empty values.
The custom serialization isnt actually used
in this specific commit, but it's needed
in the following ones.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
All uses of snitch not have their own local referece. The global
instance can now be replaced with the one living in main (and tests)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Some tests actively use global snitch instance. This patch makes each
test get a local reference and use it everywhere. Next patch will
replace global instance with local one
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Gossiper makes sure local snitch name is the same as the one of other
nodes in the ring. It now gets global snitch to get the name, this patch
passes the name as an argument, because the caller (storage_service) has
snitch instance local reference
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method replaces snitch instance on the existing sharded<snitch_ptr>
and the "existing" is nowadays the global instance. This patch changes
it to use local reference passed from API code
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's now living in storage_service.cc, but non-global snitch is
available in endpoint_snitch.cc so move the endpoint handler there
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The snitch/name endpoint needs snitch instance to get the name from.
Also the storage_service/reset_snitch endpoint will also need snitch
instance to call reset on.
This patch carries local snitch reference all thw way through API setup
and patches the get_name() call. The reset_snitch() will come in the
next patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Some time soon snitch API handlers will operate on local snitch
reference capture, so those need to be unset before the target local
variable variable goes away
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Storage service uses snitch in several places:
- boot
- snitch-reconfigured subscription
- preferred IP reconnection
At this point it's worth adding storage_service->snitch explicit
dependency and patch the above to use local reference
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are two places to patch: .start() and .setup() and both only need
snitch to get local dc/rack from, nothing more. Thus both can live with
the explicit argument for now
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Until now, authentication in alternator served only two purposes:
- refusing clients without proper credentials
- printing user information with logs
After this series, this user information is passed to lower layers, which also means that users are capable of attaching service levels to roles, and this service level configuration will be effective with alternator requests.
tests: manually by adding more debug logs and inspecting that per-service-level timeout value was properly applied for an authenticated alternator user
Fixes#11379Closes#11380
* github.com:scylladb/scylladb:
alternator: propagate authenticated user in client state
client_state: add internal constructor with auth_service
alternator: pass auth_service and sl_controller to server
This reverts commit df8e1da8b2, reversing
changes made to 4ff204c028. It causes
a crash in debug mode on aarch64 (likely a coroutine miscompile).
Fixes#11809.
The return from DescribeTable which describes GSIs and LSIs is missing
the Projection field. We do not yet support all the settings Projection
(see #5036), but the default which we support is ALL, and DescribeTable
should return that in its description.
Fixes#11470Closes#11693
To prevent stalls due to large number of tables.
Fixes scylladb/scylladb#11574
Closes#11689
* github.com:scylladb/scylladb:
schema_tables: merge_tables_and_views reindent
schema_tables: limit paralellism
Allowing to change the total or initial resources the semaphore has. After calling `set_resources()` the semaphore will look like as if it was created with the specified amount of resources when created.
Use the new method in `replica::database::revert_initial_system_read_concurrency_boost()` so it doesn't lead to strange semaphore diagnostics output. Currently the system semaphore has 90/100 count units when there are no reads against it, which has led to some confusion.
I also plan on using the new facility in enterprise.
Closes#11772
* github.com:scylladb/scylladb:
replica/database: revert initial boost to system semaphore with set_resources()
reader_concurrency_semaphore: add set_resources()
When moving a SSTable from staging to base dir, we reused the generation
under the assumption that no SSTable in base dir uses that same
generation. But that's not always true.
When reshaping staging dir, reshape compaction can pick a generation
taken by a SSTable in base dir. That's because staging dir is populated
first and it doesn't have awareness of generations in base dir yet.
When that happens, view building will fail to move SSTable in staging
which shares the same generation as another in base dir.
We could have played with order of population, populating base dir
first than staging dir, but the fragility wouldn't be gone. Not
future proof at all.
We can easily make this safe by picking a new generation for the SSTable
being moved from staging, making sure no clash will ever happen.
Fixes#11789.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#11790
This patch adds a test reproducing issue #11801, and confirming that
the previous patch fixed it. Before the previous patch, the test passed
on DynamoDB but failed on Alternator.
The patch also adds four more passing tests which demonstrate that
issue #11801 only happened in the very specific case where:
1. A GSI has two key attributes which weren't key attributes in the
base, and
2. An update sets the second of those attributes to the same value
which it already had.
This bug was originally discovered and explained by @fee-mendes.
Refs #11801.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
We should scan all sstables in the table directory and its
subdirectories to determine the highest sstable version and generation
before using it for creating new sstables (via reshard or reshape).
Fixesscylladb/scylladb#11793
Note: table_population_metadata::start_subdir is called
in a seastar thread to facilitate backporting to old versions
that do not support coroutines yet.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>