When users create a table using the Alternator API, they can decide if the billing is PROVISIONED of PAY_PER_REQUEST.
If the billing is set to PROVISIONED, they need to set the ProvisionedThroughput ReadCapacityUnits (RCU) and WriteCapacityUnits (WCU).
This series adds support for getting and setting the ProvisionedThroughput. The values will be stored as table extension tags.
Following how TTL is stored within the Alternator, we will use ```system:rcu_attribute``` and ```system:wcu_attribute``` for the labels.
The series adds a test that sets ProvisionedThroughput and validates that it gets the value back. It was tested with both Alternator and AWS.
This series is part of the effort to monitor, limit, and bill Alternator operations.
New code, no need to backport.
Closesscylladb/scylladb#20056
* github.com:scylladb/scylladb:
docs/alternator/compatibility.md: explain the consumed capacity provisioned
Add test/alternator/test_provisioned_throughput.py
test/alternator/util.py: Allow override BillingMode
alternator/executor.cc: Store ProvisionedThroughput
Before 17f4a151ce the node was marked as
been replaced in join_group0 state, before it actually joins the group0,
so by the time it actually joins and starts transferring snapshot/log no
traffic is sent to it. The commit changed this to mark the node as
being replaced after the snapshot/log is already transferred so we can
get the traffic to the node while it sill did not caught up with a
leader and this may causes problems since the state is not complete.
Mark the node as being replaced earlier, but still add the new node to
the topology later as the commit above intended.
Fixes: scylladb/scylladb#20629
Need to be backported since this is a regression
Closesscylladb/scylladb#20743
* github.com:scylladb/scylladb:
test: amend test_replace_reuse_ip test to check that there is no stale writes after snapshot transfer starts
topology coordinator:: mark node as being replaced earlier
topology coordinator: do metadata barrier before calling finish_accepting_node() during replace
with this parameter, "backup" API can backup the given table, this
enables it to be a drop-in replacement of existing rclone API used by
scylla manager.
Fixes https://github.com/scylladb/scylladb/issues/20636
---
this change is a part of the efforts to bring the native backup/restore to scylla, no need to backprt.
Closesscylladb/scylladb#20661
* github.com:scylladb/scylladb:
backup_task: fix the indent
treewide: add "table" parameter to "backup" API
Currently, node ops tasks type is retrieved from topology_request
without any change. Use respective node operation name instead.
Closesscylladb/scylladb#20671
The test performs consecutive schema changes in RECOVERY mode. The
second change relies on the first. However the driver might route the
changes to different servers and we don't have group 0 to guarantee
linearizability. We must rely on the first change coordinator to push
the schema mutations to other servers before returning, but that only
happens when it sees other servers as alive when doing the schema
change. It wasn't guaranteed in the test. Fix this.
Fixesscylladb/scylladb#20791
Should be backported to all branches containing this test to reduce
flakiness.
Closesscylladb/scylladb#20792
with this parameter, "backup" API can backup the given table, this
enables it to be a drop-in replacement of existing rclone API used by
scylla manager.
in this change:
* api/storage_service: add "table" parameter to "backup" API.
* snapshot_ctl: compose the full path of the snapshot directory in
`snapshot_ctl::start_backup`. since we have all the information
for composing the snapshot directory, and what the `backup_task_impl`
class is interested is but the snapshot directory, we just pass
the path to it instead the individual components of the directory.
* backup_task_impl: instead of scan the whole keyspace recursively,
only scan the specified snapshot directory.
Fixesscylladb/scylladb#20636
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Auth has been managed via Raft since Scylla 6.0. Restoring data
following the usual procedure (1) is error-prone and so a safer
method must have been designed and implemented. That's what
happens in this PR.
We want to extend `DESC SCHEMA` by auth and service levels
to provide a safe way to backup and restore those two components.
To realize that, we change the meaning of `DESC SCHEMA WITH INTERNALS`
and add a new "tier": `DESC SCHEMA WITH INTERNALS AND PASSWORDS`.
* `DESC SCHEMA` -- no change, i.e. the statement describes the current
schema items such as keyspaces, tables, views, UDTs, etc.
* `DESC SCHEMA WITH INTERNALS` -- does the same as the previous tier
and also describes auth and service levels. No information about
passwords is returned.
* `DESC SCHEMA WITH INTERNALS AND PASSWORDS` -- does the same
as the previous tier and also includes information about the salted
hashes corresponding to the passwords of roles.
To restore existing roles, we extend the `CREATE ROLE` statement
by allowing to use the option `WITH SALTED HASH = '[...]'`.
---
Implementation strategy:
* Add missing things/adjust existing ones that will be used later.
* Implement creating a role with salted hash.
* Add tests for creating a role with salted hash.
* Prepare for implementing describe functionality of auth and service levels.
* Implement describe functionality for elements of auth and service levels.
* Extend the grammar.
* Add tests for describe auth and service levels.
* Add/update documentation.
---
(1): https://opensource.docs.scylladb.com/stable/operating-scylla/procedures/backup-restore/restore.html
In case the link stops working, restoring a schema was realised
by managing raw files on disk.
Fixesscylladb/scylladb#18750Fixesscylladb/scylladb#18751Fixesscylladb/scylladb#20711Closesscylladb/scylladb#20168
* github.com:scylladb/scylladb:
docs: Update user documentation for backup and restore
docs/dev: Add documentation for DESC SCHEMA
test: Add tests for describing auth and service levels
cql3/functions/user_function: Remove newline character before and after UDF body
cql3: Implement DESCRIBE SCHEMA WITH INTERNALS AND PASSWORDS
auth: Implement describing auth
auth/authenticator: Add member functions for querying password hash
service/qos/service_level_controller: Describe service levels
data_dictionary: Remove keyspace_element.hh
treewide: Start using new overloads of describe
treewide: Fix indentation in describe functions
treewide: Return create statement optionally in describe functions
treewide: Add new describe overloads to implementations of data_dictionary::keyspace_element
treewide: Start using schema::ks_name() instead of schema::keyspace_name()
cql3: Refactor `description`
cql3: Move description to dedicated files
test: Add tests for `CREATE ROLE WITH SALTED HASH`
cql3/statements: Restrict CREATE ROLE WITH SALTED HASH
auth: Allow for creating roles with SALTED HASH
types: Introduce a function `cql3_type_name_without_frozen()`
cql3/util: Accept std::string_view rather than const sstring&
To fix a race between split and repair here c1de4859d8, a new sstable
generated during streaming can be split before being attached to the sstable
set. That's to prevent an unsplit sstable from reaching the set after the
tablet map is resized.
So we can think this split is an extension of the sstable writer. A failure
during split means the new sstable won't be added. Also, the duration of split
is also adding to the time erm is held. For example, repair writer will only
release its erm once the split sstable is added into the set.
This single-sstable split is going through run_custom_job(), which serializes
with other maintenance tasks. That was a terrible decision, since the split may
have to wait for ongoing maintenance task to finish, which means holding erm
for longer. Additionally, if split monitor decides to run split on the entire
compaction group, it can cause single-sstable split to be aborted since the
former wants to select all sstables, propagating a failure to the streaming
writer.
That results in new sstable being leaked and may cause problems on restart,
since the underlying tablet may have moved elsewhere or multiple splits may
have happened. We have some fragility today in cleaning up leaked sstables on
streaming failure, but this single-sstable split made it worse since the
failure can happen during normal operation, when there's e.g. no I/O error.
It makes sense to kill run_custom_job() usage, since the single-sstable split
is offline and an extension of sstable writing, therefore it makes no sense to
serialize with maintenance tasks. It must also inherit the sched group of the
process writing the new sstable. The inheritance happens today, but is fragile.
Fixes#20626.
Closesscylladb/scylladb#20737
* github.com:scylladb/scylladb:
tablet: Fix single-sstable split when attaching new unsplit sstables
replica: Fix tablet split execute after restart
In the current scenario, We check if a node being removed is normal
on the node initiating the removenode request. However, we don't have a
similar check on the topology coordinator. The node being removed could be
normal when we initiate the request, but it doesn't have to be normal when
the topology coordinator starts handling the request.
For example, the topology coordinator could have removed this node while handling
another removenode request that was added to the request queue earlier.
This commit intends to fix this issue by adding more checks in the enqueuing phase
and return errors for duplicate requests for node removal.
This PR fixes a bug. Hence we need to backport it.
Fixes: scylladb/scylladb#20271Closesscylladb/scylladb#20500
We add tests verifying the following features work correctly:
* describing auth: roles, role grants, granting permissions on
resources,
* describing service levels: creating them and attaching to roles.
Add a README.md in test/boost, giving a short introduction to what this
directory is and what kind of tests it contains, and how to run individual
tests.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#20550
This patch doesn't yet change how schema merging works but it prepares the ground for it by simplifying the code and separating merging logic into its own unit.
It consists of:
- minor cleanups of unused code
- moving code into separate file
- simplifying merge_keyspaces code
More detailed explanation in per commit messages.
Relates scylladb/scylladb#19153Closesscylladb/scylladb#19687
* github.com:scylladb/scylladb:
db: schema_applier: simplify merge_keyspaces function
db: schema_applier: remove unnecessary read in merge_keyspaces
db: schema_tables: move scylla specific code into create keyspace function
db: move schema merging code into a separate unit
db: schema_tables: export some schema management functions
replica: remove unused table_selector forward declaration
db: remove unused flush arg from do_merge_schema func
db: remove unused read_arg_values function
Change order of functions: firstly remount, then change ownership for
cgroup. It was not failing before because with privileged mode, it will
mount cgroups as RW, but it's better to have this check if behavior will
change.
Closesscylladb/scylladb#20676
The test configures write timeout to much smaller value to make the test
run faster since for some writes sleep is inserted to hit the timeout,
but it makes aarch64 debug flaky since timeout happens when it should
not because of a natural slowness.
Fixesscylladb/scylladb#20515Closesscylladb/scylladb#20744
This one is aimed at giving tests the ability to call private methods of class sstable. Some of the wrappers in the test class wrap public methods and can be removed.
Closesscylladb/scylladb#20614
* github.com:scylladb/scylladb:
test: Remove sstables::test::binary_search()
test: Remove sstables::test::move_summary()
test: Remove sstables::test::read_toc()
test: Remove sstables::test::get_summary()
test: Remove sstables::test::get_statistics()
test: Remove sstables::test::data_read()
It's mostly self containted and it's easier to
maintain reasonably sized files. Also splitting
better shows boundaries between schema and
schema merging code.
This PR addresses multiple issues with alternator batch metrics:
1. Rename the metrics to scylla_alternator_batch_item_count with op=BatchGetItem/BatchWriteItem
2. The batch size calculation was wrong and didn't count all items in the batch.
3. Add a test to validate that the metrics values increase by the correct value (not just increase). This also requires an addition to the testing to validate ops of different metrics and an exact value change.
Needs backporting to allow the monitoring to use the correct metrics names.
Fixes#20571Closesscylladb/scylladb#20646
* github.com:scylladb/scylladb:
alternator:test_metrics test metrics for batch item count
alternator:test_metrics Add validating the increased value
alternator: Fix item counting in batch operations
Alterntor rename batch item count metrics
This commit addresses an issue where accessing the raw pointer of the
schema instance within `table::_schema` using `table.schema._p` was
unreliable.
before this change, `_p` was of type `lw_shared_ptr_counter_base*`, a
type-erased smart pointer, preventing direct casting to the underlying
schema pointer. but we still cast it to `schema*` anyway. this led to
a gdb.MemoryError when dereferencing the deduced pointer:
but the type of `_p` is `lw_shared_ptr_counter_base*`, which is a
type erased smart pointer, and it cannot be casted directly to the
under pointer pointing to a `schema` instance. this results in:
```
Traceback (most recent call last):
File "/home/avi/scylla/test/scylla_gdb/../../scylla-gdb.py", line 5554, in invoke
self.print_key_type(seastar_lw_shared_ptr(schema['_clustering_key_type']).get().dereference(), 'clustering')
File "/home/avi/scylla/test/scylla_gdb/../../scylla-gdb.py", line 5533, in print_key_type
key_type = seastar_shared_ptr(key_type).get().dereference()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gdb.MemoryError: Cannot access memory at address 0x4000079656b0078
```
when we are dereferencing the raw pointer deduced this way.
in this change,
* we use the wrapper of `seastar_lw_shared_ptr` to safely obtain the
raw pointer.
* reenable this test previously disabled by 3d781c4f
tested using
```console
$ SCYLLA=/home/kefu/dev/scylladb/master/build/release/scylla \
test/scylla_gdb/run -o junit_suite_name=scylla_gdb test_misc.py::test_schema
```
on an up-to-date fedora 40 installation.
Refs 3d781c4fFixesscylladb/scylladb#20741
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20746
Move all of the blatantly restriction-related expression utilities
to statement_restrictions.cc.
Some are so blatant as to include the word "restriction" in their name.
Others are just so specialized that they cannot be used for anything else.
The motivation is that further refactoring will be simplified if it can
happen within the same module, as there will not be a need to prove
it has no effect elsewhere.
Most of the declarations are made non-public (in .cc file) to limit
proliferation. A few are needed for tests or in select_statement.cc
and so are kept public.
Other than that, the only changes are namespace qualifications and
removal of a now-duplicate definition ("inclusive").
Closesscylladb/scylladb#20732
* tools/java e505a6d3bb...5b0e274f12 (1):
> Merge 'build.xml: install and use java-11 when building' from Kefu Chai
Updates to clang 18.1.8 + LLVM patch to match Fedora 40.
New optimized clang build generated and stored in
https://devpkg.scylladb.com/clang/clang-18.1.8-x86_64.tar.gzhttps://devpkg.scylladb.com/clang/clang-18.1.8-aarch64.tar.gz
Due to the loss of the jmx submodule, we no longer install java-11-openjdk.
We add it in install-dependencies.sh here to compensate, pending a better
solution.
tools/java submodule updated to remove build failure where Java 8
was selected instead of Java 11.
The scylla_gdb test suite was disabled due to a regression in gdb 15,
which is brought in by the toolchain update [1].
[1] https://github.com/scylladb/scylladb/issues/20741.
let's assume there are 2 nodes, n1, n2. n1 is the coordinator.
1) n1 emits split
2) n1 and n2 complete split work
3) n1 becomes aware all replicas are ready for split
4) n2 restarts, but places split sstable into main group[1]
5) n1 executes split
6) n2 handles split completion, but see the main group is not empty
[1]: During split, main group should only contain unsplit sstables.
If all sstables are split, main must be empty.
This is a result of replica not setting storage group to split mode on restart
(using tablet map) and therefore sstables are incorrectly placed on main group.
The fix is about looking at tablet map and setting group to split mode before
sstables are populated into it.
Refs #20626.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
We continue removing `data_dictionary::keyspace_element`.
In this commit, we start using the overloads returning
`cql3::description` in places where the methods specified
by `data_dictionary::keyspace_element` were used.
We're going to remove the interface `data_dictionary::keyspace_element`.
As `schema::keyspace_name()` is an implementation of one of the methods
specified by that interface, we replace its uses by `schema::ks_name()`.
`schema::keyspace_name()` was an alias for it, so no semantic change
has occured.
The introduced function returns the actual name
of the type represented by `abstract_type`.
It circumvents name processing like wrapping a type
within `frozen<>` or using Cassandra's syntax.
We add the function to be able to describe UDFs
in the upcoming commits that require that their
arguments not be `frozen<>`.
We also test the implementation.
For the benefit of running test.py inside CI, we recently added to
test/cql-pytest and test/alternator the knowledge of which "Scylla mode"
(--mode) and "run number" is running (--run_id), although these concepts
are alien to these two test frameworks (remember that those test frameworks
can also run tests against unknown versions of Scylla or even our competitors'
implementations).
One unfortunate result of this change is that now if you run a test by
using pytest directly (or test/*/run) instead of test.py, for example:
$ cd test/alternator
$ pytest --aws test_item.py::test_basic_string_put_and_get
The test's success or failure reports the ugly name
test_item.py::test_basic_string_put_and_get.no_mode.1
This unnecessary "no_mode.1" come from the the default values for --mode
and --run_id, respectively. But there is no reason for these silly
defaults. In this patch we change these defaults to None, and when they
are None, they aren't tacked onto the test's name.
This patch shouldn't affect running tests through test.py, because
test.py always sets the --mode and --run_id options, and doesn't leave
them as the default.
Fixes#20512
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#20513
Allow to specify service level used in select statement `SELECT ... USING SERVICE LEVEL sl_name`.
In OSS, this only affects statement's timeout.
In case both service level and timeout are specified `SELECT ... USING SERVICE LEVEL sl_name AND TIMEOUT 1h`, the timeout has higher priority as statement's timeout.
Fixesscylladb/scylladb#18471Closesscylladb/scylladb#20523
* github.com:scylladb/scylladb:
test/cql-pytest: add test for `SELECT ... USING SERVICE LEVEL`
cql3/Cql.g: extend grammar to allow `SELECT ... USING SERVICE LEVEL`
cql3/statements/select_statement: use service level timeout
cql3/attributes: add service level name field
qos/service_level_controller: add method to check if service level exists in cache
The statement_restrictions class started life in the object-oriented style - an
object that interacts with its environment via mutators and is observed via
observers.
This is however not suitable for its objective: to analyze the WHERE clause,
select a query plan, and partition the WHERE clause atoms to the various
parts demanded by the query plan (read_command and filters). Furthermore,
the object oriented style makes it hard to work with as you can only call some
observers after the related mutators were called.
Fix this by transforming the code info a more functional style: we call
a function that returns an immutable statement_restrictions object that
can only be observed. This makes it easier to further change in the future,
as changes will not have to consider interaction with the environment.
No backport as this is a refactoring
Closesscylladb/scylladb#20672
* github.com:scylladb/scylladb:
cql3: statement_restrictions: use functional style
cql3: statement_restrictions: calculate the index only once
cql3: statement_restrictions: make it a const object
use `seastar::handle_signal()` instead of `reactor::handle_signal()`.
in a recent change in seastar (c3e826ad1197f2610138f3bcfaeb0b458f8fb799),
the later was marked as deprecated in favor of the former, so let's
use the recommended API.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20695
instead grouping tests with different parameters, let's parameterize
them using `BOOST_DATA_TEST_CASE()`, simpler this way. and the tests
can be more structured.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20697
During a query execution, the query can be re-bounced to another shard if the requested data is located there. Previous implementation assumed that the shard cannot be changed after first re-bounce, however with the introduction of Tablets, data could be migrated to another shard after the query was already re-bounced, causing a failure of the query execution. To avoid this issue, the query is re-bounced as needed until it is executed on the correct shard.
Fixes#15465Closesscylladb/scylladb#20493
* github.com:scylladb/scylladb:
cql_server: Add a test for multiple query msg rebounces.
cql_server::connection: process: rebounce msg if needed
cql_server::connection: process: co-routinize connection::process_on_shard
cql_server: connection: process: fixup indentation
cql_server: connection: process_on_shard: drop permit parameter
transport: server: pass bounce_to_shard as foreign shared ptr
cql_server: connection: process: add template concept for process_fn
cql_server: move process_fn_return_type to class definition
* Also dump diagnostics when a read times out while active (not queued).
* Add the "Trigger permit" line, containing the details of the permit which caused the diagnostics dump (by e.g. timing out).
* Add the "Identified bottleneck(s)" line, containing the identified bottlenecks which lead to permits being queued. This line is missing if no such bottleneck can be identified.
* Document the new features, as well as the stat dump, which was added some time ago.
Example of the new dump format:
```
INFO 2024-09-12 08:09:48,046 [shard 0:main] reader_concurrency_semaphore - Semaphore reader_concurrency_semaphore_dump_reader_diganostics with 8/10 count and 106192275/32768 memory resources: timed out, dumping permit diagnostics:
Trigger permit: count=0, memory=0, table=ks.tbl0, operation=mutation-query, state=waiting_for_admission
Identified bottleneck(s): memory
permits count memory table/operation/state
3 2 26M *.*/push-view-updates-2/active
3 2 16M ks.tbl1/push-view-updates-1/active
1 1 15M ks.tbl2/push-view-updates-1/active
1 0 13M ks.tbl1/multishard-mutation-query/active
1 0 12M ks.tbl0/push-view-updates-1/active
1 1 10M ks.tbl3/push-view-updates-2/active
1 1 6060K ks.tbl3/multishard-mutation-query/active
2 1 1930K ks.tbl0/push-view-updates-2/active
1 0 1216K ks.tbl0/multishard-mutation-query/active
6 0 0B ks.tbl1/shard-reader/waiting_for_admission
3 0 0B *.*/data-query/waiting_for_admission
9 0 0B ks.tbl0/mutation-query/waiting_for_admission
2 0 0B ks.tbl2/shard-reader/waiting_for_admission
4 0 0B ks.tbl0/shard-reader/waiting_for_admission
9 0 0B ks.tbl0/data-query/waiting_for_admission
7 0 0B ks.tbl3/mutation-query/waiting_for_admission
5 0 0B ks.tbl1/mutation-query/waiting_for_admission
2 0 0B ks.tbl2/mutation-query/waiting_for_admission
8 0 0B ks.tbl1/data-query/waiting_for_admission
1 0 0B *.*/mutation-query/waiting_for_admission
26 0 0B permits omitted for brevity
96 8 101M total
Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 0
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 0
reads_admitted: 1
reads_enqueued_for_admission: 82
reads_enqueued_for_memory: 0
reads_admitted_immediately: 1
reads_queued_because_ready_list: 0
reads_queued_because_need_cpu_permits: 82
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 0
reads_queued_with_eviction: 0
total_permits: 97
current_permits: 96
need_cpu_permits: 0
awaits_permits: 0
disk_reads: 0
sstables_read: 0
```
Fixes: https://github.com/scylladb/scylladb/issues/19535
Improvement, no backport needed.
Closesscylladb/scylladb#20545
* github.com:scylladb/scylladb:
docs/dev/reader-concurrency-semaphore.md: update the documentation on diagnostics dumps
test/boost/reader_concurrency_semaphore_test: test the new diagnostics functionality
reader_concurrency_semaphore: add bottleneck self-diagnosis to diagnosis dump
reader_concurrency_semaphore: include trigger permit in diagnostic dump
reader_concurrency_semaphore: propagate permit to do_dump_reader_permit_diagnostics()
reader_concurrency_semaphore: use consistent exception type for timeout
reader_concurrency_semaphore: dump diagnostics when non-waiting reader times out
The main goal of this PR is to fix a bug (#20619) in the alternator_enforce_authorization=false setting - which didn't do its job (i.e, _don't_ check permissions) when authorization is configured in CQL but not wanted in Alternator.
The series also a few smaller bugs in the code that were discovered while debugging the main issue:
1. A potential use-after-free (that didn't seem to hit us in practice) is fixed.
2. A confusing error message (that was also reported in #20619) is improved.
3. Make the alternator_enforce_authorization live-updatable. There was no reason why it shouldn't be, and as this series needs to make this flag available to more code, let's just do it properly and assume the flag is live-updatable.
Because the RBAC feature has not been backported to any open-source branches, neither should these fixes. But if some private branch received a backport of the RBAC feature, it should get these fixes too.
Fixes#20619.
Closesscylladb/scylladb#20640
* github.com:scylladb/scylladb:
alternator: make alternator_enforce_authorization live-updateable
alternator: fix alternator_enforce_authorization=false
alternator: improve error message when unauthenticated
alternator: avoid use-after-free in RBAC
* seastar ec5da7a6...69f88e2f (38):
> build: s/Sanitizers_COMPILER_OPTIONS/Sanitizers_COMPILE_OPTIONS
> test: Update httpd test with request/reply body writing sugar
> http: Add sugar to request and response body writers
> utils: Add util::write_to_stream() helper
> seastar-addr2line: adjust llvm termination regex
> README.md: add Crimson project
> rpc: conditionally use fmt::runtime() based on SEASTAR_LOGGER_COMPILE_TIME_FMT
> build: check the combination of Sanitizers
> tls: clear session ticket before releasing
> print: remove dead code
> doc/lambda-coroutine-fiasco: reword for better readability
> rpc: fix compilation error caused by fmt::runtime()
> tutorial: explain the use case of rethrow_exception and coroutine::exception
> reactor: print more informative error when io_submit fails
> README.md: note GitHub discussions
> prometheus: `fmt::print` to stringstream directly
> doc: add document for testing with seastar
> seastar/testing: only include used headers
> test: Add abortable http client test cases
> http/client: Add abortable make_request() API method
> http/client: Abort established connections
> http/client: Handle abort source in pool wait
> http/client: Add abort source to factory::make() method
> http/client: Pass abort_source here and there
> http/client: Idnentation fix after previous patch
> http/client: Merge some continuations explicitly
> signal: add seastar signal api
> httpd: remove unused prometheus structs
> print: use fmtlib's fmt::format_string in format()
> rpc: do not use seastar::format() in rpc logger
> treewide: s/format/seastar::format/
> prometheus: sanitize label value for text protocol
> tests: unit test prometheus wire format
> io-tester: Introduce batches to rate-based submission
> io-tester: Generalize issueing request and collecting its result
> io-tester: Cancel intent once
> io-tester: Dont carry rps/parallelism variables over lambdas
> io-tester: Simplify in-flight management
The breaking changes in the seastar submodule necessitate corresponding
modifications in our code. These changes must be implemented together in
a single commit to maintain consistency. So that each commit is buildable.
following changes are included in addition to seastar submodule update:
* instead of passing a `const char*` for the format string, pass a
templated `fmt::format_string<...>`, this depends on the
`seastar::format()` change in seastar.
* explicitly call `fmt::runtime()` if the format string is not a
consteval expression. this depends on the `seastar::format()` change
in seastar. as `seastar::format()` does not accept a plain
`const char*` which is not constexpr anymore.
* pass abort_source to `dns_connection_factory::make()`. this depends on
the change in seastar, which added a `abort_source*` argument to
the pure virtual member function of `connection_factory::make()`.
* call call {fmt,seastar}::format() explicitly. this is a follow up of
3e84d43f, which takes care of all places where we should call
`fmt::format()` and `seastar::format()` explicitly to disambiguate the
`format()` call. but more `format()` call made their way into the source
tree after 3e84d43f. so we need fix them as well.
* include used header in tests
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Update seastar submodule
Please enter the commit message for your changes. Lines starting
Closesscylladb/scylladb#20649
This patch adds tests for the batch operations item count.
The tests validate that the metrics tracking the number of items
processed in a batch increase by the correct amount.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
The `check_increases_operation` now allows override the checked metric.
Additionally, a custom validation value can now be passed, which make it
possible to validate the amount by which a value has changed, rather
than just validating that the value increased.
The default behavior of validating that values have increased remains
unchanged, ensuring backward compatibility.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Most of the analysis of the WHERE clause is done in statement_restrictions. It determines
what parts to use for the primary or secondary index, and what parts to use for filtering.
The difficult part is that it has a very wide interface. After construction, the user must pick
the correct bits from many public functions. There are subtle interactions between them
that are hard to untangle.
This series simplifies the interface as it is used for selection filtering. In the end, only
two public functions are used, both returning expressions: one for the partition-level
filtering, one for the clustering row level filtering.
In the end, the WHERE clause is factored into three parts:
- one part goes into the read_command of the primary or secondary index
- another part (that references only partition key columns and static key columns) is used to filter entire partitions
- another part (that currently references only clustering key columns and regular columns, but one day may reference other columns) is used to filter clustering rows
Refactoring, no backport.
Closesscylladb/scylladb#20487
* github.com:scylladb/scylladb:
cql3: statement_restrictions: drop accessors for single-column key restrictions
cql3: selection: adjust indentation
cql3: selection: delete empty loop
cql3: statement_restrictions, selection: fold multi-column restrictions into row-level filter
cql3: statement_restrictions, selection: merge clustering key filter and regular columns filter
cql3: statement_restrictions, selection: merge partition key filter and static columns filter
cql3: selection: filter regular and static rows as a single expression each
cql3: statement_restrictions: collect regular column and static column filters into single expressions
cql3: selection: filter clustering key as a single expression
cql3: statement_restrictions: expose filter for clustering key
cql3: selection: filter partition key as a single expression
cql3: statement_restrictions: expose filter for partition key
cql3: statement_restrictions: remove relations used for indexing from filtering
cql3: statement_restrictions: bail out of find_idx if !_uses_secondary_index
cql3: statement_restrictions, modification_statement: pass correct value of check_indexes
cql3: statement_restrictions: correct mismatched clustering/partition restrictions references
cql3: statement_restrictions: precalculate get_column_defs_for_filtering()
cql3: selection: do_filter(): push static/regular row glue to higher level
In https://github.com/scylladb/scylladb/pull/18729, we introduced a new statement tenant `$maintenance`, but the change wasn't protected by any cluster feature.
This wasn't a problem for OSS, since unknown isolation cookie just uses default scheduling group. However, in enterprise that leads to creating a service level on not-upgraded nodes, which may end up in an error if user create maximum number of service levels.
This patch adds a cluster feature to guard adding the new tenant. It's done in the way to handle two upgrade scenarios:
- version without `$maintenance` tenant -> version with `$maintenance` tenant guarded by a feature
- version with `$maintenance` tenant but not guarded by a feature -> version with `$maintenance` tenant guarded by a feature
The PR adds `enabled` flag to statement tenants.
This way, when the tenant is disabled, it cannot be used to create a connection, but it can be used to accept an incoming connection.
The `$maintenance` tenant is added to the config as disabled and it gets enabled once the corresponding feature is enabled.
Fixesscylladb/scylladb#20070
Refs scylladb/scylla-enterprise#4403Closesscylladb/scylladb#19802
* github.com:scylladb/scylladb:
message/messaging_service: guard adding maintenance tenant under cluster feature
message/messaging_service: add feature_service dependency
message/messaging_service: add `enabled` flag to statement tenants
Instead of a constructor, use a new function
analyze_statement_restrictions() as the entry point. It returns an
immutable statement_restrictions object.
This opens the door to returning a variant, with each arm of the variant
corresponding to a different query plan.
The test emulates several LWT(Lightweight Transaction) query rebounces. Currently, the code
that processes queries does not expect that a query may be rebounced more than once.
It was impossible with the VNodes, but with intruduction of the Tablets, data can be moved
between shards by the balancer thus a query can be rebounced to different shards multiple times.
When the configuration has alternator_enforce_authorization=false,
Alternator should not do authentication (check which user signed each
request) nor authorization (check if that user has permissions to do
each operation).
Our implementation forgot to disable the authorization checks when
it's configured to false. The (incorrect) assumption was that when
alternator_enforce_authorization is configured to false, the CQL
'authenticator' and 'authorizer' configuration is also disabled -
so the authorization checks will be no-ops. But we can't assume
that: Users are free to configure 'authenticator' and 'authorizer'
for use in CQL, and then set alternator_enforce_authorization=false
just for Alternator.
So this patch adds a new test for this case - when we have
authenticator=PasswordAuthenticator, authorizer=CassandraAuthorizer
but alternator_enforce_authorization=false, and fixes it to work
correctly.
The heart of the fix is trivial: the `verify_*_permission()` functions
just need to check the alternator_enforce_authorization and return
immediately when false. The bigger part of this change is to get the
alternator_enforce_authorization into the "executor" object and then
to pass it into the verify calls.
Although alternator_enforce_authorization is not YET live updatable,
this code is prepared for the future that it may become live
updatable, so the executor object saves not the boolean value of
this flag, but a live-updatable reference to it.
Fixes#20619
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This PR adds the possibility to gather resource consumption metrics. The collected metrics can be used to compare performance before and after specific changes aimed at increasing performance. Currently, this functionality works only in manual mode, and this is just raw data. Later on, these metrics can be used in Jupyter notebook to analyze and visualize how the resources are used and can provide the insight on how to improve it. This PR is a first insight after gathering these metrics.
Add the possibility to gather resource consumption for the test.py execution. SQLite DB will be created with different performance metrics that will allow comparing the resource consumption between changes.
The DB will be in the tmp directory that by default set to testlog. Across the runs, the DB will not be deleted, so each new run will just add information to the existing DB.
Parameter --get-metrics was added to switch on or off the metrics gathering. By default, it's switched on.
Closes: scylladb/qa-tasks#1666Closes: scylladb/qa-tasks#1707Closesscylladb/scylladb#19881
The `consume*()` variants just forward the call to the `_impl` method with the same name. The latter, being a member of `::impl`, will bypass the top level `fill_buffer()`, etc. methods and thus will never call `set_close_required()`. Do this in the top-level `consume*()` methods instead, to ensure a reader, on which only `consume*()` is called, and then is destroyed, will complain as it should (and abort).
Only one place was found in core code, which didn't close the reader: `split_mutation() in `mutation/mutation.cc` and this reader is the "from-mutation" one which has no real close routine. All other places were in tests. All this is to say, there were no real bugs uncovered by this PR.
Fixes#16520
Improvement, no backport required.
Closesscylladb/scylladb#16522
* github.com:scylladb/scylladb:
readers/flat_mutation_reader_v2: call set_close_required() from consume*()
test/boost/sstable_compaction_test: close reader after use
test/boost/repair_test: close reader after use
mutation/mutation: split_mutation(): close reader after use
"crawling" is a little bit obscure in this context. so let's rename this class to reflect the fact that this reader only reads the entire content of the sstable.
both crawling reader for kl and mx formats are renamed. also, in order to be consistent, all "crawling reader" in variable names are updated as well.
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#20599
* github.com:scylladb/scylladb:
sstable: s/crawling_sstable_mutation_reader/sstable_full_scan_reader
sstable/mx/reader: add comment for mx_crawling_sstable_mutation_reader
New sstables for a table are created by the table::make_sstable() method. The method then calls sstables_manager::make_sstable() and passes there a path to component files which, in turn, sits on table::config. Since some time ago having an on-disk path for an sstable had become optional, as sstables could be put on S3 storage without local paths involved. In that case the aforementioned "path" is ~~ab~~used as a key in the system.sstables registry, that references a record with information used to retrieve URLs of sstables' objects.
This PR removes the "path" argument from sstables_manager::make_sstable() and its sstable_sdirectory peer. The details of sstables' location are moved onto storage_options and depend on storage type. For now in both storage types this location is still the good-old $datadir/$keyspace/$table-$uuid string. S3 storage needs to be patched more to use more elegant "location" value.
Eventually the `table::config::{datadir|all_datadirs}` will be removed, this PR is the step towards it.
closes: #12707Closesscylladb/scylladb#20542
* github.com:scylladb/scylladb:
table: Use storage options to clean the storage
sstables/storage: Re-use ocally generated vector of paths
sstables/storage: Visit options once to initialize storage
sstables_manager: Return table storage options when initalizing storage
sstables/storage: Fix indentation after previous patch
table: Move datadirs initialization parallelism to storage level
sstables/storage: Split the visitor's overloaded functor
restore: Don't use table_dir to construct sstable_directory
sstable_directory: Remove table_dir field
sstable_directory: Use options details in lister
sstables_manager: Remove table_dir from make_sstable()
sstables: Remove table_dir from sstable constructor
sstables/storage: Remove sstring dir from make_storage()
sstables/storage: Use options to construct
tests: Properly initialize storage options with "dir"
distributed_loader: Create S3 options with prefix for restore
storage_options: Add special-purpose local options maker
storage_options: Keep local path / s3 prefix onboard
table: Get another options when initializing storage
Allow create_pending_deletion_log to delete a bunch of sstables
potentially resides in different prefixes (e.g. in the base directory
and under staging/).
The motivation arises from table::cleanup_tablet that calls compaction_group::cleanup on all cg:s via cleanup_compaction_groups. Cleanup, in turn, calls delete_sstables_atomically on all sstables in the compaction_group, in all states, including the normal state as well as staging - hence the requirement to support deleting sstables in different sub-directories.
Also, apparently truncate calls delete_atomically for all sstables too, via table::discard_sstables, so if it happened to be executed during view update generation, i.e. when there are sstables in staging, it should hit the assertion failure reported in https://github.com/scylladb/scylladb/issues/18862 as well (although I haven't seen it yet, but I see no reason why it would happen). So the issue was apparently present since the initial implementation of the pending_delete_log. It's just that with tablet migration it is more likely to be hit.
Fixesscylladb/scylladb#18862
Needs backport to 6.0 since tablets require this capability
Closesscylladb/scylladb#19555
* github.com:scylladb/scylladb:
sstable_directory: create_pending_deletion_log: place pending_delete log under the base directory
sstables: storage: keep base directory in base class
sstables: storage: define opened_directory in header file
sstable_directory: use only dirlog