The current test in boost/cql_query_large_test::test_large_data only checks whether notifications for large rows and cells are written into the system keyspace. It doesn't check this for partitions.
This change adds this check for partitions.
In test_exception_safety_of_update_from_memtable, we have a potential
throw from external_updater.
external_updater is supposed to be infallible.
Scylla currently aborts when an external_updater throws, so a throw from
there just fails the test.
This isn't intended. We aren't testing external_updater in this test.
Fixes#18163Closesscylladb/scylladb#18171
Before the patch selection of auth version depended
on consistent topology feature but during raft recovery
procedure this feature is disabled so we need to persist
the version somewhere to not switch back to v1 as this
is not supported.
During recovery auth works in read-only mode, writes
will fail.
Fixes https://github.com/scylladb/scylladb/issues/17736Closesscylladb/scylladb#18039
* github.com:scylladb/scylladb:
auth: keep auth version in scylla_local
auth: coroutinize service::start
We can cache tablet map in erm, to avoid looking it up on every write for
getting write replicas. We do that in tablet_sharder, but not in tablet
erm. Tablet map is immutable in the context of a given erm, so the
address of the map is stable during erm lifetime.
This caught my attention when looking at perf diff output
(comparing tablet and vnode modes).
It also helps when erm is called again on write completion for
checking locality, used for forwarding info to the driver if needed.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#18158
They result in poor distribution and poor cardinality, interfering with
tests which want to generate N partitions or rows.
Fixes: #17821Closesscylladb/scylladb#17856
Maintainers are also allowed to commit their own backport PR. They are
allowed to backport their own code, opening a PR to get a CI run for a
backport doesn't change this.
Closesscylladb/scylladb#17727
Upgrading raft topology is an important api call
that should be logged.
When failed, it is also important to log the
exception to get better visibility into why
the call failed.
Closesscylladb/scylladb#18143
* github.com:scylladb/scylladb:
api: storage_service: upgrade_to_raft_topology: fixup indentation
api: storage_service: upgrade_to_raft_topology: add logging
The vector in question is populted from the content of another map, so
its size is known in advance
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#18155
Before the patch selection of auth version depended
on consistent topology feature but during raft recovery
procedure this feature is disabled so we need to persist
the version somewhere to not switch back to v1 as this
is not supported.
During recovery auth works in read-only mode, writes
will fail.
Upgrading raft topology is an important api call
that should be logged.
When failed, it is also important to log the
exception to get better visibility into why
the call failed.
Indentation will be fixed in the next patch.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
we should include used header, to avoid compilation failures like:
```
cql3/statements/select_statement.cc:229:79: error: no member named 'filter' in namespace 'std::ranges::views'
for (const auto& used_function : used_functions | std::ranges::views::filter(not_native)) {
~~~~~~~~~~~~~~~~~~~~^
1 error generated.`
```
if some of the included header drops its own `#include <optional>`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18145
It now happens in initialize_virtual_tables(), but this function is split into sub-calls and iterates over virtual tables map several times to do its work. This PR squashes it into a straightforward code which is shorter and, hopefully, easier to read.
Closesscylladb/scylladb#18133
* github.com:scylladb/scylladb:
virtual_tables: Open-code install_virtual_readers_and_writers()
virtual_tables: Move readers setup loop into add_table()
virtual_tables: Move tables creation loop into add_table()
virtual_tables: Make add_tablet() a coroutine
virtual_tables: Open-code register_virtual_tables()
Currently, we load the repair history during boot up. If the number of
repair history entries is high, it might take a while to load them.
In my test, to load 10M entries, it took around 60 seconds.
It is not a must to load the entries during boot up. It is better to
load them in the background to speed up the boot time.
Fixes#17993Closesscylladb/scylladb#17994
* github.com:scylladb/scylladb:
repair: Load repair history in background
repair: Abort load_history process in shutdown
We currently support the sync-label only in OSS. Since Scylla-enterprise
get all the commits from OSS repo, the sync-label is running and failing
during checkout (since it's a private repo and should have different
configuration)
For now, let's limit the workflows for oss repo
Closesscylladb/scylladb#18142
Added support to track and limit the memory usage by sstable components. A reclaimable component of an SSTable is one from which memory can be reclaimed. SSTables and their managers now track such reclaimable memory and limit the component memory usage accordingly. A new configuration variable defines the memory reclaim threshold. If the total memory of the reclaimable components exceeds this limit, memory will be reclaimed to keep the usage under the limit. This PR considers only the bloom filters as reclaimable and adds support to track and limit them as required.
The feature can be manually verified by doing the following :
1. run a single-node single-shard 1GB cluster
2. create a table with bloom-filter-false-positive-chance of 0.001 (to intentionally cause large bloom filter)
3. populate with tiny partitions
4. watch the bloom filter metrics get capped at 100MB
The default value of the `components_memory_reclaim_threshold` config variable which controls the reclamation process is `.1`. This can also be reduced further during manual tests to easily hit the threshold and verify the feature.
Fixes#17747Closesscylladb/scylladb#17771
* github.com:scylladb/scylladb:
test_bloom_filter.py: disable reclaiming memory from components
sstable_datafile_test: add tests to verify auto reclamation of components
test/lib: allow overriding available memory via test_env_config
sstables_manager: support reclaiming memory from components
sstables_manager: store available memory size
sstables_manager: add variable to track component memory usage
db/config: add a new variable to limit memory used by table components
sstable_datafile_test: add testcase to verify reclamation from sstables
sstables: support reclaiming memory from components
This patch makes the get_description.py script easier to use by the
documentation automation:
1. The script is now a library.
2. You can choose the output of the script, currently supported pipee
and yml.
You can still call the from the command line, like before, but you can
also calls it from another python script.
For example the folowing python script would generate the documentation
for the metrics description of the ./alternator/ttl.cc file.
```
import get_description
metrics = get_description.get_metrics_from_file("./alternator/ttl.cc", "scylla", get_description.get_metrics_information("metrics-config.yml"))
get_description.write_metrics_to_file("out.yaml", metrics, "yml")
```
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Closesscylladb/scylladb#18136
Coredumps coming from CI are produced by a commit, which is not
available in the scylla.git repository, as CI runs on a merge commit
between the main branch (master or enterprise) and the tested PR branch.
Currently the script will attempt to checkout this commit and will fail
as the commit hash is unrecognized.
To work around this, add a --ci flag, which when used, will force the
main branch to be checked out, instead of the commit hash.
Closesscylladb/scylladb#18023
This reverts commit 97b203b1af.
since Seastar provides the formatter, it's not necessary to vendor it in
scylladb anymore.
Refs #13245Closesscylladb/scylladb#18114
storage_group_id_for_token() was only needed from within
tablet_storage_group_manager, so we can kill
table::storage_group_id_for_token().
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#18134
Currently, we load the repair history during boot up. If the number of
repair history entries is high, it might take a while to load them.
In my test, to load 10M entries, it took around 60 seconds.
It is not a must to load the entries during boot up. It is better to
load them in the background to speed up the boot time.
Fixes#17993
Disabled reclaiming memory from sstable components in the testcase as it
interferes with the false positive calculation.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Reclaim memory from the SSTable that has the most reclaimable memory if
the total reclaimable memory has crossed the threshold. Only the bloom
filter memory is considered reclaimable for now.
Fixes#17747
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
The available memory size is required to calculate the reclaim memory
threshold, so store that within the sstables manager.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
sstables_manager::_total_reclaimable_memory variable tracks the total
memory that is reclaimable from all the SSTables managed by it.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
A new configuration variable, components_memory_reclaim_threshold, has
been added to configure the maximum allowed percentage of available
memory for all SSTable components in a shard. If the total memory usage
exceeds this threshold, it will be reclaimed from the components to
bring it back under the limit. Currently, only the memory used by the
bloom filters will be restricted.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Added support to track total memory from components that are reclaimable
and to reclaim memory from them if and when required. Right now only the
bloom filters are considered as reclaimable components but this can be
extended to any component in the future.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
It's pretty short already and is naturally a "part" of
initialize_virtual_tables(). Neither it installs writers any longer.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Similarly to previous patch, after virtual tables are registered the
registry is iterated over to install virtual readers onto each entry.
Again, this can happen at the time of registering, no need in dedicated
loop for that.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Once virtual_tables map is populated, it's iterated over to create
replica::table entries for each virtual table. This can be done in the
same place where the virtual table is created, no need in dedicated loop
for it nowadays.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's naturally a "part" of initialize_virtual_tables(). Further patching
gets possible with it being open-coded.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
if a pull request's cover letter is empty, `pr.body` is None. in that
case we should not try to pass it to `re.findall()` as the "string"
parameter. otherwise, we'd get
```
TypeError: expected string or bytes-like object, got 'NoneType'
```
so, in this change, we just return an empty list if the PR in question
has an empty cover letter.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18125
The cluster manager library doesn't set the asan/ubsan options
to abort on error and create core dumps; this makes debugging much
harder.
Fix by preparing the environment correctly.
Fixesscylladb/scylladb#17510Closesscylladb/scylladb#17511
what we need is but a script, so instead of checkout the whole repo,
with all history for all tags and branches, let's just checkout
a single file. faster this way.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18126
when adding a label to a PR request we keep getting the following error
message:
```
Traceback (most recent call last):
File "/home/runner/work/scylladb/scylladb/.github/scripts/sync_labels.py", line 93, in <module>
main()
File "/home/runner/work/scylladb/scylladb/.github/scripts/sync_labels.py", line 89, in main
sync_labels(repo, args.number, args.label, args.action, args.is_issue)
File "/home/runner/work/scylladb/scylladb/.github/scripts/sync_labels.py", line 74, in sync_labels
target.add_to_labels(label)
File "/usr/lib/python3/dist-packages/github/Issue.py", line 321, in add_to_labels
headers, data = self._requester.requestJsonAndCheck(
File "/usr/lib/python3/dist-packages/github/Requester.py", line 353, in requestJsonAndCheck
return self.__check(
File "/usr/lib/python3/dist-packages/github/Requester.py", line 378, in __check
raise self.__createException(status, responseHeaders, output)
github.GithubException.GithubException: 403 {"message": "Resource not accessible by integration", "documentation_url": "https://docs.github.com/rest/issues/labels#add-labels-to-an-issue"}
```
Based on
https://docs.github.com/en/actions/security-guides/automatic-token-authentication#permissions-for-the-github_token.
The maximum access for pull requests from public forked repositories is
set to `read`
Switching to `pull_request_target` to solve it
Fixes: https://github.com/scylladb/scylladb/issues/18102Closesscylladb/scylladb#18052
The read_field is std::optional<View>. The raw_value::make_value()
accepts managed_bytes_opt which is std::optional<manager_bytes>.
Finally, there's std::optional<T>::optional(std::optional<U>&&)
move constructor (and its copy-constructor peer).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#18128
The schema_builder::build() method creates a copy of raw schema
internaly in a hope that builder will be updated and be asked to build
the resulting schema again (e.g. alternator uses this).
However, there are places that build schema using temporary object once
in a `return schema_builder().with_...().build()` manner. For those
invocations copying raw schema is just waste of cycles.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#18094
currently, our homebrew formatter formats `std::map` like
```
{{k1, v1}, {k2, v2}}
```
while {fmt} formats a map like:
```
{k1: v1, k2: v2}
```
and if the type of key/value is string, {fmt} quotes it, so a
compaction strategy option is formatted like
```
{"max_threshold": "1"}
```
before switching the formatter to the ones supported by {fmt},
let's update the test to match with the new format. this should
reduce the overhead of reviewing the change of switching the
formatter. we can revert this change, and use a simpler approach
after the change of formatter lands.
Closesscylladb/scylladb#18058
* github.com:scylladb/scylladb:
test/cql-pytest: match error message formated using {fmt}
test/cql-pytest: extract scylla_error() for not allowed options test
before this change, `reclaim_timer::report()` calls
```c++
fmt::format(", at {}", current_backtrace())
```
which allocates a `std::string` on heap, so it can fail and throw. in
that case, `std::terminate()` is called. but at that moment, the reason
why `reclaim_timer::report()` gets called is that we fail to reclaim
memory for the caller. so we are more likely to run into this issue. anyway,
we should not allocate memory in this path.
in this change, a dedicated printer is created so that we don't format
to a temporary `std::string`, and instead write directly to the buffer
of logger. this avoids the memory allocation.
Fixes#18099
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18100