If something happens during nod adding to the cluster, it will not be registered as a part of the cluster. This leads to situations during log gathering that logs for a such node will be missing.
Incorrect passing of the artifacts_dir_url parameter from test.py to pytest leads to the situation when it will pass None as a string and pytest will generate incorrect URL.
In CI test always executed with option --repeat=3 that leads to generate 3 test results with the same name. Junit plugin in CI cannot distinguish correctly the difference between these results. In case when we have two passes and one fail, the link to test result will sometimes be redirected to the incorrect one because the test name is the same. To fix this ReportPlugin added that will be responsible to modify the test case name during junit report generation adding to the test name mode and run id.
Fixes: https://github.com/scylladb/scylladb/issues/17851
Fixes: https://github.com/scylladb/scylladb/issues/15973Closesscylladb/scylladb#19235
* github.com:scylladb/scylladb:
[test.py] Add uniqueness to the test name
[test.py] Refactor alternator, nodetool, rest_api
these two arguments are critical when tablets are enabled.
Fixes https://github.com/scylladb/scylladb/issues/19296
---
6.0 is the first release with tablets support. and `nodetool ring` is an important tool to understand the data distribution. so we need to backport this document change to 6.0
Closesscylladb/scylladb#19297
* github.com:scylladb/scylladb:
doc: document `keyspace` and `table` for `nodetool ring`
doc: replace tab with space
this change is created in the same spirit of 1186ddef16, which updated the rule for generating the stripped dist pkg, but it failed to update the one for generating the unstripped dist pkg. what's why we have build failure when the workflow is looking for the unstripped tar.gz:
```
08:02:47 ++ ls /jenkins/workspace/scylla-master/scylla-ci/scylla/build/RelWithDebInfo/dist/tar/scylla-unstripped-6.1.0~dev-0.20240613.d5bdddaeb40b.x86_64.tar.gz
08:02:47 ls: cannot access '/jenkins/workspace/scylla-master/scylla-ci/scylla/build/RelWithDebInfo/dist/tar/scylla-unstripped-6.1.0~dev-0.20240613.d5bdddaeb40b.x86_64.tar.gz': No such file or directory`
```
so, in this change, we fix the path.
Refs #2717
---
* cmake related change, hence no need to backport.
Closesscylladb/scylladb#19290
* github.com:scylladb/scylladb:
build: cmake: use per-mode path for building unstripped_dist_pkg
build: cmake: use path to be compatible with CI
when `--use-cmake` option is passed to `configure.py`,
- before this change, all modes are selected if no
`--mode` options are passed to `configure.py`.
- after this change, only the modes whose `build_by_default` is
`True` are selected, if no `--mode` options are specfied.
the new behavior matches the existing behavior. otherwise,
`ninja -C build mode_list` would list the mode which
is not built by default.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19292
utils/chunked_vector::reserve_partial: fix usage in callers
The method reserve_partial(), when used as documented, quits before the
intended capacity can be reserved fully. This can lead to overallocation
of memory in the last chunk when data is inserted to the chunked vector.
The method itself doesn't have any bug but the way it is being used by
the callers needs to be updated to get the desired behaviour.
Instead of calling it repeatedly with the value returned from the
previous call until it returns zero, it should be repeatedly called with
the intended size until the vector's capacity reaches that size.
This PR updates the method comment and all the callers to use the
right way.
Fixes#19254Closesscylladb/scylladb#19279
* github.com:scylladb/scylladb:
utils/large_bitset: remove unused includes identified by clangd
utils/large_bitset: use thread::maybe_yield()
test/boost/chunked_managed_vector_test: fix testcase tests_reserve_partial
utils/lsa/chunked_managed_vector: fix reserve_partial()
utils/chunked_vector: return void from reserve_partial and make_room
test/boost/chunked_vector_test: fix testcase tests_reserve_partial
utils/chunked_vector::reserve_partial: fix usage in callers
before this change, we use runtime format string to format error
messages. but it does not have the compile time format check.
if we pass arguments which are not formattable, {fmt} throws at
runtime, instead of error out at compile-time. this could be very
annoying, because we format error messages at the error handling
path. but if user ends up seeing an exception for {fmt} instead
of a nice error message, it would be far from helpful.
in this change, we
- use compile-time format string
- fix two caller sites, where we pass `std::exception_ptr` to
{fmt}, but `std::exception_ptr` is not formattable by {fmt} at
the time of writing. we do have operator<< based formatter for
it though. so we delegate to `fmt::streamed` to format it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19294
This PR fixes two problems with the `expected_error`
parameter in `server_add` and `servers_add`.
1. It didn't work in `server_add` if the cluster was empty
because of an incorrect attempt to connect the driver.
2. It didn't work in `servers_add` completely because the
`seeds` parameter was handled incorrectly.
This PR only adds improvements in the testing framework,
no need to backport it.
Closesscylladb/scylladb#19255
* github.com:scylladb/scylladb:
test: manager_client, scylla_cluster: fix type annotations in add_servers
test: manager_client: don't connect driver after failed server_{add, start}
test: scylla_cluster: pass seeds to add_servers
On each shard of each node we store the view update backlogs of
other nodes to, depending on their size, delay responses to incoming
writes, lowering the load on these nodes and helping them get their
backlog to normal if it were too high.
These backlogs are propagated between nodes in two ways: the first
one is adding them to replica write responses. The seconds one
is gossiping any changes to the node's backlog every 1s. The gossip
becomes useful when writes stop to some node for some time and we
stop getting the backlog using the first method, but we still want
to be able to select a proper delay for new writes coming to this
node. It will also be needed for the mv admission control.
Currently, the backlog is gossiped from shard 0, as expected.
However, we also receive the backlog only on shard 0 and only
update this shard's backlogs for the other node. Instead, we'd
want to have the backlogs updated on all shards, allowing us
to use proper delays also when requests are received on shards
different than 0.
This patch changes the backlog update code, so that the backlogs
on all shards are updated instead. This will only be performed
up to once per second for each other node, and is done with
a lower priority, so it won't severly impact other work.
Fixes: scylladb/scylladb#19232Closesscylladb/scylladb#19268
In CI test always executed with option --repeat=3 that leads to generate 3 test results with the same name. Junit plugin in CI cannot distinguish correctly the difference between these results. In case when we have two passes and one fail, the link to test result will sometimes be redirected to the incorrect one because the test name is the same.
To fix this ReportPlugin added that will be responsible to modify the test case name during junit report generation adding to the test name mode and run id.
Fixes: https://github.com/scylladb/scylladb/issues/17851
Fixes: https://github.com/scylladb/scylladb/issues/15973
For various reasons, a view building write may fail. When that
happens, the view building should not finish until these writes
are successfully retried and they should not interfere with any
writes that are performed to the base table while the view is
building.
The test introduced in this patch confirms that this is the case.
Refs scylladb/scylladb#19261Closesscylladb/scylladb#19263
Update the maximum size tested by the testcase. The test always created
only one chunk as the maximum size tested by it (1 << 12 = 4KB) was less
than the default max chunk size (12.8 KB). So, use twice the
max_chunk_capacity as the test size distribution upper limit to verify
that partial_reserve can reserve multiple chunks.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Fix the method comment and return types of chunked_managed_vector's
reserve_partial() similar to chunked_vector's reserve_partial() as it
has the same issues mentioned in #19254. Also update the usage in the
chunked_managed_vector_test.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Since reserve_partial does not depend on the number of remaining
capacity to be reserved, there is no need to return anything from it and
the make_room method.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Fix the usage of reserve_partial in the testcase. Also update the
maximum chunk size used by the testcase. The test always created only
one chunk as the maximum size tested by it (1 << 12 = 4KB) was less
than the default max chunk size (128 KB). So, use smaller chunk size,
512 bytes, to verify that partial_reserve can reserve multiple chunks.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
in 57c408ab, we dropped operator<< for `parsed::path`, but we forgot
to drop the friend declaration for it along with the operator. so in
this change, let's drop the friend declaration.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19287
these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning.
also, add api to iwyu github workflow's CLEANER_DIR, to prevent future violations.
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#19269
* github.com:scylladb/scylladb:
.github: add api to iwyu's CLEANER_DIR
api: do not include unused headers
`before this change, we use "scylla" as the dependecy of
unstripped_dist_pkg, but that's implies the scylla built with the
default mode. if the build rules is generated using the
multi-config generator, the default mode does not necessarily
identical to the current `$<CONFIG>`, so let's be more explicit.
otherwise, we could run into built failure like
```
FAILED: dist/RelWithDebInfo/scylla-unstripped-6.1.0~dev-0.20240614.5f36888e7fbd.x86_64.tar.gz /jenkins/workspace/scylla-master/scylla-ci/scylla/build/dist/RelWithDebInfo/scylla-unstripped-6.1.0~dev-0.20240614.5f36888e7fbd.x86_64.tar.gz
cd /jenkins/workspace/scylla-master/scylla-ci/scylla && scripts/create-relocatable-package.py --build-dir /jenkins/workspace/scylla-master/scylla-ci/scylla/build/RelWithDebInfo --node-exporter-dir /jenkins/workspace/scylla-master/scylla-ci/scylla/build/node_exporter --debian-dir /jenkins/workspace/scylla-master/scylla-ci/scylla/build/debian /jenkins/workspace/scylla-master/scylla-ci/scylla/build/dist/RelWithDebInfo/scylla-unstripped-6.1.0~dev-0.20240614.5f36888e7fbd.x86_64.tar.gz
ldd: /jenkins/workspace/scylla-master/scylla-ci/scylla/build/RelWithDebInfo/scylla: No such file or directory
Traceback (most recent call last):
File "/jenkins/workspace/scylla-master/scylla-ci/scylla/scripts/create-relocatable-package.py", line 109, in <module>
libs.update(ldd(exe))
^^^^^^^^
File "/jenkins/workspace/scylla-master/scylla-ci/scylla/scripts/create-relocatable-package.py", line 37, in ldd
for ldd_line in subprocess.check_output(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ldd', '/jenkins/workspace/scylla-master/scylla-ci/scylla/build/RelWithDebInfo/scylla']' returned non-zero exit status 1.
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this change is created in the same spirit of 1186ddef16, which
updated the rule for generating the stripped dist pkg, but it
failed to update the one for generating the unstripped dist pkg.
what's why we have build failure when the workflow is looking for
the unstripped tar.gz:
```
08:02:47 ++ ls /jenkins/workspace/scylla-master/scylla-ci/scylla/build/RelWithDebInfo/dist/tar/scylla-unstripped-6.1.0~dev-0.20240613.d5bdddaeb40b.x86_64.tar.gz
08:02:47 ls: cannot access '/jenkins/workspace/scylla-master/scylla-ci/scylla/build/RelWithDebInfo/dist/tar/scylla-unstripped-6.1.0~dev-0.20240613.d5bdddaeb40b.x86_64.tar.gz': No such file or directory`
```
so, in this change, we fix the path.
Refs #2717
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
backport not needed, these are just cleanups.
Closesscylladb/scylladb#19260
* github.com:scylladb/scylladb:
replica: simplify perform_cleanup_compaction()
replica: return storage_group by reference on storage_group_for*()
replica: devirtualize storage_group_of()
Before work on tablets was completed, it was noticed that — due to some missing pieces of implementation — Scylla doesn't properly close sstables for migrated-away tablets. Because of this, disk space wasn't being reclaimed properly.
Since the missing pieces of implementation were added, the problem should be gone now. This patch adds a test which was used to reproduce the problem earlier. It's expected to pass now, validating that the issue was fixed.
Should be backported to branch-6.0, because the tested problem was also affecting that branch.
Fixes#16946Closesscylladb/scylladb#18906
* github.com:scylladb/scylladb:
test_tablets: add test_tablet_storage_freeing
test: pylib: add get_sstables_disk_usage()
with tablets, we're expected to have a worst of ~100 tablets in a given
table and shard, so let's avoid linear search when looking for the
memtable_list in a range scan. we're bounded by ~100 elements, so
shouldn't be a big problem, but it's an inefficiency we can easily
get rid of.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#19286
The method reserve_partial(), when used as documented, quits before the
intended capacity can be reserved fully. This can lead to overallocation
of memory in the last chunk when data is inserted to the chunked vector.
The method itself doesn't have any bug but the way it is being used by
the callers needs to be updated to get the desired behaviour.
Instead of calling it repeatedly with the value returned from the
previous call until it returns zero, it should be repeatedly called with
the intended size until the vector's capacity reaches that size.
This commit updates the method comment and all the callers to use the
right way.
Fixes#19254
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
As it turns out, each sstable carries its own schema in its serialization header (Statistics component). This schema is incomplete -- the names of the key columns are not stored, just their type. Static and regular columns do have names and types stored however. This bare-bones schema is enough to parse and display the content of the sstable. Another thing missing is schema options (the stuff after the `WITH` keyword, except the clustering order). The only options stored are the compression options (in the CompressionInfo component), this is actually needed to read the Data component.
This series adds a new method to `tools/schema_loader.cc` to extract the schema stored in the sstable itself. This new schema load method is used as the last fall-back for obtaining the schema, in case scylla-sstable is trying to autodetect the schema of the sstable. Although, right now this bare-bones schema is enough for everything scylla-sstable does, it is more future proof to stick to the "full" schema if possible, so this new method is the last resort for now.
Fixes: https://github.com/scylladb/scylladb/issues/17869
Fixes: https://github.com/scylladb/scylladb/issues/18809
New functionality, no backport needed.
Closesscylladb/scylladb#19169
* github.com:scylladb/scylladb:
tools/scylla-sstable: log loaded schema with trace level
tools/scylla-sstable: load schema from the sstable as fallback
tools/schema_loader: introduce load_schema_from_sstable()
test/lib/random_schema: remove assert on min number of regular columns
sstables: introduce load_metadata()
This patch includes extensive testing for what happens to an ongoing
paged query when a secondary index is suddenly added or dropped.
Issue #18992 was opened suggesting that this would be broken, and indeed
the tests included here show that it is indeed broken.
The four tests included in this patch are heavily commented to explain
what they are testing and why, but here is a short summary of what is
being tested by each of them:
1. A paged query filtering on v=17 continues correctly even if an
index is created on v.
2. A paged query filtering on v1 and v2 where v2 is indexed,
continues correctly even if an index is created on v1 (remember
that Scylla prefers to use the first index mentioned in the query).
3. A paged query using an index on v continues correctly even if that
index is deleted.
4. However, if the query doesn't say "ALLOW FILTERING", it cannot
be continued after the index is deleted.
All these tests pass on Cassandra, but all of them except the fourth
fail on Scylla, reproducing issue #18992. Somewhat to my suprise, the
failure of the query in all the failed tests is silent (i.e., trying to
fetch the next page just fetches nothing and says the iteration is done).
I was expecting more dramatic failures ("marshaling error" messages,
crashes, etc.) but didn't get them.
Refs #18992
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#19000
The schema of the sstable can be interesting, so log it with trace
level. Unfortunately, this is not the nice CQL statement we are used to
(that requires a database object), but the not-nearly-so-nice CFMetadata
printout. Still, it is better then nothing.
When auto-detecting the schema of the sstable, if all other methods
failed, load the schema from the sstable's serialization header. This
schema is incomplete. It is just enough to parse and display the content
of the sstable. Although parsing and displaying the content of the
sstable is all scylla-sstable does, it is more future-compatible to us
the full schema when possible. So the always-available but minimal
schema that each sstable has on itself, is used just as a fallback.
The test which tested the case when all schema load attempts fail,
doesn't work now, because loading the serialization header always
succeeds. So convert this test into two positive tests, testing the
serialization header schema fallback instead.
Allows loading the schema from an sstable's serialization header. This
schema is incomplete, but it is enough to parse and display the content
of the sstable.
Change the format of sync points to use host ID instead of IPs, to be consistent with the use of host IDs in hinted handoff module.
Introduce sync point v3 format which is the same as v2 except it stores host IDs instead of IPs.
The decoding supports both formats with host IDs and IPs, so a sync point contains now a variant of either types, and in the case of new type the translation is avoided.
Fixes#18653Closesscylladb/scylladb#19134
* github.com:scylladb/scylladb:
db/hints: migrate sync point to host ID
db/hints: rename sync point structures with _v1 suffix to _v1_v2
those functions cannot return nullptr, will throw when group is not
found, so better return ref instead.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
If adding or starting a server fails expectedly, there is no reason
to update or connect the driver. Moreover, before this patch, we
couldn't use `server_add` and `servers_add` with `expected_error`
if the cluster was empty. After expected bootstrap failures, we
tried to connect the driver, which rightfully failed on
`assert len(hosts) > 0` in `cluster_con`.