backport not needed, these are just cleanups.
Closesscylladb/scylladb#19260
* github.com:scylladb/scylladb:
replica: simplify perform_cleanup_compaction()
replica: return storage_group by reference on storage_group_for*()
replica: devirtualize storage_group_of()
Before work on tablets was completed, it was noticed that — due to some missing pieces of implementation — Scylla doesn't properly close sstables for migrated-away tablets. Because of this, disk space wasn't being reclaimed properly.
Since the missing pieces of implementation were added, the problem should be gone now. This patch adds a test which was used to reproduce the problem earlier. It's expected to pass now, validating that the issue was fixed.
Should be backported to branch-6.0, because the tested problem was also affecting that branch.
Fixes#16946Closesscylladb/scylladb#18906
* github.com:scylladb/scylladb:
test_tablets: add test_tablet_storage_freeing
test: pylib: add get_sstables_disk_usage()
with tablets, we're expected to have a worst of ~100 tablets in a given
table and shard, so let's avoid linear search when looking for the
memtable_list in a range scan. we're bounded by ~100 elements, so
shouldn't be a big problem, but it's an inefficiency we can easily
get rid of.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#19286
As it turns out, each sstable carries its own schema in its serialization header (Statistics component). This schema is incomplete -- the names of the key columns are not stored, just their type. Static and regular columns do have names and types stored however. This bare-bones schema is enough to parse and display the content of the sstable. Another thing missing is schema options (the stuff after the `WITH` keyword, except the clustering order). The only options stored are the compression options (in the CompressionInfo component), this is actually needed to read the Data component.
This series adds a new method to `tools/schema_loader.cc` to extract the schema stored in the sstable itself. This new schema load method is used as the last fall-back for obtaining the schema, in case scylla-sstable is trying to autodetect the schema of the sstable. Although, right now this bare-bones schema is enough for everything scylla-sstable does, it is more future proof to stick to the "full" schema if possible, so this new method is the last resort for now.
Fixes: https://github.com/scylladb/scylladb/issues/17869
Fixes: https://github.com/scylladb/scylladb/issues/18809
New functionality, no backport needed.
Closesscylladb/scylladb#19169
* github.com:scylladb/scylladb:
tools/scylla-sstable: log loaded schema with trace level
tools/scylla-sstable: load schema from the sstable as fallback
tools/schema_loader: introduce load_schema_from_sstable()
test/lib/random_schema: remove assert on min number of regular columns
sstables: introduce load_metadata()
This patch includes extensive testing for what happens to an ongoing
paged query when a secondary index is suddenly added or dropped.
Issue #18992 was opened suggesting that this would be broken, and indeed
the tests included here show that it is indeed broken.
The four tests included in this patch are heavily commented to explain
what they are testing and why, but here is a short summary of what is
being tested by each of them:
1. A paged query filtering on v=17 continues correctly even if an
index is created on v.
2. A paged query filtering on v1 and v2 where v2 is indexed,
continues correctly even if an index is created on v1 (remember
that Scylla prefers to use the first index mentioned in the query).
3. A paged query using an index on v continues correctly even if that
index is deleted.
4. However, if the query doesn't say "ALLOW FILTERING", it cannot
be continued after the index is deleted.
All these tests pass on Cassandra, but all of them except the fourth
fail on Scylla, reproducing issue #18992. Somewhat to my suprise, the
failure of the query in all the failed tests is silent (i.e., trying to
fetch the next page just fetches nothing and says the iteration is done).
I was expecting more dramatic failures ("marshaling error" messages,
crashes, etc.) but didn't get them.
Refs #18992
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#19000
The schema of the sstable can be interesting, so log it with trace
level. Unfortunately, this is not the nice CQL statement we are used to
(that requires a database object), but the not-nearly-so-nice CFMetadata
printout. Still, it is better then nothing.
When auto-detecting the schema of the sstable, if all other methods
failed, load the schema from the sstable's serialization header. This
schema is incomplete. It is just enough to parse and display the content
of the sstable. Although parsing and displaying the content of the
sstable is all scylla-sstable does, it is more future-compatible to us
the full schema when possible. So the always-available but minimal
schema that each sstable has on itself, is used just as a fallback.
The test which tested the case when all schema load attempts fail,
doesn't work now, because loading the serialization header always
succeeds. So convert this test into two positive tests, testing the
serialization header schema fallback instead.
Allows loading the schema from an sstable's serialization header. This
schema is incomplete, but it is enough to parse and display the content
of the sstable.
Change the format of sync points to use host ID instead of IPs, to be consistent with the use of host IDs in hinted handoff module.
Introduce sync point v3 format which is the same as v2 except it stores host IDs instead of IPs.
The decoding supports both formats with host IDs and IPs, so a sync point contains now a variant of either types, and in the case of new type the translation is avoided.
Fixes#18653Closesscylladb/scylladb#19134
* github.com:scylladb/scylladb:
db/hints: migrate sync point to host ID
db/hints: rename sync point structures with _v1 suffix to _v1_v2
those functions cannot return nullptr, will throw when group is not
found, so better return ref instead.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Some time ago it turned out that if unrecognized feature name is met in scylla.yaml, the whole experimental features list is ignored, but scylla continues to boot. There's UNUSED feature which is the proper way to deprecate a feature, and this PR improves its handling in several ways.
1. The recently removed "tablets" feature is partially brought back, but marked as UNUSED
2. Any UNUSED features met while parsing are printed into logs
3. The enum_option<> helper is enlightened along the way
refs: #18968Closesscylladb/scylladb#19230
* github.com:scylladb/scylladb:
config: Mark tablets feature as unused
main: Warn unused features
enum_option: Carry optional key on board
enum_option: Remove on-board _map member
The API endpoints are registered for particular services (with rare exceptions), and once the corresponding service is ready, its endpoints section can be registered too. Same but reversed is for shutdown, and it's automatic with deferred actions.
refs: #2737Closesscylladb/scylladb#19208
* github.com:scylladb/scylladb:
main: Register task manager API next to task manager itself
main: Register messaging API next to messaging service
main: Register repair API next to repair service
its declaration was removed in 84a9d2fa, which failed to remove
the implementation from .cc file.
in this change, let's remove operator<< for role_or_anonymous
completely.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19243
If we receive a message in the same term but from a different leader
than we expect, we print:
```
Got append request/install snapshot/read_quorum from an unexpected leader
```
For some reason the message did not include the details (who the leader
was and who the sender was) which requires almost zero effort and might
be useful for debugging. So let's include them.
Ref: scylladb/scylla-enterprise#4276Closesscylladb/scylladb#19238
Commit 882b2f4e9f (cql3, schema_tables: Generalize function creation)
erroneously says that optional<context> is not suitable for future<>
type, but in fact it is.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#19204
since gcc-13 is packaged by ppa:ubuntu-toolchain-r, and GCC-13 was
released 1 year ago, let's use it instead. less warnings, as the
standard library from GCC-13 is more standard compliant.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19162
to be prepared for changes from clang, and enjoy the new warnings/errors from this compiler.
* it is an improvement in our CI, no need to backport.
Closesscylladb/scylladb#19164
* github.com:scylladb/scylladb:
.github: add workflow to build with clang nightly
.github: rename clang-tidy-matcher.json to clang-matcher.json
Commit 47dbf23773 (Rework view services and system-distributed-keyspace
dependencies) made streaming and repair services depend on view builder,
but missed the fact that the builder itself starts much later.
Move view builder earlier, that's safe, no activity is started upon
that, real building is kicked much later when invoke_on_all(start)
happens.
Other than than, start system distributed keyspace earlier, which also
looks safe, as it's also started "for real" later, by storage service
when it joins the ring.
fixes: #19133
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#19250
upgrade/_common are document fragments included by other documents.
but quite a few the documents previously including these fragments
were removed. but we didn't remove these fragments along with them.
in this change, we drop them.
Fixes#19245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19251
before this change, when performing memtable_test, we expect that
the memtables of ks.cf is the only memtables being flushed. and
we inject 4 failures in the code path of flush, and wait until 4
of them are triggered. but in the background, `dirty_memory_manager`
performs flush on all tables when necessary. so, the total number of
failures is not necessary the total number of failures triggered
when flushing ks.cf, some of them could be triggered when flushing
system tables. that's why we have sporadict test failures from
this test. as we might check `t.min_memtable_timestamp()` too soon.
after this change, we increase `unspooled_dirty_soft_limit` setting,
in order to disable `dirty_memory_manager`, so that the only flush
is performed by the test.
Fixes https://github.com/scylladb/scylladb/issues/19034
---
the issue applies to both 5.4 and 6.0, and this issue hurts the CI stability, hence we should backport it.
Closesscylladb/scylladb#19252
* github.com:scylladb/scylladb:
test: memtable_test: increase unspooled_dirty_soft_limit
test: memtable_test: replace BOOST_ASSERT with BOOST_REQURE
before this change, when performing memtable_test, we expect that
the memtables of ks.cf is the only memtables being flushed. and
we inject 4 failures in the code path of flush, and wait until 4
of them are triggered. but in the background, `dirty_memory_manager`
performs flush on all tables when necessary. so, the total number of
failures is not necessary the total number of failures triggered
when flushing ks.cf, some of them could be triggered when flushing
system tables. that's why we have sporadict test failures from
this test. as we might check `t.min_memtable_timestamp()` too soon.
after this change, we increase `unspooled_dirty_soft_limit` setting,
in order to disable `dirty_memory_manager`, so that the only flush
is performed by the test.
Fixes#19034
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we verify the behavior of design under test using
`BOOST_ASSERT()`, which is a wrapper around `assert()`, so if a test
fails, the test just aborts. this is not very helpful for postmortem
debugging.
after this change, we use `BOOST_REQUIRE` macro for verifying the
behavior, so that Boost.Test prints out the condition if it does not
hold when we test it.
Refs #19034
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
in this changeset, we tighten the clang-include-cleaner workflow, and address the warnings in two more subdirectories in the source tree.
* it's a cleanup, no need to backport
Closesscylladb/scylladb#19155
* github.com:scylladb/scylladb:
.github: add alternator to iwyu's CLEANER_DIR
alternator: do not include unused headers
.github: change severity to error in clang-include-cleaner
exceptions: do not include unused headers
since we stopped using the generic container formatters which in turn
use operator<< for formatting the elemements. we can drop more
operator<< operators.
so, in this change, we drop operator<< for proposal.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19156
ConfigParser.readfp was deprecated in Python 3.2 and removed in Python 3.12.
Under Fedora 40, the container fails to launch because it cannot parse its
configuration.
Fix by using the newer read_file().
Closesscylladb/scylladb#19236
This config item is propagated to the table object via table::config. Although the field in `table::config`, used to propagate the value, was `utils::updateable_value<T>`, it was assigned a constant and so the live-update chain was broken.
This series fixes this and adds a test which fails before the patch and passes after. The test needed new test infrastructure, around the failure injection api, namely the ability to exfiltrate the value of internal variable. This infrastructure is also added in this series.
Fixes: https://github.com/scylladb/scylladb/issues/18674
- [x] This patch has to be backported because it fixes broken functionality
Closesscylladb/scylladb#18705
* github.com:scylladb/scylladb:
test/topology_custom: add test for enable_compacting_data_for_streaming_and_repair live-update
test/pylib: rest_client: add get_injection()
api/error_injection: add getter for error_injection
utils/error_injection: add set_parameter()
replica/database: fix live-update enable_compacting_data_for_streaming_and_repair
in 44e85c7d, we remove coverage compiling options from the cflags
when building abseil. but in 535f2b21, these options were brought
back as parts of cxx_flags.
so we need to remove them again from cxx_flags.
Fixes#19219
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19220
This features used to be there for a while, but then it was removed by
83d491af02. This patch partially takes it
back, but maps to UNUSED, so that if met in config, it's warned, but
other features are parsed as well.
refs: #18968
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When seeing an UNUSED feature -- print it into log. This is where the
enum_option::key is in use. The thing is that experimental features map
different unused feature names into the single UNUSED feature enum
value, so once the feature is parsed its configured name only persists
in the option's key member (saved by previous patch).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It facilitates option formatting, but the main purpose is to be able to
find out the exact keys, not values, later (see next patch).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The map in question is immutable and can obtained from the Mapper type
at any time, there's no need in keeping its copy on each enum_option
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Change the format of sync points to use host ID instead of IPs, to be
consistent with the use of host IDs in hinted handoff module.
Introduce sync point v3 format which is the same as v2 except it stores
host IDs instead of IPs.
The encoding of sync points now always uses the new v3 format with host
IDs.
The decoding supports both formats with host IDs and IPs, so a sync point
contains now a variant of either types, and in the case of the new
format the translation from IP to host ID is avoided.
* tools/java 88809606c8...01ba3c196f (3):
> Revert "build: don't add nonexistent directory 'lib' to relocatable packages"
> build: run antlr in a separate process
> build: don't add nonexistent directory 'lib' to relocatable packages
Allow external code to obtain information about an error injection
point, including whether it is enabled, and importantly, what its
parameters are. Together with the `set_parameter()` added in the
previous patch, this allows tests to read out the values of internal
parameters, via a set_parameter() injection point.
Allow injection points to write values into the parameter map, which
external code can then examine. This allows exfiltrating the values if
internal variables, to be examined by tests, without exposing these
variables via an "official" path.
Most of the time this test spends waiting for a node to die. Helps 3x times
Was
real 9m21,950s
user 1m11,439s
sys 1m26,022s
Now
real 3m37,780s
user 0m58,439s
sys 1m13,698s
refs: #17764
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#19222