There is no need to start a thread for the status_checker
and can be implemented using a background coroutine.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
When closing _lower_bound and *_upper_bound
in the final close() call, they are currently left with
an engaged current_list member.
If the index_reader uses a _local_index_cache,
it is evicted with evict_gently which will, rightfully,
see the respective pages as referenced, and they won't be
evicted gently (only later when the index_reader is destroyed).
Reset index_bound.current_list on close(index_bound&)
to free up the reference.
Ref #12271
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12370
Since [repair: Always use run_replace_ops](2ec1f719de), nodes no longer publish HIBERNATE state so we don't need to support handling it.
Replace is now always done using node operations (using repair or streaming).
so nodes are never expected to change status to HIBERNATE.
Therefore storage_service:handle_state_replacing is not needed anymore.
This series gets rid of it and updates documentation related to STATUS:HIBERNATE respectively.
Fixes#12330Closes#12349
* github.com:scylladb/scylladb:
docs: replace-dead-node: get rid of hibernate status
storage_service: get rid of handle_state_replacing
We already check is remote's node topology is missing before creating a
connection, but local node topology can be missing too when we will use
raft to manage it. Raft needs to be able to create connections before
topology is knows.
Message-Id: <20221228144944.3299711-7-gleb@scylladb.com>
database_test in timing out because it's having to run the tests calling
do_with_cql_env_and_compaction_groups 3x, one for each compaction group
setting. reduce it to 2 settings instead of 3 if running in debug mode.
Refs #12396.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12421
This lets us carry fewer things and rely on the distribution
for maintenance.
The frozen toolchain is updated. Incidental updates include clang 15.0.6,
and pytest that doesn't need workarounds.
Closes#12397
This reverts commit ac2e2f8883. It causes
a regression ("std::bad_variant_access in load_view_build_progress").
Commit 2978052113 (a reindent) is also reverted as part of
the process.
Fixes#12395
In commit acfa180766 we added to
test/cql-pytest a mechanism to detect when Scylla crashes in the middle
of a test function - in which case we report the culprit test and exit
immediately to avoid having a hundred more tests report that they failed
as well just because Scylla was down.
However, if Scylla was *never* up - e.g., if the user ran "pytest" without
ever running Scylla - we still report hundreds of tests as having failed,
which is confusing and not helpful.
So with this patch, if a connection cannot be made to Scylla at all,
the test exits immediately, explaining what went wrong, not blaming
any specific test:
$ pytest
...
! _pytest.outcomes.Exit: Cannot connect to Scylla at --host=localhost --port=9042 !
============================ no tests ran in 0.55s =============================
Beyond being a helpful reminder for a developer who runs "pytest" without
having started Scylla first (or using test/cql-pytest/run or test.py to
start Scylla easily), this patch is also important when running tests
through test.py if it reuses an instance of Scylla that crashed during an
earlier pytest file's run.
This patch does not fix test.py - it can still try to run pytest with
a dead Scylla server without checking. But at least with this patch
pytest will notice this problem immediately and won't report hundreds of
test functions having failed. The only report the user will see will be
the last test which crashed Scylla, which will make it easier to find
this failure without being hidden between hundreds of spurious failures.
Fixes#12360
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12401
This patch enables offstrategy compaction for all classic streaming
based node ops. We can use this method because tables are streamed one
after another. As long as there is still streamed data for a given
table, we update the automatic trigger timer. When all the streaming has
finished, the trigger timer will timeout and fire the offstrategy
compaction for the given table.
I checked with this patch, rebuild is 3X faster. There was no compaction
in the middle of the streaming. The streamed sstables are compacted
together after streaming is done.
Time Before:
INFO 2022-11-25 10:06:08,213 [shard 0] range_streamer - Rebuild
succeeded, took 67 seconds, nr_ranges_remaining=0
Time After:
INFO 2022-11-25 09:42:50,943 [shard 0] range_streamer - Rebuild
succeeded, took 23 seconds, nr_ranges_remaining=0
Compaciton Before:
88 sstables were written -> 88 sstables were added into main set
Compaction After:
88 sstables written -> after offstretegy 2 sstables were added into main seet
Closes#11848
invoke_on_task is used in translation units where its definition is not
visible, yet it has no explicit instantiations. If the compiler always
decides to inline the definition, not to instantiate it implicitly,
linking invoke_on_task will fail. (It happened to me when I turned up
inline-threshold). Fix that.
Closes#12387
In very slow debug builds the default driver timeouts are too low and
tests might fail. Bump up the values to more reasonable time.
These timeout values are the same as used in topology tests.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Closes#12405
When the mutation compactor has all the rows it needs for a page, it
saves the decision to stop in a member flag: _stop.
For single partition queries, the mutation compactor is kept alive
across pages and so it has a method, start_new_page() to reset its state
for the next page. This method didn't clear the _stop flag. This meant
that the value set at the end of the previous could cause the new page
and subsequently the entire query to be stopped prematurely.
This can happen if the new page starts with a row that is covered by a
higher level tombstone and is completely empty after compaction.
Reset the _stop flag in start_new_page() to prevent this.
This commit also adds a unit test which reproduces the bug.
Fixes: #12361Closes#12384
On some docker instance configuration, hostname resolution does not
work, so our script will fail on startup because we use hostname -i to
construct cqlshrc.
To prevent the error, we can use --rpc-address or --listen-address
for the address since it should be same.
Fixes#12011Closes#12115
In case a table is dropped, we should ignore it in the repair_updater,
since we can not update off strategy trigger for a dropped table.
Refs #12373Closes#12388
Every 1 hour, compaction manager will submit all registered table_state
for a regular compaction attempt, all without yielding.
This can potentially cause a reactor stall if there are 1000s of table
states, as compaction strategy heuristics will run on behalf of each,
and processing all buckets and picking the best one is not cheap.
This problem can be magnified with compaction groups, as each group
is represented by a table state.
This might appear in dashboard as periodic stalls, every 1h, misleading
the investigator into believing that the problem is caused by a
chronological job.
This is fixed by piggybacking on compaction reevaluation loop which
can yield between each submission attempt if needed.
Fixes#12390.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#12391
They are currently missing from the printout
when the a table is created, but they are determinal
to understanding the mode with which tombstones are to
be garbage-collected in the table. gcGraceSeconds alone
is no longer enough since the introduction of
tombstone_gc_option in a8ad385ecd.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12381
In issue #10767, concerned were raised that the CLUSTERING ORDER BY
clause is handled incorrectly in a CREATE MATERIALIZED VIEW definition.
The tests in this patch try to explore the different ways in which
CLUSTERING ORDER BY can be used in CREATE MATERIALIZED VIEW and allows
us to compare Scylla's behaivor to Cassandra, and to common sense.
The tests discover that the CLUSTERING ORDER BY feature in materialized
views generally works as expected, but there are *three* differences
between Scylla and Cassandra in this feature. We consider two differences
to be bugs (and hence the test is marked xfail) and one a Scylla extension:
1. When a base table has a reverse-order clustering column and this
clustering column is used in the materialized view, in Cassandra
the view's clustering order inherits the reversed order. In Scylla,
the view's clustering order reverts to the default order.
Arguably, both behaviors can be justified, but usually when in doubt
we should implement Cassandra's behavior - not pick a different
behavior, even if the different behavior is also reasonable. So
this test (test_mv_inherit_clustering_order()) is marked "xfail",
and a new issue was created about this difference: #12308.
If we want to fix this behavior to match Cassandra's we should also
consider backward compatibility - what happens if we change this
behavior in Scylla now, after we had the opposite behavior in
previous releases? We may choose to enshrine Scylla's Cassandra-
incompatible behavior here - and document this difference.
2. The CLUSTERING ORDER BY should, as its name suggests, only list
clustering columns. In Scylla, specifying other things, like regular
columns, partition-key columns, or non-existent columns, is silently
ignored, whereas it should result in an Invalid Request error (as it
does in Cassandra). So test_mv_override_clustering_order_error()
is marked "xfail".
This is the difference already discovered in #10767.
3. When a materialized view has several clustering columns, Cassandra
requires that a CLUSTERING ORDER BY clause, if present, must specify
the order of all of *all* clustering columns. Scylla, in contrast,
allows the user to override the order of only *some* of these columns -
and the rest get the default order. I consider this to be a
legitimate Scylla extension, and not a compatibility bug, so marked
the test with "scylla_only", and no issue was opened about it.
Refs #10767
Refs #12308
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12307
This patch adds a scylla_inject_error(), a context manager which tests
can use to temporarily enable some error injection while some test
code is running. It can be used to write tests that artificially
inject certain errors instead of trying to reach the elaborate (and
often requiring precise timing or high amounts of data) situation where
they occur naturally.
The error-injection API is Scylla-specific (it uses the Scylla REST API)
and does not work on "release"-mode builds (all other modes are supported),
so when Cassandra or release-mode build are being tested, the test which
uses scylla_inject_error() gets skipped.
Example usage:
```python
from rest_api import scylla_inject_error
with scylla_inject_error(cql, "injection_name", one_shot=True):
# do something here
...
```
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12264
Retrieves the configuration item with the given name and prints its
value as well as its metadata.
Example:
(gdb) scylla get-config-value compaction_static_shares
value: 100, type: "float", source: SettingsFile, status: Used, live: MustRestart
Closes#12362
* github.com:scylladb/scylladb:
scylla-gdb.py: add scylla get-config-value gdb command
scylla-gdb.py: extract $downcast_vptr logic to standalone method
test: scylla-gdb/run: improve diagnostics for failed tests
These options have been nonsense since 2017.
--pie and --so are ignored, --static disables (sic!) static linking of
libraries.
Remove them.
Closes#12366
Retrieves the configuration item with the given name and prints its
value as well as its metadata.
Example:
(gdb) scylla get-config-value compaction_static_shares
value: 100, type: "float", source: SettingsFile, status: Used, live: MustRestart
Due to an oversight, the local index cache isn't evicted gently
when _upper_bound existed. This is a source of reactor stalls.
Fix that.
Fixes#12271Closes#12364
Currently the scylla tools (`scylla-types` and `scylla-sstable`) have documentation in two places: high level documentation can be found at `docs/operating-scylla/admin-tools/scylla-{types,sstable}.rst`, while low level, more detailed documentation is embedded in the tool itself. This is especially pronounced for `scylla-sstable`, which only has a short description of its operations online, all details being found only in the command-line help.
We want to move away from this model, such that all documentation can be found online, with the command-line help being reserved to documenting how the various switches and flags work, on top of a short description of the operation and a link to the detailed online docs.
Closes#12284
* github.com:scylladb/scylladb:
tool/scylla-sstable: move documentation online
docs: scylla-sstable.rst: add sstable content section
docs: scylla-{sstable,types}.rst: drop Syntax section
Allows static configuration of number of compaction groups per table per shard.
To bootstrap the project, config option x_log2_compaction_groups was added which controls both number of groups and partitioning within a shard.
With a value of 0 (default), it means 1 compaction group, therefore all tokens go there.
With a value of 3, it means 8 compaction groups, and 3 most-significant-bits of tokens being used to decide which group owns the token.
And so on.
It's still missing:
- integration with repair / streaming
- integration with reshard / reshape.
perf/perf_simple_query --smp 1 --memory 1G
BEFORE
-----
median 61358.55 tps ( 71.1 allocs/op, 12.2 tasks/op, 56375 insns/op, 0 errors)
median 61322.80 tps ( 71.1 allocs/op, 12.2 tasks/op, 56391 insns/op, 0 errors)
median 61058.58 tps ( 71.1 allocs/op, 12.2 tasks/op, 56386 insns/op, 0 errors)
median 61040.94 tps ( 71.1 allocs/op, 12.2 tasks/op, 56381 insns/op, 0 errors)
median 61118.40 tps ( 71.1 allocs/op, 12.2 tasks/op, 56379 insns/op, 0 errors)
AFTER
-----
median 61656.12 tps ( 71.1 allocs/op, 12.2 tasks/op, 56486 insns/op, 0 errors)
median 61483.29 tps ( 71.1 allocs/op, 12.2 tasks/op, 56495 insns/op, 0 errors)
median 61638.05 tps ( 71.1 allocs/op, 12.2 tasks/op, 56494 insns/op, 0 errors)
median 61726.09 tps ( 71.1 allocs/op, 12.2 tasks/op, 56509 insns/op, 0 errors)
median 61537.55 tps ( 71.1 allocs/op, 12.2 tasks/op, 56491 insns/op, 0 errors)
Closes#12139
* github.com:scylladb/scylladb:
test: mutation_test: Test multiple compaction groups
test: database_test: Test multiple compaction groups
test: database_test: Adapt it to compaction groups
db: Add config for setting static number of compaction groups
replica: Introduce static compaction groups
test: sstable_test: Stop referencing single compaction group
api: compaction_manager: Stop a compaction type for all groups
api: Estimate pending tasks on all compaction groups
api: storage_service: Run maintenance compactions on all compaction groups
replica: table: Adapt assertion to compaction groups
replica: database: stop and disable compaction on behalf of all groups
replica: Introduce table::parallel_foreach_table_state()
replica: disable auto compaction on behalf of all groups
replica: table: Rework compaction triggers for compaction groups
replica: Adapt table::get_sstables_including_compacted_undeleted() to compaction groups
replica: Adapt table::rebuild_statistics() to compaction groups
replica: table: Perform major compaction on behalf of all groups
replica: table: Perform off-strategy compaction on behalf of all groups
replica: table: Perform cleanup compaction on behalf of all groups
replica: Extend table::discard_sstables() to operate on all compaction groups
replica: table: Create compound sstable set for all groups
replica: table: Set compaction strategy on behalf of all groups
replica: table: Return min memtable timestamp across all groups
replica: Adapt table::stop() to compaction groups
replica: Adapt table::clear() to compaction groups
replica: Adapt table::can_flush() to compaction groups
replica: Adapt table::flush() to compaction groups
replica: Introduce parallel_foreach_compaction_group()
replica: Adapt table::set_schema() to compaction groups
replica: Add memtables from all compaction groups for reads
replica: Add memtable_count() method to compaction_group
replica: table: Reserve reader list capacity through a callback
replica: Extract addition of memtables to reader list into a new function
replica: Adapt table::occupancy() to compaction groups
replica: Adapt table::active_memtable() to compaction groups
replica: Introduce table::compaction_groups()
replica: Preparation for multiple compaction groups
scylla-gdb: Fix backward compatibility of scylla_memtables command
Said mechanism broke tools and tests to some extent: the read it executes on sstable load time means that if the sstable is broken enough to fail this read, it will fail to load, preventing diagnostic tools to load it and examine it and preventing tests from producing broken sstables for testing purposes.
Closes#12359
* github.com:scylladb/scylladb:
sstables: allow bypassing first/last position metadata loading
sstables: sstable::{load,open_data}(): fix indentation
sstables: coroutinize sstable::open_data()
sstables: sstable::open_data(): use clear_gently() to clear token ranges
sstables: coroutinize sstable::load()
Type of the id of node operations is changed from utils::UUID
to node_ops_id. This way the id of node operations would be easily
distinguished from the ids of other entities.
Closes#11673
* seastar 3a5db04197...3db15b5681 (27):
> build: get the full path of c-ares
> build: unbreak pkgconfig output
> http: Add 206 Partial Content response code
> http: Carry integer content_length on reply
> tls_test: drop duplicated includes
> tls_test: remove duplicated test case
> reactor: define __NR_pidfd_open if not defined
> sockets: Wait on socket peer closing the connection
> tcp: Close connection when getting RST from server
> Merge 'Enhance rpc tester with delays, timeouts and verbosity' from Pavel Emelyanov
> Merge 'build: use pkg_search_module(.. IMPORTED_TARGET ..) ' from Kefu Chai
> build: define GnuTLS_{LIBRARIES, INCLUDE_DIRS} only if GnuTLS is found
> build: use pkg_search_module(.. IMPORTED_TARGET ..)
> addr2line: extend asan regex
> abort_source: move-assign operator: call base class unlink
> coroutine: correct syntax error in doxygen comment
> demo: Extend http connection demo with https
> test: temporarily disable warning for tests triggering warnings
> tests/unit/coroutine: Include <ranges>
> sstring: Document why sstring exists at all
> test: log error when read/write to pipe fails
> test: use executables in /bin
> tests: spawn_test: use BOOST_CHECK_EQUAL() for checking equality of temporary_buffer
> docker: bump up to clang {14,15} and gcc {11,12}
> shared_ptr: ignore false alarm from GCC-12
> build: check for fix of CWG2631
> circleci: use versioned container image
Closes#12355
In the past we had issue #7933 where very long strings of consecutive
tombstones caused Alternator's paging to take an unbounded amount of
time and/or memory for a single page. This issue was fixed (by commit
e9cbc9ee85) but the two tests we had
reproducing that issue were left with the "xfail" mark.
They were also marked "veryslow" - each taking about 100 seconds - so
they didn't run by default so nobody noticed they started to pass.
In this patch I make these tests much faster (taking less than a second
together), confirm that they pass - and remove the "xfail" mark and
improve their descriptions.
The trick to making these tests faster is to not create a million
tombstones like we used to: We now know that after string of just 10,000
tombstones ('query_tombstone_page_limit') the page should end, so
we can check specifically this number. The story is more complicated for
partition tombstones, but there too it should be a multiple of
query_tombstone_page_limit. To make the tests even faster, we change
run.py to lower the query_tombstone_page_limit from the default 10,000
to 1000. The tests work correctly even without this change, but they are
ten times faster with it.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12350
* Update Nixpkgs base
* Clarify some comments
* Get rid of custom-packaged cxxbridge (it's now present in Nixpkgs as
cxx-rs)
* Add missing libraries (libdeflate, libxcrypt)
* Fix expected hash of the gdb patch
* Fix a couple of small build problems
Fixes#12259Closes#12346
* github.com:scylladb/scylladb:
build: fix Nix devenv
cql3: mark several private fields as maybe_unused
configure.py: link with more abseil libs
* Update Nixpkgs base
* Clarify some comments
* Get rid of custom-packaged cxxbridge (it's now present in Nixpkgs as
cxx-rs)
* Add missing libraries (libdeflate, libxcrypt)
* Fix expected hash of the gdb patch
* Bump Python driver to 3.25.20-scylla
Fixes#12259
Because they are indeed unused -- they are initialized, passed down
through some layers, but not actually used. No idea why only Clang 12
in debug mode in Nix devenv complains about it, though.
Specifically libabsl_strings{,_internal}.a.
This fixes failure to link tests in the Nix devenv; since presumably
all is good in other setups, it must be something weird having to do
with inlining?
The extra linked libraries shouldn't hurt in any case.
Extends mutation_test to run the tests with more than one
compaction group, in addition to a single one (default).
Piggyback on existing tests. Avoids duplication.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>