We report virtual memory used, but that's not a real accounting
of the actual memory used. Use the correct real_memory_used() instead.
Note that this isn't a recent regression and was probably broken forever.
However nobody looks at this measure (and it's usually close to the
correct value) so nobody noticed.
Since it's so minor, I didn't bother filing an issue.
Before 95f31f37c1 ("Merge 'dirty_memory_manager: simplify
region_group' from Avi Kivity"), we had two region_group
objects, one _real_region_group and another _virtual_region_group,
each with a set of "soft" and "hard" limits and related functions
and members.
In 95f31f37c1, we merged _real_region_group into _virtual_region_group,
but unfortunately the _real_region_group members received the "hard"
prefix when they got merged. This overloads the meaning of "hard" -
is it related to soft/hard limit or is it related to the real/virtual
distinction?
This patch applied some renaming to restore consistency. Anything
that came from _virtual_region_group now has "virtual" in its name.
Anything that came from _real_region_group now has "real" in its name.
The terms are still pretty bad but at least they are consistent.
region_group evolved as a tree, each node of which contains some
regions (memtables). Each node has some constraints on memory, and
can start flushing and/or stop allocation into its memtables and those
below it when those constraints are violated.
Today, the tree has exactly two nodes, only one of which can hold memtables.
However, all the complexity of the tree remains.
This series applies some mechanical code transformations that remove
the tree structure and all the excess functionality, leaving a much simpler
structure behind.
Before:
- a tree of region_group objects
- each with two parameters: soft limit and hard limit
- but only two instances ever instantiated
After:
- a single region_group object
- with three parameters - two from the bottom instance, one from the top instance
Closes#11570
* github.com:scylladb/scylladb:
dirty_memory_manager: move third memory threshold parameter of region_group constructor to reclaim_config
dirty_memory_manager: simplify region_group::update()
dirty_memory_manager: fold region_group::notify_hard_pressure_relieved into its callers
dirty_memory_manager: clean up region_group::do_update_hard_and_check_relief()
dirty_memory_manager: make do_update_hard_and_check_relief() a member of region_group
dirty_memory_manager: remove accessors around region_group::_under_hard_pressure
dirty_memory_manager: merge memory_hard_limit into region_group
dirty_memory_manager: rename members in memory_hard_limit
dirty_memory_manager: fold do_update() into region_group::update()
dirty_memory_manager: simplify memory_hard_limit's do_update
dirty_memory_manager: drop soft limit / soft pressure members in memory_hard_limit
dirty_memory_manager: de-template do_update(region_group_or_memory_hard_limit)
dirty_memory_manager: adjust soft_limit threshold check
dirty_memory_manager: drop memory_hard_limit::_name
dirty_memory_manager: simplify memory_hard_limit configuration
dirty_memory_manager: fold region_group_reclaimer into {memory_hard_limit,region_group}
dirty_memory_manager: stop inheriting from region_group_reclaimer
dirty_memory_manager: test: unwrap region_group_reclaimer
dirty_memory_manager: change region_group_reclaimer configuration to a struct
dirty_memory_manager: convert region_group_reclaimer to callbacks
dirty_memory_manager: consolidate region_group_reclaimer constructors
dirty_memory_manager: rename {memory_hard_limit,region_group}::notify_relief
dirty_memory_manager: drop unused parameter to memory_hard_limit constructor
dirty_memory_manager: drop memory_hard_limit::shutdown()
dirty_memory_manager: split region_group hierarchy into separate classes
dirty_memory_manager: extract code block from region_group::update
dirty_memory_manager: move more allocation_queue functions out of region_group
dirty_memory_manager: move some allocation queue related function definitions outside class scope
dirty_memory_manager: move region_group::allocating_function and related classes to new class allocation_queue
dirty_memory_manager: remove support for multiple subgroups
The view builder builds the views from a given base table in
view_builder::batch_size batches of rows. After processing this many
rows, it suspends so the view builder can switch to building views for
other base tables in the name of fairness. When resuming the build step
for a given base table, it reuses the reader used previously (also
serving the role of a snapshot, pinning sstables read from). The
compactor however is created anew. As the reader can be in the middle of
a partition, the view builder injects a partition start into the
compactor to prime it for continuing the partition. This however only
included the partition-key, crucially missing any active tombstones:
partition tombstone or -- since the v2 transition -- active range
tombstone. This can result in base rows covered by either of this to be
resurrected and the view builder to generate view updates for them.
This patch solves this by using the detach-state mechanism of the
compactor which was explicitly developed for situations like this (in
the range scan code) -- resuming a read with the readers kept but the
compactor recreated.
Also included are two test cases reproducing the problem, one with a
range tombstone, the other with a partition tombstone.
Fixes: #11668Closes#11671
* abseil 9e408e05...7f3c0d78 (193):
> Allows absl::StrCat to accept types that implement AbslStringify()
> Merge pull request #1283 from pateldeev:any_inovcable_rename_true
> Cleanup: SmallMemmove nullify should also be limited to 15 bytes
> Cleanup: implement PrependArray and PrependPrecise in terms of InlineData
> Cleanup: Move BitwiseCompare() to InlineData, and make it layout independent.
> Change kPower10Table bounds to be half-open
> Cleanup some InlineData internal layout specific details from cord.h
> Improve the comments on the implementation of format hooks adl tricks.
> Expand LogEntry method docs.
> Documentation: Remove an obsolete note about the implementation of `Cord`.
> `absl::base_internal::ReadLongFromFile` should use `O_CLOEXEC` and handle interrupts to `read`
> Allows absl::StrFormat to accept types which implement AbslStringify()
> Add common_policy_traits - a subset of hash_policy_traits that can be shared between raw_hash_set and btree.
> Split configuration related to cycle clock into separate headers
> Fix -Wimplicit-int-conversion and -Wsign-conversion warnings in btree.
> Implement Eisel-Lemire for from_chars<float>
> Import of CCTZ from GitHub.
> Adds support for "%v" in absl::StrFormat and related functions for bool values. Note that %v prints bool values as "true" and "false" rather than "1" and "0".
> De-pointerize LogStreamer::stream_, and fix move ctor/assign preservation of flags and other stream properties.
> Explicitly disallows modifiers for use with %v.
> Change the macro ABSL_IS_TRIVIALLY_RELOCATABLE into a type trait - absl::is_trivially_relocatable - and move it from optimization.h to type_traits.h.
> Add sparse and string copy constructor benchmarks for hash table.
> Make BTrees work with custom allocators that recycle memory.
> Update the readme, and (internally) fix some export processes to better keep it up-to-date going forward.
> Add the fact that CHECK_OK exits the program to the comment of CHECK_OK.
> Adds support for "%v" in absl::StrFormat and related functions for numeric types, including integer and floating point values. Users may now specify %v and have the format specifier deduced. Integer values will print according to %d specifications, unsigned values will use %u, and floating point values will use %g. Note that %v does not work for `char` due to ambiguity regarding the intended output. Please continue to use %c for `char`.
> Implement correct move constructor and assignment for absl::strings_internal::OStringStream, and mark that class final.
> Add more options for `BM_iteration` in order to see better picture for choosing trade off for iteration optimizations.
> Change `EndComparison` benchmark to not measure iteration. Also added `BM_Iteration` separately.
> Implement Eisel-Lemire for from_chars<double>
> Add `-llog` to linker options when building log_sink_set in logging internals.
> Apply clang-format to btree.h.
> Improve failure message: tell the values we don't like.
> Increase the number of per-ObjFile program headers we can expect.
> Fix "unsafe narrowing" warnings in absl, 8/n.
> Fix format string error with an explicit cast
> Add a case to detect when the Bazel compiler string is explicitly set to "gcc", instead of just detecting Bazel's default "compiler" string.
> Fix "unsafe narrowing" warnings in absl, 10/n.
> Fix "unsafe narrowing" warnings in absl, 9/n.
> Fix stacktrace header includes
> Add a missing dependency on :raw_logging_internal
> CMake: Require at least CMake 3.10
> CMake: install artifacts reflect the compiled ABI
> Fixes bug so that `%v` with modifiers doesn't compile. `%v` is not intended to work with modifiers because the meaning of modifiers is type-dependent and `%v` is intended to be used in situations where the type is not important. Please continue using if `%s` if you require format modifiers.
> Convert algorithm and container benchmarks to cc_binary
> Merge pull request #1269 from isuruf:patch-1
> InlinedVector: Small improvement to the max_size() calculation
> CMake: Mark hash_testing as a public testonly library, as it is with Bazel
> Remove the ABSL_HAVE_INTRINSIC_INT128 test from pcg_engine.h
> Fix ClangTidy warnings in btree.h and btree_test.cc.
> Fix log StrippingTest on windows when TCHAR = WCHAR
> Refactors checker.h and replaces recursive functions with iterative functions for readability purposes.
> Refactors checker.h to use if statements instead of ternary operators for better readability.
> Import of CCTZ from GitHub.
> Workaround for ASAN stack safety analysis problem with FixedArray container annotations.
> Rollback of fix "unsafe narrowing" warnings in absl, 8/n.
> Fix "unsafe narrowing" warnings in absl, 8/n.
> Changes mutex profiling
> InlinedVector: Correct the computation of max_size()
> Adds support for "%v" in absl::StrFormat and related functions for string-like types (support for other builtin types will follow in future changes). Rather than specifying %s for strings, users may specify %v and have the format specifier deduced. Notably, %v does not work for `const char*` because we cannot be certain if %s or %p was intended (nor can we be certain if the `const char*` was properly null-terminated). If you have a `const char*` you know is null-terminated and would like to work with %v, please wrap it in a `string_view` before using it.
> Fixed header guards to match style guide conventions.
> Typo fix
> Added some more no_test.. tags to build targets for controlling testing.
> Remove includes which are not used directly.
> CMake: Add an option to build the libraries that are used for writing tests without requiring Abseil's tests be built (default=OFF)
> Fix "unsafe narrowing" warnings in absl, 7/n.
> Fix "unsafe narrowing" warnings in absl, 6/n.
> Release the Abseil Logging library
> Switch time_state to explicit default initialization instead of value initialization.
> spinlock.h: Clean up includes
> Fix minor typo in absl/time/time.h comment: "ToDoubleNanoSeconds" -> "ToDoubleNanoseconds"
> Support compilers that are unknown to CMake
> Import of CCTZ from GitHub.
> Change bit_width(T) to return int rather than T.
> Import of CCTZ from GitHub.
> Merge pull request #1252 from jwest591:conan-fix
> Don't try to enable use of ARM NEON intrinsics when compiling in CUDA device mode. They are not available in that configuration, even if the host supports them.
> Fix "unsafe narrowing" warnings in absl, 5/n.
> Fix "unsafe narrowing" warnings in absl, 4/n.
> Import of CCTZ from GitHub.
> Update Abseil platform support policy to point to the Foundational C++ Support Policy
> Import of CCTZ from GitHub.
> Add --features=external_include_paths to Bazel CI to ignore warnings from dependencies
> Merge pull request #1250 from jonathan-conder-sm:gcc_72
> Merge pull request #1249 from evanacox:master
> Import of CCTZ from GitHub.
> Merge pull request #1246 from wxilas21:master
> remove unused includes and add missing std includes for absl/status/status.h
> Sort INTERNAL_DLL_TARGETS for easier maintenance.
> Disable ABSL_HAVE_STD_IS_TRIVIALLY_ASSIGNABLE for clang-cl.
> Map the absl::is_trivially_* functions to their std impl
> Add more SimpleAtod / SimpleAtof test coverage
> debugging: handle alternate signal stacks better on RISCV
> Revert change "Fix "unsafe narrowing" warnings in absl, 4/n.".
> Fix "unsafe narrowing" warnings in absl, 3/n.
> Fix "unsafe narrowing" warnings in absl, 4/n.
> Fix "unsafe narrowing" warnings in absl, 2/n.
> debugging: honour `STRICT_UNWINDING` in RISCV path
> Fix "unsafe narrowing" warnings in absl, 1/n.
> Add ABSL_IS_TRIVIALLY_RELOCATABLE and ABSL_ATTRIBUTE_TRIVIAL_ABI macros for use with clang's __is_trivially_relocatable and [[clang::trivial_abi]].
> Merge pull request #1223 from ElijahPepe:fix/implement-snprintf-safely
> Fix frame pointer alignment check.
> Fixed sign-conversion warning in code.
> Import of CCTZ from GitHub.
> Add missing include for std::unique_ptr
> Do not re-close files on EINTR
> Renamespace absl::raw_logging_internal to absl::raw_log_internal to match (upcoming) non-raw logging namespace.
> Check for negative return values from ReadFromOffset
> Use HTTPS RFC URLs, which work regardless of the browser's locale.
> Avoid signedness change when casting off_t
> Internal Cleanup: removing unused internal function declaration.
> Make Span complain if constructed with a parameter that won't outlive it, except if that parameter is also a span or appears to be a view type.
> any_invocable_test: Re-enable the two conversion tests that used to fail under MSVC
> Add GetCustomAppendBuffer method to absl::Cord
> debugging: add hooks for checking stack ranges
> Minor clang-tidy cleanups
> Support [[gnu::abi_tag("xyz")]] demangling.
> Fix -Warray-parameter warning
> Merge pull request #1217 from anpol:macos-sigaltstack
> Undo documentation change on erase.
> Improve documentation on erase.
> Merge pull request #1216 from brjsp:master
> string_view: conditional constexpr is no longer needed for C++14
> Make exponential_distribution_test a bigger test (timeout small -> moderate).
> Move Abseil to C++14 minimum
> Revert commit f4988f5bd4176345aad2a525e24d5fd11b3c97ea
> Disable C++11 testing, enable C++14 and C++20 in some configurations where it wasn't enabled
> debugging: account for differences in alternate signal stacks
> Import of CCTZ from GitHub.
> Run flaky test in fewer configurations
> AnyInvocable: Move credits to the top of the file
> Extend visibility of :examine_stack to an upcoming Abseil Log.
> Merge contiguous mappings from the same file.
> Update versions of WORKSPACE dependencies
> Use ABSL_INTERNAL_HAS_SSE2 instead of __SSE2__
> PR #1200: absl/debugging/CMakeLists.txt: link with libexecinfo if needed
> Update GCC floor container to use Bazel 5.2.0
> Update GoogleTest version used by Abseil
> Release absl::AnyInvocable
> PR #1197: absl/base/internal/direct_mmap.h: fix musl build on mips
> absl/base/internal/invoke: Ignore bogus warnings on GCC >= 11
> Revert GoogleTest version used by Abseil to commit 28e1da21d8d677bc98f12ccc7fc159ff19e8e817
> Update GoogleTest version used by Abseil
> explicit_seed_seq_test: work around/disable bogus warnings in GCC 12
> any_test: expand the any emplace bug suppression, since it has gotten worse in GCC 12
> absl::Time: work around bogus GCC 12 -Wrestrict warning
> Make absl::StdSeedSeq an alias for std::seed_seq
> absl::Optional: suppress bogus -Wmaybe-uninitialized GCC 12 warning
> algorithm_test: suppress bogus -Wnonnull warning in GCC 12
> flags/marshalling_test: work around bogus GCC 12 -Wmaybe-uninitialized warning
> counting_allocator: suppress bogus -Wuse-after-free warning in GCC 12
> Prefer to fallback to UTC when the embedded zoneinfo data does not contain the requested zone.
> Minor wording fix in the comment for ConsumeSuffix()
> Tweak the signature of status_internal::MakeCheckFailString as part of an upcoming change
> Fix several typos in comments.
> Reformulate documentation of ABSL_LOCKS_EXCLUDED.
> absl/base/internal/invoke.h: Use ABSL_INTERNAL_CPLUSPLUS_LANG for language version guard
> Fix C++17 constexpr storage deprecation warnings
> Optimize SwissMap iteration by another 5-10% for ARM
> Add documentation on optional flags to the flags library overview.
> absl: correct the stack trace path on RISCV
> Merge pull request #1194 from jwnimmer-tri:default-linkopts
> Remove unintended defines from config.h
> Ignore invalid TZ settings in tests
> Add ABSL_HARDENING_ASSERTs to CordBuffer::SetLength() and CordBuffer::IncreaseLengthBy()
> Fix comment typo about absl::Status<T*>
> In b-tree, support unassignable value types.
> Optimize SwissMap for ARM by 3-8% for all operations
> Release absl::CordBuffer
> InlinedVector: Limit the scope of the maybe-uninitialized warning suppression
> Improve the compiler error by removing some noise from it. The "deleted" overload error is useless to users. By passing some dummy string to the base class constructor we use a valid constructor and remove the unintended use of the deleted default constructor.
> Merge pull request #714 from kgotlinux:patch-2
> Include proper #includes for POSIX thread identity implementation when using that implementation on MinGW.
> Rework NonsecureURBGBase seed sequence.
> Disable tests on some platforms where they currently fail.
> Fixed typo in a comment.
> Rollforward of commit ea78ded7a5f999f19a12b71f5a4988f6f819f64f.
> Add an internal helper for logging (upcoming).
> Merge pull request #1187 from trofi:fix-gcc-13-build
> Merge pull request #1189 from renau:master
> Allow for using b-tree with `value_type`s that can only be constructed by the allocator (ignoring copy/move constructors).
> Stop using sleep timeouts for Linux futex-based SpinLock
> Automated rollback of commit f2463433d6c073381df2d9ca8c3d8f53e5ae1362.
> time.h: Use uint32_t literals for calls to overloaded MakeDuration
> Fix typos.
> Clarify the behaviour of `AssertHeld` and `AssertReaderHeld` when the calling thread doesn't hold the mutex.
> Enable __thread on Asylo
> Add implementation of is_invocable_r to absl::base_internal for C++ < 17, define it as alias of std::is_invocable_r when C++ >= 17
> Optimize SwissMap iteration for aarch64 by 5-6%
> Fix detection of ABSL_HAVE_ELF_MEM_IMAGE on Haiku
> Don’t use generator expression to build .pc Libs lines
> Update Bazel used on MacOS CI
> Import of CCTZ from GitHub.
Closes#11687
To recap: the Nix devenv ({default,shell,flake}.nix and friends) in Scylla is a nicer (for those who consider it so, that is) alternative to dbuild: a completely deterministic build environment without Docker.
In theory we could support much more (creating installable packages, container images, various deployment affordances, etc. -- Nix is, among other things, a kind of parallel-to-everything-else devops realm) but there is clearly no demand and besides duplicating the work the release team is already doing (and doing just fine, needless to say) would be pointless and wasteful.
This PR reflects the accumulated changes that I have been carrying locally for the past year or so. The version currently in master _probably_ can still build Scylla, but that Scylla certainly would not pass unit tests.
What the previous paragraph seems to mean is, apparently I'm the only active user of Nix devenv for Scylla. Which, in turn, presents some obvious questions for the maintainers:
- Does this need to live in the Scylla source at all? (The changes to non-Nix-specific parts are minimal and unobtrusive, but they are still changes)
- If it's left in, who is going to maintain it going forward, should more users somehow appear? (I'm perfectly willing to fix things up when alerted, but no timeliness guarantees)
Closes#9557
* github.com:scylladb/scylladb:
nix: add README.md
build: improvements & upgrades to Nix dev environment
build: allow setting SCYLLA_RELEASE from outside
The combination is hard to read and modify.
Closes#11665
* github.com:scylladb/scylladb:
readers/multishard: restore shard_reader_v2::do_fill_buffer() indentation
readers/multishard: convert shard_reader_v2::do_fill_buffer() to a pure coroutine
Include the unique test name (the unique name distinguishes between different test repeats) and the test case name where possible. Improve printing of clusters: include the cluster name and stopped servers. Fix some logging calls and add new ones.
Examples:
```
------ Starting test test_topology ------
```
became this:
```
------ Starting test test_topology.1::test_add_server_add_column ------
```
This:
```
INFO> Leasing Scylla cluster {127.191.142.1, 127.191.142.2, 127.191.142.3} for test test_add_server_add_column
```
became this:
```
INFO> Leasing Scylla cluster ScyllaCluster(name: 02cdd180-40d1-11ed-8803-3c2c30d32d96, running: {127.144.164.1, 127.144.164.2, 127.144.164.3}, stopped: {}) for test test_topology.1::test_add_server_add_column
```
Closes#11677
* github.com:scylladb/scylladb:
test/pylib: scylla_cluster: improve cluster printing
test/pylib: don't pass test_case_name to after-test endpoint
test/pylib: scylla_cluster: track current test case name and print it
test.py: pass the unique test name (e.g. `test_topology.1`) to cluster manager
test/pylib: scylla_cluster: pass the test case name to `before_test`
test/pylib: use "test_case_name" variable name when talking about test cases
reclaim_timer uses a coarse clock, but does not account for
the measurement error introduced by that -- it can falsely
report reclaims as stalls, even if they are shorter by a full
coarse clock tick from the requested threshold
(blocked-reactor-notify-ms).
Notably, if the stall threshold happens to be smaller or equal to coarse
clock resolution, Scylla's log gets spammed with false stall reports.
The resolution of coarse clocks in Linux is 1/CONFIG_HZ. This is
typically equal to 1 ms or 4 ms, and stall thresholds of this order
can occur in practice.
Eliminate false positives by requiring the measured reclaim duration to
be at least 1 clock tick longer than the configured threshold for it to
be considered a stall.
Fixes#10981Closes#11680
"
This series adds a long waited transition of our auto-generation
code to irq_cpu_mask instead of 'mode' in perftune.yaml.
And then it fixes a regression in scylla_prepare perftune.yaml
auto-generation logic.
"
* 'scylla_prepare_fix_regression-v1' of https://github.com/vladzcloudius/scylla:
scylla_prepare + scylla_cpuset_setup: make scylla_cpuset_setup idempotent without introducing regressions
scylla_prepare: stop generating 'mode' value in perftune.yaml
* Add some more useful stuff to the shell environment, so it actually
works for debugging & post-mortem analysis.
* Wrap ccache & distcc transparently (distcc will be used unless
NODISTCC is set to a non-empty value in the environment; ccache will
be used if CCACHE_DIR is not empty).
* Package the Scylla Python driver (instead of the C* one).
* Catch up to misc build/test requirements (including optional) by
requiring or custom-packaging: wasmtime 0.29.0, cxxbridge,
pytest-asyncio, liburing.
* Build statically-linked zstd in a saner and more idiomatic fashion.
* In pure builds (where sources lack Git metadata), derive
SCYLLA_RELEASE from source hash.
* Refactor things for more parameterization.
* Explicitly stub out installPhase (seeing that "nix build" succeeds
up to installPhase means we didn't miss any dependencies).
* Add flake support.
* Add copious comments.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
The extant logic for deriving the value of SCYLLA_RELEASE from the
source tree has those assumptions:
* The tree being built includes Git metadata.
* The value of `date` is trustworthy and interesting.
* There are no uncommitted changes (those relevant to building,
anyway).
The above assumptions are either irrelevant or problematic in pure
build environments (such as the sandbox set up by `nix-build`):
* Pure builds use cleaned-up sources with all timestamps reset to Unix
time 0. Those cleaned-up sources are saved (in the Nix store, for
example) and content-hashed, so leaving the (possibly huge) Git
metadata increases the time to copy the sources and wastes disk
space (in fact, Nix in flake mode strips `.git` unconditionally).
* Pure builds run in a sandbox where time is, likewise, reset to Unix
time 0, so the output of `date` is neither informative nor useful.
Now, the only build step that uses Git metadata in the first place is
the SCYLLA_RELEASE value derivation logic. So, essentially, answering
the question "is the Git metadata needed to build Scylla" is a matter
of definition, and is up to us. If we elect to ignore Git metadata
and current time, we can derive SCYLLA_RELEASE value from the content
hash of the cleaned-up tree, regardless of the way that tree was
arrived at.
This change makes it possible to skip the derivation of SCYLLA_RELEASE
value from Git metadata and current time by way of setting
SCYLLA_RELEASE in the environment.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
We notice there are two separate conditions controlling a call to
a single outcome, notify_pressure_relief(). Merge them into a single
boolean variable.
It started life as something shared between memory_hard_limit and
region_group, but now that they are back being the same thing, we
can make it a member again.
The two classes always have a 1:1 or 0:1 relationship, and
so we can just move all the members of memory_hard_limit
into region_group, with the functions that track the relationship
(memory_hard_limit::{add,del}()) removed.
The 0:1 relationship is maintained by initializing the
hard limit parameter with std::numeric_limits<size_t>::max().
The _hard_total_memory variable is always checked if it is
greater than this parameter in order to do anything, and
with this default it can never be.
In preparation for merging memory_hard_limit into region_group,
disambiguate similarly named members by adding the word "hard" in
random places.
memory_hard_limit and region_group are candidates for merging
because they constantly reference each other, and memory_hard_limit
does very little by itself.
do_full_buffer() is an eclectic mix of coroutines and continuations.
That makes it hard to follow what is running sequentially and
concurrently.
Convert it into a pure coroutine by changing internal continuations
to lambda coroutines. These lambda coroutines are guarded with
seastar::coroutine::lambda. Furthermore, a future that is co_awaited
is converted to immediate co_await (without an intermediate future),
since seastar::coroutine::lambda only works if the coroutine is awaited
in the same statement it is defined on.
Print the cluster name and stopped servers in addition to the running
servers.
Fix a logging call which tried to print a server in place of a cluster
and even at that it failed (the server didn't have a hostname yet so it
printed as an empty string). Add another logging call.
Use `_before_test` calls to track the current test case name.
Concatenate it with the unique test name like this:
`test_topology.1::test_add_server_add_column`, and print it
instead of the test case name.
We pass the test case name to `after_test` - so make it consistent.
Arguably, the test case name is more useful (as it's more precise) than
the test name.
Reduce the false dependencies on db/large_data_handler.hh by
not including it from commonly used header files, and rather including
it only in the source files that actually need it.
The is in preparation for https://github.com/scylladb/scylladb/issues/11449Closes#11654
* github.com:scylladb/scylladb:
test: lib: do not include db/large_data_handler.hh in test_service.hh
test: lib: move sstable test_env::impl ctor out of line
sstables: do not include db/large_data_handler.hh in sstables.hh
api/column_family: add include db/system_keyspace.hh
The generator was first setting the marker then applied tombstones.
The marker was set like this:
row.marker() = random_row_marker();
Later, when shadowable tombstones were applied, they were compacted
with the marker as expected.
However, the key for the row was chosen randomly in each iteration and
there are multiple keys set, so there was a possibility of a key clash
with an earlier row. This could override the marker without applying
any tombstones, which is conditional on random choice.
This could generate rows with markers uncompacted with shadowable tombstones.
This broken row_cache_test::test_concurrent_reads_and_eviction on
comparison between expected and read mutations. The latter was
compacted because it went through an extra merge path, which compacts
the row.
Fix by making sure there are no key clashes.
Closes#11663
The `server_remove` function did a very weird thing: it shut down a
server and made the framework 'forget' about it. From the point of view
of the Scylla cluster and the driver the server was still there.
Replace the function's body with `raise NotImplementedError`. In the
future it can be replaced with an implementation that calls
`removenode` on the Scylla cluster.
Remove `test_remove_server_add_column` from `test_topology`. It
effectively does the same thing as `test_stop_server_add_column`, except
that the framework also 'forgets' about the stopped server. This could
lead to weird situations because the forgotten server's IP could be
reused in another test that was running concurrently with this test.
Closes#11657
It was needed for defining and referencing nop_lp_handler
and in sstable_3_x_test for testing the large_data_handler.
Remove the include from the commonly used header file
to reduce the false dependencies on large_data_handler.hh
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The logic to reject explicit snapshot of views/indexes was improved in aa127a2dbb. However, we never implemented auto-snapshot of
view/indexes when taking a snapshot of the base table.
This is implemented in this patch.
The implementation is built on top of
ba42852b0e
so it would be hard to backport to 5.1 or earlier
releases.
Fixes#11612
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#11616
* github.com:scylladb/scylladb:
database: automatically take snapshot of base table views
api: storage_service: reject snapshot of views in api layer
For db::system_keyspace::load_view_build_progress that currently
indirectly satisfied via sstables/sstables.hh ->
db/large_data_handler.hh
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Due to issue #11567, Alternator do not yet support adding a GSI to an
existing table via UpdateTable with the GlobalSecondaryIndexUpdates
parameter.
However, currently, we print a misleading error message in this case,
complaining about the AttributeDefinitions parameter. This parameter
is also required with GlobalSecondaryIndexUpdates, but it's not the
main problem, and the user is likely to be confused why the error message
points to that specific paramter and what it means that this parameter
is claimed to be "not supported" (while it is supported, in CreateTable).
With this patch, we report that GlobalSecondaryIndexUpdates is not
supported.
This patch does not fix the unsupported feature - it just improves
the error message saying that it's not supported.
Refs #11567
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11650
When walking through the ranges, we should yield to prevent stalls. We
do similar yield in other node operations.
Fix a stall in 5.1.dev.20220724.f46b207472a3 with build-id
d947aaccafa94647f71c1c79326eb88840c5b6d2
```
!INFO | scylla[6551]: Reactor stalled for 10 ms on shard 0. Backtrace:
0x4bbb9d2 0x4bba630 0x4bbb8e0 0x7fd365262a1f 0x2face49 0x2f5caff
0x36ca29f 0x36c89c3 0x4e3a0e1
````
Fixes#11146Closes#11160
Extend the cql3 truncate statement to accept attributes,
similar to modification statements.
To achieve that we define cql3::statements::raw::truncate_statement
derived from raw::cf_statement, and implement its pure virtual
prepare() method to make a prepared truncate_statement.
The latter is no longer derived from raw::cf_statement,
and just stores a schema_ptr to get to the keyspace and column_family.
`test_truncate_using_timeout` cql-pytest was added to test
the new USING TIMEOUT feature.
Fixes#11408
Also, update docs/cql/ddl.rst truncate-statement section respectively.
Closes#11409
* github.com:scylladb/scylladb:
docs: cql-extensions: add TRUNCATE to USING TIMEOUT section.
docs: cql: ddl: add support for TRUNCATE USING TIMEOUT
cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT
cql3: selectStatement: restrict to USING TIMEOUT in grammar
cql3: deleteStatement: restrict to USING TIMEOUT|TIMESTAMP in grammar
The series contains fixes for system.large_* log warning and respective documentation.
This prepares the way for adding a new system.large_collections table (See #11449):
Fixes#11620Fixes#11621Fixes#11622
the respective fixes should be backported to different release branches, based on the respective patches they depend on (mentioned in each issue).
Closes#11623
* github.com:scylladb/scylladb:
docs: adjust to sstable base name
docs: large-partition-table: adjust for additional rows column
docs: debugging-large-partition: update log warning example
db/large_data_handler: print static cell/collection description in log warning
db/large_data_handler: separate pk and ck strings in log warning with delimiter
Fix the type of `create_server`, rename `topology_for_class` to `get_cluster_factory`, simplify the suite definitions and parameters passed to `get_cluster_factory`
Closes#11590
* github.com:scylladb/scylladb:
test.py: replace `topology` with `cluster_size` in Topology tests
test.py: rename `topology_for_class` to `get_cluster_factory`
test/pylib: ScyllaCluster: fix create_server parameter type
The test was disabled due to a bug in the Python driver which caused the
driver not to reconnect after a node was restarted (see
scylladb/python-driver#170).
Introduce a workaround for that bug: we simply create a new driver
session after restarting the nodes. Reenable the test.
Closes#11641
Extended the queries language to support bind variables which are bound in the
execution stage, before creating a raft command.
Adjusted `test_broadcast_tables.py` to prepare statements at the beginning of the test.
Fixed a small bug in `strongly_consistent_modification_statement::check_access`.
Closes#11525
Before this patch we could get an OOM if we
received several big commands. The number of
commands was small, but their total size
in bytes was large.
snapshot_trailing_size is needed to guarantee
progress. Without this limit the fsm could
get stuck if the size of the next item is greater than
max_log_size - (size of trailing entries).
Closes#11397
* github.com:scylladb/scylladb:
raft replication_test, make backpressure test to do actual backpressure
raft server, shrink_to_fit on log truncation
raft server, release memory if add_entry throws
raft server, log size limit in bytes
When there are errors starting the first cluster(s) the logs of the server logs are needed. So move `.start()` to the `try` block in `test.py` (out of `asynccontextmanager`).
While there, make `ScyllaClusterManager.start()` idempotent.
Closes#11594
* github.com:scylladb/scylladb:
test.py: fix ScyllaClusterManager start/stop
test.py: fix topology init error handling
We don't want to keep memory we don't use, shrink_to_fit guarantees that.
In fact, boost::deque frees up memory when items are deleted, so this change has little effect at the moment, but it may pay off if we change the container in the future.