Commit Graph

33280 Commits

Author SHA1 Message Date
Benny Halevy
2c4ff71d2b db: large_data_handler: dynamically update config thresholds
make the various large data thresholds live-updateable
and construct the observers and updaters in
cql_table_large_data_handler to dynamically update
the base large_data_handler class threshold members.

Fixes #11685

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-05 10:53:40 +03:00
Benny Halevy
6d582054c0 utils/updateable_value: add transforming_value_updater
Automatically updates a value from a utils::updateable_value
Where they can be of different types.
An optional transfom function can provide an additional transformation
when updating the value, like multiplying it by a factor for unit conversion,
for example.

To be used for auto-updating the large data thresholds
from the db::config.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-05 10:52:49 +03:00
Benny Halevy
46ebffcc93 db/large_data_handler: cql_table_large_data_handler: record large_collections
When the large_collection_detection cluster feature is enabled,
select the internal_record_large_cells_and_collections method
to record the large collection cell, storing also the collection_elements
column.

We want to do that only when the cluster feature is enabled
to facilitate rollback in case rolling upgrade is aborted,
otherwise system.large_cells won't be backward compatible
and will have to be deleted manually.

Delete the sstable from system.large_cells if it contains
elements_in_collection above threshold.

Closes #11449

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:10 +03:00
Benny Halevy
3f8bba202f db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler
For recording collection_elements of large_collections when
the large_collection_detection feature is enabled.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:10 +03:00
Benny Halevy
dc4e7d8e01 db/large_data_handler: cql_table_large_data_handler: move ctor out of line
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:09 +03:00
Benny Halevy
f4c3070002 docs: large-rows-large-cells-tables: fix typos
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:09 +03:00
Benny Halevy
2f49eebb04 db/system_keyspace: add collection_elements column to system.large_cells
And bump the schema version offset since the new schema
should be distinguishable from the previous one.

Refs scylladb/scylladb#11660

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:08 +03:00
Benny Halevy
9ad41c700e gms/feature_service: add large_collection_detection cluster feature
And a corresponding db::schema_feature::SCYLLA_LARGE_COLLECTIONS

We want to enable the schema change supporting collection_elements
only when all nodes are upgraded so that we can roll back
if the rolling upgrade process is aborted.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:07 +03:00
Benny Halevy
9eeb8f2971 test: sstable_3_x_test: add test_sstable_too_many_collection_elements
Test that collections with too many elements are detected properly.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:07 +03:00
Benny Halevy
3c11937b00 test: lib: simple_schema: add support for optional collection column
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:06 +03:00
Benny Halevy
7b5f2d2e53 test: lib: simple_schema: build schema in ctor body
Rather when initializing _s.

Prepare for adding an optional collection column.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:06 +03:00
Benny Halevy
db01641a44 test: lib: simple_schema: cql: define s1 as static only if built this way
Keep the with_static ctor parameter as private member
to be used by the cql() method to define s1 either as static or not.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:05 +03:00
Benny Halevy
6dadca2648 db/large_data_handler: maybe_record_large_cells: consider collection_elements
Detect large_collections when the number of collection_elements
is above the configured threshold.

Next step would be to record the number of collection_elements
in the system.large_cells table, when the respective
cluster feature is enabled.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:05 +03:00
Benny Halevy
27ee75c54e db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries
Log in debug level when deleting large data entry
from system table.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:42:04 +03:00
Benny Halevy
7dead10742 sstables: mx/writer: pass collection_elements to writer::maybe_record_large_cells
And update the sstable elements_in_collection
stats entry.

Next step would be to forward it to
large_data_handler().maybe_record_large_cells().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:41:58 +03:00
Benny Halevy
54ab038825 sstables: mx/writer: add large_data_type::elements_in_collection
Add a new large_data_stats type and entry for keeping
the collection_elements_count_threshold and the maximum value
of collection_elements.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:41:56 +03:00
Benny Halevy
a107f583fd db/large_data_handler: get the collection_elements_count_threshold
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:31:11 +03:00
Benny Halevy
167ec84eeb db/config: add compaction_collection_elements_count_warning_threshold
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:31:10 +03:00
Benny Halevy
5e88e6267e test: sstable_3_x_test: add test_sstable_write_large_cell
based on cell size threshold.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:31:09 +03:00
Benny Halevy
3980415d97 test: sstable_3_x_test: pass cell_threshold_bytes to large_data_handler
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:31:09 +03:00
Benny Halevy
3eb4cda8ea test: sstable_3_x_test: large_data_handler: prepare callback for testing large_cells
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:31:08 +03:00
Benny Halevy
0a9d3f24e6 test: sstable_3_x_test: large_data tests: use BOOST_REQUIRE_[GL]T
This way, the boost infrastructure prints the offending values
if the test assertion fails.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:31:07 +03:00
Benny Halevy
9668dd0e2d test: sstable_3_x_test: test_sstable_log_too_many_rows: use tests::random
So it would be reproducible based on the test random-seed

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-04 08:30:51 +03:00
Kamil Braun
114419d6ab service/raft: raft_group0_client: read on-disk an in-memory group0 upgrade atomically
`set_group0_upgrade_state` writes the on-disk state first, then
in-memory state second, both under a write lock.
`get_group0_upgrade_state` would only take the lock if the in-memory
state was `use_pre_raft_procedures`.

If there's an external observer who watches the on-disk state to decide
whether Raft upgrade finished yet, the following could happen:
1. The node wrote `use_post_raft_procedures` to disk but didn't update
   the in-memory state yet, which is still `synchronize`.
2. The external client reads the table and sees that the state is
   `use_post_raft_procedures`, and deduces that upgrade has finished.
3. The external client immediately tries to perform a schema change. The
   schema change code calls `get_group0_upgrade_state` which does not
   take the read lock and returns `synchronize`. The schema change gets
   denied because schema changes are not allowed in `synchronize`.

Make sure that `get_group0_upgrade_state` cannot execute in-between
writing to disk and updating the in-memory state by always taking the
read lock before reading the in-memory state. As it was before, it will
immediately drop the lock if the state is not `use_pre_raft_procedures`.

This is useful for upgrade tests, which read the on-disk state to decide
whether upgrade has finished and often try to perform a schema change
immediately afterwards.

Closes #11672
2022-10-03 19:04:16 +02:00
Kamil Braun
67ee6500e3 service/raft: raft_group_registry: pass direct_fd_pinger by reference
It was passed to `raft_group_registry::direct_fd_proxy` by value. That
is a bug, we want to pass a reference to the instance that is living
inside `gossiper`.

Fortunately this bug didn't cause problems, because the pinger is only
used for one function, `get_address`, which looks up an address in a map
and if it doesn't find it, accesses the map that lives inside
`gossiper` on shard 0 (and then caches it in the local copy).

Explicitly delete the copy constructor of `direct_fd_pinger` so this
doesn't happen again.

Closes #11661
2022-10-03 16:40:35 +02:00
Tomasz Grabiec
9dae2b9c02 Merge 'mutation_fragment_stream_validator: various API improvements' from Botond Dénes
The low-level `mutation_fragment_stream_validator` gets `reset()` methods that until now only the high-level `mutation_fragment_stream_validating_filter` had.
Active tombstone validation is pushed down to the low level validator.
The low level validator, which was a pain to use until now due to being very fussy on which subset of its API one used, is made much more robust, not requiring the user to stick to a subset of its API anymore.

Closes #11614

* github.com:scylladb/scylladb:
  mutation_fragment_stream_validator: make interface more robust
  mutation_fragment_stream_validator: add reset() to validating filter
  mutation_fragment_stream_validator: move active tomsbtone validation into low level validator
2022-10-03 16:23:46 +02:00
Botond Dénes
95f31f37c1 Merge 'dirty_memory_manager: simplify region_group' from Avi Kivity
region_group evolved as a tree, each node of which contains some
regions (memtables). Each node has some constraints on memory, and
can start flushing and/or stop allocation into its memtables and those
below it when those constraints are violated.

Today, the tree has exactly two nodes, only one of which can hold memtables.
However, all the complexity of the tree remains.

This series applies some mechanical code transformations that remove
the tree structure and all the excess functionality, leaving a much simpler
structure behind.

Before:
 - a tree of region_group objects
 - each with two parameters: soft limit and hard limit
 - but only two instances ever instantiated
After:
 - a single region_group object
 - with three parameters - two from the bottom instance, one from the top instance

Closes #11570

* github.com:scylladb/scylladb:
  dirty_memory_manager: move third memory threshold parameter of region_group constructor to reclaim_config
  dirty_memory_manager: simplify region_group::update()
  dirty_memory_manager: fold region_group::notify_hard_pressure_relieved into its callers
  dirty_memory_manager: clean up region_group::do_update_hard_and_check_relief()
  dirty_memory_manager: make do_update_hard_and_check_relief() a member of region_group
  dirty_memory_manager: remove accessors around region_group::_under_hard_pressure
  dirty_memory_manager: merge memory_hard_limit into region_group
  dirty_memory_manager: rename members in memory_hard_limit
  dirty_memory_manager: fold do_update() into region_group::update()
  dirty_memory_manager: simplify memory_hard_limit's do_update
  dirty_memory_manager: drop soft limit / soft pressure members in memory_hard_limit
  dirty_memory_manager: de-template do_update(region_group_or_memory_hard_limit)
  dirty_memory_manager: adjust soft_limit threshold check
  dirty_memory_manager: drop memory_hard_limit::_name
  dirty_memory_manager: simplify memory_hard_limit configuration
  dirty_memory_manager: fold region_group_reclaimer into {memory_hard_limit,region_group}
  dirty_memory_manager: stop inheriting from region_group_reclaimer
  dirty_memory_manager: test: unwrap region_group_reclaimer
  dirty_memory_manager: change region_group_reclaimer configuration to a struct
  dirty_memory_manager: convert region_group_reclaimer to callbacks
  dirty_memory_manager: consolidate region_group_reclaimer constructors
  dirty_memory_manager: rename {memory_hard_limit,region_group}::notify_relief
  dirty_memory_manager: drop unused parameter to memory_hard_limit constructor
  dirty_memory_manager: drop memory_hard_limit::shutdown()
  dirty_memory_manager: split region_group hierarchy into separate classes
  dirty_memory_manager: extract code block from region_group::update
  dirty_memory_manager: move more allocation_queue functions out of region_group
  dirty_memory_manager: move some allocation queue related function definitions outside class scope
  dirty_memory_manager: move region_group::allocating_function and related classes to new class allocation_queue
  dirty_memory_manager: remove support for multiple subgroups
2022-10-03 13:22:47 +03:00
Botond Dénes
5621cdd7f9 db/view/view_builder: don't drop partition and range tombstones when resuming
The view builder builds the views from a given base table in
view_builder::batch_size batches of rows. After processing this many
rows, it suspends so the view builder can switch to building views for
other base tables in the name of fairness. When resuming the build step
for a given base table, it reuses the reader used previously (also
serving the role of a snapshot, pinning sstables read from). The
compactor however is created anew. As the reader can be in the middle of
a partition, the view builder injects a partition start into the
compactor to prime it for continuing the partition. This however only
included the partition-key, crucially missing any active tombstones:
partition tombstone or -- since the v2 transition -- active range
tombstone. This can result in base rows covered by either of this to be
resurrected and the view builder to generate view updates for them.
This patch solves this by using the detach-state mechanism of the
compactor which was explicitly developed for situations like this (in
the range scan code) -- resuming a read with the readers kept but the
compactor recreated.
Also included are two test cases reproducing the problem, one with a
range tombstone, the other with a partition tombstone.

Fixes: #11668

Closes #11671
2022-10-03 11:28:22 +03:00
Avi Kivity
2c744628ae Update abseil submodule
* abseil 9e408e05...7f3c0d78 (193):
  > Allows absl::StrCat to accept types that implement AbslStringify()
  > Merge pull request #1283 from pateldeev:any_inovcable_rename_true
  > Cleanup: SmallMemmove nullify should also be limited to 15 bytes
  > Cleanup: implement PrependArray and PrependPrecise in terms of InlineData
  > Cleanup: Move BitwiseCompare() to InlineData, and make it layout independent.
  > Change kPower10Table bounds to be half-open
  > Cleanup some InlineData internal layout specific details from cord.h
  > Improve the comments on the implementation of format hooks adl tricks.
  > Expand LogEntry method docs.
  > Documentation: Remove an obsolete note about the implementation of `Cord`.
  > `absl::base_internal::ReadLongFromFile` should use `O_CLOEXEC` and handle interrupts to `read`
  > Allows absl::StrFormat to accept types which implement AbslStringify()
  > Add common_policy_traits - a subset of hash_policy_traits that can be shared between raw_hash_set and btree.
  > Split configuration related to cycle clock into separate headers
  > Fix -Wimplicit-int-conversion and -Wsign-conversion warnings in btree.
  > Implement Eisel-Lemire for from_chars<float>
  > Import of CCTZ from GitHub.
  > Adds support for "%v" in absl::StrFormat and related functions for bool values. Note that %v prints bool values as "true" and "false" rather than "1" and "0".
  > De-pointerize LogStreamer::stream_, and fix move ctor/assign preservation of flags and other stream properties.
  > Explicitly disallows modifiers for use with %v.
  > Change the macro ABSL_IS_TRIVIALLY_RELOCATABLE into a type trait - absl::is_trivially_relocatable - and move it from optimization.h to type_traits.h.
  > Add sparse and string copy constructor benchmarks for hash table.
  > Make BTrees work with custom allocators that recycle memory.
  > Update the readme, and (internally) fix some export processes to better keep it up-to-date going forward.
  > Add the fact that CHECK_OK exits the program to the comment of CHECK_OK.
  > Adds support for "%v" in absl::StrFormat and related functions for numeric types, including integer and floating point values. Users may now specify %v and have the format specifier deduced. Integer values will print according to %d specifications, unsigned values will use %u, and floating point values will use %g. Note that %v does not work for `char` due to ambiguity regarding the intended output. Please continue to use %c for `char`.
  > Implement correct move constructor and assignment for absl::strings_internal::OStringStream, and mark that class final.
  > Add more options for `BM_iteration` in order to see better picture for choosing trade off for iteration optimizations.
  > Change `EndComparison` benchmark to not measure iteration. Also added `BM_Iteration` separately.
  > Implement Eisel-Lemire for from_chars<double>
  > Add `-llog` to linker options when building log_sink_set in logging internals.
  > Apply clang-format to btree.h.
  > Improve failure message: tell the values we don't like.
  > Increase the number of per-ObjFile program headers we can expect.
  > Fix "unsafe narrowing" warnings in absl, 8/n.
  > Fix format string error with an explicit cast
  > Add a case to detect when the Bazel compiler string is explicitly set to "gcc", instead of just detecting Bazel's default "compiler" string.
  > Fix "unsafe narrowing" warnings in absl, 10/n.
  > Fix "unsafe narrowing" warnings in absl, 9/n.
  > Fix stacktrace header includes
  > Add a missing dependency on :raw_logging_internal
  > CMake: Require at least CMake 3.10
  > CMake: install artifacts reflect the compiled ABI
  > Fixes bug so that `%v` with modifiers doesn't compile. `%v` is not intended to work with modifiers because the meaning of modifiers is type-dependent and `%v` is intended to be used in situations where the type is not important. Please continue using if `%s` if you require format modifiers.
  > Convert algorithm and container benchmarks to cc_binary
  > Merge pull request #1269 from isuruf:patch-1
  > InlinedVector: Small improvement to the max_size() calculation
  > CMake: Mark hash_testing as a public testonly library, as it is with Bazel
  > Remove the ABSL_HAVE_INTRINSIC_INT128 test from pcg_engine.h
  > Fix ClangTidy warnings in btree.h and btree_test.cc.
  > Fix log StrippingTest on windows when TCHAR = WCHAR
  > Refactors checker.h and replaces recursive functions with iterative functions for readability purposes.
  > Refactors checker.h to use if statements instead of ternary operators for better readability.
  > Import of CCTZ from GitHub.
  > Workaround for ASAN stack safety analysis problem with FixedArray container annotations.
  > Rollback of fix "unsafe narrowing" warnings in absl, 8/n.
  > Fix "unsafe narrowing" warnings in absl, 8/n.
  > Changes mutex profiling
  > InlinedVector: Correct the computation of max_size()
  > Adds support for "%v" in absl::StrFormat and related functions for string-like types (support for other builtin types will follow in future changes). Rather than specifying %s for strings, users may specify %v and have the format specifier deduced. Notably, %v does not work for `const char*` because we cannot be certain if %s or %p was intended (nor can we be certain if the `const char*` was properly null-terminated). If you have a `const char*` you know is null-terminated and would like to work with %v, please wrap it in a `string_view` before using it.
  > Fixed header guards to match style guide conventions.
  > Typo fix
  > Added some more no_test.. tags to build targets for controlling testing.
  > Remove includes which are not used directly.
  > CMake: Add an option to build the libraries that are used for writing tests without requiring Abseil's tests be built (default=OFF)
  > Fix "unsafe narrowing" warnings in absl, 7/n.
  > Fix "unsafe narrowing" warnings in absl, 6/n.
  > Release the Abseil Logging library
  > Switch time_state to explicit default initialization instead of value initialization.
  > spinlock.h: Clean up includes
  > Fix minor typo in absl/time/time.h comment: "ToDoubleNanoSeconds" -> "ToDoubleNanoseconds"
  > Support compilers that are unknown to CMake
  > Import of CCTZ from GitHub.
  > Change bit_width(T) to return int rather than T.
  > Import of CCTZ from GitHub.
  > Merge pull request #1252 from jwest591:conan-fix
  > Don't try to enable use of ARM NEON intrinsics when compiling in CUDA device mode. They are not available in that configuration, even if the host supports them.
  > Fix "unsafe narrowing" warnings in absl, 5/n.
  > Fix "unsafe narrowing" warnings in absl, 4/n.
  > Import of CCTZ from GitHub.
  > Update Abseil platform support policy to point to the Foundational C++ Support Policy
  > Import of CCTZ from GitHub.
  > Add --features=external_include_paths to Bazel CI to ignore warnings from dependencies
  > Merge pull request #1250 from jonathan-conder-sm:gcc_72
  > Merge pull request #1249 from evanacox:master
  > Import of CCTZ from GitHub.
  > Merge pull request #1246 from wxilas21:master
  > remove unused includes and add missing std includes for absl/status/status.h
  > Sort INTERNAL_DLL_TARGETS for easier maintenance.
  > Disable ABSL_HAVE_STD_IS_TRIVIALLY_ASSIGNABLE for clang-cl.
  > Map the absl::is_trivially_* functions to their std impl
  > Add more SimpleAtod / SimpleAtof test coverage
  > debugging: handle alternate signal stacks better on RISCV
  > Revert change "Fix "unsafe narrowing" warnings in absl, 4/n.".
  > Fix "unsafe narrowing" warnings in absl, 3/n.
  > Fix "unsafe narrowing" warnings in absl, 4/n.
  > Fix "unsafe narrowing" warnings in absl, 2/n.
  > debugging: honour `STRICT_UNWINDING` in RISCV path
  > Fix "unsafe narrowing" warnings in absl, 1/n.
  > Add ABSL_IS_TRIVIALLY_RELOCATABLE and ABSL_ATTRIBUTE_TRIVIAL_ABI macros for use with clang's __is_trivially_relocatable and [[clang::trivial_abi]].
  > Merge pull request #1223 from ElijahPepe:fix/implement-snprintf-safely
  > Fix frame pointer alignment check.
  > Fixed sign-conversion warning in code.
  > Import of CCTZ from GitHub.
  > Add missing include for std::unique_ptr
  > Do not re-close files on EINTR
  > Renamespace absl::raw_logging_internal to absl::raw_log_internal to match (upcoming) non-raw logging namespace.
  > Check for negative return values from ReadFromOffset
  > Use HTTPS RFC URLs, which work regardless of the browser's locale.
  > Avoid signedness change when casting off_t
  > Internal Cleanup: removing unused internal function declaration.
  > Make Span complain if constructed with a parameter that won't outlive it, except if that parameter is also a span or appears to be a view type.
  > any_invocable_test: Re-enable the two conversion tests that used to fail under MSVC
  > Add GetCustomAppendBuffer method to absl::Cord
  > debugging: add hooks for checking stack ranges
  > Minor clang-tidy cleanups
  > Support [[gnu::abi_tag("xyz")]] demangling.
  > Fix -Warray-parameter warning
  > Merge pull request #1217 from anpol:macos-sigaltstack
  > Undo documentation change on erase.
  > Improve documentation on erase.
  > Merge pull request #1216 from brjsp:master
  > string_view: conditional constexpr is no longer needed for C++14
  > Make exponential_distribution_test a bigger test (timeout small -> moderate).
  > Move Abseil to C++14 minimum
  > Revert commit f4988f5bd4176345aad2a525e24d5fd11b3c97ea
  > Disable C++11 testing, enable C++14 and C++20 in some configurations where it wasn't enabled
  > debugging: account for differences in alternate signal stacks
  > Import of CCTZ from GitHub.
  > Run flaky test in fewer configurations
  > AnyInvocable: Move credits to the top of the file
  > Extend visibility of :examine_stack to an upcoming Abseil Log.
  > Merge contiguous mappings from the same file.
  > Update versions of WORKSPACE dependencies
  > Use ABSL_INTERNAL_HAS_SSE2 instead of __SSE2__
  > PR #1200: absl/debugging/CMakeLists.txt: link with libexecinfo if needed
  > Update GCC floor container to use Bazel 5.2.0
  > Update GoogleTest version used by Abseil
  > Release absl::AnyInvocable
  > PR #1197: absl/base/internal/direct_mmap.h: fix musl build on mips
  > absl/base/internal/invoke: Ignore bogus warnings on GCC >= 11
  > Revert GoogleTest version used by Abseil to commit 28e1da21d8d677bc98f12ccc7fc159ff19e8e817
  > Update GoogleTest version used by Abseil
  > explicit_seed_seq_test: work around/disable bogus warnings in GCC 12
  > any_test: expand the any emplace bug suppression, since it has gotten worse in GCC 12
  > absl::Time: work around bogus GCC 12 -Wrestrict warning
  > Make absl::StdSeedSeq an alias for std::seed_seq
  > absl::Optional: suppress bogus -Wmaybe-uninitialized GCC 12 warning
  > algorithm_test: suppress bogus -Wnonnull warning in GCC 12
  > flags/marshalling_test: work around bogus GCC 12 -Wmaybe-uninitialized warning
  > counting_allocator: suppress bogus -Wuse-after-free warning in GCC 12
  > Prefer to fallback to UTC when the embedded zoneinfo data does not contain the requested zone.
  > Minor wording fix in the comment for ConsumeSuffix()
  > Tweak the signature of status_internal::MakeCheckFailString as part of an upcoming change
  > Fix several typos in comments.
  > Reformulate documentation of ABSL_LOCKS_EXCLUDED.
  > absl/base/internal/invoke.h: Use ABSL_INTERNAL_CPLUSPLUS_LANG for language version guard
  > Fix C++17 constexpr storage deprecation warnings
  > Optimize SwissMap iteration by another 5-10% for ARM
  > Add documentation on optional flags to the flags library overview.
  > absl: correct the stack trace path on RISCV
  > Merge pull request #1194 from jwnimmer-tri:default-linkopts
  > Remove unintended defines from config.h
  > Ignore invalid TZ settings in tests
  > Add ABSL_HARDENING_ASSERTs to CordBuffer::SetLength() and CordBuffer::IncreaseLengthBy()
  > Fix comment typo about absl::Status<T*>
  > In b-tree, support unassignable value types.
  > Optimize SwissMap for ARM by 3-8% for all operations
  > Release absl::CordBuffer
  > InlinedVector: Limit the scope of the maybe-uninitialized warning suppression
  > Improve the compiler error by removing some noise from it. The "deleted" overload error is useless to users. By passing some dummy string to the base class constructor we use a valid constructor and remove the unintended use of the deleted default constructor.
  > Merge pull request #714 from kgotlinux:patch-2
  > Include proper #includes for POSIX thread identity implementation when using that implementation on MinGW.
  > Rework NonsecureURBGBase seed sequence.
  > Disable tests on some platforms where they currently fail.
  > Fixed typo in a comment.
  > Rollforward of commit ea78ded7a5f999f19a12b71f5a4988f6f819f64f.
  > Add an internal helper for logging (upcoming).
  > Merge pull request #1187 from trofi:fix-gcc-13-build
  > Merge pull request #1189 from renau:master
  > Allow for using b-tree with `value_type`s that can only be constructed by the allocator (ignoring copy/move constructors).
  > Stop using sleep timeouts for Linux futex-based SpinLock
  > Automated rollback of commit f2463433d6c073381df2d9ca8c3d8f53e5ae1362.
  > time.h: Use uint32_t literals for calls to overloaded MakeDuration
  > Fix typos.
  > Clarify the behaviour of `AssertHeld` and `AssertReaderHeld` when the calling thread doesn't hold the mutex.
  > Enable __thread on Asylo
  > Add implementation of is_invocable_r to absl::base_internal for C++ < 17, define it as alias of std::is_invocable_r when C++ >= 17
  > Optimize SwissMap iteration for aarch64 by 5-6%
  > Fix detection of ABSL_HAVE_ELF_MEM_IMAGE on Haiku
  > Don’t use generator expression to build .pc Libs lines
  > Update Bazel used on MacOS CI
  > Import of CCTZ from GitHub.

Closes #11687
2022-10-03 11:06:37 +03:00
Botond Dénes
f4540ef0d6 Merge 'Upgrade nix devenv' from Michael Livshin
To recap: the Nix devenv ({default,shell,flake}.nix and friends) in Scylla is a nicer (for those who consider it so, that is) alternative to dbuild: a completely deterministic build environment without Docker.

In theory we could support much more (creating installable packages, container images, various deployment affordances, etc. -- Nix is, among other things, a kind of parallel-to-everything-else devops realm) but there is clearly no demand and besides duplicating the work the release team is already doing (and doing just fine, needless to say) would be pointless and wasteful.

This PR reflects the accumulated changes that I have been carrying locally for the past year or so.  The version currently in master _probably_ can still build Scylla, but that Scylla certainly would not pass unit tests.

What the previous paragraph seems to mean is, apparently I'm the only active user of Nix devenv for Scylla.  Which, in turn, presents some obvious questions for the maintainers:

- Does this need to live in the Scylla source at all?  (The changes to non-Nix-specific parts are minimal and unobtrusive, but they are still changes)
- If it's left in, who is going to maintain it going forward, should more users somehow appear?  (I'm perfectly willing to fix things up when alerted, but no timeliness guarantees)

Closes #9557

* github.com:scylladb/scylladb:
  nix: add README.md
  build: improvements & upgrades to Nix dev environment
  build: allow setting SCYLLA_RELEASE from outside
2022-10-03 09:40:09 +03:00
Botond Dénes
2041744132 Merge 'readers/mutlishard: don't mix coroutines and continuations in the do_fill_buffer()' from Avi Kivity
The combination is hard to read and modify.

Closes #11665

* github.com:scylladb/scylladb:
  readers/multishard: restore shard_reader_v2::do_fill_buffer() indentation
  readers/multishard: convert shard_reader_v2::do_fill_buffer() to a pure coroutine
2022-10-03 06:51:20 +03:00
Nadav Har'El
b8f8eb8710 Merge 'Improve test.py logging' from Kamil Braun
Include the unique test name (the unique name distinguishes between different test repeats) and the test case name where possible. Improve printing of clusters: include the cluster name and stopped servers. Fix some logging calls and add new ones.

Examples:
```
------ Starting test test_topology ------
```
became this:
```
------ Starting test test_topology.1::test_add_server_add_column ------
```

This:
```
INFO> Leasing Scylla cluster {127.191.142.1, 127.191.142.2, 127.191.142.3} for test test_add_server_add_column
```
became this:
```
INFO> Leasing Scylla cluster ScyllaCluster(name: 02cdd180-40d1-11ed-8803-3c2c30d32d96, running: {127.144.164.1, 127.144.164.2, 127.144.164.3}, stopped: {}) for test test_topology.1::test_add_server_add_column
```

Closes #11677

* github.com:scylladb/scylladb:
  test/pylib: scylla_cluster: improve cluster printing
  test/pylib: don't pass test_case_name to after-test endpoint
  test/pylib: scylla_cluster: track current test case name and print it
  test.py: pass the unique test name (e.g. `test_topology.1`) to cluster manager
  test/pylib: scylla_cluster: pass the test case name to `before_test`
  test/pylib: use "test_case_name" variable name when talking about test cases
2022-10-02 20:48:50 +03:00
Pavel Emelyanov
2b8636a2a9 storage_proxy.hh: Remove unused headers
Add needed forward declarations and fix indirect inclusions in some .ccs

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11679
2022-10-02 20:48:50 +03:00
Michał Chojnowski
4563cbe595 logalloc: prevent false positives in reclaim_timer
reclaim_timer uses a coarse clock, but does not account for
the measurement error introduced by that -- it can falsely
report reclaims as stalls, even if they are shorter by a full
coarse clock tick from the requested threshold
(blocked-reactor-notify-ms).

Notably, if the stall threshold happens to be smaller or equal to coarse
clock resolution, Scylla's log gets spammed with false stall reports.
The resolution of coarse clocks in Linux is 1/CONFIG_HZ. This is
typically equal to 1 ms or 4 ms, and stall thresholds of this order
can occur in practice.

Eliminate false positives by requiring the measured reclaim duration to
be at least 1 clock tick longer than the configured threshold for it to
be considered a stall.

Fixes #10981

Closes #11680
2022-10-02 13:41:40 +03:00
Avi Kivity
372eadf542 Merge "perftune related improvements in scylla_* scripts" from Vlad Zolotarov
"
This series adds a long waited transition of our auto-generation
code to irq_cpu_mask instead of 'mode' in perftune.yaml.

And then it fixes a regression in scylla_prepare perftune.yaml
auto-generation logic.
"

* 'scylla_prepare_fix_regression-v1' of https://github.com/vladzcloudius/scylla:
  scylla_prepare + scylla_cpuset_setup: make scylla_cpuset_setup idempotent without introducing regressions
  scylla_prepare: stop generating 'mode' value in perftune.yaml
2022-10-02 13:25:13 +03:00
Michael Livshin
d178ac17dc nix: add README.md
Signed-off-by: Michael Livshin <repo@cmm.kakpryg.net>
2022-10-02 12:26:02 +03:00
Michael Livshin
7bd13be3f2 build: improvements & upgrades to Nix dev environment
* Add some more useful stuff to the shell environment, so it actually
  works for debugging & post-mortem analysis.

* Wrap ccache & distcc transparently (distcc will be used unless
  NODISTCC is set to a non-empty value in the environment; ccache will
  be used if CCACHE_DIR is not empty).

* Package the Scylla Python driver (instead of the C* one).

* Catch up to misc build/test requirements (including optional) by
  requiring or custom-packaging: wasmtime 0.29.0, cxxbridge,
  pytest-asyncio, liburing.

* Build statically-linked zstd in a saner and more idiomatic fashion.

* In pure builds (where sources lack Git metadata), derive
  SCYLLA_RELEASE from source hash.

* Refactor things for more parameterization.

* Explicitly stub out installPhase (seeing that "nix build" succeeds
  up to installPhase means we didn't miss any dependencies).

* Add flake support.

* Add copious comments.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-10-02 11:47:16 +03:00
Michael Livshin
839d8f40e6 build: allow setting SCYLLA_RELEASE from outside
The extant logic for deriving the value of SCYLLA_RELEASE from the
source tree has those assumptions:

* The tree being built includes Git metadata.

* The value of `date` is trustworthy and interesting.

* There are no uncommitted changes (those relevant to building,
  anyway).

The above assumptions are either irrelevant or problematic in pure
build environments (such as the sandbox set up by `nix-build`):

* Pure builds use cleaned-up sources with all timestamps reset to Unix
  time 0.  Those cleaned-up sources are saved (in the Nix store, for
  example) and content-hashed, so leaving the (possibly huge) Git
  metadata increases the time to copy the sources and wastes disk
  space (in fact, Nix in flake mode strips `.git` unconditionally).

* Pure builds run in a sandbox where time is, likewise, reset to Unix
  time 0, so the output of `date` is neither informative nor useful.

Now, the only build step that uses Git metadata in the first place is
the SCYLLA_RELEASE value derivation logic.  So, essentially, answering
the question "is the Git metadata needed to build Scylla" is a matter
of definition, and is up to us.  If we elect to ignore Git metadata
and current time, we can derive SCYLLA_RELEASE value from the content
hash of the cleaned-up tree, regardless of the way that tree was
arrived at.

This change makes it possible to skip the derivation of SCYLLA_RELEASE
value from Git metadata and current time by way of setting
SCYLLA_RELEASE in the environment.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-10-02 11:47:16 +03:00
Avi Kivity
17b1cb4434 dirty_memory_manager: move third memory threshold parameter of region_group constructor to reclaim_config
Place it along the other parameters.
2022-09-30 22:17:37 +03:00
Avi Kivity
ecf30ee469 dirty_memory_manager: simplify region_group::update()
We notice there are two separate conditions controlling a call to
a single outcome, notify_pressure_relief(). Merge them into a single
boolean variable.
2022-09-30 22:15:45 +03:00
Avi Kivity
230fff299a dirty_memory_manager: fold region_group::notify_hard_pressure_relieved into its callers
It is trivial.
2022-09-30 22:11:01 +03:00
Avi Kivity
12b81173b9 dirty_memory_manager: clean up region_group::do_update_hard_and_check_relief()
Remove synthetic "rg" local.
2022-09-30 22:09:09 +03:00
Avi Kivity
e1bad8e883 dirty_memory_manager: make do_update_hard_and_check_relief() a member of region_group
It started life as something shared between memory_hard_limit and
region_group, but now that they are back being the same thing, we
can make it a member again.
2022-09-30 22:04:26 +03:00
Avi Kivity
6b21c10e9e dirty_memory_manager: remove accessors around region_group::_under_hard_pressure
It is now only accessed from within the class, so the
accessors don't help anything.
2022-09-30 21:59:46 +03:00
Avi Kivity
6a02bb7c2b dirty_memory_manager: merge memory_hard_limit into region_group
The two classes always have a 1:1 or 0:1 relationship, and
so we can just move all the members of memory_hard_limit
into region_group, with the functions that track the relationship
(memory_hard_limit::{add,del}()) removed.

The 0:1 relationship is maintained by initializing the
hard limit parameter with std::numeric_limits<size_t>::max().
The _hard_total_memory variable is always checked if it is
greater than this parameter in order to do anything, and
with this default it can never be.
2022-09-30 21:59:38 +03:00
Avi Kivity
45ab24e43d dirty_memory_manager: rename members in memory_hard_limit
In preparation for merging memory_hard_limit into region_group,
disambiguate similarly named members by adding the word "hard" in
random places.

memory_hard_limit and region_group are candidates for merging
because they constantly reference each other, and memory_hard_limit
does very little by itself.
2022-09-30 21:47:33 +03:00
Avi Kivity
aca96c4103 readers/multishard: restore shard_reader_v2::do_fill_buffer() indentation 2022-09-30 19:19:51 +03:00
Avi Kivity
b08196f3b3 readers/multishard: convert shard_reader_v2::do_fill_buffer() to a pure coroutine
do_full_buffer() is an eclectic mix of coroutines and continuations.
That makes it hard to follow what is running sequentially and
concurrently.

Convert it into a pure coroutine by changing internal continuations
to lambda coroutines. These lambda coroutines are guarded with
seastar::coroutine::lambda. Furthermore, a future that is co_awaited
is converted to immediate co_await (without an intermediate future),
since seastar::coroutine::lambda only works if the coroutine is awaited
in the same statement it is defined on.
2022-09-30 19:19:48 +03:00
Kamil Braun
b2cf610567 test/pylib: scylla_cluster: improve cluster printing
Print the cluster name and stopped servers in addition to the running
servers.

Fix a logging call which tried to print a server in place of a cluster
and even at that it failed (the server didn't have a hostname yet so it
printed as an empty string). Add another logging call.
2022-09-30 17:00:05 +02:00
Kamil Braun
05ed3769dd test/pylib: don't pass test_case_name to after-test endpoint
It's redundant now, the manager tracks the current test case using
before-test endpoint calls.
2022-09-30 16:41:45 +02:00