Commit Graph

33283 Commits

Author SHA1 Message Date
Raphael S. Carvalho
e56bfecd8d sstable_compaction_test: Use column_family_for_tests::as_table_state() instead
That's important for multiple compaction groups. Once replica::table
supports multiple groups, there will be no table::as_table_state(),
so for testing table with a single group, we'll be relying on
column_family_for_tests::as_table_state().

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-05 21:37:19 -03:00
Raphael S. Carvalho
5a028ca4dc test: Don't expose compound set in column_family_for_tests
The compound set shouldn't be exposed in main_sstables() because
once we complete the switch to column_family_for_tests::table_state,
can happen compaction will try to remove or add elements to its
set snapshot, and compound set isn't allowed to either ops.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-05 21:37:19 -03:00
Raphael S. Carvalho
b16d6c55b1 test: Implement column_family_for_tests::table_state::is_auto_compaction_disabled_by_user()
Needed once we switch to column_family_for_tests::table_state, so unit
tests relying on correct value will still work

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-05 21:37:19 -03:00
Raphael S. Carvalho
a6d24a763a sstable_compaction_test: Merge table_state_for_test into column_family_for_tests
This change will make table_state_for_test the table_state of
column_family_for_tests. Today, an unit test has to keep a reference
to them both and logically couple them, but that's error prone.

This change is also important when replica::table supports multiple
compaction groups, so unit tests won't have to directly reference
the table_state of table, but rather use the one managed by
column_family_for_tests.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-05 21:37:19 -03:00
Raphael S. Carvalho
6a0eabd17a sstable_compaction_test: use table_state_for_test itself in fully_expired_sstables()
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-05 21:37:19 -03:00
Raphael S. Carvalho
a6affea008 sstable_compaction_test: Switch to table_state in compact_sstables()
The switch is important once we have multiple compaction groups,
as a single table may own several groups. There will no longer be
a replica::table::as_table_state().

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-05 21:37:19 -03:00
Raphael S. Carvalho
2aa6518486 sstable_compaction_test: Reduce boilerplate by switching to column_family_for_tests
Lots of boilerplate is reduced, and will also help to complete the
switch from replica::table to compaction::table_state in the unit
tests.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-05 21:37:18 -03:00
Botond Dénes
4c13328788 Merge 'Return all sstables in table::get_sstable_set()' from Raphael "Raph" Carvalho
This fixes a regression introduced by 1e7a444, where table::get_sstable_set() isn't exposing all sstables, but rather only the ones in the main set. That causes user of the interface, such as get_sstables_by_partition_key() (used by API to return sstable name list which contains a particular key), to miss files in the maintenance set.

Fixes https://github.com/scylladb/scylladb/issues/11681.

Closes #11682

* github.com:scylladb/scylladb:
  replica: Return all sstables in table::get_sstable_set()
  sstables: Fix cloning of compound_sstable_set
2022-10-05 06:55:50 +03:00
Pavel Emelyanov
2c1ef0d2b7 sstables.hh: Remove unused headers
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11709
2022-10-04 23:37:07 +02:00
Raphael S. Carvalho
827750c142 replica: Return all sstables in table::get_sstable_set()
get_sstable_set() as its name implies is not confined to the main
or maintenance set, nor to a specific compaction group, so let's
make it return the compound set which spans all groups, meaning
all sstables tracked by a table will be returned.

This is a regression introduced in 1e7a444. It affects the API
to return sstable list containing a partition key, as sstables
in maintenance would be missed, fooling users of the API like
tools that could trust the output.

Each compaction group is returning the main and maintenance set
in table_state's main_sstable_set() and maintenance_sstable_set(),
respectively.

Fixes #11681.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-04 10:43:27 -03:00
Raphael S. Carvalho
eddf32b94c sstables: Fix cloning of compound_sstable_set
The intention was that its clone() would actually clone the content
of an existing set into a new one, but the current impl is actually
moving the sets instead of copying them. So the original set
becomes invalid. Luckily, this problem isn't triggered as we're
not exposing the compound set in the table's interface, so the
compound_sstable_set::clone() method isn't being called.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-10-04 10:43:25 -03:00
Felipe Mendes
f67bb43a7a locator: ec2_snitch: IMDSv2 support
Access to AWS Metadata may be configured in three distinct ways:
   1 - Optional HTTP tokens and HTTP endpoint enabled: The default as it works today
   2 - Required HTTP tokens and HTTP endpoint enabled: Which support is entirely missing today
   3 - HTTP endpoint disabled: Which effectively forbids one to use Ec2Snitch or Ec2MultiRegionSnitch

This commit makes the 2nd option the default which is not only AWS recommended option, but is also entirely compatible with the 1st option.
In addition, we now validate the HTTP response when querying the IMDS server. Therefore - should a HTTP 403 be received - Scylla will
properly notify users on what they are trying to do incorrectly in their setup.

The commit was tested under the following circumstances (covering all 3 variants):
 - Ec2Snitch: IMDSv2 optional & required, and HTTP server disabled.
 - Ec2MultiRegionSnitch: IMDSv2 optional & required, and HTTP server disabled.

Refs: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html
      https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-options.html
      https://github.com/scylladb/scylladb/issues/9987
Fixes: https://github.com/scylladb/scylladb/issues/10490
Closes: https://github.com/scylladb/scylladb/issues/10490

Closes #11636
2022-10-04 15:48:42 +03:00
Kamil Braun
c200ae2228 Merge 'test.py topology Scylla REST API client' from Alecco
- Separate `aiohttp` client code
- Helper to access Scylla server REST API
- Use helper both in `ScyllaClusterManager` (test.py process) and `ManagerClient` (pytest process)
- Add `removenode` and `decommission` operations.

Closes #11653

* github.com:scylladb/scylladb:
  test.py: Scylla REST methods for topology tests
  test.py: rename server_id to server_ip
  test.py: HTTP client helper
  test.py: topology pass ManagerClient instead of...
  test.py: delete unimplemented remove server
  test.py: fix variable name ssl name clash
2022-10-04 11:50:18 +02:00
Botond Dénes
169a8a66f2 compatible_ring_position_or_view: make it cheap to copy
This class exists for one purpose only: to serve as glue code between
dht::ring_position and boost::icl::interval_map. The latter requires
that keys in its intervals are:
* default constructible
* copyable
* have standalone compare operations

For this reason we have to wrap `dht::ring_position` in a class,
together with a schema to provide all this. This is
`compatible_ring_position`. There is one further requirement by code
using the interval map: it wants to do lookups without copying the
lookup key(s). To solve this, we came up with
`compatible_ring_position_or_view` which is a union of a key or a key
view + schema. As we recently found out, boost::icl copies its keys **a
lot**. It seems to assume these keys are cheap to copy and carelessly
copies them around even when iterating over the map. But
`compatible_ring_position_or_view` is not cheap to copy as it copies a
`dht::ring_position` which allocates, and it does that via an
`std::optional` and `std::variant` to add insult to injury.
This patch make said class cheap to copy, by getting rid of the variant
and storing the `dht::ring_position` via a shared pointer. The view is
stored separately and either points to the ring position stored in the
shared pointer or to an outside ring position (for lookups).

Fixes: #11669

Closes #11670
2022-10-04 12:00:21 +03:00
Piotr Dulikowski
51f813d89b storage_proxy: update rate limited reads metric when coordinator rejects
The decision to reject a read operation can either be made by replicas,
or by the coordinator. In the second case, the

  scylla_storage_proxy_coordinator_read_rate_limited

metric was not incremented, but it should. This commit fixes the issue.

Fixes: #11651

Closes #11694
2022-10-04 10:33:58 +03:00
Pavel Emelyanov
9cd1f777a5 database.hh: Remove unused headers
Use forward declarations when needed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11667
2022-10-04 09:01:38 +03:00
Botond Dénes
5fd4b1274e Merge 'compaction_manager: Don't let ENOSPC throw out of ::stop() method' from Pavel Emelyanov
The seastar defer_stop() helper is cool, but it forwards any exception from the .stop() towards the caller. In case the caller is main() the exception causes Scylla to abort(). This fires, for example, in compaction_manager::stop() when it steps on ENOSPC

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11662

* github.com:scylladb/scylladb:
  compaction_manager: Swallow ENOSPCs in ::stop()
  exceptions: Mark storage_io_error::code() with noexcept
2022-10-04 08:54:22 +03:00
Nadav Har'El
3a30fbd56c test/alternator: fix timeout in flaky test test_ttl_stats
The test `test_metrics.py::test_ttl_stats` tests the metrics associated
with Alternator TTL expiration events. It normally finishes in less than a
second (the TTL scanning is configured to run every 0.5 seconds), so we
arbitrarily set a 60 second timeout for this test to allow for extremely
slow test machines. But in some extreme cases even this was not enough -
in one case we measured the TTL scan to take 63 seconds.

So in this patch we increase the timeout in this test from 60 seconds
to 120 seconds. We already did the same change in other Alternator TTL
tests in the past - in commit 746c4bd.

Fixes #11695

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11696
2022-10-04 08:50:51 +03:00
Kamil Braun
114419d6ab service/raft: raft_group0_client: read on-disk an in-memory group0 upgrade atomically
`set_group0_upgrade_state` writes the on-disk state first, then
in-memory state second, both under a write lock.
`get_group0_upgrade_state` would only take the lock if the in-memory
state was `use_pre_raft_procedures`.

If there's an external observer who watches the on-disk state to decide
whether Raft upgrade finished yet, the following could happen:
1. The node wrote `use_post_raft_procedures` to disk but didn't update
   the in-memory state yet, which is still `synchronize`.
2. The external client reads the table and sees that the state is
   `use_post_raft_procedures`, and deduces that upgrade has finished.
3. The external client immediately tries to perform a schema change. The
   schema change code calls `get_group0_upgrade_state` which does not
   take the read lock and returns `synchronize`. The schema change gets
   denied because schema changes are not allowed in `synchronize`.

Make sure that `get_group0_upgrade_state` cannot execute in-between
writing to disk and updating the in-memory state by always taking the
read lock before reading the in-memory state. As it was before, it will
immediately drop the lock if the state is not `use_pre_raft_procedures`.

This is useful for upgrade tests, which read the on-disk state to decide
whether upgrade has finished and often try to perform a schema change
immediately afterwards.

Closes #11672
2022-10-03 19:04:16 +02:00
Alejo Sanchez
abf1425ad4 test.py: Scylla REST methods for topology tests
Provide a helper client for Scylla REST requests. Use it on both
ScyllaClusterManager (e.g. remove node, test.py process) and
ManagerClient (e.g. get uuid, pytest process).

For now keep using IPs as key in ScyllaCluster, but this will be changed
to UUID -> IP in the future. So, for now, pass both independently. Note
the UUID must be obtained from the server before stopping it.

Refresh client driver connection when decommissioning or removing
a node.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2022-10-03 19:01:03 +02:00
Alejo Sanchez
86c752c2a0 test.py: rename server_id to server_ip
In ScyllaCluster currently servers are tracked by the host IP. This is
not the host id (UUID). Fix the variable name accordingly

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2022-10-03 19:01:03 +02:00
Alejo Sanchez
a7a0b446f0 test.py: HTTP client helper
Split aiohttp client to a shared helper file.

While there, move aiohttp session setup back to constructors. When there
were teardown issues it looked it could be caused by aiohttp session
being created outside a coroutine. But this is proven not to be the case
after recent fixes. So move it back to the ManagerClient constructor.

On th other hand, create a close() coroutine to stop the aiohttp session.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2022-10-03 19:01:03 +02:00
Alejo Sanchez
41dbdf0f70 test.py: topology pass ManagerClient instead of...
cql connection

When there are topology changes, the driver needs to be updated. Instead
of passing the CassandraCluster.Connection, pass the ManagerClient
instance which manages the driver connection inside of it.

Remove workaround for test_raft_upgrade.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2022-10-03 19:00:47 +02:00
Alejo Sanchez
0c3a06d0d7 test.py: delete unimplemented remove server
Delete of Unused and unimplemented broken version of remove server.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2022-10-03 18:57:38 +02:00
Alejo Sanchez
98bc4c198f test.py: fix variable name ssl name clash
Change variable ssl to use_ssl to avoid clash with ssl module.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2022-10-03 18:57:38 +02:00
Pavel Emelyanov
d22b130af1 compaction_manager: Swallow ENOSPCs in ::stop()
When being stopped compaction manager may step on ENOSPC. This is not a
reason to fail stopping process with abort, better to warn this fact in
logs and proceed as if nothing happened

refs: #11245

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-03 18:54:48 +03:00
Pavel Emelyanov
7ba1f551f3 exceptions: Mark storage_io_error::code() with noexcept
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-03 18:50:06 +03:00
Kamil Braun
67ee6500e3 service/raft: raft_group_registry: pass direct_fd_pinger by reference
It was passed to `raft_group_registry::direct_fd_proxy` by value. That
is a bug, we want to pass a reference to the instance that is living
inside `gossiper`.

Fortunately this bug didn't cause problems, because the pinger is only
used for one function, `get_address`, which looks up an address in a map
and if it doesn't find it, accesses the map that lives inside
`gossiper` on shard 0 (and then caches it in the local copy).

Explicitly delete the copy constructor of `direct_fd_pinger` so this
doesn't happen again.

Closes #11661
2022-10-03 16:40:35 +02:00
Tomasz Grabiec
9dae2b9c02 Merge 'mutation_fragment_stream_validator: various API improvements' from Botond Dénes
The low-level `mutation_fragment_stream_validator` gets `reset()` methods that until now only the high-level `mutation_fragment_stream_validating_filter` had.
Active tombstone validation is pushed down to the low level validator.
The low level validator, which was a pain to use until now due to being very fussy on which subset of its API one used, is made much more robust, not requiring the user to stick to a subset of its API anymore.

Closes #11614

* github.com:scylladb/scylladb:
  mutation_fragment_stream_validator: make interface more robust
  mutation_fragment_stream_validator: add reset() to validating filter
  mutation_fragment_stream_validator: move active tomsbtone validation into low level validator
2022-10-03 16:23:46 +02:00
Botond Dénes
95f31f37c1 Merge 'dirty_memory_manager: simplify region_group' from Avi Kivity
region_group evolved as a tree, each node of which contains some
regions (memtables). Each node has some constraints on memory, and
can start flushing and/or stop allocation into its memtables and those
below it when those constraints are violated.

Today, the tree has exactly two nodes, only one of which can hold memtables.
However, all the complexity of the tree remains.

This series applies some mechanical code transformations that remove
the tree structure and all the excess functionality, leaving a much simpler
structure behind.

Before:
 - a tree of region_group objects
 - each with two parameters: soft limit and hard limit
 - but only two instances ever instantiated
After:
 - a single region_group object
 - with three parameters - two from the bottom instance, one from the top instance

Closes #11570

* github.com:scylladb/scylladb:
  dirty_memory_manager: move third memory threshold parameter of region_group constructor to reclaim_config
  dirty_memory_manager: simplify region_group::update()
  dirty_memory_manager: fold region_group::notify_hard_pressure_relieved into its callers
  dirty_memory_manager: clean up region_group::do_update_hard_and_check_relief()
  dirty_memory_manager: make do_update_hard_and_check_relief() a member of region_group
  dirty_memory_manager: remove accessors around region_group::_under_hard_pressure
  dirty_memory_manager: merge memory_hard_limit into region_group
  dirty_memory_manager: rename members in memory_hard_limit
  dirty_memory_manager: fold do_update() into region_group::update()
  dirty_memory_manager: simplify memory_hard_limit's do_update
  dirty_memory_manager: drop soft limit / soft pressure members in memory_hard_limit
  dirty_memory_manager: de-template do_update(region_group_or_memory_hard_limit)
  dirty_memory_manager: adjust soft_limit threshold check
  dirty_memory_manager: drop memory_hard_limit::_name
  dirty_memory_manager: simplify memory_hard_limit configuration
  dirty_memory_manager: fold region_group_reclaimer into {memory_hard_limit,region_group}
  dirty_memory_manager: stop inheriting from region_group_reclaimer
  dirty_memory_manager: test: unwrap region_group_reclaimer
  dirty_memory_manager: change region_group_reclaimer configuration to a struct
  dirty_memory_manager: convert region_group_reclaimer to callbacks
  dirty_memory_manager: consolidate region_group_reclaimer constructors
  dirty_memory_manager: rename {memory_hard_limit,region_group}::notify_relief
  dirty_memory_manager: drop unused parameter to memory_hard_limit constructor
  dirty_memory_manager: drop memory_hard_limit::shutdown()
  dirty_memory_manager: split region_group hierarchy into separate classes
  dirty_memory_manager: extract code block from region_group::update
  dirty_memory_manager: move more allocation_queue functions out of region_group
  dirty_memory_manager: move some allocation queue related function definitions outside class scope
  dirty_memory_manager: move region_group::allocating_function and related classes to new class allocation_queue
  dirty_memory_manager: remove support for multiple subgroups
2022-10-03 13:22:47 +03:00
Botond Dénes
5621cdd7f9 db/view/view_builder: don't drop partition and range tombstones when resuming
The view builder builds the views from a given base table in
view_builder::batch_size batches of rows. After processing this many
rows, it suspends so the view builder can switch to building views for
other base tables in the name of fairness. When resuming the build step
for a given base table, it reuses the reader used previously (also
serving the role of a snapshot, pinning sstables read from). The
compactor however is created anew. As the reader can be in the middle of
a partition, the view builder injects a partition start into the
compactor to prime it for continuing the partition. This however only
included the partition-key, crucially missing any active tombstones:
partition tombstone or -- since the v2 transition -- active range
tombstone. This can result in base rows covered by either of this to be
resurrected and the view builder to generate view updates for them.
This patch solves this by using the detach-state mechanism of the
compactor which was explicitly developed for situations like this (in
the range scan code) -- resuming a read with the readers kept but the
compactor recreated.
Also included are two test cases reproducing the problem, one with a
range tombstone, the other with a partition tombstone.

Fixes: #11668

Closes #11671
2022-10-03 11:28:22 +03:00
Avi Kivity
2c744628ae Update abseil submodule
* abseil 9e408e05...7f3c0d78 (193):
  > Allows absl::StrCat to accept types that implement AbslStringify()
  > Merge pull request #1283 from pateldeev:any_inovcable_rename_true
  > Cleanup: SmallMemmove nullify should also be limited to 15 bytes
  > Cleanup: implement PrependArray and PrependPrecise in terms of InlineData
  > Cleanup: Move BitwiseCompare() to InlineData, and make it layout independent.
  > Change kPower10Table bounds to be half-open
  > Cleanup some InlineData internal layout specific details from cord.h
  > Improve the comments on the implementation of format hooks adl tricks.
  > Expand LogEntry method docs.
  > Documentation: Remove an obsolete note about the implementation of `Cord`.
  > `absl::base_internal::ReadLongFromFile` should use `O_CLOEXEC` and handle interrupts to `read`
  > Allows absl::StrFormat to accept types which implement AbslStringify()
  > Add common_policy_traits - a subset of hash_policy_traits that can be shared between raw_hash_set and btree.
  > Split configuration related to cycle clock into separate headers
  > Fix -Wimplicit-int-conversion and -Wsign-conversion warnings in btree.
  > Implement Eisel-Lemire for from_chars<float>
  > Import of CCTZ from GitHub.
  > Adds support for "%v" in absl::StrFormat and related functions for bool values. Note that %v prints bool values as "true" and "false" rather than "1" and "0".
  > De-pointerize LogStreamer::stream_, and fix move ctor/assign preservation of flags and other stream properties.
  > Explicitly disallows modifiers for use with %v.
  > Change the macro ABSL_IS_TRIVIALLY_RELOCATABLE into a type trait - absl::is_trivially_relocatable - and move it from optimization.h to type_traits.h.
  > Add sparse and string copy constructor benchmarks for hash table.
  > Make BTrees work with custom allocators that recycle memory.
  > Update the readme, and (internally) fix some export processes to better keep it up-to-date going forward.
  > Add the fact that CHECK_OK exits the program to the comment of CHECK_OK.
  > Adds support for "%v" in absl::StrFormat and related functions for numeric types, including integer and floating point values. Users may now specify %v and have the format specifier deduced. Integer values will print according to %d specifications, unsigned values will use %u, and floating point values will use %g. Note that %v does not work for `char` due to ambiguity regarding the intended output. Please continue to use %c for `char`.
  > Implement correct move constructor and assignment for absl::strings_internal::OStringStream, and mark that class final.
  > Add more options for `BM_iteration` in order to see better picture for choosing trade off for iteration optimizations.
  > Change `EndComparison` benchmark to not measure iteration. Also added `BM_Iteration` separately.
  > Implement Eisel-Lemire for from_chars<double>
  > Add `-llog` to linker options when building log_sink_set in logging internals.
  > Apply clang-format to btree.h.
  > Improve failure message: tell the values we don't like.
  > Increase the number of per-ObjFile program headers we can expect.
  > Fix "unsafe narrowing" warnings in absl, 8/n.
  > Fix format string error with an explicit cast
  > Add a case to detect when the Bazel compiler string is explicitly set to "gcc", instead of just detecting Bazel's default "compiler" string.
  > Fix "unsafe narrowing" warnings in absl, 10/n.
  > Fix "unsafe narrowing" warnings in absl, 9/n.
  > Fix stacktrace header includes
  > Add a missing dependency on :raw_logging_internal
  > CMake: Require at least CMake 3.10
  > CMake: install artifacts reflect the compiled ABI
  > Fixes bug so that `%v` with modifiers doesn't compile. `%v` is not intended to work with modifiers because the meaning of modifiers is type-dependent and `%v` is intended to be used in situations where the type is not important. Please continue using if `%s` if you require format modifiers.
  > Convert algorithm and container benchmarks to cc_binary
  > Merge pull request #1269 from isuruf:patch-1
  > InlinedVector: Small improvement to the max_size() calculation
  > CMake: Mark hash_testing as a public testonly library, as it is with Bazel
  > Remove the ABSL_HAVE_INTRINSIC_INT128 test from pcg_engine.h
  > Fix ClangTidy warnings in btree.h and btree_test.cc.
  > Fix log StrippingTest on windows when TCHAR = WCHAR
  > Refactors checker.h and replaces recursive functions with iterative functions for readability purposes.
  > Refactors checker.h to use if statements instead of ternary operators for better readability.
  > Import of CCTZ from GitHub.
  > Workaround for ASAN stack safety analysis problem with FixedArray container annotations.
  > Rollback of fix "unsafe narrowing" warnings in absl, 8/n.
  > Fix "unsafe narrowing" warnings in absl, 8/n.
  > Changes mutex profiling
  > InlinedVector: Correct the computation of max_size()
  > Adds support for "%v" in absl::StrFormat and related functions for string-like types (support for other builtin types will follow in future changes). Rather than specifying %s for strings, users may specify %v and have the format specifier deduced. Notably, %v does not work for `const char*` because we cannot be certain if %s or %p was intended (nor can we be certain if the `const char*` was properly null-terminated). If you have a `const char*` you know is null-terminated and would like to work with %v, please wrap it in a `string_view` before using it.
  > Fixed header guards to match style guide conventions.
  > Typo fix
  > Added some more no_test.. tags to build targets for controlling testing.
  > Remove includes which are not used directly.
  > CMake: Add an option to build the libraries that are used for writing tests without requiring Abseil's tests be built (default=OFF)
  > Fix "unsafe narrowing" warnings in absl, 7/n.
  > Fix "unsafe narrowing" warnings in absl, 6/n.
  > Release the Abseil Logging library
  > Switch time_state to explicit default initialization instead of value initialization.
  > spinlock.h: Clean up includes
  > Fix minor typo in absl/time/time.h comment: "ToDoubleNanoSeconds" -> "ToDoubleNanoseconds"
  > Support compilers that are unknown to CMake
  > Import of CCTZ from GitHub.
  > Change bit_width(T) to return int rather than T.
  > Import of CCTZ from GitHub.
  > Merge pull request #1252 from jwest591:conan-fix
  > Don't try to enable use of ARM NEON intrinsics when compiling in CUDA device mode. They are not available in that configuration, even if the host supports them.
  > Fix "unsafe narrowing" warnings in absl, 5/n.
  > Fix "unsafe narrowing" warnings in absl, 4/n.
  > Import of CCTZ from GitHub.
  > Update Abseil platform support policy to point to the Foundational C++ Support Policy
  > Import of CCTZ from GitHub.
  > Add --features=external_include_paths to Bazel CI to ignore warnings from dependencies
  > Merge pull request #1250 from jonathan-conder-sm:gcc_72
  > Merge pull request #1249 from evanacox:master
  > Import of CCTZ from GitHub.
  > Merge pull request #1246 from wxilas21:master
  > remove unused includes and add missing std includes for absl/status/status.h
  > Sort INTERNAL_DLL_TARGETS for easier maintenance.
  > Disable ABSL_HAVE_STD_IS_TRIVIALLY_ASSIGNABLE for clang-cl.
  > Map the absl::is_trivially_* functions to their std impl
  > Add more SimpleAtod / SimpleAtof test coverage
  > debugging: handle alternate signal stacks better on RISCV
  > Revert change "Fix "unsafe narrowing" warnings in absl, 4/n.".
  > Fix "unsafe narrowing" warnings in absl, 3/n.
  > Fix "unsafe narrowing" warnings in absl, 4/n.
  > Fix "unsafe narrowing" warnings in absl, 2/n.
  > debugging: honour `STRICT_UNWINDING` in RISCV path
  > Fix "unsafe narrowing" warnings in absl, 1/n.
  > Add ABSL_IS_TRIVIALLY_RELOCATABLE and ABSL_ATTRIBUTE_TRIVIAL_ABI macros for use with clang's __is_trivially_relocatable and [[clang::trivial_abi]].
  > Merge pull request #1223 from ElijahPepe:fix/implement-snprintf-safely
  > Fix frame pointer alignment check.
  > Fixed sign-conversion warning in code.
  > Import of CCTZ from GitHub.
  > Add missing include for std::unique_ptr
  > Do not re-close files on EINTR
  > Renamespace absl::raw_logging_internal to absl::raw_log_internal to match (upcoming) non-raw logging namespace.
  > Check for negative return values from ReadFromOffset
  > Use HTTPS RFC URLs, which work regardless of the browser's locale.
  > Avoid signedness change when casting off_t
  > Internal Cleanup: removing unused internal function declaration.
  > Make Span complain if constructed with a parameter that won't outlive it, except if that parameter is also a span or appears to be a view type.
  > any_invocable_test: Re-enable the two conversion tests that used to fail under MSVC
  > Add GetCustomAppendBuffer method to absl::Cord
  > debugging: add hooks for checking stack ranges
  > Minor clang-tidy cleanups
  > Support [[gnu::abi_tag("xyz")]] demangling.
  > Fix -Warray-parameter warning
  > Merge pull request #1217 from anpol:macos-sigaltstack
  > Undo documentation change on erase.
  > Improve documentation on erase.
  > Merge pull request #1216 from brjsp:master
  > string_view: conditional constexpr is no longer needed for C++14
  > Make exponential_distribution_test a bigger test (timeout small -> moderate).
  > Move Abseil to C++14 minimum
  > Revert commit f4988f5bd4176345aad2a525e24d5fd11b3c97ea
  > Disable C++11 testing, enable C++14 and C++20 in some configurations where it wasn't enabled
  > debugging: account for differences in alternate signal stacks
  > Import of CCTZ from GitHub.
  > Run flaky test in fewer configurations
  > AnyInvocable: Move credits to the top of the file
  > Extend visibility of :examine_stack to an upcoming Abseil Log.
  > Merge contiguous mappings from the same file.
  > Update versions of WORKSPACE dependencies
  > Use ABSL_INTERNAL_HAS_SSE2 instead of __SSE2__
  > PR #1200: absl/debugging/CMakeLists.txt: link with libexecinfo if needed
  > Update GCC floor container to use Bazel 5.2.0
  > Update GoogleTest version used by Abseil
  > Release absl::AnyInvocable
  > PR #1197: absl/base/internal/direct_mmap.h: fix musl build on mips
  > absl/base/internal/invoke: Ignore bogus warnings on GCC >= 11
  > Revert GoogleTest version used by Abseil to commit 28e1da21d8d677bc98f12ccc7fc159ff19e8e817
  > Update GoogleTest version used by Abseil
  > explicit_seed_seq_test: work around/disable bogus warnings in GCC 12
  > any_test: expand the any emplace bug suppression, since it has gotten worse in GCC 12
  > absl::Time: work around bogus GCC 12 -Wrestrict warning
  > Make absl::StdSeedSeq an alias for std::seed_seq
  > absl::Optional: suppress bogus -Wmaybe-uninitialized GCC 12 warning
  > algorithm_test: suppress bogus -Wnonnull warning in GCC 12
  > flags/marshalling_test: work around bogus GCC 12 -Wmaybe-uninitialized warning
  > counting_allocator: suppress bogus -Wuse-after-free warning in GCC 12
  > Prefer to fallback to UTC when the embedded zoneinfo data does not contain the requested zone.
  > Minor wording fix in the comment for ConsumeSuffix()
  > Tweak the signature of status_internal::MakeCheckFailString as part of an upcoming change
  > Fix several typos in comments.
  > Reformulate documentation of ABSL_LOCKS_EXCLUDED.
  > absl/base/internal/invoke.h: Use ABSL_INTERNAL_CPLUSPLUS_LANG for language version guard
  > Fix C++17 constexpr storage deprecation warnings
  > Optimize SwissMap iteration by another 5-10% for ARM
  > Add documentation on optional flags to the flags library overview.
  > absl: correct the stack trace path on RISCV
  > Merge pull request #1194 from jwnimmer-tri:default-linkopts
  > Remove unintended defines from config.h
  > Ignore invalid TZ settings in tests
  > Add ABSL_HARDENING_ASSERTs to CordBuffer::SetLength() and CordBuffer::IncreaseLengthBy()
  > Fix comment typo about absl::Status<T*>
  > In b-tree, support unassignable value types.
  > Optimize SwissMap for ARM by 3-8% for all operations
  > Release absl::CordBuffer
  > InlinedVector: Limit the scope of the maybe-uninitialized warning suppression
  > Improve the compiler error by removing some noise from it. The "deleted" overload error is useless to users. By passing some dummy string to the base class constructor we use a valid constructor and remove the unintended use of the deleted default constructor.
  > Merge pull request #714 from kgotlinux:patch-2
  > Include proper #includes for POSIX thread identity implementation when using that implementation on MinGW.
  > Rework NonsecureURBGBase seed sequence.
  > Disable tests on some platforms where they currently fail.
  > Fixed typo in a comment.
  > Rollforward of commit ea78ded7a5f999f19a12b71f5a4988f6f819f64f.
  > Add an internal helper for logging (upcoming).
  > Merge pull request #1187 from trofi:fix-gcc-13-build
  > Merge pull request #1189 from renau:master
  > Allow for using b-tree with `value_type`s that can only be constructed by the allocator (ignoring copy/move constructors).
  > Stop using sleep timeouts for Linux futex-based SpinLock
  > Automated rollback of commit f2463433d6c073381df2d9ca8c3d8f53e5ae1362.
  > time.h: Use uint32_t literals for calls to overloaded MakeDuration
  > Fix typos.
  > Clarify the behaviour of `AssertHeld` and `AssertReaderHeld` when the calling thread doesn't hold the mutex.
  > Enable __thread on Asylo
  > Add implementation of is_invocable_r to absl::base_internal for C++ < 17, define it as alias of std::is_invocable_r when C++ >= 17
  > Optimize SwissMap iteration for aarch64 by 5-6%
  > Fix detection of ABSL_HAVE_ELF_MEM_IMAGE on Haiku
  > Don’t use generator expression to build .pc Libs lines
  > Update Bazel used on MacOS CI
  > Import of CCTZ from GitHub.

Closes #11687
2022-10-03 11:06:37 +03:00
Botond Dénes
f4540ef0d6 Merge 'Upgrade nix devenv' from Michael Livshin
To recap: the Nix devenv ({default,shell,flake}.nix and friends) in Scylla is a nicer (for those who consider it so, that is) alternative to dbuild: a completely deterministic build environment without Docker.

In theory we could support much more (creating installable packages, container images, various deployment affordances, etc. -- Nix is, among other things, a kind of parallel-to-everything-else devops realm) but there is clearly no demand and besides duplicating the work the release team is already doing (and doing just fine, needless to say) would be pointless and wasteful.

This PR reflects the accumulated changes that I have been carrying locally for the past year or so.  The version currently in master _probably_ can still build Scylla, but that Scylla certainly would not pass unit tests.

What the previous paragraph seems to mean is, apparently I'm the only active user of Nix devenv for Scylla.  Which, in turn, presents some obvious questions for the maintainers:

- Does this need to live in the Scylla source at all?  (The changes to non-Nix-specific parts are minimal and unobtrusive, but they are still changes)
- If it's left in, who is going to maintain it going forward, should more users somehow appear?  (I'm perfectly willing to fix things up when alerted, but no timeliness guarantees)

Closes #9557

* github.com:scylladb/scylladb:
  nix: add README.md
  build: improvements & upgrades to Nix dev environment
  build: allow setting SCYLLA_RELEASE from outside
2022-10-03 09:40:09 +03:00
Botond Dénes
2041744132 Merge 'readers/mutlishard: don't mix coroutines and continuations in the do_fill_buffer()' from Avi Kivity
The combination is hard to read and modify.

Closes #11665

* github.com:scylladb/scylladb:
  readers/multishard: restore shard_reader_v2::do_fill_buffer() indentation
  readers/multishard: convert shard_reader_v2::do_fill_buffer() to a pure coroutine
2022-10-03 06:51:20 +03:00
Nadav Har'El
b8f8eb8710 Merge 'Improve test.py logging' from Kamil Braun
Include the unique test name (the unique name distinguishes between different test repeats) and the test case name where possible. Improve printing of clusters: include the cluster name and stopped servers. Fix some logging calls and add new ones.

Examples:
```
------ Starting test test_topology ------
```
became this:
```
------ Starting test test_topology.1::test_add_server_add_column ------
```

This:
```
INFO> Leasing Scylla cluster {127.191.142.1, 127.191.142.2, 127.191.142.3} for test test_add_server_add_column
```
became this:
```
INFO> Leasing Scylla cluster ScyllaCluster(name: 02cdd180-40d1-11ed-8803-3c2c30d32d96, running: {127.144.164.1, 127.144.164.2, 127.144.164.3}, stopped: {}) for test test_topology.1::test_add_server_add_column
```

Closes #11677

* github.com:scylladb/scylladb:
  test/pylib: scylla_cluster: improve cluster printing
  test/pylib: don't pass test_case_name to after-test endpoint
  test/pylib: scylla_cluster: track current test case name and print it
  test.py: pass the unique test name (e.g. `test_topology.1`) to cluster manager
  test/pylib: scylla_cluster: pass the test case name to `before_test`
  test/pylib: use "test_case_name" variable name when talking about test cases
2022-10-02 20:48:50 +03:00
Pavel Emelyanov
2b8636a2a9 storage_proxy.hh: Remove unused headers
Add needed forward declarations and fix indirect inclusions in some .ccs

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11679
2022-10-02 20:48:50 +03:00
Michał Chojnowski
4563cbe595 logalloc: prevent false positives in reclaim_timer
reclaim_timer uses a coarse clock, but does not account for
the measurement error introduced by that -- it can falsely
report reclaims as stalls, even if they are shorter by a full
coarse clock tick from the requested threshold
(blocked-reactor-notify-ms).

Notably, if the stall threshold happens to be smaller or equal to coarse
clock resolution, Scylla's log gets spammed with false stall reports.
The resolution of coarse clocks in Linux is 1/CONFIG_HZ. This is
typically equal to 1 ms or 4 ms, and stall thresholds of this order
can occur in practice.

Eliminate false positives by requiring the measured reclaim duration to
be at least 1 clock tick longer than the configured threshold for it to
be considered a stall.

Fixes #10981

Closes #11680
2022-10-02 13:41:40 +03:00
Avi Kivity
372eadf542 Merge "perftune related improvements in scylla_* scripts" from Vlad Zolotarov
"
This series adds a long waited transition of our auto-generation
code to irq_cpu_mask instead of 'mode' in perftune.yaml.

And then it fixes a regression in scylla_prepare perftune.yaml
auto-generation logic.
"

* 'scylla_prepare_fix_regression-v1' of https://github.com/vladzcloudius/scylla:
  scylla_prepare + scylla_cpuset_setup: make scylla_cpuset_setup idempotent without introducing regressions
  scylla_prepare: stop generating 'mode' value in perftune.yaml
2022-10-02 13:25:13 +03:00
Michael Livshin
d178ac17dc nix: add README.md
Signed-off-by: Michael Livshin <repo@cmm.kakpryg.net>
2022-10-02 12:26:02 +03:00
Michael Livshin
7bd13be3f2 build: improvements & upgrades to Nix dev environment
* Add some more useful stuff to the shell environment, so it actually
  works for debugging & post-mortem analysis.

* Wrap ccache & distcc transparently (distcc will be used unless
  NODISTCC is set to a non-empty value in the environment; ccache will
  be used if CCACHE_DIR is not empty).

* Package the Scylla Python driver (instead of the C* one).

* Catch up to misc build/test requirements (including optional) by
  requiring or custom-packaging: wasmtime 0.29.0, cxxbridge,
  pytest-asyncio, liburing.

* Build statically-linked zstd in a saner and more idiomatic fashion.

* In pure builds (where sources lack Git metadata), derive
  SCYLLA_RELEASE from source hash.

* Refactor things for more parameterization.

* Explicitly stub out installPhase (seeing that "nix build" succeeds
  up to installPhase means we didn't miss any dependencies).

* Add flake support.

* Add copious comments.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-10-02 11:47:16 +03:00
Michael Livshin
839d8f40e6 build: allow setting SCYLLA_RELEASE from outside
The extant logic for deriving the value of SCYLLA_RELEASE from the
source tree has those assumptions:

* The tree being built includes Git metadata.

* The value of `date` is trustworthy and interesting.

* There are no uncommitted changes (those relevant to building,
  anyway).

The above assumptions are either irrelevant or problematic in pure
build environments (such as the sandbox set up by `nix-build`):

* Pure builds use cleaned-up sources with all timestamps reset to Unix
  time 0.  Those cleaned-up sources are saved (in the Nix store, for
  example) and content-hashed, so leaving the (possibly huge) Git
  metadata increases the time to copy the sources and wastes disk
  space (in fact, Nix in flake mode strips `.git` unconditionally).

* Pure builds run in a sandbox where time is, likewise, reset to Unix
  time 0, so the output of `date` is neither informative nor useful.

Now, the only build step that uses Git metadata in the first place is
the SCYLLA_RELEASE value derivation logic.  So, essentially, answering
the question "is the Git metadata needed to build Scylla" is a matter
of definition, and is up to us.  If we elect to ignore Git metadata
and current time, we can derive SCYLLA_RELEASE value from the content
hash of the cleaned-up tree, regardless of the way that tree was
arrived at.

This change makes it possible to skip the derivation of SCYLLA_RELEASE
value from Git metadata and current time by way of setting
SCYLLA_RELEASE in the environment.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-10-02 11:47:16 +03:00
Avi Kivity
17b1cb4434 dirty_memory_manager: move third memory threshold parameter of region_group constructor to reclaim_config
Place it along the other parameters.
2022-09-30 22:17:37 +03:00
Avi Kivity
ecf30ee469 dirty_memory_manager: simplify region_group::update()
We notice there are two separate conditions controlling a call to
a single outcome, notify_pressure_relief(). Merge them into a single
boolean variable.
2022-09-30 22:15:45 +03:00
Avi Kivity
230fff299a dirty_memory_manager: fold region_group::notify_hard_pressure_relieved into its callers
It is trivial.
2022-09-30 22:11:01 +03:00
Avi Kivity
12b81173b9 dirty_memory_manager: clean up region_group::do_update_hard_and_check_relief()
Remove synthetic "rg" local.
2022-09-30 22:09:09 +03:00
Avi Kivity
e1bad8e883 dirty_memory_manager: make do_update_hard_and_check_relief() a member of region_group
It started life as something shared between memory_hard_limit and
region_group, but now that they are back being the same thing, we
can make it a member again.
2022-09-30 22:04:26 +03:00
Avi Kivity
6b21c10e9e dirty_memory_manager: remove accessors around region_group::_under_hard_pressure
It is now only accessed from within the class, so the
accessors don't help anything.
2022-09-30 21:59:46 +03:00
Avi Kivity
6a02bb7c2b dirty_memory_manager: merge memory_hard_limit into region_group
The two classes always have a 1:1 or 0:1 relationship, and
so we can just move all the members of memory_hard_limit
into region_group, with the functions that track the relationship
(memory_hard_limit::{add,del}()) removed.

The 0:1 relationship is maintained by initializing the
hard limit parameter with std::numeric_limits<size_t>::max().
The _hard_total_memory variable is always checked if it is
greater than this parameter in order to do anything, and
with this default it can never be.
2022-09-30 21:59:38 +03:00
Avi Kivity
45ab24e43d dirty_memory_manager: rename members in memory_hard_limit
In preparation for merging memory_hard_limit into region_group,
disambiguate similarly named members by adding the word "hard" in
random places.

memory_hard_limit and region_group are candidates for merging
because they constantly reference each other, and memory_hard_limit
does very little by itself.
2022-09-30 21:47:33 +03:00
Avi Kivity
aca96c4103 readers/multishard: restore shard_reader_v2::do_fill_buffer() indentation 2022-09-30 19:19:51 +03:00