Commit Graph

34814 Commits

Author SHA1 Message Date
Alejo Sanchez
cf3b8d7edc pytest/topology: check snapshot transfer
Test snapshot transfer by reducing the snapshot threshold on initial
servers (3 and 1 trailing).

Then creates a table, and does 3 extra schema changes (add column),
triggering at least 2 snapshots.

Then brings a new server to the cluster, which will get the schema
through a snapshot.

Then the test stops the initial servers and verifies the table
schema is up to date on the new server.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2023-02-07 16:09:07 +01:00
Alejo Sanchez
346d02b477 raft conf error injection for snapshot
To trigger snapshot limit behavior provide an error injection to set
with one-shot.

Note this effectively changes it and there is no revert.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2023-02-03 22:33:33 +01:00
Alejo Sanchez
9ceb6aba81 test/pylib: one-shot error injection helper
Existing helper with async context manager only worked for non one-shot
error injections. Fix it and add another helper for one-shot without a
context manager.

Fix tests using the previous helper.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2023-02-02 16:37:21 +01:00
Kefu Chai
53366db6c6 build: disable Seastar's io_uring backend again
this partially reverts 49157370bc

according the reports in #12173, at least two developers ran into
test failures which are correlated with the lastest Seastar change,
which enables the io_uring backend by default. they are using linux
kernel 6.0.12 and 6.1.7. it's also reported that reverting the
the commit of eedca15f16c3b6eae3d3d8af9510624a93f5d186 in seastar
helps. that very commit enables the io_uring by default. although we
are not able to identify the exact root cause of the failures in #12173
at this moment, to rule out the potential problem of io_uring should
help with further investigation.

in this change, io_uring backend is disabled when building Seastar.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12689
2023-02-01 17:36:07 +02:00
Avi Kivity
b4559a6992 Update seastar submodule
* seastar 943c09f869...ef24279f03 (6):
  > Merge 'util/print_safe, reactor: use concept for type constraints and refactory ' from Kefu Chai
  > Right align the memory diagnostics
  > Merge 'Add an API for the metrics layer to manipulate metrics dynamically.' from Amnon Heiman
  > semaphore: assert no outstanding units when moved
  > build: do not populate package registry by default
  > build: stop detecting concepts support

Closes #12695
2023-02-01 17:19:49 +02:00
Kamil Braun
40142a51d0 test: topology: wait for token ring/group 0 consistency after decommission
There was a check for immediate consistency after a decommission
operation has finished in one of the tests, but it turns out that also
after decommission it might take some time for token ring to be updated
on other nodes. Replace the check with a wait.

Also do the wait in another test that performs a sequence of
decommissions. We won't attempt to start another decommission until
every node learns that the previously decommissioned node has left.

Closes #12686
2023-02-01 16:49:22 +02:00
Raphael S. Carvalho
1b2140e416 compaction: Fix inefficiency when updating LCS backlog tracker
LCS backlog tracker uses STCS tracker for L0. Turns out LCS tracker
is calling STCS tracker's replace_sstables() with empty arguments
even when higher levels (> 0) *only* had sstables replaced.
This unnecessary call to STCS tracker will cause it to recompute
the L0 backlog, yielding the same value as before.

As LCS has a fragment size of 0.16G on higher levels, we may be
updating the tracker multiple times during incremental compaction,
which operates on SSTables on higher levels.

Inefficiency is fixed by only updating the STCS tracker if any
L0 sstable is being added or removed from the table.

This may be fixing a quadratic behavior during boot or refresh,
as new sstables are loaded one by one.
Higher levels have a substantial higher number of sstables,
therefore updating STCS tracker only when level 0 changes, reduces
significantly the number of times L0 backlog is recomputed.

Refs #12499.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #12676
2023-02-01 15:19:07 +02:00
Michael Hollander
5d1e40bc18 Added missing full stop to SimpleSnitch paragraph
Closes #12692
2023-02-01 13:21:49 +02:00
Nadav Har'El
132af20057 Merge 'test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests' from Kamil Braun
`ScyllaClusterManager` is used to run a sequence of test cases from
a single test file. Between two consecutive tests, if the previous test
left the cluster 'dirty', meaning the cluster cannot be reused, it would
free up space in the pool (using `steal`), stop the cluster, then get a
new cluster from the pool.

Between the `steal` and the `get`, a concurrent test run (with its own
instance of `ScyllaClusterManager` would start, because there was free
space in the pool.

This resulted in undesirable behavior when we ran tests with
`--repeat X` for a large `X`: we would start with e.g. 4 concurrent
runs of a test file, because the pool size was 4. As soon as one of the
runs freed up space in the pool, we would start another concurrent run.
Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so
on. We would have a large number of concurrent runs, even though the
original 4 runs didn't finish yet. All of these concurrent runs would
compete waiting on the pool, and waiting for space in the pool would
take longer and longer (the duration is linear w.r.t number of
concurrent competing runs). Tests would then time out because they would
have to wait too long.

Fix that by using the new `replace_dirty` function introduced to the
pool. This function frees up space by returning a dirty cluster and then
immediately takes it away to be used for a new cluster. Thanks to this,
we will only have at most as many concurrent runs as the pool size. For
example with --repeat 8 and pool size 4, we would run 4 concurrent runs
and start the 5th run only when one of the original 4 runs finishes,
then the 6th run when a second run finishes and so on.

The fix is preceded by a refactor that replaces `steal` with `put(is_dirty=True)`
and a `destroy` function passed to the pool (now the pool is responsible
for stopping the cluster and releasing its IPs).

Fixes #11757

Closes #12549

* github.com:scylladb/scylladb:
  test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests
  test/pylib: pool: introduce `replace_dirty`
  test/pylib: pool: replace `steal` with `put(is_dirty=True)`
2023-02-01 12:37:39 +02:00
Nadav Har'El
681a066923 test/pylib: put UNIX-domain socket in /tmp
The "cluster manager" used by the topology test suite uses a UNIX-domain
socket to communicate between the cluster manager and the individual tests.
The socket is currently located in the test directory but there is a
problem: In Linux the length of the path used as a UNIX-domain socket
address is limited to just a little over 100 bytes. In Jenkins run, the
test directory names are very long, and we sometimes go over this length
limit and the result is that test.py fails creating this socket.

In this patch we simply put the socket in /tmp instead of the test
directory. We only need to do this change in one place - the cluster
manager, as it already passes the socket path to the individual tests
(using the "--manager-api" option).

Tested by cloning Scylla in a very long directory name.
A test like ./test.py --mode=dev test_concurrent_schema fails before
this patch, and passes with it.

Fixes #12622

Closes #12678
2023-02-01 12:37:35 +03:00
Botond Dénes
325246ab2a Merge 'doc: fix the service name from "scylla-enterprise-server" "to "scylla-server"' from Anna Stuchlik
Related https://github.com/scylladb/scylladb/issues/12658.

This issue fixes the bug in the upgrade guides for the released versions.

Closes #12679

* github.com:scylladb/scylladb:
  doc: fix the service name in the upgrade guide for patch releases versions 2022
  doc: fix the service name in the upgrade guide from 2021.1 to 2022.1
2023-02-01 12:37:35 +03:00
Anna Stuchlik
2be131da83 doc: fixes https://github.com/scylladb/scylladb/issues/12672, fix the redirects to the Cloud docs
Closes #12673
2023-02-01 12:37:35 +03:00
Botond Dénes
d8073edbb7 Merge 'cql3, locator: call fmt::format_to() explicitly and include used headers' from Kefu Chai
these fixes address the FTBFS of scylla with GCC-13.

Closes #12669

* github.com:scylladb/scylladb:
  cql3/stats: include the used header.
  cql3, locator: call fmt::format_to() explicitly
2023-02-01 12:37:35 +03:00
Pavel Emelyanov
d065f9f82e sstables: The generation_type is not formattable
If TOC writing hits TOC file conflict it tries to throw an exception
with sstable generation in it. However, generation_type is not
formattable at all, let alone the {:d} option.pick

This bug generates an obscure 'fmt::v9::format_error (invalid type
specifier)' error in unknown location making the debugging hard.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #12671
2023-02-01 12:37:35 +03:00
Kefu Chai
186ceea009 cql3/selection: construct string_view using char* not size
before this change, we construct a sstring from a comma statement,
which evaluates to the return value of `name.size()`, but what we
expect is `sstring(const char*, size_t)`.

in this change

* instead of passing the size of the string_view,
  both its address and size are used
* `std::string_view` is constructed instead of sstring, for better
  performance, as we don't need to perform a deep copy

the issue is reported by GCC-13:

```
In file included from cql3/selection/selectable.cc:11:
cql3/selection/field_selector.hh:83:60: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
        auto sname = sstring(reinterpret_cast<const char*>(name.begin(), name.size()));
                                                           ^~~~~~~~~~
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12666
2023-02-01 12:37:35 +03:00
David Garcia
616bf26422 docs: add opensource flag
Closes #12656
2023-02-01 12:37:35 +03:00
Anna Stuchlik
11a59bcc76 doc: fix the service name in the upgrade guide for patch releases versions 2022 2023-01-31 11:04:21 +01:00
Anna Stuchlik
71ae644d40 doc: fix the service name in the upgrade guide from 2021.1 to 2022.1 2023-01-31 10:46:44 +01:00
Kefu Chai
58b4dc5b9a cql3/stats: include the used header.
otherwise `uint64_t` won't be found when compiling with GCC-13.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-01-30 21:50:23 +08:00
Kefu Chai
ccc03dd1ec cql3, locator: call fmt::format_to() explicitly
since format_to() is defined included by both fmt and std namepaces,
without specifying which one to use, we'd fail to build with the
standard library which implements std::format_to(). yes, we are
`using namespace std` somewhere.

this change should address the FTBFS with GCC-13.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-01-30 21:50:11 +08:00
Warren Krewenki
8655a8be19 docs: Update suggested AWS instance types in benchmark tips
The list of suggested instances had a misspelling of c5d, and didn't include the i4i instances recommended by https://www.scylladb.com/2022/05/09/scylladb-on-the-new-aws-ec2-i4i-instances-twice-the-throughput-lower-latency/

Closes #12664
2023-01-30 14:10:18 +02:00
Botond Dénes
c927eea1d5 Merge 'table: trim ranges for compaction group cleanup' from Benny Halevy
This series contains the following changes for trimming the ranges passed to cleanup a compaction group to the compaction group owned token_range.

table: compaction_group_for_token: use signed arithmetic
Fixes #12595

table: make_compaction_groups: calculate compaction_group token ranges
table: perform_cleanup_compaction: trim owned ranges on compaction_group boundaries
Fixes #12594

Closes #12598

* github.com:scylladb/scylladb:
  table: perform_cleanup_compaction: trim owned ranges on compaction_group boundaries
  table: make_compaction_groups: calculate compaction_group token ranges
  dht: range_streamer: define logger as static
2023-01-30 13:11:28 +02:00
Anna Stuchlik
64cc4c8515 docs: fixes https://github.com/scylladb/scylladb/issues/12654, update the links to the Download Center
Closes #12655
2023-01-30 12:45:20 +02:00
Michał Chojnowski
fa7e904cd6 commitlog: fix total_size_on_disk accounting after segment file removal
Currently, segment file removal first calls `f.remove_file()` and
does `total_size_on_disk -= f.known_size()` later.
However, `remove_file()` resets `known_size` to 0, so in effect
the freed space in not accounted for.

`total_size_on_disk` is not just a metric. It is also responsible
for deciding whether a segment should be recycled -- it is recycled
only if `total_size_on_disk - known_size < max_disk_size`.
Therefore this bug has dire performance consequences:
if `total_size_on_disk - known_size` ever exceeds `max_disk_size`,
the recycling of commitlog segments will stop permanently, because
`total_size_on_disk - known_size` will never go back below
`max_disk_size` due to the accounting bug. All new segments from this
point will be allocated from scratch.

The bug was uncovered by a QA performance test. It isn't easy to trigger --
it took the test 7 hours of constant high load to step into it.
However, the fact that the effect is permanent, and degrades the
performance of the cluster silently, makes the bug potentially quite severe.

The bug can be easily spotted with Prometheus as infinitely rising
`commitlog_total_size_on_disk` on the affected shards.

Fixes #12645

Closes #12646
2023-01-30 12:20:04 +02:00
Avi Kivity
5d914adcef Merge 'view: row_lock: lock_ck: find or construct row_lock under partition lock' from Benny Halevy
Since we're potentially searching the row_lock in parallel to acquiring the read_lock on the partition, we're racing with row_locker::unlock that may erase the _row_locks entry for the same clustering key, since there is no lock to protect it up until the partition lock has been acquired and the lock_partition future is resolved.

This change moves the code to search for or allocate the row lock _after_ the partition lock has been acquired to make sure we're synchronously starting the read/write lock function on it, without yielding, to prevent this use-after-free.

This adds an allocation for copying the clustering key in advance that wasn't needed before if the lock for it was already found, but the view building is not on the hot path so we can tolerate that.

This is required on top of 5007ded2c1 as seen in https://github.com/scylladb/scylladb/issues/12632 which is closely related to #12168 but demonstrates a different race causing use-after-free.

Fixes #12632

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #12639

* github.com:scylladb/scylladb:
  view: row_lock: lock_ck: try_emplace row_lock entry
  view: row_lock: lock_ck: find or construct row_lock under partition lock
2023-01-29 18:38:14 +02:00
Warren Krewenki
2b7a7e52f4 docs: Missing closing quote in example query
Closes #12663
2023-01-29 11:50:11 +02:00
Botond Dénes
84a69b6adb db/view/view_update_check: check_needs_view_update_path(): filter out non-member hosts
We currently don't clean up the system_distributed.view_build_status
table after removed nodes. This can cause false-positive check for
whether view update generation is needed for streaming.
The proper fix is to clean up this table, but that will be more
involved, it even when done, it might not be immediate. So until then
and to be on the safe side, filter out entries belonging to unknown
hosts from said table.

Fixes: #11905
Refs: #11836

Closes #11860
2023-01-27 17:12:45 +03:00
Botond Dénes
e2c9cdb576 mutation_compactor: only pass consumed range-tombstone-change to validator
Currently all consumed range tombstone changes are unconditionally
forwarded to the validator. Even if they are shadowed by a higher level
tombstone and/or purgable. This can result in a situation where a range
tombstone change was seen by the validator but not passed to the
consumer. The validator expects the range tombstone change to be closed
by end-of-partition but the end fragment won't come as the tombstone was
dropped, resulting in a false-positive validation failure.
Fix by only passing tombstones to the validator, that are actually
passed to the consumer too.

Fixes: #12575

Closes #12578
2023-01-27 14:03:45 +01:00
Nadav Har'El
b99b83acdd docs/alternator: fix links to open issues
The docs/alternator/compatibility.md file links to various open issues
on unimplemented features. One of the links was to an already-closed
issue. Replace it by a link to an open issue that was missing.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12649
2023-01-27 14:29:57 +02:00
Pavel Emelyanov
1f9f819c8c table: Remove unused column_family_directory() overload
There's another one that accepts explicit basedir first argument and
that's used by the rest of the code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #12643
2023-01-27 14:17:41 +02:00
Nadav Har'El
f873884b50 test/alternator: unskip test which works on modern Scylla
We had one test test_gsi.py::test_gsi_identical that didn't work on KA/LA
sstables due to #6157, so it was skipped. Today, Scylla no longer supports
writing these old sstable formats, so the test can never find itself
running on these versions, so should pass. And indeed it does, and the
"skip" marker can be removed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12651
2023-01-27 14:10:07 +02:00
Botond Dénes
d358d4d9e9 Merge 'Configure sstable_test_env with tempdir' from Pavel Emelyanov
Today's sstable_test_env starts with a default-configured db::config and, thus, sstables_manager. Test cases that run in this env always create a tempdir to store sstable files in on their own. Next patching makes sstable-manager and friends fully control the data-dir path in order to support object storage for sstables in a nice way, and this behavior of tests upsets this ongoing work.

Said that, this PR configures sstable_test_env with a tempdir and pins down the cases using it to stick to that directory, rather than to the custom one.

Closes #12641

* github.com:scylladb/scylladb:
  test: Use tempdir from sstable_test_env
  test: Add tmpdir to sstable test env
  test: Keep db::config as unique pointer
2023-01-27 13:59:12 +02:00
Avi Kivity
df09bf2670 tools: toolchain: dbuild: pass NOFILE limit from host to container
The leak sanitizer has a bug [1] where, if it detects a leak, it
forks something, and before that, it closes all files (instead of
using close_range like a good citizen).

Docker tends to create containers with the NOFILE limit (number of
open files) set to 1 billion.

The resulting 1 billion close() system calls is incredibly slow.

Work around that problem by passing the host NOFILE limit.

[1] https://github.com/llvm/llvm-project/issues/59112

Closes #12638
2023-01-27 13:56:35 +02:00
Benny Halevy
d2893f93cb view: row_lock: lock_ck: try_emplace row_lock entry
Use same method as the two-level lock at the
partition level.  try_emplace will either use
an existing entry, if found, or create a new
entry otherwise.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-27 13:51:48 +02:00
Benny Halevy
4b5e324ecb view: row_lock: lock_ck: find or construct row_lock under partition lock
Since we're potentially searching the row_lock in parallel to acquiring
the read_lock on the partition, we're racing with row_locker::unlock
that may erase the _row_locks entry for the same clustering key, since
there is no lock to protect it up until the partition lock has been
acquired and the lock_partition future is resolved.

This change moves the code to search for or allocate the row lock
_after_ the partition lock has been acquired to make sure we're
synchronously starting the read/write lock function on it, without
yielding, to prevent this use-after-free.

This adds an allocation for copying the clustering key in advance
even if a row_lock entry already exists, that wasn't needed before.
It only us slows down (a bit) when there is contention and the lock
already existed when we want to go locking. In the fast path there
is no contention and then the code already had to create the lock
and copy the key. In any case, the penalty of copying the key once
is tiny compared to the rest of the work that view updates are doing.

This is required on top of 5007ded2c1 as
seen in https://github.com/scylladb/scylladb/issues/12632
which is closely related to #12168 but demonstrates a different race
causing use-after-free.

Fixes #12632

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-27 13:51:46 +02:00
Kamil Braun
fa9cf81af2 test: topology: verify that group 0 and token ring are consistent
After topology changes like removing a node, verify that the set of
group 0 members and token ring members is the same.

Modify `get_token_ring_host_ids` to only return NORMAL members. The
previous version which used the `/storage_service/host_id` endpoint
might have returned non-NORMAL members as well.

Fixes: #12153

Closes #12619
2023-01-27 14:21:14 +03:00
Avi Kivity
f719de3357 Update seastar submodule
* seastar d41af8b59...943c09f86 (20):
  > reactor: disable io_uring on older kernels if not enough lockable memory is available
  > demos/tcp_sctp_client_demo: use user-defined literal for sizes
  > core/units: add user-defined literal for IEC prefixes
  > core/units: include what we use
  > coroutine/exception: do not include core/coroutine.hh
  > seastar/coroutine: drop std-coroutine.hh
  > core/bitops.hh: add type constraits to templates
  > apps/iotune: s/condition == false/!condition/
  > core/metrics_api: s/promehteus/prometheus/
  > reactor: make io_uring the default backend if available
  > tests: connect_test: use 127.0.0.1 for connect refused test
  > reactor: use aio to implement reactor_backend_uring::read()
  > future: schedule: get_available_state_ref under SEASTAR_DEBUG
  > rpc: client_info: add retrieve_auxiliary_opt
  > Merge 'Make http requests with content-length header and generated body' from Pavel Emelyanov
  > Merge 'Ensure logger doesn't allocate' from Travis Downs
  > http, httpd: optimize header field assignment
  > sstring: operator<< std::unordered_map: delete stray space char
  > Dump memory diagnostics at error level on abort
  > Fix CLI help for memory diagnostics dump

Closes #12650
2023-01-26 22:19:24 +02:00
Botond Dénes
d7ed92bb42 Merge 'Reduce the number of table::make_sstable() overloads' from Pavel Emelyanov
There are several helpers to make an sstable for the table and two with most of the arguments are only used by tests. This PR leaves table with just one arg-less call thus making it easier to patch further.

Closes #12636

* github.com:scylladb/scylladb:
  table: Shrink sstables making API
  tests: Use sstables manager to make sstables
  distributed_loader: Add helpers to make sstables for reshape/reshard
2023-01-26 14:25:21 +02:00
Kamil Braun
5eadea301e Merge 'pytest: start after ungraceful stop' from Alecco
If a server is stopped suddenly (i.e. not graceful), schema tables might
be in inconsistent state. Add a test case and enable Scylla
configuration option (force_schema_commit_log) to handle this.

Fixes #12218

Closes #12630

* github.com:scylladb/scylladb:
  pytest: test start after ungraceful stop
  test.py: enable force_schema_commit_log
2023-01-26 12:08:33 +01:00
Kamil Braun
3eabe04f5d test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests
`ScyllaClusterManager` is used to run a sequence of test cases from
a single test file. Between two consecutive tests, if the previous test
left the cluster 'dirty', meaning the cluster cannot be reused, it would
put the old cluster to the pool with `is_dirty=True`, then get a new
cluster from the pool.

Between the `put` and the `get`, a concurrent test run (with its own
instance of `ScyllaClusterManager`) would start, because there was free
space in the pool.

This resulted in undesirable behavior when we ran tests with
`--repeat X` for a large `X`: we would start with e.g. 4 concurrent
runs of a test file, because the pool size was 4. As soon as one of the
runs freed up space in the pool, we would start another concurrent run.
Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so
on. We would have a large number of concurrent runs, even though the
original 4 runs didn't finish yet. All of these concurrent runs would
compete waiting on the pool, and waiting for space in the pool would
take longer and longer (the duration is linear w.r.t number of
concurrent competing runs). Tests would then time out because they would
have to wait too long.

Fix that by using the new `replace_dirty` function introduced to the
pool. This function frees up space by returning a dirty cluster and then
immediately takes it away to be used for a new cluster. Thanks to this,
we will only have at most as many concurrent runs as the pool size. For
example with --repeat 8 and pool size 4, we would run 4 concurrent runs
and start the 5th run only when one of the original 4 runs finishes,
then the 6th run when a second run finishes and so on.

Fixes #11757
2023-01-26 11:58:00 +01:00
Kamil Braun
b5ef57ecc2 test/pylib: pool: introduce replace_dirty
Used to atomically return a dirty object to the pool and then use the
space freed by this object to get another object. Unlike
`put(is_dirty=True)` followed by `get`, a concurrent waiter cannot take
away our space from us.

A piece of `get` was refactored to a private function `_build_and_get`,
this piece is also used in `replace_dirty`.
2023-01-26 11:58:00 +01:00
Kamil Braun
858803cc2c test/pylib: pool: replace steal with put(is_dirty=True)
The pool usage was kind of awkward previously: if the user of a pool
decided that a previously borrowed object should no longer be used,
it was their responsibility to destroy the object (releasing associated
resources and so on) and then call `steal()` on the pool to free space
for a new object.

Change the interface. Now the `Pool` constructor obtains a `destroy`
function additionally to the `build` function. The user calls the
function `put` to return both objects that are still usable and those
aren't. For the latter, they set `is_dirty=True`. The pool will
'destroy' the object with the provided function, which could mean e.g.
releasing associated resources.

For example, instead of:
```
if self.cluster.is_dirty:
    self.clusters.stop()
    self.clusters.release_ips()
    self.clusters.steal()
else:
    self.clusters.put(self.cluster)
```
we can now use:
```
self.clusters.put(self.cluster, is_dirty=self.cluster.is_dirty)
```
(assuming that `self.clusters` is a pool constructed with a `destroy`
function that stops the cluster and releases its IPs.)

Also extend the interface of the context manager obtained by
`instance()` - the user must now pass a flag `dirty_on_exception`. If
the context manager exists due to an exception and that flag was `True`,
the object will be considered dirty. The dirty flag can also be set
manually on the context manager. For example:
```
async with (cm := pool.instance(dirty_on_exception=True)) as server:
    cm.dirty = await run_test(test, server)
    # It will also be considered dirty if run_test throws an exception
```
2023-01-26 11:58:00 +01:00
Pavel Emelyanov
dd307d8a42 test: Use tempdir from sstable_test_env
The test cases in sstable_directory_test use a temporary directory that
differs from the one sstables manager starts over. Fix that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-01-26 11:47:06 +03:00
Pavel Emelyanov
0c3799db71 test: Add tmpdir to sstable test env
This adds the test/lib's tmpdir instance _and_ configures the
data_file_directories with this path. This makes sure sstables manager
and the rest of the test use the same directory for sstables. For now
it doesn't change anything, but helps next patching.

(A neat side effect of this change is that sstable_test_env is now
 configured the same way as cql_test_env does)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-01-26 11:47:06 +03:00
Pavel Emelyanov
9f4efd6b6f table: Shrink sstables making API
Currently there are four helpers, this patch makes it just two and one
of them becomes private the table thus making the API small and neat
(and easy to patch further).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-01-26 10:47:39 +03:00
Pavel Emelyanov
fd559f3b81 tests: Use sstables manager to make sstables
This test uses two many-args helpers from table calss to create sstables
with desired parameters. The table API in question is not used by any
other code but these few places, to it's better to open-code it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-01-26 10:47:39 +03:00
Pavel Emelyanov
bfddfb8927 distributed_loader: Add helpers to make sstables for reshape/reshard
This kills two birds with one stone. First, it factors out (quite a lot
of) common arguments that are passed to table.make_sstable(). Second, it
makes the helpers call sstable manager with extended args making it
possible to remove those wrappers from table class later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-01-26 10:47:39 +03:00
Botond Dénes
ba26770376 tools/schema_loader: data_dictionary_impl:try_find_table(): also check ks name
Although the number of keyspaces should mostly be 1 here, and thus the
chance of two tables from different keyspaces colliding is miniscule, it
is not zero. Better be safe than sorry, so match the keyspace name too
when looking up a table.

Closes #12627
2023-01-25 22:04:07 +02:00
Raphael S. Carvalho
87ee547120 table: Fix quadratic behavior when inserting sstables into tracker on schema change
Each time backlog tracker is informed about a new or old sstable, it
will recompute the static part of backlog which complexity is
proportional to the total number of sstables.
On schema change, we're calling backlog_tracker::replace_sstables()
for each existing sstable, therefore it produces O(N ^ 2) complexity.

Fixes #12499.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #12593
2023-01-25 21:57:33 +02:00
Botond Dénes
bdd4b25c61 scylla-gdb.py: scylla memory: remove 'sstable reads' from semaphore names
This phrase is inaccurate and unnecessary. We know all lines in the
printout are for reads and they are semaphores: no need to repeat this
information on each line.
Example:

  Read Concurrency Semaphores:
    read:              0/100,             0/     41901096, queued: 0
    streaming:         0/ 10,             0/     41901096, queued: 0
    system:            0/ 10,             0/     41901096, queued: 0

Closes #12633
2023-01-25 21:55:27 +02:00