Commit Graph

32646 Commits

Author SHA1 Message Date
Pavel Emelyanov
db01be31c6 Backport 'Merge 'Enlighten messaging_service::shutdown()''
This includes seastar update titled
  'Merge 'Split rpc::server stop into two parts''

Includes backport of #12244 fix

* br-5.1-backport-ms-shutdown:
  messaging_service: Shutdown rpc server on shutdown
  messaging_service: Generalize stop_servers()
  messaging_service: Restore indentation after previous patch
  messaging_service: Coroutinize stop()
  messaging_service: Coroutinize stop_servers()
  messaging: Shutdown on stop() if it wasn't shut down earlier
  Update seastar submodule

refs: #14031
2023-06-14 09:28:56 +03:00
Pavel Emelyanov
87531915d9 messaging_service: Shutdown rpc server on shutdown
The RPC server now has a lighter .shutdown() method that just does what
m.s. shutdown() needs, so call it. On stop call regular stop to finalize
the stopping process

backport: The messaging_service::shutdown() had conflict due to missing
          e147681d85 commit

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-06-14 09:28:23 +03:00
Pavel Emelyanov
4075daf96d messaging_service: Generalize stop_servers()
Make it do_with_servers() and make it accept method to call and message
to print. This gives the ability to reuse this helper in next patch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-06-14 09:28:23 +03:00
Pavel Emelyanov
b27c5567fa messaging_service: Restore indentation after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-06-14 09:28:23 +03:00
Pavel Emelyanov
8877f0b28a messaging_service: Coroutinize stop()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-06-14 09:28:23 +03:00
Pavel Emelyanov
fabc7df720 messaging_service: Coroutinize stop_servers()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-06-14 09:28:23 +03:00
Tomasz Grabiec
bdafe2b98c messaging: Shutdown on stop() if it wasn't shut down earlier
All rpc::client objects have to be stopped before they are
destroyed. Currently this is done in
messaging_service::shutdown(). The cql_test_env does not call
shutdown() currently. This can lead to use-after-free on the
rpc::client object, manifesting like this:

Segmentation fault on shard 0.
Backtrace:
column_mapping::~column_mapping() at schema.cc:?
db::cql_table_large_data_handler::internal_record_large_cells(sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long) const at ./db/large_data_handler.cc:180
operator() at ./db/large_data_handler.cc:123
 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long>(std::__invoke_other, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long>(db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:114
 (inlined by) std::_Function_handler<seastar::future<void> (sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long), db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1>::_M_invoke(std::_Any_data const&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const*&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:290
std::function<seastar::future<void> (sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long)>::operator()(sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long) const at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:591
 (inlined by) db::cql_table_large_data_handler::record_large_cells(sstables::sstable const&, sstables::key const&, clustering_key_prefix const*, column_definition const&, unsigned long, unsigned long) const at ./db/large_data_handler.cc:175
seastar::rpc::log_exception(seastar::rpc::connection&, seastar::log_level, char const*, std::__exception_ptr::exception_ptr) at ./build/release/seastar/./seastar/src/rpc/rpc.cc:109
operator() at ./build/release/seastar/./seastar/src/rpc/rpc.cc:788
operator() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:1682
 (inlined by) void seastar::futurize<seastar::future<void> >::satisfy_with_result_of<seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&) const::{lambda()#1}&&) at ./build/release/seastar/./seastar/include/seastar/core/future.hh:2134
 (inlined by) operator() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:1681
 (inlined by) seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void*, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:781
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2319
 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:2756
seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2925
seastar::reactor::run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2808
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:265
seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:156
operator() at ./build/release/seastar/./seastar/src/testing/test_runner.cc:75
 (inlined by) void std::__invoke_impl<void, seastar::testing::test_runner::start_thread(int, char**)::$_0&>(std::__invoke_other, seastar::testing::test_runner::start_thread(int, char**)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:61
 (inlined by) std::enable_if<is_invocable_r_v<void, seastar::testing::test_runner::start_thread(int, char**)::$_0&>, void>::type std::__invoke_r<void, seastar::testing::test_runner::start_thread(int, char**)::$_0&>(seastar::testing::test_runner::start_thread(int, char**)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:111
 (inlined by) std::_Function_handler<void (), seastar::testing::test_runner::start_thread(int, char**)::$_0>::_M_invoke(std::_Any_data const&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:290
std::function<void ()>::operator()() const at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:591
 (inlined by) seastar::posix_thread::start_routine(void*) at ./build/release/seastar/./seastar/src/core/posix.cc:73

Fix by making sure that shutdown() is called prior to destruction.

Fixes #12244

Closes #12276
2023-06-14 09:28:23 +03:00
Pavel Emelyanov
d78bc60a74 Update seastar submodule
* seastar 09063faa...8d7cc312 (1):
  > rpc: Introduce server::shutdown()

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-06-14 09:21:20 +03:00
Raphael S. Carvalho
97985a68a1 compaction: Fix incremental compaction for sstable cleanup
After c7826aa910, sstable runs are cleaned up together.

The procedure which executes cleanup was holding reference to all
input sstables, such that it could later retry the same cleanup
job on failure.

Turns out it was not taking into account that incremental compaction
will exhaust the input set incrementally.

Therefore cleanup is affected by the 100% space overhead.

To fix it, cleanup will now have the input set updated, by removing
the sstables that were already cleaned up. On failure, cleanup
will retry the same job with the remaining sstables that weren't
exhausted by incremental compaction.

New unit test reproduces the failure, and passes with the fix.

Fixes #14035.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #14038

(cherry picked from commit 23443e0574)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #14195
2023-06-13 09:57:59 +03:00
Avi Kivity
c1278994d8 Merge 'multishard_mutation_query: make reader_context::lookup_readers() exception safe' from Botond Dénes
With regards to closing the looked-up querier if an exception is thrown. In particular, this requires closing the querier if a semaphore mismatch is detected. Move the table lookup above the line where the querier is looked up, to avoid having to handle the exception from it. As a consequence of closing the querier on the error path, the lookup lambda has to be made a coroutine. This is sad, but this is executed once per page, so its cost should be insignificant when spread over an
entire page worth of work.

Also add a unit test checking that the mismatch is detected in the first place and that readers are closed.

Fixes: #13784

Closes #13790

* github.com:scylladb/scylladb:
  test/boost/database_test: add unit test for semaphore mismatch on range scans
  partition_slice_builder: add set_specific_ranges()
  multishard_mutation_query: make reader_context::lookup_readers() exception safe
  multishard_mutation_query: lookup_readers(): make inner lambda a coroutine

(cherry picked from commit 1c0e8c25ca)
2023-06-08 05:12:12 -04:00
Michał Chojnowski
c33fb41802 data_dictionary: fix forgetting of UDTs on ALTER KEYSPACE
Due to a simple programming oversight, one of keyspace_metadata
constructors is using empty user_types_metadata instead of the
passed one. Fix that.

Fixes #14139

Closes #14143

(cherry picked from commit 1a521172ec)
2023-06-06 21:53:03 +03:00
Kamil Braun
bca4bf6c11 auth: don't use infinite timeout in default_role_row_satisfies query
A long long time ago there was an issue about removing infinite timeouts
from distributed queries: #3603. There was also a fix:
620e950fc8. But apparently some queries
escaped the fix, like the one in `default_role_row_satisfies`.

With the right conditions and timing this query may cause a node to hang
indefinitely on shutdown. A node tries to perform this query after it
starts. If we kill another node which is required to serve this query
right before that moment, the query will hang; when we try to shutdown
the querying node, it will wait for the query to finish (it's a
background task in auth service), which it never does due to infinite
timeout.

Use the same timeout configuration as other queries in this module do.

Fixes #13545.

Closes #14134

(cherry picked from commit f51312e580)
2023-06-06 19:39:55 +03:00
Anna Mikhlin
cf08b19dad release: prepare for 5.1.12 scylla-5.1.12 2023-06-05 18:13:42 +03:00
Vlad Zolotarov
0d5751b4b6 scylla_prepare: correctly handle a former 'MQ' mode
Fixes a regression introduced in 80917a1054:
"scylla_prepare: stop generating 'mode' value in perftune.yaml"

When cpuset.conf contains a "full" CPU set the negation of it from
the "full" CPU set is going to generate a zero mask as a irq_cpu_mask.
This is an illegal value that will eventually end up in the generated
perftune.yaml, which in line will make the scylla service fail to start
until the issue is resolved.

In such a case a irq_cpu_mask must represent a "full" CPU set mimicking
a former 'MQ' mode.

\Fixes scylladb/scylladb#11701
Tested:
 - Manually on a 2 vCPU VM in an 'auto-selection' mode.
 - Manually on a large VM (48 vCPUs) with an 'MQ' manually
   enforced.
Message-Id: <20221004004237.2961246-1-vladz@scylladb.com>

(cherry picked from commit 8195dab92a)
2023-06-04 19:26:04 +03:00
Vlad Zolotarov
1c7cdf68d6 scylla_prepare + scylla_cpuset_setup: make scylla_cpuset_setup idempotent without introducing regressions
This patch fixes the regression introduced by 3a51e78 which broke
a very important contract: perftune.yaml should not be "touched"
by Scylla scriptology unless explicitly requested.

And a call for scylla_cpuset_setup is such an explicit request.

The issue that the offending patch was intending to fix was that
cpuset.conf was always generated anew for every call of
scylla_cpuset_setup - even if a resulting cpuset.conf would come
out exactly the same as the one present on the disk before tha call.

And since the original code was following the contract mentioned above
it was also deleting perftune.yaml every time too.
However, this was just an unavoidable side-effect of that cpuset.conf
re-generation.

The above also means that if scylla_cpuset_setup doesn't write to cpuset.conf
we should not "touch" perftune.yaml and vise versa.

This patch implements exactly that together with reverting the dangerous
logic introduced by 3a51e78.

\Fixes scylladb/scylladb#11385
\Fixes scylladb/scylladb#10121

(cherry picked from commit c538cc2372)
2023-06-04 19:25:41 +03:00
Vlad Zolotarov
8fc0591f98 scylla_prepare: stop generating 'mode' value in perftune.yaml
Modern perftune.py supports a more generic way of defining IRQ CPUs:
'irq_cpu_mask'.

This patch makes our auto-generation code create a perftune.yaml
that uses this new parameter instead of using outdated 'mode'.

As a side effect, this change eliminates the notion of "incorrect"
value in cpuset.conf - every value is valid now as long as it fits into
the 'all' CPU set of the specific machine.

Auto-generated 'irq_cpu_mask' is going to include all bits from 'all'
CPU mask except those defined in cpuset.conf.

\Fixes scylladb/scylladb#9903

(cherry picked from commit 80917a1054)
2023-06-04 19:25:27 +03:00
Pavel Emelyanov
a0fa2d043d Update seastar submodule
* seastar a6389d17...09063faa (1):
  > rpc: Wait for server socket to stop before killing conns

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-30 20:03:52 +03:00
Botond Dénes
50a3cc6b90 compatible_ring_position_or_view: make it cheap to copy
This class exists for one purpose only: to serve as glue code between
dht::ring_position and boost::icl::interval_map. The latter requires
that keys in its intervals are:
* default constructible
* copyable
* have standalone compare operations

For this reason we have to wrap `dht::ring_position` in a class,
together with a schema to provide all this. This is
`compatible_ring_position`. There is one further requirement by code
using the interval map: it wants to do lookups without copying the
lookup key(s). To solve this, we came up with
`compatible_ring_position_or_view` which is a union of a key or a key
view + schema. As we recently found out, boost::icl copies its keys **a
lot**. It seems to assume these keys are cheap to copy and carelessly
copies them around even when iterating over the map. But
`compatible_ring_position_or_view` is not cheap to copy as it copies a
`dht::ring_position` which allocates, and it does that via an
`std::optional` and `std::variant` to add insult to injury.
This patch make said class cheap to copy, by getting rid of the variant
and storing the `dht::ring_position` via a shared pointer. The view is
stored separately and either points to the ring position stored in the
shared pointer or to an outside ring position (for lookups).

Fixes: #11669

Closes #11670

(cherry picked from commit 169a8a66f2)
2023-05-25 17:30:40 +03:00
Botond Dénes
cfa8fa1d77 Merge 'Backport compaction reevaluation fixes to branch-5.1' from Raphael "Raph" Carvalho
Fixes #13429.
Fixes #12390.
Fixes #13430.

Closes #14009

* github.com:scylladb/scylladb:
  compaction: Make compaction reevaluation actually periodic
  compaction_manager: Fix reactor stalls during periodic submissions
  compaction_manager: reindent postponed_compactions_reevaluation()
  compaction_manager: coroutinize postponed_compactions_reevaluation()
  compaction_manager: make postponed_compactions_reevaluation() return a future
  replica: Reevaluate regular compaction on off-strategy completion
2023-05-25 07:55:17 +03:00
Raphael S. Carvalho
6cdd5ccabd compaction: Make compaction reevaluation actually periodic
The manager intended to periodically reevaluate compaction need for
each registered table. But it's not working as intended.
The reevaluation is one-off.

This means that compaction was not kicking in later for a table, with
low to none write activity, that had expired data 1 hour from now.

Also make sure that reevaluation happens within the compaction
scheduling group.

Fixes #13430.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 156ac0a67a)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-23 21:30:47 -03:00
Raphael S. Carvalho
204baa0c1e compaction_manager: Fix reactor stalls during periodic submissions
Every 1 hour, compaction manager will submit all registered table_state
for a regular compaction attempt, all without yielding.

This can potentially cause a reactor stall if there are 1000s of table
states, as compaction strategy heuristics will run on behalf of each,
and processing all buckets and picking the best one is not cheap.
This problem can be magnified with compaction groups, as each group
is represented by a table state.

This might appear in dashboard as periodic stalls, every 1h, misleading
the investigator into believing that the problem is caused by a
chronological job.

This is fixed by piggybacking on compaction reevaluation loop which
can yield between each submission attempt if needed.

Fixes #12390.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #12391

(cherry picked from commit 67ebd70e6e)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-23 21:18:36 -03:00
Avi Kivity
3556d2b4e8 compaction_manager: reindent postponed_compactions_reevaluation()
(cherry picked from commit d2b1d2f695)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-23 21:18:18 -03:00
Avi Kivity
6b699c9667 compaction_manager: coroutinize postponed_compactions_reevaluation()
So much nicer.

(cherry picked from commit 1669025736)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-23 21:17:58 -03:00
Avi Kivity
316ea63ea0 compaction_manager: make postponed_compactions_reevaluation() return a future
postponed_compactions_reevaluation() runs until compaction_manager is
stopped, checking if it needs to launch new compactions.

Make it return a future instead of stashing its completion somewhere.
This makes is easier to convert it to a coroutine.

(cherry picked from commit d2c44cba77)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-23 21:17:26 -03:00
Raphael S. Carvalho
bafde878ba replica: Reevaluate regular compaction on off-strategy completion
When off-strategy compaction completes, regular compaction is not triggered.

If off-strategy output causes the table's SSTable set to not conform the strategy
goal, it means that read and space amplification will be suboptimal until the next
compaction kicks in, which can take undefinite amount of time (e.g. when active
memtable is flushed).

Let's reevaluate compaction on main SSTable set when off-strategy ends.

Fixes #13429.

Backport note: conflict is around compaction_group vs table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 2652b41606)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-05-23 21:06:14 -03:00
Yaron Kaikov
88eeab7838 release: prepare for 5.1.11 2023-05-22 15:14:22 +03:00
Raphael S. Carvalho
f6230b5eec sstables: Fix use-after-move when making reader in reverse mode
static report:
sstables/mx/reader.cc:1705:58: error: invalid invocation of method 'operator*' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed]
            legacy_reverse_slice_to_native_reverse_slice(*schema, slice.get()), pc, std::move(trace_state), fwd, fwd_mr, monitor);

Fixes #13394.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 213eaab246)
2023-05-15 20:27:51 +03:00
Raphael S. Carvalho
737285d342 db/view/build_progress_virtual_reader: Fix use-after-move
use-after-free in ctor, which potentially leads to a failure
when locating table from moved schema object.

static report
In file included from db/system_keyspace.cc:51:
./db/view/build_progress_virtual_reader.hh:202:40: warning: invalid invocation of method 'operator->' on object 's' while it is in the 'consumed' state [-Wconsumed]
                _db.find_column_family(s->ks_name(), system_keyspace::v3::SCYLLA_VIEWS_BUILDS_IN_PROGRESS),

Fixes #13395.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 1ecba373d6)
2023-05-15 20:26:17 +03:00
Raphael S. Carvalho
f1ee68e128 index/built_indexes_virtual_reader.hh: Fix use-after-move
static report:
./index/built_indexes_virtual_reader.hh:228:40: warning: invalid invocation of method 'operator->' on object 's' while it is in the 'consumed' state [-Wconsumed]
                _db.find_column_family(s->ks_name(), system_keyspace::v3::BUILT_VIEWS),

Fixes #13396.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit f8df3c72d4)
2023-05-15 20:24:52 +03:00
Raphael S. Carvalho
7d4abd9e64 replica: Fix use-after-move in table::make_streaming_reader
Variant used by
streaming/stream_transfer_task.cc:        , reader(cf.make_streaming_reader(cf.schema(), std::move(permit_), prs))

as full slice is retrieved after schema is moved (clang evaluates
left-to-right), the stream transfer task can be potentially working
on a stale slice for a particular set of partitions.

static report:
In file included from replica/dirty_memory_manager.cc:6:
replica/database.hh:706:83: error: invalid invocation of method 'operator->' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed]
        return make_streaming_reader(std::move(schema), std::move(permit), range, schema->full_slice());

Fixes #13397.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 04932a66d3)
2023-05-15 20:22:15 +03:00
Asias He
0b4b5c21ad tombstone_gc: Fix gc_before for immediate mode
The immediate mode is similar to timeout mode with gc_grace_seconds
zero. Thus, the gc_before returned should be the query_time instead of
gc_clock::time_point::max in immediate mode.

Setting gc_before to gc_clock::time_point::max, a row could be dropped
by compaction even if the ttl is not expired yet.

The following procedure reproduces the issue:

- Start 2 nodes

- Insert data

```
CREATE KEYSPACE ks2a WITH REPLICATION = { 'class' : 'SimpleStrategy',
'replication_factor' : 2 };
CREATE TABLE ks2a.tb (pk int, ck int, c0 text, c1 text, c2 text, PRIMARY
KEY(pk, ck)) WITH tombstone_gc = {'mode': 'immediate'};
INSERT into ks2a.tb (pk,ck, c0, c1, c2) values (10 ,1, 'x', 'y', 'z')
USING TTL 1000000;
INSERT into ks2a.tb (pk,ck, c0, c1, c2) values (20 ,1, 'x', 'y', 'z')
USING TTL 1000000;
INSERT into ks2a.tb (pk,ck, c0, c1, c2) values (30 ,1, 'x', 'y', 'z')
USING TTL 1000000;
```

- Run nodetool flush and nodetool compact

- Compaction drops all data

```
~128 total partitions merged to 0.
```

Fixes #13572

Closes #13800

(cherry picked from commit 7fcc403122)
2023-05-15 10:34:16 +03:00
Takuya ASADA
3c7d2a3284 scylla_kernel_check: suppress verbose iotune messages
Stop printing verbose iotune messages while the check, just print error
message.

Fixes #13373.

Closes #13362

(cherry picked from commit 160c184d0b)
2023-05-14 21:26:15 +03:00
Benny Halevy
d7e65a1a0a view: view_builder: start: demote sleep_aborted log error
This is not really an error, so print it in debug log_level
rather than error log_level.

Fixes #13374

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #13462

(cherry picked from commit cc42f00232)
2023-05-14 21:22:21 +03:00
Raphael S. Carvalho
73d80d55d1 Fix use-after-move when initializing row cache with dummy entry
Courtersy of clang-tidy:
row_cache.cc:1191:28: warning: 'entry' used after it was moved [bugprone-use-after-move]
_partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{_schema});
^
row_cache.cc:1191:60: note: move occurred here
_partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{_schema});
^
row_cache.cc:1191:28: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated
_partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{*_schema});

The use-after-move is UB, as for it to happen, depends on evaluation order.

We haven't hit it yet as clang is left-to-right.

Fixes #13400.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #13401

(cherry picked from commit d2d151ae5b)
2023-05-14 21:02:52 +03:00
Anna Mikhlin
37538f00f5 release: prepare for 5.1.10 scylla-5.1.11 scylla-5.1.10 2023-05-08 22:11:10 +03:00
Botond Dénes
1cb11e7e2f Update seastar submodule
* seastar 84858fde...a6389d17 (2):
  > core/on_internal_error: always log error with backtrace
  > on_internal_error: refactor log_error_and_backtrace

Fixes: #13786
2023-05-08 10:36:55 +03:00
Marcin Maliszkiewicz
f4200098ce db: view: use deferred_close for closing staging_sstable_reader
When consume_in_thread throws the reader should still be closed.

Related https://github.com/scylladb/scylla-enterprise/issues/2661

Closes #13398
Refs: scylladb/scylla-enterprise#2661
Fixes: #13413

(cherry picked from commit 99f8d7dcbe)
2023-05-08 09:45:54 +03:00
Botond Dénes
f751613924 Merge 'service:forward_service: use long type instead of counter in function mocking' from Michał Jadwiszczak
Aggregation query on counter column is failing because forward_service is looking for function with counter as an argument and such function doesn't exist. Instead the long type should be used.

Fixes: #12939

Closes #12963

* github.com:scylladb/scylladb:
  test:boost: counter column parallelized aggregation test
  service:forward_service: use long type when column is counter

(cherry picked from commit 61e67b865a)
2023-05-07 14:29:33 +03:00
Michał Jadwiszczak
b38d56367f test/boost/cql_query_test: enable parallelized_aggregation
Run tests for parallelized aggregation with
`enable_parallelized_aggregation` set always to true, so the tests work
even if the default value of the option is false.

Closes #12409

(cherry picked from commit 83bb77b8bb)

Ref #12939.
2023-05-07 14:29:33 +03:00
Anna Stuchlik
991f4ab104 doc: remove the sequential repair option from docs
Fixes https://github.com/scylladb/scylladb/issues/12132

The sequential repair mode is not supported. This commit
removes the incorrect information from the documentation.

Closes #13544

(cherry picked from commit 3d25edf539)
2023-05-07 14:29:33 +03:00
Nadav Har'El
c72058199f cql: fix empty aggregation, and add more tests
This patch fixes #12475, where an aggregation (e.g., COUNT(*), MIN(v))
of absolutely no partitions (e.g., "WHERE p = null" or "WHERE p in ()")
resulted in an internal error instead of the "zero" result that each
aggregator expects (e.g., 0 for COUNT, null for MIN).

The problem is that normally our aggregator forwarder picks the nodes
which hold the relevant partition(s), forwards the request to each of
them, and then combines these results. When there are no partitions,
the query is sent to no node, and we end up with an empty result set
instead of the "zero" results. So in this patch we recognize this
case and build those "zero" results (as mentioned above, these aren't
always 0 and depend on the aggregation function!).

The patch also adds two tests reproducing this issue in a fairly general
way (e.g., several aggregators, different aggregation functions) and
confirming the patch fixes the bug.

The test also includes two additional tests for COUNT aggregation, which
uncovered an incompatibility with Cassandra which is still not fixed -
so these tests are marked "xfail":

Refs #12477: Combining COUNT with GROUP by results with empty results
             in Cassandra, and one result with empty count in Scylla.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12715

(cherry picked from commit 3ba011c2be)
2023-05-04 14:18:06 +03:00
Botond Dénes
c69e1a4e12 readers: evictable_reader: skip progress guarantee when next pos is partition start
The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563

(cherry picked from commit 72003dc35c)
2023-05-02 21:26:52 +03:00
Anna Stuchlik
62c737aa49 doc: fixes https://github.com/scylladb/scylladb/issues/12964, removes the information that the CDC options are experimental
Closes #12973

(cherry picked from commit 4dd1659d0b)
2023-04-27 21:06:58 +03:00
Raphael S. Carvalho
edeec94a89 replica: Fix undefined behavior in table::generate_and_propagate_view_updates()
Undefined behavior because the evaluation order is undefined.

With GCC, where evaluation is right-to-left, schema will be moved
once it's forwarded to make_flat_mutation_reader_from_mutations_v2().

The consequence is that memory tracking of mutation_fragment_v2
(for tracking only permit used by view update), which uses the schema,
can be incorrect. However, it's more likely that Scylla will crash
when estimating memory usage for row, which access schema column
information using schema::column_at(), which in turn asserts that
the requested column does really exist.

Fixes #13093.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #13092

(cherry picked from commit 3fae46203d)
2023-04-27 20:49:39 +03:00
Anna Stuchlik
95a63edea8 doc: remove incorrect info about BYPASS CACHE
Fixes https://github.com/scylladb/scylladb/issues/13106

This commit removes the information that BYPASS CACHE
is an Enterprise-only feature and replaces that info
with the link to the BYPASS CACHE description.

Closes #13316

(cherry picked from commit 1cfea1f13c)
2023-04-27 20:49:39 +03:00
Kefu Chai
e04eef29a8 dist/redhat: enforce dependency on %{release} also
* tools/python3 bf6e892...4b04b46 (1):
  > dist: redhat: provide only a single version

s/%{version}/%{version}-%{release}/ in `Requires:` sections.

this enforces the runtime dependencies of exactly the same
releases between scylla packages.

Fixes #13222
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
(cherry picked from commit 7165551fd7)
2023-04-27 20:49:39 +03:00
Nadav Har'El
c14ccceec9 test/rest_api: fix flaky test for toppartitions
The REST test test_storage_service.py::test_toppartitions_pk_needs_escaping
was flaky. It tests the toppartition request, which unfortunately needs
to choose a sampling duration in advance, and we chose 1 second which we
considered more than enough - and indeed typically even 1ms is enough!
but very rarely (only know of only one occurance, in issue #13223) one
second is not enough.

Instead of increasing this 1 second and making this test even slower,
this patch takes a retry approach: The tests starts with a 0.01 second
duration, and is then retried with increasing durations until it succeeds
or a 5-seconds duration is reached. This retry approach has two benefits:
1. It de-flakes the test (allowing a very slow test to take 5 seconds
instead of 1 seconds which wasn't enough), and 2. At the same time it
makes a successful test much faster (it used to always take a full
second, now it takes 0.07 seconds on a dev build on my laptop).

A *failed* test may, in some cases, take 10 seconds after this patch
(although in some other cases, an error will be caught immediately),
but I consider this acceptable - this test should pass, after all,
and a failure indicates a regression and taking 10 seconds will be
the last of our worries in that case.

Fixes #13223.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13238

(cherry picked from commit c550e681d7)
2023-04-27 20:49:39 +03:00
Nadav Har'El
29e3cd80fb test/alternator: increase CQL connection timeout
This patch increases the connection timeout in the get_cql_cluster()
function in test/cql-pytest/run.py. This function is used to test
that Scylla came up, and also test/alternator/run uses it to set
up the authentication - which can only be done through CQL.

The Python driver has 2-second and 5-second default timeouts that should
have been more than enough for everybody (TM), but in #13239 we saw
that in one case it apparently wasn't enough. So to be extra safe,
let's increase the default connection-related timeouts to 60 seconds.

Note this change only affects the Scylla *boot* in the test/*/run
scripts, and it does not affect the actual tests - those have different
code to connect to Scylla (see cql_session() in test/cql-pytest/util.py),
and we already increased the timeouts there in #11289.

Fixes #13239

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13291

(cherry picked from commit 4fdcee8415)
2023-04-27 20:49:39 +03:00
Tomasz Grabiec
067f94d3f8 direct_failure_detector: Avoid throwing exceptions in the success path
sleep_abortable() is aborted on success, which causes sleep_aborted
exception to be thrown. This causes scylla to throw every 100ms for
each pinged node. Throwing may reduce performance if happens often.

Also, it spams the logs if --logger-log-level exception=trace is enabled.

Avoid by swallowing the exception on cancellation.

Fixes #13278.

Closes #13279

(cherry picked from commit 99cb948eac)
2023-04-27 20:49:35 +03:00
Avi Kivity
f2f9f26b79 Update seastar submodule
* seastar acdf7dca9b...84858fde99 (1):
  > http: request_parser: fix grammar ambiguity in field_content

Fixes #12468

Related upstream commit: 42575340ba
2023-04-27 17:43:21 +03:00