Commit Graph

43495 Commits

Author SHA1 Message Date
Gleb Natapov
87beebeed0 paxos: do not signal semaphore if it was not acquired
The guard signals a semaphore during destruction if it is marked as
locked, but currently it may be marked as locked even if locking failed.
Fix this by using semaphore_units instead of managing the locked flag
manually.

Fixes: https://github.com/scylladb/scylladb/issues/19698
2024-07-16 12:32:25 +03:00
Kefu Chai
9a71543fd2 github: always use the tools/toolchain/image for lint workflows
instead of hardwiring the toolchain image in github workflows, read it
from `tools/toolchain/image`. a dedicated reusable workflow is added to
read from this file, and expose its content with an output parameter.

also, switch iwyu.yaml workflow to this image, more maintainable this
way. please note, before this change, we are also using the latest
stable build of clang, and since fedora 40 is also using the clang 18,
so the behavior is not change. but with this change, we don't have
the flexibility of using other clang versions provided
https://apt.llvm.org in future.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19655
2024-07-10 23:45:35 +03:00
Avi Kivity
65a7fc9902 Merge 'transport, service: move definition of destructors into .cc' from Kefu Chai
this changeset includes two changes:

- service: move storage_service::~storage_service() into .cc
- transport: move the cql_server::~cql_server() into .cc

they intends to address the compile failures when building scylladb with clang-19. clang-19 is more picky when generating the defaulted destructors with incomplete types. but its behavior makes sense regarding to standard compliance. so let's update accordingly.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#19668

* github.com:scylladb/scylladb:
  transport: move the cql_server::~cql_server() into .cc
  service: move storage_service::~storage_service() into .cc
2024-07-10 23:43:16 +03:00
Kefu Chai
06ba523818 sstable: extract file_writer out
`sstables::write()` has multiple overloads, which are defined in
`sstables/writer.hh`. two of these overloads are template functions,
which have a template parameter named `W`, which has a type constraint
requiring it to fulfill the `Writer` concept. but in `types.hh`, when
the compiler tries to instantiate the template function with signature
of `write(sstable_version_types v, W& out, const T& t)` with
`file_writer` as the template parameter of `w`, `file_writer` is only
forward-declared using `class file_writer` in the same header file, so
this type is still an incomplete type at that moment. that's why the
compiler is not able to determine if `file_writer` fulfills the
constraint or not. actually, the declaration of `file_writer` is located
in `sstables/writer.hh`, which in turn includes `types.hh`. so they
form a cyclic dependency.

in this change, in order to break this cycle, we extract file_writer out
into a separate header file, so that both `sstables/writer.hh` and
`sstables/types.hh` can include it. this address the build failure.

Fixes scylladb/scylladb#19667
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19669
2024-07-10 23:32:47 +03:00
Michał Chojnowski
fdd8b03d4b scylla-gdb.py: add $coro_frame()
Adds a convenience function for inspecting the coroutine frame of a given
seastar task.

Short example of extracting a coroutine argument:

```
(gdb) p *$coro_frame(seastar::local_engine->_current_task)
$1 = {
  __resume_fn = 0x2485f80 <sstables::parse(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::statistics&)>,
  ...
  PointerType_7 = 0x601008e67880,
  ...
  __coro_index = 0 '\000'
  ...
(gdb) p $downcast_vptr($->PointerType_7)
  $2 = (schema *) 0x601008e67880
```

Closes scylladb/scylladb#19479
2024-07-10 21:46:27 +03:00
Avi Kivity
45e27c0da2 config, enum_option: allow round-trip string conversion
The default configuration for replication_strategy_warn_list is
["SimpleStrategy"], but one cannot set this via CQL:

cqlsh> select * from system.config where name = 'replication_strategy_warn_list';

 name                           | source  | type                      | value
--------------------------------+---------+---------------------------+--------------------
 replication_strategy_warn_list | default | replication strategy list | ["SimpleStrategy"]

(1 rows)
cqlsh> update system.config set value  = '[NetworkTopologyStrategy]' where name = 'replication_strategy_warn_list';
cqlsh> select * from system.config where name = 'replication_strategy_warn_list';

 name                           | source | type                      | value
--------------------------------+--------+---------------------------+-----------------------------
 replication_strategy_warn_list |    cql | replication strategy list | ["NetworkTopologyStrategy"]

(1 rows)
cqlsh> update system.config set value  = '["NetworkTopologyStrategy"]' where name = 'replication_strategy_warn_list';
WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="Operation failed for system.config - received 0 responses and 1 failures from 1 CL=ONE." info={'consistency': 'ONE', 'required_responses': 1, 'received_responses': 0, 'failures': 1}

Fix by allowing quotes in enum_set parsing.

Bug present since 8c464b2ddb ("guardrails: restrict
replication strategy (RS)", 6.0).

Fixes #19604.

Closes scylladb/scylladb#19605
2024-07-10 20:39:01 +03:00
Yaron Kaikov
e33126fc3e .github/script/label_promoted_commit.py: add label only if ref is PR
we got a failure during check-commit action:
```
Run python .github/scripts/label_promoted_commits.py --commit_before_merge 30e82a81e8 --commit_after_merge f31d5e3204 --repository scylladb/scylladb --ref refs/heads/master

Commit sha is: d5a149fc01
Commit sha is: 415457be2b
Commit sha is: d3b1ccd03a
Commit sha is: 1fca341514
Commit sha is: f784be6a7e
Commit sha is: 80986c17c3
Commit sha is: 492d0a5c86
Commit sha is: 7b3f55a65f
Commit sha is: 78d6471ce4
Commit sha is: 7a69d9070f
Commit sha is: a9e985fcc9
master branch, pr number is: 19213
Traceback (most recent call last):
  File "/home/runner/work/scylladb/scylladb/.github/scripts/label_promoted_commits.py", line 87, in <module>
    main()
  File "/home/runner/work/scylladb/scylladb/.github/scripts/label_promoted_commits.py", line 81, in main
    pr = repo.get_pull(pr_number)
  File "/usr/lib/python3/dist-packages/github/Repository.py", line 2746, in get_pull
    headers, data = self._requester.requestJsonAndCheck(
  File "/usr/lib/python3/dist-packages/github/Requester.py", line 353, in requestJsonAndCheck
    return self.__check(
  File "/usr/lib/python3/dist-packages/github/Requester.py", line 378, in __check
    raise self.__createException(status, responseHeaders, output)
github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/pulls/pulls#get-a-pull-request", "status": "404"}
Error: Process completed with exit code 1.
```

The reason for this failure is since in one of the promoted commits
(a9e985fcc9) had a reference of `Closes`
to an issue.

Fixes: https://github.com/scylladb/scylladb/issues/19677

Closes scylladb/scylladb#19678
2024-07-10 15:27:12 +03:00
Botond Dénes
9bdcba7a46 Merge 'conf: scylla.yaml: update documentation for tablets' from Benny Halevy
Tablets are no longer in experimental_features since 83d491a, so remove them from the experimental_features section documentation.

Also, expand the documentation for the `enable_tablets` option.

Fixes #19456

Needs backport to 6.0

Closes scylladb/scylladb#19516

* github.com:scylladb/scylladb:
  conf: scylla.yaml: enable_tablets: expand documentation
  conf: scylla.yaml: remove tablets from experimental_features doc comment
2024-07-10 14:32:40 +03:00
Nadav Har'El
c6cffe36dd Merge 'cql: forbid having counter columns in tablets tables' from Piotr Smaron
Counter updates break under tablet migration (#18180), and for this reason counters need to be disabled until the problem is fixed. It's enough to forbid creating a table with counters, as altering a table without counters already cannot result in the table having counters:
1) Adding a counter column to a table without counters:
```
cqlsh> ALTER TABLE temp.cf ADD (col_name counter);
ConfigurationException: Cannot add a counter column (col_name) in a non counter column family
```
2) Altering a column to be of the counter type:
```
cqlsh> ALTER TABLE temp.cf ALTER col_name TYPE counter;
ConfigurationException: Cannot change col_name from type int to type counter: types are incompatible.
```

Fixes: #19449
Fixes: https://github.com/scylladb/scylladb/issues/18876

Need to backport to 6.0, as this is broken there.

Closes scylladb/scylladb#19518

* github.com:scylladb/scylladb:
  doc: add notes to feature pages which don't support tablets
  cql: adjust warning about tablets
  cql: forbid having counter columns in tablets tables
2024-07-10 10:18:30 +03:00
Kefu Chai
7e4e685964 transport: move the cql_server::~cql_server() into .cc
because transport/server.cc has the complete definition of event_notifier, the
compiler can default-generate the destructor of `cql_server` with the necessary
information. otherwise, clang-19 would fail to build, like:

```
FAILED: CMakeFiles/scylla.dir/Dev/main.cc.o
/home/kefu/.local/bin/clang++ -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_PROGRAM_OPTIONS_NO_LIB -DDEVEL -DFMT_SHARED -DSCYLLA_BUILD_MODE=dev -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Dev\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -I/home/kefu/dev/scylladb/build -isystem /home/kefu/dev/scylladb/build/rust -isystem /home/kefu/dev/scylladb/abseil -O2 -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -MD -MT CMakeFiles/scylla.dir/Dev/main.cc.o -MF CMakeFiles/scylla.dir/Dev/main.cc.o.d -o CMakeFiles/scylla.dir/Dev/main.cc.o -c /home/kefu/dev/scylladb/main.cc
In file included from /home/kefu/dev/scylladb/main.cc:11:
In file included from /usr/include/yaml-cpp/yaml.h:10:
In file included from /usr/include/yaml-cpp/parser.h:11:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/memory:78:
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:91:16: error: invalid application of 'sizeof' to an incomplete type 'cql_transport::cql_server::event_notifier'
   91 |         static_assert(sizeof(_Tp)>0,
      |                       ^~~~~~~~~~~
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:398:4: note: in instantiation of member function 'std::default_delete<cql_transport::cql_server::event_notifier>::operator()' requested here
  398 |           get_deleter()(std::move(__ptr));
      |           ^
/home/kefu/dev/scylladb/transport/server.hh:135:7: note: in instantiation of member function 'std::unique_ptr<cql_transport::cql_server::event_notifier>::~unique_ptr' requested here
  135 | class cql_server : public seastar::peering_sharded_service<cql_server>, public generic_server::server {
      |       ^
/home/kefu/dev/scylladb/transport/server.hh:135:7: note: in implicit destructor for 'cql_transport::cql_server' first required here
/home/kefu/dev/scylladb/transport/server.hh:149:11: note: forward declaration of 'cql_transport::cql_server::event_notifier'
  149 |     class event_notifier;
      |           ^
1 error generated.
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-10 12:52:51 +08:00
Kefu Chai
79ffde063a service: move storage_service::~storage_service() into .cc
as repair/repair.cc has the complete definition of node_ops_meta_data, the
compiler can default-generate the destructor of `storage_service` with the necessary
information. otherwise, clang-19 would fail to build, like:

```
FAILED: repair/CMakeFiles/repair.dir/Dev/repair.cc.o
/home/kefu/.local/bin/clang++ -DDEVEL -DFMT_SHARED -DSCYLLA_BUILD_MODE=dev -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Dev\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -O2 -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -MD -MT repair/CMakeFiles/repair.dir/Dev/repair.cc.o -MF repair/CMakeFiles/repair.dir/Dev/repair.cc.o.d -o repair/CMakeFiles/repair.dir/Dev/repair.cc.o -c /home/kefu/dev/scylladb/repair/repair.cc
In file included from /home/kefu/dev/scylladb/repair/repair.cc:9:
In file included from /home/kefu/dev/scylladb/repair/repair.hh:11:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/unordered_map:41:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unordered_map.h:33:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable.h:35:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:34:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/tuple:38:
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:291:11: error: field has incomplete type 'service::node_ops_meta_data'
  291 |       _T2 second;                ///< The second member
      |           ^
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/ext/aligned_buffer.h:93:28: note: in instantiation of template class 'std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>' requested here
   93 |     : std::aligned_storage<sizeof(_Tp), __alignof__(_Tp)>
      |                            ^
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:334:43: note: in instantiation of template class '__gnu_cxx::__aligned_buffer<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>>' requested here
  334 |       __gnu_cxx::__aligned_buffer<_Value> _M_storage;
      |                                           ^
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:373:7: note: in instantiation of template class 'std::__detail::_Hash_node_value_base<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>>' requested here
  373 |     : _Hash_node_value_base<_Value>
      |       ^
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable.h:1662:21: note: in instantiation of template class 'std::__detail::_Hash_node_value<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>, false>' requested here
 1662 |                         ._M_bucket_index(declval<const __node_value_type&>(),
      |                                          ^
/usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unordered_map.h:109:11: note: in instantiation of member function 'std::_Hashtable<utils::tagged_uuid<node_ops_id_tag>, std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>, std::allocator<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>>, std::__detail::_Select1st, std::equal_to<utils::tagged_uuid<node_ops_id_tag>>, std::hash<utils::tagged_uuid<node_ops_id_tag>>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>>::~_Hashtable' requested here
  109 |     class unordered_map
      |           ^
/home/kefu/dev/scylladb/service/storage_service.hh:109:7: note: forward declaration of 'service::node_ops_meta_data'
  109 | class node_ops_meta_data;
      |       ^
In file included from /home/kefu/dev/scylladb/repair/repair.cc:9:
In file included from /home/kefu/dev/scylladb/repair/repair.hh:11:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/unordered_map:41:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unordered_map.h:33:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable.h:35:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:34:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/tuple:38:
In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:60:
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-07-10 12:52:51 +08:00
Piotr Smaron
531659f8dc doc: add notes to feature pages which don't support tablets
There's already a page which lists which features are not working with
tablets: architecture/tablets.html#limitations-and-unsupported-features,
but it's also helpful for users to be warned about this when visiting a
specific feature doc page.
2024-07-09 18:18:05 +02:00
Avi Kivity
f31d5e3204 Merge 'repair/streaming: enable toggling tombstone gc with a config item' from Botond Dénes
We currently disable tombstone GC for compaction done on the read path of streaming and repair, because those expired tombstones can still prevent data resurrection. With time-based tombstone GC, missing a repair for long enough can cause data resurrection because a tombstone is potentially GC'd before it could be spread to every node by repair. So repair disseminating these expired tombstones helps clusters which missed repair for long enough. It is not a guarantee because compaction could have done the GC itself, but it is better than nothing.
This last resort is getting less important with repair-based tombstone GC. Furthermore, we have seen this cause huge repair amplification in a cluster, where expired tombstones triggered repair replicating otherwise identical rows.

This series makes tombstone GC on the streaming/repair compaction path configurable with a config item. This new config item defaults to `false` (current behaviour), setting it to `true`, will enable tombstone GC.

Fixes: https://github.com/scylladb/scylladb/issues/19015

Not a regression, no backport needed

Closes scylladb/scylladb#19016

* github.com:scylladb/scylladb:
  test/topology_custom/test_repair: add test for enable_tombstone_gc_for_streaming_and_repair
  replica/table: maybe_compact_for_streaming(): toggle tombstone GC based on the control flag
  replica: propagate enable_tombstone_gc_for_streaming_and_repair to maybe_compact_for_streaming()
  db/config: introduce enable_tombstone_gc_for_streaming_and_repair
2024-07-09 19:04:11 +03:00
Piotr Smaron
5bfabff9a0 cql: adjust warning about tablets
Made it shorter, simpler and mentioned also that counters aren't
supported with tablets.

Fixes: #18876
2024-07-09 18:01:37 +02:00
Piotr Smaron
c70f321c6f cql: forbid having counter columns in tablets tables
Counter updates break under tablet migration (#18180), and for this
reason they need to be disabled until the problem is fixed.
It's enough to forbid creating a table with counters, as altering a
table without counters already cannot result in the table having
counters:
1) Adding a counter column to a table without counters:
```
cqlsh> ALTER TABLE temp.cf ADD (col_name counter);
ConfigurationException: Cannot add a counter column (col_name) in a non counter column family
```
2) Altering a column to be of the counter type:
```
cqlsh> ALTER TABLE temp.cf ALTER col_name TYPE counter;
ConfigurationException: Cannot change col_name from type int to type counter: types are incompatible.
```

Fixes: #19449
2024-07-09 18:01:31 +02:00
Patryk Wrobel
a89e3d10af code-cleanup: add missing header guards
The following command had been executed to get the
list of headers that did not contain '#pragma once':
'grep -rnw . -e "#pragma once" --include *.hh -L'

This change adds missing include guard to headers
that did not contain any guard.

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>

Closes scylladb/scylladb#19626
2024-07-09 18:31:35 +03:00
Takuya ASADA
cae999c094 toolchain: change optimized clang install method to standard one
Previously optimized clang installation was not used standard build
script, it overwrites preinstalled Fedora's clang binaries instead.
However this breaks on clang-18.1.8, since libLTO versioning convention.
To avoid such problem, let's switch to standard installation method and
swith install prefix to /usr/local.

Fixes #19203

Closes scylladb/scylladb#19505
2024-07-09 14:22:42 +03:00
Tomasz Grabiec
252110bc54 Merge 'mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion' from Michał Chojnowski
apply_monotonically() is run with reclaim disabled. So with some bad luck,
sentinel insertion might fail with bad_alloc even on a perfectly healthy node.
We can't deal with the failure of sentinel insertion, so this will result in a
crash.

This patch prevents the spurious OOM by reserving some memory (1 LSA segment)
and only making it available right before the critical allocations.

Fixes https://github.com/scylladb/scylladb/issues/19552

Closes scylladb/scylladb#19617

* github.com:scylladb/scylladb:
  mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion
  logalloc: add hold_reserve
  logalloc: generalize refill_emergency_reserve()
2024-07-09 13:09:01 +02:00
Anna Stuchlik
948459b1ac doc: replace a link on the CDC+Kafka page
This commit replaces a link to the installation section with a link to the getting started section.

Closes scylladb/scylladb#19658
2024-07-09 12:35:43 +03:00
Michael Litvak
ed33e59714 storage_proxy: remove response handler if no targets
When writing a mutation, it might happen that there are no live targets
to send the mutation to, yet the request can be satisfied. For example,
when writing with CL=ANY to a dead node, the request is completed by
storing a local hint.

Currently, in that case, a write response handler is created for the
request and it remains active until it timeouts because it is not
removed anywhere, even though the write is completed successfuly after
storing the hint. The response handler should be removed usually when
receiving responses from all targets, but in this case there are no
targets to trigger the removal.

In this commit we check if we don't have live targets to send the
mutation to. If so, we remove the response handler immediately.

Fixes scylladb/scylladb#19529

Closes scylladb/scylladb#19586
2024-07-09 12:11:05 +03:00
Kamil Braun
98c18d8904 Merge 'Add API for read barrier' from Emil Maskovsky
Introduce REST API for triggering a read barrier.

This is to make sure the database schema is up to date on the node where
the read barrier is triggered. One of the use cases is the database
backup via the Scylla Manager, which requires that the schema backed up
is matching the data or newer (data can be migrated, but an older schema
would cause issues).

Fixes scylladb/scylladb#19213

Closes scylladb/scylladb#19597

* github.com:scylladb/scylladb:
  raft: add the read barrier REST API
  raft: use `raft_timeout` in trigger_snapshot
  raft: use bad_param_exception for consistency
  test: raft: verify schema updated after read barrier
2024-07-09 10:58:21 +02:00
Kefu Chai
6af989782c test: sstable_directory_test: use THREADSAFE_BOOST_REQUIRE_EQUAL when appropriate
for better debugging experience.

before this change, we have

```
fatal error: in "sstable_directory_test_generation_sanity": critical check sst->generation() == sst1->generation() has failed
```
after this change, we have

```
fatal error: in "sstable_directory_test_generation_sanity": critical
check sst->generation() == sst1->generation() has failed [3ghm_0ntw_29vj625yegw7jodysc != 3ghm_0ntw_29vj625yegw7jodysd]
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19639
2024-07-09 10:54:23 +03:00
Kefu Chai
30e82a81e8 test: do not define boost_test_print_type() for types with operator<<
before this change, we provide `boost_test_print_type()` for all types
which can be formatted using {fmt}. these types includes those who
fulfill the concept of range, and their element can be formatted using
{fmt}. if the compilation unit happens to include `fmt/ranges.h`.
the ranges are formatted with `boost_test_print_type()` as well. this
is what we expect. in other words, we use {fmt} to format types which
do not natively support {fmt}, but they fulfill the range concept.

but `boost::unit_test::basic_cstring` is one of them

- it can be formatted using operator<<, but it does not provide
  fmt::format specialization
- it fulfills the concept of range
- and its element type is `char const`, which can be formatted using
  {fmt}

that's why it's formatted like:

```
test/boost/sstable_directory_test.cc(317): fatal error: in "sstable_directory_test_generation_sanity": critical check ['s', 's', 't', '-', '>', 'g', 'e', 'n', 'e', 'r', 'a', 't', 'i', 'o', 'n', '(', ')', ' ', '=', '=', ' ', 's', 's', 't', '1', '-', '>', 'g', 'e', 'n', 'e', 'r', 'a', 't', 'i', 'o', 'n', '(', ')'] has failed`
```

where the string is formatted as a sequence-alike container. this
is far from readable.

so, in this change, we do not define `boost_test_print_type()` for
the types which natively support `operator<<` anymore. so they can
be printed with `operator<<` when  boost::test prints them.

Fixes scylladb/scylladb#19637
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19638
2024-07-09 10:34:37 +03:00
Botond Dénes
9544c364be scylla-gdb.py: introduce scylla large-objects
The equivalent of small-objects, but for large objects (spans).
Allows listing object of a large-class, and therefore investigating a
run-away class, by attempting to identify the owners of the objects in
it.

Written to investigate #16493

Closes scylladb/scylladb#16711
2024-07-09 10:21:09 +03:00
Emil Maskovsky
a9e985fcc9 raft: add the read barrier REST API
This will allow to trigger the read barrier directly via the API,
instead of doing work-arounds (like dropping a non-existent table).

The intended use-case is in the Scylla Manager, to make sure that
the database schema is up to date after the data has been backed up
and before attempting to backup the database schema.

The database schema in particular is being backed up just on a single
node, which might not yet have the schema at least as new as the data
(data can be migrated to a newer schema, but not a vice-versa).

The read barrier issued on the node should ensure that the node should
have the schema at least as new as the data or newer.

Closes #19213
2024-07-08 18:16:27 +02:00
Emil Maskovsky
7a69d9070f raft: use raft_timeout in trigger_snapshot
Migrate the "trigger_snapshot" to use the standardized `raft_timeout` approach.
2024-07-08 18:13:31 +02:00
Michał Chojnowski
78d6471ce4 mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion
apply_monotonically() is run with reclaim disabled. So with some bad luck,
sentinel insertion might fail with bad_alloc even on a perfectly healthy node.
We can't deal with the failure of sentinel insertion, so this will result in a
crash.

This patch prevents the spurious OOM by reserving some memory (1 LSA segment)
and only making it available right before the critical allocations.

Fixes scylladb/scylladb#19552
2024-07-08 16:08:27 +02:00
Michał Chojnowski
7b3f55a65f logalloc: add hold_reserve
mutation_partition_v2::apply_monotonically() needs to perform some allocations
in a destructor, to ensure that the invariants of the data structure are
restored before returning. But it is usually called with reclaiming disabled,
so the allocations might fail even in a perfectly healthy node with plenty of
reclaimable memory.

This patch adds a mechanism which allows to reserve some LSA memory (by
asking the allocator to keep it unused) and make it available for allocation
right when we need to guarantee allocation success.
2024-07-08 16:08:27 +02:00
Wojciech Przytuła
691e245152 storage_proxy: fix uninitialized LWT contention counter
When debugging the issue of high LWT contention metric, we (the
drivers team) discovered that at least 3 drivers (Go, Java, Rust)
cause high numbers in that metrics in LWT workloads - we doubted that
all those drivers route LWT queries badly. We tried to understand that
metric and its semantics. It took 3 people over 10 hours to figure out
what it is supposed to count.

People from core team suspected that it was the drivers sending
requests to different shards, causing contention. Then we ran the
workload against a single node single shard cluster... and observed
contention. Finally, we looked into the Scylla code and saw it.

**Uninitialized stack value.**

The core member was shocked. But we, the drivers people, felt we always
knew it. It's yet another time that we are blamed for a server-side
issue. We rebuilt scylla with the variable initialized to 0 and the
metric kept being 0.

To prevent such errors in the future, let's consider some lints that
warn against uninitialized variables. This is such an obvious feature
of e.g. Rust, and yet this has shown to be cause a painful bug in 2024.

Closes scylladb/scylladb#19625
2024-07-08 16:55:46 +03:00
Emil Maskovsky
492d0a5c86 raft: use bad_param_exception for consistency
Replace the `std::runtime_error` by the `bad_param_exception` that is used in other places.
2024-07-08 14:31:11 +02:00
Takuya ASADA
cbf33aba5c scylla_coredump_setup: install systemd-coredump before has_zstd()
On Ubuntu/Debian, we have to install systemd-coredump before
running has_ztd(), since it detect ZSTD support by running coredumpctl.

Move pkg_install('systemd-coredump') to the head of the script.

Fixes #19643

Closes scylladb/scylladb#19648
2024-07-08 15:04:34 +03:00
Kefu Chai
229250ef3e .github: use scylla-toolchain for newer fmt
in cccec07581, we started using a featured introduced by {fmt} v10.
but we are still using the {fmt} cooked using seastar, and it is
9.1.0, so this breaks the build when running the clang-tidy workflow.

in this change, instead of building on ubuntu jammy, we use the
scylladb/scylla-toolchain image based on fedora 40, which provides
{fmt} v10.2.1. since we are have clang 18 in fedora 40, this change
does not sacrifice anything.

after this change, clang-tidy workflow should be back to normal.

Fixes scylladb/scylladb#19621
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19628
2024-07-08 11:14:02 +02:00
Emil Maskovsky
80986c17c3 test: raft: verify schema updated after read barrier
Regression test for #19213.
2024-07-08 10:50:32 +02:00
Piotr Dulikowski
3c535641fd Merge 'service/storage_proxy: Add metrics keeping track of incoming hints' from Dawid Mędrek
Although Scylla already exposes metrics keeping track of various information related to hinted handoff, all of them correspond to either storing or sending hints. However, when debugging, it's also crucial to be aware of how many hints are coming to a given node and what their size is. Unfortunately, the existing metrics are not enough to obtain that information.

This PR introduces the following new metrics:

* `sent_bytes_total` – the total size of the hints that have been sent from a given shard,
* `received_hints_total` – the total number of hints that a given shard has received,
* `received_hints_bytes_total` – the total size of the hints a given shard has received.

It also renames `hints_manager_sent` to `hints_manager_sent_total` to avoid conflicts of prefixes between that metric and `sent_bytes_total` in tests.

Fixes scylladb/scylladb#10987

Closes scylladb/scylladb#18976

* github.com:scylladb/scylladb:
  db/hints: Add a metric for the size of sent hints
  service/storage_proxy: Add metrics for received hints
2024-07-08 10:29:53 +02:00
Botond Dénes
56c194e52c Merge 'compaction: not include unused headers' from Kefu Chai
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#19581

* github.com:scylladb/scylladb:
  .github: add compaction to iwyu's CLEANER_DIR
  compaction: not include unused headers
2024-07-08 10:03:51 +03:00
Israel Fruchter
32e6725b8e Update tools/cqlsh submodule
* tools/cqlsh 73bdbeb0...86a280a1 (1):
  > remove cassandra from the shiv package

Ref: scylladb/scylla-cqlsh#96

Closes scylladb/scylladb#19558
2024-07-08 10:00:59 +03:00
Michael Litvak
407274e828 view: drain view builder before database
The view builder is doing write operations to the database.
In order for the view builder to shutdown gracefully without errors, we
need to ensure the database can handle writes while it is drained.
The commit changes the drain order, so that view builder is drained
before the database shuts down.

Fixes scylladb/scylladb#18929

Closes scylladb/scylladb#19609
2024-07-05 22:17:40 +03:00
Botond Dénes
103bd8334a service/paxos/paxos_state: restore resilience against dropped tables
Recently, the code in paxos_state::prepare(), paxos_state::accept() and
paxos_state::learn() was coroutinized by 58912c2cc1, 887a5a8f62 and
2b7acdb32c respectively. This introduced a regression: the latency
histogram updater code, was moved from a finally() to a defer(). Unlike
the former, the latter runs in a noexcept context so the possible
replica::no_such_column_family raised from the latency update code now
crashes the node, instead of failing just the paxos operation as before.
Fix by only updating the latency histogram if the table still exists.

Fixes: scylladb/scylladb#19620

Closes scylladb/scylladb#19623
2024-07-05 14:58:11 +02:00
Anna Stuchlik
8759dfae96 doc: add Run in Docker page to the documentation
The page was missing from the docs. I created the page based on
the information in the download center (which will be closed down soon)
and other ScyllaDB resources.

Closes scylladb/scylladb#19577
2024-07-04 20:20:03 +03:00
Dawid Medrek
0e1cb0dc73 db/hints: Add logging when ignoring hint directories
In 2446cce, we stopped trying to attempt to create
endpoint managers for invalid hint directories
even when their names represented IP addresses or
host IDs. In this commit, we add logging informing
the user about it.

Refs scylladb/scylladb#19173

Closes scylladb/scylladb#19618
2024-07-04 20:14:52 +03:00
Botond Dénes
155acbb306 reader_concurrency_semaphore: execution_loop(): move maybe_admit_waiters() to the inner loop
Now that the CPU concurency limit is configurable, new reads might be
ready to execute right after the current one was executed. So move the
poll for admitting new reads into the inner loop, to prevent the
situation where the inner loop yields and a concurrent
do_wait_admission() finds that there are waiters (queued because at the
time they arrived to the semaphore, the _ready_list was not empty) but it
is is possible to admit a new read. When this happens the semaphore will
dump diagnostics to help debug the apparent contradiction, which can
generate a lot of log spam. Moving the poll into the inner loop prevents
the false-positive contradiction detection from firing.

Refs: scylladb/scylladb#19017

Closes scylladb/scylladb#19600
2024-07-04 17:47:52 +03:00
Avi Kivity
0626e0487d Merge 'Add copy on write to functions schema code' from Marcin Maliszkiewicz
This is the first patch from series which would allow us to unify raft command code. Property we want to achieve is that all modifications performed by a single raft command can be made visible atomically. This helps to exclude accidental dependencies across subsystem updates and make easier to reason about state.

Here we alter functions schema code so that changes are first applied to a copy of declared functions and then made visible atomically. Later work will apply similar strategy to the whole schema.

Relates scylladb/scylladb#19153

Closes scylladb/scylladb#19598

* github.com:scylladb/scylladb:
  cql3: functions: make modification functions accessible only via batch class
  db: replica: batch functions schema modifications
  cql3: functions: introduce class for batching functions modifications
  cql3: functions: make functions class non-static
  cql3: functions: remove reduntant class access specifiers
  cql3: functions: remove unused java snippet
2024-07-04 17:40:23 +03:00
Anna Stuchlik
822a58f964 doc: remove support for Debian 10
This PR removes support for Debian 10, which reached end of life on June 30, 2024.

Refs https://github.com/scylladb/scylla-enterprise/issues/4377

Closes scylladb/scylladb#19616
2024-07-04 17:24:57 +03:00
Marcin Maliszkiewicz
3f1c2fecc2 cql3: functions: make modification functions accessible only via batch class
This is to assure that all the code is using batching
2024-07-04 13:10:26 +02:00
Marcin Maliszkiewicz
32fe101f9d db: replica: batch functions schema modifications
Before each function change was immediately visible as
during event notification logic yielded.

Now we first gather the modifications and then commit them.

Further work will broaden the scope of atomicity to the whole
schema and even across other subsystems.
2024-07-04 13:10:26 +02:00
Michał Chojnowski
f784be6a7e logalloc: generalize refill_emergency_reserve()
In the next patch, we will want to do the thing as
refill_emergency_reserve() does, just with a quantity different
than _emergency_reserve_max. So we split off the shareable part
to a new function, and use it to implement refill_emergency_reserve().
2024-07-04 12:19:01 +02:00
Pavel Emelyanov
9a654730a7 tablet_allocator: Put more info into failed-to-drain exception
When balancer fails to find a node to balance drained tablets into, it
throws an exception with tablet id and node id, but it's also good to
know more details about the balancing state that lead to failure

refs: #19504

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#19588
2024-07-04 12:18:50 +02:00
Marcin Maliszkiewicz
4d937c5a17 cql3: functions: introduce class for batching functions modifications
It will hold a temporary shallow copy of declared functions.
Then each modification adds/removes/replaces stored function object.
At the end change is commited by moving temporary copy to the
main functions class instance.
2024-07-04 12:14:36 +02:00
Nadav Har'El
96dff367f8 Merge 'storage_proxy: update view update backlog on correct shard when writing' from Wojciech Mitros
This series is another approach of https://github.com/scylladb/scylladb/pull/18646 and https://github.com/scylladb/scylladb/pull/19181. In this series we only change where the view backlog gets
updated - we do not assure that the view update backlog returned in a response is necessarily the backlog
that increased due to the corresponding write, the returned backlog may be outdated up to 10ms. Because
 this series does not include this change, it's considerably less complex and it doesn't modify the common
write patch, so no particular performance considerations were needed in that context. The issue being fixed
is still the same, the full description can be seen below.

When a replica applies a write on a table which has a materialized view
it generates view updates. These updates take memory which is tracked
by `database::_view_update_concurrency_sem`, separate on each shard.
The fraction of units taken from the semaphore to the semaphore limit
is the shard's view update backlog. Based on these backlogs, we want
to estimate how busy a node is with its view updates work. We do that
by taking the max backlog across all shards.
To avoid excessive cross-shard operations, the node's (max) backlog isn't
calculated each time we need it, but up to 1 time per 10ms (the `_interval`) with an optimization where the backlog of the calculating shard is immediately up-to-date (we don't need cross-shard operations for it):
```
update_backlog node_update_backlog::fetch() {
    auto now = clock::now();
    if (now >= _last_update.load(std::memory_order_relaxed) + _interval) {
        _last_update.store(now, std::memory_order_relaxed);
        auto new_max = boost::accumulate(
                _backlogs,
                update_backlog::no_backlog(),
                [] (const update_backlog& lhs, const per_shard_backlog& rhs) {
                    return std::max(lhs, rhs.load());
                });
        _max.store(new_max, std::memory_order_relaxed);
        return new_max;
    }
    return std::max(fetch_shard(this_shard_id()), _max.load(std::memory_order_relaxed));
}
```
For the same reason, even when we do calculate the new node's backlog,
we don't read from the `_view_update_concurrency_sem`. Instead, for
each shard we also store a update_backlog atomic which we use for
calculation:
```
    struct per_shard_backlog {
        // Multiply by 2 to defeat the prefetcher
        alignas(seastar::cache_line_size * 2) std::atomic<update_backlog> backlog = update_backlog::no_backlog();
        need_publishing need_publishing = need_publishing::no;

        update_backlog load() const {
            return backlog.load(std::memory_order_relaxed);
        }
    };
 std::vector<per_shard_backlog> _backlogs;
```
Due to this distinction, the update_backlog atomic need to be updated
separately, when the `_view_update_concurrency_sem` changes.
This is done by calling `storage_proxy::update_view_update_backlog`, which reads the `_view_update_concurrency_sem` of the shard (in `database::get_view_update_backlog`)
and then calls node`_update_backlog::add` where the read backlog
is stored in the atomic:
```
void storage_proxy::update_view_update_backlog() {
    _max_view_update_backlog.add(get_db().local().get_view_update_backlog());
}
void node_update_backlog::add(update_backlog backlog) {
    _backlogs[this_shard_id()].backlog.store(backlog, std::memory_order_relaxed);
    _backlogs[this_shard_id()].need_publishing = need_publishing::yes;
}
```
For this implementation of calculating the node's view update backlog to work,
we need the atomics to be updated correctly when the semaphores of corresponding
shards change.

The main event where the view update backlog changes is an incoming write
request. That's why when handling the request and preparing a response
we update the backlog calling `storage_proxy::get_view_update_backlog` (also
because we want to read the backlog and send it in the response):
backlog update after local view updates (`storage_proxy::send_to_live_endpoints` in `mutate_begin`)
```
 auto lmutate = [handler_ptr, response_id, this, my_address, timeout] () mutable {
     return handler_ptr->apply_locally(timeout, handler_ptr->get_trace_state())
             .then([response_id, this, my_address, h = std::move(handler_ptr), p = shared_from_this()] {
         // make mutation alive until it is processed locally, otherwise it
         // may disappear if write timeouts before this future is ready
         got_response(response_id, my_address, get_view_update_backlog());
     });
 };
backlog update after remote view updates (storage_proxy::remote::handle_write)

 auto f = co_await coroutine::as_future(send_mutation_done(netw::messaging_service::msg_addr{reply_to, shard}, trace_state_ptr,
         shard, response_id, p->get_view_update_backlog()));
```
Now assume that on a certain node we have a write request received on shard A,
which updates a row on shard B (A!=B). As a result, shard B will generate view
updates and consume units from its `_view_update_concurrency_sem`, but will
not update its atomic in `_backlogs` yet. Because both shards in the example
are on the same node, shard A will perform a local write calling `lmutate` shown
above. In the `lmutate` call, the `apply_locally` will initiate the actual write on
shard B and the `storage_proxy::update_view_update_backlog` will be called back
on shard A. In no place will the backlog atomic on shard B get updated even
though it increased in size due to the view updates generated there.
Currently, what we calculate there doesn't really matter - it's only used for the
MV flow control delays, so currently, in this scenario, we may only overload
a replica causing failed replica writes which will be later retried as hints. However,
when we add MV admission control, the calculated backlog will be the difference
between an accepted and a rejected request.

Fixes: https://github.com/scylladb/scylladb/issues/18542

Without admission control (https://github.com/scylladb/scylladb/pull/18334), this patch doesn't affect much, so I'm marking it as backport/none

Closes scylladb/scylladb#19341

* github.com:scylladb/scylladb:
  test: add test for view backlog not being updated on correct shard
  test: move auxiliary methods for waiting until a view is built to util
  mv: update view update backlog when it increases on correct shard
2024-07-04 11:40:09 +03:00
Marcin Maliszkiewicz
16b770ff1a cql3: functions: make functions class non-static
This is done to ease code reuse in the following commit.
It'd also help should we ever want properly mount functions
class to schema object instead of static storage.
2024-07-04 10:24:57 +02:00