Commit Graph

34995 Commits

Author SHA1 Message Date
Botond Dénes
abb7ae4309 Update ./tools/jmx submodule
* tools/jmx f21550e...50909d6 (1):
  > scylla-apiclient: drop hk2-locator dependency

Fixes: scylladb/scylla-jmx#231
2024-01-10 14:22:14 +02:00
Botond Dénes
2820c63734 Update tools/java submodule
* tools/java d7ec9bf45f...79fa02d8a3 (2):
  > build.xml: update scylla-driver-core to 3.11.5.1
  > treewide: update "guava" package

Fixes: scylla-tools-java#365
Fixes: scylla-tools-java#343

Closes #16693
2024-01-10 08:19:43 +02:00
Nadav Har'El
ac0056f4bc Merge 'Fix partition estimation with TWCS tables during streaming' from Raphael "Raph" Carvalho
TWCS tables require partition estimation adjustment as incoming streaming data can be segregated into the time windows.

Turns out we had two problems in this area that leads to suboptimal bloom filters.

1) With off-strategy enabled, data segregation is postponed, but partition estimation was adjusted as if segregation wasn't postponed. Solved by not adjusting estimation if segregation is postponed.
2) With off-strategy disabled, data segregation is not postponed, but streaming didn't feed any metadata into partition estimation procedure, meaning it had to assume the max windows input data can be segregated into (100). Solved by using schema's default TTL for a precise estimation of window count.

For the future, we want to dynamically size filters (see https://github.com/scylladb/scylladb/issues/2024), especially for TWCS that might have SSTables that are left uncompacted until they're fully expired, meaning that the system won't heal itself in a timely manner through compaction on a SSTable that had partition estimation really wrong.

Fixes https://github.com/scylladb/scylladb/issues/15704.

Closes scylladb/scylladb#15938

* github.com:scylladb/scylladb:
  streaming: Improve partition estimation with TWCS
  streaming: Don't adjust partition estimate if segregation is postponed

(cherry picked from commit 64d1d5cf62)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #16672
2024-01-08 09:06:43 +02:00
Calle Wilund
aaa25e1a78 Commitlog replayer: Range-check skip call
Fixes #15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.

v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will
    ensure we log the fact that things are broken

Closes scylladb/scylladb#15270

(cherry picked from commit 6ffb482bf3)
2024-01-05 09:19:45 +02:00
Beni Peled
c57a0a7a46 release: prepare for 5.2.13 scylla-5.2.13 2024-01-03 17:48:59 +02:00
Botond Dénes
740ba3ac2a tools/schema_loader: read_schema_table_mutation(): close the reader
The reader used to read the sstables was not closed. This could
sometimes trigger an abort(), because the reader was destroyed, without
it being closed first.
Why only sometimes? This is due to two factors:
* read_mutation_from_flat_mutation_reader() - the method used to extract
  a mutation from the reader, uses consume(), which does not trigger
  `set_close_is_required()` (#16520). Due to this, the top-level
  combined reader did not complain when destroyed without close.
* The combined reader closes underlying readers who have no more data
  for the current range. If the circumstances are just right, all
  underlying readers are closed, before the combined reader is
  destoyed. Looks like this is what happens for the most time.

This bug was discovered in SCT testing. After fixing #16520, all
invokations of `scylla-sstable`, which use this code would trigger the
abort, without this patch. So no further testing is required.

Fixes: #16519

Closes scylladb/scylladb#16521

(cherry picked from commit da033343b7)
2023-12-31 18:13:10 +02:00
Gleb Natapov
76c3dda640 storage_service: register schema version observer before joining group0 and starting gossiper
The schema version is updated by group0, so if group0 starts before
schema version observer is registered some updates may be missed. Since
the observer is used to update node's gossiper state the gossiper may
contain wrong schema version.

Fix by registering the observer before starting group0 and even before
starting gossiper to avoid a theoretical case that something may pull
schema after start of gossiping and before the observer is registered.

Fixes: #15078

Message-Id: <ZOYZWhEh6Zyb+FaN@scylladb.com>
(cherry picked from commit d1654ccdda)
2023-12-20 11:14:27 +01:00
Kamil Braun
287546923e Merge 'db: hints: add checksum to sync_point encoding' from Patryk Jędrzejczak
Fixes #9405

`sync_point` API provided with incorrect sync point id might allocate
crazy amount of memory and fail with `std::bad_alloc`.

To fix this, we can check if the encoded sync point has been modified
before decoding. We can achieve this by calculating a checksum before
encoding, appending it to the encoded sync point, and compering it with
a checksum calculated in `db::hints::decode` before decoding.

Closes #14534

* github.com:scylladb/scylladb:
  db: hints: add checksum to sync point encoding
  db: hints: add the version_size constant

(cherry picked from commit eb6202ef9c)

The only difference from the original merge commit is the include
path of `xx_hasher.hh`. On branch 5.2, this file is in the root
directory, not `utils`.

Closes #16458
2023-12-19 17:39:50 +02:00
Botond Dénes
c0dab523f9 Update tools/java submodule
* tools/java e2aad6e3a0...d7ec9bf45f (1):
  > Merge "build: take care of old libthrift" from Piotr Grabowski

Fixes: scylladb/scylla-tools-java#352

Closes #16464
2023-12-19 17:37:27 +02:00
Michael Huang
5499f7b5a8 cdc: use chunked_vector for topology_description entries
Lists can grow very big. Let's use a chunked vector to prevent large contiguous
allocations.
Fixes: #15302.

Closes scylladb/scylladb#15428

(cherry picked from commit 62a8a31be7)
2023-12-19 13:43:23 +01:00
Piotr Grabowski
7055ac45d1 test: use more frequent reconnection policy
The default reconnection policy in Python Driver is an exponential
backoff (with jitter) policy, which starts at 1 second reconnection
interval and ramps up to 600 seconds.

This is a problem in tests (refs #15104), especially in tests that restart
or replace nodes. In such a scenario, a node can be unavailable for an
extended period of time and the driver will try to reconnect to it
multiple times, eventually reaching very long reconnection interval
values, exceeding the timeout of a test.

Fix the issue by using a exponential reconnection policy with a maximum
interval of 4 seconds. A smaller value was not chosen, as each retry
clutters the logs with reconnection exception stack trace.

Fixes #15104

Closes #15112

(cherry picked from commit 17e3e367ca)
2023-12-19 13:43:23 +01:00
Gleb Natapov
4ff29d1637 raft: drop assert in server_impl::apply_snapshot for a condition that may happen
server_impl::apply_snapshot() assumes that it cannot receive a snapshots
from the same host until the previous one is handled and usually this is
true since a leader will not send another snapshot until it gets
response to a previous one. But it may happens that snapshot sending
RPC fails after the snapshot was sent, but before reply is received
because of connection disconnect. In this case the leader may send
another snapshot and there is no guaranty that the previous one was
already handled, so the assumption may break.

Drop the assert that verifies the assumption and return an error in this
case instead.

Fixes: #15222

Message-ID: <ZO9JoEiHg+nIdavS@scylladb.com>
(cherry picked from commit 55f047f33f)
2023-12-19 13:43:23 +01:00
Alexey Novikov
6bcf9e6631 When add duration field to UDT check whether this UDT is used in some clustering key
Having values of the duration type is not allowed for clustering
columns, because duration can't be ordered. This is correctly validated
when creating a table but do not validated when we alter the type.

Fixes #12913

Closes scylladb/scylladb#16022

(cherry picked from commit bd73536b33)
2023-12-19 06:58:41 -05:00
Takuya ASADA
74dd8f08e3 dist: fix local-fs.target dependency
systemd man page says:

systemd-fstab-generator(3) automatically adds dependencies of type Before= to
all mount units that refer to local mount points for this target unit.

So "Before=local-fs.taget" is the correct dependency for local mount
points, but we currently specify "After=local-fs.target", it should be
fixed.

Also replaced "WantedBy=multi-user.target" with "WantedBy=local-fs.target",
since .mount are not related with multi-user but depends local
filesystems.

Fixes #8761

Closes scylladb/scylladb#15647

(cherry picked from commit a23278308f)
2023-12-19 13:15:00 +02:00
Botond Dénes
68507ed4d9 Merge '[Backport 5.2] Shard of shard repair task impl' from Aleksandra Martyniuk
Shard id is logged twice in repair (once explicitly, once added by logger).
Redundant occurrence is deleted.

shard_repair_task_impl::id (which contains global repair shard)
is renamed to avoid further confusion.

Fixes: https://github.com/scylladb/scylladb/issues/12955

Closes #16439

* github.com:scylladb/scylladb:
  repair: rename shard_repair_task_impl::id
  repair: delete redundant shard id from logs
2023-12-19 10:28:57 +02:00
Botond Dénes
46a29e9a02 Merge 'alternator: fix isolation of concurrent modifications to tags' from Nadav Har'El
Alternator's implementation of TagResource, UntagResource and UpdateTimeToLive (the latter uses tags to store the TTL configuration) was unsafe for concurrent modifications - some of these modifications may be lost. This short series fixes the bug, and also adds (in the last patch) a test that reproduces the bug and verifies that it's fixed.

The cause of the incorrect isolation was that we separately read the old tags and wrote the modified tags. In this series we introduce a new function, `modify_tags()` which can do both under one lock, so concurrent tag operations are serialized and therefore isolated as expected.

Fixes #6389.

Closes #13150

* github.com:scylladb/scylladb:
  test/alternator: test concurrent TagResource / UntagResource
  db/tags: drop unsafe update_tags() utility function
  alternator: isolate concurrent modification to tags
  db/tags: add safe modify_tags() utility functions
  migration_manager: expose access to storage_proxy

(cherry picked from commit dba1d36aa6)

Closes #16453
2023-12-19 10:19:31 +02:00
Botond Dénes
23fd6939eb Merge '[Backport to 5.2] gossiper: mark_alive: use deferred_action to unmark pending' from Benny Halevy
Backport the following patches to 5.2:
- gossiper: mark_alive: enter background_msg gate (#14791)
- gossiper: mark_alive: use deferred_action to unmark pending (#14839)

Closes #16452

* github.com:scylladb/scylladb:
  gossiper: mark_alive: use deferred_action to unmark pending
  gossiper: mark_alive: enter background_msg gate
2023-12-19 09:06:37 +02:00
Botond Dénes
1cf499cfea Update tools/java submodule
* tools/java 80701efa8d...e2aad6e3a0 (2):
  > build: update logback dependency
  > build: update `netty` dependency

Fixes: https://github.com/scylladb/scylla-tools-java/issues/363
Fixes: https://github.com/scylladb/scylla-tools-java/issues/364

Closes #16444
2023-12-18 18:19:20 +02:00
Nadav Har'El
91e05dc646 cql: fix SELECT toJson() or SELECT JSON of time column
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#16121

(cherry picked from commit 8d040325ab)
2023-12-18 18:19:20 +02:00
Benny Halevy
a2009c4a8c gossiper: mark_alive: use deferred_action to unmark pending
Make sure _pending_mark_alive_endpoints is unmarked in
any case, including exceptions.

Fixes #14839

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #14840

(cherry picked from commit 1e7e2eeaee)
2023-12-18 14:44:22 +02:00
Benny Halevy
999a6bfaae gossiper: mark_alive: enter background_msg gate
The function dispatch a background operation that must be
waited on in stop().

\Fixes scylladb/scylladb#14791

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 868e436901)
2023-12-18 14:42:52 +02:00
Kefu Chai
faef786c88 reloc: strip.sh: always generate symbol list with posix format
we compare the symbols lists of stripped ELF file ($orig.stripped) and
that of the one including debugging symbols ($orig.debug) to get a
an ELF file which includes only the necessary bits as the debuginfo
($orig.minidebug).

but we generate the symbol list of stripped ELF file using the
sysv format, while generate the one from the unstripped one using
posix format. the former is always padded the symbol names with spaces
so that their the length at least the same as the section name after
we split the fields with "|".

that's why the diff includes the stuff we don't expect. and hence,
we have tons of warnings like:

```
objcopy: build/node_exporter/node_exporter.keep_symbols:4910: Ignoring rubbish found on this line
```

when using objcopy to filter the ELF file to keep only the
symbols we are interested in.

so, in this change

* use the same format when dumping the symbols from unstripped ELF
  file
* include the symbols in the text area -- the code, by checking
  "T" and "t" in the dumped symbols. this was achieved by matching
  the lines with "FUNC" before this change.
* include the the symbols in .init data section -- the global
  variables which are initialized at compile time. they could
  be also interesting when debugging an application.

Fixes #15513
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15514

(cherry picked from commit 50c937439b)
2023-12-18 13:58:14 +02:00
Michał Chojnowski
7e9bdef8bb row_cache: when the constructor fails, clear _partitions in the right allocator
If the constructor of row_cache throws, `_partitions` is cleared in the
wrong allocator, possibly causing allocator corruption.

Fix that.

Fixes #15632

Closes scylladb/scylladb#15633

(cherry picked from commit 330d221deb)
2023-12-18 13:55:16 +02:00
Michael Huang
af38b255c8 cql3: Fix invalid JSON parsing for JSON objects with ASCII keys
For JSON objects represented as map<ascii, int>, don't treat ASCII keys
as a nested JSON string. We were doing that prior to the patch, which
led to parsing errors.

Included the error offset where JSON parsing failed for
rjson::parse related functions to help identify parsing errors
better.

Fixes: #7949

Signed-off-by: Michael Huang <michaelhly@gmail.com>

Closes scylladb/scylladb#15499

(cherry picked from commit 75109e9519)
2023-12-18 13:45:57 +02:00
Kefu Chai
c4b699525a sstables: throw at seeing invalid chunk_len
before this change, when running into a zero chunk_len, scylla
crashes with `assert(chunk_size != 0)`. but we can do better than
printing a backtrace like:
```
scylla: sstables/compress.cc:158: void
sstables::compression::segmented_offsets::init(uint32_t): Assertion `chunk_size != 0' failed.
```
so, in this change, a `malformed_sstable_exception` is throw in place
of an `assert()`, which is supposed to verify the programming
invariants, not for identifying corrupted data file.

Fixes #15265
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15264

(cherry picked from commit 1ed894170c)
2023-12-18 13:29:02 +02:00
Nadav Har'El
3a24b8c435 sstables: stop warning when auto-snapshot leaves non-empty directory
When a table is dropped, we delete its sstables, and finally try to delete
the table's top-level directory with the rmdir system call. When the
auto-snapshot feature is enabled (this is still Scylla's default),
the snapshot will remain in that directory so it won't be empty and will
cannot be removed. Today, this results in a long, ugly and scary warning
in the log:

```
WARN  2023-07-06 20:48:04,995 [shard 0] sstable - Could not remove table directory "/tmp/scylla-test-198265/data/alternator_alternator_Test_1688665684546/alternator_Test_1688665684546-4238f2201c2511eeb15859c589d9be4d/snapshots": std::filesystem::__cxx11::filesystem_error (error system:39, filesystem error: remove failed: Directory not empty [/tmp/scylla-test-198265/data/alternator_alternator_Test_1688665684546/alternator_Test_1688665684546-4238f2201c2511eeb15859c589d9be4d/snapshots]). Ignored.
```

It is bad to log as a warning something which is completely normal - it
happens every time a table is dropped with the perfectly valid (and even
default) auto-snapshot mode. We should only log a warning if the deletion
failed because of some unexpected reason.

And in fact, this is exactly what the code **tried** to do - it does
not log a warning if the rmdir failed with EEXIST. It even had a comment
saying why it was doing this. But the problem is that in Linux, deleting
a non-empty directory does not return EEXIST, it returns ENOTEMPTY...
Posix actually allows both. So we need to check both, and this is the
only change in this patch.

To confirm this that this patch works, edit test/cql-pytest/run.py and
change auto-snapshot from 0 to 1, run test/alternator/run (for example)
and see many "Directory not empty" warnings as above. With this patch,
none of these warnings appear.

Fixes #13538

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #14557

(cherry picked from commit edfb89ef65)
2023-12-18 13:26:40 +02:00
Kefu Chai
9e9a488da3 streaming: cast the progress to a float before formatting it
before this change, we format a `long` using `{:f}`. fmtlib would
throw an exception when actually formatting it.

so, let's make the percentage a float before formatting it.

Fixes #14587
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14588

(cherry picked from commit 1eb76d93b7)
2023-12-18 13:16:58 +02:00
Aleksandra Martyniuk
614d15b9f6 repair: rename shard_repair_task_impl::id
shard_repair_task_impl::id stores global repair id. To avoid confusion
with the task id, the field is renamed to global_repair_id.

(cherry picked from commit d889a599e8)
2023-12-18 12:08:00 +01:00
Aleksandra Martyniuk
fc2799096f repair: delete redundant shard id from logs
In repair shard id is logged twice. Delete repeated occurence.

(cherry picked from commit f7c88edec5)
2023-12-18 12:03:26 +01:00
Petr Gusev
b9178bd853 hints: send_one_hint: extend the scope of file_send_gate holder
The problem was that the holder in with_gate
call was released too early. This happened
before the possible call to on_hint_send_failure
in then_wrapped. As a result, the effects of
on_hint_send_failure (segment_replay_failed flag)
were not visible in send_one_file after
ctx_ptr->file_send_gate.close(), so we could decide
that the segment was sent in full and delete
it even if sending of some hints led to errors.

Fixes #15110

(cherry picked from commit 9fd3df13a2)
2023-12-18 13:03:23 +02:00
Kefu Chai
12aacea997 compound_compat: do not format an sstring with {:d}
before this change, we format a sstring with "{:d}", fmtlib would throw
`fmt::format_error` at runtime when formatting it. this is not expected.

so, in this change, we just print the int8_t using `seastar::format()`
in a single pass. and with the format specifier of `#02x` instead of
adding the "0x" prefix manually.

Fixes #14577
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14578

(cherry picked from commit 27d6ff36df)
2023-12-18 12:47:16 +02:00
Kefu Chai
df30f66bfa tools/scylla-sstable: dump column_desc as an object
before this change, `scylla sstable dump-statistics` prints the
"regular_columns" as a list of strings, like:

```
        "regular_columns": [
          "name",
          "clustering_order",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type",
          "name",
          "column_name_bytes",
          "type_name",
          "org.apache.cassandra.db.marshal.BytesType",
          "name",
          "kind",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type",
          "name",
          "position",
          "type_name",
          "org.apache.cassandra.db.marshal.Int32Type",
          "name",
          "type",
          "type_name",
          "org.apache.cassandra.db.marshal.UTF8Type"
        ]
```

but according
https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/scylla-sstable.html#dump-statistics,

> $SERIALIZATION_HEADER_METADATA := {
>     "min_timestamp_base": Uint64,
>     "min_local_deletion_time_base": Uint64,
>     "min_ttl_base": Uint64",
>     "pk_type_name": String,
>     "clustering_key_types_names": [String, ...],
>     "static_columns": [$COLUMN_DESC, ...],
>     "regular_columns": [$COLUMN_DESC, ...],
> }
>
> $COLUMN_DESC := {
>     "name": String,
>     "type_name": String
> }

"regular_columns" is supposed to be a list of "$COLUMN_DESC".
the same applies to "static_columnes". this schema makes sense,
as each column should be considered as a single object which
is composed of two properties. but we dump them like a list.

so, in this change, we guard each visit() call of `json_dumper()`
with `StartObject()` and `EndObject()` pair, so that each column
is printed as an object.

after the change, "regular_columns" are printed like:
```
        "regular_columns": [
          {
            "name": "clustering_order",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          },
          {
            "name": "column_name_bytes",
            "type_name": "org.apache.cassandra.db.marshal.BytesType"
          },
          {
            "name": "kind",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          },
          {
            "name": "position",
            "type_name": "org.apache.cassandra.db.marshal.Int32Type"
          },
          {
            "name": "type",
            "type_name": "org.apache.cassandra.db.marshal.UTF8Type"
          }
        ]
```

Fixes #15036
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15037

(cherry picked from commit c82f1d2f57)
2023-12-18 12:26:36 +02:00
Michał Sala
2427bda737 forward_service: introduce shutdown checks
This commit introduces a new boolean flag, `shutdown`, to the
forward_service, along with a corresponding shutdown method. It also
adds checks throughout the forward_service to verify the value of the
shutdown flag before retrying or invoking functions that might use the
messaging service under the hood.

The flag is set before messaging service shutdown, by invoking
forward_service::shutdown in main. By checking the flag before each call
that potentially involves the messaging service, we can ensure that the
messaging service is still operational. If the flag is false, indicating
that the messaging service is still active, we can proceed with the
call. In the event that the messaging service is shutdown during the
call, appropriate exceptions should be thrown somewhere down in called
functions, avoiding potential hangs.

This fix should resolve the issue where forward_service retries could
block the shutdown.

Fixes #12604

Closes #13922

(cherry picked from commit e0855b1de2)
2023-12-18 12:25:25 +02:00
Petr Gusev
27adf340ef storage_proxy: mutation:: make frozen_mutation [[ref]]
We had a redundant copy in receive_mutation_handler
forward_fn callback. This frozen_mutation is
dynamically allocated and can be arbitrary large.

Fixes: #12504
(cherry picked from commit 5adbb6cde2)
2023-12-18 12:20:40 +02:00
Botond Dénes
5c33c9d6a6 Merge 'thrift: return address in listen_addresses() only after server is ready' from Marcin Maliszkiewicz
This is used for readiness API: /storage_service/rpc_server and the fix prevents from returning 'true' prematurely.

Some improvement for readiness was added in a51529dd15 but thrift implementation wasn't fully done.

Fixes https://github.com/scylladb/scylladb/issues/12376

Closes #13319

* github.com:scylladb/scylladb:
  thrift: return address in listen_addresses() only after server is ready
  thrift: simplify do_start_server() with seastar:async

(cherry picked from commit 9a024f72c4)
2023-12-18 12:20:40 +02:00
Kamil Braun
9aaaa66981 Merge 'cql3: fix a few misformatted printouts of column names in error messages' from Nadav Har'El
Fix a few cases where instead of printing column names in error messages, we printed weird stuff like ASCII codes or the address of the name.

Fixes #13657

Closes #13658

* github.com:scylladb/scylladb:
  cql3: fix printing of column_specification::name in some error messages
  cql3: fix printing of column_definition::name in some error messages

(cherry picked from commit a29b8cd02b)
2023-12-18 09:55:37 +02:00
Avi Kivity
b21ec82894 Merge 'Do not yield while traversing the gossiper endpoint state map' from Benny Halevy
This series introduces a new gossiper method: get_endpoints that returns a vector of endpoints (by value) based on the endpoint state map.

get_endpoints is used here by gossiper and storage_service for iterations that may preempt
instead of iterating direction over the endpoint state map (`_endpoint_state_map` in gossiper or via `get_endpoint_states()`) so to prevent use-after-free that may potentially happen if the map is rehashed while the function yields causing invalidation of the loop iterators.

\Fixes #13899

\Closes #13900

* github.com:scylladb/scylladb:
  storage_service: do not preempt while traversing endpoint_state_map
  gossiper: do not preempt while traversing endpoint_state_map

(cherry picked from commit d2d53fc1db)

Closes #16431
2023-12-18 09:35:42 +02:00
Yaron Kaikov
5052890ae8 release: prepare for 5.2.12 scylla-5.2.12 2023-12-17 14:28:03 +02:00
Kefu Chai
0da3453f95 db: schema_tables: capture reference to temporary value by value
`clustering_key_columns()` returns a range view, and `front()` returns
the reference to its first element. so we cannot assume the availability
of this reference after the expression is evaluated. to address this
issue, let's capture the returned range by value, and keep the first
element by reference.

this also silences warning from GCC-13:

```
/home/kefu/dev/scylladb/db/schema_tables.cc:3654:30: error: possibly dangling reference to a temporary [-Werror=dangling-reference]
 3654 |     const column_definition& first_view_ck = v->clustering_key_columns().front();
      |                              ^~~~~~~~~~~~~
/home/kefu/dev/scylladb/db/schema_tables.cc:3654:79: note: the temporary was destroyed at the end of the full expression ‘(& v)->view_ptr::operator->()->schema::clustering_key_columns().boost::iterator_range<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> > >::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> >, boost::iterators::random_access_traversal_tag>::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> >, boost::iterators::bidirectional_traversal_tag>::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> >, boost::iterators::incrementable_traversal_tag>::front()’
 3654 |     const column_definition& first_view_ck = v->clustering_key_columns().front();
      |                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
```

Fixes #13720
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13721

(cherry picked from commit 135b4fd434)
2023-12-15 13:55:57 +02:00
Benny Halevy
6d7b2bc02f sstables: compressed_file_data_source_impl: get: throw malformed_sstable_exception on premature eof
Currently, the reader might dereference a null pointer
if the input stream reaches eof prematurely,
and read_exactly returns an empty temporary_buffer.

Detect this condition before dereferencing the buffer
and sstables::malformed_sstable_exception.

Fixes #13599

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #13600

(cherry picked from commit 77b70dbdb7)
2023-12-15 13:54:42 +02:00
Wojciech Mitros
119c8279dd rust: update wasmtime dependency
The previous version of wasmtime had a vulnerability that possibly
allowed causing undefined behavior when calling UDFs.

We're directly updating to wasmtime 8.0.1, because the update only
requires a slight code modification and the Wasm UDF feature is
still experimental. As a result, we'll benefit from a number of
new optimizations.

Fixes #13807

Closes #13804

(cherry picked from commit 6bc16047ba)
2023-12-15 13:54:42 +02:00
Michał Chojnowski
3af6dfe4ac database: fix reads_memory_consumption for system semaphore
The metric shows the opposite of what its name suggests.
It shows available memory rather than consumed memory.
Fix that.

Fixes #13810

Closes #13811

(cherry picked from commit 0813fa1da0)
2023-12-15 13:54:42 +02:00
Eliran Sinvani
0230798db3 use_statement: Covert an exception to a future exception
The use statement execution code can throw if the keyspace is
doesn't exist, this can be a problem for code that will use
execute in a fiber since the exception will break the fiber even
if `then_wrapped` is used.

Fixes #14449

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>

Closes scylladb/scylladb#14394

(cherry picked from commit c5956957f3)
2023-12-15 13:54:42 +02:00
Botond Dénes
64503a7137 Merge 'mutation_query: properly send range tombstones in reverse queries' from Michał Chojnowski
reconcilable_result_builder passes range tombstone changes to _rt_assembler
using table schema, not query schema.
This means that a tombstone with bounds (a; b), where a < b in query schema
but a > b in table schema, will not be emitted from mutation_query.

This is a very serious bug, because it means that such tombstones in reverse
queries are not reconciled with data from other replicas.
If *any* queried replica has a row, but not the range tombstone which deleted
the row, the reconciled result will contain the deleted row.

In particular, range deletes performed while a replica is down will not
later be visible to reverse queries which select this replica, regardless of the
consistency level.

As far as I can see, this doesn't result in any persistent data loss.
Only in that some data might appear resurrected to reverse queries,
until the relevant range tombstone is fully repaired.

This series fixes the bug and adds a minimal reproducer test.

Fixes #10598

Closes scylladb/scylladb#16003

* github.com:scylladb/scylladb:
  mutation_query_test: test that range tombstones are sent in reverse queries
  mutation_query: properly send range tombstones in reverse queries

(cherry picked from commit 65e42e4166)
2023-12-14 12:53:07 +02:00
Yaron Kaikov
b013877629 build_docker.sh: Upgrade package during creation and remove sshd service
When scanning our latest docker image using `trivy` (command: `trivy
image docker.io/scylladb/scylla-nightly:latest`), it shows we have OS
packages which are out of date.

Also removing `openssh-server` and `openssh-client` since we don't use
it for our docker images

Fixes: https://github.com/scylladb/scylladb/issues/16222

Closes scylladb/scylladb#16224

(cherry picked from commit 7ce6962141)

Closes #16360
2023-12-11 10:57:16 +02:00
Botond Dénes
33d2da94ab reader_concurrency_semaphore: execution_loop(): trigger admission check when _ready_list is empty
The execution loop consumes permits from the _ready_list and executes
them. The _ready_list usually contains a single permit. When the
_ready_list is not empty, new permits are queued until it becomes empty.
The execution loops relies on admission checks triggered by the read
releasing resouces, to bring in any queued read into the _ready_list,
while it is executing the current read. But in some cases the current
read might not free any resorces and thus fail to trigger an admission
check and the currently queued permits will sit in the queue until
another source triggers an admission check.
I don't yet know how this situation can occur, if at all, but it is
reproducible with a simple unit test, so it is best to cover this
corner-case in the off-chance it happens in the wild.
Add an explicit admission check to the execution loop, after the
_ready_list is exhausted, to make sure any waiters that can be admitted
with an empty _ready_list are admitted immediately and execution
continues.

Fixes: #13540

Closes #13541

(cherry picked from commit b790f14456)
2023-12-07 16:04:55 +02:00
Paweł Zakrzewski
dac69be4a4 auth: fix error message when consistency level is not met
Propagate `exceptions::unavailable_exception` error message to the
client such as cqlsh.

Fixes #2339

(cherry picked from commit 400aa2e932)
2023-12-07 14:49:47 +02:00
Botond Dénes
763e583cf2 Merge 'row_cache: abort on exteral_updater::execute errors' from Benny Halevy
Currently the cache updaters aren't exception safe
yet they are intended to be.

Instead of allowing exceptions from
`external_updater::execute` escape `row_cache::update`,
abort using `on_fatal_internal_error`.

Future changes should harden all `execute` implementations
to effectively make them `noexcept`, then the pure virtual
definition can be made `noexcept` to cement that.

\Fixes scylladb/scylladb#15576

\Closes scylladb/scylladb#15577

* github.com:scylladb/scylladb:
  row_cache: abort on exteral_updater::execute errors
  row_cache: do_update: simplify _prev_snapshot_pos setup

(cherry picked from commit 4a0f16474f)

Closes scylladb/scylladb#16256
2023-12-07 09:16:42 +02:00
Nadav Har'El
b331b4a4bb Backport fixes for nodetool commands with Alternator GSI in the database
Fixes #16153

* java e716e1bd1d...80701efa8d (1):
  > NodeProbe: allow addressing table name with colon in it

/home/nyh/scylla/tools$ git submodule summary jmx | cat
* jmx bc4f8ea...f21550e (3):
  > ColumnFamilyStore: only quote table names if necessary
  > APIBuilder: allow quoted scope names
  > ColumnFamilyStore: don't fail if there is a table with ":" in its name

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16296
2023-12-06 10:48:49 +02:00
Anna Stuchlik
d9448a298f doc: fix rollback in the 4.6-to-5.0 upgrade guide
This commit fixes the rollback procedure in
the 4.6-to-5.0 upgrade guide:
- The "Restore system tables" step is removed.
- The "Restore the configuration file" command
  is fixed.
- The "Gracefully shutdown ScyllaDB" command
  is fixed.

In addition, there are the following updates
to be in sync with the tests:

- The "Backup the configuration file" step is
  extended to include a command to backup
  the packages.
- The Rollback procedure is extended to restore
  the backup packages.
- The Reinstallation section is fixed for RHEL.

Refs https://github.com/scylladb/scylladb/issues/11907

This commit must be backported to branch-5.4, branch-5.2, and branch-5.1

Closes scylladb/scylladb#16155

(cherry picked from commit 1e80bdb440)
2023-12-05 15:10:21 +02:00