Commit Graph

39478 Commits

Author SHA1 Message Date
Aleksandra Martyniuk
5e665cd7fb test: test drop table on receiver side during streaming
(cherry picked from commit 2ea5d9b623)
2024-02-26 13:00:58 +01:00
Aleksandra Martyniuk
b770be8f78 streaming: fix indentation
(cherry picked from commit b08f539427)
2024-02-26 12:53:38 +01:00
Aleksandra Martyniuk
b5ff9a2bf8 streaming: handle no_such_column_family from remote node gracefully
If no_such_column_family is thrown on remote node, then streaming
operation fails as the type of exception cannot be determined.

Use repair::with_table_drop_silenced in streaming to continue
operation if a table was dropped.

(cherry picked from commit 219e1eda09)
2024-02-26 10:15:32 +01:00
Aleksandra Martyniuk
0da3772d50 repair: add methods to skip dropped table
Schema propagation is async so one node can see the table while on
the other node it is already dropped. So, if the nodes stream
the table data, the latter node throws no_such_column_family.
The exception is propagated to the other node, but its type is lost,
so the operation fails on the other node.

Add method which waits until all raft changes are applied and then
checks whether given table exists.

Add the function which uses the above to determine, whether the function
failed because of dropped table (eg. on the remote node so the exact
exception type is unknown). If so, the exception isn't rethrown.

(cherry picked from commit 5202bb9d3c)
2024-02-26 10:10:37 +01:00
Nadav Har'El
72e804306c mv: fix missing view deletions in some cases of range tombstones
For efficiency, if a base-table update generates many view updates that
go the same partition, they are collected as one mutation. If this
mutation grows too big it can lead to memory exhaustion, so since
commit 7d214800d0 we split the output
mutation to mutations no longer than 100 rows (max_rows_for_view_updates)
each.

This patch fixes a bug where this split was done incorrectly when
the update involved range tombstones, a bug which was discovered by
a user in a real use case (#17117).

Range tombstones are read in two parts, a beginning and an end, and the
code could split the processing between these two parts and the result
that some of the range tombstones in update could be missed - and the
view could miss some deletions that happened in the base table.

This patch fixes the code in two places to avoid breaking up the
processing between range tombstones:

1. The counter "_op_count" that decides where to break the output mutation
   should only be incremented when adding rows to this output mutation.
   The existing code strangely incrmented it on every read (!?) which
   resulted in the counter being incremented on every *input* fragment,
   and in particular could reach the limit 100 between two range
   tombstone pieces.

2. Moreover, the length of output was checked in the wrong place...
   The existing code could get to 100 rows, not check at that point,
   read the next input - half a range tombstone - and only *then*
   check that we reached 100 rows and stop. The fix is to calculate
   the number of rows in the right place - exactly when it's needed,
   not before the step.

The first change needs more justification: The old code, that incremented
_op_count on every input fragment and not just output fragments did not
fit the stated goal of its introduction - to avoid large allocations.
In one test it resulted in breaking up the output mutation to chunks of
25 rows instead of the intended 100 rows. But, maybe there was another
goal, to stop the iteration after 100 *input* rows and avoid the possibility
of stalls if there are no output rows? It turns out the answer is no -
we don't need this _op_count increment to avoid stalls: The function
build_some() uses `co_await on_results()` to run one step of processing
one input fragment - and `co_await` always checks for preemption.
I verfied that indeed no stalls happen by using the existing test
test_long_skipped_view_update_delete_with_timestamp. It generates a
very long base update where all the view updates go to the same partition,
but all but the last few updates don't generate any view updates.
I confirmed that the fixed code loops over all these input rows without
increasing _op_count and without generating any view update yet, but it
does NOT stall.

This patch also includes two tests reproducing this bug and confirming
its fixed, and also two additional tests for breaking up long deletions
that I wanted to make sure doesn't fail after this patch (it doesn't).

By the way, this fix would have also fixed issue #12297 - which we
fixed a year ago in a different way. That issue happend when the code
went through 100 input rows without generating *any* output rows,
and incorrectly concluding that there's no view update to send.
With this fix, the code no longer stops generating the view
update just because it saw 100 input rows - it would have waited
until it generated 100 output rows in the view update (or the
input is really done).

Fixes #17117

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#17164

(cherry picked from commit 14315fcbc3)
2024-02-22 15:04:28 +02:00
Avi Kivity
384a0628b0 Merge 'cdc: metadata: allow sending writes to the previous generations' from Patryk Jędrzejczak
Before this PR, writes to the previous CDC generations would
always be rejected. After this PR, they will be accepted if the
write's timestamp is greater than `now - generation_leeway`.

This change was proposed around 3 years ago. The motivation was
to improve user experience. If a client generates timestamps by
itself and its clock is desynchronized with the clock of the node
the client is connected to, there could be a period during
generation switching when writes fail. We didn't consider this
problem critical because the client could simply retry a failed
write with a higher timestamp. Eventually, it would succeed. This
approach is safe because these failed writes cannot have any side
effects. However, it can be inconvenient. Writing to previous
generations was proposed to improve it.

The idea was rejected 3 years ago. Recently, it turned out that
there is a case when the client cannot retry a write with the
increased timestamp. It happens when a table uses CDC and LWT,
which makes timestamps permanent. Once Paxos commits an entry
with a given timestamp, Scylla will keep trying to apply that entry
until it succeeds, with the same timestamp. Applying the entry
involves writing to the CDC log table. If it fails, we get stuck.
It's a major bug with an unknown perfect solution.

Allowing writes to previous generations for `generation_leeway` is
a probabilistic fix that should solve the problem in practice.

Apart from this change, this PR adds tests for it and updates
the documentation.

This PR is sufficient to enable writes to the previous generations
only in the gossiper-based topology. The Raft-based topology
needs some adjustments in loading and cleaning CDC generations.
These changes won't interfere with the changes introduced in this
PR, so they are left for a follow-up.

Fixes scylladb/scylladb#7251
Fixes scylladb/scylladb#15260

Closes scylladb/scylladb#17134

* github.com:scylladb/scylladb:
  docs: using-scylla: cdc: remove info about failing writes to old generations
  docs: dev: cdc: document writing to previous CDC generations
  test: add test_writes_to_previous_cdc_generations
  cdc: generation: allow increasing generation_leeway through error injection
  cdc: metadata: allow sending writes to the previous generations

(cherry picked from commit 9bb4482ad0)

Backport note: in tests, replaced `servers_add` with loop of `server_add`
2024-02-22 12:44:24 +01:00
Wojciech Mitros
435000ee70 rust: update dependencies
The currently used versions of "time" and "rustix" depencies
had minor security vulnerabilities.
In this patch:
- the "rustix" crate is updated
- the "chrono" crate that we depend on was not compatible
with the version of the "time" crate that had fixes, so
we updated the "chrono" crate, which actually removed the
dependency on "time" completely.
Both updated were performed using "cargo update" on the
relevant package and the corresponding version.

Refs #15772

Closes scylladb/scylladb#17407
2024-02-19 22:12:13 +02:00
Anna Stuchlik
e691604823 doc: remove Enterprise OS support from Open Source
With this commit:
- The information about ScyllaDB Enterprise OS support
  is removed from the Open Source documentation.
- The information about ScyllaDB Open Source OS support
  is moved to the os-support-info file in the _common folder.
- The os-support-info file is included in the os-support page
  using the scylladb_include_flag directive.

This update employs the solution we added with
https://github.com/scylladb/scylladb/pull/16753.
It allows to dynamically add content to a page
depending on the opensource/enterprise flag.

Refs https://github.com/scylladb/scylladb/issues/15484

Closes scylladb/scylladb#17310

(cherry picked from commit ef1468d5ec)
2024-02-19 11:16:19 +02:00
Lakshmi Narayanan Sreethar
46098c5a0e replica/database: quiesce compaction before closing system tables during shutdown
During shutdown, as all system tables are closed in parallel, there is a
possibility of a race condition between compaction stoppage and the
closure of the compaction_history table. So, quiesce all the compaction
tasks before attempting to close the tables.

Fixes #15721

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#17218

(cherry picked from commit 3b7b315f6a)
2024-02-19 09:45:17 +02:00
Anna Stuchlik
b2fe98bfc6 doc: add missing redirections
This commit adds the missing redirections
to the pages whose source files were
previously stored in the install-scylla folder
and were moved to another location.

Closes scylladb/scylladb#17367

(cherry picked from commit e132ffdb60)
2024-02-19 09:18:14 +02:00
Botond Dénes
e4526449a1 query: do not kill unpaged queries when they reach the tombstone-limit
The reason we introduced the tombstone-limit
(query_tombstone_page_limit), was to allow paged queries to return
incomplete/empty pages in the face of large tombstone spans. This works
by cutting the page after the tombstone-limit amount of tombstones were
processed. If the read is unpaged, it is killed instead. This was a
mistake. First, it doesn't really make sense, the reason we introduced
the tombstone limit, was to allow paged queries to process large
tombstone-spans without timing out. It does not help unpaged queries.
Furthermore, the tombstone-limit can kill internal queries done on
behalf of user queries, because all our internal queries are unpaged.
This can cause denial of service.

So in this patch we disable the tombstone-limit for unpaged queries
altogether, they are allowed to continue even after having processed the
configured limit of tombstones.

Fixes: #17241

Closes scylladb/scylladb#17242

(cherry picked from commit f068d1a6fa)
2024-02-15 12:50:09 +02:00
Jenkins Promoter
c44bb1544d Update ScyllaDB version to: 5.4.4 2024-02-14 16:23:48 +02:00
Avi Kivity
fcfcd6d35a Regenerate frozen toolchain
For gnutls 3.8.3.

Fixes #17285.

Closes scylladb/scylladb#17291
2024-02-12 19:39:28 +02:00
Pavel Emelyanov
cf42ca0c2a Update seastar submodule
* seastar 95a38bb0...9d44e5eb (1):
  > Merge "Slowdown IO scheduler based on dispatched/completed ratio" int branch-5.4

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
scylla-5.4.3 scylla-5.4.3-candidate
2024-02-09 12:21:47 +03:00
Botond Dénes
62d8c7274a Merge 'Fix mintimeuuid() call that could crash Scylla' from Nadav Har'El
This PR fixes the bug of certain calls to the `mintimeuuid()` CQL function which large negative timestamps could crash Scylla. It turns out we already had protections in place against very positive timestamps, but very negative timestamps could still cause bugs.

The actual fix in this series is just a few lines, but the bigger effort was improving the test coverage in this area. I added tests for the "date" type (the original reproducer for this bug used totimestamp() which takes a date parameter), and also reproducers for this bug directly, without totimestamp() function, and one with that function.

Finally this PR also replaces the assert() which made this molehill-of-a-bug into a mountain, by a throw.

Fixes #17035

Closes scylladb/scylladb#17073

* github.com:scylladb/scylladb:
  utils: replace assert() by on_internal_error()
  utils: add on_internal_error with common logger
  utils: add a timeuuid minimum, like we had maximum
  test/cql-pytest: tests for "date" type

(cherry picked from commit 2a4b991772)
2024-02-07 13:47:55 +02:00
Botond Dénes
8080c15d7a Merge '[Backport 5.4] Raft snapshot fixes' from Kamil Braun
Backports required to fix https://github.com/scylladb/scylladb/issues/16683 in 5.4:
- add an API to trigger Raft snapshot
- use the API when we restart and see that the existing snapshot is at index 0, to trigger a new one --- in order to fix broken deployments that already bootstrapped with index-0 snapshot (we may get such deployments by upgrading from 5.2)

Closes scylladb/scylladb#17123

* github.com:scylladb/scylladb:
  test_raft_snapshot_request: fix flakiness (again)
  test_raft_snapshot_request: fix flakiness
  Merge 'raft_group0: trigger snapshot if existing snapshot index is 0' from Kamil Braun
  Merge 'Add an API to trigger snapshot in Raft servers' from Kamil Braun
2024-02-07 11:52:51 +02:00
Botond Dénes
8398f361cd Merge 'doc: add the 5.4-to-2024.1 upgrade guide' from Anna Stuchlik
This PR:
- Adds the upgrade guide from ScyllaDB Open Source 5.4 to ScyllaDB Enterprise 2024.1. Note: The need to include the "Restore system tables" step in rollback has been confirmed; see https://github.com/scylladb/scylladb/issues/11907#issuecomment-1842657959.
- Removes the 5.1-to-2022.2 upgrade guide (unsupported versions).

Fixes https://github.com/scylladb/scylladb/issues/16445

Closes scylladb/scylladb#16887

* github.com:scylladb/scylladb:
  doc: fix the OSS version number
  doc: metric updates between 2024.1. and 5.4
  doc: remove the 5.1-to-2022.2 upgrade guide
  doc: add the 5.4-to-2024.1 upgrade guide

(cherry picked from commit edb983d165)
2024-02-06 12:24:26 +02:00
Anna Stuchlik
dba6070794 doc: add 2024.1 to the OSS vs. Enterprise matrix
This commit adds the information that
ScyllaDB Enterprise 2024.1 is based
on ScyllaDB Open Source 5.4
to the OSS vs. Enterprise matrix.

Closes scylladb/scylladb#16880

(cherry picked from commit a462b914cb)
2024-02-05 14:13:14 +02:00
Jenkins Promoter
0a6a52e08c Update ScyllaDB version to: 5.4.3 2024-02-04 20:35:41 +02:00
Michał Chojnowski
25c0510015 row_cache: update _prev_snapshot_pos even if apply_to_incomplete() is preempted
Commit e81fc1f095 accidentally broke the control
flow of row_cache::do_update().

Before that commit, the body of the loop was wrapped in a lambda.
Thus, to break out of the loop, `return` was used.

The bad commit removed the lambda, but didn't update the `return` accordingly.
Thus, since the commit, the statement doesn't just break out of the loop as
intended, but also skips the code after the loop, which updates `_prev_snapshot_pos`
to reflect the work done by the loop.

As a result, whenever `apply_to_incomplete()` (the `updater`) is preempted,
`do_update()` fails to update `_prev_snapshot_pos`. It remains in a
stale state, until `do_update()` runs again and either finishes or
is preempted outside of `updater`.

If we read a partition processed by `do_update()` but not covered by
`_prev_snapshot_pos`, we will read stale data (from the previous snapshot),
which will be remembered in the cache as the current data.

This results in outdated data being returned by the replica.
(And perhaps in something worse if range tombstones are involved.
I didn't investigate this possibility in depth).

Note: for queries with CL>1, occurences of this bug are likely to be hidden
by reconciliation, because the reconciled query will only see stale data if
the queried partition is affected by the bug on on *all* queried replicas
at the time of the query.

Fixes #16759

Closes scylladb/scylladb#17138

(cherry picked from commit ed98102c45)
2024-02-04 14:46:26 +02:00
Kamil Braun
311e31b36f test_raft_snapshot_request: fix flakiness (again)
At the end of the test, we wait until a restarted node receives a
snapshot from the leader, and then verify that the log has been
truncated.

To check the snapshot, the test used the `system.raft_snapshots` table,
while the log is stored in `system.raft`.

Unfortunately, the two tables are not updated atomically when Raft
persists a snapshot (scylladb/scylladb#9603). We first update
`system.raft_snapshots`, then `system.raft` (see
`raft_sys_table_storage::store_snapshot_descriptor`). So after the wait
finishes, there's no guarantee the log has been truncated yet -- there's
a race between the test's last check and Scylla doing that last delete.

But we can check the snapshot using `system.raft` instead of
`system.raft_snapshots`, as `system.raft` has the latest ID. And since
1640f83fdc, storing that ID and truncating
the log in `system.raft` happens atomically.

Closes scylladb/scylladb#17106

(cherry picked from commit c911bf1a33)
2024-02-02 13:02:30 +01:00
Kamil Braun
6a6a4fde79 test_raft_snapshot_request: fix flakiness
Add workaround for scylladb/python-driver#295.

Also an assert made at the end of the test was false, it is fixed with
appropriate comment added.

(cherry picked from commit 74bf60a8ca)
2024-02-02 13:02:30 +01:00
Botond Dénes
390414c99e Merge 'raft_group0: trigger snapshot if existing snapshot index is 0' from Kamil Braun
The persisted snapshot index may be 0 if the snapshot was created in
older version of Scylla, which means snapshot transfer won't be
triggered to a bootstrapping node. Commands present in the log may not
cover all schema changes --- group 0 might have been created through the
upgrade upgrade procedure, on a cluster with existing schema. So a
deployment with index=0 snapshot is broken and we need to fix it. We can
use the new `raft::server::trigger_snapshot` API for that.

Also add a test.

Fixes scylladb/scylladb#16683

Closes scylladb/scylladb#17072

* github.com:scylladb/scylladb:
  test: add test for fixing a broken group 0 snapshot
  raft_group0: trigger snapshot if existing snapshot index is 0

(cherry picked from commit 181f68f248)
2024-02-02 13:02:30 +01:00
Botond Dénes
26b812067b Merge 'Add an API to trigger snapshot in Raft servers' from Kamil Braun
This allows the user of `raft::server` to cause it to create a snapshot
and truncate the Raft log (leaving no trailing entries; in the future we
may extend the API to specify number of trailing entries left if
needed). In a later commit we'll add a REST endpoint to Scylla to
trigger group 0 snapshots.

One use case for this API is to create group 0 snapshots in Scylla
deployments which upgraded to Raft in version 5.2 and started with an
empty Raft log with no snapshot at the beginning. This causes problems,
e.g. when a new node bootstraps to the cluster, it will not receive a
snapshot that would contain both schema and group 0 history, which would
then lead to inconsistent schema state and trigger assertion failures as
observed in scylladb/scylladb#16683.

In 5.4 the logic of initial group 0 setup was changed to start the Raft
log with a snapshot at index 1 (ff386e7a44)
but a problem remains with these existing deployments coming from 5.2,
we need a way to trigger a snapshot in them (other than performing 1000
arbitrary schema changes).

Another potential use case in the future would be to trigger snapshots
based on external memory pressure in tablet Raft groups (for strongly
consistent tables).

The PR adds the API to `raft::server` and a HTTP endpoint that uses it.

In a follow-up PR, we plan to modify group 0 server startup logic to automatically
call this API if it sees that no snapshot is present yet (to automatically
fix the aforementioned 5.2 deployments once they upgrade.)

Closes scylladb/scylladb#16816

* github.com:scylladb/scylladb:
  raft: remove `empty()` from `fsm_output`
  test: add test for manual triggering of Raft snapshots
  api: add HTTP endpoint to trigger Raft snapshots
  raft: server: add `trigger_snapshot` API
  raft: server: track last persisted snapshot descriptor index
  raft: server: framework for handling server requests
  raft: server: inline `poll_fsm_output`
  raft: server: fix indentation
  raft: server: move `io_fiber`'s processing of `batch` to a separate function
  raft: move `poll_output()` from `fsm` to `server`
  raft: move `_sm_events` from `fsm` to `server`
  raft: fsm: remove constructor used only in tests
  raft: fsm: move trace message from `poll_output` to `has_output`
  raft: fsm: extract `has_output()`
  raft: pass `max_trailing_entries` through `fsm_output` to `store_snapshot_descriptor`
  raft: server: pass `*_aborted` to `set_exception` call

(cherry picked from commit d202d32f81)

Backport note: the HTTP API is only started if raft_group_registry is
started.
2024-02-02 12:35:46 +01:00
Tzach Livyatan
e83c4cc75c Update link to sizing / pricing calc
Closes scylladb/scylladb#17015

(cherry picked from commit 06a9a925a5)
2024-01-29 14:32:56 +02:00
Avi Kivity
df1843311a Merge 'Invalidate prepared statements for views when their schema changes.' from Eliran Sinvani
When a base table changes and altered, so does the views that might
refer to the added column (which includes "SELECT *" views and also
views that might need to use this column for rows lifetime (virtual
columns).
However the query processor implementation for views change notification
was an empty function.
Since views are tables, the query processor needs to at least treat them
as such (and maybe in the future, do also some MV specific stuff).
This commit adds a call to `on_update_column_family` from within
`on_update_view`.
The side effect true to this date is that prepared statements for views
which changed due to a base table change will be invalidated.

Fixes https://github.com/scylladb/scylladb/issues/16392

This series also adds a test which fails without this fix and passes when the fix is applied.

Closes scylladb/scylladb#16897

* github.com:scylladb/scylladb:
  Add test for mv prepared statements invalidation on base alter
  query processor: treat view changes at least as table changes

(cherry picked from commit 5810396ba1)
2024-01-23 19:34:10 +02:00
Anna Stuchlik
fcaae2ea78 doc: remove upgrade for unsupported versions
This commit removes the upgrade guides
from ScyllaDB Open Source to Enterprise
for versions we no longer support.

In addition, it removes a link to
one of the removed pages from
the Troubleshooting section (the link is
redundant).

(cherry picked from commit 0ad3ef4c55)

Closes scylladb/scylladb#16913
2024-01-22 16:45:36 +02:00
David Garcia
a1b6edd5d3 docs: dynamic include based on flag
docs: extend include options

Closes scylladb/scylladb#16753

(cherry picked from commit f555a2cb05)
2024-01-19 10:14:56 +02:00
Botond Dénes
6c625e8cd3 Merge '[Backport 5.4] tasks: compaction: drop regular compaction tasks after they are finished' from Aleksandra Martyniuk
Make compaction tasks internal. Drop all internal tasks without parents
immediately after they are done.

Fixes: https://github.com/scylladb/scylladb/issues/16735
Refs: https://github.com/scylladb/scylladb/issues/16694.

Closes scylladb/scylladb#16798

* github.com:scylladb/scylladb:
  compaction: make regular compaction tasks internal
  tasks: don't keep internal root tasks after they complete
2024-01-17 09:34:08 +02:00
Anna Stuchlik
10df72ed04 doc: remove Serverless from the Drivers page
This commit removes the information about ScyllaDB Cloud Serverless,
which is no longer valid.

(cherry picked from commit 758284318a)

Closes scylladb/scylladb#16805
scylla-5.4.2-candidate scylla-5.4.2
2024-01-17 09:01:40 +02:00
Botond Dénes
d4788406d4 readers/multishard: evictable_reader::fast_forward_to(): close reader on exception
When the reader is currently paused, it is resumed, fast-forwarded, then
paused again. The fast forwarding part can throw and this will lead to
destroying the reader without it being closed first.
Add a try-catch surrounding this part in the code. Also mark
`maybe_pause()` and `do_pause()` as noexcept, to make it clear why
that part doesn't need to be in the try-catch.

Fixes: #16606

Closes scylladb/scylladb#16630

(cherry picked from commit 204d3284fa)
2024-01-16 16:56:57 +02:00
Aleksandra Martyniuk
081a36e34f compaction: make regular compaction tasks internal
Regular compaction tasks are internal.

Adjust test_compaction_task accordingly: modify test_regular_compaction_task,
delete test_running_compaction_task_abort (relying on regular compaction)
which checks are already achived by test_not_created_compaction_task_abort.
Rename the latter.

(cherry picked from commit 6b87778ef2)
2024-01-16 11:15:41 +01:00
Aleksandra Martyniuk
c0c7de8fd1 tasks: don't keep internal root tasks after they complete
(cherry picked from commit 6b2b384c83)
2024-01-16 10:53:16 +01:00
Botond Dénes
aee9947f6c Merge '[Branch 5.4]: Major compaction: flush commitlog by forcing new active segment and flushing all tables' from Kefu Chai
Major compaction already flushes each table to make
sure it considers any mutations that are present in the
memtable for the purpose of tombstone purging.
See 64ec1c6ec6

However, tombstone purging may be inhibited by data
in commitlog segments based on `gc_time_min` in the
`tombstone_gc_state` (See f42eb4d1ce).

Flushing all sstables in the database release
all references to commitlog segments and there
it maximizes the potential for tombstone purging,
which is typically the reason for running major compaction.

However, flushing all tables too frequently might
result in tiny sstables.  Since when flushing all
keyspaces using `nodetool flush` the `force_keyspace_compaction`
api is invoked for keyspace successively, we need a mechanism
to prevent too frequent flushes by major compaction.

Hence a `compaction_flush_all_tables_before_major_seconds` interval
configuration option is added (defaults to 24 hours).

In the case that not all tables are flushed prior
to major compaction, we revert to the old behavior of
flushing each table in the keyspace before major-compacting it.

Fixes scylladb/scylladb#15777

Closes scylladb/scylladb#15820

to address the confliction, following change is also included in this changeset:

tools/scylla-nodetool: implement the cleanup command

The --jobs command-line argument is accepted but ignored, just like the
current nodetool does.

Refs: scylladb/scylladb#15588

Closes scylladb/scylladb#16756

* github.com:scylladb/scylladb:
  docs: nodetool: flush: enrich examples
  docs: nodetool: compact: fix example
  api: add /storage_service/compact
  api: add /storage_service/flush
  tools/scylla-nodetool: implement the flush command
  compaction_manager: flush_all_tables before major compaction
  database: add flush_all_tables
  api: compaction: add flush_memtables option
  test/nodetool: jmx: fix path to scripts/scylla-jmx
  scylla-nodetool, docs: improve optional params documentation
  tools/scylla-nodetool: extract keyspace/table parsing
  tools/scylla-nodetool: implement the cleanup command
  test/nodetool: rest_api_mock: add more options for multiple requests
2024-01-16 11:49:06 +02:00
Anna Stuchlik
6fdfec5282 doc: remove support for CentOS 7
This commit removes support for CentOS 7
from the docs.

The change applies to version 5.4,so it
must be backported to branch-5.4.

Refs https://github.com/scylladb/scylla-enterprise/issues/3502

In addition, this commit removes the information
about Amazon Linux and Oracle Linux, unnecessarily added
without request, and there's no clarity over which versions
should be documented.

Closes scylladb/scylladb#16279

(cherry picked from commit af1405e517)
2024-01-16 09:57:30 +02:00
Tomasz Grabiec
50a5c5379a test: Drop tablets test
The feature will not be enabled on 5.4 so there is no point in testing it.

Closes scylladb/scylladb#16780
2024-01-15 17:02:07 +02:00
Tomasz Grabiec
938b993331 Merge 'Fix a few rare bugs in row cache' from Michał Chojnowski
This is a loose collection of fixes to rare row cache bugs flushed out by running test_concurrent_reads_and_eviction several million times. See individual commits for details.

Fixes #15483

Closes scylladb/scylladb#15945

* github.com:scylladb/scylladb:
  partition_version: fix violation of "older versions are evicted first" during schema upgrades
  cache_flat_mutation_reader: fix a broken iterator validity guarantee in ensure_population_lower_bound()
  cache_flat_mutation_reader: fix a continuity loss in maybe_update_continuity()
  cache_flat_mutation_reader: fix continuity losses during cache population races with reverse reads
  partition_snapshot_row_cursor: fix a continuity loss in ensure_entry_in_latest() with reverse reads
  cache_flat_mutation_reader: fix some cache mispopulations with reverse reads
  cache_flat_mutation_reader: fix a logic bug in ensure_population_lower_bound() with reverse reads
  cache_flat_mutation_reader: never make an unlinked last dummy continuous

(cherry picked from commit 6bcf3ac86c)
2024-01-15 16:47:56 +02:00
Botond Dénes
7971abb8e3 Update tools/java submodule
* tools/java 6e4b6f6c...84636d6a (1):
  > Update JNA dependency to 5.14.0

Fixes: https://github.com/scylladb/scylla-tools-java/issues/371
2024-01-15 15:50:14 +02:00
Aleksandra Martyniuk
65fb562ae3 tasks: keep task's children in list
If std::vector is resized its iterators and references may
get invalidated. While task_manager::task::impl::_children's
iterators are avoided throughout the code, references to its
elements are being used.

Since children vector does not need random access to its
elements, change its type to std::list<foreign_task_ptr>, which
iterators and references aren't invalidated on element insertion.

Fixes: #16380.

Closes scylladb/scylladb#16381

(cherry picked from commit 9b9ea1193c)
2024-01-15 12:49:19 +02:00
Benny Halevy
97a9f1dc7b docs: nodetool: flush: enrich examples
Provide 3 examples, like in the nodetool/compact page:
global, per-keyspace, per-table.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 310ff20e1e)
2024-01-12 15:57:39 +08:00
Benny Halevy
7f629df6fd docs: nodetool: compact: fix example
It looks like `nodetool compact standard1` is meant
to show how to compact a specified table, not a keyspace.
Note that the previous example like is for a keyspace.
So fix the table compaction example to:
`nodetool compact keyspace1 standard1`

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit d32b90155a)
2024-01-12 15:57:39 +08:00
Benny Halevy
3ff8051532 api: add /storage_service/compact
For major compacting all tables in the database.
The advantage of this api is that `commitlog->force_new_active_segment`
happens only once in `database::flush_all_tables` rather than
once per keyspace (when `nodetool compact` translates to
a sequence of `/storage_service/keyspace_compaction` calls).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit b12b142232)
2024-01-12 15:57:39 +08:00
Benny Halevy
e5dcef32ef api: add /storage_service/flush
For flushing all tables in the database.
The advantage of this api is that `commitlog->force_new_active_segment`
happens only once in `database::flush_all_tables` rather than
once per keyspace (when `nodetool flush` translates to
a sequence of `/storage_service/keyspace_flush` calls).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 1b576f358b)
2024-01-12 15:57:39 +08:00
Botond Dénes
199cfd0784 tools/scylla-nodetool: implement the flush command
(cherry picked from commit f5083f66f5)
2024-01-12 15:57:39 +08:00
Benny Halevy
5d88e997ef compaction_manager: flush_all_tables before major compaction
Major compaction already flushes each table to make
sure it considers any mutations that are present in the
memtable for the purpose of tombstone purging.
See 64ec1c6ec6

However, tombstone purging may be inhibited by data
in commitlog segments based on `gc_time_min` in the
`tombstone_gc_state` (See f42eb4d1ce).

Flushing all sstables in the database release
all references to commitlog segments and there
it maximizes the potential for tombstone purging,
which is typically the reason for running major compaction.

However, flushing all tables too frequently might
result in tiny sstables.  Since when flushing all
keyspaces using `nodetool flush` the `force_keyspace_compaction`
api is invoked for keyspace successively, we need a mechanism
to prevent too frequent flushes by major compaction.

Hence a `compaction_flush_all_tables_before_major_seconds` interval
configuration option is added (defaults to 24 hours).

In the case that not all tables are flushed prior
to major compaction, we revert to the old behavior of
flushing each table in the keyspace before major-compacting it.

Fixes scylladb/scylladb#15777

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 66ba983fe0)
2024-01-12 15:57:39 +08:00
Benny Halevy
7bb6386c14 database: add flush_all_tables
Flushes all tables after forcing force_new_active_segment
of the commitlog to make sure all commitlog segments can
get recycled.

Otherwise, due to "false sharing", rarely-written tables
might inhibit recycling of the commitlog segments they reference.

After f42eb4d1ce,
that won't allow compaction to purge some tombstones based on
the min_gc_time.

To be used in the next patch by major compaction.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit be763bea34)
2024-01-12 15:57:39 +08:00
Benny Halevy
993e6997c0 api: compaction: add flush_memtables option
When flushing is done externally, e.g. by running
`nodetool flush` prior to `nodetool compact`,
flush_memtables=false can be passed to skip flushing
of tables right before they are major-compacted.

This is useful to prevent creation of small sstables
due to excessive memtable flushing.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 1fd85bd37b)
2024-01-12 15:57:39 +08:00
Benny Halevy
8b487be054 test/nodetool: jmx: fix path to scripts/scylla-jmx
The current implementation makes no sense.

Like `nodetool_path`, base the default `jmx_path`
on the assumption that the test is run using, e.g.
```
(cd test/nodetool; pytest --nodetool=cassandra test_compact.py)
```

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 7f860d612a)
2024-01-12 15:57:39 +08:00
Benny Halevy
346e883dfc scylla-nodetool, docs: improve optional params documentation
Document the behavior if no keyspace is specified
or no table(s) are specified for a given keyspace.

Fixes scylladb/scylladb#16032

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 9324363e55)
2024-01-12 15:57:39 +08:00
Botond Dénes
ceffbdf832 tools/scylla-nodetool: extract keyspace/table parsing
Having to extract 1 keyspace and N tables from the command-line is
proving to be a common pattern among commands. Extract this into a
method, so the boiler-plate can be shared. Add a forward-looking
overload as well, which will be used in the next patch.

(cherry picked from commit f082cc8273)
2024-01-12 15:57:39 +08:00