Commit Graph

38153 Commits

Author SHA1 Message Date
Pavel Emelyanov
ec4040496b system_keyspace: Reuse container() and _db member for flushing
The set_scylla_local_param_as() wants to flush replica::database on all
shards. For that it uses smp::invoke_on_all() and qctx, but since the
method is now non-static one for system_keyspace it can enjoy usiing
container().invoke_on_all() and this->_db (on target shard)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-31 16:02:21 +03:00
Pavel Emelyanov
1ac4b7d2fe system_keyspace: Make [gs]et_scylla_local_param_as() class methods
These are now two .cc-local templatized helpers, but they are only
called by system_keyspace:: non-static methods, so can be such as well

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-31 16:02:18 +03:00
Pavel Emelyanov
04b12d24fd system_keyspace: De-static [gs]et_scylla_local_param()
All same-class callers are now non-static methods of system_keyspace,
all external callers do it via an object at hand.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-31 16:02:18 +03:00
Nadav Har'El
04e5082d52 alternator: limit expression length and recursion depth
DynamoDB limits of all expressions (ConditionExpression, UpdateExpression,
ProjectionExpression, FilterExpression, KeyConditionExpression) to just
4096 bytes. Until now, Alternator did not enforce this limit, and we had
an xfailing test showing this.

But it turns out that not enforcing this limit can be dangerous: The user
can pass arbitrarily-long and arbitrarily nested expressions, such as:

    a<b and (a<b and (a<b and (a<b and (a<b and (a<b and (...))))))

or
    (((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

and those can cause recursive algorithms in Alternator's parser and
later when applying expressions to recurse very deeply, overflow the
stack, and crash.

This patch includes new tests that demonstrate how Scylla crashes during
parsing before enforcing the 4096-byte length limit on expressions.
The patch then enforces this length limit, and these tests stop crashing.
We also verify that deeply-nested expressions shorter than the 4096-byte
limit are apparently short enough for our recursion ability, and work
as expected.

Unforuntately, running these tests many times showed that the 4096-byte
limit is not low enough to avoid all crashes so this patch needs to do
more:

The parsers created by ANTLR are recursive, and there is no way to limit
the depth of their recursion (i.e., nothing like YACC's YYMAXDEPTH).
Very deep recursion can overflow the stack and crash Scylla. After we
limited the length of expression strings to 4096 bytes this was *almost*
enough to prevent stack overflows. But unfortunetely the tests revealed
that even limited to 4096 bytes, the expression can sometimes recurse
too deeply: Consider the expression "((((((....((((" with 4000 parentheses.
To realize this is a syntax error, the parser needs to do a recursive
call 4000 times. Or worse - because of other Antlr limitations (see rants
in comments in expressions.g) it's actually 12000 recursive calls, and
each of these calls have a pretty large frame. In some cases, this
overflows the stack.

The solution used in this patch is not pretty, but works. We add to rules
in alternator/expressions.g that recurse (there are two of those - "value"
and "boolean_expression") an integer "depth" parameter, which we increase
when the rule recurses. Moreover, we add a so-called predicate
"{depth<MAX_DEPTH}?" that stops the parsing when this limit is reached.
When the parsing is stopped, the user will see a special kind of parse
error, saying "expression nested too deeply".

With this last modification to expressions.g, the tests for deeply-nested but
still-below-4096-bytes expressions
(test_limits.py::test_deeply_nested_expression_*) would not fail sporadically
as they did without it.

While adding the "expression nested too deeply" case, I also made the
general syntax-error reporting in Alternator nicer: It no longer prints
the internal "expression_syntax_error" type name (an exception type will
only be printed if some sort of unexpected exception happens), and it
prints the character position where the syntax error (or too deep
nested expression) was recognized.

Fixes #14473

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #14477
2023-07-31 08:57:54 +03:00
Botond Dénes
a637ddd09c Merge 'cql: add missing functions for the COUNTER column type' from Nadav Har'El
We have had support for COUNTER columns for quite some time now, but some functionality was left unimplemented - various internal and CQL functions resulted in "unimplemented" messages when used, and the goal of this series is to fix those issues. The primary goal was to add the missing support for CASTing counters to other types in CQL (issue #14501), but we also add the missing CQL  `counterasblob()` and `blobascounter()` functions (issue #14742).

As usual, the series includes extensive functional tests for these features, and one pre-existing test for CAST that used to fail now begins to pass.

Fixes #14501
Fixes #14742

Closes #14745

* github.com:scylladb/scylladb:
  test/cql-pytest: test confirming that casting to counter doesn't work
  cql: support casting of counter to other types
  cql: implement missing counterasblob() and blobascounter() functions
  cql: implement missing type functions for "counters" type
2023-07-31 08:55:45 +03:00
Nadav Har'El
b55b8f29b9 test/cql-pytest: test confirming that casting to counter doesn't work
In the previous patch we implemented CAST operations from the COUNTER
type to various other types. We did not implement the reverse cast,
from different types to the counter type. Should we? In this patch
we add a test that shows we don't need to bother - Cassandra does not
support such casts, so it's fine that we don't too - and indeed the
test shows we don't support them.
It's not a useful operation anyway.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-07-30 20:16:25 +03:00
Nadav Har'El
b513bba201 cql: support casting of counter to other types
We were missing support in the "CAST(x AS type)" function for the counter
type. This patch adds this support, as well as extensive testing that it
works in Scylla the same as Cassandra.

We also un-xfail an existing test translated from Cassandra's unit
test. But note that this old test did not cover all the edge-cases that
the new test checks - some missing cases in the implementation were
not caught by the old test.

Fixes #14501

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-07-30 20:16:25 +03:00
Nadav Har'El
c1762750ed cql: implement missing counterasblob() and blobascounter() functions
Code in functions.cc creates the different TYPEasblob() and blobasTYPE()
functions for all type names TYPE. The functions for the "counter" type
were skipped, supposedly because "counters are not supported yet". But
counters are supported, so let's add the missing functions.

The code fix is trivial, the tests that verify that the result behaves
like Cassandra took more work.

After this patch, unimplemented::cause::COUNTERS is no longer used
anywhere in the code. I wanted to remove it, but noticed that
unimplemented::cause is a graveyard of unused causes, so decided not
to remove this one either. We should clean it up in a separate patch.

Fixes #14742

Also includes tests for tangently-related issues:
Refs #12607
Refs #14319

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-07-30 20:16:25 +03:00
Nadav Har'El
d9c2cd3024 cql: implement missing type functions for "counters" type
types.cc had eight of its functions unimplemented for the "counters"
types, throwing an "unimplemented::cause::COUNTERS" when used.
A ninth function (validate) was unimplemented for counters but did not
even throw.
Many code paths did not use any of these functions so didn't care, but
some do - e.g., the silly do-nothing "SELECT CAST(c AS counter)" when
c is already a counter column, which causes this operation to fail.

When the types.cc code encounters a counter value, it is (if I understand
it correctly) already a single uint64_t ("long_type") value, so we fall
back to the long_type implementation of all the functions. To avoid mistakes,
I simply copied the reversed_type implementation for all these functions -
whereas the reversed_type implementation falls back to using the underlying
type, the counter_type implementation always falls back to long_type.

After this patch, "SELECT CAST(c AS counter)" for a counter column works.
We'll introduce a test that verifies this (and other things) in a later
patch in this series.

The following patches will also need more of these functions to be
implemented correctly (e.g., blobascounter() fails to validate the size
of the input blob if the validate function isn't implemented for the
counter type).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-07-30 20:16:25 +03:00
Avi Kivity
accd6271bc Merge 'tools: introduce tool_app_template and migrate all tools to it' from Botond Dénes
The scaffolding required to have a working scylla tool app, is considerable, leading to a large amount of boilerplate code in each such app.  This logic is also very similar across the two tool apps we have and would presumably be very similar in any future app. This PR extracts this logic into `tools/utils.hh` and introduces `tool_app_template`, which is similar to `seastar::app_template`  in that it centralizes all the option handling and more in a single class, that each tool has to just instantiate and then call `run()` to run the app.
This cuts down on the repetition and boilerplate in our current tool apps and make prototyping new tool apps much easier.

Closes #14855

* github.com:scylladb/scylladb:
  tools/utils.hh: remove unused headers
  tools/utils: make get_selected_operation() and configure_tool_mode() private
  tools/utils.hh: de-template get_selected_operation()
  tools/scylla-types: migrate to tools_app_template
  tools/scylla-types: prepare for migration to tool_app_template
  tools/scylla-sstable.cc: fix indentation
  tools/scylla-sstables: migrate to tool_app_template
  tools/scylla-sstables: prepare for migration to tool_app_template
  tools: extract tool app skeleton to utils.hh
2023-07-30 18:31:10 +03:00
Pavel Emelyanov
b8d1c7fc0b sstables-format-selector: Add and use system_keyspace dependency
The selector keeps selected format in system.local and uses static
db::system_keyspace::(get|set)_scylla_local_param() helpers to access
it. The helpers are turning into non-static so the selector should call
those on system_keyspace object, not class

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #14871
2023-07-30 18:12:16 +03:00
Avi Kivity
1c3d22b717 build: update frozen toolchain to Fedora 38
This refreshes clang to 16.0.6 and libstdc++ to 13.1.1.

compiler-rt, libasan, and libubsan are added to install-dependencies.sh
since they are no longer pulled in as depdendencies.

Closes #13730
2023-07-30 03:08:48 +03:00
Avi Kivity
14dee7a946 Revert "build: build with -O0 if Clang >= 16 is used"
This reverts commit fb05fddd7d. After
1554b5cb61 ("Update seastar submodule"),
which fixed a coroutine bug in Seastar, it is no longer necessary.

Also revert the related "build: drop the warning on -O0 might fail tests"
(894039d444).
2023-07-29 08:07:04 +03:00
Avi Kivity
1554b5cb61 Update seastar submodule
* seastar c0e618bbb...0784da876 (11):
  > Revert "metrics: Remove registered_metric::operator()"
  > build: use new behavior defined by CMP0127
  > build: pass -DBOOST_NO_CXX98_FUNCTION_BASE to C++ compiler
  > coroutine: fix a use-after-free in maybe_yield
Ref #13730.
  > Merge 'sstring: add more accessors' from Kefu Chai
  > Merge 'semaphore: semaphore_units: return units when reassigned' from Benny Halevy
  > metrics: do not define defaulted copy assignment operator
  > HTTP headers in http_response are now case insensitive
  > rpc: Make server._proto a reference
  > Merge 'Cleanup class metrics::registered_metrics' from Pavel Emelyanov
  > core: undefine fallthrough to fix compilation error

Closes #14862
2023-07-28 23:45:30 +03:00
Tomasz Grabiec
4e9d95d78c Merge 'Compact data before streaming' from Botond Dénes
Currently, streaming and repair processes and sends data as-is. This is wasteful: streaming might be sending data which is expired or covered by tombstones, taking up valuable bandwidth and processing time. Repair additionally could be exposed to artificial differences, due to different nodes being in different states of compactness.
This PR adds opt-in compaction to `make_streaming_reader()`, then opts in all users. The main difference being in how these choose the current compaction time to use:
* Load'n'stream and streaming uses the current time on the local node.
* Repair uses a centrally chosen compaction time, generated on the repair master and propagated to al repair followers. This is to ensure all repair participants work with the exact state of compactness.

 Importantly, this compaction does *not* purge tombstones (tombstone GC is disabled completely).

Fixes: https://github.com/scylladb/scylladb/issues/3561

Closes #14756

* github.com:scylladb/scylladb:
  replica: make_[multishard_]streaming_reader(): make compaction_time mandatory
  repair/row_level: opt in to compacting the stream
  streaming: opt-in to compacting the stream
  sstables_loader: opt-in for compacting the stream
  replica/table: add optional compacting to make_multishard_streaming_reader()
  replica/table: add optional compacting to make_streaming_reader()
  db/config: add config item for enabling compaction for streaming and repair
  repair: log the error which caused the repair to fail
  readers: compacting_reader: use compact_mutation_state::abandon_current_partition()
  mutation/mutation_compactor: allow user to abandon current partition
2023-07-28 16:42:13 +02:00
Pavel Emelyanov
24fdd4297b schema_tables: Use query_processor argument in save_system_schema()
... instead of global qctx. The now used qctx->execute_cql() just calls
the query_processor::execute_internal with cache_internal::yes

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #14874
2023-07-28 15:55:16 +02:00
Kefu Chai
cc2bbde8f1 test: use BOOST_CHECK_EQUAL when appropriate in compaction_manager_basic_test
compaction_manager_basic_test checks the stats of compaction_manager to
verify that there are no ongoing or pending compactions after the triggering
the compaction and waiting for its completion. but in #14865, there are
still active compaction(s) after the compaction_manager's stats shows there
is at least one task completed.

to understand this issue better, let's use `BOOST_CHECK_EQUAL()` instead
of `BOOST_REQUIRE()`, so that the test does not error out when the check
fails, and we can have better understanding of the status when the test
fails.

Refs #14865
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14872
2023-07-28 15:45:07 +03:00
Botond Dénes
1eca60fe10 tools/utils.hh: remove unused headers 2023-07-28 08:41:34 -04:00
Botond Dénes
cbcb20f0f9 tools/utils: make get_selected_operation() and configure_tool_mode() private
Their only user is in tools/utils.cc, so move them there, into an
anonymous namespace.
2023-07-28 08:41:34 -04:00
Botond Dénes
fc0c87002c tools/utils.hh: de-template get_selected_operation()
It now has a single user, so it doesn't have to be a template.
For now, make the method inline, so it can stay in the header. It will
be moved to utils.cc in the next patch.
2023-07-28 08:41:16 -04:00
Botond Dénes
8caf258539 tools/scylla-types: migrate to tools_app_template
Discard the locally coded app skeleton and reuse the tool app template
instead. Reduces boilerplate greatly.
2023-07-28 08:30:53 -04:00
Botond Dénes
68a452be00 tools/scylla-types: prepare for migration to tool_app_template
Make options more declarative and create a local reference to
app.configuration() in the main lambda. To faciliate further patching.
2023-07-28 08:30:53 -04:00
Botond Dénes
7598c23359 tools/scylla-sstable.cc: fix indentation
Broken in the previous patch.
2023-07-28 08:30:53 -04:00
Botond Dénes
d082622ab9 tools/scylla-sstables: migrate to tool_app_template
Removing a great amount of boilerplate, streamlinging the main method.
2023-07-28 08:30:53 -04:00
Botond Dénes
092650b20b tools/scylla-sstables: prepare for migration to tool_app_template
Make options more declarative. To facilitate further patching.
2023-07-28 08:30:53 -04:00
Botond Dénes
89d7d80fce tools: extract tool app skeleton to utils.hh
The skeleton of the two existing scylla-native tools (scylla-types and
scylla-sstable) is very similar. By skeleton, I mean all the boilerplate
around creating and configuring a seastar::app_template, representing
operations/command and their options, and presenting and selecting
these.
To facilitate code-sharing and quick development of any new tools,
extract this skeleton from scylla-sstable.cc into tools/utils.hh,
in the form of a new tool_app_template, which wraps a
seastar::app_template and centralizes all the boilerplate logic in a
single place. The extracted code is not a simple copy-paste, although
many elements are simply copied. The original code is not removed yet.
2023-07-28 08:30:53 -04:00
Botond Dénes
3a51053e66 Merge 'De-static system_keyspace::*_group0_* methods' from Pavel Emelyanov
These are users of global `qctx` variable or call `(get|set)_scylla_local_param(_as)?` which, in turn, also reference the `qctx`. Unfortunately, the latter(s) are still in use by other code and cannot be marked non-static in this PR

Closes #14869

* github.com:scylladb/scylladb:
  system_keyspace: De-static set_raft_group0_id()
  system_keyspace: De-static get_raft_group0_id()
  system_keyspace: De-static get_last_group0_state_id()
  system_keyspace: De-static group0_history_contains()
  raft: Add system_keyspace argument to raft_group0::join_group0()
2023-07-28 14:53:22 +03:00
Kefu Chai
df041c7dc8 build: cmake: add missing source file
TLS certificate authenticator registers itself using a
`class_registrator`. that's why CMake is able to build without
compiling this source file. but for the sake of completeness, and
to be sync with configure.py, let's add it to CMake.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14866
2023-07-28 14:30:58 +03:00
Pavel Emelyanov
d311784721 system_keyspace: De-static set_raft_group0_id()
The caller is group0 code with sys_ks local variable

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-28 13:13:59 +03:00
Pavel Emelyanov
7837bc7d5a system_keyspace: De-static get_raft_group0_id()
The callers are in group0 code that have sys_ks local variable/argument

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-28 13:13:11 +03:00
Pavel Emelyanov
26dd7985a8 system_keyspace: De-static get_last_group0_state_id()
The caller is raft_group0_client with sys.ks. dependency reference and
group0_state_machine with raft_group0_client exporing its sys.ks.

This makes it possible to instantly drop one more qctx reference

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-28 13:12:04 +03:00
Pavel Emelyanov
3de0efd32c system_keyspace: De-static group0_history_contains()
The caller is raft_group0_client with sys.ks. dependency reference.
This allows to drop one qctx reference right at once

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-28 13:11:08 +03:00
Pavel Emelyanov
0dbe83ce89 raft: Add system_keyspace argument to raft_group0::join_group0()
The method will need one to access db::system_keyspace methods. The
sys.ks. is at hand and in use in both callers

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-07-28 13:10:24 +03:00
Avi Kivity
d73a393670 main: increase Seastar reactor task quota in debug mode
Debug mode is so slow that the work:poll ratio decreases, leading
to even more slowness as more polling is done for the same amount
of work.

Increase the task quota to recover some performance.

Ref #14752.

Closes #14820
2023-07-28 10:34:18 +03:00
Avi Kivity
cf81eef370 Merge 'schema_mutations, migration_manager: Ignore empty partitions in per-table digest' from Tomasz Grabiec
Schema digest is calculated by querying for mutations of all schema
tables, then compacting them so that all tombstones in them are
dropped. However, even if the mutation becomes empty after compaction,
we still feed its partition key. If the same mutations were compacted
prior to the query, because the tombstones expire, we won't get any
mutation at all and won't feed the partition key. So schema digest
will change once an empty partition of some schema table is compacted
away.

Tombstones expire 7 days after schema change which introduces them. If
one of the nodes is restarted after that, it will compute a different
table schema digest on boot. This may cause performance problems. When
sending a request from coordinator to replica, the replica needs
schema_ptr of exact schema version request by the coordinator. If it
doesn't know that version, it will request it from the coordinator and
perform a full schema merge. This adds latency to every such request.
Schema versions which are not referenced are currently kept in cache
for only 1 second, so if request flow has low-enough rate, this
situation results in perpetual schema pulls.

After ae8d2a550d (5.2.0), it is more liekly to
run into this situation, because table creation generates tombstones
for all schema tables relevant to the table, even the ones which
will be otherwise empty for the new table (e.g. computed_columns).

This change inroduces a cluster feature which when enabled will change
digest calculation to be insensitive to expiry by ignoring empty
partitions in digest calculation. When the feature is enabled,
schema_ptrs are reloaded so that the window of discrepancy during
transition is short and no rolling restart is required.

A similar problem was fixed for per-node digest calculation in
c2ba94dc39e4add9db213751295fb17b95e6b962. Per-table digest calculation
was not fixed at that time because we didn't persist enabled features
and they were not enabled early-enough on boot for us to depend on
them in digest calculation. Now they are enabled before non-system
tables are loaded so digest calculation can rely on cluster features.

Fixes #4485.

Manually tested using ccm on cluster upgrade scenarios and node restarts.

Closes #14441

* github.com:scylladb/scylladb:
  test: schema_change_test: Verify digests also with TABLE_DIGEST_INSENSITIVE_TO_EXPIRY enabled
  schema_mutations, migration_manager: Ignore empty partitions in per-table digest
  migration_manager, schema_tables: Implement migration_manager::reload_schema()
  schema_tables: Avoid crashing when table selector has only one kind of tables
2023-07-28 00:01:33 +03:00
Anna Stuchlik
8ee6f6ecb6 doc: add the requirement to upgrade drivers
This commit adds a requirement to upgrade ScyllaDB
drivers before upgrading ScyllaDB.

The requirement to upgrade the Monitoring Stack
has been moved to the new section so that
both prerequisites are documented together.

NOTE: The information is added to the 5.2-to-5.3
upgrade guide because all future upgrade guides
will be based on this one (as it's the latest one).

If 5.3 is released, this commit should be backported
to branch-5.3.

Refs https://github.com/scylladb/scylladb/issues/13958

Closes #14771
2023-07-27 15:21:38 +02:00
Patryk Jędrzejczak
b81a6037f1 test: pylib: ensure ScyllaCluster.add_server does not start a second cluster
If the cluster isn't empty and all servers are stopped, calling
ScyllaCluster.add_server can start a new cluster. That's because
ScyllaCluster._seeds uses the running servers to calculate the
seed node list, so if all nodes are down, the new node would
select only itself as a seed, starting a new cluster.

As a single ScyllaCluster should describe a single cluster, we
make ScyllaCluster.add_server fail when called on a non-empty
cluster with all its nodes stopped.

Closes #14804
2023-07-27 13:27:23 +02:00
Botond Dénes
7351c8424d mutation/mutation_rebuilder: add comment about validity of returned mutation reference
Closes #14853
2023-07-27 12:13:46 +03:00
Alexey Novikov
ff721ec3e3 make timestamp string format cassandra compatible
when we convert timestamp into string it must look like: '2017-12-27T11:57:42.500Z'
it concerns any conversion except JSON timestamp format
JSON string has space as time separator and must look like: '2017-12-27 11:57:42.500Z'
both formats always contain milliseconds and timezone specification

Fixes #14518
Fixes #7997

Closes #14726
2023-07-27 12:01:09 +03:00
Botond Dénes
b599f15b26 replica: make_[multishard_]streaming_reader(): make compaction_time mandatory
Now that all users have opted in unconditionally, there is no point in
keeping this optional. Make it mandatory to make sure there are no
opt-out by mistake.
The global override via enable_compacting_data_for_streaming_and_repair
config item still remains, allowing compaction to be force turned-off.
2023-07-27 04:57:52 -04:00
Botond Dénes
fdaf908967 repair/row_level: opt in to compacting the stream
Using a centrally generated compaction-time, generated on the repair
master and propagated to all repair followers. For repair it is
imperative that all participants use the exact same compaction time,
otherwise there can be artificial differences between participants,
generating unnecessary repair activity.
If a repair follower doesn't get a compaction-time from the repair
master, it uses a locally generated one. This is no worse than the
previous state of each node being on some undefined state of compaction.
2023-07-27 04:57:50 -04:00
Botond Dénes
5452fd1ce4 streaming: opt-in to compacting the stream
Use locally generated compaction time on each node. This could lead to
different nodes making different decisions on what is expired or not.
But this is already the case for streaming, as what exactly is expired
depends on when compaction last run.
2023-07-27 03:22:11 -04:00
Botond Dénes
5a73c3374e sstables_loader: opt-in for compacting the stream
No point in loading expired/covered data.
2023-07-27 03:22:11 -04:00
Botond Dénes
2f8d77e97b replica/table: add optional compacting to make_multishard_streaming_reader()
Doing to make_multishard_streaming_reader() what the previous commit did
to make_streaming_reader(). In fact, the new compaction_time parameter
is simply forwarded to the make_streaming_reader() on the shard readers.

Call sites are updated, but none opt in just yet.
2023-07-27 03:22:11 -04:00
Botond Dénes
42b0dd5558 replica/table: add optional compacting to make_streaming_reader()
Opt-in is possible by passing an engaged `compaction_time`
(gc_clock::time_point) to the method. When this new parameter is
disengaged, no compaction happens.
Note that there is a global override, via the
enable_compacting_data_for_streaming_and_repair config item, which can
force-disable this compaction.
Compaction done on the output of the streaming reader does *not*
garbage-collect tombstones!

All call-sites are adjusted (the new parameter is not defaulted), but
none opt in yet. This will be done in separate commit per user.
2023-07-27 03:22:11 -04:00
Botond Dénes
9e3987fc96 db/config: add config item for enabling compaction for streaming and repair
Compacting can greatly reduce the amount of data to be processed by
streaming and repair, but with certain data shapes, its effectiveness
can be reduced and its CPU overhead might outweight the benefits. This
should very rarely be the case, but leave an off switch in case
this becomes a problem in a deployment.
Not wired yet.
2023-07-27 03:22:11 -04:00
Botond Dénes
a22446afe0 repair: log the error which caused the repair to fail
Instead of just a boolean _failed flag, persist the error message of the
exception which caused the repair to fail, and include it in the log
message announcing the failure.
2023-07-27 03:22:11 -04:00
Botond Dénes
ac44efea11 readers: compacting_reader: use compact_mutation_state::abandon_current_partition()
When next_partition() or fast_forward_to() is called. Instead of trying
to simulate a properly closed partition by injecting synthetic mutation
fragments to properly close it.
2023-07-27 02:50:44 -04:00
Botond Dénes
326c3b92e5 mutation/mutation_compactor: allow user to abandon current partition
Currently, the compactor requires a valid stream and thus abandoning a
partition in the middle was not possible. This causes some complications
for the compacting reader, which implements methods such as
`next_partition()` which is possibly called in the middle of a
partition. In this case the compacting reader attempts to close the
partition properly by inserting a synthetic partition-end fragment into
the stream. This is not enough however as it doesn't close any range
tombstone changes that might be active. Instead of piling on more
complexity, add an API to the compactor which allows abandoning the
current partition.
2023-07-27 02:50:44 -04:00
Kefu Chai
1b7bde2e9e compaction_manager: use range in compacting_sstable_registration
simpler than the "begin, end" iterator pair. and also tighten the
type constraints, now require the value type to be
sstables::shared_sstable. this matches what we are expecting in
the implementation.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14678
2023-07-27 09:40:20 +03:00