Halted background fibers render raft server effectively unusable, so
report this explicitly to the clients.
Fix: #11352Closes#11370
* github.com:scylladb/scylladb:
raft server, status metric
raft server, abort group0 server on background errors
raft server, provide a callback to handle background errors
raft server, check aborted state on public server public api's
Task manager for observing and managing long-running, asynchronous tasks in Scylla
with the interface for the user. It will allow listing of tasks, getting detailed
task status and progression, waiting for their completion, and aborting them.
The task manager will be configured with a “task ttl” that determines how long
the task status is kept in memory after the task completes.
At first it will support repair and compaction tasks, and possibly more in the future.
Currently:
Sharded `task_manager` is started in `main.cc` where it is further passed
to `http_context` for the purpose of user interface.
Task manager's tasks are implemented in two two layers: the abstract
and the implementation one. The latter is a pure virtual class which needs
to be overriden by each module. Abstract layer provides the methods that
are shared by all modules and the access to module-specific methods.
Each module can access task manager, create and manage its tasks through
`task_manager::module` object. This way data specific to a module can be
separated from the other modules.
User can access task manager rest api interface to track asynchronous tasks.
The available options consist of:
- getting a list of modules
- getting a list of basic stats of all tasks in the requested module
- getting the detailed status of the requested task
- aborting the requested task
- waiting for the requested task to finish
To enable testing of the provided api, test specific task implementation and module
are provided. Their lifetime can be simulated with the standalone test api.
These components are compiled and the tests are run in all but release build modes.
Fixes: #9809Closes#11216
* github.com:scylladb/scylladb:
test: task manager api test
task_manager: test api layer implementation
task_manager: add test specific classes
task_manager: test api layer
task_manager: api layer implementation
task_manager: api layer
task_manager: keep task_manager reference in http_context
start sharded task manager
task_manager: create task manager object
The implementation of a test api that helps testing task manager
api. It provides methods to simulate the operations that can happen
on modules and theirs task. Through the api user can: register
and unregister the test module and the tasks belonging to the module,
and finish the tasks with success or custom error.
The test api that helps testing task manager api. It can be used
to simulate the operations that can happen on modules and theirs
task. Through the api user can: register and unregister the test
module and the tasks belonging to the module, and finish the tasks
with success or custom error.
The implementation of a task manager api layer. It provides
methods to list the modules registered in task_manager, list
tasks belonging to the given module, abort, wait for or retrieve
a status of the given task.
The task manager api layer. It can be used to list the modules
registered in task_manager, list tasks belonging to the given
module, abort, wait for or retrieve a status of the given task.
Implementation of a task manager that allows tracking
and managing asynchronous tasks.
The tasks are represented by task_manager::task class providing
members common to all types of tasks. The methods that differ
among tasks of different module can be overriden in a class
inheriting from task_manager::task::impl class. Each task stores
its status containing parameters like id, sequence number, begin
and end time, state etc. After the task finishes, it is kept
in memory for configurable time or until it is unregistered.
Tasks need to be created with make_task method.
Each module is represented by task_manager::module type and should
have an access to task manager through task_manager::module methods.
That allows to easily separate and collectively manage data
belonging to each module.
We decided to extend `cql_statement` hierarchy with `strongly_consistent_modification_statement`
and `strongly_consistent_select_statement`. Statements operating on
system.broadcast_kv_store will be compiled to these new subclasses if
BROADCAST_TABLES flag is enabled.
If the query is executed on a shard other than 0 it's bounced to that shard.
In broadcast tables, raft command contains a whole program to be executed.
Sending and parsing on each node entire CQL statement is inefficient,
thus we decided to compile it to an intermediate language which can be
easily serializable.
This patch adds a definition of such a language. For now, only the following
types of statements can be compiled:
* select value where key = CONST from system.broadcast_kv_store;
* update system.broadcast_kv_store set value = CONST where key = CONST;
* update system.broadcast_kv_store set value = CONST where key = CONST if value = CONST;
where CONST is string literal.
Patch 765d2f5e46 did not
evaluate the #if SCYLLA_BUILD_MODE directives properly
and it always matched SCYLLA_BULD_MODE == release.
This change fixes that by defining numerical codes
for each build mode and using macro expansion to match
the define SCYLLA_BUILD_MODE against these codes.
Also, ./configure.py was changes to pass SCYLLA_BUILD_MODE
to all .cc source files, and makes sure it is defined
in build_mode.hh.
Support was added for coverage build mode,
and an #error was added if SCYLLA_BUILD_MODE
was not recognized by the #if ladder directives.
Additional checks verifying the expected SEASTAR_DEBUG
against SCYLLA_BUILD_MODE were added as well,
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#11387
Currently SCYLLA_BULD_MODE is defined as a string by the cxxflags
generated by configure.py. This is not very useful since one cannot use
it in a @if preprocessor directive.
Instead, use -DSCYLLA_BULD_MODE=release, for example, and define a
SCYLLA_BULD_MODE_STR as the dtirng representation of it.
In addition define the respective
SCYLLA_BUILD_MODE_{RELEASE,DEV,DEBUG,SANITIZE} macros that can be easily
used in @ifdef (or #ifndef :)) for conditional compilation.
The planned use case for it is to enable a task_manager test module only
in non-release modes.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#11357
Provides separate control over debuginfo for perf tests
since enabling --tests-debuginfo affects both today
causing the Jenkins archives of perf tests binaries to
inflate considerably.
Refs https://github.com/scylladb/scylla-pkg/issues/3060
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#11337
Currently messaging_service.o takes the longest of all core objects to
compile. For a full build of build/release/scylla, with current ninja
scheduling, on a 32-hyperthread machine, the last ~16% of the total
build time is spent just waiting on messaging_service.o to finish
compiling.
Moving the file to the top of the list makes ninja start its compilation
early and gets rid of that single-threaded tail, improving the total build
time.
Closes#11255
When starting `Build` job we have a situation when `x86` and `arm`
starting in different dates causing the all process to fail
As suggested by @avikivity , adding a date-stamp parameter and will pass
it through downstream jobs to get one release for each job
Ref: scylladb/scylla-pkg#3008
Closes#11234
Calling WebAssembly UDFs requires wasmtime instance. Creating such an instance is expensive,
but these instances can be reused for subsequent calls of the same UDF on various inputs.
This patch introduces a way of reusing wasmtime instances: a wasm instance cache.
The cache stores a wasmtime instance for each UDF and scheduling group. The instances are
evicted using LRU strategy and their size is based on the size of their wasm memories.
The instances stored in the cache are also dropped when the UDF is dropped itself. For that reason,
the first patch modifies the current implementation of UDF dropping, so that the instance dropping may be added
later. The patch also removes the need of compiling the UDF again when dropping it.
The second patch contains the implementation and use of the new cache. The cache is implemented
in `lang/wasm_instance_cache.hh` and the main ways of using it are the `run_script` methods from `wasm.hh`
The third patch adds tests to `test_wasm.py` that check the correctness and performance of the new
cache. The tests confirm the instance reuse, size limits, instance eviction after timeout and after dropping the UDF.
Closes#10306
* github.com:scylladb/scylladb:
wasm: test instances reuse
wasm: reuse UDF instances
schema_tables: simplify merge_functions and avoid extra compilation
This series is the first step in the effort to reduce the number of metrics reported by Scylla.
The series focuses on the per-table metrics.
The combination of histograms, per-tables, and per shard makes the number of metrics in a cluster explode.
The following series uses multiple tools to reduce the number of metrics.
1. Multiple metrics should only be reported for the user tables and the condition that checked it was not updated when more non-user keyspaces were added.
2. Second, instead of a histogram, per table, per shard, it will report a summary per table, per shard, and a single histogram per node.
3. Histograms, summaries, and counters will be reported only if they are used (for example, the cas-related metrics will not be reported for tables that are not using cas).
Closes#11058
* github.com:scylladb/scylla:
Add summary_test
database: Reduce the number of per-table metrics
replica/table.cc: Do not register per-table metrics for system
histogram_metrics_helper.hh: Add to_metrics_summary function
Unified histogram, estimated_histogram, rates, and summaries
Split the timed_rate_moving_average into data and timer
utils/histogram.hh: should_sample should use a bitmask
estimated_histogram: add missing getter method
The to_metrics_summary is a helper function that create a metrics type
summary from a timed_rate_moving_average_with_summary object.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
logalloc manages regions of log-structured allocated memory, and region_groups
containing such regions and other region_groups. region_groups were introduced
for accounting purposes - first to limit the amount of memory in memtables, then to
match new dirty memory allocation rate with memtable flushing rate so we never
hit a situation where allocation rate exceeded flush rate, and we exceed our limit.
The problem is that the abstraction is very weak - if we want to change anything
in memtable flush control we'll need to change region_groups too - and also
expensive to maintain.
The solution is to break the abstraction and move region_groups to memtable
dirty memory management code. Instead introduce a new, simpler abstraction,
the region_listener, which communicates changes in region memory consumption
to an external piece of code, which can then choose to do with it what it likes.
The long term plan is to completely remove region_groups and fold them into dirty_memory_manager:
- make each memtable a region_listener so it gets called back after size changes
- make memtables inform their dirty_memory_manager about the size to dirty_memory_manager can decide to throttle writes and which memtable to pick to flush
Closes#10839
* github.com:scylladb/scylla:
logalloc: drop region_impl public accessors
logalloc, dirty_memory_manager: move size-tracking binomial heap out of logalloc
logalloc: relax lifetime rules around region_listener
logalloc, dirty_memory_manager: move region_group and associated code
logalloc: expose tracker_reclaimer_lock
logalloc: reimplement tracker_reclaim_lock to avoid using hidden classes
logalloc: reduce friendship between region and region_group
logalloc: decouple region_group from region
memtable: stop using logalloc::region::group() to test for flushed memtables
This pull request introduces a "synchronous mode" for global views. In this mode, all view updates are applied synchronously as if the view was local.
Marking view as a synchronous one can be done using `CREATE MATERIALIZED VIEW` and `ALTER MATERIALIZED VIEW`. E.g.:
```cql
ALTER MATERIALIZED VIEW ks.v WITH synchronous_updates = true;
```
Marking view as a synchronous one was done using tags (originally used by alternator). No big modifications in the view's code were needed.
Fixes: https://github.com/scylladb/scylla/issues/10545Closes#11013
* github.com:scylladb/scylla:
cql-pytest: extend synchronous mv test with new cases
cql-pytest: allow extra parameters in new_materialized_view
docs: add a paragraph on view synchronous updates
test/boost/cql_query_test: add test setting synchronous updates property
test: cql-pytest: add a test for synchronous mode materialized views
db: view: react to synchronous updates tag
cql3: statements: cf_prop_defs: apply synchronous updates tag
alternator, db: move the tag code to db/tags
cql3: statements: add a synchronous_updates property
region_group is an abstraction that allows accounting for groups of
regions, but the cost/benefit ratio of maintaining the abstraction
is poor. Each time we need to change decision algorithm of memtable
flushing (admittedly rarely), we need to distill that into an abstraction
for region_groups and then use it. An example is virtual regions groups;
we wanted to account for the partially flushed memtables and had to
invent region groups to stand in their place.
Rather than continuing to invest in the abstraction, break it now
and move it to the memtable dirty memory manager which is responsible
for making those decisions. The relevant code is moved to
dirty_memory_manager.hh and dirty_memory_manager.cc (new file), and
a new unit test file is added as well.
A downside of the change is that unit testing will be more difficult.
Tags are a useful mechanism that could be used outside of alternator
namespace. My motivation to move tags_extension and other utilities to
db/tags/ was that I wanted to use them to mark "synchronous mode" views.
I have extracted `get_tags_of_table`, `find_tag` and `update_tags`
method to db/tags/utils.cc and moved alternator/tags_extension.hh to
db/tags/.
The signature of `get_tags_of_table` was changed from `const
std::map<sstring, sstring>&` to `const std::map<sstring, sstring>*`
Original behavior of this function was to throw an
`alternator::api_error` exception. This was undesirable, as it
introduced a dependency on the alternator module. I chose to change it
to return a potentially null value, and added a wrapper function to the
alternator module - `get_tags_of_table_or_throw` to keep the previous
throwing behavior.
When executing a wasm UDF, most of the time is spent on
setting up the instance. To minimize its cost, we reuse
the instance using wasm::instance_cache.
This patch adds a wasm instance cache, that stores
a wasmtime instance for each UDF and scheduling group.
The instances are evicted using LRU strategy. The
cache may store some entries for the UDF after evicting
the instance, but they are evicted when the corresponding
UDF is dropped, which greatly limits their number.
The size of stored instances is estimated using the size
of their WASM memories. In order to be able to read the
size of memory, we require that the memory is exported
by the client.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Currently, we use following naming convention for relocatable package
filename:
${package_name}-${arch}-package-${version}.${release}.tar.gz
But this is very different with Linux standard packaging system such as
.rpm and .deb.
Let's align the convention to .rpm style, so new convention should be:
${package_name}-${version}-${release}.${arch}.tar.gz
Closes#9799Closes#10891
* tools/java de8289690e...d0143b447c (1):
> build_reloc.sh: rename relocatable packages
* tools/jmx fe351e8...06f2735 (1):
> build_reloc.sh: rename relocatable packages
* tools/python3 e48dcc2...bf6e892 (1):
> reloc/build_reloc.sh: rename relocatable packages
This is part of support installing executables from PIP package,
now we support installing executable from PIP package but it will
install under /opt/scylladb/python3/bin.
To call these commands without speciying full path, we also need to install
symlink to /usr/bin.
To do this, we need new list which specifies command name for symlink.
Closes#10748
This PR introduces improvements to `expr::to_restriction` and prepares the validation part for restriction classes removal.
`expr::to_restriction` is currently used to take a restriction from the WHERE clause, prepare it, perform some validation checks and finally convert it to an instance of the restriction class.
Soon we will get rid of the restriction class.
In preparation for that `expr::to_restriction` is split into two independent parts:
* The part that prepares and validates a binary_operator
* The part that converts a binary_operator to restriction
Thanks to this split getting rid of restriction class will be painless, we will just stop using the second part.
`to_restriction.cc` is replaced by `restrictions.hh/cc`. In the future we can put all the restriction expressions code there to avoid clutter in `expression.hh/cc`.
This change made it much easier to fix#10631, so I did that as well.
Fixes: #10631Closes#10979
* github.com:scylladb/scylla:
cql-pytest: Test that IS NOT only accepts NULL
cql-pytest: Enable testInvalidCollectionNonEQRelation
cql3: Move single element IN restrictions handling
cql3: Check for disallowed operators early
cql3: Simplify adding restrictions
cql3: Reorganize to_restriction code
cql3: Fix IS NOT NULL check in to_restriction
cql3: Swap order of arguments in error message
expr::to_restriction is currently used to
take a restriction from the WHERE clause,
prepare it, perform some validation checks
and finally convert it to an instance of
the restriction class.
Soon we will get rid of the restriction class.
In preparation for that expr::to_restriction
is split into two independent parts:
* The part that prepares and validates a binary_operator
* The part that converts a binary_operator to restriction
Thanks to this split getting rid of restriction class
will be painless, we will just stop using the
second part.
This commit splits expr::to_restriction into two functions;
* validate_and_prepare_new_restriction
* convert_to_restriction
that handle each of those parts.
All helper validation methods in the anonymous namespace
are copied from the to_restriction.cc file.
to_restriction.cc isn't the best filename for the new functionality,
so it has been renamed to restrictions.hh/cc.
In the future all the code regarding restrictions could be
put there to reduce clutter in expression.hh/cc
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Introduces a utility function which allows obtaining a pointer to the
exception data held behind an std::exception_ptr if the data matches the
requested type. It can be used to implement manual but concise
try..catch chains.
The `try_catch` has the best performance when used with libstdc++ as it
uses the stdlib specific functions for simulating a try..catch without
having to actually throw. For other stdlibs, the implementation falls
back to a throw surrounded by an actual try..catch.
Currently, for users who have permissions_cache configs set to very high
values (and thus can't wait for the configured times to pass) having to restart
the service every time they make a change related to permissions or
prepared_statements cache (e.g. Adding a user and changing their permissions)
can become pretty annoying.
This patch series make permissions_validity_in_ms, permissions_update_interval_in_ms
and permissions_cache_max_entries live updateable so that restarting the
service is not necessary anymore for these cases.
It also adds an API for flushing the cache to make it easier for users who
don't want to modify their permissions_cache config.
branch: https://github.com/igorribeiroduarte/scylla/tree/make_permissions_cache_live_updateable
CI: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1005/
dtests: https://github.com/igorribeiroduarte/scylla-dtest/tree/test_permissions_cache
* https://github.com/igorribeiroduarte/scylla/make_permissions_cache_live_updateable:
loading_cache_test: Test loading_cache::reset and loading_cache::update_config
api: Add API for resetting authorization cache
authorization_cache: Make permissions cache and authorized prepared statements cache live updateable
auth_prep_statements_cache: Make aut_prep_statements_cache accept a config struct
utils/loading_cache.hh: Add update_config method
utils/loading_cache.hh: Rename permissions_cache_config to loading_cache_config and move it to loading_cache.hh
utils/loading_cache.hh: Add reset method
For cases where we have very high values set to permissions_cache validity and
update interval (E.g.: 1 day), whenever a change to permissions is made it's
necessary to update scylla config and decrease these values, since waiting for
all this time to pass wouldn't be viable.
This patch adds an API for resetting the authorization cache so that changing
the config won't be mandatory for these cases.
Usage:
$ curl -X POST http://localhost:10000/authorization_cache/reset
Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
Currently, we use the last row in the query result set as the position where the query is continued from on the next page. Since only live rows make it into query result set, this mandates the query to be stopped on a live row on the replica, lest any dead rows or tombstones processed after the live rows, would have to be re-processed on the next page (and the saved reader would have to be thrown away due to position mismatch). This requirement of having to stop on a live row is problematic with datasets which have lots of dead rows or tombstones, especially if these form a prefix. In the extreme case, a query can time out before it can process a single live row and the data-set becomes effectively unreadable until compaction gets rid of the tombstones.
This series prepares the way for the solution: it allows the replica to determine what position the query should continue from on the next page. This position can be that of a dead row, if the query stopped on a dead row. For now, the replica supplies the same position that would have been obtained with looking at the last row in the result set, this series merely introduces the infrastructure for transferring a position together with the query result, and it prepares the paging logic to make use of this position. If the coordinator is not prepared for the new field, it will simply fall-back to the old way of looking at the last row in the result set. As I said for now this is still the same as the content of the new field so there is no problem in mixed clusters.
Refs: https://github.com/scylladb/scylla/issues/3672
Refs: https://github.com/scylladb/scylla/issues/7689
Refs: https://github.com/scylladb/scylla/issues/7933
Tests: manual upgrade test.
I wrote a data set with:
```
./scylla-bench -mode=write -workload=sequential -replication-factor=3 -nodes 127.0.0.1,127.0.0.2,127.0.0.3 -clustering-row-count=10000 -clustering-row-size=8096 -partition-count=1000
```
This creates large, 80MB partitions, which should fill many pages if read in full. Then I started a read workload:
```
./scylla-bench -mode=read -workload=uniform -replication-factor=3 -nodes 127.0.0.1,127.0.0.2,127.0.0.3 -clustering-row-count=10000 -duration=10m -rows-per-request=9000 -page-size=100
```
I confirmed that paging is happening as expected, then upgraded the nodes one-by-one to this PR (while the read-load was ongoing). I observed no read errors or any other errors in the logs.
Closes#10829
* github.com:scylladb/scylla:
query: have replica provide the last position
idl/query: add last_position to query_result
mutlishard_mutation_query: propagate compaction state to result builder
multishard_mutation_query: defer creating result builder until needed
querier: use full_position instead of ad-hoc struct
querier: rely on compactor for position tracking
mutation_compactor: add current_full_position() convenience accessor
mutation_compactor: s/_last_clustering_pos/_last_pos/
mutation_compactor: add state accessor to compact_mutation
introduce full_position
idl: move position_in_partition into own header
service/paging: use position_in_partition instead of clustering_key for last row
alternator/serialization: extract value object parsing logic
service/pagers/query_pagers.cc: fix indentation
position_in_partition: add to_string(partition_region) and parse_partition_region()
mutation_fragment.hh: move operator<<(partition_region) to position_in_partition.hh
Adds the per_partition_rate_limit_test.cc file. Currently, it only
contains a test which verifies that the feature correctly switches off
rate limiting for internal queries (!allow_limit || internal sg).
Adds the new `per_partition_rate_limit` schema extension. It has two
parameters: `max_writes_per_second` and `max_reads_per_second`.
In the future commits they will control how many operations of given
type are allowed for each partition in the given table.
Introduces the rate_limiter, a replica-side data structure meant for
tracking the frequence with which each partition is being accessed
(separately for reads and writes) and deciding whether the request
should be accepted and processed further or rejected.
The limiter is implemented as a statically allocated hashmap which keeps
track of the frequency with which partitions are accessed. Its entries
are incremented when an operation is admitted and are decayed
exponentially over time.
If a partition is detected to be accessed more than its limit allows,
requests are rejected with a probability calculated in such a way that,
on average, the number of accepted requests is kept at the limit.
The structure currently weights a bit above 1MB and each shard is meant
to keep a separate instance. All operations are O(1), including the
periodic timer.
Introduces `replica::rate_limit_exception` - an exceptions that is
supposed to be thrown/returned on the replica side when the request is
rejected due to the exceeding the per-partition rate limit.
Additionally, introduces the `exception_variant` type which allows to
transport the new exception over RPC while preserving the type
information. This will be useful in later commits, as the coordinator
will have to know whether a replica has failed due to rate limit being
exceeded or another kind of error.
The `exception_variant` currently can only either hold "other exception"
(std::monostate) or the aforementioned `rate_limit_exception`, but can
be extended in a backwards-compatible way in the future to be able to
hold more exceptions that need to be handled in a different way.
The most time-consuming part is invoking "ninja -t compdb", and there
is no need to repeat that for every mode.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Closes#10733
In several places, configure.py uses unsorted sets which results in
its output being in different order every time - both a different
order of targets, and a different order in dependencies of each
target.
This is both strange, and annoying when trying to debug configure.py
and trying to understand when, if at all, its output changes.
So in this patch, we use "sorted(...)" in the right places that
are needed to guarantee a fixed order. This fixed order is alphabetical,
but that's not the goal of this patch - the goal is to ensure a fixed
order.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
In commit 9cc9facbea, I fixed issue #4706.
That issue about what happens when interrupting a rebuild of build.ninja
(which happens automatically when you run "ninja" after configure.py
changed). We don't want to leave behind a half-built build.ninja,
or leave it deleted.
The solution in that commit was for configure.py to build a temporary file
(build.ninja.tmp), and only as the very last step rename it build.ninja.
Unfortunately, since that time, we added more last steps after what
used to be that very last step :-(
If this new code running after the rename takes a noticable amount of
time, and if the user is unlucky enough to interrupt it during that
time, ninja will see a modified output file (build.ninja) and a failed
rule, and will delete the output file!
The solution is to move the rename out of configure.py. Instead, we
add a "--out=filename" option to configure.py which allows it to write
directly to a different file name, not build.ninja. When rebuilding
build.ninja, the rule will now call configure.py with "--out=build.ninja.new"
and then rename it back to build.ninja. Any failure or interrupt at any
stage of configure.py will leave build.ninja untouched, so ninja will
not delete it - it will just delete the temporary build.ninja.new.
Fixes#4706 (again)
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This series is split from another, bigger RFC series which provides
manual remedies to deal with inconsistencies between the base table
and its views. This part deals with ghost rows by providing a statement
which fetches view rows from a given range, then reads its corresponding
rows from the base table (cl=ALL), and finally removes rows which were
not present in the base table at all, qualifying them as ghost rows.
Motivations for introducing such a statement:
* in case of detected inconsistencies, it can be used to fix
materialized views without recreating them from scratch, which can
take days and generates lots of throughput
* a tool which periodically scrubs a materialized view can be easily
created on top of this statement, especially that it's possible
to remove ghost rows from a user-defined view token range;
This series comes with a unit test.
The reason for digging up this series is because it's still possible to end up with ghost rows in certain rather improbable scenarios, and we lack a way of fixing them without rebuilding the whole view. For instance, in case of a failed synchronous update to a local view, the user will be notified that the query failed, but a ghost row can be created nonetheless. The pruning statement introduced in this series would allow healing the failure locally, without rebuilding the whole view.
Tests: unit(dev)
Closes#10426
* github.com:scylladb/scylla:
docs: add a paragraph on PRUNE MATERIALIZED VIEW statement
service,test: add a test case for error during pruning
tests: add ghost row deletion test case
cql3: enable ghost row deletion via CQL
cql3: add a statement for deleting ghost rows
cql3: convert is_json statement parameter to enum
pager: add ghost row deleting pager
db,view: add delete ghost rows visitor
Writing into the group0 raft group on a client side involves locking
the state machine, choosing a state id and checking for its presence
after operation completes. The code that does it resides now in the
migration manager since the currently it is the only user of group0. In
the near future we will have more client for group0 and they all will
have to have the same logic, so the patch moves it to a separate class
raft_group0_client that any future user of group0 can use to write
into it.
Message-Id: <YoYAJwdTdbX+iCUn@scylladb.com>
In order to expose the API for deleting ghost rows from a view,
a CQL statement is created. It is loosely based on select_statement,
as its first step is to select view table rows.
Functionality of the relation class has been replaced by
expr::to_restriction.
Relation and all classes deriving from it can now be removed.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
Add a function that will be used to convert expressions
received from the parser to restrictions.
Currently parser creates relations with expressions inside
and then those relations are converted to restrictions.
Once this function is implemented we will be able to skip
creating relations altogether and convert straight from
expression to restriction. This will allow us to remove
the relation class.
Further functionality will be implemented in the following commits.
This commit implements converting single column relations to expressions.
The code is mostly taken from functions in single_column_relation.hh,
because we are replicating functionality of the functions called
single_column_relation::new_XX_restriction.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
Recently Abseil started to ask to enable ABSL_PROPAGATE_CXX_STD,
warning that it will do so itself in the future. Do so, and
specify that we use C++20 to avoid inconsistencies.
Closes#10563