This series refactors `table::snapshot` and moves the responsibility
to flush the table before taking the snapshot to the caller.
`flush_on_all` and `snapshot_on_all` helpers are added to replica::database
(by making it a peering_sharded_service) and upper layers,
including api and snapshot-ctl now call it instead of calling cf.snapshot directly.
With that, error are handed in table::snapshot and propagated
back to the callers.
Failure to allocate the `snapshot_manager` object is fatal,
similar to failure to allocate a continuation, since we can't
coordinate across the shards without it.
Test: unit(dev), rest_api(debug)
Fixes#10500Closes#10513
* github.com:scylladb/scylla:
table: snapshot: handle errors
table: snapshot: get rid of skip_flush param
database: truncate: skip flush when taking snapshot
test: rest_api: storage_service: verify_snapshot_details: add truncate
database: snapshot_on_all: flush before snapshot if needed
table: make snapshot method private
database: add snapshot_on_all
snapshot-ctl: run_snapshot_modify_operation: reject views and secondary index using the schema
snapshot-ctl: refactor and coroutinize take_snapshot / take_column_family_snapshot
api: storage_service: increase visibility of snapshot ops in the log
api: storage_service: coroutinize take_snapshot and del_snapshot
api: storage_service: take_snapshot: improve api help messages
test: rest_api: storage_service: add test_storage_service_snapshot
database: add flush_on_all variants
test: rest_api: add test_storage_service_flush
Turn table::snapshot into a coroutine,
catch exceptions, and return them to the caller.
Make sure that coordination across shards
would not break even if any of the shards hits
an error, by always signaling semaphores other
shards wait on.
All errors except for failing to allocate
the snapshot_manager objects are caught
and propagated back.
Failing to allocate the snapshot_manager is fatal
similar to failing to allocate a continuation
since we can't coordinate across the shards without it,
so abort that fails.
Fixes#10500
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
database::truncate already flushes the table
on auto_snapshot so there is never a reason
to flush it again in table::snapshot.
Note that cf.can_flush() is false only if memtables
are empty so there nothing to flush or there is
is no seal_immediate_fn and then table::snapshot
wouldn't be able to flush either.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
flush_on_all shards before taking the snapshot if !skip_flush
so we can get rid of flushing in table::snapshot.
Refs #10500
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And move the logic from snapshot-ctl down to the
replica::database layer.
A following patch will move the flush phase
from the replica::table::snapshot layer
out to the caller.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Detecting a secondary index by checking for a dot
in the table name is wrong as tables generated by Alternator
may contain a dot in their name.
Instead detect bot hmaterialized view and secondary indexes
using the schema()->is_view() method.
Fixes#10526
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
snapshot operations over the api are rare
but they contain significant state on disk in the
form of sstables hard-linked to the snapshot directories.
Also, we've seen snapshot operations hang in the field,
requiring a core dump to analyse the issue,
while there were no records in the log indicating
when previous snapshot operations were last executed.
This change promotes logging to info level
when take_snapshot and del_snapshot start,
and logs errors if in case they fail.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Test the snapshot operations via the rest api.
Added test/rest_api/rest_util.py with
new_test_snapshot that creates a new test snapshot
and automagically deletes it when the `with` block
if exited, similar to new_test_keyspace and new_test_table.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Allowing bind markers in collection literals is a change which causes minor differences in behavior between Scylla and Cassandra. Despite such an undesirable effect, I think allowing them is a good idea because it makes [refactoring work made by cvybhu](https://github.com/scylladb/scylla/pull/10409) easier - 469d03f8c2.
Also, making Scylla accept a superset of valid Cassandra cql expressions does not make us less compatible (maybe apart from test suit compatibility).
Closes#10457
* github.com:scylladb/scylla:
test/boost: cql_query_test: allow bound variables in test_list_of_tuples_with_bound_var
test/boost: cql_query_test: test bound variables in collection literals
cql3: expr: do not allow unset values inside collections
cql3: expr: prepare_expr: allow bind markers in collection literals
"
There's a cql_type_parser::parse() method that needs to get user
types for a keyspace by its name. For this it uses the global
storage proxy instance as a place to get database from. This set
introduces an abstract user_types_storage helper object that's
responsible in providing the user types for the caller.
This helper, in turn, is provided to the parse() method by the
database itself or by the schema_ctxt object that needs parse()
to unfreeze schemas and doesn't have database at those times.
This removes one more get_storage_proxy() call.
"
* 'br-user-types-storage' of https://github.com/xemul/scylla:
cql_type_parser: Require user_types_storage& in parse()
schame_tables: Add db/ctxt args here and there
user_types: Carry storage on database and schema_ctxt
data_dictionary: Introduce user types storage
When pull_github_pr.sh uses git cherry-pick to merge a single-patch
pull request, this cherry-pick can fail. A typical example is trying
to merge a patch that has actually already been merged in the past,
so cherry-pick reports that the patch, after conflict resolution,
is empty.
When cherry-pick fails, it leaves the working directory in an annoying
mid-cherry-pick state, and today the user needs to manually call
"git cherry-pick --abort" to return to the normal state. The script
should it automatically - so this is what we do in this patch.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
When repair_meta stops it does so in the background and reports back
a shared future into whose shared promise peer it resolves that
background activity. There's a shorter way to forward a future result
into another, even shared, promise. And this method doesn't need to
discard a future.
tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/253
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The lcs at those places are explicitly start()ed beforehand. The
is_start() check is necessary when using the latency_counter with a
histogram that may or may not start the counter (this is the case
in several class table methods).
tests: unit(dev)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
propagate_replacement() is an internal function that shouldn't be in
the public interface. No one besides an unit test for incremental
compaction needs it. In the future, I want to revisit incremental
compaction unit test to stop using it and only rely on public
interfaces
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220506171647.81063-1-raphaelsc@scylladb.com>
"
There's a enpoint->state map member of the gossiper class. First
ugly thing about it is that the member is public.
Next, there's a whole bunch of helpers around that map that export
various bits of information from it. All of those helpers reshard
to shard-0 to read from the state mape ignoring the fact that the
map is replicated on all shards internally. Also, some of those
helpers effectively duplicate each other for no real gain. Finally,
most of them are specific to api/ code, and open-coding them often
makes api/ handlers shorter and simpler.
This set removes the unused, api-only or trivial state map accessors
and marks the state map itself private (underscore prefix included).
tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/233/
"
* 'br-gossiper-sanitize-api-2' of https://github.com/xemul/scylla:
gossiper: Add underscores to new private members
code: Indentation fix after previous patch
gossiper, code: Relax get_up/down/all_counters() helpers
api: Fix indentation after previous patch
gossiper, api: Remove get_arrival_samples()
gossiper, api: Remove get/set phi convict threshold helpers
gossiper, api: Move get_simple_states() into API code
gossiper: In-line std::optional<> get_endpoint_state_for_endpoint() overload
gossiper, api: Remove get_endpoint_state() helpers
gossiper: Make state and locks maps private
gossiper: Remove dead code
Attempting to call advance_to() on the index, after it is positioned at EOF, can result in an assert failure, because the operation results in an attempt to move backwards in the index-file (to read the last index page, which was already read). This only happens if the index cache entry belonging to the last index page is evicted, otherwise the advance operation just looks-up said entry and returns it. To prevent this, we add an early return conditioned on eof() to all the partition-level advance-to methods.
A regression unit test reproducing the above described crash is also added.
Fixes: #10403Closes#10491
* github.com:scylladb/scylla:
sstables/index_reader: short-circuit fast-forward-to when at EOF
test/lib/random_schema: add a simpler overload for fixed partition count
One user observed this assertion fail, but it's an extremely rare event.
The root cause - interlacing of processing STARTUP and OPTIONS messages -
is still there, but now it's harmless enough to leave it as is.
Fixes#10487Closes#10503
This series fixes a few issue on the table truncate path:
- "memtable_list: safely futurize clear_and_add"
- reinstates an async version of table::clear_and_add, just safe against #10421
- a unit test reproducing #10421 was added to make sure the new version is indeed safe.
- "table: clear: serialize with ongoing flush" fixes#10423
- a unit test reproducing #10423 was added
Fixes#10281Fixes#10423
Test: unit(dev), database_test. test_truncate_without_snapshot_during_{writes,flushes} (debug)
Closes#10424
* github.com:scylladb/scylla:
test: database_test: add test_truncate_without_snapshot_during_writes
memtable_list: safely futurize clear_and_add
table: clear: serialize with ongoing flush
primitive_consumer::read_bytes() destroys and creates a vector for every value it reads.
This happens for every cell.
We can save a bit of work by reusing the vector.
Closes#10512
* github.com:scylladb/scylla:
sstables: consumer: reuse the fragmented_temporary_buffer in read_bytes()
utils: fragmented_temporary_buffer: add release()
Dtest triggers the problem by:
1) creating table with LCS
2) disabling regular compaction
3) writing a few sstables
4) running maintenance compaction, e.g. cleanup
Once the maintenance compaction completes, disengaged optional _last_compacted_keys
triggers an exception in notify_completion().
_last_compacted_keys is used by regular for its round-robin file picking
policy. It stores the last compacted key for each level. Meaning it's
irrelevant for any other compaction type.
Regular compaction is responsible for initializing it when it runs for
the first time to pick files. But with it disabled, notify_completion()
will find it uninitialized, therefore resulting in bad_optional_access.
To fix this, the procedure is skipped if _last_compacted_keys is
disengaged. Regular compaction, once re-enabled, will be able to
fill _last_compacted_keys by looking at metadata of the files.
compaction_test.py::TestCompaction::test_disable_autocompaction_doesnt_
block_user_initiated_compactions[CLEANUP-LeveledCompactionStrategy]
now passes.
Fixes#10378.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#10508
SSTable was moved into descriptor, so on failure, it couldn't be used
without resulting in a segfault. Fix it by not moving sst, and changing
signature to make it explicit we don't want to move the content.
Fixes#10505.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#10506
These helpers count elements in the endpoint state map. It makes sense
to keep them in gossiper API, but it's worth removing the wrappers that
do invoke_on(0). This makes code shorter.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The API method in question just tries to scan the state map. There's no
need in doing invoke_on(0) and in a separate helper method in gossiper,
the creation of the json return value can happen in the API handler.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method helps updating enpoint state in handle_major_state_change by
returning a copy of an endpoint state that's kept while the map's entry
is being replaced with the new state. It can be replaced with a shorter
code.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are two of them -- one to do invoke_on(0) the other one to get the
needed data. The former one is not needed -- the scanned endpoint state
map is replicated accross shards and is the same everywhere. The latter
is not needed, because there's only one user of it -- the API -- which
can work with the existing gossiper API.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Locks are not needed outside gossiper, state map is sometimes read from,
but there a const getter for such cases. Both methods now desrve the
underbar prefix, but it doesn't come with this short patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This series futurizes two synchronous functions used for data reconciliation:
`data_read_resolver::resolve` and `to_data_query_result` and does so
by introducing lower-level asynchronous infrastructure:
`mutation_partition_view::accept_gently`,
`frozen_mutation::unfreeze_gently` and `frozen_mutation::consume_gently`,
and `mutation::consume_gently`.
This trades some cycles on this cold path to prevent known reactor stalls.
Fixes#2361Fixes#10038Closes#10482
* github.com:scylladb/scylla:
mutation: add consume_gently
frozen_mutation: add consume_gently
query: coroutinize to_data_query_result
frozen_mutation: add unfreeze_gently
mutation_partition_view: add accept_gently methods
storage_proxy: futurize data_read_resolver::resolve
Attempting to call advance_to() on the index, after it is positioned at
EOF, can result in an assert failure, because the operation results in
an attempt to move backwards in the index-file (to read the last index
page, which was already read). This only happens if the index cache
entry belonging to the last index page is evicted, otherwise the advance
operation just looks-up said entry and returns it.
To prevent this, we add an early return conditioned on eof() to all the
partition-level advance-to methods.
A regression unit test reproducing the above described crash is also
added.
Currently, adding a cluster feature requires editing several files and
repeating the new feature name several times. This series reduces
the boilerplate to a single line (for non-experimental features), and
perhaps three for experimental features.
Closes#10488
* github.com:scylladb/scylla:
gms: feature_service: remove variable/helper function duplication
gms: feature: make `operator bool` implicit
gms: feature_service: remove feature variable duplication in enable()
gms: feature_service: remove feature variable declaration/definition duplication
gms: features: de-quadruplicate active feature names
gms: features: de-quadruplicate deprecated feature names
gms: feature_service: avoid duplicating feature names when listing known features