All the cases in this test also run mutation source tests and the
case with single-fragment buffer takes times more time to execute
than the others.
Splitting this single case so that it runs mutation source tests
flavours in different cases improves the test parallelizm.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test_database_with_data_in_sstables_is_a_mutation_source case runs
the mutation source tests in one go. The problem is that on each step
a whole new ks:cf is created which takes the majority of the tests time.
In the end of the day this case is the slowest one in the suite being
up to two times longer (depending on mode) than the #2 on this list.
This patch splits the case into 4 so that each mutation source flavor
is run in separate case.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are 4 flavours of mutation source tests that are all ran
sequentially -- plain, reversed and upgrade/downgrade ones that
check v1<->v2 conversions.
This patch splits them all into individual calls so that some
tests may want to have dedicated cases for each. "By default" they
are all run as they were.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
The storage_service is involved in the cdc_generation_service guts
more than needed.
- the bool _for_testing bit is cdc-only
- there's API-only cdc_generation_service getter
- cdc_g._s. startup code partially sits in s._s. one
This patch cleans most of the above leaving only the startup
_cdc_gen_id on board.
tests: unit(dev)
refs: #2795
"
* 'br-storage-service-vs-cdc-2' of https://github.com/xemul/scylla:
api: Use local sharded<cdc::generation_service> reference
main: Push cdc::generation_service via API
storage_service: Ditch for_testing boolean
cdc: Replace db::config with generation_service::config
cdc: Drop db::config from description_generator
cdc: Remove all arguments from maybe_rewrite_streams_descriptions
cdc: Move maybe_rewrite_streams_descriptions into after_join
cdc: Squash two methods into one
cdc: Turn make_new_cdc_generation a service method
cdc: Remove ring-delay arg from make_new_cdc_generation
cdc: Keep database reference on generation_service
The end_point_hints_manager's field _last_written_rp is initialized in
its regular constructor, but is not copied in the move constructor.
Because the move constructor is always involved when creating a new
endpoint manager, the _last_written_rp field is effectively always
initialized with the zero-argument constructor, and is set to the zero
value.
This can cause the following erroneous situation to occur:
- Node A accumulates hints towards B.
- Sync point is created at A.
It will be used later to wait for currently accumulated hints.
- Node A is restarted.
The endpoint manager A->B is created which has bogus value in the
_last_written_rp (it is set to zero).
- Node A replays its hints but does not write any new ones.
- A hint flush occurs.
If there are no hint segments on disk after flush, the endpoint
manager sets its last sent position to the last written position,
which is by design. However, the last written position has incorrect
value, so the last sent position also becomes incorrect and too low.
- Try to wait for the sync point created earlier.
The sync point waiting mechanism waits until last sent hint position
reaches or goes past the position encoded in the sync point, but it
will not happen because the last sent position is incorrect.
The above bug can be (sometimes) reproduced in
hintedhandoff_sync_point_api_test dtest.
Now, the _last_written_rp field is properly initialized in the move
constructor, which prevents the bug described above.
Fixes: #9320Closes#9426
We must abort the environment before the ticker as the environment may
require time to keep advancing during abort in order for all operations
to finish, e.g. operations that can finish only due to timeout.
Currently such operations may cause the test to hang indefinitely
at the end.
The test requires a small modification to ensure that
`delivery_queue::push` is not called after the queue was aborted.
Message-Id: <20210930143539.157727-1-kbraun@scylladb.com>
"This series removes layer violation in compaction, and also
simplifies compaction manager and how it interacts with compaction
procedure."
* 'compaction_manager_layer_violation_fix/v4' of github.com:raphaelsc/scylla:
compaction: split compaction info and data for control
compaction_manager: use task when stopping a given compaction type
compaction: remove start_size and end_size from compaction_info
compaction_manager: introduce helpers for task
compaction_manager: introduce explicit ctor for task
compaction: kill sstables field in compaction_info
compaction: kill table pointer in compaction_info
compaction: simplify procedure to stop ongoing compactions
compaction: move management of compaction_info to compaction_manager
compaction: move output run id from compaction_info into task
The script assumes the remote name is "origin", a fair
assumption, but not universally true. Read it from configuration
instead of guessing it.
Closes#9423
Since May 2020 empty strings are allowed in DynamoDB as attribute values
(see announcment in [1]). However, they are still not allowed as keys.
We had tests that they are not allowed in keys of LSI or GSI, but missed
tests that they are not allowed as keys (partition or sort key) of base
tables. This patch add these missing tests.
These tests pass - we already had code that checked for empty keys and
generated an appropriate error.
Note that for compatibility with DynamoDB, Alternator will forbid empty
strings as keys even though Scylla *does* support this possibility
(Scylla always supported empty strings as clustering key, and empty
partition keys will become possible with issue #9352).
[1] https://aws.amazon.com/about-aws/whats-new/2020/05/amazon-dynamodb-now-supports-empty-values-for-non-key-string-and-binary-attributes-in-dynamodb-tables/
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211003122842.471001-1-nyh@scylladb.com>
The gist of this series is splitting `compaction_abort_exception` from `compaction_stop_exception`
and their respective error messages to differentiate between compaction being stopped due to e.g. shutdown
or api event vs. compaction aborting due to scrub validation error.
While at it, cleanup the existing retry logic related to `compaction_stop_exception`.
Test: unit(dev)
Dtest: nodetool_additional_test.py:TestNodetool.{{scrub,validate}_sstable_with_invalid_fragment_test,{scrub,validate}_ks_sstable_with_invalid_fragment_test,{scrub,validate}_with_one_node_expect_data_loss_test} (dev, w/ https://github.com/scylladb/scylla-dtest/pull/2267)
Closes#9321
* github.com:scylladb/scylla:
compaction: split compaction_aborted_exception from compaction_stopped_exception
compaction_manager: maybe_stop_on_error: rely on retry=false default
compaction_manager: maybe_stop_on_error: sync return value with error message.
compaction: drop retry parameter from compaction_stop_exception
compaction_manager: move errors stats accounting to maybe_stop_on_error
Just like EBS disks for EC2, we want to use persistent disk on GCE.
We won't recommend to use it, but still need to support it.
Related scylladb/scylla-machine-image#215
Closes#9395
* seastar e6db0cd587...0ba6c36cc3 (6):
> semaphore: add try_get_units
> build: adjust compilation for libfmt 8+
> alloc_failure_injector: add explicit zero-initialization
> Change wakeup() from private to public in reactor.hh
> app-template: separate seastar options into --seastar-help
> files: Don't ignore FS info for read-only files
The pull_github_pr.sh script is preferred over colorful github
buttons, because it's designed to always assign proper authors.
It also works for both single- and multi-patch series, which
makes the merging process more universal.
Message-Id: <b982b650442456b988e1cea59aa5ad221207b825.1633101849.git.sarna@scylladb.com>
by setting _alloc_count initially to 0.
The _alloc_count hasn't been explicitely specified. As the allocator has
been usually an automatic variable, _alloc_count had initially some
unspecified contents. This probalby means that cases where the first few
allocations passed and the later one failed, might haven't ever been
tested. Good thing is that most of the users have been transferred to
the Seastar failure injector, which (by accident) has been correct.
Closes#9420
The tracing code assumes that query_option_names and query_option_values
vectors always have the same length as the prepared_statements vector,
but it's not true. E.g. if one of the statements in a batch is
incorrect, it will create a discrepancy between the number of prepared
statements and the number of bound names and values, which currently
leads to a segmentation fault.
To overcome the problem, all three vectors are integrated into a single
vector, which makes size mismatches impossible.
Tested manually with code that triggers a failure while executing
a batch statement, because the Python driver performs driver-side
validation and thus it's hard to create a test case which triggers
the problem.
closes: #9221
In order to ease future extensions to the information being sent
by the service level configuration change API, we pack the additional
parameters (other the the service level options) to the interface in a
structure. This will allow an easy expansion in the future if more
parameters needs to be sent to the observer.i
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
compaction_info must only contain info data to be exported to the
outside world, whereas compaction_data will contain data for
controlling compaction behavior and stats which change as
compaction progresses.
This separation makes the interface clearer, also allowing for
future improvements like removing direct references to table
in compaction.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
compaction_info will eventually only be used for exporting data about
ongoing compactions, so task must be used instead.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
those stats aren't used in compaction stats API and therefore they
can be removed. end_size is added to compaction_result (needed for
updating history) and start_size can be calculated in advance.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Today, compactions are tracked by both _compactions and _tasks,
where _compactions refer to actual ongoing compaction tasks,
whereas _tasks refer to manager tasks which is responsible for
spawning new compactions, retry them on failure, etc.
As each task can only have one ongoing compaction at a time,
let's move compaction into task, such that manager won't have to
look at both when deciding to do something like stopping a task.
So stopping a task becomes simpler, and duplication is naturally
gone.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Today, compaction is calling compaction manager to register / deregister
the compaction_info created by it.
This is a layer violation because manager sits one layer above
compaction, so manager should be responsible for managing compaction
info.
From now on, compaction_info will be created and managed by
compaction_manager. compaction will only have a reference to info,
which it can use to update the world about compaction progress.
This will allow compaction_manager to be simplified as info can be
coupled with its respective task, allowing duplication to be removed
and layer violation to be fixed.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
this run id is used to track partial runs that are being written to.
let's move it from info into task, as this is not an external info,
but rather one that belongs to compaction_manager.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
There will be unbounded growth of pending tasks if they are submitted
faster than retiring them. That can potentially happen if memtables
are frequently flushed too early. It was observed that this unbounded
growth caused task queue violations as the queue will be filled
with tons of tasks being reevaluated. By avoiding duplication in
pending task list for a given table T, growth is no longer unbounded
and consequently reevaluation is no longer aggressive.
Refs #9331.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210930125718.41243-1-raphaelsc@scylladb.com>
This is not to mess with storage service in this API call. Next
patch will make use of the passed reference.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Nowadays it purely controls whether or not to inject delays into
timestamps generation by cdc. The same effect can be achieved by
configuring the cdc::generation_service directly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is to push the service towards general idea that each
component should have its own config and db::config to stay
in main.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
All of them are references taken from 'this', since the function is
the generation_service method it can use 'this' directly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The generation service already has all it needs to do it. This
keeps storage_service smaller and less aware about cdc internals.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The recently introduced make_new_generation() method just calls
another one by passing more this->... stuff as arguments. Relax
the flow by teaching the latter to use 'this' directly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It has everything needed onboard. Only two arguments are required -- the
booststrap tokens and whether or not to inject a delay.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There seems to be a problem with libwasmtime.a dependency on aarch64,
causing occasional segfaults during tests - specifically, tests
which exercise the path for halting wasm execution due to fuel
exhaustion. As a temporary measure, wasm is disabled on this
architecture to unblock the flow.
Refs #9387Closes#9414
The --replace-token and --replace-node were added some time ago, but
have never been used since then, just parsed and immediatelly aborted.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210930102222.16294-1-xemul@scylladb.com>
"
Most of the code gets local_host_id by querying system keyspace.
There's one place (counters) that want future-less getter and
that caches host id on database for that (it used to be cached on
storage_service some time ago).
This set relocates the value on local cache and frees the starting
code from the need to mess with database for setting it. Also
this cuts hints->qctx hidden dependency.
tests: unit(dev)
"
* 'br-host-id-in-local-cache' of https://github.com/xemul/scylla:
storage_proxy: Use future-less local_host_id getting
database: Get local host id from system_keyspace
system_keyspace: Keep local_host_id on local_cache
code: Rename get_local_host_id() into load_...()
system_keyspace: Coroutinize get_/set_local_host_id
The methods in question are called from the API handlers which
are registered after start (and after host id load and cache),
so they can safely be switched.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Some places in the code want to have future-less access to the
host id, now they do it all by themselves. Local cache seems to
be a better place (for the record -- some time ago the "better
place" argument justified cached host id relocation from the
storage_service onto the database).
While at it -- add the future-less getter for the host_id to be
used further.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This patch adds a reproducer for issue #7586 - that Alternator queries
(Query) operating in reverse order (ScanIndexForward = false) are
artificially limited to 100 MB partitions because of their memory use.
This test generates a partition over 100 MB in size and then tries various
reverse queries on it - with or without Limit, starting at the end or
the middle of the partition. The test currently fails when a reverse query
refuses to operate on such a large partition - the log reports this:
ERROR ... Memory usage of reversed read exceeds hard limit of 104857600
(configured via max_memory_for_unlimited_query_hard_limit), while reading
partition K1H6ON3A1C
With yet-uncommitted reverse-scan improvements, the test proceeds further,
but still fails where we test that a reverse query with Limit not
explicitly specified should still be limited to a certain size (e.g. 1MB)
and cannot return the entire 100 MB partition in one response.
Please note that this is not a comprehensive test for Scylla's reverse
scan implementation: In particular we do not have separate tests for
reverse scan's implementation on different sources - memtables, sstables,
or the cache. Nor do we check all sorts of edge cases. We assume that
Scylla's reverse scan implementation will have its own unit tests
elsewhere that will check these things - and this test can focus on the
Alternator use case.
This test is marked "xfail" because it still fails on Alternator. It is
marked "veryslow" because it's a (relatively) slow test, taking multiple
seconds to set up the 100 MB partition. So run the test with the
pytest options "--runxfail --runveryslow" to see how it fails.
Refs #7586
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210930063700.407511-1-nyh@scylladb.com>
There will appear the future-less method which better deserves
the get_ prefix, so give the existing method the load_ one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>