Commit Graph

28511 Commits

Author SHA1 Message Date
Pavel Emelyanov
b742e6cbb6 test: Split multishard combining reader case
All the cases in this test also run mutation source tests and the
case with single-fragment buffer takes times more time to execute
than the others.

Splitting this single case so that it runs mutation source tests
flavours in different cases improves the test parallelizm.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-10-05 11:57:02 +03:00
Pavel Emelyanov
30075094ac test: Split database test case
The test_database_with_data_in_sstables_is_a_mutation_source case runs
the mutation source tests in one go. The problem is that on each step
a whole new ks:cf is created which takes the majority of the tests time.
In the end of the day this case is the slowest one in the suite being
up to two times longer (depending on mode) than the #2 on this list.

This patch splits the case into 4 so that each mutation source flavor
is run in separate case.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-10-05 11:53:18 +03:00
Pavel Emelyanov
1e09a2c925 test: Split run_mutation_source_tests
There are 4 flavours of mutation source tests that are all ran
sequentially -- plain, reversed and upgrade/downgrade ones that
check v1<->v2 conversions.

This patch splits them all into individual calls so that some
tests may want to have dedicated cases for each. "By default" they
are all run as they were.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-10-05 11:51:43 +03:00
Avi Kivity
148a12f3da Merge "Keep storage_service less aware of cdc internals" from Pavel E
"
The storage_service is involved in the cdc_generation_service guts
more than needed.

 - the bool _for_testing bit is cdc-only
 - there's API-only cdc_generation_service getter
 - cdc_g._s. startup code partially sits in s._s. one

This patch cleans most of the above leaving only the startup
_cdc_gen_id on board.

tests: unit(dev)
refs: #2795

"

* 'br-storage-service-vs-cdc-2' of https://github.com/xemul/scylla:
  api: Use local sharded<cdc::generation_service> reference
  main: Push cdc::generation_service via API
  storage_service: Ditch for_testing boolean
  cdc: Replace db::config with generation_service::config
  cdc: Drop db::config from description_generator
  cdc: Remove all arguments from maybe_rewrite_streams_descriptions
  cdc: Move maybe_rewrite_streams_descriptions into after_join
  cdc: Squash two methods into one
  cdc: Turn make_new_cdc_generation a service method
  cdc: Remove ring-delay arg from make_new_cdc_generation
  cdc: Keep database reference on generation_service
2021-10-04 14:56:05 +03:00
Piotr Dulikowski
6093c2378b hints: assign _last_written_rp in ep manager's move constructor
The end_point_hints_manager's field _last_written_rp is initialized in
its regular constructor, but is not copied in the move constructor.
Because the move constructor is always involved when creating a new
endpoint manager, the _last_written_rp field is effectively always
initialized with the zero-argument constructor, and is set to the zero
value.

This can cause the following erroneous situation to occur:

- Node A accumulates hints towards B.
- Sync point is created at A.
  It will be used later to wait for currently accumulated hints.
- Node A is restarted.
  The endpoint manager A->B is created which has bogus value in the
  _last_written_rp (it is set to zero).
- Node A replays its hints but does not write any new ones.
- A hint flush occurs.
  If there are no hint segments on disk after flush, the endpoint
  manager sets its last sent position to the last written position,
  which is by design. However, the last written position has incorrect
  value, so the last sent position also becomes incorrect and too low.
- Try to wait for the sync point created earlier.
  The sync point waiting mechanism waits until last sent hint position
  reaches or goes past the position encoded in the sync point, but it
  will not happen because the last sent position is incorrect.

The above bug can be (sometimes) reproduced in
hintedhandoff_sync_point_api_test dtest.

Now, the _last_written_rp field is properly initialized in the move
constructor, which prevents the bug described above.

Fixes: #9320
Closes #9426
2021-10-04 13:21:34 +02:00
Kamil Braun
b2f33b3e0b test: raft: randomized_nemesis_test: abort environment before ticker
We must abort the environment before the ticker as the environment may
require time to keep advancing during abort in order for all operations
to finish, e.g. operations that can finish only due to timeout.
Currently such operations may cause the test to hang indefinitely
at the end.

The test requires a small modification to ensure that
`delivery_queue::push` is not called after the queue was aborted.
Message-Id: <20210930143539.157727-1-kbraun@scylladb.com>
2021-10-04 12:31:26 +02:00
Avi Kivity
1bac93e075 Merge "simplifications and layer violation fix for compaction manager" from Raphael
"This series removes layer violation in compaction, and also
simplifies compaction manager and how it interacts with compaction
procedure."

* 'compaction_manager_layer_violation_fix/v4' of github.com:raphaelsc/scylla:
  compaction: split compaction info and data for control
  compaction_manager: use task when stopping a given compaction type
  compaction: remove start_size and end_size from compaction_info
  compaction_manager: introduce helpers for task
  compaction_manager: introduce explicit ctor for task
  compaction: kill sstables field in compaction_info
  compaction: kill table pointer in compaction_info
  compaction: simplify procedure to stop ongoing compactions
  compaction: move management of compaction_info to compaction_manager
  compaction: move output run id from compaction_info into task
2021-10-04 13:09:31 +03:00
Avi Kivity
93b765f655 scripts/pull_github_pr.sh: don't guess git remote name
The script assumes the remote name is "origin", a fair
assumption, but not universally true. Read it from configuration
instead of guessing it.

Closes #9423
2021-10-04 12:32:39 +03:00
Nadav Har'El
414b672e22 test/alternator: verify that empty-string keys are NOT allowed
Since May 2020 empty strings are allowed in DynamoDB as attribute values
(see announcment in [1]). However, they are still not allowed as keys.

We had tests that they are not allowed in keys of LSI or GSI, but missed
tests that they are not allowed as keys (partition or sort key) of base
tables. This patch add these missing tests.

These tests pass - we already had code that checked for empty keys and
generated an appropriate error.

Note that for compatibility with DynamoDB, Alternator will forbid empty
strings as keys even though Scylla *does* support this possibility
(Scylla always supported empty strings as clustering key, and empty
partition keys will become possible with issue #9352).

[1] https://aws.amazon.com/about-aws/whats-new/2020/05/amazon-dynamodb-now-supports-empty-values-for-non-key-string-and-binary-attributes-in-dynamodb-tables/

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211003122842.471001-1-nyh@scylladb.com>
2021-10-04 08:40:43 +02:00
Botond Dénes
61e7d3de90 Merge 'Cleanup compaction_stop_exception' from Benny Halevy
The gist of this series is splitting `compaction_abort_exception` from `compaction_stop_exception`
and their respective error messages to differentiate between compaction being stopped due to e.g. shutdown
or api event vs. compaction aborting due to scrub validation error.

While at it, cleanup the existing retry logic related to `compaction_stop_exception`.

Test: unit(dev)
Dtest: nodetool_additional_test.py:TestNodetool.{{scrub,validate}_sstable_with_invalid_fragment_test,{scrub,validate}_ks_sstable_with_invalid_fragment_test,{scrub,validate}_with_one_node_expect_data_loss_test} (dev, w/ https://github.com/scylladb/scylla-dtest/pull/2267)

Closes #9321

* github.com:scylladb/scylla:
  compaction: split compaction_aborted_exception from compaction_stopped_exception
  compaction_manager: maybe_stop_on_error: rely on retry=false default
  compaction_manager: maybe_stop_on_error: sync return value with error message.
  compaction: drop retry parameter from compaction_stop_exception
  compaction_manager: move errors stats accounting to maybe_stop_on_error
2021-10-04 07:27:11 +03:00
Takuya ASADA
9c830297ac scylla_util.py: add persistent disk support for GCE
Just like EBS disks for EC2, we want to use persistent disk on GCE.
We won't recommend to use it, but still need to support it.

Related scylladb/scylla-machine-image#215

Closes #9395
2021-10-03 17:58:18 +03:00
Takuya ASADA
d87b80ad14 scylla_util.py: add persistent disk support for Azure Just like EBS disks for EC2, we want to use persistent disk on Azure. We won't recommend to use it, but still need to support it.
Related https://github.com/scylladb/scylla-machine-image/issues/218

Closes #9417
2021-10-03 17:56:31 +03:00
Avi Kivity
adcd5a69d6 Update seastar submodule
* seastar e6db0cd587...0ba6c36cc3 (6):
  > semaphore: add try_get_units
  > build: adjust compilation for libfmt 8+
  > alloc_failure_injector: add explicit zero-initialization
  > Change wakeup() from private to public in reactor.hh
  > app-template: separate seastar options into --seastar-help
  > files: Don't ignore FS info for read-only files
2021-10-03 13:14:43 +03:00
Piotr Sarna
1d353bd6e7 docs: mention scripts/pull_github_pr.sh
The pull_github_pr.sh script is preferred over colorful github
buttons, because it's designed to always assign proper authors.
It also works for both single- and multi-patch series, which
makes the merging process more universal.
Message-Id: <b982b650442456b988e1cea59aa5ad221207b825.1633101849.git.sarna@scylladb.com>
2021-10-03 10:19:26 +03:00
Michał Radwański
0d5a2067ad test/lib/failure_injecting_allocation_strategy: remove UB...
by setting _alloc_count initially to 0.

The _alloc_count hasn't been explicitely specified. As the allocator has
been usually an automatic variable, _alloc_count had initially some
unspecified contents. This probalby means that cases where the first few
allocations passed and the later one failed, might haven't ever been
tested. Good thing is that most of the users have been transferred to
the Seastar failure injector, which (by accident) has been correct.

Closes #9420
2021-10-01 13:25:05 +02:00
Pavel Emelyanov
85d86cc85f scripts: Fix origin repo URL parsing
It assumes that the origin URL is git@ one while it can be
the https:// one as well.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20211001082116.7214-1-xemul@scylladb.com>
2021-10-01 13:22:06 +02:00
Piotr Sarna
ec52e05eab tracing: unify prepared statement info into a single struct
The tracing code assumes that query_option_names and query_option_values
vectors always have the same length as the prepared_statements vector,
but it's not true. E.g. if one of the statements in a batch is
incorrect, it will create a discrepancy between the number of prepared
statements and the number of bound names and values, which currently
leads to a segmentation fault.
To overcome the problem, all three vectors are integrated into a single
vector, which makes size mismatches impossible.

Tested manually with code that triggers a failure while executing
a batch statement, because the Python driver performs driver-side
validation and thus it's hard to create a test case which triggers
the problem.

closes: #9221
2021-10-01 10:57:38 +03:00
Eliran Sinvani
c38ceafdcf Service Level Controller: Add an extention point to the API (#9374)
In order to ease future extensions to the information being sent
by the service level configuration change API, we pack the additional
parameters (other the the service level options) to the interface in a
structure. This will allow an easy expansion in the future if more
parameters needs to be sent to the observer.i

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2021-10-01 10:20:28 +03:00
Raphael S. Carvalho
9067a13eac compaction: split compaction info and data for control
compaction_info must only contain info data to be exported to the
outside world, whereas compaction_data will contain data for
controlling compaction behavior and stats which change as
compaction progresses.
This separation makes the interface clearer, also allowing for
future improvements like removing direct references to table
in compaction.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:57 -03:00
Raphael S. Carvalho
87ce0c5d43 compaction_manager: use task when stopping a given compaction type
compaction_info will eventually only be used for exporting data about
ongoing compactions, so task must be used instead.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:52 -03:00
Raphael S. Carvalho
cbd78be2dd compaction: remove start_size and end_size from compaction_info
those stats aren't used in compaction stats API and therefore they
can be removed. end_size is added to compaction_result (needed for
updating history) and start_size can be calculated in advance.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:45 -03:00
Raphael S. Carvalho
18f703e94b compaction_manager: introduce helpers for task
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:41 -03:00
Raphael S. Carvalho
d4572a1bb5 compaction_manager: introduce explicit ctor for task
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:37 -03:00
Raphael S. Carvalho
38df9c68f8 compaction: kill sstables field in compaction_info
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:33 -03:00
Raphael S. Carvalho
90cfe895d4 compaction: kill table pointer in compaction_info
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:29 -03:00
Raphael S. Carvalho
4ce745e0b6 compaction: simplify procedure to stop ongoing compactions
Today, compactions are tracked by both _compactions and _tasks,
where _compactions refer to actual ongoing compaction tasks,
whereas _tasks refer to manager tasks which is responsible for
spawning new compactions, retry them on failure, etc.
As each task can only have one ongoing compaction at a time,
let's move compaction into task, such that manager won't have to
look at both when deciding to do something like stopping a task.

So stopping a task becomes simpler, and duplication is naturally
gone.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:16:21 -03:00
Raphael S. Carvalho
efed06e2e4 compaction: move management of compaction_info to compaction_manager
Today, compaction is calling compaction manager to register / deregister
the compaction_info created by it.

This is a layer violation because manager sits one layer above
compaction, so manager should be responsible for managing compaction
info.

From now on, compaction_info will be created and managed by
compaction_manager. compaction will only have a reference to info,
which it can use to update the world about compaction progress.

This will allow compaction_manager to be simplified as info can be
coupled with its respective task, allowing duplication to be removed
and layer violation to be fixed.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:15:00 -03:00
Raphael S. Carvalho
1f5b17fdc5 compaction: move output run id from compaction_info into task
this run id is used to track partial runs that are being written to.
let's move it from info into task, as this is not an external info,
but rather one that belongs to compaction_manager.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-30 13:13:20 -03:00
Raphael S. Carvalho
52302c3238 compaction_manager: prevent unbounded growth of pending tasks
There will be unbounded growth of pending tasks if they are submitted
faster than retiring them. That can potentially happen if memtables
are frequently flushed too early. It was observed that this unbounded
growth caused task queue violations as the queue will be filled
with tons of tasks being reevaluated. By avoiding duplication in
pending task list for a given table T, growth is no longer unbounded
and consequently reevaluation is no longer aggressive.

Refs #9331.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210930125718.41243-1-raphaelsc@scylladb.com>
2021-09-30 16:49:52 +03:00
Pavel Emelyanov
037135316e api: Use local sharded<cdc::generation_service> reference
And remove the getter from storage_service.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 16:04:12 +03:00
Pavel Emelyanov
5d8e05e7ae main: Push cdc::generation_service via API
This is not to mess with storage service in this API call. Next
patch will make use of the passed reference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 16:04:12 +03:00
Pavel Emelyanov
f669fbd230 storage_service: Ditch for_testing boolean
Nowadays it purely controls whether or not to inject delays into
timestamps generation by cdc. The same effect can be achieved by
configuring the cdc::generation_service directly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 16:04:12 +03:00
Pavel Emelyanov
db623c5f64 cdc: Replace db::config with generation_service::config
This is to push the service towards general idea that each
component should have its own config and db::config to stay
in main.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 16:04:12 +03:00
Pavel Emelyanov
b879d3f3a5 cdc: Drop db::config from description_generator
It only needs one for murmur3_partitioner_ignore_msb_bits value,
provide it directly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 16:04:12 +03:00
Pavel Emelyanov
2e7364b94f cdc: Remove all arguments from maybe_rewrite_streams_descriptions
All of them are references taken from 'this', since the function is
the generation_service method it can use 'this' directly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 16:04:12 +03:00
Pavel Emelyanov
6fe31d8eac cdc: Move maybe_rewrite_streams_descriptions into after_join
The generation service already has all it needs to do it. This
keeps storage_service smaller and less aware about cdc internals.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 15:34:03 +03:00
Pavel Emelyanov
3b51c5c96a cdc: Squash two methods into one
The recently introduced make_new_generation() method just calls
another one by passing more this->... stuff as arguments. Relax
the flow by teaching the latter to use 'this' directly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 15:34:03 +03:00
Pavel Emelyanov
7a7a87f24a cdc: Turn make_new_cdc_generation a service method
It has everything needed onboard. Only two arguments are required -- the
booststrap tokens and whether or not to inject a delay.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 15:34:03 +03:00
Pavel Emelyanov
b867a19da1 cdc: Remove ring-delay arg from make_new_cdc_generation
It already has the db::config from where to get one (and even
this will change soon).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 15:34:03 +03:00
Pavel Emelyanov
5e2a049266 cdc: Keep database reference on generation_service
The service effectively depends on it when rewrites streams
descriptions.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 15:34:03 +03:00
Piotr Sarna
e2fe8559ca configure: temporarily disable wasm support for aarch64
There seems to be a problem with libwasmtime.a dependency on aarch64,
causing occasional segfaults during tests - specifically, tests
which exercise the path for halting wasm execution due to fuel
exhaustion. As a temporary measure, wasm is disabled on this
architecture to unblock the flow.

Refs #9387

Closes #9414
2021-09-30 14:57:04 +03:00
Pavel Emelyanov
bbcf671276 config: Remove unused replacing options
The --replace-token and --replace-node were added some time ago, but
have never been used since then, just parsed and immediatelly aborted.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210930102222.16294-1-xemul@scylladb.com>
2021-09-30 14:56:04 +03:00
Piotr Jastrzebski
79de151158 cache_tracker: remove unused parameter from on_remove
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <f66ad391d86963b43b2a01e957887ea597e591e8.1632992165.git.piotr@scylladb.com>
2021-09-30 13:03:13 +03:00
Avi Kivity
83237894b7 Merge "Keep local_host_id in local_cache" from Pavel E
"
Most of the code gets local_host_id by querying system keyspace.
There's one place (counters) that want future-less getter and
that caches host id on database for that (it used to be cached on
storage_service some time ago).

This set relocates the value on local cache and frees the starting
code from the need to mess with database for setting it. Also
this cuts hints->qctx hidden dependency.

tests: unit(dev)
"

* 'br-host-id-in-local-cache' of https://github.com/xemul/scylla:
  storage_proxy: Use future-less local_host_id getting
  database: Get local host id from system_keyspace
  system_keyspace: Keep local_host_id on local_cache
  code: Rename get_local_host_id() into load_...()
  system_keyspace: Coroutinize get_/set_local_host_id
2021-09-30 12:38:52 +03:00
Pavel Emelyanov
8a93a6de78 storage_proxy: Use future-less local_host_id getting
The methods in question are called from the API handlers which
are registered after start (and after host id load and cache),
so they can safely be switched.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 10:55:20 +03:00
Pavel Emelyanov
e9002e1e61 database: Get local host id from system_keyspace
It's now cached on database itself, and it can be removed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 10:55:20 +03:00
Pavel Emelyanov
9f5fd8b5c0 system_keyspace: Keep local_host_id on local_cache
Some places in the code want to have future-less access to the
host id, now they do it all by themselves. Local cache seems to
be a better place (for the record -- some time ago the "better
place" argument justified cached host id relocation from the
storage_service onto the database).

While at it -- add the future-less getter for the host_id to be
used further.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 10:54:38 +03:00
Nadav Har'El
1edcc3a218 test/alternator: add test for reverse queries
This patch adds a reproducer for issue #7586 - that Alternator queries
(Query) operating in reverse order (ScanIndexForward = false) are
artificially limited to 100 MB partitions because of their memory use.

This test generates a partition over 100 MB in size and then tries various
reverse queries on it - with or without Limit, starting at the end or
the middle of the partition. The test currently fails when a reverse query
refuses to operate on such a large partition - the log reports this:

  ERROR ... Memory usage of reversed read exceeds hard limit of 104857600
  (configured via max_memory_for_unlimited_query_hard_limit), while reading
  partition K1H6ON3A1C

With yet-uncommitted reverse-scan improvements, the test proceeds further,
but still fails where we test that a reverse query with Limit not
explicitly specified should still be limited to a certain size (e.g. 1MB)
and cannot return the entire 100 MB partition in one response.

Please note that this is not a comprehensive test for Scylla's reverse
scan implementation: In particular we do not have separate tests for
reverse scan's implementation on different sources - memtables, sstables,
or the cache. Nor do we check all sorts of edge cases. We assume that
Scylla's reverse scan implementation will have its own unit tests
elsewhere that will check these things - and this test can focus on the
Alternator use case.

This test is marked "xfail" because it still fails on Alternator. It is
marked "veryslow" because it's a (relatively) slow test, taking multiple
seconds to set up the 100 MB partition. So run the test with the
pytest options "--runxfail --runveryslow" to see how it fails.

Refs #7586

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210930063700.407511-1-nyh@scylladb.com>
2021-09-30 09:34:39 +02:00
Pavel Emelyanov
beb345c00a code: Rename get_local_host_id() into load_...()
There will appear the future-less method which better deserves
the get_ prefix, so give the existing method the load_ one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 10:33:57 +03:00
Pavel Emelyanov
e49dc4ed0d system_keyspace: Coroutinize get_/set_local_host_id
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-30 10:33:57 +03:00