Don't call get_datacenter(ep) without checking
first has_endpoint(ep) since the former may abort
on internal error if the endpoint is not listed
in topology.
Refs #11870
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12054
Returns an unordered set of datacenter names
to be used by network_topology_replication_strategy
and for ks_prop_defs.
The set is kept in sync with _dc_endpoints.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12023
Since we moved all IaaS code to scylla-machine-image, we nolonger need
AMI variable on sysconfig file or --ami parameter on setup scripts,
and also never used /etc/scylla/ami_disabled.
So let's drop all of them from Scylla core core.
Related with scylladb/scylla-machine-image#61
Closes#12043
When started the sstable_directory is constructed with a bunch of booleans that control the way its process_sstable_dir method works. It's shorter and simpler to pass these booleans into method directly, all the more so there's another flag that's already passed like this.
Closes#12005
* github.com:scylladb/scylladb:
sstable_directory: Move all RAII booleans onto flags
sstable_directory: Convert sort-sstables argument to flags struct
sstable_directory: Drop default filter
The PR introduces top level repair tasks representing repair and node operations
performed with repair. The actions performed as a part of these operations are
moved to corresponding tasks' run methods.
Also a small change to repair module is added.
Closes#11869
* github.com:scylladb/scylladb:
repair: define run for data_sync_repair_task_impl
repair: add data_sync_repair_task_impl
tasks: repair: add noexcept to task impl constructor
repair: define run for user_requested_repair_task_impl
repair: add user_requested_repair_task_impl
repair: allow direct access to max_repair_memory_per_range
When processing a query, we keep a pointer to an effective_replication_map.
In a couple places we used the latest topology instead of the one held by the effective_replication_map
that the query uses and that might lead to inconsistencies if, for example, a node is removed from topology after decommission that happens concurrently to the query.
This change gets the topology& from the e_r_m in those cases.
Fixes#12050Closes#12051
* github.com:scylladb/scylladb:
storage_proxy: pass topology& to sort_endpoints_by_proximity
storage_proxy: pass topology& to is_worth_merging_for_range_query
Currently the ctor of said class always allocates as it copies the
provided name string and it creates a new name via format().
We want to avoid this, now that the validator is used on the read path.
So defer creating the formatted name to when we actually want to log
something, which is either when log level is debug or when an error is
found. We don't care about performance in either case, but we do care
about it on the happy path.
Further to the above, provide a constructor for string literal names and
when this is used, don't copy the name string, just save a view to it.
Refs: #11174Closes#12042
Contains fixes requested in the issue (and some tiny extras), together with analysis why they don't affect the users (see commit messages).
Fixes [ #11800](https://github.com/scylladb/scylladb/issues/11800)
Closes#11926
* github.com:scylladb/scylladb:
alternator: add maybe_quote to secondary indexes 'where' condition
test/alternator: correct xfail reason for test_gsi_backfill_empty_string
test/alternator: correct indentation in test_lsi_describe
alternator: fix wrong 'where' condition for GSI range key
There's a bunch of booleans that control the behavior of sstable
directory scanning. Currently they are described as verbose
bool_class<>-es and are put into sstable_directory construction time.
However, these are not used outside of .process_sstable_dir() method and
moving them onto recently added flags struct makes the code much
shorter (29 insertions(+), 121 deletions(-))
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The sstable_directory::process_sstable_dir() accepts a boolean to
control its behavior when collecting sstables. Turn this boolean into a
structure of flags. The intention is to extend this flags set in the
future (next patch).
This boolean is true all the time, but one place sets it to true in a
"verbose" manner, like this:
bool sort_sstables_according_to_owner = false;
process_sstable_dir(directory, sort_sstables_according_to_owner).get();
the local variable is not used anymore. Using designated initializers
solves the verbosity in a nicer manner.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's used as default argument for .reshape() method, but callers specify
it explicitly. At the same time the filter is simple enough and is only
used in one place so that the caller can just use explicit lambda.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It mustn't use the latest topology that may differ from the
one used by the query as it may be missing nodes
(e.g. after concurrent decommission).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It mustn't use the latest topology that may differ from the
one used by the query as it may be missing nodes
(e.g. after concurrent decommission).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This bug doesn't affect anything, the reason is descibed in the commit:
'alternator: fix wrong 'where' condition for GSI range key'.
But it's theoretically correct to escape those key names and
the difference can be observed via CQL's describe table. Before
the patch 'where' condition is missing one double quote in variable
name making it mismatched with corresponding column name.
Otherwise I think assert is not executed in a loop. And I am not sure why lsi variable can be bound
to anything. As I tested it was pointing to the last element in lsis...
This bug doesn't manifest in a visible way to the user.
Adding the index to an existing table via GlobalSecondaryIndexUpdates is not supported
so we don't need to consider what could happen for empty values of index range key.
After the index is added the only interesting value user can set is omitting
the value (null or empty are not allowed, see test_gsi_empty_value and
test_gsi_null_value).
In practice no matter of 'where' condition the underlaying materialized
view code is skipping row updates with missing keys as per this comment:
'If one of the key columns is missing, set has_new_row = false
meaning that after the update there will be no view row'.
Thats why the added test passes both before and after the patch.
But it's still usefull to include it to exercise those code paths.
Fixes#11800
This patch includes a translation of several additional small test files
from Cassandra's CQL unit test directory cql3/validation/operations.
All tests included here pass on both Cassandra and Scylla, so they did
not discover any new Scylla bugs, but can be useful in the future as
regression tests.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12045
Take advantage of the facts that both the owned ranges
and the initial non_owned_ranges (derived from the set of sstables)
are deoverlapped and sorted by start token to turn
the calculation of the final non_owned_ranges from
quadratic to linear.
Fixes#11922Closes#11903
* github.com:scylladb/scylladb:
dht: optimize subtract_ranges
compaction: refactor dht::subtract_ranges out of get_ranges_for_invalidation
compaction_manager: needs_cleanup: get first/last tokens from sstable decorated keys
In order to support different storage kinds for sstable files (e.g. -- s3) it's needed to localize all the places that manipulate files on a POSIX filesystem so that custom storage could implement them in its own way. This set moves the deletion log manipulations to the sstable_directory.cc, which already "knows" that it works over a directory.
Closes#12020
* github.com:scylladb/scylladb:
sstables: Delete log file in replay_pending_delete_log()
sstables: Move deletion log manipulations to sstable_directory.cc
sstables: Open-code delete_sstables() call
sstables: Use fs::path in replay_pending_delete_log()
sstables: Indentation fix after previous patch
sstables: Coroutinize replay_pending_delete_log
sstables: Read pending delete log with one line helper
sstables: Dont write pending log with file_writer
Take advantage of the fact that both ranges and
ranges_to_subtract are deoverlapped and sorted by
to reduce the calculation complexity from
quadratic to linear.
Fixes#11922
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The algorithm is generic and can be used elsewhere.
Add a unit test for the function before it gets
optimized in the following patch.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Currently, the function is inefficient in two ways:
1. unnecessary copy of first/last keys to automatic variables
2. redecorating the partition keys with the schema passed to
needs_cleanup.
We canjust use the tokens from the sstable first/last decorated keys.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The deletion log concept uses the fact that files are on a POSIX
filesystem. Support for another storage type will have to reimplement
this place, so keep the FS-specific code in _directory.cc file.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's no used by any other code, but to be used it requires the caller to
tranform TOC file names by prepending sstable directory to them. Things
get shorter and simpler if merging the helper code into the caller.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's called by a code that has fs::path at hand and internally uses
helpers that need fs::path too, so no need to convert it back and forth.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's a wrapper over output_stream with offset tracking and the tracking
is not needed to generate a log file. As a bonus of switching back we
get a stream.write(sstring) sugar.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Fix https://github.com/scylladb/scylladb/issues/11598
This PR adds the troubleshooting article submitted by @syuu1228 in the deprecated _scylla-docs_ repo, with https://github.com/scylladb/scylla-docs/pull/4152.
I copied and reorganized the content and rewritten it a little according to the RST guidelines so that the page renders correctly.
@syuu1228 Could you review this PR to make sure that my changes didn't distort the original meaning?
Closes#11626
* github.com:scylladb/scylladb:
doc: apply the feedback to improve clarity
doc: add the link to the new Troubleshooting section and replace Scylla with ScyllaDB
doc: add the new page to the toctree
doc: add a troubleshooting article about the missing configuration files
We had an xfailing test that reproduced a case where Alternator tried
to report an error when the request was too long, but the boto library
didn't see this error and threw a "Broken Pipe" error instead. It turns
out that this wasn't a Scylla bug but rather a bug in urllib3, which
overzealously reported a "Broken Pipe" instead of trying to read the
server's response. It turns out this issue was already fixed in
https://github.com/urllib3/urllib3/pull/1524
and now, on modern installations, the test that used to fail now passes
and reports "XPASS".
So in this patch we remove the "xfail" tag, and skip the test if
running an old version of urllib3.
Fixes#8195Closes#12038
Fragment reordering and fragment dropping bugs have been plaguing us since forever. To fight them we added a validator to the sstable write path to prevent really messed up sstables from being written.
This series adds validation to the mutation compactor. This will cover reads and compaction among others, hopefully ridding us of such bugs on the read path too.
This series fixes some benign looking issues found by unit tests after the validator was added -- although how benign a producer emitting two partition-ends depends entirely on how the consumer reacts to it, so no such bug is actually benign.
Fixes: https://github.com/scylladb/scylladb/issues/11174Closes#11532
* github.com:scylladb/scylladb:
mutation_compactor: add validator
mutation_fragment_stream_validator: add a 'none' validation level
test/boost/mutation_query_test: test_partition_limit: sort input data
querier: consume_page(): use partition_start as the sentinel value
treewide: use ::for_partition_end() instead of ::end_of_partition_tag_t{}
treewide: use ::for_partition_start() instead of ::partition_start_tag_t{}
position_in_partition: add for_partition_{start,end}()
Adds unit tests for the function `expr::prepare_expression`.
Three minor bugs were found by these tests, both fixed in this PR.
1. When preparing a map, the type for tuple constructor was taken from an unprepared tuple, which has `nullptr` as its type.
2. Preparing an empty nonfrozen list or set resulted in `null`, but preparing a map didn't. Fixed this inconsistency.
3. Preparing a `bind_variable` with `nullptr` receiver was allowed. The `bind_variable` ended up with a `nullptr` type, which is incorrect. Changed it to throw an exception,
Closes#11941
* github.com:scylladb/scylladb:
test preparing expr::usertype_constructor
expr_test: test that prepare_expression checks style_type of collection_constructor
expr_test: test preparing expr::collection_constructor for map
prepare_expr: make preparing nonfrozen empty maps return null
prepare_expr: fix a bug in map_prepare_expression
expr_test: test preparing expr::collection_constructor for set
expr_test: test preparing expr::collection_constructor for list
expr_test: test preparing expr::tuple_constructor
expr_test: test preparing expr::untyped_constant
expr_test_utils: add make_bigint_raw/const
expr_test_utils: add make_tinyint_raw/const
expr_test: test preparing expr::bind_variable
cql3: prepare_expr: forbid preparing bind_variable without a receiver
expr_test: test preparing expr::null
expr_test: test preparing expr::cast
expr_test_utils: add make_receiver
expr_test_utils: add make_smallint_raw/const
expr_test: test preparing expr::token
expr_test: test preparing expr::subscript
expr_test: test preparing expr::column_value
expr_test: test preparing expr::unresolved_identifier
expr_test_utils: mock data_dictionary::database
We had a test that used to fail because of issue #8745. But this issue
was alread fixed, and we forgot to remove the "xfail" marker. The test
now passes, so let's remove the xfail marker.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12039
Add new guide for upgrading 5.1 to 5.2.
In this new upgrade doc, include additional steps for enabling
Raft using the `consistent_cluster_management` flag. Note that we don't
have this flag yet but it's planned to replace the experimental flag in
5.2.
In the "Raft in ScyllaDB" document, add sections about:
- enabling Raft in existing clusters in Scylla 5.2,
- verifying that the internal Raft upgrade procedure finishes
successfully,
- recovering from a stuck Raft upgrade procedure or from a majority loss
situation.
Fix some problems in the documentation, e.g. it is not possible to
enable Raft in an existing cluster in 5.0, but the documentation claimed
that it is.
Follow-up items:
- if we decide for a different name for `consistent_cluster_management`,
use that name in the docs instead
- update the warnings in Scylla to link to the Raft doc
- mention Enterprise versions once we know the numbers
- update the appropriate upgrade docs for Enterprise versions
once they exist
Closes#11910
* github.com:scylladb/scylladb:
docs: describe the Raft upgrade and recovery procedures
docs: add upgrade guide 5.1 -> 5.2
We recently (in 7fbad8de87) made sure all admission paths can trigger the eviction of inactive reads. As reader eviction happens in the background, a mechanism was added to make sure only a single eviction fiber was running at any given time. This mechanism however had a preemption point between stopping the fiber and releasing the evict lock. This gave an opportunity for either new waiters or inactive readers to be added, without the fiber acting on it. Since it still held onto the lock, it also prevented from other eviction fibers to start. This could create a situation where the semaphore could admit new reads by evicting inactive ones, but it still has waiters. Since an empty waitlist is also an admission criteria, once one waiter is wrongly added, many more can accumulate.
This series fixes this by ensuring the lock is released in the instant the fiber decides there is no more work to do.
It also fixes the assert failure on recursive eviction and adds a detection to the inactive/waiter contradiction.
Fixes: #11923
Refs: #11770Closes#12026
* github.com:scylladb/scylladb:
reader_concurrency_semaphore: do_wait_admission(): detect admission-waiter anomaly
reader_concurrency_semaphore: evict_readers_in_the_background(): eliminate blind spot
reader_concurrency_semaphore: do_detach_inactive_read(): do a complete detach
This series contains a mixed bag of improvements to `scylla sstable dump-data`. These improvements are mostly aimed at making the json output clearer, getting rid of any ambiguities.
Closes#12030
* github.com:scylladb/scylladb:
tools/scylla-sstable: traverse sstables in argument order
tools/scylla-sstable: dump-data docs: s/clustering_fragments/clustering_elements
tools/scylla-sstable: dump-data/json: use Null instead of "<unknown>"
tools/scylla-sstable: dump-data/json: use more uniform format for collections
tools/scylla-sstable: dump-data/json: make cells easier to parse
Since recently the framework uses a separate set of unique IDs to
identify servers, but the log file and workdir is still named using the
last part of the IP address.
This is confusing: the test logs sometimes don't provide the IP addr
(only the ID), and even if they do, the reader of the test log may not
know that they need to look at the last part of the IP to find the
node's log/workdir.
Also using ID will be necessary if we want to reuse IP addresses (e.g.
during node replace, or simply not to run out of IP addresses during
testing).
So use the ID instead to name the workdir and log file.
Also, when starting a test case, print the used cluster. This will make
it easier to map server IDs to their IP addresses when browsing through
the test logs.
Closes#12018
* github.com:scylladb/scylladb:
test/pylib: manager_client: print used cluster when starting test case
test/pylib: scylla_cluster: use server ID to name workdir and log file, not IP address