During the investigation of scylladb/scylladb#20282, it was discovered that implementations of speculating read executors have undefined behavior when called with an incorrect number of read replicas. This PR introduces two levels of condition checking:
- Condition checking in speculating read executors for the number of replicas.
- Checking the consistency of the Effective Replication Map in filter_for_query(): the map is considered incorrect if the list of replicas contains a node from a data center whose replication factor is 0.
Please note: This PR does not fix the issue found in scylladb/scylladb#20282; it only adds condition checks to prevent undefined behavior in cases of inconsistent inputs.
Refs scylladb/scylladb#20625
As this issue applies to the releases versions and can affect clients, we need backports to 6.0, 6.1, 6.2.
(cherry picked from commit 132358dc92)
(cherry picked from commit ae23d42889)
(cherry picked from commit ad93cf5753)
(cherry picked from commit 8db6d6bd57)
(cherry picked from commit c373edab2d)
Refs #20851Closesscylladb/scylladb#21068
* github.com:scylladb/scylladb:
Add conditions checking for get_read_executor
Avoid an extra call to block_for in db::filter_for_query.
Improve code readability in consistency_level.cc and storage_proxy.cc
tools: Add build_info header with functions providing build type information
tests: Add tests for alter table with RF=1 to RF=0
During split prepare phase, there will be more than 1 compaction group with
overlapping token range for a given replica.
Assume tablet 1 has sstable A containing deleted data, and sstable B containing
a tombstone that shadows data in A.
Then split starts:
1) sstable B is split first, and moved from main (unsplit) group to a
split-ready group
2) now compaction runs in split-ready group before sstable A is split
tombstone GC logic today only looks at underlying group, so compaction is step
2 will discard the deleted data in A, since it belongs to another group (the
unsplit one), and so the tombstone can be purged incorrectly.
To fix it, compaction will now work with all uncompacting sstables that belong
to the same replica, since tombstone GC requires all sstables that possibly
contain shadowed data to be available for correct decision to be made.
Fixes#20044.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 93815e0649)
A new header provides `constexpr` functions to retrieve build
type information: `get_build_type()`, `is_release_build()`,
and `is_debug_build()`. These functions are useful when adding
changes that should be enabled at compile time only for
specific build types.
(cherry picked from commit ae23d42889)
A dialect is a different way to interpret the same CQL statement.
Examples:
- how duplicate bind variable names are handled (later in this series)
- whether `column = NULL` in LWT can return true (as is now) or
whether it always returns NULL (as in SQL)
Currently, dialect is an empty structure and will be filled in later.
It is passed to query_processor methods that also accept a CQL string,
and from there to the parser. It is part of the prepared statement cache
key, so that if the dialect is changed online, previous parses of the
statement are ignored and the statement is prepared again.
The patch is careful to pick up the dialect at the entry point (e.g.
CQL protocol server) so that the dialect doesn't change while a statement
is parsed, prepared, and cached.
(cherry picked from commit d69bf4f010)
It is unsafe to restrict the sync nodes for repair to the source data center if it has too low replication factor in network_topology_replication_strategy, or if other nodes in that DC are ignored.
Also, this change restricts the usage of source_dc to `network_topology` and `everywhere_topology`
strategies, as with simple replication strategy
there is no guarantee that there would be any
more replicas in that data center.
Fixes#16826
Reproducer submitted as https://github.com/scylladb/scylla-dtest/pull/3865
It fails without this fix and passes with it.
* Requires backport to live versions. Issue hit in the filed with 2022.2.14
(cherry picked from commit 8b1877f3ca)
(cherry picked from commit 0419b1d522)
(cherry picked from commit b5d0ab092c)
(cherry picked from commit 9729dd21c3)
(cherry picked from commit 8665eef98c)
(cherry picked from commit 5f655e41e3)
Refs #16827Closesscylladb/scylladb#20228
* github.com:scylladb/scylladb:
raft_rebuild: propagate source_dc force option to rebuild_option
repair: do_rebuild_replace_with_repair: use source_dc only when safe
repair: replace_with_repair: pass the replace_node downstream
repair: replace_with_repair: pass ignore_nodes as a set of host_id:s
repair: replace_rebuild_with_repair: pass ks_erms from caller
nodetool: rebuild: add force option
Add and use utils::optional_param to pass source_dc
To be used to force usage of source_dc, even
when it is unsafe for rebuild.
Update docs and add test/nodetool/test_rebuild.py
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 0419b1d522)
Counter updates break under tablet migration (#18180), and for this reason counters need to be disabled until the problem is fixed. It's enough to forbid creating a table with counters, as altering a table without counters already cannot result in the table having counters:
1) Adding a counter column to a table without counters:
```
cqlsh> ALTER TABLE temp.cf ADD (col_name counter);
ConfigurationException: Cannot add a counter column (col_name) in a non counter column family
```
2) Altering a column to be of the counter type:
```
cqlsh> ALTER TABLE temp.cf ALTER col_name TYPE counter;
ConfigurationException: Cannot change col_name from type int to type counter: types are incompatible.
```
Fixes: #19449
Fixes: https://github.com/scylladb/scylladb/issues/18876
Need to backport to 6.0, as this is broken there.
Closesscylladb/scylladb#19518
* github.com:scylladb/scylladb:
doc: add notes to feature pages which don't support tablets
cql: adjust warning about tablets
cql: forbid having counter columns in tablets tables
Counter updates break under tablet migration (#18180), and for this
reason they need to be disabled until the problem is fixed.
It's enough to forbid creating a table with counters, as altering a
table without counters already cannot result in the table having
counters:
1) Adding a counter column to a table without counters:
```
cqlsh> ALTER TABLE temp.cf ADD (col_name counter);
ConfigurationException: Cannot add a counter column (col_name) in a non counter column family
```
2) Altering a column to be of the counter type:
```
cqlsh> ALTER TABLE temp.cf ALTER col_name TYPE counter;
ConfigurationException: Cannot change col_name from type int to type counter: types are incompatible.
```
Fixes: #19449
The following command had been executed to get the
list of headers that did not contain '#pragma once':
'grep -rnw . -e "#pragma once" --include *.hh -L'
This change adds missing include guard to headers
that did not contain any guard.
Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
Closesscylladb/scylladb#19626
Previously optimized clang installation was not used standard build
script, it overwrites preinstalled Fedora's clang binaries instead.
However this breaks on clang-18.1.8, since libLTO versioning convention.
To avoid such problem, let's switch to standard installation method and
swith install prefix to /usr/local.
Fixes#19203Closesscylladb/scylladb#19505
in 3c7af287, cqlsh's reloc package was marked as "noarch", and its
filename was updated accordingly in `configure.py`, so let's update
the CMake building system accordingly.
this change should address the build failure of
```
08:48:14 [3325/4124] Generating ../Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz
08:48:14 FAILED: Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz /jenkins/workspace/scylla-master/scylla-ci/scylla/build/Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz
08:48:14 cd /jenkins/workspace/scylla-master/scylla-ci/scylla/build/dist && /usr/bin/cmake -E copy /jenkins/workspace/scylla-master/scylla-ci/scylla/tools/cqlsh/build/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz /jenkins/workspace/scylla-master/scylla-ci/scylla/build/Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz
08:48:14 Error copying file "/jenkins/workspace/scylla-master/scylla-ci/scylla/tools/cqlsh/build/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz" to "/jenkins/workspace/scylla-master/scylla-ci/scylla/build/Debug/dist/tar/scylla-cqlsh-6.1.0~dev-0.20240629.60955ead75ef.noarch.tar.gz".
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19544
podman does not allocate a tty by default, so without `-t` or `--tty`,
one cannot use a functional terminal when interacting with the
container. that what one can expect when running `dbuild -i --`, and
we are greeted with :
```
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
```
after this change, one can enjoy the good-old terminal as usual
after being dropped to the container provided by `dbuild -i --`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19550
this change updates the cqlsh submodule:
* tools/cqlsh/ ba83aea3...73bdbeb0 (4):
> install.sh: replace tab with spaces
> define the the debug packge is empty
> tests: switch from using cqlsh bash to the test the python file
> package python driver as wheels
it also includes follow change to package cqlsh as a regular
rpm instead of as a "noarch" rpm:
so far cqlsh bundles the python-driver in, but only as source.
meaning the package wasn't architecture, and also didn't
have the libev eventloop compiled in.
Since from python 3.12 and up, that would mean we would
fallback into asyncio eventloop (which still exprimental)
or into error (once we'll sync with the driver upstream)
so to avoid those, we are change the packaging of cqlsh
to be architecture specific, and get cqlsh compiled, and bundle
all of it's requirements as per architecture installed bundle of wheels.
using `shiv`, i.e. one file virtualenv that we'll be packing
into our artifacts
Ref: https://github.com/scylladb/scylla-cqlsh/issues/90
Ref: https://github.com/scylladb/scylla-cqlsh/pull/91
Ref: https://github.com/linkedin/shivClosesscylladb/scylladb#19385
* tools/cqlsh ba83aea...242876c (1):
> Merge 'package python driver as wheels' from Israel Fruchter
Update tools/cqlsh/ submodule
in which, the change of `define the the debug packge is empty`
should address the build failure like
```
Processing files: scylla-cqlsh-debugsource-6.1.0~dev-0.20240624.c7748f60c0bc.aarch64
error: Empty %files file /jenkins/workspace/scylla-master/next/scylla/tools/cqlsh/build/redhat/BUILD/scylla-cqlsh/debugsourcefiles.list
RPM build errors:
Empty %files file /jenkins/workspace/scylla-master/next/scylla/tools/cqlsh/build/redhat/BUILD/scylla-cqlsh/debugsourcefiles.list
```
Closesscylladb/scylladb#19473
flat_mutation_reader_v2 was introduced in a pair of commits in 2021:
e3309322c3 "Clone flat_mutation_reader related classes into v2 variants"
08b5773c12 "Adapt flat_mutation_reader_v2 to the new version of the API"
as a replacement for flat_mutation_reader, using range_tombstone_change
instead of range_tombstone to represent represent range tombstones. See
those commits for more information.
The transition was incremental; the last use of the original
flat_mutation_reader was removed in 2022 in commit
026f8cc1e7 "db: Use mutation_partition_v2 in mvcc"
In turn, flat_mutation_reader was introduced in 2017 in commit
748205ca75 "Introduce flat_mutation_reader"
To transition from a mutation_reader that nested rows within
a partition in a separate stream, to a flat reader that streamed
partitions and rows in the same stream.
Here, we reclaim the original name and rename the awkward
flat_mutation_reader_v2 to mutation_reader.
Note that mutation_fragment_v2 remains since we still use the original
for compatibilty, sometimes.
Some notes about the transition:
- files were also renamed. In one case (flat_mutation_reader_test.cc), the
rename target already existed, so we rename to
mutation_reader_another_test.cc.
- a namespace 'mutation_reader' with two definitions existed (in
mutation_reader_fwd.hh). Its contents was folded into the mutation_reader
class. As a result, a few #includes had to be adjusted.
Closesscylladb/scylladb#19356
Add allure-pytest pip dependency to be able to use it for generating the allure report later.
Main benefits of the allure report:
1. Group test failures
2. Possibility to attach log files to she test itself
3. Timeline of test run
4. Test description on the report
5. Search by test name or tag
[avi: regenerate toolchain]
Closesscylladb/scylladb#19335
before this change, we use runtime format string to format error
messages. but it does not have the compile time format check.
if we pass arguments which are not formattable, {fmt} throws at
runtime, instead of error out at compile-time. this could be very
annoying, because we format error messages at the error handling
path. but if user ends up seeing an exception for {fmt} instead
of a nice error message, it would be far from helpful.
in this change, we
- use compile-time format string
- fix two caller sites, where we pass `std::exception_ptr` to
{fmt}, but `std::exception_ptr` is not formattable by {fmt} at
the time of writing. we do have operator<< based formatter for
it though. so we delegate to `fmt::streamed` to format it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19294
The schema of the sstable can be interesting, so log it with trace
level. Unfortunately, this is not the nice CQL statement we are used to
(that requires a database object), but the not-nearly-so-nice CFMetadata
printout. Still, it is better then nothing.
When auto-detecting the schema of the sstable, if all other methods
failed, load the schema from the sstable's serialization header. This
schema is incomplete. It is just enough to parse and display the content
of the sstable. Although parsing and displaying the content of the
sstable is all scylla-sstable does, it is more future-compatible to us
the full schema when possible. So the always-available but minimal
schema that each sstable has on itself, is used just as a fallback.
The test which tested the case when all schema load attempts fail,
doesn't work now, because loading the serialization header always
succeeds. So convert this test into two positive tests, testing the
serialization header schema fallback instead.
Allows loading the schema from an sstable's serialization header. This
schema is incomplete, but it is enough to parse and display the content
of the sstable.
* tools/java 88809606c8...01ba3c196f (3):
> Revert "build: don't add nonexistent directory 'lib' to relocatable packages"
> build: run antlr in a separate process
> build: don't add nonexistent directory 'lib' to relocatable packages
* tools/cqlsh c8158555...0d58e5ce (6):
> cqlsh.py: fix server side describe after login command
> cqlsh: try server-side DESCRIBE, then client-side
> Refactor tests to accept both client and server side describe
> github actions: support testing with enterprise release
> Add the tab-completion support of SERVICE_LEVEL statements
> reloc/build_reloc.sh: don't use `--no-build-isolation`
Closesscylladb/scylladb#18990
Separate keyspace which also behaves as system brings
little benefit while creating some compatibility problems
like schema digest mismatch during rollback. So we decided
to move auth tables into system keyspace.
Fixes https://github.com/scylladb/scylladb/issues/18098Closesscylladb/scylladb#18769
this change was created in the same spirit of 505900f18f. because
we are deprecating the operator<< for vector and unorderd_map in
Seastar, some tests do not compile anymore if we disable these
operators. so to be prepared for the change disabling them, let's
include test/lib/test_utils.hh for accessing the printer dedicated
for Boost.test. and also '#include <fmt/ranges.h>' when necessary,
because, in order to format the ranges using {fmt}, we need to
use fmt/ranges.h.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
User-defined types can depend on each other, creating directed acyclic graph.
In order to support restoring schema from `DESC SCHEMA`, UDTs should be
ordered topologically, not alphabetically as it was till now.
This patch changes the way UDTs are ordered in `DESC SCHEMA`/`DESC KEYSPACE <ks>` statements, so the output can be safely copy-pasted to restore the schema.
Fixes#18539Closesscylladb/scylladb#18302
* github.com:scylladb/scylladb:
test/cql-pytest/test_describe: add test for UDTs ordering
cql3/statements/describe_statement: UDTs topological sorting
cql3/statements/describe_statement: allow to skip alphabetical sorting
types: add a method to get all referenced user types
db/cql_type_parser: use generic topological sorting
db/cql_type_parses: futurize raw_builder::build()
test/boost: add test for topological sorting
utils: introduce generic topological sorting algorithm
This feature corrected how we store the token in secondary indexes. It
was introduced in 7ff72b0ba5 (2020; 4.4) and can now be assumed present
everywhere. Note that we still support indexes created with the old format.
Currently, invoking `nodetool ring` on a tablet keyspace fails with an error, because it doesn't pass the required table parameter to `/storage_service/ownership/{keyspace}`. Further to this, the command will currently always output the vnode ring, regardless of the keyspace and table parameter. This series fixes this, adding tablet support to `/storage_service/tokens_endpoint`, which will now return the tablet ring (tablet token -> tablet primary replica mapping) if the new keyspace and table parameters are provided.
`nodetool status` also gets a touch-up, to provide the tablet ring's token count (the tablet count) when invoked with a tablet keyspace and table.
Fixes: #17889Fixes: #18474
- [x] ** native-nodetool is new functionality, no backport is needed **
Closesscylladb/scylladb#18608
* github.com:scylladb/scylladb:
test/nodetool: make test pass with cassandra nodetool
tools/scylla-nodetool: status: fix token count for tablets
tools/scylla-nodetool: add tablet support to ring command
api/storage_service: add tablet support for /storage_service/tokens_endpoint
service/storage_service: introduce get_tablet_to_endpoint_map()
locator/tablets: introduce the primary replica concept
Currently, all documentation links that feature anywhere in the help output of scylla-nodetool, are hard-coded to point to the documentation of the latest stable release. As our documentation is version and product (open-source or enterprise) specific, this is not correct. This PR addresses this, by generating documentation links such that they point to the documentation appropriate for the product and version of the scylladb release.
Fixes: https://github.com/scylladb/scylladb/issues/18276
- [x] the native nodetool is a new feature, no backport needed
Closesscylladb/scylladb#18476
* github.com:scylladb/scylladb:
tools/scylla-nodetool: make doc link version-specific
release: introduce doc_link()
build: pass scylla product to release.cc
Currently, the token count column is always based on the vnodes, which
makes no sense for tablet keyspaces. If a tablet keyspace is provided as
the keyspace argument, don't print the vnode token count. If the user
provided a table argument as well, print the tablet count, otherwise
print "?".
Add a table parameter. Pass both keyspace and table (when provided) to
the /storage_service/tokens_endpoint API endpoint, so that the returned
(and printed) token ring is that of the table's tablets, not the vnode
ring.
Also pass the table param to the ownership API, which will complain if
this param is missing for a tablet keyspace.
* tools/cqlsh e5f5eafd...c8158555 (11):
> cqlshlib/sslhandling: fix logic of `ssl_check_hostname`
> cqlshlib/sslhandling.py: don't use empty userkey/usercert
> Dockerfile: noninteractive isn't enough for answering yet on apt-get
> fix cqlsh version print
> cqlshlib/sslhandling: change `check_hostname` deafult to False
> Introduce new ssl configuration for disableing check_hostname
> set the hostname in ssl_options.server_hostname when SSL is used
> issue-73 Fixed a bug where username and password from the credentials file were ignored.
> issue-73 Fixed a bug where username and password from the credentials file were ignored.
> issue-73
> github actions: update `cibuildwheel==v2.16.5`
Fixes: scylladb/scylladb#18590Closesscylladb/scylladb#18591
Generate documentation link, such that they point to the documentation
page, which is appropriate to the current product (open-source or
enterprise) and version. The documentation links are generated by a new
function and the documentation links are injected into the description
of nodetool command via fmt::format().
in {fmt} version 10.0.0, it has a regression, which dropped the
formatter for `char *`, even it does format `const char*`, as the
latter is convertible to
`fmt::stirng_view`.
and this issue was addressed in 10.1.0 using 616a4937, which adds
the formatter for `Char *` back, where `Char` is a template parameter.
but we do need to print `vector<char*>`, so, to address the build
failure with {fmt} version 10.0.0, which is shipped along with
fedora 39. let's backport this formatter.
Fixes#18503
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18505
in Lua 5.3, lua_resume() only accepts three parameters, while in Lua 5.4,
this function accepts four parameters. so in order to be compatible with
Lua 5.3, we should not pass the 4th parameter to this function.
a macro is defined to conditionally pass this parameter based on the
Lua's version.
see https://www.lua.org/manual/5.3/manual.html#lua_resume
Refs 5b5b8b3264
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18450
On be3776ec2a, we changed outdir to
absolute path.
This causes "unknown target" error when we build Scylla using the relative
path something like "ninja build/dev/scylla", since the target name
become absolte path.
Revert the change to able to build with the relative path.
Also, change optimized_clang.sh to use relative path for --builddir,
since we reference "../../$builddir/SCYLLA-*-FILE" when we build
submodule, it won't work with absolute path.
Fixes#18321Closesscylladb/scylladb#18338
We move consistent cluster management out of experimental and
make it the default for new clusters in 6.0. In code, we make the
`consistent-topology-changes` flag unused and assumed to be true.
In 6.0, the topology upgrade procedure will be manual and
voluntary, so some clusters will still be using the gossip-based
topology even though they support the raft-based topology.
Therefore, we need to continue testing the gossip-based topology.
This is possible by using the `force-gossip-topology-changes` flag
introduced in scylladb/scylladb#18284.
Ref scylladb/scylladb#17802Closesscylladb/scylladb#18285
* github.com:scylladb/scylladb:
docs: raft.rst: update after removing consistent-topology-changes
treewide: fix indentation after the previous patch
db: config: make consistent-topology-changes unused
test: lib: single_node_cql_env: restart a node in noninitial run_in_thread calls
test: test_read_required_hosts: run with force-gossip-topology-changes
storage_service: join_cluster: replace force_gossip_based_join with force-gossip-topology-changes
storage_service: join_token_ring: fix finish_setup_after_join calls
* tools/java b810e8b00e...4ee15fd9ea (1):
> install.sh: don't install nodetool into /usr/bin
Add a bin/nodetool and install it to bin/ in install.sh. This script
simply forwards to scylla nodetool and it is the replacement for the
Java nodetool, which is dropped from the java-tools's install.sh, in the
submodule update also included in this patch.
With this change, we now hardwire the usage of the native nodetool, as
*the* nodetool, with the intermediary nodetool wrapper script removed
from the picture.
Bash completion was copied from the java tools repository and it is now
installed by the scylla package, together with nodetool.
The Java nodetool is still available as as a fall-back, in case the
native nodetool has problems, at the path of
/opt/scylladb/share/cassandra/bin/nodetool.
Testing
I tested upgrades on a DEB and RPM distro: Ubuntu and Fedora.
First I installed scylla-5.4, then I installed the packages for this PR.
On Ubuntu, I had to use dpkg -i --auto-deconfigure, otherwise, dpkg would
refuse to install the new packages because they break the old ones. No
extra flags were required on Fedora.
In both cases, /usr/bin/nodetool was changed from a thunk calling the
Java nodetool (from 5.4) to the native launcher script from this PR.
/opt/scylladb/share/cassandra/bin/nodetool remained in place and still
works after the upgrade.
I also verified that --nonroot installs also work. Nodetool works both
when called with an absolute path, or when ~/scylladb/bin is added to
$PATH.
Fixes: #18226Fixes: #17412Closesscylladb/scylladb#18255
[avi: reset submodule to actual hash we ended up with]