this mirrors what we have in `configure.py`, to build the CqlParser with `-O1`
and disable `-fsanitize-address-use-after-scope` when compiling CqlParser.cc
in order to prevent the compiler from emitting code which uses large amount of stack
space at the runtime.
Closesscylladb/scylladb#15819
* github.com:scylladb/scylladb:
build: cmake: avoid using large amount stack of when compiling parser
build: cmake: s/COMPILE_FLAGS/COMPILE_OPTIONS/
There are some schema modifications performed automatically (during bootstrap, upgrade etc.) by Scylla that are announced by multiple calls to `migration_manager::announce` even though they are logically one change. Precisely, they appear in:
- `system_distributed_keyspace::start`,
- `redis:create_keyspace_if_not_exists_impl`,
- `table_helper::setup_keyspace` (for the `system_traces` keyspace).
All these places contain a FIXME telling us to `announce` only once. There are a few reasons for this:
- calling `migration_manager::announce` with Raft is quite expensive -- taking a `read_barrier` is necessary, and that requires contacting a leader, which then must contact a quorum,
- we must implement a retrying mechanism for every automatic `announce` if `group0_concurrent_modification` occurs to enable support for concurrent bootstrap in Raft-based topology. Doing it before the FIXMEs mentioned above would be harder, and fixing the FIXMEs later would also be harder.
This PR fixes the first two FIXMEs and improves the situation with the last one by reducing the number of the `announce` calls to two. Unfortunately, reducing this number to one requires a big refactor. We can do it as a follow-up to a new, more specific issue. Also, we leave a new FIXME.
Fixing the first two FIXMEs required enabling the announcement of a keyspace together with its tables. Until now, the code responsible for preparing mutations for a new table could assume the existence of the keyspace. This assumption wasn't necessary, but removing it required some refactoring.
Fixes#15437Closesscylladb/scylladb#15594
* github.com:scylladb/scylladb:
table_helper: announce twice in setup_keyspace
table_helper: refactor setup_table
redis: create_keyspace_if_not_exists_impl: fix indentation
redis: announce once in create_keyspace_if_not_exists_impl
db: system_distributed_keyspace: fix indentation
db: system_distributed_keyspace: announce once in start
tablet_allocator: update on_before_create_column_family
migration_listener: add parameter to on_before_create_column_family
alternator: executor: use new prepare_new_column_family_announcement
alternator: executor: introduce create_keyspace_metadata
migration_manager: add new prepare_new_column_family_announcement
Topology coordinator should handle failures internally as long as it
remains to be the coordinator. The raft state monitor is not in better
position to handle any errors thrown by it, all it can do it to restart
the coordinator. The series makes topology_coordinator::run handle all
the errors internally and mark the function as noexcept to not leak
error handling complexity into the raft state monitor.
* 'gleb/15728-fix' of github.com:scylladb/scylla-dev:
storage_service: raft topology: mark topology_coordinator::run function as noexcept
storage_service: raft topology: do not throw error from fence_previous_coordinator()
it is printed when pytest passes it down as a fixture as part of
the logging message. it would help with debugging a object_store test.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15817
The function handled all exceptions internally. By making it noexcept we
make sure that the caller (raft_state_monitor_fiber) does not need
handle any exceptions from the topology coordinator fiber.
Throwing error kills the topology coordinator monitor fiber. Instead we
retry the operation until it succeeds or the node looses its leadership.
This is fine before for the operation to succeed quorum is needed and if
the quorum is not available the node should relinquish its leadership.
Fixes#15728
this mirrors what we have in `configure.py`, to build the CqlParser with -O1
and disable sanitize-address-use-after-scope when compiling CqlParser.cc
in order to prevent the compiler from emitting code which uses large amount of stack
at the runtime.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we create a new UUID for a new sstable managed by the s3_storage, and we use the string representation of UUID defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for naming the objects stored on s3_storage. but this representation is not what we are using for storing sstables on local filesystem when the option of "uuid_sstable_identifiers_enabled" is enabled. instead, we are using a base36-based representation which is shorter.
to be consistent with the naming of the sstables created for local filesystem, and more importantly, to simplify the interaction between the local copy of sstables and those stored on object storage, we should use the same string representation of the sstable identifier.
so, in this change:
1. instead of creating a new UUID, just reuse the generation of the sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the base36-based one (implemented by fmt::formatter<sstables::generation_type>)
Fixes#14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#14406
* github.com:scylladb/scylladb:
sstable: remove _remote_prefix from s3_storage
sstable: switch to uuid identifier for naming S3 sstable objects
this series extracts the the env variables related functions out and remove unused `import`s for better readability.
Closesscylladb/scylladb#15796
* github.com:scylladb/scylladb:
test/pylib: remove duplicated imports
test/pylib: extract the env variable printing into MinIoServer
test/pylib: extract _set_environ() out
default_compaction_progress_monitor returns a reference to a static
object. So, it should be read-only, but its users need to modify it.
Delete default_compaction_progress_monitor and use one's own
compaction_progress_monitor instance where it's needed.
Closesscylladb/scylladb#15800
This commit enables publishing documentation
from branch-5.4. The docs will be published
as UNSTABLE (the warning about version 5.4
being unstable will be displayed).
Closesscylladb/scylladb#15762
The log-structured allocator maintains memory reserves to so that
operations using log-strucutured allocator memory can have some
working memory and can allocate. The reserves start small and are
increased if allocation failures are encountered. Before starting
an operation, the allocator first frees memory to satisfy the reserves.
One problem is that if the reserves are set to a high value and
we encounter a stall, then, first, we have no idea what value
the reserves are set to, and second, we have no idea what operation
caused the reserves to be increased.
We fix this problem by promoting the log reports of reserve increases
from DEBUG level to INFO level and by attaching a stack trace to
those reports. This isn't optimal since the messages are used
for debugging, not for informing the user about anything important
for the operation of the node, but I see no other way to obtain
the information.
Ref #13930.
Closesscylladb/scylladb#15153
Said method is called in an allocating section, which will re-try the enclosed lambda on allocation failure. `read_context()` however moves the permit parameter so on the second and later calls, the permit will be in a moved-from state, triggering a `nullptr` dereference and therefore a segfault.
We already have a unit test (`test_exception_safety_of_reads` in `row_cache_test.cc`) which was supposed to cover this, but:
* It only tests range scans, not single partition reads, which is a separate path.
* Turns out allocation failure tests are again silently broken (no error is injected at all). This is because `test/lib/memtable_snapshot_source.hh` creates a critical alloc section which accidentally covers the entire duration of tests using it.
Fixes: #15578Closesscylladb/scylladb#15614
* github.com:scylladb/scylladb:
test/boost/row_cache_test: test_exception_safety_of_reads: also cover single-partition reads
test/lib/memtable_snapshot_source: disable critical alloc section while waiting
row_cache: make_reader_opt(): make make_context() reentrant
Major compaction semantics is that all data of a table will be compacted
together, so user can expect e.g. a recently introduced tombstone to be
compacted with the data it shadows.
Today, it can happen that all data in maintenance set won't be included
for major, until they're promoted into main set by off-strategy.
So user might be left wondering why major is not having the expected
effect.
To fix this, let's perform off-strategy first, so data in maintenance
set will be made available by major. A similar approach is done for
data in memtable, so flush is performed before major starts.
The only exception will be data in staging, which cannot be compacted
until view building is done with it, to avoid inconsistency in view
replicas.
The serialization in comapaction manager of reshape jobs guarantee
correctness if there's an ongoing off-strategy on behalf of the
table.
Fixes#11915.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#15792
This patch adds a reproducer for a minor compatibility between Scylla's
and Cassandra's handling of a prepared statement when a bind marker with
the same name is used more than once, e.g.,
```
SELECT * FROM tbl WHERE p=:x AND c=:x
```
It turns out that Scylla tells the driver that there is only one bind
marker, :x, whereas Cassandra tells the driver that there are two bind
markers, both named :x. This makes no different if the user passes
a map `{'x': 3}`, but if the user passes a tuple, Scylla accepts only
`(3,)` (assigning both bind markers the same value) and Cassandra
accepts only `(3,3)`.
The test added in this patch demonstrates this incompatibility.
It fails on Scylla, passes on Cassandra, and is marked "xfail".
Refs #15559
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#15564
Exception thrown from row_level_repair::run does not show the root
cause of a failure making it harder to debug.
Add the internal exception contents to runtime_error message.
After the change the log will mention the real cause (last line), e.g.:
repair - repair[92db0739-584b-4097-b6e2-e71a66e40325]: 33 out of 132 ranges failed,
keyspace=system_distributed, tables={cdc_streams_descriptions_v2, cdc_generation_timestamps,
view_build_status, service_levels}, repair_reason=bootstrap, nodes_down_during_repair={}, aborted_by_user=false,
failed_because=seastar::nested_exception: std::runtime_error (Failed to repair for keyspace=system_distributed,
cf=cdc_streams_descriptions_v2, range=(8720988750842579417,+inf))
(while cleaning up after seastar::abort_requested_exception (abort requested))
Closesscylladb/scylladb#15770
This PR improves the way of how handling failures is documented and accessible to the user.
- The Handling Failures section is moved from Raft to Troubleshooting.
- Two new topics about failure are added to Troubleshooting with a link to the Handling Failures page (Failure to Add, Remove, or Replace a Node, Failure to Update the Schema).
- A note is added to the add/remove/replace node procedures to indicate that a quorum is required.
See individual commits for more details.
Fixes https://github.com/scylladb/scylladb/issues/13149Closesscylladb/scylladb#15628
* github.com:scylladb/scylladb:
doc: add a note about Raft
doc: add the quorum requirement to procedures
doc: add more failure info to Troubleshooting
doc: move Handling Failures to Troubleshooting
instead of appending the options to the CMake variables, use the command to do this. simpler this way. and the bonus is that the options are de-duplicated.
Closesscylladb/scylladb#15797
* github.com:scylladb/scylladb:
build: cmake: use add_link_options() when appropriate
build: cmake: use add_compile_options() when appropriate
this series is one of the steps to remove global statements in `configure.py`.
not only the script is more structured this way, this also allows us to quickly identify the part which should/can be reused when migrating to CMake based building system.
Refs #15379Closesscylladb/scylladb#15780
* github.com:scylladb/scylladb:
build: update modeval using a dict
build: pass args.test_repeat and args.test_timeout explicitly
build: pull in jsoncpp using "pkgs"
build: build: extract code fragments into functions
instead of appending to CMAKE_EXE_LINKER_FLAGS*, use
add_link_options() to add more options. as CMAKE_EXE_LINKER_FLAGS*
is a string, and typically set by user, let's use add_link_options()
instead.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
instead of appending to CMAKE_CXX_FLAGS, use add_compile_options()
to add more options. as CMAKE_CXX_FLAGS is a string, and typically
set by user, let's use add_compile_options() instead, the options
added by this command will be added before CMAKE_CXX_FLAGS, and
will have lower priority.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
since we use the sstable.generation() for the remote prefix of
the key of the object for storing the sstable component, there is
no need to set remote_prefix beforehand.
since `s3_storage::ensure_remote_prefix()` and
`system_kesypace::sstables_registry_lookup_entry()` are not used
anymore, they are removed.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we create a new UUID for a new sstable managed
by the s3_storage, and we use the string representation of UUID
defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for
naming the objects stored on s3_storage. but this representation is
not what we are using for storing sstables on local filesystem when
the option of "uuid_sstable_identifiers_enabled" is enabled. instead,
we are using a base36-based representation which is shorter.
to be consistent with the naming of the sstables created for local
filesystem, and more importantly, to simplify the interaction between
the local copy of sstables and those stored on object storage, we should
use the same string representation of the sstable identifier.
so, in this change:
1. instead of creating a new UUID, just reuse the generation of the
sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As
we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined
by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the
base36-based one (implemented by
fmt::formatter<sstables::generation_type>)
4. enable the `uuid_sstable_identifers` cluster feature if it is
enabled in the `test_env_config`, so that it the sstable manager
can enable the uuid-based uuid when creating a new uuid for
sstable.
5. throw if the generation of sstable is not UUID-based when
accessing / manipulating an sstable with S3 storage backend. as
the S3 storage backend now relies on this option. as, otherwise
we'd have sstables with key like s3://bucket/number/basename, which
is just unable to serve as a unique id for sstable if the bucket is
shared across multiple tables.
Fixes#14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The following new commands are implemented:
* stop
* compactionhistory
All are associated with tests. All tests (both old and new) pass with both the scylla-native and the cassandra nodetool implementation.
Refs: https://github.com/scylladb/scylladb/issues/15588Closesscylladb/scylladb#15649
* github.com:scylladb/scylladb:
tools/scylla-nodetool: implement compactionhistory command
tools/scylla-nodetool: implement stop command
mutation/json: extract generic streaming writer into utils/rjson.hh
test/nodetool: rest_api_mock.py: add support for error responses
This is the continuation of 8c03eeb85d
Registering API handlers for services need to
* get the service to handle requests via argument, not from http context (http context, in turn, is going not to depend on anything)
* unset the handlers on stop so that the service is not used after it's stopped (and before API server is stopped)
This makes task manager handlers work this way
Closesscylladb/scylladb#15764
* github.com:scylladb/scylladb:
api: Unset task_manager test API handlers
api: Unset task_manager API handlers
api: Remove ctx->task_manager dependency
api: Use task_manager& argument in test API handlers
api: Push sharded<task_manager>& down the test API set calls
api: Use task_manager& argument in API handlers
api: Push sharded<task_manager>& down the API set calls
instead of updating `modes` in with global statements, update it in
a function. for better readablity. and to reduce the statements in
global scope.
Refs #15379
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this change adds "jsoncpp" dependency using "pkgs". simpler this
way. it also helps to remove more global statements.
Refs #15379
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this change extract `get_warnings_options()` out. it helps to
remove more global statements.
Refs #15379
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The options parameter is redundant. We always use
`_metadata->strategy_options()` and
`keyspace::create_replication_strategy` already assumes that
`_metadata` is set by using its other fields.
Closesscylladb/scylladb#15776
memtable_snapshot_source starts a background fiber in its constructor,
which compacts LSA memory in a loop. The loop's inside is covered with a
critical alloc section. It also contains a wait on a condition variable
and in its present form the critical section also covers the wait,
effectively turning off allocation failure injection for any test using
the memtable_snapshot_source.
This patch disables the critical alloc section while the loop waits on
the condition variable.
Said lambda currently moves the permit parameter, so on the second and
later calls it will possibly run into use-after-move. This can happen if
the allocating section below fails and is re-tried.
the java related build dependencies are installed by
* tools/java/install-dependencies.sh
* tools/jmx/install-dependencies.sh
respectively. and the parent `install-dependencies.sh` always
invoke these scripts, so there is no need to repeat them in the
parent `install-dependenceies.sh` anymore.
in addition to dedup the build deps, this change also helps to
reduce the size of build dependencies. as by default, `dnf`
install the weak deps, unless `-setopt=install_weak_deps=False`
is passed to it, so this change also helps to reduce the traffic
and foot print of the installed packages for building scylla.
see also 9dddad27bf
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15473
test.py calls `uninstall()` and `stop()` concurrently from exit
artifacts, and `uninstall()` internally calls `stop()`. This leads to
premature releasing of IP addresses from `uninstall()` (returning IPs to
the pool) while the servers using those IPs are still stopping. Then a
server might obtain that IP from the pool and fail to start due to
"Address already in use".
Put a lock around the body of `stop()` to prevent that.
Fixes: scylladb/scylladb#15755Closesscylladb/scylladb#15763
before this change, we print
marshaling error: Value not compatible with type org.apache.cassandra.db.marshal.AsciiType: '...'
but the wording is not quite user friendly, it is a mapping of the
underlying implementation, user would have difficulty understanding
"marshaling" and/or "org.apache.cassandra.db.marshal.AsciiType"
when reading this error message.
so, in this change
1. change the error message to:
Invalid ASCII character in string literal: '...'
which should be more straightforward, and easier to digest.
2. update the test accordingly
please note, the quoted non-ASCII string is preserved instead of
being printed in hex, as otherwise user would not be able to map it
with his/her input.
Refs #14320
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15678