the temporary directory holding the log file collecting the scylla
subprocess's output is specified by the test itself, and it is
`test_tempdir`. but unfortunately, cql-pytest/run.py is not aware
of this. so `cleanup_all()` is not able to print out the logging
messages at exit. as, please note, cql-pytest/run.py always
collect "log" file under the directory created using `pid_to_dir()`
where pid is the spawned subprocesses. but `object_store/run` uses
the main process's pid for its reusable tempdir.
so, with this change, we also register a cleanup func to printout
the logging message when the test exits.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13647
when reading this source code, there are a handful issues reported by my flycheck plugin. none of them is critical, but better off fixing them.
Closes#13612
* github.com:scylladb/scylladb:
test: object_store: specify timeout
test: object_store: s/exit/sys.exit/
test: object_store: do not declare a global variable for read
test: object_store: remove unused imports
`os.P_NOWAIT` is supposed to be used in spawn calls, while `os.WNOHANG`
is used as in the options parameter passed to wait calls. fortunately,
`P_NOWAIT` is defined as "1" in CPython, and `os.WNOHANG` is defined
as "1" in linux kernel. that's why the existing implementation works.
but we should not rely on this coincidence. so, in this change,
`os.P_NOWAIT` is replaced with `os.WNOHANG` for correctness and for
better readability.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13646
This series adds handling of null std::unique_ptr to utils::clear_gently
and handling of std::optional and seastar::optimized_optional (both engaged and disengaged cases).
Also, unit tests were added to tests the above cases.
Fixes#13636Closes#13638
* github.com:scylladb/scylladb:
utils: clear_gently: add variants for optional values
utils: clear_gently: do not clear null unique_ptr
All users of global proxy are gone (*), proxy can be made fully main/cql_test_env local.
(*) one test case still needs it, but can get it via cql_test_env
Closes#13616
* github.com:scylladb/scylladb:
code: Remove global proxy
schema_change_test: Use proxy from cql_test_env
test: Carry proxy reference on cql_test_env
this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation.
this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier.
Closes#13073
* github.com:scylladb/scylladb:
replica, test: create generation id using generator
sstables: add generation_generator
test: sstables: use generate_n for generating ids for testing
This series cleans up the generation and value types used in gms / gossiper.
Currently we use a blend of int, int32_t, and int64_t around messaging.
This change defines gms::generation_type and gms::version_type as int32_t
and add check in non-release modes that the respective int64 value passed over messaging do not overflow 32 bits.
Closes#12966
* github.com:scylladb/scylladb:
gossiper: version_generator: add {debug_,}validate_gossip_generation
gms: gossip_digest: use generation_type and version_type
gms: heart_beat_state: use generation_type and version_type
gms: versioned_value: use version_type
gms: version_generator: define version_type and generation_type strong types
utils: move generation-number to gms
utils: add tagged_integer
gms: versioned_value: make members private
scylla-gdb: add get_gms_versioned_value
gms: versioned_value: delete unused compare_to function
gms: gossip_digest: delete unused compare_to function
Implement clear_gently for std:;optional<T>
and seastar::optimized_optional<T> and respective
unit tests.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Otherwise the null pointer is dereferenced.
Add a unit test reproducing the issue
and testing this fix.
Fixes#13636
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The only reason why it's there (right next to compaction_fwd.hh) is
because the database::table_truncate_state subclass needs the definition
of compaction_manager::compaction_reenabler subclass.
However, the former sub is not used outside of database.cc and can be
defined in .cc. Keeping it outside of the header allows dropping the
compaction_manager.hh from database.hh thus greatly reducing its fanout
over the code (from ~180 indirect inclusions down to ~20).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13622
Derived from utils::tagged_integer, using different tags,
the types are incompatible with each other and require explicit
typecasting to- and from- their value type.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
A generic template for defining strongly typed
integer types.
Use it here to replace raft::internal::tagged_uint64.
Will be used for defining gms generation and version
as strong and distinguishable types in following patches.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Introduce a new table `CDC_GENERATIONS_V3` (`system.cdc_generations_v3`).
The table schema is a copy-paste of the `CDC_GENERATIONS_V2` schema. The
difference is that V2 lives in `system_distributed_keyspace` and writes to it
are distributed using regular `storage_proxy` replication mechanisms based on
the token ring. The V3 table lives in `system_keyspace` and any mutations
written to it will go through group 0.
Extend the `TOPOLOGY` schema with new columns:
- `new_cdc_generation_data_uuid` will be stored as part of a bootstrapping
node's `ring_slice`, it stores UUID of a newly introduced CDC
generation which is used as partition key for the `CDC_GENERATIONS_V3`
table to access this new generation's data. It's a regular column,
meaning that every row (corresponding to a node) will have its own.
- `current_cdc_generation_uuid` and `current_cdc_generation_timestamp`
together form the ID of the newest CDC generation in the cluster.
(the uuid is the data key for `CDC_GENERATIONS_V3`, the timestamp is
when the CDC generation starts operating). Those are static columns
since there's a single newest CDC generation.
When topology coordinator handles a request for node to join, calculate a new
CDC generation using the bootstrapping node's tokens, translate it to mutation
format, and insert this mutation to the CDC_GENERATIONS_V3 table through group 0
at the same time we assign tokens to the node in Raft topology. The partition
key for this data is stored in the bootstrapping node's `ring_slice`.
After inserting new CDC generation data , we need to pick a timestamp for this
generation and commit it, telling all nodes in the cluster to start using the
generation for CDC log writes once their clocks cross that timestamp.
We introduce a separate step to the bootstrap saga, before
`write_both_read_old`, called `commit_cdc_generation`. In this step, the
coordinator takes the `new_cdc_generation_data_uuid` stored in a bootstrapping
node's `ring_slice` - which serves as the key to the table where the CDC
generation data is stored - and combines it with a timestamp which it generates
a bit into the future (as in old gossiper-based code, we use 2 * ring_delay, by
default 1 minute). This gives us a CDC generation ID which we commit into the
topology state as the `current_cdc_generation_id` while switching the saga to
the next step, `write_both_read_old`.
Once a new CDC generation is committed to the cluster by the topology
coordinator, we also need to publish it to the user-facing description tables so
CDC applications know which streams to read from.
This uses regular distributed table writes underneath (tables living in the
`system_distributed` keyspace) so it requires `token_metadata` to be nonempty.
We need a hack for the case of bootstrapping the first node in the cluster -
turning the tokens into normal tokens earlier in the procedure in
`token_metadata`, but this is fine for the single-node case since no streaming
is happening.
When a node notices that a new CDC generation was introduced in
`storage_service::topology_state_load`, it updates its internal data structures
that are used when coordinating writes to CDC log tables.
We include the current CDC generation data in topology snapshot transfers.
Some fixes and refactors included.
Closes#13385
* github.com:scylladb/scylladb:
docs: cdc: describe generation changes using group 0 topology coordinator
cdc: generation_service: add a FIXME
cdc: generation_service: add legacy_ prefix for gossiper-based functions
storage_service: include current CDC generation data in topology snapshots
db: system_keyspace: introduce `query_mutations` with range/slice
storage_service: hold group 0 apply mutex when reading topology snapshot
service: raft_group0_client: introduce `hold_read_apply_mutex`
storage_service: use CDC generations introduced by Raft topology
raft topology: publish new CDC generation to the user description tables
raft topology: commit a new CDC generation on node bootstrap
raft topology: create new CDC generation data during node bootstrap
service: topology_state_machine: make topology::find const
db: system_keyspace: small refactor of `load_topology_state`
cdc: generation: extract pure parts of `make_new_generation` outside
db: system_keyspace: add storage for CDC generations managed by group 0
service: topology_state_machine: better error checking for state name (de)serialization
service: raft: plumbing `cdc::generation_service&`
cdc: generation: `get_cdc_generation_mutations`: take timestamp as parameter
cdc: generation: make `topology_description_generator::get_sharding_info` a parameter
sys_dist_ks: make `get_cdc_generation_mutations` public
sys_dist_ks: move find_schema outside `get_cdc_generation_mutations`
sys_dist_ks: move mutation size threshold calculation outside `get_cdc_generation_mutations`
service/raft: group0_state_machine: signal topology state machine in `load_snapshot`
we only need to declare a variable with `global` when we need to
write to it, but if we just want to read it, there is no need to
declare it. because the way how python looks up for a variable
when reading from it enables python to find the global variables
(and apparently the functions!). but when we assign a variable in
python, the interpreter would have to tell in which scope the
variable lives. by default the local scope is used, and a new
variable is added to `locals()`.
but in this case, we just read from it. so no need to add the
`global` statement.
see also https://docs.python.org/3/reference/simple_stmts.html#global
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
reuse generation_generator for generating generation identifiers for
less repeatings. also, add allow update generator to update its
lastest known generation id.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this the standard library offers
`std::lexicographical_compare_threeway()`, and we never uses the
last two addition parameters which are not provided by
`std::lexicographical_compare_threeway()`. there is no need to have
the homebrew version of trichotomic compare function.
in this change,
* all occurrences of `lexicographical_tri_compare()` are replaced
with `std::lexicographical_compare_threeway()`.
* ``lexicographical_tri_compare()` is dropped.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13615
No code needs global proxy anymore. Keep on-stack values in main and
cql_test_env and keep the pointer on debug:: namespace.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's one place where test case calls for storage proxy and currently
does it via global refernece. Time to switch it to cql_test_env's one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
All sharded<> services are created by cql_test_env on the stack. The
cql_test_env() is then used to keep references on some of them and to
export them to test cases via its methods. Proxy is missing on that
exportable list, but will be needed, so add one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Move gms::arrival_window to api/failure_detector which is its only user.
and get rid of the rest, which is not used, now that we use direct_failure_detector instead.
TODO: integare direct_failure_detector with failure_detector api.
Closes#13576
* github.com:scylladb/scylladb:
gms: get rid of unused failure_detector
api: failure_detector: remove false dependency on failure_detector::arrival_window
test: rest_api: add test_failure_detector
Fixes a problem when raft-based topology is enabled, which loads
topology from storage. It starts by clearing topology and then adding
nodes one by one. Before this patch, this violates internal invariant
of topology object which puts the local node as the first node. This
would manifest by triggering an assert in topology::pop_node() which
throws if popping the node at index 0 in order to keep the information
about local node around. This is normally prevented by a check in
topology::remove_node() which avoid calling pop_node() if removing the
local node. But since there is no node which is marked as local, this
check allows the first node to be popped.
To fix the problem I lift the invariant that local node is always in
_nodes. We still have information about local node in config. Instead
of keeping it in _nodes, we recognize it as part of indexing. We also
allow removing the local node like a regular node.
The path which reloads topology works correctly after this, the local
node will be recognized when (if) it is added to the topology.
Fixes#13495Closes#13498
* github.com:scylladb/scylladb:
locator: topology: Fix move assignment
locator: topology: Add printer
tests: topology: Test that topology clearing preserves information about local node
locator: topology: Recognize local node as part of indexing it
locator: topology: Fix get_location(ep) for local node
locator: topology: Fix typo
locator: topology: Preserve config when cloning
so we don't need to keep a `prev_gen` around, this also prepares for
the coming change to use generation generator.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `component_type` without the help of `operator<<`.
the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.
also, please note, to enable fmtlib to format `std::set<component_type>`
in `test/boost/sstable_3_x_test.cc` , we need to include
`<fmt/ranges.h>` in that source file.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13598
this change replaces all occurrences of `boost::lexical_cast<std::string>`
in the source tree with `fmt::to_string()`. for couple reasons:
* `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`,
so the latter is easier to parse and read.
* `boost::lexical_cast<std::string>` creates a stringstream under the
hood, so it can use the `operator<<` to stringify the given object.
but stringstream is known to be less performant than fmtlib.
* we are migrating to fmtlib based formatting, see #13245. so
using `fmt::to_string()` helps us to remove yet another dependency
on `operator<<`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13611
The legacy failure_detector is now unused and can be removed.
TODO: integare direct_failure_detector with failure_detector api.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This patch contains tests reproducing issue #13601 and the corresponding
Cassandra issue CASSANDRA-18470. These issues are about what the AVG
aggregation does for arbitrary-precision "decimal" numbers - the tests
we add here show examples where the current behavior doesn't make sense:
The problem is that "decimal" has arbitrary precision - so, should an
average of 1/3 be returned as 0.3 or 0.33333333333333333? This is not
specified, so Scylla (and Cassandra) decided to pick the result precision
based on the input precision. In particular, the average of 1 and 2
is returned as 2 (zero digits after the decimal point, like in the
inputs) instead of the expected 1.5. Arguably this isn't useful behavior.
The test adds a second test which fails on Cassandra, but does pass
on Scylla: Cassandra returns as the average of 1, 2, 2, 3 the integer 1
whereas the correct average is 2 (and Scylla returns it correctly).
The reason why this bug is even worse on Cassandra is that Scylla's AVG
only loses precision when dividing the sum and count, but Cassandra
tries to maintain only the average, and loses precision at every step.
Refs #13601
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#13603
Fixes a problem when raft-based topology is enabled, which loads
topology from storage. It starts by clearing topology and then adding
nodes one by one. Before this patch, this violates internal invariant
of topology object which puts the local node as the first node. This
would manifest by triggering an assert in topology::pop_node() which
throws if popping the node at index 0 in order to keep the information
about local node around. This is normally prevented by a check in
topology::remove_node() which avoid calling pop_node() if removing the
local node. But since there is no node which is marked as local, this
check allows the first node to be popped.
To fix the problem I lift the invariant that local node is always in
_nodes. We still have information about local node in config. Instead
of keeping it in _nodes, we recognize it as part of indexing. We also
allow removing the local node like a regular node.
The path which reloads topology works correctly after this, the local
node will be recognized when (if) it is added to the topology.
Fixes#13495
The tests in question are using MINIO_SERVER_ADDRESS environment variable to export minio server address from pylib to test cases. Also they use hard-coded public bucket name. Both plays badly with AWS S3, the former due to MINIO_... in its name and the latter because public bucket name can be any.
So this PR puts address and public bucket name into S3_..._FOR_TEST environment variables and fixes output stream closure on failure while at it.
Detached from #13493Closes#13546
* github.com:scylladb/scylladb:
s3/test: Rename MINIO_SERVER_ADDRESS environment variable
s3/test: Keep public bucket name in environment
s3/test: Fix upload stream closure
test/lib: Add getenv_safe() helper
This `with` context is supposed to disable, then re-enable
autocompaction for the given keyspaces, but it used the wrong API for
it, it used the column_family/autocompaction API, which operates on
column families, not keyspaces. This oversight led to a silent failure
because the code didn't check the result of the request.
Both are fixed in this patch:
* switch to use `storage_service/auto_compaction/{keyspace}` endpoint
* check the result of the API calls and report errors as exceptions
Fixes: #13553Closes#13568
server to see other servers after start/restart
When starting/restarting a server, provide a way to wait for the server
to see at least n other servers.
Also leave the implementation methods available for manual use and
update previous tests, one to wait for a specific server to be seen, and
one to wait for a specific server to not be seen (down).
Fixes#13147
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
Closes#13438
The logger instancewas removed in a previous commit but it is used in
the wrapper helper. Add it back.
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
This short PR fixes a bug in SUM() aggregation where if the data contains +Inf and -Inf the returned sum should be NaN but we returned an error instead. This is a recent regression uncovered by a dtest (see issue #13551), but in the first patch we add additional tests in the cql-pytest framework which reproduce this bug and explore various other areas (wrongly) implicated by the failing dtest.
Fixes#13551Closes#13564
* github.com:scylladb/scylladb:
cql3: allow SUM() aggregation to result in a NaN
test/cql-pytest: add tests for data casts and inf in sums
Using it the pylib minio code export minio address for tests. This
creates unneeded WTFs when running the test over AWS S3, so it's better
to rename to variable not to mention MINIO at all.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Local test.py runs minio with the public 'testbucket' bucket and all
test cases know that. This series adds an ability to run tests over real
S3 so the bucket name should be configurable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
If multipart upload fails for some reason the output stream remains not
closed and the respective assertion masquerades the original failure.
Fix that by closing the stream in all cases.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper is like ::getenv() but checks if the variable exists and
throws descriptive exception. So instead of
fatal error: in "...": std::logic_error: basic_string: construction from null is not valid
one could get something like
fatal error: in "...": std::logic_error: Environment variable ... not set
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When floating-point data contains +Inf and -Inf, the sum is NaN.
Our SUM() aggregation calculated this sum correctly, but then instead
of returning it, complained that the sum overflowed by narrowing.
This was a false positive: The sum() finalizer wanted to test that no
precision was lost when casting the accumulator to the result type,
so checked that the result before and after the cast are the same.
But specifically for NaN, it is never equal to anything - not even
to itself. This check is wrong for floating point, but moreover -
isn't even necessary when the two types (accumulator type and result
type) are identical so in this patch we skip it in this case.
Note that in the current code, a different accumulator and result type
is only used in the case of integer types; When accumulating floating
point sums, the same type is used, so the broken check will be avoided.
The test for this issue starts to pass with this patch, so the xfail
tag is removed.
Fixes#13551
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch adds tests to reproduce issue #13551. The issue, discovered
by a dtest (cql_cast_test.py), claimed that either cast() or sum(cast())
from varint type broke. So we add two tests in cql-pytest:
1. A new test file, test_cast_data.py, for testing data casts (a
CAST (...) as ... in a SELECT), starting with testing casts from
varint to other types.
The test uncovers a lot of interesting cases (it is heavily
commented to explain these cases) but nothing there is wrong
and all tests pass on Scylla.
2. An xfailing test for sum() aggregate of +Inf and -Inf. It turns out
that this caused #13551. In Cassandra and older Scylla, the sum
returned a NaN. In Scylla today, it generates a misleading
error message.
As usual, the tests were run on both Cassandra (4.1.1) and Scylla.
Refs #13551.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>