When we have an filter such as "WHERE m[2] = 3" (where m is a map
column), if a row had a null value for m, our expression evaluation
code incorrectly dereferences an unset optional, and continued
processing the result of this dereference which resulted in undefined
behavior - sometimes we were lucky enough to get "marshaling error"
but other times Scylla crashed.
The fix is trivial - just check before dereferencing the optional value
of the map. We return null in that case, which means that we consider
the result of null[2] to be null. I think this is a reasonable approach
and fits our overall approach of making null dominate expressions (e.g.,
the value of "null < 2" is also null).
The test test_filtering.py::test_filtering_null_map_with_subscript,
which used to frequently fail with marshaling errors or crashes, now
passes every time so its "xfail" mark is removed.
Fixes#10417
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The test test_null.py::test_map_subscript_null turned out to reproduce
multiple bugs related to using map subscripts in filtering expressions.
One was issue #10361 (m[null] resulted in a bizarre error) or #10399
(m[null] resulted in a crash), and a different issue was #10401 (m[2]
resulted in a bizarre error or a crash if m itself was null). Moreover,
the same test uncovered different bugs depending how it was run - alone
or with other tests - because it was using a shared table.
In this patch we introduce two separate tests in test_filtering.py
which are designed to reproduce these separate bugs instead of mixing
them into one test. The new tests also cover a few more corners which
the previous test (which focused on nulls) missed - such as UNSET_VALUE.
The two new tests (and the old test_map_subscript_null) pass on
Cassandra so still assume that the Cassandra behavior - that m[null]
should be an error - is the correct behavior. We may want to change
the desired behavior (e.g., to decide that m[null] be null, not an
error), and change the tests accordingly later - but for now the
tests follow Cassandra's behavior exactly, and pass on Cassandra
and fail on Scylla (so are marked xfail).
The bugs reproduced by these tests involve randomness or reading
uninitialized memory, so these tests sometimes pass, sometimes fail,
and sometimes even crash (as reported in #10399 and #10401). So to
reproduce these bugs run the tests multiple times. For example:
test/cql-pytest/run --count 100 --runxfail
test_filtering.py::test_filtering_null_map_with_subscript
Refs #10361
Refs #10399
Refs #10401
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
"
cache_flat_mutation_reader gets a native v2 implementation. The
underlying mutation representation is not changed: range deletions are
still stored as v1 range_tombstones in mutation_partition. These are
converted to range tombstone changes during reading.
This allows for separating the change of a native v2 reader
implementation and a native v2 in-memory storage format, enabling the
two to be done at separate times and incrementally.
This means there is still conversion ingoing when reading from cache and
when populating, but when reading from underlying, the stream can now be
passed through as-is without conversions.
Also, any future v2 related changes to the in-memory storage will now be
limited to the cache reader implementation itself.
In the process, the non-forwarding reader, whose only user is the cache,
is also converted to v2.
"
Performance results reported by Botond:
"
build/release/test/perf/perf_simple_query -c1 -m2G --flush --
duration=20
BEFORE
median 130421.76 tps ( 71.1 allocs/op, 12.1 tasks/op, 47462
insns/op)
median absolute deviation: 319.64
maximum: 131028.33
minimum: 127502.55
AFTER
median 133297.41 tps ( 64.1 allocs/op, 12.2 tasks/op, 45406
insns/op)
median absolute deviation: 2964.24
maximum: 137581.56
minimum: 123739.4
Getting rid of those upgrade/downgrade was good for allocs and ops.
Curiously there is a 0.1 rise in number of tasks though.
"
* 'row-cache-readers-v2/v1' of https://github.com/denesb/scylla:
row_cache: update reader implementations to v2
range_tombstone_change_generator: flush(): add end_of_range
readers/nonforwardable: convert to v2
read_context: fix indentation
read_context: coroutinize move_to_next_partition()
row_cache: cache_entry::read(): return v2 reader
row_cache: return v2 readers from make_reader*()
readers/delegating_v2: s/make_delegating_reader_v2/make_delegating_reader/
cache_flat_mutation_reader gets a native v2 implementation. The
underlying mutation representation is not changed: range deletions are
still stored as v1 range_tombstones in mutation_partition. These are
converted to range tombstone changes during reading.
This allows for separating the change of a native v2 reader
implementation and a native v2 in-memory storage format, enabling the
two to be done at separate times and incrementally.
Allowing to flush all range tombstone changes, including those that have
a position equal to the passed in upper bound, when finishing off a
read-range, e.g. a clustering range from a slice.
It has a single user, the row cache, which for now has to
upgrade/downgrade around the nonforwardable reader, but this will go
away in the next patches when the row cache readers are converted to v2
proper.
The patchset embeds the mutation_fragment upgrading logic from v1 to v2 into the mutation_fragment_queue. This way the mutation fragments coming to the mutation_fragment_queue can be v1, but the underlying query_reader receives mutation_fragment_v2, eliminating the last usage of query_reader (v1). The last commit removes query_reader, query_reader_handle and associated factory functions.
tests: unit(dev), dtest(incremental_repair_test, read_repair_test, repair_additional_test, repair_test)
Closes#10371
* github.com:scylladb/scylla:
readers: Remove queue_reader v1 and associated code.
repair: Make mutation_fragment_queue internally upgrade fragments to v2
repair: Make mutation_fragment_queue::impl a seastar::shared_ptr
It makes mutation_fragment_queue copyable and makes the pointer to
pending mutation fragments in next commit stable. This allows moving the
mutation_fragment_queue without breaking the underlying
upgrading_consumer.
And adjust callers. The factory functions just sprinkle upgrade_to_v2()
on returned readers for now.
One test in row_cache_test.cc had to be disabled, because the upgrade to
v2 wrapper we now have over cache readers doesn't allow it to directly
control the reader's buffer size and so the test fails. There is a FIXME
left in the test code and the test will be re-enabled once a native v2
reader implementation allows us to get rid of the upgrade wrapper.
It turns out that Cassandra does not allow IN restrictions together with
filtering, except, curiously, when the restriction is on a clustering key.
There is no real reason for this limitation - the error message even says
it is not *yet* supported.
Scylla, on the other hand, does support this case. Of course it's not
enough that we support it - we need to support it correctly... But we don't
have a full regression test that this support is correct - in
filtering_test.cc we test it with clustering and regular columns - but not
partition key columns.
So this patch adds a simple cql-pytest test that this sort of filtering
works in Scylla correctly for partition, clustering and regular columns
(and also confirms that these cases don't work, yet, on Cassandra).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220420075553.1008062-1-nyh@scylladb.com>
The method in question is called in the branch where the replace address
is checked to be present, no need in extra explicit check.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Both the if is_replacing()/else branches call gossiper wating method as
their first steps. Can be done once.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are three modes in there -- bootstrap, b.s. with RBNO and b.s. for
replacing. All three are checked two times in a row, but can be done
once.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This call just places a boolean flag on all. It won't hurt if it lasts
while the node is performing pre-bootstrap checks, but it allows making
the whole method less branchy.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The do_with() means we have an unconditional allocation, so we can
justify the coroutine's allocation (replacing it). Meanwhile,
coroutine::parallel_for_each() reduces an allocation if mutate_locally()
blocks.
Closes#10387
gcc 12 checks some things that clang doesn't, resulting in compile errors.
This series fixes some of theses issues, but still builds (and tests) with clang.
Unfortunately, we still don't have a clean gcc build due to an outstanding bug [1].
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056Closes#10386
* github.com:scylladb/scylla:
build: disable warnings that cause false-positive errors with gcc 12
utils: result_loop: remove invalid and incorrect constraint
service: forward_service: avoid using deprecated std::bind1st and std::not1
repair: explicityl ignore tombstone gc update response
treewide: abort() after switch in formatters
db: view: explicitly ignore unused result
compaction: leveled_compaction_strategy: avoid compares between signed and unsigned
compaction_manager: compaction_reenabler: disambiguate compaction_state
api: avoid function specialization in req_param
alternator: ttl: avoid specializing class templates in non-namespace scope
alternator: executor: fix signed/unsigned comparison in is_big()
Currently, rpc handlers are all lambdas inside
storage_proxy::init_messaging_service(). This means any stack trace
refers to storage_proxy::init_messaging_service::lambda#n instead of
a meaningful function name, and it makes init_messaging_service()
very intimidating.
Fix that by moving all such lambdas to regular member functions. The
first two patches remove unnecessary captures to make it easy; the
final patch coverts the lambdas to member functions.
Closes#10388
* github.com:scylladb/scylla:
storage_proxy: convert rpc handlers from lambdas to member functions
storage_proxy: don't capture messaging_service in server callbacks
storage_proxy: don't capture migration_manager in server callbacks
We currently does not able to get any error message from subprocess when we specified capture_output=True on subprocess.run().
This is because CalledProcessError does not print stdout/stderr when it raised, and we don't catch the exception, we just let python to cause Traceback.
Result of that, we only able to know exit status and failed command but
not able to get stdout/stderr.
This is problematic especially working on perftune.py bug, since the
script should caused Traceback but we never able to see it.
To resolve this, add wrapper function "out()" for capture output, and
print stdout/stderr with error message inside the function.
Fixes#10390Closes#10391
Checking a concept in a requires-expression requires an additional
requires keyword. Moreover, the constraint is incorrect (at least
all callers pass a T, not a result<T>), so remove it.
Found by gcc 12.
It is typical in switch statements to select on an enum type and
rely on the compliler to complain if an enum value was missed. But
gcc isn't satisified since the enum could have a value outside the
declared list. Call abort() in this impossible situation to pacify
it.
Function specializations are not allowed (you're supposed to use
overloads), but clang appears to allow them.
Here, we can't use an overload since the type doesn't appear in the
parameter list. Use a constraint instead.
The C++ standard disallows class template specialization in non-namespace
scopes. Clang apparently allows it as an extension.
Fix by not using a template - there are just two specializations and
no generic implementation. Use regular classes and std::conditional_t
to choose between the two.
Signed/unsigned comparisons are subject to C promotion rules. In is_big()
in this case the comparison is safe, but gcc warns. Use a cast to silence
the warning.
The sign/unsigned mix and int/size_t size differences still look bad, it
would be good to revisit this code, but that is left for another patch.
Series 59d56a3fd7 introduced
an accidental backward incompatible regression by adding
a column to system_schema.keyspaces and then not even using
it for anything. It's a leftover from the original hackathon
implementation and should never reach master in the first place.
Fortunately, the series isn't part of any stable release yet.
Fixes#10376
Tests: manual, verifying that the system_schema.keyspaces table
no longer contains the extraneous column.
Closes#10377
Currently, rpc handlers are all lambdas inside
storage_proxy::init_messaging_service(). This means any stack trace
refers to storage_proxy::init_messaging_service::lambda#n instead of
a meaningful function name, and it makes init_messaging_service()
very intimidating.
Fix that by moving all such lambdas to regular member functions.
This is easy now that they don't capture anything except `this`,
which we provide during registration via std::bind_front().
A few #includes and forward declarations had to be added to
storage_proxy.hh. This is unfortunate, but can only be solved
by splitting storage_proxy into a client part and a server part.
We'd like to make the server callbacks member functions, rather
than lambdas, so we need to eliminate their captures. This patch
eliminats 'ms' by referringn to the already existing member '_messaging'
instead.
We'd like to make the server callbacks member functions, rather
than lambdas, so we need to eliminate their captures. This patch
eliminates 'mm' by making it a member variable and capturing 'this'
instead. In one case 'mm' was used by a handle_write() intermediate
lambda so we have to make that non-static and capture it too.
uninit_messaging_service() clears the member variable to preserve
the same lifetime 'mm' had before, in case that's important.
* seastar acf7e3523b...5e86362704 (10):
> Merge "Respect taskset-configured cpumask" from Pavel E
Ref #9505.
> rpc_tester: Run CPU hogs on server side too
> std-coroutine: include <coroutine> for LLVM-15
> Revert "Merge "tests: perf: measure coroutines performance" from Benny"
> test: perf_tests: remove [[gnu::always_inline]] attribute from coroutine perf tests
> Merge "tests: perf: measure coroutines performance" from Benny
> Merge "Extend RPC tester" from Pavel E
> rpc: Mark connection trivial getters const noexcept
> seastar-addr2line: Allow use of llvm-addr2line as the command
> file: append_challenged_posix_file: Serialize allocate() to not block concurrent reads or writes
The code would advertise the USES_RAFT feature when the SUPPORTS_RAFT
feature was enabled through a listener registered on the SUPPORTS_RAFT
feature.
This would cause a deadlock:
1. `gossiper::add_local_application_state(SUPPORTED_FEATURES, ...)`
locks the gossiper (it's called for the first time from sstables
format selector).
2. The function calls `on_change` listeners.
3. One of the listeners is the one for SUPPORTS_RAFT.
4. The listener calls
`gossiper::add_local_application_state(SUPPORTED_FEATURES, ...)`.
5. This tries to lock the gossiper.
In turn, depending on timing, this could hang the startup procedure,
which calls `add_local_application_state` multiple times at various
points, trying to take the lock inside gossiper.
This prevents us from testing raft / group 0, new schema change
procedures that use group 0, etc.
For now, simply remove the code that advertises the USES_RAFT feature.
Right now the feature has no other effect on the system than just
becoming enabled. In fact, it's possible that we don't need this second
feature at all (SUPPORTS_RAFT may be enough), but that's
work-in-progress. If needed, it will be easy to bring the enabling code
back (in a fixed form that doesn't cause a deadlock). We don't remove
the feature definitions yet just in case.
Refs: #10355
We start the memory threshold guard (that enables large memory allocation
warnings post-boot) but don't wait for it. I can't imagine it can hurt,
but it does carry a FIXME label.
Closes#10375