Commit Graph

39859 Commits

Author SHA1 Message Date
Kefu Chai
43fd63e28c clocks-impl: format time_point using fmt
instead of relying on the operator<<() of an opaque type, use fmtlib
to print a timepoint for better support of new fmtlib which dropped
the default-generated formatter for types with operator<<().

Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16116
2023-11-22 17:44:07 +02:00
Nadav Har'El
242a4b23c0 Merge 'tests: Skip unnecessary sleeps in cql_test_env teardown' from Tomasz Grabiec
This PR contains two patches which get rid of unnecessary sleeps on cql_test_env teardown greatly reducing run time of tests.

Reduces run time of `build/dev/test/boost/schema_change_test` from 90s to 6s.

Closes scylladb/scylladb#16111

* github.com:scylladb/scylladb:
  test: cql_test_env: Interrupt all components on cql_test_env teardown
  tests: cql_test_env: Skip gossip shutdown sleep
2023-11-22 17:44:07 +02:00
Anna Stuchlik
3751acce42 doc: fix rollback in the 5.2-to-5.4 upgrade guide
This commit fixes the rollback procedure in
the 5.2-to-5.4 upgrade guide:
- The "Restore system tables" step is removed.
- The "Restore the configuration file" command
  is fixed.
- The "Gracefully shutdown ScyllaDB" command
  is fixed.

In addition, there are the following updates
to be in sync with the tests:

- The "Backup the configuration file" step is
  extended to include a command to backup
  the packages.
- The Rollback procedure is extended to restore
  the backup packages.
- The Reinstallation section is fixed for RHEL.

Also, I've removed the optional step to enable
consistent schema management from the list of
steps - the appropriate section has already
been removed, but it remained in the procedure
description, which was misleading.

Refs https://github.com/scylladb/scylladb/issues/11907

This commit must be backported to branch-5.4

Closes scylladb/scylladb#16114
2023-11-22 17:44:07 +02:00
Takuya ASADA
b97df92d76 scylla_setup: stop aborting on old kernel warning when non-interactive mode
On non-interactive mode setup, RHEL/CentOS7 old kernel check causes
"Setup aborted", this is not what we want.
We should keep warning but proceed setup, so default value of the kernel
check should be True, since it will automatically applied on
non-interactive mode.

Fixes #16045

Closes scylladb/scylladb#16100
2023-11-22 17:44:07 +02:00
Botond Dénes
b1a76ebb93 Merge 'Sanitize storage service init/deinit sequences' from Pavel Emelyanov
Currently storage service starts too early and its initialization is split into several steps. This PR makes storage service start "late enough" and makes its initialization (minimally required before joining cluster) happen in on place.

refs: #2795
refs: #2737

Closes scylladb/scylladb#16103

* github.com:scylladb/scylladb:
  storage_service: Drop (un)init_messaging_service_part() pair
  storage_service: Init/Deinit RPC handlers in constructor/stop
  storage_service: Dont capture container() on RPC handler
  storage_service: Use storage_service::_sys_dist_ks in some places
  storage_service: Add explicit dependency on system dist. keyspace
  storage_service: Rurn query processor pointer into reference
  storage_service: Add explicity query_processor dependency
  main: Start storage service later
2023-11-22 17:44:07 +02:00
Tomasz Grabiec
93ee7b7df9 test: cql_test_env: Interrupt all components on cql_test_env teardown
This should interrupt all sleeps in component teardown.

Before this patch, there was a 1s sleep on gossiper shutdown, which I
don't know where it comes from. After the patch there is no such
sleep.
2023-11-21 12:22:32 +01:00
Tomasz Grabiec
7f3a74efab tests: cql_test_env: Skip gossip shutdown sleep
Removes unnecessary 2s sleep on each cql test env teardown.
2023-11-21 12:22:24 +01:00
Botond Dénes
65e42e4166 Merge 'mutation_query: properly send range tombstones in reverse queries' from Michał Chojnowski
reconcilable_result_builder passes range tombstone changes to _rt_assembler
using table schema, not query schema.
This means that a tombstone with bounds (a; b), where a < b in query schema
but a > b in table schema, will not be emitted from mutation_query.

This is a very serious bug, because it means that such tombstones in reverse
queries are not reconciled with data from other replicas.
If *any* queried replica has a row, but not the range tombstone which deleted
the row, the reconciled result will contain the deleted row.

In particular, range deletes performed while a replica is down will not
later be visible to reverse queries which select this replica, regardless of the
consistency level.

As far as I can see, this doesn't result in any persistent data loss.
Only in that some data might appear resurrected to reverse queries,
until the relevant range tombstone is fully repaired.

This series fixes the bug and adds a minimal reproducer test.

Fixes #10598

Closes scylladb/scylladb#16003

* github.com:scylladb/scylladb:
  mutation_query_test: test that range tombstones are sent in reverse queries
  mutation_query: properly send range tombstones in reverse queries
2023-11-21 09:19:14 +02:00
Kefu Chai
691f7f6edb util: do not use variable length array
vla (variable length array) is an extension in GCC and Clang. and
it is not part of the C++ standard.

so let's avoid using it if possible, for better standard compliant.
it's also more consistent with other places where we calculate the size
of an array of T in the same source file.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16084
2023-11-20 23:02:41 +02:00
Nadav Har'El
0fd10690d4 Merge 'When creating S3-backed keyspace, check the endpoint instantly' from Pavel Emelyanov
Currently CREATE KEYSPACE ... WITH STORAGE = { 'type' = 'S3' ... } will create keyspace even if the backend configuration is "invalid" in the sense that the requested endpoint is not known to scylla via object_storage.yaml config file. The first time after that when this misconfiguration will reveal itself is when flushing a memtable (see #15635), but it's good to know the endpoint is not configured earlier than that.

fixes: #15074

Closes scylladb/scylladb#16038

* github.com:scylladb/scylladb:
  test: Add validation of misconfigured storage creation
  sstables: Throw early if endpoint for keyspace is not configured
  replica: Move storage options validation to sstables manager
  test/cql-pytest/test_keyspaces: Move DESCRIBE case to object store
  sstables: Add has_endpoint_client() helper to manager
2023-11-20 21:12:48 +02:00
Kefu Chai
9a3c7cd768 build: cmake: drop Seastar_OptimizationLevel_*
in this change,

* all `Seastar_OptimizationLevel_*` are dropped.
* mode.Sanitize.cmake:
    s/CMAKE_CXX_FLAGS_COVERAGE/CMAKE_CXX_FLAGS_SANITIZE/
* mode.Dev.cmake:
    s/CMAKE_CXX_FLAGS_RELEASE/CMAKE_CXX_FLAGS_DEV/

Seastar_OptimizationLevel_* variables have nothing to do with
Seastar, and they introduce unnecessary indirection. the function
of `update_cxx_flags()` already requires an option name for this
parameter, so there is no need to have a name for it.

the cached entry of `Seastar_OptimizationLevel_DEBUG` is also
dropped, if we really need to have knobs which can be configured
by user, we should define them in a more formal way. at this
moment, this is not necessary. so drop it along with this
variable.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16059
2023-11-20 19:26:54 +02:00
Botond Dénes
6e9850067b Merge 'Make test-only write_memtable_to_sstable() overloads shorter' from Pavel Emelyanov
There are three of them, one is used by core, another by tests and the third one passes arguments between those two. And the ..._for_tests() helper in test utils. This PR leaves only one for tests out of three.

Closes scylladb/scylladb#16068

* github.com:scylladb/scylladb:
  tests: Shorten the write_memtable_to_sstable_for_test()
  replica: Squash two write_memtable_to_sstable()
  replica: Coroutinize one of write_memtable_to_sstable() overloads
2023-11-20 16:05:06 +02:00
Pavel Emelyanov
9b16c298e9 test: Add validation of misconfigured storage creation
In an attempt to create a non-local keyspace with unknown endpoint,
there should pop up the configuration exception.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 15:25:58 +03:00
Pavel Emelyanov
2bf1e2a294 sstables: Throw early if endpoint for keyspace is not configured
When a keyspace is created it initiaizes the storage for it and
initialization of S3 storage is the good place to check if the endpoint
for the storage is configured at all.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 15:25:58 +03:00
Pavel Emelyanov
f2a99ad30a replica: Move storage options validation to sstables manager
Currently the cql statement .validate() callback is responsible for
checking if the non-local storage options are allowed with the
respective feature. Next patch will need to extend this check to also
validate the details of the provided storage options, but doing it at
cql level doesn't seem correct -- it's "too far" from query processor
down to sstables manager.

Good news is that there's a lower-level validation of the new keyspace,
namely the database::validate_new_keyspace() call. Move the storage
options validation into sstables manager, while at it, reimplement it
as a visitor to facilitate further extentions and plug the new
validation to the aforementioned database::validate_new_keyspace().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 15:24:59 +03:00
Botond Dénes
f53961248d gms,service: add a feature to protect the usage of allow_mutation_read_page_without_live_row
allow_mutation_read_page_without_live_row is a new option in the
partition_slice::option option set. In a mixed clusters, old nodes
possibly don't know this new option, so its usage must be protected by a
cluster feature. This patch does just that.

Fixes: #15795

Closes scylladb/scylladb#15890
2023-11-20 13:03:55 +01:00
Botond Dénes
935065fd8d Update tools/java submodule
* tools/java b776096d...8485bef3 (2):
  > dist: Require jre-11-headless in from rpm
  > dist: remove duplicated java-headless from "Requires"
2023-11-20 13:55:55 +02:00
Pavel Emelyanov
b31b51ae90 test/cql-pytest/test_keyspaces: Move DESCRIBE case to object store
We're going to ban creation of a keyspace with S3 type in case the
requested endpoint is not configured. The problem is that this test case
of cql-pytest needs such keyspace to be created and in order to provide
the object storage configuration we'd need to touch the generic scylla
cluster management which is an overill for generic cql-pytest case.

Simpler solution is to make object_store test suite perform all the
S3-related checks, including the way DESCRIBE for S3-backed ks works.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 14:31:08 +03:00
Pavel Emelyanov
2c31cd7817 sstables: Add has_endpoint_client() helper to manager
It's the get_endpoint_client() peer that only checks the client
presense. To be used by next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 14:31:08 +03:00
Pavel Emelyanov
8ae751a3ff tests: Shorten the write_memtable_to_sstable_for_test()
The wrapper just calls the test-only core write_memtable_to_sstable()
overload, tests can do it on their own.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 14:27:57 +03:00
Pavel Emelyanov
1d7d2dff50 replica: Squash two write_memtable_to_sstable()
There are three of them and one acts purely as arguments passer between
other two.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 14:27:57 +03:00
Pavel Emelyanov
e9826858a9 replica: Coroutinize one of write_memtable_to_sstable() overloads
Simpler to read and patch further this way

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 14:27:57 +03:00
Pavel Emelyanov
f4626f6b8e storage_service: Drop (un)init_messaging_service_part() pair
It's no longer needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:59:08 +03:00
Pavel Emelyanov
c42c13e658 storage_service: Init/Deinit RPC handlers in constructor/stop
All the services that need to register RPC handlers do it in service
constructor or .start() method. Unregistration happens in .stop().
Storage service explicitly (de)initializes its RPC handlers in dedicated
calls, but there's no point in that. The handlers' accessibility is
determined by messaging service start_lister/shutdown, handlers
themselves can be registered any time before it and unregistered any
time after it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:57:07 +03:00
Pavel Emelyanov
40cb9dd66f storage_service: Dont capture container() on RPC handler
The handlers are about to be initialized from inside storage_service
constructor. At that time container() is not yet available and its
invalid to capture it on handlers' lambda. Fortunately, there's only one
handler that does it, other handlers capture 'this' and call container()
explicitly. This patch fixes the remaining one to do the same.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:55:56 +03:00
Pavel Emelyanov
cc76f03f63 storage_service: Use storage_service::_sys_dist_ks in some places
The main goal here is to drop sys.dist.ks argument from the
init_messaging_service call to make future patching simpler. While doing
it it turned out that the argument was needed to be passed all the way
down to the mark_existing_views_as_built(), so this patch also dropes
this argument from this whole call-trace.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:53:55 +03:00
Pavel Emelyanov
4df5af931a storage_service: Add explicit dependency on system dist. keyspace
This effectively reverts bc051387c5 (storage_service: Remove sys_dist_ks
from storage_service dependencies) since now storage service needs the
sys. disk. ks not only cluster join time. Next patch will make more use
of it as well.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:52:42 +03:00
Pavel Emelyanov
a7f23930cb storage_service: Rurn query processor pointer into reference
It's non-nullptr all the time after previous patch and can be a
reference instead

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:52:04 +03:00
Pavel Emelyanov
e59544674a storage_service: Add explicity query_processor dependency
It's now set via a dedicated call that happens after query processor is
started. Now query processor is started before storage service and the
latter can get the q.p. local reference via constructor.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:51:09 +03:00
Pavel Emelyanov
6ee8e7a031 main: Start storage service later
The storage service is top-level service which depends on many other
services. Recently (see d42685d0cb storage_service: Load tablet
metadata on boot and from group0 changes) it also got implicit
dependency on query processor, but it still starts too early for
explicit reference on q.p.

This patch moves storage service start to later times. This is possible
because storage service is not explicitly needed by any other component
start/init in between its old and new start places. Also, cql_test_ent
starts storage service "that late" too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-20 13:48:30 +03:00
Nadav Har'El
5752dc875b Merge 'Materialize_views: don't construct global_schema_ptr from views schemas that lacks base information' from Eliran Sinvani
This miniset addresses two potential conversions to `global_schema_ptr` of incomplete materialized views schemas.
One of them was completely unnecessary and also is a "chicken and an egg" problem where on the sync schema procedure itself a view schema was converted to `global_schema_ptr` solely for the purposes of logging. This can create a
"hickup" in the materialized views updates if they are comming from a node with a different mv schema.
The reason why sometimes a synced schema can have no base info is because of deactivision and reactivision of the schema inside the `schema_registry` which doesn't restore the base information due to lack of context.
When a schema is synced the problem becomes easy since we can just use the latest base information from the database.

Fixes #14011

Closes scylladb/scylladb#14861

* github.com:scylladb/scylladb:
  migration manager: fix incomplete mv schemas returned from get_schema_for_write
  migration_manager: do not globalize potentially incomplete schema
2023-11-20 11:54:01 +02:00
Pavel Emelyanov
3471f30b58 view_update_generator: Unplug from database later
Patch 967ebacaa4 (view_update_generator: Move abort kicking to
do_abort()) moved unplugging v.u.g from database from .stop() to
.do_abort(). The latter call happens very early on stop -- once scylla
receives SIGINT. However, database may still need v.u.g. plugged to
flush views.

This patch moves unplug to later, namely to .stop() method of v.u.g.
which happens after database is drained and should no longer continue
view updates.

fixes: #16001

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#16091
2023-11-20 11:47:55 +02:00
Botond Dénes
fd11eeeaa3 Merge 'dist/redhat: drop unnecessary variables and tags' from Kefu Chai
this is a cleanup in `scylla.spec`.

Closes scylladb/scylladb#16097

* github.com:scylladb/scylladb:
  dist/redhat: group sub-package preambles together
  dist/redhat: drop unused `defines` variable
  dist/redhat: remove tags for subpackage which are same as main preamble
2023-11-20 11:46:56 +02:00
Kefu Chai
71f352896d dist/redhat: group sub-package preambles together
group sections like `%build` and `%install` together, to improve
the readability of the spec recipe.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-11-20 12:19:33 +08:00
Kefu Chai
3f108629b9 dist/redhat: drop unused defines variable
this variable was introduced in 6d7d0231. back then, we were still
building the binaries in .spec, but we've switched to the relocatable
package now, so there is no need to use keep these compilation related
flags anymore.

in this change, the `defines` variable is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-11-20 12:19:33 +08:00
Kefu Chai
d69b4838ea dist/redhat: remove tags for subpackage which are same as main preamble
this is a cleanup.

if a subpackage is licensed under a different license from the one
specified in the main preamble, we need to use a distinct License
tag on a per-subpackage basis. but if it is licensed with the
identical license, it is not necessary. since all three
subpackages of "*-{server, conf, kernel-conf}" are licensed under
AGPLv3, there is no need to repeat the "License:" tag in their
own preamble section.

the same applies to the "URL" tag.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-11-20 12:19:33 +08:00
Eliran Sinvani
63631257db migration manager: fix incomplete mv schemas returned from
get_schema_for_write

Sometimes a view registry can get deactivated inside the schema
registry, this happens due to dactivating and reactivating the registry
entry which doesn't rebuild the base table information in the view.
This error is later caught when trying to convert the schema into a
`global_schema_ptr`, however, the real bug here is that not all schemas
returned from `get_schema_for_write` are suitable for write because the
mv schemas can be incomplete.
This commit changes the aforementioned function in order to fix the bug.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2023-11-20 06:07:20 +02:00
Piotr Grabowski
321459ec51 install-dependencies.sh: update node_exporter to 1.7.0
Update node_exporter to 1.7.0.

The previous version (1.6.1) was flagged by security scanners (such as
Trivy) with HIGH-severity CVE-2023-39325. 1.7.0 release fixed that
problem.

[Botond: regenerate frozen toolchain]

Fixes #16085

Closes scylladb/scylladb#16086

Closes scylladb/scylladb#16090
2023-11-19 18:15:44 +02:00
Calle Wilund
6ffb482bf3 Commitlog replayer: Range-check skip call
Fixes #15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.

v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will
    ensure we log the fact that things are broken

Closes scylladb/scylladb#15270
2023-11-19 17:44:55 +02:00
Eliran Sinvani
562403b82f migration_manager: do not globalize potentially incomplete schema
There was a case where maybe sync function of a materialized view could
fail to sync if the view version was old. This is because adding the
base information to the view is only relevant until the record is
synced. This triggers an internal error in the `global_schem_ptr`
constructor.
The conversion to global pointer in that case was solely for logging
purposes so instead, we pass the pieces of information needed for the
logging itself.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2023-11-19 14:13:01 +02:00
Botond Dénes
eb674128ca Merge 'treewide: do not mark return value const if this has no effect ' from Kefu Chai
this change is a cleanup to add `-Wignore-qualifiers` when building the tree.

to mark a return value without value semantics has no effect. these
`const` specifier useless. so let's drop them.

and, if we compile the tree with `-Wignore-qualifiers`, the compiler
would warn like:

```
/home/kefu/dev/scylladb/schema/schema.hh:245:5: error: 'const' type qualifier on return type has no effect [-Werror,-Wignored-qualifiers]
  245 |     const index_metadata_kind kind() const;
      |     ^~~~~
```
so this change also silences the above warnings.

Closes scylladb/scylladb#16083

* github.com:scylladb/scylladb:
  build: enable -Wignore-qualifiers
  treewide: do not mark return value const if this has no effect
2023-11-17 15:35:20 +02:00
Kefu Chai
781b7de502 build: enable -Wignore-qualifiers
`-Wignore-qualifiers` is included by -Wextra. but we are not there yet,
with this change, we can keep the changes introducing -Wignore-qualifiers
warnings out of the repo, before applying `-Wextra`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-11-17 17:49:47 +08:00
Kefu Chai
15bfa09454 treewide: do not mark return value const if this has no effect
this change is a cleanup.

to mark a return value without value semantics has no effect. these
`const` specifier useless. so let's drop them.

and, if we compile the tree with `-Wignore-qualifiers`, the compiler
would warn like:

```
/home/kefu/dev/scylladb/schema/schema.hh:245:5: error: 'const' type qualifier on return type has no effect [-Werror,-Wignored-qualifiers]
  245 |     const index_metadata_kind kind() const;
      |     ^~~~~
```
so this change also silences the above warnings.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-11-17 17:46:19 +08:00
Tomasz Grabiec
6bcf3ac86c Merge 'Fix a few rare bugs in row cache' from Michał Chojnowski
This is a loose collection of fixes to rare row cache bugs flushed out by running test_concurrent_reads_and_eviction several million times. See individual commits for details.

Fixes #15483

Closes scylladb/scylladb#15945

* github.com:scylladb/scylladb:
  partition_version: fix violation of "older versions are evicted first" during schema upgrades
  cache_flat_mutation_reader: fix a broken iterator validity guarantee in ensure_population_lower_bound()
  cache_flat_mutation_reader: fix a continuity loss in maybe_update_continuity()
  cache_flat_mutation_reader: fix continuity losses during cache population races with reverse reads
  partition_snapshot_row_cursor: fix a continuity loss in ensure_entry_in_latest() with reverse reads
  cache_flat_mutation_reader: fix some cache mispopulations with reverse reads
  cache_flat_mutation_reader: fix a logic bug in ensure_population_lower_bound() with reverse reads
  cache_flat_mutation_reader: never make an unlinked last dummy continuous
2023-11-16 23:48:17 +01:00
Michał Chojnowski
9ccd4ea416 partition_version: fix violation of "older versions are evicted first" during schema upgrades
A schema upgrade appends a MVCC version B after an existing version A.

The last dummy in B is added to the front of LRU,
so it will be evicted after the entries in A.

This alone doesn't quite violate the "older versions are evicted first" rule,
because the new last dummy carries no information. But apply_monotonically
generally assumes that entries on the same position have the obvious
eviction order, even if they carry no information. Thus, after the merge,
the rule can become broken.

The proposed fix is as follows:

- In the case where A is merged into B, the merged last dummy
  inherits the link of A.
- The merging of B into anything is prevented until its merge with A is finished.

This is relatively hacky, because it still involves a state that
goes against some natural expectations granted by the "older versions..."
rule. A less hacky fix would be to ensure that the new dummy is inserted
into a proper place in the eviction order to begin with.

Or, better yet, we could eliminate the rule altogether.
Aside from being very hard to maintain, it also prevents the introduction
of any eviction algorithm other than LRU.
2023-11-16 19:01:18 +01:00
Michał Chojnowski
2aac8690c7 cache_flat_mutation_reader: fix a broken iterator validity guarantee in ensure_population_lower_bound()
ensure_population_lower_bound() guarantees that _last_row is valid or null.

However, it fails to provide this guarantee in the special rare case when
`_population_range_starts_before_all_rows == true` and _last_row is non-null.

(This can happen in practice if there is a dummy at before_all_clustering_rows
and eviction makes the `(before_all_clustering_rows, ...)` interval
discontinous. When the interval is read in this state, _last_row will point to
the dummy, while _population_range_starts_before_all_rows will still be true.)

In this special case, `ensure_population_lower_bound()` does not refresh
`_last_row`, so it can be non-null but invalid after the call.
If it is accessed in this state, undefined behaviour occurs.
This was observed to happen in a test,
in the `read_from_underlying() -- maybe_drop_last_entry()` codepath.

The proposed fix is to make the meaning of _population_range_starts_before_all_rows
closer to its real intention. Namely: it's supposed to handle the special case of a
left-open interval, not the case of an interval starting at -inf.
2023-11-16 19:01:18 +01:00
Michał Chojnowski
0dcf91491e cache_flat_mutation_reader: fix a continuity loss in maybe_update_continuity()
To reflect the final range tombstone change in the populated range,
maybe_update_continuity() might insert a dummy at `before_key(_next_row.table_position())`.

But the relevant logic breaks down in the special case when that position is
equal to `_last_row.position()`. The code treats the dummy as a part of
the (_last_row, _next_row) range, but this is wrong in the special case.

This can lead to inconsistent state. For example, `_last_row` can be wrongly made
continuous, or its range tombstone can be wrongly nulled.

The proposed fix is to only modify the dummy if it was actually inserted.
If it had been inserted beforehand (which is true in the special case, because
of the `ensure_population_lower_bound()` call earlier), then it's already in a
valid state and doesn't need changes.
2023-11-16 19:01:18 +01:00
Michał Chojnowski
6601c778dd cache_flat_mutation_reader: fix continuity losses during cache population races with reverse reads
Cache population routines insert new row entries.

In non-reverse reads, the new entries (except for the lower bound of the query
range) are filled with the correct continuity and range tombstones immediately
after insertion, because that information has already arrived from underlying.
when the entries are inserted.

But in reverse reads, it's the interval *after* the newly-inserted entry
that's made continuous. The continuity information in the new entries isn't
filled. When two population routines race, the one which comes later can
punch holes in the continuity left by the first routine, which can break
the "older versions are evicted first" rule and revert the affected
interval to an older version.

To fix this, we must make sure that inserting new row entries doesn't
change the total continuity of the version.
2023-11-16 19:01:18 +01:00
Michał Chojnowski
47299d6b06 partition_snapshot_row_cursor: fix a continuity loss in ensure_entry_in_latest() with reverse reads
The FIXME comment claims that setting continity isn't very important in this
place, but in fact this is just wrong.

If two calls to read_from_underlying() get into a race, the one which finishes
later can call ensure_entry_in_latest() on a position which lies inside a
continuous interval in the newest version. If we don't take care to preserve
the total continuity of the version, this can punch a hole in the continuity of the
newest version, potentially reverting the affected interval to an older version.

Fix that.
2023-11-16 19:01:18 +01:00
Michał Chojnowski
b5988fb389 cache_flat_mutation_reader: fix some cache mispopulations with reverse reads
`_last_row` is in table schema, but it is sometimes compared with positions in
query schema. This leads to unexpected behaviour when reverse reads
are used.
The previous patch fixed one such case, which was affecting correctness.

As far as I can tell, the three cases affected by this patch aren't
a correctness problem, but can cause some intervals to fail to be made continuous.
(And they won't be cached even if the same read is repeated many times).
2023-11-16 19:01:18 +01:00