Commit Graph

36593 Commits

Author SHA1 Message Date
Michał Chojnowski
80c8a6d0e6 partition_version: make partition_entry::upgrade() gentle
Preceding commits in this patch series have extended the MVCC
mechanism to allow for versions with different schemas
in the same entry/snapshot, with on-the-fly and background
schema upgrades to the most recent version in the chain.

Given that, we can perform gentle schema upgrades by simply
adding an empty version with the target schema to the front
of the entry.

This patch is intended to be the first and only behaviour-changing patch in the
series. Previous patches added code paths for multi-schema snapshots, but never
exercised them, because before this patch two different schemas within a single
MVCC chain never happened. This patch makes it happen and thus exercises all the
code in the series up until now.

Fixes #2577
2023-05-04 03:35:15 +02:00
Michał Chojnowski
fe576f8f29 partition_version: handle multi-schema snapshots in merge_partition_versions
Each partition_version is allowed to have a different schema now.
As of this patch, all versions reachable from a snapshot/entry always
have the same schema, but this will change in an upcoming patch.
This commit prepares merge_partition_versions() for that.

See code comments added in this patch for a detailed description.

The design chosen in this patch requires adding a bit of information to
partition_version. Due to alignment, it results in a regrettable waste of 8
bytes per partition. If we want, we can recover that in the future by squeezing
the bit into some free bit in other fields, for example the highest or lowest
bits of one of the pointers in partition_version.

After this patch, MVCC should be prepared for replacing the atomic schema
upgrade() of cache/memtable entries with a gentle upgrade().
2023-05-04 03:35:15 +02:00
Michał Chojnowski
152b4cd4c2 mutation_partition_v2: handle schema upgrades in apply_monotonically()
To avoid reactor stalls during schema upgrades of memtable and cache entries,
we want to do them interruptibly, not atomically. To achieve that, we want
to reuse the existing gentle version merging mechanism. If we generalize
version merging algorithms to handle `mutation_partition`s with different
schemas, a schema upgrade will boil down simply to adding a new empty MVCC
version with the new schema.

In a previous patch, we already generalized the cursor to upgrade rows
on the fly when reading.
But we still have to generalize the other MVCC algorithm: the merging of
superfluous mutation_partition_v2 objects. This patch modifies the two-version
merging algorithm: apply_monotonically(). The next patch will update its caller,
merge_partition_versions() to make of use the updated apply_monotonically()
properly.
2023-05-04 03:35:15 +02:00
Michał Chojnowski
0273101890 partition_version: remove the unused "from" argument in partition_entry::upgrade()
partition_entry now contains a reference to its schema, so it doesn't have to
be supplied by the caller anymore.
2023-05-04 02:37:30 +02:00
Michał Chojnowski
fc4b812e62 row_cache_test: prepare test_eviction_after_schema_change for gentle schema upgrades
The upcoming schema upgrade change will perform the schema upgrade by adding
a new version (with the new schema) to the partition entry.

To clean a multi-version entry, eviction is not enough - the versions have
to be merged and/or cleared first. drain() does just that.
2023-05-04 02:37:30 +02:00
Michał Chojnowski
db6a35e3a8 partition_version: handle multi-schema entries in partition_entry::squashed
An upcoming patch will enable multiple schemas within a single entry,
after the entry is upgraded.
partition_entry::squashed isn't prepared for that yet.
This patch prepares it.
2023-05-04 02:37:30 +02:00
Michał Chojnowski
5f68409934 partition_snapshot_row_cursor: handle multi-schema snapshots
To support gentle schema upgrades, each version has its own schema.
Currently this facility is unused, and the schema is equal for
all versions in a snapshot. But in upcoming commits this will change.

In the new design, after an entry upgrade, there will be a transitional
period where two versions with different schemas will coexist in a snapshot.
Eventually, these versions will be merged by mutation_cleaner into one
version with the current schema, but until then reads have to merge
multi-schema snapshots on the fly.

This commit implements in the cursor support for per-version schemas.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
f4e853b32d partiton_version: prepare partition_snapshot::squashed() for multi-schema snapshots
When in upcoming patches we allow multiple schema versions within a single
snapshot, reads will have to upgrade rows on the fly.
This also applies to squashed()
2023-05-04 02:37:29 +02:00
Michał Chojnowski
a2e3cf7463 partition_version: prepare partition_snapshot::static_row() for multi-schema snapshots
When in upcoming patches we allow multiple schema versions within a single
snapshot, reads will have to upgrade rows on the fly.
This also applies to the static row.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
94e4dc3d8d partition_version: add a logalloc::region argument to partition_entry::upgrade()
The argument is currently unused, but will be further propagated to
add_version() in an upcoming patch.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
98dfe3355e memtable: propagate the region to memtable_entry::upgrade_schema()
Adds a logalloc::region argument to upgrade_schema().
It's currently unused, but will be further propagated to
partition_entry::upgrade() in an upcoming patch.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
effd1fe70f mutation_partition: add an upgrading variant of lazy_row::apply()
A helper which will be used during upcoming changes to
mutation_partition_v2::apply_monotonically(), which will extend it to merging
versions with different schemas.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
dce1b3e820 mutation_partition: add an upgrading variant of rows_entry::rows_entry
A helper which will be used in upcoming commits.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
2fe25a5aa2 mutation_partition: switch an apply() call to apply_monotonically() 2023-05-04 02:37:29 +02:00
Michał Chojnowski
a34c5e410f mutation_partition: add an upgrading variant of rows_entry::apply_monotonically()
A helper which will be used during upcoming changes to
mutation_partition_v2::apply_monotonically(), which will extend it to merging
versions with different schemas.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
333e65447c mutation_fragment: add an upgrading variant of clustering_row::apply()
It will be used during upcoming changes in partition_snapshot_row_cursor
to prepare it for multi-schema snapshots.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
b488e4d541 mutation_partition: add an upgrading variant of row::row
It will be used in upcoming commits.

A factory function is used, rather than an actual constructor,
because we want to delegate the (easy) case of equal schemas
to the existing single-schema constructor.
And that's impossible (at least without invoking a copy/move
constructor) to do with only constructors.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
caaf0bd6bf partition_version: remove _schema from partition_entry::operator<<
operator<< accepts a schema& and a partition_entry&. But since the latter
now contains a reference to its schema inside, the former is redundant.
Remove it.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
f6e11c95e2 partition_version: remove the schema argument from partition_entry::read()
partition_entry now contains a reference to its schema, so it no longer
needs to be supplied by the caller.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
4e4ae43a84 memtable: remove _schema from memtable_entry
After adding a _schema field to each partition version,
the field in memtable_entry is redundant. It can be always recovered
from the latest version. Remove it.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
d999e46fa5 row_cache: remove _schema from cache_entry
After adding a _schema field to each partition version,
the field in cache_entry is redundant. It can be always recovered
from the latest version. Remove it.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
d7d6449a8f partition_version: remove the _schema field from partition_snapshot
After adding a _schema field to each partition version,
the field in partition_snapshot is redundant. It can be always recovered
from the latest version. Remove it.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
1d01a4a168 partition_version: add a _schema field to partition_version
Currently, partition_version does not reference its schema.
All partition_version reachable from a entry/snapshot have the same schema,
which is referenced in memtable_entry/cache_entry/partition_snapshot.

To enable gentle schema upgrades, we want to use the existing background
version merging mechanism. To achieve that, we will move the schema reference
into partition_version, and we will allow neighbouring MVCC versions to have
different schemas, and we will merge them on-the-fly during reads and
persistently during background version merges.
This way, an upgrade will boil down to adding a new empty version with
the new schema.

This patch adds the _schema field to partition_version and propagates
the schema pointer to it from the version's containers (entry/snapshot).
Subsequent patches will remove the schema references from the containers,
because they are now redundant.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
bc6a07a16a mutation_partition: change schema_ptr to schema& in mutation_partition::difference
Cosmetic change. See the preceding commit for details.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
a70c5704df mutation_partition: change schema_ptr to schema& in mutation_partition constructor
Cosmetic change. See the preceding commit for details.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
781514acfe mutation_partition_v2: change schema_ptr to schema& in mutation_partition_v2 constructor
We don't have a convention for when to pass `schema_ptr` and and when to pass
`const schema&` around.
In general, IMHO the natural convention for such a situation is to pass the
shared pointer if the callee might extend the lifetime of shared_ptr,
and pass a reference otherwise. But we convert between them willy-nilly
through shared_from_this().

While passing a reference to a function which actually expects a shared_ptr
can make sense (e.g. due to the fact that smart pointers can't be passed in
registers), the other way around is rather pointless.

This patch takes one occurence of that and modifies the parameter to a reference.

Since enable_shared_from_this makes shared pointer parameters and reference
parameters interchangeable, this is a purely cosmetic change.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
021b345832 mutation_partition: add upgrading variants of row::apply()
They will be used in upcoming patches which introduce incremental schema upgrades.

Currently, these variants always copy cells during upgrade.
This could be optimized in the future by adding a way to move them instead.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
4214f8d0de partition_version: update the comment to apply_to_incomplete()
The comment refers to "other", but it means "pe". Fix that.

The patch also adds a bit of context to the mutation_partition jargon
("evictability" and "continuity"), by reminding how it relates
to the concrete abstractions: memtable and cache.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
49a02b08de mutation_partition_v2: clean up variants of apply()
Most variants of apply() and apply_monotonically() in mutation_partition_v2
are leftovers from mutation_partition, and are unused. Thus they only
add confusion and maintenance burden. Since we will be modifying
apply_monotonically() in upcoming patches, let's clean them up, lest
the variants become stale.

This patch removes all unused variants of apply() and apply_monotonically()
and "manually inlines" the variants which aren't used often enough to carry
their own weight.

In the end, we are left with a single apply_monotonically() and two convenience
apply() helpers.

The single apply_monotonically() accepts two schema arguments. This facility
is unimplemented and unused as of this patch - the two arguments are always
the same - but it will be implemented and used in later parts of the series.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
88a0871729 mutation_partition: remove apply_weak()
apply_weak is just an alias for apply(), and most of its variants
are dead code. Get rid of it.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
38d9241c30 mutation_partition_v2: remove a misleading comment in apply_monotonically()
The comment suggests that the order of sentinel insertion is meaningful because
of the resulting eviction order. But the sentinels are added to the tracker
with the two-argument version of insert(), which inserts the second argument
into the LRU right before the (more recent) first argument.
Thus the eviction order of sentinels is decided explicitly, and it doesn't
rely on insertion order.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
42c7bc0391 row_cache_test: add schema changes to test_concurrent_reads_and_eviction
Reads with multiple schema verions have a different code path now,
so add schema changes to the test, to test these paths too.
2023-05-04 02:37:29 +02:00
Michał Chojnowski
fb8ae3cca4 mutation_partition: fix mixed-schema apply()
In some mixed-schema apply helpers for tests, the source mutation
is accidentally copied with the target schema. Fix that.

Nothing seems to be currently affected by this bug; I found it when it
was triggered by a new test I was adding.
2023-05-04 02:37:29 +02:00
Kefu Chai
113fb32019 compaction: disambiguate format_to()
we should always qualify `format_to` with its namespace. otherwise
we'd have following failure when compiling with libstdc++ from GCC-13:

```
/home/kefu/dev/scylladb/compaction/table_state.hh:65:16: error: call to 'format_to' is ambiguous
        return format_to(ctx.out(), "{}.{} compaction_group={}", s->ks_name(), s->cf_name(), t.get_group_id());
               ^~~~~~~~~
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13760
2023-05-03 20:33:18 +03:00
Botond Dénes
48b9f31a08 Merge 'db, sstable: use generation_type instead of its value when appropriate' from Kefu Chai
in this series, we try to use `generation_type` as a proxy to hide the consumers from its underlying type. this paves the road to the UUID based generation identifier. as by then, we cannot assume the type of the `value()` without asking `generation_type` first. better off leaving all the formatting and conversions to the `generation_type`. also, this series changes the "generation" column of sstable registry table to "uuid", and convert the value of it to the original generation_type when necessary, this paves the road to a world with UUID based generation id.

Closes #13652

* github.com:scylladb/scylladb:
  db: use uuid for the generation column in sstable registry table
  db, sstable: add operator data_value() for generation_type
  db, sstable: print generation instead of its value
2023-05-03 09:04:54 +03:00
Nadav Har'El
b5f28e2b55 Merge 'Add S3 support to sstables::test_env' from Pavel Emelyanov
Currently there are only 2 tests for S3 -- the pure client test and compound object_store test that launches scylla, creates s3-backed table and CQL-queries it. At the same time there's a whole lot of small unit test for sstables functionality, part of it can run over S3 storage too.

This PR adds this support and patches several test cases to use it. More test cases are to come later on demand.

fixes: #13015

Closes #13569

* github.com:scylladb/scylladb:
  test: Make resharding test run over s3 too
  test: Add lambda to fetch bloom filter size
  test: Tune resharding test use of sstable::test_env
  test: Make datafile test case run over s3 too
  test: Propagate storage options to table_for_test
  test: Add support for s3 storage_options in config
  test: Outline sstables::test_env::do_with_async()
  test: Keep storage options on sstable_test_env config
  sstables: Add and call storage::destroy()
  sstables: Coroutinize sstable::destroy()
2023-05-02 21:48:05 +03:00
Botond Dénes
72003dc35c readers: evictable_reader: skip progress guarantee when next pos is partition start
The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563
2023-05-02 16:19:32 +03:00
Botond Dénes
7baa2d9cb2 Merge 'Cleanup range printing' from Benny Halevy
This mini-series cleans up printing of ranges in utils/to_string.hh

It generalizes the helper function to work on a std::ranges::range,
with some exceptions, and adds a helper for boost::transformed_range.

It also changes the internal interface by moving `join` the the utils namespace
and use std::string rather than seastar::sstring.

Additional unit tests were added to test/boost/json_test

Fixes #13146

Closes #13159

* github.com:scylladb/scylladb:
  utils: to_string: get rid of utils::join
  utils: to_string: get rid of to_string(std::initializer_list)
  utils: to_string: get rid of to_string(const Range&)
  utils: to_string: generalize range helpers
  test: add string_format_test
  utils: chunked_vector: add std::ranges::range ctor
2023-05-02 14:55:18 +03:00
Botond Dénes
d6ed5bbc7e Merge 'alternator: fix validation of numbers' magnitude and precision' from Nadav Har'El
DynamoDB limits the allowed magnitude and precision of numbers - valid
decimal exponents are between -130 and 125 and up to 38 significant
decimal digitst are allowed. In contrast, Scylla uses the CQL "decimal"
type which offers unlimited precision. This can cause two problems:

1. Users might get used to this "unofficial" feature and start relying
    on it, not allowing us to switch to a more efficient limited-precision
    implementation later.

2. If huge exponents are allowed, e.g., 1e-1000000, summing such a
    number with 1.0 will result in a huge number, huge allocations and
    stalls. This is highly undesirable.

This series adds more tests in this area covering additional corner cases,
and then fixes the issue by adding the missing verification where it's
needed. After the series, all 12 tests in test/alternator/test_number.py now pass.

Fixes #6794

Closes #13743

* github.com:scylladb/scylladb:
  alternator: unit test for number magnitude and precision function
  alternator: add validation of numbers' magnitude and precision
  test/alternator: more tests for limits on number precision and magnitude
  test/alternator: reproducer for DoS in unlimited-precision addition
2023-05-02 14:33:36 +03:00
Kefu Chai
74e9e6dd1a db: use uuid for the generation column in sstable registry table
* change the "generation" column of sstable registry table from
  bigint to uuid
* from helper to convert UUID back to the original generation

in the long run, we encourage user to use uuid based generation
identifier. but in the transition period, both bigint based and uuid
based identifiers are used for the generation. so to cater both
needs, we use a hackish way to store the integer into UUID. to
differentiate the was-integer UUID from the geniune UUID, we
check the UUID's most_significant_bits. because we only support
serialize UUID v1, so if the timestamp in the UUID is zero,
we assume the UUID was generated from an integer when converting it
back to a generation identififer.

also, please note, the only use case of using generation as a
column is the sstable_registry table, but since its schema is fixed,
we cannot store both a bigint and a UUID as the value of its
`generation` column, the simpler way forward is to use a single type
for the generation. to be more efficient and to preserve the type of
the generation, instead of using types like ascii string or bytes,
we will always store the generation as a UUID in this table, if the
generation's identifier is a int64_t, the value of the integer will
be used as the least significant bits of the UUID.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-02 19:23:22 +08:00
Nadav Har'El
ed34f3b5e4 cql-pytest: translate Cassandra's test for LWT with collections
This is a translation of Cassandra's CQL unit test source file
validation/operations/InsertUpdateIfConditionTest.java into our cql-pytest
framework.

This test file checks various LWT conditional updates which involve
collections or UDTs (there is a separate test file for LWT conditional
updates which do not involve collections, which I haven't translated
yet).

The tests reproduce one known bug:

Refs #5855:  lwt: comparing NULL collection with empty value in IF
             condition yields incorrect results

And also uncovered three previously-unknown bugs:

Refs #13586: Add support for CONTAINS and CONTAINS KEY in LWT expressions
Refs #13624: Add support for UDT subfields in LWT expression
Refs #13657: Misformatted printout of column name in LWT error message

Beyond those bona-fide bugs, this test also demonstrates several places
where we intentionally deviated from Cassandra's behavior, forcing me
to comment out several checks. These deviations are known, and intentional,
but some of them are undocumented and it's worth listing here the ones
re-discovered by this test:

1. On a successful conditional write, Cassandra returns just True, Scylla
   also returns the old contents of the row. This difference is officially
   documented in docs/kb/lwt-differences.rst.
2. Scylla allows the test "l = [null]" or "s = {null}" with this weird
   null element (the result is false), whereas Cassandra prints an error.
3. Scylla allows "l[null]" or "m[null]" (resulting in null), Cassandra
   prints an error.
4. Scylla allows a negative list index, "l[-2]", resulting in null.
   Cassandra prints an error in this case.
5. Cassandra allows in "IF v IN (?, ?)" to bind individual values to
   UNSET_VALUE and skips them, Scylla treats this as an error. Refs #13659.
6. Scylla allows "IN null" (the condition just fails), Cassandra prints
   an error in this case.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13663
2023-05-02 11:53:58 +03:00
Pavel Emelyanov
d4a72de406 test: Make resharding test run over s3 too
Now when the test case and used lib/utils code is using storage-agnostic
approach, it can be extended to run over S3 storage as well.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:46:23 +03:00
Pavel Emelyanov
2601c58278 test: Add lambda to fetch bloom filter size
The resharding test compares bloom filter sizes before and after reshard
runs. For that it gets the filter on-disk filename and stat()s it. That
won't work with S3 as it doesn't have its accessable on-disk files.

Some time ago there existed the storage::get_stats() method, but now
it's gone. The new s3::client::get_object_stat() is coming, but it will
take time to switch to it. For now, generalize filter size fetching into
a local lambda. Next patch will make a stub in it for S3 case, and once
the get_object_stat() is there we'll be able to smoothly start using it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:43:26 +03:00
Kefu Chai
135b4fd434 db: schema_tables: capture reference to temporary value by value
`clustering_key_columns()` returns a range view, and `front()` returns
the reference to its first element. so we cannot assume the availability
of this reference after the expression is evaluated. to address this
issue, let's capture the returned range by value, and keep the first
element by reference.

this also silences warning from GCC-13:

```
/home/kefu/dev/scylladb/db/schema_tables.cc:3654:30: error: possibly dangling reference to a temporary [-Werror=dangling-reference]
 3654 |     const column_definition& first_view_ck = v->clustering_key_columns().front();
      |                              ^~~~~~~~~~~~~
/home/kefu/dev/scylladb/db/schema_tables.cc:3654:79: note: the temporary was destroyed at the end of the full expression ‘(& v)->view_ptr::operator->()->schema::clustering_key_columns().boost::iterator_range<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> > >::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> >, boost::iterators::random_access_traversal_tag>::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> >, boost::iterators::bidirectional_traversal_tag>::<anonymous>.boost::iterator_range_detail::iterator_range_base<__gnu_cxx::__normal_iterator<const column_definition*, std::vector<column_definition> >, boost::iterators::incrementable_traversal_tag>::front()’
 3654 |     const column_definition& first_view_ck = v->clustering_key_columns().front();
      |                                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
```

Fixes #13720
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13721
2023-05-02 11:42:43 +03:00
Pavel Emelyanov
76594bf72b test: Tune resharding test use of sstable::test_env
The test case in question spawns async context then makes the test_env
instance on the stack (and stopper for it too). There's helper for the
above steps, better to use them.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
439c8770aa test: Make datafile test case run over s3 too
Most of the sstable_datafile test cases are capable of running with S3
storage, so this patch makes the simplest of them do it. Patching the
rest from this file is optional, because mostly the cases test how the
datafile data manipulations work without checking the files
manipulations. So even if making them all run over S3 is possible, it
will just increase the testing time w/o real test of the storage driver.

So this patch makes one test case run over local and S3 storages, more
patches to update more test cases with files manipulations are yet to
come.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
f7df238545 test: Propagate storage options to table_for_test
Teach table_for_tests use any storage options, not just local one. For
now the only user that passes non-local options is sstables::test_env.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Pavel Emelyanov
fa1de16f30 test: Add support for s3 storage_options in config
When the sstable test case wants to run over S3 storage it needs to
specify that in test config by providing the S3 storage options. So
first thing this patch adds is the helper that makes these options based
on the env left by minio launcher from test.py.

Next, in order to make sstables_manager work with S3 it needs the
plugged system keyspace which, in turn, needs query processor, proxy,
database, etc. All this stuff lives in cql_test_env, so the test case
running with S3 options will run in a sstables::test_env nested inside
cql_test_env. The latter would also need to plug its system keyspace to
the former's sstables manager and turn the experimental feature ON.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:30:03 +03:00
Nadav Har'El
57ffbcbb22 cql3: fix spurious token names in syntax error messages
We have known for a long time (see issue #1703) that the quality of our
CQL "syntax error" messages leave a lot to be desired, especially when
compared to Cassandra. This patch doesn't yet bring us great error
messages with great context - doing this isn't easy and it appears that
Antlr3's C++ runtime isn't as good as the Java one in this regard -
but this patch at least fixes **garbage** printed in some error messages.

Specifically, when the parser can deduce that a specific token is missing,
it used to print

    line 1:83 missing ')' at '<missing '

After this patch we get rid of the meaningless string '<missing ':

    line 1:83 : Missing ')'

Also, when the parser deduced that a specific token was unneeded, it
used to print:

    line 1:83 extraneous input ')' expecting <invalid>

Now we got rid of this silly "<invalid>" and write just:

    line 1:83 : Unexpected ')'

Refs #1703. I didn't yet marked that issue "fixed" because I think a
complete fix would also require printing the entire misparsed line and the
point of the parse failure. Scylla still prints a generic "Syntax Error"
in most cases now, and although the character number (83 in the above
example) can help, it's much more useful to see the actual failed
statement and where character 83 is.

Unfortunately some tests enshrine buggy error messages and had to be
fixed. Other tests enshrined strange text for a generic unexplained
error message, which used to say "  : syntax error..." (note the two
spaces and elipses) and after this patch is " : Syntax error". So
these tests are changed. Another message, "no viable alternative at
input" is deliberately kept unchanged by this patch so as not to break
many more tests which enshrined this message.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13731
2023-05-02 11:23:58 +03:00
Pavel Emelyanov
1e03733e8c test: Outline sstables::test_env::do_with_async()
It's growing larger, better to keep it in .cc file

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-05-02 11:15:45 +03:00