Commit Graph

31912 Commits

Author SHA1 Message Date
Michael Livshin
0eefbfa3cc utils: logalloc: add arithmetic operations to segment_pool::stats
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
3fced65542 utils: logalloc: have reclaim timers detect being nested
Make sure that inner timers don't waste CPU measuring anything.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
76ca93b779 utils: logalloc: add more reclaim_timers
Measure stalls at higher resolution.

Refs #6189

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
42db63d012 utils: logalloc: move reclaim_timer to compact_and_evict_locked
track compact_and_evict_locked timing from
all call paths, not only from compact_and_evict.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
fd2b4a4b7d utils: logalloc: pull reclaim_timer definition forward
So it can be used in functions defined earlier in the source file
in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
33785d261e utils: logalloc: reclaim_timer make tracker optional
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
acd82d3b25 utils: logalloc: reclaim_timer: print backtrace if stall detected
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
239992f16c utils: logalloc: reclaim_timer: get call site name
Before adding even more call sites, print the call site
name in the report.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
c4d64c3bf7 utils: logalloc: reclaim_timer: rename set_result
Rename set_result to set_memory_released
to make it clearer what the result means.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
5ce0038e6a utils: logalloc: reclaim_timer: rename _reserve_segments member
Rename reclaim_timer::_reserve_segments to _segments_to_release
as it is clearer and more suitable for later patches
that will add reclaim_timers in more functions.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
c34d1a7705 utils: logalloc: reclaim_timer round up microseconds
better report 29000 us than 28999 us.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Nadav Har'El
f5ff687b64 Merge 'cql3: Reorganize expr::to_restriction' from Jan Ciołek
This PR introduces improvements to `expr::to_restriction` and prepares the validation part for restriction classes removal.

`expr::to_restriction` is currently used to take a restriction from the WHERE clause, prepare it, perform some validation checks and finally convert it to an instance of the restriction class.

Soon we will get rid of the restriction class.

In preparation for that `expr::to_restriction` is split into two independent parts:
* The part that prepares and validates a binary_operator
* The part that converts a binary_operator to restriction

Thanks to this split getting rid of restriction class will be painless, we will just stop using the second part.

`to_restriction.cc` is replaced by `restrictions.hh/cc`. In the future we can put all the restriction expressions code there to avoid clutter in `expression.hh/cc`.

This change made it much easier to fix #10631, so I did that as well.

Fixes: #10631

Closes #10979

* github.com:scylladb/scylla:
  cql-pytest: Test that IS NOT only accepts NULL
  cql-pytest: Enable testInvalidCollectionNonEQRelation
  cql3: Move single element IN restrictions handling
  cql3: Check for disallowed operators early
  cql3: Simplify adding restrictions
  cql3: Reorganize to_restriction code
  cql3: Fix IS NOT NULL check in to_restriction
  cql3: Swap order of arguments in error message
2022-07-12 00:26:34 +03:00
Avi Kivity
53e0dc7530 bytes_ostream: base on managed_bytes
bytes_ostream is an incremental builder for a discontiguous byte container.
managed_bytes is a non-incremental (size must be known up front) byte
container, that is also compatible with LSA. So far, conversion between
them involves copying. This is unfortunate, since query_result is generated
as a bytes_ostream, but is later converted to managed_bytes (today, this
is done in cql3::expr::get_non_pk_values() and
compound_view_wrapper::explode(). If the two types could be made compatible,
we could use managed_bytes_view instead of creating new objects and avoid
a copy. It's also nicer to have one less vocabulary type.

This patch makes bytes_ostream use managed_bytes' internal representation
(blob_storage instead of bytes_ostream::chunk) and provides a conversion
to managed_bytes. All bytes_ostream users are left in place, but the goal
is to make bytes_ostream a write-only type with the only observer a conversion
to managed_bytes.

It turns out to be relatively simple. The internal representations were
already similar. I made blob_storage::ref_type self-initializing to
reduce churn (good practice anyway) and added a private constructor
to managed_bytes for the conversion.

Note that bytes_ostream can only be used to construct a non-LSA managed_bytes,
but LSA uses of managed_bytes are very strictly controlled (the entry
points to memtable and cache) so that's not a problem.

A unit test is added.

Closes #10986
2022-07-12 00:23:29 +03:00
Pavel Emelyanov
5526738794 view: Fix trace-state pointer use after move
It's moved into .mutate_locally() but it captured and used in its
continuation. It works well just because moved-from pointer looks like
nullptr and all the tracing code checks for it to be non-such.

tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1266/
       (CI job failed on post-actions thus it's red)

Fixes #11015

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220711134152.30346-1-xemul@scylladb.com>
2022-07-11 17:20:51 +03:00
Avi Kivity
34886ce1a1 Merge 'Allow regular compaction during major' from Benny Halevy
After acquiring the _compaction_state write lock,
select all sstables using get_candidates and register them
as compacting, then unlock the _compaction_state lock
to let regular compaction run in parallel.

Also, run major compaction in maintenance scheduling group.
We should separate the scheduling groups used for major compaction
from the the regular compaction scheduling group so that
the latter can be affected by the backlog tracker in case
backlog accumulates during a long running major compaction.

Fixes #10961

Closes #10984

* github.com:scylladb/scylla:
  compaction_manager: major_compaction_task: run in maintenance scheduling groupt
  compaction_manager: allow regular compaction to run in parallel to major
2022-07-11 17:11:51 +03:00
Jan Ciolek
012f7d5b1a cql-pytest: Test that IS NOT only accepts NULL
The IS_NOT operator can only be used during materialized view creation
and it can only be used to express IS NOT NULL.
Trying to write something like IS NOT 42 should cause an error.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
22e605f823 cql-pytest: Enable testInvalidCollectionNonEQRelation
The wrong error message has been fixed and
now the test passes.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
38e115edf7 cql3: Move single element IN restrictions handling
Restrictions like
col IN (1)
get converted to
col = 1
as an optimization/simplification.

This used to be done in prepare_binary_operator,
but it fits way better inside of
validate_and_prepare_new_restriction.

When it was being done in prepare_binary_operator
the conversion happened before validation checks
and the error messages would describe an equality
restriction despite the user making an IN restriction.

Now the conversion happens after all validation
is finished, which ensures that all checks are
being done on the original expression.

Fixes: #10631

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
cb504b2d6e cql3: Check for disallowed operators early
Move checking for disallowed operators
earlier in the code flow.
This is needed to pass some tests that
expect one error message instead of the other.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
62155846bc cql3: Simplify adding restrictions
The code that adds restrictions in statement_restrictions.cc
is unnecessarily convoluted.

The code to handle IS NOT NULL is actually repeated twice,
once in the constructor and once in add_is_not_restriction.
I missed this when I orignally modified this code.
There is no need to keep duplicate code, we can just
use the new add_is_not_restriction.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
debd7399fd cql3: Reorganize to_restriction code
expr::to_restriction is currently used to
take a restriction from the WHERE clause,
prepare it, perform some validation checks
and finally convert it to an instance of
the restriction class.

Soon we will get rid of the restriction class.

In preparation for that expr::to_restriction
is split into two independent parts:
* The part that prepares and validates a binary_operator
* The part that converts a binary_operator to restriction

Thanks to this split getting rid of restriction class
will be painless, we will just stop using the
second part.

This commit splits expr::to_restriction into two functions;
* validate_and_prepare_new_restriction
* convert_to_restriction
that handle each of those parts.

All helper validation methods in the anonymous namespace
are copied from the to_restriction.cc file.

to_restriction.cc isn't the best filename for the new functionality,
so it has been renamed to restrictions.hh/cc.
In the future all the code regarding restrictions could be
put there to reduce clutter in expression.hh/cc

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
5be574fe51 cql3: Fix IS NOT NULL check in to_restriction
expr::to_restriction performs a check to see if
the restriction is of form: `col IS NOT NULL`

There is a mistake in this check.
It uses is<null>(prepared_binop.rhs)
to determine if the right hand side of binary operator
is a null, but the binary operator is already prepared.

During preparation expr::null is converted to expr::constant
and that wouldn't be detected by this check.

The check has been changed to check for null constant instead
of expr::null.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:15 +02:00
Jan Ciolek
4142b27d85 cql3: Swap order of arguments in error message
The error message displays two arguments in
a specific order, but the tests actually
expect them to be swapped.

Swap the arguments to match the expected
error messages in tests.

It wasn't detected earlier because the
check was never reached, but this will change
soon in the following commits.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:13 +02:00
Nadav Har'El
a504d120d0 Merge 'docs: migrate the docs from the scylla-docs repo' from Anna Stuchlik
This PR migrates the ScyllaDB end-user documentation from the [scylla-docs](https://github.com/scylladb/scylla-docs/) repository, according to the [migration plan](https://docs.google.com/document/d/15yBf39j15hgUVvjeuGR4MCbYeArqZrO1ir-z_1Urc6A/edit?usp=sharing). All the files are added to the `docs` subfolder.

 **This PR does not cover any content changes.**

How to test this PR:

1. Go to `scylla/docs`.
2. Run `make preview`. The docs should build without any warnings.
3. Open http://127.0.0.1:5500/ in your browser. You should see the documentation landing page:
![image](https://user-images.githubusercontent.com/37244380/177358869-af9f1b78-e528-4d0d-9479-cc69e25f3b67.png)

Closes #10976

* github.com:scylladb/scylla:
  doc: fix errors -fix the indent in the conf.py file
  doc: fix the path to Alternator
  doc: fix errors - add Alternator to the toctree
  doc: fix errors- update the conf.py file
  doc: fix errors - remove the CNAME file
  doc: add the CNAME and robots files
  doc: move index and README from scylla-docs repo
  doc: move the documentation from the scylla-docs repo
  doc: remove the old index file
2022-07-11 15:26:06 +03:00
Avi Kivity
957bf48eb2 Merge 'Don't throw exceptions on the replica side when handling single partition reads and writes' from Piotr Dulikowski
This PR gets rid of exception throws/rethrows on the replica side for writes and single-partition reads. This goal is achieved without using `boost::outcome` but rather by replacing the parts of the code which throw with appropriate seastar idioms and by introducing two helper functions:

1.`try_catch` allows to inspect the type and value behind an `std::exception_ptr`. When libstdc++ is used, this function does not need to throw the exception and avoids the very costly unwind process. This based on the "How to catch an exception_ptr without even try-ing" proposal mentioned in https://github.com/scylladb/scylla/issues/10260.

This function allows to replace the current `try..catch` chains which inspect the exception type and account it in the metrics.

Example:

```c++
// Before
try {
    std::rethrow_exception(eptr);
} catch (std::runtime_exception& ex) {
    // 1
} catch (...) {
    // 2
}

// After
if (auto* ex = try_catch<std::runtime_exception>(eptr)) {
    // 1
} else {
    // 2
}
```

2. `make_nested_exception_ptr` which is meant to be a replacement for `std::throw_with_nested`. Unlike the original function, it does not require an exception being currently thrown and does not throw itself - instead, it takes the nested exception as an `std::exception_ptr` and produces another `std::exception_ptr` itself.

Apart from the above, seastar idioms such as `make_exception_future`, `co_await as_future`, `co_return coroutine::exception()` are used to propagate exceptions without throwing. This brings the number of exception throws to zero for single partition reads and writes (tested with scylla-bench, --mode=read and --mode=write).

Results from `perf_simple_query`:

```
Before (719724e4df):
  Writes:
    Normal:
      127841.40 tps ( 56.2 allocs/op,  13.2 tasks/op,   50042 insns/op,        0 errors)
    Timeouts:
      94770.81 tps ( 53.1 allocs/op,   5.1 tasks/op,   78678 insns/op,  1000000 errors)
  Reads:
    Normal:
      138902.31 tps ( 65.1 allocs/op,  12.1 tasks/op,   43106 insns/op,        0 errors)
    Timeouts:
      62447.01 tps ( 49.7 allocs/op,  12.1 tasks/op,  135984 insns/op,   936846 errors)

After (d8ac4c02bfb7786dc9ed30d2db3b99df09bf448f):
  Writes:
    Normal:
      127359.12 tps ( 56.2 allocs/op,  13.2 tasks/op,   49782 insns/op,        0 errors)
    Timeouts:
      163068.38 tps ( 52.1 allocs/op,   5.1 tasks/op,   40615 insns/op,  1000000 errors)
  Reads:
    Normal:
      151221.15 tps ( 65.1 allocs/op,  12.1 tasks/op,   43028 insns/op,        0 errors)
    Timeouts:
      192094.11 tps ( 41.2 allocs/op,  12.1 tasks/op,   33403 insns/op,   960604 errors)
```

Closes #10368

* github.com:scylladb/scylla:
  database: avoid rethrows when handling exceptions from commitlog
  database: convert throw_commitlog_add_error to use make_nested_exception_ptr
  utils: add make_nested_exception_ptr
  storage_proxy: don't rethrow when inspecting replica exceptions on write path
  database: don't rethrow rate_limit_exception
  storage_proxy: don't rethrow the exception in abstract_read_resolver::error
  utils/exceptions.cc: don't rethrow in is_timeout_exception
  utils/exceptions: add try_catch
  utils: add abi/eh_ia64.hh
  storage_proxy: don't rethrow exceptions from replicas when accounting read stats
  message: get rid of throws in send_message{,_timeout,_abortable}
  database/{query,query_mutations}: don't rethrow read semaphore exceptions
2022-07-11 14:01:41 +03:00
Anna Stuchlik
0bf99b25c9 doc: fix errors -fix the indent in the conf.py file 2022-07-11 12:31:59 +02:00
Anna Stuchlik
ae9ed315d1 doc: fix the path to Alternator 2022-07-11 12:31:59 +02:00
Anna Stuchlik
a1d9f0f0c8 doc: fix errors - add Alternator to the toctree 2022-07-11 12:31:30 +02:00
Anna Stuchlik
81949bbc7a doc: fix errors- update the conf.py file 2022-07-11 12:18:47 +02:00
Anna Stuchlik
2e95bd0ed1 doc: fix errors - remove the CNAME file 2022-07-11 12:17:33 +02:00
Anna Stuchlik
7b5dfde56a doc: add the CNAME and robots files 2022-07-11 12:16:53 +02:00
Anna Stuchlik
8d86dfa929 doc: move index and README from scylla-docs repo 2022-07-11 12:14:40 +02:00
Anna Stuchlik
6e97b83b60 doc: move the documentation from the scylla-docs repo 2022-07-11 12:14:02 +02:00
Anna Stuchlik
bb41457f73 doc: remove the old index file 2022-07-11 12:12:15 +02:00
Nadav Har'El
cc69177dcc config: fix printing of experimental feature list
Recently we noticed a regression where with certain versions of the fmt
library,

   SELECT value FROM system.config WHERE name = 'experimental_features'

returns string numbers, like "5", instead of feature names like "raft".

It turns out that the fmt library keep changing their overload resolution
order when there are several ways to print something. For enum_option<T> we
happen to have to conflicting ways to print it:
  1. We have an explicit operator<<.
  2. We have an *implicit* convertor to the type held by T.

We were hoping that the operator<< always wins. But in fmt 8.1, there is
special logic that if the type is convertable to an int, this is used
before operator<<()! For experimental_features_t, the type held in it was
an old-style enum, so it is indeed convertible to int.

The solution I used in this patch is to replace the old-style enum
in experimental_features_t by the newer and more recommended "enum class",
which does not have an implicit conversion to int.

I could have fixed it in other ways, but it wouldn't have been much
prettier. For example, dropping the implicit convertor would require
us to change a bunch of switch() statements over enum_option (and
not just experimental_features_t, but other types of enum_option).

Going forward, all uses of enum_option should use "enum class", not
"enum". tri_mode_restriction_t was already using an enum class, and
now so does experimental_features_t. I changed the examples in the
comments to also use "enum class" instead of enum.

This patch also adds to the existing experimental_features test a
check that the feature names are words that are not numbers.

Fixes #11003.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11004
2022-07-11 09:17:30 +02:00
Nadav Har'El
4a4d9ec9c0 cql-pytest: remove "xfail" mark from two passing tests
Fix two cql-pytest that have been "XPASS"ing (unexpectedly passing)
by removing the "xfail" (expecting failure) mark from them:

One test was for an issue that has already been fixed (refs #10081).
The second test was a translated Cassandra test that should never
have failed because it doesn't trigger the issue that supposedly failed
it (that test sets a large value for a non-indexed column, so doesn't
trigger the problem we have with large values in an indexed column).

Closes #11006
2022-07-11 08:34:19 +03:00
Nadav Har'El
0a71151bc4 test/cql-pytest: avoid deprecation message
When running test/cql-pytest, pytest prints one warning at the end:

   /home/nyh/scylla/test/cql-pytest/test_secondary_index.py:82: DeprecationWarning: ResultSet indexing support will be removed in 4.0.
   Consider using ResultSet.one() to get a single row.
   assert any([index_name in event.description for event in cql.execute(query, trace=True).get_query_trace().events])

So in this patch I do exactly what the warning recommends - use one().

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11002
2022-07-11 08:01:23 +03:00
Nadav Har'El
2581b54ea0 test/{alternator,redis}: stop using deprecated "disutils" package
Python has deprecated the distutils package. In several places in the
Alternator and Redis test suites, we used distutils.version to check if
the library is new enough for running the test (and skip the test if
it's too old). On new versions of Python, we started getting deprecation
warnings such as:

    DeprecationWarning: The distutils package is deprecated and slated for
    removal in Python 3.12. Use setuptools or check PEP 632 for potential
    alternatives

PEP 632 recommends using package.version instead of distutils.version,
and indeed it works well. After applying this patch, Alternator and
Redis test runs no long end in silly deprecation warnings.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11007
2022-07-11 08:00:45 +03:00
Benny Halevy
7e2d2cf1c1 table: snapshot: coroutine::return_exception_ptr
Otherwise, we lose the returned exception_ptr type.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11000
2022-07-10 17:56:24 +03:00
Nadav Har'El
2437a42b64 Merge 'cql-pytest: add test_permissions.py' from Piotr Sarna
This new test suite is expected to gather all kinds of permissions
tests - granting, revoking, authorizing, and so on.
Right now it contains a single minimal test which ensures that
the default superuser can be granted applicable permissions,
which they already have anyway.

The test suite added in this pull request will also be useful
when developing #10633 - permissions for UDF/UDA infrastructure.

Closes #10991

* github.com:scylladb/scylla:
  cql-pytest: add initial permissions test suite
  cql-pytest: enable CassandraAuthorizer for Scylla and Cassandra
2022-07-10 09:30:10 +03:00
cvybhu
80dda2bb97 cql3: expr: Fix handling reversed types in limits()
There was a bug which caused incorrect results of limits()
for columns with reversed clustering order.
Such columns have reversed_type as their type and this
needs to be taken into account when comparing them.

It was introduced in 6d943e6cd0.
This commit replaced uses of get_value_comparator
with type_of. The difference between them is that
get_value_comparator applied ->without_reversed()
on the result type.

Because the type was reversed, comparisons like
1 < 2 evaluated to false.

This caused the test testIndexOnKeyWithReverseClustering
to fail, but sadly it wasn't caught by CI because
the CI itself has a bug that makes it skip some tests.
The test passes now, although it has to be run manually
to check that.

Fixes: #10918

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>

Closes #10994
2022-07-10 09:24:06 +03:00
Nadav Har'El
a7fa29bceb cross-tree: fix header file self-sufficiency
Scylla's coding standard requires that each header is self-sufficient,
i.e., it includes whatever other headers it needs - so it can be included
without having to include any other header before it.

We have a test for this, "ninja dev-headers", but it isn't run very
frequently, and it turns out our code deviated from this requirement
in a few places. This patch fixes those places, and after it
"ninja dev-headers" succeeds again.

Fixes #10995

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10997
2022-07-08 12:59:14 +03:00
Avi Kivity
3b20407f25 Merge 'db: Avoid memtable flush latency on schema merge' from Tomasz Grabiec
Currently, applying schema mutations involves flushing all schema
tables so that on restart commit log replay is performed on top of
latest schema (for correctness). The downside is that schema merge is
very sensitive to fdatasync latency. Flushing a single memtable
involves many syncs, and we flush several of them. It was observed to
take as long as 30 seconds on GCE disks under some conditions.

This patch changes the schema merge to rely on a separate commit log
to replay the mutations on restart. This way it doesn't have to wait
for memtables to be flushed. It has to wait for the commitlog to be
synced, but this cost is well amortized.

We put the mutations into a separate commit log so that schema can be
recovered before replaying user mutations. This is necessary because
regular writes have a dependency on schema version, and replaying on
top of latest schema satisfies all dependencies. Without this, we
could get loss of writes if we replay a write which depends on the
latest schema on top of old schema.

Also, if we have a separate commit log for schema we can delay schema
parsing for after the replay and avoid complexity of recognizing
schema transactions in the log and invoking the schema merge logic.

I reproduced bad behavior locally on my machine with a tired (high latency)
SSD disk, load driver remote. Under high load, I saw table alter (server-side part) taking
up to 10 seconds before. After the patch, it takes up to 200 ms (50:1 improvement).
Without load, it is 300ms vs 50ms.

Fixes #8272
Fixes #8309
Fixes #1459

Closes #10333

* github.com:scylladb/scylla:
  config: Introduce force_schema_commit_log option
  config: Introduce unsafe_ignore_truncation_record
  db: Avoid memtable flush latency on schema merge
  db: Allow splitting initiatlization of system tables
  db: Flush system.scylla_local on change
  migration_manager: Do not drop system.IndexInfo on keyspace drop
  Introduce SCHEMA_COMMITLOG cluster feature
  frozen_mutation: Introduce freeze/unfreeze helpers for vectors of mutations
  db/commitlog: Improve error messages in case of unknown column mapping
  db/commitlog: Fix error format string to print the version
  db: Introduce multi-table atomic apply()
2022-07-07 16:03:50 +03:00
Benny Halevy
acae3cc223 treewide: stop use of deprecated coroutine::make_exception
Convert most use sites from `co_return coroutine::make_exception`
to `co_await coroutine::return_exception{,_ptr}` where possible.

In cases this is done in a catch clause, convert to
`co_return coroutine::exception`, generating an exception_ptr
if needed.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #10972
2022-07-07 15:02:16 +03:00
Piotr Sarna
2e61e50e97 cql-pytest: add initial permissions test suite
This new test suite is expected to gather all kinds of permissions
tests - granting, revoking, authorizing, and so on.
Right now it contains a single minimal test which ensures that
the default superuser can be granted applicable permissions,
which they already have anyway.
2022-07-07 13:45:26 +02:00
Piotr Sarna
1dc116f4dc cql-pytest: enable CassandraAuthorizer for Scylla and Cassandra
In order to be able to test permissions, an authorizer different
than AllowAllAuthorizer (default) must be set.
CassandraAuthorizer is thus enabled - it works on default user/password
pair, so it doesn't introduce any regressions to the test suite.
2022-07-07 13:45:26 +02:00
Avi Kivity
bfc521ee9c Merge "Activate compaction_throughput_mb_per_sec option" from Pavel E
"
The option controlls the IO bandwidth of the compaction sched class.
It's not set to be 16MB/s, but is unused. This set makes it 0 by
default (which means unlimited), live-updateable and plugs it to the
seastar sched group IO throttling.

branch: https://github.com/xemul/scylla/tree/br-compaction-throttling-3
tests: unit(dev),
       v2: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1010/ ,
       v2: manual config update
"

* 'br-compaction-throttling-3-a' of https://github.com/xemul/scylla:
  compaction_manager: Add compaction throughput limit
  updateable_value: Support dummy observing
  serialized_action: Allow being observer for updateable_value
  config: Tune the config option
2022-07-07 13:14:07 +03:00
Tomasz Grabiec
6622e3369a config: Introduce force_schema_commit_log option 2022-07-06 22:08:56 +02:00
Tomasz Grabiec
b8d20335a4 config: Introduce unsafe_ignore_truncation_record
The node now refuses to boot if schema tables were truncated.
This adds a config option to ignore truncation records as a
workaround if user truncated them manually.
2022-07-06 22:08:56 +02:00
Tomasz Grabiec
6b316f267f db: Avoid memtable flush latency on schema merge
Currently, applying schema mutations involves flushing all schema
tables so that on restart commit log replay is performed on top of
latest schema (for correctness). The downside is that schema merge is
very sensitive to fdatasync latency. Flushing a single memtable
involves many syncs, and we flush several of them. It was observed to
take as long as 30 seconds on GCE disks under some conditions.

This patch changes the schema merge to rely on a separate commit log
to replay the mutations on restart. This way it doesn't have to wait
for memtables to be flushed. It has to wait for the commitlog to be
synced, but this cost is well amortized.

We put the mutations into a separate commit log so that schema can be
recovered before replaying user mutations. This is necessary because
regular writes have a dependency on schema version, and replaying on
top of latest schema satisfies all dependencies. Without this, we
could get loss of writes if we replay a write which depends on the
latest schema on top of old schema.

Also, if we have a separate commit log for schema we can delay schema
parsing for after the replay and avoid complexity of recognizing
schema transactions in the log and invoking the schema merge logic.

One complication with this change is that replay_position markers are
commitlog-domain specific and cannot cross domains. They are recorded
in various places which survive node restart: sstables are annotated
with the maximum replay position, and they are present inside
truncation records. The former annotation is used by "truncate"
operation to drop sstables. To prevent old replay positions from being
interpreted in the context in the new schema commitlog domain, the
change refuses to boot if there are truncation records, and also
prohibits truncation of schema tables.

The boot sequence needs to know whether the cluster feature associated
with this change was enabled on all nodes. Fetaures are stored in
system.scylla_local. Because we need to read it before initializing
schema tables, the initialization of tables now has to be split into
two phases. The first phase initializes all system tables except
schema tables, and later we initialize schema tables, after reading
stored cluster features.

The commitlog domain is switched only when all nodes are upgraded, and
only after new node is restarted. This is so that we don't have to add
risky code to deal with hot-switching of the commitlog domain. Cold
switching is safer. This means that after upgrade there is a need for
yet another rolling restart round.

Fixes #8272
Fixes #8309
Fixes #1459
2022-07-06 22:08:56 +02:00