Commit Graph

32449 Commits

Author SHA1 Message Date
Nadav Har'El
c8bb147f84 Merge 'cql3: don't ignore other restrictions when a multi column restriction is present during filtering' from Jan Ciołek
When filtering with multi column restriction present all other restrictions were ignored.
So a query like:
`SELECT * FROM WHERE pk = 0 AND (ck1, ck2) < (0, 0) AND regular_col = 0 ALLOW FILTERING;`
would ignore the restriction `regular_col = 0`.

This was caused by a bug in the filtering code:
2779a171fc/cql3/selection/selection.cc (L433-L449)

When multi column restrictions were detected, the code checked if they are satisfied and returned immediately.
This is fixed by returning only when these restrictions are not satisfied. When they are satisfied the other restrictions are checked as well to ensure all of them are satisfied.

This code was introduced back in 2019, when fixing #3574.
Perhaps back then it was impossible to mix multi column and regular columns and this approach was correct.

Fixes: #6200
Fixes: #12014

Closes #12031

* github.com:scylladb/scylladb:
  cql-pytest: add a reproducer for #12014, verify that filtering multi column and regular restrictions works
  boost/restrictions-test: uncomment part of the test that passes now
  cql-pytest: enable test for filtering combined multi column and regular column restrictions
  cql3: don't ignore other restrictions when a multi column restriction is present during filtering

(cherry picked from commit 2d2034ea28)
2022-11-21 14:02:33 +02:00
Kamil Braun
dc92ec4c8b docs: a single 5.0 -> 5.1 upgrade guide
There were 4 different pages for upgrading Scylla 5.0 to 5.1 (and the
same is true for other version pairs, but I digress) for different
environments:
- "ScyllaDB Image for EC2, GCP, and Azure"
- Ubuntu
- Debian
- RHEL/CentOS

THe Ubuntu and Debian pages used a common template:
```
.. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p1.rst
.. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p2.rst
```
with different variable substitutions.

The "Image" page used a similar template, with some extra content in the
middle:
```
.. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p1.rst
.. include:: /upgrade/_common/upgrade-image-opensource.rst
.. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p2.rst
```

The RHEL/CentOS page used a different template:
```
.. include:: /upgrade/_common/upgrade-guide-v4-rpm.rst
```

This was an unmaintainable mess. Most of the content was "the same" for
each of these options. The only content that must actually be different
is the part with package installation instructions (e.g. calls to `yum`
vs `apt-get`). The rest of the content was logically the same - the
differences were mistakes, typos, and updates/fixes to the text that
were made in some of these docs but not others.

In this commit I prepare a single page that covers the upgrade and
rollback procedures for each of these options. The section dependent on
the system was implemented using Sphinx Tabs.

I also fixed and changed some parts:

- In the "Gracefully stop the node" section:
Ubuntu/Debian/Images pages had:

```rst
.. code:: sh

   sudo service scylla-server stop
```

RHEL/CentOS pages had:
```rst
.. code:: sh

.. include:: /rst_include/scylla-commands-stop-index.rst
```

the stop-index file contained this:
```rst
.. tabs::

   .. group-tab:: Supported OS

      .. code-block:: shell

         sudo systemctl stop scylla-server

   .. group-tab:: Docker

      .. code-block:: shell

         docker exec -it some-scylla supervisorctl stop scylla

      (without stopping *some-scylla* container)
```

So the RHEL/CentOS version had two tabs: one for Scylla installed
directly on the system, one for Scylla running in Docker - which is
interesting, because nothing anywhere else in the upgrade documents
mentions Docker.  Furthermore, the RHEL/CentOS version used `systemctl`
while the ubuntu/debian/images version used `service` to stop/start
scylla-server.  Both work on modern systems.

The Docker option is completely out of place - the rest of the upgrade
procedure does not mention Docker. So I decided it doesn't make sense to
include it. Docker documentation could be added later if we actually
decide to write upgrade documentation when using Docker...  Between
`systemctl` and `service` I went with `service` as it's a bit
higher-level.

- Similar change for "Start the node" section, and corresponding
  stop/start sections in the Rollback procedure.

- To reuse text for Ubuntu and Debian, when referencing "ScyllaDB deb
  repo" in the Debian/Ubuntu tabs, I provide two separate links: to
  Debian and Ubuntu repos.

- the link to rollback procedure in the RPM guide (in 'Download and
  install the new release' section) pointed to rollback procedure from
  3.0 to 3.1 guide... Fixed to point to the current page's rollback
  procedure.

- in the rollback procedure steps summary, the RPM version missed the
  "Restore system tables" step.

- in the rollback procedure, the repository links were pointing to the
  new versions, while they should point to the old versions.

There are some other pre-existing problems I noticed that need fixing:

- EC2/GCP/Azure option has no corresponding coverage in the rollback
  section (Download and install the old release) as it has in the
  upgrade section. There is no guide for rolling back 3rd party and OS
  packages, only Scylla. I left a TODO in a comment.
- the repository links assume certain Debian and Ubuntu versions (Debian
  10 and Ubuntu 20), but there are more available options (e.g. Ubuntu
  22). Not sure how to deal with this problem. Maybe a separate section
  with links? Or just a generic link without choice of platform/version?

Closes #11891

(cherry picked from commit 0c7ff0d2cb)

Backport notes:
Funnily, the 5.1 branch did not have the upgrade guide to 5.1 at all. It
was only in `master`. So the backport does not remove files, only adds
new ones.
I also had to add:
- an additional link in the upgrade-opensource index to the 5.1 upgrade
  page (it was already in upstream `master` when the cherry-picked commit
  was added)
- the list of new metrics, which was also completely missing in
  branch-5.1.

Closes #12034
2022-11-21 13:58:41 +02:00
Tzach Livyatan
8f7e3275a2 Update Alternator Markdown file to use automatic link notation
Closes #11335

(cherry picked from commit 8fc58300ea)
2022-11-21 09:56:10 +02:00
Yaron Kaikov
40a1905a2d release: prepare for 5.1.0-rc5 scylla-5.1.0-rc5 2022-11-19 13:41:28 +02:00
Avi Kivity
4e2c436222 Merge 'doc: add the upgrade guide from 5.0 to 2022.1 on Ubuntu 20.04' from Anna Stuchlik
Ubuntu 22.04 is supported by both ScyllaDB Open Source 5.0 and Enterprise 2022.1.

Closes #11227

* github.com:scylladb/scylladb:
  doc: add the redirects from Ubuntu version specific to version generic pages
  doc: remove version-speific content for Ubuntu and add the generic page to the toctree
  doc: rename the file to include Ubuntu
  doc: remove the version number from the document and add the link to Supported Versions
  doc: add a generic page for Ubuntu
  doc: add the upgrade guide from 5.0 to 2022.1 on Ubuntu 2022.1

(cherry picked from commit d4c986e4fa)
2022-11-18 17:06:00 +02:00
Botond Dénes
68be369f93 Merge 'doc: add the upgrade guide for ScyllaDB image from 2021.1 to 2022.1' from Anna Stuchlik
This PR is related to  https://github.com/scylladb/scylla-docs/issues/4124 and https://github.com/scylladb/scylla-docs/issues/4123.

**New Enterprise Upgrade Guide from 2021.1 to 2022.2**
I've added the upgrade guide for ScyllaDB Enterprise image. In consists of 3 files:
/upgrade/_common/upgrade-guide-v2022-ubuntu-and-debian-p1.rst
upgrade/_common/upgrade-image.rst
 /upgrade/_common/upgrade-guide-v2022-ubuntu-and-debian-p2.rst

**Modified Enterprise Upgrade Guides 2021.1 to 2022.2**
I've modified the existing guides for Ubuntu and Debian to use the same files as above, but exclude the image-related information:
/upgrade/_common/upgrade-guide-v2022-ubuntu-and-debian-p1.rst + /upgrade/_common/upgrade-guide-v2022-ubuntu-and-debian-p2.rst = /upgrade/_common/upgrade-guide-v2022-ubuntu-and-debian.rst

To make things simpler and remove duplication, I've replaced the guides for Ubuntu 18 and 20 with a generic Ubuntu guide.

**Modified Enterprise Upgrade Guides from 4.6 to 5.0**
These guides included a bug: they included the image-related information (about updating OS packages), because a file that includes that information was included by mistake. What's worse, it was duplicated. After the includes were removed, image-related information is no longer included in the Ubuntu and Debian guides (this fixes https://github.com/scylladb/scylla-docs/issues/4123).

I've modified the index file to be in sync with the updates.

Closes #11285

* github.com:scylladb/scylladb:
  doc: reorganize the content to list the recommended way of upgrading the image first
  doc: update the image upgrade guide for ScyllaDB image to include the location of the manifest file
  doc: fix the upgrade guides for Ubuntu and Debian by removing image-related information
  doc: update the guides for Ubuntu and Debian to remove image information and the OS version number
  doc: add the upgrade guide for ScyllaDB image from 2021.1 to 2022.1

(cherry picked from commit dca351c2a6)
2022-11-18 17:05:14 +02:00
Botond Dénes
0f7adb5f47 Merge 'doc: change the tool names to "Scylla SStable" and "Scylla Types"' from Anna Stuchlik
Fix https://github.com/scylladb/scylladb/issues/11393

- Rename the tool names across the docs.
- Update the examples to replace `scylla-sstable` and `scylla-types` with `scylla sstable` and `scylla types`, respectively.

Closes #11432

* github.com:scylladb/scylladb:
  doc: update the tool names in the toctree and reference pages
  doc: rename the scylla-types tool as Scylla Types
  doc: rename the scylla-sstable tool as Scylla SStable

(cherry picked from commit 2c46c24608)
2022-11-18 17:04:37 +02:00
Avi Kivity
82dc8357ef Merge 'Docs: document how scylla-sstable obtains its schema' from Botond Dénes
This is a very important aspect of the tool that was completely missing from the document before. Also add a comparison with SStableDump.
Fixes: https://github.com/scylladb/scylladb/issues/11363

Closes #11390

* github.com:scylladb/scylladb:
  docs: scylla-sstable.rst: add comparison with SStableDump
  docs: scylla-sstable.rst: add section about providing the schema

(cherry picked from commit 2ab5cbd841)
2022-11-18 17:01:17 +02:00
Anna Stuchlik
12a58957e2 doc: fix the upgrade version in the upgrade guide for RHEL and CentOS
Closes #11477

(cherry picked from commit 0dee507c48)
2022-11-18 16:59:44 +02:00
Botond Dénes
3423ad6e38 Merge 'doc: update the default SStable format' from Anna Stuchlik
The purpose of this PR is to update the information about the default SStable format.
It

Closes #11431

* github.com:scylladb/scylladb:
  doc: simplify the information about default formats in different versions
  doc: update the SSTables 3.0 Statistics File Format to add the UUID host_id option of the ME format
  doc: add the information regarding the ME format to the SSTables 3.0 Data File Format page
  doc: fix additional information regarding the ME format on the SStable 3.x page
  doc: add the ME format to the table
  add a comment to remove the information when the documentation is versioned (in 5.1)
  doc: replace Scylla with ScyllaDB
  doc: fix the formatting and language in the updated section
  doc: fix the default SStable format

(cherry picked from commit a0392bc1eb)
2022-11-18 16:58:14 +02:00
Anna Stuchlik
64001719fa doc: remove the section about updating OS packages during upgrade from upgrade guides for Ubunut and Debian (from 4.5 to 4.6)
Closes #11629

(cherry picked from commit c5285bcb14)
2022-11-18 16:56:29 +02:00
AdamStawarz
cc3d368bc8 Update tombstones-flush.rst
change syntax:

nodetool compact <keyspace>.<mytable>;
to
nodetool compact <keyspace> <mytable>;

Closes #11904

(cherry picked from commit 6bc455ebea)
2022-11-18 16:52:06 +02:00
Botond Dénes
d957b0044b Merge 'doc: improve the documentation landing page ' from Anna Stuchlik
This PR introduces the following changes to the documentation landing page:

- The " New to ScyllaDB? Start here!" box is added.
- The "Connect your application to Scylla" box is removed.
- Some wording has been improved.
- "Scylla" has been replaced with "ScyllaDB".

Closes #11896

* github.com:scylladb/scylladb:
  Update docs/index.rst
  doc: replace Scylla with ScyllaDB on the landing page
  doc: improve the wording on the landing page
  doc: add the link to the ScyllaDB Basics page to the documentation landing page

(cherry picked from commit 2b572d94f5)
2022-11-18 16:51:26 +02:00
Botond Dénes
d4ed67bd47 Merge 'doc: cql-extensions.md: improve description of synchronous views' from Nadav Har'El
It was pointed out to me that our description of the synchronous_updates
materialized-view option does not make it clear enough what is the
default setting, or why a user might want to use this option.

This patch changes the description to (I hope) better address these
issues.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11404

* github.com:scylladb/scylladb:
  doc: cql-extensions.md: replace "Scylla" by "ScyllaDB"
  doc: cql-extensions.md: improve description of synchronous views

(cherry picked from commit b9fc504fb2)
2022-11-18 16:44:38 +02:00
Nadav Har'El
0cd6341cae Merge 'doc: document user defined functions (UDFs)' from Anna Stuchlik
This PR is V2 of the[ PR created by @psarna.](https://github.com/scylladb/scylladb/pull/11560).
I have:
- copied the content.
- applied the suggestions left by @nyh.
- made minor improvements, such as replacing "Scylla" with "ScyllaDB", fixing punctuation, and fixing the RST syntax.

Fixes https://github.com/scylladb/scylladb/issues/11378

Closes #11984

* github.com:scylladb/scylladb:
  doc: label user-defined functions as Experimental
  doc: restore the note for the Count function (removed by mistatke)
  doc: document user defined functions (UDFs)

(cherry picked from commit 7cbb0b98bb)
2022-11-18 16:43:53 +02:00
Nadav Har'El
23d8852a82 Merge 'doc: update the "Counting all rows in a table is slow" page' from Anna Stuchlik
Fix https://github.com/scylladb/scylladb/issues/11373

- Updated the information on the "Counting all rows in a table is slow" page.
- Added COUNT to the list of selectors of the SELECT statement (somehow it was missing).
- Added the note to the description of the COUNT() function with a link to the KB page for troubleshooting if necessary. This will allow the users to easily find the KB page.

Closes #11417

* github.com:scylladb/scylladb:
  doc: add a comment to remove the note in version 5.1
  doc: update the information on the Countng all rows page and add the recommendation to upgrade ScyllaDB
  doc: add a note to the description of COUNT with a reference to the KB article
  doc: add COUNT to the list of acceptable selectors of the SELECT statement

(cherry picked from commit 22bb35e2cb)
2022-11-18 16:28:43 +02:00
Aleksandra Martyniuk
88016de43e compaction: request abort only once in compaction_data::stop
compaction_manager::task (and thus compaction_data) can be stopped
because of many different reasons. Thus, abort can be requested more
than once on compaction_data abort source causing a crash.

To prevent this before each request_abort() we check whether an abort
was requested before.

Closes #12004

(cherry picked from commit 7ead1a7857)

Fixes #12002.
2022-11-17 19:15:43 +02:00
Asias He
bdecf4318a gossip: Improve get_live_token_owners and get_unreachable_token_owners
The get_live_token_owners returns the nodes that are part of the ring
and live.

The get_unreachable_token_owners returns the nodes that are part of the ring
and is not alive.

The token_metadata::get_all_endpoints returns nodes that are part of the
ring.

The patch changes both functions to use the more authoritative source to
get the nodes that are part of the ring and call is_alive to check if
the node is up or down. So that the correctness does not depend on
any derived information.

This patch fixes a truncate issue in storage_proxy::truncate_blocking
where it calls get_live_token_owners and get_unreachable_token_owners to
decide the nodes to talk with for truncate operation. The truncate
failed because incorrect nodes were returned.

Fixes #10296
Fixes #11928

Closes #11952

(cherry picked from commit 16bd9ec8b1)
2022-11-17 14:30:43 +02:00
Eliran Sinvani
72bf244ad1 cql: Fix crash upon use of the word empty for service level name
Wrong access to an uninitialized token instead of the actual
generated string caused the parser to crash, this wasn't
detected by the ANTLR3 compiler because all the temporary
variables defined in the ANTLR3 statements are global in the
generated code. This essentialy caused a null dereference.

Tests: 1. The fixed issue scenario from github.
       2. Unit tests in release mode.

Fixes #11774

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <20190612133151.20609-1-eliransin@scylladb.com>

Closes #11777

(cherry picked from commit ab7429b77d)
2022-11-10 20:42:59 +02:00
Botond Dénes
ee82323599 db/view/view_builder: don't drop partition and range tombstones when resuming
The view builder builds the views from a given base table in
view_builder::batch_size batches of rows. After processing this many
rows, it suspends so the view builder can switch to building views for
other base tables in the name of fairness. When resuming the build step
for a given base table, it reuses the reader used previously (also
serving the role of a snapshot, pinning sstables read from). The
compactor however is created anew. As the reader can be in the middle of
a partition, the view builder injects a partition start into the
compactor to prime it for continuing the partition. This however only
included the partition-key, crucially missing any active tombstones:
partition tombstone or -- since the v2 transition -- active range
tombstone. This can result in base rows covered by either of this to be
resurrected and the view builder to generate view updates for them.
This patch solves this by using the detach-state mechanism of the
compactor which was explicitly developed for situations like this (in
the range scan code) -- resuming a read with the readers kept but the
compactor recreated.
Also included are two test cases reproducing the problem, one with a
range tombstone, the other with a partition tombstone.

Fixes: #11668

Closes #11671

(cherry picked from commit 5621cdd7f9)
2022-11-07 11:45:37 +02:00
Alexander Turetskiy
2f78df92ab Alternator: Projection field added to return from DescribeTable which describes GSIs and LSIs.
The return from DescribeTable which describes GSIs and LSIs is missing
the Projection field. We do not yet support all the settings Projection
(see #5036), but the default which we support is ALL, and DescribeTable
should return that in its description.

Fixes #11470

Closes #11693

(cherry picked from commit 636e14cc77)
2022-11-07 10:36:04 +02:00
Takuya ASADA
e2809674d2 locator::ec2_snitch: Retry HTTP request to EC2 instance metadata service
EC2 instance metadata service can be busy, ret's retry to connect with
interval, just like we do in scylla-machine-image.

Fixes #10250

Signed-off-by: Takuya ASADA <syuu@scylladb.com>

Closes #11688

(cherry picked from commit 6b246dc119)
2022-11-06 15:43:06 +02:00
Yaron Kaikov
0295d0c5c8 release: prepare for 5.1.0-rc4 scylla-5.1.0-rc4 2022-11-06 14:49:29 +02:00
Botond Dénes
fa94222662 Merge 'Alternator, MV: fix bug in some view updates which set the view key to its existing value' from Nadav Har'El
As described in issue #11801, we saw in Alternator when a GSI has both partition and sort keys which were non-key attributes in the base, cases where updating the GSI-sort-key attribute to the same value it already had caused the entire GSI row to be deleted.

In this series fix this bug (it was a bug in our materialized views implementation) and add a reproducing test (plus a few more tests for similar situations which worked before the patch, and continue to work after it).

Fixes #11801

Closes #11808

* github.com:scylladb/scylladb:
  test/alternator: add test for issue 11801
  MV: fix handling of view update which reassign the same key value
  materialized views: inline used-once and confusing function, replace_entry()

(cherry picked from commit e981bd4f21)
2022-11-01 13:14:21 +02:00
Pavel Emelyanov
dff7f3c5ba compaction_manager: Swallow ENOSPCs in ::stop()
When being stopped compaction manager may step on ENOSPC. This is not a
reason to fail stopping process with abort, better to warn this fact in
logs and proceed as if nothing happened

refs: #11245

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 15:36:44 +03:00
Pavel Emelyanov
3723713130 exceptions: Mark storage_io_error::code() with noexcept
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 15:36:38 +03:00
Pavel Emelyanov
03f8411e38 table: Handle storage_io_error's ENOSPC when flushing
Commit a9805106 (table: seal_active_memtable: handle ENOSPC error)
made memtable flushing code stand ENOSPC and continue flusing again
in the hope that the node administrator would provide some free space.

However, it looks like the IO code may report back ENOSPC with some
exception type this code doesn't expect. This patch tries to fix it

refs: #11245

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 15:35:26 +03:00
Pavel Emelyanov
0e391d67d1 table: Rewrap retry loop
The existing loop is very branchy in its attempts to find out whether or
not to abort. The "allowed_retries" count can be a good indicator of the
decision taken. This makes the code notably shorter and easier to extend

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 15:35:25 +03:00
Benny Halevy
f76989285e table: seal_active_memtable: handle ENOSPC error
Aborting too soon on ENOSPC is too harsh, leading to loss of
availability of the node for reads, while restarting it won't
solve the ENOSPC condition.

Fixes #11245

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11246
2022-10-13 15:35:19 +03:00
Beni Peled
9deeeb4db1 release: prepare for 5.1.0-rc3 scylla-5.1.0-rc3 2022-10-09 08:36:06 +03:00
Avi Kivity
1f3196735f Update tools/java submodule (cqlsh permissions)
* tools/java ad6764b506...b3959948dd (1):
  > install.sh is using wrong permissions for install cqlsh files

Fixes #11584.
2022-10-04 18:02:03 +03:00
Nadav Har'El
abb6817261 cql: validate bloom_filter_fp_chance up-front
Scylla's Bloom filter implementation has a minimal false-positive rate
that it can support (6.71e-5). When setting bloom_filter_fp_chance any
lower than that, the compute_bloom_spec() function, which writes the bloom
filter, throws an exception. However, this is too late - it only happens
while flushing the memtable to disk, and a failure at that point causes
Scylla to crash.

Instead, we should refuse the table creation with the unsupported
bloom_filter_fp_chance. This is also what Cassandra did six years ago -
see CASSANDRA-11920.

This patch also includes a regression test, which crashes Scylla before
this patch but passes after the patch (and also passes on Cassandra).

Fixes #11524.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11576

(cherry picked from commit 4c93a694b7)
2022-10-04 16:21:48 +03:00
Nadav Har'El
d3fd090429 alternator: return ProvisionedThroughput in DescribeTable
DescribeTable is currently hard-coded to return PAY_PER_REQUEST billing
mode. Nevertheless, even in PAY_PER_REQUEST mode, the DescribeTable
operation must return a ProvisionedThroughput structure, listing both
ReadCapacityUnits and WriteCapacityUnits as 0. This requirement is not
stated in some DynamoDB documentation but is explictly mentioned in
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_ProvisionedThroughput.html
Also in empirically, DynamoDB returns ProvisionedThroughput with zeros
even in PAY_PER_REQUEST mode. We even had an xfailing test to confirm this.

The ProvisionedThroughput structure being missing was a problem for
applications like DynamoDB connectors for Spark, if they implicitly
assume that ProvisionedThroughput is returned by DescribeTable, and
fail (as described in issue #11222) if it's outright missing.

So this patch adds the missing ProvisionedThroughput structure, and
the xfailing test starts to pass.

Note that this patch doesn't change the fact that attempting to set
a table to PROVISIONED billing mode is ignored: DescribeTable continues
to always return PAY_PER_REQUEST as the billing mode and zero as the
provisioned capacities.

Fixes #11222

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11298

(cherry picked from commit 941c719a23)
2022-10-03 14:26:55 +03:00
Pavel Emelyanov
3e7c57d162 cross-shard-barrier: Capture shared barrier in complete
When cross-shard barrier is abort()-ed it spawns a background fiber
that will wake-up other shards (if they are sleeping) with exception.

This fiber is implicitly waited by the owning sharded service .stop,
because barrier usage is like this:

    sharded<service> s;
    co_await s.invoke_on_all([] {
        ...
        barrier.abort();
    });
    ...
    co_await s.stop();

If abort happens, the invoke_on_all() will only resolve _after_ it
queues up the waking lambdas into smp queues, thus the subseqent stop
will queue its stopping lambdas after barrier's ones.

However, in debug mode the queue can be shuffled, so the owning service
can suddenly be freed from under the barrier's feet causing use after
free. Fortunately, this can be easily fixed by capturing the shared
pointer on the shared barrier instead of a regular pointer on the
shard-local barrier.

fixes: #11303

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11553
2022-10-03 13:20:28 +03:00
Tomasz Grabiec
f878a34da3 test: lib: random_mutation_generator: Don't generate mutations with marker uncompacted with shadowable tombstone
The generator was first setting the marker then applied tombstones.

The marker was set like this:

  row.marker() = random_row_marker();

Later, when shadowable tombstones were applied, they were compacted
with the marker as expected.

However, the key for the row was chosen randomly in each iteration and
there are multiple keys set, so there was a possibility of a key clash
with an earlier row. This could override the marker without applying
any tombstones, which is conditional on random choice.

This could generate rows with markers uncompacted with shadowable tombstones.

This broken row_cache_test::test_concurrent_reads_and_eviction on
comparison between expected and read mutations. The latter was
compacted because it went through an extra merge path, which compacts
the row.

Fix by making sure there are no key clashes.

Closes #11663

(cherry picked from commit 5268f0f837)
2022-10-02 16:44:57 +03:00
Raphael S. Carvalho
eaded57b2e compaction: Properly handle stop request for off-strategy
If user stops off-strategy via API, compaction manager can decide
to give up on it completely, so data will sit unreshaped in
maintenance set, preventing it from being compacted with data
in the main set. That's problematic because it will probably lead
to a significant increase in read and space amplification until
off-strategy is triggered again, which cannot happen anytime
soon.

Let's handle it by moving data in maintenance set into main one,
even if unreshaped. Then regular compaction will be able to
continue from where off-strategy left off.

Fixes #11543.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #11545

(cherry picked from commit a04047f390)
2022-10-02 14:20:17 +03:00
Tomasz Grabiec
25d2da08d1 db: range_tombstone_list: Avoid quadratic behavior when applying
Range tombstones are kept in memory (cache/memtable) in
range_tombstone_list. It keeps them deoverlapped, so applying a range
tombstone which covers many range tombstones will erase existing range
tombstones from the list. This operation needs to be exception-safe,
so range_tombstone_list maintains an undo log. This undo log will
receive a record for each range tombstone which is removed. For
exception safety reasons, before pushing an undo log entry, we reserve
space in the log by calling std::vector::reserve(size() + 1). This is
O(N) where N is the number of undo log entries. Therefore, the whole
application is O(N^2).

This can cause reactor stalls and availability issues when replicas
apply such deletions.

This patch avoids the problem by reserving exponentially increasing
amount of space. Also, to avoid large allocations, switches the
container to chunked_vector.

Fixes #11211

Closes #11215

(cherry picked from commit 7f80602b01)
2022-09-30 00:01:26 +03:00
Botond Dénes
9b1a570f6f sstables: crawling mx-reader: make on_out_of_clustering_range() no-op
Said method currently emits a partition-end. This method is only called
when the last fragment in the stream is a range tombstone change with a
position after all clustered rows. The problem is that
consume_partition_end() is also called unconditionally, resulting in two
partition-end fragments being emitted. The fix is simple: make this
method a no-op, there is nothing to do there.

Also add two tests: one targeted to this bug and another one testing the
crawling reader with random mutations generated for random schema.

Fixes: #11421

Closes #11422

(cherry picked from commit be9d1c4df4)
2022-09-29 23:42:01 +03:00
Piotr Dulikowski
426d045249 exception: fix the error code used for rate_limit_exception
Per-partition rate limiting added a new error type which should be
returned when Scylla decides to reject an operation due to per-partition
rate limit being exceeded. The new error code requires drivers to
negotiate support for it, otherwise Scylla will report the error as
`Config_error`. The existing error code override logic works properly,
however due to a mistake Scylla will report the `Config_error` code even
if the driver correctly negotiated support for it.

This commit fixes the problem by specifying the correct error code in
`rate_limit_exception`'s constructor.

Tested manually with a modified version of the Rust driver which
negotiates support for the new error. Additionally, tested what happens
when the driver doesn't negotiate support (Scylla properly falls back to
`Config_error`).

Branches: 5.1
Fixes: #11517

Closes #11518

(cherry picked from commit e69b44a60f)
2022-09-29 23:39:25 +03:00
Botond Dénes
86dbbf12cc shard_reader: do_fill_buffer(): only update _end_of_stream after buffer is copied
Commit 8ab57aa added a yield to the buffer-copy loop, which means that
the copy can yield before done and the multishard reader might see the
half-copied buffer and consider the reader done (because
`_end_of_stream` is already set) resulting in the dropping the remaining
part of the buffer and in an invalid stream if the last copied fragment
wasn't a partition-end.

Fixes: #11561
(cherry picked from commit 0c450c9d4c)
2022-09-29 19:11:52 +03:00
Pavel Emelyanov
b05903eddd messaging_service: Fix gossiper verb group
When configuring tcp-nodelay unconditionally, messaging service thinks
gossiper uses group index 1, though it had changed some time ago and now
those verbs belong to group 0.

fixes: #11465

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
(cherry picked from commit 2c74062962)
2022-09-29 19:05:11 +03:00
Piotr Sarna
26ead53304 Merge 'Fix mutation commutativity with shadowable tombstone'
from Tomasz Grabiec

This series fixes lack of mutation associativity which manifests as
sporadic failures in
row_cache_test.cc::test_concurrent_reads_and_eviction due to differences
in mutations applied and read.

No known production impact.

Refs https://github.com/scylladb/scylladb/issues/11307

Closes #11312

* github.com:scylladb/scylladb:
  test: mutation_test: Add explicit test for mutation commutativity
  test: random_mutation_generator: Workaround for non-associativity of mutations with shadowable tombstones
  db: mutation_partition: Drop unnecessary maybe_shadow()
  db: mutation_partition: Maintain shadowable tombstone invariant when applying a hard tombstone
  mutation_partition: row: make row marker shadowing symmetric

(cherry picked from commit 484004e766)
2022-09-20 23:21:58 +02:00
Tomasz Grabiec
f60bab9471 test: row_cache: Use more narrow key range to stress overlapping reads more
This makes catching issues related to concurrent access of same or
adjacent entries more likely. For example, catches #11239.

Closes #11260

(cherry picked from commit 8ee5b69f80)
2022-09-20 23:21:54 +02:00
Yaron Kaikov
66f34245fc release: prepare for 5.1.0-rc2 scylla-5.1.0-rc2 2022-09-19 14:35:28 +03:00
Michał Chojnowski
4047528bd9 db: commitlog: don't print INFO logs on shutdown
The intention was for these logs to be printed during the
database shutdown sequence, but it was overlooked that it's not
the only place where commitlog::shutdown is called.
Commitlogs are started and shut down periodically by hinted handoff.
When that happens, these messages spam the log.

Fix that by adding INFO commitlog shutdown logs to database::stop,
and change the level of the commitlog::shutdown log call to DEBUG.

Fixes #11508

Closes #11536

(cherry picked from commit 9b6fc553b4)
2022-09-18 13:33:05 +03:00
Michał Chojnowski
1a82c61452 sstables: add a flag for disabling long-term index caching
Long-term index caching in the global cache, as introduced in 4.6, is a major
pessimization for workloads where accesses to the index are (spacially) sparse.
We want to have a way to disable it for the affected workloads.

There is already infrastructure in place for disabling it for BYPASS CACHE
queries. One way of solving the issue is hijacking that infrastructure.

This patch adds a global flag (and a corresponding CLI option) which controls
index caching. Setting the flag to `false` causes all index reads to behave
like they would in BYPASS CACHE queries.

Consequences of this choice:

- The per-SSTable partition_index_cache is unused. Every index_reader has
  its own, and they die together. Independent reads can no longer reuse the
  work of other reads which hit the same index pages. This is not crucial,
  since partition accesses have no (natural) spatial locality. Note that
  the original reason for partition_index_cache -- the ability to share
  reads for the lower and upper bound of the query -- is unaffected.
- The per-SSTable cached_file is unused. Every index_reader has its own
  (uncached) input stream from the index file, and every
  bsearch_clustered_cursor has its own cached_file, which dies together with
  the cursor. Note that the cursor still can perform its binary search with
  caching. However, it won't be able to reuse the file pages read by
  index_reader. In particular, if the promoted index is small, and fits inside
  the same file page as its index_entry, that page will be re-read.
  It can also happen that index_reader will read the same index file page
  multiple times. When the summary is so dense that multiple index pages fit in
  one index file page, advancing the upper bound, which reads the next index
  page, will read the same index file page. Since summary:disk ratio is 1:2000,
  this is expected to happen for partitions with size greater than 2000
  partition keys.

Fixes #11202

(cherry picked from commit cdb3e71045)
2022-09-18 13:27:46 +03:00
Avi Kivity
3d9800eb1c logalloc: don't crash while reporting reclaim stalls if --abort-on-seastar-bad-alloc is specified
The logger is proof against allocation failures, except if
--abort-on-seastar-bad-alloc is specified. If it is, it will crash.

The reclaim stall report is likely to be called in low memory conditions
(reclaim's job is to alleviate these conditions after all), so we're
likely to crash here if we're reclaiming a very low memory condition
and have a large stall simultaneously (AND we're running in a debug
environment).

Prevent all this by disabling --abort-on-seastar-bad-alloc temporarily.

Fixes #11549

Closes #11555

(cherry picked from commit d3b8c0c8a6)
2022-09-18 13:24:21 +03:00
Karol Baryła
c48e9b47dd transport/server.cc: Return correct size of decompressed lz4 buffer
An incorrect size is returned from the function, which could lead to
crashes or undefined behavior. Fix by erroring out in these cases.

Fixes #11476

(cherry picked from commit 1c2eef384d)
2022-09-07 10:58:30 +03:00
Avi Kivity
2eadaad9f7 Merge 'database: evict all inactive reads for table when detaching table' from Botond Dénes
Currently, when detaching the table from the database, we force-evict all queriers for said table. This series broadens the scope of this force-evict to include all inactive reads registered at the semaphore. This ensures that any regular inactive read "forgotten" for any reason in the semaphore, will not end up in said readers accessing a dangling table reference when destroyed later.

Fixes: https://github.com/scylladb/scylladb/issues/11264

Closes #11273

* github.com:scylladb/scylladb:
  querier: querier_cache: remove now unused evict_all_for_table()
  database: detach_column_family(): use reader_concurrency_semaphore::evict_inactive_reads_for_table()
  reader_concurrency_semaphore: add evict_inactive_reads_for_table()

(cherry picked from commit afa7960926)
2022-09-02 10:41:22 +03:00
Yaron Kaikov
d10aee15e7 release: prepare for 5.1.0-rc1 scylla-5.1.0-rc1 2022-09-02 06:15:05 +03:00