Commit Graph

20122 Commits

Author SHA1 Message Date
Benny Halevy
de314dfe30 tracing: one_session_records: keep local tracing ptr
Similar to trace_state keep shared_ptr<tracing> _local_tracing_ptr
in one_session_records when constructed so it can be used
during shutdown.

Fixes #5243

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 7aef39e400)
2019-12-24 18:42:07 +02:00
Avi Kivity
40a077bf93 database: fix schema use-after-move in make_multishard_streaming_reader
On aarch64, asan detected a use-after-move. It doesn't happen on x86_64,
likely due to different argument evaluation order.

Fix by evaluating full_slice before moving the schema.

Note: I used "auto&&" and "std::move()" even though full_slice()
returns a reference. I think this is safer in case full_slice()
changes, and works just as well with a reference.

Fixes #5419.

(cherry picked from commit 85822c7786)
2019-12-24 18:34:46 +02:00
Pavel Solodovnikov
d488e762cf LWT: Fix required participants calculation for LOCAL_SERIAL CL
Suppose we have a multi-dc setup (e.g. 9 nodes distributed across
3 datacenters: [dc1, dc2, dc3] -> [3, 3, 3]).

When a query that uses LWT is executed with LOCAL_SERIAL consistency
level, the `storage_proxy::get_paxos_participants` function
incorrectly calculates the number of required participants to serve
the query.

In the example above it's calculated to be 5 (i.e. the number of
nodes needed for a regular QUORUM) instead of 2 (for LOCAL_SERIAL,
which is equivalent to LOCAL_QUORUM cl in this case).

This behavior results in an exception being thrown when executing
the following query with LOCAL_SERIAL cl:

INSERT INTO users (userid, firstname, lastname, age) VALUES (0, 'first0', 'last0', 30) IF NOT EXISTS

Unavailable: Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level for cl LOCAL_SERIAL. Requires 5, alive 3" info={'required_replicas': 5, 'alive_replicas': 3, 'consistency': 'LOCAL_SERIAL'}

Tests: unit(dev), dtest(consistency_test.py)

Fixes #5477.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20191216151732.64230-1-pa.solodovnikov@scylladb.com>
(cherry picked from commit c451f6d82a)
2019-12-23 15:21:32 +02:00
Asias He
637d80ffcf repair: Do not return working_row_buf_nr in get combined row hash verb
In commit b463d7039c (repair: Introduce
get_combined_row_hash_response), working_row_buf_nr is returned in
REPAIR_GET_COMBINED_ROW_HASH in addition to the combined hash. It is
scheduled to be part of 3.1 release. However it is not backported to 3.1
by accident.

In order to be compatible between 3.1 and 3.2 repair. We need to drop
the working_row_buf_nr in 3.2 release.

Fixes: #5490
Backports: 3.2
Tests: Run repair in a mixed 3.1 and 3.2 cluster
(cherry picked from commit 7322b749e0)
2019-12-22 15:53:18 +02:00
Hagit Segev
39b17be562 release: prepare for 3.2.rc3 scylla-3.2.rc3 2019-12-15 10:33:13 +02:00
Dejan Mircevski
e54df0585e cql3: Fix needs_filtering() for clustering columns
The LIKE operator requires filtering, so needs_filtering() must check
is_LIKE().  This already happens for partition columns, but it was
overlooked for clustering columns in the initial implementation of
LIKE.

Fixes #5400.

Tests: unit(dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
(cherry picked from commit 27b8b6fe9d)
2019-12-12 14:40:09 +02:00
Avi Kivity
7d113bd1e9 Merge "Add experimental_features option" from Dejan
"
Add --experimental-features -- a vector of features to unlock. Make corresponding changes in the YAML parser.

Fixes #5338
"

* 'vecexper' of https://github.com/dekimir/scylla:
  config: Add `experimental_features` option
  utils: Add enum_option

(cherry picked from commit 63474a3380)
2019-12-12 14:39:42 +02:00
Avi Kivity
2e7cd77bc4 Update seastar submodule
* seastar 8837a3fdf1...8e236efda9 (1):
  > Merge "reactor: fix iocb pool underflow due to unaccounted aio fsync" from Avi

Fixes #5443.
2019-12-12 14:17:32 +02:00
Cem Sancak
0a38d2b0ee Fix DPDK mode in prepare script
Fixes #5455.

(cherry picked from commit 86b8036502)
2019-12-12 14:15:37 +02:00
Piotr Sarna
2ff26d1160 table: Reduce read amplification in view update generation
This commit makes sure that single-partition readers for
read-before-write do not have fast-forwarding enabled,
as it may lead to huge read amplification. The observed case was:
1. Creating an index.
  CREATE INDEX index1  ON myks2.standard1 ("C1");
2. Running cassandra-stress in order to generate view updates.
cassandra-stress write no-warmup n=1000000 cl=ONE -schema \
  'replication(factor=2) compaction(strategy=LeveledCompactionStrategy)' \
  keyspace=myks2 -pop seq=4000000..8000000 -rate threads=100 -errors
  skip-read-validation -node 127.0.0.1;

Without disabling fast-forwarding, single-partition readers
were turned into scanning readers in cache, which resulted
in reading 36GB (sic!) on a workload which generates less
than 1GB of view updates. After applying the fix, the number
dropped down to less than 1GB, as expected.

Refs #5409
Fixes #4615
Fixes #5418

(cherry picked from commit 79c3a508f4)
2019-12-05 22:35:05 +02:00
Calle Wilund
9dd714ae64 commitlog_replayer: Ensure applied frozen_mutation is safe during apply
Fixes #5211

In 79935df959 replay apply-call was
changed from one with no continuation to one with. But the frozen
mutation arg was still just lambda local.

Change to use do_with for this case as well.

Message-Id: <20191203162606.1664-1-calle@scylladb.com>
(cherry picked from commit 56a5e0a251)
scylla-3.2.rc2
2019-12-04 15:02:15 +02:00
Hagit Segev
3980570520 release: prepare for 3.2.rc2 2019-12-03 18:16:37 +02:00
Avi Kivity
9889e553e6 Update seastar submodule
* seastar 6f0ef3251...8837a3fdf (1):
  > shared_future: Fix crash when all returned futures time out

Fixes #5322
2019-11-29 11:47:24 +02:00
Avi Kivity
3e0b09faa1 Update seastar submodule to point to scylla-seastar.git
This allows us to add 3.2 specific patches to Seastar.
2019-11-29 11:45:44 +02:00
Asias He
bc4106ff45 repair: Fix rx_hashes_nr metrics (#5213)
In get_full_row_hashes_with_rpc_stream and
repair_get_row_diff_with_rpc_stream_process_op which were introduced in
the "Repair switch to rpc stream" series, rx_hashes_nr metrics are not
updated correctly.

In the test we have 3 nodes and run repair on node3, we makes sure the
following metrics are correct.

assertEqual(node1_metrics['scylla_repair_tx_hashes_nr'] + node2_metrics['scylla_repair_tx_hashes_nr'],
   	    node3_metrics['scylla_repair_rx_hashes_nr'])
assertEqual(node1_metrics['scylla_repair_rx_hashes_nr'] + node2_metrics['scylla_repair_rx_hashes_nr'],
   	    node3_metrics['scylla_repair_tx_hashes_nr'])
assertEqual(node1_metrics['scylla_repair_tx_row_nr'] + node2_metrics['scylla_repair_tx_row_nr'],
   	    node3_metrics['scylla_repair_rx_row_nr'])
assertEqual(node1_metrics['scylla_repair_rx_row_nr'] + node2_metrics['scylla_repair_rx_row_nr'],
   	    node3_metrics['scylla_repair_tx_row_nr'])
assertEqual(node1_metrics['scylla_repair_tx_row_bytes'] + node2_metrics['scylla_repair_tx_row_bytes'],
   	    node3_metrics['scylla_repair_rx_row_bytes'])
assertEqual(node1_metrics['scylla_repair_rx_row_bytes'] + node2_metrics['scylla_repair_rx_row_bytes'],
            node3_metrics['scylla_repair_tx_row_bytes'])

Tests: repair_additional_test.py:RepairAdditionalTest.repair_almost_synced_3nodes_test
Fixes: #5339
Backports: 3.2
(cherry picked from commit 6ec602ff2c)
2019-11-25 18:18:12 +02:00
Rafael Ávila de Espíndola
df3563c1ae rpmbuild: don't use dwz
By default rpm uses dwz to merge the debug info from various
binaries. Unfortunately, it looks like addr2line has not been updated
to handle this:

// This works
$ addr2line  -e build/release/scylla 0x1234567

$ dwz -m build/release/common.debug build/release/scylla.debug build/release/iotune.debug

// now this fails
$ addr2line -e build/release/scylla 0x1234567

I think the issue is

https://sourceware.org/bugzilla/show_bug.cgi?id=23652

Fixes #5289

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191123015734.89331-1-espindola@scylladb.com>
(cherry picked from commit 8599f8205b)
2019-11-25 11:45:15 +02:00
Rafael Ávila de Espíndola
1c89961c4f commitlog: make sure a file is closed
If allocate or truncate throws, we have to close the file.

Fixes #4877

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191114174810.49004-1-espindola@scylladb.com>
(cherry picked from commit 6160b9017d)
2019-11-24 17:47:08 +02:00
Tomasz Grabiec
85b1a45252 row_cache: Fix abort on bad_alloc during cache update
Since 90d6c0b, cache will abort when trying to detach partition
entries while they're updated. This should never happen. It can happen
though, when the update fails on bad_alloc, because the cleanup guard
invalidates the cache before it releases partition snapshots (held by
"update" coroutine).

Fix by destroying the coroutine first.

Fixes #5327.

Tests:
  - row_cache_test (dev)

Message-Id: <1574360259-10132-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit e3d025d014)
2019-11-24 17:42:41 +02:00
Pekka Enberg
6a847e2242 test.py: Append test repeat cycle to output XML filename
Currently, we overwrite the same XML output file for each test repeat
cycle. This can cause invalid XML to be generated if the XML contents
don't match exactly for every iteration.

Fix the problem by appending the test repeat cycle in the XML filename
as follows:

  $ ./test.py --repeat 3 --name vint_serialization_test --mode dev --jenkins jenkins_test

  $ ls -1 *.xml
  jenkins_test.release.vint_serialization_test.0.boost.xml
  jenkins_test.release.vint_serialization_test.1.boost.xml
  jenkins_test.release.vint_serialization_test.2.boost.xml

Fixes #5303.

Message-Id: <20191119092048.16419-1-penberg@scylladb.com>
(cherry picked from commit 505f2c1008)
2019-11-20 22:26:29 +02:00
Nadav Har'El
10cf0e0d91 merge: row_marker: correct row expiry condition
Merged patch set by Piotr Dulikowski:

This change corrects condition on which a row was considered expired by its
TTL.

The logic that decides when a row becomes expired was inconsistent with the
logic that decides if a single cell is expired. A single cell becomes expired
when expiry_timestamp <= now, while a row became expired when
expiry_timestamp < now (notice the strict inequality). For rows inserted
with TTL, this caused non-key cells to expire (change their values to null)
one second before the row disappeared. Now, row expiry logic uses non-strict
inequality.

Fixes #4263,
Fixes #5290.

Tests:

    unit(dev)
    python test described in issue #5290

(cherry picked from commit 9b9609c65b)
2019-11-20 21:37:16 +02:00
Hagit Segev
8c1474c039 release: prepare for 3.2.rc1 scylla-3.2.rc1 2019-11-19 21:38:26 +02:00
Yaron Kaikov
bb5e9527bb dist/docker: Switch to 3.2 release repository (#5296)
Modify Dockerfile SCYLLA_REPO_URL argument to point to the 3.2 repository.
2019-11-18 11:59:51 +02:00
Nadav Har'El
4dae72b2cd sstables: allow non-traditional characters in table name
The goal of this patch is to fix issue #5280, a rather serious Alternator
bug, where Scylla fails to restart when an Alternator table has secondary
indexes (LSI or GSI).

Traditionally, Cassandra allows table names to contain only alphanumeric
characters and underscores. However, most of our internal implementation
doesn't actually have this restriction. So Alternator uses the characters
':' and '!' in the table names to mark global and local secondary indexes,
respectively. And this actually works. Or almost...

This patch fixes a problem of listing, during boot, the sstables stored
for tables with such non-traditional names. The sstable listing code
needlessly assumes that the *directory* name, i.e., the CF names, matches
the "\w+" regular expression. When an sstable is found in a directory not
matching such regular expression, the boot fails. But there is no real
reason to require such a strict regular expression. So this patch relaxes
this requirement, and allows Scylla to boot with Alternator's GSI and LSI
tables and their names which include the ":" and "!" characters, and in
fact any other name allowed as a directory name.

Fixes #5280.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191114153811.17386-1-nyh@scylladb.com>
(cherry picked from commit 2fb2eb27a2)
scylla-3.2.rc0
2019-11-17 18:07:52 +02:00
Kamil Braun
1e444a3dd5 sstables: fix sstable file I/O CQL tracing when reading multiple files (#5285)
CQL tracing would only report file I/O involving one sstable, even if
multiple sstables were read from during the query.

Steps to reproduce:

create a table with NullCompactionStrategy
insert row, flush memtables
insert row, flush memtables
restart Scylla
tracing on
select * from table
The trace would only report DMA reads from one of the two sstables.

Kudos to @denesb for catching this.

Related issue: #4908

(cherry picked from commit a67e887dea)
2019-11-17 16:09:35 +02:00
Yaron Kaikov
76906d6134 release: prepare for 3.2.rc0 2019-11-17 16:08:11 +02:00
Pavel Emelyanov
2195edb819 gitignore: Add tags file
This file is generated by ctags utility for navigation, so it
is not to be tracked by git.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20191031221339.19030-1-xemul@scylladb.com>
2019-11-14 16:50:11 +01:00
Gleb Natapov
e0668f806a lwt: change format of partition key serialization for system.paxos table
Serialize provided partition_key in such a way that the serialized value
will hash to the same token as the original key. This way when system.paxos
table is updated the update is shard local.

Message-Id: <20191114135449.GU10922@scylladb.com>
2019-11-14 15:07:16 +01:00
Avi Kivity
19b665ea6b Merge "Correctly handle null/unset frozen collection/UDT columns in INSERT JSON." from Kamil
"
When using INSERT JSON with frozen collection/UDT columns, if the columns were left unspecified or set to null, the statement would create an empty non-null value for these columns instead of using null values as it should have. For example:

cqlsh:b> create table t (k text primary key, l frozen<list<int>>, m frozen<map<int, int>>, s frozen<set<int>>, u frozen<ut>);
cqlsh:b> insert into t JSON '{"k": "insert_json"}';
cqlsh:b> select * from t;
 k                 | l    | m    | s    | u
-------------------+------+------+------+------
       insert_json |     [] |     {} |     {} |

This PR fixes this.
Resolves #5246 and closes #5270.
"

* 'frozen-json' of https://github.com/kbr-/scylla:
  tests: add null/unset frozen collection/UDT INSERT JSON test
  cql3: correctly handle frozen null/unset collection/UDT columns in INSERT JSON
  cql3: decouple execute from term binding in user_type::setter
2019-11-14 15:29:30 +02:00
Avi Kivity
4544aa0b34 Update seastar submodule
* seastar 75e189c6ba...6f0ef32514 (6):
  > Merge "Add named semaphores" from Piotr
  > parallel_for_each_state: pass rvalue reference to add_future
  > future: Pass rvalue to uninitialized_wrapper::uninitialized_set.
  > dependencies: Add libfmt-dev to debian
  > log: Fix logger behavior when logging both to stdout and syslog.
  > README.md: list Scylla among the projects using Seastar
2019-11-14 15:01:18 +02:00
Vladimir Davydov
ab42b72c6d cql: fix SERIAL consistency check for batch statements
If CONSISTENCY is set to SERIAL or LOCAL SERIAL, all write requests must
fail according to Cassandra's documentation. However, batched writes
bypass this check. Fix this.
2019-11-14 12:15:39 +01:00
Vladimir Davydov
25aeefd6f3 cql: fix CAS consistency level validation
This patch resurrects Cassandra's code validating a consistency level
for CAS requests. Basically, it makes CAS requests use a special
function instead of validate_for_write to make error messages more
coherent.

Note, we don't need to resurrect requireNetworkTopologyStrategy as
EACH_QUORUM should work just fine for both CAS and non-CAS writes.
Looks like it is just an artefact of a rebase in the Cassandra
repository.
2019-11-14 12:15:39 +01:00
Avi Kivity
cd075e9132 reloc: do not install dependencies when building the relocatable package
The dependencies are provided by the frozen toolchain. If a dependency
is missing, we must update the toolchain rather than rely on build-time
installation, which is not reproducible (as different package versions
are available at different times).

Luckily "dnf install" does not update an already-installed package. Had
that been a case, none of our builds would have been reproducible, since
packages would be updated to the latest version as of the build time rather
than the version selected by the frozen toolchain.

So, to prevent missing packages in the frozen toolchain translating to
an unreproducible build, remove the support for installing dependencies
from reloc/build_reloc.sh. We still parse the --nodeps option in case some
script uses it.

Fixes #5222.

Tests: reloc/build_reloc.sh.
2019-11-14 09:37:14 +02:00
Gleb Natapov
552c56633e storage_proxy: do not release mutation if not all replies were received
MV backpressure code frees mutation for delayed client replies earlier
to save memory. The commit 2d7c026d6e that
introduced the logic claimed to do it only when all replies are received,
but this is not the case. Fix the code to free only when all replies
are received for real.

Fixes #5242

Message-Id: <20191113142117.GA14484@scylladb.com>
2019-11-13 16:23:19 +02:00
Raphael S. Carvalho
3e70523111 distributed_loader: Release disk space of SSTables deleted by resharding
Resharding is responsible for the scheduling the deletion of sstables
resharded, but it was not refreshing the cache of the shards those
sstables belong to, which means cache was incorrectly holding reference
to them even after they were deleted. The consequence is sstables
deleted by resharding not having their disk space freed until cache
is refreshed by a subsequent procedure that triggers it.

Fixes #5261.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20191107193550.7860-1-raphaelsc@scylladb.com>
2019-11-13 16:03:27 +02:00
Avi Kivity
6aed3b7471 Merge "cql: trivial cleanup" from Vova
* 'cql-trivial-cleanup' of ssh://github.com/scylladb/scylla-dev:
  cql: rename modification_statement::_sets_a_collection to _selects_a_collection
  cql: rename _column_conditions to _regular_conditions
  cql: remove unnecessary optional around prefetch_data
2019-11-13 15:12:10 +02:00
Avi Kivity
1cb9f9bdfe Merge "Use a fixed-size bitset for column set" from Kostja
"
Use a fixed-size, rather than a dynamically growing
bitset for column mask. This avoids unnecessary memory
reallocation in the most common case.
"

* 'column_set' of ssh://github.com/scylladb/scylla-dev:
  schema: pre-allocate the bitset of column_set
  schema: introduce schema::all_columns_count()
  schema: rename column_mask to column_set
2019-11-13 15:08:13 +02:00
Tomasz Grabiec
f68e17eb52 Merge "Partition/row hit/miss counters for memtable write operations" from Piotr D.
Adds per-table metrics for counting partition and row reuse
in memtables. New metrics are as follows:
    - memtable_partition_writes - number of write operations performed
          on partitions in memtables,
    - memtable_partition_hits - number of write operations performed
          on partitions that previously existed in a memtable,
    - memtable_row_writes - number of row write operations performed
          in memtables,
    - memtable_row_hits - number of row write operations that ovewrote
          rows previously present in a memtable.

Tests: unit(release)
2019-11-13 13:11:51 +01:00
Juliusz Stasiewicz
8318a6720a cql3: error msg w/ arg counts for prepared stmts with wrong arg cnt
Fixes #3748. Very small change: added argument count (expectation vs. reality)
to error msg within `invalid_request_exception'.
2019-11-13 13:43:37 +02:00
Nadav Har'El
ccb9038c69 alternator: Implement Expected operators LT and GT
Merged patch series from Dejan Mircevski. Implements the "LT" and "GT"
operators of the Expected update option (i.e., conditional updates),
and enables the pre-existing tests for them.
2019-11-13 12:07:44 +02:00
Konstantin Osipov
6159c012db schema: pre-allocate the bitset of column_set
The number of columns is usually small, and avoiding
a resize speeds up bit manipulation functions.
2019-11-13 11:41:51 +03:00
Konstantin Osipov
e95d675567 schema: introduce schema::all_columns_count()
schema::all_columns_count() will be used to reserve
memory of the column_set bitmask.
2019-11-13 11:41:42 +03:00
Konstantin Osipov
191acec7ab schema: rename column_mask to column_set
Since it contains a precise set of columns, it's more
accurate to call it a set, not a mask. Besides, the name
column_mask is already used for column options on storage
level.
2019-11-13 11:41:30 +03:00
Kamil Braun
d6446e352e tests: add null/unset frozen collection/UDT INSERT JSON test
When using INSERT JSON with null/unspecified frozen collection/UDT
columns, the columns should be set to null.

See #5270.
2019-11-12 18:24:47 +01:00
Vladimir Davydov
8110178e5d cql: rename modification_statement::_sets_a_collection to _selects_a_collection
This is merely to avoid confusion: we use _sets prefix to indicate that
there are operations over static/regular columns (_sets_static_columns,
_sets_regular_columns), but _sets_a_collection is set for both operations
and conditions. So let's rename it to _selects_a_collection and add some
comments.
2019-11-12 20:15:42 +03:00
Vladimir Davydov
a19192950e cql: rename _column_conditions to _regular_conditions
It's weird that modification_statement has _static_conditions for
conditions on static columns and _column_conditions for conditions on
regular columns, as if conditions on static columns are not column
conditions. Let's rename _column_conditions to _regular_conditions to
avoid confusion.
2019-11-12 20:15:35 +03:00
Konstantin Osipov
0ad0369684 cql: remove unnecessary optional around prefetch_data 2019-11-12 20:15:24 +03:00
Kamil Braun
6c04c5bed5 cql3: correctly handle frozen null/unset collection/UDT columns in INSERT JSON
Before this commit, an empty non-null value was created for
frozen collection/UDT columns when an INSERT JSON statement was executed
with the value left unspecified or set to null.
This was incompatible with Cassandra which inserted a null (dead cell).

Fixes #5270.
2019-11-12 18:05:01 +01:00
Kamil Braun
0ad7d71f31 cql3: decouple execute from term binding in user_type::setter
This commit makes it possible to pass a bound value terminal
directly to the setter.
Continuation of commit bfe3c20035.
2019-11-12 18:02:21 +01:00
Takuya ASADA
614ec6fc35 install.sh: drop --pkg option, use .install file on .deb package
--pkg option on install.sh is introduced for .deb packaging since it requires
different install directory for each subpackage.
But we actually able to use "debian/tmp" for shared install directory,
then we can specify file owner of the package using .install files.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20191030203142.31743-1-syuu@scylladb.com>
2019-11-12 16:50:37 +02:00
Piotr Dulikowski
59fbbb993f memtables: add partition/row hit/miss counters
Adds per-table metrics for counting partition and row reuse
in memtables. New metrics are as follows:
    - memtable_partition_writes - number of write operations performed
          on partitions in memtables,
    - memtable_partition_hits - number of write operations performed
          on partitions that previously existed in a memtable,
    - memtable_row_writes - number of row write operations performed
          in memtables,
    - memtable_row_hits - number of row write operations that ovewrote
          rows previously present in a memtable.

Tests: unit(release)
2019-11-12 13:35:41 +01:00