CQL tracing would only report file I/O involving one sstable, even if
multiple sstables were read from during the query.
Steps to reproduce:
create a table with NullCompactionStrategy
insert row, flush memtables
insert row, flush memtables
restart Scylla
tracing on
select * from table
The trace would only report DMA reads from one of the two sstables.
Kudos to @denesb for catching this.
Related issue: #4908
There are ... signs of massive start/stop code rework in the
main() function. While fixing the sub-modules interdependencies
during start/stop I've polished these signs too, so here's the
simplest ones.
Serialize provided partition_key in such a way that the serialized value
will hash to the same token as the original key. This way when system.paxos
table is updated the update is shard local.
Message-Id: <20191114135449.GU10922@scylladb.com>
"
When using INSERT JSON with frozen collection/UDT columns, if the columns were left unspecified or set to null, the statement would create an empty non-null value for these columns instead of using null values as it should have. For example:
cqlsh:b> create table t (k text primary key, l frozen<list<int>>, m frozen<map<int, int>>, s frozen<set<int>>, u frozen<ut>);
cqlsh:b> insert into t JSON '{"k": "insert_json"}';
cqlsh:b> select * from t;
k | l | m | s | u
-------------------+------+------+------+------
insert_json | [] | {} | {} |
This PR fixes this.
Resolves#5246 and closes#5270.
"
* 'frozen-json' of https://github.com/kbr-/scylla:
tests: add null/unset frozen collection/UDT INSERT JSON test
cql3: correctly handle frozen null/unset collection/UDT columns in INSERT JSON
cql3: decouple execute from term binding in user_type::setter
* seastar 75e189c6ba...6f0ef32514 (6):
> Merge "Add named semaphores" from Piotr
> parallel_for_each_state: pass rvalue reference to add_future
> future: Pass rvalue to uninitialized_wrapper::uninitialized_set.
> dependencies: Add libfmt-dev to debian
> log: Fix logger behavior when logging both to stdout and syslog.
> README.md: list Scylla among the projects using Seastar
If CONSISTENCY is set to SERIAL or LOCAL SERIAL, all write requests must
fail according to Cassandra's documentation. However, batched writes
bypass this check. Fix this.
This patch resurrects Cassandra's code validating a consistency level
for CAS requests. Basically, it makes CAS requests use a special
function instead of validate_for_write to make error messages more
coherent.
Note, we don't need to resurrect requireNetworkTopologyStrategy as
EACH_QUORUM should work just fine for both CAS and non-CAS writes.
Looks like it is just an artefact of a rebase in the Cassandra
repository.
The dependencies are provided by the frozen toolchain. If a dependency
is missing, we must update the toolchain rather than rely on build-time
installation, which is not reproducible (as different package versions
are available at different times).
Luckily "dnf install" does not update an already-installed package. Had
that been a case, none of our builds would have been reproducible, since
packages would be updated to the latest version as of the build time rather
than the version selected by the frozen toolchain.
So, to prevent missing packages in the frozen toolchain translating to
an unreproducible build, remove the support for installing dependencies
from reloc/build_reloc.sh. We still parse the --nodeps option in case some
script uses it.
Fixes#5222.
Tests: reloc/build_reloc.sh.
MV backpressure code frees mutation for delayed client replies earlier
to save memory. The commit 2d7c026d6e that
introduced the logic claimed to do it only when all replies are received,
but this is not the case. Fix the code to free only when all replies
are received for real.
Fixes#5242
Message-Id: <20191113142117.GA14484@scylladb.com>
Resharding is responsible for the scheduling the deletion of sstables
resharded, but it was not refreshing the cache of the shards those
sstables belong to, which means cache was incorrectly holding reference
to them even after they were deleted. The consequence is sstables
deleted by resharding not having their disk space freed until cache
is refreshed by a subsequent procedure that triggers it.
Fixes#5261.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20191107193550.7860-1-raphaelsc@scylladb.com>
* 'cql-trivial-cleanup' of ssh://github.com/scylladb/scylla-dev:
cql: rename modification_statement::_sets_a_collection to _selects_a_collection
cql: rename _column_conditions to _regular_conditions
cql: remove unnecessary optional around prefetch_data
"
Use a fixed-size, rather than a dynamically growing
bitset for column mask. This avoids unnecessary memory
reallocation in the most common case.
"
* 'column_set' of ssh://github.com/scylladb/scylla-dev:
schema: pre-allocate the bitset of column_set
schema: introduce schema::all_columns_count()
schema: rename column_mask to column_set
Adds per-table metrics for counting partition and row reuse
in memtables. New metrics are as follows:
- memtable_partition_writes - number of write operations performed
on partitions in memtables,
- memtable_partition_hits - number of write operations performed
on partitions that previously existed in a memtable,
- memtable_row_writes - number of row write operations performed
in memtables,
- memtable_row_hits - number of row write operations that ovewrote
rows previously present in a memtable.
Tests: unit(release)
Merged patch series from Dejan Mircevski. Implements the "LT" and "GT"
operators of the Expected update option (i.e., conditional updates),
and enables the pre-existing tests for them.
Since it contains a precise set of columns, it's more
accurate to call it a set, not a mask. Besides, the name
column_mask is already used for column options on storage
level.
This is merely to avoid confusion: we use _sets prefix to indicate that
there are operations over static/regular columns (_sets_static_columns,
_sets_regular_columns), but _sets_a_collection is set for both operations
and conditions. So let's rename it to _selects_a_collection and add some
comments.
It's weird that modification_statement has _static_conditions for
conditions on static columns and _column_conditions for conditions on
regular columns, as if conditions on static columns are not column
conditions. Let's rename _column_conditions to _regular_conditions to
avoid confusion.
Before this commit, an empty non-null value was created for
frozen collection/UDT columns when an INSERT JSON statement was executed
with the value left unspecified or set to null.
This was incompatible with Cassandra which inserted a null (dead cell).
Fixes#5270.
--pkg option on install.sh is introduced for .deb packaging since it requires
different install directory for each subpackage.
But we actually able to use "debian/tmp" for shared install directory,
then we can specify file owner of the package using .install files.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20191030203142.31743-1-syuu@scylladb.com>
Adds per-table metrics for counting partition and row reuse
in memtables. New metrics are as follows:
- memtable_partition_writes - number of write operations performed
on partitions in memtables,
- memtable_partition_hits - number of write operations performed
on partitions that previously existed in a memtable,
- memtable_row_writes - number of row write operations performed
in memtables,
- memtable_row_hits - number of row write operations that ovewrote
rows previously present in a memtable.
Tests: unit(release)
This change adds a SCYLLA_REPO_URL argument to Dockerfile, which defines
the RPM repository used to install Scylla from.
When building a new Docker image, users can specify the argument by
passing the --build-arg SCYLLA_REPO_URL=<url> option to the docker build
command. If the argument is not specified, the same RPM repository is
used as before, retaining the old default behavior.
We intend to use this in release engineering infrastructure to specify
RPM repositories for nightly builds of release branches (for example,
3.1.x), which are currently only using the stable RPMs.
Code for check_LT(), check_GT(), etc. will be nearly identical, so
factor it out into a single function that takes a comparator object.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
In 1ca9dc5d47, it was established that the correct way to
base64-decode a JSON value is via string_view, rather than directly
from GetString().
This patch adds a base64_decode(rjson::value) overload, which
automatically uses the correct procedure. It saves typing, ensures
correctness (fixing one incorrect call found), and will come in handy
for future EXPECTED comparisons.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
unwrap_number() is now a public function in serialization.hh instead
of a static function visible only in executor.cc.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Merged patch series from Piotr Sarna:
An otherwise empty partition can still have a valid static column.
Filtering didn't take that fact into account and only filtered
full-fledged rows, which may result in non-matching rows being returned
to the client.
Fixes#5248
"type" label is already in use for the counter type ("derive", "gauge",
etc). Using the same label for "cas" / "non-cas" overwrites it. Let's
instead call the new label "conditional" and use "yes" / "no" for its
value, as suggested by Kostja.
Message-Id: <3082b16e4d6797f064d58da95fb4e50b59ab795c.1572451480.git.vdavydov@scylladb.com>
"
In case when a single reader contributes a stream of fragments and keeps winning over other readers, mutation_reader_merger will enter gallop mode, in which it is assumed that the reader will keep winning over other readers. Currently, a reader needs to contribute 3 fragments to enter that mode.
In gallop mode, fragments returned by the galloping reader will be compared with the best fragment from _fragment_heap. If it wins, the fragment is directly returned. Otherwise, gallop mode ends and merging performed as in general case, which involves heap operations.
In current implementation, when the end of partition is encountered while in gallop mode, the gallop mode is ended unconditionally.
A microbenchmark was added in order to test performance of the galloping reader optimization. A combining reader that merges results from four other readers is created. Each sub-reader provides a range of 32 clustering rows that is disjoint from others. All sub-readers return rows from the same partition. An improvement can be observed after introducing the galloping reader optimization.
As for other benchmarks from the "combined" group, results are pretty close to the old ones. The only one that seems to have suffered slightly is combined.many_overlapping.
Median times from a single run of perf_mutation_readers.combined: (1s run duration, 5 runs per benchmark, release mode)
test name before after improvement
one_row 49.070ns 48.287ns 1.60%
single_active 61.574us 61.235us 0.55%
many_overlapping 488.193us 514.977us -5.49%
disjoint_interleaved 57.462us 57.111us 0.61%
disjoint_ranges 56.545us 56.006us 0.95%
overlapping_partitions_disjoint_rows 127.039us 80.849us 36.36%
Same results, normalized per mutation fragment:
test name before after improvement
one_row 16.36ns 16.10ns 1.60%
single_active 109.46ns 108.86ns 0.55%
many_overlapping 216.97ns 228.88ns -5.49%
disjoint_interleaved 102.15ns 101.53ns 0.61%
disjoint_ranges 100.52ns 99.57ns 0.95%
overlapping_partitions_disjoint_rows 246.38ns 156.80ns 36.36%
Tested on AMD Ryzen Threadripper 2950X @ 3.5GHz.
Tests: unit(release)
Fixes#3593.
"
* '3593-combined_reader-gallop-mode' of https://github.com/piodul/scylla:
mutation_reader: gallop mode microbenchmark
mutation_reader: combined reader gallop tests
mutation_reader: gallop mode for combined reader
mutation_reader: refactor prepare_next