perf_fast_forward is used to detect performance regressions. The two
main metrics used for this are fargments per second and the number of
the IO operations. The former is a median of a several runs, but the
latter is just the actual number of asynchronous IO operations performed
in the run that happened to be picked as a median frag/s-wise. There's
no always a direct correlation between frag/s and aio and the latter can
vary which makes the latter hard to compare.
In order to make this easier a new metric was introduced: "average aio"
which reports the average number of asynchronous IO operations performed
in a run. This should produce much more stable results and therefore
make the comparison more meaningful.
Message-Id: <20190430134401.19238-1-pdziepak@scylladb.com>
The existing unit test test_secondary_index_contains_virtual_columns
reproduced a bug (issue #4144) with indexing of primary-key columns,
but we only actually tested clustering columns. In issue #4471 there
was a question whether we may still have a bug when indexing of
*partition-key* columns. This patch adds a test that verifies that
we don't, and this case works well too.
Refs #4144
Refs #4471
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190501113500.25900-1-nyh@scylladb.com>
Until this patch, dropping columns from a table was completely forbidden
if this table has any materialized views or secondary indexes. However,
this is excessively harsh, and not compatible with Cassandra which does
allow dropping columns from a base table which has a secondary index on
*other* columns. This incompatibility was raised in the following
Stackoverflow question:
https://stackoverflow.com/questions/55757273/error-while-dropping-column-from-a-table-with-secondary-index-scylladb/55776490
In this patch, we allow dropping a base table column if none of its
materialized views *needs* this column. Columns selected by a view
(as regular or key columns) are needed by it, of course, but when
virtual columns are used (namely, there is a view with same key columns
as the base), *all* columns are needed by the view, so unfortunately none
of the columns may be dropped.
After this patch, when a base-table column cannot be dropped because one
of the materialized views needs it, the error message will look like:
exceptions::invalid_request_exception: Cannot drop column a from base
table ks.cf: a materialized view cf_a_idx_index needs this column.
This patch also includes extensive testing for the cases where dropping
columns are now allowed, and not allowed. The secondary-index tests are
especially interesting, because they demonstrate that now usually (when
a non-key column is being indexed) dropping columns will be allowed,
which is what originally bothered the Stackoverflow user.
Fixes#4448.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190429214805.2972-1-nyh@scylladb.com>
There are several places were IN restrictions are not currently supported,
especially in queries involving a secondary index. However, when the IN
restriction has just a single value, it is nothing more than an equality
restriction and can be converted into one and be supported. So this patch
does exactly this.
Note that Cassandra does this conversion since August 2016, and therefore
supports the special case of single-value IN even where general IN is not
supported. So it's important for Cassandra compatibility that we do this
conversion too.
This patch also includes a test with two queries involving a secondary
index that were previously disallowed because of the "IN" on the primary
key or the indexed column - and are now allowed when the IN restriction
has just a single value. A third query tested is not related to secondary
indexes, but confirms we don't break multi-column single-value IN queries.
Fixes#4455.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190428160317.23328-1-nyh@scylladb.com>
The shard readers of the multishard reader assumed that the positions in
the data stream are strictly monotonous. This assumption is invalid.
Range tombstones can have positions that they can share with other range
tombstones and/or a clustering row. The effect of this false assumption
was that when the shard reader was evicted such that the last seen
fragment was a range tombstone, when recreated it would skip any unseen
fragments that have the same position as that of the last seen range
tombstone.
Fixes: #4418
Branches: master, 3.0, 2019.1
Tests: unit(dev)
* https://github.com/denesb/scylla.git
multishard_reader_handle_non_strictly_monotonous_positions/v4:
multishard_combining_reader: shard_reader::remote_reader extract
fill-buffer logic into do_fill_buffer()
mutlishard_combining_reader: reorder
shard_reader::remote_reader::do_fill_buffer() code
position_in_partition_view: add region() accessor
multishard_combining_reader: fix handling of non-strictly monotonous
positions
flat_mutation_reader: add flat_mutation_reader_from_mutations()
overload with range and slice
flat_mutation_reader: add make_flat_mutation_reader_from_fragments()
overload with range and slice
tests: add unit test for multishard reader correctly handling
non-strictly monotonous positions
Currently null and missing values are treated differently. Missing
values throw no_such_column. Null values return nullptr, std::nullopt
or throw null_column_value.
The api is a bit confusing since a function returning a std::optional
either returns std::nullopt or throws depending on why there is no
value.
With this patch series only get_nonnull throws and there is only one
exception type.
* https://github.com/espindola/scylla.git espindola/merge-null-and-missing-v2:
query-result-set: merge handling of null and missing values
Remove result_set_row::has
Return a reference from get_nonnull
Nothing seems to differentiate a missing and a null value. This patch
then merges the two exception types and now the only method that
throws is get_nonnull. The other methods return nullptr or
std::nullopt as appropriate.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
To be able to support this new overload, the reader is made
partition-range aware. It will now correctly only return fragments that
fall into the partition-range it was created with. For completeness'
sake and to be able to test it, also implement
`fast_forward_to(const dht::partition_range)`. Slicing is done by
filtering out non-overlapping fragments from the initial list of
fragments. Also add a unit test that runs it through the mutation_source
test suite.
There was nothing keeping the verify lambda alive after the return. It
worked most of the time since the only state kept by the lambda was
a pointer to cql_test_env.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190426203823.15562-1-espindola@scylladb.com>
To be able to run the mutation-source test suite with this reader. In
the next patch, this reader will be used in testing another reader, so
it is important to make sure it works correctly first.
"
Previously we weren't validating elements of collections so it
was possible to add non-UTF-8 string to a column with type
list<text>.
Tests: unit(release)
Fixes#4009
"
* 'haaawk/4009/v5' of github.com:scylladb/seastar-dev:
types: Test correct map validation
types: Test correct in clause validation
types: Test correct tuple validation
types: Test correct set validation
types: Test correct list validation
types: Add test_tuple_elements_validation
types: Add test_in_clause_validation
types: Add test_map_elements_validation
types: Add test_set_elements_validation
types: Add test_list_elements_validation
types: Validate input when tuples
types: Validate input when parsing a set
types: Validate input when parsing a map
types: Validate input when parsing a list
types: Implement validation for tuple
types: Implement validation for set
types: Implement validation for map
types: Implement validation for list
types: Add cql_serialization_format parameter to validate
1. All nodes in the cluster have to support MC_SSTABLE_FEATURE
2. When a node observes that whole cluster supports MC_SSTABLE_FEATURE
then it should start using MC format.
3. Once all shards start to use MC then a node should broadcast that
unbounded range tombstones are now supported by the cluster.
4. Once whole cluster supports unbounded range tombstones we can
start accepting them on CQL level.
tests: unit(release)
Fixes#4205Fixes#4113
* seastar-dev.git dev/haaawk/enable_mc/v11:
system_keyspace: Add scylla_local
system_keyspace: add accessors for SCYLLA_LOCAL
storage_service: add _sstables_format field
feature: add when_enabled callbacks
system_keyspace: add storage_service param to setup
Add sstable format helper methods
Register feature listeners in storage_service
Add service::read_sstables_format
Use read_sstables_format in main.cc
Use _sstables_format to determine current format
Add _unbounded_range_tombstones_feature
Update supported features on format change
Currently, each instanciation of `random_mutation_generator::impl` will
generate a new random seed for itself. Altough these are printed,
mapping back all the printed seeds to the exact source location where it
has to be substituted in is non-trivial. This makes reproducing random
test failures very hard. To solve this problem, use
`tests::random::get_int()` to produce the random seed of the
`random_mutation_generator::impl` instances. This way the seed of all
the mutation generator will be derived from a single "master" seed that
is easily replaced after a test failure, hopefully also leading to
easily reproducible random test failures.
I checked that after substituting in a previously generated master
random seed, all derived seeds were exactly the same.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <0471415938fc27485975ef9213d37d94bff20fd5.1555329062.git.bdenes@scylladb.com>
"
These are patches I wrote while working on UDF/UDA, but IMHO they are
independent improvements and are ready for review.
Tests: unit (debug) dtest (release)
I checked that all tests in
nosetests -v user_types_test.py sstabledump_test.py cqlsh_tests/cqlsh_tests.py
now pass.
"
* 'espindola/udf-uda-refactoring-v3' of https://github.com/espindola/scylla:
Refactor user type merging
cql_type_parser::raw_builder: Allow building types incrementally
cql3: delete dead code
Include missing header
return a const reference from return_type
delete unused var
Add a test on nested user types.
Currently the partition header will always be reported as different when
comparing two mutations. This is because they are prepended with the
"expected: " and "... but got: " texts. This generates unnecessary
noise. Inject a new line between the prefix and the partition-header
proper. This way the partition header will only show up in the diff when
there is an actual difference. The "expected: " and "... but got: "
phrases are still shown as different on the top of the diff but this is
fine as one can immediately see that they are not part of the data and
additionaly they help the reader in determining which part of the diff
is the expected one and which is the actual one.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <29e0f413d248048d7db032224a3fd4180bf1b319.1554909144.git.bdenes@scylladb.com>
The problem happens after a schema change because we fail to properly
remove ongoing compaction, which stopped being tracked, from list that
is used to calculate backlog, so it may happen that a compaction read
monitor (ceases to exist after compaction ends) is used after freed.
Fixes#4410.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20190409024936.23775-1-raphaelsc@scylladb.com>
The comparison of tables before and after mutation is now done by a
generic diff_rows function. The same function will be used for user
defined functions and user defined aggregates.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
There were many calls to read_keyspace_mutation. One in each function
that prepares a mutation for some other schema change.
With this patch they are all moved to a single location.
Tests: unit (dev, debug)
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190328024440.26201-1-espindola@scylladb.com>
"
File must be either owned by the process uid
or have both read and write access to it,
so it could be (hard) linked when sysctl
fs.protected_hardlinks is enabled.
Fixes#3117
"
* 'projects/valid_owner_and_mode/v3-rebased' of https://github.com/bhalevy/scylla:
storage_service: handle load_new_sstables exception
init: validate file ownership and mode.
treewide: use std::filesystem
Rather than {std::experimental,boost,seastar::compat}::filesystem
On Sat, 2019-03-23 at 01:44 +0200, Avi Kivity wrote:
> The intent for seastar::compat was to allow the application to choose
> the C++ dialect and have seastar follow, rather than have seastar choose
> the types and have the application follow (as in your patch).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
"
Variable length integers are used are used extensively by SSTables mc
format. The current deserialisation routine is quite naive in a way that
it reads each byte separately. Since, those vints usually appear inside
much larger buffers, we optimise for such cases, read 8-bytes at once
and then mask out the unneeded parts (as well as fix their order because
big-endian).
Tests: unit(dev).
perf_vint (average time per element when deserializing 1000 vints):
before:
vint.deserialize 69442000 14.400ns 0.000ns 14.399ns 14.400ns
after:
vint.deserialize 241502000 4.140ns 0.000ns 4.140ns 4.140ns
perf_fast_forward (data on /tmp):
large-partition-single-key-slice on dataset large-part-ds1:
before:
range time (s) iterations frags frag/s mad f/s max f/s min f/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu
-> [0, 1] 0.000278 8792 2 7190 119 7367 1960 3 104 2 0 0 1 1 0 0 1 100.0%
-> [1, 100) 0.000344 96 99 288100 4335 307689 193809 2 108 2 0 0 1 1 0 0 1 100.0%
-> (100, 200] 0.000339 13254 100 295263 2824 301734 222725 2 108 2 0 0 1 1 0 0 1 100.0%
after:
range time (s) iterations frags frag/s mad f/s max f/s min f/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu
-> [0, 1] 0.000236 10001 2 8461 59 8718 2261 3 104 2 0 0 1 1 0 0 1 100.0%
-> [1, 100) 0.000285 89 99 347500 2441 355826 215745 2 108 2 0 0 1 1 0 0 1 100.0%
-> (100, 200] 0.000293 14369 100 341302 1512 350123 222049 2 108 2 0 0 1 1 0 0 1 100.0%
"
* tag 'optimise-vint/v2' of https://github.com/pdziepak/scylla:
sstable: pass full length of buffer to vint deserialiser
vint: optimise deserialisation routine
vint: drop deserialize_type structure
tests/vint: reduce test dependencies
tests/perf: add performance test for vint serialisation
"
This series introduce a rudimentary sstables manager
that will be used for making and deleting sstables, and tracking
of thereof.
The motivation for having a sstables manager is detailed in
https://github.com/scylladb/scylla/issues/4149.
The gist of it is that we need a proper way to manage the life
cycle of sstables to solve potential races between compaction
and various consumers of sstables, so they don't get deleted by
compaction while being used.
In addition, we plan to add global statistics methods like returning
the total capacity used by all sstables.
This patchset changes the way class sstable gets the large_data_handler.
Rather than passing it separately for writing the sstable and when deleting
sstables, we provide the large_data_handler when the sstable object is
constructed and then use it when needed.
Refs #4149
"
* 'projects/sstables_manager/v3' of https://github.com/bhalevy/scylla:
sstables: provide large_data_handler to constructor
sstables_manager: default_sstable_buffer_size need not be a function
sstables: introduce sstables_manager
sstables: move shareable_components def to its own header
tests: use global nop_lp_handler in test_services
sstables: compress.hh: add missing include
sstables: reorder entry_descriptor constructor params
sstables: entry_descriptor: get rid of unused ctor
sstables: make load_shared_components a method of sstable
sstables: remove default params from sstable constructor
database: add table::make_sstable helper
distributed_loader: pass column_family to load_sstables_with_open_info
distributed_loader: no need for forward declaration of load_sstables_with_open_info
distributed_loader: reshard: use default params for make_sstable
The goal of the sstables manager is to track and manage sstables life-cycle.
There is a sstable manager instance per database and it is passed to each column-family
(and test environment) on construction.
All sstables created, loaded, and deleted pass through the sstables manager.
The manager will make sure consumers of sstables are in sync so that sstables
will not be deleted while in use.
Refs #4149
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
"
Fixes#4348
v2 changes:
* added a unit test
This miniseries fixes decimal/varint serialization - it did not update
output iterator in all cases, which may lead to overwriting decimal data
if any other value follows them directly in the same buffer (e.g. in a tuple).
It also comes with a reproducing unit test covering both decimals and varints.
Tests: unit (dev)
dtest: json_test.FromJsonUpdateTests.complex_data_types_test
json_test.FromJsonInsertTests.complex_data_types_test
json_test.ToJsonSelectTests.complex_data_types_test
"
* 'fix_varint_serialization_2' of https://github.com/psarna/scylla:
tests: add test for unpacking decimals
types: fix varint and decimal serialization