We don't support non-PK restrictions correctly as explained in commit
3c90607 ("tests/cql_query_test: Fix view creation in
test_duration_restrictions()") and Apache Cassandra doesn't support them
for MVs either. Change some test cases to not rely on them.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20171107165138.3176-1-duarte@scylladb.com>
Fixes#2938.
* 'tgrabiec/fix-range-tombstone-list-exception-safety-v1' of github.com:scylladb/seastar-dev:
tests: range_tombstone_list: Add test for exception safety of apply()
tests: Introduce range_tombstone_list assertions
cache: Make range tombstone merging exception-safe
range_tombstone_list: Introduce apply_monotonically()
range_tombstone_list: Make reverter::erase() exception-safe
range_tombstone_list: Fix memory leaks in case of bad_alloc
mutation_partition: Fix abort in case range tombstone copying fails
managed_bytes: Declare copy constructor as allocation point
Integrate with allocation failure injection framework
We don't support non-PK restrictions correctly as explained in commit
3c90607 ("tests/cql_query_test: Fix view creation in
test_duration_restrictions()") and Apache Cassandra doesn't support them
for MVs either. Disable the tests, but don't remove them because they
will be resurrected once CASSANDRA-13832 is fixed.
Message-Id: <1510052422-3478-1-git-send-email-penberg@scylladb.com>
"This patch series adds support for secondary index queries using the
backing index view that's created when CREATE INDEX statement is
executed.
Example:
-- Create keyspace and table:
CREATE KEYSPACE ks WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : 1};
CREATE TABLE ks.users (
userid uuid,
name text,
email text,
country text,
PRIMARY KEY (userid)
);
-- Create secondary indexes:
CREATE INDEX ON ks.users (email);
CREATE INDEX ON ks.users (country);
-- Insert some data:
INSERT INTO ks.users (userid, name, email, country) VALUES (uuid(), 'Bondie Easseby', 'beassebyv@house.gov', 'France');
INSERT INTO ks.users (userid, name, email, country) VALUES (uuid(), 'Demetri Curror', 'dcurrorw@techcrunch.com', 'France');
INSERT INTO ks.users (userid, name, email, country) VALUES (uuid(), 'Langston Paulisch', 'lpaulischm@reverbnation.com', 'United States');
INSERT INTO ks.users (userid, name, email, country) VALUES (uuid(), 'Channa Devote', 'cdevote14@marriott.com', 'Denmark');
-- Query on the secondary-index backed non-primary keys:
SELECT * FROM ks.users WHERE email = 'beassebyv@house.gov';
userid | country | email | name
--------+---------+-------+------
022238c8-5213-44b5-959e-4e3e1b032f85 | France | beassebyv@house.gov | Bondie Easseby
(1 rows)
SELECT * FROM ks.users WHERE country = 'France';
userid | country | email | name
--------------------------------------+---------+-------------------------+----------------
2152d85a-61f6-4eab-af4d-e7e7d0872319 | France | beassebyv@house.gov | Bondie Easseby
59fddb6d-bfc9-4636-a9a0-85383fd815ee | France | dcurrorw@techcrunch.com | Demetri Curror
Known imitations:
- Only regular column indexes return results. Indexing primary key
components like clustering keys return empty result set because of
index view query partition key serialization issues that will be fixed
in subsequent patches.
- Secondary index queries are not paginated, which can cause problems
for queries that return a large number of rows.
- Multiple restrictions don't work correctly if one of them is backed by
a secondary-index.
- Only one secondary-indexed restriction per query is supported -- other
restrictions are ignored.
- Compound partition keys are not supported.
- ALLOW FILTERING on non-primary key columns does not work correctly
without secondary index (see issue #2200)."
* 'penberg/cql-2i-queries/v2' of github.com:penberg/scylla:
tests/cql_query_test: Add test case for secondary index queries
cql3: Secondary-index backed select statements
index: Fix index view schema when primary key component is indexed
tests/cql_query_test: Fix view creation in test_duration_restrictions()
cql3/restrictions: Add statement_restrictions::index_restrictions() helper
index: Implement index::supports_expression() for EQ operator
cql3: Make operator_type class non-copyable
index: Fix index::supports_expression() operator parameter type
cql3: Implement statement_restriction index validation
Before the patch we appended and queried at the front. Insert at the
front instead, so that writes and reads overlap. Stresses eviction and
population more.
Message-Id: <1506369562-14892-1-git-send-email-tgrabiec@scylladb.com>
"We currently can't insert row entries at any position_in_partition,
but only at full keys and after all keys. If a query range has bounds
such that we have to insert a dummy entry at non-representable position
then information about range continuity will not be fully populated.
In particular, single-row queries of a row which is not present in sstables
will miss when repeated again.
The series fixes the problem by marking the whole query range as continuous
by inserting dummy entries at boundaries when necessary.
Refs #2579."
* tag 'tgrabiec/cache-range-continuity-v2' of github.com:scylladb/seastar-dev:
tests: row_cache: Add test for population of single rows
tests: Add test for population of continuity
tests: mutation_reader_assertions: Introduce produces_compacted()
mutation: Introduce apply(mutation_fragment)
cache: Document invariants of cache_streamed_mutation::_lower_bound
cache_streamed_mutation: Special-case population for singular ranges
query: Introduce is_single_row()
cache_streamed_mutation: Increment mispopulation counter when can't populate due to eviction
cache_streamed_mutation: Override continuity of older versions when populating
cache_streamed_mutation: Mark whole query range as continuous
tests: cache_streamed_mutation: Allow creating expected_row at any position_in_partition
cache_streamed_mutation: Populate continuity when range adjacent to non-latest version rows
cache_streamed_mutation: Avoid lookup in maybe_add_to_cache() in more cases
row_cache: Make read_context::key() valid before reading from underlying starts
mutation_partition: Allow creating rows_entry at any clustered position_in_partition
position_in_partition: Do not use -2 and +2 weights
clustering_ranges_walker: Make contains() drop range tombstones adjacent to query range
mutation_partition: Remove delegating_compare()
mvcc: Print iterators in operator<< for partition_snapshot_row_cursor
mvcc: Introduce partition_snapshot_row_weakref
mvcc: Make the null state of partition_snapshot::change_mark explicit
mvcc: Add partition_snapshot::region() getter
mvcc: Add partition_snapshot::schema() getter
position_in_partition: Introduce before_key()
position_in_partition: Introduce min()
position_in_partition: Introduce for_static_row()
When created cache registers several metrics, since attempts to create
an already existing metrics result in an exception being thrown it is no
longer possible to have two cache instances at the same time. This is
exactly what happens in memory_footprint: one (useless) cache object is
created through a call to do_with_cql_env() and, then, memory_footprint
explicitly creates another one (not a useless one).
The tests itself doesn't really need a full cql environment and the only
reason it was added is so that storage_service is initialised and various
code paths can query for the available cluster features. This can be
done in a much lightweight way using storage_service_for_tests.
Fixes memory_footprint failure (until next time we decide there is
nothing wrong with globals).
Message-Id: <20171102160233.6756-1-pdziepak@scylladb.com>
The materialized view created in test_duration_restriction() restricts
on a non-PK column. Since Scylla's ALLOW FILTERING and secondary index
validation path is broken, once we start to do secondary index queries,
query processor thinks there's a secondary index backing that non-PK
column and fails because it's unable to find such column.
Fix up the view to only trigger the duration type validation error we're
interested in here.
Before this patch only ranges between returned row fragments were
marked as continuous. In the extreme case, there could be no such
fragments, in which case next read would miss as well. To avoid this,
mark whole query range as continuous by inserting dummy entries when
necessary.
Refs #2579.
"When reading a single row it is possible that the read will be satisfied
by just reading from one of the data source candidates. To exploit this
an optimization is employed which sorts data source candidates by their
timestamp and reads mutations from the most recent to the oldest. When
all needed cells are present and their earliest timestamp is still
later than the latest one of the remaining data source the read can be
terminated early.
However this optimization also has the possibility to backfire as the
data sources are read sequentially, so if all of them has to be read
eventually then we will end up worse then without it.
Thus the optimization can be disabled up-front or enabled to only run
until its efficiency degrades below a certain threshold.
Also counters are added to column-families to make it possible to
observe how well it performs.
Benchmarking
Benchmarking was done with disabled cache and at a constant op rate of
4k (1/3 of the max op rate on my box), against 3 sstables containing the
same 10000 rows.
1) Optimization turned off (all sstables read paralelly)
latency mean : 1.3 [simple:1.3]
latency median : 1.0 [simple:1.0]
latency 95th percentile : 2.4 [simple:2.4]
latency 99th percentile : 2.9 [simple:2.9]
latency 99.9th percentile : 8.0 [simple:8.0]
latency max : 13.5 [simple:13.5]
2) Optimization turned on, best case (1 of 3 sstables read)
latency mean : 0.6 [simple:0.6]
latency median : 0.6 [simple:0.6]
latency 95th percentile : 1.0 [simple:1.0]
latency 99th percentile : 1.2 [simple:1.2]
latency 99.9th percentile : 4.4 [simple:4.4]
latency max : 13.4 [simple:13.4]
3) Optimization turned on, best case, IN query (1 of 3 sstables read)
latency mean : 0.7 [simple_in:0.7]
latency median : 0.6 [simple_in:0.6]
latency 95th percentile : 1.1 [simple_in:1.1]
latency 99th percentile : 1.4 [simple_in:1.4]
latency 99.9th percentile : 5.4 [simple_in:5.4]
latency max : 16.8 [simple_in:16.8]
4) Optimization turned on, worst case (3 of 3 sstables read sequentally)
latency mean : 2.8 [simple:2.8]
latency median : 2.3 [simple:2.3]
latency 95th percentile : 5.4 [simple:5.4]
latency 99th percentile : 6.5 [simple:6.5]
latency 99.9th percentile : 13.5 [simple:13.5]
latency max : 19.2 [simple:19.2]
5) Optimization turned on, mid case (2 of 3 sstables read sequentally)
latency mean : 1.4 [simple:1.4]
latency median : 1.1 [simple:1.1]
latency 95th percentile : 2.7 [simple:2.7]
latency 99th percentile : 3.2 [simple:3.2]
latency 99.9th percentile : 7.7 [simple:7.7]
latency max : 15.1 [simple:15.1]"
Ref #324
* 'bdenes/optimize_single_row_read_v6' of github.com:denesb/scylla:
Add unit tests for single_key_sstable_reader
Add counters for the single-key reader optimization
Add single_key_parallel_scan_threshold option
single_key_sstable_reader: optimize single-row queries
single_key_sstable_reader: move reading code into it's own method
Add selects_only_full_rows() and selects_only_full_rows_with_atomic_columns()
query::full_slice doesn't select any regular or static columns, which
is at odds with the expectations of its users. This patch replaces it
with the schema::full_slice() version.
Refs #2885
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1507732800-9448-2-git-send-email-duarte@scylladb.com>
"This changeset is the first step to flatten mutation_reader.
Then it introduces new mutation_fragment types for partition header and end of partition.
Using those a new flat_mutation_reader is defined.
Finally it introduces converters between new flat_mutation_reader and
old mutation_reader."
* 'haaawk/flattened_mutation_reader_v12' of github.com:scylladb/seastar-dev:
Add tests for flat_mutation_reader
Introduce conversion from flat_mutation_reader to mutation_reader
Introduce conversion from mutation_reader to flat_mutation_reader
Introduce flat_mutation_reader
Extract FlattenedConsumer concept using GCC6_CONCEPT
Introduce partition_end mutation_fragment
Introduce a position for end of partition
Introduce partition_start mutation_fragment
Introduce FragmentConsumer
Introduce a position for partition start
streamed_mutation: Extract concepts using GCC6_CONCEPT macro
Those tests run mutation source test for all sources
using conversion to and from flat_mutation_reader.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
For every finished compaction, we were calculating shards for all
existing tables. With ignore_msb set to 0, it's probably not a big
deal, but if ignore_msb is like 12 and LCS is used (meaning thousands
of tables possibly), the operation may stall the reactor for a
considerable amount of time. That's fixed by caching shards.
Fixes#2875.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20171011053424.22308-1-raphaelsc@scylladb.com>
This type of mutation_fragment will be used in new mutation_reader
to signal the beginning of the next partition.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
"This series implements CAST AS functions in scylla.
It allows to use expressions of the form CAST(x AS type) in select statements.
Primary motivation for this functions came from aggregate functions, because
function avg(.) gives rounded results for interger columns. Now it is possible
to convert such column to float/double and obtain floating point results:
SELECT ... avg(cast(x as double)), ...
Fixes #2280."
* 'danfiala/2280-patch-series-v2' of https://github.com/hagrid-the-developer/scylla:
tests: Add test for CAST AS functions.
cql3: Add support for CAST AS functions to ANTLR grammar.
cql3/selectable: Add selectable::with_cast for CAST AS functions.
cql3/functions: Add support for CAST AS functions.
types:: Add support for CAST AS functions.
types: Moved code that implements conversion of types' values to string.
"Currently restricting_mutation_reader restricts mutation_readears on a
count basis. This is inaccurate on multiple levels. The reader might be
a combined_mutation_reader, which might be composed of multiple
individual readers, whose number might change during the lifetime of the
reader. The memory consumption of the readers can vary and may change
during the lifetime of the reader as well.
To remedy this, make the restriction memory-consumption based. The
restricting semaphore is now configured with the amound of memory
(bytes) that its readers are allowed to consume in total. New readers
consume 128k units up-front to account for read-ahead buffers, and then
consume additional units for any buffer (returned
from input_stream<>::read()) they keep around.
Like before, readers already allowed to read will not be blocked,
instead new readers will be blocked on their first read if all the units
all consumed.
Fixes #2692."
* 'bdenes/restricting_mutation_reader-v5' of https://github.com/denesb/scylla:
Update reader restriction related metrics
Add restricted_reader_test unit test
restricted_mutation_reader: restrict based-on memory consumption
mutation_reader.hh: Move restricted_reader related code
"
The original motivation for the "utils: introduce a loading_shared_values" series was a hinted handoff work where
I needed an on-demand asynchronously loading key-value container (a replica address to a commitlog instance map).
It turned out that we already have the classes that do almost what I needed:
- utils::loading_cache
- sstables::shared_index_lists
Therefore it made sense to find a common ground, unify this functionality and reuse the code both in the classes above and in the
new hinted handoff code.
This series introduces the utils::loading_shared_values that generalizes the sstables::shared_index_lists
API on top of bi::unordered_set with the rehashing logic from the utils::loading_cache triggered by an addition
of an entry to the set (PATCH1).
Then it reworks the sstables::shared_index_lists and utils::loading_cache on top of the new class (PATCH2 and PATCH3).
PATCH4 optimizes the loading_cache for the long timer period use case.
But then we have discovered that we have another "customer" for the loading_cache. Apparently our prepared statements cache
had a birth flaw - it was unlimited in size - unless the corresponding keyspace and/or table are modified/dropped the entries
are never evicted. We clearly need to limit its size and it would also make sense to evict the cache entries that haven't been
used long enough.
This seems like a perfect match for a utils::loading_cache except for prepared statements don't need to be reloaded after
they are created.
Patches starting from PATCH5 are dealing with adding the utils::loading_cache the missing functionality (like making the "reloading"
conditional and adding the synchronous methods like find(key)) and then transitioning the CQL and Thrift prepared statements
caches to utils::loading_cache.
This also fixes #2474."
* 'evict_unused_prepared-v5' of https://github.com/vladzcloudius/scylla:
tests: loading_cache_test: initial commit
cql3::query_processor: implement CQL and Thrift prepared statements caches using cql3::prepared_statements_cache
cql3: prepared statements cache on top of loading_cache
utils::loading_cache: make the size limitation more strict
utils::loading_cache: added static_asserts for checking the callbacks signatures
utils::loading_cache: add a bunch of standard synchronous methods
utils::loading_cache: add the ability to create a cache that would not reload the values
utils::loading_cache: add the ability to work with not-copy-constructable values
utils::loading_cache: add EntrySize template parameter
utils::loading_cache: rework on top of utils::loading_shared_values
sstables::shared_index_list: use utils::loading_shared_values
utils: introduce loading_shared_values
Restrict readers based on their memory consumption, instead of the count
of the top-level readers. To do this an interposer is installed at the
input_stream level which tracks buffers emmited by the stream. This way
we can have an accurate picture of the readers' actual memory
consumption.
New readers will consume 16k units from the semaphore up-front. This is
to account their own memory-consumption, apart from the buffers they
will allocate. Creating the reader will be deferred to when there are
enough resources to create it. As before only new readers will be
blocked on an exhausted semaphore, existing readers can continue to
work.
"Currently restricting_mutation_reader restricts mutation_readears on a
count basis. This is inaccurate on multiple levels. The reader might be
a combined_mutation_reader, which might be composed of multiple
individual readers, whose number might change during the lifetime of the
reader. The memory consumption of the readers can vary and may change
during the lifetime of the reader as well.
To remedy this, make the restriction memory-consumption based. The
restricting semaphore is now configured with the amound of memory
(bytes) that its readers are allowed to consume in total. New readers
consume 128k units up-front to account for read-ahead buffers, and then
consume additional units for any buffer (returned
from input_stream<>::read()) they keep around.
Like before, readers already allowed to read will not be blocked,
instead new readers will be blocked on their first read if all the units
all consumed."
Fixes#2692.
* 'bdenes/restricting_mutation_reader-v4' of https://github.com/denesb/scylla:
Update reader restriction related metrics
Add restricted_reader_test unit test
restricted_mutation_reader: restrict based-on memory consumption
mutation_reader.hh: Move restricted_reader related code