"
Fixes#3574
This series adds missing multi-column restrictions filtering to CQL.
The underlying infrastructure already allows checking multi-column
restrictions in a reasonable way, so this series consists of mostly
adding simple interfaces and parameters.
Also, unit test cases for multi-column restrictions are provided.
Tests: unit (dev)
"
* 'add_multi_column_restrictions_filtering_3' of https://github.com/psarna/scylla:
tests: add multi-column filtering tests
cql3: add multi-column restrictions filtering
cql3: add specified is_satisfied_by to multi-column restriction
cql3: rewrite raw loop in is_satisfied_by to boost::any_of
cql3: fix is_satisfied_by for multi-column restrictions
cql3: add missing include to multi-column restriction
Multi-column restrictions need only schema, clustering key and query
options in order to decide if they are satisfied, so an overloaded
function that takes reduced number of parameters is added.
Multi-column restriction should be satisfied by the value
if any of the ranges contains it, not all of them.
Example: SELECT * FROM t WHERE (a,b) IN ((1,2),(1,3))
will operate on two singular ranges: [(1,2),(1,2)] and [(1,3),(1,3)].
It's sufficient for a value to be inside any of these two in order
to satisfy the restriction.
"
As part of implementing sstables manager and fixing issue related
to updating large_data_handler on all delete paths, we want to funnel
all sstable creations, loading, and deletions through a manager.
The patchset lays out test infrastructure to funnel these opeations
through class sstables::test_env.
In the process, it cleans up many numerous call sites in the existing
unit tests that evolved over time.
Refs #4198
Refs #4149
Tests: unit (dev)
"
* 'projects/test_env/v3' of https://github.com/bhalevy/scylla:
tests: introduce sstables::test_env
tests: perf_sstable: rename test_env
tests: sstable_datafile_test: use useable_sst
tests: sstable_test: add write_and_validate_sst helper
tests: sstable_test: add test_using_reusable_sst helper
tests: sstable_test: use reusable_sst where possible
tests: sstable_test: add test_using_working_sst helper
tests: sstable_3_x_test: make_test_sstable
tests: run_sstable_resharding_test: use default parameters to make_sstable
tests: sstables::test::make_test_sstable: reorder params
tests: test_setup: do_with_test_directory is unused
tests: move sstable_resharding_strategy_tests to sstable_reharding_test
tests: move create_token_from_key helpers to test_services
tests: move column_family_for_tests to test_services
dht: move declaration of default_partitioner from sstable_datafile_test to i_partitioner.hh
Allow the --mode argument to ./configure.py and ./test.py to be repeated. This
is to allow contiuous integration to configure only debug and release, leaving dev
to developers.
Message-Id: <20190214162736.16443-1-avi@scylladb.com>
"
This series introduces PER PARTITION LIMIT to CQL.
Protocol and storage is already capable of applying per-partition limits,
so for nonpaged queries the changes are superficial - a variable is parsed
and passed down.
For paged queries and filtering the situation is a little bit more complicated
due to corner cases: results for one partition can be split over 2 or more pages,
filtering may drop rows, etc. To solve these, another variable is added to paging
state - the number of rows already returned from last served partition.
Note that "last" partition may be stretched over any number of pages, not just the
last one, which is a case especially when considering filtering.
As a result, per-partition-limiting queries are not eligible for page generator
optimization, because they may need to have their results locally filtered
for extraneous rows (e.g. when the next page asks for per-partition limit 5,
but we already received 4 rows from the last partition, so need just 1 more
from last partition key, but 5 from all next ones).
Tests: unit (dev)
Fixes#2202
"
* 'add_per_partition_limit_3' of https://github.com/psarna/scylla:
tests: remove superficial ignore_order from filtering tests
tests: add filtering with per partition key limit test
tests: publish extract_paging_state and count_rows_fetched
tests: fix order of parameters in with_rows_ignore_order
cql3,grammar: add PER PARTITION LIMIT
idl,service: add persistent last partition row count
cql3: prevent page generator usage for per-partition limit
cql3: add checking for previous partition count to filtering
pager: add adjusting per-partition row limit
cql3: obey per partition limit for filtering
cql3: clean up unneeded limit variables
cql3: obey per partition limit for select statement
cql3: add get_per_partition_limit
cql3: add per_partition_limit to CQL statement
Python 3.6 is the first version to accept bytes to the json.loads(),
which causes the following error on older Python 3 versions:
Traceback (most recent call last):
File "/usr/lib/scylla/scylla-housekeeping", line 175, in <module>
args.func(args)
File "/usr/lib/scylla/scylla-housekeeping", line 121, in check_version
raise e
File "/usr/lib/scylla/scylla-housekeeping", line 116, in check_version
versions = get_json_from_url(version_url + params)
File "/usr/lib/scylla/scylla-housekeeping", line 55, in get_json_from_url
return json.loads(data)
File "/usr/lib64/python3.4/json/__init__.py", line 312, in loads
s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'
To support those older Python versions, convert the bytes read to utf8
strings before calling the json.loads().
Fixes#4239
Branches: master, 3.0
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20190218112312.24455-1-amnon@scylladb.com>
When reporting a failure, expected rows were mixed up with received
rows. Also, the message assumed it received more rows, but it can
as well be less, so now it reports a "different number" of rows.
In order to process paged queries with per-partition limits properly,
paging state needs to keep additional information: what was the row
count of last partition returned in previous run.
That's necessary because the end of previous page and the beginning
of current one might consist of rows with the same partition key
and we need to be able to trim the results to the number indicated
by per-partition limit.
Paged queries that induce per-partition limits cannot use
page generator optimization, as sometimes the results need
to be filtered for extraneous rows on page breaks.
Filtering now needs to take into account per partition limits as well,
and for that it's essential to be able to compare partition keys
and decide which rows should be dropped - if previous page(s) contained
rows with the same partition key, these need to be taken into
consideration too.
For filtering pagers, per partition limit should be set
to page size every time a query is executed, because some rows
may potentially get dropped from results.
Part of the code is already implemented (counters and hinted-handoff).
Part of the code will probably never be (triggers). And the rest is
the code that estimates number of rows per range to determine query
parallelism, but we implemented exponential growth algorithms instead.
Message-Id: <20190214112226.GE19055@scylladb.com>
"
get_restricted_ranges() is inefficient since it calculates all
vnodes that cover a requested key ranges in advance, but callers often
use only the first one. Replace the function with generator interface
that generates requested number of vnodes on demand.
"
* 'gleb/query_ranges_to_vnodes_generator' of github.com:scylladb/seastar-dev:
storage_proxy: limit amount of precaclulated ranges by query_ranges_to_vnodes_generator
storage_proxy: remove old get_restricted_ranges() interface
cql3/statements/select_statement: convert index query interface to new query_ranges_to_vnodes_generator interface
tests: convert storage_proxy test to new query_ranges_to_vnodes_generator interface
storage_proxy: convert range query path to new query_ranges_to_vnodes_generator interface
storage_proxy: introduce new query_ranges_to_vnode_generator interface
Give the constant 1024*1024 introduced in an earlier commit a name,
"batch_memory_max", and move it from view.cc to view_builder.hh.
It now resides next to the pre-existing constant that controlled how
many rows were read in each build step, "batch_size".
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190217100222.15673-1-nyh@scylladb.com>
* seastar 11546d4...2313dec (6):
> Deprecate thread_scheduling_group in favor of scheduling_group
> Merge "Fixes for Doxygen documentation" from Jesse
> future: optionally type-erase future::then() and future::then_wrapped
> build: Allow deprecated declarations internally
> rpc: fix insertion of server connections into server's container
> rpc: split BOOST_REQUIRE with long conditions into multiple
read_exactly(), when given a stream that does not contain the amount of data
requested, will loop endlessly, allocating more and more memory as it does, until
it fails with an exception (at which point it will release the memory).
Fix by returning an empty result, like input_stream::read_exactly() (which it
replaces). Add a test case that fails without a fix.
Affected callers are the native transport, commitlog replay, and internal
deserialization.
Fixes#4233.
Branches: master, branch-3.0
Tests: unit(dev)
Message-Id: <20190216150825.14841-1-avi@scylladb.com>
When yum-utils already installed on Fedora, 'yum install dnf-utils' causes
conflict, will fail.
We should show description message instead of just causing dnf error
mesage.
Fixes#4215
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190215221103.2379-1-syuu@scylladb.com>
When bootstrapping, a node should to wait to have a schema agreement
with its peers, before it can join the ring. This is to ensure it can
immediately accept writes. Failing to reach schema agreement before
joining is not fatal, as the node can pull unknown schemas on writes
on-demand. However, if such a schema contains references to UDFs, the
node will reject writes using it, due to #3760.
To ensure that schema agreement is reached before joining the ring,
`storage_service::join_token_ring()` has to checks. First it checks that
at least one peer was connected previously. For this it compares
`database::get_version()` with `database::empty_version`. The (implied)
assumption is that this will become something other than
`database::empty_version` only after having connected (and pulled
schemas from) at least one peer. This assumption doesn't hold anymore,
as we now set the version earlier in the boot process.
The second check verifies that we have the same schema version as all
known, live peers. This check assumes (since 3e415e2) that we have
already "met" all (or at least some) of our peers and if there is just
one known node (us) it concludes that this is a single-node cluster,
which automatically has schema agreement.
It's easy to see how these two checks will fail. The first fails to
ensure that we have met our peers, and the second wrongfully concludes
that we are a one-node cluster, and hence have schema agreement.
To fix this, modify the first check. Instead of relying on the presence
of a non-empty database version, supposedly implying that we already
talked to our peers, explicitely make sure that we have really talked to
*at least* one other node, before proceeding to the second check, which
will now do the correct thing, actually checking the schema versions.
Fixes: #4196
Branches: 3.0, 2.3
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <40b95b18e09c787e31ba6c5519fb64d68b4ca32e.1550228389.git.bdenes@scylladb.com>
The included testcase used to crash because during database::stop() we
would try to update system.large_partition.
There doesn't seem to be an order we can stop the existing services in
cql_test_env that makes this possible.
This patch then adds another step when shutting down a database: first
stop updating system.large_partition.
This means that during shutdown any memtable flush, compaction or
sstable deletion will not be reflected in system.large_partition. This
is hopefully not too bad since the data in the table is TTLed.
This seems to impact only tests, since main.cc calls _exit directly.
Tests: unit (release,debug)
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190213194851.117692-1-espindola@scylladb.com>
-Og is advertised as debug-friendly optimization, both in compile time
and debug experience. It also cuts sstable_mutation_test run time in half:
Changing -O0 to -Og
Before:
real 16m49.441s
user 16m34.641s
sys 0m10.490s
After:
real 8m38.696s
user 8m26.073s
sys 0m10.575s
Message-Id: <20190214205521.19341-1-avi@scylladb.com>
In preparation for providing a default large_data_handler in
a test-standard way.
buffer_size parameter reordered and now has a default value
same as make_sstable()'s.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>