Commit Graph

18026 Commits

Author SHA1 Message Date
Asias He
02ddfa393e repair: Rename request_sync_boundary to get_sync_boundary
Make it consistent with the row level repair rpc verb.
2019-02-25 15:13:39 +08:00
Avi Kivity
84465c23c4 Merge "Add multi-column restrictions filtering" from Piotr
"
Fixes #3574

This series adds missing multi-column restrictions filtering to CQL.
The underlying infrastructure already allows checking multi-column
restrictions in a reasonable way, so this series consists of mostly
adding simple interfaces and parameters.
Also, unit test cases for multi-column restrictions are provided.

Tests: unit (dev)
"

* 'add_multi_column_restrictions_filtering_3' of https://github.com/psarna/scylla:
  tests: add multi-column filtering tests
  cql3: add multi-column restrictions filtering
  cql3: add specified is_satisfied_by to multi-column restriction
  cql3: rewrite raw loop in is_satisfied_by to boost::any_of
  cql3: fix is_satisfied_by for multi-column restrictions
  cql3: add missing include to multi-column restriction
2019-02-19 14:42:14 +02:00
Piotr Sarna
9432937816 tests: add multi-column filtering tests
Refs #3574
2019-02-19 13:24:25 +01:00
Piotr Sarna
4dc0b0672c cql3: add multi-column restrictions filtering
It's now possible to pass multi-column restrictions
to queries that require filtering.

Fixes #3574
2019-02-19 13:24:25 +01:00
Piotr Sarna
3db526ffe2 cql3: add specified is_satisfied_by to multi-column restriction
Multi-column restrictions need only schema, clustering key and query
options in order to decide if they are satisfied, so an overloaded
function that takes reduced number of parameters is added.
2019-02-19 13:24:25 +01:00
Piotr Sarna
16dbc917a4 cql3: rewrite raw loop in is_satisfied_by to boost::any_of 2019-02-19 13:24:12 +01:00
Piotr Sarna
0d675e4419 cql3: fix is_satisfied_by for multi-column restrictions
Multi-column restriction should be satisfied by the value
if any of the ranges contains it, not all of them.
Example: SELECT * FROM t WHERE (a,b) IN ((1,2),(1,3))
will operate on two singular ranges: [(1,2),(1,2)] and [(1,3),(1,3)].
It's sufficient for a value to be inside any of these two in order
to satisfy the restriction.
2019-02-19 13:10:58 +01:00
Avi Kivity
934ba7ccb2 Merge "tests: introduce test environment and cleanup sstable tests" from Benny
"
As part of implementing sstables manager and fixing issue related
to updating large_data_handler on all delete paths, we want to funnel
all sstable creations, loading, and deletions through a manager.

The patchset lays out test infrastructure to funnel these opeations
through class sstables::test_env.

In the process, it cleans up many numerous call sites in the existing
unit tests that evolved over time.

Refs #4198
Refs #4149

Tests: unit (dev)
"

* 'projects/test_env/v3' of https://github.com/bhalevy/scylla:
  tests: introduce sstables::test_env
  tests: perf_sstable: rename test_env
  tests: sstable_datafile_test: use useable_sst
  tests: sstable_test: add write_and_validate_sst helper
  tests: sstable_test: add test_using_reusable_sst helper
  tests: sstable_test: use reusable_sst where possible
  tests: sstable_test: add test_using_working_sst helper
  tests: sstable_3_x_test: make_test_sstable
  tests: run_sstable_resharding_test: use default parameters to make_sstable
  tests: sstables::test::make_test_sstable: reorder params
  tests: test_setup: do_with_test_directory is unused
  tests: move sstable_resharding_strategy_tests to sstable_reharding_test
  tests: move create_token_from_key helpers to test_services
  tests: move column_family_for_tests to test_services
  dht: move declaration of default_partitioner from sstable_datafile_test to i_partitioner.hh
2019-02-19 11:26:42 +02:00
Piotr Sarna
4eecb57a0b cql3: add missing include to multi-column restriction 2019-02-19 10:24:31 +01:00
Tomasz Grabiec
9c6f897731 tools/toolchain/README: Add the "Troubleshooting" section
Message-Id: <1550567863-29404-1-git-send-email-tgrabiec@scylladb.com>
2019-02-19 11:21:02 +02:00
Tzach Livyatan
622361bf1a docs/docker-hub.md: Docker Compose cluster example
This adds a simple example of launching a 3-node Scylla cluster with
Docker Compose.

Signed-off-by: Tzach Livyatan <tzach@scylladb.com>
[ penberg: minor edits ]
Message-Id: <20190213081003.6401-1-tzach@scylladb.com>
2019-02-19 09:52:20 +02:00
Avi Kivity
e37e095432 build: allow configuring and testing multiple modes
Allow the --mode argument to ./configure.py and ./test.py to be repeated. This
is to allow contiuous integration to configure only debug and release, leaving dev
to developers.
Message-Id: <20190214162736.16443-1-avi@scylladb.com>
2019-02-18 15:52:25 +00:00
Duarte Nunes
6e83457b1b Merge 'Add PER PARTITION LIMIT' from Piotr
"
This series introduces PER PARTITION LIMIT to CQL.
Protocol and storage is already capable of applying per-partition limits,
so for nonpaged queries the changes are superficial - a variable is parsed
and passed down.
For paged queries and filtering the situation is a little bit more complicated
due to corner cases: results for one partition can be split over 2 or more pages,
filtering may drop rows, etc. To solve these, another variable is added to paging
state - the number of rows already returned from last served partition.
Note that "last" partition may be stretched over any number of pages, not just the
last one, which is a case especially when considering filtering.
As a result, per-partition-limiting queries are not eligible for page generator
optimization, because they may need to have their results locally filtered
for extraneous rows (e.g. when the next page asks for  per-partition limit 5,
but we already received 4 rows from the last partition, so need just 1 more
from last partition key, but 5 from all next ones).

Tests: unit (dev)

Fixes #2202
"

* 'add_per_partition_limit_3' of https://github.com/psarna/scylla:
  tests: remove superficial ignore_order from filtering tests
  tests: add filtering with per partition key limit test
  tests: publish extract_paging_state and count_rows_fetched
  tests: fix order of parameters in with_rows_ignore_order
  cql3,grammar: add PER PARTITION LIMIT
  idl,service: add persistent last partition row count
  cql3: prevent page generator usage for per-partition limit
  cql3: add checking for previous partition count to filtering
  pager: add adjusting per-partition row limit
  cql3: obey per partition limit for filtering
  cql3: clean up unneeded limit variables
  cql3: obey per partition limit for select statement
  cql3: add get_per_partition_limit
  cql3: add per_partition_limit to CQL statement
2019-02-18 14:47:11 +00:00
Amnon Heiman
750b76b1de scylla-housekeeping: Read JSON as UTF-8 string for older Python 3 compatibility
Python 3.6 is the first version to accept bytes to the json.loads(),
which causes the following error on older Python 3 versions:

  Traceback (most recent call last):
    File "/usr/lib/scylla/scylla-housekeeping", line 175, in <module>
      args.func(args)
    File "/usr/lib/scylla/scylla-housekeeping", line 121, in check_version
      raise e
    File "/usr/lib/scylla/scylla-housekeeping", line 116, in check_version
      versions = get_json_from_url(version_url + params)
    File "/usr/lib/scylla/scylla-housekeeping", line 55, in get_json_from_url
      return json.loads(data)
    File "/usr/lib64/python3.4/json/__init__.py", line 312, in loads
      s.__class__.__name__))
  TypeError: the JSON object must be str, not 'bytes'

To support those older Python versions, convert the bytes read to utf8
strings before calling the json.loads().

Fixes #4239
Branches: master, 3.0

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20190218112312.24455-1-amnon@scylladb.com>
2019-02-18 14:52:32 +02:00
Piotr Sarna
5ad5221ce1 tests: remove superficial ignore_order from filtering tests
Testing filtering with LIMIT used with_rows_ignore_order function,
while it's better to use simpler with_rows.
2019-02-18 11:06:44 +01:00
Piotr Sarna
5f67a501ec tests: add filtering with per partition key limit test 2019-02-18 11:06:44 +01:00
Piotr Sarna
a84e237177 tests: publish extract_paging_state and count_rows_fetched
These local lambda functions will be reused, so they are promoted
to static functions.
2019-02-18 11:06:44 +01:00
Piotr Sarna
824e9dc352 tests: fix order of parameters in with_rows_ignore_order
When reporting a failure, expected rows were mixed up with received
rows. Also, the message assumed it received more rows, but it can
as well be less, so now it reports a "different number" of rows.
2019-02-18 11:06:44 +01:00
Piotr Sarna
3e4f065847 cql3,grammar: add PER PARTITION LIMIT
Select statements now allow passing PER PARTITION LIMIT (?) directive
which will trim results for each partition accordingly.
2019-02-18 11:06:44 +01:00
Piotr Sarna
acf7bedad4 idl,service: add persistent last partition row count
In order to process paged queries with per-partition limits properly,
paging state needs to keep additional information: what was the row
count of last partition returned in previous run.
That's necessary because the end of previous page and the beginning
of current one might consist of rows with the same partition key
and we need to be able to trim the results to the number indicated
by per-partition limit.
2019-02-18 11:06:44 +01:00
Piotr Sarna
3a2b004f02 cql3: prevent page generator usage for per-partition limit
Paged queries that induce per-partition limits cannot use
page generator optimization, as sometimes the results need
to be filtered for extraneous rows on page breaks.
2019-02-18 11:06:44 +01:00
Piotr Sarna
1dadae212a cql3: add checking for previous partition count to filtering
Filtering now needs to take into account per partition limits as well,
and for that it's essential to be able to compare partition keys
and decide which rows should be dropped - if previous page(s) contained
rows with the same partition key, these need to be taken into
consideration too.
2019-02-18 11:06:43 +01:00
Piotr Sarna
82a3883575 pager: add adjusting per-partition row limit
For filtering pagers, per partition limit should be set
to page size every time a query is executed, because some rows
may potentially get dropped from results.
2019-02-18 10:55:52 +01:00
Piotr Sarna
b965c3778f cql3: obey per partition limit for filtering
Filtering queries now take into account the limit of rows
per single partition provided by the user.
2019-02-18 10:29:34 +01:00
Piotr Sarna
b3aa939cde cql3: clean up unneeded limit variables
Some places extracted a `limit` variable to be captured by lambdas,
but they were not used inside them.
2019-02-18 10:29:34 +01:00
Piotr Sarna
cfb6e9c79c cql3: obey per partition limit for select statement
Select statement now takes into account the limit of rows
per single partition provided by the user.
2019-02-18 10:29:34 +01:00
Piotr Sarna
41b466246e cql3: add get_per_partition_limit 2019-02-18 10:29:34 +01:00
Piotr Sarna
93786a9148 cql3: add per_partition_limit to CQL statement
Select statements can now accept per_partition_limit variable.
2019-02-18 10:29:34 +01:00
Gleb Natapov
b01a659014 storage_proxy: remove old Cassandra code
Part of the code is already implemented (counters and hinted-handoff).
Part of the code will probably never be (triggers). And the rest is
the code that estimates number of rows per range to determine query
parallelism, but we implemented exponential growth algorithms instead.

Message-Id: <20190214112226.GE19055@scylladb.com>
2019-02-18 10:34:55 +02:00
Avi Kivity
a1567b0997 Merge "replace get_restricted_ranges() function with generator interface" from Gleb
"
get_restricted_ranges() is inefficient since it calculates all
vnodes that cover a requested key ranges in advance, but callers often
use only the first one.  Replace the function with generator interface
that generates requested number of vnodes on demand.
"

* 'gleb/query_ranges_to_vnodes_generator' of github.com:scylladb/seastar-dev:
  storage_proxy: limit amount of precaclulated ranges by query_ranges_to_vnodes_generator
  storage_proxy: remove old get_restricted_ranges() interface
  cql3/statements/select_statement: convert index query interface to new query_ranges_to_vnodes_generator interface
  tests: convert storage_proxy test to new query_ranges_to_vnodes_generator interface
  storage_proxy: convert range query path to new query_ranges_to_vnodes_generator interface
  storage_proxy: introduce new query_ranges_to_vnode_generator interface
2019-02-18 10:33:54 +02:00
Avi Kivity
497367f9f7 Revert "build: switch debug mode from -O0 to -Og"
This reverts commit e988521b89. It triggers a bug
int gcc variable tracking, and there are reports it significantly slows down
compilation.
2019-02-17 18:32:28 +02:00
Nadav Har'El
05db7d8957 Materialized views: name the "batch_memory_max" constant
Give the constant 1024*1024 introduced in an earlier commit a name,
"batch_memory_max", and move it from view.cc to view_builder.hh.
It now resides next to the pre-existing constant that controlled how
many rows were read in each build step, "batch_size".

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190217100222.15673-1-nyh@scylladb.com>
2019-02-17 13:28:16 +00:00
Avi Kivity
7b411e30a9 Update seastar submodule
* seastar 11546d4...2313dec (6):
  > Deprecate thread_scheduling_group in favor of scheduling_group
  > Merge "Fixes for Doxygen documentation" from Jesse
  > future: optionally type-erase future::then() and future::then_wrapped
  > build: Allow deprecated declarations internally
  > rpc: fix insertion of server connections into server's container
  > rpc: split BOOST_REQUIRE with long conditions into multiple
2019-02-16 22:27:34 +02:00
Avi Kivity
03531c2443 fragmented_temporary_buffer: fix read_exactly() during premature end-of-stream
read_exactly(), when given a stream that does not contain the amount of data
requested, will loop endlessly, allocating more and more memory as it does, until
it fails with an exception (at which point it will release the memory).

Fix by returning an empty result, like input_stream::read_exactly() (which it
replaces). Add a test case that fails without a fix.

Affected callers are the native transport, commitlog replay, and internal
deserialization.

Fixes #4233.

Branches: master, branch-3.0
Tests: unit(dev)
Message-Id: <20190216150825.14841-1-avi@scylladb.com>
2019-02-16 17:06:19 +00:00
Takuya ASADA
af988a5360 install-dependencies.sh: show description when 'yum-utils' package is installed on Fedora
When yum-utils already installed on Fedora, 'yum install dnf-utils' causes
conflict, will fail.
We should show description message instead of just causing dnf error
mesage.

Fixes #4215

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190215221103.2379-1-syuu@scylladb.com>
2019-02-16 17:16:18 +02:00
Pekka Enberg
f7cf04ac4b tools/toolchain: Clean up DNF cache from Docker image
Make sure we call "dnf clean all" to remove the DNF cache, which reduces
Docker image size as per the following guidelines:

https://github.com/fedora-cloud/Fedora-Dockerfiles/wiki/Guidelines-for-Creating-Dockerfiles

A freshly built image is 250 MB smaller than the one on Docker Hub:

  <none>                                <none>               b8cafc8ff557        16 seconds ago      1.2 GB
  docker.io/scylladb/scylla-toolchain   fedora-29-20190212   d253d45a964c        3 days ago          1.45 GB

Message-Id: <20190215142322.12466-1-penberg@scylladb.com>
2019-02-16 17:12:10 +02:00
Botond Dénes
2125e99531 service/storage_service: fix pre-bootstrap wait for schema agreement
When bootstrapping, a node should to wait to have a schema agreement
with its peers, before it can join the ring. This is to ensure it can
immediately accept writes. Failing to reach schema agreement before
joining is not fatal, as the node can pull unknown schemas on writes
on-demand. However, if such a schema contains references to UDFs, the
node will reject writes using it, due to #3760.

To ensure that schema agreement is reached before joining the ring,
`storage_service::join_token_ring()` has to checks. First it checks that
at least one peer was connected previously. For this it compares
`database::get_version()` with `database::empty_version`. The (implied)
assumption is that this will become something other than
`database::empty_version` only after having connected (and pulled
schemas from) at least one peer. This assumption doesn't hold anymore,
as we now set the version earlier in the boot process.
The second check verifies that we have the same schema version as all
known, live peers. This check assumes (since 3e415e2) that we have
already "met" all (or at least some) of our peers and if there is just
one known node (us) it concludes that this is a single-node cluster,
which automatically has schema agreement.
It's easy to see how these two checks will fail. The first fails to
ensure that we have met our peers, and the second wrongfully concludes
that we are a one-node cluster, and hence have schema agreement.

To fix this, modify the first check. Instead of relying on the presence
of a non-empty database version, supposedly implying that we already
talked to our peers, explicitely make sure that we have really talked to
*at least* one other node, before proceeding to the second check, which
will now do the correct thing, actually checking the schema versions.

Fixes: #4196

Branches: 3.0, 2.3

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <40b95b18e09c787e31ba6c5519fb64d68b4ca32e.1550228389.git.bdenes@scylladb.com>
2019-02-15 15:56:46 +01:00
Rafael Ávila de Espíndola
9cd14f2602 Don't write to system.large_partition during shutdown
The included testcase used to crash because during database::stop() we
would try to update system.large_partition.

There doesn't seem to be an order we can stop the existing services in
cql_test_env that makes this possible.

This patch then adds another step when shutting down a database: first
stop updating system.large_partition.

This means that during shutdown any memtable flush, compaction or
sstable deletion will not be reflected in system.large_partition. This
is hopefully not too bad since the data in the table is TTLed.

This seems to impact only tests, since main.cc calls _exit directly.

Tests: unit (release,debug)

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190213194851.117692-1-espindola@scylladb.com>
2019-02-15 10:49:10 +01:00
Avi Kivity
e988521b89 build: switch debug mode from -O0 to -Og
-Og is advertised as debug-friendly optimization, both in compile time
and debug experience. It also cuts sstable_mutation_test run time in half:

Changing -O0 to -Og

Before:

real    16m49.441s
user    16m34.641s
sys    0m10.490s

After:

real    8m38.696s
user    8m26.073s
sys    0m10.575s

Message-Id: <20190214205521.19341-1-avi@scylladb.com>
2019-02-15 08:19:48 +02:00
Benny Halevy
c8f239ff2b tests: introduce sstables::test_env
In preparation to adding sstables_manager we want
to establish an environment for testing sstables.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:37:41 +02:00
Benny Halevy
f9546b23b7 tests: perf_sstable: rename test_env
test_env is going to be a class in sstables namespace

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:15 +02:00
Benny Halevy
d6cfc1fae5 tests: sstable_datafile_test: use useable_sst
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
2a6b5a7622 tests: sstable_test: add write_and_validate_sst helper
In preparation for sstables::test_env

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
255f05e6c8 tests: sstable_test: add test_using_reusable_sst helper
In preparation for sstables::test_env

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
e11e29a1fc tests: sstable_test: use reusable_sst where possible
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
9d4989f2e8 tests: sstable_test: add test_using_working_sst helper
In preparation for sstables::test_env

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
55aac22b37 tests: sstable_3_x_test: make_test_sstable
Reused for making sstables for test cases.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
3bc1b8b9ff tests: run_sstable_resharding_test: use default parameters to make_sstable
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:22:14 +02:00
Benny Halevy
b0f3f8d766 tests: sstables::test::make_test_sstable: reorder params
In preparation for providing a default large_data_handler in
a test-standard way.

buffer_size parameter reordered and now has a default value
same as make_sstable()'s.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:21:36 +02:00
Benny Halevy
bcd3f36a8a tests: test_setup: do_with_test_directory is unused
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-14 22:21:32 +02:00