Commit Graph

2923 Commits

Author SHA1 Message Date
Avi Kivity
6c71eae63f Merge "API: Stream compaction history records" from Amnon
"
get_compaction_history can return a lot of records which will add up to a
big http reply.

This series makes sure it will not create large allocations when
returning the results.

It adds an api to the query_processor to use paged queries with a
consumer function that returns a future, this way we can use the http
stream after each record.

This implementation will prevent large allocations and stalls.

Fixes #4152
"

* 'amnon/compaction_history_stream_v7' of github.com:scylladb/seastar-dev:
  tests/query_processor_test: add query_with_consumer_test
  system_keyspace, api: stream get_compaction_history
  query_processor: query and for_each_cql_result with future
2019-02-05 14:16:36 +02:00
Avi Kivity
ebf179318c Merge "SI: Add virtual columns to underlying MV" from Duarte
"
Virtual columns are MV-specific columns that contribute to the
liveness of view rows. However, we were not adding those columns when
creating an index's underlying MV, causing indexes to miss base rows.

Fixes #4144
Branches: master, branch-3.0
"

Reviewed-by: Nadav Har'El <nyh@scylladb.com>

* 'sec-index/virtual-columns/v1' of https://github.com/duarten/scylla:
  tests/secondary_index_test: Add reproducer for #4144
  index/secondary_index_manager: Add virtual columns to MV
2019-02-05 13:26:45 +02:00
Amnon Heiman
c96c3ce9e8 tests/query_processor_test: add query_with_consumer_test
This patch adds a unit test for querying with a consumer function.

query with consumer uses paging, the tests covers the scenarios where
the number of rows bellow and above the page size, it also test the
option to stop in the middle of reading.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2019-02-05 12:35:53 +02:00
Piotr Sarna
11e6d88ca7 tests: supplement filtering collections with more cases
Filtering test cases for collections are supplemented with
checking whether CONTAINS works correctly for sets and maps.

Message-Id: <4a684152cdcdb65e1415ba5859699cb324312c2b.1548837150.git.sarna@scylladb.com>
2019-02-03 17:19:30 +02:00
Avi Kivity
468f8c7ee7 Merge "Print a warning if a row is too large" from Rafael
"
This is a first step in fixing #3988.
"

* 'espindola/large-row-warn-only-v4' of https://github.com/espindola/scylla:
  Rename large_partition_handler
  Print a warning if a row is too large
  Remove defaut parameter value
  Rename _threshold_bytes to _partition_threshold_bytes
  keys: add schema-aware printing for clustering_key_prefix
2019-02-03 13:57:42 +02:00
Raphael S. Carvalho
930f8caff9 sstables/compaction: Fix segfault when replacing expired sstable in incremental compaction
Fully expired sstable is not added to compacting set, meaning it's not actually
compacted, but it's kept in a list of sstables which incremental compaction
uses to check if any sstable can be replaced.
Incremental compaction was unconditionally removing expired sstable from compacting
set, which led to segfault because end iterator was given.

The fix is about changing sstable_set::erase() behavior to follow standard one
for erase functions which will works if the target element is not present.

Fixes #4085.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20190130163100.5824-1-raphaelsc@scylladb.com>
2019-01-30 16:32:45 +00:00
Avi Kivity
1224cde871 Merge "Make perf_simple_query produce JSON results" from Paweł
"
This series enhances perf_simple_query error reporting by adding an
option of producing a json file containing the results. The format of
that file is very similar to the results produces by perf_fast_forward
in order to ease integration with any tools that may want to interpret
them.

In addition to that perf_simple_query now prints to the standard output
median, median absolute deviation, minimum and maximum of the partial
results, so that there is no need for external scripts to compute those
values.
"

* tag 'perf_simple_query-json/v1' of https://github.com/pdziepak/scylla:
  perf_simple_query: produce json results
  perf_simple_query: calculate and print statistics
  perf: time_parallel: return results of each iteration
  perf_simple_query: take advantage of threads in main()
2019-01-30 17:39:19 +02:00
Paweł Dziepak
6a0ee5dbbf Merge "Simpler fix for the memtable reader's fragment monotonicity violation" from Botond
"
Recently it was discovered that the memtable reader
(partition_snapshot_reader to be more precise) can violate mutation
fragment monotonicity, by remitting range tombstones when those overlap
with more than one ck range of the partition slice.
This was fixed by 7049cd9, however after that fix was merged a much
simpler fix was proposed by Tomek, one that doesn't involve nearly as
much changes to the partition snapshot reader and hences poses less risk
of breaking it.
This mini-series reverts the previous fix, then applies the new, simpler
one.

Refs: #4104
"

* 'partition-snapshot-reader-simpler-fix/v2' of https://github.com/denesb/scylla:
  partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges
  Revert "partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges"
2019-01-30 15:24:31 +00:00
Jesse Haber-Kucharsky
b39eac653d Switch to the the CMake-ified Seastar
Committer: Avi Kivity <avi@scylladb.com>
Branch: next

Switch to the the CMake-ified Seastar

This change allows Scylla to be compiled against the `master` branch of
Seastar.

The necessary changes:

- Add `-Wno-error` to prevent a Seastar warning from terminating the
  build

- The new Seastar build system generates the pkg-config files (for
  example, `seastar.pc`) at configure time, so we don't need to invoke
  Ninja to generate them

- The `-march` argument is no longer inherited from Seastar (correctly),
  so it needs to be provided independently

- Define `SEASTAR_TESTING_MAIN` so that the definition of an entry
  point is included for all unit test compilation units

- Independently link Scylla against Seastar's compiled copy of fmt in
  its build directory

- All test files use the (now public) Seastar testing headers

- Add some missing Seastar headers to source files

[avi: regenerate frozen toolchain, adjust seastar submoule]
Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <02141f2e1ecff5cbcd56b32768356c3bf62750c4.1548820547.git.jhaberku@scylladb.com>
2019-01-30 11:17:38 +02:00
Botond Dénes
8d59c36165 partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges
When entering a new ck range (of the partition-slice), the partition
snapshot reader will apply to its range tombstones stream all the
tombstones that are relevant to the new ck range. When the partition has
range tombstones that overlap with multiple ck ranges, these will be
applied to the range tombstone stream when entering any of the ck ranges
they overlap with. This will result in the violation of the monotonicity
of the mutation fragments emitted by the reader, as these range
tombstones will be re-emitted on each ck range, if the ck range has at
least one clustering row they apply to.
For example, given the following partition:
    rt{[1,10]}, cr{1}, cr{2}, cr{3}...

And a partition-slice with the following ck ranges:
    [1,2], [3, 4]

The reader will emit the following fragment stream:
    rt{[1,10]}, cr{1}, cr{2}, rt{[1,10]}, cr{3}, ...

Note how the range tombstone is emitted twice. In addition to violating
the monotonicity guarantee, this can also result in an explosion of the
number of emitted range tombstones.

Fix by trimming range tombstones to the start of the current ck range,
thus ensuring that they will not violate mutation fragment monotonicity
guarantees.

Refs: #4104

This is a much simpler fix for the above issue, than the already
committed one (7049cd937A). The latter is reverted by the previous
patch and this patch applies the simpler fix.
2019-01-30 10:01:13 +02:00
Duarte Nunes
35c03f41a4 Merge 'Fix multiple contains for one column' from Piotr
"
An error in validating CONTAINS restrictions against collections caused
only the first restriction to be taken into account due to returning
prematurely.
This miniseries provides a fix for that as well as a matching test case.

Tests: unit (release)
Fixes #4161
"

* 'fix_multiple_contains_for_one_column' of https://github.com/psarna/scylla:
  tests: enable CONTAINS tests for filtering
  cql3: remove premature return from is_satisfied_by
  cql3: restore indentation
2019-01-29 11:10:13 +00:00
Piotr Sarna
11aae54cca tests: enable CONTAINS tests for filtering
Tests for filtering with CONTAINS restrictions were not enabled,
so they are now. Also, another case for having two CONTAINS restrictions
for a single column is added.

Refs #4161
2019-01-29 11:47:28 +01:00
Rafael Ávila de Espíndola
625080b414 Rename large_partition_handler
Now that it also handles large rows, rename it to large_data_handler.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-01-28 15:03:14 -08:00
Rafael Ávila de Espíndola
1185138a34 Print a warning if a row is too large
Tests: unit (release)

Refs #3988.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-01-28 15:03:10 -08:00
Paweł Dziepak
335dca54a5 perf_simple_query: produce json results 2019-01-28 16:36:06 +00:00
Paweł Dziepak
7d21c9c31f perf_simple_query: calculate and print statistics 2019-01-28 16:36:06 +00:00
Paweł Dziepak
eb3d80fa2b perf: time_parallel: return results of each iteration 2019-01-28 16:35:33 +00:00
Paweł Dziepak
6a1e1e8454 perf_simple_query: take advantage of threads in main() 2019-01-28 13:21:08 +00:00
Duarte Nunes
aafaf840a2 tests/secondary_index_test: Add reproducer for #4144
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2019-01-27 22:30:34 +00:00
Benny Halevy
36b6a3ebcf tests: add test_distributed_loader_with_incomplete_sstables
Test removal of sstables with temporary TOC file,
with and without temporary sstable directory.

Temporary sstable directories may be empty or still have
leftover components in them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:48:24 +02:00
Benny Halevy
64a23ea3bc tests: single_node_cql_env::do_with: use the provided data_file_directories path if available
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Benny Halevy
441809094a tests: single_node_cql_env::_data_dir is not used
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Asias He
ee0bb0aa94 tests: Drop the unsupported random_read mode in perf_sstable
It is not supported. Remove it.
Message-Id: <fe31e090574be96a9620b6902ceb843699d558d0.1548403105.git.asias@scylladb.com>
2019-01-25 14:24:40 +00:00
Benny Halevy
6efd85ed01 tests: extend time_overflow unit tests
Test also cql select queries with and without bypass cache.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-24 15:55:06 +02:00
Piotr Jastrzebski
ad016a732b Move set_type_impl out of types.hh to types/set.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
b1e1b66732 Move list_type_impl out of types.hh to types/list.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
147cc031db Move map_type_impl out of types.hh to types/map.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
b6b2fdc5be Move tuple_type_impl from types.hh to types/tuple.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
7666e81b51 Decouple database.hh from types/user.hh
This commit declares shared_ptr<user_types_metadata> in
database.hh were user_types_metadata is an incomplete type so
it requires
"Allow to use shared_ptr with incomplete type other than sstable"
to compile correctly.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:55:04 +01:00
Piotr Jastrzebski
e92b4c3dbc Move user_type_impl out of types.hh to types/user.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:04:04 +01:00
Nadav Har'El
76f1fcc346 cql3: really ensure retrieval of columns for filtering
Commit fd422c954e aimed to fix
issue #3803. In that issue, if a query SELECTed only certain columns but
did filtering (ALLOW FILTERING) over other unselected columns, the filtering
didn't work. The fix involved adding the columns being filtered to the set
of columns we read from disk, so they can be filtered.

But that commit included an optimization: If you have clustering keys
c1 and c2, and the query asks for a specific partition key and c1 < 3 and
c2 > 3, the "c1 < 3" part does NOT need to be filtered because it is already
done as a slice (a contiguous read from disk). The committed code erroneously
concluded that both c1 and c2 don't need to be filtered, which was wrong
(c2 *does* need to be read and filtered).

In this patch, we fix this optimization. Previously, we used the "prefix
length", which in the above example was 2 (both c1 and c2 were filtered)
but we need a new and more elaborate function,
num_prefix_columns_that_need_not_be_filtered(), to determine we can only
skip filtering of 1 (c1) and cannot skip the second.

Fixes #4121. This patch also adds a unit test to confirm this.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Message-Id: <20190123131212.6269-1-nyh@scylladb.com>
2019-01-23 15:24:30 +02:00
Avi Kivity
c1dd04986b Merge "Prepare for the switch to CMake-ified Seastar" from Jesse
"
This series prepares for the integration of the `master` branch of
Seastar back into Scylla.

A number of changes to the existing build are necessary to integrate
Seastar correctly, and these are detailed in the individual change
messages.

I tested with and without DPDK, in release and debug mode.

The actual switch is a separate patch.
"

* 'jhk/seastar_cmake/v4' of https://github.com/hakuch/scylla:
  build: Fix link order for DPDK
  tests: Split out `sstable_datafile_test`
  build: Remove unnecessary inclusion
  tests: Fix use-after-free errors in static vars
  build: Remove Seastar internals
  build: Only use Seastar flags from pkg-config
  build: Query Seastar flags using pkg-config
  build: Change parameters for `pkg_config` function
2019-01-23 10:33:00 +02:00
Jesse Haber-Kucharsky
02dd7bcc82 build: Remove unnecessary inclusion 2019-01-22 18:25:01 -05:00
Jesse Haber-Kucharsky
2a62550002 tests: Fix use-after-free errors in static vars
Without these two variables being declared as TLS, executing these two
tests in "debug" mode fail AddressSanitizer's checks.
2019-01-22 18:24:52 -05:00
Benny Halevy
7d0854a1e5 tests: cql_query_test add test_time_overflow
Test 32-bit time overflow scenarios.
Fails without "gc_clock: make 64 bit".

Refs #3353

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-22 15:34:32 +02:00
Benny Halevy
1ccd72f115 sstables: mc: use int64_t for local_deletion_time and ttl
In preparation for changing gc_clock::duration::rep to int64_t.

Refs #3353

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-22 15:34:32 +02:00
Botond Dénes
4e89dea9ea database: don't allow access to global semaphores
Recently we had a bug (#4096) due to a component
(`multishard_mutation_query()`) assuming that all reads used the
semaphore obtainable via `database::user_read_concurrency_sem()`.
This problem revealed that it is plain wrong to allow access to the
shard-global semaphores residing in the database object. Instead all
code wishing to access the relevant semaphore for some read, should do
so via the relevant `table` object, thus guaranteeing that it will get
the correct semaphore, configured for that table.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <4f3a6780eb3240822db34aba7c1ba0a675a96592.1547734212.git.bdenes@scylladb.com>
2019-01-21 16:29:02 +02:00
Tomasz Grabiec
e02baabd62 tests: perf_fast_forward: Introduce --with-compression option
Message-Id: <1547819062-4369-1-git-send-email-tgrabiec@scylladb.com>
2019-01-21 12:18:31 +00:00
Botond Dénes
ff2884f25b Revert "partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges"
A much simpler and more complete fix was found. Let's revert this before
applying the simpler fix.

This reverts commit 7049cd9374.
2019-01-21 13:56:56 +02:00
Tomasz Grabiec
d7c701d2d1 Merge "Type-erase gratuitous templates with functions" from Avi
Many area of the code are splattered with unneeded templates. This patchset replaces
some of them, where the template parameter is a function object, with an std::function
or noncopyable_function (with a preference towards the latter; but it is not always
possible). As the template is compiled for each instantiation (if the function
object is a lambda) while a function is compiled only once, there are significant
savings in compile time and bloat.

   text    data     bss     dec     hex filename
85160690          42120  284910 85487720        5187068 scylla.before
84824762          42120  284910 85151792        5135030 scylla.after

* https://github.com/avikivity/scylla detemplate/v2:
  api/commitlog: de-template acquire_cl_metric()
  database: de-template do_parse_schema_tables
  database: merge for_all_partitions and for_all_partitions_slow
  hints: de-template scan_for_hints_dirs()
  schema_tables: partially de-template make_map_mutation()
  distributed_loader: de-template
  tests: commitlog_test: de-template
  tests: cql_auth_query_test: de-template
  test: de-template eventually() and eventually_true()
  tests: flush_queue_test: de-template
  hint_test: de-template
  tests: mutation_fragment_test: de-template
  test: mutation_test: de-template
2019-01-21 11:32:22 +01:00
Avi Kivity
1e5c09dbce test: mutation_test: de-template
Replace the with_column_family helper template with an ordinary funciton, to
reduce code bloat.
2019-01-20 15:55:20 +02:00
Avi Kivity
28db56df13 tests: mutation_fragment_test: de-template
The for_each_target() template is called four times, so making it a normal function
reduces a lot of code generation.
2019-01-20 15:55:20 +02:00
Avi Kivity
401684503d hint_test: de-template
While cl_test is duplicated with commitlog_test, at least deduplicate it internally
by converting it to an ordinary function.
2019-01-20 15:55:20 +02:00
Avi Kivity
208b0f80a4 tests: flush_queue_test: de-template
The internal test_propagation template is instantiated many times. Replace
with an oridinary function to reduce bloat. Call sites adjusted to have a
uniform signature.
2019-01-20 15:55:20 +02:00
Avi Kivity
2f36d30572 test: de-template eventually() and eventually_true()
These templates are not trivial and called many times. De-template them to
reduce code bloat.
2019-01-20 15:55:20 +02:00
Avi Kivity
96a8eacc3c tests: cql_auth_query_test: de-template
Replace the with_user() and verify_unauthorized_then_ok() templates with functions.
2019-01-20 15:55:20 +02:00
Avi Kivity
e0b0e18234 tests: commitlog_test: de-template
The cl_test function is called many times, so its contents are bloat. De-template
it so it is compiled only once.
2019-01-20 15:55:20 +02:00
Tomasz Grabiec
c422bfc2c5 tests: perf_fast_forward: Store results for each dataset in separate sub-directory
Otherwise read test results for subsequent datasets will override each other.

Also, rename population test case to not include dataset name, which
is now redundant.

Message-Id: <1547822942-9690-1-git-send-email-tgrabiec@scylladb.com>
2019-01-20 15:38:46 +02:00
Botond Dénes
7049cd9374 partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges
When entering a new ck range (of the partition-slice), the partition
snapshot reader will apply to its range tombstones stream all the
tombstones that are relevant to the new ck range. When the partition has
range tombstones that overlap with multiple ck ranges, these will be
applied to the range tombstone stream when entering any of the ck ranges
they overlap with. This will result in the violation of the monotonicity
of the mutation fragments emitted by the reader, as these range
tombstones will be re-emitted on each ck range, if the ck range has at
least one clustering row they apply to.
For example, given the following partition:
    rt{[1,10]}, cr{1}, cr{2}, cr{3}...

And a partition-slice with the following ck ranges:
    [1,2], [3, 4]

The reader will emit the following fragment stream:
    rt{[1,10]}, cr{1}, cr{2}, rt{[1,10]}, cr{3}, ...

Note how the range tombstone is emitted twice. In addition to violating
the monotonicity guarantee, this can also result in an explosion of the
number of emitted range tombstones.

Fix by applying only those range tombstones to the range tombstone
stream, that have a position strictly greater than that of the last
emitted clustering row (or range tombstone), when entering a new ck
range.

Fixes: #4104

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <e047af76df75972acb3c32c7ef9bb5d65d804c82.1547916701.git.bdenes@scylladb.com>
2019-01-20 15:38:04 +02:00
Paweł Dziepak
14757d8a83 types: collection_type: drop tombstone if covered by higher-level one
At the moment are inefficiencies in how
collection_type_impl::mutation::compact_and_expire( handles tombstones.
If there is a higher-level tombstone that covers the collection one
(including cases where there is no collection tombstone) it will be
applied to the collection tombstone and present in the compaction
output. This also means that the collection tombstone is never dropped
if fully covered by a higher-level one.

This patch fixes both those problems. After the compaction the
collection tombstone is either unchanged or removed if covered by a
higher-level one.

Fixes #4092.

Message-Id: <20190118174244.15880-1-pdziepak@scylladb.com>
2019-01-20 15:32:34 +02:00