scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	d9db79a85d	tests: Switch to seastar's allocation failure injector It catches more allocation sites.	2018-07-17 16:30:01 +02:00
Piotr Sarna	fcfbc804e4	tests: add filtering indexed queries tests Tests covering ALLOW FILTERING usage while using secondary indexes as well are added to cql_query_test. Tests are based on Cassandra's test suite for filtering secondary indexes + some more simple cases.	2018-07-11 18:06:21 +02:00
Piotr Sarna	dcdd8be59c	cql3: make index-related tests less timing dependent Indexes and materialized views take time to build, so checks that rely on that are now wrapped with 'eventually' blocks. Message-Id: <6d3def2bc49b76dda11d7a1c9974a8b3d221003f.1531312518.git.sarna@scylladb.com>	2018-07-11 15:45:52 +03:00
Tomasz Grabiec	1de5177175	tests: row_cache: Fix use-after-scope on partition_range passed to readers The partition_range must outlive the reader. Message-Id: <1531301583-15476-1-git-send-email-tgrabiec@scylladb.com>	2018-07-11 12:39:30 +03:00
Avi Kivity	28621066e6	observable: allow an observable to disconnect() twice without penalty Message-Id: <20180711070754.13286-1-avi@scylladb.com>	2018-07-11 10:15:01 +01:00
Piotr Sarna	559439b6ea	tests: add more ALLOW FILTERING tests More test cases are added to cql_query_test in order to check ALLOW FILTERING clauses more accurately. Message-Id: <4c59c1f3eb01558be992d0596e5423c276087387.1531220558.git.sarna@scylladb.com>	2018-07-10 14:44:33 +03:00
Piotr Jastrzebski	0abdd919c8	sstables: Test reading deleted cells from SST3 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-07-10 10:03:29 +02:00
Piotr Jastrzebski	f64901fdac	test_uncompressed_compound_ck_read: fix comment Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-07-10 10:03:14 +02:00
Avi Kivity	96737d140f	utils: add observer/observable templates An observable is used to decouple an information producer from a consumer (in the same way as a callback), while allowing multiple consumers (called observers) to coexist and to manage their lifetime separately. Two classes are introduced: observable: a producer class; when an observable is invoked all observers receive the information observer: a consumer class; receives information from a observable Modelled after boost::signals2, with the following changes - all signals return void; information is passed from the producer to the consumer but not back - thread-unsafe - modern C++ without preprocessor hacks - connection lifetime is always managed rather than leaked by default - renamed to avoid the funky "slot" name Message-Id: <20180709172726.5079-1-avi@scylladb.com>	2018-07-09 18:48:44 +01:00
Duarte Nunes	c126b00793	Merge 'ALLOW FILTERING support' from Piotr " The main idea of this series is to provide a filtering_visitor as a specialised result_set_builder::visitor implementation that keeps restriction info and applies it on query results. Also, since allow_filtering checking is not correct now (e.g. #2025) on select_statement level, this series tries to fix any issues related to it. Still in TODO: * handling CONTAINS relation in single column restriction filtering * handling multi-column restrictions - especially EQ, which can be split into multiple single-column restrictions * more tests - it's never enough; especially esoteric cases like filtering queries which also use secondary indexes, paging tests, etc. Tests: unit (release) " * 'allow_filtering_6' of https://github.com/psarna/scylla: tests: add allow_filtering tests to cql_query_test cql3: enable ALLOW FILTERING service: add filtering_pager cql3: optimize filtering partition keys and static rows cql3: add filtering visitor cql3: move result_set_builder functions to header cql3: amend need_filtering() cql3: add single column primary key restrictions getters cql3: expose single column primary key restrictions cql3: add needs_filtering to primary key restrictions cql3: add simpler single_column_restriction::is_satisfied_by	2018-07-05 10:18:08 +01:00
Piotr Sarna	a7dd02309f	tests: add allow_filtering tests to cql_query_test Test cases for ALLOW FILTERING are added to cql_query_test suite.	2018-07-05 10:50:43 +02:00
Avi Kivity	f4caa418ff	Merge "Fix the "LCS data-loss bug"" from Botond " This series fixes the "LCS data-loss bug" where full scans (and everything that uses them) would miss some small percentage (> 0.001%) of the keys. This could easily lead to permanent data-loss as compaction and decomission both use full scans. `aeffbb673` worked around this bug by disabling the incremental reader selectors (the class identified as the source of the bug) altogether. This series fixes the underlying issue and reverts `aeffbb673`. The root cause of the bug is that the `incremental_reader_selector` uses the current read position to poll for new readers using `sstable_set::incremental_selector::select()`. This means that when the currently open sstables contain no partitions that would intersect with some of the yet unselected sstables, those sstables would be ignored. Solve the problem by not calling `select()` with the current read position and always pass the `next_position` returned in the previous call. This means that the traversal of the sstable-set happens at a pace defined by the sstable-set itself and this guarantees that no sstable will be jumped over. When asked for new readers the `incremental_reader_selector` will now iteratively call `select()` using the `next_position` from the previous `select()` call until it either receives some new, yet unselected sstables, or `next_position` surpasses the read position (in which case `select()` will be tried again later). The `sstable_set::incremental_selector` was not suitable in its present state to support calling `select()` with the `next_position` from a previous call as in some cases it could not make progress due to inclusiveness related ambiguities. So in preparation to the above fix `sstable_set` was updated to work in terms of ring-position instead of tokens. Ring-position can express positions in a much more fine-grained way then token, including positions after/before tokens and keys. This allows for a clear expression of `next_position` such that calling `select()` with it guarantees forward progress in the token-space. Tests: unit(release, debug) Refs: #3513 " * 'leveled-missing-keys/v4' of https://github.com/denesb/scylla: tests/mutation_reader_test: combined_mutation_reader_test: use SEASTAR_THREAD_TEST_CASE tests/mutation_reader_test: refactor combined_mutation_reader_test tests/mutation_reader_test: fix reader_selector related tests Revert "database: stop using incremental selectors" incremental_reader_selector: don't jump over sstables mutation_reader: reader_selector: use ring_position instead of token sstables_set::incremental_selector: use ring_position instead of token compatible_ring_position: refactor to compatible_ring_position_view dht::ring_position_view: use token_bound from ring_position i_partitioner: add free function ring-position tri comparator mutation_reader_merger::maybe_add_readers(): remove early return mutation_reader_merger: get rid of _key	2018-07-05 09:33:12 +03:00
Botond Dénes	b32f94d31e	tests/mutation_reader_test: combined_mutation_reader_test: use SEASTAR_THREAD_TEST_CASE	2018-07-04 17:42:37 +03:00
Botond Dénes	77ad085393	tests/mutation_reader_test: refactor combined_mutation_reader_test Make combined_mutation_reader_test more interesting: * Set the levels on the sstables * Arrange the sstables so that they test for the "jump over sstables" bug. * Arrange the sstables so that they test for the "gap between sstables". While at it also make the code more compact.	2018-07-04 17:42:37 +03:00
Botond Dénes	4b57fc9aea	tests/mutation_reader_test: fix reader_selector related tests Don't assume the partition keys use lexical ordering. Add some additional checks.	2018-07-04 17:42:37 +03:00
Botond Dénes	81a03db955	mutation_reader: reader_selector: use ring_position instead of token sstable_set::incremental selector was migrated to ring position, follow suit and migrate the reader_selector to use ring_position as well. Above correctness this also improves efficiency in case of dense tables, avoiding prematurely selecting sstables that share the token but start at different keys, altough one could argue that this is a niche case.	2018-07-04 17:42:37 +03:00
Botond Dénes	a8e795a16e	sstables_set::incremental_selector: use ring_position instead of token Currently `sstable_set::incremental_selector` works in terms of tokens. Sstables can be selected with tokens and internally the token-space is partitioned (in `partitioned_sstable_set`, used for LCS) with tokens as well. This is problematic for severeal reasons. The sub-range sstables cover from the token-space is defined in terms of decorated keys. It is even possible that multiple sstables cover multiple non-overlapping sub-ranges of a single token. The current system is unable to model this and will at best result in selecting unnecessary sstables. The usage of token for providing the next position where the intersecting sstables change [1] causes further problems. Attempting to walk over the token-space by repeatedly calling `select()` with the `next_position` returned from the previous call will quite possibly lead to an infinite loop as a token cannot express inclusiveness/exclusiveness and thus the incremental selector will not be able to make progress when the upper and lower bounds of two neighbouring intervals share the same token with different inclusiveness e.g. [t1, t2](t2, t3]. To solve these problems update incremental_selector to work in terms of ring position. This makes it possible to partition the token-space amoing sstables at decorated key granularity. It also makes it possible for select() to return a next_position that is guaranteed to make progress. partitioned_sstable_set now builds the internal interval map using the decorated key of the sstables, not just the tokens. incremental_selector::select() now uses `dht::ring_position_view` as both the selector and the next_position. ring_position_view can express positions between keys so it can also include information about inclusiveness/exclusiveness of the next interval guaranteeing forward progress. [1] `sstable_set::incremental_selector::selection::next_position`	2018-07-04 17:42:33 +03:00
Tomasz Grabiec	2ffb621271	Merge "Fix atomic_cell_or_collection::external_memory_usage()" from Paweł After the transition to the new in-memory representation in `aab6b0ee27` 'Merge "Introduce new in-memory representation for cells" from Paweł' atomic_cell_or_collection::external_memory_usage() stopped accounting for the externally stored data. Since, it wasn't covered by the unit tests the bug remained unnotices until now. This series fixes the memory usage calculation and adds proper unit tests. * https://github.com/pdziepak/scylla.git fix-external-memory-usage/v1: tests/mutation: properly mark atomic_cells that are collection members imr::utils::object: expose size overhead data::cell: expose size overhead of external chunks atomic_cell: add external chunks and overheads to external_memory_usage() tests/mutation: test external_memory_usage()	2018-07-03 14:58:10 +02:00
Botond Dénes	c236a96d7d	tests/cql_query_tess: add unit test for querying empty ranges test A bug was found recently (#3564) in the paging logic, where the code assumed the queried ranges list is non-empty. This assumption is incorrect as there can be valid (if rare) queries that can result in the ranges list to be empty. Add a unit test that executes such a query with paging enabled to detect any future bugs related to assumptions about the ranges list being non-empty. Refs: #3564 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <f5ba308c4014c24bb392060a7e72e7521ff021fa.1530618836.git.bdenes@scylladb.com>	2018-07-03 13:43:17 +01:00
Avi Kivity	eafd16266d	tests: reduce multishard_mutation_test runtime in debug mode Debug mode is so slow that generating 1000 mutations is too much for it. High memory use can also confuse the santitizers that track each allocation. Reduce mutation count from 1000 to 10 in debug mode.	2018-07-03 12:01:44 +03:00
Avi Kivity	f3da043230	Merge "Make in-memory partition version merging preemptable" from Tomasz " Partition snapshots go away when the last read using the snapshot is done. Currently we will synchronously attempt to merge partition versions on this event. If partitions are large, that may stall the reactor for a significant amount of time, depending on the size of newer versions. Cache update on memtable flush can create especially large versions. The solution implemented in this series is to allow merging to be preemptable, and continue in the background. Background merging is done by the mutation_cleaner associated with the container (memtable, cache). There is a single merging process per mutation_cleaner. The merging worker runs in a separate scheduling group, introduced here, called "mem_compaction". When the last user of a snapshot goes away the snapshot is slided to the oldest unreferenced version first so that the version is no longer reachable from partition_entry::read(). The cleaner will then keep merging preceding (newer) versions into it, until it merges a version which is referenced. The merging is preemtable. If the initial merging is preempted, the snapshot is enqueued into the cleaner, the worker woken up, and merging will continue asynchronously. When memtable is merged with cache, its cleaner is merged with cache cleaner, so any outstanding background merges will be continued by the cache cleaner without disruption. This reduces scheduling latency spikes in tests/perf_row_cache_update for the case of large partition with many rows. For -c1 -m1G I saw them dropping from >23ms to 1-2ms. System-level benchmark using scylla-bench shows a similar improvement. " * tag 'tgrabiec/merge-snapshots-gradually-v4' of github.com:tgrabiec/scylla: tests: perf_row_cache_update: Test with an active reader surviving memtable flush memtable, cache: Run mutation_cleaner worker in its own scheduling group mutation_cleaner: Make merge() redirect old instance to the new one mvcc: Use RAII to ensure that partition versions are merged mvcc: Merge partition version versions gradually in the background mutation_partition: Make merging preemtable tests: mvcc: Use the standard maybe_merge_versions() to merge snapshots	2018-07-01 15:32:51 +03:00
Botond Dénes	5fd9c3b9d4	tests/mutation_reader_test: require min shard-count for multishard tests Tests testing different aspects of `foreign_reader` and `multishard_combining_reader` are designed to run with a certain minimum shard count. Running them with any shard count below this minimum makes them useless at best but can even fail them. Refuse to run these tests when the shard count is below the required minimum to avoid an accidental and unnecessary investigation into a false-positive test failure. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <d24159415b6a9d74eafb8355b6e3fba98c1ff7ff.1530274392.git.bdenes@scylladb.com>	2018-07-01 12:44:41 +03:00
Avi Kivity	f73340e6f8	Merge "Index reader and associated types clean-up." from Vladimir " This patchset paves way to support for reading SSTables 3.x index files. It aims at streamlining and tidying up the existing index_reader and helpers and brings no functional or high-level changes. In v3: - do not capture 'found' and just return 'true' in the continuation inside advance_and_check_if_present() - split code that makes the use of advance_upper_past() internal-only into two commits for better readability GitHub URL: https://github.com/argenet/scylla/tree/projects/sstables-30/index_reader_cleanup/v3 Tests: unit {release} Performance tests (perf_fast_forward) did not reveal any noticeable changes. The complete output is below. ======================================== Original code (before the patchset) ======================================== running: large-partition-skips Testing scanning large partition with skips. Reads whole range interleaving reads with skips according to read-skip pattern: read skip time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 1 0 0.336514 1000000 2971642 1000 126956 35 0 0 0 0 0 0 0 99.5% 1 1 1.411239 500000 354299 993 127056 2 0 0 1 1 0 0 0 99.9% 1 8 0.464468 111112 239224 993 127056 2 0 0 1 1 0 0 0 99.8% 1 16 0.330490 58824 177990 993 127056 12 0 0 1 1 0 0 0 99.7% 1 32 0.257010 30304 117910 993 127056 15 0 0 1 1 0 0 0 99.7% 1 64 0.213650 15385 72010 997 127072 268 0 0 3 3 0 0 0 99.5% 1 256 0.159498 3892 24402 993 127056 245 0 0 1 1 0 0 0 95.5% 1 1024 0.088678 976 11006 993 127056 347 0 0 1 1 0 0 0 63.4% 1 4096 0.082627 245 2965 649 22452 389 252 0 1 1 0 0 0 20.0% 64 1 0.411080 984616 2395191 1059 127056 57 1 0 1 1 0 0 0 99.1% 64 8 0.390130 888896 2278461 993 127056 2 0 0 1 1 0 0 0 99.8% 64 16 0.369033 800000 2167828 993 127056 3 0 0 1 1 0 0 0 99.8% 64 32 0.338126 666688 1971714 993 127056 10 0 0 1 1 0 0 0 99.7% 64 64 0.297335 500032 1681711 997 127072 18 0 0 3 3 0 0 0 99.7% 64 256 0.199420 200000 1002910 993 127056 211 0 0 1 1 0 0 0 99.5% 64 1024 0.113953 58880 516704 993 127056 284 0 0 1 1 0 0 0 64.1% 64 4096 0.094596 15424 163051 687 23684 415 248 0 1 1 0 0 0 23.7% running: large-partition-slicing Testing slicing of large partition: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.000586 1 1706 3 164 2 1 0 1 1 0 0 0 9.0% 0 32 0.000587 32 54539 3 164 2 1 0 1 1 0 0 0 9.9% 0 256 0.000688 256 372343 4 196 2 1 0 1 1 0 0 0 20.7% 0 4096 0.004320 4096 948185 19 676 10 1 0 1 1 0 0 0 36.7% 500000 1 0.000882 1 1134 5 228 3 2 0 1 1 0 0 0 14.3% 500000 32 0.000881 32 36321 5 228 3 2 0 1 1 0 0 0 14.3% 500000 256 0.000961 256 266386 6 260 3 2 0 1 1 0 0 0 21.9% 500000 4096 0.003127 4096 1309805 21 740 14 2 0 1 1 0 0 0 54.0% running: large-partition-slicing-clustering-keys Testing slicing of large partition using clustering keys: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.000639 1 1564 3 164 2 0 0 1 1 0 0 0 13.9% 0 32 0.000626 32 51154 3 164 2 0 0 1 1 0 0 0 15.3% 0 256 0.000716 256 357560 4 168 2 0 0 1 1 0 0 0 23.1% 0 4096 0.003681 4096 1112743 16 680 8 1 0 1 1 0 0 0 38.5% 500000 1 0.000966 1 1035 4 424 3 2 0 1 1 0 0 0 12.4% 500000 32 0.000911 32 35121 5 296 3 1 0 1 1 0 0 0 13.1% 500000 256 0.000978 256 261645 5 296 3 1 0 1 1 0 0 0 19.1% 500000 4096 0.003155 4096 1298139 11 744 6 1 0 1 1 0 0 0 44.5% running: large-partition-slicing-single-key-reader Testing slicing of large partition, single-partition reader: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.000756 1 1323 4 484 2 0 0 1 1 0 0 0 11.3% 0 32 0.000625 32 51174 3 164 2 0 0 1 1 0 0 0 15.5% 0 256 0.000705 256 363337 4 196 2 0 0 1 1 0 0 0 24.3% 0 4096 0.003603 4096 1136829 16 900 8 1 0 1 1 0 0 0 44.4% 500000 1 0.000880 1 1136 5 228 3 3 0 1 1 0 0 0 12.6% 500000 32 0.000882 32 36268 5 228 3 1 0 1 1 0 0 0 14.0% 500000 256 0.000965 256 265178 6 260 3 1 0 1 1 0 0 0 20.8% 500000 4096 0.003098 4096 1322024 21 740 14 2 0 1 1 0 0 0 54.6% running: large-partition-select-few-rows Testing selecting few rows from a large partition: stride rows time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 1000000 1 0.000631 1 1585 3 164 2 2 0 1 1 0 0 0 15.2% 500000 2 0.000873 2 2291 5 228 3 2 0 1 1 0 0 0 13.2% 250000 4 0.001404 4 2850 9 356 5 4 0 1 1 0 0 0 11.9% 125000 8 0.002878 8 2779 21 740 13 8 0 1 1 0 0 0 15.5% 62500 16 0.005184 16 3087 41 1380 25 16 0 1 1 0 0 0 19.3% 2 500000 0.948899 500000 526926 1040 127056 39 0 0 1 1 0 0 0 99.9% running: large-partition-forwarding Testing forwarding with clustering restriction in a large partition: pk-scan time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu yes 0.001813 2 1103 11 1380 3 8 0 1 1 0 0 0 18.5% no 0.000922 2 2170 5 228 3 1 0 1 1 0 0 0 14.1% running: small-partition-skips Testing scanning small partitions with skips. Reads whole range interleaving reads with skips according to read-skip pattern: read skip time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu -> 1 0 1.023396 1000000 977139 1104 139668 12 0 0 2 2 0 0 0 99.7% -> 1 1 2.176794 500000 229696 6200 177660 5109 0 0 5108 7679 0 0 0 69.9% -> 1 8 1.130179 111112 98314 6200 177660 5109 0 0 5108 9647 0 0 0 41.5% -> 1 16 0.972022 58824 60517 6200 177660 5109 0 0 5108 9913 0 0 0 32.0% -> 1 32 0.880783 30304 34406 6201 177664 5110 0 0 5108 10057 0 0 0 25.2% -> 1 64 0.829019 15385 18558 6199 177656 5108 0 0 5107 10135 0 0 0 20.4% -> 1 256 2.248487 3892 1731 5028 168948 3937 0 0 3936 7801 0 0 0 4.6% -> 1 1024 0.342806 976 2847 2076 146948 985 105 0 984 1955 0 0 0 9.3% -> 1 4096 0.088605 245 2765 739 18152 492 246 0 247 490 0 0 0 11.1% -> 64 1 1.796715 984616 548009 6274 177660 5120 0 0 5108 5187 0 0 0 63.1% -> 64 8 1.688994 888896 526287 6200 177660 5109 0 0 5108 5674 0 0 0 61.2% -> 64 16 1.593196 800000 502135 6200 177660 5109 0 0 5108 6143 0 0 0 58.7% -> 64 32 1.438651 666688 463412 6200 177660 5109 0 0 5108 6807 0 0 0 56.5% -> 64 64 1.290205 500032 387560 6200 177660 5109 0 0 5108 7660 0 0 0 49.2% -> 64 256 2.136466 200000 93613 5252 170616 4161 0 0 4160 6267 0 0 0 13.8% -> 64 1024 0.388871 58880 151413 2317 148784 1226 107 0 1225 1844 0 0 0 23.4% -> 64 4096 0.107253 15424 143809 807 19100 562 244 0 321 482 0 0 0 24.2% running: small-partition-slicing Testing slicing small partitions: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.002773 1 361 3 68 2 0 0 1 1 0 0 0 10.5% 0 32 0.002905 32 11015 3 68 2 0 0 1 1 0 0 0 11.6% 0 256 0.003170 256 80764 4 104 2 0 0 1 1 0 0 0 17.8% 0 4096 0.008125 4096 504095 20 616 11 1 0 1 1 0 0 0 54.1% 500000 1 0.002914 1 343 3 72 2 0 0 1 2 0 0 0 10.7% 500000 32 0.002967 32 10786 3 72 2 0 0 1 2 0 0 0 12.6% 500000 256 0.003338 256 76685 5 112 3 0 0 2 2 0 0 0 17.4% 500000 4096 0.008495 4096 482141 21 624 12 1 0 2 2 0 0 0 52.3% ======================================== With the patchset ======================================== running: large-partition-skips Testing scanning large partition with skips. Reads whole range interleaving reads with skips according to read-skip pattern: read skip time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 1 0 0.340110 1000000 2940229 1000 126956 42 0 0 0 0 0 0 0 97.5% 1 1 1.401352 500000 356798 993 127056 2 0 0 1 1 0 0 0 99.9% 1 8 0.463124 111112 239918 993 127056 2 0 0 1 1 0 0 0 99.8% 1 16 0.330050 58824 178228 993 127056 11 0 0 1 1 0 0 0 99.7% 1 32 0.255981 30304 118384 993 127056 8 0 0 1 1 0 0 0 99.7% 1 64 0.215160 15385 71505 997 127072 263 0 0 3 3 0 0 0 99.4% 1 256 0.159702 3892 24370 993 127056 239 0 0 1 1 0 0 0 95.6% 1 1024 0.094403 976 10339 993 127056 298 0 0 1 1 0 0 0 58.9% 1 4096 0.082501 245 2970 649 22452 391 252 0 1 1 0 0 0 20.1% 64 1 0.415227 984616 2371272 1059 127056 52 1 0 1 1 0 0 0 99.3% 64 8 0.391556 888896 2270166 993 127056 2 0 0 1 1 0 0 0 99.8% 64 16 0.372075 800000 2150102 993 127056 4 0 0 1 1 0 0 0 99.7% 64 32 0.337454 666688 1975641 993 127056 15 0 0 1 1 0 0 0 99.7% 64 64 0.296345 500032 1687333 997 127072 21 0 0 3 3 0 0 0 99.7% 64 256 0.199221 200000 1003911 993 127056 204 0 0 1 1 0 0 0 99.4% 64 1024 0.118224 58880 498037 993 127056 275 0 0 1 1 0 0 0 61.8% 64 4096 0.095098 15424 162191 687 23684 417 248 0 1 1 0 0 0 23.7% running: large-partition-slicing Testing slicing of large partition: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.000585 1 1709 3 164 2 1 0 1 1 0 0 0 10.7% 0 32 0.000589 32 54353 3 164 2 1 0 1 1 0 0 0 10.0% 0 256 0.000688 256 372293 4 196 2 1 0 1 1 0 0 0 20.7% 0 4096 0.004336 4096 944562 19 676 10 1 0 1 1 0 0 0 36.9% 500000 1 0.000877 1 1140 5 228 3 2 0 1 1 0 0 0 13.6% 500000 32 0.000883 32 36222 5 228 3 2 0 1 1 0 0 0 14.4% 500000 256 0.000963 256 265804 6 260 3 2 0 1 1 0 0 0 22.0% 500000 4096 0.003008 4096 1361779 21 740 17 2 0 1 1 0 0 0 56.7% running: large-partition-slicing-clustering-keys Testing slicing of large partition using clustering keys: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.000623 1 1604 3 164 2 0 0 1 1 0 0 0 13.9% 0 32 0.000624 32 51261 3 164 2 0 0 1 1 0 0 0 14.7% 0 256 0.000714 256 358484 4 168 2 0 0 1 1 0 0 0 22.6% 0 4096 0.003687 4096 1110990 16 680 8 1 0 1 1 0 0 0 38.6% 500000 1 0.000973 1 1028 4 424 3 2 0 1 1 0 0 0 12.1% 500000 32 0.000914 32 35022 5 296 3 1 0 1 1 0 0 0 12.8% 500000 256 0.000986 256 259646 5 296 3 1 0 1 1 0 0 0 19.7% 500000 4096 0.003155 4096 1298122 11 744 6 1 0 1 1 0 0 0 44.5% running: large-partition-slicing-single-key-reader Testing slicing of large partition, single-partition reader: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.000766 1 1305 4 484 2 0 0 1 1 0 0 0 12.2% 0 32 0.000626 32 51111 3 164 2 0 0 1 1 0 0 0 15.2% 0 256 0.000710 256 360563 4 196 2 0 0 1 1 0 0 0 25.2% 0 4096 0.003963 4096 1033440 16 900 8 1 0 1 1 0 0 0 40.2% 500000 1 0.000877 1 1141 5 228 3 1 0 1 1 0 0 0 12.7% 500000 32 0.000882 32 36272 5 228 3 1 0 1 1 0 0 0 14.2% 500000 256 0.000959 256 266937 6 260 3 1 0 1 1 0 0 0 21.1% 500000 4096 0.003103 4096 1319992 21 740 14 2 0 1 1 0 0 0 53.9% running: large-partition-select-few-rows Testing selecting few rows from a large partition: stride rows time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 1000000 1 0.000631 1 1586 3 164 2 2 0 1 1 0 0 0 13.8% 500000 2 0.000872 2 2295 5 228 3 2 0 1 1 0 0 0 13.4% 250000 4 0.001483 4 2698 9 356 5 4 0 1 1 0 0 0 11.2% 125000 8 0.002894 8 2764 21 740 13 8 0 1 1 0 0 0 15.6% 62500 16 0.005182 16 3087 41 1380 25 16 0 1 1 0 0 0 19.5% 2 500000 0.942943 500000 530255 1040 127056 38 0 0 1 1 0 0 0 99.9% running: large-partition-forwarding Testing forwarding with clustering restriction in a large partition: pk-scan time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu yes 0.001807 2 1107 11 1380 3 8 0 1 1 0 0 0 18.9% no 0.000924 2 2165 5 228 3 1 0 1 1 0 0 0 14.1% running: small-partition-skips Testing scanning small partitions with skips. Reads whole range interleaving reads with skips according to read-skip pattern: read skip time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu -> 1 0 1.009953 1000000 990145 1104 139668 11 0 0 2 2 0 0 0 99.7% -> 1 1 2.213846 500000 225851 6200 177660 5109 0 0 5108 7679 0 0 0 70.3% -> 1 8 1.150029 111112 96617 6200 177660 5109 0 0 5108 9647 0 0 0 42.3% -> 1 16 0.989438 58824 59452 6200 177660 5109 0 0 5108 9913 0 0 0 33.2% -> 1 32 0.891590 30304 33989 6201 177664 5110 0 0 5108 10057 0 0 0 26.4% -> 1 64 0.840952 15385 18295 6199 177656 5108 0 0 5107 10135 0 0 0 21.6% -> 1 256 2.247875 3892 1731 5028 168948 3937 0 0 3936 7801 0 0 0 5.0% -> 1 1024 0.345917 976 2821 2076 146948 985 105 0 984 1955 0 0 0 10.0% -> 1 4096 0.088806 245 2759 739 18152 492 246 0 247 490 0 0 0 11.6% -> 64 1 1.821995 984616 540406 6274 177660 5119 0 0 5108 5187 0 0 0 63.9% -> 64 8 1.715052 888896 518291 6200 177660 5109 0 0 5108 5674 0 0 0 61.9% -> 64 16 1.620385 800000 493710 6200 177660 5109 0 0 5108 6143 0 0 0 59.4% -> 64 32 1.464497 666688 455233 6200 177660 5109 0 0 5108 6807 0 0 0 56.9% -> 64 64 1.311386 500032 381300 6200 177660 5109 0 0 5108 7660 0 0 0 50.0% -> 64 256 2.153954 200000 92853 5252 170616 4161 0 0 4160 6267 0 0 0 14.3% -> 64 1024 0.350275 58880 168097 2317 148784 1226 107 0 1225 1844 0 0 0 27.5% -> 64 4096 0.107498 15424 143482 807 19100 562 244 0 321 482 0 0 0 24.5% running: small-partition-slicing Testing slicing small partitions: offset read time (s) frags frag/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 0 1 0.002872 1 348 3 68 2 0 0 1 1 0 0 0 10.2% 0 32 0.002833 32 11297 3 68 2 0 0 1 1 0 0 0 12.1% 0 256 0.003145 256 81404 4 104 2 0 0 1 1 0 0 0 17.9% 0 4096 0.008110 4096 505079 20 616 12 1 0 1 1 0 0 0 54.4% 500000 1 0.002934 1 341 3 72 2 1 0 1 2 0 0 0 10.6% 500000 32 0.002871 32 11145 3 72 2 0 0 1 2 0 0 0 12.0% 500000 256 0.003216 256 79598 5 112 3 0 0 2 2 0 0 0 18.3% 500000 4096 0.008557 4096 478692 21 624 12 1 0 2 2 0 0 0 51.9% " * 'projects/sstables-30/index_reader_cleanup/v3' of https://github.com/argenet/scylla: sstables: Remove "lower_" from index_reader public methods. sstables: Make index_reader::advance_upper_past() method private. sstables: Stop using index_reader::advance_upper_past() outside the class. sstables: Move promoted_index_block from types.hh to index_entry.hh. sstables: Factor out promoted index into a separate class. sstables: Use std::optional instead of std::experimental optional in index_reader.	2018-07-01 12:30:29 +03:00
Vladimir Krivopalov	b24eb5c11d	sstables: Remove "lower_" from index_reader public methods. The index_reader class public interface has been amended to only deal with the upper bound cursor along with advancing the lower bound. Since the class users can only explicitly operate with the lower bound cursor (take data file position, advance to the next partition, etc), it no longer makes sense to specify that the method operates on the lower bound cursor in its name. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-29 11:48:33 -07:00
Paweł Dziepak	e9dffc753c	tests/mutation: test external_memory_usage()	2018-06-28 19:20:23 +01:00
Paweł Dziepak	e69f2c361c	tests/mutation: properly mark atomic_cells that are collection members	2018-06-28 18:00:39 +01:00
Asias He	4050a4b24e	tests: Add test for multishard_writer	2018-06-28 17:20:29 +08:00
Asias He	8eccff1723	tests: Allow random_mutation_generator to generate mutations belong to remote shrard - make_local_keys returns keys of current shard - make_keys returns keys of current or remote shard	2018-06-28 17:20:28 +08:00
Tomasz Grabiec	0a1aec2bd6	tests: perf_row_cache_update: Test with an active reader surviving memtable flush Exposes latency issues caused by mutation_cleaner life time issues, fixed by eralier commits.	2018-06-27 21:51:04 +02:00
Tomasz Grabiec	450985dfee	mvcc: Use RAII to ensure that partition versions are merged Before this patch, maybe_merge_versions() had to be manually called before partition snapshot goes away. That is error prone and makes client code more complicated. Delegate that task to a new partition_snapshot_ptr object, through which all snapshots are published now.	2018-06-27 21:51:04 +02:00
Avi Kivity	e1efda8b0c	Merge "Disable sstable filtering based on min/max clustering key components" from Tomasz " With DateTiered and TimeWindow, there is a read optimization enabled which excludes sstables based on overlap with recorded min/max values of clustering key components. The problem is that it doesn't take into account partition tombstones and static rows, which should still be returned by the reader even if there is no overlap in the query's clustering range. A read which returns no clustering rows can mispopulate cache, which will appear as partition deletion or writes to the static row being lost. Until node restart or eviction of the partition entry. There is also a bad interaction between cache population on read and that optimization. When the clustering range of the query doesn't overlap with any sstable, the reader will return no partition markers for the read, which leads cache populator to assume there is no partition in sstables and it will cache an empty partition. This will cause later reads of that partition to miss prior writes to that partition until it is evicted from cache or node is restarted. Disable until a more elaborate fix is implemented. Fixes #3552 Fixes #3553 " * tag 'tgrabiec/disable-min-max-sstable-filtering-v1' of github.com:tgrabiec/scylla: tests: Add test for slicing a mutation source with date tiered compaction strategy tests: Check that database conforms to mutation source database: Disable sstable filtering based on min/max clustering key components	2018-06-27 14:28:27 +03:00
Tomasz Grabiec	4995a8c568	tests: mvcc: Use the standard maybe_merge_versions() to merge snapshots Preparation for switching to background merging.	2018-06-27 12:48:30 +02:00
Tomasz Grabiec	b4879206fb	tests: Add test for slicing a mutation source with date tiered compaction strategy Reproducer for https://github.com/scylladb/scylla/issues/3552	2018-06-26 18:54:44 +02:00
Tomasz Grabiec	826a237c2e	tests: Check that database conforms to mutation source	2018-06-26 18:54:44 +02:00
Avi Kivity	9a7ecdb3b9	Merge "Deglobalise cache_tracker" from Paweł " Cache tracker is a thread-local global object that indirectly depends on the lifetimes of other objects. In particular, a member of cache_tracker: mutation_cleaner may extend the lifetime of a mutation_partition until the cleaner is destroyed. The mutation_partition itself depends on LSA migrators which are thread-local objects. Since, there is no direct dependency between LSA-migrators and cache_tracker it is not guarantee that the former won't be destroyed before the latter. The easiest (barring some unit tests that repeat the same code several billion times) solution is to stop using globals. This series also improves the part of LSA sanitiser that deals with migrators. Fixes #3526. Tests: unit(release) " * tag 'deglobalise-cache-tracker/v1-rebased' of https://github.com/pdziepak/scylla: mutation_cleaner: add disclaimer about mutation_partition lifetime lsa: enhance sanitizer for migrators lsa: formalise migrator id requirements row_cache: deglobalise row cache tracker	2018-06-26 16:38:12 +01:00
Paweł Dziepak	96b0577343	row_cache: deglobalise row cache tracker Row cache tracker has numerous implicit dependencies on ohter objects (e.g. LSA migrators for data held by mutation_cleaner). The fact that both cache tracker and some of those dependencies are thread local objects makes it hard to guarantee correct destruction order. Let's deglobalise cache tracker and put in in the database class.	2018-06-25 09:37:43 +01:00
Paweł Dziepak	dca68afce6	cql3: add result class So far the only way of returing a result of a CQL query was to build a result_set. An alternative lazy result generator is going to be introduced for the simple cases when no transformations at CQL layer are needed. To do that we need to hide the fact that there are going to be multiple representations of a cql results from the users.	2018-06-25 09:21:47 +01:00
Paweł Dziepak	3b9ba30497	tests: add test for reusable buffers	2018-06-25 09:21:47 +01:00
Paweł Dziepak	9d140488bd	tests/perf: add performance test for IDL	2018-06-25 09:21:47 +01:00
Paweł Dziepak	fe8dc1fa5c	bytes_ostream: add remove_suffix()	2018-06-25 09:21:47 +01:00
Paweł Dziepak	969219d5bc	tests/random-utils: add missing include	2018-06-25 09:21:47 +01:00
Avi Kivity	cb549c767a	database: rename column_family to table The name "column_family" is both awkward and obsolete. Rename to the modern and accurate "table". An alias is kept to avoid huge code churn. To prevent a One Definition Rule violation, a preexisting "table" type is moved to a new namespace row_cache_stress_test. Tests: unit (release) Message-Id: <20180624065238.26481-1-avi@scylladb.com>	2018-06-24 14:54:46 +03:00
Tomasz Grabiec	2d4177355a	Merge "Support for writing range tombstones to SSTables 3.x" from Vladimir This patchset brings support for writing range tombstones to SSTables 3.x. ('mc' format). In SSTables 3.x, range tombstones are represented by so-called range tombstone markers (hereafter RT markers) that denote range tombstone start and end bounds. So each range tombstone is represented in data file by two ordered RT markers. There are also markers that both close the previous range tombstone and open the new one in case if two range tombstones are ajdacent. This is done to consume less disk space on such occasions. Range tombstones written as RT markers are naturally non-overlapping. * github.com:argenet/scylla projects/sstables-30/write-range-tombstones/v6 range_tombstone_stream: Remove an unused boolean flag. Revert "Add missing enum values to bound_kind." sstables: Move to_deletion_time helper up and make it static. sstables: Write end-of-partition byte before flushing the last index block. sstables: Add support for writing range tombstones in SSTables 3.x format. tests: Add unit test covering simple range tombstone. tests: Add unit test covering adjacent range tombstones. tests: Add test to cover non-adjacent RTs. tests: Add test covering mixed rows and range tombstones. tests: Add test covering SSTables 3.x with many RTs. tests: Add unit test covering overlapping RTs and rows. tests: Add tests writing a range tombstone and a row overlapping with its start. tests: Add tests writing a range tombstone and a row overlapping with its end. tests: Add function that writes from multiple memtable into SSTables. tests: Add test where 2nd range tombstone covers the remainder of the 1st one. tests: Add test writing two non-adjacent range tombstones with same clustering key prefix at their bounds. tests: Add test covering overlapped range tombstones.	2018-06-22 15:47:18 +02:00
Vladimir Krivopalov	ea09cf732d	tests: Add test covering overlapped range tombstones. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-20 18:08:36 -07:00
Vladimir Krivopalov	5df3cd1787	tests: Add test writing two non-adjacent range tombstones with same clustering key prefix at their bounds. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-20 18:08:36 -07:00
Vladimir Krivopalov	35b90b2d1e	tests: Add test where 2nd range tombstone covers the remainder of the 1st one. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-20 18:08:36 -07:00
Vladimir Krivopalov	2f277c29e8	tests: Add function that writes from multiple memtable into SSTables. This comes in handy when we want to test overlapping range tombstones because memtable would otherwise de-overlap them internally. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-20 18:08:36 -07:00
Vladimir Krivopalov	41d283fe83	tests: Add tests writing a range tombstone and a row overlapping with its end. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-20 18:08:36 -07:00
Vladimir Krivopalov	ff53f601e4	tests: Add tests writing a range tombstone and a row overlapping with its start. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-20 18:08:36 -07:00
Vladimir Krivopalov	f552f30d57	tests: Add unit test covering overlapping RTs and rows. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-06-20 18:08:36 -07:00

1 2 3 4 5 ...

2414 Commits