scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 20:05:10 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	9da078a18a	tests: logalloc_test: Print debugging info and abort on failure The test started to fail sporadically on jenkins after `7a00dd6985` due to quiesce() timing out. It's not clear though if this is a regression because before the series such timeouts would not cause test failure if the future resulves eventually, timeout was only logged. I was not able to reproduce it on my setup nor on jenkins, so let's add more debugging output and trigger a coredump next time the test fails. Message-Id: <1487089576-27147-1-git-send-email-tgrabiec@scylladb.com>	2017-02-15 14:41:49 +02:00
Tomasz Grabiec	7ec8c4cf54	tests: mutation_source_test: Test that slicing returns only relevant range tombstones	2017-02-13 20:52:50 +01:00
Tomasz Grabiec	2b8bd10dca	tests: Pass all mutation source parameters	2017-02-13 20:52:49 +01:00
Tomasz Grabiec	25dffef6ae	tests: mutation_source_tests: Ensure timestamps are strictly monotonic	2017-02-13 16:19:32 +01:00
Tomasz Grabiec	e6a95fd8cc	tests: streamed_mutation_assertions: Add more expectation methods	2017-02-13 16:19:32 +01:00
Tomasz Grabiec	62843175ea	tests: streamed_mutation_assertions: Make produces_end_of_stream() give better error messages	2017-02-13 16:19:32 +01:00
Paweł Dziepak	4ffe0401ee	test/mutation_source: specify whether to generate counter mutations Tests using random mutation generator should be provided with bot counter and non-counter mutations to ensure that both cases are sufficiently covered. However, mixed schemas (with both counter and non-counter columns) are not allowed so the RMG has to be explicitly told whether to use counter or non-counter schema.	2017-02-07 15:17:14 +00:00
Paweł Dziepak	294bf0bb7a	tests/canonical_mutation: don't try to upgrade incompatible schemas Test case test_reading_with_different_schemas uses randomly generated pairs of mutations and tries to upgrade one to the schema of the other. However, there are cases when one schema cannot be upgraded to another, for example, counter and non-counter schemas.	2017-02-07 15:17:14 +00:00
Avi Kivity	b18e54307f	tests: add --operations-per-shard option to perf_simple_query This helps achieve more repeatable runs that can then be compared via the Linux perf tool. The option overrides duration-based testing and runs the test for a specific number of iterations. Message-Id: <20170204172937.8462-1-avi@scylladb.com>	2017-02-06 12:08:04 +01:00
Avi Kivity	7a00dd6985	Merge "Avoid avalanche of tasks after memtable flush" from Tomasz "Before, the logic for releasing writes blocked on dirty worked like this: 1) When region group size changes and it is not under pressure and there are some requests blocked, then schedule request releasing task 2) request releasing task, if no pressure, runs one request and if there are still blocked requests, schedules next request releasing task If requests don't change the size of the region group, then either some request executes or there is a request releasing task scheduled. The amount of scheduled tasks is at most 1, there is a single releasing thread. However, if requests themselves would change the size of the group, then each such change would schedule yet another request releasing thread, growing the task queue size by one. The group size can also change when memory is reclaimed from the groups (e.g. when contains sparse segments). Compaction may start many request releasing threads due to group size updates. Such behavior is detrimental for performance and stability if there are a lot of blocked requests. This can happen on 1.5 even with modest concurrency because timed out requests stay in the queue. This is less likely on 1.6 where they are dropped from the queue. The releasing of tasks may start to dominate over other processes in the system. When the amount of scheduled tasks reaches 1000, polling stops and server becomes unresponsive until all of the released requests are done, which is either when they start to block on dirty memory again or run out of blocked requests. It may take a while to reach pressure condition after memtable flush if it brings virtual dirty much below the threshold, which is currently the case for workloads with overwrites producing sparse regions. I saw this happening in a write workload from issue #2021 where the number of request releasing threads grew into thousands. Fix by ensuring there is at most one request releasing thread at a time. There will be one releasing fiber per region group which is woken up when pressure is lifted. It executes blocked requests until pressure occurs." * tag 'tgrabiec/lsa-single-threaded-releasing-v2' of github.com:cloudius-systems/seastar-dev: tests: lsa: Add test for reclaimer starting and stopping tests: lsa: Add request releasing stress test lsa: Avoid avalanche releasing of requests lsa: Move definitions to .cc lsa: Simplify hard pressure notification management lsa: Do not start or stop reclaiming on hard pressure tests: lsa: Adjust to take into account that reclaimers are run synchronously lsa: Document and annotate reclaimer notification callbacks tests: lsa: Use with_timeout() in quiesce()	2017-02-02 17:49:31 +02:00
Piotr Jastrzebski	36b2c4df19	row_cache_test: extend test_mvcc Make the test execute with and without an active reader to memtable that's flushed to cache. This improves the code covarage of MVCC with tests. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <007b6cd1ba7a84ea5675ea82e454bf1adf3b3330.1485954941.git.piotr@scylladb.com>	2017-02-02 13:51:32 +01:00
Paweł Dziepak	8671d8329d	perf_simple_query: add counter tables tests	2017-02-02 10:35:14 +00:00
Paweł Dziepak	99b21fbb86	tests: random_mutation_generator: generate counter cells	2017-02-02 10:35:14 +00:00
Paweł Dziepak	de2acd47c9	tests/sstables: test reading and writing counters	2017-02-02 10:35:14 +00:00
Paweł Dziepak	5905729c4a	sstables: read counter cells	2017-02-02 10:35:14 +00:00
Paweł Dziepak	de698105e4	tests/counter: test apply, difference and freeze	2017-02-02 10:35:14 +00:00
Paweł Dziepak	496b42fcc7	tests: add test for counters	2017-02-02 10:35:13 +00:00
Tomasz Grabiec	2fd339787b	tests: lsa: Add test for reclaimer starting and stopping	2017-02-01 17:41:56 +01:00
Tomasz Grabiec	f943296da0	tests: lsa: Add request releasing stress test	2017-02-01 17:41:55 +01:00
Piotr Jastrzebski	c7e95af0b0	row_cache_test: fix test_mvcc Currently the test does not wait for cache update to finish before carrying on with the checks. This makes the test nondeterministic and purely wrong because checks expect update to be finished. This patch changes the test to wait for update to finish. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <2a99bba24b1628466d3495332b48ef3ccdb43c26.1485862389.git.piotr@scylladb.com>	2017-01-31 11:37:29 +00:00
Tomasz Grabiec	f053b48f7c	tests: lsa: Adjust to take into account that reclaimers are run synchronously	2017-01-30 19:18:07 +01:00
Tomasz Grabiec	ed9ff19467	lsa: Document and annotate reclaimer notification callbacks They are called from region_group::update(), so must be alloc-free and noexcept.	2017-01-30 19:18:07 +01:00
Tomasz Grabiec	2ec6fe415e	tests: lsa: Use with_timeout() in quiesce() Current consutrct doesn't interrupt the test, the timeout failure will only be logged.	2017-01-30 19:18:07 +01:00
Pekka Enberg	be0351b49c	cql3: Introduce raw_value and raw_value_view types Currently, the code is using bytes_opt and bytes_view_opt to represent CQL values, which can hold a value or null. In preparation for supporting a third state, unset value introduced in CQL v4, introduce new raw_value and raw_value_view types and use them instead. The new types are based on boost::variant<> and are capable of holding null, unset values, and blobs that represent a value.	2017-01-26 13:50:04 +02:00
Tomasz Grabiec	2c7902fb2b	Revert "lsa: Reduce reclamation latency" This reverts commit `d61002cc33`. Introduced a regression in row_cache_alloc_stress. The problem is that reclaim_from_evictable() evicts way too much after the refactor due to the stop condition not taking into account how much data was evicted so far and only looking at occupancy of the minimal segment. This may lead to eviction of the whole region.	2017-01-26 10:43:18 +01:00
Duarte Nunes	54a464ae27	random_mutation_generator: Always generate range tombstones Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-01-23 19:02:23 +01:00
Duarte Nunes	a01aa91c82	range_tombstone_list: Add unit tests for difference() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-01-23 18:14:33 +01:00
Benoît Canet	bcc826cc34	mutation_reader: Short circuit the read path on empty range Add a boolean to short circuit the read path on empty range hoping for some speedup. tested in read write with cs using: cl=QUORUM duration=1m -mode native cql3 -rate threads=700 -node localhost Will do some additional benchmark. Fixes #1056 Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <20170118194451.16836-1-benoit@scylladb.com>	2017-01-20 10:05:40 +00:00
Tomasz Grabiec	d61002cc33	lsa: Reduce reclamation latency Currently eviction is performed until occupancy of the whole region drops below the 85% threshold. This may take a while if region had high occupancy and is large. We could improve the situation by only evicting until occupancy of the sparsest segment drops below the threshold, as is done by this change. I tested this using a c-s read workload in which the condition triggers in the cache region, with 1G per shard: lsa-timing - Reclamation cycle took 12.934 us. lsa-timing - Reclamation cycle took 47.771 us. lsa-timing - Reclamation cycle took 125.946 us. lsa-timing - Reclamation cycle took 144356 us. lsa-timing - Reclamation cycle took 655.765 us. lsa-timing - Reclamation cycle took 693.418 us. lsa-timing - Reclamation cycle took 509.869 us. lsa-timing - Reclamation cycle took 1139.15 us. The 144ms pause is when large eviction is necessary. The change improves worst case latency. Reclamation time statistics over 30 second period after cache fills up, in microseconds: Before: avg = 1524.283148 stdev = 11021.021118 min = 12.934000 max = 144356.000000 sum = 257603.852000 samples = 169 After: avg = 1317.362414 stdev = 1913.542802 min = 263.935000 max = 19244.600000 sum = 175209.201000 samples = 133 Refs #1634. Message-Id: <1484730859-11969-1-git-send-email-tgrabiec@scylladb.com>	2017-01-19 17:35:36 +02:00
Tomasz Grabiec	ddfee57c97	Replace iostream include with iosfwd in headers Message-Id: <1484656119-8386-4-git-send-email-tgrabiec@scylladb.com>	2017-01-17 14:52:44 +02:00
Paweł Dziepak	e03868c226	tests: run with all features enabled Since `ce083308a1` "random_mutation_generator: Generate RTs by default" random mutation generator produces range tombstones. However, so far the tests were run with all features disabled (because of incomplete initialization of all services) which meant that RANGE_TOMBSTONE feature was not enabled and the code couldn't handle range tombstones that weren't just prefixes. This patch solves the problem by forcing all features to be enabled when tests are run. Message-Id: <20170116103324.22956-1-pdziepak@scylladb.com>	2017-01-16 11:38:45 +01:00
Duarte Nunes	ce083308a1	random_mutation_generator: Generate RTs by default This patch changes the random_mutation_generator so it generates range tombstones by default. This fixes a bug where reversibly applying range tombstones wasn't being tested. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170110164822.28747-1-duarte@scylladb.com>	2017-01-11 09:24:37 +00:00
Avi Kivity	0591303b72	Merge "avoid excessive memory usage during resharding" from Rapahel "Intended to reduce memory usage when resharding by sharing sstable components among shards. File descriptors are also shared from now on, meaning that a much smaller number of file descriptors will be used during resharding. Fixes #1951." branch 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla * 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla: db: avoid excessive memory usage during resharding checked_file_impl: add support to dup sstables: group sstable components that can be shared among shards sstables: rename sstable member	2017-01-09 20:43:50 +02:00
Raphael S. Carvalho	68dfcf5256	db: avoid excessive memory usage during resharding After resharding, sstables may be owned by all shards, which means that file descriptors and memory usage for metadata will increase by a factor equal to number of shards. That can easily lead to OOM. SSTable components are immutable, so they can be stored in one shard and shared with others that need it. We use the following formula to decide which shard will open the sstable and share it with the others: (generation % smp::count), which is the inverse of how we calculate generation for new sstables. So if no resharding is performed, everything is shard-local. With this approach, resource usage due to loaded sstables will be evenly distributed among shards. For this approach to work, we now only populate keyspaces from shard 0. It's now the sole responsible for iterating through column family dirs. In addition, most of population functions are now free and take distributed database object as parameter. Fixes #1951. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-01-09 15:24:36 -02:00
Avi Kivity	77cb2b452f	Merge "CQL 3.3.1 support" from Pekka "This patch series adds support for CQL 3.3.1. The changes to CQL are listed here: https://github.com/apache/cassandra/blob/cassandra-2.2/doc/cql3/CQL.textile#changes The following CQL features are already supported by Scylla: - TRUNCATE TABLE alias - Double-dollar string literals - Aggregate functions: MIN, MAX, SUM, and AVG This series adds the following CQL features: - New data types: tinyint, smallint, date, and time - CQL binary protocol v4 (required by the new data types) - Advertise Cassandra 2.2.8 version from Scylla so that drivers correctly detect the presence of CQL 3.3.1 The following CQL features are not supported by Scylla: - Role-based access control (issue #1941) - JSON data type - User-defined functions (UDFs) - User-defined aggregates (UDAs) The following CQL binary protocol v4 changes are not implemented by this series: - Read_failure and Write_failure error codes are not implemented. They error codes not used by the smart drivers but as they are propagated to application code, we eventually need to wire them up to our storage proxy implementation. - Function_failure error code is only used by user-defined functions and the fromJson function, which are not implemented by Scylla. Fixes #1284." * 'penberg/cql-3.3.1/v5' of github.com:cloudius-systems/seastar-dev: version: Bump Cassandra version to 2.2.8 db/schema_tables: Add schema_functions and schema_aggregates tables tests/type_tests: TIME type test cases tests/cql_query_test: TIME type test cases cql3: TIME data type support tests/type_tests: DATE type test cases tests/cql_query_test: DATE type test cases cql3: DATE type support date.h: 64-bit year and days representation licenses: Add utils/date.h license utils/date.h: Import date and time library sources tests/type_tests: TINYINT and SMALLINT type test cases tests/cql_query_test: TINYINT and SMALLINT type test cases cql3: TINYINT and SMALLINT data type support types: Fix integer_type_impl::parse_int() for bytes	2017-01-09 11:54:45 +02:00
Pekka Enberg	10facd7db8	tests/type_tests: TIME type test cases	2017-01-09 10:42:21 +02:00
Pekka Enberg	a49ee9387e	tests/cql_query_test: TIME type test cases	2017-01-09 10:42:20 +02:00
Pekka Enberg	9ceea7bbc4	tests/type_tests: DATE type test cases	2017-01-09 10:42:20 +02:00
Pekka Enberg	f0cbfb9e4f	tests/cql_query_test: DATE type test cases	2017-01-09 10:42:20 +02:00
Raphael S. Carvalho	eed2a7d065	sstables: group sstable components that can be shared among shards We intend to share immutable sstable components among shards to reduce excessive memory usage when resharding shared sstables. This change is about grouping those components into a structure, and using foreign ptr to make sure that the structure will be deleted by whichever shard created it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-01-06 15:16:19 -02:00
Raphael S. Carvalho	a492f8dfaf	sstables: rename sstable member Rename _components to _recognized_components because _components will be used to name a field with shareable components. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-01-06 15:16:17 -02:00
Avi Kivity	be11b054e1	Merge "Reduce the size of mutation_partition" from Piotr "Reduce the size of mutation_partition by implementing intrusive set using bi::rbtree_algorithms directly and using tree nodes optimized for size. This will reduce the size of mutation_partition by: 24 bytes + <number of cql rows> * 8 bytes This should have a positive impact on performance because mutation_partitions are stored both in memtable and cache. Fixes #742." * 'haaawk/742' of github.com:cloudius-systems/seastar-dev: intrusive_set: rename size() to calculate_size() Make intrusive_set_external_comparator::_value_traits static Implement intrusive set using rbtree_algorithms mutation_partition: make apply_reversibly_intrusive_set nongeneric mutation_partition: take schema in find_row and clustered_row mutation_partition: Extract intrusive set logic to a class. mutation_partition: Replace value_comp with key_comp calls	2017-01-05 17:34:10 +02:00
Piotr Jastrzebski	b159e08764	intrusive_set: rename size() to calculate_size() This hopefully will make it more apparent that the time complexity of this method is O(N) not O(1). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-01-05 12:21:43 +01:00
Piotr Jastrzebski	4bbe05dd47	mutation_partition: take schema in find_row and clustered_row This will allow intrusive set implementation that does not store schema. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-01-05 11:26:03 +01:00
Pekka Enberg	0ea5652354	tests/type_tests: TINYINT and SMALLINT type test cases	2017-01-05 10:57:35 +02:00
Pekka Enberg	41e3327ebc	tests/cql_query_test: TINYINT and SMALLINT type test cases	2017-01-05 10:57:35 +02:00
Pekka Enberg	060841b756	tests/types_test: Fix int32 type string conversion boundary case The test case is interested in the upper boundary of 32-bit integer because we already test the lower boundary in assertions below. The old test passed, of course, but it wasn't very interesting. Message-Id: <1483522773-6008-1-git-send-email-penberg@scylladb.com>	2017-01-04 11:57:02 +01:00
Avi Kivity	868b4d110c	Merge "Fixes for intentional short reads" from Paweł "This patchset contains fixes for the changes introduced in "Query result size limiting". It also improves handling of short data reads. I order to minimise chances of digest mismatch during data queries replicas that were asked just to return a digest also keep track of the size of the data (in the IDL representation) so that they would stop at the same point nodes doing full data queries would. Moreover, data queries are not affected by per-shard memory limit and the coordinator sends individual result size limits to replicas in order not to depend on hardcoded values. It is still possible to get digest mismatches if the IDL changes (e.g. a new field is added), but, hopefully, that won't be a serious problem." * 'pdziepak/short-read-fixes/v4' of github.com:cloudius-systems/seastar-dev: query: introduce result_memory_accounter::foreign_state storage_proxy: fix short reads in parallel range queries storage_proxy: pass maximum result size to replicas mutation_partition: use result limiter for digest reads query: make result_memory_limiter constants available for linker result_memory_limiter: add accounter for digest reads idl: allow writers to use any output stream result_memory_limiter: split new_read() to new_{data, mutation}_read() idl: is_short_read() was added in 1.6 mutation_partition: honour allowed_short_read for static rows storage_proxy: fix _is_short_read computation storage_proxy: disallow short reads if got no live rows storage_proxy: don't stop after result with no live rows	2016-12-26 10:42:49 +02:00
Avi Kivity	1d9ee358f1	Revert "Merge "Reduce the size of mutation_partition" from Piotr" This reverts commit `aa392810ff`, reversing changes made to a24ff47c637e6a5fd158099b8a65f1191fc2d023; it uses boost::intrusive::detail directly, which it must not, and doesn't compile on all boost versions as a consequence.	2016-12-25 16:07:48 +02:00
Avi Kivity	aa392810ff	Merge "Reduce the size of mutation_partition" from Piotr "Reduce the size of mutation_partition by implementing intrusive set using bi::rbtree_algorithms directly and using tree nodes optimized for size. This will reduce the size of mutation_partition by: 24 bytes + <number of cql rows> * 8 bytes This should have a positive impact on performance because mutation_partitions are stored both in memtable and cache. Fixes #742." * 'haaawk/742' of github.com:cloudius-systems/seastar-dev: intrusive_set: rename size() to calculate_size() Make intrusive_set_external_comparator::_value_traits static Implement intrusive set using rbtree_algorithms mutation_partition: make apply_reversibly_intrusive_set nongeneric mutation_partition: take schema in find_row and clustered_row mutation_partition: Extract intrusive set logic to a class. mutation_partition: Replace value_comp with key_comp calls	2016-12-25 12:56:10 +02:00

1 2 3 4 5 ...

1300 Commits