scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-29 11:10:40 +00:00

Author	SHA1	Message	Date
Avi Kivity	9d8507de09	Merge "Optimize checksum_combine() for CRC32" from Tomek " zlib's crc32_combine() is not very efficient. It is faster to re-combine the buffer using crc32(). It's still substantial amount of work which could be avoided. This patch introduces a fast implementation of crc32_combine() which uses a different algorithm than zlib. It also utilizes intrinsics for carry-less multiplication instruction to perform the computation faster. The details of the algorithm can be found in code comments. Performance results using perf_checksum and second buffer of length 64 KiB: zlib CRC32 combine: 38'851 ns libdeflate CRC32: 4'797 ns fast_crc32_combine(): 11 ns So the new implementation is 3500x faster than zlib's, and 417x faster than re-checksumming the buffer using libdeflate. Tested on i7-5960X CPU @ 3.00GHz Performance was also evaluated using sstable writer benchmark: perf_fast_forward --populate --sstable-format=mc --data-directory /tmp/perf-mc \ --value-size=10000 --rows 1000000 --datasets small-part It yielded 9% improvement in median frag/s (129'055 vs 117'977). Refs #3874 " * tag 'fast-crc32-combine-v2' of github.com:tgrabiec/scylla: tests: perf_checksum: Test fast_crc32_combine() tests: Rename libdeflate_test to checksum_utils_test tests: libdeflate: Add more tests for checksum_combine() tests: libdeflate: Check both libdeflate and default checksummers sstables: Use fast_crc_combine() in the default checksummer utils/gz: Add fast implementation of crc32_combine() utils/gz: Add pre-computed polynomials utils/gz: Import Barett reduction implementation from libdeflate utils: Extract clmul() from crc.hh (cherry picked from commit `b098b5b987`)	2018-12-08 13:42:43 +02:00
Avi Kivity	b9c046b17b	Merge "Optimize checksum computation for the MC sstable format" from Tomek " One part of the improvement comes from replacing zlib's CRC32 with the one from libdeflate, which is optimized for modern architecture and utilizes the PCLMUL instruction. perf_checksum test was introduced to measure performance of various checksumming operations. Results for 514 B (relevant for writing with compression enabled): test iterations median mad min max crc_test.perf_deflate_crc32_combine 58414 16.711us 3.483ns 16.708us 16.725us crc_test.perf_adler_combine 165788278 6.059ns 0.031ns 6.027ns 7.519ns crc_test.perf_zlib_crc32_combine 59546 16.767us 26.191ns 16.741us 16.801us --- crc_test.perf_deflate_crc32_checksum 12705072 83.267ns 4.580ns 78.687ns 98.964ns crc_test.perf_adler_checksum 3918014 206.701ns 23.469ns 183.231ns 258.859ns crc_test.perf_zlib_crc32_checksum 2329682 428.787ns 0.085ns 428.702ns 510.085ns Results for 64 KB (relevant for writing with compression disabled): test iterations median mad min max crc_test.perf_deflate_crc32_combine 25364 38.393us 17.683ns 38.375us 38.545us crc_test.perf_adler_combine 169797143 5.842ns 0.009ns 5.833ns 6.901ns crc_test.perf_zlib_crc32_combine 26067 38.663us 95.094ns 38.546us 40.523us --- crc_test.perf_deflate_crc32_checksum 202821 4.937us 14.426ns 4.912us 5.093us crc_test.perf_adler_checksum 44684 22.733us 206.263ns 22.492us 25.258us crc_test.perf_zlib_crc32_checksum 18839 53.049us 36.117ns 53.013us 53.274us The new CRC32 implementation (deflate_crc32) doesn't provide a fast checksum_combine() yet, it delegates to zlib so it's as slow as the latter. Because for CRC32 checksum_combine() is several orders of magnitude slower than checksum(), we avoid calling checksum_combine() completely for this checksummer. We still do it for adler32, which has combine() which is faster than checksum(). SStable write performance was evaluated by running: perf_fast_forward --populate --data-directory /tmp/perf-mc \ --rows=10000000 -c1 -m4G --datasets small-part Below is a summary of the average frag/s for a memtable flush. Each result is an average of about 20 flushes with stddev of about 4k. Before: [1] MC,lz4: 330'903 [2] LA,lz4: 450'157 [3] MC,checksum: 419'716 [4] LA,checksum: 459'559 After: [1'] MC,lz4: 446'917 ([1] + 35%) [2'] LA,lz4: 456'046 ([2] + 1.3%) [3'] MC,checksum: 462'894 ([3] + 10%) [4'] LA,checksum: 467'508 ([4] + 1.7%) After this series, the performance of the MC format writer is similar to that of the LA format before the series. There seems to be a small but consistent improvement for LA too. I'm not sure why. " * tag 'improve-mc-sstable-checksum-libdeflate-v3' of github.com:tgrabiec/scylla: tests: perf: Introduce perf_checksum tests: Add test for libdeflate CRC32 implementation sstables: compress: Use libdeflate for crc32 sstables: compress: Rename crc32_utils to zlib_crc32_checksummer licenses: Add libdeflate license Integrate libdeflate with the build system Add libdeflate submodule sstables: Avoid checksum_combine() for the crc32 checksummer sstables: compress: Avoid unnecessary checksum_combine() sstables: checksum_utils: Add missing include (cherry picked from commit `5e759b0c07`)	2018-12-08 13:42:43 +02:00
Duarte Nunes	1953c5fa61	Merge 'Fix filtering with LIMIT' from Piotr " This series adds proper handling of filtering queries with LIMIT. Previously the limit was erroneously applied before filtering, which leads to truncated results. To avoid that, paged filtering queries now use an enhanced pager, which remembers how many rows dropped and uses that information to fetch for more pages if the limit is not yet reached. For unpaged filtering queries, paging is done internally as in case of aggregations to avoid returning keeping huge results in memory. Also, previously, all limited queries used the page size counted from max(page size, limit). It's not good for filtering, because with LIMIT 1 we would then query for rows one-by-one. To avoid that, filtered queries ask for the whole page and the results are truncated if need be afterwards. Tests: unit (release) " * 'fix_filtering_with_limit_2' of https://github.com/psarna/scylla: tests: add filtering with LIMIT test tests: split filtering tests from cql_query_test cql3: add proper handling of filtering with LIMIT service/pager: use dropped_rows to adjust how many rows to read service/pager: virtualize max_rows_to_fetch function cql3: add counting dropped rows in filtering pager (cherry picked from commit `1afda28cf3`)	2018-12-02 12:07:46 +02:00
Botond Dénes	6779b63dfe	tests: add unit test for multishard_mutation_query()	2018-09-03 10:31:44 +03:00
Jesse Haber-Kucharsky	c10fcbf7a5	auth: Add unit tests for password handling This will mean we can make changes more confidently.	2018-08-13 13:24:45 -04:00
Paweł Dziepak	166c9a3b8c	tests: add test for fragmented_temporary_buffer	2018-07-18 12:28:06 +01:00
Paweł Dziepak	b5a72a880b	tests: add basic test for transport requests and responses	2018-07-18 12:28:06 +01:00
Avi Kivity	99d3f0a1b1	tests: add obserable_test to test suite Message-Id: <20180711071131.13702-1-avi@scylladb.com>	2018-07-11 10:15:01 +01:00
Paweł Dziepak	07a429e837	test.py: do not disable human-readable format with --jenkins flag When test.py is run with --jenkins flag Boost UTF is asked to generate an XML file with the test results. This automatically disables the human-readable output printed to stdout. There is no real reason to do so and it is actually less confusing when the Boost UTF messages are in the test output together with Scylla logger messages. Message-Id: <20180704172913.23462-1-pdziepak@scylladb.com>	2018-07-05 09:31:15 +03:00
Botond Dénes	da53ea7a13	tests.py: add --jobs command line parameter Allowing for setting the number of jobs to use for running the tests. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <d58d6393c6271bffc37ab3b5edc37b00ef485d9c.1529433590.git.bdenes@scylladb.com>	2018-07-01 12:26:41 +03:00
Asias He	fd8b7efb99	tests: Add multishard_writer_test to test.py For multishard_writer class testing.	2018-06-28 17:20:29 +08:00
Paweł Dziepak	3b9ba30497	tests: add test for reusable buffers	2018-06-25 09:21:47 +01:00
Avi Kivity	ba5d8717c8	tests: disable reactor stall notifier In case it is interacting badly with ASAN and causing spurious test failures.	2018-06-10 15:55:00 +03:00
Paweł Dziepak	cc76480174	tests: introduce tests for metaprogramming helpers	2018-05-31 10:09:01 +01:00
Avi Kivity	ff3e86888a	tests: report tests as they are completed As each test completes, report it. This prevents a long-running test in the beginning of the list from stalling output. Message-Id: <20180526173517.23078-1-avi@scylladb.com>	2018-05-28 13:58:01 +03:00
Botond Dénes	204f6fd478	test.py: print test args when listing failed tests This can be very helpful when a test only fails when run with some particular arguments. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <dac1f7e23afa904156e65c3bb3c8fd52b7e999ff.1526906955.git.bdenes@scylladb.com>	2018-05-21 17:28:18 +03:00
Botond Dénes	f96084d38e	test.py: add custom seastar flags for mutation_reader_test Use -c3 if possible (if the machines has at least 3 cores).	2018-04-30 17:17:45 +03:00
Botond Dénes	52f0bb0481	test.py: move custom seastar flags for tests declarative	2018-04-30 17:17:45 +03:00
Avi Kivity	13ea1a89b5	Merge "Implement loading sstables in 3.x format" from Piotr " Pass sstable version to parse, write and describe_type methods to make it possible to handle different versions. For now serialization header from 3.x format is ignored. Tests: units (release) " * 'haaawk/sstables3/loading_v4' of ssh://github.com/scylladb/seastar-dev: Add test for loading the whole sstable Add test for loading statistics Add support for 3_x stats metadata Pass sstable version to describe_type Pass sstable version to write methods metadata_type: add Serialization type Pass sstable_version_types to parse methods Add test for reading filter Add test for read_summary sstables 3.x: Add test for reading TOC sstable: Make component_map version dependent sstable::component_type: add operator<< Extract sstable::component_type to separete header Remove unused sstable::get_shared_components sstable_version_types: add mc version	2018-04-24 12:49:41 +03:00
Piotr Jastrzebski	10f9b06145	sstables 3.x: Add test for reading TOC Make sure DigestCRC32 is handled correctly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-24 11:30:26 +02:00
Nadav Har'El	4af2604e76	secondary index: update test.py I forgot that I also need to update test.py for the new test. It's unfortunate that this script doesn't pick up the list of tests automatically (perhaps with a black-list of tests we don't want to run). I wonder if there are additional tests we are forgetting to run. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180424085911.29732-1-nyh@scylladb.com>	2018-04-24 12:11:38 +03:00
Duarte Nunes	cc6c96bc92	tests: Add view_complex_test This patch introduces view_complex_test and adds more test coverage for materialized views. A new file was introduced to avoid making view_schema_test slower. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:03 +01:00
Avi Kivity	28be4ff5da	Revert "Merge "Implement loading sstables in 3.x format" from Piotr" This reverts commit `513479f624`, reversing changes made to `01c36556bf`. It breaks booting. Fixes #3376.	2018-04-23 06:47:00 +03:00
Piotr Jastrzebski	6c2cf40ce8	sstables 3.x: Add test for reading TOC Make sure DigestCRC32 is handled correctly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-22 13:46:12 +02:00
Avi Kivity	2c2175ab34	Merge "Add support for reading variant integers from SSTables" from Piotr " Enhance continuous_data_consumer to use existing vint serialization for reading variant integers from SSTables. Also available at: https://github.com/scylladb/seastar-dev/commits/haaawk/sstables3/unsigned-vint-v6 Tests: units (release) " * 'haaawk/sstables3/unsigned-vint-v6' of ssh://github.com/scylladb/seastar-dev: sstables: add test for continuous_data_consumer::read_unsigned_vint buffer_input_stream: make it possible to specify chunk size Add tests for make_limiting_data_source Introduce make_limiting_data_source sstables: add continuous_data_consumer::read_unsigned_vint Cover serialized_size_from_first_byte in tests core: add unsigned_vint::serialized_size_from_first_byte sstables: add all dependant headers to consumer.hh sstables: add all dependant headers to exceptions.hh core: add #pragma once to vint-serialization.hh	2018-04-17 10:09:38 +03:00
Piotr Jastrzebski	c5dda1c0c9	sstables: add test for continuous_data_consumer::read_unsigned_vint Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-16 21:14:34 +02:00
Piotr Jastrzebski	4406d11095	Add tests for make_limiting_data_source Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-04-16 21:00:35 +02:00
Avi Kivity	4c588de70f	tests: apply overprovisioned flag to all tests Some tests escaped the --overprovisioned flag, causing them to compete over cpu 0. Add the flag to all tests. Message-Id: <20180410181606.8341-1-avi@scylladb.com>	2018-04-11 10:48:52 +02:00
Botond Dénes	49128d12cf	Move querier_cache_resource_based_eviction test into querier_cache.cc Turns out do_with_cql_env can be used from within SEASTAR test cases so no reason to have a separate file for a single test case. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <028a28b7d90a3bc5ed4719ce273da05880133c0e.1523432699.git.bdenes@scylladb.com>	2018-04-11 10:55:19 +03:00
Duarte Nunes	9f5cfa76f7	tests/view_build_test: Add tests for view building This is a separate file from view_schema_test because that one is already becoming too long to run; also, having multiple test files means they can be executed in parallel. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:11 +01:00
Botond Dénes	0e6aa91269	Fix test.py output and error handling * Don't dump output of failed tests immediately, print the output for failed tests in the end instead. * Fix exception printing in run_test(): don't assume passed in error object is a `bytes` (or bytes-like) object, call the object's str operator instead and let callers encode bytes objects instead. * Don't assume Exception object has an `out` member, use operator str instead to convert it to string. * Don't print progress in run_test() directly because it results in incomprehensible output as the executors race to print to stdout. Leave progress report to the caller who can serialize progress prints. * Automatically detect non-tty stdout and don't try to edit already printed text. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <7bb7e0003ded9b28710250bff851ea849bb99f7d.1522062795.git.bdenes@scylladb.com>	2018-03-26 14:26:45 +03:00
Avi Kivity	601d8f7cff	test: switch boost.test from --log_sink to --logger Upstream fix works only for --logger according to https://github.com/boostorg/test/pull/124 Message-Id: <20180319121520.11110-1-avi@scylladb.com>	2018-03-19 13:26:28 +01:00
Avi Kivity	31b86a46a0	tests: run tests in parallel Launch tests in a concurrent executor with worker count determined by available memory.	2018-03-19 12:17:10 +02:00
Avi Kivity	638611a350	tests: simplify timeout handling The subprocess module can handle timeouts itself, so use this to simplify the module code.	2018-03-19 12:16:58 +02:00
Avi Kivity	95abed020b	tests: don't require crash integrity We don't resume tests after crashes, so no need to spend time waiting for the disk to fsync.	2018-03-19 12:16:58 +02:00
Avi Kivity	b3d8dadf0c	tests: allow sharing the machine with other tests By using the overprovisioned flag, we reduce polling and pinning, so less CPU time is wasted and the scheduler has more options to schedule reactor threads.	2018-03-19 12:16:58 +02:00
Avi Kivity	3d84c8945d	tests: extract seastar options to a separate variable	2018-03-19 12:16:58 +02:00
Avi Kivity	8b1cff90ce	tests: reduce memory for tests If we reduce memory for an individual test, we can run more in parallel.	2018-03-19 12:16:58 +02:00
Avi Kivity	c3750176d8	tests: add "--" unconditionally for boost tests Now that we have a minimum boost version, we don't need to check whether boost requires "--" before test-specific command line arguments. Removing the check speeds up the test a little.	2018-03-19 12:16:58 +02:00
Botond Dénes	c0009750c3	Add unit test for resource based cache eviction Specifically for the reader-permit based eviction. This test lives in a separate executable as it uses with_cql_test_env() and thus needs a main() of it's own.	2018-03-13 16:20:50 +02:00
Botond Dénes	c53b6f75c8	Add unit tests for querier_cache	2018-03-13 12:59:45 +02:00
Jesse Haber-Kucharsky	90af3d889a	tests: Rename test for consistency Now we have `cql_auth_query_test` and `cql_auth_syntax_test`.	2018-03-01 12:06:59 -05:00
Jesse Haber-Kucharsky	62bfc3939c	tests: Add CQL syntax tests for access-control These are quick-running tests for verifying the accepted forms of CQL statements (and fragments) related to access-control: users, roles, and permissions. Establishing the allowed forms of statements is helpful for reference, but also makes syntax changes (like those expected in later patches) clearer and more safe.	2018-03-01 11:46:37 -05:00
Avi Kivity	d973445a94	Merge "sstable/schema extensions" from Calle " Adds extension points to schema/sstables to enable hooking in stuff, like, say, something that modifies how sstable disk io works. (Cough, cough, encryption) Extensions are processed as property keywords in CQL. To add an extension, a "module" must register it into the extensions object on boot time. To avoid globals (and yet don't), extensions are reachable from config (and thus from db). Table/view tables already contain an extension element, so we utilize this to persist config. schema_tables tables/views from mutations now require a "context" object (currently only extensions, but abstracted for easier further changes. Because of how schemas currently operate, there is a super lame workaround to allow "schema_registry" access to config and by extension extensions. DB, upon instansiation, calls a thread local global "init" in schema_registry and registers the config. It, in turn, can then call table_from_mutations as required. Includes the (modified) patch to encapsulate compression into objects, mainly because it is nice to encapsulate, and isolate a little. " * 'calle/extensions-v5' of github.com:scylladb/seastar-dev: extensions: Small unit test sstables: Process extensions on file open sstables::types: Add optional extensions attribute to scylla metadata sstables::disk_types: Add hash and comparator(sstring) to disk_string schema_tables: Load/save extensions table cql: Add schema extensions processing to properties schema_tables: Require context object in schema load path schema_tables: Add opaque context object config_file_impl: Remove ostream operators main/init: Formalize configurables + add extensions to init call db::config: Add extensions as a config sub-object db::extensions: Configuration object to store various extensions cql3::statements::property_definitions: Use std::variant instead of any sstables: Add extension type for wrapping file io schema: Add opaque type to represent extensions sstables::compress/compress: Make compression a virtual object	2018-02-26 17:15:29 +02:00
Calle Wilund	e75d3dc997	extensions: Small unit test Test basic operation of schema and sstable extensions	2018-02-26 10:43:37 +00:00
Jesse Haber-Kucharsky	1cf6dd85fb	tests: Add basic tests for `enum_set` This is motivated by a small addition to `enum_set` and `super_enum` that follows this patch.	2018-02-14 14:15:59 -05:00
Avi Kivity	432268f582	Merge "branch 'remove_atomic_deletion_manager_v2' of github.com:raphaelsc/scylla" from Raphael "The motivation is that it's no longer needed after new resharding algorithm that is the sole responsible for working with shared sstables and regular compaction will not work with those! So resharding will schedule deletion of shared sstables once it's certain that shards that own them have the new unshared sstables. The manager was needed for orchestrating deletion of shared sstable across shards. It brings extra complexity that's not longer needed, and it was also overloading shard 0, but the latter could have been fixed. Tests: - unit: release mode - dtest: resharding_test.py" * 'remove_atomic_deletion_manager_v2' of github.com:raphaelsc/scylla: Remove SSTable's atomic deletion manager Stop using SSTable's atomic deletion manager database: split column_family::rebuild_sstable_list	2018-02-08 19:10:16 +02:00
Raphael S. Carvalho	312bd9ce25	Remove SSTable's atomic deletion manager Not used anymore, can be deleted. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-02-07 22:38:45 -02:00
Duarte Nunes	996e47a6f9	test.py: Increase memory for row_cache_stress_test Cells and rows will require more memory when we start caching the cell hash. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 00:22:51 +00:00
Botond Dénes	71be2e1d0d	test.py: don't fail if test's exit code is not 0 on --help test.py invokes all test executables once with --help to determine whether it needs a -- to seperate scylla args or not. For this check it doesn't matter what exit code the test exits with, so don't fail if it's not 0. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <d05be7c3819349e3b22b6249bb83fbf9269d14cb.1517314408.git.bdenes@scylladb.com>	2018-01-30 14:21:01 +02:00

1 2 3 4 5

205 Commits