scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Author	SHA1	Message	Date
Avi Kivity	2eaeb3e4eb	Update swagger-ui submodule Updates to version 2.2.10 with a local change (from Amnon) to support our location. Fixes #3942.	2018-11-27 13:01:02 +02:00
Tomasz Grabiec	17a8a9d13d	gdb: Properly parse unique_ptr in 'scylla lsa' There's no _M_t._M_head_impl any more in the standard library. We now have std_unique_ptr wrapper which abstracts this fact away so use that. Message-Id: <20181126174837.11542-1-tgrabiec@scylladb.com>	2018-11-27 12:32:41 +02:00
Tomasz Grabiec	eecda72175	gdb: Adjust 'scylla lsa' for removal of emergency reserve There's no _emergency_reserve any more. Show _free_segments instead. Message-Id: <20181126174837.11542-2-tgrabiec@scylladb.com>	2018-11-27 12:32:37 +02:00
Avi Kivity	5e759b0c07	Merge "Optimize checksum computation for the MC sstable format" from Tomek " One part of the improvement comes from replacing zlib's CRC32 with the one from libdeflate, which is optimized for modern architecture and utilizes the PCLMUL instruction. perf_checksum test was introduced to measure performance of various checksumming operations. Results for 514 B (relevant for writing with compression enabled): test iterations median mad min max crc_test.perf_deflate_crc32_combine 58414 16.711us 3.483ns 16.708us 16.725us crc_test.perf_adler_combine 165788278 6.059ns 0.031ns 6.027ns 7.519ns crc_test.perf_zlib_crc32_combine 59546 16.767us 26.191ns 16.741us 16.801us --- crc_test.perf_deflate_crc32_checksum 12705072 83.267ns 4.580ns 78.687ns 98.964ns crc_test.perf_adler_checksum 3918014 206.701ns 23.469ns 183.231ns 258.859ns crc_test.perf_zlib_crc32_checksum 2329682 428.787ns 0.085ns 428.702ns 510.085ns Results for 64 KB (relevant for writing with compression disabled): test iterations median mad min max crc_test.perf_deflate_crc32_combine 25364 38.393us 17.683ns 38.375us 38.545us crc_test.perf_adler_combine 169797143 5.842ns 0.009ns 5.833ns 6.901ns crc_test.perf_zlib_crc32_combine 26067 38.663us 95.094ns 38.546us 40.523us --- crc_test.perf_deflate_crc32_checksum 202821 4.937us 14.426ns 4.912us 5.093us crc_test.perf_adler_checksum 44684 22.733us 206.263ns 22.492us 25.258us crc_test.perf_zlib_crc32_checksum 18839 53.049us 36.117ns 53.013us 53.274us The new CRC32 implementation (deflate_crc32) doesn't provide a fast checksum_combine() yet, it delegates to zlib so it's as slow as the latter. Because for CRC32 checksum_combine() is several orders of magnitude slower than checksum(), we avoid calling checksum_combine() completely for this checksummer. We still do it for adler32, which has combine() which is faster than checksum(). SStable write performance was evaluated by running: perf_fast_forward --populate --data-directory /tmp/perf-mc \ --rows=10000000 -c1 -m4G --datasets small-part Below is a summary of the average frag/s for a memtable flush. Each result is an average of about 20 flushes with stddev of about 4k. Before: [1] MC,lz4: 330'903 [2] LA,lz4: 450'157 [3] MC,checksum: 419'716 [4] LA,checksum: 459'559 After: [1'] MC,lz4: 446'917 ([1] + 35%) [2'] LA,lz4: 456'046 ([2] + 1.3%) [3'] MC,checksum: 462'894 ([3] + 10%) [4'] LA,checksum: 467'508 ([4] + 1.7%) After this series, the performance of the MC format writer is similar to that of the LA format before the series. There seems to be a small but consistent improvement for LA too. I'm not sure why. " * tag 'improve-mc-sstable-checksum-libdeflate-v3' of github.com:tgrabiec/scylla: tests: perf: Introduce perf_checksum tests: Add test for libdeflate CRC32 implementation sstables: compress: Use libdeflate for crc32 sstables: compress: Rename crc32_utils to zlib_crc32_checksummer licenses: Add libdeflate license Integrate libdeflate with the build system Add libdeflate submodule sstables: Avoid checksum_combine() for the crc32 checksummer sstables: compress: Avoid unnecessary checksum_combine() sstables: checksum_utils: Add missing include	2018-11-26 20:10:46 +02:00
Tomasz Grabiec	f1a35b654a	tests: perf: Introduce perf_checksum	2018-11-26 18:59:43 +01:00
Tomasz Grabiec	5b6e3fb5ed	tests: Add test for libdeflate CRC32 implementation	2018-11-26 18:59:42 +01:00
Tomasz Grabiec	bf0164cdaf	sstables: compress: Use libdeflate for crc32 Improves memtable flush performance by 10% in a CPU-bound case. Unlike the zlib implementation, libdeflate is optimized for modern CPUs. It utilizes the PCLMUL instruction.	2018-11-26 18:59:42 +01:00
Tomasz Grabiec	0ac1905f4f	sstables: compress: Rename crc32_utils to zlib_crc32_checksummer	2018-11-26 18:59:42 +01:00
Tomasz Grabiec	ba141a4852	licenses: Add libdeflate license	2018-11-26 18:59:41 +01:00
Tomasz Grabiec	048d569b45	Integrate libdeflate with the build system	2018-11-26 18:59:09 +01:00
Tomasz Grabiec	f704f7bc19	Add libdeflate submodule	2018-11-26 18:57:51 +01:00
Tomasz Grabiec	743cf43847	sstables: Avoid checksum_combine() for the crc32 checksummer checksum_combine() is much slower than re-feeding the buffer to checksum() for the zlib CRC32 checksummer. Introduce Checksum::prefer_combine() to determine this and select more optimal behavior for given checksummer. Improves performance of memtable flush with compression enabled by 30%.	2018-11-26 18:57:33 +01:00
Avi Kivity	b351a9fee7	db/repair_decision.hh: add missing #include Message-Id: <20181126154948.2453-1-avi@scylladb.com>	2018-11-26 18:49:08 +01:00
Tomasz Grabiec	88cf1c61ba	sstables: compress: Avoid unnecessary checksum_combine()	2018-11-26 14:31:38 +01:00
Tomasz Grabiec	8372cf7bcc	sstables: checksum_utils: Add missing include	2018-11-26 14:31:38 +01:00
Avi Kivity	c6d700279b	class_registry: introduce a non-static variant of class_registry class_registry's staticness brings has the usual problem of static classes (loss of dependency information) and prevents us from librarifying Scylla since all objects that define a registration must be linked in. Take a first step against this staticness by defining a nonstatic variant. The static class_registry is then redefined in terms of the nonstatic class. After all uses have been converted, the static variant can be retired. Message-Id: <20181126130935.12837-1-avi@scylladb.com>	2018-11-26 13:30:21 +00:00
Paweł Dziepak	62ea153629	Merge "Check for schema mismatch after dropping dead cells" from Piotr " Previously we were checking for schema incompatibility between current schema and sstable serialization header before reading any data. This isn't the best approach because data in sstable may be already irrelevant due to column drop for example. This patchset moves the check after actual data is read and verified that it has a timestamp new enough to classify it as nonobsolete. Fixes #3924 " * 'haaawk/3924/v3' of github.com:scylladb/seastar-dev: sstables: Enable test_schema_change for MC format sstables3: Throw error on schema mismatch only for live cells sstables: Pass column_info to consume_*_column sstables: Add schema_mismatch to column_info sstables: Store column data type in column_info sstables: Remove code duplication in column_translation	2018-11-26 13:10:18 +00:00
Avi Kivity	9a46ee69d4	doc: fix BYPASS CACHE documentation BYPASS CACHE was mistakenly documenting an earlier version of the patch. Correct it to document th committed version. Message-Id: <20181126125810.9344-1-avi@scylladb.com>	2018-11-26 13:04:52 +00:00
Piotr Jastrzebski	5c86294a56	sstables: Enable test_schema_change for MC format Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-11-26 13:25:23 +01:00
Piotr Jastrzebski	4bdb86c712	sstables3: Throw error on schema mismatch only for live cells Previously we were throwing exception during the creation of column_translation. This wasn't always correct because sometimes column for which the mismatch appeared was already dropped and data present in sstable should be ignored anyway. Fixes #3924 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-11-26 13:25:10 +01:00
Piotr Sarna	6ab8235369	main: fix deinitialization order for view update generator View update generator should be stopped only after drain_on_shutdown() is performed on storage service. Message-Id: <4d2bda4c73422a2ebf46d6dcd06c95d960839889.1543230849.git.sarna@scylladb.com>	2018-11-26 11:21:37 +00:00
Duarte Nunes	2a371c2689	Merge 'Allow bypassing cache on a per-query basis' from Avi " Some queries are very unlikely to hit cache. Usually this includes range queries on large tables, but other patterns are possible. While the database should adapt to the query pattern, sometimes the user has information the database does not have. By passing this information along, the user helps the database manage its resources more optimally. To do this, this patch introduces a BYPASS CACHE clause to the SELECT statement. A query thus marked will not attempt to read from the cache, and instead will read from sstables and memtables only. This reduces CPU time spent to query and populate the cache, and will prevent the cache from being flooded with data that is not likely to be read again soon. The existing cache disabled path is engaged when the option is selected. Tests: unit (release), manual metrics verification with ccm with and without the BYPASS CACHE clause. Ref #3770. " * tag 'cache-bypass/v2' of https://github.com/avikivity/scylla: doc: document SELECT ... BYPASS CACHE tests: add test for SELECT ... BYPASS CACHE cql: add SELECT ... BYPASS CACHE clause db: add query option to bypass cache	2018-11-26 09:59:40 +00:00
Paweł Dziepak	13385778fd	Merge "Measure performance of dataset population in perf_fast_forward" from Tomasz * tag 'perf-ffwd-dataset-population-v2' of github.com:tgrabiec/scylla: tests: perf_fast_forward: Measure performance of dataset population tests: perf_fast_forward: Record the dataset on which test case was run tests: perf_fast_forward: Introduce the concept of a dataset tests: perf_fast_forward: Introduce make_compaction_disabling_guard() tests: perf_fast_forward: Initialize output manager before population tests: perf_fast_forward: Handle empty test parameter set tests: perf_fast_forward: Extract json_output_writer::write_common_test_group() tests: perf_fast_forward: Factor out access to cfg to a single place per function tests: perf_fast_forward: Extract result_collector tests: perf_fast_forward: Take writes into account in AIO statistics tests: perf_fast_forward: Reorder members tests: perf_fast_forward: Add --sstable-format command line option	2018-11-26 09:45:55 +00:00
Avi Kivity	58033ad3a4	doc: document SELECT ... BYPASS CACHE Add a new cql-extensions.md file and document BYPASS CACHE there.	2018-11-26 11:37:52 +02:00
Avi Kivity	f69401c609	tests: add test for SELECT ... BYPASS CACHE The test verifies that cache read metrics are not incremented during a cache bypass read.	2018-11-26 11:37:52 +02:00
Avi Kivity	ecf3f92ec7	cql: add SELECT ... BYPASS CACHE clause The BYPASS CACHE clause instructs the database not to read from or populate the cache for this query. The new keywords (BYPASS and CACHE) are not reserved.	2018-11-26 11:37:49 +02:00
Takuya ASADA	7740cd2142	dist/common/systemd/scylla-housekeeping-restart.service.mustache: specify correct repo for Debian variants We do specify correct repo for both Red Hat/Debian variants on -deily, but mistakenly don't for -restart, so do same on -restart. Fixes #3906 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20181109224509.27380-1-syuu@scylladb.com>	2018-11-26 11:02:25 +02:00
Rafael Ávila de Espíndola	6746907999	Use fully covered switches in continuous_data_consumer do_process_buffer had two unreachable default cases and a long if-else-if chain. This converts the the if-else-if chain to a switch and a helper function. This moves the error checking from run time to compile time. If we were to add a 128 bit integer for example, gcc would complain about it missing from the switch. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181125221451.106067-1-espindola@scylladb.com>	2018-11-25 22:52:11 +00:00
Avi Kivity	b4765af790	Merge "Introduce SSTable-run-based compaction" from Raphael " This new compaction approach consists of releasing exhausted fragments[1] of a run[2] a compaction proceeds, so decreasing considerably the space requirement. These changes will immediately benefit leveled strategy because it already works with the run concept. [1] fragment is a sstable composing a run; exhausted means sstable was fully consumed by compaction procedure. [2] run is a set of non-overlapping sstables which roughly span the entire token range. Note: Last patch includes an example compaction strategy showing how to work with the interface. unit tests: all modes passing dtests: compaction ones passing " * 'sstable_run_based_compaction_v10' of github.com:raphaelsc/scylla: tests: add example compaction strategy for sstable run based approach sstables/compaction: propagate sstable replacement to all compaction of a CF sstables: store cf pointer in compaction_info tests/sstable_test: add test for compaction replacement of exhausted sstable sstables: add sstable's on closed handling tests/sstables: add test for sstable run based compaction sstables/compaction_manager: prevent partial run from being selected for compaction compaction: use same run identifier for sstables generated by same compaction sstables: introduce sstable run sstables/compaction_manager: release reference to exhausted sstable through callback sstables/compaction: stop tracking exhausted input sstable in compaction_read_monitor database: do not keep reference to sstable in selector when done selecting compaction: share sstable set with incremental reader selector sstables/compaction: release space earlier of exhausted input sstables sstables: make partitioned sstable set's incremental selector resilient to changes in the set database: do not store reference to sstable in incremental selector tests/sstables: add run identifier correctness test sstables: use a random uuid for sstables without run identifier sstables: add run identifier to scylla metadata	2018-11-25 17:20:24 +02:00
Avi Kivity	b835b93ee6	db: add query option to bypass cache With the option enabled, we bypass the cache unconditionally and only read from memtables+sstables. This is useful for analytics queries.	2018-11-25 16:26:08 +02:00
Raphael S. Carvalho	3fa70d6b5f	tests: add example compaction strategy for sstable run based approach Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 20:16:54 -02:00
Raphael S. Carvalho	2058001f94	sstables/compaction: propagate sstable replacement to all compaction of a CF This is needed for parallel compaction to work with sstable run based approach. That's because regular compaction clones a set containing all sstables of its column family. So compaction A can potentially hold a reference to a compacting sstable of compaction B, so preventing compacting B from releasing its exhausted sstable. So all replacements are propagated to all compactions of a given column family, and compactions in turn, including the one which initiated the propagation, will do the replacement. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:30 -02:00
Raphael S. Carvalho	953fdcc867	sstables: store cf pointer in compaction_info motivation is that we need a more efficient way to find compactions that belong to a given column family in compaction list. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:28 -02:00
Raphael S. Carvalho	baf89f0df3	tests/sstable_test: add test for compaction replacement of exhausted sstable Make sure that compaction is capable of releasing exhausted sstable space early in the procedure. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:26 -02:00
Raphael S. Carvalho	824c20b76d	sstables: add sstable's on closed handling Motivation is that it will be useful for catching regression on compaction when releasing early exhausted sstables. That's because sstable's space is only released once it's closed. So this will allow us to write a test case and possibly use it for entities holding exhausted sstable. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:25 -02:00
Raphael S. Carvalho	0085e8371d	tests/sstables: add test for sstable run based compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:23 -02:00
Raphael S. Carvalho	e88d1d54b9	sstables/compaction_manager: prevent partial run from being selected for compaction Filter out sstable belonging to a partial run being generated by an ongoing compaction. Otherwise, that could lead to wrong decisions by the compaction strategy. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:22 -02:00
Raphael S. Carvalho	23884fe9f6	compaction: use same run identifier for sstables generated by same compaction SSTables composing the same run will share the same run identifier. Therefore, a new compaction strategy will be able to get all sstables belong to the same run from sstable_set, which now keeps track of existing runs. Same UUID is passed to writers of a given compaction. Otherwise, a new UUID is picked for every sstable created by compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:20 -02:00
Raphael S. Carvalho	4f68cb34a6	sstables: introduce sstable run sstable run is a structure that will hold all sstables that has the same run identifier. All sstables belonging to the same run will not overlap with one another. It can be used by compaction strategy to work on runs instead of individual sstables. sstable_set structure which holds all sstables for a given column family will be responsible for providing to its user an interface to work with runs instead of individual sstables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:18 -02:00
Raphael S. Carvalho	fc92fb955d	sstables/compaction_manager: release reference to exhausted sstable through callback That's important for the reference to sstable to not be kept throughout the compaction procedure, which would break the goal of releasing space during compaction. Manager passes a callback to compaction which calls it whenever there's sstable replacement. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:16 -02:00
Raphael S. Carvalho	3f309ebba9	sstables/compaction: stop tracking exhausted input sstable in compaction_read_monitor Motivation is that we want to release space for exhausted sstable and that will only happen when all references to it are gone and that backlog tracker takes the early replacement into account. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:13 -02:00
Raphael S. Carvalho	3433de3dc0	database: do not keep reference to sstable in selector when done selecting When compacting, we'll create all readers at once and will not select again from incremental selector, meaning the selector will keep all respective sstables in current_sstables, preventing compaction from releasing space as it goes on. The change is about refreshing sstable set's selector such that it will not hold a reference to an exhausted sstable whatsoever. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:12 -02:00
Raphael S. Carvalho	f6df949c1a	compaction: share sstable set with incremental reader selector By doing that, we'll be able to release exhausted sstable from both simulteaneously. That's achieved by sharing set containing input sstables with the incremental reader selector and removing exhausted sstables from shared set when the time has come. Step towards reducing disk requirement for compaction by making it delete sstable which all data is in a sealed new sstable. For that to happen, all references must be gone. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:10 -02:00
Raphael S. Carvalho	e5a0b05c15	sstables/compaction: release space earlier of exhausted input sstables Currently, compaction only replace input sstables at end of compaction, meaning compaction must be finished for all the space of those sstables to be released. What we can do instead is to delete earlier some input sstable under some conditions: 1) SStable data should be committed to a new, sealed output sstable, meaning it's exhausted. 2) Exhausted sstable mustn't overlap with a non-exhausted sstable because a tombstone in the exhausted could have been purged and the shadowed data in non-exhausted could be ressurected if system crashes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:07 -02:00
Raphael S. Carvalho	ace070c8fc	sstables: make partitioned sstable set's incremental selector resilient to changes in the set The motivation is that compaction may remove a sstable from the set while the incremental selector is alive, and for that to work, we need to invalidate the iterators stored by the selector. We could have added a method to notify it, but there will be a case where the one keeping the set cannot forward the notification to the selector. So it's better for the selector to take care of itself. Change counter approach is used which allows the selector to know when to invalidate the iterators. After invalidation, selector will move the iterator back into its right place by looking for lower bound for current pos. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:05 -02:00
Raphael S. Carvalho	8d11b0bbb4	database: do not store reference to sstable in incremental selector Use sstable generation instead to keep track of read sstables. The motivation is that we'll not keep reference to sstables, so allowing their space on disk to be released as soon they get exhausted. Generation is used because it guarantees uniqueness of the sstable. Reviewed-by: Botond Dénes <bdenes@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:04 -02:00
Raphael S. Carvalho	edc87014c1	tests/sstables: add run identifier correctness test Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:02 -02:00
Raphael S. Carvalho	a66b1954cc	sstables: use a random uuid for sstables without run identifier Older sstables must have an identifier for them to be associated with their own run. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:01 -02:00
Raphael S. Carvalho	62025fa52c	sstables: add run identifier to scylla metadata It identifies a run which a particular sstable belongs to. Existing sstables will have a random uuid associated with it in memory. UUID is the correct choice because it allows sstables to be exported without having conflicts when using identifier generated by different nodes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:52:44 -02:00
Rafael Ávila de Espíndola	d18bbe9d45	Remove unreachable default cases. These switches are fully covered. We can be sure they will stay this way because of -Werror and gcc's -Wswitch warning. We can also be sure that we never have an invalid enum value since the state machine values are not read from disk. The patch also removes a superfluous ';'. Message-Id: <20181124020128.111083-1-espindola@scylladb.com>	2018-11-24 09:31:51 +00:00

1 2 3 4 5 ...

17049 Commits