scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	fb15759934	sstables: reader: Do not read the head of the partition when index can be used read_partition() was always called through read_next_partition(), even if we're at the beginning of the read. read_next_partition() is supposed to skip to the next partition. It still works when we're positioned before a partition, it doesn't advance the consumer, but it clears _index_in_current_partition, because it (correctly) assumes it corresponds to the partition we're about to leave, not the one we're about to enter. This means that index lookups we did in the read initializer will be disregarded when reading starts, and we'll always start by reading partition data from the data file. This is suboptimal for reads which are slicing a large partition and don't need to read the front of the partition. Regression introduced in `4b9a34a854`. The fix is to call read_partition() directly when we're positioned at the beginning of the partition. For that purpose a new flag was introduced. test_no_index_reads_when_rows_fall_into_range_boundaries has to be relaxed, because it assumed that slicing reads will read the head of the partition. Refs #3984 Fixes #3992 Tested using: ./build/release/tests/perf/perf_fast_forward_g \ --sstable-format=mc \ --datasets large-part-ds1 \ --run-tests=large-partition-slicing-clustering-keys Before (focus on aio): offset read time (s) frags frag/s mad f/s max f/s min f/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 4000000 1 0.001378 1 726 5 736 102 6 200 4 2 0 1 1 0 0 0 65.8% After: offset read time (s) frags frag/s mad f/s max f/s min f/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 4000000 1 0.001290 1 775 6 788 716 2 136 2 0 0 1 1 0 0 0 69.1%	2018-12-18 11:11:37 +01:00
Tomasz Grabiec	385a4c23fd	sstables: mc: mutation_fragment_filter: Check the fast-forward window first Otherwise the parser will keep consuming and dropping fragments needlessly, rather than giving the user a chance to consume end-of-stream condition, and maybe skip again. Refs #3984	2018-12-18 11:11:37 +01:00
Tomasz Grabiec	62a1afaac9	sstables: mc: writer: Avoid calling unsigned_vint::serialized_size() Rather than adding serialized_size() to the body size before serializing the field, we can serialize the field to _tmp_bufs at the beginning and have the body size automatically account for it.	2018-12-18 11:11:36 +01:00
Duarte Nunes	1f578be187	Merge 'Fix evictable shard reader related issues' from Botond " Recently some additional issues were discovered related to recent changes to the way inactive readers are evicted and making shard readers evictable. One such issue is that the `querier_cache` is not prepared for the querier to be immediately evicted by the reader concurrency semaphore, when registered with it as an inactive read (#3987). The other issue is that the multishard mutation query code was not fully prepared for evicted shard readers being re-created, or failing why being re-created (#3991). This series fixes both of these issues and adds a unit test which covers the second one. I am working on a unit test which would cover the second issue, but it's proving to be a difficult one and I don't want to delay the fixes for these issues any longer as they also affect 3.0. Fixes: #3987 Fixes: #3991 Tests: unit(release, debug) " * 'evictable-reader-related-issues/v2' of https://github.com/denesb/scylla: multishard_mutation_query: reset failed readers to inexistent state multishard_mutation_query: handle missing readers when dismantling multishard_mutation_query: add support for keeping stats for discarded partitions multishard_mutation_query: expect evicted reader state when creating reader multishard_mutation_query: pretty-print the reader state in log messages querier_cache: check that the query wasn't evicted during registering reader_concurrency_semaphore: use the correct types in the constructor reader_concurrency_semaphore: add consume_resources() reader_concurrency_semaphore::inactive_read_handle: add operator bool()	2018-12-17 15:36:23 +00:00
Botond Dénes	b4c3aab4a7	multishard_mutation_query: reset failed readers to inexistent state When attempting to dismantling readers, some of the to-be-dismantled readers might be in a failed state. The code waiting on the reader to stop is expecting failures, however it didn't do anything besides logging the failure and bumping a counter. Code in the lower layers did not know how to deal with a failed reader and would trigger `std::bad_variant_access` when trying to process (save or cleanup) it. To prevent this, reset the state of failed readers to `inexistent_state` so code in the lower layers doesn't attempt to further process them.	2018-12-17 13:18:08 +02:00
Botond Dénes	9cef043841	multishard_mutation_query: handle missing readers when dismantling When dismantling the combined buffer and the compaction state we are no longer guaranteed to have the reader each partition originated from. The reader might have been evicted and not resumed, or resuming it might have failed. In any case we can no longer assume the originating reader of each partition will be present. If a reader no longer exists, discard the partitions that it emitted.	2018-12-17 13:18:08 +02:00
Botond Dénes	438bef333b	multishard_mutation_query: add support for keeping stats for discarded partitions In the next patches we will add code that will have to discard some of the dismantled partitions/fragments/bytes. Prepare the `dismantle_buffer_stats` struct for being able to track the discarded partitions/fragments/bytes in addition to those that were successfully dismantled.	2018-12-17 13:18:08 +02:00
Botond Dénes	ce52436af4	multishard_mutation_query: expect evicted reader state when creating reader Previously readers were created once, so `make_remote_reader()` had a validation to ensure readers were not attempted at being created more than once. This validation was done by checking that the reader-state is either `inexistent` or `successful_lookup`. However with the introduction of pausing shard readers, it is now possible that a reader will have to be created and then re-created several times, however this validation was not updated to expect this. Update the validation so it also expects the reader-state to be `evicted`, the state the reader will be if it was evicted while paused.	2018-12-17 13:18:08 +02:00
Botond Dénes	1effb1995b	multishard_mutation_query: pretty-print the reader state in log messages	2018-12-17 13:18:08 +02:00
Botond Dénes	5780f2ce7a	querier_cache: check that the query wasn't evicted during registering The reader concurrency semaphore can evict the querier when it is registered as an inactive read. Make the `querier_cache` aware of this so that it doesn't continue to process the inserted querier when this happens. Also add a unit test for this.	2018-12-17 13:18:08 +02:00
Botond Dénes	e1d8237e6b	reader_concurrency_semaphore: use the correct types in the constructor Previously there was a type mismatch for `count` and `memory`, between the actual type used to store them in the class (signed) and the type of the parameters in the constructor (unsigned). Although negative numbers are completely valid for these members, initializing them to negative numbers don't make sense, this is why they used unsigned types in the constructor. This restriction can backfire however when someone intends to give these parameters the maximum possible value, which, when interpreted as a signed value will be `-1`. What's worse the caller might not even be aware of this unsigned->signed conversion and be very suprised when they find out. So to prevent surprises, expose the real type of these members, trusting the clients of knowing what they are doing. Also add a `no_limits` constructor, so clients don't have to make sure they don't overflow internal types.	2018-12-17 13:18:08 +02:00
Botond Dénes	dfd649a6b4	reader_concurrency_semaphore: add consume_resources()	2018-12-17 13:18:08 +02:00
Botond Dénes	21b44adbfe	reader_concurrency_semaphore::inactive_read_handle: add operator bool()	2018-12-17 13:18:08 +02:00
Amnon Heiman	571755e117	node-exporter.service: Update command line to fix service startup The upgrade to node_exporter 0.17 commit `09c2b8b48a` ("node_exporter_install: switch to node_exporter 0.17") caused the service to no longer start. Turns out node_exported broke backwards compatibility of the command line between 0.15 to 0.16. Fix it up. While fixing the command line, all the collector that are enabled by default were removed. Fixes #3989 Signed-off-by: Amnon Heiman <amnon@scylladb.com> [ penberg@scylladb.com: edit commit message ] Message-Id: <20181213114831.27216-1-amnon@scylladb.com>	2018-12-17 10:22:17 +02:00
Rafael Ávila de Espíndola	4de14e6143	Add tests on broken mc range tombstones. This tests that we diagnose both two consecutive range starts and two consecutive range ends. Message-Id: <20181214212608.95452-1-espindola@scylladb.com>	2018-12-15 13:53:25 +01:00
Avi Kivity	b023e8b45d	Merge " Extract MC sstable writer to a separate compilation unit" from Tomasz " The motivation is to keep code related to each format separate, to make it easier to comprehend and reduce incremental compilation times. Also reduces dependency on sstable writer code by removing writer bits from sstales.hh. The ka/la format writers are still left in sstables.cc, they could be also extracted. " * 'extract-sstable-writer-code' of github.com:tgrabiec/scylla: sstables: Make variadic write() not picked on substitution error sstables: Extract MC format writer to mc/writer.cc sstables: Extract maybe_add_summary_entry() out of components_writer sstables: Publish functions used by writers in writer.hh sstables: Move common write functions to writer.hh sstables: Extract sstable_writer_impl to a header sstables: Do not include writer.hh from sstables.hh sstables: mc: Extract bound_kind_m related stuff into mc/types.hh sstables: types: Extract sstable_enabled_features::all() sstables: Move components_writer to .cc tests: sstable_datafile_test: Avoid dependency on components_writer	2018-12-14 15:05:00 +02:00
Duarte Nunes	224821303c	Merge 'Reduce the dependency on database.hh' from Botond " Working on database.hh or any header that is included in database.hh (of which there is a lot), is a major pain as each change involves the recompilation of half of our compilation units. Reduce the impact by removing the `#include "database.hh"` directive from as many header files as possible. Many headers can make do with just some forward declarations and don't need to include the entire headers. I also found some headers that included database.hh without actually needing it. Results Before: $ touch database.hh $ ninja build/release/scylla [1/154] CXX build/release/gen/cql3/CqlParser.o After: $ touch database.hh $ ninja build/release/scylla [1/107] CXX build/release/gen/cql3/CqlParser.o " * 'reduce-dependencies-on-database-hh/v2' of https://github.com/denesb/scylla: treewide: remove include database.hh from headers where possible database_fwd.hh: add keyspace fwd declaration service/client_state: de-inline set_keyspace() Move cache_temperature into its own header	2018-12-14 12:24:48 +00:00
Piotr Sarna	63bd43e57e	cql3: add refusing to create an index on static column Secondary indexes on static columns are not yet supported, so creating such index should return an appropriate error. Fixes #3993 Message-Id: <700b0a71e80da52d2d5250edacc12626b55681fa.1544785127.git.sarna@scylladb.com>	2018-12-14 11:15:28 +00:00
Rafael Ávila de Espíndola	f48d54543f	Use read_rows_flat to test broken sstables. The previous code was using mp_row_consumer_k_l to be as close to the tested code as possible. Given that it is testing for an unhandled exception, there is probably more value in moving it to a higher level, easier to use, API. This patch changes it to use read_rows_flat(). Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181210235016.41133-1-espindola@scylladb.com>	2018-12-14 10:14:28 +01:00
Botond Dénes	1865e5da41	treewide: remove include database.hh from headers where possible Many headers don't really need to include database.hh, the include can be replaced by forward declarations and/or including the actually needed headers directly. Some headers don't need this include at all. Each header was verified to be compilable on its own after the change, by including it into an empty `.cc` file and compiling it. `.cc` files that used to get `database.hh` through headers that no longer include it were changed to include it themselves.	2018-12-14 08:03:57 +02:00
Botond Dénes	efe2b2c75d	database_fwd.hh: add keyspace fwd declaration	2018-12-14 08:03:57 +02:00
Tomasz Grabiec	245a0d953a	tests: cql_test_env: Start the compaction manager Broken in `fee4d2e` Not doing this results in compaction requests being ignored. One effect of this is that perf_fast_forward produces many sstables instead of one. Refs #3984 Refs #3983 Message-Id: <1544719540-10178-1-git-send-email-tgrabiec@scylladb.com>	2018-12-13 18:58:50 +02:00
Piotr Sarna	6743af5dbd	cql3: refuse to create index on COMPACT STORAGE with ck To follow C* compatibility, creating an index on COMPACT STORAGE table should be disallowed not only on base primary keys, but also when the base table contains clustering keys. Message-Id: <ab40c39730aff2e164d11ee5159ff62b8ec9e8e8.1544698186.git.sarna@scylladb.com>	2018-12-13 13:39:12 +00:00
Duarte Nunes	f8878238ed	service/storage_proxy: Embed the expire timer in the response handler Embedding the expire timer for a write response in the abstract_write_response_handler simplifies the code as it allows removing the rh_entry type. It will also make the timeout easily accessible inside the handler, for future patches. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181213111818.39983-1-duarte@scylladb.com>	2018-12-13 14:25:21 +02:00
Tomasz Grabiec	3889b05d7e	Merge "Tests and small fixes for composite markers" from Rafael * https://github.com/espindola/scylla espindola/add-composite-tests: Remove newline from exception messages. Fix end marker exception message. Add tests for broken start and end composite markers.	2018-12-13 10:29:44 +01:00
Rafael Ávila de Espíndola	51fd880892	Add tests for broken start and end composite markers.	2018-12-13 10:29:44 +01:00
Rafael Ávila de Espíndola	64439f6477	Fix end marker exception message. The code tested the end marker, but the exception mentioned the start marker. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2018-12-13 10:29:44 +01:00
Rafael Ávila de Espíndola	cfd07185b7	Remove newline from exception messages. They are inconsistent with other uses of malformed_sstable_exception and incompatible with adding " in sstable ..." to the message. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2018-12-13 10:29:44 +01:00
Vlad Zolotarov	7da1ac2c2c	large_partition_handler: fix the message We currently detect large partitions - not rows. So this is what we should be reporting. Fixes #3986 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20181212215506.9879-1-vladz@scylladb.com>	2018-12-13 00:11:27 +00:00
Rafael Ávila de Espíndola	894f07f912	Move default case out of two switches. These switches are fully covered, having the default label disables -Wswitch. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181212160904.17341-1-espindola@scylladb.com>	2018-12-12 18:20:24 +01:00
Botond Dénes	10336c13fc	service/client_state: de-inline set_keyspace()	2018-12-12 18:14:03 +02:00
Botond Dénes	76fe4ebc18	Move cache_temperature into its own header Some headers need to include database.hh just because of cache_temperature. Move it into its own header so these includes can be removed.	2018-12-12 16:03:45 +02:00
Tomasz Grabiec	0a853b8866	sstables: index_reader: Avoid schema copy in advance_to() Introduced in `7e15e43`. Exposed by perf_fast_forward: running: large-partition-skips on dataset large-part-ds1 Testing scanning large partition with skips. Reads whole range interleaving reads with skips according to read-skip pattern: read skip time (s) frags frag/s (...) 1 0 5.268780 8000000 1518378 1 1 31.695985 4000000 126199 Message-Id: <1544614272-21970-1-git-send-email-tgrabiec@scylladb.com>	2018-12-12 11:33:46 +00:00
Tomasz Grabiec	ff2ad2f6bb	sstables: Make variadic write() not picked on substitution error If write(v, out, x) doesn't match any overload, the variadic write() will be picked, with Rest = {}. The compiler will print error messages about unable to find write(v, out), which totally obscures the original cause of mismatch. Make it picked only when there are at least two write() parameters so that debugging compilation errors is actually possible.	2018-12-12 12:07:31 +01:00
Tomasz Grabiec	a14633c6d0	sstables: Extract MC format writer to mc/writer.cc This moves all MC-related writing code to mc/writer.cc: - m_format_write_helpers.hh is dropped - m_format_write_helpers_impl.hh is dropped - sstable_writer_m is moved out of sstables.cc sstable_writer_m is renamed to sstables::mc::writer	2018-12-12 12:07:31 +01:00
Tomasz Grabiec	2636e6b5ab	sstables: Extract maybe_add_summary_entry() out of components_writer So that it can be used from writer implementations, which don't have access to the definition of the components_writer.	2018-12-12 12:07:31 +01:00
Tomasz Grabiec	577e71478d	sstables: Publish functions used by writers in writer.hh	2018-12-12 12:07:31 +01:00
Tomasz Grabiec	faf0ff1843	sstables: Move common write functions to writer.hh They are common for sstable writers of different formats. Note that writer.hh is supposed to be included only by writer implementations, not writer users.	2018-12-12 12:07:31 +01:00
Tomasz Grabiec	3b4ccc85d0	sstables: Extract sstable_writer_impl to a header	2018-12-12 12:07:31 +01:00
Tomasz Grabiec	6e3c9c3e5e	sstables: Do not include writer.hh from sstables.hh It is only needed by writer implementations.	2018-12-12 12:07:05 +01:00
Tomasz Grabiec	bd7e9ad3ab	sstables: mc: Extract bound_kind_m related stuff into mc/types.hh	2018-12-12 12:06:46 +01:00
Tomasz Grabiec	a4721b4d50	sstables: types: Extract sstable_enabled_features::all()	2018-12-12 12:06:45 +01:00
Tomasz Grabiec	90074d0b75	sstables: Move components_writer to .cc	2018-12-12 12:06:45 +01:00
Tomasz Grabiec	eff47a59ee	tests: sstable_datafile_test: Avoid dependency on components_writer It's LA format specific and it's going to become private to sstable.cc	2018-12-12 12:06:22 +01:00
Avi Kivity	fa96e07e6b	build: pass C compiler configuration in relocatable package build Just like we allow customizing the C++ compiler, we should allow customizing the C compiler. Ref #3978 Message-Id: <20181211172821.30830-1-avi@scylladb.com>	2018-12-12 11:45:13 +01:00
Asias He	71c1681f6c	storage_service: Notify NEW_NODE only when a node is new node This is a backport of CASSANDRA-11038. Before this, a restarted node will be reported as new node with NEW_NODE cql notification. To fix, only send NEW_NODE notification when the node was not part of the cluster Fixes: #3979 Tests: pushed_notifications_test.py:TestPushedNotifications.restart_node_test Message-Id: <453d750b98b5af510c4637db25b629f07dd90140.1544583244.git.asias@scylladb.com>	2018-12-12 07:33:49 +02:00
Juliana Oliveira	5eb76c9bc6	compress: add support for Cassandra's compression parameter This patch adds compatibility for Cassandra's "chunk_size_in_kb", as well as it keeps Scylla's "chunk_size_kb" compression parameter. Fixes #3669 Tests: unit (release) v2: use variable instead of array v3: fix commited files Signed-off-by: Juliana Oliveira <juliana@scylladb.com> Message-Id: <20181211215840.GA7379@shenzou.localdomain>	2018-12-11 23:33:27 +00:00
Nadav Har'El	a0379209e6	secondary indexes: fail attempts to create a CUSTOM INDEX Cassandra supports a "CREATE CUSTOM INDEX" to create a secondary index with a custom implementation. The only custom implementation that Cassandra supports is SASI. But Scylla doesn't support this, or any other custom index implementation. If a CREATE CUSTOM INDEX statement is used, we shouldn't silently ignore the "CUSTOM" tag, we should generate an error. This patch also includes a regression test that "CREATE CUSTOM INDEX" statements with valid syntax fail (before this patch, they succeeded). Fixes #3977 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181211224545.18349-2-nyh@scylladb.com>	2018-12-11 23:33:02 +00:00
Nadav Har'El	36db4fba23	Fix typo in error message Interestingly, this typo was copied from the original Cassandra source code :-) Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181211224545.18349-1-nyh@scylladb.com>	2018-12-11 23:32:58 +00:00
Avi Kivity	5b08e91bdb	tools: add SYS_PTRACE capability to dbuild LeakSanitizer uses ptrace, and docker disables ptrace by default. Add it back so tests pass. Message-Id: <20181208112524.19229-1-avi@scylladb.com>	2018-12-11 19:09:12 +00:00

1 2 3 4 5 ...

17372 Commits