scylladb

Author	SHA1	Message	Date
Duarte Nunes	3235c13125	utils/fragmented_temporary_buffer: Correctly implement remove_suffix() The current implementation breaks the invariant that _size_bytes = reduce(_fragments, &temporary_buffer::size) In particular, this breaks algorithms that check the individual segment size. Correctly implement remove_suffix() by destroying superfluous temporary_buffer's and by trimming the last one, if needed. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190103133523.34937-1-duarte@scylladb.com>	2019-01-03 13:37:01 +00:00
Botond Dénes	021feef513	querier_cache: simplify memory eviction use-after-free fix, add tests Simplify the fix for memory based eviction, introduced by `918d255` so there is no need to massage the counters. Also add a check to `test_memory_based_cache_eviction` which checks for the bug fixed. While at it also add a check to `test_time_based_cache_eviction` for the fix to time based eviction (`e5a0ea3`). Tests: tests/querier_cache:debug Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <c89e2788a88c2a701a2c39f377328e77ac01e3ef.1546515465.git.bdenes@scylladb.com>	2019-01-03 13:44:08 +02:00
Rafael Ávila de Espíndola	28c014351f	Fix crash on corrupt sstable The check in consume_range_tombstone was too late. Before getting to it we would fail an assert calling to_bound_kind. This moves the check earlier and adds a testcase. Tests: unit (release) Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-01-02 17:52:07 -08:00
Avi Kivity	d19660ec0a	Merge "commitlog: Use fragmented buffers for reading entries" from Duarte " Instead of allocating a contiguous temporary_buffer when reading mutations from the commitlog - or hint - replaying, use fragemnted buffers instead. Refs #4020 " * 'commitlog/fragmented-read/v1' of https://github.com/duarten/scylla: db/commitlog: Use fragmented buffers to read entries db/commitlog: Implement skip in terms of input buffer skipping tests/fragmented_temporary_buffer_test: Add unit test for remove_suffix() utils/fragmented_temporary_buffer: Add remove_suffix tests/fragmented_temporary_buffer_test: Add unit test for skip() utils/fragmented_temporary_buffer: Allow skipping in the input stream	2019-01-01 19:08:34 +02:00
Duarte Nunes	b7517183fa	db/commitlog: Use fragmented buffers to read entries Leverage fragmented_temporary_buffer when reading commit log entries, avoiding large allocations. Refs #4020 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-31 13:20:37 +00:00
Duarte Nunes	8379ac6189	tests/fragmented_temporary_buffer_test: Add unit test for remove_suffix() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-31 13:20:37 +00:00
Duarte Nunes	50dd8b67b2	tests/fragmented_temporary_buffer_test: Add unit test for skip() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-31 13:20:37 +00:00
Avi Kivity	c180a18dbb	Distribute distributed_loader into its own header and source files distributed_loader is a sizeable fraction of database.cc, so moving it out reduces compile time and improves readability. Message-Id: <20181230200926.15074-1-avi@scylladb.com>	2018-12-31 14:27:27 +02:00
Avi Kivity	7830086317	client_state: change set_keyspace() to accept a single database shard set_keyspace() only needs one shard (it is checking replicated state, not sharded data) so arrange for it to receive only that one shard.	2018-12-29 10:58:39 +02:00
Tomasz Grabiec	7747f2dde3	Merge "nodetool toppartitions" from Rafi & Avi Implementation of nodetool toppartiotion query, which samples most frequest PKs in read/write operation over a period of time. Content: - data_listener classes: mechanism that interfaces with mutation readers in database and table classes, - toppartition_query and toppartition_data_listener classes to implement toppartition-specific query (this interfaces with data_listeners and the REST api), - REST api for toppartitions query. Uses Top-k structure for handling stream summary statistics (based on implementation in C, see #2811). What's still missing: - JMX interface to nodetool (interface customization may be required), - Querying #rows and #bytes (currently, only #partitions is supported). Fixes #2811 https://github.com/avikivity/scylla rafie_toppartitions_v7.1: top_k: whitespace and minor fixes top_k: map template arguments top_k: std::list -> chunked_vector top_k: support for appending top_k results nodetool toppartitions: refactor table::config constructor nodetool toppartitions: data listeners nodetool toppartitions: add data_listeners to database/table nodetool toppartitions: fully_qualified_cf_name nodetool toppartitions: Toppartitions query implementation nodetool toppartitions: Toppartitions query REST API nodetool toppartitions: nodetool-toppartitions script	2018-12-28 16:31:24 +01:00
Rafi Einstein	0bffe5f83e	nodetool toppartitions: add data_listeners to database/table Add data_listeners member to database. Adds data_listeners* to table::config, to be used by table methods to invoke listeners. Install on_read() listener in table::make_reader(). Install on_write() listener in database::apply_in_memory(). Tests: Unit (release) Signed-off-by: Rafi Einstein <rafie@scylladb.com>	2018-12-28 16:45:57 +02:00
Rafi Einstein	aeebe8e86b	top_k: std::list -> chunked_vector Replaced std::list with chunked_vector. Because chunked_vector requires a noexcept move constructor from its value type, change the bad_boy type in the unit test not to throw in the move constructor. Signed-off-by: Rafi Einstein <rafie@scylladb.com>	2018-12-28 16:45:07 +02:00
Avi Kivity	8e2f6d0513	Merge "Fix use-after-free when destroying partition_snapshots in the background"from Tomasz " partition_snapshots created in the memtable will keep a reference to the memtable (as region) and to memtable::_cleaner. As long as the reader is alive, the memtable will be kept alive by partition_snapshot_flat_reader::_container_guard. But after that nothing prevents it from being destroyed. The snapshot can outlive the read if mutation_cleaner::merge_and_destroy() defers its destruction for later. When the read ends after memtable was flushed, the snapshot will be queued in the cache's cleaner, but internally will reference memtable's region and cleaner. This will result in a use-after-free when the snapshot resumes destruction. The fix is to update snapshots's region and cleaner references at the time of queueing to point to the cache's region and cleaner. When memtable is destroyed without being moved to cache there is no problem because the snapshot would be queued into memtable's cleaner, which will be drained on destruction from all snapshots. Introduced in `f3da043` (in >= 3.0-rc1) Fixes #4030. Tests: - mvcc_test (debug) " tag 'fix-snapshot-merging-use-after-free-v1.1' of github.com:tgrabiec/scylla: tests: mvcc: Add test_snapshot_merging_after_container_is_destroyed tests: mvcc: Introduce mvcc_container::migrate() tests: mvcc: Make mvcc_partition move-constructible tests: mvcc: Introduce mvcc_container::make_not_evictable() tests: mvcc: Allow constructing mvcc_container without a cache_tracker mutation_cleaner: Migrate partition_snapshots when queueing for background cleanup mvcc: partition_snapshot: Introduce migrate() mutation_cleaner: impl: Store a back-reference to the owning mutation_cleaner	2018-12-28 12:45:10 +02:00
Tomasz Grabiec	bb1c9cb6f3	tests: mvcc: Add test_snapshot_merging_after_container_is_destroyed	2018-12-28 10:32:39 +01:00
Tomasz Grabiec	4d13dea39a	tests: mvcc: Introduce mvcc_container::migrate()	2018-12-28 10:32:39 +01:00
Tomasz Grabiec	676868ed31	tests: mvcc: Make mvcc_partition move-constructible	2018-12-28 10:32:39 +01:00
Tomasz Grabiec	c6798f7872	tests: mvcc: Introduce mvcc_container::make_not_evictable()	2018-12-28 10:32:39 +01:00
Tomasz Grabiec	1fa00656ea	tests: mvcc: Allow constructing mvcc_container without a cache_tracker Some test cases will need many containers to simulate memtable -> cache transitions, but there can be only one cache_tracker per shard due to metrics. Allow constructing a conatiner without a cache_tracker (and thus non-evictable).	2018-12-28 10:32:39 +01:00
Benny Halevy	f104951928	sstable_test: read_file should open the file read-only Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20181218145156.12716-1-bhalevy@scylladb.com>	2018-12-25 12:02:46 +02:00
Rafael Ávila de Espíndola	f8c81d4d89	tests: sstables: mc: add tests with incompatible schemas In one test the types in the schema don't match the types in the statistics file. In another a column is missing. The patch also updates the exceptions to have more human readable messages. Tests: unit (release) Part of issue #3960. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181219233046.74229-1-espindola@scylladb.com>	2018-12-25 11:11:54 +02:00
Rafi Einstein	75f21954d4	top_k: whitespace and minor fixes Style and minor logic changes from code review. Signed-off-by: Rafi Einstein <rafie@scylladb.com>	2018-12-20 16:41:33 +02:00
Tomasz Grabiec	2b55ab8c8e	Merge "Add more extensive test for mutation reader fast-forwarding" from Paweł Mutation readers allow fast-forwarding the ranges from which the data is being read. The main user of this feature is cache which, when reading from the underlying reader, may want to skip some data it already has. Unsurprisingly, this adds more complexity to the implementation of the readers and more edge cases the developers need to take care of. While most of the readers were at least to some extent checked in this area those test usually were quite isolated (e.g. one test doing inter-partition fast-forwarding, another doing intra-partition fast-forwarding) and as a consequence didn't cover many corner cases. This patch adds a generic test for fast-forwarding and slicing that covers more complicated scenarios when those operations are combined. Needless to say that did uncover some problems, but fortunately none of them is user-visible. Fixes #3963. Fixes #3997. Tests: unit(release, debug) * https://github.com/pdziepak/scylla.git test-fast-forwarding/v4.1: tests/flat_mutation_reader_assertions: accumulate received tombstones tests/flat_mutation_reader_assertions: add more test messages tests/flat_mutation_reader_assertions: relax has_monotonic_positions() check tests/mutation_readers: do not ignore streamed_mutation::forwarding Revert "mutation_source_test: add option to skip intra-partition fast-forwarding tests" memtable: it is not a single partition read if partition fast-forwaring is enabled sstables: add more tracing in mp_row_consumer_m row_cache: use make_forwardable() to implement streamed_mutation::forwarding row_cache: read is not single-partition if inter-partition forwarding is enabled row_cache: drop support for streamed_mutation::forwarding::yes entirely sstables/mp_row_consumer: position_range end bound is exclusive mutation_fragment_filter: handle streamed_mutation::forwarding::yes properly tests/mutation_reader: reduce sleeping time tests/memtable: fix partition_range use-after-free tests/mutation: fix partition range use-after-free flat_mutation_reader_from_mutations: add overload that accepts a slice and partition range flat_mutation_reader_from_mutations: fix empty range case flat_mutation_reader_from_mutations: destroy all remaining mutations tests/mutation_source: drop dropped column handling test tests/mutation_source: add test for complex fast_forwarding and slicing	2018-12-20 15:05:21 +01:00
Paweł Dziepak	3355d16938	tests/mutation_source: add test for complex fast_forwarding and slicing While we already had tests that verified inter- and intra-partition fast-forwarding as well as slicing, they had quite limited scope and didn't combine those operations. The new test is meant to extensively test these cases.	2018-12-20 13:27:25 +00:00
Paweł Dziepak	26a30375b1	tests/mutation_source: drop dropped column handling test Schema changes are now covered by for_each_schema_change() function. Having some additional tests in run_mutation_source_tests() is problematic when it is used to test intermediate mutation readers because schema changes may be irrelevant for them, which makes the test a waste of time (might be a problem in debug mode) and requires those intermediate reader to use more complex underlying reader that supports schema changes (again, problem in a very slow debug mode).	2018-12-20 13:27:25 +00:00
Paweł Dziepak	93488209de	tests/mutation: fix partition range use-after-free	2018-12-20 13:27:25 +00:00
Paweł Dziepak	e91165d929	tests/memtable: fix partition_range use-after-free	2018-12-20 13:27:25 +00:00
Paweł Dziepak	5db8dacd1f	tests/mutation_reader: reduce sleeping time It is a very bad taste to sleep anywhere in the code. The test should be fixed to explicitly test various orderings between concurrent operations, but before that happens let's at least readuce how much those sleeps slow it down by changing it from milliseconds to microseconds.	2018-12-20 13:27:25 +00:00
Paweł Dziepak	bcb5aed1ef	Revert "mutation_source_test: add option to skip intra-partition fast-forwarding tests" This reverts commit `b36733971b`. That commit made run_mutation_reader_tests() support mutation_sources that do not implement streamed_mutation::forwarding::yes. This is wrong since mutation_sources are not allowed to ignore or otherwise not support that mode. Moreover, there is absolutely no reason for them to do so since there is a make_forwardable() adapter that can make any mutation_reader a forwardable one (at the cost of performance, but that's not always important).	2018-12-20 13:27:25 +00:00
Paweł Dziepak	8706750b9b	tests/mutation_readers: do not ignore streamed_mutation::forwarding It is wrong to silently ignore streamed_mutation::forwarding option which completely changes how the reader is supposed to operate. The best solution is to use make_forwardable() adapter which changes non-forwardable reader to a forwardable one.	2018-12-20 13:27:25 +00:00
Paweł Dziepak	edf2c71701	tests/flat_mutation_reader_assertions: relax has_monotonic_positions() check Since `41ede08a1d` "mutation_reader: Allow range tombstones with same position in the fragment stream" mutation readers emit fragments in non-decreasing order (as opposed to strictly increasing), has_monotonic_posiitons() needs to be updated to allow that.	2018-12-20 13:27:25 +00:00
Paweł Dziepak	787d1ba7b2	tests/flat_mutation_reader_assertions: add more test messages	2018-12-20 13:27:25 +00:00
Paweł Dziepak	593fb936c2	tests/flat_mutation_reader_assertions: accumulate received tombstones Current data model employed by mutation readers doesn't have an unique representation of range tombstones. This complicates testing by making multiple ways of emitting range tombstones and rows equally valid. This patch adds an option to verify mutation readers by checking whether tombstones they emit properly affect the clustered rows regardless of how exactly the tombstones are emitted. The interface of flat_mutation_reader_assertions is extended by adding may_produce_tombstones() that accepts any number of tombstones and accumulates them. Then, produces_row_with_key() accepts an additional argument which is the expected timestamp of the range tombstone that affects that row.	2018-12-20 13:27:25 +00:00
Paweł Dziepak	e6d26a528f	Merge "Optimize slicing sstable readers" from Tomasz " Contains several improvements for fast-forwarding and slicing readers. Mainly for the MC format, but not only: - Exiting the parser early when going out of the fast-forwarding window [MC-format-only] - Avoiding reading of the head of the partition when slicing - Avoiding parsing rows which are going to be skipped [MC-format-only] " * 'sstable-mc-optimize-slicing-reads' of github.com:tgrabiec/scylla: sstables: mc: reader: Skip ignored rows before parsing them sstables: mc: reader: Call _cells.clear() when row ends rather than when it starts sstables: mc: mutation_fragment_filter: Take position_in_partition rather than a clustering_row sstables: mc: reader: Do not call consume_row_marker_and_tombstone() for static rows sstables: mc: parser: Allow the consumer to skip the whole row sstables: continuous_data_consumer: Introduce skip() sstables: continuous_data_consumer: Make position() meaningful inside state_processor::process_state() sstables: mc: parser: Allocate dynamic_bitset once per read instead of once per row sstables: reader: Do not read the head of the partition when index can be used sstables: mc: mutation_fragment_filter: Check the fast-forward window first sstables: mc: writer: Avoid calling unsigned_vint::serialized_size()	2018-12-20 12:48:22 +00:00
Duarte Nunes	776fdd4d1a	service/storage_proxy: Expose local view update backlog The local view update backlog is the max backlog out of the relative memory backlog size and the relative hints backlog size. We leverage the db::view::node_update_backlog class so we can send the max backlog out of the node's shards. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:30 +00:00
Duarte Nunes	6662475dd9	tests/view_schema_test: Add simple test for db::view::node_update_backlog Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:30 +00:00
Rafael Ávila de Espíndola	ff18c837b7	tests: Add missing include in random-utils.hh This file uses std::cout and so should include <iostream>. Found with a patch to seastar that removes some redundant <iostream> includes. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181218183816.34504-1-espindola@scylladb.com>	2018-12-19 10:52:19 +00:00
Duarte Nunes	a7456db687	Merge 'Simplify natural endpoint calculation' from Calle " Implementation of origin change c000da13563907b99fe220a7c8bde3c1dec74ad5 Modifies network topology calculation, reducing the amount of maps/sets used by applying the knowledge of how many replicas we expect/need per dc and sharing endpoint and rack set (since we cannot have overlaps). Also includes a transposed origin test to ensure new calculation matches the old one. Fixes #2896 " * 'calle/network_topology' of github.com:scylladb/seastar-dev: network_topology_test: Add test to verify new algorith results equals old network_topology_strategy: Simplify calculate_natural_endpoints token_metadata: Add "get_location" ip to dc+rack accessor sequenced_set: Add "insert" method, following std::set semantics	2018-12-19 09:39:29 +00:00
Rafael Ávila de Espíndola	b93d8d863d	Add a test with mismatched timestamps. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181218035931.3554-1-espindola@scylladb.com>	2018-12-18 11:30:56 +01:00
Tomasz Grabiec	fb15759934	sstables: reader: Do not read the head of the partition when index can be used read_partition() was always called through read_next_partition(), even if we're at the beginning of the read. read_next_partition() is supposed to skip to the next partition. It still works when we're positioned before a partition, it doesn't advance the consumer, but it clears _index_in_current_partition, because it (correctly) assumes it corresponds to the partition we're about to leave, not the one we're about to enter. This means that index lookups we did in the read initializer will be disregarded when reading starts, and we'll always start by reading partition data from the data file. This is suboptimal for reads which are slicing a large partition and don't need to read the front of the partition. Regression introduced in `4b9a34a854`. The fix is to call read_partition() directly when we're positioned at the beginning of the partition. For that purpose a new flag was introduced. test_no_index_reads_when_rows_fall_into_range_boundaries has to be relaxed, because it assumed that slicing reads will read the head of the partition. Refs #3984 Fixes #3992 Tested using: ./build/release/tests/perf/perf_fast_forward_g \ --sstable-format=mc \ --datasets large-part-ds1 \ --run-tests=large-partition-slicing-clustering-keys Before (focus on aio): offset read time (s) frags frag/s mad f/s max f/s min f/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 4000000 1 0.001378 1 726 5 736 102 6 200 4 2 0 1 1 0 0 0 65.8% After: offset read time (s) frags frag/s mad f/s max f/s min f/s aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk cpu 4000000 1 0.001290 1 775 6 788 716 2 136 2 0 0 1 1 0 0 0 69.1%	2018-12-18 11:11:37 +01:00
Calle Wilund	e353a8633a	network_topology_test: Add test to verify new algorith results equals old Transposed from origin unit test. Creates a semi-random topology of racks, dcs, tokens and replication factors and verifies endpoint calculation equals old algo.	2018-12-17 13:10:59 +00:00
Botond Dénes	5780f2ce7a	querier_cache: check that the query wasn't evicted during registering The reader concurrency semaphore can evict the querier when it is registered as an inactive read. Make the `querier_cache` aware of this so that it doesn't continue to process the inserted querier when this happens. Also add a unit test for this.	2018-12-17 13:18:08 +02:00
Botond Dénes	e1d8237e6b	reader_concurrency_semaphore: use the correct types in the constructor Previously there was a type mismatch for `count` and `memory`, between the actual type used to store them in the class (signed) and the type of the parameters in the constructor (unsigned). Although negative numbers are completely valid for these members, initializing them to negative numbers don't make sense, this is why they used unsigned types in the constructor. This restriction can backfire however when someone intends to give these parameters the maximum possible value, which, when interpreted as a signed value will be `-1`. What's worse the caller might not even be aware of this unsigned->signed conversion and be very suprised when they find out. So to prevent surprises, expose the real type of these members, trusting the clients of knowing what they are doing. Also add a `no_limits` constructor, so clients don't have to make sure they don't overflow internal types.	2018-12-17 13:18:08 +02:00
Rafael Ávila de Espíndola	4de14e6143	Add tests on broken mc range tombstones. This tests that we diagnose both two consecutive range starts and two consecutive range ends. Message-Id: <20181214212608.95452-1-espindola@scylladb.com>	2018-12-15 13:53:25 +01:00
Avi Kivity	b023e8b45d	Merge " Extract MC sstable writer to a separate compilation unit" from Tomasz " The motivation is to keep code related to each format separate, to make it easier to comprehend and reduce incremental compilation times. Also reduces dependency on sstable writer code by removing writer bits from sstales.hh. The ka/la format writers are still left in sstables.cc, they could be also extracted. " * 'extract-sstable-writer-code' of github.com:tgrabiec/scylla: sstables: Make variadic write() not picked on substitution error sstables: Extract MC format writer to mc/writer.cc sstables: Extract maybe_add_summary_entry() out of components_writer sstables: Publish functions used by writers in writer.hh sstables: Move common write functions to writer.hh sstables: Extract sstable_writer_impl to a header sstables: Do not include writer.hh from sstables.hh sstables: mc: Extract bound_kind_m related stuff into mc/types.hh sstables: types: Extract sstable_enabled_features::all() sstables: Move components_writer to .cc tests: sstable_datafile_test: Avoid dependency on components_writer	2018-12-14 15:05:00 +02:00
Rafael Ávila de Espíndola	f48d54543f	Use read_rows_flat to test broken sstables. The previous code was using mp_row_consumer_k_l to be as close to the tested code as possible. Given that it is testing for an unhandled exception, there is probably more value in moving it to a higher level, easier to use, API. This patch changes it to use read_rows_flat(). Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181210235016.41133-1-espindola@scylladb.com>	2018-12-14 10:14:28 +01:00
Tomasz Grabiec	245a0d953a	tests: cql_test_env: Start the compaction manager Broken in `fee4d2e` Not doing this results in compaction requests being ignored. One effect of this is that perf_fast_forward produces many sstables instead of one. Refs #3984 Refs #3983 Message-Id: <1544719540-10178-1-git-send-email-tgrabiec@scylladb.com>	2018-12-13 18:58:50 +02:00
Rafael Ávila de Espíndola	51fd880892	Add tests for broken start and end composite markers.	2018-12-13 10:29:44 +01:00
Tomasz Grabiec	eff47a59ee	tests: sstable_datafile_test: Avoid dependency on components_writer It's LA format specific and it's going to become private to sstable.cc	2018-12-12 12:06:22 +01:00
Nadav Har'El	a0379209e6	secondary indexes: fail attempts to create a CUSTOM INDEX Cassandra supports a "CREATE CUSTOM INDEX" to create a secondary index with a custom implementation. The only custom implementation that Cassandra supports is SASI. But Scylla doesn't support this, or any other custom index implementation. If a CREATE CUSTOM INDEX statement is used, we shouldn't silently ignore the "CUSTOM" tag, we should generate an error. This patch also includes a regression test that "CREATE CUSTOM INDEX" statements with valid syntax fail (before this patch, they succeeded). Fixes #3977 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181211224545.18349-2-nyh@scylladb.com>	2018-12-11 23:33:02 +00:00
Tomasz Grabiec	1dd2bf52ca	Merge "Add a couple of tests of broken sstables" From Rafael These are the current uninteresting cases I found when looking at malformed_sstable_exception. The existing code is working, just not being tested. * https://github.com/espindola/scylla.git espindola/espindola/broken-sst: Add a broken sstable test. Add a test with mismatched schema.	2018-12-10 19:30:58 +01:00

1 2 3 4 5 ...

2843 Commits