scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	9d923a61e1	Merge "Fixes to sstable files for non-compound schemas" from Duarte This series mainly fixes issues with the serialization of promoted index entries for non-compound schemas and with the serialization of range tombstones, also for non-compound schemas. We lift the correct cell name writing code into its own function, and direct all users to it. We also ensure backward compatibility with incorrectly generated promoted indexes and range tombstones. Fixes #2995 Fixes #2986 Fixes #2979 Fixes #2992 Fixes #2993 * git@github.com:duarten/scylla.git promoted-index-serialization/v3: sstables/sstables: Unify column name writers sstables/sstables: Don't write index entry for a missing row maker sstables/sstables: Reuse write_range_tombstone() for row tombstones sstables/sstables: Lift index writing for row tombstones sstables/sstables: Leverage index code upon range tombstone consume sstables/sstables: Move out tombstone check in write_range_tombstone() sstables/sstables: A schema with static columns is always compound sstables/sstables: Lift column name writing logic sstables/sstables: Use schema-aware write_column_name() for collections sstables/sstables: Use schema-aware write_column_name() for row marker sstables/sstables: Use schema-aware write_column_name() for static row sstables/sstables: Writing promoted index entry leverages column_name_writer sstables/sstables: Add supported feature list to sstables sstables/sstables: Don't use incorrectly serialized promoted index cql3/single_column_primary_key_restrictions: Implement is_inclusive() cql3/delete_statement: Constrain range deletions for non-compound schemas tests/cql_query_test: Verify range deletion constraints sstables/sstables: Correctly deserialize range tombstones service/storage_service: Add feature for correct non-compound RTs tests/sstable_*: Start the storage service for some cases sstables/sstable_writer: Prepare to control range tombstone serialization sstables/sstables: Correctly serialize range tombstones tests/sstable_assertions: Fix monotonicity check for promoted indexes tests/sstable_assertions: Assert a promoted index is empty tests/sstable_mutation_test: Verify promoted index serializes correctly tests/sstable_mutation_test: Verify promoted index repeats tombstones tests/sstable_mutation_test: Ensure range tombstone serializes correctly tests/sstable_datafile_test: Add test for incorrect promoted index tests/sstable_datafile_test: Verify reading of incorrect range tombstones sstables/sstable: Rename schema-oblivious write_column_name() function sstables/sstables: No promoted index without clustering keys tests/sstable_mutation_test: Verify promoted index is not generated sstables/sstables: Optimize column name writing and indexing compound_compat: Don't assume compoundness (cherry picked from commit `bd1efbc25c`) Also added sstables::make_sstable() to preserve source compatibility in tests.	2017-11-30 16:21:13 +02:00
Avi Kivity	f698496ab2	Merge "Fix Scylla upgrades when counters are used" from Paweł "Scylla 1.7.4 and older use incorrect ordering of counter shards, this was fixed in `0d87f3dd7d` ("utils::UUID: operator< should behave as comparison of hex strings/bytes"). However, that patch was not backported to 1.7 branch until very recently. This means that versions 1.7.4 and older emit counter shards in an incorrect order and expect them to be so. This is particularly bad when dealing with imported correct sstables in which case some shards may become duplicated. The solution implemented in this patch is to allow any order of counter shards and automaticly merge all duplicates. The code is written in a way so that the correct ordering is expected in the fast path in order not to excessively punish unaffected deployments. A new feature flag CORRECT_COUNTER_ORDER is introduced to allow seamless upgrade from 1.7.4 to later Scylla versions. If that feature is not available Scylla still writes sstables and sends on-wire counters using the old ordering so that it can be correctly understood by 1.7.4, once the flag becomes available Scylla switches to the correct order. Fixes #2752." * tag 'fix-upgrade-with-counters/v2' of https://github.com/pdziepak/scylla: tests/counter: verify counter_id ordering counter: check that utils::UUID uses int64_t mutation_partition_serializer: use old counter ordering if necessary mutation_partition_view: do not expect counter shards to be sorted sstables: write counter shards in the order expected by the cluster tests/sstables: add storage_service_for_tests to counter write test tests/sstables: add test for reading wrong-order counter cells sstables: do not expect counter shards to be sorted storage_service: introduce CORRECT_COUNTER_ORDER feature tests/counter: test 1.7.4 compatible shard ordering counters: add helper for retrieving shards in 1.7.4 order tests/counter: add tests for 1.7.4 counter shard order counters: add counter id comparator compatible with Scylla 1.7.4 tests/counter: verify order of counter shards tests/counter: add test for sorting and deduplicating shards counters: add function for sorting and deduplicating counter cells counters: add counter_id::operator> (cherry picked from commit `31706ba989`)	2017-09-05 14:25:36 +03:00
Avi Kivity	e1a3052e76	tests: fix sstable_datafile_test build with boost 1.55 Boost 1.55 accidentally removed support for "range for" on recursive_directory_iterator (previous and latter versions do support it). Use old-style iteration instead. Message-Id: <20170724080128.8824-1-avi@scylladb.com> (cherry picked from commit `c21bb5ae05`)	2017-07-24 11:20:53 +03:00
Duarte Nunes	83cc640c6a	Merge 'Revert back to 1.7 schema layout in memory' from Tomasz "Fixes schema layout incompatibility in a mixed 1.7 and 2.0 cluster (#2555) by reverting back to using the old layout in memory and thus also in across-node requests. We still use the new v3 layout in schema tables (needed by drivers and external tools). Translations happen when converting to/from schema mutations." * tag 'tgrabiec/use-v2-schema-layout-in-memory-v2' of github.com:scylladb/seastar-dev: schema: Revert back to the 1.7 layout of static compact tables in memory schema: Use v3 column layout when converting to/from schema mutations schema: Encapsulate column layout translations in the v3_columns class (cherry picked from commit `1daf1bc4bb`)	2017-07-19 19:49:43 +03:00
Avi Kivity	70be29173a	tests: copy the sstable with an unknown component to the data directory We will be creating links to those sstable's files, and those don't work if the data directory and the test sstable are on different devices. Copying the files to the same directory fixes the problem. Message-Id: <20170716090405.14307-1-avi@scylladb.com> (cherry picked from commit `9116dd91cb`)	2017-07-17 08:47:08 +03:00
Avi Kivity	555621b537	Disentable memtables from sstables Remove sstable::write_components(memtable), replacing it with a helper. Fixes #2354 Message-Id: <20170624142639.16662-1-avi@scylladb.com>	2017-06-26 09:37:11 +02:00
Raphael S. Carvalho	4bb27cbd6f	lcs: actually prefer oldest sstables of L0 when it falls behind Strategy prefers promoting oldest sstables in L0. Because sort procedure is incorrectly sorting elements in descending order, newest sstables will be promoted first if and only if L0 falls behind (more than 32 sstables). If L0 doesn't fall behind, we'll have all L0 sstables compacted with overlapping ones in L1. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-06-19 20:45:39 -03:00
Nadav Har'El	3018df11b5	Allow reading exactly desired byte ranges and fast_forward_to In commit `c63e88d556`, support was added for fast_forward_to() in data_consume_rows(). Because an input stream's end cannot be changed after creation, that patch ignores the specified end byte, and uses the end of file as the end position of the stream. As result of this, even when we want to read a specific byte range (e.g., in the repair code to checksum the partitions in a given range), the code reads an entire 128K buffer around the end byte, or significantly more, with read-ahead enabled. This causes repair to do more than 10 times the amount of I/O it really has to do in the checksumming phase (which in the current implementation, reads small ranges of partitions at a time). This patch has two levels: 1. In the lower level, sstable::data_consume_rows(), which reads all partitions in a given disk byte range, now gets another byte position, "last_end". That can be the range's end, the end of the file, or anything in between the two. It opens the disk stream until last_end, which means 1. we will never read-ahead beyond last_end, and 2. fast_fordward_to() is not allowed beyond last_end. 2. In the upper level, we add to the various layers of sstable readers, mutation readers, etc., a boolean flag mutation_reader::forwarding, which says whether fast_forward_to() is allowed on the stream of mutations to move the stream to a different partition range. Note that this flag is separate from the existing boolean flag streamed_mutation::fowarding - that one talks about skipping inside a single partition, while the flag we are adding is about switching the partition range being read. Most of the functions that previously accepted streamed_mutation::forwarding now accept also the option mutation_reader::forwarding. The exception are functions which are known to read only a single partition, and not support fast_forward_to() a different partition range. We note that if mutation_reader::forwarding::no is requested, and fast_forward_to() is forbidden, there is no point in reading anything beyond the range's end, so data_consume_rows() is called with last_end as the range's end. But if forwarding::yes is requested, we use the end of the file as last_end, exactly like the code before this patch did. Importantly, we note that the repair's partition reading code, column_family::make_streaming_reader, uses mutation_reader::forwarding::no, while the other existing reading code will use the default forwarding::yes. In the future, we can further optimize the amount of bytes read from disk by replacing forwarding::yes by an actual last partition that may ever be read, and use its byte position as the last_end passed to data_consume_rows. But we don't do this yet, and it's not a regression from the existing code, which also opened the file input stream until the end of the file, and not until the end of the range query. Moreover, such an improvement will not improve of anything if the overall range is always very large, in which case not over-reading at its end will not improve performance. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20170619152629.11703-1-nyh@scylladb.com>	2017-06-19 18:31:32 +03:00
Avi Kivity	6e2c9ef9fb	Revert "Allow reading exactly desired byte ranges and fast_forward_to" This reverts commit `317d7fc253` (and also the related `2c57ab84b2`). It causes crashes during range scans, reported by Gleb: "To reproduce I run SELECT * FROM keyspace1.standard1; on typical c-s dataset and 3 node cluster. Backtrace: at /home/gleb/work/seastar/seastar/core/apply.hh:36 rvalue=<unknown type in /home/gleb/work/seastar/build/release/scylla, CU 0x54cf307, DIE 0x55ebf2a>) at /home/gleb/work/seastar/seastar/core/do_with.hh:57 range=std::vector of length 6, capacity 8 = {...}) at /home/gleb/work/seastar/seastar/core/future-util.hh:142 at ./seastar/core/future.hh:890 at /home/gleb/work/seastar/seastar/core/future-util.hh:119 at /home/gleb/work/seastar/seastar/core/future-util.hh:142	2017-06-18 16:10:21 +03:00
Nadav Har'El	317d7fc253	Allow reading exactly desired byte ranges and fast_forward_to In commit `c63e88d556`, support was added for fast_forward_to() in data_consume_rows(). Because an input stream's end cannot be changed after creation, that patch ignores the specified end byte, and uses the end of file as the end position of the stream. As result of this, even when we want to read a specific byte range (e.g., in the repair code to checksum the partitions in a given range), the code reads an entire 128K buffer around the end byte, or significantly more, with read-ahead enabled. This causes repair to do more than 10 times the amount of I/O it really has to do in the checksumming phase (which in the current implementation, reads small ranges of partitions at a time). This patch has two levels: 1. In the lower level, sstable::data_consume_rows(), which reads all partitions in a given disk byte range, now gets another byte position, "last_end". That can be the range's end, the end of the file, or anything in between the two. It opens the disk stream until last_end, which means 1. we will never read-ahead beyond last_end, and 2. fast_fordward_to() is not allowed beyond last_end. 2. In the upper level, we add to the various layers of sstable readers, mutation readers, etc., a boolean flag mutation_reader::forwarding, which says whether fast_forward_to() is allowed on the stream of mutations to move the stream to a different partition range. Note that this flag is separate from the existing boolean flag streamed_mutation::fowarding - that one talks about skipping inside a single partition, while the flag we are adding is about switching the partition range being read. Most of the functions that previously accepted streamed_mutation::forwarding now accept also the option mutation_reader::forwarding. The exception are functions which are known to read only a single partition, and not support fast_forward_to() a different partition range. We note that if mutation_reader::forwarding::no is requested, and fast_forward_to() is forbidden, there is no point in reading anything beyond the range's end, so data_consume_rows() is called with last_end as the range's end. But if forwarding::yes is requested, we use the end of the file as last_end, exactly like the code before this patch did. Importantly, we note that the repair's partition reading code, column_family::make_streaming_reader, uses mutation_reader::forwarding::no, while the other existing reading code will use the default forwarding::yes. In the future, we can further optimize the amount of bytes read from disk by replacing forwarding::yes by an actual last partition that may ever be read, and use its byte position as the last_end passed to data_consume_rows. But we don't do this yet, and it's not a regression from the existing code, which also opened the file input stream until the end of the file, and not until the end of the range query. Moreover, such an improvement will not improve of anything if the overall range is always very large, in which case not over-reading at its end will not improve performance. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20170614072122.13473-1-nyh@scylladb.com>	2017-06-15 13:22:46 +01:00
Tomasz Grabiec	f3a6d94398	sstables: Introduce sstable::as_mutation_source() Adaptors extracted from existing testing code. Message-Id: <1495729508-30081-1-git-send-email-tgrabiec@scylladb.com>	2017-05-25 19:30:20 +03:00
Calle Wilund	66991a7ccb	v3 schema test fixes	2017-05-10 16:44:48 +00:00
Duarte Nunes	65d96421da	tests/sstable_datafile_test: Fix regression This patch fixes a regression introduced in `9e88b60`, where the wrong clustering key was being specified. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170509091621.2682-1-duarte@scylladb.com>	2017-05-09 12:18:47 +03:00
Duarte Nunes	9e88b60ef5	mutation: Set cell using clustering_key_prefix Change the clustering key argument in mutation::set_cell from exploded_clustering_prefix to clustering_key_prefix, which allows for some overall code simplification and fewer copies. This mostly affects the cql3 layer. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-04 15:59:50 +02:00
Raphael S. Carvalho	8b0e358d73	tests/sstable_test: fix release-mode compaction_manager_test in release mode, compaction task is active after submitting request because ready future may be scheduled immediately. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170502171925.9893-1-raphaelsc@scylladb.com>	2017-05-02 20:48:30 +03:00
Raphael S. Carvalho	8dfb5f9c33	tests/sstable_test: fix compaction_manager_test after 'compaction: make major compaction go through compaction manager', the test fails because task is preempted in debug mode before it reaches intruction to increase stat. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170501183255.6191-1-raphaelsc@scylladb.com>	2017-05-02 09:06:41 +03:00
Duarte Nunes	4e693383f7	mutation_partion: Use row_tombstone This patch replaces the current row tombstone representation by a row_tombstone. The intent of the patch is thus to reify the idea of shadowable tombstones, that up until now we considered all materialized view row tombstones to be. We need to distinguish shadowable from non-shadowable row tombstones to support scenarios such as, when inserting to a table with a materialzied view: 1. insert into base (p, v1, v2) values (3, 1, 3) using timestamp 1 2. delete from base using timestamp 2 where p = 3 3. insert into base (p, v1) values (3, 1) using timestamp 3 These should yield a view row where v2 is definitely null, but with the current implementation, v2 will pop back with its value v2=3@TS=1, even though its dead in the base row. This is because the row tombstone inserted at 2) is a shadowable one. This patch only addresses the memory representation of such row_tombstones. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-04-25 11:46:33 +02:00
Avi Kivity	5424aca745	sstable_datafile_test: avoid string_view user-defined literal conversion operator Clang doesn't like it, perhaps because it isn't in the std namespace (it's still in std::experimental).	2017-04-22 22:11:30 +03:00
Raphael S. Carvalho	8a37b279ed	tests: add test for new sstable resharding Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:34 -03:00
Raphael S. Carvalho	d82a8dfae0	lcs: restore invariant instead of sending overlapping sst to L0 A large token span sstable may find its way into high level due to resharding, which means the strategy invariant is broken. The invariant is restored by compacting first set of overlapping sstables, meaning that the restoration is done incrementally for multiple overlapping sets. Invariant is restored by regular compaction after resharding puts new unshared sstables into their original level, where level > 0. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:09 -03:00
Raphael S. Carvalho	11b74050a1	partitioned_sstable_set: fix quadratic space complexity streaming generates lots of small sstables with large token range, which triggers O(N^2) in space in interval map. level 0 sstables will now be stored in a structure that has O(N) in space complexity and which will be included for every read. Fixes #2287. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170417185509.6633-1-raphaelsc@scylladb.com>	2017-04-18 13:04:38 +03:00
Raphael S. Carvalho	a6f8f4fe24	compaction: do not write expired cell as dead cell if it can be purged right away When compacting a fully expired sstable, we're not allowing that sstable to be purged because expired cell is unconditionally converted into a dead cell. Why not check if the expired cell can be purged instead using gc before and max purgeable timestamp? Currently, we need two compactions to get rid of a fully expired sstable which cells could have always been purged. look at this sstable with expired cell: { "partition" : { "key" : [ "2" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 120, "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z", "ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true }, "cells" : [ { "name" : "country", "value" : "1" }, ] now this sstable data after first compaction: [shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79 (~65% of original) in 229ms = 0.000328997MB/s. { ... "rows" : [ { "type" : "row", "position" : 79, "cells" : [ { "name" : "country", "deletion_info" : { "local_delete_time" : "2017-04-09T17:07:12Z" }, "tstamp" : "2017-04-09T17:07:12.702597Z" }, ] now another compaction will actually get rid of data: compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original) in 1ms = 0MB/s. ~2 total partitions merged to 0 NOTE: It's a waste of time to wait for second compaction because the expired cell could have been purged at first compaction because it satisfied gc_before and max purgeable timestamp. Fixes #2249, #2253 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com>	2017-04-13 10:59:19 +03:00
Avi Kivity	5b530aa464	Merge "Use promoted index for skipping in sstable mutation readers" from Tomasz "sstable_streamed_mutation::fast_forward_to() is changed to use promoted index (via index_reader) to optimize skipping in large partitions. In addition to that, sstable mutation_reader is changed to use the index to skip to the next partition. Performance impact was evaluated using newly added tests/perf/perf_fast_forward What's beyond this series: - Using index_reader for single-partition reads as well - Using index_reader for skipping across ranges in clustering restrictions" * tag 'tgrabiec/skip-within-partition-using-index-v2' of github.com:cloudius-systems/seastar-dev: (47 commits) tests: Add performance test for fast forwarding of sstable readers tests: Allow starting cql_test_env on pre-existing data config: Allow specifying source when setting value tests: sstable: Add test for fast forwarding within partition using index sstables: sstable_streamed_mutation: use index in fast_forward_to() sstables: Store parsed promoted index in index_entry sstables: Add trace-level logging for sstable consumption sstables: Define deletion_time earlier sstables: Make parsing throw exception on malformed promoted index tests: Add tests for ordering of position_in_partition relative to composites position_range: Introduce all_clustered_rows() factory method position_in_partition: Introduce for_key()/after_key() factory methods position_in_partition: Add factory methods for positions around all rows position_in_partition: Introduce for_range_start()/for_range_end() position_in_partition: Fix friendship declaration keys: Introduce is_empty() for prefixes position_in_partition: Make comparable with composites types: Enhance lexicographical comparators compound_compat: Accept marker value in serialize_value() compound_compat: Add trichotomic comparator ...	2017-03-29 19:01:12 +03:00
Raphael S. Carvalho	023031b0c8	compaction: lcs: fix functionality to feed starved levels quick introduction to level starvation: high levels may be left uncompacted (thus starved) for a long time if user makes something that make they contain little data, such as cleanup or change of max sstable size (default 160M). Leveled strategy handles this problem as follow: consider we're compacting L1 to L2. If L3 is starved, we look for one of its sstable that is fully contained in token range of candidates L1->L2, so that we won't end up with an overlapping in L2. now the problem: the functionality isn't working properly now because range of candidates is being incorrectly calculated due to an accident when converting the code to C++. It won't cause an overlap because it's actually being more restrictive about which sstable from starved level can be used. A test case was added to confirm the problem. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170328223753.15398-1-raphaelsc@scylladb.com>	2017-03-29 18:59:46 +03:00
Tomasz Grabiec	f1aca6d116	tests: sstable: Add test for fast forwarding within partition using index	2017-03-28 18:34:55 +02:00
Tomasz Grabiec	cd295e9926	sstables: Avoid moving an sstable In preparation for adding non-movable members.	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	88ccc99017	tests: sstables: Add test for handling of repeated tombstones	2017-03-10 14:42:22 +01:00
Tomasz Grabiec	124dde30db	sstables: Extract writer parameters into config objects Also enables users to change the default promoted index block size.	2017-03-10 14:42:22 +01:00
Paweł Dziepak	04b80272f2	cell_locker: add metrics for lock acquisition	2017-03-02 09:05:12 +00:00
Tomasz Grabiec	9f63e172fb	tests: compaction_manager_test: Fix abort on exception Message-Id: <1487343901-12745-1-git-send-email-tgrabiec@scylladb.com>	2017-02-17 15:53:55 +00:00
Paweł Dziepak	de2acd47c9	tests/sstables: test reading and writing counters	2017-02-02 10:35:14 +00:00
Benoît Canet	bcc826cc34	mutation_reader: Short circuit the read path on empty range Add a boolean to short circuit the read path on empty range hoping for some speedup. tested in read write with cs using: cl=QUORUM duration=1m -mode native cql3 -rate threads=700 -node localhost Will do some additional benchmark. Fixes #1056 Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <20170118194451.16836-1-benoit@scylladb.com>	2017-01-20 10:05:40 +00:00
Piotr Jastrzebski	b159e08764	intrusive_set: rename size() to calculate_size() This hopefully will make it more apparent that the time complexity of this method is O(N) not O(1). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-01-05 12:21:43 +01:00
Piotr Jastrzebski	4bbe05dd47	mutation_partition: take schema in find_row and clustered_row This will allow intrusive set implementation that does not store schema. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-01-05 11:26:03 +01:00
Avi Kivity	1d9ee358f1	Revert "Merge "Reduce the size of mutation_partition" from Piotr" This reverts commit `aa392810ff`, reversing changes made to a24ff47c637e6a5fd158099b8a65f1191fc2d023; it uses boost::intrusive::detail directly, which it must not, and doesn't compile on all boost versions as a consequence.	2016-12-25 16:07:48 +02:00
Avi Kivity	aa392810ff	Merge "Reduce the size of mutation_partition" from Piotr "Reduce the size of mutation_partition by implementing intrusive set using bi::rbtree_algorithms directly and using tree nodes optimized for size. This will reduce the size of mutation_partition by: 24 bytes + <number of cql rows> * 8 bytes This should have a positive impact on performance because mutation_partitions are stored both in memtable and cache. Fixes #742." * 'haaawk/742' of github.com:cloudius-systems/seastar-dev: intrusive_set: rename size() to calculate_size() Make intrusive_set_external_comparator::_value_traits static Implement intrusive set using rbtree_algorithms mutation_partition: make apply_reversibly_intrusive_set nongeneric mutation_partition: take schema in find_row and clustered_row mutation_partition: Extract intrusive set logic to a class. mutation_partition: Replace value_comp with key_comp calls	2016-12-25 12:56:10 +02:00
Piotr Jastrzebski	345ed5b6ff	intrusive_set: rename size() to calculate_size() This hopefully will make it more apparent that the time complexity of this method is O(N) not O(1). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-12-23 11:32:13 +01:00
Piotr Jastrzebski	2af6ff68d9	mutation_partition: take schema in find_row and clustered_row This will allow intrusive set implementation that does not store schema. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-12-23 11:29:07 +01:00
Raphael S. Carvalho	fd80499b3d	database: make column_family::add_sstable() private again Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <38226308bee2970a91b0e35370d6a646b85ecfe9.1482459877.git.raphaelsc@scylladb.com>	2016-12-23 11:42:16 +02:00
Tomasz Grabiec	fb8765bef9	tests: sstables: Add more test cases to tombstone_purge_test	2016-12-22 13:24:46 +01:00
Tomasz Grabiec	d841cab02c	tests: sstables: Ensure timestamps are increasing	2016-12-22 13:24:45 +01:00
Tomasz Grabiec	21ade8e4a4	tests: sstables: Simplify tombstone_purge_test - moved to seastar thread - extracted sstable creation and validation logic - reduced code duplication - switched to mutation_reader assertions - used result of compact_sstable() to locate the new sstable - rather than setting gc timestamp in the past, bump the clock before compacting	2016-12-22 13:24:41 +01:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Raphael S. Carvalho	548f6066c5	tests: add test for sstable set's incremental selector Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-12-09 16:17:17 -02:00
Raphael S. Carvalho	a16425833c	size_tiered: do not recreate bucket when it goes beyond max threshold Problem will cause size tiered to return small jobs when there are more than max_threshold sstables of similar size. For example, if max_threshold is 32, and there are 36 sstables of similar size, strategy will only return 4 sstables to be compacted. That's because we incorrectly create a new bucket when it meets the max threshold. What we should do is to allow buckets to grow beyond max threshold and trim them when selecting the most suitable one for compaction. Important to mention that estimation for size tiered will now work better when there are more than max_threshold sstables of similar size. Fixes #1901. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <080bad70d6cb86eaf52ac1bdd6765ac47aab5b03.1478316140.git.raphaelsc@scylladb.com>	2016-11-29 16:56:02 +02:00
Duarte Nunes	e680587b8a	sstable_test: Be explicit about uncompressed tables After 7c28ed, the schemas defined in the test became compressed by default. This patch changes the test so that it is explicit about which schemas shouldn't define a compressor. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1478646530-5558-1-git-send-email-duarte@scylladb.com>	2016-11-09 11:21:59 +02:00
Raphael S. Carvalho	53b7b7def3	sstables: handle unrecognized sstable component As in C*, unrecognized sstable components should be ignored when loading a sstable. At the moment, Scylla fails to do so and will not boot as a result. In addition, unknown components should be remembered when moving a sstable or changing its generation. Fixes #1780. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <b7af0c28e5b574fd577a7a1d28fb006ac197aa0a.1478025930.git.raphaelsc@scylladb.com>	2016-11-02 12:44:53 +02:00
Paweł Dziepak	6755a679f6	drop key readers key_readers weren't used since introduction of continuity flag to cache entries. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-10-19 15:29:08 +01:00
Paweł Dziepak	7dcd70124a	tests/sstables: add test for fast forwarding reader Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-10-19 15:29:08 +01:00
Raphael S. Carvalho	a8ab4b8f37	lcs: fix starvation at higher levels When max sstable size is increased, higher levels are suffering from starvation because we decide to compact a given level if the following calculation results in a number greater than 1.001: level_size(L) / max_size_for_level_l(L) Fixes #1720. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-30 14:09:49 -03:00

1 2 3

122 Commits