scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	fc92fb955d	sstables/compaction_manager: release reference to exhausted sstable through callback That's important for the reference to sstable to not be kept throughout the compaction procedure, which would break the goal of releasing space during compaction. Manager passes a callback to compaction which calls it whenever there's sstable replacement. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:16 -02:00
Raphael S. Carvalho	3f309ebba9	sstables/compaction: stop tracking exhausted input sstable in compaction_read_monitor Motivation is that we want to release space for exhausted sstable and that will only happen when all references to it are gone and that backlog tracker takes the early replacement into account. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:13 -02:00
Raphael S. Carvalho	3433de3dc0	database: do not keep reference to sstable in selector when done selecting When compacting, we'll create all readers at once and will not select again from incremental selector, meaning the selector will keep all respective sstables in current_sstables, preventing compaction from releasing space as it goes on. The change is about refreshing sstable set's selector such that it will not hold a reference to an exhausted sstable whatsoever. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:12 -02:00
Raphael S. Carvalho	f6df949c1a	compaction: share sstable set with incremental reader selector By doing that, we'll be able to release exhausted sstable from both simulteaneously. That's achieved by sharing set containing input sstables with the incremental reader selector and removing exhausted sstables from shared set when the time has come. Step towards reducing disk requirement for compaction by making it delete sstable which all data is in a sealed new sstable. For that to happen, all references must be gone. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:10 -02:00
Raphael S. Carvalho	e5a0b05c15	sstables/compaction: release space earlier of exhausted input sstables Currently, compaction only replace input sstables at end of compaction, meaning compaction must be finished for all the space of those sstables to be released. What we can do instead is to delete earlier some input sstable under some conditions: 1) SStable data should be committed to a new, sealed output sstable, meaning it's exhausted. 2) Exhausted sstable mustn't overlap with a non-exhausted sstable because a tombstone in the exhausted could have been purged and the shadowed data in non-exhausted could be ressurected if system crashes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:07 -02:00
Raphael S. Carvalho	ace070c8fc	sstables: make partitioned sstable set's incremental selector resilient to changes in the set The motivation is that compaction may remove a sstable from the set while the incremental selector is alive, and for that to work, we need to invalidate the iterators stored by the selector. We could have added a method to notify it, but there will be a case where the one keeping the set cannot forward the notification to the selector. So it's better for the selector to take care of itself. Change counter approach is used which allows the selector to know when to invalidate the iterators. After invalidation, selector will move the iterator back into its right place by looking for lower bound for current pos. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:05 -02:00
Raphael S. Carvalho	8d11b0bbb4	database: do not store reference to sstable in incremental selector Use sstable generation instead to keep track of read sstables. The motivation is that we'll not keep reference to sstables, so allowing their space on disk to be released as soon they get exhausted. Generation is used because it guarantees uniqueness of the sstable. Reviewed-by: Botond Dénes <bdenes@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:04 -02:00
Raphael S. Carvalho	edc87014c1	tests/sstables: add run identifier correctness test Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:02 -02:00
Raphael S. Carvalho	a66b1954cc	sstables: use a random uuid for sstables without run identifier Older sstables must have an identifier for them to be associated with their own run. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:53:01 -02:00
Raphael S. Carvalho	62025fa52c	sstables: add run identifier to scylla metadata It identifies a run which a particular sstable belongs to. Existing sstables will have a random uuid associated with it in memory. UUID is the correct choice because it allows sstables to be exported without having conflicts when using identifier generated by different nodes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2018-11-24 18:52:44 -02:00
Rafael Ávila de Espíndola	d18bbe9d45	Remove unreachable default cases. These switches are fully covered. We can be sure they will stay this way because of -Werror and gcc's -Wswitch warning. We can also be sure that we never have an invalid enum value since the state machine values are not read from disk. The patch also removes a superfluous ';'. Message-Id: <20181124020128.111083-1-espindola@scylladb.com>	2018-11-24 09:31:51 +00:00
Raphael S. Carvalho	d29482dce8	sstables: deprecate sstable metadata's ancestors The reason for that is that it's not available in sstable format mc, so we can no longer rely on it in common code for the currently supported formats. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20181121170057.20900-1-raphaelsc@scylladb.com>	2018-11-23 19:38:32 +01:00
Tomasz Grabiec	564b328b2e	Merge 'Add tests for schema changes' from Paweł This series adds a generic test for schema changes that generates various schema and data before and after an ALTER TABLE operation. It is then used to check correctness of mutation::upgrade() and sstable readers and lead to the discovery of #3924 and #3925. Fixes #3925. * https://github.com/pdziepak/scylla.git schema-change-test/v3.1 schema_builder: make member function names less confusing converting_mutation_partition_applier: fix collection type changes converting_mutation_partition_applier: do not emit empty collections sstable: use format() instead of sprint() tests/random-utils: make functions and variables inline tests: add models for schemas and data tests: generate schema changes tests/mutation: add test for schema changes tests/sstable: add test for schema changes	2018-11-23 15:11:31 +01:00
Paweł Dziepak	09439cd809	tests/sstable: add test for schema changes for_each_schema_change() is used for testing reading an sstable that was written with a different schema. Because of #3924, for now the mc format is not verified this way.	2018-11-23 12:14:06 +00:00
Paweł Dziepak	dc7f9fea5b	tests/mutation: add test for schema changes	2018-11-23 12:14:06 +00:00
Paweł Dziepak	35f9f424e9	tests: generate schema changes This patch adds for_each_schema_change() functions which generates schemas and data before and after some modification to the schema (e.g. adding a column, changing its type). It can be used to test schema upgrades.	2018-11-23 12:14:06 +00:00
Paweł Dziepak	daee4bd3b8	tests: add models for schemas and data This patch introduces a model of Scylla schemas and data, implemented using simple standard library primitives. It can be used for testing the actuall schemas, mutation_partitions, etc. used by the schema by comparing the results of various actions. The initial use case for this model was to test schema changes, but there is no reason why in the future it cannot be extended to test other things as well.	2018-11-23 12:14:06 +00:00
Takuya ASADA	cf0d00b81a	dist/ami: fix 'unknown configuration key: "enhanced_networking"' error while building AMI packer 1.3.2 no longer supported enhanced_networking directive, we need to use new directives("sriov_support" and "ena_support") to build with new version. packer provides automatic configuration file fixing tool, so new scylla.json is generated by following command: ./packer/packer fix scylla.json Fixes #3938 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20181123053719.32451-1-syuu@scylladb.com>	2018-11-23 08:15:47 +02:00
Paweł Dziepak	91793c0a43	bytes_ostream: drop appending_hash specialisation appending_hash is used for computing hashes that become part of the binary interface. They cannot change between Scylla version and the same data needs to always result in the same hash. At the moment, appending_hash<bytes_ostream> doesn't fulfil those requirements since it leaks information how the underlying buffer is fragmented. Fortunately, it has no users so it doesn't casue any compatibility issues. Moreover, bytes_ostream is usually used as an output of some serialisation routine (e.g. frozen_mutation_fragment or CQL response). Those serialisation formats do not guarantee that there is a single representation of a given data and therefore are not fit to be hashed by appending_hash. Removing appending_hash<bytes_ostream> may help preventing such incorrect uses. Message-Id: <20181122163823.12759-1-pdziepak@scylladb.com>	2018-11-22 23:53:54 +00:00
Tomasz Grabiec	fb38f0e9f8	Update seastar submodule * seastar b924495...1fbb633 (3): > rpc: Reduce code duplication > tests: perf: Make do_not_optimize() take the argument by const& > doc: Fix import paths in the tutorial	2018-11-22 23:53:54 +00:00
Paweł Dziepak	2a0e929830	tests/random-utils: make functions and variables inline random-utils.hh is a header which may be included in multiple translation units so all members should be non-static inline to avoid any duplication.	2018-11-22 11:30:31 +00:00
Paweł Dziepak	edb5402a73	sstable: use format() instead of sprint() The format message was using the new stlye formatting markers ("{}") which are understood by format() but not by sprint() (the latter is basically deprecated).	2018-11-22 11:30:31 +00:00
Paweł Dziepak	1fbe33791d	converting_mutation_partition_applier: do not emit empty collections This patch changes the behaviour of the schema upgrade code so that if all cells and the tombstons of a collection are removed during the upgrade the collection is not emitted (as opposed to emitting an empty one). Both behaviours are valid, but the new one makes it more consistent with how atomic cells are upgraded and how schema upgrades work for sstable readers.	2018-11-22 11:30:31 +00:00
Paweł Dziepak	7b12aaa093	converting_mutation_partition_applier: fix collection type changes ALTER TABLE allows changing the type of a collection to a compatible one. This includes changes from a fixed-sized type to a variable-sized one. If that happens the atomic_cells representing collection elements need to be rewritten so that the value size is included. The logic for rewritting atomic cells already exists (for those that are not collection members) and is reused in this patch. Fixes #3925.	2018-11-22 11:30:31 +00:00
Paweł Dziepak	43e0201ec6	schema_builder: make member function names less confusing Right now, schema_builder member functions have names that very poorly convey the actions that are performed for them. This is made even worse by some overloads which drastically change the semantics. For example: schema_builder() .with_column("v1", /* ... /) .without_column("v1", removal_timestamp); Creates a column "v1" and adds an information that there was a column with that name that was removed at 'removal_timestamp'. schema_builder() .with_coulmn("v1") .without_column(utf8_type->decompose("v1")); This adds column "v1" and then immediately removes it. In order to clean up this mess the names were changes so that: with_/without_ functions only add informations to the schema (e.g. info that a column was removed, but without removing a column of that name if one exists) * functions which names start with a verb actually perform that action, e.g. the new remove_column() removes the column (and adds information that it used to exist) as in the second example.	2018-11-22 11:30:31 +00:00
Benny Halevy	dcd18e2b62	remove exec permission from top_k source files This was introduced by `32525f2694` Cc: Rafi Einstein <rafie@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20181121163352.13325-1-bhalevy@scylladb.com>	2018-11-21 18:38:50 +02:00
Gleb Natapov	b4a8802edc	hints: make hints manager more resilient to unexpected directory content Currently if hints directory contains unexpected directories Scylla fails to start with unhandled std::invalid_argument exception. Make the manager ignore malformed files instead and try to proceed anyway. Message-Id: <20181121134618.29936-2-gleb@scylladb.com>	2018-11-21 14:53:03 +00:00
Gleb Natapov	9433d02624	hints: add auxiliary function for scanning high level hints directory We scan hints directory in two places: to search for files to replay and to search for directories to remove after resharding. The code that translates directory name to a shard is duplicated. It is simple now, so not a bit issue but in case it grows better have it in one place. Message-Id: <20181121134618.29936-1-gleb@scylladb.com>	2018-11-21 14:53:03 +00:00
Paweł Dziepak	4aa5d83590	Merge "Optimize sstable writing of the MC format" from Tomasz " Tested with perf_fast_forward from: github.com/tgrabiec/scylla.git perf_fast_forward-for-sst3-opt-write-v1 Using the following command line: build/release/tests/perf/perf_fast_forward_g --populate --sstable-format=mc \ --data-directory /tmp/perf-mc --rows=10000000 -c1 -m4G \ --datasets small-part The average reported flush throughput was (stdev for the avergages is around 4k): - for mc before the series: 367848 frag/s - for lc before the series: 463458 frag/s (= mc.before +25%) - for mc after the series: 429276 frag/s (= mc.before +16%) - for lc after the series: 466495 frag/s (= mc.before +26%) Refs #3874. " * tag 'sst3-opt-write-v2' of github.com:tgrabiec/scylla: sstables: mc: Avoid serialization of promoted index when empty sstables: mc: Avoid double serialization of rows tests: sstable 3.x: Do not compare Statistics component utils: Introduce memory_data_sink schema: Optimize column count getters sstables: checksummed_file_data_sink_impl: Bypass output_stream	2018-11-21 13:11:40 +00:00
Tomasz Grabiec	049926bfb8	sstables: mc: Avoid serialization of promoted index when empty calculate_write_size() adds some overhead, even if we're not going to write anything.	2018-11-21 14:04:27 +01:00
Tomasz Grabiec	0a9f5b563a	sstables: mc: Avoid double serialization of rows The old code was serializing the row twice. Once to get the size of its block on disk, which is needed to write the block length, and then to actually write the block. This patch avoids this by serializing once into a temporary buffer and then appending that buffer to the data file writer. I measured about 10% improvement in memtable flush throughput with this for the small-part dataset in perf_fast_forward.	2018-11-21 14:04:27 +01:00
Tomasz Grabiec	8f686af9af	tests: sstable 3.x: Do not compare Statistics component The Statistics component recorded in the test was generated using a buggy verion of Scylla, and is not correct. Exposed by fixing the bug in the way statistics are generated. Rather than comparing binary content, we should have explicit checks for statistics.	2018-11-21 14:04:27 +01:00
Tomasz Grabiec	143fd6e1c2	utils: Introduce memory_data_sink	2018-11-21 14:04:27 +01:00
Tomasz Grabiec	789fac9884	schema: Optimize column count getters	2018-11-21 14:04:27 +01:00
Tomasz Grabiec	8e8b96c6ed	sstables: checksummed_file_data_sink_impl: Bypass output_stream We can avoid the data copying by switching from this: sink -> stream -> sink to this: sink -> sink	2018-11-21 14:04:27 +01:00
Avi Kivity	bb85a21a8f	Merge "compress: Restore lz4 as default compressor" from Duarte " Enables sstable compression with LZ4 by default, which was the long-time behavior until a regression turned off compression by default. Fixes #3926 " * 'restore-default-compression/v2' of https://github.com/duarten/scylla: tests/cql_query_test: Assert default compression options compress: Restore lz4 as default compressor tests: Be explicit about absence of compression	2018-11-21 14:20:39 +02:00
Benny Halevy	76b1c184b7	conf: clean up cassandra references in scylla.yaml Indicate the default scylla directories, rather than Cassandra's. Provide links to Scylladocumentation where possible, update links to Casandra documentation otherwise. Clean up a few typos. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20181119141912.28830-1-bhalevy@scylladb.com>	2018-11-21 13:04:24 +02:00
Rafael Ávila de Espíndola	7fa7e9716d	Mention scylla-tools-java and scylla-jmx in HACKING.md I struggled a bit finding out why nodetool was not working, so it might be a good idea to expand the documentation a bit. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20181120233358.25859-1-espindola@scylladb.com>	2018-11-21 12:55:17 +02:00
Tomasz Grabiec	349c9f7a69	HACKING.md: Add a link to the slides about core dump debugging tools Message-Id: <1542793207-1620-1-git-send-email-tgrabiec@scylladb.com>	2018-11-21 11:45:23 +02:00
Michael Munday	53fdde75f6	dht: use little endian byte order explicitly for token hash This avoids a difference between little and big endian sytems. We now also calculate a full murmur hash for tokens with less than 8 bytes, however in practice the token size is always 8. Message-Id: <20181120214733.43800-1-mike.munday@ibm.com>	2018-11-21 11:44:29 +02:00
Michael Munday	360374cfde	tests: fix compilation of partitioner_test with boost 1.68 on IBM Z The boost multiprecision library that I am compiling against seems to be missing an overload for the cast to a string. The easy workaround seems to be to call str() directly instead. This also fixes #3922. Message-Id: <20181120215709.43939-1-mike.munday@ibm.com>	2018-11-21 11:43:42 +02:00
Duarte Nunes	9464fffc8c	tests/cql_query_test: Assert default compression options Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-11-20 22:47:27 +00:00
Duarte Nunes	36dc9e3280	compress: Restore lz4 as default compressor Fixes a regression introduced in `74758c87cd`, where tables started to be created without compression by default (before they were created with lz4 by default). Fixes #3926 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-11-20 22:47:27 +00:00
Duarte Nunes	5f64e34fcc	tests: Be explicit about absence of compression Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-11-20 22:47:26 +00:00
Avi Kivity	775b7e41f4	Update seastar submodule * seastar d59fcef...b924495 (2): > build: Fix protobuf generation rules > Merge "Restructure files" from Jesse Includes fixup patch from Jesse: " Update Seastar `#include`s to reflect restructure All Seastar header files are now prefixed with "seastar" and the configure script reflects the new locations of files. Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com> "	2018-11-21 00:01:44 +02:00
Takuya ASADA	42baf6a6f7	dist/ami: update packer Update packer to latest version, 1.3.2. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20181031110441.16284-2-syuu@scylladb.com>	2018-11-20 21:29:57 +02:00
Takuya ASADA	b9a42e83ad	dist/ami: enable AMI build log To make easier to debug AMI build error, enable logging. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20181031110441.16284-1-syuu@scylladb.com>	2018-11-20 21:29:57 +02:00
Takuya ASADA	72411f95cb	reloc/build_reloc.sh: find ninja-build after executed install-dependencies.sh The build environment may not installed ninja-build before running install-dependencies.sh, so do it after running the script. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20181031110737.17755-1-syuu@scylladb.com>	2018-11-20 21:29:57 +02:00
Avi Kivity	183c2369f3	Update seastar submodule * seastar a44cedf...d59fcef (10): > dns: Set tcp output stream buffer size to zero explicitly > tests: add libc-ares to travis dependencies > tests: add dns_test to test suite > build: drop bundled c-ares package > prometheus: replace the instance label with an optional one > build: Refactor C++ dialect detection > build: add libatomic to install-depenencies.sh > core: use std::underlying_type for open_flags > core: introduce open_flags::operator& > core: Fix build for `gnu++14`	2018-11-20 21:29:57 +02:00
Tomasz Grabiec	57e25fa0f8	utils: phased_barrier: Make advance_and_await() have strong exception guarantees Currently, when advance_and_await() fails to allocate the new gate object, it will throw bad_alloc and leave the phased_barrier object in an invalid state. Calling advance_and_await() again on it will result in undefined behavior (typically SIGSEGV) beacuse _gate will be disengaged. One place affected by this is table::seal_active_memtable(), which calls _flush_barrier.advance_and_await(). If this throws, subsequent flush attempts will SIGSEGV. This patch rearranges the code so that advance_and_await() has strong exception guarantees. Message-Id: <1542645562-20932-1-git-send-email-tgrabiec@scylladb.com>	2018-11-20 16:15:12 +00:00

1 2 3 4 5 ...

16994 Commits