scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-04 14:03:06 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	e224653d70	sstables: update tombstone_histogram for cells with expiration time That tombstone_histogram is used to determine droppable data ratio for a sstable, and unlike C*, we were only updating it for tombstones. We need to update it with expiration time of cells too, if any. Creation time (expiration - ttl) cannot be used because if ttl > gc_grace_seconds, the resulting sstable could be considered worth dropping by tomstone compaction before any data is actually expired. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-06-27 16:50:38 -03:00
Avi Kivity	08488a75e0	dist: tolerate sysctl failures sysctl may fail in a container environment if /proc is not virtualized properly. Fixes #1990 Message-Id: <20170625145930.31619-1-avi@scylladb.com>	2017-06-27 16:11:48 +02:00
Avi Kivity	ff7be8241f	Merge "Fix compilation issues in older environments" from Tomasz * 'tgrabiec/fix-compilation-issues' of github.com:cloudius-systems/seastar-dev: tests: streamed_mutation_test: Avoid using boost::size() on row ranges tests: row_cache: Remove unused method	2017-06-27 16:30:54 +03:00
Tomasz Grabiec	eb844a10e9	tests: streamed_mutation_test: Avoid using boost::size() on row ranges Fails to compile with libboost 1.55.	2017-06-27 15:27:13 +02:00
Tomasz Grabiec	e68925595c	tests: row_cache: Remove unused method	2017-06-27 14:10:37 +02:00
Vlad Zolotarov	6839a50677	db::commitlog: entry_writer add a virtual destructor Add a virtual destructor for a base class commitlog::entry_writer. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1498511180-18391-1-git-send-email-vladz@scylladb.com>	2017-06-27 10:17:10 +03:00
Takuya ASADA	1e86196ed5	dist/debian: unofficial support of Ubuntu non-LTS versions / Debian non-stable versions Currently our build script only supports Ubuntu 14.04/16.04 and Debian 8, this change extends support to Ubuntu non-LTS versions / Debian non-stable versions. Note that this is unofficial support, users should build the package for these distributions theirselves. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1498491473-28691-1-git-send-email-syuu@scylladb.com>	2017-06-26 18:55:55 +03:00
Asias He	cc02a62756	repair: Prefer nodes in local dc when streaming When peer nodes have the same partition data, i.e., with the same checksum, we currently choose to stream from any of them randomly. To improve streaming performance, select the peer within the same DC. This patch is supposed to improve repair perforamnce with multiple DC. Message-Id: <c6a345b6e8ed2b59f485e53c865241e463b44507.1498490831.git.asias@scylladb.com>	2017-06-26 18:34:21 +03:00
Avi Kivity	1170f56447	Merge "Speed up gossip dissemination in large cluster" from Asias Fixes #2528. * tag 'asias/gossip_talk_to_more_nodes/v3' of github.com:cloudius-systems/seastar-dev: gossip: Use vector for _live_endpoints gossip: Talk to more live nodes in each gossip round	2017-06-26 17:59:43 +03:00
Asias He	e31d4a3940	gossip: Use vector for _live_endpoints To speed up the random access in get_random_node. Switch to use vector instead of set.	2017-06-26 22:49:59 +08:00
Asias He	437899909d	gossip: Talk to more live nodes in each gossip round In large clusters with multiple DC deployment, it is observed that it takes long delay for gossip update to disseminate in the cluster. To speed up, talk to more live nodes in each gossip round. Fixes #2528	2017-06-26 22:49:59 +08:00
Nadav Har'El	6cf44f6817	Optimize column_family::make_sstable_reader() for one partition This patch does the same thing to column_family::make_sstable_reader() as commit `186f031` did to sstable::as_mutation_source(). Although usually one can fast_forward_to() on the result of a column_family::make_sstable_reader(), earlier we had an optimization where if a single partition was specified, it was read exactly, and fast_forward_to() was NOT allowed. With the mutation_reader::forwarding flag patch, when this flag was on - requesting fast_forward_to() - we disabled this optimization. This makes sense, but is not backward compatible with the code which previously assumes this optimization exists. In particular, column_family::data_query() does a single partition read but does not specify forwarding::no explicitly. So this patch returns this optimization, despite this meaning that we blatently ignore the fwd_mr flag in that case. Fixes #2524. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20170626141121.30322-1-nyh@scylladb.com>	2017-06-26 17:13:03 +03:00
Avi Kivity	9b21a9bfb6	Merge "Implement partial cache" from Tomasz and Piotr "This series enables cache to keep partial partitions. Reads no longer have to read whole partition from sstables in order to cache the result. The 10MB threshold for partition size in cache is lifted. Known issues: - There is no partial eviction yet, whole partitions are still evicted, and partition snapshots held by active reads are not evictable at all - Information about range continuity is not recorded if that would require inserting a dummy entry, or if previous entry doesn't belong to the latest snapshot - Cache update after memtable flush happening concurrently with reads may inhibit that reads' ability to populate cache (new issue) - Cache update from flushed memtables has partition granularity, so may cause latency problems with large partition - Schema is still tracked per-partition, so after schema changes reads may induce high latency due to whole partition needing to be converted atomically - Range tombstones are repeated in the stream for every range between cache entries they cover (new issue) - Populating scans for both small and large partitions (perf_fast_forward) experienced a 40% reduction of throughput, CPU bound How was this tested: - test.py --mode release - row_cache_stress_test -c1 -m1G - perf_fast_forward, passes except for the test case checking range continuity population which would require inserting a dummy entry (mentioned above) - perf_simple_query (-c1 -m1G --duration 32): before: 90k [ops/s] stdev: 4k [ops/s] after: 94k [ops/s] stdev: 2k [ops/s]" * tag 'tgrabiec/introduce-partial-cache-v8' of github.com:cloudius-systems/seastar-dev: (130 commits) tests: row_cache: Add test_tombstone_merging_in_partial_partition test case tests: Introduce row_cache_stress_test utils: Add helpers for dealing with nonwrapping_range<int> tests: simple_schema: Allow passing the tombstone to make_range_tombstone() tests: simple_schema: Accept value by reference tests: simple_schema: Make add_row() accept optional timestamp tests: simple_schema: Make new_timestamp() public tests: simple_schema: Introduce make_ckeys() tests: simple_schema: Introduce get_value(const clustered_row&) helper tests: simple_schema: Fix comment tests: simple_schema: Add missing include row_cache: Introduce evict() tests: Add cache_streamed_mutation_test tests: mutation_assertions: Allow expecting fragments mutation_fragment: Implement equality check tests: row_cache: Add test for population of random partitions tests: row_cache: Add test for partition tombstone population tests: row_cache: Test reading randomly populated partition tests: row_cache: Add test_single_partition_update() tests: row_cache: Add test_scan_with_partial_partitions ...	2017-06-26 14:54:37 +03:00
Avi Kivity	555621b537	Disentable memtables from sstables Remove sstable::write_components(memtable), replacing it with a helper. Fixes #2354 Message-Id: <20170624142639.16662-1-avi@scylladb.com>	2017-06-26 09:37:11 +02:00
Avi Kivity	236a8370e4	Remove use of std::random_shuffle() It was removed in C++17. Replace with std::shuffle(). Message-Id: <20170626063809.7563-1-avi@scylladb.com>	2017-06-26 09:36:38 +02:00
Avi Kivity	c4ae2206c7	messaging: respect inter_dc_tcp_nodelay configuration parameter We respect it partially (client side only) for now. Fixes #6. Message-Id: <20170623172048.23103-1-avi@scylladb.com>	2017-06-24 21:49:27 +02:00
Duarte Nunes	2dfd7040eb	CMakeLists.txt: Add boost support Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170623172236.15507-1-duarte@scylladb.com>	2017-06-24 21:49:27 +02:00
Avi Kivity	801b5220d6	Merge seastar upstream * seastar 9e2b7ec...0ab7ae5 (4): > Update fmt submodule > rpc: add options to control tcp_nodelay > core: Fix compilation for older versions of Boost > tests/lowres_clock_test: Fix compilation issues	2017-06-24 20:47:52 +03:00
Tomasz Grabiec	b0bcf2be53	tests: row_cache: Add test_tombstone_merging_in_partial_partition test case	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	23c6f517cb	tests: Introduce row_cache_stress_test Runs readers, updates and eviction concurrently and verifies the following property of reads: - reads see all past writes - reads see no partial writes within a single partition	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	4b4aef789e	utils: Add helpers for dealing with nonwrapping_range<int>	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	5c9f87fb27	tests: simple_schema: Allow passing the tombstone to make_range_tombstone()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	edf4a3494c	tests: simple_schema: Accept value by reference	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	5f70df472f	tests: simple_schema: Make add_row() accept optional timestamp	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	53867c4328	tests: simple_schema: Make new_timestamp() public	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	51b5814ec2	tests: simple_schema: Introduce make_ckeys()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	074c67fe4d	tests: simple_schema: Introduce get_value(const clustered_row&) helper	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	8ffc776e06	tests: simple_schema: Fix comment	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	ecacd2e84a	tests: simple_schema: Add missing include	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	b56232b216	row_cache: Introduce evict()	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	c4e8effffa	tests: Add cache_streamed_mutation_test [tgrabiec: - extracted from a larger commit - removed coupling with how cache_streamed_mutation is created (the code went out of sync), used more stable make_reader(). it's simpler too. - replaced false/true literals with is_continuous/is_dummy where appropraite - dropped tests for cache::underlying (class is gone) - reused streamed_mutation_assertions, it has better error messages - fixed the tests to not create tombstones with missing timestamps - relaxed range tombstone assertions to only check information relevant for the query range - print cache on failure for improved debuggability ]	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	44fdee3f2e	tests: mutation_assertions: Allow expecting fragments	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	1f23130b07	mutation_fragment: Implement equality check	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	116bcb8b30	tests: row_cache: Add test for population of random partitions	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	930a1415fe	tests: row_cache: Add test for partition tombstone population	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	9bfece6f82	tests: row_cache: Test reading randomly populated partition	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	0358334579	tests: row_cache: Add test_single_partition_update() [tgrabiec: Extracted from "row_cache: Introduce cache_streamed_mutation"]	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	8bb76e2f12	tests: row_cache: Add test_scan_with_partial_partitions	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	896bf2e5de	Remove unused methods from MVCC Some apply methods where replaced by apply_to_incomplete(). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	6f6575f456	row_cache: Enable partial partition population	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	5a0ae55f6d	Introduce schema_upgrader	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	1828e28bbb	database: Invalidate cache atomically with attaching streaming sstables Not doing so may cause reads to see partial writes, if another update+read happens in between.	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	896196b841	database: Invalidate cache from seal_active_streaming_memtable_immediate() Cache must be synchronized atomically with changing the underlying mutation source, otherwise write atomicity may not hold.	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	7ae40d7045	tests: Add test for update_invalidating()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	e792220c3a	row_cache: Introduce update_invalidating()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	c29878f49f	row_cache: Extract memtable walking logic from update() into do_update() So that it can be reused in update_invalidating().	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	6ebfb730ee	partition_entry: Introduce partition_tombstone() getter	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	fb62dfab02	tests: mvcc: Introduce test_schema_upgrade_preserves_continuity	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	164989a574	tests: mvcc: Add test for partition_entry::apply_to_incomplete()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	e433e68610	partition_entry: Make squashed() and upgrade() work with not fully continuous versions Those methods first create a neutral mutation_partition, and left-fold it with the versions. The problem is that there is no neutral element for static row continuity, the flag from the first addend always wins. We have to copy the flag from the first version to preserve the logical value.	2017-06-24 18:06:11 +02:00

1 2 3 4 5 ...

12395 Commits