scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 01:20:39 +00:00

Author	SHA1	Message	Date
Duarte Nunes	ad8ff1df7e	sstables: Replace composite class This patch replaces the sstables::composite class with the one in compound_compat.hh. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-07-11 16:55:11 +02:00
Duarte Nunes	0b87d16699	composite: Add unit tests Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-07-11 16:55:11 +02:00
Duarte Nunes	b179d8d378	compound_compat: Parse legacy compound values This patch adds support for parsing legacy compound values by introducing the composite class, a wrapper around a sequence of bytes serialized in the legacy format for compounds. Compound values can be sent though the thrift API. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-07-11 16:55:07 +02:00
Avi Kivity	f126efd7f2	transport: encode user-defined type metadata Right now we fall back to tuples, which confuses the client. Fixes #1443. Reviewed-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1468167120-1945-1-git-send-email-avi@scylladb.com>	2016-07-11 08:51:17 +03:00
Takuya ASADA	d2caa486ba	dist/redhat/centos_dep: disable go and ada language on scylla-gcc package, since ScyllaDB never use them centos-master jenkins job failed at building libgo, but we don't need go language, so let's disable it on scylla-gcc package. Also we never use ada, disable it too. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1468166660-23323-1-git-send-email-syuu@scylladb.com>	2016-07-10 19:12:52 +03:00
Avi Kivity	24e3026e32	Merge "compaction manager refactoring" from Raphael	2016-07-10 17:16:23 +03:00
Tomasz Grabiec	6a1f9a9b97	db: Improve logging Message-Id: <1467997671-16570-1-git-send-email-tgrabiec@scylladb.com>	2016-07-10 16:15:03 +03:00
Avi Kivity	b5bef73ad2	Merge "Avoiding checking bloom filters during compaction" from Tomasz "Checking bloom filters of sstables to compute max purgeable timestamp for compaction is expensive in terms of CPU time. We can avoid calculating it if we're not about to GC any tombstone. This patch changes compacting functions to accept a function instead of ready value for max_purgeable. I verified that bloom filter operations no longer appear on flame graphs during compaction-heavy workload (without tombstones). Refs #1322."	2016-07-10 11:33:41 +03:00
Tomasz Grabiec	8c4b5e4283	db: Avoiding checking bloom filters during compaction Checking bloom filters of sstables to compute max purgeable timestamp for compaction is expensive in terms of CPU time. We can avoid calculating it if we're not about to GC any tombstone. This patch changes compacting functions to accept a function instead of ready value for max_purgeable. I verified that bloom filter operations no longer appear on flame graphs during compaction-heavy workload (without tombstones). Refs #1322.	2016-07-10 09:54:20 +02:00
Tomasz Grabiec	c0233c877d	db: Avoid out-of-memory when flushing cannot keep up memtable_list::seal_on_overlflow() is called on each mutation to check if current memtable should be flushed. It will call memtable_list::seal_active_memtable() when that is the case. The number of concurrent seals is guarded by a semaphore, starting from commit `0f64eb7e7d`, and allows at most 4 of them. If there are 4 flushes already pending, every incoming mutation will enqueue a new flush task on the semaphore's wait list, without waiting for it. The wait queue can grow without bounds, eventually leading to out-of-memory. The fix is to seal the memtable immediately to satisfy should_flush() condition, but limit concurrency of actual flushes. This way the wait queue size on the semaphore is limited by memtables pending a flush, which is fairly limited. Message-Id: <1467997652-16513-1-git-send-email-tgrabiec@scylladb.com>	2016-07-10 10:53:51 +03:00
Tomasz Grabiec	74ff30a31a	mutation_reader: Introduce stable_flattened_mutations_consumer adaptor Needed to make compact_mutation class non-movable later. It is used in do_with, so needs to be movable. Will be solved by using this adaptor.	2016-07-09 22:31:28 +02:00
Tomasz Grabiec	fb44f895b2	mutation_reader: Name template parameters after concepts With so many consumer concepts out there, it is confusing to name parameters using genering "Consumer" name, let's name them after (already defined) concepts: CompactedMutationsConsumer, FlattenedConsumer.	2016-07-09 22:31:27 +02:00
Raphael S. Carvalho	ed5e7e6842	compaction: refactor compaction manager Previously, same function was used to handle both regular compaction and cleanup requests. That's bad because a lot of conditions were added for both compaction types to live in the same function. Now, cleanup and regular compaction will live in different functions. They share a lot of code, so helper functions were introduced. This change is also important for user-initiated compaction that will go through compaction manager in the future. Code is also a lot easier to read now. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-08 16:37:53 -03:00
Raphael S. Carvalho	da6a2b429d	compaction: add functions to register and deregister compacting sstables Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-08 16:00:51 -03:00
Raphael S. Carvalho	4d6dce8ec9	compaction: add helper function to get candidates for strategy Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-08 15:06:14 -03:00
Raphael S. Carvalho	e38f66c6fe	database: make certain column family functions const qualified Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-08 15:05:22 -03:00
Raphael S. Carvalho	bfc5376548	compaction: remove gate from compaction manager task There is no longer a need to use gate for regular termination of fiber that runs compaction. Now, we only set task->stopping to true, ask for compaction termination, and wait for its future to resolve. Code is simplified a lot with this change. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-08 15:05:10 -03:00
Paweł Dziepak	cba996a3ea	Merge "Implement missing functions for byte_ordered_partitioner" from Asias	2016-07-08 10:49:25 +01:00
Asias He	f4389349e4	config: Enable partitioner option Enable --partitioner option so that user can choose partitioner other than the default Murmur3Partitioner. Currently, only Murmur3Partitioner and ByteOrderedPartitioner are supported. When non-supported partitioner is specifed, error will be propogated to user.	2016-07-08 17:44:55 +08:00
Asias He	9c27b5c46e	byte_ordered_partitioner: Implement missing describe_ownership and midpoint In order to support ByteOrderedPartitioner, we need to implement the missing describe_ownership and midpoint function in byte_ordered_partitioner class. As a starter, this path uses a simple node token distance based method to calculate ownership. C* uses a complicated key samples based method. We can switch to what C* does later. Tests are added to tests/partitioner_test.cc. Fixes #1378	2016-07-08 17:44:55 +08:00
Asias He	e0949a8f4f	storage_service: Exit shadow round state if it fails If a node fails to talk to any seed node, shadow round will fail. We should exit shadow round state before we continue. This issue is spotted by consistency_test.TestConsistency.data_query_digest_test dtest. Message-Id: <ba0613532a69bac369ca316ab61d907b320c8e68.1467963674.git.asias@scylladb.com>	2016-07-08 10:05:07 +01:00
Avi Kivity	8dab93a853	sstables: fix low disk utilization with compression and small chunk lengths As Nadav notes we use the chunk length as the buffer size for the compressed stream too. Fix by using it only for the outer (uncompressed) stream; the inner (compressed) stream uses the sstable buffer size, 128 kiB. Fixes #1402. Message-Id: <1467910556-5759-1-git-send-email-avi@scylladb.com> Reviewed-by: Nadav Har'El <nyh@scylladb.com>	2016-07-07 18:13:30 +01:00
Vlad Zolotarov	f2bf453be2	database: revive mutation retry in case of replay_position_reordered_exception The logic that would retry applying a mutation in case of a replay_position_reordered_exception error was broken by a commit `0c31f3e626` Author: Glauber Costa <glauber@scylladb.com> Date: Wed Apr 20 19:09:21 2016 -0400 database: move memtable throttler to the LSA throttler This patch makes it work again. Fixes #1439 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1467893342-30559-1-git-send-email-vladz@cloudius-systems.com>	2016-07-07 15:00:35 +02:00
Tomasz Grabiec	de429d6a53	Merge branch 'dev/pdziepak/streamed-mutations-streaming/v3' Support for streaming of large partitions from Paweł: This series converts streaming to streaming_mutations so that there is need to store full mutation in memory in order to send or receive it. The first several patches add a way of estimating mutation fragment memory usage and introduce fragment_and_freeze() which produces a stream of reasonably sized frozen mutations from a single streamed mutation. The second part of this patchset makes sure that streaming mutations in fragments doesn't break isolation guarantees. This is achieved by delaying visibility of sstables produced by streaming until the streaming is completed. However, our current receiving code merges mutations from all streaming plans together thus making it impossible to track which data was received from a particular streaming plan. The solution to that problem is to introduce an additional flag to STREAM_MUTATION verb which informs the receiver whether the mutation is fragmented and care must be taken to preserve isolation. Small mutations behaved as they were, with writes from different stream plans coalesced while big mutations are handled separately for each streaming task.	2016-07-07 13:23:39 +02:00
Paweł Dziepak	d9eb4d8028	streaming: use fragment_and_freeze() to send mutations Commit `206955e4` "streaming: Reduce memory usage when sending mutations" moved streaming mutation limiter from do_send_mutations() to send_mutations(). The reason for that was that send_mutation() did full mutation copies. That's no longer the case and streaming limiter should be moved back to do_send_mutation() in order to provide back pressure to fragment_and_freeze(). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:36 +01:00
Paweł Dziepak	32a5de7a1f	db: handle receiving fragmented mutations If mutations are fragmented during streaming a special care must be taken so that isolation guarantees are not broken. Mutations received with flag "fragmented" set are applied to a memtable that is used only by that particular streaming task and the sstables created by flushing such memtables are not made visible until the task is complte. Also, in case the streaming fails all data is dropped. This means that fragmented mutations cannot benefit from coalescing of writes from multiple streaming plans, hence separate way of handling them so that there is no loss of performance for small partitions. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	f2ae31711e	streaming: inform CF when streaming fails Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	4031c0ed8f	streaming: pass plan_id to column family for apply and flush plan_id is needed to keep track of the origin of mutations so that if they are fragmented all fragments are made visible at the same time, when that particular streaming plan_id completes. Basically, each streaming plan that sends big (fragmented) mutations is going to have its own memtables and a list of sstables which will get flushed and made visible when that plan completes (or dropped if it fails). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	51ec7a7285	db: wait for ongoing flushes at end of streaming When flush_streaming_mutations() is called at the end of streaming it is supposed to flush all data and then invalidate cache. ranges However, if there are already some memtable flushes in progress it won't wait for them. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	5bc51821fe	sstables: allow writing unsealed sstables The purpose of this patch is to split the actions of writing sstable and sealing it. As long as the sstable is unsealed it is considered incomplete and is going to be removed on reboot. Such functionality is needed in order to defer visibility of sstables created during streaming until the streaming is complete. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	a7b6c1110f	sstables: do not require seal_sstable() to be run in thread Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	4e34bd4e8a	tests/streamed_mutation: test fragment_and_freeze() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:35 +01:00
Paweł Dziepak	19629e95e2	frozen_mutation: add fragment_add_freeze() fragment_and_freeze() produces a stream of frozen mutations from a single streamed_mutation. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:18:30 +01:00
Paweł Dziepak	820bd6c9bc	streamed_mutation: add mutation_fragment::memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	23d0bfd065	mutation_partition: add row::memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	1d54327afd	atomic_cell_or_collection: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	d0ee750cec	keys: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	cfa581b426	utils/managed_vector: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	703509a1c7	utils/managed_bytes: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	a289816b31	streamed_mutation: fix mutation_fragment::consume() return type Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	37bd7230bc	streamed_mutation: add mutation fragment visitor Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Glauber Costa	54ce6221a7	allow the dirty memory manager to be used without a database object Some of our tests don't provide a database object to a CF. Create a default dirty memory manager object that can be used without a database for them. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <872f8c9232ff87d788e271b1db86c814d7a75d9f.1467832713.git.glauber@scylladb.com>	2016-07-07 10:00:43 +01:00
Raphael S. Carvalho	0772d20c60	fix compilation in debug mode build/debug/sstables/compaction_strategy.o: In function `date_tiered_manifest::date_tiered_manifest(std::map<basic_sstring<char, unsigned int, 15u>, basic_sstring<char, unsigned int, 15u>, std::less<basic_sstring<char, unsigned int, 15u> >, std::allocator<std::pair<basic_sstring<char, unsigned int, 15u> const, basic_sstring<char, unsigned int, 15u> > > > const&)': /home/centos/scylla/sstables/date_tiered_compaction_strategy.hh:67: undefined reference to `date_tiered_manifest::DEFAULT_BASE_TIME_SECONDS' That's fixed by moving definition of static constexpr outside the class. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20c16ad71f64900aa5591018bc4e976406cfebb3.1467870383.git.raphaelsc@scylladb.com>	2016-07-07 11:52:37 +03:00
Avi Kivity	9a8788019d	row_cache: fix visitor for boost <= 1.55 Older boosts can't return a future from a visitor (likely lacking support for move-only objects). Supply a dirty hackaround. Message-Id: <1467822548-25940-1-git-send-email-avi@scylladb.com>	2016-07-06 19:55:51 +03:00
Avi Kivity	21031d276b	Merge seastar upstream * seastar c82c36f...9267dfa (6): > app_template: Make run() wait for func when reactor exit is triggered externally > core: Introduce futurize_apply() helper > rpc: make unexpected eof messages more informative > Fix boost version check > reactor: more fix for smp poll with older boost > reactor: fix build on older boost due to spsc_queue::read_available()	2016-07-06 18:14:13 +03:00
Avi Kivity	02530faeb2	compaction: fix tombstones not being garbage collected during compaction `2a46410f4a` changed sstable_list from a map to a set, so it is no longer sorted by generation. The code for finding the list of sstables not being compacted relied on this sort order, and now broke, returning a longer list than needed (including some of the sstables being compacted). As a result, the compaction code preserved the tombstones, incorrectly thinking there was still live data they referenced. Fix by sorting the set explicitly. Fixes #1429. Message-Id: <1467793026-6571-1-git-send-email-avi@scylladb.com>	2016-07-06 10:22:31 +02:00
Asias He	0c56bbe793	gossip: Make get_supported_features and wait_for_feature_on{_all}_node private They are used only inside gossiper itself. Also make the helper get_supported_features(std::unordered_map<gms::inet_address, sstring>) static. Message-Id: <f434c145ad9138084708b60c1d959b84360e47b2.1467775291.git.asias@scylladb.com>	2016-07-06 09:54:56 +03:00
Avi Kivity	ab279a4752	Merge "Add support to date tiered compaction strategy" from Raphael "After this patchset, date tiered compaction strategy is supported by Scylla. For those who don't know what it is about, the following article may help: https://labs.spotify.com/2014/12/18/date-tiered-compaction/ It's also nicely explained here by our wiki page: https://github.com/scylladb/scylla/wiki/SSTable-compaction#date-tiered-compaction Basically, date tiered strategy was developed to help the database perform better when facing a time series workload. Date tiered strategy will work to keep data written at nearly the same time together, such that the number of relevant sstables for a time-based query is relatively low. We still lacks support to filter out sstables based on time parameters of a query, but that feature should come ASAP. The following dtests now pass: compaction_test.py:TestCompaction_with_DateTieredCompactionStrategy.compaction_delete_test compaction_test.py:TestCompaction_with_DateTieredCompactionStrategy.compaction_strategy_switching_test Used cassandra-stress with the parameter '-schema compaction\(strategy=DateTieredCompactionStrategy\)' to check stability. Fixes #511."	2016-07-06 09:51:12 +03:00
Avi Kivity	7438c9de5c	Merge "Fix database freeze with load for multiple CFs" from Glauber "Issue 1195 describes a scenario with a fairly easy reproducer in which we can freeze the database. That involves writing simultaneously to multiple CFs, such that the sum of all the memory they are using is larger than the dirty memory limit, without not any of them individually being larger than the memtable size. This patchset rewrites the throttling code, including now active flushes so that this situation cannot happen. Fixes #1195"	2016-07-06 09:48:13 +03:00
Raphael S. Carvalho	b5ec4d46c6	tests: add test for date tiered compaction strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 02:11:47 -03:00

1 2 3 4 5 ...

9851 Commits