scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 20:16:43 +00:00

Author	SHA1	Message	Date
Paweł Dziepak	1d54327afd	atomic_cell_or_collection: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	d0ee750cec	keys: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	cfa581b426	utils/managed_vector: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	703509a1c7	utils/managed_bytes: add memory_usage() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	a289816b31	streamed_mutation: fix mutation_fragment::consume() return type Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Paweł Dziepak	37bd7230bc	streamed_mutation: add mutation fragment visitor Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-07 12:17:25 +01:00
Glauber Costa	54ce6221a7	allow the dirty memory manager to be used without a database object Some of our tests don't provide a database object to a CF. Create a default dirty memory manager object that can be used without a database for them. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <872f8c9232ff87d788e271b1db86c814d7a75d9f.1467832713.git.glauber@scylladb.com>	2016-07-07 10:00:43 +01:00
Raphael S. Carvalho	0772d20c60	fix compilation in debug mode build/debug/sstables/compaction_strategy.o: In function `date_tiered_manifest::date_tiered_manifest(std::map<basic_sstring<char, unsigned int, 15u>, basic_sstring<char, unsigned int, 15u>, std::less<basic_sstring<char, unsigned int, 15u> >, std::allocator<std::pair<basic_sstring<char, unsigned int, 15u> const, basic_sstring<char, unsigned int, 15u> > > > const&)': /home/centos/scylla/sstables/date_tiered_compaction_strategy.hh:67: undefined reference to `date_tiered_manifest::DEFAULT_BASE_TIME_SECONDS' That's fixed by moving definition of static constexpr outside the class. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20c16ad71f64900aa5591018bc4e976406cfebb3.1467870383.git.raphaelsc@scylladb.com>	2016-07-07 11:52:37 +03:00
Avi Kivity	9a8788019d	row_cache: fix visitor for boost <= 1.55 Older boosts can't return a future from a visitor (likely lacking support for move-only objects). Supply a dirty hackaround. Message-Id: <1467822548-25940-1-git-send-email-avi@scylladb.com>	2016-07-06 19:55:51 +03:00
Avi Kivity	21031d276b	Merge seastar upstream * seastar c82c36f...9267dfa (6): > app_template: Make run() wait for func when reactor exit is triggered externally > core: Introduce futurize_apply() helper > rpc: make unexpected eof messages more informative > Fix boost version check > reactor: more fix for smp poll with older boost > reactor: fix build on older boost due to spsc_queue::read_available()	2016-07-06 18:14:13 +03:00
Avi Kivity	02530faeb2	compaction: fix tombstones not being garbage collected during compaction `2a46410f4a` changed sstable_list from a map to a set, so it is no longer sorted by generation. The code for finding the list of sstables not being compacted relied on this sort order, and now broke, returning a longer list than needed (including some of the sstables being compacted). As a result, the compaction code preserved the tombstones, incorrectly thinking there was still live data they referenced. Fix by sorting the set explicitly. Fixes #1429. Message-Id: <1467793026-6571-1-git-send-email-avi@scylladb.com>	2016-07-06 10:22:31 +02:00
Asias He	0c56bbe793	gossip: Make get_supported_features and wait_for_feature_on{_all}_node private They are used only inside gossiper itself. Also make the helper get_supported_features(std::unordered_map<gms::inet_address, sstring>) static. Message-Id: <f434c145ad9138084708b60c1d959b84360e47b2.1467775291.git.asias@scylladb.com>	2016-07-06 09:54:56 +03:00
Avi Kivity	ab279a4752	Merge "Add support to date tiered compaction strategy" from Raphael "After this patchset, date tiered compaction strategy is supported by Scylla. For those who don't know what it is about, the following article may help: https://labs.spotify.com/2014/12/18/date-tiered-compaction/ It's also nicely explained here by our wiki page: https://github.com/scylladb/scylla/wiki/SSTable-compaction#date-tiered-compaction Basically, date tiered strategy was developed to help the database perform better when facing a time series workload. Date tiered strategy will work to keep data written at nearly the same time together, such that the number of relevant sstables for a time-based query is relatively low. We still lacks support to filter out sstables based on time parameters of a query, but that feature should come ASAP. The following dtests now pass: compaction_test.py:TestCompaction_with_DateTieredCompactionStrategy.compaction_delete_test compaction_test.py:TestCompaction_with_DateTieredCompactionStrategy.compaction_strategy_switching_test Used cassandra-stress with the parameter '-schema compaction\(strategy=DateTieredCompactionStrategy\)' to check stability. Fixes #511."	2016-07-06 09:51:12 +03:00
Avi Kivity	7438c9de5c	Merge "Fix database freeze with load for multiple CFs" from Glauber "Issue 1195 describes a scenario with a fairly easy reproducer in which we can freeze the database. That involves writing simultaneously to multiple CFs, such that the sum of all the memory they are using is larger than the dirty memory limit, without not any of them individually being larger than the memtable size. This patchset rewrites the throttling code, including now active flushes so that this situation cannot happen. Fixes #1195"	2016-07-06 09:48:13 +03:00
Raphael S. Carvalho	b5ec4d46c6	tests: add test for date tiered compaction strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 02:11:47 -03:00
Raphael S. Carvalho	b699ef2de3	compaction: wire up date tiered compaction strategy After this commit, date tiered compaction strategy is supported on Scylla. To understand how it works, take a look at our wiki page: https://github.com/scylladb/scylla/wiki/SSTable-compaction#date-tiered-compaction Fixes #511. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 02:11:47 -03:00
Raphael S. Carvalho	e5cc0cc6c4	compaction: implement date tiered compaction strategy This commit is basically about converting Java to C++. Date tiered compaction strategy isn't wired yet. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 02:11:47 -03:00
Raphael S. Carvalho	cab2892866	tests: add test for sstables::get_fully_expired_sstables Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 02:11:47 -03:00
Raphael S. Carvalho	e9076f39be	compaction: implement function to get fully expired sstables Strongly based on org.apache.cassandra.db.compaction. CompactionController.getFullyExpiredSSTables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 02:11:47 -03:00
Raphael S. Carvalho	69b3860662	tests: add test for leveled_manifest::overlapping Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 02:11:45 -03:00
Raphael S. Carvalho	92848efc42	sstables: make overlapping functions static That's needed for a function that will get overlapping sstables to get fully expired ones. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 01:34:34 -03:00
Raphael S. Carvalho	8d38fa49d4	sstables: move code to get uncompacting sstables to a function Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 01:33:55 -03:00
Raphael S. Carvalho	1118cfc51a	tests: test that sstable max_local_deletion_time is properly updated Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 01:13:34 -03:00
Raphael S. Carvalho	cc6c383249	sstables: properly keep track of max local deletion time We weren't updating max local deletion time for cells that contain ttl, or for tombstone cells. If there is a live cell with no ttl, then max local deletion time is supposed to store maximum value, which means that the sstable will not be fully expired later on. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 01:13:24 -03:00
Raphael S. Carvalho	1ecd9bdefc	sstables: fix type of max_local_deletion_time max_local_deletion_time was incorrectly using an unsigned type instead of a signed one. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 01:13:13 -03:00
Raphael S. Carvalho	f9ab94d266	compaction: import DateTieredCompactionStrategy.java File can be found at the following C* directory: src/java/org/apache/cassandra/db/compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-07-06 01:12:49 -03:00
Glauber Costa	b0932ceb04	database: act on LSA pressure notification Issue 1195 describes a scenario with a fairly easy reproducer in which we can freeze the database. That involves writing simultaneously to multiple CFs, such that the sum of all the memory they are using is larger than the dirty memory limit, without not any of them individually being larger than the memtable size. Because we will never reach the individual memtable seal size for any of them, none of them will initiate a flush leading the database to a halt. The LSA has now gained infrastructure that allow us to be notified when pressure conditions mount. What we will do in this case is initiate a flush ourselves. Fixes #1195 Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-07-05 17:46:28 -04:00
Glauber Costa	7169b727ea	move system tables to its own region In the spirit of what we are doing for the read semaphore, this patch moves system writes to its own dirty memory manager. Not only will it make sure that system tables will not be serialized by its own semaphore, but it will also put system tables in its own region group. Moving system tables to its own region group has the advantage that system requests won't be waiting during throttle behind a potentially big queue of user requests, since requests are tended to in FIFO order within the same region group. However, system tables being more controlled and predictable, we can actually go a step further and give them some extra reservation so they may not necessarily block even if under pressure (up to 10 MB more). Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-07-05 17:46:28 -04:00
Glauber Costa	c358947284	database: wrap semaphore and region group into a new dirty memory manager We currently have a semaphore in the column family level that protects us against multiple concurrent sstable flushes. However, storing that semaphore into the CF, not the database, was a (implementation, not design) mistake. One comment in particular makes it quite clear: // Ideally, we'd allow one memtable flush per shard (or per database object), and write-behind // would take care of the rest. But that still has issues, so we'll limit parallelism to some // number (4), that we will hopefully reduce to 1 when write behind works. So I aimed for the shard, but ended up coding it into the CF because that's closer to the flush point - my bad. This patch fixes this while paving the way for active reclaim to take place. It wraps the semaphore and the region group in a new structure, the dirty_memory_manager. The immediate benefit is that we don't need to be passing both the semaphore and the region group downwards in the DB -> CF path. The long term benefit is that we now have a one unified structure that can hold shared flush data in all of the CFs. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-07-05 15:29:04 -04:00
Glauber Costa	d41fcd45d1	memtables: make memtable inherit from region The LSA memory pressure mechanism will let us know which region is the best candidate for eviction when under pressure. We need to somehow then translate region -> memtable -> column family. The easiest way to convert from region to memtable, is having memtable inherit from region. Despite the fact that this requires multiple inheritance, which always raise a flag a bit, the other class we inherit from is enable_shared_from_this, which has a very simple and well defined interface. So I think it is worthy for us to do it. Once we have the memtable, grabing the column family is easy provided we have a database object. We can grab it from the schema. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-07-05 15:05:29 -04:00
Glauber Costa	0c31f3e626	database: move memtable throttler to the LSA throttler The LSA infrastructure, through the use of its region groups, now have a throttler mechanism built-in. This patch converts the current throttlers so that the LSA throttler is used instead. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-07-05 15:05:19 -04:00
Yoav Kleinberger	0ad940bfc3	tools/scyllatop: fix crash due to mouse events due to an urwid-library technicality, some mouse events like scroll or click would crash scyllatop. This patch fixes this problem. closes issue #1396. Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <1467294117-19218-1-git-send-email-yoav@scylladb.com>	2016-07-05 19:08:55 +03:00
Avi Kivity	cb59e724ee	Merge "Fix enabling sstable read ahead" from Paweł "This series contains remaining changes necessary to safely enable read ahead of sstables. Basically, it makes sure that input_streams are always properly closed (even in case of exception during read)."	2016-07-05 19:04:19 +03:00
Raphael S. Carvalho	e688fc9550	api: provide estimation of pending compaction Use compaction_strategy::estimated_pending_compaction() to provide user with an estimation of number of compaction for strategy to be fully satisfied. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <39b7d91f2525ca38fb2ce9d8885d0c2e727de7ed.1467667054.git.raphaelsc@scylladb.com>	2016-07-05 19:03:12 +03:00
Raphael S. Carvalho	43926026c3	compaction: introduce compaction strategy method to estimate pending compaction At the moment, it's not possible to know how many compaction are needed for compaction strategy to be satisfied. It's not possible to know exactly the number of pending compaction, but the strategy can provide an estimation. For size tiered, it's based on number of sstables in each bucket. By dividing bucket size by max threshold, we get number of compaction needed to compact that single bucket. For leveled, it's about the number of sstables that exceeds the limit in each level. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <e209e52f6159ee274a8358b69961a7c0ce357f7d.1467667054.git.raphaelsc@scylladb.com>	2016-07-05 19:03:11 +03:00
Avi Kivity	76cc6408cd	Merge "feature check for seed node" from Asias ""This series implemnts feature check for seed node.	2016-07-05 19:01:01 +03:00
Asias He	6f69963ef9	system_keyspace: Simplify load_host_ids implementation - Use plain loop instead of do_for_each - Use row.get_as() instead of row.template get_as() Message-Id: <3e108d3a6258c0caaf569eb9c79532d9789ea411.1467703722.git.asias@scylladb.com>	2016-07-05 09:47:21 +02:00
Asias He	3f31be58b6	system_keyspace: Simplify load_tokens implemntation - Use plain loop instead of do_for_each - Use row.get_as() instead of row.template get_as() Message-Id: <f959ace4f30078695d383c849ed4520169228f97.1467703722.git.asias@scylladb.com>	2016-07-05 09:47:21 +02:00
Asias He	5236e7a379	storage_service: Implement feature check for seed node Checking features for seed node is a bit more complicated than non-seed node, because non-seed node can always talk to at least one seed node, seed node may not. In this patch, we distingush new cluster and existing cluster by checking if the system table is empty. We relax the feature check for new cluster because the feature check is mostly useful when upgrading an existing cluster to prevent old node to join new cluster. When talking to a seed node failed during the check, we fallback to the check using features stored in the system table. This makes restarting a seed node when no other seed node is up possible (no other seed node at all, or other seed node is not up yet). I tested the following scenarios. 1) start a completely new seed node in a new cluster * system table is empty, skip the check. 2) start a cluster, restart one seed node, at least one other seed node is up * system table is not empty, check with shadow round, shadow round will * succeed 3) start a cluster, restart one seed node, no other seed node is up * system table is not empty, check with shadow round, shadow round will * fail, fallback to system table check. 4) start a cluster, shutdown all the nodes, start one seed node with new ip address, seed list in yaml is updated with new ip address * system table is not empty, check with shadow round, shadow round will * fail, fallback to system table check	2016-07-05 10:09:54 +08:00
Asias He	bb80362c3f	gossip: Insert with result.end() in get_supported_features It is faster than result.begin(), suggested by Avi.	2016-07-05 10:09:54 +08:00
Asias He	72cb4a228b	gossip: Add to_feature_set helper To convert a "," split feature string to a feature set.	2016-07-05 10:09:54 +08:00
Asias He	1d6c57fb40	gossip: Reduce timeout in shadow round In `3a36ec33db` (gossip: Wait longer for seed node during boot up), we increased the timeout by the factor of 60, i.e., ring_dealy * 60 = 5 seconds * 60 = 5 minutes. In `57ee9676c2` (storage_service: Fix default ring_delay time), we fixed the default ring_dealy to 30 seconds. Now the timeout is 30 * 60 seconds = 30 minutes, which is too long. Make it 5 minues.	2016-07-05 10:09:54 +08:00
Asias He	88f0bb3a7b	gossip: Add check_knows_remote_features To check if this node knows features in std::unordered_map<inet_address, sstring> peer_features_string	2016-07-05 10:09:54 +08:00
Asias He	2b53c50c15	gossip: Add get_supported_features To get features supported by all the nodes listed in the address/feature map.	2016-07-05 10:09:53 +08:00
Asias He	31df4e5316	system_keyspace: Introduce load_peer_features To get the peer features stored in the system.peers table.	2016-07-05 10:09:53 +08:00
Paweł Dziepak	4acf77d755	sstables: drop unused data_stream_at() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-04 18:17:43 +01:00
Paweł Dziepak	2cdf498bbd	sstables: close input stream in sstable::data_read() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-04 18:17:42 +01:00
Paweł Dziepak	8931b939a1	sstables: use finally() to close input streams Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-04 18:17:42 +01:00
Paweł Dziepak	e6ececce7f	Merge seastar upstream Submodule seastar a47f893..c82c36f: > reactor: fix build error > util: lazy_eval: fix compilation errors related to operator<<()s definitions	2016-07-04 18:14:05 +01:00
Duarte Nunes	41843b32c5	thrift: Correctly mark a CF as dense And store whether the comparator is a composite type in the case of dynamic CFs. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1467307688-11059-1-git-send-email-duarte@scylladb.com>	2016-07-04 17:40:53 +02:00

... 37 38 39 40 41 ...

11716 Commits