scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	ee84f310d9	move deletion of sstables generated by interrupted compaction This deletion should be handled by sstables::compact_sstables, which is the responsible for creation of new sstables. It also simplifies the code. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <541206be2e910ab4edb1500b098eb5ebf29c6509.1454110234.git.raphaelsc@scylladb.com>	2016-01-31 12:39:20 +02:00
Glauber Costa	7214649b8a	sstables: const where const is due Some SSTable methods are not marked as const. But they should be. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <72cd3ef0157eb38e7fd48d0c989f2342cbc42f3c.1454103008.git.glauber@scylladb.com>	2016-01-31 12:36:36 +02:00
Raphael S. Carvalho	ba4260ea8f	api: print proper compaction type There are several compaction types, and we should print the correct one when listing ongoing compaction. Currently, we only support compaction types: COMPACTION and CLEANUP. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <c96b1508a8216bf5405b1a0b0f8489d5cc4be844.1453851299.git.raphaelsc@scylladb.com>	2016-01-28 13:47:00 +02:00
Raphael S. Carvalho	45c446d6eb	compaction: pass dht::token by reference Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-27 13:25:41 -02:00
Raphael S. Carvalho	fc541e2f08	compaction: remove code to sort local ranges storage_service::get_local_ranges returns sorted ranges, which are not overlapping nor wrap-around. As a result, there is no need for the consumer to do anything. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-27 13:15:36 -02:00
Glauber Costa	3f94070d4e	use auto&& instead of auto& for priority classes. By Avi's request, who reminds us that auto& is more suited for situations in which we are assigning to the variable in question. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <87c76520f4df8b8c152e60cac3b5fba5034f0b50.1453820373.git.glauber@scylladb.com>	2016-01-26 17:00:20 +02:00
Glauber Costa	b63611e148	mark I/O operations with priority classes After this patch, our I/O operations will be tagged into a specific priority class. The available classes are 5, and were defined in the previous patch: 1) memtable flush 2) commitlog writes 3) streaming mutation 4) SSTable compaction 5) CQL query Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Glauber Costa	8e4bf025ae	sstables: wire priority for read path All the SSTable read path can now take an io_priority. The public functions will take a default parameter which is Seastar's default priority. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Glauber Costa	56c11a8109	sstables: wire priority for write path All variants of write_component now take an io_priority. The public interfaces are by default set to Seastar's default priority. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Glauber Costa	03d5a89b90	sstables: mandate a buffer size parameter for data_stream_at The only user for the default size is data_read, sitting at row.cc. That reader wants to read and process a chunk all at once. So there's really no reason to use the default buffer size - except that this code is old. We should do as we do in other single-key / single-range readers and try to read all at once if possible, by looking at the size we received as a parameter. Cleaning up the data_stream_at interface then comes as a nice side effect. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Pekka Enberg	81996bd10b	Merge "Improvements to compaction manager" from Raphael	2016-01-21 20:54:49 +02:00
Raphael S. Carvalho	bb909798bc	compaction_manager: introduce can_submit Purpose is to reuse code and also make it easier to read. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-21 15:42:23 -02:00
Raphael S. Carvalho	653a07d75d	compaction_manager: introduce signal_less_busy_task Purpose is to reuse code. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-21 15:31:44 -02:00
Raphael S. Carvalho	2164aa8d5b	move compaction manager from /utils to /sstables Compaction manager was initially created at utils because it was more generic, and wasn't only intended for compaction. It was more like a task handler based on futures, but now it's only intended to manage compaction tasks, and thus should be moved elsewhere. /sstables is where compaction code is located. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-21 15:23:05 -02:00
Pekka Enberg	b5833e8002	Merge "Enable incremental backups option" from Vlad "This series moves the "backup" logic into the sstable::write_components() methods, adds a support for enabling backup for sstables flushed in the compaction flow (in addition to a regular flushing flow which had this support already) and enables the "incremental_backups" configuration option." I fixed up a merge conflict with commit `5e953b5` ("Merge "Add support to stop ongoing compaction" from Raphael").	2016-01-21 18:52:07 +02:00
Pekka Enberg	5e953b5e47	Merge "Add support to stop ongoing compaction" from Raphael "stop compaction is about temporarily interrupting all ongoing compaction of a given type. That will also be needed for 'nodetool stop <compaction_type>'. The test was about starting scylla, stressing it, stopping compaction using the API and checking that scylla was able to recover. Scylla will print a message as follow for each compaction that was stopped: ERROR [shard 0] compaction_manager - compaction failed: read exception: std::runtime_error (Compaction for keyspace1/standard1 was deliberately stopped.) INFO [shard 0] compaction_manager - compaction task handler sleeping for 20 seconds"	2016-01-21 18:34:10 +02:00
Vlad Zolotarov	c2ab54e9c7	sstables flushing: enable incremental backup (if requested) Enable incremental backup when sstables are flushed if incremental backup has been requested. It has been enabled in the regular flushing flow before but wasn't in the compaction flow. This patch enables it in both places and does it using a backup capability of sstable::write_components() method(s). Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-01-21 12:13:20 +02:00
Vlad Zolotarov	cb5c66f264	sstable::write_components(): add a 'backup' parameter When 'backup' parameter is TRUE - create backup hard links for a newly written sstables in <sstable dir>/backups/ subdirectory. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-01-21 12:04:45 +02:00
Raphael S. Carvalho	f001bb0f53	sstables: fix make_checksummed_file_output_stream Arguments buffer_size and true were accidently inverted. GCC wasn't complaning because implicit conversion of bool to int, and vice-versa, is valid. However, this conversion is not very safe because we could accidentaly invert parameters. This should fix the last problem with sstable_test. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <9478cd266006fdf8a7bd806f1c612ec9d1297c1f.1453301866.git.raphaelsc@scylladb.com>	2016-01-20 16:01:38 +01:00
Paweł Dziepak	33892943d9	sstables: do not drop row marker when reading mutation Since `581271a243` "sstables: ignore data belonging to dropped columns" we silently drop cells if there is no column in the current schema that they belong to or their timestamp is older than the column dropped_at value. Originally this check was applied to row markers as well which caused them to be always dropped since there is no column in the schema representing these markers. This patch makes sure that the check whether colum is alive is performed only if the cell is not a row marker. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1453289300-28607-1-git-send-email-pdziepak@scylladb.com>	2016-01-20 12:35:41 +01:00
Raphael S. Carvalho	c318f3baa3	sstables: fix sstable::data_stream_at After `63967db8`, offset is ignored when creating a input stream. Found the problem after sstable_test failed recently. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <56ece21ff6e043e224eb2a6e76cdd422b94821b0.1453232689.git.raphaelsc@scylladb.com>	2016-01-20 09:35:57 +02:00
Raphael S. Carvalho	3bd240d9e8	compaction: add ability to stop an ongoing compaction That's needed for nodetool stop, which is called to stop all ongoing compaction. The implementation is about informing an ongoing compaction that it was asked to stop, so the compaction itself will trigger an exception. Compaction manager will catch this exception and re-schedule the compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-19 23:15:18 -02:00
Raphael S. Carvalho	ec4c73d451	compaction: rename compaction_stats to compaction_info compaction_info makes more sense because this structure doesn't only store stats about ongoing compaction. Soon, we will add information to it about whether or not an user asked to stop the respective ongoing compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-19 23:15:18 -02:00
Glauber Costa	63967db8bf	sstables: always use a file_*_stream_options in our readers and writes Instead of using the APIs that explicitly pass things like buffer_size, always use the options instance instead. This will make it easier to pass extra options in the future. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <5b04e60ab469c319a17a522694e5bedf806702fe.1453219530.git.glauber@scylladb.com>	2016-01-19 18:26:37 +02:00
Glauber Costa	c3ac5257b5	sstables: don't repeat file_writer creation all the time When this code was originally written, we used to operate on a generic output_stream. We created a file output stream, and then moved it into the generic object. Many patches and reworks later, we now have a file_writer object, but that pattern was never reworked. So in a couple of places we have something like this: f = file_object acquired by open_file_dma auto out = file_writer(std::move(f), 4096); auto w = make_shared<file_writer>(std::move(out)); The last statement is just totally redundant. make_shared can create an object from its parameters without trouble, so we can just pass the parameter list directly to it. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <c01801a1fdf37f8ea9a3e5c52cd424e35ba0a80d.1453219530.git.glauber@scylladb.com>	2016-01-19 18:26:36 +02:00
Raphael S. Carvalho	0c67b1d22b	compaction: filter out mutation that doesn't belong to shard When compacting sstable, mutation that doesn't belong to current shard should be filtered out. Otherwise, mutation would be duplicated in all shards that share the sstable being compacted. sstable_test will now run with -c1 because arbitrary keys are chosen for sstables to be compacted, so test could fail because of mutations being filtered out. fixes #527. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <1acc2e8b9c66fb9c0c601b05e3ae4353e514ead5.1453140657.git.raphaelsc@scylladb.com>	2016-01-19 10:16:41 +01:00
Pekka Enberg	7d3a3bd201	Merge "column family cleanup support" from Raphael "This patch is intended to add support to column family cleanup, which will make 'nodetool cleanup' possible. Why is this feature needed? Remove irrelevant data from a node that loses part of its token range to a newly added node."	2016-01-18 10:15:05 +02:00
Paweł Dziepak	cfc0a132a9	sstable: handle multi-cell vs atomic incompatibilities Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-15 13:12:40 +01:00
Paweł Dziepak	581271a243	sstables: ignore data belonging to dropped columns Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-15 13:12:40 +01:00
Tomasz Grabiec	ccd609185f	sstables: Add ability to wait for async sstable cleanup tasks This patch adds a function which waits for the background cleanup work which is started from sstable destructors. We wait for those cleanups on reactor exit so that unit tests don't leak. This fixes erratic ASAN complaint about memory leak when running schema_change_test in debug mode: Indirect leak of 64 byte(s) in 1 object(s) allocated from: 0x7fab24413912 in operator new(unsigned long) (/lib64/libasan.so.2+0x99912) 0x1776aeb in make_unique<continuation<future<T>::then_wrapped(Func&&) [with Func = future<T>::handle_exception(Func&&) [with Func = sstables::sstable::~sstable()::<lambda(auto:52)>; T = {}]::<lambda(auto:5&&)>; Result = future<>; T = {}]::<lambda(auto:2&&)> >, future<T>::then_wrapped(Func&&) [with Func = future<T>::handle_exception(Func&&) [with Func = sstables::sstable::~sstable()::<lambda(auto:52)>; T = {}]::<lambda(auto:5&&)>; Result = future<>; T = {}]::<lambda(auto:2&&)> > /usr/include/c++/5.1.1/bits/unique_ptr.h:765 0x1752b69 in schedule<future<T>::then_wrapped(Func&&) [with Func = future<T>::handle_exception(Func&&) [with Func = sstables::sstable::~sstable()::<lambda(auto:52)>; T = {}]::<lambda(auto:5&&)>; Result = future<>; T = {}]::<lambda(auto:2&&)> > /home/tgrabiec/src/scylla2/seastar/core/future.hh:513 0x1711365 in schedule<future<T>::then_wrapped(Func&&) [with Func = future<T>::handle_exception(Func&&) [with Func = sstables::sstable::~sstable()::<lambda(auto:52)>; T = {}]::<lambda(auto:5&&)>; Result = future<>; T = {}]::<lambda(auto:2&&)> > /home/tgrabiec/src/scylla2/seastar/core/future.hh:690 0x16d0474 in then_wrapped<future<T>::handle_exception(Func&&) [with Func = sstables::sstable::~sstable()::<lambda(auto:52)>; T = {}]::<lambda(auto:5&&)>, future<> > /home/tgrabiec/src/scylla2/seastar/core/future.hh:880 0x1696e9c in handle_exception<sstables::sstable::~sstable()::<lambda(auto:52)> > /home/tgrabiec/src/scylla2/seastar/core/future.hh:1012 0x1638ba8 in sstables::sstable::~sstable() sstables/sstables.cc:1619 The leak is about allocations related to close() syscall tasks invoked from sstable destructor, which were not waited for. Message-Id: <1452783887-25244-1-git-send-email-tgrabiec@scylladb.com>	2016-01-15 11:32:15 +02:00
Raphael S. Carvalho	d44a5d1e94	compaction: filter out compacting sstables The implementation is about storing generation of compacting sstables in an unordered set per column family, so before strategy is called, compaction manager will filter out compacting sstables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 01:18:29 -02:00
Raphael S. Carvalho	9c13c1c738	compaction: move compaction execution from strategy to manager Currently, compaction strategy is the responsible for both getting the sstables selected for compaction and running compaction. Moving the code that runs compaction from strategy to manager is a big improvement, which will also make possible for the compaction manager to keep track of which sstables are being compacted at a moment. This change will also be needed for cleanup and concurrent compaction on the same column family. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 00:04:27 -02:00
Raphael S. Carvalho	ed80ed82ef	sstables: prepare compact_sstables to work with cleanup Cleanup is about rewriting a sstable discarding any keys that are irrelevant, i.e. keys that don't belong to current node. Parameter cleanup was added to compact_sstables. If set to true, irrelevant code such as the one that updates compaction history will be skipped. Logic was also added to discard irrelevant keys. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-11 21:43:40 -02:00
Tomasz Grabiec	4e5a52d6fa	db: Make read interface schema version aware The intent is to make data returned by queries always conform to a single schema version, which is requested by the client. For CQL queries, for example, we want to use the same schema which was used to compile the query. The other node expects to receive data conforming to the requested schema. Interface on shard level accepts schema_ptr, across nodes we use table_schema_version UUID. To transfer schema_ptr across shards, we use global_schema_ptr. Because schema is identified with UUID across nodes, requestors must be prepared for being queried for the definition of the schema. They must hold a live schema_ptr around the request. This guarantees that schema_registry will always know about the requested version. This is not an issue because for queries the requestor needs to hold on to the schema anyway to be able to interpret the results. But care must be taken to always use the same schema version for making the request and parsing the results. Schema requesting across nodes is currently stubbed (throws runtime exception).	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	5184381a0b	memtable: Deconstify memtable in readers We want to upgrade entries on read and for that we need mutating permission.	2016-01-11 10:34:51 +01:00
Avi Kivity	0c755d2c94	db: reduce log spam when ignoring an sstable With 10 sstables/shard and 50 shards, we get ~105050 messages = 25,000 log messages about sstables being ignored. This is not reasonable. Reduce the log level to debug, and move the message to database.cc, because at its original location, the containing function has nothing to do with the message itself. Reviewed-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Message-Id: <1452181687-7665-1-git-send-email-avi@scylladb.com>	2016-01-07 19:23:25 +02:00
Glauber Costa	74fbd8fac0	do not call open_file_dma directly We have an API that wraps open_file_dma which we use in some places, but in many other places we call the reactor version directly. This patch changes the latter to match the former. It will have the added benefit of allowing us to make easier changes to these interfaces if needed. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <29296e4ec6f5e84361992028fe3f27adc569f139.1451950408.git.glauber@scylladb.com>	2016-01-05 10:37:57 +02:00
Avi Kivity	e9400dfa96	Revert "sstable: Initialize super class of malformed_sstable_exception" This reverts commit d69dc32c92d63057edf9f84aa57ca53b2a6e37e4; it does nothing and does not address issue #669.	2016-01-05 10:21:00 +02:00
Benoît Canet	d69dc32c92	sstable: Initialize super class of malformed_sstable_exception This exception was not caught properly as a std::exception by report_failed_future call to report_exception because the superclass std::exception was not initialized. Fixes #669. Signed-off-by: Benoît Canet <benoit@scylladb.com>	2016-01-05 09:54:36 +02:00
Raphael S. Carvalho	b7d36af26f	compaction: fix max_purgeable calculation max_purgeable was being incorrectly calculated because the code that creates vector of uncompacted sstables was wrong. This value is used to determine whether or not a tombstone can be purged. Operand < is supposed to be used instead in the callback passed as third parameter to boost::set_difference. This fix is a step towards closing the issue #676. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-29 09:30:08 +02:00
Vlad Zolotarov	33552829b2	core: use steady_clock where monotinic clock is required Use steady_clock instead of high_resolution_clock where monotonic clock is required. high_resolution_clock is essentially a system_clock (Wall Clock) therefore may not to be assumed monotonic since Wall Clock may move backwards due to time/date adjustments. Fixes issue #638 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-12-27 18:07:53 +02:00
Tomasz Grabiec	0862d2f531	Merge branch 'pdziepak/fix-sstables-key_reader-663/v2' From Paweł: "This series fixes sstables::key_reader not respecting range inclusiveness if the bounds were the keys that were present in the index summary. Fixes #663."	2015-12-18 17:35:09 +01:00
Paweł Dziepak	18b8d7cccc	sstables: respect range inclusiveness in key_reader When choosing a relevant range of buckets it wasn't taken into account whether the range bounds are inclusive or not. That may have resulted in more buckets being read than necessary which was a condition not expected by the code responsible from looking for a relevant keys inside the buckets. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-12-18 17:24:26 +01:00
Pekka Enberg	eeadf601e6	Merge "cleanups and improvements" from Raphael	2015-12-18 13:45:11 +02:00
Pekka Enberg	40e8a9c99c	sstables/compaction: Fix compilation error with GCC 4.9.2 I am sure it's a compiler issue but I am not ready to give up and upgrade just yet: sstables/compaction.cc:307:55: error: converting to ‘std::unordered_map<int, long int>’ from initializer list would use explicit constructor ‘std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::unordered_map(std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::size_type, const hasher&, const key_equal&, const allocator_type&) [with _Key = int; _Tp = long int; _Hash = std::hash<int>; _Pred = std::equal_to<int>; _Alloc = std::allocator<std::pair<const int, long int> >; std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::size_type = long unsigned int; std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::hasher = std::hash<int>; std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::key_equal = std::equal_to<int>; std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::allocator_type = std::allocator<std::pair<const int, long int> >]’ stats->start_size, stats->end_size, {});	2015-12-16 10:03:14 +02:00
Raphael S. Carvalho	193ede68f3	compaction: register and deregister compaction_stats That's important for compaction stats API that will need stats data of each ongoing compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-15 09:50:32 -02:00
Raphael S. Carvalho	1fba394dd0	sstables: store keyspace and cf in compaction_stats The reason behind this change is that we will need ks and cf for the compaction stats API. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-15 09:50:02 -02:00
Raphael S. Carvalho	ac1a67c8bc	sstables: move compaction_stats to header file Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-15 09:49:45 -02:00
Raphael S. Carvalho	a2fb0ec9a3	sstables: update compaction history at the end of compaction When compaction job finishes, call function to update the system table COMPACTION_HISTORY. That's also needed for the compaction history API. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-14 14:20:03 -02:00
Raphael S. Carvalho	0fa194c844	sstables: remove outdated comment Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-14 12:43:53 -02:00

1 2 3 4 5 ...

521 Commits