scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Asias He	bdc542143e	streaming: Do not invalidate cache if no sstable is added in flush_streaming_mutations The table::flush_streaming_mutations is used in the days when streaming data goes to memtable. After switching to the new streaming, data goes to sstables directly in streaming, so the sstables generated in table::flush_streaming_mutations will be empty. It is unnecessary to invalidate the cache if no sstables are added. To avoid unnecessary cache invalidating which pokes hole in the cache, skip calling _cache.invalidate() if the sstables is empty. The steps are: - STREAM_MUTATION_DONE verb is sent when streaming is done with old or new streaming - table::flush_streaming_mutations is called in the verb handler - cache is invalidated for the streaming ranges In summary, this patch will avoid a lot of cache invalidation for streaming. Backports: 3.0 3.1 3.2 Fixes: #5769 (cherry picked from commit `5e9925b9f0`)	2020-02-16 15:16:24 +02:00
Amnon Heiman	028525daeb	database: add schema.cql file when creating a snapshot When creating a snapshot we need to add a schema.cql file in the snapshot directory that describes the table in that snapshot. This patch adds the file using the schema describe method. get_snapshot_details and manifest_json_filter were modified to ignore the schema.cql file. Fixes #4192 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-01-15 15:06:00 +02:00
Benny Halevy	718e9eb341	table: move_sstables_from_staging: fix use after free of shared_sstable Introduced in `4b3243f5b9` Reproducible with materialized_views_test:TestMaterializedViews.mv_populating_from_existing_data_during_node_remove_test and read_amplification_test:ReadAmplificationTest.no_read_amplification_on_repair_with_mv_test ==955382==ERROR: AddressSanitizer: heap-use-after-free on address 0x60200023de18 at pc 0x00000051d788 bp 0x7f8a0563fcc0 sp 0x7f8a0563fcb0 READ of size 8 at 0x60200023de18 thread T1 (reactor-1) #0 0x51d787 in seastar::lw_shared_ptr<sstables::sstable>::lw_shared_ptr(seastar::lw_shared_ptr<sstables::sstable> const&) /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/shared_ptr.hh:289 #1 0x10ba189 in apply<table::move_sstables_from_staging(std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()>::<lambda(std::set<seastar::basic_sstring<char, unsigned int, 15> >&)>::<lambda(sstables::shared_sstable)>&, const seastar::lw_shared_ptr<sstables::sstabl e>&> /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future.hh:1530 #2 0x109c4f1 in apply<table::move_sstables_from_staging(std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()>::<lambda(std::set<seastar::basic_sstring<char, unsigned int, 15> >&)>::<lambda(sstables::shared_sstable)>&, const seastar::lw_shared_ptr<sstables::sstabl e>&> /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future.hh:1556 #3 0x106941a in do_for_each<__gnu_cxx::__normal_iterator<const seastar::lw_shared_ptr<sstables::sstable>, std::vector<seastar::lw_shared_ptr<sstables::sstable> > >, table::move_sstables_from_staging(std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()>::<lambda( std::set<seastar::basic_sstring<char, unsigned int, 15> >&)>::<lambda(sstables::shared_sstable)> > /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future-util.hh:618 #4 0x1069203 in operator() /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future-util.hh:626 #5 0x10ba589 in apply /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/apply.hh:36 #6 0x10ba668 in apply<seastar::do_for_each(Iterator, Iterator, AsyncAction) [with Iterator = __gnu_cxx::__normal_iterator<const seastar::lw_shared_ptr<sstables::sstable>, std::vector<seastar::lw_shared_ptr<sstables::sstable> > >; AsyncAction = table::move_sstables_from_staging (std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()>::<lambda(std::set<seastar::basic_sstring<char, unsigned int, 15> >&)>::<lambda(sstables::shared_sstable)>]::<lambda()>&> /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/apply.hh:44 #7 0x10ba7c0 in apply<seastar::do_for_each(Iterator, Iterator, AsyncAction) [with Iterator = __gnu_cxx::__normal_iterator<const seastar::lw_shared_ptr<sstables::sstable>, std::vector<seastar::lw_shared_ptr<sstables::sstable> > >; AsyncAction = table::move_sstables_from_staging (std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()>::<lambda(std::set<seastar::basic_sstring<char, unsigned int, 15> >&)>::<lambda(sstables::shared_sstable)>]::<lambda()>&> /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future.hh:1563 ... 0x60200023de18 is located 8 bytes inside of 16-byte region [0x60200023de10,0x60200023de20) freed by thread T1 (reactor-1) here: #0 0x7f8a153b796f in operator delete(void) (/lib64/libasan.so.5+0x11096f) #1 0x6ab4d1 in __gnu_cxx::new_allocator<seastar::lw_shared_ptr<sstables::sstable> >::deallocate(seastar::lw_shared_ptr<sstables::sstable>, unsigned long) /usr/include/c++/9/ext/new_allocator.h:128 #2 0x612052 in std::allocator_traits<std::allocator<seastar::lw_shared_ptr<sstables::sstable> > >::deallocate(std::allocator<seastar::lw_shared_ptr<sstables::sstable> >&, seastar::lw_shared_ptr<sstables::sstable>, unsigned long) /usr/include/c++/9/bits/alloc_traits.h:470 #3 0x58fdfb in std::_Vector_base<seastar::lw_shared_ptr<sstables::sstable>, std::allocator<seastar::lw_shared_ptr<sstables::sstable> > >::_M_deallocate(seastar::lw_shared_ptr<sstables::sstable>*, unsigned long) /usr/include/c++/9/bits/stl_vector.h:351 #4 0x52a790 in std::_Vector_base<seastar::lw_shared_ptr<sstables::sstable>, std::allocator<seastar::lw_shared_ptr<sstables::sstable> > >::~_Vector_base() /usr/include/c++/9/bits/stl_vector.h:332 #5 0x52a99b in std::vector<seastar::lw_shared_ptr<sstables::sstable>, std::allocator<seastar::lw_shared_ptr<sstables::sstable> > >::~vector() /usr/include/c++/9/bits/stl_vector.h:680 #6 0xff60fa in ~<lambda> /local/home/bhalevy/dev/scylla/table.cc:2477 #7 0xff7202 in operator() /local/home/bhalevy/dev/scylla/table.cc:2496 #8 0x106af5b in apply<table::move_sstables_from_staging(std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()> > /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future.hh:1573 #9 0x102f5d5 in futurize_apply<table::move_sstables_from_staging(std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()> > /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future.hh:1645 #10 0x102f9ee in operator()<seastar::semaphore_units<seastar::named_semaphore_exception_factory> > /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/semaphore.hh:488 #11 0x109d2f1 in apply /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/apply.hh:36 #12 0x109d42c in apply<seastar::with_semaphore(seastar::basic_semaphore<ExceptionFactory, Clock>&, size_t, Func&&) [with ExceptionFactory = seastar::named_semaphore_exception_factory; Func = table::move_sstables_from_staging(std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()>; Clock = std::chrono::_V2::steady_clock]::<lambda(auto:51)>&, seastar::semaphore_units<seastar::named_semaphore_exception_factory, std::chrono::_V2::steady_clock>&&> /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/apply.hh:44 #13 0x109d595 in apply<seastar::with_semaphore(seastar::basic_semaphore<ExceptionFactory, Clock>&, size_t, Func&&) [with ExceptionFactory = seastar::named_semaphore_exception_factory; Func = table::move_sstables_from_staging(std::vector<seastar::lw_shared_ptr<sstables::sstable> >)::<lambda()>; Clock = std::chrono::_V2::steady_clock]::<lambda(auto:51)>&, seastar::semaphore_units<seastar::named_semaphore_exception_factory, std::chrono::_V2::steady_clock>&&> /local/home/bhalevy/dev/scylla/seastar/include/seastar/core/future.hh:1563 ... Fixes #5511 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20191222214326.1229714-1-bhalevy@scylladb.com>	2019-12-23 15:20:41 +02:00
Benny Halevy	4b3243f5b9	table: move_sstables_from_staging_in_thread with _sstable_deletion_sem Hold the _sstable_deletion_sem while moving sstables from the staging directory so not to move them under the feet of table::snapshot. Fixes #5340 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:20:20 +02:00
Benny Halevy	6efef84185	sstable: return future from move_to_new_dir distributed_loader::probe_file needlessly creates a seastar thread for it and the next patch will use it as part of a parallel_for_each loop to move a list of sstables (and sync the directories once at the end). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:20:20 +02:00
Piotr Sarna	79c3a508f4	table: Reduce read amplification in view update generation This commit makes sure that single-partition readers for read-before-write do not have fast-forwarding enabled, as it may lead to huge read amplification. The observed case was: 1. Creating an index. CREATE INDEX index1 ON myks2.standard1 ("C1"); 2. Running cassandra-stress in order to generate view updates. cassandra-stress write no-warmup n=1000000 cl=ONE -schema \ 'replication(factor=2) compaction(strategy=LeveledCompactionStrategy)' \ keyspace=myks2 -pop seq=4000000..8000000 -rate threads=100 -errors skip-read-validation -node 127.0.0.1; Without disabling fast-forwarding, single-partition readers were turned into scanning readers in cache, which resulted in reading 36GB (sic!) on a workload which generates less than 1GB of view updates. After applying the fix, the number dropped down to less than 1GB, as expected. Refs #5409 Fixes #4615 Fixes #5418	2019-12-05 11:58:34 +02:00
Avi Kivity	fd951a36e3	Merge "Let compaction wait on background deletions" from Benny " In several cases in distributed testing (dtest) we trigger compaction using nodetool compact assuming that when it is done, it is indeed really done. However, the way compaction is currently implemented in scylla, it may leave behind some background tasks to delete the old sstables that were compacted. This commit changes major compaction (triggered via the ss::force_keyspace_compaction api) so it would wait on the background deletes and will return only when they finish. Fixes #4909 Tests: unit(dev), nodetool_refresh_with_data_perms_test, test_nodetool_snapshot_during_major_compaction "	2019-12-04 11:18:41 +02:00
Piotr Sarna	9c5a5a5ac2	treewide: add names to semaphores By default, semaphore exceptions bring along very little context: either that a semaphore was broken or that it timed out. In order to make debugging easier without introducing significant runtime costs, a notion of named semaphore is added. A named semaphore is simply a semaphore with statically defined name, which is present in its errors, bringing valuable context. A semaphore defined as: auto sem = semaphore(0); will present the following message when it breaks: "Semaphore broken" However, a named semaphore: auto named_sem = named_semaphore(0, named_semaphore_exception_factory{"io_concurrency_sem"}); will present a message with at least some debugging context: "Semaphore broken: io_concurrency_sem" It's not much, but it would really help in pinpointing bugs without having to inspect core dumps. At the same time, it does not incur any costs for normal semaphore operations (except for its creation), but instead only uses more CPU in case an error is actually thrown, which is considered rare and not to be on the hot path. Refs #4999 Tests: unit(dev), manual: hardcoding a failure in view building code	2019-11-26 15:14:21 +02:00
Benny Halevy	f9e93bba38	sstables: compaction: move cleanup parameter to compaction_descriptor Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20191117165806.3234-1-bhalevy@scylladb.com>	2019-11-18 10:52:20 +01:00
Kamil Braun	a67e887dea	sstables: fix sstable file I/O CQL tracing when reading multiple files (#5285 ) CQL tracing would only report file I/O involving one sstable, even if multiple sstables were read from during the query. Steps to reproduce: create a table with NullCompactionStrategy insert row, flush memtables insert row, flush memtables restart Scylla tracing on select * from table The trace would only report DMA reads from one of the two sstables. Kudos to @denesb for catching this. Related issue: #4908	2019-11-17 00:38:37 -08:00
Piotr Dulikowski	59fbbb993f	memtables: add partition/row hit/miss counters Adds per-table metrics for counting partition and row reuse in memtables. New metrics are as follows: - memtable_partition_writes - number of write operations performed on partitions in memtables, - memtable_partition_hits - number of write operations performed on partitions that previously existed in a memtable, - memtable_row_writes - number of row write operations performed in memtables, - memtable_row_hits - number of row write operations that ovewrote rows previously present in a memtable. Tests: unit(release)	2019-11-12 13:35:41 +01:00
Vladimir Davydov	b75862610e	paxos_state: account paxos round latency This patch adds the following per table stats: cas_prepare_latency cas_propose_latency cas_commit_latency They are equivalent to CasPropose, CasPrepare, CasCommit metrics exposed by Cassandra.	2019-10-29 19:26:18 +03:00
Kamil Braun	394c36835a	sstables: report sstable data file I/O in CQL tracing Use tracing::make_traced_file when creating an sstable input_stream. To achieve that, trace_state needs to be plumbed down through some functions.	2019-10-25 14:10:28 +02:00
Raphael S. Carvalho	7f1a2156c7	table: Don't account for shared SSTables in compaction backlog tracker We don't want to add shared sstables to table's backlog tracker because: 1) table's backlog tracker has only an influence on regular compaction 2) shared sstables are never regular compacted, they're worked by resharding which has its own backlog tracker. Such sstables belong to more than one shard, meaning that currently they're added to backlog tracker of all shards that own them. But the thing is that such sstables ends up being resharded in shard that may be completely random. So increasing backlog of all shards such sstables belong to, won't lead to faster resharding. Also, table's backlog tracker is supposed to deal only with regular compaction. Accounting for shared sstables in table's tracker may lead to incorrect speed up of regular compactions because the controller is not aware that some relevant part of the backlog is due to pending resharding. The fix is about ignoring sstables that will be resharded and let table's backlog tracker account only for sstables that can be worked on by regular compaction, and rely on resharding controlling itself with its own tracker. NOTE: this doesn't fix the resharding controlling issue completely, as described in #4952. We'll still need to throttle regular compaction on behalf of resharding. So subsequent work may be about: - move resharding to its own priority class, perhaps streaming. - make a resharding's backlog tracker accounts for sstables in all of its pending jobs, not only the ongoing ones (currently limited to 1 by shard). - limit compaction shares when resharding is in progress. THIS only fixes the issue in which controller for regular compaction shouldn't account sstables completely exclusive to resharding. Fixes #5077. Refs #4952. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20190924022109.17400-1-raphaelsc@scylladb.com>	2019-10-13 10:14:13 +03:00
Tomasz Grabiec	b0e0f29b06	db: read: Filter-out sstables using its first and last keys Affects single-partition reads only. Refs #5113 When executing a query on the replica we do several things in order to narrow down the sstable set we read from. For tables which use LeveledCompactionStrategy, we store sstables in an interval set and we select only sstables whose partition ranges overlap with the queried range. Other compaction strategies don't organize the sstables and will select all sstables at this stage. The reasoning behind this is that for non-LCS compaction strategies the sstables' ranges will typically overlap and using interval sets in this case would not be effective and would result in quadratic (in sstable count) memory consumption. The assumption for overlap does not hold if the sstables come from repair or streaming, which generates non-overlapping sstables. At a later stage, for single-partition queries, we use the sstables' bloom filter (kept in memory) to drop sstables which surely don't contain given partition. Then we proceed to sstable indexes to narrow down the data file range. Tables which don't use LCS will do unnecessary I/O to read index pages for single-partition reads if the partition is outside of the sstable's range and the bloom filter is ineffective (Refs #5112). This patch fixes the problem by consulting sstable's partition range in addition to the bloom filter, so that the non-overlapping sstables will be filtered out with certainty and not depend on bloom filter's efficiency. It's also faster to drop sstables based on the keys than the bloom filter. Tests: - unit (dev) - manual using cqlsh Reviewed-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20190927122505.21932-1-tgrabiec@scylladb.com>	2019-09-28 19:42:57 +03:00
Benny Halevy	19b67d82c9	table::on_compaction_completion: fix indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-09-02 12:15:38 +03:00
Benny Halevy	8dd6e13468	table::on_compaction_completion: wait for background deletes Don't let background deletes accumulate uncontrollably. Fixes #4909 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-09-02 12:15:38 +03:00
Benny Halevy	da6645dc2c	table: refresh_snapshot before deleting any sstables The row cache must not hold refrences to any sstable we're about to delete. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-09-02 12:15:29 +03:00
Botond Dénes	136fc856c5	treewide: silence discarded future warnings for questionable discards This patches silences the remaining discarded future warnings, those where it cannot be determined with reasonable confidence that this was indeed the actual intent of the author, or that the discarding of the future could lead to problems. For all those places a FIXME is added, with the intent that these will be soon followed-up with an actual fix. I deliberately haven't fixed any of these, even if the fix seems trivial. It is too easy to overlook a bad fix mixed in with so many mechanical changes.	2019-08-26 19:28:43 +03:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Piotr Sarna	1ab07b80b4	database: assign proper io priority for streaming view updates Streamed view updates parasitized on writing io priority, which is reserved for user writes - it's now properly bound to streaming write priority.	2019-08-20 00:24:50 +02:00
Avi Kivity	77686ab889	Merge "Make SSTable cleanup run aware" from Raphael " Fixes #4663. Fixes #4718. " * 'make_cleanup_run_aware_v3' of https://github.com/raphaelsc/scylla: tests/sstable_datafile_test: Check cleaned sstable is generated with expected run id table: Make SSTable cleanup run aware compaction: introduce constants for compaction descriptor compaction: Make it possible to config the identifier of the output sstable run table: do not rely on undefined behavior in cleanup_sstables	2019-07-31 19:10:22 +03:00
Tomasz Grabiec	7604980d63	database: Add missing partition slicing on streaming reader recreation streaming_reader_lifecycle_policy::create_reader() was ignoring the partition_slice passed to it and always creating the reader for the full slice. That's wrong because create_reader() is called when recreating a reader after it's evicted. If the reader stopped in the middle of partition we need to start from that point. Otherwise, fragments in the mutation stream will appear duplicated or out of ordre, violating assumptions of the consumers. This was observed to result in repair writing incorrect sstables with duplicated clustering rows, which results in malformed_sstable_exception on read from those sstables. Fixes #4659. In v2: - Added an overload without partition_slice to avoid changing existing users which never slice Tests: - unit (dev) - manual (3 node ccm + repair) Backport: 3.1 Reviewd-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <1563451506-8871-1-git-send-email-tgrabiec@scylladb.com>	2019-07-18 18:35:28 +03:00
Raphael S. Carvalho	332c2ff710	table: Make SSTable cleanup run aware The cleanup procedure will move any sstable out of its sstable run because sstables are cleaned up individually and they end up receiving a new run identifier, meaning a table may potentially end up with a new sstable run for each of the sstables cleaned. SStable cleanup needs to be run aware, so that the run structure is not messed up after the operation is done. Given that only one fragment or other, composing a sstable run, may need cleanup, it's better to keep them in their original sstable run. Fixes #4663. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2019-07-15 23:39:47 -03:00
Raphael S. Carvalho	0e732ed1cf	table: do not rely on undefined behavior in cleanup_sstables It shouldn't rely on argument evaluation order, which is ub. Fixes #4718. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2019-07-15 23:39:22 -03:00
Benny Halevy	0e4567c881	table: document _sstables_lock/_sstable_deletion_sem locking order Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-07-15 19:20:35 +03:00
Benny Halevy	6dad9baa1c	table: disable_sstable_write: acquire _sstable_deletion_sem `disable_sstable_write` needs to acquire `_sstable_deletion_sem` to properly synchronize with background deletions done by `on_compaction_completion` to ensure no sstables will be created or deleted during `reshuffle_sstables` after `storage_service::load_new_sstables` disables sstable writes. Fixes #4622 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-07-11 12:14:44 +03:00
Benny Halevy	bbbd749f70	table: uninline enable_sstable_write Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-07-11 12:14:44 +03:00
Benny Halevy	c6bad3f3c2	table: reshuffle_sstables: add log message To mark the point in time writes are disabled and scanning of the data directory is beginning. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-07-11 12:14:44 +03:00
Kamil Braun	d6736a304a	Add metric for failed memtable flushes Resolves #3316. Signed-off-by: Kamil Braun <kbraun@scylladb.com>	2019-07-10 11:30:10 +03:00
Dejan Mircevski	8dcb35913a	table: Avoid needless allocation of cell lockers All `table` instances currently unconditionally allocate a cell locker for counter cells, though not all need one. Since the lockers occupy quite a bit of memory (as reported in #4441), it's wasteful to allocate them when unneeded. Fixes #4441. Tests: unit (dev, debug) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Message-Id: <20190515190910.87931-1-dejan@scylladb.com>	2019-05-16 11:10:38 +03:00
Avi Kivity	add20eb9a6	table: fix potentially wrong schema when reading from zero sstables We use the schema during creation of the mutation_source rather than during the query itself. Likely they're the same, and since no rows are returned from a zero-sstable query, harmless. But gcc 9 complains. Fix by using the query's schema.	2019-05-07 09:56:30 +03:00
Benny Halevy	5a99023d4a	treewide: use lambda for io_check of *touch_directory To prepare for a seastar change that adds an optional file_permissions parameter to touch_directory and recursive_touch_directory. This change messes up the call to io_check since the compiler can't derive the Func&& argument. Therefore, use a lambda function instead to wrap the call to {recursive_,}touch_directory. Ref #4395 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190421085502.24729-1-bhalevy@scylladb.com>	2019-04-21 12:04:39 +03:00
Raphael S. Carvalho	d59f716e1c	table: fix wild disk usage stat after sstables are discarded by truncate Truncate would make disk usage stat go wild because it isn't updated when sstables are removed in table::discard_sstables(). Let's update the stat after sstables are removed from the sstable set. Fixes #3624. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20190328154918.25404-1-raphaelsc@scylladb.com>	2019-04-01 13:55:11 +03:00
Avi Kivity	4b330b3911	Merge "introduce sstables manager" from Benny " This series introduce a rudimentary sstables manager that will be used for making and deleting sstables, and tracking of thereof. The motivation for having a sstables manager is detailed in https://github.com/scylladb/scylla/issues/4149. The gist of it is that we need a proper way to manage the life cycle of sstables to solve potential races between compaction and various consumers of sstables, so they don't get deleted by compaction while being used. In addition, we plan to add global statistics methods like returning the total capacity used by all sstables. This patchset changes the way class sstable gets the large_data_handler. Rather than passing it separately for writing the sstable and when deleting sstables, we provide the large_data_handler when the sstable object is constructed and then use it when needed. Refs #4149 " * 'projects/sstables_manager/v3' of https://github.com/bhalevy/scylla: sstables: provide large_data_handler to constructor sstables_manager: default_sstable_buffer_size need not be a function sstables: introduce sstables_manager sstables: move shareable_components def to its own header tests: use global nop_lp_handler in test_services sstables: compress.hh: add missing include sstables: reorder entry_descriptor constructor params sstables: entry_descriptor: get rid of unused ctor sstables: make load_shared_components a method of sstable sstables: remove default params from sstable constructor database: add table::make_sstable helper distributed_loader: pass column_family to load_sstables_with_open_info distributed_loader: no need for forward declaration of load_sstables_with_open_info distributed_loader: reshard: use default params for make_sstable	2019-03-26 16:31:40 +02:00
Benny Halevy	223e1af521	sstables: provide large_data_handler to constructor And use it for writing the sstable and/or when deleting it. Refs #4198 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:24:19 +02:00
Benny Halevy	eebc3701a5	sstables: introduce sstables_manager The goal of the sstables manager is to track and manage sstables life-cycle. There is a sstable manager instance per database and it is passed to each column-family (and test environment) on construction. All sstables created, loaded, and deleted pass through the sstables manager. The manager will make sure consumers of sstables are in sync so that sstables will not be deleted while in use. Refs #4149 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Benny Halevy	3a17053cb8	database: add table::make_sstable helper In most cases we make a sstable based on the table schema and soon - large_data_handler. Encapsulate that in a make_sstable method. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Benny Halevy	af7f2a07f4	table::open_sstable: test has_scylla_component after load has_scylla_component is always false before loading the sstable. Also, return exception future rather than throwing. Hit with the following dtests: counter_tests.TestCounters.upgrade_test counter_tests.TestCountersOnMultipleNodes.counter_consistency_node__test resharding_test.ReshardingTest_nodes?_with_CompactionStrategy.resharding_counter_test update_cluster_layout_tests.TestUpdateClusterLayout.increment_decrement_counters_in_threads_nodes_restarted_test Fixes #4306 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190326084151.18848-1-bhalevy@scylladb.com>	2019-03-26 10:58:52 +02:00
Tomasz Grabiec	80020118d0	Merge "Fix a couple of bugs related to large entry deletion" from Rafael The crash observed in issue #4335 happens because delete_large_data_entries is passed a deleted name. Normally we don't get a crash, but a garbage name and we fail to delete entries from system.large_. Adding a test for the fix found another issue that the second patch is this series fixes. Tests: unit (dev) Fixes #4335. https://github.com/espindola/scylla guthub/fix-use-after-free-v4: large_data_handler: Fix a use after destruction large_data_handler: Make a variable non static Allow large_data_handler to be stopped twice Allow table to be stopped twice Test that large data entries are deleted	2019-03-25 10:37:36 +01:00
Benny Halevy	32bf0f36ef	table: move sstable counters validation from load_sstable to open_sstable Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-24 18:25:09 +02:00
Rafael Ávila de Espíndola	bd1593c12a	Allow table to be stopped twice This will be used in a testcase. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-21 10:47:59 -07:00
Piotr Sarna	a7602bd2f1	database: add global view update stats Currently view update metrics are only per-table, but per-table metrics are not always enabled. In order to be able to see the number of generated view updates in all cases, global stats are added. Fixes #4221 Message-Id: <e94c27c530b2d7d262f76d03937e7874d674870a.1552552016.git.sarna@scylladb.com>	2019-03-14 12:04:18 +00:00
Avi Kivity	0beeb2f721	Merge "implement upgradesstables + scub" from Calle " Fixes #4245 Breaks up "perform_cleanup" in parameterized "rewrite_sstables" and implements upgrade + scrub in terms of this. Both run as a "regular" compaction, but ignore the normal criteria for compaction and select obsolete/all tables. We also ensure all previous compactions are done so we can guarantee all tables are rewritten post invocation of command. " * 'calle/upgrade_sstables' of github.com:scylladb/seastar-dev: api::storage_service: Implement "scrub" api/storage_service: Implement "upgradesstables" api::storage_service: Add keyspace + tables helper compaction_manager: Add perform_sstable_scrub compaction_manager: Add perform_sstable_upgrade compaction_manager: break out rewrite_sstables from cleanup table: parameterize cleanup_sstables	2019-03-06 15:47:26 +02:00
Benny Halevy	1021eb29c9	distributed_loader: fix old format counters exception table::load_sstable: fix missing arg in old format counters exception Properly catch and log the exception in load_new_sstables. Abort when the exception is caught to keep current behavior. Seen with migration_test:TestMigration_with_2_1_x.migrate_sstable_with_counter_test without enable_dangerous_direct_import_of_cassandra_counters. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190301091235.2914-1-bhalevy@scylladb.com>	2019-03-04 17:36:09 +01:00
Calle Wilund	7fb6bbe68c	table: parameterize cleanup_sstables To allow using the logic for one-sstable-at-a-time compaction (i.e. rewrite) of sstables without the "normal" cleanup logic and partition selection.	2019-02-27 14:25:31 +00:00
Glauber Costa	e0bfd1c40a	allow Cassandra SSTables with counters to be imported if they are new enough Right now Cassandra SSTables with counters cannot be imported into Scylla. The reason for that is that Cassandra changed their counter representation in their 2.1 version and kept transparently supporting both representations. We do not support their old representation, nor there is a sane way to figure out by looking at the data which one is in use. For safety, we had made the decision long ago to not import any tables with counters: if a counter was generated in older Cassandra, we would misrepresent them. In this patch, I propose we offer a non-default way to import SSTables with counters: we can gate it with a flag, and trust that the user knows what they are doing when flipping it (at their own peril). Cassandra 2.1 is by now pretty old. many users can safely say they've never used anything older. While there are tools like sstableloader that can be used to import those counters, there are often situations in which directly importing SSTables is either better, faster, or worse: the only option left. I argue that having a flag that allow us to import them when we are sure it is safe is better than having no option at all. With this patch I was able to successfully import Cassandra tables with counters that were generated in Cassandra 2.1, reshard and compact their SSTables, and read the data back to get the same values in Scylla as in Cassandra. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190210154028.12472-1-glauber@scylladb.com>	2019-02-10 17:50:48 +02:00
Rafael Ávila de Espíndola	625080b414	Rename large_partition_handler Now that it also handles large rows, rename it to large_data_handler. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-01-28 15:03:14 -08:00
Piotr Sarna	5dec6dc6c6	table: make populate_views not allow hints View building uses populate_views to generate and send view updates. This procedure will now not allow hints to be used to acknowledge the write. Instead, the whole building step will be retried on failure. Fixes #3857 Fixes #4039	2019-01-28 09:38:42 +01:00
Avi Kivity	fae4c6c0b6	database: merge for_all_partitions and for_all_partitions_slow for_all_partitions is only used in the implementation of for_all_partitions_slow, so merge them and get rid of a template.	2019-01-20 15:55:20 +02:00

1 2

73 Commits