scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 12:06:44 +00:00

Author	SHA1	Message	Date
Avi Kivity	76d21a0c22	Merge 'Make it possible to turn caching off per table and stop caching CDC Log' from Piotr J. " We inherited from Origin a `caching` table parameter. It's a map of named caching parameters. Before this PR two caching parameters were expected: `keys` and `rows_per_partition`. So far we have been ignoring them. This PR adds a new caching parameter called `enabled` which can be set to `true` or `false` and controls the usage of the cache for the table. By default, it's set to `true` which reflects Scylla behavior before this PR. This new capability is used to disable caching for CDC Log table. It is desirable because CDC Log entries are not expected to be read often. They also put much more pressure on memory than entries in Base Table. This is caused by the fact that some writes to Base Table can override previous writes. Every write to CDC Log is unique and does not invalidate any previous entry. Fixes #6098 Fixes #6146 Tests: unit(dev, release), manual " * haaawk-dont_cache_cdc: cdc: Don't cache CDC Log table table: invalidate disabled cache on memtable flush table: Add cache_enabled member function cf_prop_defs: persist caching_options in schema property_definitions: add get that returns variant feature: add PER_TABLE_CACHING feature caching_options: add enabled parameter	2020-05-10 15:39:42 +03:00
Raphael S. Carvalho	88d2486fca	sstables: Synchronize deletion of SSTables in resharding with other operations Input SSTables of resharding is deleted at the coordinator shard, not at the shards they belong to. We're not acquiring deletion semaphore before removing those input SSTables from the SSTable set, so it could happen that resharding deletes those SSTables while another operation like snapshot, which acquires the semaphore, find them deleted. Let's acquire the deletion semaphore so that the input SSTables will only be removed from the set, when we're certain that nobody is relying on their existence anymore. Now resharding will only delete input SStables after they're safely removed from the SSTable set of all shards they belong to. unit: test(dev). Fixes #6328. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200507233636.92104-1-raphaelsc@scylladb.com>	2020-05-10 10:50:32 +03:00
Ivan Prisyazhnyy	84e25e8ba4	api: support table auto compaction control The patch implements: - /storage_service/auto_compaction API endpoint - /column_family/autocompaction/{name} API endpoint Those APIs allow to control and request the status of background compaction jobs for the existing tables. The implementation introduces the table::_compaction_disabled_by_user. Then the CompactionManager checks if it can push the background compaction job for the corresponding table. New members === table::enable_auto_compaction(); table::disable_auto_compaction(); bool table::is_auto_compaction_disabled_by_user() const Test === Tests: unit(sstable_datafile_test autocompaction_control_test), manual $ ninja build/dev/test/boost/sstable_datafile_test $ ./build/dev/test/boost/sstable_datafile_test --run_test=autocompaction_control_test -- -c1 -m2G --overprovisioned --unsafe-bypass-fsync 1 --blocked-reactor-notify-ms 2000000 The test tries to submit a compaction job after playing with autocompaction control table switch. However, there is no reliable way to hook pending compaction task. The code assumed that with_scheduling_group() closure will never preempt execution of the stats check. Revert === Reverts commit `c8247ac`. In previous version the execution sometimes resulted into the following error: test/boost/sstable_datafile_test.cc(1076): fatal error: in "autocompaction_control_test": critical check cm->get_stats().pending_tasks == 1 \|\| cm->get_stats().active_tasks == 1 has failed This version adds a few sstables to the cf, starts the compaction and awaits until it is finished. API change === - `/column_family/autocompaction/` always returned `true` while answering to the question: if the autocompaction disabled (see https://github.com/scylladb/scylla-jmx/blob/master/src/main/java/org/apache/cassandra/db/ColumnFamilyStore.java#L321). now it answers to the question: if the autocompaction for specific table is enabled. The question logic is inverted. The patch to the JMX is required. However, the change is decent because all old values were invalid (it always reported all compactions are disabled). - `/column_family/autocompaction/` got support for POST/DELETE per table Fixes === Fixes #1488 Fixes #1808 Fixes #440 Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com> Reviewed-by: Glauber Costa <glauber@scylladb.com>	2020-05-07 16:23:38 +03:00
Avi Kivity	bef8e5e930	Merge "Don't invalidate row cache when adding GC SStable to SSTable Set" from Raphael " Garbage collected SSTables, created by incremental compaction process, are being added to the SSTable set using a function that invalidates row cache using the range of the SSTable itself. That's incorrect because data in GC SSTables come from preexisting SSTables in set, meaning the state of data isn't changed and so no need for invalidation at all. Incorrect invalidation like this is a source of read performance issues. This problem is fixed by including GC SSTables to the descriptor which is used to specify changes to the SSTable set, which is the correct thing to do given that a midway failure could leave the set in an incorrect state. Fixes #5956. Fixes #6275. tests: unit(dev) " * 'fix_issue_5956_v4' of github.com:raphaelsc/scylla: sstables/compaction: Don't invalidate row cache when adding GC SSTable to SSTable set sstables/compaction: Change meaning of compaction_completion_desc input and output fields sstables/compaction: Clean up code around garbage_collected_sstable_writer	2020-05-07 14:10:49 +03:00
Benny Halevy	b2f50224d9	table: database_sstable_write_monitor: revert charges in destructor We must unregister the monitor upon destruction to prevent use-after-free from `compaction_backlog_tracker::backlog` path. This is similar to ~compaction_read_monitor as implemented in commit `ca284174d0` Fixes #6385 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200506214419.569655-1-bhalevy@scylladb.com>	2020-05-07 10:39:39 +02:00
Piotr Jastrzebski	38ede62a02	table: invalidate disabled cache on memtable flush table::update_cache has two branches of its logic. One when caching is enabled and the other when it's disabled. This patch adds unconditional cache invalidation to the second (disabled caching) branch. This is done for two purposes. First and foremost, it gives the guarantee that when we enable the cache later it will be in the right state and will be ready for usage. This is because any memtable flush that would logically invalidate the cache, actually physically does that too now. An additional benefit of this change is that disabled cache will be cleared during the next memtable flush that will happen after turning the switch off. Previously, the cache would also be emptied but it would take more time before all its elements are removed by eviction. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-06 18:39:01 +02:00
Piotr Jastrzebski	1a43849cd2	table: Add cache_enabled member function This function determines cache usage based both on table _config and dynamic schema information. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-05-06 18:39:01 +02:00
Raphael S. Carvalho	8f4458f1d5	sstables/compaction: Change meaning of compaction_completion_desc input and output fields input_sstables is renamed to old_sstables and is about old SSTables that should be deleted and removed from the SSTable set. output_sstables is renamed to new_sstables and is about new SSTable that should be added to the SSTable set, replacing the old ones. This will allow us, for example, to add auxiliary SSTables to SSTable set using the same call which replaces output SSTables by input SSTables in compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-05-05 12:03:08 -03:00
Glauber Costa	70e5252a5d	table: no longer accept online loading of SSTable files in the main directory Loading SSTables from the main directory is possible, to be compatible with Cassandra, but extremely dangerous and not recommended. From the beginning, we recommend using an separate, upload/ directory. In all this time, perhaps due to how the feature's usefulness is reduced in Cassandra due to the possible races, I have never seen anyone coming from Cassandra doing procedures involving refresh at all. Loading SSTables from the main directory forces us to disable writes to the table temporarily until the SSTables are sorted out. If we get rid of this, we can get rid of the disabling of the writes as well. We can't do it now because if we want to be nice to the odd user that may be using refresh through the main directory without our knowledge we should at least error out. This patch, then, does that: it errors out if SSTables are found in the main directory. It will not proceed with the refresh, and direct the user to the upload directory. The main loop in reshuffle_sstables is left in place structurally for now, but most of it is gone. The test for is is deleted. After a period of deprecation we can start ignoring these SSTables and get rid of the lock. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200429144511.13681-1-glauber@scylladb.com>	2020-05-03 08:40:38 +03:00
Rafael Ávila de Espíndola	95ee54f3cc	sstables: Call monitor->write_failed earlier. A writer is destroyed just before consume_in_thread returns, since the adapter takes ownership of it. The problem is that a monitor can keep a reference to the a writer_offset_tracker that is owned by that writer. The monitor is accessed periodically via backlog_controller::_update_timer. This means we have to deregister from the list of ongoing writes before the writer is destroyed. If the write fails, the deregistration happens in write_failed, but it is currently called after the writer is destroyed. This patch moves the call to write_failed to the writer destructor as I could not find a convenient location to put it. Since the writer is destroyed in consume_in_thread, we could call it there, but then we also have to update consume. The is a similar problem with the case where the sstable is written correctly. That will be fixed in the next patch. Fixes #6221. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-04-27 08:58:31 -07:00
Rafael Ávila de Espíndola	95acfd1d58	sstables: Add write_failed to the write_monitor interface Only database_sstable_write_monitor needs it so far, but the call needs to be moved earlier, which requires calling it in code paths that don't know about database_sstable_write_monitor. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-04-27 08:58:31 -07:00
Piotr Sarna	c66661c582	table: bypass cache when generating view updates from streaming There's no indication that data needed for generating view updates from staging sstables is going to be immediately useful for the user, and a large amount of it can push hot rows out of the cache, thus deteriorating performance. Fixes #6233 Tests: unit(dev)	2020-04-26 15:43:02 +03:00
Glauber Costa	1f9c37fb5e	view_updating_consumer: move reference to a pointer It is currently not possible to wrap the view_updating_consumer in an std::optional. I intend to do it to allow for compactions to optionally generate view updates. The reason for that is that view_updating_consumer has a reference as a member, which makes the move assignment constructor not be implicitly generated. This patch fixes it by keeping a pointer instead of a reference. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200421123648.8328-1-glauber@scylladb.com>	2020-04-22 10:05:35 +03:00
Piotr Sarna	a6cf0bfa7d	table: switch to correct io_priority for streaming view updates The io_priority parameter used when generating view updates from streaming is used by the sstable reader, so it should use the I/O priority for streaming read operations, not streaming write operations. Fixes #6231 Tests: unit(dev)	2020-04-19 09:56:43 +03:00
Rafael Ávila de Espíndola	3586324a61	sstables: Delete never overwritten methods Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Reviewed-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200417012330.246071-1-espindola@scylladb.com>	2020-04-17 09:16:16 +03:00
Piotr Sarna	71ac6ebcc5	Merge 'prepare the view building generator to work through a compaction' from Glauber There is no reason to read a single SSTable at a time from the staging directory. Moving SSTables from staging directory essentially involves scanning input SSTables and creating new SSTables (albeit in a different directory). We have a mechanism that does that: compactions. In a follow up patch, I will introduce a new specialization of compaction that moves SSTables from staging (potentially compacting them if there are plenty). In preparation for that, some signatures have to be changed and the view_updating_consumer has to be more compaction friendly. Meaning: - Operating with an sstable vector - taking a table reference, not a database Because this code is a bit fragile and the reviewer set is fundamentally different from anything compaction related, I am sending this separately * glommer-view_build: staging: potentially read many SSTables at the same time view_build_test: make sure it works with smp > 1	2020-04-15 18:07:09 +02:00
Glauber Costa	4e6400293e	staging: potentially read many SSTables at the same time There is no reason to read a single SSTable at a time from the staging directory. Moving SSTables from staging directory essentially involves scanning input SSTables and creating new SSTables (albeit in a different directory). We have a mechanism that does that: compactions. In a follow up patch, I will introduce a new specialization of compaction that moves SSTables from staging (potentially compacting them if there are plenty). In preparation for that, some signatures have to be changed and the view_updating_consumer has to be more compaction friendly. Meaning: - Operating with an sstable vector - taking a table reference, not a database Because this code is a bit fragile and the reviewer set is fundamentally different from anything compaction related, I am sending this separately Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-04-15 11:26:44 -04:00
Konstantin Osipov	18b9bb57ac	lwt: rename metrics to match accepted terminology Rename inherited metrics cas_propose and cas_commit to cas_accept and cas_learn respectively. A while ago we made a decision to stick to widely accepted terms for Paxos rounds: prepare, accept, learn. The rest of the code is using these terms, so rename the metrics to avoid confusion/technical debt. While at it, rename a few internal methods and functions. Fixes #6169 Message-Id: <20200414213537.129547-1-kostja@scylladb.com>	2020-04-15 12:20:30 +02:00
Pekka Enberg	c8247aced6	Revert "api: support table auto compaction control" This reverts commit `1c444b7e1e`. The test it adds sometimes fails as follows: test/boost/sstable_datafile_test.cc(1076): fatal error: in "autocompaction_control_test": critical check cm->get_stats().pending_tasks == 1 \|\| cm->get_stats().active_tasks == 1 has failed Ivan is working on a fix, but let's revert this commit to avoid blocking next promotion failing from time to time.	2020-04-11 17:56:02 +03:00
Ivan Prisyazhnyy	1c444b7e1e	api: support table auto compaction control This patch adds API endpoint /column_family/autocompaction/{name} that listen to GET and POST requests to pick and control table background compactions. To implement that the patch introduces "_compaction_disabled_by_user" flag that affects if CompactionManager is allowed to push background compactions jobs into the work. It introduces table::enable_auto_compaction(); table::disable_auto_compaction(); bool table::is_auto_compaction_disabled_by_user() const to control auto compaction state. Fixes #1488 Fixes #1808 Fixes #440 Tests: unit(sstable_datafile_test autocompaction_control_test), manual	2020-04-08 21:18:38 +03:00
Glauber Costa	463d0ab37c	compaction: move rewrite_sstables to the compaction_manager There is no reason why the table code has to be aware of the efforts of rewriting (cleanup, scrub, upgrade) an SSTable versus compacting it. Rewrite is special, because we need to do it one SSTable at a time, without lumping it together. However, the compaction manager is totally capable of doing that itself. If we do that, the special "table::rewrite_sstables" can be killed. This code would maybe be better off as a thread, where we wouldn't need to keep state. However there are some methods like maybe_stop_on_error() that expect a future so I am leaving this be for now. This is a cleanup that can be done later. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200401162722.28780-2-glauber@scylladb.com>	2020-04-06 16:02:30 +03:00
Avi Kivity	88ade3110f	treewide: replace calls to engine().some_api() with some_api() This removes the need to include reactor.hh, a source of compile time bloat. In some places, the call is qualified with seastar:: in order to resolve ambiguities with a local name. Includes are adjusted to make everything compile. We end up having 14 translation units including reactor.hh, primarily for deprecated things like reactor::at_exit(). Ref #1	2020-04-05 12:46:04 +03:00
Glauber Costa	e8801cd77b	compaction: enhance compaction_descriptor with creator and replace function There are many differences between resharding and compaction that are artificial, arising more from the way we ended up implementing it than necessity. This patch attempts to pass the creator and replacer functions through the compaction_descriptor. There is a difference between the creator function for resharding and regular compaction: resharding has to pass the shard number on behalf of which the SSTable is created. However regular compactions can just ignore this. No need to have a special path just for this. After this is done, the constructor for the compaction object can be greatly simplified. In further patches I intend to simplify it a bit further, but some more cleanup has to happen first. To make that happen we have to construct a compaction_descriptor object inside the resharding function. This is temporary: resharding currently works with a descriptor, but at some point that descriptor is lost and broken into pieces to be passed to this function. The overarching goal of this work is exactly to be able to keep that descriptor for as long as possible, which should simplify things a lot. Callers are patched, but there are plenty for sstable_datafile_test.cc. For their benefit, a helper function is provided to keep the previous signature (test only). Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-03-31 19:41:25 -04:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Nadav Har'El	7922b9eb8f	materialized views: reduce recompilation when db/view/view.hh changes. Before this patch, when db/view/view.hh was modified, 89 source files had to be recompiled. After this patch, this number is down to 5. Most of the irrelevant source files got view.hh by including database.hh, which included view.hh just for the definition of statistics. So in this patch we split the view statistics to a separate header file, view_stats.hh, and database.hh only includes that. A few source files which included only database.hh and also needed view.hh (for materialized-view related functions) now need to include view.hh explicitly. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200319121031.540-1-nyh@scylladb.com>	2020-03-19 15:46:14 +02:00
Piotr Sarna	0c11e07faf	view,table: fix waiting for view updates during building View updates sent as part of the view building process should never be ignored, but `fd49fd7` introduced a bug which may cause exactly that: the updates are mistakenly sent to background, so the view builder will not receive negative feedback if an update failed, which will in turn not cause a retry. Consequently, view building may report that it "finished" building a view, while some of the updates were lost. A simple fix is to restore previous behaviour - all updates triggered by view building are now waited for. Fixes #6038 Tests: unit(dev), dtest: interrupt_build_process_with_resharding_low_to_half_test	2020-03-19 10:50:54 +02:00
Piotr Sarna	fd49fd773c	db,view: move putting view updates to background to mutate_MV Currently, launching view updates as an asynchronous background job is done via not waiting for mutate_MV() future in table::generate_and_propagate_view_updates. That has a big downside, since mutate_MV() handles all view updates for all views of a table, so it's not possible to wait for each view independently. Per-view granularity is required in order to implement synchronous view updates of local views - because then we'll synchronously wait for all views that write to a local node (due to having a matching partition key with the base), while remote view updates will still be sent asynchronously. In order to do that, instead of not waiting for mutate_MV, we do wait for it properly, but instead launch the asynchronous, unwaited-for futures inside mutate_MV. Effectively that means no changes for view updates so far - all updates will be fired in the background. Later, another patch will introduce a way to wait for selected updates to finish.	2020-03-11 09:05:56 +01:00
Piotr Sarna	3b3659e8cd	db,view: drop default parameter for mutate_MV::allow_hints Default parameters are considered harmful, and as part of a cleanup before editing view.cc code, a default value for allow_hints parameter is removed.	2020-03-11 09:05:56 +01:00
Raphael S. Carvalho	3ba3ee2a7b	distributed_loader: trigger regular compaction on resharding completion Regular compaction relies on compaction manager to run compaction jobs until compaction strategy is satisfied. Resharding, on the other hand, is an one-off operation which runs only once in compaction manager, and leave the sstable set in such a way that the strategy is very likely unsatisfied. We need to trigger regular compaction whenever a resharding job replaces a shared sstable by an unshared sstable, so that compaction will not fall way behind due to lots of new sstables created by resharding process. Fixes #5262. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200217144946.20338-1-raphaelsc@scylladb.com>	2020-03-04 16:08:13 +02:00
Avi Kivity	906784639d	Merge "Clean sstables from using global objects" from Pavel E " This set cleans sstable_writer_config and surrounding sstables code from using global storage_ and feature_ service-s and database by moving the configuration logic onto sstables_manager (that was supposed to do it since `eebc3701a5`). Most of the complexity is hidden around sstable_writer_config creation, this set makes the sstables_manager create this object with an explicit call. All the rest are consequences of this change. Tests: unit(debug), manual start-stop " * 'br-clean-sstables-manager-2' of https://github.com/xemul/scylla: sstables: Move get_highest_supported_format sstables: Remove global get_config() helper sstables: Use manager's config() in .new_sstable_component_file() sstable_writer_config: Extend with more db::config stuff sstables_manager: Don't use global helper to generate writer config sstable_writer_config: Sanitize out some features fields initialization sstable_writer_config: Factor out some field initialization sstables: Generate writer config via manager only sstables: Keep reference on manager test: Re-use existing global sstables_manager table: Pass sstable_writer_config into write_memtable_to_sstable	2020-03-03 18:33:01 +02:00
Avi Kivity	157fe4bd19	Merge "Remove default timeouts" from Botond " Timeouts defaulted to `db::no_timeout` are dangerous. They allow any modifications to the code to drop timeouts and introduce a source of unbounded request queue to the system. This series removes the last such default timeouts from the code. No problems were found, only test code had to be updated. tests: unit(dev) " * 'no-default-timeouts/v1' of https://github.com/denesb/scylla: database: database::query(), database::apply(): remove default timeouts database: table::query(): remove default timeout mutation_query: data_query(): remove default timeout mutation_query: mutation_query(): remove default timeout multishard_mutation_query: query_mutations_on_all_shards(): remove default timeout reader_concurrency_semaphore: wait_admission(): remove default timeout utils/logallog: run_when_memory_available(): remove default timeout	2020-03-01 17:29:17 +02:00
Botond Dénes	8da88e6cb9	mutation_query: data_query(): remove default timeout	2020-02-27 19:02:40 +02:00
Botond Dénes	7bdeec4b00	flat_mutation_reader: make_reversing_reader(): add memory limit If the reversing requires more memory than the limit, the read is aborted. All users are updated to get a meaningful limit, from the respective table object, with the exception of tests of course.	2020-02-27 18:11:54 +02:00
Pavel Emelyanov	7363d56946	sstables: Move get_highest_supported_format The global get_highest_supported_format helper and its declaration are scattered all over the code, so clean this up and prepare the ground for moving _sstables_format from the storage_service onto the sstables_manager (not this set). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-25 14:31:45 +03:00
Pavel Emelyanov	5adce3390c	sstables: Generate writer config via manager only The sstable_writer_config creation looks simple (just declare the struct instance) but behind the scenes references storage and feature services, messes with database config, etc. This patch teaches the sstables_manager generate the writer config and makes the rest of the code use it. For future safety by-hands creation of the sstable_writer_config is prohibited. The manager is referenced through table-s and sstable-s, but two existing sstables_managers live on database object, and table-s and sstable-s both live shorter than the database, this reference is save. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-25 14:31:04 +03:00
Pavel Emelyanov	961f1642c7	table: Pass sstable_writer_config into write_memtable_to_sstable The latter creates the config by hands, but the plan is to create it via sstables_manager. Callers of this helper are the final frontiers where the manager will be safely accessible. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-25 13:54:40 +03:00
Raphael S. Carvalho	f93912f344	Revert "Revert "streaming: Do not invalidate cache if no sstable is added in flush_streaming_mutations"" With #4446 fixed, this commit can be reverted. This reverts commit `454e7e0109`.	2020-02-20 10:55:50 -03:00
Raphael S. Carvalho	fb81f2aa7c	table: Fix stale data being returned due to lack of cache invalidation Row cache needs to be invalidated whenever data in sstables changes. Cleanup removes data from sstables which doesn't belong to the node anymore, which means cache must be invalidated on cleanup. Currently, stale data can be returned when a node re-owns ranges which data are still stored in the node's row cache, because cleanup didn't invalidate the cache. To prevent data that belongs to the node from being purged from the row cache, cleanup will only invalidate the cache with a set of token ranges that will not overlap with any of ranges owned by the node. update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_decommission_node_2_test now passes. Fixes #4446. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-02-20 10:55:50 -03:00
Raphael S. Carvalho	65b4fc8bcd	sstables/compaction: Introduce compaction_completion_desc This descriptor contain all information needed for table to be properly updated on compaction completion. A new member will be added to it soon, which will store ranges to be invalidated in row cache on behalf of cleanup compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-02-19 19:29:32 -03:00
Avi Kivity	454e7e0109	Revert "streaming: Do not invalidate cache if no sstable is added in flush_streaming_mutations" This reverts commit `5e9925b9f0`. It causes data resurrection in simple_decommission_node_2_test. Fixes #5838.	2020-02-18 20:13:10 +02:00
Avi Kivity	6c7aa18238	Merge "Introduce schema::get_partitioner" from Piotr " Introduce schema::get_partitioner and use it instead of dht::global_partitioner. Fixes #5493 Tests: unit(dev, release, debug) " * 'per_table_partitioner_prep' of https://github.com/haaawk/scylla: (35 commits) cdc: stop using partitioners partitioner_test: stop calling set_global_partitioner storage_service: stop calling global_partitioner() mutation_writer_test: stop calling global_partitioner() schema: reduce number of global_partitioner() calls test_services: stop calling global_partitioner() sstable_utils: stop calling global_partitioner() sstable_resharding_test: stop depending on global partitioner sstable_mutation_test: stop calling global_partitioner() sstable_data_file_test: stop calling global_partitioner() random_schema: stop taking partitioner in constructor mutation_reader_test: stop calling global_partitioner() multishard_mutation_query_test: stop calling global_partitioner() row_level repair: stop calling global_partitioner() distribute_reader_and_consume_on_shards: don't take partitioner thrift: reduce global_partitioner() calls binary_search: stop calling global_partitioner() index_entry: stop calling global_partitioner() mc writer: stop calling global_partitioner() sstable: stop calling global_partitioner() ...	2020-02-17 18:12:53 +02:00
Tomasz Grabiec	76d1dd7ec6	Merge "nodetool scrub: implement validation and the skip-corrupted flag " from Botond Nodetool scrub rewrites all sstables, validating their data. If corrupt data is found the scrub is aborted. If the skip-corrupted flag is set, corrupt data is instead logged (just the keys) and skipped. The scrubbing algorithm itself is fairly simple, especially that we already have a mutation stream validator that we can use to validate the data. However currently scrub is piggy-backed on top of cleanup compaction. To implement this flag, we have to make scrub a separate compaction type and propagate down the flag. This required some massaging of the code: * Add support for more than two (cleanup or not) compaction types. * Allow passing custom options for each compaction type. * Allow stopping a compaction without the manager retrying it later. Additionally the validator itself needed some changes to allow different ways to handle errors, as needed by the scrub. Fixes: #5487 * https://github.com/denesb/nodetool-scrub-skip-corrupted/v7: table: cleanup_sstables(): only short-circuit on actual cleanup compaction: compaction_type: add Upgrade compaction: introduce compaction_options compaction: compaction_descriptor: use compaction options instead of cleanup flag compaction_manager: collect all cleanup related logic in perform_cleanup() sstables: compaction_stop_exception: add retry flag mutation_fragment_stream_validator: split into low-level and high-level API compaction: introduce scrub_compaction compaction_manager: scrub: don't piggy-back on upgrade_sstables() test: sstable_datafile_test: add scrub unit test	2020-02-17 15:28:07 +02:00
Piotr Jastrzebski	2d7532f87f	dht: add dht::get_token and replace all calls to dht::global_partitioner().get_token dht::get_token is better because it takes schema and uses it to obtain partitioner instead of using a global partitioner. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	ca4a89d239	dht: add dht::decorate_key and replace all dht::global_partitioner().decorate_key with dht::decorate_key It is an improvement because dht::decorate_key takes schema and uses it to obtain partitioner instead of using global partitioner as it was before. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:06 +01:00
Piotr Jastrzebski	abd76e566f	dht::shard_of: stop calling global_partitioner() Take const schema& as a parameter of shard_of and use it to obtain partitioner instead of calling global_partitioner(). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:23:16 +01:00
Asias He	5e9925b9f0	streaming: Do not invalidate cache if no sstable is added in flush_streaming_mutations The table::flush_streaming_mutations is used in the days when streaming data goes to memtable. After switching to the new streaming, data goes to sstables directly in streaming, so the sstables generated in table::flush_streaming_mutations will be empty. It is unnecessary to invalidate the cache if no sstables are added. To avoid unnecessary cache invalidating which pokes hole in the cache, skip calling _cache.invalidate() if the sstables is empty. The steps are: - STREAM_MUTATION_DONE verb is sent when streaming is done with old or new streaming - table::flush_streaming_mutations is called in the verb handler - cache is invalidated for the streaming ranges In summary, this patch will avoid a lot of cache invalidation for streaming. Backports: 3.0 3.1 3.2 Fixes: #5769	2020-02-16 11:22:30 +02:00
Pavel Emelyanov	b11cf6e950	cql3/query_processor.hh: Debloat from other headers This gives ~30% less (251 jobs -> 181 jobs) recompile when touching it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200212225828.3374-1-xemul@scylladb.com>	2020-02-16 11:22:30 +02:00
Botond Dénes	8014c7124d	compaction_manager: collect all cleanup related logic in perform_cleanup() Currently the call chain for a cleanup collection looks like this: compaction_manager::perform_cleanup() compaction_manager::rewrite_sstables() table::cleanup_sstables() ... `perform_cleanup()` is essentially empty, immediately deferring to `rewrite_sstables()`. Cleanup related logic is scattered between the latter two methods on the call chain. These methods however recently started serving as generic methods for compactions that want to rewrite each sstable one-by-one, collecting cleanup related ifs in various places. The reason is historic, we first had cleanup, then bolted others on top, trying to share the underlying code as much as possible. It is time this is cleaned up (pun intended). Make `perform_cleanup()` the place where all cleanup related logic is, with the rest of the stack made truly generic.	2020-02-11 17:47:44 +02:00
Botond Dénes	b2dc5d4895	compaction: compaction_descriptor: use compaction options instead of cleanup flag Instead of the restrictive `cleanup` boolean flag, which allows for choosing between only two compaction types, use `compaction_options`, which in addition to allowing any number of compaction types to be selected, also allows seamlessly passing specific options to them.	2020-02-11 17:47:44 +02:00
Botond Dénes	0b53ccaecd	table: cleanup_sstables(): only short-circuit on actual cleanup Currently the cleanup call is short circuited if it is determined that cleanup is not needed for the sstable to-be-cleaned-up. This is undesired because actually not just cleanup uses this routine to rewrite sstables, sstable-upgrade and sstable-scrub also uses it, and they want to go on with the cleanup compaction sstables even if all data in it belongs to the current node. Fix: #5699	2020-02-11 17:47:44 +02:00

1 2 3

124 Commits