scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Author	SHA1	Message	Date
Glauber Costa	628dd16519	compaction: deprecate DTCS. Step 1. This patch adds a warning of deprecation to DTCS. In a follow up step, we will start requiring a flag for it to be enabled to make sure users notice. For now we'll just be nice and add a warning for the log watchers. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200224164405.9656-1-glauber@scylladb.com>	2020-02-24 20:26:24 +02:00
Raphael S. Carvalho	e81076b01c	compaction: Implement ranges for cache invalidation on behalf of cleanup This procedure will calculate ranges for cache invalidation by subtracting all owned ranges from the sstables' partition ranges. That's done so as to reduce the size of invalidated ranges. Refs #4446. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-02-20 10:55:49 -03:00
Raphael S. Carvalho	db4c3230f7	compaction: Add ranges for cache invalidation to compaction_completion_desc It will store the ranges to be invalidated in row cache on compaction completion. Intended to be used by cleanup compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-02-19 19:30:35 -03:00
Raphael S. Carvalho	51532b84f8	compaction: Make it possible for a compaction type to customize compaction_completion_desc compaction_completion_desc will eventually store more information that can be customized by the compaction type. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-02-19 19:30:35 -03:00
Raphael S. Carvalho	65b4fc8bcd	sstables/compaction: Introduce compaction_completion_desc This descriptor contain all information needed for table to be properly updated on compaction completion. A new member will be added to it soon, which will store ranges to be invalidated in row cache on behalf of cleanup compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-02-19 19:29:32 -03:00
Avi Kivity	6c7aa18238	Merge "Introduce schema::get_partitioner" from Piotr " Introduce schema::get_partitioner and use it instead of dht::global_partitioner. Fixes #5493 Tests: unit(dev, release, debug) " * 'per_table_partitioner_prep' of https://github.com/haaawk/scylla: (35 commits) cdc: stop using partitioners partitioner_test: stop calling set_global_partitioner storage_service: stop calling global_partitioner() mutation_writer_test: stop calling global_partitioner() schema: reduce number of global_partitioner() calls test_services: stop calling global_partitioner() sstable_utils: stop calling global_partitioner() sstable_resharding_test: stop depending on global partitioner sstable_mutation_test: stop calling global_partitioner() sstable_data_file_test: stop calling global_partitioner() random_schema: stop taking partitioner in constructor mutation_reader_test: stop calling global_partitioner() multishard_mutation_query_test: stop calling global_partitioner() row_level repair: stop calling global_partitioner() distribute_reader_and_consume_on_shards: don't take partitioner thrift: reduce global_partitioner() calls binary_search: stop calling global_partitioner() index_entry: stop calling global_partitioner() mc writer: stop calling global_partitioner() sstable: stop calling global_partitioner() ...	2020-02-17 18:12:53 +02:00
Tomasz Grabiec	76d1dd7ec6	Merge "nodetool scrub: implement validation and the skip-corrupted flag " from Botond Nodetool scrub rewrites all sstables, validating their data. If corrupt data is found the scrub is aborted. If the skip-corrupted flag is set, corrupt data is instead logged (just the keys) and skipped. The scrubbing algorithm itself is fairly simple, especially that we already have a mutation stream validator that we can use to validate the data. However currently scrub is piggy-backed on top of cleanup compaction. To implement this flag, we have to make scrub a separate compaction type and propagate down the flag. This required some massaging of the code: * Add support for more than two (cleanup or not) compaction types. * Allow passing custom options for each compaction type. * Allow stopping a compaction without the manager retrying it later. Additionally the validator itself needed some changes to allow different ways to handle errors, as needed by the scrub. Fixes: #5487 * https://github.com/denesb/nodetool-scrub-skip-corrupted/v7: table: cleanup_sstables(): only short-circuit on actual cleanup compaction: compaction_type: add Upgrade compaction: introduce compaction_options compaction: compaction_descriptor: use compaction options instead of cleanup flag compaction_manager: collect all cleanup related logic in perform_cleanup() sstables: compaction_stop_exception: add retry flag mutation_fragment_stream_validator: split into low-level and high-level API compaction: introduce scrub_compaction compaction_manager: scrub: don't piggy-back on upgrade_sstables() test: sstable_datafile_test: add scrub unit test	2020-02-17 15:28:07 +02:00
Piotr Jastrzebski	56e3cb8c3a	binary_search: stop calling global_partitioner() Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	1db437ee91	index_entry: stop calling global_partitioner() Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	1f866d7001	mc writer: stop calling global_partitioner() Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	6fe0dcbac4	sstable: stop calling global_partitioner() parse functions now take const schema& which allows them to reach a partitioner. It's safe to take schema by const& because the only caller takes the schema from an sstable object. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Jastrzebski	ca4a89d239	dht: add dht::decorate_key and replace all dht::global_partitioner().decorate_key with dht::decorate_key It is an improvement because dht::decorate_key takes schema and uses it to obtain partitioner instead of using global partitioner as it was before. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:06 +01:00
Piotr Jastrzebski	abd76e566f	dht::shard_of: stop calling global_partitioner() Take const schema& as a parameter of shard_of and use it to obtain partitioner instead of calling global_partitioner(). Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:23:16 +01:00
Piotr Jastrzebski	57e4b7f215	ring_position_range_sharder: stop calling global_partitioner Remove ring_position_range_sharder(nonwrapping_range<ring_position>) which calls another constructor with partitioner obtained with dht::global_partitioner(). Fix all the places the removed constructor was used and obtain partitioner from schema instead. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:19:15 +01:00
Piotr Jastrzebski	dd1120454b	dht: move sharders to a separate header i_partitioner.hh is widely included while sharders are used only in 6 places so there's no need to include them in the whole codebase. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:19:02 +01:00
Pavel Emelyanov	b11cf6e950	cql3/query_processor.hh: Debloat from other headers This gives ~30% less (251 jobs -> 181 jobs) recompile when touching it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200212225828.3374-1-xemul@scylladb.com>	2020-02-16 11:22:30 +02:00
Botond Dénes	26d4c8be95	compaction_manager: scrub: don't piggy-back on upgrade_sstables() Now that we have the necessary infrastructure to do actual scrubbing, don't rely on `upgrade_sstables()` anymore behind the scenes, instead do an actual scrub. Also, use the skip-corrupted flag.	2020-02-13 15:02:37 +02:00
Botond Dénes	33c126e8c0	compaction: introduce scrub_compaction A specialized compaction subclass for executing a scrub compaction. `scrub_compaction` supplies a specialized reader which will validate its input and stop on the first error. If it is configured with `skip_corrupted`, it will instead skip bad data, logging it.	2020-02-13 15:02:37 +02:00
Botond Dénes	1b7725af4b	mutation_fragment_stream_validator: split into low-level and high-level API The low-level validator allows fine-grained validation of different aspects of monotonicity of a fragment stream. It doesn't do any error handling. Since different aspects can be validated with different functions, this allows callers to understand what exactly is invalid. The high-level API is the previous fragment filter one. This is now built on the low-level API. This division allows for advanced use cases where the user of the validator wants to do all error handling and wants to decide exactly what monotonicity to validate. The motivating use-case is scrubbing compaction, added in the next patches.	2020-02-13 15:02:32 +02:00
Botond Dénes	7d3bce403d	sstables: compaction_stop_exception: add retry flag Allow the thrower to communicate that it doesn't want the compaction to be retried later. I know, using exceptions for control flow is very bad, but this is the existing mechanism to stop a compaction and I don't want to invent a new one for this. Also massage the error messages a bit to take the value of this flag into consideration.	2020-02-11 18:38:35 +02:00
Botond Dénes	8014c7124d	compaction_manager: collect all cleanup related logic in perform_cleanup() Currently the call chain for a cleanup collection looks like this: compaction_manager::perform_cleanup() compaction_manager::rewrite_sstables() table::cleanup_sstables() ... `perform_cleanup()` is essentially empty, immediately deferring to `rewrite_sstables()`. Cleanup related logic is scattered between the latter two methods on the call chain. These methods however recently started serving as generic methods for compactions that want to rewrite each sstable one-by-one, collecting cleanup related ifs in various places. The reason is historic, we first had cleanup, then bolted others on top, trying to share the underlying code as much as possible. It is time this is cleaned up (pun intended). Make `perform_cleanup()` the place where all cleanup related logic is, with the rest of the stack made truly generic.	2020-02-11 17:47:44 +02:00
Botond Dénes	b2dc5d4895	compaction: compaction_descriptor: use compaction options instead of cleanup flag Instead of the restrictive `cleanup` boolean flag, which allows for choosing between only two compaction types, use `compaction_options`, which in addition to allowing any number of compaction types to be selected, also allows seamlessly passing specific options to them.	2020-02-11 17:47:44 +02:00
Botond Dénes	8579bef076	compaction: introduce compaction_options Currently the compaction API is quite restrictive. It offers a generic `compact_sstables()` and `reshard_sstables()` methods. The former is the one used by all but resharding, however it only really supports two modes: regular and cleanup. The latter is supported by a semi-hidden `cleanup` flag in `compaction_description`. Actually there are two more compaction types already which are piggy-backed on cleanup: upgrade and scrub. The upper layers distinguish between actual cleanup and "fake" cleanup by a `is_actual_cleanup` flag. The latter two "fake" cleanup compactions cannot be distinguished even by the upper layers. This is terribly confusing and hard to follow, in addition to being restrictive. This worked so far, because upgrade is served quite well by the cleanup compaction type, turning off certain preparations by the above mentioned `is_actual_cleanup` flag. Scrub is barely implemented and just an upgrade behind the scenes. This situation is however preventing really specializing each compaction. Enter `compaction_options`. This variant in disguise is designed to allow passing specific option to each compaction type, and doubles as an enum allowing more than two low level compaction type. This patch only adds the option class itself, propagating and handling it will be done by the next patches.	2020-02-11 17:47:44 +02:00
Botond Dénes	6bc3b41c20	compaction: compaction_type: add Upgrade Although we currently do support upgrade compaction, it is piggy-backed on top of cleanup compaction. This is soon going to change, so in preparation to that, add an `Upgrade` member to the `compaction_type` enum.	2020-02-11 17:47:44 +02:00
Raphael S. Carvalho	140520ff87	sstables/compaction_manager: add metric for pending compaction tasks we have compaction_manager.compactions metric for the number of active tasks, but they don't account for tasks blocked waiting for an opportunity to run, and they're the problematic ones. Fixes #5254. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200210131929.30981-1-raphaelsc@scylladb.com>	2020-02-10 17:55:02 +01:00
Avi Kivity	bed61b96a2	Merge "Move features from storage- into feature-service" from Pavel " There's a lot of code around that needs storage service purely to get the specific feature value (cluster_supports_<something> calls). This creates several circular dependencies, e.g. storage_service <-> migration_manager one and database <-> storage_servuce. Also features sit on storage_service, but register themselfs on the feature_service and the former subscribes on them back which also looks strange. I propose to keep all the features on feature_service, this keeps the latter intependent from other components, makes it possible to break one of the mentioned circle dependencyand heavily relax the other. Also the set helps us fighting the globals and, after it, the feature_service can be safely stopped at the very last moment. Tests: unit(dev), manual debug build start-stop " * 'br-features-to-service-5' of https://github.com/xemul/scylla: gossiper: Avoid string merge-split for nothing features: Stop on shutdown storage_service: Remove helpers storage_service: Prepare to switch from on-board feature helpers cql3: Check feature in .validate database: Use feature service storage_proxy: Use feature service migration_manager: Use feature service start: Pass needed feature as argument into migrate_truncation_records features: Unfriend storage_service features: Simplify feature registration features: Introduce known_feature_set features: Move disabled features set from storage_service features: Move schema_features helper features: Move all features from storage_service to feature_service storage_service: Use feature_config from _feature_service features: Add feature_config storage_service: Kill set_disabled_features gms: Move features stuff into own .cc file migration_manager: Move some fns into class	2020-02-09 19:22:07 +02:00
Pavel Emelyanov	d1775dd701	utils: Move disk-error-handler into it The disk-error-handler is purely auxiliary thing that helps propagating IO errors to the rest of the code. It well deserves not sitting in the root namespace. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200207112443.18475-1-xemul@scylladb.com>	2020-02-09 17:26:52 +02:00
Piotr Jastrzebski	8813a6ca2a	index_reader: avoid copying schema to lambda Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-06 14:10:58 +01:00
Avi Kivity	e719ea1bba	Merge "Fix assert on initialization error" (in large_data_handler) from Rafael " This series fixes an assertion when initialization fails after creating a database. I don't know of a case where that currently happens, but it is easy to cause that when writing a patch and the produced assert is just confusing. " * 'espindola/dont-assert-on-init-error' of https://github.com/espindola/scylla: db: Replace large_data_handler::_stopped with _running db: Move nop_large_data_handler constructor out-of-line db: Move large_data_handler::stop out-of-line	2020-02-05 18:49:11 +02:00
Piotr Jastrzebski	1d1ac476c3	token: remove token_view Now that both token and token_view contain int64_t it makes no sense to keep the view. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-05 09:31:32 +01:00
Piotr Jastrzebski	06dfd16aad	sstables: use copy constructor for tokens instead of manually creating new token from another token internals. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-05 09:31:32 +01:00
Piotr Jastrzebski	05e0451b27	token: change _data to int64_t Previously _data was stored as array of 8 bytes in network byte order. After this change it stores the same value in int64_t in host byte order. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-05 09:31:32 +01:00
Piotr Jastrzebski	b569d127a0	token: change data to array<uint8_t, 8> It is save to do such change because we support only Murmur3Partitioner which uses only tokens that are 8 bytes long. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-05 09:30:46 +01:00
Rafael Ávila de Espíndola	5d4671526c	db: Replace large_data_handler::_stopped with _running This is not just a direct flip to a variable with the negated Boolean value. When created, a large_data_handler is not considered to be running, the user has to call start() before it can be used. The advantaged of doing this is that if initialization fails and a database is destructed before the large_data_handler is started, the assert database::stop() { assert(!_large_data_handler->running()); is not triggered. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-04 21:15:44 -08:00
Pavel Emelyanov	0e62d615ae	storage_service: Prepare to switch from on-board feature helpers There are some places that get global storage_service instance for individual features. In the next patch all these helpers will be removed, so here's the preparation for it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-03 15:16:23 +03:00
Avi Kivity	adb64dc72f	treewide: tighten concepts syntax gcc 10 requires a semicolon after every compound requirement, as per the standard. Add missing semicolons where necessary. Message-Id: <20200129205805.20928-1-avi@scylladb.com>	2020-01-30 14:10:18 +02:00
Botond Dénes	dfc66194c8	index_reader: make the index file tracked Track I/O going to the index file, similarly to how we already track I/O going to the data file.	2020-01-28 08:13:16 +02:00
Botond Dénes	936619a8d3	sstables/continuous_data_consumer: track buffers used for parsing Based on heap profiling, buffers used for storing half-parsed fields are a major contributor to the overall memory consumption of reads. This memory was completely "under the radar" before. Track it by using tracked `temporary_buffer` instances everywhere in `continuous_data_consumer`. As `continuous_data_consumer` is the basis for parsing all index and data files, adding the tracing here automatically covers all data, index and promoted index parsing. I'm almost convinced that there is a better place to store the `permit` then the three places now, but so far I was unable to completely decipher the our data/index file parsing class hierarchy.	2020-01-28 08:13:16 +02:00
Botond Dénes	dfc8b2fc45	treewide: replace reader_resource_tracer with reader_permit The former was never really more than a reader_permit with one additional method. Currently using it doesn't even save one from any includes. Now that readers will be using reader_permit we would have to pass down both to mutation_source. Instead get rid of reader_resource_tracker and just use reader_permit. Instead of making it a last and optional parameter that is easy to ignore, make it a first class parameter, right after schema, to signify that permits are now a prominent part of the reader API. This -- mostly mechanical -- patch essentially refactors mutation_source to ask for the reader_permit instead of reader_resource_tracking and updates all usage sites.	2020-01-28 08:13:16 +02:00
Botond Dénes	a74a82d4d2	flat_mutation_reader: mutation_fragment_stream_validator: add name Add a name parameter to the validator, so that the validator can be identified in log messages. Schema identity information is added to the name automatically. This should help pinpoint the problematic place where validation failed. Although at the moment we have a single validator, it still benefits from having a name, as we can now include in it the name of the sstable being written and hence trace the source of the bad data. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200117150616.895878-1-bdenes@scylladb.com>	2020-01-20 11:06:30 +01:00
Raphael S. Carvalho	390c8b9b37	sstables: Move STCS implementation to source file header only implementation potentially create a problem with duplicate symbols Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200107154258.9746-1-raphaelsc@scylladb.com>	2020-01-08 09:55:35 +02:00
Avi Kivity	e5e42672f5	sstables: reduce bloat from sstables::write_simple() sstables::write_simple() has quite a lot of boilerplate which gets replicated into each template instance. Move all of that into a non-template do_write_simple(), leaving only things that truly depend on the component being written in the template, and encapsulating them with a noncopyable_function. An explicit template instantiation was added, since this is used in a header file. Before, it likely worked by accident and stopped working when the template became small enough to inline. Tests: unit (dev) Message-Id: <20200106135453.1634311-1-avi@scylladb.com>	2020-01-07 11:56:11 +01:00
Rafael Ávila de Espíndola	75817d1fe7	sstable: Add checks to help track problems with large_data_handler use after free I can't quite figure out how we were trying to write a sstable with the large data handler already stopped, but the backtrace suggests a good place to add extra checks. This patch adds two check. One at the start and one at the end of sstable::write_components. The first one should give us better backtraces if the large_data_handler is already stopped. The second one should help catch some race condition. Refs: #5470 Message-Id: <20191231173237.19040-1-espindola@scylladb.com>	2020-01-01 12:03:31 +02:00
Benny Halevy	abda12107f	sstables: move_to_new_dir: add do_sync_dirs param To be used for "batch" move of several sstables from staging to the base directory, allowing the caller to sync the directories once when all are moved rather than for each one of them. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:20:20 +02:00
Benny Halevy	6efef84185	sstable: return future from move_to_new_dir distributed_loader::probe_file needlessly creates a seastar thread for it and the next patch will use it as part of a parallel_for_each loop to move a list of sstables (and sync the directories once at the end). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:20:20 +02:00
Pavel Solodovnikov	2f442f28af	treewide: add const qualifiers throughout the code base	2019-11-26 02:24:49 +03:00
Benny Halevy	f9e93bba38	sstables: compaction: move cleanup parameter to compaction_descriptor Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20191117165806.3234-1-bhalevy@scylladb.com>	2019-11-18 10:52:20 +01:00
Nadav Har'El	2fb2eb27a2	sstables: allow non-traditional characters in table name The goal of this patch is to fix issue #5280, a rather serious Alternator bug, where Scylla fails to restart when an Alternator table has secondary indexes (LSI or GSI). Traditionally, Cassandra allows table names to contain only alphanumeric characters and underscores. However, most of our internal implementation doesn't actually have this restriction. So Alternator uses the characters ':' and '!' in the table names to mark global and local secondary indexes, respectively. And this actually works. Or almost... This patch fixes a problem of listing, during boot, the sstables stored for tables with such non-traditional names. The sstable listing code needlessly assumes that the directory name, i.e., the CF names, matches the "\w+" regular expression. When an sstable is found in a directory not matching such regular expression, the boot fails. But there is no real reason to require such a strict regular expression. So this patch relaxes this requirement, and allows Scylla to boot with Alternator's GSI and LSI tables and their names which include the ":" and "!" characters, and in fact any other name allowed as a directory name. Fixes #5280. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20191114153811.17386-1-nyh@scylladb.com>	2019-11-17 14:27:47 +02:00
Avi Kivity	27ef73f4f1	Merge "Report file I/O in CQL tracing when reading from sstables." from Kamil " Introduce the traced_file class which wraps a file, adding CQL trace messages before and after every operation that returns a future. Use this file to trace reads from SSTable data and index files. Fixes #4908. " * 'traced_file' of https://github.com/kbr-/scylla: sstables: report sstable index file I/O in CQL tracing sstables: report sstable data file I/O in CQL tracing tracing: add traced_file class	2019-10-26 22:53:37 +03:00
Kamil Braun	432ef7c9af	sstables: report sstable index file I/O in CQL tracing Use tracing::make_traced_file when reading from the index file in index_reader.	2019-10-25 14:10:28 +02:00

1 2 3 4 5 ...

1965 Commits