scylladb

Author	SHA1	Message	Date
Glauber Costa	e29701ca1c	compaction_manager: expand state to be able to differentiate between enabled and stopped We are having many issues with the stop code in the compaction_manager. Part of the reason is that the "stopped" state has its meaning overloaded to indicate both "compaction manager is not accepting compactions" and "compaction manager is not ready or destructed". In a later step we could default to enabled-at-start, but right now we maintain current behavior to minimize noise. It is only possible to stop the compaction manager once. It is possible to enable / disable the compaction manager many times. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Glauber Costa	70a89ab4ab	compaction: do not assume I/O priority class We shouldn't assume the I/O priority class for compactions. For instance, if we are dealing with offstrategy compactions we may want to use the maintenance group priority for them. For now, all compactions are put in the compaction class. rewrite compactions (scrub, cleanup) could be maintenance, but we don't have clear access to the database object at this time to derive the equivalent CPU priority. This is planned to be changed in the future, and when we do change it, we'll adjust. Same goes for resharding: while we could at this point change it we'd risking memory pressure since resharding is run online and sstables are shared until resharding is done. When we move it to offline execution we'll do it with maintenance priority. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200512002233.306538-3-glauber@scylladb.com>	2020-05-12 08:23:19 +03:00
Raphael S. Carvalho	88d2486fca	sstables: Synchronize deletion of SSTables in resharding with other operations Input SSTables of resharding is deleted at the coordinator shard, not at the shards they belong to. We're not acquiring deletion semaphore before removing those input SSTables from the SSTable set, so it could happen that resharding deletes those SSTables while another operation like snapshot, which acquires the semaphore, find them deleted. Let's acquire the deletion semaphore so that the input SSTables will only be removed from the set, when we're certain that nobody is relying on their existence anymore. Now resharding will only delete input SStables after they're safely removed from the SSTable set of all shards they belong to. unit: test(dev). Fixes #6328. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200507233636.92104-1-raphaelsc@scylladb.com>	2020-05-10 10:50:32 +03:00
Calle Wilund	040ffa6e64	distributed_loader: Add concurrency control override for named keyspaces Fixes #6202 Distributed loader sstable opening is gated through the database::sstable_load_concurrency_sem() semaphore (at a concurrency of 3). This is (according to creation comment) to reduce memory footprint during bootstrap, by partially serializing the actual opening of existing sstables. However, in certain versions of the product, there exist circular dependencies between data in some sstables and the ability to actually read others. Thus when gated as above, we can end up with the dependents acquiring the semaphore fully, and once stuck waiting for population of their dependency effectively blocking this from ever happening. Since we probably do not want to remove the concurrency control, and increasing it would only push the problem further away, we solve the issue by adding the ability to mark certain keyspaces as "prioritized" (pre-bootstrap), and allow them to populate outside the normal concurrency control semaphore. Concurrency increase is however limited to one extra sstable per shard and prio keyspace. Message-Id: <20200415102431.20816-1-calle@scylladb.com>	2020-04-27 16:21:13 +03:00
Glauber Costa	05efd6a5e9	resharding: get rid of special reshard_sstables There is a method, reshard_sstables(), whose sole purpose is to call a resharding compaction. There is nothing special about this method: all the information it needs is now present in the compaction_descriptor. This patch extend the compaction_options class to recognize resharding compactions as well, and uses that so that make_compaction() can also create resharding compactions. To make that happen we have to create a compaction_descriptor object in the resharding method. Note however that resharding works by passing an object very close to the compaction_descriptor around. Once this patch is merged, a logical next step is to reuse it, and avoid creating the descriptor right before calling compact_sstables(). Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-03-31 19:57:53 -04:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Pavel Emelyanov	7363d56946	sstables: Move get_highest_supported_format The global get_highest_supported_format helper and its declaration are scattered all over the code, so clean this up and prepare the ground for moving _sstables_format from the storage_service onto the sstables_manager (not this set). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-25 14:31:45 +03:00
Calle Wilund	af963e76c7	keyspace/distributed_loader: Add wait for (user) keyspace population to finish Allows caller to check/wait for a given user keyspace to finish populating on boot. Can be called at any time, though if called before population starts, it will wait until it either starts and we can determine that the keyspace does not need populating, or population finishes. tests: unit Message-Id: <20200203151712.10003-1-calle@scylladb.com>	2020-02-09 18:56:22 +02:00
Pavel Emelyanov	5cf365d7e7	database: Explicitly pass migration_manager through init_non_system_keyspace This is the last place where database code needs the migration_manager instance to be alive, so now the mutual dependency between these two is gone, only the migration_manager needs the database, but not the vice-versa. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-01-15 14:29:21 +03:00
Benny Halevy	6efef84185	sstable: return future from move_to_new_dir distributed_loader::probe_file needlessly creates a seastar thread for it and the next patch will use it as part of a parallel_for_each loop to move a list of sstables (and sync the directories once at the end). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:20:20 +02:00
Raphael S. Carvalho	3e70523111	distributed_loader: Release disk space of SSTables deleted by resharding Resharding is responsible for the scheduling the deletion of sstables resharded, but it was not refreshing the cache of the shards those sstables belong to, which means cache was incorrectly holding reference to them even after they were deleted. The consequence is sstables deleted by resharding not having their disk space freed until cache is refreshed by a subsequent procedure that triggers it. Fixes #5261. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20191107193550.7860-1-raphaelsc@scylladb.com>	2019-11-13 16:03:27 +02:00
Piotr Dulikowski	c04e8c37aa	distributed_loader: populate non-system keyspaces in parallel Before this change, when populating non-system keyspaces, each data directory was scanned and for each entry (keyspace directory), a keyspace was populated. This was done in a serial fashion - populating of one keyspace was not started until the previous one was done. Loading keyspaces in such fashion can introduce unnecessary waiting in case of a large number of keyspaces in one data directory. Population process is I/O intensive and barely uses CPU. This change enables parallel loading of keyspaces per data directory. Populating the next keyspace does not wait for the previous one. A benchmark was performed measuring startup time, with the following setup: - 1 data directory, - 200 keyspaces, - 2 tables in each keyspace, with the following schema: CREATE TABLE tbl (a int, b int, c int, PRIMARY KEY(a, b)) WITH CLUSTERING ORDER BY (b DESC), - 1024 rows in each table, with values (i, 2i, 3i) for i in 0..1023, - ran on 6-core virtual machine running on i7-8750H CPU, - compiled in dev mode, - parameters: --smp 6 --max-io-requests 4 --developer-mode=yes --datadir $DIR --commitlog-directory $DIR --hints-directory $DIR --view-hints-directory $DIR The benchmark tested: - boot time, by comparing timestamp of the first message in log, and timestamp of the following message: "init - Scylla version ... initialization completed." - keyspace population time, by comparing timestamps of messages: "init - loading non-system sstables" and "init - starting view update generator" The benchmark was run 5 times for sequential and parallel version, with the following results: - sequential: boot 31.620s, keyspace population 6.051s - parallel: boot 29.966s, keyspace population 4.360s Keyspace population time decreased by ~27.95%, and overall boot time by about ~5.23%. Tests: unit(release) Fixes #2007	2019-10-10 15:12:23 +03:00
Botond Dénes	136fc856c5	treewide: silence discarded future warnings for questionable discards This patches silences the remaining discarded future warnings, those where it cannot be determined with reasonable confidence that this was indeed the actual intent of the author, or that the discarding of the future could lead to problems. For all those places a FIXME is added, with the intent that these will be soon followed-up with an actual fix. I deliberately haven't fixed any of these, even if the fix seems trivial. It is too easy to overlook a bad fix mixed in with so many mechanical changes.	2019-08-26 19:28:43 +03:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Avi Kivity	1ed3356e0e	main: relax file ownership checks if running under euid 0 During startup, we check that the data files are owned by our euid. But in a container environment, this is impossible to enforce because uid/username mappings are different between the host and the container, and the data files are likely to be mounted from the host. To allow for such environments, relax the checks if euid=0. This both matches what happens in a container (programs run as root) and the kernel access checks (euid 0 can do anything). We can reconsider this when container uid mapping is better developed. Fixes #4823. Fixes #4536.	2019-08-13 14:36:08 +03:00
Glauber Costa	2008d982c3	lister: don't crash the node on failure to remove snapshot lister::rmdir has two in-tree users: clearing snapshots and clearing temporary directories during sstable creation. The way it is currently coded, it wraps the io functions in io_check, which means that failures to remove the directory will crash the database. We recently saw how benign failures crashed a database during clearsnapshot: we had snapshot creation running in parallel, adding more files to the directory that wasn't empty by the time of deletion. I have also seen very often users add files to existing directories by accident, which is another possibility to trigger that. This patch removes the io_check from lister, and moves it to the caller in which we want to be more strict. We still want to be strict about the creation of temporary directories, since users shouldn't be touching that in any way. Fixes #4558 Signed-off-by: Glauber Costa <glauber@scylladb.com>	2019-07-16 13:35:36 -04:00
Benny Halevy	5a99023d4a	treewide: use lambda for io_check of *touch_directory To prepare for a seastar change that adds an optional file_permissions parameter to touch_directory and recursive_touch_directory. This change messes up the call to io_check since the compiler can't derive the Func&& argument. Therefore, use a lambda function instead to wrap the call to {recursive_,}touch_directory. Ref #4395 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190421085502.24729-1-bhalevy@scylladb.com>	2019-04-21 12:04:39 +03:00
Benny Halevy	9785754e0d	distributed_loader: do not follow symlinks when verifying mode and owner We allow only regular files and directotries so to detect symlinks we must not follow them. Fixes #4375 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190418051627.9298-1-bhalevy@scylladb.com>	2019-04-19 11:47:40 +03:00
Benny Halevy	3749148339	storage_service: fix handling of load_new_sstables exception ignore_ready_future in load_new_ss_tables broke migration_test:TestMigration_with_*.migrate_sstable_with_counter_test_expect_fail dtests. The java.io.NotSerializableException in nodetool was caused by exceptions that were too long. This fix prints the problematic file names onto the node system log and includes the casue in the resulting exception so to provide the user with information about the nature of the error. Fixes #4375 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190331154006.12808-1-bhalevy@scylladb.com>	2019-04-02 11:46:19 +03:00
Benny Halevy	e3f7fe44c0	init: validate file ownership and mode. Files and directories must be owned by the process uid. Files must have read access and directories must have read, write, and execute access. Refs #3117 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-28 14:40:12 +02:00
Benny Halevy	223e1af521	sstables: provide large_data_handler to constructor And use it for writing the sstable and/or when deleting it. Refs #4198 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:24:19 +02:00
Benny Halevy	adf8428321	sstables: make load_shared_components a method of sstable and open code its static part in the caller (distributed_loader) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Benny Halevy	3a17053cb8	database: add table::make_sstable helper In most cases we make a sstable based on the table schema and soon - large_data_handler. Encapsulate that in a make_sstable method. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Benny Halevy	67f705ae04	distributed_loader: pass column_family to load_sstables_with_open_info Rather than just its schema. In preparation for adding table::make_sstable Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Benny Halevy	99875ba966	distributed_loader: no need for forward declaration of load_sstables_with_open_info Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Benny Halevy	7a8ab1d6f1	distributed_loader: reshard: use default params for make_sstable Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-26 16:05:08 +02:00
Benny Halevy	564be8b720	distributed_loader::load_new_sstables: handle exceptions in open_sstable Propagate exception to caller. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-24 18:25:09 +02:00
Piotr Sarna	986004a959	loader: move uploaded view pending sstables to staging When loading tables uploaded via `nodetool refresh`, they used to be left in upload/ directory if view updates would need to be generated from them. Since view update generation is asynchronous, sstables left in the directory could erroneously get overwritten by the user, who decides to upload another batch of sstables and some of the names collided. To remedy this, uploaded sstables that need view updates are moved to staging/ directory with a unique generation number, where they await view update generation. Fixes #4047	2019-03-20 13:44:29 +01:00
Benny Halevy	1021eb29c9	distributed_loader: fix old format counters exception table::load_sstable: fix missing arg in old format counters exception Properly catch and log the exception in load_new_sstables. Abort when the exception is caught to keep current behavior. Seen with migration_test:TestMigration_with_2_1_x.migrate_sstable_with_counter_test without enable_dangerous_direct_import_of_cassandra_counters. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190301091235.2914-1-bhalevy@scylladb.com>	2019-03-04 17:36:09 +01:00
Benny Halevy	043673b236	distributed_loader: replay and cleanup pending_delete log files Scan the table's pending_delete sub-directory if it exists. Remove any temporary pending_delete log files to roll back the respective delete_atomically operation. Replay completed pending_delete log files to roll forward the respective delete_atomically operation, and finally delete the log files. Cleanup of temporary sstable directories and pending_delete sstables are done in a preliminary scan phase when populating the column family so that we won't attempt to load the to-be-deleted sstables. Fixes #4082 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-02-22 11:08:22 +02:00
Benny Halevy	ee3ad75492	distributed_loader: populated_column_family: separate temp sst dirs cleanup phase In preparation for replaying pending_delete log files, we would like to first remove any temporary sst dirs and later handle pending_delete log files, and only then populate the column family. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-02-22 11:08:22 +02:00
Glauber Costa	e0bfd1c40a	allow Cassandra SSTables with counters to be imported if they are new enough Right now Cassandra SSTables with counters cannot be imported into Scylla. The reason for that is that Cassandra changed their counter representation in their 2.1 version and kept transparently supporting both representations. We do not support their old representation, nor there is a sane way to figure out by looking at the data which one is in use. For safety, we had made the decision long ago to not import any tables with counters: if a counter was generated in older Cassandra, we would misrepresent them. In this patch, I propose we offer a non-default way to import SSTables with counters: we can gate it with a flag, and trust that the user knows what they are doing when flipping it (at their own peril). Cassandra 2.1 is by now pretty old. many users can safely say they've never used anything older. While there are tools like sstableloader that can be used to import those counters, there are often situations in which directly importing SSTables is either better, faster, or worse: the only option left. I argue that having a flag that allow us to import them when we are sure it is safe is better than having no option at all. With this patch I was able to successfully import Cassandra tables with counters that were generated in Cassandra 2.1, reshard and compact their SSTables, and read the data back to get the same values in Scylla as in Cassandra. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190210154028.12472-1-glauber@scylladb.com>	2019-02-10 17:50:48 +02:00
Rafael Ávila de Espíndola	625080b414	Rename large_partition_handler Now that it also handles large rows, rename it to large_data_handler. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-01-28 15:03:14 -08:00
Benny Halevy	74ef09a3a2	distributed_loader: populate_column_family should scan directories too To detect and cleanup leftover temporary sstable directories. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-01-27 14:14:32 +02:00
Benny Halevy	bd85975277	sstables: fix is_temp_dir 1. fs::canonical required that the path will exist. and there is no need for fs::canonical here. 2. fs::path::extension will return the leading dot. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-01-27 14:14:32 +02:00
Benny Halevy	c2a5f3b842	distributed_loader: populate_column_family: ignore directories other than sstable::is_temp_dir populate_column_family currently lists only regular files. ignoring all directories. A later patch in this series allows it to list also directories so to cleanup the temporary sstable directories, yet valid sub-directories, like staging\|upload\|snapshots, may still exist and need to be ignored. Other kinds of handling, like validating recgnized sub-directories and halting on unrecognized sub-directories are possible, yet out of scope for this patch(set). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-01-27 14:14:32 +02:00
Benny Halevy	9bd7b2f4e6	distributed_loader: remove temporary sstable directories only on shard 0 Similar to calling remove_sstable_with_temp_toc later on in populate_column_family(), we need only one thread to do the cleanup work and the existing convention is that it's shard 0. Since lister::rmdir is checking remove_file of all entries (recursively) and the dir itself, doing that concurrently would fail. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-01-27 14:14:32 +02:00
Benny Halevy	bcfb2e509b	distributed_loader: push future returned by rmdir into futures vector	2019-01-27 14:14:32 +02:00
Piotr Sarna	5d76a635ca	distributed_loader: migrate flush_upload_dir to thread Flushing upload dir code suffers from overcomplication, so in order to make it a little bit simpler, it's moved to threaded context. Refs #4118 Message-Id: <232cca077bae7116cfa87de9c9b4ba60efc2a01d.1548077720.git.sarna@scylladb.com>	2019-01-21 15:48:17 +02:00
Tomasz Grabiec	d7c701d2d1	Merge "Type-erase gratuitous templates with functions" from Avi Many area of the code are splattered with unneeded templates. This patchset replaces some of them, where the template parameter is a function object, with an std::function or noncopyable_function (with a preference towards the latter; but it is not always possible). As the template is compiled for each instantiation (if the function object is a lambda) while a function is compiled only once, there are significant savings in compile time and bloat. text data bss dec hex filename 85160690 42120 284910 85487720 5187068 scylla.before 84824762 42120 284910 85151792 5135030 scylla.after * https://github.com/avikivity/scylla detemplate/v2: api/commitlog: de-template acquire_cl_metric() database: de-template do_parse_schema_tables database: merge for_all_partitions and for_all_partitions_slow hints: de-template scan_for_hints_dirs() schema_tables: partially de-template make_map_mutation() distributed_loader: de-template tests: commitlog_test: de-template tests: cql_auth_query_test: de-template test: de-template eventually() and eventually_true() tests: flush_queue_test: de-template hint_test: de-template tests: mutation_fragment_test: de-template test: mutation_test: de-template	2019-01-21 11:32:22 +01:00
Avi Kivity	baf9480c8d	distributed_loader: de-template distributed_loader has several large templates that can be converted to normal function with the help of noncopyable_function<>, reducing code bloat. One of the lambdas used as an actual argument was adjusted, because the de-templated callee only accepts functions returning a future, while the original accepted both functions returning a future and functions returning void (similar to future::then).	2019-01-20 15:55:20 +02:00
Avi Kivity	6e6372e8d2	Revert "Merge "Type-eaese gratuitous templates with functions" from Avi" This reverts commit `31c6a794e9`, reversing changes made to `4537ec7426`. It causes bad_function_calls in some situations: INFO 2019-01-20 01:41:12,164 [shard 0] database - Keyspace system: Reading CF sstable_activity id=5a1ff267-ace0-3f12-8563-cfae6103c65e version=d69820df-9d03-3cd0-91b0-c078c030b708 INFO 2019-01-20 01:41:13,952 [shard 0] legacy_schema_migrator - Moving 0 keyspaces from legacy schema tables to the new schema keyspace (system_schema) INFO 2019-01-20 01:41:13,958 [shard 0] legacy_schema_migrator - Dropping legacy schema tables INFO 2019-01-20 01:41:14,702 [shard 0] legacy_schema_migrator - Completed migration of legacy schema tables ERROR 2019-01-20 01:41:14,999 [shard 0] seastar - Exiting on unhandled exception: std::bad_function_call (bad_function_call)	2019-01-20 11:32:14 +02:00
Piotr Sarna	3d65eb5d4a	distributed_loader: restore indentation	2019-01-18 10:59:37 +01:00
Piotr Sarna	e50e9b5150	distributed_loader: restore always mutating to level 0 When introducing view update generation path for sstables in /upload directory, mutating these sstables was moved to regular path only. It was wrong, because sstables that need view updates generated from them may still need to be downgraded to LCS level 0, so they won't disrupt LCS assumptions after being loaded. Reported-by: Nadav Har'El <nyh@scylladb.com>	2019-01-18 10:35:20 +01:00
Avi Kivity	b6239134c2	distributed_loader: de-template distributed_loader has several large templates that can be converted to normal function with the help of noncopyable_function<>, reducing code bloat.	2019-01-17 18:56:22 +02:00
Piotr Sarna	0eb703dc80	all: rename view_update_from_staging_generator The new name, view_update_generator, is both more concise and correct, since we now generate from directories other than "/staging".	2019-01-15 17:31:47 +01:00
Piotr Sarna	a5d24e40e0	distributed_loader: fix indentation Bad indentation was introduced in the previous commit.	2019-01-15 17:31:37 +01:00
Piotr Sarna	13c8c84045	service: add generating view updates from uploaded sstables SSTables loaded to the system via /upload dir may sometimes be needed to generate view updates from them (if their table has accompanying views). Fixes #4047	2019-01-15 17:31:37 +01:00
Piotr Sarna	76616f6803	distributed_loader: use proper directory for opening SSTable Previous implementation assumes that each SSTable resides directly in table::datadir directory, while what should actually be used is directory path from SSTable descriptor. This patch prevents a regression when adding staging sstables support for upload/ dir.	2019-01-15 16:47:01 +01:00
Duarte Nunes	b851cb1a9a	distributed_loader: Forbid uploading MV sstables Instead suggest that the views be re-created. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190103142933.35354-1-duarte@scylladb.com>	2019-01-03 16:31:20 +02:00

1 2

51 Commits