scylladb

Author	SHA1	Message	Date
Benny Halevy	4439e5c132	everywhere: cleanup defer.hh includes Get rid of unused includes of seastar/util/{defer,closeable}.hh and add a few that are missing from source files. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-22 21:11:39 +03:00
Tomasz Grabiec	b044db863f	Merge 'db/virtual_table: Streaming tables for large data + describe_ring example table' from Juliusz Stasiewicz This is the 2nd PR in series with the goal to finish the hackathon project authored by @tgrabiec, @kostja, @amnonh and @mmatczuk (improved virtual tables + function call syntax in CQL). This one introduces a new implementation of the virtual tables, the streaming tables, which are suitable for large amounts of data. This PR was created by @jul-stas and @StarostaGit Closes #8961 * github.com:scylladb/scylla: test/boost: run_mutation_source_tests on streaming virtual table system_keyspace: Introduce describe_ring table as virtual_table storage_service: Pass the reference down to system_keyspace endpoint_details: store `_host` as `gms::inet_address` queue_reader: implement next_partition() virtual_tables: Introduce streaming_virtual_table flat_mutation_reader: Add a new filtering reader factory method	2021-07-23 18:05:51 +02:00
Raphael S. Carvalho	aad72289e2	table: Kill load_sstable() That function is dangerously used by distributed loader, as the latter was responsible for invalidating cache for new sstable. load_sstable() is an unsafe alternative to add_sstable_and_update_cache() that should never have been used by the outside world. Instead, let's kill it and make loader use the safe alternative instead. This will also make it easier to make sure that all concurrent updates to sstable set are properly serialized. Additionally, this may potentially reduce the amount of data evicted from the cache, when the sstables being imported have a narrow range, like high level sstables imported from a LCS table. Unlikely but possible. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210721131949.26899-1-raphaelsc@scylladb.com>	2021-07-21 16:21:42 +03:00
Juliusz Stasiewicz	f8067d938d	storage_service: Pass the reference down to system_keyspace According to the policy of avoiding globals.	2021-07-20 14:18:24 +02:00
Raphael S. Carvalho	1924e8d2b6	treewide: Move compaction code into a new top-level compaction dir Since compaction is layered on top of sstables, let's move all compaction code into a new top-level directory. This change will give me extra motivation to remove all layer violations, like sstable calling compaction-specific code, and compaction entanglement with other components like table and storage service. Next steps: - remove all layer violations - move compaction code in sstables namespace into a new one for compaction. - move compaction unit tests into its own file Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210707194058.87060-1-raphaelsc@scylladb.com>	2021-07-07 23:21:51 +03:00
Raphael S. Carvalho	88119a5c81	distributed_loader: Kill table's _sstables_opened_but_not_loaded _sstables_opened_but_not_loaded was needed because the old loader would open sstables from all shards before loading them. In the new loader, introduced with reshape, make_sstables_available() is called on each shard after resharding and reshape finished, so there's no need whatsoever for that mess. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210618200026.1002621-1-raphaelsc@scylladb.com>	2021-06-24 12:03:26 +03:00
Pavel Emelyanov	7396de72b1	distributed_loader: Remove unused load-prio manipulations Mostly this was removed by `6dfeb107` (distributed_loader: remove unused code). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-06-18 20:19:35 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Kamil Braun	617813ba66	sys_dist_ks: new keyspace for system tables with Everywhere strategy `system_distributed_everywhere` is a new keyspace that uses Everywhere replication strategy. This is useful, for example, when we want to store internal data that should be accessible by every node; the data can be written using CL=ALL (e.g. during node operations such as node bootstrap, which require all nodes to be alive - at least currently) and then read by each node locally using CL=ONE (e.g. during node restarts). Closes #8457	2021-04-19 11:22:57 +03:00
Raphael S. Carvalho	bb9a109c1a	distributed_loader: inform which table is being resharded Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210330163956.60585-1-raphaelsc@scylladb.com>	2021-04-01 13:08:59 +03:00
Gleb Natapov	d8345c67d9	Consolidate system and non system keyspace creation The code that creates system keyspace open code a lot of things from database::create_keyspace(). The patch makes create_keyspace() suitable for both system and non system keyspaces and uses it to create system keyspaces as well. Message-Id: <20210209160506.1711177-1-gleb@scylladb.com>	2021-02-09 17:18:04 +01:00
Gleb Natapov	b9a5aff7a6	distributed_loader: drop execute_futures function execute_futures() is just a local reimplementation of when_all_succeed(). Use the former directly. Message-Id: <20210208114816.GA1658725@scylladb.com>	2021-02-08 13:24:19 +01:00
Avi Kivity	df3ef800c2	Merge 'Introduce load and stream feature' from Asias He storage_service: Introduce load_and_stream === Introduction === This feature extends the nodetool refresh to allow loading arbitrary sstables that do not belong to a node into the cluster. It loads the sstables from disk and calculates the owning nodes of the data and streams to the owners automatically. From example, say the old cluster has 6 nodes and the new cluster has 3 nodes. We can copy the sstables from the old cluster to any of the new nodes and trigger the load and stream process. This can make restores and migrations much easier. === Performance === I managed to get 40MB/s per shard on my build machine. CPU: AMD Ryzen 7 1800X Eight-Core Processor DISK: Samsung SSD 970 PRO 512GB Assume 1TB sstables per node, each shard can do 40MB/s, each node has 32 shards, we can finish the load and stream 1TB of data in 13 mins on each node. 1TB / 40 MB per shard * 32 shard / 60 s = 13 mins === Tests === backup_restore_tests.py:TestBackupRestore.load_and_stream_to_new_cluster_test which creates a cluster with 4 nodes and inserts data, then use load_and_stream to restore to a 2 nodes cluster. === Usage === curl -X POST "http://{ip}:10000/storage_service/sstables/{keyspace}?cf={table}&load_and_stream=true === Notes === Btw, with the old nodetool refresh, the node will not pick up the data that does not belong to this node but it will not delete it either. One has to run nodetool cleanup to remove those data manually which is a surprise to me and probably to users as well. With load and stream, the process will delete the sstables once it finishes stream, so no nodetool cleanup is needed. The name of this feature load and stream follows load and store in CPU world. Fixes #7831 Closes #7846 * github.com:scylladb/scylla: storage_service: Introduce load_and_stream distributed_loader: Add get_sstables_from_upload_dir table: Add make_streaming_reader for given sstables set	2021-01-18 15:08:19 +02:00
Asias He	28007f13f8	distributed_loader: Add get_sstables_from_upload_dir This function scans sstables under the upload directory and return a list of sstables for each shard. Refs #7831	2021-01-16 20:03:17 +08:00
Piotr Sarna	f293c59a46	system_keyspace: migrate helper functions to string_view Functions for checking if the keyspace is system/internal were based on sstring references, which is impractical compared to string views and may lead to unnecessary creation of sstring instances.	2021-01-04 09:47:01 +01:00
Raphael S. Carvalho	198b87503f	row_cache: allow external updater to decouple preparation from execution External updater may do some preparatory work like constructing a new sstable list, and at the end atomically replace the old list by the new one. Decoupling the preparation from execution will give us the following benefits: - the preparation step can now yield if needed to avoid reactor stalls, as it's been futurized. - the execution step will now be able to provide strong exception guarantees, as it's now decoupled from the preparation step which can be non-exception-safe. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-12-28 13:17:45 -03:00
Benny Halevy	57cc5f6ae1	sstable_directory: use a external load_semaphore Although each sstable_directory limits concurrency using max_concurrent_for_each, there could be a large number of calls to do_for_each_sstable running in parallel (e.g per keyspace X per table in the distributed_loader). To cap parallelism across sstable_directory instances and concurrent calls to do_for_each_sstable, start a sharded<semaphore> and pass a shared semaphore& to the sstable_directory:s. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-08 11:57:06 +03:00
Benny Halevy	f4269e3a04	distributed_loader: process_upload_dir: use initial_sstable_loading_concurrency Although process_upload_dir is not called when initially loading the tables, but rather from from storage_service::load_new_sstables, it can use the same sstable_loading_concurrency, rather than constant `4`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-07 14:45:20 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Benny Halevy	78595303f9	sstable: remove_by_toc_name: make static It's not called outside of sstables code anymore. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-09 12:04:36 +03:00
Benny Halevy	d4615f4293	sstables: sstable_version_types: implement operator<=> Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200707061715.578604-1-bhalevy@scylladb.com>	2020-07-08 14:23:11 +03:00
Rafael Ávila de Espíndola	400212e81f	auth: Convert sstring variables in common.hh to constexpr std::string_view This converts the following variables: DEFAULT_SUPERUSER_NAME AUTH_KS USERS_CF AUTH_PACKAGE_NAME Since they are now constexpr they will not be part of any initialization order problems. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-03 12:35:58 -07:00
Raphael S. Carvalho	18880af9ad	distributed_loader: kill unused invoke_shards_with_ptr() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:23:50 -03:00
Raphael S. Carvalho	6dfeb107ae	distributed_loader: remove unused code Remove code no longer used by population procedure. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:21:40 -03:00
Raphael S. Carvalho	39f96a5572	distributed_loader: Don't mutate levels to zero when populating column family Unlike refresh on upload dir, column family population shouldn't mutate level of SSTables to level 0. Otherwise, LCS will have to regenerate all levels by rewriting the data multiple times, hurting a lot the write amplification and consequently the node performance. That's also affecting the time for a node to boot because reshape may be triggered as a result of this. Refs #6695. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200622192502.187532-2-raphaelsc@scylladb.com>	2020-06-23 19:40:18 +03:00
Avi Kivity	de38091827	priority_manager: merge streaming_read and streaming_write classes into one class Streaming is handled by just once group for CPU scheduling, so separating it into read and write classes for I/O is artificial, and inflates the resources we allow for streaming if both reads and writes happen at the same time. Merge both classes into one class ("streaming") and adjust callers. The merged class has 200 shares, so it reduces streaming bandwidth if both directions are active at the same time (which is rare; I think it only happens in view building).	2020-06-22 15:09:04 +03:00
Benny Halevy	a3918bdc96	distributed_loader: reenable verify_owner_and_mode when loading new sstables The call to `verify_owner_and_mode` from `flush_upload_dir` fell between the cracks in `b34c0c2ff6` (distributed_loader: rework uploading of SSTables). It causes https://jenkins.scylladb.com/view/master/job/scylla-master/job/dtest-release/528/testReport/nodetool_additional_test/TestNodetool/nodetool_refresh_with_wrong_upload_modes_test/ to fail like this: ``` /Directory cannot be accessed .* write/ not found in 'Nodetool command '/jenkins/workspace/scylla-master/dtest-release/scylla/.ccm/scylla-repository/7351db7cab7bbf907172940d0bbf8b90afde90ba/scylla-tools-java/bin/nodetool -h 127.0.87.1 -p 7187 refresh -- keyspace1 standard1' failed; exit status: 1; stdout: nodetool: Scylla API server HTTP POST to URL '/storage_service/sstables/keyspace1' failed: Failed to load new sstables: std::filesystem::__cxx11::filesystem_error (error system:13, filesystem error: remove failed: Permission denied [/jenkins/workspace/scylla-master/dtest-release/scylla/.dtest/dtest-rqzo7km7/test/node1/data/keyspace1/standard1-8a57a660b29611eabf0c000000000000/upload/mc-3-big-TOC.txt]) ``` Reenable it in this patch makes the dtest pass again. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200621140439.85843-1-bhalevy@scylladb.com>	2020-06-22 14:03:13 +03:00
Glauber Costa	e40aa042a7	distributed_loader: reshard before the node is made online This patch moves the resharding process to use the new directory_with_sstables_handler infrastructure. There is no longer a clear reshard step, and that just becomes a natural part of populate_column_family. In main.cc, a couple of changes are necessary to make that happen. The first one obviously is to stop calling reshard. We also need to make sure that: - The compaction manager is started much earlier, so we can register resharding jobs with it. - auto compactions are disabled in the populate method, so resharding doesn't have to fight for bandwidth with auto compactions. Now that we are resharding through the sstable_directory, the old resharding code can be deleted. There is also no need to deal with the resharding backlog either, because the SSTables are not yet added to the sstable set at this point. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:37:18 -04:00
Glauber Costa	b34c0c2ff6	distributed_loader: rework uploading of SSTables Uploading of SSTables is problematic: for historical reasons it takes a lock that may have to wait for ongoing compactions to finish, then it disables writes in the table, and then it goes loading SSTables as if it knew nothing about them. With the sstable_directory infrastructure we can do much better: * we can reshard and reshape the SSTables in place, keeping the number of SSTables in check. Because this is an background process we can be fairly aggressive and set the reshape mode to strict. * we can then move the SSTables directly into the main directory. Because we know they are few in number we can call the more elegant add_sstable_and_invalidate_cache instead of the open coding currently done by load_new_sstables * we know they are not shared (if they were, we resharded them), simplifying the load process even further. The major changes after this patch is applied is that all compactions (resharding and reshape) needed to make the SSTables in-strategy are done in the streaming class, which reduces the impact of this operation on the node. When the SSTables are loaded, subsequent reads will not suffer as we will not be adding shared SSTables in potential high numbers, nor will we reshard in the compaction class. There is also no more need for a lock in the upload process so in the fast path where users are uploading a set of SSTables from a backup this should essentially be instantaneous. The lock, as well as the code to disable and enable table writes is removed. A future improvement is to bypass the staging directory too, in which case the reshaping compaction would already generate the view updates. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:37:18 -04:00
Glauber Costa	4d6aacb265	sstable_directory: add helper to reshape existing unshared sstables Before moving SSTables to the main directory, we may need to reshape them into in-strategy. This patch provides helper code that reshapes the SSTables that are known to be unshared local in the sstable directory, and updates the sstable directory with the result. Rehaping can be made more or less aggressive by passing a reshape mode (relaxed or strict), which will influence the amount of SSTables reshape can tolerate to consider a particular slice of the SSTable set offstrategy. Because the compaction expects an std::vector everywhere, we changed our chunked vector for the unshared sstables to a std::vector so we can more easily pass it around without conversions. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:37:18 -04:00
Glauber Costa	c4841fa735	compaction: add a size and throught pretty printer. This is so we don't always use MB. Sometimes it is best to report GB, TB, and their equivalent throughput metrics. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:37:18 -04:00
Glauber Costa	072d0d3073	distributed_loader.cc: add a helper function to extract the highest SSTable version found Using a map reduce in a shared sstable directory, finds the highest version seen across all shards. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:00:28 -04:00
Glauber Costa	baa82b3a26	distributed_loader.cc : extract highest_generation_seen code We'll use it in one more other location so extract it to common code. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:00:27 -04:00
Glauber Costa	9902af894a	compaction_manager: rename run_resharding_job It will be used to run any custom job where the caller provides a function. One such example is indeed resharding, but reshaping SSTables can also fall here. The semaphore is also renamed, and we'll allow only one custom job at a time (across all possible types). We also remove the assumption of the scheduling group. The caller has to have already placed the code in the correct CPU scheduling group. The I/O priority class comes from the descriptor. To make sure that we don't regress, we wrap the entire reshard-at-boot code in the compaction class. Currently the setup would be done in the main group, and the actual resharding in the compaction group. Note that this is temporary, as this code is about to change. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:00:27 -04:00
Glauber Costa	45f3bc679e	distributed_loader: assume populate_column_families is run in shard 0 This is already the case, since main.cc calls it from shard 0 and relies on it to spread the information to the other shards. We will turn this branch - which is always taken - into an assert for the sake of future-proofing and soon add even more code that relies on this being executed in shard 0. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:00:27 -04:00
Glauber Costa	1c70a7c54e	upload: use custom error handler for upload directory SSTables created for the upload directory should be using its custom error handler. There is one user of the custom error handler in tree, which is the current upload directory function. As we will use a free function instead of a lambda in our implementation we also use the opportunity to fix it for consistency. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-16 19:42:19 -04:00
Glauber Costa	4025b22d13	distributed_loader: remove self-move assignment By mistake I ended up spilling the lambda capture idiom of x = std::move(x) into the function parameter list, which is invalid. Fix it. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200609141608.103665-1-glauber@scylladb.com>	2020-06-09 17:22:57 +03:00
Glauber Costa	8021d12371	load_new_sstables: reshard before scanning the upload directory In a later patch we will be able move files directly from upload into the main directory. However for now, for the benefit of doing this incrementally, we will first reshard in place with our new reshard infrastructure. load_new_sstables can then move the SSTables directly, without having to worry about resharding. This has the immediate benefit that the resharding happens: - in the streaming group, without affecting compaction work - without waiting for the current locks (which are held by compactions) in load_new_sstables to release. We could, at this point, just move the SSTables to the main directory right away. I am not doing this in this patch, and opting to keep the rest of upload process unchanged. This will be fixed later when we enable offstrategy compactions: we'll then compact the SSTables generated into the main directory. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-09 09:02:35 -04:00
Glauber Costa	aebd965f0e	distributed_load: initial handling of off-strategy SSTables Off-strategy SSTables are SSTables that do not conform to the invariants that the compaction strategies define. Examples of offstrategy SSTables are SSTables acquired over bootstrap, resharding when the cpu count changes or imported from other databases through our upload directory. This patch introduces a new class, sstable_directory, that will handle SSTables that are present in a directory that is not one of the directories where the table expects its SSTables. There is much to be done to support off-strategy compactions fully. To make sure we make incremental progress, this patch implements enough code to handle resharding of SSTables in the upload directory. SSTables are resharded in place, before we start accessing the files. Later, we will take other steps before we finally move the SSTables into the main directory. But for now, starting with resharding will not only allow us to start small, but it will also allow us to start unleashing much needed cleanups in many places. For instance, once we start resharding on boot before making the SSTables available, we will be able to expurge all places in Scylla where, during normal operations, we have extra handler code for the fact that SSTables could be shared. Tests: a new test is added and it passes in debug mode. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-08 16:06:00 -04:00
Glauber Costa	e48ad3dc23	remove manifest_file filter from table. When we are scanning an sstable directory, we want to filter out the manifest file in most situations. The table class has a filter for that, but it is a static filter that doesn't depend on table for anything. We are better off removing it and putting in another independent location. While it seems wasteful to use a new header just for that, this header will soon be populated with the sstable_directory class. Tests: unit (dev) Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-08 16:06:00 -04:00
Raphael S. Carvalho	8e47f61df7	compaction: Enable tombstone expiration based on the presence of the sstable set For tombstone expiration to proceed correctly without the risk of resurrecting data, the sstable set must be present. Regular compaction and derivatives provide the sstable set, so they're able to expire tombstones with no resurrection risk. Resharding, on the other hand, can run on any shard, not necessarily on the same shard that one of the input sstables belongs to, so it currently cannot provide a sstable set for tombstone expiration to proceed safely. That being said, let's only do expiration based on the presence of the set. This makes room for the sstable set to be feeded to compaction via descriptor, allowing even resharding to do expiration. Currently, compaction thinks that sstable set can only come from the table, and that also needs to be changed for further flexibility. It's theoretically possible that a given resharding job will resurrect data if a fully expired SSTable is resharded at a shard which it doesn't belong to. Resharding will have no way to tell that expiring all that data will lead to resurrection because the relevant SSTables are at different shards. This is fixed by checking for fully expired sstables only on presence of the sstable set. Fixes #6600. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200605200954.24696-1-raphaelsc@scylladb.com>	2020-06-07 11:46:48 +03:00
Glauber Costa	e29701ca1c	compaction_manager: expand state to be able to differentiate between enabled and stopped We are having many issues with the stop code in the compaction_manager. Part of the reason is that the "stopped" state has its meaning overloaded to indicate both "compaction manager is not accepting compactions" and "compaction manager is not ready or destructed". In a later step we could default to enabled-at-start, but right now we maintain current behavior to minimize noise. It is only possible to stop the compaction manager once. It is possible to enable / disable the compaction manager many times. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Glauber Costa	70a89ab4ab	compaction: do not assume I/O priority class We shouldn't assume the I/O priority class for compactions. For instance, if we are dealing with offstrategy compactions we may want to use the maintenance group priority for them. For now, all compactions are put in the compaction class. rewrite compactions (scrub, cleanup) could be maintenance, but we don't have clear access to the database object at this time to derive the equivalent CPU priority. This is planned to be changed in the future, and when we do change it, we'll adjust. Same goes for resharding: while we could at this point change it we'd risking memory pressure since resharding is run online and sstables are shared until resharding is done. When we move it to offline execution we'll do it with maintenance priority. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200512002233.306538-3-glauber@scylladb.com>	2020-05-12 08:23:19 +03:00
Raphael S. Carvalho	88d2486fca	sstables: Synchronize deletion of SSTables in resharding with other operations Input SSTables of resharding is deleted at the coordinator shard, not at the shards they belong to. We're not acquiring deletion semaphore before removing those input SSTables from the SSTable set, so it could happen that resharding deletes those SSTables while another operation like snapshot, which acquires the semaphore, find them deleted. Let's acquire the deletion semaphore so that the input SSTables will only be removed from the set, when we're certain that nobody is relying on their existence anymore. Now resharding will only delete input SStables after they're safely removed from the SSTable set of all shards they belong to. unit: test(dev). Fixes #6328. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200507233636.92104-1-raphaelsc@scylladb.com>	2020-05-10 10:50:32 +03:00
Calle Wilund	040ffa6e64	distributed_loader: Add concurrency control override for named keyspaces Fixes #6202 Distributed loader sstable opening is gated through the database::sstable_load_concurrency_sem() semaphore (at a concurrency of 3). This is (according to creation comment) to reduce memory footprint during bootstrap, by partially serializing the actual opening of existing sstables. However, in certain versions of the product, there exist circular dependencies between data in some sstables and the ability to actually read others. Thus when gated as above, we can end up with the dependents acquiring the semaphore fully, and once stuck waiting for population of their dependency effectively blocking this from ever happening. Since we probably do not want to remove the concurrency control, and increasing it would only push the problem further away, we solve the issue by adding the ability to mark certain keyspaces as "prioritized" (pre-bootstrap), and allow them to populate outside the normal concurrency control semaphore. Concurrency increase is however limited to one extra sstable per shard and prio keyspace. Message-Id: <20200415102431.20816-1-calle@scylladb.com>	2020-04-27 16:21:13 +03:00
Glauber Costa	05efd6a5e9	resharding: get rid of special reshard_sstables There is a method, reshard_sstables(), whose sole purpose is to call a resharding compaction. There is nothing special about this method: all the information it needs is now present in the compaction_descriptor. This patch extend the compaction_options class to recognize resharding compactions as well, and uses that so that make_compaction() can also create resharding compactions. To make that happen we have to create a compaction_descriptor object in the resharding method. Note however that resharding works by passing an object very close to the compaction_descriptor around. Once this patch is merged, a logical next step is to reuse it, and avoid creating the descriptor right before calling compact_sstables(). Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-03-31 19:57:53 -04:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Pavel Emelyanov	7363d56946	sstables: Move get_highest_supported_format The global get_highest_supported_format helper and its declaration are scattered all over the code, so clean this up and prepare the ground for moving _sstables_format from the storage_service onto the sstables_manager (not this set). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-25 14:31:45 +03:00

1 2

94 Commits