scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-05 22:43:15 +00:00

Author	SHA1	Message	Date
Kamil Braun	1019ff07cb	db: system_keyspace: group cdc functions in single place	2021-04-06 13:15:31 +02:00
Kamil Braun	3cebe99613	sys_dist_ks: update comment at quorum_if_many The comment mentioned tables that no longer exist: their names have changed some time ago. Update the comment to be name-agnostic. Furthemore, the second part of the comment related to a case of "joining a node without bootstrapping". Fortunately this operation is no longer possible (after #6848 which became part of Scylla 4.3) so we can shorten the comment.	2021-04-06 13:15:31 +02:00
Avi Kivity	82c76832df	treewide: don't include "db/system_distributed_keyspace.hh" from headers This just causes unneeded and slower recompliations. Instead replace with forward declarations, or includes of smaller headers that were incidentally brought in by the one removed. The .cc files that really need it gain the include, but they are few. Ref #1. Closes #8403	2021-04-04 14:00:26 +03:00
Kamil Braun	641040d465	sys_dist_ks: remove dead code (expire_cdc_* functions) These functions were not used anywhere but had to be maintained anyway. When (if) the expiration algorithm actually gets implemented (see issue #7300), the functions can be added back (perhaps they will need to look differently at that time, and it's likely that the `expire` column won't be used in the expiration algorithm in the end anyway).	2021-04-04 13:12:12 +03:00
Kamil Braun	4f3f245188	sys_dist_ks: coroutinize system_distributed_keyspace::start	2021-04-04 13:10:44 +03:00
Tomasz Grabiec	307bd354d2	Merge 'hints: use token_metadata to tell if node has left the ring' from Piotr Dulikowski This PR changes the `can_send` function so that it looks at the `token_metadata` in order to tell if the destination node is in the ring. Previously, gossiper state was used for that purpose and required a relatively complicated condition to check. The new logic just uses `token_metadata::is_member` which reduces complexity of the `can_send` function. Additionally, `storage_service` is slightly modified so that during a removenode operation the `token_metadata` is first updated and only then endpoint lifecycle subscribers are notified. This was done in order to prevent a race just like the one which happened in #5087 - hints manager is a lifecycle subscriber and starts a draining operation when a node is removed, and in order for draining to work correctly, `can_send` should keep returning true for that node. Tests: - unit(dev) - dtest(hintedhandoff_additional_test.py) - dtest(topology_test.py) Closes #8387 * github.com:scylladb/scylla: hints: clarify docstring comment for can_send hints: use token_metadata to tell if node is in the ring hints: slightly reogranize "if" statement in can_send storage_service: release token_metadata lock before notify_left storage_service: notify_left after token_metadata is replicated	2021-04-01 15:51:46 +02:00
Piotr Dulikowski	6a1152ea9b	hints: clarify docstring comment for can_send Now, the docstring comment next to can_send better represents the condition that is checked inside that function. The statement about returning true when destination left the NORMAL state is replaced with a statement about returning true when the destination has left the ring.	2021-04-01 03:58:29 +02:00
Piotr Dulikowski	4f90514247	hints: use token_metadata to tell if node is in the ring Now, instead of looking at the gossiper state to check if the destination node is still in the ring, we are using token_metadata as a source of truth. This results in much simpler code in can_send() as token_metadata has an is_member method which does exactly what we want.	2021-04-01 03:58:29 +02:00
Piotr Dulikowski	e7d9057d0c	hints: slightly reogranize "if" statement in can_send This commit reverses the order of if-else blocks in can_send, which makes it - in my opinion, at least - slightly easier to read.	2021-04-01 03:58:29 +02:00
Piotr Jastrzebski	57c7964d6c	config: ignore enable_sstables_mc_format flag Don't allow users to disable MC sstables format any more. We would like to retire some old cluster features that has been around for years. Namely MC_SSTABLE and UNBOUNDED_RANGE_TOMBSTONES. To do this we first have to make sure that all existing clusters have them enabled. It is impossible to know that unless we stop supporting enable_sstables_mc_format flag. Test: unit(dev) Refs #8352 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #8360	2021-03-31 12:23:59 +03:00
Calle Wilund	c0666ea89b	commitlog: Fix inner loop condition in allocation pre-fill Fixes #8369 This was originally found (and fixed) by @gleb-cloudius, but the patch set with the fix was reverted at some point, and the fix went away. Now the error remains even in new, nice coroutine code. We check the wrong var in the inner loop of the pre-fill path of allocate_segment_ex, often causing us to generate giant writev:s of more or less the whole file. Not intended. Closes #8370	2021-03-30 12:14:55 +02:00
Nadav Har'El	ccc75bfe2a	Merge 'Disable thrift by default' from Piotr Sarna The Thrift layer is functional, but it's not usually the first-choice protocol for Scylla users, so it's hereby disabled by default. Fixes #8336 Closes #8338 * github.com:scylladb/scylla: docs: mention disabling Thrift by default db,config: disable Thrift by default	2021-03-29 12:48:20 +03:00
Piotr Wojtczak	c1daf2bb24	column_family: Make toppartitions queries more generic Right now toppartitions can only be invoked on one column family at a time. This change introduces a natural extension to this functionality, allowing to specify a list of families. We provide three ways for filtering in the query parameter "name_list": 1. A specific column family to include in the form "ks:cf" 2. A keyspace, telling the server to include all column families in it. Specified by omitting the cf name, i.e. "ks:" 3. All column families, which is represented by an empty list The list can include any amount of one or both of the 1. and 2. option. Fixes #4520 Closes #7864	2021-03-24 17:54:05 +02:00
Pavel Emelyanov	37bec6fb76	commitlog: Open files with append_is_unlikely This open option tells seastar that the file in question will be truncated to the needed size right at once and all the subsequent writes will happen within this size. This hint turns off append optimization in seastar that's not that cheap and helps so save few cpu cycles. The option was introduced in seastar by 8bec57bc. tests: unit(dev), dtest(commitlog: test_batch_commitlog, test_periodic_commitlog, test_commitlog_replay_on_startup) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210323115409.31215-1-xemul@scylladb.com>	2021-03-24 13:05:33 +02:00
Piotr Sarna	e2443337d9	db,config: disable Thrift by default It will still be possible to use Thrift once it's enabled in the yaml file, but it's better to not open this port by default, since Thrift is definitely not the first choice for Scylla users. Fixes #8336	2021-03-22 10:54:26 +01:00
Avi Kivity	972ea9900c	Merge 'commitlog: Make pre-allocation drop O_DSYNC while pre-filling' from Calle Wilund Refs #7794 Iff we need to pre-fill segment file ni O_DSYNC mode, we should drop this for the pre-fill, to avoid issuing flushes until the file is filled. Done by temporarily closing, re-opening in "normal" mode, filling, then re-opening. Closes #8250 * github.com:scylladb/scylla: commitlog: Make pre-allocation drop O_DSYNC while pre-filling commitlog: coroutinize allocate_segment_ex	2021-03-17 09:59:22 +02:00
Calle Wilund	48ca01c3ab	commitlog: Make pre-allocation drop O_DSYNC while pre-filling Refs #7794 Iff we need to pre-fill segment file ni O_DSYNC mode, we should drop this for the pre-fill, to avoid issuing flushes until the file is filled. Done by temporarily closing, re-opening in "normal" mode, filling, then re-opening. v2: * More comment v3: * Add missing flush v4: * comment v5: * Split coroutine and fix into separate patches	2021-03-15 09:35:45 +00:00
Calle Wilund	ae3b8e6fdf	commitlog: coroutinize allocate_segment_ex To make further changes here easier to write and read.	2021-03-15 09:35:37 +00:00
Calle Wilund	f44420f2c9	snapshot: Add filter to check for existing snapshot Fixes #8212 Some snapshotting operations call in on a single table at a time. When checking for existing snapshots in this case, we should not bother with snapshots in other tables. Add an optional "filter" to check routine, which if non-empty includes tables to check. Use case is "scrub" which calls with a limited set of tables to snapshot. Closes #8240	2021-03-10 20:21:38 +02:00
Eliran Sinvani	9162748b18	materialized views: create view schemas with proper base table reference. Newly created view schemas don't always have their base info, this is bad since such schemas don't support read nor write. This leaves us vulnerable to a race condition where there is an attempt to use this schema for read or write. Here we initialize the base reference and also reconfigure the view to conform to the new computed column type, which makes it usable for write and not only reads. We do it for views created in the migration manager following announcements and also for copied schemas.	2021-03-07 12:50:42 +02:00
Eliran Sinvani	39cd9dae4e	materialized views: Extract fix legacy schema into its own logic We extract the logic for fixing the view schema into it's own logic as we will need to use it in more places in the code. This makes 'maybe_update_legacy_secondary_index_mv_schema' redundant since it becomes a two liner wrapper for this logic. We also remove it here and replace the call to it with the equivalent code.	2021-03-07 12:50:42 +02:00
Piotr Sarna	added53b7d	Merge 'hints: use a soft disk space limit in hints commitlog' from Piotr Dulikowski A recent change to the commitlog (`4082f57`) caused its configurable size limit to be strictly enforced - after reaching the limit, new segments wouldn't be allocated until some of the previous segments are freed. This flow can work for the regular commitlog, however the hints commitlog does not delete the segments itself - instead, hints manager recreates its commitlog every 10 seconds, picks up segments left by the previous instance and deletes each segment manually only after all hints are sent out from a segment. Because of the non-standard flow, it is possible that the hints commitlog fills up and stops accepting more hints. Hints manager uses a relatively low limit for each commitlog instance (128MB divided by shard count), so it's not hard to fill it up. What's worse, hints manager tries to acquire file_update_mutex in exclusive mode before re-creating the commitlog, while hints waiting to be written acquire this lock in shared mode - which causes hints flushing to completely deadlock and no more hints be admitted to the commitlog. The queue of hints waiting to be admitted grows very quickly and soon all writes which could result in a hint being generated are rejected with OverloadedException. To solve this problem, it is now possible to bring back the soft disk space limit by setting a flag in commitlog's configuration. Tests: - unit(dev) - wrote hints for 15 minutes in order to see if it gets stuck again Fixes #8137 Closes #8206 * github.com:scylladb/scylla: hints_manager: don't use commitlog hard space limit commitlog: add an option to allow going over size limit	2021-03-04 12:24:05 +01:00
Calle Wilund	5da0129775	system_distributed_keyspace: Add better routine to get latest cdc gen. timestamp Since we have a table of cdc version timestamps, conviniently sorted reversed, we can just query this and get the latest known gen ts.	2021-03-03 15:44:54 +00:00
Calle Wilund	5a69250d7e	system_distributed_keyspace: Fix cdc_get_versioned_streams timestamp range With the new scheme for cdc generation management, one of the last changes was to make the time ordering of the stream timestamps reversed. However, cdc_get_versioned_streams forgot to take this into account when sifting out timestamp ranges for stream retrieval (based on low mark). Fixed by doing reverse iteration.	2021-03-03 15:41:42 +00:00
Piotr Dulikowski	376da49cf4	hints_manager: don't use commitlog hard space limit This commit disables the hard space limit applied by commitlogs created to store hints. The hard limit causes problems for hints because they use small-sized commitlogs to store hints (128MB, currently). Instead of letting the commitlog delete the segments itself, it recreates the commitlog every 10 seconds and manually deletes old segments after all hints are sent out from them. If the 128MB limit is reached, the hints manager will get stuck. A future which puts hint into commitlog holds a shared lock, and commitlog recreation needs to get an exclusive lock, which results in a deadlock. No more hints will be admitted, and eventually we will start rejecting writes with OverloadedException due to too many hints waiting to be admitted to the commitlog. By disabling the hard limit for hints commitlog, the old behavior is brought back - commitlog becomes more conservative with the space used after going over its size limit, but does not block until some of its segments are deleted.	2021-03-02 16:53:50 +01:00
Avi Kivity	5f4bf18387	Revert "Merge 'sstables: add versioning to the sstable_set ' from Wojciech Mitros" This reverts commit `31909515b3`, reversing changes made to `ef97adc72a`. It shows many serious regressions in dtest. Fixes #8197.	2021-03-02 13:21:22 +02:00
Benny Halevy	baf5d05631	storage_service: use atomic_vector for lifecycle_subscribers So it can be modified while walked to dispatch subscribed event notifications. In #8143, there is a race between scylla shutdown and notify_down(), causing use-after-free of cql_server. Using an atomic vector itstead and futurizing unregister_subscriber allows deleting from _lifecycle_subscribers while walked using atomic_vector::for_each. Fixes #8143 Test: unit(release) DTest: update_cluster_layout_tests:TestUpdateClusterLayout.add_node_with_large_partition4_test(release) materialized_views_test.py:TestMaterializedViews.double_node_failure_during_mv_insert_4_nodes_test(release) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210224164647.561493-2-bhalevy@scylladb.com>	2021-03-01 20:34:42 +02:00
Avi Kivity	8747c684e0	Merge 'Move timeouts to client state' from Piotr Sarna This series is extracted from #7913 as it may prove useful to other series as well, and #7913 might take a while until its merged, given that it also depends on other unmerged pull requests. The idea of this series is to move timeouts to the client state, which will allow changing them independently for each session - e.g. by setting per-service-level timeouts and initializing the values from attached service levels (see #7867). Closes #8140 * github.com:scylladb/scylla: treewide: remove timeout config from query options cql3: use timeout config from client state instead of query options cql3: use timeout config from client state instead of query options cql3: use timeout config from client state instead of query options service: add timeout config to client state	2021-03-01 20:34:35 +02:00
Piotr Dulikowski	aa2df75321	commitlog: add an option to allow going over size limit This commit adds an option which, when turned on, allows the commitlog to go over configured size limit. After reaching the limit, commitlog will be more conservative with its usage of the disk space - for example, it won't increase the segment reserve size or reuse recycled segments. Most importantly, it won't block writes until the space used by the commitlog goes down. This change is necessary for hinted handoff to keep its current behavior. Hinted handoff does not let the commitlog free segments itself - instead, it re-creates it every 10 seconds and manually deletes segments after all hints are sent from a segment.	2021-03-01 14:16:05 +01:00
Avi Kivity	31909515b3	Merge 'sstables: add versioning to the sstable_set ' from Wojciech Mitros Currently, the sstable_set in a table is copied before every change to allow accessing the unchanged version by existing sstable readers. This patch changes the sstable_set to a structure that keeps all its versions that are referenced somewhere and provides a way of getting a reference to an immutable version of the set. Each sstable in the set is associated with the versions it is alive in, and is removed when all such versions don't have references anymore. To avoid copying, the object holding all sstables in the set version is changed to a new structure, sstable_list, which was previously an alias for std::unordered_set<shared_sstable>, and which implements most of the methods of an unordered_set, but its iterator uses the actual set with all sstables from all referenced versions and iterates over those sstables that belong to the captured version. The methods that modify the sets contents give strong exception guarantee by trying to insert new sstables to its containers, and erasing them in the case of an caught exception. To release shared_sstables as soon as possible (i.e. when all references to versions that contain them die), each time a version is removed, all sstables that were referenced exclusively by this version are erased. We are able to find these sstables efficiently by storing, for each version, all sstables that were added and erased in it, and, when a version is removed, merging it with the next one. When a version that adds an sstable gets merged with a version that removes it, this sstable is erased. Fixes #2622 Signed-off-by: Wojciech Mitros wojciech.mitros@scylladb.com Closes #8111 * github.com:scylladb/scylla: sstables: add test for checking the latency of updating the sstable_set in a table sstables: move column_family_test class from test/boost to test/lib sstables: use fast copying of the sstable_set instead of rebuilding it sstables: replace the sstable_set with a versioned structure sstables: remove potential ub sstables: make sstable_set constructor less error-prone	2021-03-01 14:16:36 +02:00
Piotr Sarna	7936652322	db,view: improve verbosity of errors coming from view updates The error now contains information about the view table that failed, as well as base and view tokens. Example: view - Error applying view update to 127.0.0.1 (view: ks.testme_v_idx_index, base token: -4069959284402364209, view token: -3248873570005575792): std::runtime_error (manually injected error) Fixes #8177 Closes #8178	2021-03-01 10:46:14 +02:00
Piotr Sarna	c5214eb096	treewide: remove timeout config from query options Timeout config is now stored in each connection, so there's no point in tracking it inside each query as well. This patch removes timeout_config from query_options and follows by removing now unnecessary parameters of many functions and constructors.	2021-02-25 17:20:27 +01:00
Kamil Braun	841f07e9b7	cdc: add config option to disable streams rewriting Rewriting stream descriptions is a long, expensive, and prone-to-failure operation. Due to #8061 it may consume a lot of memory. In general, it may keep failing (and being retried) endlessly, straining the cluster. As a backdoor we add this flag for potential future needs of admins or field engineers. I don't expect it will ever be used, but it won't hurt and may save us some work in the worst case scenario.	2021-02-18 11:44:59 +01:00
Kamil Braun	9bdd000e97	cdc: rewrite streams to the new description table Nodes automatically ensure that the latest CDC generation's list of streams is present in the streams description table. When a new generation appears, we only need to update the table for this generation; old generations are already inserted. However, we've changed the description table (from `cdc_streams_descriptions` to `cdc_streams_descriptions_v2`). The existing mechanism only ensures that the latest generation appears in the new description table. This commit adds an additional procedure that rewrites the older generations as well, if we find that it is necessary to do so (i.e. when some CDC log tables may contain data in these generations).	2021-02-18 11:44:59 +01:00
Kamil Braun	4ef736a0a3	cql3: query_processor: improve internal paged query API The `query_processor::query` method allowed internal paged queries. However, it was quite limited, hardcoding a number of parameters: consistency level, timeout config, page size. This commit does the following improvements: 1. Rename `query` to `query_internal` to make it obvious that this API is supposed to be used for internal queries only 2. Extend the method to take consistency level, timeout config, and page size as parameters 3. Remove unused overloads of `query_internal` 4. Fix a bunch of typos / grammar issues in the docstring	2021-02-18 11:44:59 +01:00
Kamil Braun	67d4e5576d	sys_dist_ks: split CDC streams table partitions into clustered rows Until now, the lists of streams in the `cdc_streams_descriptions` table for a given generation were stored in a single collection. This solution has multiple problems when dealing with large clusters (which produce large lists of streams): 1. large allocations 2. reactor stalls 3. mutations too large to even fit in commitlog segments This commit changes the schema of the table as described in issue #7993. The streams are grouped according to token ranges, each token range being represented by a separate clustering row. Rows are inserted in reasonably large batches for efficiency. The table is renamed to enable easy upgrade. On upgrade, the latest CDC generation's list of streams will be (re-)inserted into the new table. Yet another table is added: one that contains only the generation timestamps clustered in a single partition. This makes it easy for CDC clients to learn about new generations. It also enables an elegant two-phase insertion procedure of the generation description: first we insert the streams; only after ensuring that a quorum of replicas contains them, we insert the timestamp. Thus, if any client observes a timestamp in the timestamps table (even using a ONE query), it means that a quorum of replicas must contain the list of streams.	2021-02-18 11:44:59 +01:00
Kamil Braun	ba920361b3	cdc: use chunked_vector for streams in streams_version The vector may get quite long (say... 1,6M stream IDs). We prevent a large allocation by using utils::chunked_vector.	2021-02-18 11:44:59 +01:00
Kamil Braun	9ae4467970	cdc: remove `streams_version::expired` field This field was not used anywhere.	2021-02-18 11:44:59 +01:00
Kamil Braun	3d7b990300	system_distributed_keyspace: use mutation API to insert CDC streams The `storage_proxy::mutate` low-level API is much more powerful than the CQL API. This power is not needed for this commit but for the next.	2021-02-18 11:44:59 +01:00
Kamil Braun	0df15ca8cc	storage_service: don't use `sys_dist_ks` before it is started It could happen that system_distributed_keyspace was used by storage_service before it was fully started (inside `handle_cdc_generation`), i.e. before sys_dist_ks' `start()` returned (on shard 0). It only checked whether `local_is_initialized()` returns true, so it only ensured that the service is constructed. Currently, sys_dist_ks' `start` only announces migrations, so this was mostly harmless. More concretely: it could result in the node trying to send CQL requests using a table that it didn't yet recognize by calling sys_dist_ks' methods before the `announce_migration` call inside `start` has returned. This would result in an exception; however, the exception would be catched by the caller and the procedure would be retried, succeeding eventually. See `handle_cdc_generation` for details. Still, the initial intention of the code was to wait for the sys_dist_ks service to be fully started before it was used. This commit fixes that.	2021-02-18 11:44:59 +01:00
Botond Dénes	ba7a9d2ac3	imr: switch back to open-coded description of structures Commit `aab6b0ee27` introduced the controversial new IMR format, which relied on a very template-heavy infrastructure to generate serialization and deserialization code via template meta-programming. The promise was that this new format, beyond solving the problems the previous open-coded representation had (working on linearized buffers), will speed up migrating other components to this IMR format, as the IMR infrastructure reduces code bloat, makes the code more readable via declarative type descriptions as well as safer. However, the results were almost the opposite. The template meta-programming used by the IMR infrastructure proved very hard to understand. Developers don't want to read or modify it. Maintainers don't want to see it being used anywhere else. In short, nobody wants to touch it. This commit does a conceptual revert of `aab6b0ee27`. A verbatim revert is not possible because related code evolved a lot since the merge. Also, going back to the previous code would mean we regress as we'd revert the move to fragmented buffers. So this revert is only conceptual, it changes the underlying infrastructure back to the previous open-coded one, but keeps the fragmented buffers, as well as the interface of the related components (to the extent possible). Fixes: #5578	2021-02-16 23:43:07 +01:00
Eliran Sinvani	178ced9014	schema tables: Remove mutations to unknown tables when adapting schema mutations Whenever an alter table occurs, the mutations for the just altered table are sent over to all of the replicas from the coordinator. In a mixed cluster the mutations should be adapted to a specific version of the schema. However, the adaptation that happens today doesn't omit mutations to newly added schema tables, to be more specific, mutations to the `computed_columns` table which doesn't exist for example in version 2019.1 This makes altering a table during a rolling upgrade from 2019.1 to 2020.1 dangerous.	2021-02-11 13:48:55 +02:00
Eliran Sinvani	ff1ba9bc2b	schema tables: Register 'scylla_tables' versions that were sent to other nodes In a mixed cluster there can be a situation where `scylla_tables` needs to be sent over to another node because a schema sync or because the node pulls it because it is referenced by a frozen_mutation. The former is not a problem since the sending node chooses the version to send. However, the former is problematic since `scylla_tables` versions are not registered anywhere. This registers every `scylla_tables` schema version which is used to adapted mutations since after this happens a schema pull for this version might follow.	2021-02-11 13:47:16 +02:00
Wojciech Mitros	e1b494633b	sstables: make sstable_set constructor less error-prone Adding an non-empty set of sstables as the set of all sstables in an sstable_set could cause inconsistencies with the values returned by select_sstable_runs because the _all_runs map would still be initialized empty. For similar reasons, the provided sstable_set_impl should also be empty. Dispel doubts by removing the unordered_set from the constructor, and adding a check of emptiness of the sstable_set_impl. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-02-11 11:02:55 +01:00
Gleb Natapov	d8345c67d9	Consolidate system and non system keyspace creation The code that creates system keyspace open code a lot of things from database::create_keyspace(). The patch makes create_keyspace() suitable for both system and non system keyspaces and uses it to create system keyspaces as well. Message-Id: <20210209160506.1711177-1-gleb@scylladb.com>	2021-02-09 17:18:04 +01:00
Avi Kivity	4082f57edc	Merge 'Make commitlog disk limit a hard limit.' from Calle Wilund Refs #6148 Commitlog disk limit was previously a "soft" limit, in that we allowed allocating new segments, even if we were over disk usage max. This would also cause us sometimes to create new segments and delete old ones, if badly timed in needing and releasing segments, in turn causing useless disk IO for pre-allocation/zeroing. This patch set does: * Make limit a hard limit. If we have disk usage > max, we wait for delete or recycle. * Make flush threshold configurable. Default is ask for flush when over 50% usage. (We do not wait for results) * Make flush "partial". We flush X% of the used space (used - thres/2), and make the rp limit accordingly. This means we will try to clear the N oldest segments, not all. I.e. "lighter" flush. Of course, if the CL is wholly dominated by a single CF, this will not really help much. But when > 1 cf is used, it means we can skip those not having unflushed data < req rp. * Force more eager flush/recycle if we're out of segments Note: flush threshold is not exposed in scylla config (yet). Because I am unsure of wording, and even if it should. Note: testing is sparse, esp. in regard to latency/timeouts added in high usage scenarios. While I can fairly easily provoke "stalls" (i.e. forced waiting for segments to free up) with simple C-S, it is hard to say exactly where in a more sane config (I set my limits looow) latencies will start accumulating. Closes #7879 * github.com:scylladb/scylla: commitlog: Force earlier cycle/flush iff segment reserve is empty commitlog: Make segment allocation wait iff disk usage > max commitlog: Do partial (memtable) flushing based on threshold commitlog: Make flush threshold configurable table: Add a flush RP mark to table, and shortcut if not above	2021-02-08 16:44:05 +02:00
Pavel Emelyanov	a05adb8538	database: Remove global storage proxy reference The db::update_keyspace() needs sharded<storage_proxy> reference, but the only caller of it already has it and can pass one as argument. tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210205175611.13464-3-xemul@scylladb.com>	2021-02-08 12:59:46 +01:00
Calle Wilund	c5f6125039	commitlog: Add "add_entries" call to allow inputting N mutations Fixes #7615 Allows N mutations to be written "atomically" (i.e. in the same call). Either all are added to segement, or none. Returns rp_handle vector corresponding to the call vector.	2021-02-02 10:41:08 +00:00
Calle Wilund	5fcc2066ed	commitlog: Make commitlog entries optionally multi-entry Allows writing more than one blob of data using a single "add" call into segment. The old call sites will still just provide a single entry. To ensure we can determine the health of all the entries as a unit, we need to wrap them in a "parent" entry. For this, we bump the commitlog segment format and introduce a magic marker, which if present, means we have entries in entry, totalling "size" bytes. We checksum the entra header, and also checksum the individual checksums of each sub-entry (faster). This is added as a post-word. When parsing/replaying, if v2+ and marker, we have to read all entries + checksums into memory, verify, and _then_ we can actually send the info to caller.	2021-02-02 10:41:08 +00:00
Calle Wilund	6bef3f9cc3	commitlog: Move entry_writer definition to cc file Should not be public/visible	2021-02-02 10:32:44 +00:00

1 2 3 4 5 ...

1989 Commits