scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-29 11:10:40 +00:00

Author	SHA1	Message	Date
Calle Wilund	e3153dd5b0	Commitlog replayer: Range-check skip call Fixes #15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader. v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will ensure we log the fact that things are broken Closes scylladb/scylladb#15270 (cherry picked from commit `6ffb482bf3`)	2024-01-05 09:19:28 +02:00
Marcin Maliszkiewicz	6943447c6a	db: view: run local materialized view mutations on a separate smp service group When base write triggers mv write and it needs to be send to another shard it used the same service group and we could end up with a deadlock. This fix affects also alternator's secondary indexes. Testing was done using (yet) not committed framework for easy alternator performance testing: https://github.com/scylladb/scylladb/pull/13121. I've changed hardcoded max_nonlocal_requests config in scylla from 5000 to 500 and then ran: ./build/release/scylla perf-alternator-workloads --workdir /tmp/scylla-workdir/ --smp 2 \ --developer-mode 1 --alternator-port 8000 --alternator-write-isolation forbid --workload write_gsi \ --duration 60 --ring-delay-ms 0 --skip-wait-for-gossip-to-settle 0 --continue-after-error true --concurrency 2000 Without the patch when scylla is overloaded (i.e. number of scheduled futures being close to max_nonlocal_requests) after couple seconds scylla hangs, cpu usage drops to zero, no progress is made. We can confirm we're hitting this issue by seeing under gdb: p seastar::get_smp_service_groups_semaphore(2,0)._count $1 = 0 With the patch I wasn't able to observe the problem, even with 2x concurrency. I was able to make the process hang with 10x concurrency but I think it's hitting different limit as there wasn't any depleted smp service group semaphore and it was happening also on non mv loads. Fixes https://github.com/scylladb/scylladb/issues/15844 Closes scylladb/scylladb#15845 (cherry picked from commit `020a9c931b`)	2023-11-19 18:47:11 +02:00
Kamil Braun	187e275147	system_keyspace: use system memory for `system.raft` table `system.raft` was using the "user memory pool", i.e. the `dirty_memory_manager` for this table was set to `database::_dirty_memory_manager` (instead of `database::_system_dirty_memory_manager`). This meant that if a write workload caused memory pressure on the user memory pool, internal `system.raft` writes would have to wait for memtables of user tables to get flushed before the write would proceed. This was observed in SCT longevity tests which ran a heavy workload on the cluster and concurrently, schema changes (which underneath use the `system.raft` table). Raft would often get stuck waiting many seconds for user memtables to get flushed. More details in issue #15622. Experiments showed that moving Raft to system memory fixed this particular issue, bringing the waits to reasonable levels. Currently `system.raft` stores only one group, group 0, which is internally used for cluster metadata operations (schema and topology changes) -- so it makes sense to keep use system memory. In the future we'd like to have other groups, for strongly consistent tables. These groups should use the user memory pool. It means we won't be able to use `system.raft` for them -- we'll just have to use a separate table. Fixes: scylladb/scylladb#15622 Closes scylladb/scylladb#15972 (cherry picked from commit `f094e23d84`)	2023-11-16 12:51:03 +01:00
Botond Dénes	fa0f382a82	Merge 'Initialize datadir for system and non-system keyspaces the same way' from Pavel Emelyanov When populating system keyspace the sstable_directory forgets to create upload/ subdir in the tables' datadir because of the way it's invoked from distributed loader. For non-system keyspaces directories are created in table::init_storage() which is self-contained and just creates the whole layout regardless of what. This PR makes system keyspace's tables use table::init_storage() as well so that the datadir layout is the same for all on-disk tables. Test included. fixes: #15708 closes: scylladb/scylla-manager#3603 Closes scylladb/scylladb#15723 * github.com:scylladb/scylladb: test: Add test for datadir/ layout sstable_directory: Indentation fix after previous patch db,sstables: Move storage init for system keyspace to table creation (cherry picked from commit `7f81957437`)	2023-10-25 12:13:03 +03:00
Avi Kivity	f42eb4d1ce	Merge 'Store and propagage GC timestamp markers from commitlog' from Calle Wilund Fixes #14870 (Originally suggested by @avikivity). Use commit log stored GC clock min positions to narrow compaction GC bounds. (Still requires augmented manual flush:es with extensive CL clearing to pass various dtest, but this does not affect "real" execution). Adds a lowest timestamp of GC clock whenever a CF is added to a CL segment the first time. Because GC clock is wall clock time and only connected to TTL (not cell/row timestamps), this gives a fairly accurate view of GC low bounds per segment. This is then (in a rather ugly way) propagated to tombstone_gc_state to narrow the allowed GC bounds for a CF, based on what is currently left in CL. Note: this is a rather unoptimized version - no caching or anything. But even so, should not be excessively expensive, esp. since various other code paths already cache the results. Closes scylladb/scylladb#15060 * github.com:scylladb/scylladb: main/cql_test_env: Augment compaction mgr tombstone_gc_state with CL GC info tombstone_gc_state: Add optional callback to augment GC bounds commitlog: Add keeping track of approximate lowest GC clock for CF entries database: Force new commitlog segment on user initiated flush commitlog: Add helper to force new active segment	2023-10-17 18:27:43 +03:00
Calle Wilund	560d3c17f0	commitlog: Add keeping track of approximate lowest GC clock for CF entries Adds a lowest timestamp of GC clock whenever a CF is added to a CL segment first. Because GC clock is wall clock time and only connected to TTL (not cell/row timestamps), this gives a fairly accurate view of GC low bounds per segment. Includes of course a function to get the all-segment lowest per CF.	2023-10-17 10:26:41 +00:00
Calle Wilund	810d06946f	commitlog: Add helper to force new active segment When called, if active segment holds data, close and replace with pristine one.	2023-10-17 10:26:40 +00:00
Tomasz Grabiec	0aef0f900b	Merge 'truncation records refactorings' from Petr Gusev This PR contains several refactoring, related to truncation records handling in `system_keyspace`, `commitlog_replayer` and `table` clases: * drop map_reduce from `commitlog_replayer`, it's sufficient to load truncation records from the null shard; * add a check that `table::_truncated_at` is properly initialized before it's accessed; * move its initialization after `init_non_system_keyspaces` Closes scylladb/scylladb#15583 * github.com:scylladb/scylladb: system_keyspace: drop truncation_record system_keyspace: remove get_truncated_at method table: get_truncation_time: check _truncated_at is initialized database: add_column_family: initialize truncation_time for new tables database: add_column_family: rename readonly parameter to is_new system_keyspace: move load_truncation_times into distributed_loader::populate_keyspace commitlog_replayer: refactor commitlog_replayer::impl::init system_keyspace: drop redundant typedef system_keyspace: drop redundant save_truncation_record overload table: rename cache_truncation_record -> set_truncation_time system_keyspace: get_truncated_position -> get_truncated_positions	2023-10-17 10:55:30 +02:00
Jan Ciolek	940e44f887	db/view: change log level of failed view updates to WARN When a remote view update doesn't succeed there's a log message saying "Error applying view update...". This message had log level ERROR, but it's not really a hard error. View updates can fail for a multitude of reasons, even during normal operation. A failing view update isn't fatal, it will be saved as a view hint a retried later. Let's change the log level to WARN. It's something that shouldn't happen too much, but it's not a disaster either. ERROR log level causes trouble in tests which assume that an ERROR level message means that the test has failed. Refs: https://github.com/scylladb/scylladb/issues/15046#issuecomment-1712748784 For local view updates the log level stays at "ERROR", local view updates shouldn't fail. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes scylladb/scylladb#15640	2023-10-11 18:19:23 +03:00
Avi Kivity	35849fc901	Revert "Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun" This reverts commit `3d4398d1b2`, reversing changes made to `45dfce6632`. The commit causes some schema changes to be lost due to incorrect timestamps in some mutations. More information is available in [1]. Reopens: scylladb/scylladb#7620 Reopens: scylladb/scylladb#13957 Fixes scylladb/scylladb#15530. [1] https://github.com/scylladb/scylladb/pull/15687	2023-10-11 00:32:05 +03:00
Dawid Medrek	6fdca0d3a8	db/hints/manager: Reword comments about state The current comments should be clearer to someone not familiar with the module. This commit also makes them abide by the limit of 120 characters per line.	2023-10-06 13:25:30 +02:00
Dawid Medrek	aa38ea3642	db/hints/manager: Unfriend space_watchdog space_watchdog is a friend of shard hint manager just to be able to execute one of its functions. This commit changes that by unfriending the class and exposing the function.	2023-10-06 13:25:30 +02:00
Dawid Medrek	6cd0153954	db/hints: Remove a redundant alias	2023-10-06 13:25:30 +02:00
Dawid Medrek	ddc385bce0	db/hints: Remove an unused namespace	2023-10-06 13:25:30 +02:00
Dawid Medrek	76d414012b	db/hints: Coroutinize change_host_filter()	2023-10-06 13:25:30 +02:00
Dawid Medrek	09eb30e6f1	db/hints: Coroutinize drain_for() This commit turns the function into a coroutine and makes the code less compact and more readable.	2023-10-06 13:25:30 +02:00
Dawid Medrek	907a572e24	db/hints: Clean up can_hint_for() This commit gets rid of unnecessary additional calls to functions and makes all lines abide by the limit of 120 characters.	2023-10-06 13:25:30 +02:00
Dawid Medrek	596e1f9859	db/hints: Clean up store_hint() This commit makes the function abide by the limit of 120 characters per line.	2023-10-06 13:25:30 +02:00
Dawid Medrek	8a43f94ca6	db/hints: Clean up too_many_in_flight_hints_for() This commit makes the return statement more readable. It also makes the comment abide by the limit of 120 characters per line.	2023-10-06 13:25:30 +02:00
Dawid Medrek	96a5906621	db/hints: Refactor get_ep_manager()	2023-10-06 13:25:30 +02:00
Dawid Medrek	8b591be3c3	db/hints: Coroutinize wait_for_sync_point() This commit coroutinizes the function and adds a comment explaining a non-trivial case.	2023-10-06 13:25:27 +02:00
Dawid Medrek	fee3aafd80	db/hints: Use std::span in calculate_current_sync_point std::span is a lot more flexible than std::vector as it allows for arbitrary contiguous ranges.	2023-10-06 12:36:05 +02:00
Dawid Medrek	64fd4d6323	db/hints: Clean up manager::forbid_hints_for_eps_with_pending_hints()	2023-10-06 12:26:55 +02:00
Dawid Medrek	58cd5c4167	db/hints: Clean up manager::forbid_hints()	2023-10-06 12:26:55 +02:00
Dawid Medrek	f8ed93f5bc	db/hints: Clean up manager::allow_hints()	2023-10-06 12:26:52 +02:00
Dawid Medrek	bfe32bcf89	db/hints: Coroutinize compute_hints_dir_device_id()	2023-10-06 12:18:30 +02:00
Dawid Medrek	8f28eb6522	db/hints: Clean up manager::stop() This commit gets rid of boilerplate in the function, leverages a range pipe and explicit types to make the code more readable, and changes the logs to make it clearer what happens.	2023-10-06 12:18:30 +02:00
Dawid Medrek	a384caece0	db/hints: Clean up manager::start() This commit coroutinizes the function and makes it less compact.	2023-10-06 12:18:30 +02:00
Dawid Medrek	2db97aaf81	db/hints/manager: Clean up the constructor fmt::to_string should be preferred to seastar::format. It's clearer and simpler. Besides that, this commit makes the code abide by the limit of 120 characters per line.	2023-10-06 12:18:30 +02:00
Dawid Medrek	6c10a86791	db/hints: Remove boilerplate drain_lock()	2023-10-06 12:18:30 +02:00
Dawid Medrek	f1f35ba819	db/hints: Let drain_for() return a future Currently, the function doesn't return anything. However, if the futurue doesn't need to be awaited, the caller can decide that. There is no reason to make that decision in the function itself.	2023-10-06 12:18:25 +02:00
Dawid Medrek	79e1412f14	db/hints: Remove ep_managers_end The methods are redundant and are effectively code boilerplate.	2023-10-06 12:15:04 +02:00
Dawid Medrek	cfbacb29bb	db/hints: Remove find_ep_manager The methods are redundant and are effectively code boilerplate.	2023-10-06 12:15:04 +02:00
Dawid Medrek	1c70a18fc7	db/hints: Use manager as API for hint_endpoint_manager This commit makes with_file_update_mutex() a method of hint_endpoint_manager and introduces db::hints::manager::with_file_update_mutex_for() for accessing it from the outside. This way, hint_endpoint_manager is hidden and no one needs to know about its existence.	2023-10-06 12:15:01 +02:00
Dawid Medrek	d068143b83	db/hints: Don't mark have_ep_manager()'s definition as inline Doing that doesn't allow for external linkage, so it's not accessible from other files.	2023-10-06 11:54:15 +02:00
Dawid Medrek	58249363bc	db/hints: Remove make_directory_initializer() The function is never used. It's not even implemented.	2023-10-06 11:54:15 +02:00
Dawid Medrek	f47a669f75	db/hints/manager: Order constructors This commit orders constructors of db::hints::manager for readability.	2023-10-06 11:54:15 +02:00
Dawid Medrek	4663f72990	db/hints: Move ~manager() and mark it as noexcept The destructor is trivial and there is no reason to keep in the source file. We mark it as noexcept too.	2023-10-06 11:54:15 +02:00
Dawid Medrek	18a2831186	db/hints: Use reference for storage proxy This commit makes db::hints::manager store service::storage_proxy as a reference instead of a seastar::shared_ptr. The manager is owned by storage proxy, so it only lives as long as storage proxy does. Hence, it makes little sense to store the latter as a shared pointer; in fact, it's very confusing and may be error-prone. The field never changes, so it's safe to keep it as a reference (especially because copy and move constructors of db::hints::manager are both deleted). What's more, we ensure that the hint manager has access to storage proxy as soon as it's created. The same changes were applied to db::hints::resource_manager. The rationale is the same.	2023-10-06 11:54:15 +02:00
Dawid Medrek	3c347cc196	db/hints/manager: Explicitly delete copy constructor This commit explicitly deletes the copy constructor of db::hints::manager and its copy assignment. They're not used in the code, and they should not.	2023-10-06 11:54:15 +02:00
Dawid Medrek	ee5a5c1661	db/hints: Capitalize constants This is a common convention. Follow it for readability.	2023-10-06 11:54:15 +02:00
Dawid Medrek	fd30bac7b1	db/hints/manager: Hide declarations	2023-10-06 11:54:15 +02:00
Dawid Medrek	4b03cba1bf	db/hints/manager: Move the defintions of static members to the header If the variables are accessible from the outside, it makes sense to also expose their initial values to the user. This commit moves them to the header and marks as inline.	2023-10-06 11:54:15 +02:00
Dawid Medrek	c3ab28f5e9	db/hints: Move make_dummy() to the header The function is trivial. It can also be marked as noexcept.	2023-10-06 11:54:15 +02:00
Dawid Medrek	5e333f0a52	db/hints: Don't explicitly define ~directory_initializer() The destructor is the default destructor, and it is safe to drop it altogether.	2023-10-06 11:53:02 +02:00
Dawid Medrek	9f215d3cf1	db/hints: Change the order of logging in ensure_created_and_verified() The new logging order seems to make more sense, i.e. we first log that we're creating and validating directories, and only then do we start doing that. The previous order when those actions were reversed didn't match the log's message because the action was already done when we informed the user of it.	2023-10-06 11:14:41 +02:00
Dawid Medrek	4ad3e8d37b	db/hints: Coroutinize ensure_rebalanced()	2023-10-06 11:14:41 +02:00
Dawid Medrek	672cdb5c05	db/hints: Coroutinize ensure_created_and_verified()	2023-10-06 11:14:41 +02:00
Dawid Medrek	a5f14cb130	db/hints: Improve formatting of directory_initializer::impl The implementation class has been divided into clear sections. The indentation has also been adjusted to what is commonly used in the codebase.	2023-10-06 11:14:41 +02:00
Dawid Medrek	500175d738	db/hints: Do not rely on the values of enums These changes move away from relying on specific values of enum variants. The code based on the arithmetic of them is trivial, and there is no reason to not operator== and operator!= instead. This should make the code less error prone and easier to understand.	2023-10-06 11:14:41 +02:00

1 2 3 4 5 ...

3411 Commits