scylladb

Author	SHA1	Message	Date
Botond Dénes	578a092e4a	reader_concurrency_semaphore: wait for all permits to be destroyed in stop() To prevent use-after-free resulting from any permit out-living the semaphore.	2021-06-16 11:29:36 +03:00
Botond Dénes	8c7447effd	mutation_reader: reader_lifecycle_policy::destroy_reader(): require to be called on native shard Currently shard_reader::close() (its caller) goes to the remote shard, copies back all fragments left there to the local shard, then calls `destroy_reader()`, which in the case of the multishard mutation query copies it all back to the native shard. This was required before because `shard_reader::stop()` (`close()`'s) predecessor) couldn't wait on `smp::submit_to()`. But close can, so we can get rid of all this back-and-forth and just call `destroy_reader()` on the shard the reader lives on, just like we do with `create_reader()`.	2021-06-16 11:29:35 +03:00
Botond Dénes	4ecf061c90	reader_lifecycle_policy implementations: fix indentation Left broken from the previous patch.	2021-06-16 11:21:38 +03:00
Botond Dénes	a7e59d3e2c	mutation_reader: reader_lifecycle_policy::destroy_reader(): de-futurize reader parameter The shard reader is now able to wait on the stopped reader and pass the already stopped reader to `destroy_reader()`, so we can de-futurize the reader parameter of said method. The shard reader was already patched to pass a ready future so adjusting the call-site is trivial. The most prominent implementation, the multishard mutation query, can now also drop its `_dismantling_gate` which was put in place so it can wait on the background stopping if readers. A consequence of this move is that handling errors that might happen during the stopping of the reader is now handled in the shard reader, not all lifecycle policy implementations.	2021-06-16 11:21:38 +03:00
Piotr Jastrzebski	1ed92e37f8	database: Fix warning about deprecated update_shares_for_class usage This patch fixes the following compilation warning: database.cc:430:33: warning: 'update_shares_for_class' is deprecated: Use io_priority_class.update_shares [-Wdeprecated-declarations] _inflight_update = engine().update_shares_for_class(_io_priority, uint32_t(shares)); Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #8751	2021-06-14 10:42:22 +03:00
Benny Halevy	5a8531c4c8	repair: get_sharder_for_tables: throw no_such_column_family Insteadof std::runtime_error with a message that resembles no_such_column_family, throw a no_such_column_family given the keyspace and table uuid. The latter can be explicitly caught and handled if needed. Refs #8612 Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210608113605.91292-1-bhalevy@scylladb.com>	2021-06-08 14:45:44 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Solodovnikov	e0749d6264	treewide: some random header cleanups Eliminate not used includes and replace some more includes with forward declarations where appropriate. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-06-06 19:18:49 +03:00
Benny Halevy	f081e651b3	memtable_list: rename request_flush to just flush Now that it returns a future that always waits on pending flushes there is no point in calling it `request_flush`. `flush()` is simpler and better describes its function. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-06-06 09:21:23 +03:00
Benny Halevy	4f20cd3bea	memtable_list: rename seal_active_memtable_immediate to seal_active_memtable Now that there's no more seal_active_memtable_delayed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-06-06 09:21:23 +03:00
Benny Halevy	82a263f672	database: apply_in_memory: run_when_memory_available under table::run_async Make sure to apply the mutation under the table's _async_gate. Fixes #8790 Test: unit(dev), view_build_test(debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #8794	2021-06-06 09:21:23 +03:00
Benny Halevy	3ad0f156b9	memtable_list: request_flush: wait on pending flushes also when empty() In https://github.com/scylladb/scylla/issues/8609, table::stop() that is called from database::drop_column_family is expected to wait on outstanding flushes by calling _memtable->request_flush(), but the memtable_list is considered empty() at this point as it has a single empty memtable, so request_flush() returns a ready future, without waiting on outstanding flushes. This change replaces the call to request_flush with flush(). Fix that by either returning _flush_coalescing future that resolves when the memtable is sealed, if available, or go through the get_flush_permit and _dirty_memory_manager->flush_one song and dance, even though the memtable is empty(), as the latter waits on pending flushes. Fixes #8609 Test: unit(dev) DTest: alternator_tests.py:AlternatorTest.test_batch_with_auto_snapshot_false(debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210524143438.1056014-1-bhalevy@scylladb.com>	2021-05-25 11:19:51 +02:00
Avi Kivity	50f3bbc359	Merge "treewide: various header cleanups" from Pavel S " The patch set is an assorted collection of header cleanups, e.g: * Reduce number of boost includes in header files * Switch to forward declarations in some places A quick measurement was performed to see if these changes provide any improvement in build times (ccache cleaned and existing build products wiped out). The results are posted below (`/usr/bin/time -v ninja dev-build`) for 24 cores/48 threads CPU setup (AMD Threadripper 2970WX). Before: Command being timed: "ninja dev-build" User time (seconds): 28262.47 System time (seconds): 824.85 Percent of CPU this job got: 3979% Elapsed (wall clock) time (h:mm:ss or m:ss): 12:10.97 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 2129888 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1402838 Minor (reclaiming a frame) page faults: 124265412 Voluntary context switches: 1879279 Involuntary context switches: 1159999 Swaps: 0 File system inputs: 0 File system outputs: 11806272 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 After: Command being timed: "ninja dev-build" User time (seconds): 26270.81 System time (seconds): 767.01 Percent of CPU this job got: 3905% Elapsed (wall clock) time (h:mm:ss or m:ss): 11:32.36 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 2117608 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1400189 Minor (reclaiming a frame) page faults: 117570335 Voluntary context switches: 1870631 Involuntary context switches: 1154535 Swaps: 0 File system inputs: 0 File system outputs: 11777280 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 The observed improvement is about 5% of total wall clock time for `dev-build` target. Also, all commits make sure that headers stay self-sufficient, which would help to further improve the situation in the future. " * 'feature/header_cleanups_v1' of https://github.com/ManManson/scylla: transport: remove extraneous `qos/service_level_controller` includes from headers treewide: remove evidently unneded storage_proxy includes from some places service_level_controller: remove extraneous `service/storage_service.hh` include sstables/writer: remove extraneous `service/storage_service.hh` include treewide: remove extraneous database.hh includes from headers treewide: reduce boost headers usage in scylla header files cql3: remove extraneous includes from some headers cql3: various forward declaration cleanups utils: add missing <limits> header in `extremum_tracking.hh`	2021-05-24 14:24:20 +03:00
Avi Kivity	924f93028a	db: data_listeners: remove unused field _db Remove the unused field and the constructor that populated it.	2021-05-21 20:56:42 +03:00
Pavel Solodovnikov	fff7ef1fc2	treewide: reduce boost headers usage in scylla header files `dev-headers` target is also ensured to build successfully. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-05-20 01:33:18 +03:00
Piotr Sarna	fa53bf5c1e	database: check for conflicting table names for indexes When an index is created without an explicit name, a default name is chosen. However, there was no check if a table with conflicting name already exists. The check is now in place and if any conflicts are found, a new index name is chosen instead.	2021-05-11 15:20:59 +02:00
Botond Dénes	992819b188	database: add get_unlimited_query_max_result_size() Similar to the already existing get_reader_concurrency_semaphore(), this method determines the appropriate max result size for the query class, which is deduced from the current scheduling group. This method shares its scheduling group -> query class association mechanism with the above mentioned semaphore getter.	2021-05-05 13:30:42 +03:00
Botond Dénes	9313acb304	database: get_reader_concurrency_semaphore(): extract query classification logic Into a local function. In the next patch we want to add another method which needs to classify queries based on the current scheduling group, so prepare for sharing this logic.	2021-05-05 10:41:04 +03:00
Benny Halevy	7c7569f0ad	querier_cache: implement stop Close the _closing_gate to wait on background close of dropped queries, and close all remaining queriers. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	2f9cf01aa7	querier_cache: futurize evict api Prepare for futurizing the lower-level inactive reads api. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	57f921de4f	database: streaming_reader_lifecycle_policy: destroy_reader: close inactive reader Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	43bf0f9356	reader_concurrency_semaphore: add stop method In addition to clear_inactive_reads, that's currently called when the database object is destroyed, introduce a stop() method that will: 1. wait on all background closes of inactive_reads. 2. close all present inactive_reads and waits on their close. 3. signal waiters on the wait_list via broken() with a proper exception indicating that the semaphore was closed. In addition, assert in the semaphore's destructor that it has no remaining inactive reads. Stop must be called from whoever owns the r_c_s. Mainly, from database::stop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	2c1edb1a94	mutation_reader: reader_lifecycle_policy: return future from destroy_reader So we can wait on it from to-be-introduced shard_reader::close(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Piotr Sarna	2ad09d0bf8	Merge 'treewide: remove inclusions of storage_proxy.hh from headers' from Avi Kivity Reduce rebuilds and build time by removing unnecessary includes. Along the way, improve header sanity. Ref #1. Test: dev-headers, unit(dev). Closes #8524 * github.com:scylladb/scylla: treewide: remove inclusions of storage_proxy.hh from headers storage_proxy: unnest coordinator_query_result treewide: make headers self-sufficient utils: intrusive_btree: add missing #pragma once	2021-04-21 08:22:52 +02:00
Avi Kivity	daeddda7cc	treewide: remove inclusions of storage_proxy.hh from headers storage_proxy.hh is huge and includes many headers itself, so remove its inclusions from headers and re-add smaller headers where needed (and storage_proxy.hh itself in source files that need it). Ref #1.	2021-04-20 21:23:00 +03:00
Botond Dénes	4c3454dd07	database: get_reader_concurrency_semaphore(): make the user semaphore the catch-all Currently said method uses the system semaphore as a catch-all for all scheduling groups it doesn't know about. This is incompatible with the recent forward-porting of the service-level infrastructure as it means that all service level related scheduling groups will fall back to the system scheduling group, which causes two problems: * They will experience much limited concurrency, as the system semaphore is assigned much less count units, to match the much more limited internal traffic. * They compete with internal reads, severely impacting the respective internal processes, potentially causing extreme slowdown, or even deadlock in the case of an internal query executed on behalf of a user query being blocked on the latter. Even if we don't have any custom service level scheduling groups at the moment, it is better to change this such that unknown scheduling groups fall-back to using the user semaphore. We don't expect any new internal scheduling group to pop up any time soon (and if they do we can adjust get_reader_concurrency_semaphore() accordingly), but we do expect user scheduling groups to be created in the future, even dynamically. To minimize the chance of the wrong workload being associated with the user semaphore, all statically created scheduling groups are now explicitly listed in `get_reader_concurrency_semaphore()`, to make their association with the respective semaphore explicit and documented. Added a unit test which also checks the correct association for all these scheduling groups. Fixes: #8508 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210420105156.94002-1-bdenes@scylladb.com>	2021-04-20 14:06:25 +03:00
Kamil Braun	617813ba66	sys_dist_ks: new keyspace for system tables with Everywhere strategy `system_distributed_everywhere` is a new keyspace that uses Everywhere replication strategy. This is useful, for example, when we want to store internal data that should be accessible by every node; the data can be written using CL=ALL (e.g. during node operations such as node bootstrap, which require all nodes to be alive - at least currently) and then read by each node locally using CL=ONE (e.g. during node restarts). Closes #8457	2021-04-19 11:22:57 +03:00
Pavel Emelyanov	5ecbc33be5	database.*: Remove unused headers The database.hh is the central recursive-headers knot -- it has ~50 includes. This patch leaves only 34 (it remains the champion though). Similar thing for database.cc. Both changes help the latter compile ~4% faster :) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210414183107.30374-1-xemul@scylladb.com>	2021-04-18 14:03:17 +03:00
Botond Dénes	80a03826e3	database: mutation_query(): use table::mutation_query() Instead of `mutation_query()` from `mutation_query.hh`. The latter is about to be retired as we want to migrate all users to `table::mutation_query()`. As part of this change, move away from `mutation_query_stage` too. This brings the code paths of the two query variants closer together, as they both have an execution stage declared in `database`.	2021-04-09 13:40:27 +03:00
Avi Kivity	82c76832df	treewide: don't include "db/system_distributed_keyspace.hh" from headers This just causes unneeded and slower recompliations. Instead replace with forward declarations, or includes of smaller headers that were incidentally brought in by the one removed. The .cc files that really need it gain the include, but they are few. Ref #1. Closes #8403	2021-04-04 14:00:26 +03:00
Piotr Jastrzebski	57c7964d6c	config: ignore enable_sstables_mc_format flag Don't allow users to disable MC sstables format any more. We would like to retire some old cluster features that has been around for years. Namely MC_SSTABLE and UNBOUNDED_RANGE_TOMBSTONES. To do this we first have to make sure that all existing clusters have them enabled. It is impossible to know that unless we stop supporting enable_sstables_mc_format flag. Test: unit(dev) Refs #8352 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #8360	2021-03-31 12:23:59 +03:00
Eliran Sinvani	0220786710	database: Fix view schemas in place when loading On restart the view schemas are loaded and might contain old views with an unmarked computed column. We already have code to update the schema, but before we do it we load the view as is. This is not desired since once registered, this view version can be used for writes which is forbidden since we will spot a none computed column which is in the view's primary key but not in the base table at all. To solve this, in addition to altering the persistent schema, we fix the view's loaded schema in place. This is safe since computed column is just involved in generating a value for this column when creating a view update so the effect of this manipulation stays internal. The second stage of the in place fixing is to persist the changes made in the in place fixing so the view is ready for the next node restart in particular the `computed_columns` table.	2021-03-07 12:57:16 +02:00
Eliran Sinvani	39cd9dae4e	materialized views: Extract fix legacy schema into its own logic We extract the logic for fixing the view schema into it's own logic as we will need to use it in more places in the code. This makes 'maybe_update_legacy_secondary_index_mv_schema' redundant since it becomes a two liner wrapper for this logic. We also remove it here and replace the call to it with the equivalent code.	2021-03-07 12:50:42 +02:00
Tomasz Grabiec	761f89e55e	api: Introduce system/drop_sstable_caches RESTful API Evicts objects from caches which reflect sstable content, like the row cache. In the future, it will also drop the page cache and sstable index caches. Unlike lsa/compact, doesn't cause reactor stalls. The old lsa/compact call invokes memory reclamation, which is non-preemptible. It also compacts LSA segments, so does more work. Some use cases don't need to compact LSA segments, just want the row cache to be wiped. Message-Id: <20210301120211.36195-1-tgrabiec@scylladb.com>	2021-03-01 16:13:04 +02:00
Avi Kivity	78d1afeabd	Merge "Use radix tree to store cells on a row" from Pavel E " Current storage of cells in a row is a union of vector and set. The vector holds 5 cell_and_hash's inline, up to 32 ones in the external storage and then it's switched to std::set. Once switched, the whole union becomes the waste of space, as it's size is sizeof(vector head) + 5 * sizeof(cell and hash) = 90+ bytes and only 3 pointers from it are used (std::set header). Also the overhead to keep cell_and_hash as a set entry is more then the size of the structure itself. Column ids are 32-bit integers that most likely come sequentialy. For this kind of a search key a radix tree (with some care for non-sequential cases) can be beneficial. This set introduces a compact radix tree, that uses 7-bit sub values from the search key to index on each node and compacts the nodes themselves for better memory usage. Then the row::_storage is replaced with the new tree. The most notable result is the memory footprint decrease, for wide rows down to 2x times. The performance of micro-benchmarks is a bit lower for small rows and (!) higer for longer (8+ cells). The numbers are in patch #12 (spoiler: they are better than for v2) v3: - trimmed size of radix down to 7 bits - simplified the nodes layouts, now there are 2 of them (was 4) - enhanced perf_mutation to test N-cells schema - added AVX intra-nodes search for medium-sized nodes - added .clone_from() method that helped to improve perf_mutation - minor - changed functions not to return values via refs-arguments - fixed nested classes to properly use language constructors - renamed index_to to key_t to distinguish from node_index_t - improved recurring variadic templates not to use sentinel argument - use standard concepts v2: - fixed potential mis-compilation due to strict-aliasing violation - added oracle test (radix tree is compared with std::map) - added radix to perf_collection - cosmetic changes (concepts, comments, names) A note on item 1 from v2 changelog. The nodes are no longer packed perfectly, each has grown 3 bytes. But it turned out that when used as cells container most of this growth drowned in lsa alignments. next todo: - aarch64 version of 16-keys node search tests: unit(dev), unit(debug for radix), pref(dev) " 'br-radix-tree-for-cells-3' of https://github.com/xemul/scylla: test/memory_footpring: Print radix tree node sizes row: Remove old storages row: Prepare row::equal for switch row: Prepare row::difference for switch row: Introduce radix tree storage type row-equal: Re-declare the cells_equal lambda test: Add tests for radix tree utils: Compact radix tree array-search: Add helpers to search for a byte in array test/perf_collection: Add callback to check the speed of clone test/perf_mutation: Add option to run with more than 1 columns test/perf_mutation: Prepare to have several regular columns test/perf_mutation: Use builder to build schema	2021-02-18 21:19:14 +02:00
Benny Halevy	92e0e84ee5	database: futurize remove In preparation for futurizing the querier_cache api. Coroutinize drop_column_family while at it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210215101254.480228-61-bhalevy@scylladb.com>	2021-02-17 18:52:53 +02:00
Pavel Emelyanov	1bdfa355ea	row: Remove old storages Now when the 3rd storage type (radix tree) is all in, old storage can be safely removed. The result is: 1. memory footprint sizeof(class row): 112 => 16 bytes sizeof(rows_entry): 126 => 120 bytes the "in cache" value depends on the number of cells: num of cells master patch 1 752 656 2 808 712 3 864 768 4 920 824 5 968 936 6 1136 992 ... 16 1840 1672 17 1904 1992 (+88) 18 1976 2048 (+72) 19 2048 2104 (+56) 20 2120 2160 (+40) 21 2184 2208 (+24) 22 2256 2264 ( +8) 23 2328 2320 ... 32 2960 2808 After 32 cells the storage switches into rbtree with 24-bytes per-cell overhead and the radix tree improvement rocketlaunches 64 7872 6056 128 15040 9512 256 29376 18568 2. perf_mutation test is enhanced by this series and the results differ depending on the number of columns used tps value --column-count master patch 1 59.9k 57.6k (-3.8%) 2 59.9k 57.5k 4 59.8k 57.6k 8 57.6k 57.7k <- eq 16 56.3k 57.6k 32 53.2k 57.4k (+7.9%) A note on this. Last time 1-column test was ~5% worse which was explained by inline storage of 5 cells that's present on current implementation and was absent in radix tree. An attempt to make inline storage for small radix trees resulted in complete loss of memory footprint gain, but gave fraction of percent to perf_mutation performance. So this version doesn't have inline nodes. The 1.2% improvement from v2 surprisingly came from the tree::clone_from() which in v2 was work-around-ed by slow walk+emplace sequence while this version has the optimized API call for cloning. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-02-15 20:35:06 +03:00
Gleb Natapov	d06d21bfae	database: remove add_keyspace() function It is not longer used. Message-Id: <20210209175931.1796263-2-gleb@scylladb.com>	2021-02-10 00:36:02 +01:00
Gleb Natapov	d8345c67d9	Consolidate system and non system keyspace creation The code that creates system keyspace open code a lot of things from database::create_keyspace(). The patch makes create_keyspace() suitable for both system and non system keyspaces and uses it to create system keyspaces as well. Message-Id: <20210209160506.1711177-1-gleb@scylladb.com>	2021-02-09 17:18:04 +01:00
Avi Kivity	4082f57edc	Merge 'Make commitlog disk limit a hard limit.' from Calle Wilund Refs #6148 Commitlog disk limit was previously a "soft" limit, in that we allowed allocating new segments, even if we were over disk usage max. This would also cause us sometimes to create new segments and delete old ones, if badly timed in needing and releasing segments, in turn causing useless disk IO for pre-allocation/zeroing. This patch set does: * Make limit a hard limit. If we have disk usage > max, we wait for delete or recycle. * Make flush threshold configurable. Default is ask for flush when over 50% usage. (We do not wait for results) * Make flush "partial". We flush X% of the used space (used - thres/2), and make the rp limit accordingly. This means we will try to clear the N oldest segments, not all. I.e. "lighter" flush. Of course, if the CL is wholly dominated by a single CF, this will not really help much. But when > 1 cf is used, it means we can skip those not having unflushed data < req rp. * Force more eager flush/recycle if we're out of segments Note: flush threshold is not exposed in scylla config (yet). Because I am unsure of wording, and even if it should. Note: testing is sparse, esp. in regard to latency/timeouts added in high usage scenarios. While I can fairly easily provoke "stalls" (i.e. forced waiting for segments to free up) with simple C-S, it is hard to say exactly where in a more sane config (I set my limits looow) latencies will start accumulating. Closes #7879 * github.com:scylladb/scylla: commitlog: Force earlier cycle/flush iff segment reserve is empty commitlog: Make segment allocation wait iff disk usage > max commitlog: Do partial (memtable) flushing based on threshold commitlog: Make flush threshold configurable table: Add a flush RP mark to table, and shortcut if not above	2021-02-08 16:44:05 +02:00
Pavel Emelyanov	a05adb8538	database: Remove global storage proxy reference The db::update_keyspace() needs sharded<storage_proxy> reference, but the only caller of it already has it and can pass one as argument. tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210205175611.13464-3-xemul@scylladb.com>	2021-02-08 12:59:46 +01:00
Avi Kivity	913d970c64	Merge "Unify inactive readers" from Botond " Currently inactive readers are stored in two different places: * reader concurrency semaphore * querier cache With the latter registering its inactive readers with the former. This is an unnecessarily complex (and possibly surprising) setup that we want to move away from. This series solves this by moving the responsibility if storing of inactive reads solely to the reader concurrency semaphore, including all supported eviction policies. The querier cache is now only responsible for indexing queriers and maintaining relevant stats. This makes the ownership of the inactive readers much more clear, hopefully making Benny's work on introducing close() and abort() a little bit easier. Tests: unit(release, debug:v1) " * 'unify-inactive-readers/v2' of https://github.com/denesb/scylla: reader_concurrency_semaphore: store inactive readers directly querier_cache: store readers in the reader concurrency semaphore directly querier_cache: retire memory based cache eviction querier_cache: delegate expiry to the reader_concurrency_semaphore reader_concurrency_semaphore: introduce ttl for inactive reads querier_cache: use new eviction notify mechanism to maintain stats reader_concurrency_semaphore: add eviction notification facility reader_concurrency_semaphore: extract evict code into method evict()	2021-02-03 10:59:04 +02:00
Calle Wilund	c3d95811da	table: Add a flush RP mark to table, and shortcut if not above Adds a second RP to table, marking where we flushed last. If a new flush request comes in that is below this mark, we can skip a second flush. This is to (in future) support incremental CL flush.	2021-01-05 18:16:09 +00:00
Piotr Sarna	aba9772eff	database: migrate find_keyspace to string views ... in order to avoid creating unnecessary sstring instances just to compare strings.	2021-01-04 09:47:01 +01:00
Calle Wilund	71c5dc82df	database: Verify iff we actually are writing memtables to disk in truncate Fixes #7732 When truncating with auto_snapshot on, we try to verify the low rp mark from the CF against the sstables discarded by the truncation timestamp. However, in a scenario like: Fill memtables Flush Truncate with snapshot A Fill memtables some more Truncate Move snapshot A to upload + refresh (load old tables) Truncate The last op will assert, because while we have sstables loaded, which will be discarded now, we did not in fact generate any _new_ ones (since memtables are empty), and the RP we get back from discard is one from an earlier generation set. (Any permutation of events that create the situation "empty memtable" + "non-empty sstables with only old tables" will generate the same error). Added a check that before flushing checks if we actually have any data, and if not, does not uphold the RP relation assert. Closes #7799	2020-12-15 16:24:36 +02:00
Piotr Sarna	cd1e351dc1	table: unify waiting for pending operations In order to reduce code duplication which already caused a bug, waiting for pending operations is now unified with a single helper function.	2020-12-15 13:11:25 +01:00
Piotr Sarna	57d63ca036	database: add waiting for pending streams on table drop We already wait for pending reads and writes, so for completeness we should also wait for all pending stream operations to finish before dropping the table to avoid inconsistencies.	2020-12-15 12:55:45 +01:00
Pavel Emelyanov	62214e2258	database: Have local id arg in transform_counter_updates_to_shards() There are two places that call it -- database code itself and tests. The former already has the local host id, so just pass one. The latter are a bit trickier. Currently they use the value from storage_service created by storage_service_for_tests, but since this version of service doesn't pass through prepare_to_join() the local_host_id value there is default-initialized, so just default-initialize the needed argument in place. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-12-04 15:09:30 +03:00
Pavel Emelyanov	66dcc47571	system-keyspace: Rewrite force_blocking_flush The method is called after query_processor::execute_internal to flush the cf. Encapsulating this flush inside database and getting the database from query_processor lets removing database reference from global qctx object. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Avi Kivity	f55b522c1b	database: detect misconfigured unit tests that don't set available_memory available_memory is used to seed many caches and controllers. Usually it's detected from the environment, but unit tests configure it on their own with fake values. If they forget, then the undefined behavior sanitizer will kick in in random places (see `8aa842614a` ("test: gossip_test: configure database memory allocation correctly") for an example. Prevent this early by asserting that available_memory is nonzero. Closes #7612	2020-11-18 08:49:32 +02:00

1 2 3 4 5 ...

1401 Commits