scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-13 03:12:13 +00:00

Author	SHA1	Message	Date
Botond Dénes	566e31a5ac	db/view: view_updating_consumer: allow passing custom update pusher So that tests can test the `view_update_consumer` in isolation, without having to set up the whole database machinery. In addition to less infrastructure setup, this allows more direct checking of mutations pushed for view generation.	2020-07-20 11:23:39 +03:00
Botond Dénes	0166f97096	db/view: view_update_generator: make staging reader evictable The view update generation process creates two readers. One is used to read the staging sstables, the data which needs view updates to be generated for, and another reader for each processed mutation, which reads the current value (pre-image) of each row in said mutation. The staging reader is created first and is kept alive until all staging data is processed. The pre-image reader is created separately for each processed mutation. The staging reader is not restricted, meaning it does not wait for admission on the relevant reader concurrency semaphore, but it does register its resource usage on it. The pre-image reader however is restricted. This creates a situation, where the staging reader possibly consumes all resources from the semaphore, leaving none for the later created pre-image reader, which will not be able to start reading. This will block the view building process meaning that the staging reader will not be destroyed, causing a deadlock. This patch solves this by making the staging reader restricted and making it evictable. To prevent thrashing -- evicting the staging reader after reading only a really small partition -- we only make the staging reader evictable after we have read at least 1MB worth of data from it.	2020-07-20 11:23:39 +03:00
Botond Dénes	84357f0722	db/view: view_updating_consumer: move implementation from table.cc to view.cc table.cc is a very counter-intuitive place for view related stuff, especially if the declarations reside in `db/view/`.	2020-07-20 11:23:39 +03:00
Pavel Emelyanov	8618a02815	migration_manager: Remove db/schema_tables.hh inclustion into header The schema_tables.hh -> migration_manager.hh couple seems to work as one of "single header for everyhing" creating big blot for many seemingly unrelated .hh's. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:54:43 +03:00
Amnon Heiman	ea8d52b11c	row_locking: change estimated histogram with time_estimated_histogram This patch changes the row locking latencies to use time_estimated_histogram. The change consist of changing the histogram definition and changing how values are inserted to the histogram. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-07-14 11:17:43 +03:00
Avi Kivity	b0698dfb38	Merge 'Rewrite CQL3 restriction representation' from dekimir " This is the first stage of replacing the existing restrictions code with a new representation. It adds a new class `expression` to replace the existing class `restriction`. Lots of the old code is deleted, though not all -- that will come in subsequent stages. Tests: unit (dev, debug restrictions_test), dtest (next-gating) " * dekimir-restrictions-rewrite: cql3/restrictions: Drop dead code cql3/restrictions: Use free functions instead of methods cql3/restrictions: Create expression objects cql3/restrictions: Add free functions over new classes cql3/restrictions: Add new representation	2020-07-08 10:22:17 +03:00
Dejan Mircevski	37ebe521e3	cql3/restrictions: Use free functions instead of methods Instead of `restriction` class methods, use the new free functions. Specific replacement actions are listed below. Note that class `restrictions` (plural) remains intact -- both its methods and its type hierarchy remain intact for now. Ensure full test coverage of the replacement code with new file test/boost/restrictions_test.cc and some extra testcases in test/cql/*. Drop some existing tests because they codify buggy behaviour (reference #6369, #6382). Drop others because they forbid relation combinations that are now allowed (eg, mixing equality and inequality, comparing to NULL, etc.). Here are some specific categories of what was replaced: - restriction::is_foo predicates are replaced by using the free function find_if; sometimes it is used transitively (see, eg, has_slice) - restriction::is_multi_column is replaced by dynamic casts (recall that the `restrictions` class hierarchy still exists) - utility methods is_satisfied_by, is_supported_by, to_string, and uses_function are replaced by eponymous free functions; note that restrictions::uses_function still exists - restriction::apply_to is replaced by free function replace_column_def - when checking infinite_bound_range_deletions, the has_bound is replaced by local free function bounded_ck - restriction::bounds and restriction::value are replaced by the more general free function possible_lhs_values - using free functions allows us to simplify the multi_column_restriction and token_restriction hierarchies; their methods merge_with and uses_function became identical in all subclasses, so they were moved to the base class - single_column_primary_key_restrictions<clustering_key>::needs_filtering was changed to reuse num_prefix_columns_that_need_not_be_filtered, which uses free functions Fixes #5799. Fixes #6369. Fixes #6371. Fixes #6372. Fixes #6382. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-07-07 23:08:09 +02:00
Botond Dénes	5ebe2c28d1	db/view: view_update_generator: re-balance wait/signal on the register semaphore The view update generator has a semaphore to limit concurrency. This semaphore is waited on in `register_staging_sstable()` and later the unit is returned after the sstable is processed in the loop inside `start()`. This was broken by `4e64002`, which changed the loop inside `start()` to process sstables in per table batches, however didn't change the `signal()` call to return the amount of units according to the number of sstables processed. This can cause the semaphore units to dry up, as the loop can process multiple sstables per table but return just a single unit. This can also block callers of `register_staging_sstable()` indefinitely as some waiters will never be released as under the right circumstances the units on the semaphore can permanently go below 0. In addition to this, `4e64002` introduced another bug: table entries from the `_sstables_with_tables` are never removed, so they are processed every turn. If the sstable list is empty, there won't be any update generated but due to the unconditional `signal()` described above, this can cause the units on the semaphore to grow to infinity, allowing future staging sstables producers to register a huge amount of sstables, causing memory problems due to the amount of sstable readers that have to be opened (#6603, #6707). Both outcomes are equally bad. This patch fixes both issues and modifies the `test_view_update_generator` unit test to reproduce them and hence to verify that this doesn't happen in the future. Fixes: #6774 Refs: #6707 Refs: #6603 Tests: unit(dev) Signed-off-by: Botond DÃ©nes <bdenes@scylladb.com> Message-Id: <20200706135108.116134-1-bdenes@scylladb.com>	2020-07-07 08:53:00 +02:00
Wojciech Mitros	76038b8d8e	view: differentiate identical error messages and change them to warnings Modified log message in view_builder::calculate_shard_build_step to make it distinct from the one in view_builder::execute, changed their logging level to warning, since we're continuing even if we handle an exception. Fixes #4600	2020-07-06 20:50:34 +03:00
Botond Dénes	62c6859b69	db/view: view_update_generator: use partitioned sstable set And pass it to `make_range_sstable_reader()` when creating the reader, thus allowing the incremental selector created therein to exploit the fact that staging sstables are disjoint (in the case of repair and streaming at least). This should reduce the memory consumption of the staging reader considerably when reading from a lot of sstables.	2020-07-06 13:38:23 +03:00
Rafael Ávila de Espíndola	64c8164e6c	everywhere: Update to seastar api v4 (when_all_succeed returning a tuple) We now just need to replace a few calls to then with then_unpack. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200618172100.111147-1-espindola@scylladb.com>	2020-06-23 19:40:18 +03:00
Avi Kivity	de38091827	priority_manager: merge streaming_read and streaming_write classes into one class Streaming is handled by just once group for CPU scheduling, so separating it into read and write classes for I/O is artificial, and inflates the resources we allow for streaming if both reads and writes happen at the same time. Merge both classes into one class ("streaming") and adjust callers. The merged class has 200 shares, so it reduces streaming bandwidth if both directions are active at the same time (which is rare; I think it only happens in view building).	2020-06-22 15:09:04 +03:00
Rafael Ávila de Espíndola	f6e407ecd2	everywhere: Prepare for seastar api v4 (when_all_succeed return value) The seastar api v4 changes the return type of when_all_succeed. This patch adds discard_result when that is best solution to handle the change. This doesn't do the actual update to v4 since there are still a few issues left to fix in seastar. A patch doing just the update will follow. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200617233150.918110-1-espindola@scylladb.com>	2020-06-18 15:13:56 +03:00
Piotr Sarna	3458bd2e32	db,view: fix outdated comments Some comments still referred to variable names which are no longer up-to-date. Follow-up for #6560. Message-Id: <2b857ccc900dd64f0d9379f5d6c87fd3aaa5d902.1591594042.git.sarna@scylladb.com>	2020-06-08 09:02:10 +03:00
Nadav Har'El	d6626c217a	merge: add error injection to mv Merged pull request https://github.com/scylladb/scylla/pull/6516 from Piotr Sarna: This series adds error injection points to materialized view paths: view update generation from staging sstables; view building; generating view updates from user writes. This series comes with a corresponding dtest pull request which adds some test cases based on error injection. Fixes #6488	2020-06-07 19:23:23 +03:00
Piotr Sarna	b3a6a33487	db,view: ensure that local updates are applied locally In current mutate_MV() code it's possible for a local endpoint to become a target for a network operation. That's the source of occasional `broken promise` benign error messages appearing, since the mutation is actually applied locally, so there's no point in creating a write response handler - the node will not send a response to itself via network. While at it, the code is deduplicated a little bit - with the paths simplified, it's easier to ensure that a local endpoint is never listed as a target for remote network operations. Fixes #5459 Tests: unit(dev), dtest(materialized_views_test.TestMaterializedViews.add_dc_during_mv_insert_test)	2020-06-07 19:10:03 +03:00
Piotr Sarna	76e89efc1a	db,view: add error injection points to view building ... in order to be able to test scenarios with failures.	2020-06-05 09:39:58 +02:00
Piotr Sarna	9d524a7a7e	db,view: add error injection points to view update generator ... in order to be able to test scenarios with failures.	2020-06-05 09:39:58 +02:00
Avi Kivity	0c6bbc84cd	Merge "Classify queries based on their initiator, rather than their target" from Botond " Currently we classify queries as "system" or "user" based on the table they target. The class of a query determines how the query is treated, currently: timeout, limits for reverse queries and the concurrency semaphore. The catch is that users are also allowed to query system tables and when doing so they will bypass the limits intended for user queries. This has caused performance problems in the past, yet the reason we decided to finally address this is that we want to introduce a memory limit for unpaged queries. Internal (system) queries are all unpaged and we don't want to impose the same limit on them. This series uses scheduling groups to distinguish user and system workloads, based on the assumption that user workloads will run in the statement scheduling group, while system workloads will run in the main (or default) scheduling group, or perhaps something else, but in any case not in the statement one. Currently the scheduling group of reads and writes is lost when going through the messaging service, so to be able to use scheduling groups to distinguish user and system reads this series refactors the messaging service to retain this distinction across verb calls. Furthermore, we execute some system reads/writes as part of user reads/writes, such as auth and schema sync. These processes are tagged to run in the main group. This series also centralises query classification on the replica and moves it to a higher level. More specifically, queries are now classified -- the scheduling group they run in is translated to the appropriate query class specific configuration -- on the database level and the configuration is propagated down to the lower layers. Currently this query class specific configuration consists of the reader concurrency semaphore and the max memory limit for otherwise unlimited queries. A corollary of the semaphore begin selected on the database level is that the read permit is now created before the read starts. A valid permit is now available during all stages of the read, enabling tracking the memory consumption of e.g. the memtable and cache readers. This change aligns nicely with the needs of more accurate reader memory tracking, which also wants a valid permit that is available in every layer. The series can be divided roughly into the following distinct patch groups: * 01-02: Give system read concurrency a boost during startup. * 03-06: Introduce user/system statement isolation to messaging service. * 07-13: Various infrastructure changes to prepare for using read permits in all stages of reads. * 14-19: Propagate the semaphore and the permit from database to the various table methods that currently create the permit. * 20-23: Migrate away from using the reader concurrency semaphore for waiting for admission, use the permit instead. * 24: Introduce `database::make_query_config()` and switch the database methods needing such a config to use it. * 25-31: Get rid of all uses of `no_reader_permit()`. * 32-33: Ban empty permits for good. * 34: querier_cache: use the queriers' permits to obtain the semaphore. Fixes: #5919 Tests: unit(dev, release, debug), dtest(bootstrap_test.py:TestBootstrap.start_stop_test_node), manual testing with a 2 node mixed cluster with extra logging. " * 'query-class/v6' of https://github.com/denesb/scylla: (34 commits) querier_cache: get semaphore from querier reader_permit: forbid empty permits reader_permit: fix reader_resources::operator bool treewide: remove all uses of no_reader_permit() database: make_multishard_streaming_reader: pass valid permit to multi range reader sstables: pass valid permits to all internal reads compaction: pass a valid permit to sstable reads database: add compaction read concurrency semaphore view: use valid permits for reads from the base table database: use valid permit for counter read-before-write database: introduce make_query_class_config() reader_concurrency_semaphore: remove wait_admission and consume_resources() test: move away from reader_concurrency_semaphore::wait_admission() reader_permit: resource_units: introduce add() mutation_reader: restricted_reader: work in terms of reader_permit row_cache: pass a valid permit to underlying read memtable: pass a valid permit to the delegate reader table: require a valid permit to be passed to most read methods multishard_mutation_query: pass a valid permit to shard mutation sources querier: add reader_permit parameter and forward it to the mutation_source ...	2020-05-29 10:11:44 +03:00
Piotr Sarna	77e943e9a3	db,views: unify time points used for update generation Until now, view updates were generated with a bunch of random time points, because the interface was not adjusted for passing a single time point. The time points were used to determine whether cells were alive (e.g. because of TTL), so it's better to unify the process: 1. when generating view updates from user writes, a single time point is used for the whole operation 2. when generating view updates via the view building process, a single time point is used for each build step NOTE: I don't see any reliable and deterministic way of writing test scenarios which trigger problems with the old code. After #6488 is resolved and error injection is integrated into view.cc, tests can be added. Fixes #6429 Tests: unit(dev) Message-Id: <f864e965eb2e27ffc13d50359ad1e228894f7121.1590070130.git.sarna@scylladb.com>	2020-05-28 12:56:09 +03:00
Botond Dénes	992e697dd5	view: use valid permits for reads from the base table View update generation involves reading existing values from the base table, which will soon require a valid permit to be passed to it, so make sure we create and pass a valid permit to these reads. We use `database::make_query_class_config()` to obtain the semaphore for the read which selects the appropriate user/system semaphore based on the scheduling group the base table write is running in.	2020-05-28 11:34:35 +03:00
Botond Dénes	cc5137ffe3	table: require a valid permit to be passed to most read methods Now that the most prevalent users (range scan and single partition reads) all pass valid permits we require all users to do so and propagate the permit down towards `make_sstable_reader()`. The plan is to use this permit for restricting the sstable readers, instead of the semaphore the table is configured with. The various `make_streaming_*reader()` overloads keep using the internal semaphores as but they also create the permit before the read starts and pass it to `make_sstable_reader()`.	2020-05-28 11:34:35 +03:00
Piotr Sarna	18a37d0cb1	db,view: add tracing to view update generation path In order to improve materialized views' debuggability, tracing points are added to view update generation path. Sample info of an insert statement which resulted in producing local view updates which require read-before-write: activity \| timestamp \| source \| source_elapsed \| client ------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2020-04-19 12:02:48.420000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 0] \| 2020-04-19 12:02:48.420674 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 0] \| 2020-04-19 12:02:48.420753 \| 127.0.0.1 \| 79 \| 127.0.0.1 Creating write handler for token: -6715243485458697746 natural: {127.0.0.1} pending: {} [shard 0] \| 2020-04-19 12:02:48.420815 \| 127.0.0.1 \| 141 \| 127.0.0.1 Creating write handler with live: {127.0.0.1} dead: {} [shard 0] \| 2020-04-19 12:02:48.420824 \| 127.0.0.1 \| 149 \| 127.0.0.1 Executing a mutation locally [shard 0] \| 2020-04-19 12:02:48.420830 \| 127.0.0.1 \| 155 \| 127.0.0.1 View updates for ks.t1 require read-before-write - base table reader is created [shard 0] \| 2020-04-19 12:02:48.420862 \| 127.0.0.1 \| 188 \| 127.0.0.1 Generated 2 view update mutations [shard 0] \| 2020-04-19 12:02:48.420910 \| 127.0.0.1 \| 235 \| 127.0.0.1 Locally applying view update for ks.t1_v_idx_index; base token = -6715243485458697746; view token = -4156302194539278891 [shard 0] \| 2020-04-19 12:02:48.420918 \| 127.0.0.1 \| 243 \| 127.0.0.1 Successfully applied local view update for 127.0.0.1 and 0 remote endpoints [shard 0] \| 2020-04-19 12:02:48.420971 \| 127.0.0.1 \| 297 \| 127.0.0.1 View updates for ks.t1 were generated and propagated [shard 0] \| 2020-04-19 12:02:48.420973 \| 127.0.0.1 \| 299 \| 127.0.0.1 Got a response from /127.0.0.1 [shard 0] \| 2020-04-19 12:02:48.420988 \| 127.0.0.1 \| 314 \| 127.0.0.1 Delay decision due to throttling: do not delay, resuming now [shard 0] \| 2020-04-19 12:02:48.420990 \| 127.0.0.1 \| 315 \| 127.0.0.1 Mutation successfully completed [shard 0] \| 2020-04-19 12:02:48.420994 \| 127.0.0.1 \| 320 \| 127.0.0.1 Done processing - preparing a result [shard 0] \| 2020-04-19 12:02:48.421000 \| 127.0.0.1 \| 326 \| 127.0.0.1 Request complete \| 2020-04-19 12:02:48.420330 \| 127.0.0.1 \| 330 \| 127.0.0.1 Sample info for remote updates: activity \| timestamp \| source \| source_elapsed \| client --------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2020-04-26 16:19:47.691000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 1] \| 2020-04-26 16:19:47.691590 \| 127.0.0.1 \| 6 \| 127.0.0.1 Processing a statement [shard 1] \| 2020-04-26 16:19:47.692368 \| 127.0.0.1 \| 783 \| 127.0.0.1 Creating write handler for token: -3248873570005575792 natural: {127.0.0.3, 127.0.0.2} pending: {} [shard 1] \| 2020-04-26 16:19:47.694186 \| 127.0.0.1 \| 2598 \| 127.0.0.1 Creating write handler with live: {127.0.0.2, 127.0.0.3} dead: {} [shard 1] \| 2020-04-26 16:19:47.694283 \| 127.0.0.1 \| 2699 \| 127.0.0.1 Sending a mutation to /127.0.0.2 [shard 1] \| 2020-04-26 16:19:47.694591 \| 127.0.0.1 \| 3006 \| 127.0.0.1 Sending a mutation to /127.0.0.3 [shard 1] \| 2020-04-26 16:19:47.694862 \| 127.0.0.1 \| 3277 \| 127.0.0.1 Message received from /127.0.0.1 [shard 1] \| 2020-04-26 16:19:47.696358 \| 127.0.0.3 \| 40 \| 127.0.0.1 Message received from /127.0.0.1 [shard 1] \| 2020-04-26 16:19:47.696442 \| 127.0.0.2 \| 32 \| 127.0.0.1 View updates for ks.t require read-before-write - base table reader is created [shard 1] \| 2020-04-26 16:19:47.697762 \| 127.0.0.3 \| 1444 \| 127.0.0.1 View updates for ks.t require read-before-write - base table reader is created [shard 1] \| 2020-04-26 16:19:47.698120 \| 127.0.0.2 \| 1710 \| 127.0.0.1 Generated 1 view update mutations [shard 1] \| 2020-04-26 16:19:47.699107 \| 127.0.0.3 \| 2789 \| 127.0.0.1 Sending view update for ks.t_v2_idx_index to 127.0.0.4, with pending endpoints = {}; base token = -3248873570005575792; view token = 1634052884888577606 [shard 1] \| 2020-04-26 16:19:47.699345 \| 127.0.0.3 \| 3027 \| 127.0.0.1 Sending a mutation to /127.0.0.4 [shard 1] \| 2020-04-26 16:19:47.699614 \| 127.0.0.3 \| 3296 \| 127.0.0.1 Generated 1 view update mutations [shard 1] \| 2020-04-26 16:19:47.699824 \| 127.0.0.2 \| 3414 \| 127.0.0.1 Locally applying view update for ks.t_v2_idx_index; base token = -3248873570005575792; view token = 1634052884888577606 [shard 1] \| 2020-04-26 16:19:47.700012 \| 127.0.0.2 \| 3603 \| 127.0.0.1 View updates for ks.t were generated and propagated [shard 1] \| 2020-04-26 16:19:47.700059 \| 127.0.0.3 \| 3741 \| 127.0.0.1 Message received from /127.0.0.3 [shard 1] \| 2020-04-26 16:19:47.700958 \| 127.0.0.4 \| 37 \| 127.0.0.1 Successfully applied local view update for 127.0.0.2 and 0 remote endpoints [shard 1] \| 2020-04-26 16:19:47.701522 \| 127.0.0.2 \| 5112 \| 127.0.0.1 View updates for ks.t were generated and propagated [shard 1] \| 2020-04-26 16:19:47.701615 \| 127.0.0.2 \| 5206 \| 127.0.0.1 Sending mutation_done to /127.0.0.1 [shard 1] \| 2020-04-26 16:19:47.701913 \| 127.0.0.3 \| 5595 \| 127.0.0.1 Mutation handling is done [shard 1] \| 2020-04-26 16:19:47.702489 \| 127.0.0.3 \| 6171 \| 127.0.0.1 Got a response from /127.0.0.3 [shard 1] \| 2020-04-26 16:19:47.702667 \| 127.0.0.1 \| 11082 \| 127.0.0.1 Delay decision due to throttling: do not delay, resuming now [shard 1] \| 2020-04-26 16:19:47.702689 \| 127.0.0.1 \| 11105 \| 127.0.0.1 Mutation successfully completed [shard 1] \| 2020-04-26 16:19:47.702784 \| 127.0.0.1 \| 11200 \| 127.0.0.1 Sending mutation_done to /127.0.0.1 [shard 1] \| 2020-04-26 16:19:47.703016 \| 127.0.0.2 \| 6606 \| 127.0.0.1 Done processing - preparing a result [shard 1] \| 2020-04-26 16:19:47.703054 \| 127.0.0.1 \| 11470 \| 127.0.0.1 Sending mutation_done to /127.0.0.3 [shard 1] \| 2020-04-26 16:19:47.703720 \| 127.0.0.4 \| 2800 \| 127.0.0.1 Mutation handling is done [shard 1] \| 2020-04-26 16:19:47.704527 \| 127.0.0.4 \| 3607 \| 127.0.0.1 Got a response from /127.0.0.4 [shard 1] \| 2020-04-26 16:19:47.704580 \| 127.0.0.3 \| 8262 \| 127.0.0.1 Delay decision due to throttling: do not delay, resuming now [shard 1] \| 2020-04-26 16:19:47.704606 \| 127.0.0.3 \| 8288 \| 127.0.0.1 Successfully applied view update for 127.0.0.4 and 1 remote endpoints [shard 1] \| 2020-04-26 16:19:47.704853 \| 127.0.0.3 \| 8535 \| 127.0.0.1 Mutation handling is done [shard 1] \| 2020-04-26 16:19:47.706092 \| 127.0.0.2 \| 9682 \| 127.0.0.1 Got a response from /127.0.0.2 [shard 1] \| 2020-04-26 16:19:47.709933 \| 127.0.0.1 \| 18348 \| 127.0.0.1 Request complete \| 2020-04-26 16:19:47.702582 \| 127.0.0.1 \| 11582 \| 127.0.0.1 Tests: unit(dev, debug)	2020-05-18 16:05:23 +02:00
Piotr Sarna	92aadb94e5	treewide: propagate trace state to write path In order to add tracing to places where it can be useful, e.g. materialized view updates and hinted handoff, tracing state is propagated to all applicable call sites.	2020-05-18 16:05:23 +02:00
Piotr Sarna	f48e414eab	db, view: remove duplicate entries from pending endpoints When generating view updates, an endpoint can appear both as a primary paired endpoint for the view update, and as a pending endpoint (due to range movements). In order not to generate the same update twice for the same endpoint, the paired endpoint is removed from the list of pending endpoints if present. Fixes #5459 Tests: unit(dev), dtest(TestMaterializedViews.add_dc_during_mv_insert_test)	2020-05-06 16:42:56 +03:00
Glauber Costa	1f9c37fb5e	view_updating_consumer: move reference to a pointer It is currently not possible to wrap the view_updating_consumer in an std::optional. I intend to do it to allow for compactions to optionally generate view updates. The reason for that is that view_updating_consumer has a reference as a member, which makes the move assignment constructor not be implicitly generated. This patch fixes it by keeping a pointer instead of a reference. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200421123648.8328-1-glauber@scylladb.com>	2020-04-22 10:05:35 +03:00
Glauber Costa	4e6400293e	staging: potentially read many SSTables at the same time There is no reason to read a single SSTable at a time from the staging directory. Moving SSTables from staging directory essentially involves scanning input SSTables and creating new SSTables (albeit in a different directory). We have a mechanism that does that: compactions. In a follow up patch, I will introduce a new specialization of compaction that moves SSTables from staging (potentially compacting them if there are plenty). In preparation for that, some signatures have to be changed and the view_updating_consumer has to be more compaction friendly. Meaning: - Operating with an sstable vector - taking a table reference, not a database Because this code is a bit fragile and the reviewer set is fundamentally different from anything compaction related, I am sending this separately Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-04-15 11:26:44 -04:00
Piotr Sarna	1a9083b342	db,view: guard view builder startup with a semaphore The startup routine performs some bookkeeping operations on views, and so do these events: - on_create_view; - on_drop_view; - on_update_view. Since the above events are guarded with a semaphore, the startup routine should also take the same semaphore - in order to ensure that all bookkeeping operations are serialized. Refs #6094	2020-04-05 11:41:26 +02:00
Piotr Sarna	8da4a5b78c	db,view: nitpick: change & operator to && for booleans Although it's technically correct to use the bitwise and operator on booleans as well, it's slightly confusing for the reader.	2020-04-05 11:41:25 +02:00
Piotr Sarna	e49805b7b8	db,view: remove unneeded implicit capture-by-reference The lambda does not use any other captures, so it does not to implicitly capture anything by reference.	2020-04-05 11:41:25 +02:00
Piotr Sarna	3f19865493	db,view: fix waiting for a view building future The future was marked with a `FIXME: discarded future`, but there's really no reason not to wait for it, and it was probably meant to be waited for since its implementation.	2020-04-05 11:41:25 +02:00
Botond Dénes	240b5e0594	frozen_schema: key() remove unused schema parameter Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200402092249.680210-1-bdenes@scylladb.com>	2020-04-02 14:43:35 +02:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Pavel Solodovnikov	adc6a98b59	cql3: return raw::parsed_statement as unique_ptr Change CQL parsing routine to return std::unique_ptr instead of seastar::shared_ptr. This can help reduce redundant shared_ptr copies even further. Make some supplementary changes necessary for this transition: * Remove enabled_shared_from_this base class from the following classes: truncate_statement, authorization_statement, authentication_statement: these were previously constructing prepared_statement instance in `prepare` method using `shared_from_this`. Make `prepare` methods implementation of inheriting classes mirror implementation from other statements (i.e. create a shallow copy of the object when prepairing into `prepared_statement`; this could be further refactored to avoid copies as much as possible). * Remove unused fields in create_role_statement which led to error while using compiler-generated copy ctor (copying uninitialied bool values via ctor). Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2020-03-23 23:19:21 +03:00
Botond Dénes	e0284bb9ee	treewide: add missing headers and/or forward declarations	2020-03-23 09:29:45 +02:00
Nadav Har'El	7922b9eb8f	materialized views: reduce recompilation when db/view/view.hh changes. Before this patch, when db/view/view.hh was modified, 89 source files had to be recompiled. After this patch, this number is down to 5. Most of the irrelevant source files got view.hh by including database.hh, which included view.hh just for the definition of statistics. So in this patch we split the view statistics to a separate header file, view_stats.hh, and database.hh only includes that. A few source files which included only database.hh and also needed view.hh (for materialized-view related functions) now need to include view.hh explicitly. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200319121031.540-1-nyh@scylladb.com>	2020-03-19 15:46:14 +02:00
Piotr Sarna	0c11e07faf	view,table: fix waiting for view updates during building View updates sent as part of the view building process should never be ignored, but `fd49fd7` introduced a bug which may cause exactly that: the updates are mistakenly sent to background, so the view builder will not receive negative feedback if an update failed, which will in turn not cause a retry. Consequently, view building may report that it "finished" building a view, while some of the updates were lost. A simple fix is to restore previous behaviour - all updates triggered by view building are now waited for. Fixes #6038 Tests: unit(dev), dtest: interrupt_build_process_with_resharding_low_to_half_test	2020-03-19 10:50:54 +02:00
Nadav Har'El	635e6d887c	materialized views: fix corner case of view updates used by Alternator While CQL does not allow creation of a materialized view with more than one base regular column in the view's key, in Alternator we do allow this - both partition and clustering key may be a base regular column. We had a bug in the logic handling this case: If the new base row is missing a value for one of the view key columns, we shouldn't create a view row. Similarly, if the existing base row was missing a value for one of the view key columns, a view row does not exist and doesn't need to be deleted. This was done incorrectly, and made decisions based on just one of the key columns, and the logic is now fixed (and I think, simplified) in this patch. With this patch, the Alternator test which previously failed because of this problem now passes. The patch also includes new tests in the existing C++ unit test test_view_with_two_regular_base_columns_in_key. This tests was already supposed to be testing various cases of two-new-key-columns updates, but missed the cases explained above. These new tests failed badly before this patch - some of them had clean write errors, others caused crashes. With this patch, they pass. Fixes #6008. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200312162503.8944-1-nyh@scylladb.com>	2020-03-15 07:57:33 +01:00
Piotr Sarna	2061e6a9cc	db,view: perform local view updates synchronously Local view updates (updates applied to a local node, without remote communication) are from now on performed synchronously - which adds consistency guarantees, as a local write failure will be returned to the client instead of being silently ignored.	2020-03-11 09:05:56 +01:00
Piotr Sarna	fd49fd773c	db,view: move putting view updates to background to mutate_MV Currently, launching view updates as an asynchronous background job is done via not waiting for mutate_MV() future in table::generate_and_propagate_view_updates. That has a big downside, since mutate_MV() handles all view updates for all views of a table, so it's not possible to wait for each view independently. Per-view granularity is required in order to implement synchronous view updates of local views - because then we'll synchronously wait for all views that write to a local node (due to having a matching partition key with the base), while remote view updates will still be sent asynchronously. In order to do that, instead of not waiting for mutate_MV, we do wait for it properly, but instead launch the asynchronous, unwaited-for futures inside mutate_MV. Effectively that means no changes for view updates so far - all updates will be fired in the background. Later, another patch will introduce a way to wait for selected updates to finish.	2020-03-11 09:05:56 +01:00
Piotr Sarna	3b3659e8cd	db,view: drop default parameter for mutate_MV::allow_hints Default parameters are considered harmful, and as part of a cleanup before editing view.cc code, a default value for allow_hints parameter is removed.	2020-03-11 09:05:56 +01:00
Pavel Emelyanov	4fa12f2fb8	header: De-bloat schema.hh The header sits in many other headers, but there's a handy schema_fwd.hh that's tiny and contains needed declarations for other headers. So replace shema.hh with schema_fwd.hh in most of the headers (and remove completely from some). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303102050.18462-1-xemul@scylladb.com>	2020-03-03 11:34:00 +01:00
Piotr Jastrzebski	76d154dbac	view: stop calling global_partitioner() Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-02-17 10:59:15 +01:00
Piotr Sarna	e93c54e837	db,view: fix generating view updates for partition tombstones The update generation path must track and apply all tombstones, both from the existing base row (if read-before-write was needed) and for the new row. One such path contained an error, because it assumed that if the existing row is empty, then the update can be simply generated from the new row. However, lack of the existing row can also be the result of a partition/range tombstone. If that's the case, it needs to be applied, because it's entirely possible that this partition row also hides the new row. Without taking the partition tombstone into account, creating a future tombstone and inserting an out-of-order write before it in the base table can result in ghost rows in the view table. This patch comes with a test which was proven to fail before the changes. Branches 3.1,3.2,3.3 Fixes #5793 Tests: unit(dev) Message-Id: <8d3b2abad31572668693ab585f37f4af5bb7577a.1581525398.git.sarna@scylladb.com>	2020-02-12 23:16:30 +02:00
Avi Kivity	dcab666d52	cql3: query_processor: reduce #includes query_processor is a central class, so reducing its includes can reduce dependencies treewite. This patch removes includes for parsed_statement, cf_statement, and untyped_result_set and fixes up the rest of the tree to include what it lacks as a result of these removals.	2020-02-09 12:24:24 +02:00
Pavel Emelyanov	e2ec5eecf6	view_update: Do not need storage_proxy The view_update_generator acceps (and keeps) database and storage_proxy, the latter is only needed to initialize the view_updating_consumer which, in turn, only needs it to get database from (to find column family). This can be relaxed by providing the database from _generator to _consumer directly, without using the storage_proxy in between. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200207112427.18419-1-xemul@scylladb.com>	2020-02-07 13:30:01 +02:00
Eliran Sinvani	8cfc2aad57	internalize storage proxy statistics metric registration The storage proxy statistics structure did not contain a method for registering the statistics for metric groups, instead, each user had to register some of the metrics by itself. There is no real reason for separating the metrics registration from the statistics data. There is even less justification for doing this only for part of the stats as is the case for those statistics. This commit internalize the metrics registration in the storage_proxy stats structures. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2020-01-30 15:01:40 +01:00
Avi Kivity	17eaf552f0	Merge "Improve the accuracy of reader memory tracking" from Botond " Grab the lowest hanging fruits. This patch-set makes three important changes: * Consume the memory for I/O operations on tracked files, before they are forwarded to the underlying file. * Track memory consumed by buffers created for parsing in `continuous_data_consumer`. As this is the basis for the data, index and promoted index parsers, all three are covered now in this regard. * Track the index file. The remaining, not-so-low handing fruits in order of gain/cost(performance) ratio: * Track in-memory index lists. * Track in-memory promoted index blocks. * Track reader buffer memory. Note that this ordering might change based on the workload and other environmental factors. Also included in this series is an infrastructure refactoring to make tracking memory easier and involve including lighter headers, as well as a manual test designed to allow testing and experimenting with the effects of changes to the accuracy of the tracking of reader memory consumption. Refs: #4176 Refs: #2778 Tests: unit(dev), manual(sstable_scan_footprint_test) The latter was run as: build/dev/test/manual/sstable_scan_footprint_test -c1 -m2G --reads=4000 --read-concurrency=1 --logger-log-level test=trace --collect-stats --stats-period-ms=20 This will trickle reads until the semaphore blocks, then wait until the wait queue drains before sending new reads. This way we are not testing the effectiveness of the pre-admission estimation (which is terribly optimistic) and instead check that with slowly ramping up read load the semaphore will block on memory preventing OOM. This now runs to completion without a single `std::bad_alloc`. The read concurrency semaphore allows between 15-30 reads, and is always blocked on memory. " * 'more-accurate-reader-resource-tracking/v1' of ssh://github.com/denesb/scylla: test/manual/sstable_scan_footprint_test: improve memory consumption diagnostics tests/manual/sstable_scan_footprint_test: use the semaphore to determine read rate tests/manual: Add test measuring memory demand of concurrent sstable reads index_reader: make the index file tracked sstables/continuous_data_consumer: track buffers used for parsing reader_concurrency_semaphore: tracking_file_impl: consume memory speculatively reader_concurrency_semaphore: bye reader_resource_tracker treewide: replace reader_resource_tracer with reader_permit reader_permit: expose make_tracked_temporary_buffer() reader_permit: introduce make_tracked_file() reader_permit: introduce memory_units reader_concurrency_semaphore: mv reader_resources and reader_permit to reader_permit.hh reader_concurrency_semaphore: reader_permit: make it a value type reader_concurrency_semaphore: s/resources/reader_resources/ reader_concurrency_semaphore::reader_permit: move methods out-of-line	2020-01-29 00:11:17 +02:00
Botond Dénes	dfc8b2fc45	treewide: replace reader_resource_tracer with reader_permit The former was never really more than a reader_permit with one additional method. Currently using it doesn't even save one from any includes. Now that readers will be using reader_permit we would have to pass down both to mutation_source. Instead get rid of reader_resource_tracker and just use reader_permit. Instead of making it a last and optional parameter that is easy to ignore, make it a first class parameter, right after schema, to signify that permits are now a prominent part of the reader API. This -- mostly mechanical -- patch essentially refactors mutation_source to ask for the reader_permit instead of reader_resource_tracking and updates all usage sites.	2020-01-28 08:13:16 +02:00
Dejan Mircevski	90b54c8c42	view_info: Drop partition_ranges() The method view_info::partition_ranges() is unused. Also drop the now-dead _partition_ranges data member. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-01-26 12:02:32 +02:00

1 2 3 4

197 Commits