Commit Graph

207 Commits

Author SHA1 Message Date
Piotr Jastrzebski
01ea159fde codebase wide: use try_emplace when appropriate
C++17 introduced try_emplace for maps to replace a pattern:
if(element not in a map) {
    map.emplace(...)
}

try_emplace is more efficient and results in a more concise code.

This commit introduces usage of try_emplace when it's appropriate.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <4970091ed770e233884633bf6d46111369e7d2dd.1597327358.git.piotr@scylladb.com>
2020-08-16 14:41:09 +03:00
Piotr Jastrzebski
c001374636 codebase wide: replace count with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.

`contains` does not only express the intend of the code better but also
does it in more unified way.

This commit replaces all the occurences of the `count` with the
`contains`.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
2020-08-15 20:26:02 +03:00
Nadav Har'El
8135647906 merge: Add metrics to semaphores
Merged pull request https://github.com/scylladb/scylla/pull/7018
by Piotr Sarna:

This series addresses various issues with metrics and semaphores - it mainly adds missing metrics, which makes it possible to see the length of the queues attached to the semaphores. In case of view building and view update generation, metrics was not present in these services at all, so a first, basic implementation is added.

More precise semaphore metrics would ease the testing and development of load shedding and admission control.

	view_builder: add metrics
	db, view: add view update generator metrics
	hints: track resource_manager sending queue length
	hints: add drain queue length to metrics
	table: add metrics for sstable deletion semaphore
	database: remove unused semaphore
2020-08-12 12:39:59 +03:00
Piotr Sarna
5086a5ca32 view_builder: add metrics
The view builder service lacked metrics, so a basic set of them
is added.
2020-08-11 17:43:53 +02:00
Piotr Sarna
e4d78b60ff db, view: add view update generator metrics
The view update generator completely lacked metrics, so a basic set
of them is now exposed.
2020-08-11 17:43:53 +02:00
Piotr Jastrzebski
80e3923b3c codebase wide: replace find(...) != end() with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
the code pattern looked like:

<collection>.find(<element>) != <collection>.end()

In C++20 the same can be expressed with:

<collection>.contains(<element>)

This is not only more concise but also expresses the intend of the code
more clearly.

This commit replaces all the occurences of the old pattern with the new
approach.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>
2020-08-11 13:28:50 +03:00
Dejan Mircevski
df20854963 cql3: Move expressions to their own namespace
Move the classes representing CQL expressions (and utility functions
on them) from the `restrictions` namespace to a new namespace `expr`.

Most of the restriction.hh content was moved verbatim to
expression.hh.  Similarly, all expression-related code was moved from
statement_restrictions.cc verbatim to expression.cc.

As suggested in #5763 feedback
https://github.com/scylladb/scylla/pull/5763#discussion_r443210498

Tests: dev (unit)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-08-08 21:03:26 +03:00
Avi Kivity
257c17a87a Merge "Don't depend on seastar::make_(lw_)?shared idiosyncrasies" from Rafael
"
While working on another patch I was getting odd compiler errors
saying that a call to ::make_shared was ambiguous. The reason was that
seastar has both:

template <typename T, typename... A>
shared_ptr<T> make_shared(A&&... a);

template <typename T>
shared_ptr<T> make_shared(T&& a);

The second variant doesn't exist in std::make_shared.

This series drops the dependency in scylla, so that a future change
can make seastar::make_shared a bit more like std::make_shared.
"

* 'espindola/make_shared' of https://github.com/espindola/scylla:
  Everywhere: Explicitly instantiate make_lw_shared
  Everywhere: Add a make_shared_schema helper
  Everywhere: Explicitly instantiate make_shared
  cql3: Add a create_multi_column_relation helper
  main: Return a shared_ptr from defer_verbose_shutdown
2020-08-02 19:51:24 +03:00
Botond Dénes
9eab5bca27 query_*(): use the coordinator specified memory limit for unlimited queries
It is important that all replicas participating in a read use the same
memory limits to avoid artificial differences due to different amount of
results. The coordinator now passes down its own memory limit for reads,
in the form of max_result_size (or max_size). For unpaged or reverse
queries this has to be used now instead of the locally set
max_memory_unlimited_query configuration item.

To avoid the replicas accidentally using the local limit contained in
the `query_class_config` returned from
`database::make_query_class_config()`, we refactor the latter into
`database::get_reader_concurrency_semaphore()`. Most of its callers were
only interested in the semaphore only anyway and those that were
interested in the limit as well should get it from the coordinator
instead, so this refactoring is a win-win.
2020-07-28 18:00:29 +03:00
Rafael Ávila de Espíndola
e15c8ee667 Everywhere: Explicitly instantiate make_lw_shared
seastar::make_lw_shared has a constructor taking a T&&. There is no
such constructor in std::make_shared:

https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared

This means that we have to move from

    make_lw_shared(T(...)

to

    make_lw_shared<T>(...)

If we don't want to depend on the idiosyncrasies of
seastar::make_lw_shared.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-07-21 10:33:49 -07:00
Botond Dénes
566e31a5ac db/view: view_updating_consumer: allow passing custom update pusher
So that tests can test the `view_update_consumer` in isolation, without
having to set up the whole database machinery. In addition to less
infrastructure setup, this allows more direct checking of mutations
pushed for view generation.
2020-07-20 11:23:39 +03:00
Botond Dénes
0166f97096 db/view: view_update_generator: make staging reader evictable
The view update generation process creates two readers. One is used to
read the staging sstables, the data which needs view updates to be
generated for, and another reader for each processed mutation, which
reads the current value (pre-image) of each row in said mutation. The
staging reader is created first and is kept alive until all staging data
is processed. The pre-image reader is created separately for each
processed mutation. The staging reader is not restricted, meaning it
does not wait for admission on the relevant reader concurrency
semaphore, but it does register its resource usage on it. The pre-image
reader however *is* restricted. This creates a situation, where the
staging reader possibly consumes all resources from the semaphore,
leaving none for the later created pre-image reader, which will not be
able to start reading. This will block the view building process meaning
that the staging reader will not be destroyed, causing a deadlock.

This patch solves this by making the staging reader restricted and
making it evictable. To prevent thrashing -- evicting the staging reader
after reading only a really small partition -- we only make the staging
reader evictable after we have read at least 1MB worth of data from it.
2020-07-20 11:23:39 +03:00
Botond Dénes
84357f0722 db/view: view_updating_consumer: move implementation from table.cc to view.cc
table.cc is a very counter-intuitive place for view related stuff,
especially if the declarations reside in `db/view/`.
2020-07-20 11:23:39 +03:00
Pavel Emelyanov
8618a02815 migration_manager: Remove db/schema_tables.hh inclustion into header
The schema_tables.hh -> migration_manager.hh couple seems to work as one
of "single header for everyhing" creating big blot for many seemingly
unrelated .hh's.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-17 17:54:43 +03:00
Amnon Heiman
ea8d52b11c row_locking: change estimated histogram with time_estimated_histogram
This patch changes the row locking latencies to use
time_estimated_histogram.

The change consist of changing the histogram definition and changing how
values are inserted to the histogram.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2020-07-14 11:17:43 +03:00
Avi Kivity
b0698dfb38 Merge 'Rewrite CQL3 restriction representation' from dekimir
"
This is the first stage of replacing the existing restrictions code with a new representation. It adds a new class `expression` to replace the existing class `restriction`. Lots of the old code is deleted, though not all -- that will come in subsequent stages.

Tests: unit (dev, debug restrictions_test), dtest (next-gating)
"

* dekimir-restrictions-rewrite:
  cql3/restrictions: Drop dead code
  cql3/restrictions: Use free functions instead of methods
  cql3/restrictions: Create expression objects
  cql3/restrictions: Add free functions over new classes
  cql3/restrictions: Add new representation
2020-07-08 10:22:17 +03:00
Dejan Mircevski
37ebe521e3 cql3/restrictions: Use free functions instead of methods
Instead of `restriction` class methods, use the new free functions.
Specific replacement actions are listed below.

Note that class `restrictions` (plural) remains intact -- both its
methods and its type hierarchy remain intact for now.

Ensure full test coverage of the replacement code with new file
test/boost/restrictions_test.cc and some extra testcases in
test/cql/*.

Drop some existing tests because they codify buggy behaviour
(reference #6369, #6382).  Drop others because they forbid relation
combinations that are now allowed (eg, mixing equality and
inequality, comparing to NULL, etc.).

Here are some specific categories of what was replaced:

- restriction::is_foo predicates are replaced by using the free
  function find_if; sometimes it is used transitively (see, eg,
  has_slice)

- restriction::is_multi_column is replaced by dynamic casts (recall
  that the `restrictions` class hierarchy still exists)

- utility methods is_satisfied_by, is_supported_by, to_string, and
  uses_function are replaced by eponymous free functions; note that
  restrictions::uses_function still exists

- restriction::apply_to is replaced by free function
  replace_column_def

- when checking infinite_bound_range_deletions, the has_bound is
  replaced by local free function bounded_ck

- restriction::bounds and restriction::value are replaced by the more
  general free function possible_lhs_values

- using free functions allows us to simplify the
  multi_column_restriction and token_restriction hierarchies; their
  methods merge_with and uses_function became identical in all
  subclasses, so they were moved to the base class

- single_column_primary_key_restrictions<clustering_key>::needs_filtering
  was changed to reuse num_prefix_columns_that_need_not_be_filtered,
  which uses free functions

Fixes #5799.
Fixes #6369.
Fixes #6371.
Fixes #6372.
Fixes #6382.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-07-07 23:08:09 +02:00
Botond Dénes
5ebe2c28d1 db/view: view_update_generator: re-balance wait/signal on the register semaphore
The view update generator has a semaphore to limit concurrency. This
semaphore is waited on in `register_staging_sstable()` and later the
unit is returned after the sstable is processed in the loop inside
`start()`.
This was broken by 4e64002, which changed the loop inside `start()` to
process sstables in per table batches, however didn't change the
`signal()` call to return the amount of units according to the number of
sstables processed. This can cause the semaphore units to dry up, as the
loop can process multiple sstables per table but return just a single
unit. This can also block callers of `register_staging_sstable()`
indefinitely as some waiters will never be released as under the right
circumstances the units on the semaphore can permanently go below 0.
In addition to this, 4e64002 introduced another bug: table entries from
the `_sstables_with_tables` are never removed, so they are processed
every turn. If the sstable list is empty, there won't be any update
generated but due to the unconditional `signal()` described above, this
can cause the units on the semaphore to grow to infinity, allowing
future staging sstables producers to register a huge amount of sstables,
causing memory problems due to the amount of sstable readers that have
to be opened (#6603, #6707).
Both outcomes are equally bad. This patch fixes both issues and modifies
the `test_view_update_generator` unit test to reproduce them and hence
to verify that this doesn't happen in the future.

Fixes: #6774
Refs: #6707
Refs: #6603

Tests: unit(dev)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200706135108.116134-1-bdenes@scylladb.com>
2020-07-07 08:53:00 +02:00
Wojciech Mitros
76038b8d8e view: differentiate identical error messages and change them to warnings
Modified log message in view_builder::calculate_shard_build_step to make it distinct from the one in view_builder::execute, changed their logging level to warning, since we're continuing even if we handle an exception.

Fixes #4600
2020-07-06 20:50:34 +03:00
Botond Dénes
62c6859b69 db/view: view_update_generator: use partitioned sstable set
And pass it to `make_range_sstable_reader()` when creating the reader,
thus allowing the incremental selector created therein to exploit the
fact that staging sstables are disjoint (in the case of repair and
streaming at least). This should reduce the memory consumption of the
staging reader considerably when reading from a lot of sstables.
2020-07-06 13:38:23 +03:00
Rafael Ávila de Espíndola
64c8164e6c everywhere: Update to seastar api v4 (when_all_succeed returning a tuple)
We now just need to replace a few calls to then with then_unpack.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200618172100.111147-1-espindola@scylladb.com>
2020-06-23 19:40:18 +03:00
Avi Kivity
de38091827 priority_manager: merge streaming_read and streaming_write classes into one class
Streaming is handled by just once group for CPU scheduling, so
separating it into read and write classes for I/O is artificial, and
inflates the resources we allow for streaming if both reads and writes
happen at the same time.

Merge both classes into one class ("streaming") and adjust callers. The
merged class has 200 shares, so it reduces streaming bandwidth if both
directions are active at the same time (which is rare; I think it only
happens in view building).
2020-06-22 15:09:04 +03:00
Rafael Ávila de Espíndola
f6e407ecd2 everywhere: Prepare for seastar api v4 (when_all_succeed return value)
The seastar api v4 changes the return type of when_all_succeed. This
patch adds discard_result when that is best solution to handle the
change.

This doesn't do the actual update to v4 since there are still a few
issues left to fix in seastar. A patch doing just the update will
follow.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200617233150.918110-1-espindola@scylladb.com>
2020-06-18 15:13:56 +03:00
Piotr Sarna
3458bd2e32 db,view: fix outdated comments
Some comments still referred to variable names which are no longer
up-to-date.

Follow-up for #6560.
Message-Id: <2b857ccc900dd64f0d9379f5d6c87fd3aaa5d902.1591594042.git.sarna@scylladb.com>
2020-06-08 09:02:10 +03:00
Nadav Har'El
d6626c217a merge: add error injection to mv
Merged pull request https://github.com/scylladb/scylla/pull/6516 from
Piotr Sarna:

This series adds error injection points to materialized view paths:

	view update generation from staging sstables;
	view building;
	generating view updates from user writes.

This series comes with a corresponding dtest pull request which adds some
test cases based on error injection.

Fixes #6488
2020-06-07 19:23:23 +03:00
Piotr Sarna
b3a6a33487 db,view: ensure that local updates are applied locally
In current mutate_MV() code it's possible for a local endpoint
to become a target for a network operation. That's the source
of occasional `broken promise` benign error messages appearing,
since the mutation is actually applied locally, so there's no point
in creating a write response handler - the node will not send a response
to itself via network.
While at it, the code is deduplicated a little bit - with the paths
simplified, it's easier to ensure that a local endpoint is never
listed as a target for remote network operations.

Fixes #5459
Tests: unit(dev),
       dtest(materialized_views_test.TestMaterializedViews.add_dc_during_mv_insert_test)
2020-06-07 19:10:03 +03:00
Piotr Sarna
76e89efc1a db,view: add error injection points to view building
... in order to be able to test scenarios with failures.
2020-06-05 09:39:58 +02:00
Piotr Sarna
9d524a7a7e db,view: add error injection points to view update generator
... in order to be able to test scenarios with failures.
2020-06-05 09:39:58 +02:00
Avi Kivity
0c6bbc84cd Merge "Classify queries based on their initiator, rather than their target" from Botond
"
Currently we classify queries as "system" or "user" based on the table
they target. The class of a query determines how the query is treated,
currently: timeout, limits for reverse queries and the concurrency
semaphore. The catch is that users are also allowed to query system
tables and when doing so they will bypass the limits intended for user
queries. This has caused performance problems in the past, yet the
reason we decided to finally address this is that we want to introduce a
memory limit for unpaged queries. Internal (system) queries are all
unpaged and we don't want to impose the same limit on them.

This series uses scheduling groups to distinguish user and system
workloads, based on the assumption that user workloads will run in the
statement scheduling group, while system workloads will run in the main
(or default) scheduling group, or perhaps something else, but in any
case not in the statement one. Currently the scheduling group of reads
and writes is lost when going through the messaging service, so to be
able to use scheduling groups to distinguish user and system reads this
series refactors the messaging service to retain this distinction across
verb calls. Furthermore, we execute some system reads/writes as part of
user reads/writes, such as auth and schema sync. These processes are
tagged to run in the main group.
This series also centralises query classification on the replica and
moves it to a higher level. More specifically, queries are now
classified -- the scheduling group they run in is translated to the
appropriate query class specific configuration -- on the database level
and the configuration is propagated down to the lower layers.
Currently this query class specific configuration consists of the reader
concurrency semaphore and the max memory limit for otherwise unlimited
queries. A corollary of the semaphore begin selected on the database
level is that the read permit is now created before the read starts. A
valid permit is now available during all stages of the read, enabling
tracking the memory consumption of e.g. the memtable and cache readers.
This change aligns nicely with the needs of more accurate reader memory
tracking, which also wants a valid permit that is available in every layer.

The series can be divided roughly into the following distinct patch
groups:
* 01-02: Give system read concurrency a boost during startup.
* 03-06: Introduce user/system statement isolation to messaging service.
* 07-13: Various infrastructure changes to prepare for using read
  permits in all stages of reads.
* 14-19: Propagate the semaphore and the permit from database to the
  various table methods that currently create the permit.
* 20-23: Migrate away from using the reader concurrency semaphore for
  waiting for admission, use the permit instead.
* 24: Introduce `database::make_query_config()` and switch the database
  methods needing such a config to use it.
* 25-31: Get rid of all uses of `no_reader_permit()`.
* 32-33: Ban empty permits for good.
* 34: querier_cache: use the queriers' permits to obtain the semaphore.

Fixes: #5919

Tests: unit(dev, release, debug),
dtest(bootstrap_test.py:TestBootstrap.start_stop_test_node), manual
testing with a 2 node mixed cluster with extra logging.
"
* 'query-class/v6' of https://github.com/denesb/scylla: (34 commits)
  querier_cache: get semaphore from querier
  reader_permit: forbid empty permits
  reader_permit: fix reader_resources::operator bool
  treewide: remove all uses of no_reader_permit()
  database: make_multishard_streaming_reader: pass valid permit to multi range reader
  sstables: pass valid permits to all internal reads
  compaction: pass a valid permit to sstable reads
  database: add compaction read concurrency semaphore
  view: use valid permits for reads from the base table
  database: use valid permit for counter read-before-write
  database: introduce make_query_class_config()
  reader_concurrency_semaphore: remove wait_admission and consume_resources()
  test: move away from reader_concurrency_semaphore::wait_admission()
  reader_permit: resource_units: introduce add()
  mutation_reader: restricted_reader: work in terms of reader_permit
  row_cache: pass a valid permit to underlying read
  memtable: pass a valid permit to the delegate reader
  table: require a valid permit to be passed to most read methods
  multishard_mutation_query: pass a valid permit to shard mutation sources
  querier: add reader_permit parameter and forward it to the mutation_source
  ...
2020-05-29 10:11:44 +03:00
Piotr Sarna
77e943e9a3 db,views: unify time points used for update generation
Until now, view updates were generated with a bunch of random
time points, because the interface was not adjusted for passing
a single time point. The time points were used to determine
whether cells were alive (e.g. because of TTL), so it's better
to unify the process:
1. when generating view updates from user writes, a single time point
   is used for the whole operation
2. when generating view updates via the view building process,
   a single time point is used for each build step

NOTE: I don't see any reliable and deterministic way of writing
      test scenarios which trigger problems with the old code.
      After #6488 is resolved and error injection is integrated
      into view.cc, tests can be added.

Fixes #6429
Tests: unit(dev)
Message-Id: <f864e965eb2e27ffc13d50359ad1e228894f7121.1590070130.git.sarna@scylladb.com>
2020-05-28 12:56:09 +03:00
Botond Dénes
992e697dd5 view: use valid permits for reads from the base table
View update generation involves reading existing values from the base
table, which will soon require a valid permit to be passed to it, so
make sure we create and pass a valid permit to these reads.
We use `database::make_query_class_config()` to obtain the semaphore for
the read which selects the appropriate user/system semaphore based on
the scheduling group the base table write is running in.
2020-05-28 11:34:35 +03:00
Botond Dénes
cc5137ffe3 table: require a valid permit to be passed to most read methods
Now that the most prevalent users (range scan and single partition
reads) all pass valid permits we require all users to do so and
propagate the permit down towards `make_sstable_reader()`. The plan is
to use this permit for restricting the sstable readers, instead of the
semaphore the table is configured with. The various
`make_streaming_*reader()` overloads keep using the internal semaphores
as but they also create the permit before the read starts and pass it to
`make_sstable_reader()`.
2020-05-28 11:34:35 +03:00
Piotr Sarna
18a37d0cb1 db,view: add tracing to view update generation path
In order to improve materialized views' debuggability,
tracing points are added to view update generation path.

Sample info of an insert statement which resulted in producing
local view updates which require read-before-write:

 activity                                                                                                                           | timestamp                  | source    | source_elapsed | client
------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+-----------
                                                                                                                 Execute CQL3 query | 2020-04-19 12:02:48.420000 | 127.0.0.1 |              0 | 127.0.0.1
                                                                                                      Parsing a statement [shard 0] | 2020-04-19 12:02:48.420674 | 127.0.0.1 |             -- | 127.0.0.1
                                                                                                   Processing a statement [shard 0] | 2020-04-19 12:02:48.420753 | 127.0.0.1 |             79 | 127.0.0.1
                                  Creating write handler for token: -6715243485458697746 natural: {127.0.0.1} pending: {} [shard 0] | 2020-04-19 12:02:48.420815 | 127.0.0.1 |            141 | 127.0.0.1
                                                                   Creating write handler with live: {127.0.0.1} dead: {} [shard 0] | 2020-04-19 12:02:48.420824 | 127.0.0.1 |            149 | 127.0.0.1
                                                                                             Executing a mutation locally [shard 0] | 2020-04-19 12:02:48.420830 | 127.0.0.1 |            155 | 127.0.0.1
                                          View updates for ks.t1 require read-before-write - base table reader is created [shard 0] | 2020-04-19 12:02:48.420862 | 127.0.0.1 |            188 | 127.0.0.1
                                                                                        Generated 2 view update mutations [shard 0] | 2020-04-19 12:02:48.420910 | 127.0.0.1 |            235 | 127.0.0.1
 Locally applying view update for ks.t1_v_idx_index; base token = -6715243485458697746; view token = -4156302194539278891 [shard 0] | 2020-04-19 12:02:48.420918 | 127.0.0.1 |            243 | 127.0.0.1
                                              Successfully applied local view update for 127.0.0.1 and 0 remote endpoints [shard 0] | 2020-04-19 12:02:48.420971 | 127.0.0.1 |            297 | 127.0.0.1
                                                                     View updates for ks.t1 were generated and propagated [shard 0] | 2020-04-19 12:02:48.420973 | 127.0.0.1 |            299 | 127.0.0.1
                                                                                           Got a response from /127.0.0.1 [shard 0] | 2020-04-19 12:02:48.420988 | 127.0.0.1 |            314 | 127.0.0.1
                                                             Delay decision due to throttling: do not delay, resuming now [shard 0] | 2020-04-19 12:02:48.420990 | 127.0.0.1 |            315 | 127.0.0.1
                                                                                          Mutation successfully completed [shard 0] | 2020-04-19 12:02:48.420994 | 127.0.0.1 |            320 | 127.0.0.1
                                                                                     Done processing - preparing a result [shard 0] | 2020-04-19 12:02:48.421000 | 127.0.0.1 |            326 | 127.0.0.1
                                                                                                                   Request complete | 2020-04-19 12:02:48.420330 | 127.0.0.1 |            330 | 127.0.0.1

Sample info for remote updates:

 activity                                                                                                                                                           | timestamp                  | source    | source_elapsed | client
--------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+-----------
                                                                                                                                                 Execute CQL3 query | 2020-04-26 16:19:47.691000 | 127.0.0.1 |              0 | 127.0.0.1
                                                                                                                                      Parsing a statement [shard 1] | 2020-04-26 16:19:47.691590 | 127.0.0.1 |              6 | 127.0.0.1
                                                                                                                                   Processing a statement [shard 1] | 2020-04-26 16:19:47.692368 | 127.0.0.1 |            783 | 127.0.0.1
                                                       Creating write handler for token: -3248873570005575792 natural: {127.0.0.3, 127.0.0.2} pending: {} [shard 1] | 2020-04-26 16:19:47.694186 | 127.0.0.1 |           2598 | 127.0.0.1
                                                                                        Creating write handler with live: {127.0.0.2, 127.0.0.3} dead: {} [shard 1] | 2020-04-26 16:19:47.694283 | 127.0.0.1 |           2699 | 127.0.0.1
                                                                                                                         Sending a mutation to /127.0.0.2 [shard 1] | 2020-04-26 16:19:47.694591 | 127.0.0.1 |           3006 | 127.0.0.1
                                                                                                                         Sending a mutation to /127.0.0.3 [shard 1] | 2020-04-26 16:19:47.694862 | 127.0.0.1 |           3277 | 127.0.0.1
                                                                                                                         Message received from /127.0.0.1 [shard 1] | 2020-04-26 16:19:47.696358 | 127.0.0.3 |             40 | 127.0.0.1
                                                                                                                         Message received from /127.0.0.1 [shard 1] | 2020-04-26 16:19:47.696442 | 127.0.0.2 |             32 | 127.0.0.1
                                                                           View updates for ks.t require read-before-write - base table reader is created [shard 1] | 2020-04-26 16:19:47.697762 | 127.0.0.3 |           1444 | 127.0.0.1
                                                                           View updates for ks.t require read-before-write - base table reader is created [shard 1] | 2020-04-26 16:19:47.698120 | 127.0.0.2 |           1710 | 127.0.0.1
                                                                                                                        Generated 1 view update mutations [shard 1] | 2020-04-26 16:19:47.699107 | 127.0.0.3 |           2789 | 127.0.0.1
 Sending view update for ks.t_v2_idx_index to 127.0.0.4, with pending endpoints = {}; base token = -3248873570005575792; view token = 1634052884888577606 [shard 1] | 2020-04-26 16:19:47.699345 | 127.0.0.3 |           3027 | 127.0.0.1
                                                                                                                         Sending a mutation to /127.0.0.4 [shard 1] | 2020-04-26 16:19:47.699614 | 127.0.0.3 |           3296 | 127.0.0.1
                                                                                                                        Generated 1 view update mutations [shard 1] | 2020-04-26 16:19:47.699824 | 127.0.0.2 |           3414 | 127.0.0.1
                                  Locally applying view update for ks.t_v2_idx_index; base token = -3248873570005575792; view token = 1634052884888577606 [shard 1] | 2020-04-26 16:19:47.700012 | 127.0.0.2 |           3603 | 127.0.0.1
                                                                                                      View updates for ks.t were generated and propagated [shard 1] | 2020-04-26 16:19:47.700059 | 127.0.0.3 |           3741 | 127.0.0.1
                                                                                                                         Message received from /127.0.0.3 [shard 1] | 2020-04-26 16:19:47.700958 | 127.0.0.4 |             37 | 127.0.0.1
                                                                              Successfully applied local view update for 127.0.0.2 and 0 remote endpoints [shard 1] | 2020-04-26 16:19:47.701522 | 127.0.0.2 |           5112 | 127.0.0.1
                                                                                                      View updates for ks.t were generated and propagated [shard 1] | 2020-04-26 16:19:47.701615 | 127.0.0.2 |           5206 | 127.0.0.1
                                                                                                                      Sending mutation_done to /127.0.0.1 [shard 1] | 2020-04-26 16:19:47.701913 | 127.0.0.3 |           5595 | 127.0.0.1
                                                                                                                                Mutation handling is done [shard 1] | 2020-04-26 16:19:47.702489 | 127.0.0.3 |           6171 | 127.0.0.1
                                                                                                                           Got a response from /127.0.0.3 [shard 1] | 2020-04-26 16:19:47.702667 | 127.0.0.1 |          11082 | 127.0.0.1
                                                                                             Delay decision due to throttling: do not delay, resuming now [shard 1] | 2020-04-26 16:19:47.702689 | 127.0.0.1 |          11105 | 127.0.0.1
                                                                                                                          Mutation successfully completed [shard 1] | 2020-04-26 16:19:47.702784 | 127.0.0.1 |          11200 | 127.0.0.1
                                                                                                                      Sending mutation_done to /127.0.0.1 [shard 1] | 2020-04-26 16:19:47.703016 | 127.0.0.2 |           6606 | 127.0.0.1
                                                                                                                     Done processing - preparing a result [shard 1] | 2020-04-26 16:19:47.703054 | 127.0.0.1 |          11470 | 127.0.0.1
                                                                                                                      Sending mutation_done to /127.0.0.3 [shard 1] | 2020-04-26 16:19:47.703720 | 127.0.0.4 |           2800 | 127.0.0.1
                                                                                                                                Mutation handling is done [shard 1] | 2020-04-26 16:19:47.704527 | 127.0.0.4 |           3607 | 127.0.0.1
                                                                                                                           Got a response from /127.0.0.4 [shard 1] | 2020-04-26 16:19:47.704580 | 127.0.0.3 |           8262 | 127.0.0.1
                                                                                             Delay decision due to throttling: do not delay, resuming now [shard 1] | 2020-04-26 16:19:47.704606 | 127.0.0.3 |           8288 | 127.0.0.1
                                                                                    Successfully applied view update for 127.0.0.4 and 1 remote endpoints [shard 1] | 2020-04-26 16:19:47.704853 | 127.0.0.3 |           8535 | 127.0.0.1
                                                                                                                                Mutation handling is done [shard 1] | 2020-04-26 16:19:47.706092 | 127.0.0.2 |           9682 | 127.0.0.1
                                                                                                                           Got a response from /127.0.0.2 [shard 1] | 2020-04-26 16:19:47.709933 | 127.0.0.1 |          18348 | 127.0.0.1
                                                                                                                                                   Request complete | 2020-04-26 16:19:47.702582 | 127.0.0.1 |          11582 | 127.0.0.1

Tests: unit(dev, debug)
2020-05-18 16:05:23 +02:00
Piotr Sarna
92aadb94e5 treewide: propagate trace state to write path
In order to add tracing to places where it can be useful,
e.g. materialized view updates and hinted handoff, tracing state
is propagated to all applicable call sites.
2020-05-18 16:05:23 +02:00
Piotr Sarna
f48e414eab db, view: remove duplicate entries from pending endpoints
When generating view updates, an endpoint can appear both
as a primary paired endpoint for the view update, and as a pending
endpoint (due to range movements). In order not to generate
the same update twice for the same endpoint, the paired endpoint
is removed from the list of pending endpoints if present.

Fixes #5459
Tests: unit(dev),
       dtest(TestMaterializedViews.add_dc_during_mv_insert_test)
2020-05-06 16:42:56 +03:00
Glauber Costa
1f9c37fb5e view_updating_consumer: move reference to a pointer
It is currently not possible to wrap the view_updating_consumer in an
std::optional. I intend to do it to allow for compactions to optionally
generate view updates.

The reason for that is that view_updating_consumer has a reference as a
member, which makes the move assignment constructor not be implicitly
generated.

This patch fixes it by keeping a pointer instead of a reference.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20200421123648.8328-1-glauber@scylladb.com>
2020-04-22 10:05:35 +03:00
Glauber Costa
4e6400293e staging: potentially read many SSTables at the same time
There is no reason to read a single SSTable at a time from the staging
directory. Moving SSTables from staging directory essentially involves
scanning input SSTables and creating new SSTables (albeit in a different
directory).

We have a mechanism that does that: compactions. In a follow up patch, I
will introduce a new specialization of compaction that moves SSTables
from staging (potentially compacting them if there are plenty).

In preparation for that, some signatures have to be changed and the
view_updating_consumer has to be more compaction friendly. Meaning:
- Operating with an sstable vector
- taking a table reference, not a database

Because this code is a bit fragile and the reviewer set is fundamentally
different from anything compaction related, I am sending this separately

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2020-04-15 11:26:44 -04:00
Piotr Sarna
1a9083b342 db,view: guard view builder startup with a semaphore
The startup routine performs some bookkeeping operations on views,
and so do these events:
 - on_create_view;
 - on_drop_view;
 - on_update_view.
Since the above events are guarded with a semaphore, the startup
routine should also take the same semaphore - in order to ensure
that all bookkeeping operations are serialized.

Refs #6094
2020-04-05 11:41:26 +02:00
Piotr Sarna
8da4a5b78c db,view: nitpick: change & operator to && for booleans
Although it's technically correct to use the bitwise and operator
on booleans as well, it's slightly confusing for the reader.
2020-04-05 11:41:25 +02:00
Piotr Sarna
e49805b7b8 db,view: remove unneeded implicit capture-by-reference
The lambda does not use any other captures, so it does not to
implicitly capture anything by reference.
2020-04-05 11:41:25 +02:00
Piotr Sarna
3f19865493 db,view: fix waiting for a view building future
The future was marked with a `FIXME: discarded future`, but there's really
no reason not to wait for it, and it was probably meant to be waited for
since its implementation.
2020-04-05 11:41:25 +02:00
Botond Dénes
240b5e0594 frozen_schema: key() remove unused schema parameter
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200402092249.680210-1-bdenes@scylladb.com>
2020-04-02 14:43:35 +02:00
Rafael Ávila de Espíndola
c5795e8199 everywhere: Replace engine().cpu_id() with this_shard_id()
This is a bit simpler and might allow removing a few includes of
reactor.hh.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200326194656.74041-1-espindola@scylladb.com>
2020-03-27 11:40:03 +03:00
Pavel Solodovnikov
adc6a98b59 cql3: return raw::parsed_statement as unique_ptr
Change CQL parsing routine to return std::unique_ptr
instead of seastar::shared_ptr.

This can help reduce redundant shared_ptr copies even further.

Make some supplementary changes necessary for this transition:
 * Remove enabled_shared_from_this base class from the following
   classes: truncate_statement, authorization_statement,
   authentication_statement: these were previously constructing
   prepared_statement instance in `prepare` method using
   `shared_from_this`.
   Make `prepare` methods implementation of inheriting classes
   mirror implementation from other statements (i.e.
   create a shallow copy of the object when prepairing into
   `prepared_statement`; this could be further refactored
   to avoid copies as much as possible).
 * Remove unused fields in create_role_statement which led to
   error while using compiler-generated copy ctor (copying
   uninitialied bool values via ctor).

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-03-23 23:19:21 +03:00
Botond Dénes
e0284bb9ee treewide: add missing headers and/or forward declarations 2020-03-23 09:29:45 +02:00
Nadav Har'El
7922b9eb8f materialized views: reduce recompilation when db/view/view.hh changes.
Before this patch, when db/view/view.hh was modified, 89 source files had to
be recompiled. After this patch, this number is down to 5.

Most of the irrelevant source files got view.hh by including database.hh,
which included view.hh just for the definition of statistics. So in this
patch we split the view statistics to a separate header file, view_stats.hh,
and database.hh only includes that. A few source files which included
only database.hh and also needed view.hh (for materialized-view related
functions) now need to include view.hh explicitly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200319121031.540-1-nyh@scylladb.com>
2020-03-19 15:46:14 +02:00
Piotr Sarna
0c11e07faf view,table: fix waiting for view updates during building
View updates sent as part of the view building process should never
be ignored, but fd49fd7 introduced a bug which may cause exactly that:
the updates are mistakenly sent to background, so the view builder
will not receive negative feedback if an update failed, which will
in turn not cause a retry. Consequently, view building may report
that it "finished" building a view, while some of the updates were
lost. A simple fix is to restore previous behaviour - all updates
triggered by view building are now waited for.

Fixes #6038
Tests: unit(dev),
dtest: interrupt_build_process_with_resharding_low_to_half_test
2020-03-19 10:50:54 +02:00
Nadav Har'El
635e6d887c materialized views: fix corner case of view updates used by Alternator
While CQL does not allow creation of a materialized view with more than one
base regular column in the view's key, in Alternator we do allow this - both
partition and clustering key may be a base regular column. We had a bug in
the logic handling this case:

If the new base row is missing a value for *one* of the view key columns,
we shouldn't create a view row. Similarly, if the existing base row was
missing a value for *one* of the view key columns, a view row does not
exist and doesn't need to be deleted.  This was done incorrectly, and made
decisions based on just one of the key columns, and the logic is now
fixed (and I think, simplified) in this patch.

With this patch, the Alternator test which previously failed because of
this problem now passes. The patch also includes new tests in the existing
C++ unit test test_view_with_two_regular_base_columns_in_key. This tests
was already supposed to be testing various cases of two-new-key-columns
updates, but missed the cases explained above. These new tests failed
badly before this patch - some of them had clean write errors, others
caused crashes. With this patch, they pass.

Fixes #6008.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200312162503.8944-1-nyh@scylladb.com>
2020-03-15 07:57:33 +01:00
Piotr Sarna
2061e6a9cc db,view: perform local view updates synchronously
Local view updates (updates applied to a local node,
without remote communication) are from now on performed
synchronously - which adds consistency guarantees, as a local
write failure will be returned to the client instead of being
silently ignored.
2020-03-11 09:05:56 +01:00
Piotr Sarna
fd49fd773c db,view: move putting view updates to background to mutate_MV
Currently, launching view updates as an asynchronous background job
is done via not waiting for mutate_MV() future in
table::generate_and_propagate_view_updates. That has a big downside,
since mutate_MV() handles *all* view updates for *all* views of a table,
so it's not possible to wait for each view independently.
Per-view granularity is required in order to implement synchronous
view updates of local views - because then we'll synchronously
wait for all views that write to a local node (due to having a matching
partition key with the base), while remote view updates will still
be sent asynchronously.
In order to do that, instead of not waiting for mutate_MV,
we do wait for it properly, but instead launch the asynchronous,
unwaited-for futures inside mutate_MV.
Effectively that means no changes for view updates so far - all updates
will be fired in the background. Later, another patch will introduce
a way to wait for selected updates to finish.
2020-03-11 09:05:56 +01:00