Commit Graph

21 Commits

Author SHA1 Message Date
Avi Kivity
f70ece9f88 tests: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Botond Dénes
eb357a385d flat_mutation_reader: make timeout opt-out rather than opt-in
Currently timeout is opt-in, that is, all methods that even have it
default it to `db::no_timeout`. This means that ensuring timeout is used
where it should be is completely up to the author and the reviewrs of
the code. As humans are notoriously prone to mistakes this has resulted
in a very inconsistent usage of timeout, many clients of
`flat_mutation_reader` passing the timeout only to some members and only
on certain call sites. This is small wonder considering that some core
operations like `operator()()` only recently received a timeout
parameter and others like `peek()` didn't even have one until this
patch. Both of these methods call `fill_buffer()` which potentially
talks to the lower layers and is supposed to propagate the timeout.
All this makes the `flat_mutation_reader`'s timeout effectively useless.

To make order in this chaos make the timeout parameter a mandatory one
on all `flat_mutation_reader` methods that need it. This ensures that
humans now get a reminder from the compiler when they forget to pass the
timeout. Clients can still opt-out from passing a timeout by passing
`db::no_timeout` (the previous default value) but this will be now
explicit and developers should think before typing it.

There were suprisingly few core call sites to fix up. Where a timeout
was available nearby I propagated it to be able to pass it to the
reader, where I couldn't I passed `db::no_timeout`. Authors of the
latter kind of code (view, streaming and repair are some of the notable
examples) should maybe consider propagating down a timeout if needed.
In the test code (the wast majority of the changes) I just used
`db::no_timeout` everywhere.

Tests: unit(release, debug)

Signed-off-by: Botond Dénes <bdenes@scylladb.com>

Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>
2018-09-20 11:31:24 +02:00
Tomasz Grabiec
0a1aec2bd6 tests: perf_row_cache_update: Test with an active reader surviving memtable flush
Exposes latency issues caused by mutation_cleaner life time issues,
fixed by eralier commits.
2018-06-27 21:51:04 +02:00
Gleb Natapov
98b7f6148b fix regression in perf_row_cache_update test
logalloc should be initialized explicitly by every test that uses it
now.

Message-Id: <20180613093657.GY11809@scylladb.com>
2018-06-13 15:21:20 +03:00
Tomasz Grabiec
25b3641d9e tests: Improve perf_row_cache_update
We now test more kinds of workloads:
 - small partitions with no clustering key
 - large partition with lots of small rows
 - large partition with lots of range tombstones

We also collect statistics about scheduling latency induced by cache
update.

Example output:

Small partitions, no overwrites:
update: 356.809113 [ms], stall: {ticks: 396, min: 0.006867 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.358102 [ms]}, cache: 257/257 [MB] LSA: 257/257 [MB] std free: 83 [MB]
update: 337.542999 [ms], stall: {ticks: 373, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.358102 [ms]}, cache: 514/514 [MB] LSA: 514/514 [MB] std free: 83 [MB]
update: 383.485291 [ms], stall: {ticks: 425, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 771/788 [MB] LSA: 771/788 [MB] std free: 83 [MB]
update: 574.968811 [ms], stall: {ticks: 634, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.629722 [ms], max: 1.955666 [ms]}, cache: 879/917 [MB] LSA: 879/917 [MB] std free: 83 [MB]
update: 411.541138 [ms], stall: {ticks: 455, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.358102 [ms]}, cache: 787/835 [MB] LSA: 787/835 [MB] std free: 83 [MB]
update: 368.491211 [ms], stall: {ticks: 408, min: 0.001332 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 750/790 [MB] LSA: 750/790 [MB] std free: 83 [MB]
update: 343.671967 [ms], stall: {ticks: 380, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 734/769 [MB] LSA: 734/769 [MB] std free: 83 [MB]
update: 320.277283 [ms], stall: {ticks: 357, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 724/753 [MB] LSA: 724/753 [MB] std free: 83 [MB]
update: 310.583282 [ms], stall: {ticks: 344, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 714/740 [MB] LSA: 714/740 [MB] std free: 83 [MB]
update: 303.627106 [ms], stall: {ticks: 338, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.955666 [ms]}, cache: 707/731 [MB] LSA: 707/731 [MB] std free: 83 [MB]
update: 296.742523 [ms], stall: {ticks: 330, min: 0.001332 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 701/724 [MB] LSA: 701/724 [MB] std free: 83 [MB]
update: 286.598541 [ms], stall: {ticks: 319, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 697/719 [MB] LSA: 697/719 [MB] std free: 83 [MB]
update: 288.649323 [ms], stall: {ticks: 321, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 694/715 [MB] LSA: 694/715 [MB] std free: 83 [MB]
update: 282.069916 [ms], stall: {ticks: 314, min: 0.001598 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 692/712 [MB] LSA: 692/712 [MB] std free: 83 [MB]
update: 292.462036 [ms], stall: {ticks: 325, min: 0.001917 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 689/708 [MB] LSA: 689/708 [MB] std free: 83 [MB]
update: 274.390442 [ms], stall: {ticks: 305, min: 0.001332 [ms], 50%: 1.131752 [ms], 90%: 1.131752 [ms], 99%: 1.131752 [ms], max: 1.131752 [ms]}, cache: 687/705 [MB] LSA: 687/705 [MB] std free: 83 [MB]
invalidation: 172.617508 [ms]
Large partition, lots of small rows:
update: 262.132721 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.005722 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 268.650944 [ms]}, cache: 187/188 [MB] LSA: 187/188 [MB] std free: 82 [MB]
update: 281.359467 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 322.381152 [ms]}, cache: 375/376 [MB] LSA: 375/376 [MB] std free: 82 [MB]
update: 287.229065 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 322.381152 [ms]}, cache: 563/564 [MB] LSA: 563/564 [MB] std free: 82 [MB]
update: 1294.816284 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.005722 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 1386.179840 [ms]}, cache: 586/625 [MB] LSA: 586/625 [MB] std free: 82 [MB]
update: 845.022461 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.005722 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 962.624896 [ms]}, cache: 439/475 [MB] LSA: 439/475 [MB] std free: 82 [MB]
update: 380.335938 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 386.857376 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 477.234680 [ms], stall: {ticks: 4, min: 0.002760 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 525.955017 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 548.003784 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.006866 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 528.697937 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 609.292603 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.005722 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 575.762451 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 668.489536 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 530.801392 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 535.948364 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 527.143555 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.020501 [ms], 99%: 0.020501 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
update: 521.869202 [ms], stall: {ticks: 4, min: 0.002760 [ms], 50%: 0.004768 [ms], 90%: 0.017084 [ms], 99%: 0.017084 [ms], max: 557.074624 [ms]}, cache: 599/600 [MB] LSA: 599/600 [MB] std free: 82 [MB]
invalidation: 173.069733 [ms]
Large partition, lots of range tombstones:
update: 224.003220 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 268.650944 [ms]}, cache: 52/52 [MB] LSA: 52/52 [MB] std free: 82 [MB]
update: 570.882874 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 105/105 [MB] LSA: 105/105 [MB] std free: 82 [MB]
update: 577.249878 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 158/158 [MB] LSA: 158/158 [MB] std free: 82 [MB]
update: 580.239624 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 211/211 [MB] LSA: 211/211 [MB] std free: 82 [MB]
update: 614.187134 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.004768 [ms], 90%: 0.011864 [ms], 99%: 0.011864 [ms], max: 668.489536 [ms]}, cache: 264/264 [MB] LSA: 264/264 [MB] std free: 82 [MB]
update: 618.709229 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.003973 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 317/317 [MB] LSA: 317/317 [MB] std free: 82 [MB]
update: 626.943359 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 369/370 [MB] LSA: 369/370 [MB] std free: 82 [MB]
update: 602.873474 [ms], stall: {ticks: 4, min: 0.001917 [ms], 50%: 0.003973 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 422/423 [MB] LSA: 422/423 [MB] std free: 82 [MB]
update: 617.522583 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 475/475 [MB] LSA: 475/475 [MB] std free: 82 [MB]
update: 627.291138 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.004768 [ms], 90%: 0.011864 [ms], 99%: 0.011864 [ms], max: 668.489536 [ms]}, cache: 528/528 [MB] LSA: 528/528 [MB] std free: 82 [MB]
update: 623.720886 [ms], stall: {ticks: 4, min: 0.001598 [ms], 50%: 0.003973 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 581/581 [MB] LSA: 581/581 [MB] std free: 82 [MB]
update: 630.735596 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 668.489536 [ms]}, cache: 634/634 [MB] LSA: 634/634 [MB] std free: 82 [MB]
update: 2776.525635 [ms], stall: {ticks: 4, min: 0.002300 [ms], 50%: 0.004768 [ms], 90%: 0.014237 [ms], 99%: 0.014237 [ms], max: 2874.382592 [ms]}, cache: 687/687 [MB] LSA: 687/687 [MB] std free: 82 [MB]
2018-05-30 12:18:56 +02:00
José Guilherme Vanz
380bc0aa0d Swap arguments order of mutation constructor
Swap arguments in the mutation constructor keeping the same standard
from the constructor variants. Refs #3084

Signed-off-by: José Guilherme Vanz <guilherme.sft@gmail.com>
Message-Id: <20180120000154.3823-1-guilherme.sft@gmail.com>
2018-01-21 12:58:42 +02:00
Jesse Haber-Kucharsky
fb0866ca20 Move thread_local declarations out of main.cc
Since `disk-error-handler.hh` defines these global variables `extern`,
it makes sense to declare them in the `disk-error-handler.cc` instead of
`main.cc`.

This means that test files don't have to declare them.

Fixes #2735.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <1eed120bfd9bb3647e03fe05b60c871de2df2a86.1511810004.git.jhaberku@scylladb.com>
2017-11-27 20:27:42 +01:00
Tomasz Grabiec
d22fdf4261 row_cache: Improve safety of cache updates
Cache imposes requirements on how updates to the on-disk mutation source
are made:
  1) each change to the on-disk muation source must be followed
     by cache synchronization reflecting that change
  2) The two must be serialized with other synchronizations
  3) must have strong failure guarantees (atomicity)

Because of that, sstable list update and cache synchronization must be
done under a lock, and cache synchronization cannot fail to synchronize.

Normally cache synchronization achieves no-failure thing by wiping the
cache (which is noexcept) in case failure is detect. There are some
setup steps hoever which cannot be skipped, e.g. taking a lock
followed by switching cache to use the new snapshot. That truly cannot
fail.  The lock inside cache synchronizers is redundant, since the
user needs to take it anyway around the combined operation.

In order to make ensuring strong exception guarantees easier, and
making the cache interface easier to use correctly, this patch moves
the control of the combined update into the cache. This is done by
having cache::update() et al accept a callback (external_updater)
which is supposed to perform modiciation of the underlying mutation
source when invoked.

This is in-line with the layering. Cache is layered on top of the
on-disk mutation source (it wraps it) and reading has to go through
cache. After the patch, modification also goes through cache. This way
more of cache's requirements can be confined to its implementation.

The failure semantics of update() and other synchronizers needed to
change due to strong exception guaratnees. Now if it fails, it means
the update was not performed, neither to the cache nor to the
underlying mutation source.

The database::_cache_update_sem goes away, serialization is done
internally by the cache.

The external_updater needs to have strong exception guarantees. This
requirement is not new. It is however currently violated in some
places. This patch marks those callbacks as noexcept and leaves a
FIXME. Those should be fixed, but that's not in the scope of this
patch. Aborting is still better than corrupting the state.

Fixes #2754.

Also fixes the following test failure:

  tests/row_cache_test.cc(949): fatal error: in "test_update_failure": critical check it->second.equal(*s, mopt->partition()) has failed

which started to trigger after commit 318423d50b. Thread stack
allocation may fail, in which case we did not do the necessary
invalidation.
2017-09-04 10:04:29 +02:00
Tomasz Grabiec
56e3ce05db row_cache: Don't require presence checker to be supplied externally
The API is simpler and safer this way.
2017-09-04 10:04:29 +02:00
Tomasz Grabiec
8ba6366610 row_cache: Switch to using snapshot_source
Currently every time cache needs to create reader for missing data it
obtains a reader which is most up to date. That reader includes writes
from later populate phases, for which update() was not yet
called. This will be problematic once we allow partitions to be
partially populated, because different parts of the partition could be
partially populated using readers using different sets of writes, and break
write atomicity.

The solution will be to always populate given partition using the same
set of writes, using reader created from the current snapshot. The
snapshot changes only on update(), with update() gradually converting
each partition to the new snapshot.
2017-06-24 18:06:11 +02:00
Tomasz Grabiec
78844fa2e5 db: Use incremental selector in partition_presence_checker
This reduces the number of sstables we need to check to only those
whose token range overlaps with the key. Reduces cache update
time. Especially effective with leveled compaction strategy.

Refs #1943.

Incremental selector works with an immutable sstable set, so cache
updates need to be serialized. Otherwise we could mispopulate due to
stale presence information.

Presence checker interface was changed to accept decorated key in
order to gain easy access to the token, which is required by
the incremental selector.
2016-12-19 14:20:58 +01:00
Paweł Dziepak
6755a679f6 drop key readers
key_readers weren't used since introduction of continuity flag to cache
entries.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-10-19 15:29:08 +01:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Benoît Canet
1fb9a48ac5 exception: Optionally shutdown communication on I/O errors.
I/O errors cannot be fixed by Scylla the only solution
is to shutdown the database communications.

Signed-off-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458154098-9977-1-git-send-email-benoit@scylladb.com>
2016-03-17 15:02:52 +02:00
Glauber Costa
15336e7eb7 key_source: turn it into a class
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-01-25 15:20:38 -05:00
Glauber Costa
58fdae33bd mutation_source: turn it into a class
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-01-25 15:20:38 -05:00
Tomasz Grabiec
4e5a52d6fa db: Make read interface schema version aware
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.

Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.

Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.

Schema requesting across nodes is currently stubbed (throws runtime
exception).
2016-01-11 10:34:52 +01:00
Avi Kivity
47499dcf18 data_value: make conversion from bytes explicit
Since bytes is a very generic value that is returned from many calls,
it is easy to pass it by mistake to a function expecting a data_value,
and to get a wrong result.  It is impossible for the data_value constructor
to know if the argument is a genuine bytes variable, a data_value of another
type, but serialized, or some other serialized data type.

To prevent misuse, make the data_value(bytes) constructor
(and complementary data_value(optional<bytes>) explicit.
2015-11-13 17:12:29 +02:00
Paweł Dziepak
c1e95dd893 row_cache: pass underlying key_source to row_cache
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-20 20:27:53 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Tomasz Grabiec
bf6062493e tests: Introduce tests/perf_row_cache_update 2015-09-07 09:41:36 +02:00