Commit Graph

77 Commits

Author SHA1 Message Date
Piotr Grabowski
561753fe71 mutation: Improve log print of mutations
Changes format of mutation, mutation_partition
log messages to more human-readable. Fixes #826.
2020-09-04 16:33:25 +02:00
Wojciech Mitros
45215746fe increase the maximum size of query results to 2^64
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.

The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).

In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.

The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).

Fixes #5101.
2020-08-03 17:32:49 +02:00
Botond Dénes
6660a5df51 result_memory_accounter: remove default constructor
If somebody wants to bypass proper memory accounting they should at
the very least be forced to consider if that is indeed wise and think a
second about the limit they want to apply.
2020-07-28 18:00:29 +03:00
Piotr Jastrzebski
ca4a89d239 dht: add dht::decorate_key
and replace all dht::global_partitioner().decorate_key
with dht::decorate_key

It is an improvement because dht::decorate_key takes schema
and uses it to obtain partitioner instead of using global
partitioner as it was before.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:59:06 +01:00
Piotr Dulikowski
59fbbb993f memtables: add partition/row hit/miss counters
Adds per-table metrics for counting partition and row reuse
in memtables. New metrics are as follows:
    - memtable_partition_writes - number of write operations performed
          on partitions in memtables,
    - memtable_partition_hits - number of write operations performed
          on partitions that previously existed in a memtable,
    - memtable_row_writes - number of row write operations performed
          in memtables,
    - memtable_row_hits - number of row write operations that ovewrote
          rows previously present in a memtable.

Tests: unit(release)
2019-11-12 13:35:41 +01:00
Vladimir Davydov
e0b31dd273 query: add flag to return static row on partition with no rows
A SELECT statement that has clustering key restrictions isn't supposed
to return static content if no regular rows matches the restrictions,
see #589. However, for the CAS statement we do need to return static
content on failure so this patch adds a flag that allows the caller to
override this behavior.
2019-10-28 21:50:44 +03:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Avi Kivity
a71ab365e3 toplevel: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Paweł Dziepak
637b9a7b3b atomic_cell_or_collection: make operator<< show cell content
After the new in-memory representation of cells was introduced there was
a regression in atomic_cell_or_collection::operator<< which stopped
printing the content of the cell. This makes debugging more incovenient
are time-consuming. This patch fixes the problem. Schema is propagated
to the atomic_cell_or_collection printer and the full content of the
cell is printed.

Fixes #3571.

Message-Id: <20181024095413.10736-1-pdziepak@scylladb.com>
2018-10-24 13:29:51 +03:00
Botond Dénes
eb357a385d flat_mutation_reader: make timeout opt-out rather than opt-in
Currently timeout is opt-in, that is, all methods that even have it
default it to `db::no_timeout`. This means that ensuring timeout is used
where it should be is completely up to the author and the reviewrs of
the code. As humans are notoriously prone to mistakes this has resulted
in a very inconsistent usage of timeout, many clients of
`flat_mutation_reader` passing the timeout only to some members and only
on certain call sites. This is small wonder considering that some core
operations like `operator()()` only recently received a timeout
parameter and others like `peek()` didn't even have one until this
patch. Both of these methods call `fill_buffer()` which potentially
talks to the lower layers and is supposed to propagate the timeout.
All this makes the `flat_mutation_reader`'s timeout effectively useless.

To make order in this chaos make the timeout parameter a mandatory one
on all `flat_mutation_reader` methods that need it. This ensures that
humans now get a reminder from the compiler when they forget to pass the
timeout. Clients can still opt-out from passing a timeout by passing
`db::no_timeout` (the previous default value) but this will be now
explicit and developers should think before typing it.

There were suprisingly few core call sites to fix up. Where a timeout
was available nearby I propagated it to be able to pass it to the
reader, where I couldn't I passed `db::no_timeout`. Authors of the
latter kind of code (view, streaming and repair are some of the notable
examples) should maybe consider propagating down a timeout if needed.
In the test code (the wast majority of the changes) I just used
`db::no_timeout` everywhere.

Tests: unit(release, debug)

Signed-off-by: Botond Dénes <bdenes@scylladb.com>

Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>
2018-09-20 11:31:24 +02:00
Botond Dénes
d347866664 mutation: add missing assert to mutation from reader
read_mutation_from_flat_mutation_reader's internal adapter can build a
single mutation only and hence can consume only a single partition.
If more than one partitions are pushed down from the producer the
adaptor will very likely crash. To avoid unnecessary investigations add
an assert() to fail early and make it clear what the real problem is.
All other consume_ methods have an assert() already for their
invariants so this is just following suit.
2018-09-03 10:31:44 +03:00
Paweł Dziepak
27014a23d7 treewide: require type info for copying atomic_cell_or_collection 2018-05-31 15:51:11 +01:00
Paweł Dziepak
e9d6fc48ac treewide: require type for creating atomic_cell 2018-05-31 15:51:11 +01:00
Duarte Nunes
6b4b429883 query-result: Introduce class result_options
Introduce class result_options to carry result options through the
request pipeline, which at this point mean the result type and the
digest algorithm. This class allows us to encapsulate the concrete
digest algorithm to use.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 00:22:50 +00:00
Piotr Jastrzebski
d9cbb9fedc Delete unused mutation_from_streamed_mutation(streamed_mutation_opt)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-01-24 20:56:49 +01:00
Piotr Jastrzebski
759271f866 Delete unused mutation_from_streamed_mutation(streamed_mutation&)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-01-24 20:56:49 +01:00
Piotr Jastrzebski
667ce36981 Move mutation_rebuilder to header
It will be used in tests.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-01-19 08:56:37 +01:00
Piotr Jastrzebski
570703a169 read_mutation_from_flat_mutation_reader: don't take schema_ptr
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-12-21 11:47:07 +01:00
Tomasz Grabiec
b3709047b0 mutation_partition: Extract sliced() from mutation into mutation_partition
So that we can call it on mutation_partition.
2017-12-08 17:50:47 +01:00
Tomasz Grabiec
66990867b8 mutation: Make printout more concise
Before:

{ks.cf key {key: pk{000c706b30303030303030303030}, token:-2018791535786252460} data {mutation_partition:

After:

{ks.cf {key: pk{000c706b30303030303030303030}, token:-2018791535786252460} {mutation_partition:
2017-12-01 10:52:37 +01:00
Tomasz Grabiec
fd7ab5fe99 database: Move operator<<() overloads to appropriate source files 2017-12-01 10:52:37 +01:00
Piotr Jastrzebski
4b58a05053 Introduce read_mutation_from_flat_mutation_reader
This helper method reads a single mutation from
a flat_mutation_reader.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:26:10 +01:00
Piotr Jastrzebski
6718ecab82 Make mutation_rebuilder streamed_mutation independent
mutation_rebuilder will be used not only with streamed_mutations
but also with flat_mutation_readers so it's better for it to be
independent from streamed_mutation.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:26:10 +01:00
Tomasz Grabiec
749f5770df mutation: Introduce apply(mutation_fragment) 2017-11-02 12:16:17 +01:00
Tomasz Grabiec
1d5d5e26a2 mutation: Introduce sliced() 2017-06-24 18:06:11 +02:00
Piotr Jastrzebski
77f944880c cache: Remove support for wide partitions
This will be handled by row cache now.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-06-24 18:06:11 +02:00
Duarte Nunes
9e88b60ef5 mutation: Set cell using clustering_key_prefix
Change the clustering key argument in mutation::set_cell from
exploded_clustering_prefix to clustering_key_prefix, which allows for
some overall code simplification and fewer copies. This mostly affects
the cql3 layer.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Duarte Nunes
b27da688f9 mutation: Remove dead get_cell() function
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170316234843.23130-1-duarte@scylladb.com>
2017-03-17 11:18:23 +02:00
Tomasz Grabiec
cbf4601e31 streamed_mutation: Add non-owning variant of mutation_from_streamed_mutation() 2017-02-23 18:50:53 +01:00
Piotr Jastrzebski
4bbe05dd47 mutation_partition: take schema in find_row and clustered_row
This will allow intrusive set implementation that does not
store schema.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:26:03 +01:00
Avi Kivity
1d9ee358f1 Revert "Merge "Reduce the size of mutation_partition" from Piotr"
This reverts commit aa392810ff, reversing
changes made to a24ff47c637e6a5fd158099b8a65f1191fc2d023; it uses
boost::intrusive::detail directly, which it must not, and doesn't compile on
all boost versions as a consequence.
2016-12-25 16:07:48 +02:00
Piotr Jastrzebski
2af6ff68d9 mutation_partition: take schema in find_row and clustered_row
This will allow intrusive set implementation that does not
store schema.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-12-23 11:29:07 +01:00
Asias He
e5485f3ea6 Get rid of query::partition_range
Use dht::partition_range instead
2016-12-19 08:09:25 +08:00
Paweł Dziepak
15de8de9e5 reconcilable_result: keep result_memory_tracker object
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Paweł Dziepak
948c062e64 mutation_rebuilder: use memory_usage()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-11-18 11:25:36 +00:00
Paweł Dziepak
ef57b9a26f rename memory_usage() to external_memory_usage() where applicable
Renaming the function to external_memory_usage() makes it clear that
sizeof(T) is not included, something that was a source of confusion in
the past.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-11-18 11:25:36 +00:00
Piotr Jastrzebski
9d33948487 mutation_rebuilder: fix fragment size calculation
It wasn't calculating the size of data correctly.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <c03dfff7bf1ca3199991e5864189f98bfa2942ea.1479397736.git.piotr@scylladb.com>
2016-11-17 16:23:42 +00:00
Tomasz Grabiec
ecf85cbffb mutation: Define + operation
It's more convenient to write m1 + m2 in tests than to do more
elaborate constructs with copy constructors and apply().
2016-10-18 11:16:08 +02:00
Piotr Jastrzebski
0d39bb1ad0 Implement mutation_from_streamed_mutation_with_limit
If mutation is bigger than this limit
it won't be read and mutation_from_streamed_mutation
will return empty optional.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-07-21 09:35:23 +02:00
Paweł Dziepak
93cc4454a6 streamed_mutation: emit range_tombstones directly
Originally, streamed_mutations guaranteed that emitted tombstones are
disjoint. In order to achieve that two separate objects were produced
for each range tombstone: range_tombstone_begin and range_tombstone_end.

Unfortunately, this forced sstable writer to accumulate all clustering
rows between range_tombstone_begin and range_tombstone_end.

However, since there is no need to write disjoint tombstones to sstables
(see #1153 "Write range tombstones to sstables like Cassandra does") it
is also not necessary for streamed_mutations to produce disjoint range
tombstones.

This patch changes that by making streamed_mutation produce
range_tombstone objects directly.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-13 09:51:18 +01:00
Duarte Nunes
01b18063ea query: Add per-partition row limit
This patch as a per-partition row limit. It ensures both local
queries and the reconciliation logic abide by this limit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-22 09:46:51 +02:00
Paweł Dziepak
48e08fa997 mutation: add mutation_from_streamed_mutation()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:49 +01:00
Avi Kivity
db03295c8a Merge "Fix query digest mismatch" from Tomasz
"Currently data query digest includes cells and tombstones which may have
expired or be covered by higher-level tombstones. This causes digest
mismatch between replicas if some elements are compacted on one of the
nodes and not on others. This mismatch triggers read-repair which doesn't
resolve because mutations received by mutation queries are not differing,
they are compacted already.

The fix adds compacting step before writing and digesting query results by
reusing the algorithm used by mutation query. This is not the most optimal
way to fix this. The compaction step could be folded with the query writing,
there is redundancy in both steps. However such change carries more risk,
and thus was postponed.

perf_simple_query test (cassandra-stress-like partitions) shows regression
from 83k to 77k (7%) ops/s.

Fixes #1165."
2016-04-08 12:13:29 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Tomasz Grabiec
f15c380a4f database: Compact mutations when executing data queries
Currently data query digest includes cells and tombstones which may have
expired or be covered by higher-level tombstones. This causes digest
mismatch between replicas if some elements are compacted on one of the
nodes and not on others. This mismatch triggers read-repair which doesn't
resolve because mutations received by mutation queries are not differing,
they are compacted already.

The fix adds compacting step before writing and digesting query results by
reusing the algorithm used by mutation query. This is not the most optimal
way to fix this. The compaction step could be folded with the query writing,
there is redundancy in both steps. However such change carries more risk,
and thus was postponed.

perf_simple_query test (cassandra-stress-like partitions) shows regression
from 83k to 77k (7%) ops/s.

Fixes #1165.
2016-04-07 19:56:58 +02:00
Tomasz Grabiec
87d7279267 mutation: Add copy assignment operator
We already have a copy constructor, so can have copy assignment as
well.
2016-03-21 18:41:27 +01:00
Paweł Dziepak
82d2a2dccb specify whether query::result, result_digest or both are needed
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-03-11 18:27:13 +00:00
Tomasz Grabiec
4284715ddf Relax includes 2016-02-26 12:26:13 +01:00
Tomasz Grabiec
036974e19b Make mutation interfaces support multiple versions
Schema is tracked in memtable and cache per-entry. Entries are
upgraded lazily on access. Incoming mutations are upgraded to table's
current schema on given shard.

Mutating nodes need to keep schema_ptr alive in case schema version is
requested by target node.
2016-01-11 10:34:51 +01:00
Tomasz Grabiec
f59ec59abc mutation: Implement upgrade()
Converts mutation to a new schema.
2016-01-08 21:10:26 +01:00