Commit Graph

95 Commits

Author SHA1 Message Date
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Avi Kivity
0909e3c17d treewide: remove redundant "x <=> 0" compares
If x is of type std::strong_ordering, then "x <=> 0" is equivalent to
x. These no-ops were inserted during #1449 fixes, but are now unnecessary.
They have potential for harm, since they can hide an accidental of the
type of x to an arithmetic type, so remove them.

Ref #1449.
2021-07-28 13:30:32 +03:00
Pavel Emelyanov
0f53e83a8e range_tombstone_list, code: Mark external_memory_usage noexcept
The range_tombstone_list's method is at the top of the
stack of calls each not throwing anything, so do the
deep-dive noexcept marking.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Michał Chojnowski
4b60e69e7c keys, compound: take the argument to from_single_value() by reference
Since serialize_value needs to copy the values to a bigger buffer anyway,
there is no point in copying the argument higher in the call chain.
This patch eliminates some pointless copies, for example in
alternator/executor.cc

Closes #8688
2021-05-24 11:20:24 +03:00
Michał Chojnowski
ffdb706984 keys, compound: eliminate some careless copies of shared pointers
Using `auto` copies the shared pointers. We don't want that, so let's use
`const auto&`.

Closes #8686
2021-05-23 12:11:46 +03:00
Michał Chojnowski
23909e91a4 alternator: executor: eliminate some pointless reserializations
There are places where abstract_type::deserialize is called just to pass the
result to compound_wrapper::from_singular, which immediately serializes it
again.

Get rid of this ritual by adding a version of from_singular which takes
a serialized argument.

As a bonus, along the way we eliminate some pointless copies of lw_shared_ptr
and std::shared_ptr caused by two careless uses of `auto`.

Closes #8687
2021-05-23 09:42:09 +03:00
Michał Chojnowski
5a2b492f09 compound: add explode_fragmented
We will use it in the next patches in this series.
2021-04-08 10:02:54 +02:00
Michał Chojnowski
979666075f cql3: expression: use managed_bytes instead of bytes where possible 2021-04-01 10:44:21 +02:00
Michał Chojnowski
0bb959e890 cql3: don't linearize elements of lists, tuples, and user types
This patch switches the type used to store collection elements inside the
intermediate form used in lists::value, tuples::value etc. from bytes
to managed_bytes. After this patch, tuple and list elements are only linearized
in from_serialized, which will be corrected soon.
This commit introduces some additional copies in expression.cc, which
will be dealt with in a future commit.
2021-04-01 10:44:21 +02:00
Avi Kivity
58b7f225ab keys: convert trichotomic comparators to return std::strong_ordering
A trichotomic comparator returning an int an easily be mistaken
for a less comparator as the return types are convertible.

Use the new std::strong_ordering instead.

A caller in cql3's update_parameters.hh is also converted, following
the path of least resistance.

Ref #1449.

Test: unit (dev)

Closes #8323
2021-03-21 09:30:43 +02:00
Michał Chojnowski
85048b349b memtable: fix accounting of managed_bytes in partition_snapshot_accounter
managed_bytes has a small overhead per each fragment. Due to that, managed_bytes
containing the same data can have different total memory usage in different
allocators. The smaller the preferred max allocation size setting is, the more
fragments are needed and the greater total per-fragment overhead is.
In particular, managed_bytes allocated in the LSA could grow in
memory usage when copied to the standard allocator, if the standard allocator
had a preferred max allocation setting smaller than the LSA.

partition_snapshot_accounter calculates the amount of memory used by
mutation fragments in the memtable (where they are allocated with LSA) based
on the memory usage after they are copied to the standard allocator.
This could result in an overestimation, as explained above.
But partition_snapshot_accounter must not overestimate the amount of freed
memory, as doing otherwise might result in OOM situations.

This patch prevents the overaccounting by adding minimal_external_memory_usage():
a new version of external_memory_usage(), which ignores allocator-dependent
overhead. In particular, it includes the per-fragment overhead in managed_bytes
only once, no matter how many fragments there are.
2021-01-15 18:21:13 +01:00
Michał Chojnowski
2e38647a95 keys: update comments after changes and remove an unused method
The comments were outdated after the latest changes (bytes_view vs
managed_bytes_view).
compound_view_wrapper::get_component() is unused, so we remove it.
2021-01-15 14:05:44 +01:00
Michał Chojnowski
dbcf987231 keys, compound: switch from bytes_view to managed_bytes_view
The keys classes (partition_key et al) already use managed_bytes,
but they assume the data is not fragmented and make liberal use
of that by casting to bytes_view. The view classes use bytes_view.

Change that to managed_bytes_view, and adjust return values
to managed_bytes/managed_bytes_view.

The callers are adjusted. In some places linearization (to_bytes())
is needed, but this isn't too bad as keys are always <= 64k and thus
will not be fragmented when out of LSA. We can remove this
linearization later.

The serialize_value() template is called from a long chain, and
can be reached with either bytes_view or managed_bytes_view.
Rather than trace and adjust all the callers, we patch it now
with constexpr if.

operator bytes_view (in keys) is converted to operator
managed_bytes_view, allowing callers to defer or avoid
linearization.
2021-01-08 14:16:08 +01:00
Michał Chojnowski
2d28471a59 utils: managed_bytes: make the constructors from bytes and bytes_view explicit
Conversions from views to owners have no business being implicit.
Besides, they would also cause various ambiguity problems when adding
managed_bytes_view.
2021-01-04 22:22:12 +01:00
Botond Dénes
84c47c4228 partition_key_view: add validate method
We want to be able to pass `partition_key_view` to
`validation::validate_cql_key()`. As the latter wants to call
`validate()` on the key, replicate `partition_key::validate()` in
`partition_key_view`.
2020-05-12 12:07:00 +03:00
Piotr Jastrzebski
9279a679da keys.hh: make it independent from schema.hh
This cuts build dependency keys.hh -> schema.hh

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-20 14:25:17 +02:00
Tomasz Grabiec
c6274fdef3 keys: Avoid implicit conversion to partition_key in the hasher of partition_key_view
Message-Id: <1556230107-13557-1-git-send-email-tgrabiec@scylladb.com>
2019-04-26 20:02:35 +03:00
Rafael Ávila de Espíndola
561285488b keys: add schema-aware printing for clustering_key_prefix
For reporting large rows we have to be able to print clustering keys
in addition to partition keys.

Refs #3988.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-01-28 13:01:54 -08:00
Tomasz Grabiec
75cde85349 Merge "Support reading range tombstones" from Piotr and Vladimir
Implement and test support for reading range tombstones in SSTables 3.

Does not yet support reads which are using slicing or fast forwarding.

From github.com/scylladb/seastar-dev.git haaawk/sstables3/tombstones_v11:

Piotr Jastrzebski (5):
  sstables: Add consumer_m::consume_range_tombstone
  sstables: Support null columns in ck
  sstables: Support reading range_tombstones
  sstables: Test reading range_tombstones
  sstables: Add test for RT with non-full key

Vladimir Krivopalov (2):
  sstables: Add operator<< overload for bound_kind_m.
  keys: Add clustering_key_prefix::make_full helper.
2018-08-27 20:43:38 +02:00
Vladimir Krivopalov
8acf4ddb8e keys: Add clustering_key_prefix::make_full helper.
This method fills non-full clustering key with trailing empty values to
make it full.
This can be used for clustering keys of rows in a compact table as,
unlike in regular tables, they can be non-full.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-08-22 12:13:23 +02:00
Duarte Nunes
ce461b06d7 keys: Add factory for an empty clustering_key_prefix_view
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-08-20 21:39:37 +01:00
Avi Kivity
bfd14b4123 keys: schema-aware printing of a partition_key
Add a with_schema() helper to decorate a partition key with its
schema for pretty-printing purposes, and matching operator<<.

This is useful to print partition keys where the operator, who
may not be familiar with the encoding, may see them.
2018-07-17 14:43:12 +03:00
Avi Kivity
2582f53b44 Merge "database and API: Add column_family::get_sstables_by_key" from Amnon
"
This is series is for nodetool getsstables.

This patch is based on:
8daaf9833a

With some minor adjustments because of the code change in sstables.

The idea is to allow searching for all the sstables that contains a
given key.

After this patch if there is a table t1 in keyspace k1 and it has a key
called aa.

curl -X GET "http://localhost:10000/column_family/sstables/by_key/k1%3At1?key=aa"

Will return the list of sstables file names that contains that key.
"

* 'amnon/sstable_for_key_v4' of github.com:scylladb/seastar-dev:
  Add the API implementation to get_sstables_by_key
  api: column_family.json make the get_sstables_for_key doc clearer
  column_family: Add the get_sstables_by_partition_key method
  sstable test: add has_partition_key test
  sstable: Add has_partition_key method
  keys_test: add a test for nodetool_style string
  keys: Add from_nodetool_style_string factory method
2018-06-10 16:53:56 +03:00
Amnon Heiman
c517ee8353 keys: Add from_nodetool_style_string factory method
Based on:
8daaf9833a

This patch adds a from_nodetool_style_string factory method to partition_key.
The string format is follows the nodetool format, that column in the
partition keys are split by ':'.
For example, if a partition key has two column col1 and col2, to get the
partition key that has col1 = val1 and col2 = val2:

val1:val2
2018-05-28 18:09:51 +03:00
Nadav Har'El
433fc6c36e keys.hh: simplify empty clustering-key check
The exploded_clustering_prefix type has a convenient is_empty() method
and an even more convenient "operator bool" shortcut. Unfortunately,
the other clustering prefix types (clustering_key_prefix,
clustering_key_prefix_view) have, for historic reasons, an is_empty
method which takes a schema parameter. That also means they can't
have an "operator bool" shortcut.

But checking if a prefix doesn't really need the schema - all we need to
check is whether the byte representation is empty. The result is simpler
and more efficient code, and easier to use. It is also more consistent -
all clustering-key-related types will have an "operator bool" instead of
just some of them.

To avoid massive code changes, we leave a is_empty(schema) variant, which
simply calls is_empty(). There's already precedent for that - various
methods which have a variant taking schema (and ignoring it) and one
taking nothing.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180521174220.13262-1-nyh@scylladb.com>
2018-05-23 11:46:23 +02:00
Duarte Nunes
5f822e3928 db/view/view_builder: Actually build views
This patch adds the missing view building code to the eponymous class.

We consume from the reader associated with each base table until all
its views are built. If the reader reaches the end and there are
incomplete views, then a view was added while others were being built.
In such cases, we restart the reader to the beginning of the current
token, but not to the beginning of the token range, when the view is
added. Then, when we exhaust the reader, we simply create a new one
for the whole token range, and resume building the pending views.

We aim to be resource-conscious. On a given shard, at any given moment,
we consume at most from one reader. We also strive for fairness, in that
each build step inserts entries for the views of a different base. Each
build step reads and generates updates for batch_size rows. We lack a
controller, which could potentially allow us to go faster (to execute
multiple steps at the same time, or consume more rows per batch), and
also which would apply backpressure, so we could, for example, delay
executing a build step.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-03-27 01:20:11 +01:00
Duarte Nunes
12507fb9ce keys: Replace feed_hash() member function with appending_hash
Replace the feed_hash() member function of partition_key and
clustering_key_prefix with the specialization of appending_hash,
so that we can use the general feed_hash() function.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 00:22:50 +00:00
Paweł Dziepak
6031b7e587 keys: introduce compound_wrapper::from_exploded_view() 2017-07-26 14:38:27 +01:00
Duarte Nunes
257eaa0d05 compound_view_wrapper: Add tri_compare
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-17 10:33:18 +02:00
Duarte Nunes
9e88b60ef5 mutation: Set cell using clustering_key_prefix
Change the clustering key argument in mutation::set_cell from
exploded_clustering_prefix to clustering_key_prefix, which allows for
some overall code simplification and fewer copies. This mostly affects
the cql3 layer.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Duarte Nunes
07e648251b prefix_compound_view_wrapper: Add is_full and is_empty functions
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Tomasz Grabiec
212a021fc6 keys: Introduce is_empty() for prefixes 2017-03-28 18:10:39 +02:00
Paweł Dziepak
711bd19f16 keys: add memory_usage()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-11-18 11:25:36 +00:00
Paweł Dziepak
ef57b9a26f rename memory_usage() to external_memory_usage() where applicable
Renaming the function to external_memory_usage() makes it clear that
sizeof(T) is not included, something that was a source of confusion in
the past.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-11-18 11:25:36 +00:00
Paweł Dziepak
eb59b4c4ab keys: disable constructing from generic range
stdx::optional<T> uses quite elaborate std::enable_if_t magic to decide
whether the argument passed to its constructor should be used for a call
T constructor or stdx::optional<T> constructor.

Apparently, with GCC 6.2 having T constructor which accepts any type
confuses that magic and we end up with compile errors.

The solution is to have from_range() method that replaces that
constructor from range. There is also constructor that creates a key
from std::vector<bytes> so that code generated by IDL works as it did
before.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1474550971-15309-1-git-send-email-pdziepak@scylladb.com>
2016-09-24 18:57:01 +03:00
Paweł Dziepak
d0ee750cec keys: add memory_usage()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:17:25 +01:00
Paweł Dziepak
7809adc6ce keys: add compound_wrapper::tri_compare
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:48 +01:00
Tomasz Grabiec
57413618e8 Merge branch 'range-tombstone-v9' from https://github.com/duarten/scylla.git
From Duarte:

This patchset adds the range_tombstone_list data structure,
used to hold a set of disjoint range tombstones, and changes
the internal representation of row tombstones to use that
data structure.

Fixes #1155

[tgrabiec: Added compound_wrapper::make_empty(const schema&) overload
	   to fix compilation failure in tracing code]
2016-06-02 22:17:17 +02:00
Duarte Nunes
6a111fdd01 mutations: Introduce the range_tombstone class
This patch introduces the range_tombstone class, composed of
a [start, end] pair of clustering_key_prefixes, the type
of inclusiveness of each bound, and a tombstone.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:58 +02:00
Duarte Nunes
dc8319ed91 keys: Remove schema argument from make_empty
An empty key is independent of the schema.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:36 +02:00
Piotr Jastrzebski
8307681975 Introduce clustering_ranges type.
It will be used to slice data returned by mutation_readers.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-05-16 11:46:09 +02:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Gleb Natapov
51ca3122cf cleanup forward declaration for key types
Message-Id: <20160310075138.GC6117@scylladb.com>
2016-03-10 10:52:19 +01:00
Paweł Dziepak
53858ed9cd keys: remove old-style serializers
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-03-02 09:05:25 +00:00
Tomasz Grabiec
a921479e71 Merge tag '807-v3' from https://github.com/avikivity/scylla
From Avi:

This patchset introduces a linearization context for managed_bytes objects.

Within this context, any scattered managed_bytes (found only in lsa regions,
so limited to memtable and cache) are auto-linearized for the lifetime of
the context.   This ensures that key and value lookups can use fast
contiguous iterators instead of using slow discontiguous iterators (or
crashing, as is the case now).
2016-02-16 14:29:48 +01:00
Tomasz Grabiec
f4e3bd0c00 keys: Introduce partition_key::validate()
So that user doesn't have to play with low-level representations.
2016-02-15 16:53:56 +01:00
Tomasz Grabiec
df5f8e4bfc keys: Avoid unnecessary construction of temporary 'bytes' object
We're now using managed_bytes as main storage, so conversion from
bytes_view to bytes is redundant, we need to convert to managed_bytes
eventualy.
2016-02-15 16:53:56 +01:00
Tomasz Grabiec
6d00e473ac keys: Make constructor from bytes private 2016-02-15 16:53:55 +01:00