Commit Graph

895 Commits

Author SHA1 Message Date
Botond Dénes
ba7a9d2ac3 imr: switch back to open-coded description of structures
Commit aab6b0ee27 introduced the
controversial new IMR format, which relied on a very template-heavy
infrastructure to generate serialization and deserialization code via
template meta-programming. The promise was that this new format, beyond
solving the problems the previous open-coded representation had (working
on linearized buffers), will speed up migrating other components to this
IMR format, as the IMR infrastructure reduces code bloat, makes the code
more readable via declarative type descriptions as well as safer.
However, the results were almost the opposite. The template
meta-programming used by the IMR infrastructure proved very hard to
understand. Developers don't want to read or modify it. Maintainers
don't want to see it being used anywhere else. In short, nobody wants to
touch it.

This commit does a conceptual revert of
aab6b0ee27. A verbatim revert is not
possible because related code evolved a lot since the merge. Also, going
back to the previous code would mean we regress as we'd revert the move
to fragmented buffers. So this revert is only conceptual, it changes the
underlying infrastructure back to the previous open-coded one, but keeps
the fragmented buffers, as well as the interface of the related
components (to the extent possible).

Fixes: #5578
2021-02-16 23:43:07 +01:00
Michał Chojnowski
25a9569cc4 utils: managed_bytes: add a few trivial helper methods
We will use them in the upcoming IMR removal patch.
2021-02-16 23:43:07 +01:00
Michał Chojnowski
3f248ca7cc utils: fragment_range: move FragmentedView helpers to fragment_range.hh
In the upcoming IMR removal patch we will need read_simple() and similar helpers
for FragmentedView outside of types.hh. For now, let's move them to
fragment_range.hh, where FragmentedView is defined. Since it's a widely included
header, we should consider moving them to a more specialized header later.
2021-02-16 21:35:15 +01:00
Michał Chojnowski
8a06a576aa utils: fragment_range: add single_fragmented_mutable_view
We will use it later in the upcoming IMR removal patch.
2021-02-16 21:35:15 +01:00
Michał Chojnowski
7b662b9315 utils: fragment_range: implement FragmentRange for fragment_range
This will allow us to pass FragmentedView instances to places where
FragmentRange is expected.
2021-02-16 21:35:15 +01:00
Michał Chojnowski
f972f90193 utils: mutable_view: add front()
We will use it in the upcoming patches.
2021-02-16 21:35:14 +01:00
Piotr Sarna
aa39130a20 bounded_stats_queue: add missing const qualifiers
Most of the methods of this utility are effectively const.
Message-Id: <ed376ab74b6323cf770cc0a1314edbae0b16111e.1612953601.git.sarna@scylladb.com>
2021-02-10 13:04:35 +02:00
Pavel Emelyanov
2f7c03d84c utils: Intrusive B-tree (with tests)
The design of the tree goes from the row-cache needs, which are

1. Insert/Remove do not invalidate iterators
2. Elements are LSA-manageable
3. Low key overhead
4. External tri-comparator
5. As little actions on insert/remove as possible

With the above the design is

Two types of nodes -- inner and leaf. Both types keep pointer on parent nodes
and N pointers on keys (not keys themselves). Two differences: inner nodes have
array of pointers on kids, leaf nodes keep pointer on the tree (to update left-
and rightmost tree pointers on node move).

Nodes do not keep pointers/references on trees, thus we have O(1) move of any
object, but O(logN) to get the tree size. Fortunately, with big keys-per-node
value this won't result in too many steps.

In turn, the tree has 3 pointers -- root, left- and rightmost leaves. The latter
is for constant-time begin() and end().

Keys are managed by user with the help of embeddable member_hook instance,
which is 1 pointer in size.

The code was copied from the B+ tree one, then heavily reworked, the internal
algorythms turned out to differ quite significantly.

For the sake of mutation_partition::apply_monotonically(), which needs to move
an element from one tree into another, there's a key_grabber helping wrapper
that allows doing this move respecting the exception-safety requirement.

As measured by the perf_collections test the B-tree with 8 keys is faster, than
the std::set, but slower than the B+tree:

            vs set        vs b+tree
   fill:     +13%           -6%
   find:     +23%          -35%

Another neat thing is that 1-key insertion-removal is ~40% faster than
for BST (the same number of allocations, but the key object is smaller,
less pointers to set-up and less instructions to execute when linking
node with root).

v4:
- equip insertion methods with on_alloc_point() calls to catch
  potential exception guarantees violations eariler

- add unlink_leftmost_without_rebalance. The method is borrowed from
  boost intrusive set, and is added to kill two birds -- provide it,
  as it turns out to be popular, and use a bit faster step-by-step
  tree destruction than plain begin+erase loop

v3:
- introduce "inline" root node that is embedded into tree object and in
  which the 1st key is inserted. This greatly improves the 1-key-tree
  performance, which is pretty common case for rows cache

v2:
- introduce "linear" root leaf that grows on demand

  This improves the memory consumption for small trees. This linear node may
  and should over-grow the NodeSize parameter. This comes from the fact that
  there are two big per-key memory spikes on small trees -- 1-key root leaf
  and the first split, when the tree becomes 1-key root with two half-filled
  leaves. If the linear extention goes above NodeSize it can flatten even the
  2nd peak

- mitigate the keys indirection a bit

  Prefetching the keys while doing the intra-node linear scan and the nodes
  while descending the tree gives ~+5% of fill and find

- generalize stress tests for B and B+ trees

- cosmetic changes

TODO:

- fix few inefficincies in the core code (walks the sub-tree twice sometimes)
- try to optimize the leaf nodes, that are not lef-/righmost not to carry
  unused tree pointer on board

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-02-02 09:30:29 +03:00
Raphael S. Carvalho
298d54ceb0 utils/fragment_temporary_buffer: don't push empty fragment if data size is fragment-aligned
last fragment is unconditionally pushed to set of fragments, so if data
size is fragment-aligned, an empty fragment will be needlessly pushed to
the back of the fragment set.

note: i haven't tested if empty fragment at back of set will cause issues,
i think it won't, but this should be avoided anyway.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210129231532.871405-3-raphaelsc@scylladb.com>
2021-01-30 20:54:20 +02:00
Raphael S. Carvalho
e745f1e697 utils/fragmented_temporary_buffer: avoid reallocations by reserving upfront
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210129231532.871405-2-raphaelsc@scylladb.com>
2021-01-30 20:54:20 +02:00
Raphael S. Carvalho
08e838d4b5 utils/fragmented_temporary_buffer: simplify allocate_to_fit()
1) reuse default_fragment_size for knowledge of max fragment size
2) fragments_count is not a good name as it doesn't include last non-full
fragment (if present), so rename it.
3) simplify calculation of last fragment size

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210129231532.871405-1-raphaelsc@scylladb.com>
2021-01-30 20:54:20 +02:00
Pavel Solodovnikov
d14dc030ac utils: add fragmented_temporary_buffer::allocate_to_fit
Introduce `fragmented_temporary_buffer::allocate_to_fit` static
function returning an instance of the buffer of a specified size.

The allocated buffer fragments have a size of at most 128kb.
`bytes_ostream` has the same hard-coded limit, so just use the
same here.

This patch will be later needed for `raft::log_entry` raw data
serialization when writing to the underlying persistent storage.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-01-29 01:59:16 +03:00
Nadav Har'El
49440d67ad Merge: Fix multiple issues with timeuuid type
Merged patch series by Konstantin Osipov:

"These series improve uniqueness of generated timeuuids and change
list append/prepend logic to use client/LWT timestamp in timeuuids
generated for list keys. Timeuuid compare functions are
optimized.

The test coverage is extended for all of the above."

  uuid: add a comment warning against UUID::operator<
  uuid: replace slow versions of timeuiid compare with optimized/tested versions.
  test: add tests for legacy uuid compare & msb monotonicity
  test: add a test case for append/prepend limit
  test: add a test case for monotonicity of timeuuid least significant bits
  uuid: implement optimized timeuuid compare
  test: add a test case for list prepend/append with custom timestamp
  lists: rewrite list prepend to use append machinery
  lists: use query timestamp for list cell values during append
  uuid: fill in UUID node identifier part of UUID
  test: add a CQL test for list append/prepend operations
2021-01-21 13:20:07 +02:00
Konstantin Osipov
e18e2cb9f2 uuid: add a comment warning against UUID::operator< 2021-01-21 13:03:59 +03:00
Konstantin Osipov
56d8d166cb test: add tests for legacy uuid compare & msb monotonicity 2021-01-21 13:03:59 +03:00
Konstantin Osipov
0af3758aff uuid: implement optimized timeuuid compare
Introduce uint64_t based comparator for serialized timeuuids.

Respect Cassandra legacy for timeuuid compare order.

Scylla uses two versions of timeuuid compare:
- one for timeuuid values stored in uuid columns
- a different one for timeuuid values stored in timeuuid columns.

This commit re-implements the implementations of these comparators in
types.cc and deprecates the respective implementations types.cc. They
will be removed in a following patch.

A micro-benchmark at https://github.com/alecco/timeuuid-bench/
shows 2-4x speed up of the new comparators.
2021-01-21 13:03:59 +03:00
Konstantin Osipov
2b8ce83eea lists: use query timestamp for list cell values during append
Scylla list cells are represented internally as a map of
timeuuid => value. To append a new value to a list
the coordinator generates a timeuuid reflecting the current time as key
and adds a value to the map using this key.

Before this patch, Scylla always generated a timeuuid for a new
value, even if the query had a user supplied or LWT timestamp.
This could break LWT linearizability. User supplied timestamps were
ignored.

This is reported as https://github.com/scylladb/scylla/issues/7611

A statement which appended multiple values to a list or a BATCH
generated an own microsecond-resolution timeuuid for each value:

BEGIN BATCH
  UPDATE ... SET a = a + [3]
  UPDATE ... SET a = a + [4]
APPLY BATCH

UPDATE ... SET a = a + [3, 4]

To fix the bug, it's necessary to preserve monotonicity of
timeuuids within a batch or multi-value append, but make sure
they all use the microsecond time, as is set by LWT or user.

To explain the fix, it's first necessary to recall the structure
of time-based UUIDs:

60 bits: time since start of GMT epoch, year 1582, represented
         in 100-nanosecond units
4 bits:  version
14 bits: clock sequence, a random number to avoid duplicates
         in case system clock is adjusted
2 bits:  type
48 bits: MAC address (or other hardware address)

The purpose of clockseq bits is as defined in
https://tools.ietf.org/html/rfc4122#section-4.1.5
is to reduce the probability of UUID collision in case clock
goes back in time or node id changes. The implementation should reset it
whenever one of these events may occur.

Since LWT microsecond time is guaranteed to be
unique by Paxos, the RFC provisioning for clockseq and MAC
slots becomes excessive.

The fix thus changes timeuuid slot content in the following way:
- time component now contains the same microsecond time for all
  values of a statement or a batch. The time is unique and monotonic in
  case of LWT. Otherwise it's most always monotonic, but may not be
  unique if two timestamps are created on different coordinators.
- clockseq component is used to store a sequence number which is
  unique and monotonic for all values within the statement/batch.
- to protect against time back-adjustments and duplicates
  if time is auto-generated, MAC component contains a random (spoof)
  MAC address, re-created on each restart. The address is different
  at each shard.

The change is made for all sources of time: user, generated, LWT.
Conditioning the list key generation algorithm on the source of
time would unnecessarily complicate the code while not increase
quality (uniqueness) of created list keys.

Since 14 bits of clockseq provide us with only 16383 distinct slots
per statement or batch, 3 extra bits in nanosecond part of the time
are used to extend the range to 131071 values per statement/batch.
If the rang is exceeded beyond the limit, an exception is produced.

A twist on the use of clockseq to extend timeuuid uniqueness is
that Scylla, like Cassandra, uses int8 compare to compare lower
bits of timeuuid for ordering. The patch takes this into account
and sign-complements the clockseq value to make it monotonic
according to the legacy compare function.

Fixes #7611

test: unit (dev)
2021-01-21 13:03:59 +03:00
Konstantin Osipov
6d1781be36 uuid: fill in UUID node identifier part of UUID
Before this patch, UUID generation code was not creating
sufficiently unique IDs: the 6 byte node identifier was mostly
empty, i.e. only containing shard id. This could lead to
collisions between queries executed concurrently at different
coordinators, and, since timeuuid is used as key in list append
and prepend operations, lead to lost updates.

To generate a unique node id, the patch uses a combination of
hardware MAC address (or a random number if no hardware address is
available) and the current shard id.

The shard id is mixed into higher bits of MAC, to reduce the
chances on NIC collision within the same network.

With sufficiently unique timeuuids as list cell keys, such updates
are no longer lost, but multi-value update can still be "merged"
with another multi-value update.

E.g. if node A executes SET l = l + [4, 5] and node B executes SET
l  = l + [6, 7], the list value could be any of [4, 5, 6, 7], [4,
6, 5, 7], [6, 4, 5, 7] and so on.

At least we are now less likely to get any value lost.

Fixes #6208.

@todo: initialize UUID subsystem explicitly in main()
and switch to using seastar::engine().net().network_interfaces()

test: unit (dev)
2021-01-21 13:03:53 +03:00
Avi Kivity
4cfaab208e allocation_strategy: set preferred max contiguous allocation to 128k for standard allocations
Now that managed_bytes and its users do not assume that a managed_bytes
instance allocated using standard_allocation_strategy is non-fragmented,
we can set the preferred max contiguous allocation to 128k. This causes
managed_bytes to fragment instances that are larger than this size.

Note that managed_bytes is the only user.

Closes #7943
2021-01-21 11:15:13 +02:00
Michał Chojnowski
85048b349b memtable: fix accounting of managed_bytes in partition_snapshot_accounter
managed_bytes has a small overhead per each fragment. Due to that, managed_bytes
containing the same data can have different total memory usage in different
allocators. The smaller the preferred max allocation size setting is, the more
fragments are needed and the greater total per-fragment overhead is.
In particular, managed_bytes allocated in the LSA could grow in
memory usage when copied to the standard allocator, if the standard allocator
had a preferred max allocation setting smaller than the LSA.

partition_snapshot_accounter calculates the amount of memory used by
mutation fragments in the memtable (where they are allocated with LSA) based
on the memory usage after they are copied to the standard allocator.
This could result in an overestimation, as explained above.
But partition_snapshot_accounter must not overestimate the amount of freed
memory, as doing otherwise might result in OOM situations.

This patch prevents the overaccounting by adding minimal_external_memory_usage():
a new version of external_memory_usage(), which ignores allocator-dependent
overhead. In particular, it includes the per-fragment overhead in managed_bytes
only once, no matter how many fragments there are.
2021-01-15 18:21:13 +01:00
Michał Chojnowski
72ecbd6936 utils: fragment_range: add a fragment iterator for FragmentedView
A stylistic change. Iterators are the idiomatic way to iterate in C++.
2021-01-15 14:05:44 +01:00
Pavel Solodovnikov
eb523d4ac8 utils: remove unused linearization facilities in managed_bytes class
Remove the following bits of `managed_bytes` since they are unused:
* `with_linearized_managed_bytes` function template
* `linearization_context_guard` RAII wrapper class for managing
  `linearization_context` instances.
* `do_linearize` function
* `linearization_context` class

Since there is no more public or private methods in `managed_class`
to linearize the value except for explicit `with_linearized()`,
which doesn't use any of aforementioned parts, we can safely remove
these.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-01-08 14:16:08 +01:00
Avi Kivity
3bf6b78668 utils: managed_bytes: remove linearizing accessors
Accessor that require linearization, such as data(), begin(),
and casting to bytes_view, are no longer used and are now removed.
2021-01-08 14:16:08 +01:00
Michał Chojnowski
bf0ec63e34 utils: managed_bytes: add managed_bytes_view::operator[]
This operator has a single purpose: an easier port of legacy_compound_view
from bytes_view to managed_bytes_view.
It is inefficient and should be removed as soon as legacy_compound_view stops
using operator[].
2021-01-08 14:16:08 +01:00
Michał Chojnowski
778269151a utils: managed_bytes: introduce managed_bytes_view
managed_bytes_view is a non-owning view into managed_bytes.
It can also be implicitly constructed from bytes_view.

It conforms to the FragmentedView concept and is mainly used through that
interface.

It will be used as a replacement for bytes_view occurrences currently
obtained by linearizing managed_bytes.
2021-01-08 14:16:08 +01:00
Michał Chojnowski
cf7d25b98d utils: fragment_range: add serialization helpers for FragmentedMutableView
We will use them to write to managed_bytes_view in an upcoming patch,
to avoid linearization in compound_type::serialize_value.
2021-01-08 14:16:07 +01:00
Michał Chojnowski
4822730752 utils: mutable_view: add substr()
Analogous to bytes_view::substr.
This bit of functionality will be used to implement managed_bytes_mutable_view.
2021-01-08 13:17:46 +01:00
Michał Chojnowski
6c97027f85 utils: fragment_range: add compare_unsigned
We will use it to compare fragmented buffers (mainly managed_bytes_view in
types, compound, and tests) without linearization.
2021-01-04 22:50:45 +01:00
Michał Chojnowski
2d28471a59 utils: managed_bytes: make the constructors from bytes and bytes_view explicit
Conversions from views to owners have no business being implicit.
Besides, they would also cause various ambiguity problems when adding
managed_bytes_view.
2021-01-04 22:22:12 +01:00
Avi Kivity
0f7b6dd180 utils: managed_bytes: introduce with_linearized()
This is a temporary scaffold for weaning ourselves off
linearization. It differs from with_linearized_managed_bytes in
that it does not rely on the environment (linearization_context)
and so is easier to remove.
2020-12-20 15:14:44 +01:00
Avi Kivity
c37e495958 utils: managed_bytes: constrain with_linearized_managed_bytes()
The passed function must be called with a no parameters; document
and enforce it.
2020-12-20 15:14:44 +01:00
Avi Kivity
a1df1b3c34 utils: managed_bytes: avoid internal uses of managed_bytes::data()
We use managed_bytes::data() in a few places when we know the
data is non-fragmented (such as when the small buffer optimization
is in use). We'd like to remove managed_bytes::data() as linearization
is bad, so in preparation for that, replace internal uses of data()
with the equivalent direct access.
2020-12-20 15:14:44 +01:00
Avi Kivity
72a2554a86 utils: managed_bytes: extract do_linearize_pure()
do_linearize() is an impure function as it changes state
in linearization_context. Extract the pure parts into a new
do_linearize_pure(). This will be used to linearize managed_bytes
without a linearization_context, during the transition period where
fragmented and non-fragmented values coexist.
2020-12-20 15:14:44 +01:00
Avi Kivity
a11ecfe231 Merge 'types: don't linearize in validate()' from Michał Chojnowski
A sequel to #7692.

This series gets rid of linearization when validating collections and tuple types. (Other types were already validated without linearizing).
The necessary helpers for reading from fragmented buffers were introduced in #7692. All this series does is put them to use in `validate()`.

Refs: #6138

Closes #7770

* github.com:scylladb/scylla:
  types: add single-fragment optimization in validate()
  utils: fragment_range: add with_simplified()
  cql3: statements: select_statement: remove unnecessary use of with_linearized
  cql3: maps: remove unnecessary use of with_linearized
  cql3: lists: remove unnecessary use of with_linearized
  cql3: tuples: remove unnecessary use of with_linearized
  cql3: sets: remove unnecessary use of with_linearized
  cql3: tuples: remove unnecessary use of with_linearized
  cql3: attributes: remove unnecessary uses of with_linearized
  types: validate lists without linearizing
  types: validate tuples without linearizing
  types: validate sets without linearizing
  types: validate maps without linearizing
  types: template abstract_type::validate on FragmentedView
  types: validate_visitor: transition from FragmentRange to FragmentedView
  utils: fragmented_temporary_buffer: add empty() to FragmentedView
  utils: fragmented_temporary_buffer: don't add to null pointer
2020-12-11 17:33:59 +02:00
Michał Chojnowski
150473f074 types: add single-fragment optimization in validate()
Manipulating fragmented views is costlier that manipulating contiguous views,
so let's detect the common situation when the fragmented view is actually
contiguous underneath, and make use of that.

Note: this optimization is only useful for big types. For trivial types,
validation usually only checks the size of the view.
2020-12-11 09:53:07 +01:00
Michał Chojnowski
e2d17879fc utils: fragment_range: add with_simplified()
Reading from contiguous memory (bytes_view) is significantly simpler
runtime-wise than reading from a fragmented view, due to less state and less
branching, so we often want to convert a fragmented view to a simple view before
processing it, if the fragmented view contains at most one fragment, which is
common. with_simplified() does just that.
2020-12-11 09:53:07 +01:00
Michał Chojnowski
15dbe00e8a types: validate_visitor: transition from FragmentRange to FragmentedView
This will allow us to easily get rid of linearizations when validating
collections and tuples, because the helpers used in validate_aux() already
have FragmentedView overloads.
2020-12-11 09:53:07 +01:00
Michał Chojnowski
3647c0ba47 utils: fragmented_temporary_buffer: add empty() to FragmentedView
It's redundant with size_bytes(), but sometimes empty() is more readable and
reduces churn when replacing other types with FragmentedView.
2020-12-11 09:53:07 +01:00
Michał Chojnowski
b4dd5d3bdb utils: fragmented_temporary_buffer: don't add to null pointer
When fragmented_temporary_buffer::view is created from a bytes_view,
_current is null. In that case, in remove_current(), null pointer offset
happens, and ubsan complains. Fix that.
2020-12-11 09:53:07 +01:00
Michał Chojnowski
60a3cecfea utils: fragment_range: use range-based for loop instead of boost::for_each
We want to pass bytes_ostream to this loop in later commits.
bytes_ostream does not conform to some boost concepts required by
boost::for_each, so let's just use C++'s native loop.
2020-12-07 12:50:36 +01:00
Piotr Sarna
2015988373 Merge 'types: get rid of linearization in deserialize()' from Michał Chojnowski
Citing #6138: > In the past few years we have converted most of our codebase to
work in terms of fragmented buffers, instead of linearised ones, to help avoid
large allocations that put large pressure on the memory allocator.  > One
prominent component that still works exclusively in terms of linearised buffers
is the types hierarchy, more specifically the de/serialization code to/from CQL
format. Note that for most types, this is the same as our internal format,
notable exceptions are non-frozen collections and user types.  > > Most types
are expected to contain reasonably small values, but texts, blobs and especially
collections can get very large. Since the entire hierarchy shares a common
interface we can either transition all or none to work with fragmented buffers.

This series gets rid of intermediate linearizations in deserialization. The next
steps are removing linearizations from serialization, validation and comparison
code.

Series summary:
- Fix a bug in `fragmented_temporary_buffer::view::remove_prefix`. (Discovered
  while testing. Since it wasn't discovered earlier, I guess it doesn't occur in
  any code path in master.)
- Add a `FragmentedView` concept to allow uniform handling of various types of
  fragmented buffers (`bytes_view`, `temporary_fragmented_buffer::view`,
  `ser::buffer_view` and likely `managed_bytes_view` in the future).
- Implement `FragmentedView` for relevant fragmented buffer types.
- Add helper functions for reading from `FragmentedView`.
- Switch `deserialize()` and all its helpers from `bytes_view` to
  `FragmentedView`.
- Remove `with_linearized()` calls which just became unnecessary.
- Add an optimization for single-fragment cases.

The addition of `FragmentedView` might be controversial, because another concept
meant for the same purpose - `FragmentRange` - is already used. Unfortunately,
it lacks the functionality we need. The main (only?) thing we want to do with a
fragmented buffer is to extract a prefix from it and `FragmentRange` gives us no
way to do that, because it's immutable by design. We can work around that by
wrapping it into a mutable view which will track the offset into the immutable
`FragmentRange`, and that's exactly what `linearizing_input_stream` is. But it's
wasteful. `linearizing_input_stream` is a heavy type, unsuitable for passing
around as a view - it stores a pair of fragment iterators, a fragment view and a
size (11 words) to conform to the iterator-based design of `FragmentRange`, when
one fragment iterator (4 words) already contains all needed state, just hidden.
I suggest we replace `FragmentRange` with `FragmentedView` (or something
similar) altogether.

Refs: #6138

Closes #7692

* github.com:scylladb/scylla:
  types: collection: add an optimization for single-fragment buffers in deserialize
  types: add an optimization for single-fragment buffers in deserialize
  cql3: tuples: don't linearize in in_value::from_serialized
  cql3: expr: expression: replace with_linearize with linearized
  cql3: constants: remove unneeded uses of with_linearized
  cql3: update_parameters: don't linearize in prefetch_data_builder::add_cell
  cql3: lists: remove unneeded use of with_linearized
  query-result-set: don't linearize in result_set_builder::deserialize
  types: remove unneeded collection deserialization overloads
  types: switch collection_type_impl::deserialize from bytes_view to FragmentedView
  cql3: sets: don't linearize in value::from_serialized
  cql3: lists: don't linearize in value::from_serialized
  cql3: maps: don't linearize in value::from_serialized
  types: remove unused deserialize_aux
  types: deserialize: don't linearize tuple elements
  types: deserialize: don't linearize collection elements
  types: switch deserialize from bytes_view to FragmentedView
  types: deserialize tuple types from FragmentedView
  types: deserialize set type from FragmentedView
  types: deserialize map type from FragmentedView
  types: deserialize list type from FragmentedView
  types: add FragmentedView versions of read_collection_size and read_collection_value
  types: deserialize varint type from FragmentedView
  types: deserialize floating point types from FragmentedView
  types: deserialize decimal type from FragmentedView
  types: deserialize duration type from FragmentedView
  types: deserialize IP address types from FragmentedView
  types: deserialize uuid types from FragmentedView
  types: deserialize timestamp type from FragmentedView
  types: deserialize simple date type from FragmentedView
  types: deserialize time type from FragmentedView
  types: deserialize boolean type from FragmentedView
  types: deserialize integer types from FragmentedView
  types: deserialize string types from FragmentedView
  types: remove unused read_simple_opt
  types: implement read_simple* versions for FragmentedView
  utils: fragmented_temporary_buffer: implement FragmentedView for view
  utils: fragment_range: add single_fragmented_view
  serializer: implement FragmentedView for buffer_view
  utils: fragment_range: add linearized and with_linearized for FragmentedView
  utils: fragment_range: add FragmentedView
  utils: fragmented_temporary_buffer: fix view::remove_prefix
2020-12-04 09:46:20 +01:00
Michał Chojnowski
fcb258cb01 utils: fragmented_temporary_buffer: implement FragmentedView for view
fragmented_temporary_buffer::view is one of the types we want to directly
deserialize from.
2020-11-27 15:26:13 +01:00
Michał Chojnowski
f6cc2b6a48 utils: fragment_range: add single_fragmented_view
bytes_view is one of the types we want to deserialize from (at least for now),
so we want to be able to pass it to deserialize() after it's transitioned to
FragmentView.

single_fragmented_view is a wrapper implementing FragmentedView for bytes_view.
It's constructed from bytes_view explicitly, because it's typically used in
context where we want to phase linearization (and by extension, bytes_view) out.
2020-11-27 15:26:13 +01:00
Michał Chojnowski
2008c0f62f utils: fragment_range: add linearized and with_linearized for FragmentedView
We would like those helpers to disappear one day but for now we still need them
until everything can handle fragmented buffers.
2020-11-27 15:26:13 +01:00
Michał Chojnowski
fc90bd5190 utils: fragment_range: add FragmentedView
This patch introduces FragmentedView - a concept intented as a general-purpose
interface for fragmented buffers.
Another concept made for this purpose, FragmentedRange, already exists in the
codebase. However, it's unwieldy. The iterator-based design of FragmentRange is
harder to implement and requires more code, but more importantly it makes
FragmentRange immutable.
Usually we want to read the beginning of the buffer and pass the rest of it
elsewhere. This is impossible with FragmentRange.
FragmentedView can do everything FragmentRange can do and more, except for
playing nicely with iterator-based collection methods, but those are useless for
fragmented buffers anyway.
2020-11-27 15:26:13 +01:00
Benny Halevy
157a964a63 locator: extract can_yield to utils/maybe_yield.hh
Move the definition of bool_class can_yield to a standalone
header file and define there a maybe_yield(can_yield) helper.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-11-24 12:23:56 +02:00
Michał Chojnowski
9bceaac44c utils: fragmented_temporary_buffer: fix view::remove_prefix
This piece of logic was wrong for two unrelated reasons:
1. When fragmented_temporary_buffer::view is constructed from bytes_view,
_current is null. When remove_prefix was used on such view, null pointer
dereference happened.
2. It only worked for the first remove_prefix call. A second call would put a
wrong value in _current_position.
2020-11-24 03:05:13 +01:00
Avi Kivity
d612ca78f3 Merge 'Allow changing hinted handoff configuration in runtime' from Piotr Dulikowski
This PR allows changing the hinted_handoff_enabled option in runtime, either by modifying and reloading YAML configuration, or through HTTP API.

This PR also introduces an important change in semantics of hinted_handoff_enabled:
- Previously, hinted_handoff_enabled controlled whether _both writing and sending_ hints is allowed at all, or to particular DCs,
- Now, hinted_handoff_enabled only controls whether _writing hints_ is enabled. Sending hints from disk is now always enabled.

Fixes: #5634
Tests:
- unit(dev) for each commit of the PR
- unit(debug) for the last commit of the PR

Closes #6916

* github.com:scylladb/scylla:
  api: allow changing hinted handoff configuration
  storage_proxy: fix wrong return type in swagger
  hints_manager: implement change_host_filter
  storage_proxy: always create hints manager
  config: plug in hints::host_filter object into configuration
  db/hints: introduce host_filter
  hints/resource_manager: allow registering managers after start
  hints: introduce db::hints::directory_initializer
  directories.cc: prepare for use outside main.cc
2020-11-18 13:41:02 +02:00
Avi Kivity
13c6c90d8c Merge 'Remove std::iterator usage' from Piotr Jastrzębski
std::iterator is deprecated since C++17 so define all the required iterator_traits directly and stop using std::iterator at all.

More context: https://www.fluentcpp.com/2018/05/08/std-iterator-deprecated

Tests: unit(dev)

Closes #7635

* github.com:scylladb/scylla:
  log_heap: Remove std::iterator from hist_iterator
  types: Remove std::iterator from tuple_deserializing_iterator
  types: Remove std::iterator from listlike_partial_deserializing_iterator
  sstables: remove std::iterator from const_iterator
  token_metadata: Remove std::iterator from tokens_iterator
  size_estimates_virtual_reader: Remove std::iterator
  token_metadata: Remove std::iterator from tokens_iterator_impl
  counters: Remove std::iterator from iterators
  compound_compat: Remove std::iterator from iterators
  compound: Remove std::iterator from iterator
  clustering_interval_set: Remove std::iterator from position_range_iterator
  cdc: Remove std::iterator from collection_iterator
  cartesian_product: Remove std::iterator from iterator
  bytes_ostream: Remove std::iterator from fragment_iterator
2020-11-17 19:22:17 +02:00
Piotr Jastrzebski
2fe9d879df log_heap: Remove std::iterator from hist_iterator
std::iterator is deprecated since C++17 so define all the required
iterator_traits directly.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-11-17 16:53:20 +01:00