Commit Graph

22819 Commits

Author SHA1 Message Date
Dejan Mircevski
cc86d915ed configure.py: $mode-test targets depend on scylla
The targets {dev|debug|release}-test run all unit tests, including
alternator/run.  But this test requires the Scylla executable, which
wasn't among the dependencies.  Fix it by adding build/$mode/scylla to
the dependency list.

Fixes #6855.

Tests: `ninja dev-test` after removing build/dev/scylla

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-07-16 16:38:48 +03:00
Piotr Dulikowski
e2462bce3b cdc: fix a corner case inside get_base_table
It is legal for a user to create a table with name that has a
_scylla_cdc_log suffix. In such case, the table won't be treated as a
cdc log table, and does not require a corresponding base table to exist.

During refactoring done as a part of initial implemetation of of
Alternator streams (#6694), `is_log_for_some_table` started throwing
when trying to check a name like `X_scylla_cdc_log` when there was no
table with name `X`. Previously, it just returned false.

The exception originates inside `get_base_table`, which tries to return
the base table schema, not checking for its existence - which may throw.
It makes more sense for this function to return nullptr in such case (it
already does when provided log table name does not have the cdc log
suffix), so this patch adds an explicit check and returns nullptr when
necessary.

A similar oversight happened before (see #5987), so this patch also adds
a comment which explains why existence of `X_scylla_cdc_log` does not
imply existence of `X`.

Fixes: #6852
Refs: #5724, #5987
2020-07-16 16:38:48 +03:00
Asias He
4d7faac350 repair: Add uuid to a repair job
Currently, repair uses an integer to identify a repair job. The repair
id starts from 1 since node restart. As a result, different repair jobs
will have same id across restart.

To make the id more unique across restart, we can use an uuid in
addition to the integer id. We can not drop the use of the integer id
completely since the http api and nodetool use it.

Fixes #6786
2020-07-16 11:03:19 +03:00
Pekka Enberg
8b5121ea0c README.md: Add Slack and Twitter social banners
Add social banners for Slack and Twitter in README that are easy to
find for people new to the project.

Fixes #6538

Message-Id: <20200716070449.630864-1-penberg@scylladb.com>
2020-07-16 10:55:15 +03:00
Pekka Enberg
ed71ebafe5 README.md: Improve contribution section
The markdown syntax in the contribution section is incorrect, which is
why the links appear on the same line.

Improve the contribution section by making it more explicit what the
links are about.

Message-Id: <20200714070716.143768-1-penberg@scylladb.com>
2020-07-16 10:53:12 +03:00
Nadav Har'El
dcf9c888a2 alternator test: disable test_streams.py::test_get_records
This test usually fails, with the following error. Marking it "xfail" until
we can get to the bottom of this.

dynamodb = dynamodb.ServiceResource()
dynamodbstreams = <botocore.client.DynamoDBStreams object at 0x7fa91e72de80>

    def test_get_records(dynamodb, dynamodbstreams):
        # TODO: add tests for storage/transactionable variations and global/local index
        with create_stream_test_table(dynamodb, StreamViewType='NEW_AND_OLD_IMAGES') as table:
            arn = wait_for_active_stream(dynamodbstreams, table)

            p = 'piglet'
            c = 'ninja'
            val = 'lucifers'
            val2 = 'flowers'
>           table.put_item(Item={'p': p, 'c': c, 'a1': val, 'a2': val2})
test_streams.py:316:
...
E           botocore.exceptions.ClientError: An error occurred (Internal Server Error) when calling the PutItem operation (reached max retries: 3): Internal server error: std::runtime_error (cdc::metadata::get_stream: could not find any CDC stream (current time: 2020/07/15 17:26:36). Are we in the middle of a cluster upgrade?)

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-07-16 08:24:25 +03:00
Nadav Har'El
61f52da9b1 merge: Alternator/CDC: Implement streams support
Merged pull request https://github.com/scylladb/scylla/pull/6694
by Calle Wilund:

Implementation of DynamoDB streams using Scylla CDC.

Fixes #5065

Initial, naive implementation insofar that it uses 1:1 mapping CDC stream to
DynamoDB shard. I.e. there are a lot of shards.

Includes tests verified against both local DynamoDB server and actual AWS
remote one.

Note:
Because of how data put is implemented in alternator, currently we do not
get "proper" INSERT labels for first write of data, because to CDC it looks
like an update. The test compensates for this, but actual users might not
like it.
2020-07-16 08:18:25 +03:00
Nadav Har'El
c4497bf770 alternator test: enable experimental CDC
In the script test/alternator/run, which runs Scylla for the Alternator
tests, add the "--experimental-features=cdc" option, to allow us testing
the streams API whose implementation requires the experimenal CDC feature.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2020-07-16 08:18:09 +03:00
Takuya ASADA
e52ae78f79 reloc: support unified relocatable package
This introduce unified relocatable package, a single tarball to install
all Scylla packages.

Fixes #6626
See scylladb/scylla-pkg#1218
2020-07-15 20:29:31 +03:00
Avi Kivity
afd9b29627 Update tools/jmx and tools/java submodules
* tools/java 50dbf77123...3eca0e3511 (1):
  > config: Avoid checking options and filtering scylla.yaml

* tools/jmx 5820992...aa94fe5 (3):
  > dist: redhat: reduce log spam from unpacking sources when building rpm
  > Merge 'gitignore: fix typo and add scylla-apiclient/target/' from Benny
  > apiclient: Bump Jackson version to 2.10.4
2020-07-15 19:42:47 +03:00
Takuya ASADA
5da8784494 install.sh: support calling install.sh from other directory
On .deb package with new relocatable package format, all files moved to
under scylla/ directory.
So we need to call ./scylla/install.sh on debian/rules, but it does not work
correctly, since install.sh does not support calling from other directory.

To support this, we need to changedir to scylla top directory before
copying files.
2020-07-15 18:55:12 +03:00
Nadav Har'El
09a71ccd84 merge: cql3/restrictions: exclude NULLs from comparison in filtering
Merge pull request https://github.com/scylladb/scylla/pull/6834 by
Juliusz Stasiewicz:

NULLs used to give false positives in GT, LT, GEQ and LEQ ops performed upon
ALLOW FILTERING. That was a consequence of not distinguishing NULL from an
empty buffer.

This patch excludes NULLs on high level, preventing them from entering LHS
of any comparison, i.e. it assumes that any binary operation should return
false whenever the LHS operand is NULL (note: at the moment filters with
RHS NULL, such as ...WHERE x=NULL ALLOW FILTERING, return empty sets anyway).

Fixes #6295

* '6295-do-not-compare-nulls-v2' of github.com:jul-stas/scylla:
  filtering_test: check that NULLs do not compare to normal values
  cql3/restrictions: exclude NULLs from comparison in filtering
2020-07-15 18:32:14 +03:00
Takuya ASADA
9b5f28a2e3 scylla_raid_setup: fix incorrect block device path
To use UUID, we need a tag "UUID=<uuid>".

reference: https://www.freedesktop.org/software/systemd/man/systemd.mount.html
reference: https://man7.org/linux/man-pages/man8/mount.8.html
2020-07-15 18:22:46 +03:00
Tomasz Grabiec
b8531fb885 Merge "Switch partitions cache from BST to B+tree & array" from Pavel E.
The data model is now

        bplus::tree<Key = int64_t, T = array<entry>>

where entry can be cache_entry or memtable_entry.

The whole thing is encapsulated into a collection called "double_decker"
from patch #3. The array<T> is an array of T-s with 0-bytes overhead used
to resolve hash conflicts (patch #2).

branch:
tests: unit(debug)
tests before v7:
        unit(debug) for new collections, memtable and row_cache
        unit(dev) for the rest
        perf(dev)

* https://github.com/xemul/scylla/commits/row-cache-over-bptree-9:
  test: Print more sizes in memory_footprint_test
  memtable: Switch onto B+ rails
  row_cache: Switch partition tree onto B+ rails
  memtable: Count partitions separately
  token: Introduce raw() helper and raw comparator
  row-cache: Use ring_position_comparator in some places
  dht: Detach ring_position_comparator_for_sstables
  double-decker: A combination of B+tree with array
  intrusive-array: Array with trusted bounds
  utils: B+ tree implementation
  test: Move perf measurement helpers into header
2020-07-15 14:54:29 +02:00
Calle Wilund
3b74b9585f cql3::lists: Fix setter_by_uuid not handing null value
Fixes #6828

When using the scylla list index from UUID extension,
null values were not handled properly causing throws
from underlying layer.
2020-07-15 13:52:09 +02:00
Raphael S. Carvalho
7a728803f7 cql3/functions: protect against uninitialized value
impl_count_function doesn't explicitly initialize _count, so its correctness
depends on default initialization. Let's explicitly initialize _count to
make the code future proof.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200714162604.64402-1-raphaelsc@scylladb.com>
2020-07-15 12:38:39 +03:00
Calle Wilund
ac4d0bb144 docs/alternator: Change the streams sectionSmall but updated blurb describing the state of streams inalternator/scylla. 2020-07-15 08:21:34 +00:00
Calle Wilund
76f6fe679a alternator tests: Add streams test
Small set of positive and negative tests of streams
functionality. Verified against DynamoDB and Alternator.
2020-07-15 08:21:34 +00:00
Calle Wilund
cbb70f4af4 executor: "UpdateTable" support for streams
Partial implementation of the "UpdateTable" command.
Supports only enabling/disabling streams.
2020-07-15 08:21:34 +00:00
Calle Wilund
45ee73969d executor: Allow streams specification in CreateTable schema 2020-07-15 08:21:34 +00:00
Calle Wilund
3376209718 cdc::schema: Make extensions expicitly settable from builder
To make non-cql cdc schema options a reality.
2020-07-15 08:21:34 +00:00
Calle Wilund
bbc544748f alternator: Implement GetRecords
Simplistic variant, using 1:1 mapping of scylla stream id <-> shard
2020-07-15 08:21:34 +00:00
Calle Wilund
3756febbf5 alternator: expose describe_single_item and default_timeout
To be able to describe single alternator items from other files.
And query with the default timeout.
2020-07-15 08:10:23 +00:00
Calle Wilund
c45781de1e alternator: Implement GetShardIterator 2020-07-15 08:10:23 +00:00
Calle Wilund
8084b5a9b7 alternator: Implement DescribeStream 2020-07-15 08:10:23 +00:00
Calle Wilund
8fb9b32bd3 alternator: Implement ListStreams command 2020-07-15 08:10:23 +00:00
Calle Wilund
811b531e2d db::config: Add option to set streams confidence window
Option to control the alternator streams CDC query/shard range time
confidence interval, i.e. the period we enforce as timestamp threshold
when reading. The default, 10s, should be sufficient on a normal
cluster, but across DCs:, or with client timestamps or whatever, one might
need a larger window.
2020-07-15 08:10:23 +00:00
Calle Wilund
a9641d4f02 system_distributed_keyspace: Add cdc topology/stream ids reader
To read the full topology (with expired and expirations etc)
from within.
2020-07-15 08:10:23 +00:00
Calle Wilund
0158f6473b cdc: Add stream ids structure with time and expiration
For reading the topology tables from within scylla.
2020-07-15 08:10:23 +00:00
Calle Wilund
331aa7c501 cdc: Add "is_cdc_metacolumn_name" predicate
To sift column names
2020-07-15 08:10:23 +00:00
Calle Wilund
8a728ce618 cdc: Add get_base_table helper 2020-07-15 08:10:23 +00:00
Calle Wilund
8f462e8606 CDC::log: Add base_name helper
To extract base table name from CDC log table name.
2020-07-15 08:10:23 +00:00
Calle Wilund
0708a9971a executor: Add system_distributed_keyspace as parameter/member
Streams implementation will require querying system tables
etc to do its work, thus will need access to this object.
2020-07-15 08:10:23 +00:00
Calle Wilund
e382d79bcd executor: Make some helper and subroutines class-visible
Subroutines needed by (in this case) streams implementation
moved from being file-static to class-static (exported).
To make putting handler routines in separate sources possible.
Because executor.cc is large and slow to compile.
Separation is nice.

Unfortunately, not all methods can be kept class-private,
since unrelated types also use them.

Reviewer suggested to instead place there is a top-level
header for export, i.e. not class-private at all.
I am skipping that for now, mainly because I can't come up
with a good file name. Can be part of a generate refactor
of helper routine organization in executor.
2020-07-15 08:10:23 +00:00
Calle Wilund
8a7b24dea1 alternator::error: Add exception overloads for Dynamo types
Add types exception overloads for ValidationException, ResourceNotFoundException, etc,
to avoid writing explicit error type as string everywhere (with the potential for
spelling errors ever present).
Also allows intellisense etc to complete the exception when coded.
2020-07-15 08:10:23 +00:00
Calle Wilund
699c4d2c7e rjson: Add templated get/set overloads and optional get<T>
To allow immediate json value conversion for types we
have TypeHelper<...>:s for.

Typed opt-get to get both automatic type conversion, _and_
find functionality in one call.
2020-07-15 08:10:23 +00:00
Calle Wilund
72ec525045 rjson: Add exception overloads
To avoid copying error message composing, as well as forcing
said code info rjson.cc.
Also helps caller to determine fault by catch type.
2020-07-15 08:10:23 +00:00
Piotr Sarna
f1c1043701 README: update the alternator paragraph
Since alternator is no longer experimental, its paragraph
in README.md is rephrased to better reflect its current state.

Message-Id: <a89eb70c4350e021ad9d6f684e49f94e4c735c19.1594792604.git.sarna@scylladb.com>
2020-07-15 09:27:35 +03:00
Juliusz Stasiewicz
c25398e8cf filtering_test: check that NULLs do not compare to normal values
Tested operators are: `<` and `>`. Tests all types of NULLs except
`duration` because durations are explicitly not comparable. Values
to compare against were chosen arbitrarily.
2020-07-14 15:37:17 +02:00
Pavel Emelyanov
f8ffc31218 test: Print more sizes in memory_footprint_test
The row cache memory footprint changed after switch to B+
because we no longer have a sole cache_entry allocation, but
also the bplus::data and bplus::node. Knowing their sizes
helps analyzing the footprint changes.

Also print the size of memtable_entry that's now also stored
in B+'s data.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:30:02 +03:00
Pavel Emelyanov
4d2f5f93a4 memtable: Switch onto B+ rails
The change is the same as with row-cache -- use B+ with int64_t token
as key and array of memtable_entry-s inside it.

The changes are:

Similar to those for row_cache:

- compare() goes away, new collection uses ring_position_comparator

- insertion and removal happens with the help of double_decker, most
  of the places are about slightly changed semantics of it

- flags are added to memtable_entry, this makes its size larger than
  it could be, but still smaller than it was before

Memtable-specific:

- when the new entry is inserted into tree iterators _might_ get
  invalidated by double-decker inner array. This is easy to check
  when it happens, so the invalidation is avoided when possible

- the size_in_allocator_without_rows() is now not very precise. This
  is because after the patch memtable_entries are not allocated
  individually as they used to. They can be squashed together with
  those having token conflict and asking allocator for the occupied
  memory slot is not possible. As the closest (lower) estimate the
  size of enclosing B+ data node is used

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:30:02 +03:00
Pavel Emelyanov
174b101a49 row_cache: Switch partition tree onto B+ rails
The row_cache::partitions_type is replaced from boost::intrusive::set
to bplus::tree<Key = int64_t, T = array_trusted_bounds<cache_entry>>

Where token is used to quickly locate the partition by its token and
the internal array -- to resolve hashing conflicts.

Summary of changes in cache_entry:

- compare's goes away as the new collection needs tri-compare one which
  is provided by ring_position_comparator

- when initialized the dummy entry is added with "after_all_keys" kind,
  not "before_all_keys" as it was by default. This is to make tree
  entries sorted by token

- insertion and removing of cache_entries happens inside double_decker,
  most of the changes in row_cache.cc are about passing constructor args
  from current_allocator.construct into double_decker.empace_before()

- the _flags is extended to keep array head/tail bits. There's a room
  for it, sizeof(cache_entry) remains unchanged

The rest fits smothly into the double_decker API.

Also, as was told in the previous patch, insertion and removal _may_
invalidate iterators, but may leave them intact. However, currently
this doesn't seem to be a problem as the cache_tracker ::insert() and
::on_partition_erase do invalidate iterators unconditionally.

Later this can be otimized, as iterators are invalidated by double-decker
only in case of hash conflict, otherwise it doesn't change arrays and
B+ tree doesn't invalidate its.

tests: unit(dev), perf(dev)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:30:02 +03:00
Pavel Emelyanov
dff5eb6f25 memtable: Count partitions separately
The B+ will not have constant-time .size() call, so do it by hands

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:30:02 +03:00
Pavel Emelyanov
ae28814b1c token: Introduce raw() helper and raw comparator
In next patches the entries having token on-board will be
moved onto B+-tree rails. For this the int64_t value of the
token will be used as B+ key, so prepare for this.

One corner case -- the after_all_keys tokens must be resolved
to int64::max value to appear at the "end" of the tree. This
is not the same as "before_all_keys" case, which maps to the
int64::min value which is not allowed for regular tokens. But
for the sake of B+ switch this is OK, the conflicts of token
raw values are explicitly resolved in next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:30:02 +03:00
Pavel Emelyanov
7b2754cf5f row-cache: Use ring_position_comparator in some places
The row cache (and memtable) code uses own comparators built on top
of the ring_position_comparator for collections of partitions. These
collections will be switched from the key less-compare to the pair
of token less-compare + key tri-compare.

Prepare for the switch by generalizing the ring_partition_comparator
and by patching all the non-collections usage of less-compare to use
one.

The memtable code doesn't use it outside of collections, but patch it
anyway as a part of preparations.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:30:02 +03:00
Pavel Emelyanov
1e15c06889 dht: Detach ring_position_comparator_for_sstables
Next patches will generalize ring_position_comparator with templates
to replace cache_entry's and memtable_entry's comparators. The overload
of operator() for sstables has its own implementation, that differs from
the "generic" one, for smoother generalization it's better to detach it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:30:02 +03:00
Pavel Emelyanov
cf1315cde5 double-decker: A combination of B+tree with array
The collection is K:V store

   bplus::tree<Key = K, Value = array_trusted_bounds<V>>

It will be used as partitions cache. The outer tree is used to
quickly map token to cache_entry, the inner array -- to resolve
(expected to be rare) hash collisions.

It also must be equipped with two comparators -- less one for
keys and full one for values. The latter is not kept on-board,
but it required on all calls.

The core API consists of just 2 calls

- Heterogenuous lower_bound(search_key) -> iterator : finds the
  element that's greater or equal to the provided search key

  Other than the iterator the call returns a "hint" object
  that helps the next call.

- emplace_before(iterator, key, hint, ...) : the call construct
  the element right before the given iterator. The key and hint
  are needed for more optimal algo, but strictly speaking not
  required.

  Adding an entry to the double_decker may result in growing the
  node's array. Here to B+ iterator's .reconstruct() method
  comes into play. The new array is created, old elements are
  moved onto it, then the fresh node replaces the old one.

// TODO: Ideally this should be turned into the
// template <typename OuterCollection, typename InnerCollection>
// but for now the double_decker still has some intimate knowledge
// about what outer and inner collections are.

Insertion into this collection _may_ invalidate iterators, but
may leave intact. Invalidation only happens in case of hashing
conflict, which can be clearly seen from the hint object, so
there's a good room for improvement.

The main usage by row_cache (the find_or_create_entry) looks like

   cache_entry find_or_create_entry() {
       bound_hint hint;

       it = lower_bound(decorated_key, &hint);
       if (!hint.found) {
           it = emplace_before(it, decorated_key.token(), hint,
                                 <constructor args>)
       }
       return *it;
  }

Now the hint. It contains 3 booleans, that are

  - match: set to true when the "greater or equal" condition
    evaluated to "equal". This frees the caller from the need
    to manually check whether the entry returned matches the
    search key or the new one should be inserted.

    This is the "!found" check from the above snippet.

To explain the next 2 bools, here's a small example. Consider
the tree containing two elements {token, partition key}:

   { 3, "a" }, { 5, "z" }

As the collection is sorted they go in the order shown. Next,
this is what the lower_bound would return for some cases:

   { 3, "z" } -> { 5, "z" }
   { 4, "a" } -> { 5, "z" }
   { 5, "a" } -> { 5, "z" }

Apparently, the lower bound for those 3 elements are the same,
but the code-flows of emplacing them before one differ drastically.

   { 3, "z" } : need to get previous element from the tree and
                push the element to it's vector's back
   { 4, "a" } : need to create new element in the tree and populate
                its empty vector with the single element
   { 5, "a" } : need to put the new element in the found tree
                element right before the found vector position

To make one of the above decisions the .emplace_before would need
to perform another set of comparisons of keys and elements.
Fortunately, the needed information was already known inside the
lower_bound call and can be reported via the hint.

Said that,

  - key_match: set to true if tree.lower_bound() found the element
    for the Key (which is token). For above examples this will be
    true for cases 3z and 5a.

  - key_tail: set to true if the tree element was found, but when
    comparing values from array the bounding element turned out
    to belong to the next tree element and the iterator was ++-ed.
    For above examples this would be true for case 3z only.

And the last, but not least -- the "erase self" feature. Which is
given only the cache_entry pointer at hands remove it from the
collection. To make this happen we need to make two steps:

1. get the array the entry sits in
2. get the b+ tree node the vectors sits in

Both methods are provided by array_trusted_bounds and bplus::tree.
So, when we need to get iterator from the given T pointer, the algo
looks like

- Walk back the T array untill hitting the head element
- Call array_trusted_bounds::from_element() getting the array
- Construct b+ iterator from obtained array
- Construct the double_decker iterator from b+ iterator and from
  the number of "steps back" from above
- Call double_decker::iterator.erase()

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:29:53 +03:00
Pavel Emelyanov
eb70644c1c intrusive-array: Array with trusted bounds
A plain array of elements that grows and shrinks by
constructing the new instance from an existing one and
moving the elements from it.

Behaves similarly to vector's external array, but has
0-bytes overhead. The array bounds (0-th and N-th
elemements) are determined by checking the flags on the
elements themselves. For this the type must support
getters and setters for the flags.

To remove an element from array there's also a nothrow
option that drops the requested element from array,
shifts the righter ones left and keeps the trailing
unused memory (so called "train") until reconstruction
or destruction.

Also comes with lower_bound() helper that helps keeping
the elements sotred and the from_element() one that
returns back reference to the array in which the element
sits.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:29:49 +03:00
Pavel Emelyanov
95f15ea383 utils: B+ tree implementation
// The story is at
// https://groups.google.com/forum/#!msg/scylladb-dev/sxqTHM9rSDQ/WqwF1AQDAQAJ

This is the B+ version which satisfies several specific requirements
to be suitable for row-cache usage.

1. Insert/Remove doesn't invalidate iterators
2. Elements should be LSA-compactable
3. Low overhead of data nodes (1 pointer)
4. External less-only comparator
5. As little actions on insert/delete as possible
6. Iterator walks the sorted keys

The design, briefly is:

There are 3 types of nodes: inner, leaf and data, inner and leaf
keep build-in array of N keys and N(+1) nodes. Leaf nodes sit in
a doubly linked list. Data nodes live separately from the leaf ones
and keep pointers on them. Tree handler keeps pointers on root and
left-most and right-most leaves. Nodes do _not_ keep pointers or
references on the tree (except 3 of them, see below).

changes in v9:

- explicitly marked keys/kids indices with type aliases
- marked the whole erase/clear stuff noexcept
- disposers now accept object pointer instead of reference
- clear tree in destructor
- added more comments
- style/readability review comments fixed

Prior changes

**
- Add noexcepts where possible
- Restrict Less-comparator constraint -- it must be noexcept
- Generalized node_id
- Packed code for beging()/cbegin()

**
- Unsigned indices everywhere
- Cosmetics changes

**
- Const iterators
- C++20 concepts

**
- The index_for() implmenetation is templatized the other way
  to make it possible for AVX key search specialization (further
  patching)

**
- Insertion tries to push kids to siblings before split

  Before this change insertion into full node resulted into this
  node being split into two equal parts. This behaviour for random
  keys stress gives a tree with ~2/3 of nodes half-filled.

  With this change before splitting the full node try to push one
  element to each of the siblings (if they exist and not full).
  This slows the insertion a bit (but it's still way faster than
  the std::set), but gives 15% less total number of nodes.

- Iterator method to reconstruct the data at the given position

  The helper creates a new data node, emplaces data into it and
  replaces the iterator's one with it. Needed to keep arrays of
  data in tree.

- Milli-optimize erase()
  - Return back an iterator that will likely be not re-validated
  - Do not try to update ancestors separation key for leftmost kid

  This caused the clear()-like workload work poorly as compared to
  std:set. In particular the row_cache::invalidate() method does
  exactly this and this change improves its timing.

- Perf test to measure drain speed
- Helper call to collect tree counters

**
- Fix corner case of iterator.emplace_before()
- Clean heterogenous lookup API
- Handle exceptions from nodes allocations
- Explicitly mark places where the key is copied (for future)
- Extend the tree.lower_bound() API to report back whether
  the bound hit the key or not
- Addressed style/cleanness review comments

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:29:43 +03:00
Pekka Enberg
7ef50d7c71 configure.py: Don't install dependencies when building submodules
Let's pass the "--nodeps" option to "build_reloc.sh" script of the
submodules to avoid the build system running "sudo"...

Reported-by: Piotr Sarna <sarna@scylladb.com>
Reported-by: Pavel Emelyanov <xemul@scylladb.com>
Tested-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200714114340.440781-1-penberg@scylladb.com>
2020-07-14 14:50:59 +03:00