Commit Graph

25907 Commits

Author SHA1 Message Date
Pavel Emelyanov
ae6b677f9a partition_snapshot_row-cursor: Add const consume_row() version
It's the same as the existing one, but doesn't modify
anything (cursor and pointing rows_entry's) and calls
consumer with const row reference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 12:18:29 +03:00
Pavel Emelyanov
5e28075ec0 partition_snapshot_row_cursor: Add concept to .consume_row()
Nothing special here

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 12:18:29 +03:00
Pavel Emelyanov
d891cfe6cd partition_snapshot_row_cursor: Don't carry end iterators
The btree's iterator can be checked to reach the tree's end
without holding the ending iterator itself. This makes the
whole p_s_r_c 20% smaller (288 bytes -> 224 bytes) since it
now keeps 4 extra iterators on-board -- inside small vectors
for heap and current_row.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 12:18:29 +03:00
Pavel Emelyanov
4558eb3afc partition_snapshot_row_cursor: Move cells hash creation to reader
Right now call to .row() method may create hash on row's cells.
It's counterintuitive to see a const method that transparently
changes something it points to. Since the only caller of a row()
who knows whether the hash creation is required is the cache
reader, it's better to move the call to prepare_hash() into it.

Other than making the .row() less surprising this also helps to
get rid of the whole method by the next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 12:18:29 +03:00
Pavel Emelyanov
00caf5f219 partition_snapshot_row_cursor: Move read_partition into test
The method in question is test-only helper, there's no
need in keeping it as a part of the API.

Another reason to move is that the method is O(number of
rows) and doesn't preempt while looping, but cursor code
users try hard not to stall the reactor. So even though
this method has a meaningful semantics within the class,
it will better be reinvented if needed in core code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 12:16:13 +03:00
Pavel Emelyanov
9f323355a6 partition_snapshot_row_cursor: Move is_in_latest_version inline
The method is currently defined outside of the class which
gives compiler less chances to really inline it when needed.

Also, keeping this simple piece of code inline is less code
to read (and compile).

Mark the guy noexcept while at it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
cc57e35c6a partition_snapshot_row_cursor: Use is_in_latest_version where
appropriate

Checking for _current_row[0].version being 0 (or not being 0)
is better understood if done with a well named existing helper.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
353a8f66a2 partition_snapshot_row_cursor: Less dereferences in key() method
The valid cursor's key is kept on the _position as well,
but getting it from there is 1 defererence less:

_current_row -(*)-> row -> key
_position -(**)-> std::optional -> key

* iterator's -> is pointer dereference
** std::optional is designed not to be a pointer

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
353a1306ce partition_snapshot_row_cursor: Update change mark in prepare_heap
The heap's iterators validity is checked with the change mark,
which is updated every time heap is recreated. Factor these
updates out and keep the mark together with the heap it protects.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
1a1f05f50b partition_snapshot_row_cursor: Clear current row when recreating
The cursor keeps current row in a separate vector of iterators
and reconstructs it in a dedicated method, which _expects_ that
the vector is empty on entry.

It's better to keep the logic of current row construction in one
place.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
2edd072d27 partition_snapshot_row_cursor: Use btree::lower_bound sugar
When checking if the lower-bound entry matched the search
key it's possible to avoid extra comparison with the help
of the collection used to store the rows (btree).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
9aee0ad8b3 partition_snapshot_row_cursor: Factor out next() and erase_and_advance()
Both helpers do the same -- advance the cursor to the next row.
The latter may additionally remove the row from the uniquely
owned version.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
2fb0f7315c partition_snapshot_row_cursor: Relax vector of iterators
The cursor maintains a vector of iterators that correspond to
each of the versions scanned. However, only the iterator in
the latest one is really needed, so the whole vector can be
reduced down to an optional.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 11:45:45 +03:00
Pavel Emelyanov
26e27e27e8 btree: Add operator bool()
The btree's iterators allow for simple checking for '== tree.end()'
condition. For this check neither the tree itself, nor the ending
iterator is required. One just need to check if the _idx value is
the npos.

One additional change to make it work is required -- when removing
an entry from the inline node the _idx should be set to npos.

This change is, well, a bugfix. An iterator left with 0 in _idx is
treated as a valid one. However, the bug is non-triggerable. If such
an "invalid" iterator is compared against tree.end() the check would
return true, because the tree pointers would conside.

So this patch adds an operator bool() to btree iterator to facilitate
simpler checking if it reached the end of the collection or not.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 10:05:47 +03:00
Pavel Emelyanov
772fe2b089 clustering_row: Add new .apply() overload
The clustering_row is a wrapper over the deletable_row and
facilitates the apply-creation of the latter from some other
objects. Soon it will accept the deletable_row itself for
apply()-ing.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-04-09 10:05:47 +03:00
Gleb Natapov
b9175edea4 raft: test: check that a server with id zero cannot be neither created nor added to a config
Message-Id: <20210407134853.1964226-2-gleb@scylladb.com>
2021-04-08 17:07:18 +02:00
Gleb Natapov
fb938a36d4 raft: disallow adding and creating servers with id zero
Id zero has special meaning in the code and cannot be valid server id.
Message-Id: <20210407134853.1964226-1-gleb@scylladb.com>
2021-04-08 17:07:18 +02:00
Kamil Braun
3687757115 sstables: fix TWCS single key reader sstable filter
The filter passed to `min_position_reader_queue`, which was used by
`clustering_order_reader_merger`, would incorrectly include sstables as
soon as they passed through the PK (bloom) filter, and would include
sstables which didn't pass the PK filter (if they passed the CK
filter). Fortunately this wouldn't cause incorrect data to be returned,
but it would cause sstables to be opened unnecessarily (these sstables
would immediately return eof), resulting in a performance drop. This commit
fixes the filter and adds a regression test which uses statistics to
check how many times the CK filter was invoked.

Fixes #8432.

Closes #8433
2021-04-08 18:03:49 +03:00
Avi Kivity
3a58985674 Merge 'scylla_ntp_setup: detect already installed ntp client' from Takuya ASADA
On current implementation, we may re-run ntp configuration even it
already configured.
Also, the system may configured with non-default ntp client, we just
ignoring that and configure with default ntp client.

This patch minimize unnecessary re-configuration of ntp client.
It run in following order:
 1. Check NTP client is already running. If it running, skip setup
 2. Check NTP client is alrady installed. If it installed, use it
 3. If there is non of NTP client package installed,
    - if it's CentOS, install chrony
    - if it's on other distributions, install systemd-timesyncd

Closes #8431

* github.com:scylladb/scylla:
  scylla_ntp_setup: detect already installed ntp client
  scylla_util.py: return bool value on systemd_unit.is_active()
2021-04-08 17:27:15 +03:00
Takuya ASADA
735c83b27f scylla_ntp_setup: detect already installed ntp client
On current implementation, we may re-run ntp configuration even it
already configured.
Also, the system may configured with non-default ntp client, we just
ignoring that and configure with default ntp client.

This patch minimize unnecessary re-configuration of ntp client.
It run in following order:
 1. Check NTP client is already running. If it running, skip setup
 2. Check NTP client is alrady installed. If it installed, use it
 3. If there is non of NTP client package installed,
    - if it's CentOS, install chrony
    - if it's on other distributions, install systemd-timesyncd

Related with #8344, #8339
2021-04-08 22:52:02 +09:00
Botond Dénes
32ae51dc2c table: query(): fix typo (short_read_allwoed)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210408133018.65692-1-bdenes@scylladb.com>
2021-04-08 16:34:08 +03:00
Tomasz Grabiec
6d6f39a7b3 Merge "fixes for stepdown and quorum check" from Gleb
The series contains code cleanups and fixes for stepdown process
and quorum check code.

Note this is re-send of already posted patches lumped together for
convenience.

* scylla-dev/raft-fixes-v1:
  raft: add test for check quorum on a leader
  raft: fix quorum check code for joint config and non-voting members
  raft: do not hang on waiting for entries on a leader that was removed from a cluster
  raft: add more tracing to stepdown code
  raft: use existing election_elapsed() function instead of redo the calculation
  raft: test: add test case for stepdown process
  raft: check that a node is still the leader after initiating stepdown process
2021-04-08 15:18:52 +02:00
Takuya ASADA
2545d7fd43 scylla_util.py: return bool value on systemd_unit.is_active()
Currently, 'if unit.is_active():' is always True since is_active()
returns result in string (active, inactive, unknown).
To avoid such scripting bug, change return value in bool.
2021-04-08 21:54:05 +09:00
Asias He
a8c90a5848 storage_service: Reject replacing a node that has left the ring
1) start n1, n2, n3

2) decommission n3

3) remove /var/lib/scylla for n3

4) start n4 with the same ip address as n3 to replace n3

5) replace will be successful

If a node has left the ring, we should reject the replace operation.

This patch makes the check during replace operation more strict and
rejects the replace if the node has left the ring.

After the patch, we will see

ERROR 2021-04-07 08:02:14,099 [shard 0] init - Startup failed:
std::runtime_error (Cannot replace_adddress 127.0.0.3 because it has left
the ring, status=LEFT)

Fixes #8419

Closes #8420
2021-04-07 19:42:28 +03:00
Avi Kivity
29a674cd94 test: perf: perf_fast_forward: report allocation rate and tasks
These are more stable than cpu consumed across runs, and impact
performance directly.

Closes #8422
2021-04-07 15:41:43 +02:00
Piotr Sarna
8e808a56d2 Merge 'commitlog: Fix race and edge condition in delete_segments' from Calle Wilund
Fixes #8363
Fixes #8376

Delete segements has two issues when running with size-limited
commit log and strict adherence to said limit.

1.) It uses parallel processing, with deferral. This means that
    the disk usage variables it looks at might not be fully valid
    - i.e. we might have already issued a file delete that will
    reduce disk footprint such that a segment could instead be
    recycled, but since vars are (and should) only updated
    _post_ delete, we don't know.
2.) It does not take into account edge conditions, when we only
    delete a single segment, and this segment is the border segment
    - i.e. the one pushing us over the limit, yet allocation is
    desperately waiting for recycling. In this case we should
    allow it to live on, and assume that next delete will reduce
    footprint. Note: to ensure exact size limit, make sure
    total size is a multiple of segment size.

if we had an error in recycling (disk rename?), and no elements
are available, we could have waiters hoping they will get segements.
abort the queue (not permanent, but wakes up waiters), and let them
retry. Since we did deletions instead, disk footprint should allow
for new allocs at least. Or more likely, everything is broken, but
we will at least make more noise.

Closes #8372

* github.com:scylladb/scylla:
  commitlog: Add signalling to recycle queue iff we fail to recycle
  commitlog: Fix race and edge condition in delete_segments
  commitlog: coroutinize delete_segments
  commitlog_test: Add test for deadlock in recycle waiter
2021-04-07 15:13:25 +02:00
Nadav Har'El
0dd6f2db8f Merge 'CDC generations: refactors and improvements' from Kamil Braun
The "most important" major changes are:

1. storage_service: simplify CDC generation management during node replace

Previously, when node A replaced node B, it would obtain B's
generation timestamp from its application state (gossiped by other
nodes) and start gossiping it immediately on bootstrap.

But that's not necessary:
  - if this is the timestamp of the last (current) generation, we would
     obtain it from other nodes anyway (every node gossips the last known
     timestamp),
  - if this is the timestamp of an earlier generation, we would forget
     it immediately and start gossiping the last timestamp (obtained from
     other nodes).

This commit simplifies the bootstrap code (in node-replace case) a bit:
the replacing node no longer attempts to retrieve the CDC generation
timestamp from the node being replaced.

2. tree-wide: introduce cdc::generation_id type

Each CDC generation has a timestamp which denotes a logical point in time
when this generation starts operating. That same timestamp is
used to identify the CDC generation. We use this identification scheme
to exchange CDC generations around the cluster.

However, the fact that a generation's timestamp is used as an ID for
this generation is an implementation detail of the currently used method
of managing CDC generations.

Places in the code that deal with the timestamp, e.g. functions which
take it as an argument (such as handle_cdc_generation) are often
interested in the ID aspect, not the "when does the generation start
operating" aspect. They don't care that the ID is a `db_clock::time_point`.
They may sometimes want to retrieve the time point given the ID (such as
do_handle_cdc_generation when it calls `cdc::metadata::insert`),
but they don't care about the fact that the time point actually IS the ID.

In the future we may actually change the specific type of the ID if we
modify the generation management algorithms.

This commit is an intermediate step that will ease the transition in the
future. It introduces a new type, `cdc::generation_id`. Inside it contains
the timestamp, so:
- if a piece of code doesn't care about the timestamp, it just passes
   the ID around
- if it does care, it can access it using the `get_ts` function.
   The fact that `get_ts` simply accesses the ID's only field is an
   implementation detail.

3. cdc: handle missing generation case in check_and_repair_cdc_streams

check_and_repair_cdc_streams assumed that there is always at least
one generation being gossiped by at least one of the nodes. Otherwise it
would enter undefined behavior.

I'm not aware of any "real" scenario where this assumption wouldn't be
satisfied at the moment where check_and_repair_cdc_streams makes it
except perhaps some theoretical races. But it's best to stay on the safe
side.

---

Additionally the PR does some simplifications, stylistic improvements,
removes some dead code, coroutinizes some functions, uncoroutinizes others
(due to miscompiles), adds additional logging, updates some stale comments.
Read commit messages for more details.

Closes #8283

* github.com:scylladb/scylla:
  cdc: log a message when creating a new CDC generation
  cdc: handle missing generation case in check_and_repair_cdc_streams
  tree-wide: introduce cdc::generation_id type
  tree-wide: rename "cdc streams timestamp" to "cdc generation id"
  cdc: remove some functions from generation.hh
  storage_service: make set_gossip_tokens a static free-function
  db: system_keyspace: group cdc functions in single place
  cdc: get rid of "get_local_streams_timestamp"
  sys_dist_ks: update comment at quorum_if_many
  storage_service: simplify CDC generation management during node replace
2021-04-07 14:49:02 +03:00
Kamil Braun
6525111d21 cdc: log a message when creating a new CDC generation 2021-04-07 13:47:16 +02:00
Kamil Braun
0978155bec cdc: handle missing generation case in check_and_repair_cdc_streams
check_and_repair_cdc_streams assumed that there is always at least
one generation being gossiped by at least one of the nodes. Otherwise it
would enter undefined behavior.

I'm not aware of any "real" scenario where this assumption wouldn't be
satisfied at the moment where check_and_repair_cdc_streams makes it
except perhaps some theoretical races. But it's best to stay on the safe
side.
2021-04-07 13:47:16 +02:00
Kamil Braun
99fd2244a3 tree-wide: introduce cdc::generation_id type
This is a follow-up to the previous commit.

Each CDC generation has a timestamp which denotes a logical point in time
when this generation starts operating. That same timestamp is
used to identify the CDC generation. We use this identification scheme
to exchange CDC generations around the cluster.

However, the fact that a generation's timestamp is used as an ID for
this generation is an implementation detail of the currently used method
of managing CDC generations.

Places in the code that deal with the timestamp, e.g. functions which
take it as an argument (such as handle_cdc_generation) are often
interested in the ID aspect, not the "when does the generation start
operating" aspect. They don't care that the ID is a `db_clock::time_point`.
They may sometimes want to retrieve the time point given the ID (such as
do_handle_cdc_generation when it calls `cdc::metadata::insert`),
but they don't care about the fact that the time point actually IS the ID.

In the future we may actually change the specific type of the ID if we
modify the generation management algorithms.

This commit is an intermediate step that will ease the transition in the
future. It introduces a new type, `cdc::generation_id`. Inside it contains
the timestamp, so:
1. if a piece of code doesn't care about the timestamp, it just passes
   the ID around
2. if it does care, it can simply access it using the `get_ts` function.
   The fact that `get_ts` simply accesses the ID's only field is an
   implementation detail.

Using the occasion, we change the `do_handle_cdc_generation_intercept...`
function to be a standard function, not a coroutine. It turns out that -
depending on the shape of the passed-in argument - the function would
sometimes miscompile (the compiled code would not copy the argument to the
coroutine frame).
2021-04-07 13:47:13 +02:00
Raphael S. Carvalho
8e0a1ca866 sstable_set: Implement compound_sstable_set's create_single_key_sstable_reader()
compound set isn't overriding create_single_key_sstable_reader(), so
default implementation is always called. Although default impl will
provide correct behavior, specialized ones which provides better perf,
which currently is only available for TWCS, were being ignored.

compound set impl of single key reader will basically combine single key
readers of all sets managed by it.

Fixes #8415.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210406205009.75020-1-raphaelsc@scylladb.com>
2021-04-07 12:36:30 +03:00
Nadav Har'El
da11cd99f7 Merge 'Add a (failing) test for picking secondary indexes in order' from Piotr Sarna
Currently the heuristics for picking an index for a query
are not very well defined. It would be best if we used
statistics to pick the index which is likely to perform
the fastest, but for starters we should at least let the user
decide which index to pick by picking the first one by the
order of restrictions passed to the query.
The (failing) test case from this patch shows the expected
results.

Ref: #7969

Closes #8414

* github.com:scylladb/scylla:
  cql-pytest: add a failing test for index picking order
  cql3: add tracing used secondary index
2021-04-07 11:40:37 +03:00
Piotr Sarna
1f7b972db7 cql-pytest: add a failing test for index picking order
Currently the heuristics for picking an index for a query
are not very well defined. It would be best if we used
statistics to pick the index which is likely to perform
the fastest, but for starters we should at least let the user
decide which index to pick by picking the first one by the
order of restrictions passed to the query.
The (failing) test case from this patch shows the expected
results.

Ref: #7969
2021-04-07 10:05:00 +02:00
Gleb Natapov
68d73bd4c8 raft: add test for check quorum on a leader 2021-04-07 10:15:33 +03:00
Gleb Natapov
b3cb4f3966 raft: fix quorum check code for joint config and non-voting members
Current leader code check for most nodes to be alive, but this is
incorrect since some nodes may be non-voting and hence should not
cause a leader to stepdown if dead. It also incorrect with joint config
since quorum is calculated differently there. Fix it by introducing
activity_tracker class that knows how to handle all the above details.
2021-04-07 10:15:33 +03:00
Gleb Natapov
a48a2c454b raft: do not hang on waiting for entries on a leader that was removed from a cluster
If a leader is removed from a cluster it will never know when entries
that it did not committed yet will be committed, so abort the wait in
this case with uncertainty error.
2021-04-07 10:15:33 +03:00
Gleb Natapov
db03c94692 raft: add more tracing to stepdown code 2021-04-07 10:15:33 +03:00
Gleb Natapov
7dec56721c raft: use existing election_elapsed() function instead of redo the calculation 2021-04-07 10:15:33 +03:00
Gleb Natapov
bdb59307d3 raft: test: add test case for stepdown process
Add the test for the case where C_new entry is not the last one in a
leader that is been removed from a cluster. In this case a leader will
continue replication even after committing C_new and will start stepdown
process later, when at least one follower is fully synchronized.
2021-04-07 10:15:33 +03:00
Gleb Natapov
3bcd3212e2 raft: check that a node is still the leader after initiating stepdown process
Usually initiation of stepdown process does not immediately depose the
current leader, but if the current leader is no longer part of the
cluster it will happen. We were missing the check after initiating
stepdown process in append reply handling.
2021-04-07 10:15:33 +03:00
Avi Kivity
5109bf8b99 config: relax batch size warning and failure thresholds
We inherited very low threshold for warning and failing multi-partition
batches, but these warnings aren't useful. The size of a batch in bytes
as no impact on node stability. In fact the warnings can cause more
problems if they flood the log.

Fix by raising the warning threshold to 128 kiB (our magic size)
and the fail threshold to 1 MiB.

Fixes #8416.

Closes #8417
2021-04-06 20:56:06 +03:00
Calle Wilund
d734f85280 commitlog: Add signalling to recycle queue iff we fail to recycle
Fixes #8376

If a recycle should fail, we will sort of handle it by deleting
the segment, so no leaks. But if we have waiter(s) on the recycle
queue, we could end up deadlocked/starved because nothing is
incoming there.

This adds an abort of the queue iff we failed and no objects are
available. This will wake up any waiter, and he should retry,
and hopefully at least be able to create a new segment.
We then reset the queue to a new one. So we can go on.

v2:
* Forgot to reset queue
v3:
* Nicer exception handling in allocate_segment_ex
2021-04-06 16:38:14 +00:00
Calle Wilund
15dd76f0c2 commitlog: Fix race and edge condition in delete_segments
Fixes #8363

Delete segements has two issues when running with size-limited
commit log and strict adherence to said limit.

1.) It uses parallel processing, with deferral. This means that
    the disk usage variables it looks at might not be fully valid
    - i.e. we might have already issued a file delete that will
    reduce disk footprint such that a segment could instead be
    recycled, but since vars are (and should) only updated
    _post_ delete, we don't know.
2.) It does not take into account edge conditions, when we only
    delete a single segment, and this segment is the border segment
    - i.e. the one pushing us over the limit, yet allocation is
    desperately waiting for recycling. In this case we should
    allow it to live on, and assume that next delete will reduce
    footprint. Note: to ensure exact size limit, make sure
    total size is a multiple of segment size.

Fixed by
a.) Doing delete serialized. It is not like being parallel here will
    win us speed awards. And now we can know exact footprint, and
    how many segments we have left to delete
b.) Check if we are a block across the footprint boundry, and people
    might be waiting for a segment. If so, don't delete segment, but
    recycle.

As a follow-up, we should probably instead adjust the commitlog size
limit (per shard) to be a multiple of segment sizes, but there is
risks in that too.
2021-04-06 16:38:14 +00:00
Calle Wilund
d9a9897892 commitlog: coroutinize delete_segments
Because we like cow routines.
2021-04-06 16:38:14 +00:00
Calle Wilund
813694b617 commitlog_test: Add test for deadlock in recycle waiter
Not a very good test, mind you. Nothing to verify, just see if
the test times out. But try to make it at least complete for
failure report.
2021-04-06 16:38:14 +00:00
Piotr Sarna
1c99ed6ced cql3: add tracing used secondary index
The indexed queries will now record which index was chosen
for fetching the base table keys.
Example output:
 activity
------------------------------------------------------------------------------------------------------------------------
                                                                                                    Parsing a statement
                                                                                                 Processing a statement
                                                                  Consulting index my_v2_idx for a single slice of keys
 Creating read executor for token -3248873570005575792 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE
                                                                                            read_data: querying locally
                                               Start querying singular range {{-3248873570005575792, pk{000400000002}}}
                           Querying cache for range {{-3248873570005575792, pk{000400000002}}} and slice {(-inf, +inf)}
                                                                                                       Querying is done
                                                                                   Done processing - preparing a result
2021-04-06 17:16:29 +02:00
Tomasz Grabiec
4b10247a4f Merge "raft: do not assert when receiving unexpected messages in a leader state" from Gleb
* scylla-dev/raft-cleanup-v2:
  raft: test: add test that leader behaves as expected when it gets unexpended messages
  raft: do not assert when receiving unexpected messages in a leader state
  raft: use existing function to check if election timeout elapsed
2021-04-06 16:52:23 +02:00
Konstantin Osipov
c83cf1f965 uuid: switch the API to use std::chrono
A follow up for the patch for #7611. This change was requested
during review and moved out of #7611 to reduce its scope.

The patch switches UUID_gen API from using plain integers to
hold time units to units from std::chrono.

For one, we plan to switch the entire code base to std::chrono units,
to ensure type safety. Secondly, using std::chrono units allows to
increase code reuse with template metaprogramming and remove a few
of UUID_gen functions that beceme redundant as a result.

* switch  get_time_UUID(), unix_timestamp(), get_time_UUID_raw(), switch
  min_time_UUID(), max_time_UUID(), create_time_safe() to
  std::chrono
* remove unused variant of from_unix_timestamp()
* remove unused get_time_UUID_bytes(), create_time_unsafe(),
  redundant get_adjusted_timestamp()
* inline get_raw_UUID_bytes()
* collapse to similar implementations of get_time_UUID()
* switch internal constants to std::chrono
* remove unnecessary unique_ptr from UUID_gen::_instance
Message-Id: <20210406130152.3237914-2-kostja@scylladb.com>
2021-04-06 17:12:54 +03:00
Nadav Har'El
91249e9683 Update tools/java submodule
* tools/java 5756445ec7...57eb143119 (1):
  > sstableloader: Handle non-prepared batches with ":" in identifier names

Fixes #8230.
2021-04-06 16:37:03 +03:00
Nadav Har'El
0d0db05cf3 test/alternator: speed up two slow xfailing tests
By far the two slowest Alternator tests when running a development build on
my laptop are
	test_gsi.py::test_gsi_projection_include
and
	test_gsi.py::test_gsi_projection_keys_only
Each of those takes around 3.2, and the sum of just these two tests is as
much as 10% (!) of all other 600 tests.

The reason why these tests are slow is that they check scanning a GSI
with *projection*. Scylla currently ignores the projection, so the scan
returns the wrong value. Because this is a GSI, which supports only
eventually- consistent reads, we need to retry the read - and did it for
up to 3 seconds!

But this retry only makes sense if the GSI read did not *yet* return
the expected data. But in these xfailing test, we read a *wrong* item
(with too many attributes) almost immediately, and this should indicate
an immediate failure - no amount of retry would help. So in this patch
we detect this case and fail the test immediately instead of wasting
3 seconds in retries.

On my laptop with dev build, this patch reduces the time to run the
entire Alternator test suite from 70 seconds to 63 seconds.

Also, now that we never just waste time until the timeout, we can
increase it to any number, and in this patch we increase it from 3
seconds to 5.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210317183918.1775383-1-nyh@scylladb.com>
2021-04-06 14:49:15 +02:00