Commit Graph

3218 Commits

Author SHA1 Message Date
Lakshmi Narayanan Sreethar
310c5da4bb test/boost/chunked_managed_vector_test: fix testcase tests_reserve_partial
Update the maximum size tested by the testcase. The test always created
only one chunk as the maximum size tested by it (1 << 12 = 4KB) was less
than the default max chunk size (12.8 KB). So, use twice the
max_chunk_capacity as the test size distribution upper limit to verify
that partial_reserve can reserve multiple chunks.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-06-14 13:47:10 +05:30
Lakshmi Narayanan Sreethar
d4f8b91bd6 utils/lsa/chunked_managed_vector: fix reserve_partial()
Fix the method comment and return types of chunked_managed_vector's
reserve_partial() similar to chunked_vector's reserve_partial() as it
has the same issues mentioned in #19254. Also update the usage in the
chunked_managed_vector_test.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-06-14 13:47:10 +05:30
Lakshmi Narayanan Sreethar
29f036a777 test/boost/chunked_vector_test: fix testcase tests_reserve_partial
Fix the usage of reserve_partial in the testcase. Also update the
maximum chunk size used by the testcase. The test always created only
one chunk as the maximum size tested by it (1 << 12 = 4KB) was less
than the default max chunk size (128 KB). So, use smaller chunk size,
512 bytes, to verify that partial_reserve can reserve multiple chunks.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-06-14 13:43:07 +05:30
Avi Kivity
f1819419cc Merge 'scylla-sstable: add method to load the schema from the sstable itself' from Botond Dénes
As it turns out, each sstable carries its own schema in its serialization header (Statistics component). This schema is incomplete -- the names of the key columns are not stored, just their type. Static and regular columns do have names and types stored however. This bare-bones schema is enough to parse and display the content of the sstable. Another thing missing is schema options (the stuff after the `WITH` keyword, except the clustering order). The only options stored are the compression options (in the CompressionInfo component), this is actually needed to read the Data component.

This series adds a new method to `tools/schema_loader.cc` to extract the schema stored in the sstable itself. This new schema load method is used as the last fall-back for obtaining the schema, in case scylla-sstable is trying to autodetect the schema of the sstable. Although, right now this bare-bones schema is enough for everything scylla-sstable does, it is more future proof to stick to the "full" schema if possible, so this new method is the last resort for now.

Fixes: https://github.com/scylladb/scylladb/issues/17869
Fixes: https://github.com/scylladb/scylladb/issues/18809

New functionality, no backport needed.

Closes scylladb/scylladb#19169

* github.com:scylladb/scylladb:
  tools/scylla-sstable: log loaded schema with trace level
  tools/scylla-sstable: load schema from the sstable as fallback
  tools/schema_loader: introduce load_schema_from_sstable()
  test/lib/random_schema: remove assert on min number of regular columns
  sstables: introduce load_metadata()
2024-06-13 12:21:09 +03:00
Botond Dénes
8f2ba03465 tools/schema_loader: introduce load_schema_from_sstable()
Allows loading the schema from an sstable's serialization header. This
schema is incomplete, but it is enough to parse and display the content
of the sstable.
2024-06-13 01:32:17 -04:00
Piotr Dulikowski
0b5a0c969a Merge 'hinted handoff: migrate sync point to host ID' from Michael Litvak
Change the format of sync points to use host ID instead of IPs, to be consistent with the use of host IDs in hinted handoff module.
Introduce sync point v3 format which is the same as v2 except it stores host IDs instead of IPs.
The decoding supports both formats with host IDs and IPs, so a sync point contains now a variant of either types, and in the case of new type the translation is avoided.

Fixes #18653

Closes scylladb/scylladb#19134

* github.com:scylladb/scylladb:
  db/hints: migrate sync point to host ID
  db/hints: rename sync point structures with _v1 suffix to _v1_v2
2024-06-13 06:16:00 +02:00
Kefu Chai
223fba3243 test: memtable_test: increase unspooled_dirty_soft_limit
before this change, when performing memtable_test, we expect that
the memtables of ks.cf is the only memtables being flushed. and
we inject 4 failures in the code path of flush, and wait until 4
of them are triggered. but in the background, `dirty_memory_manager`
performs flush on all tables when necessary. so, the total number of
failures is not necessary the total number of failures triggered
when flushing ks.cf, some of them could be triggered when flushing
system tables. that's why we have sporadict test failures from
this test. as we might check `t.min_memtable_timestamp()` too soon.

after this change, we increase `unspooled_dirty_soft_limit` setting,
in order to disable `dirty_memory_manager`, so that the only flush
is performed by the test.

Fixes #19034
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-12 19:17:27 +08:00
Kefu Chai
2df4e9cfc2 test: memtable_test: replace BOOST_ASSERT with BOOST_REQURE
before this change, we verify the behavior of design under test using
`BOOST_ASSERT()`, which is a wrapper around `assert()`, so if a test
fails, the test just aborts. this is not very helpful for postmortem
debugging.

after this change, we use `BOOST_REQUIRE` macro for verifying the
behavior, so that Boost.Test prints out the condition if it does not
hold when we test it.

Refs #19034
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-12 19:17:27 +08:00
Kefu Chai
4175e02d9d clustering_bounds_comparator: drop operator<< for bound_kind
turns out operator<< for bound_kind is not used anymore, so let's
drop it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#19159
2024-06-11 18:01:06 +02:00
Michael Litvak
afc9a1a8a6 db/hints: migrate sync point to host ID
Change the format of sync points to use host ID instead of IPs, to be
consistent with the use of host IDs in hinted handoff module.
Introduce sync point v3 format which is the same as v2 except it stores
host IDs instead of IPs.
The encoding of sync points now always uses the new v3 format with host
IDs.
The decoding supports both formats with host IDs and IPs, so a sync point
contains now a variant of either types, and in the case of the new
format the translation from IP to host ID is avoided.
2024-06-11 11:07:00 +02:00
Botond Dénes
7b2aad56c4 test/boost/sstable_datafile_test: remove unused semaphores
The tests use the ones from test_env, the explicitely created ones are
unused.

Closes scylladb/scylladb#19167
2024-06-09 20:43:59 +03:00
Piotr Dulikowski
e18aeb2486 Merge 'mv: gossip the same backlog if a different backlog was sent in a response' from Wojciech Mitros
Currently, there are 2 ways of sharing a backlog with other nodes: through
a gossip mechanism, and with responses to replica writes. In gossip, we
check each second if the backlog changed, and if it did we update other
nodes with it. However if the backlog for this node changed on another
node with a write response, the gossiped backlog is currently not updated,
so if after the response the backlog goes back to the value from the previous
gossip round, it will not get sent and the other node will stay with an
outdated backlog - this can be observed in the following scenario:

1. Cluster starts, all nodes gossip their empty view update backlog to one another
2. On node N, `view_update_backlog_broker` (the backlog gossiper) performs an iteration of its backlog update loop, sees no change (backlog has been empty since the start), schedules the next iteration after 1s
3. Within the next 1s, coordinator (different than N) sends a write to N causing a remote view update (which we do not wait for). As a result, node N replies immediately with an increased view update backlog, which is then noted by the coordinator.
4. Still within the 1s, node N finishes the view update in the background, dropping its view update backlog to 0.
5. In the next and following iterations of `view_update_backlog_broker` on N, backlog is empty, as it was in step 2, so no change is seen and no update is sent due to the check
```
auto backlog = _sp.local().get_view_update_backlog();
if (backlog_published && *backlog_published == backlog) {
    sleep_abortable(gms::gossiper::INTERVAL, _as).get();
    continue;
}
```

After this scenario happens, the coordinator keeps an information about an increased view update backlog on N even though it's actually already empty

This patch fixes the issue this by notifying the gossip that a different backlog
was sent in a response, causing it to send an unchanged backlog to other
nodes in the following gossip round.

Fixes: https://github.com/scylladb/scylladb/issues/18461

Similarly to https://github.com/scylladb/scylladb/pull/18646, without admission control (https://github.com/scylladb/scylladb/pull/18334), this patch doesn't affect much, so I'm marking it as backport/none

Tests: manual. Currently this patch only affects the length of MV flow control delay, which is not reliable to base a test on. A proper test will be added when MV admission control is added, so we'll be able to base the test on rejected requests

Closes scylladb/scylladb#18663

* github.com:scylladb/scylladb:
  mv: gossip the same backlog if a different backlog was sent in a response
  node_update_backlog: divide adding and fetching backlogs
2024-06-07 10:20:21 +02:00
Marcin Maliszkiewicz
63e6334a64 raft: rename mutations_collector to group0_batch 2024-06-06 13:26:34 +02:00
Wojciech Mitros
272e80fe0a node_update_backlog: divide adding and fetching backlogs
Currently, we only update the backlogs in node_update_backlog at the
same time when we're fetching them. This is done using storage_proxy's
method get_view_update_backlog, which is confusing because it's a getter
with side-effects. Additionally, we don't always want to update the
backlog when we're reading it (as in gossip which is only on shard 0)
and we don't always want to read it when we're updating it (when we're
not handling any writes but the backlog drops due to background work
finish).

This patch divides the node_view_backlog::add_fetch as well the
storage_proxy::get_view_update_backlog both into two methods; one
for updating and one for reading the backlog. This patch only replaces
the places where we're currently using the view backlog getter, more
situations where we should get/update the backlog should be considered
in a following patch.
2024-06-06 10:45:13 +02:00
Marcin Maliszkiewicz
ac0e164a6b raft: rename announce to commit
Old wording was derived from existing code which
originated from schema code. Name commit better
describes what we do here.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
370a5b547e cql3: raft: attach description to each mutations collector group
This description is readable from raft log table.
Previously single description was provided for the whole
announce call but since it can contain mutations from
various subsystems now description was moved to
add_mutation(s)/add_generator function calls.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
97a5da5965 cql3: auth: use mutation collector for alter role
This is done to achieve single transaction semantics.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
a12c8ebfce cql3: auth: use mutation collector for grant role and revoke role
This is done to achieve single transaction semantics.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
5ba7d1b116 cql3: auth: use mutation collector for drop role and auto-revoke
The main theme of this commit is executing drop
keyspace/table/aggregate/function statements in a single
transaction together with auth auto-revoke logic.
This is the logic which cleans related permissions after
resource is deleted.

It contains serveral parts which couldn't easily be split
into separate commits mainly because mutation collector related
paths can't be mixed together. It would require holding multiple
guards which we don't support. Another reason is that with mutation
collector the changes are announced in a single place, at the end
of statement execution, if we'd announce something in the middle
then it'd lead to raft concurrent modification infinite loop as it'd
invalidate our guard taken at the begining of statement execution.

So this commit contains:

- moving auto-revoke code to statement execution from migration_listener
 * only for auth-v2 flow, to not break the old one
 * it's now executed during statement execution and not merging schemas,
   which means it produces mutations once as it should and not on each
   node separately
 * on_before callback family wasn't used because I consider it much
   less readable code. Long term we want to remove
   auth_migration_listener.

- adding mutation collector to revoke_all
 * auto-revoke uses this function so it had to be changed,
   auth::revoke_all free function wrapper was added as cql3
   layer should not use underlying_authorizer() directly.

- adding mutation collector to drop_role
 * because it depends on revoke_all and we can't mix old and new flows
 * we need to switch all functions auth::drop_role call uses
 * gradual use of previously introduced modify_membership, otherwise
   we would need to switch even more code in this commit
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
01fb43e35f Revert "Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks"
This reverts commit 80ed442be2.

This logic was replaced in previous commit by dynamic cast.
Hopefully even this cast will be eliminated in the future.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
2a6cfbfb33 cql3: auth: use mutation collector for create role statement
This is done to achieve single transaction semantics.

grant_permissions_to_creator is logically part of create role
but its change will be included in following commits
as it spans multiple usages.

Additinally we disabled rollback during create role as
it won't work and is not needed with single transaction logic.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
7e0a801f53 auth: add class for gathering mutations without immediate announce
To achieve write atomicity across different tables we need to announce
mutations in a single transaction. So instead of each function doing
a separate announce we need to collect mutations and announce them once
at the end.
2024-06-04 15:43:04 +02:00
Lakshmi Narayanan Sreethar
3d7d1fa72a db/config.cc: increment components_memory_reclaim_threshold config default
Incremented the components_memory_reclaim_threshold config's default
value to 0.2 as the previous value was too strict and caused unnecessary
eviction in otherwise healthy clusters.

Fixes #18607

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#18964
2024-05-30 18:03:51 +03:00
Pavel Emelyanov
83d491af02 config: Remove experimental TABLETS feature
... and replace it with boolean enable_tablets option. All the places
in the code are patched to check the latter option instead of the former
feature.

The option is OFF by default, but the default scylla.yaml file sets this
to true, so that newly installed clusters turn tablets ON.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18898
2024-05-30 18:03:51 +03:00
Pavel Emelyanov
e74a4b038f Merge 'tablets: alter keyspace' from Piotr Smaron
This change supports changing replication factor in tablets-enabled keyspaces.
This covers both increasing and decreasing the number of tablets replicas through
first building topology mutations (`alter_keyspace_statement.cc`) and then
tablets/topology/schema mutations (`topology_coordinator.cc`).
For the limitations of the current solution, please see the docs changes attached to this PR.

Fixes: #16129

Closes scylladb/scylladb#16723

* github.com:scylladb/scylladb:
  test: Do not check tablets mutations on nodes that don't have them
  test: Fix the way tablets RF-change test parses mutation_fragments
  test/tablets: Unmark RF-changing test with xfail
  docs: document ALTER KEYSPACE with tablets
  Return response only when tablets are reallocated
  cql-pytest: Verify RF is changes by at most 1 when tablets on
  cql3/alter_keyspace_statement: Do not allow for change of RF by more than 1
  Reject ALTER with 'replication_factor' tag
  Implement ALTER tablets KEYSPACE statement support
  Parameterize migration_manager::announce by type to allow executing different raft commands
  Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks
  Extend system.topology with 3 new columns to store data required to process alter ks global topo req
  Allow query_processor to check if global topo queue is empty
  Introduce new global topo `keyspace_rf_change` req
  New raft cmd for both schema & topo changes
  Add storage service to query processor
  tablets: tests for adding/removing replicas
  tablet_allocator: make load_balancer_stats_manager configurable by name
2024-05-29 14:17:51 +03:00
Botond Dénes
d37eca0593 test/boost/mutation_reader_test: compacting_reader_next_partition: fix partition order
The test creates two partitions and passes them through the reader, but
the partitions are out-of-order. This is benign but best to fix it
anyway.
Found after bumping validation level inside the compactor.

Closes scylladb/scylladb#18848
2024-05-28 20:41:54 +03:00
Piotr Smaron
80ed442be2 Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks 2024-05-28 13:55:11 +02:00
Kefu Chai
e42d83dc46 treewide: include used headers
before this change, we rely on `seastar/util/std-compat.hh` to
include the used headers provided by stdandard library. this was
necessary before we moved to a C++20 compliant standard library
implementation. but since Seastar has dropped C++17 support. its
`seastar/util/std-compat.hh` is not responsible for providing these
headers anymore.

so, in this change, we include the used headers directly instead
of relying on `seastar/util/std-compat.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18883
2024-05-27 17:34:38 +03:00
Paweł Zakrzewski
c888945354 tablets: tests for adding/removing replicas
Note we're suppressing a UBSanitizer overflow error in UTs. That's
because our linter complains about a possible overflow, which never
happens, but tests are still failing because of it.
2024-05-27 12:48:44 +02:00
Botond Dénes
e0f4d79f3b Merge 'Do not export statement scheduling group from database' from Pavel Emelyanov
Database used to be (and still is in many ways) an object used to get configuration from. Part of the configuration is the set of pre-configured scheduling groups. That's not nice, services should use each other for some real need, not as proxies to configuration. This patch patches the places that explicitly switch to statement group _not_ to use database to get the group itself.

fixes: #17643

Closes scylladb/scylladb#18799

* github.com:scylladb/scylladb:
  database: Don't export statement scheduling group
  test: Use async attrs and cql-test-env scheduling groups
  test: Use get_scheduling_groups() to get scheduling groups
  api: Don't switch sched group to start/stop protocol servers
  main: Don't switch sched group to start protocol servers
  code: Switch to sched group in request_stop_server()
  code: Switch to server sched group in start()
  protocol_server: Keep scheduling group on board
  code: Add scheduling group to controllers
  redis: Coroutinize start() method
2024-05-27 10:48:33 +03:00
Marcin Maliszkiewicz
2ab143fb40 db: auth: move auth tables to system keyspace
Separate keyspace which also behaves as system brings
little benefit while creating some compatibility problems
like schema digest mismatch during rollback. So we decided
to move auth tables into system keyspace.

Fixes https://github.com/scylladb/scylladb/issues/18098

Closes scylladb/scylladb#18769
2024-05-26 22:30:42 +03:00
Kefu Chai
bfe918ac9e test/boost: define fmt::formatter for service_level_controller_test.cc
since we are moving away for operator<< based formatter, more and more
types now only have {fmt} based formatters. the same will apply to the
STL container types after ditching the generic homebrew formatter in
to_string.hh, so to be prepared for the change, let's add the
fmt::formatter for tests as well.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-05-26 12:32:43 +08:00
Kefu Chai
222dbf2ce4 test/boost: include test/lib/test_utils.hh
this change was created in the same spirit of 505900f18f. because
we are deprecating the operator<< for vector and unorderd_map in
Seastar, some tests do not compile anymore if we disable these
operators. so to be prepared for the change disabling them, let's
include test/lib/test_utils.hh for accessing the printer dedicated
for Boost.test. and also '#include <fmt/ranges.h>' when necessary,
because, in order to format the ranges using {fmt}, we need to
use fmt/ranges.h.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-05-26 12:32:43 +08:00
Pavel Emelyanov
2e3a057db1 test: Use get_scheduling_groups() to get scheduling groups
There's such a helper in cql-test-env that other tests use to get sched
groups from. Few other tests (ab)use databse for that, this patch fixes
those remnants.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-24 18:00:01 +03:00
Botond Dénes
5e41dd28c7 Merge 'Sanitize sl controller draining' from Pavel Emelyanov
The sl-controller is stopped in three steps. The first (and instantly the second) is unsubscribing from lifecycle notification and draining. The third is stop itself. First two steps are "out of order" as compared to the desired start-stop sequence of any service, this patch fixes these steps.

After this PR the drain_on_shutdown() (the call that drains the node upon stop) finally becomes clean and tidy and is no longer accompanied by ad-hoc fellow drains/stops/aborts/whatever.

refs: #2737

Closes scylladb/scylladb#18731

* github.com:scylladb/scylladb:
  sl_controller: Remove drain() method
  sl_controller: Move abort kicking into do_abort()
  main,sl_controller: Subscribe for early abort
  main: Unsubscribe sl controller next to subscribing
2024-05-21 17:16:23 +03:00
Avi Kivity
33ec6ccea9 test: boost: chunked_vector_test: include <optional>
std::optional is used but not imported. This fails on libstdc++-14.

Closes scylladb/scylladb#18739
2024-05-21 07:37:11 +03:00
Pavel Emelyanov
8d4c8711fa main,sl_controller: Subscribe for early abort
There's stop-signal in main that fires an abort source on stop. Lots of
other services are subscribed in it, add the sl-controller too. For now
it's a no-op, but next patches will make use of it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-20 21:26:31 +03:00
Avi Kivity
61505d057e Merge 'Sort user-defined types in describe statements' from Michał Jadwiszczak
User-defined types can depend on each other, creating directed acyclic graph.

In order to support restoring schema from `DESC SCHEMA`, UDTs should be
ordered topologically, not alphabetically as it was till now.

This patch changes the way UDTs are ordered in `DESC SCHEMA`/`DESC KEYSPACE <ks>` statements, so the output can be safely copy-pasted to restore the schema.

Fixes #18539

Closes scylladb/scylladb#18302

* github.com:scylladb/scylladb:
  test/cql-pytest/test_describe: add test for UDTs ordering
  cql3/statements/describe_statement: UDTs topological sorting
  cql3/statements/describe_statement: allow to skip alphabetical sorting
  types: add a method to get all referenced user types
  db/cql_type_parser: use generic topological sorting
  db/cql_type_parses: futurize raw_builder::build()
  test/boost: add test for topological sorting
  utils: introduce generic topological sorting algorithm
2024-05-20 16:58:17 +03:00
Avi Kivity
52fe351c31 Merge 'Balance tablets within nodes (intra-node migration)' from Tomasz Grabiec
This is needed to avoid severe imbalance between shards which can
happen when some table grows and is split. The inter-node balance can
be equal, so inter-node migration cannot fix the imbalance. Also, if RF=N
then there is not even a possibility of moving tablets around to fix the imbalance.
The only way to bring the system to balance is to move tablets within the nodes.

The system is not prepared for intra-node migration currently. Request coordination
is host-based, while for intra-node migration it should be (also) shard-based.
The solution employed here is to keep the coordination between nodes as-is,
and for intra-node migration storage_proxy-level coordinator is not aware of
the migration (no pending host). The replica-side request handler will be a
second-level coordinator which routes requests to shards, similar to how
the first-level coordinator routes them to hosts.

Tablet sharder is adjusted to handle intra-migration where a tablet
can have two replicas on the same host. For reads, sharder uses the
read selector to resolve the conflict. For writes, the write selector
is used.

The old shard_of() API is kept to represent shard for reads, and new
method is introduced to query the shards for writing:
shard_for_writes(). All writers should be switched to that API, which
is not done in this patch yet.

The request handler on replica side acts as a second-level
coordinator, using sharder to determine routing to shards. A given
sharder has a scope of a single topology version, a single
effective_replication_map_ptr, which should be kept alive during
writes.

perf-simple-query test results show no signs of regression:

Command: perf-simple-query -c1 -m1G --write --tablets --duration=10

Before:

> 83294.81 tps ( 59.5 allocs/op,  14.3 tasks/op,   53725 insns/op,        0 errors)
> 87756.72 tps ( 59.5 allocs/op,  14.3 tasks/op,   54049 insns/op,        0 errors)
> 86428.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54208 insns/op,        0 errors)
> 86211.38 tps ( 59.7 allocs/op,  14.3 tasks/op,   54219 insns/op,        0 errors)
> 86559.89 tps ( 59.6 allocs/op,  14.3 tasks/op,   54188 insns/op,        0 errors)
> 86609.39 tps ( 59.6 allocs/op,  14.3 tasks/op,   54117 insns/op,        0 errors)
> 87464.06 tps ( 59.5 allocs/op,  14.3 tasks/op,   54039 insns/op,        0 errors)
> 86185.43 tps ( 59.6 allocs/op,  14.3 tasks/op,   54169 insns/op,        0 errors)
> 86254.71 tps ( 59.6 allocs/op,  14.3 tasks/op,   54139 insns/op,        0 errors)
> 83395.35 tps ( 60.2 allocs/op,  14.4 tasks/op,   54693 insns/op,        0 errors)
>
> median 86428.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54208 insns/op,        0 errors)
> median absolute deviation: 243.04
> maximum: 87756.72
> minimum: 83294.81
>

After:

> 85523.06 tps ( 59.5 allocs/op,  14.3 tasks/op,   53872 insns/op,        0 errors)
> 89362.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54226 insns/op,        0 errors)
> 88167.55 tps ( 59.7 allocs/op,  14.3 tasks/op,   54400 insns/op,        0 errors)
> 87044.40 tps ( 59.7 allocs/op,  14.3 tasks/op,   54310 insns/op,        0 errors)
> 88344.50 tps ( 59.6 allocs/op,  14.3 tasks/op,   54289 insns/op,        0 errors)
> 88355.06 tps ( 59.6 allocs/op,  14.3 tasks/op,   54242 insns/op,        0 errors)
> 88725.46 tps ( 59.6 allocs/op,  14.3 tasks/op,   54230 insns/op,        0 errors)
> 88640.08 tps ( 59.6 allocs/op,  14.3 tasks/op,   54210 insns/op,        0 errors)
> 90306.31 tps ( 59.4 allocs/op,  14.3 tasks/op,   54043 insns/op,        0 errors)
> 87343.62 tps ( 59.8 allocs/op,  14.3 tasks/op,   54496 insns/op,        0 errors)
>
> median 88355.06 tps ( 59.6 allocs/op,  14.3 tasks/op,   54242 insns/op,        0 errors)
> median absolute deviation: 1007.41
> maximum: 90306.31
> minimum: 85523.06

Command (reads): perf-simple-query -c1 -m1G  --tablets --duration=10

Before:

> 95860.18 tps ( 63.1 allocs/op,  14.1 tasks/op,   42476 insns/op,        0 errors)
> 97537.69 tps ( 63.1 allocs/op,  14.1 tasks/op,   42454 insns/op,        0 errors)
> 97549.23 tps ( 63.1 allocs/op,  14.1 tasks/op,   42470 insns/op,        0 errors)
> 97511.29 tps ( 63.1 allocs/op,  14.1 tasks/op,   42470 insns/op,        0 errors)
> 97227.32 tps ( 63.1 allocs/op,  14.1 tasks/op,   42471 insns/op,        0 errors)
> 94031.94 tps ( 63.1 allocs/op,  14.1 tasks/op,   42441 insns/op,        0 errors)
> 96978.04 tps ( 63.1 allocs/op,  14.1 tasks/op,   42462 insns/op,        0 errors)
> 96401.70 tps ( 63.1 allocs/op,  14.1 tasks/op,   42473 insns/op,        0 errors)
> 96573.77 tps ( 63.1 allocs/op,  14.1 tasks/op,   42440 insns/op,        0 errors)
> 96340.54 tps ( 63.1 allocs/op,  14.1 tasks/op,   42468 insns/op,        0 errors)
>
> median 96978.04 tps ( 63.1 allocs/op,  14.1 tasks/op,   42462 insns/op,        0 errors)
> median absolute deviation: 571.20
> maximum: 97549.23
> minimum: 94031.94
>

After:

> 99794.67 tps ( 63.1 allocs/op,  14.1 tasks/op,   42471 insns/op,        0 errors)
> 101244.99 tps ( 63.1 allocs/op,  14.1 tasks/op,   42472 insns/op,        0 errors)
> 101128.37 tps ( 63.1 allocs/op,  14.1 tasks/op,   42485 insns/op,        0 errors)
> 101065.27 tps ( 63.1 allocs/op,  14.1 tasks/op,   42465 insns/op,        0 errors)
> 101212.98 tps ( 63.1 allocs/op,  14.1 tasks/op,   42456 insns/op,        0 errors)
> 101413.31 tps ( 63.1 allocs/op,  14.1 tasks/op,   42463 insns/op,        0 errors)
> 101464.92 tps ( 63.1 allocs/op,  14.1 tasks/op,   42466 insns/op,        0 errors)
> 101086.74 tps ( 63.1 allocs/op,  14.1 tasks/op,   42488 insns/op,        0 errors)
> 101559.09 tps ( 63.1 allocs/op,  14.1 tasks/op,   42468 insns/op,        0 errors)
> 100742.58 tps ( 63.1 allocs/op,  14.1 tasks/op,   42491 insns/op,        0 errors)
>
> median 101212.98 tps ( 63.1 allocs/op,  14.1 tasks/op,   42456 insns/op,        0 errors)
> median absolute deviation: 200.33
> maximum: 101559.09
> minimum: 99794.67
>

Fixes #16594

Closes scylladb/scylladb#18026

* github.com:scylladb/scylladb:
  Implement fast streaming for intra-node migration
  test: tablets_test: Test sharding during intra-node migration
  test: tablets_test: Check sharding also on the pending host
  test: py: tablets: Test writes concurrent with migration
  test: py: tablets: Test crash during intra-node migration
  api, storage_service: Introduce API to wait for topology to quiesce
  dht, replica: Remove deprecated sharder APIs
  test: Avoid using deprecated sharded API
  db: do_apply_many() avoid deprecated sharded API
  replica: mutation_dump: Avoid deprecated sharder API
  repair: Avoid deprecated sharder API
  table: Remove optimization which returns empty reader when key is not owned by the shard
  dht: is_single_shard: Avoid deprecated sharder API
  dht: split_range_to_single_shard: Work with static_sharder only
  dht: ring_position_range_sharder: Avoid deprecated sharder APIs
  dht: token: Avoid use of deprecated sharder API by switching to static_sharder
  selective_token_sharder: Avoid use of deprecated sharder API
  docs: Document tablet sharding vs tablet replica placement
  readers/multishard.cc: use shard_for_reads() instead of shard_of()
  multishard_mutation_query.cc: use shard_for_reads() instead of shard_of()
  storage_proxy: Extract common code to apply mutations on many shards according to sharder
  storage_proxy: Prepare per-partition rate-limiting for intra-node migration
  storage_proxy: Avoid shard_of() use in mutate_counter_on_leader_and_replicate()
  storage_proxy: Prepare mutate_hint() for intra-node tablet migration
  commitlog_replayer: Avoid deprecated sharder::shard_of()
  lwt: Avoid deprecated sharder::shard_of()
  compaction: Avoid deprecated sharder::shard_of()
  dht: Extract dht::static_sharder
  replica: Deprecate table::shard_of()
  locator: Deprecate effective_replication_map::shard_of()
  dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard
  tests: tablets: py: Add intra-node migration test
  tests: tablets: Test that drained nodes are not balanced internally
  tests: tablets: Add checks of replica set validity to test_load_balancing_with_random_load
  tests: tablets: Verify that disabling balancing results in no intra-node migrations
  tests: tablets: Check that nodes are internally balanced
  tests: tablets: Improve debuggability by showing which rows are missing
  tablets, storage_service: Support intra-node migration in move_tablet() API
  tablet_allocator: Generate intra-node migration plan
  tablet_allocator: Extract make_internode_plan()
  tablet_allocator: Maintain candidate list and shard tablet count for target nodes
  tablet_allocator: Lift apply_load/can_accept_load lambdas to member functions
  tablets, streaming: Implement tablet streaming for intra-node migration
  dht, auto_refreshing_sharder: Allow overriding write selector
  multishard_writer: Handle intra-node migration
  storage_proxy: Handle intra-node tablet migration for writes
  tablets: Get rid of tablet_map::get_shard()
  tablets: Avoid tablet_map::get_shard in cleanup
  tablets: test: Use sharder instead of tablet_map::get_shard()
  tablets: tablet_sharder: Allow working with non-local host
  sharding: Prepare for intra-node-migration
  docs: Document sharder use for tablets
  tablets: Introduce tablet transition kind for intra-node migration
  tests: tablets: Fix use-after-move of skiplist in rebalance_tablets()
  sstables, gdb: Track readers in a linked list
  raft topology: Fix global token metadata barrier to not fence ahead of what is drained
2024-05-20 16:13:01 +03:00
Kefu Chai
40ce52c3cc test: use generic boost_test_print_type()
in this change, we trade the `boost_test_print_type()` overloads
for the generic template of `boost_test_print_type()`, except for
those in the very small tests, which presumably want to keep
themselves relative self-contained.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18727
2024-05-20 12:56:20 +03:00
Avi Kivity
2fbd78c769 feature: grandfather DIGEST_FOR_NULL_VALUES
The DIGEST_FOR_NULL_VALUES feature was added in 21a77612b3 (2020; 4.4)
and can now be assumed to be always present. The hasher which it invoked
is removed.
2024-05-18 00:24:00 +03:00
Avi Kivity
3bead8cea0 feature: grandfather PER_TABLE_PARTITIONERS
The PER_TABLE_PARTITIONERS feature was added in 90df9a44ce (2020; 4.0)
and can now be assumed to be always present. We also remove the associated
schema_feature.
2024-05-18 00:15:07 +03:00
Avi Kivity
6b532fd40b test: schema_change_test: regenerate digest for PER_TABLE_PARTITIONERS
The first digest tested was generated without the PER_TABLE_PARTITIONERS
schema feature. We're about to make that feature mandatory, so we won't
be able (and won't need) to generate a digest without it.

Update the digest to include the feature. Note it wasn't untested before,
we have a test with schema_features::full().
2024-05-18 00:14:43 +03:00
Avi Kivity
c4d8b17f4c test: test_schema_change_digest: drop unneeded reference digests
digests[0] was used by the VIEW_VIRTUAL_COLUMNS feature, which
no longer exists.

digests[1] is the same as digests[2], so drop it.
2024-05-17 20:41:20 +03:00
Avi Kivity
b5f6021a6b feature: grandfather VIEW_VIRTUAL_COLUMNS
The VIEW_VIRTUAL_COLUMNS feature was added in a108df09f9 (2019; 3.1)
and can now be assumed to be always present.

The corresponding schema_feature is removed. Note schema_features are not sent
over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.
2024-05-17 20:41:19 +03:00
Nadav Har'El
c7aa47354a Merge 'mutation_fragment_stream_validating_filter: respect validating_level::none' from Botond Dénes
Even when configured to not do any validation at all, the validator still did some. This small series fixes this, and adds a test to check that validation levels in general are respected, and the validator doesn't validate more than it is asked to.

Fixes: #18662

Closes scylladb/scylladb#18667

* github.com:scylladb/scylladb:
  test/boost/mutation_fragment_test.cc: add test for validator validation levels
  mutation: mutation_fragment_stream_validating_filter: fix validation_level::none
  mutation: mutation_fragment_stream_validating_filter: add raises_error ctor parameter
2024-05-16 19:57:49 +03:00
Michał Jadwiszczak
7f04c88395 test/boost: add test for topological sorting 2024-05-16 13:30:03 +02:00
Tomasz Grabiec
a179f37780 test: tablets_test: Test sharding during intra-node migration 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
5f32d2ddb6 test: tablets_test: Check sharding also on the pending host 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
32a191384a test: Avoid using deprecated sharded API
There is not tablet migration in unit tests, so shard_of() can be
safely replaced with shard_for_reads(). Even if it's used for writes.
2024-05-16 00:28:47 +02:00