Commit Graph

378 Commits

Author SHA1 Message Date
Benny Halevy
731a74c71f storage_proxy: pass topology& to sort_endpoints_by_proximity
It mustn't use the latest topology that may differ from the
one used by the query as it may be missing nodes
(e.g. after concurrent decommission).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-11-22 15:02:40 +02:00
Benny Halevy
ab3fc1e069 storage_proxy: pass topology& to is_worth_merging_for_range_query
It mustn't use the latest topology that may differ from the
one used by the query as it may be missing nodes
(e.g. after concurrent decommission).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-11-22 15:01:58 +02:00
Avi Kivity
a2da08f9f9 storage_proxy: hold effective_replication_map for the duration of a paxos transaction
Luckily, all topology calculations are done in get_paxos_participants(),
so all we have to do is it hold the effective_replication_map for the
duration of the transaction, and pass it to get_paxos_participants().
This ensures that the coordinator knows about all in-flight requests
and can fence them from topology changes.
2022-10-13 14:27:26 +03:00
Avi Kivity
69aaa5e131 storage_proxy: move paxos_response_handler class to .cc file
It's not used elsewhere.
2022-10-13 14:27:26 +03:00
Avi Kivity
b2f3934e95 storage_proxy: deinline paxos_response_handler constructor/destructor
They have no business being inline as it's a heavyweight object.
2022-10-13 14:27:26 +03:00
Avi Kivity
94e4ff11be storage_proxy: use consistent effective_replication_map for counter coordinator
Hold the effective_replication_map while talking to the counter leader,
to allow for fencing in the future. The code is somewhat awkward because
the API allows for multiple keyspaces to be in use.

The error code generation, already broken as it doesn't use the correct
table, continues to be broken in that it doesn't use the correct
effective_replication_map, for the same reason.
2022-10-13 14:27:23 +03:00
Avi Kivity
406a046974 storage_proxy: improve consistency in query_partition_key_range{,_concurrent}
query_partition_key_range captures a token_metadata_ptr and uses
it consistently in sequential calls to query_partition_key_range_concurrent
(via tail recursion), but each invocation of
query_partition_key_range_concurrent captures its own
effective_replication_map_ptr. Since these are captured at different times,
they can be inconsistent after the first iteration.

Fix by capturing it once in the caller and propagating it everywhere.
2022-10-13 13:56:52 +03:00
Avi Kivity
86a48cf12f storage_proxy: use consistent token_metadata with rest of singular read
query_singular() uses get_token_metadata_ptr() and later, in
get_read_executor(), captures the effective_replication_map(). This
isn't a bug, since the two are captured in the same continuation and
are therefore consistent, but a way to ensure it stays so is to capture
the effective_replication_map earlier and derive the token_metadata from
it.
2022-10-13 13:46:04 +03:00
Pavel Emelyanov
2b8636a2a9 storage_proxy.hh: Remove unused headers
Add needed forward declarations and fix indirect inclusions in some .ccs

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11679
2022-10-02 20:48:50 +03:00
Benny Halevy
64140ccf05 cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT
Extend the cql3 truncate statement to accept attributes,
similar to modification statements.

To achieve that we define cql3::statements::raw::truncate_statement
derived from raw::cf_statement, and implement its pure virtual
prepare() method to make a prepared truncate_statement.

The latter, statements::truncate_statement, is no longer derived
from raw::cf_statement, and just stores a schema_ptr to get to the
keyspace and column_family names.

`test_truncate_using_timeout` cql-pytest was added to test
the new USING TIMEOUT feature.

Fixes #11408

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 18:30:39 +03:00
Pavel Emelyanov
b6fdea9a79 code: Call sort_endpoints_by_proximity() via topology
The method is about to be moved from snitch to topology, this patch
prepares the rest of the code to use the latter to call it. The
topology's method just calls snitch, but it's going to change in the
next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-09-05 15:14:01 +03:00
Pavel Emelyanov
642e50f3e3 snitch: Move is_worth_merging_for_range_query to proxy
Proxy is the only place that calls this method. Also the method name
suggests it's not something "generic", but rather an internal logic of
proxy's query processing.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-09-05 15:10:46 +03:00
Benny Halevy
d295d8e280 everywhere: define locator::host_id as a strong tagged_uuid type
So it can be distinguished from other uuid-based
identifiers in the system.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11276
2022-08-12 06:01:44 +03:00
Avi Kivity
01a614fb4d storage_proxy: use consistent replication map on write path
Capture a replication map just once in
abstract_write_handler::_effective_replication_map_ptr and use it
in all write handlers. A few accesses to get the topology still remain,
they will be fixed up in a later patch.
2022-08-11 17:58:42 +03:00
Avi Kivity
f1b0e3d58e storage_proxy: convert get_live{,_sorted}_endpoints() to accept an effective_replication_map
Allow callers to use consistent effective_replication_map:s across calls
by letting the caller select the object to use.
2022-08-11 17:58:42 +03:00
Botond Dénes
2656968db2 service/storage_proxy: propagate last position on digest reads
We want to transmit the last position as determined by the replica on
both result and digest reads. Result reads already do that via the
query::result, but digest reads don't yet as they don't return the full
query::result structure, just the digest field from it. Add the last
position to the digest read's return value and collect these in the
digest resolver, along with the returned digests.
2022-08-10 06:03:37 +03:00
Botond Dénes
1b669cefed service/storage_proxy: add get_tombstone_limit()
To be used by coordinator side code to determine the correct tombstone
limit to pass to read-command (tombstone limit field added in the next
commit). When this limit is non-zero, the replica will start cutting
pages after the tombstone limit is surpassed.
This getter works similarly to `get_max_result_size()`: if the cluster
feature for empty replica pages is set, it will return the value
configured via db::config::query_tombstone_limit. System queries always
use a limit of 0 (unlimited tombstones).
2022-08-09 10:00:40 +03:00
Kamil Braun
0a4e701b50 service: storage_proxy: pass migration_manager* to init_messaging_service
`migration_manager` lifetime is longer than the lifetime of "storage
proxy's messaging service part" - that is, `init_messaging_service` is
called after `migration_manager` is started, and `uninit_messaging_service`
is called before `migration_manager` is stopped. Thus we don't need to
hold an owning pointer to `migration_manager` here.

Later, when `init_messaging_service` will actually construct `remote`,
this will be a reference, not a pointer.

Also observe that `_mm` in `remote` is only used in handlers, and
handlers are unregistered before `_mm` is nullified, which ensures that
handlers are not running when `_mm` is nullified. (This argument shows
why the code made sense regardless of our switch from shared_ptr to raw
ptr).
2022-08-04 12:19:43 +02:00
Kamil Braun
078900042f service: storage_proxy: pass shared_ptr<gossiper> to start_hints_manager
No need to call `_remote->gossiper().shared_from_this()` from within
storage_proxy now.
2022-08-04 12:16:09 +02:00
Kamil Braun
d9d10d87ec service: storage_proxy: establish private section in remote
Only the (un)init, send_*, and `is_alive` functions are public,
plus a getter for gossiper.
2022-08-04 12:16:05 +02:00
Kamil Braun
7364d453dd service: storage_proxy: remove migration_manager pointer
The ownership is passed to `remote`, which now contains
a `shared_ptr<migration_manager>`.
2022-08-04 12:15:36 +02:00
Kamil Braun
eddd3b8226 service: storage_proxy: remove _gossiper field
Access `gossiper` through `_remote`.
Later, all those accesses will handle missing `remote`.

Note that there are also accesses through the `remote()` internal getter.

The plan is as follows:
- direct accesses through `_remote` will be modified to handle missing
  `_remote` (these won't cause an error)
- `remote()` will throw if `_remote` is missing (`remote()` is only used
  for operations which actually need to send a message to a remote node).
2022-08-04 12:15:35 +02:00
Kamil Braun
ab946e392f alternator: ttl: pass gossiper& to expiration_service
This allows us to remove the `gossiper()` getter from `storage_proxy`.
2022-08-04 12:12:43 +02:00
Kamil Braun
3e73de9a40 service: storage_proxy: introduce is_alive helper
A helper is introduced both in `remote` and in `storage_proxy`.

The `storage_proxy` one calls the `remote` one. In the future it will
also handle a missing `remote`. Then it will report only the local node
to be alive and other nodes dead while `remote` is missing.

The change reduces the number of functions using the `_gossiper` field
in `storage_proxy`.
2022-08-04 12:12:41 +02:00
Kamil Braun
2aff2fea00 service: storage_proxy: remove _messaging reference
All uses of `messaging_service&` have been moved to `remote`.
2022-08-02 19:55:12 +02:00
Kamil Braun
cf931c7863 service: storage_proxy: move connection_dropped to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
2203d4fa09 service: storage_proxy: make encode_replica_exception_for_rpc a static function
No need for this ugly template to be part of the `storage_proxy` header.
2022-08-02 19:55:12 +02:00
Kamil Braun
3499bc7731 service: storage_proxy: move handle_write to remote
It is a helper used by `receive_mutation_handler`
and `handle_paxos_learn`.
2022-08-02 19:55:12 +02:00
Kamil Braun
ba88ad8db0 service: storage_proxy: move handle_paxos_prune to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
548767f91e service: storage_proxy: move handle_paxos_accept to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
807c7f32de service: storage_proxy: move handle_paxos_prepare to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
0e431e7c03 service: storage_proxy: move handle_truncate to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
f8c1ba357f service: storage_proxy: move handle_read_digest to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
43997af40f service: storage_proxy: move handle_read_mutation_data to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
80586a0c7e service: storage_proxy: move handle_read_data to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
00c0ee44bd service: storage_proxy: move handle_mutation_failed to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
b9c436c6e0 service: storage_proxy: move handle_mutation_done to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
178536d5d2 service: storage_proxy: move handle_paxos_learn to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
f309886fac service: storage_proxy: move receive_mutation_handler to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
fad14d2094 service: storage_proxy: move handle_counter_mutation to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
93325a220f service: storage_proxy: remove get_local_shared_storage_proxy
Its remaining uses are trivial to remove.

Note: in `handle_counter_mutation` we had this piece of code:
```
            }).then([trace_state_ptr = std::move(trace_state_ptr), &mutations, cl, timeout] {
                auto sp = get_local_shared_storage_proxy();
                return sp->mutate_counters_on_leader(...);
```

Obtaining a `shared_ptr` to `storage_proxy` at this point is
no different from obtaining a regular pointer:
- The pointer is obtained inside `then` lambda body, not in the capture
  list. So if the goal of obtaining a `shared_ptr` here was to keep
  `storage_proxy` alive until the `then` lambda body is executed, that
  goal wasn't achieved because the pointer was obtained too late.
- The `shared_ptr` is destroyed as soon as `mutate_counters_on_leader`
  returns, it's not stored anywhere. So it doesn't prolong the lifetime
  of the service.

I replaced this with a simple capture of `this` in the lambda.
2022-08-02 19:55:12 +02:00
Kamil Braun
f174645ab5 service: storage_proxy: introduce remote
Move most accesses to `_messaging` to this struct
(functions that send RPCs).
2022-08-02 19:55:10 +02:00
Piotr Dulikowski
3357066387 storage_proxy: resultize return type of get_read_executor
Now, get_read_executor is able to return coordinator exceptions without
throwing them. In an upcoming commit, it will start returning rate limit
exception in some cases and it is preferable to return them without
throwing.
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
d3d9add219 storage_proxy: add per partition rate limit info to read RPC
Now, the read RPC accept the per partition rate limit info parameter. It
is passed on to query_result_local(_digest) methods.
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
e8e8ada4b4 storage_proxy: add per partition rate limit info to query_result_local(_digest)
The query_result_local and query_result_local_digest methods were
updated to accept db::per_partition_rate_limit::info structure and pass
it on to database::accept.
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
e6beab3106 storage_proxy: add allow rate limit flag to mutate/mutate_result
Now, mutate/mutate_result accept a flag which decides whether the write
should be rate limited or not.

The new parameter is mandatory and all call sites were updated.
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
1f65c4e001 storage_proxy: add allow rate limit flag to mutate_internal
Now, mutate_internal accepts a flag which decides whether the write
should be rate limited or not.
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
1e4e92ed8b storage_proxy: add allow rate limit flag to mutate_begin
Now, mutate_begin accepts a flag which decides whether given write
should be rate limited or not.
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
76e95e7ae8 storage_proxy: choose the right per partition rate limit info in write handler
Now, write response handler calculates the appropriate rate limit info
parameter and passes it to the mutation holder.
2022-06-22 20:16:49 +02:00
Piotr Dulikowski
2a7ba76c3e storage_proxy: resultize return types of write handler creation path
The mutate_prepare and create_write_response_handler(_helper) functions
are modified to be able to return exceptions without throwing them. In
an upcoming commit, create_write_response_handler will sometimes return
rate limit exception, and it is preferable to return them without
throwing.
2022-06-22 20:16:49 +02:00