scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-22 07:42:16 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	1a045d0cdd	cql_transport: Use shared updateable_timeout_config by reference Pass sharded<updateable_timeout_config>& into cql_transport::controller, which feeds the shard-local instance as a reference into cql_server_config::timeout_config. This drops the per-shard local updateable_timeout_config constructed from db::config inside the controller's sharded_parameter lambda, replacing it with a reference into the shared sharded instance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-24 15:21:31 +03:00
Piotr Smaron	218f8adc8f	transport: add per-service-level cql_requests_serving metric Add a per-scheduling-group gauge that tracks the number of in-flight CQL requests for each service level. The existing scylla_transport_requests_serving metric is a single global per-shard counter; the new metric breaks it down by scheduling group so operators can see which service level contributes the most in-flight requests when debugging latency. The metric is named cql_requests_serving (exposed as scylla_transport_cql_requests_serving) following the cql_ prefix convention used by all other per-scheduling-group transport metrics (cql_requests_count, cql_request_bytes, cql_response_bytes, cql_pending_response_memory). Using a cql_ prefix avoids Prometheus confusion with the global requests_serving metric, which lacks the scheduling_group_name label. The counter is incremented when a request enters process_request() and decremented in the same 'leave' defer block as the global requests_serving, ensuring the request is counted as in-flight until the response is sent.	2026-04-17 15:07:14 +02:00
Avi Kivity	0ae22a09d4	LICENSE: Update to version 1.1 Updated terms of non-commercial use (must be a never-customer).	2026-04-12 19:46:33 +03:00
Marcin Maliszkiewicz	a74665b300	transport: add per-service-level pending response memory metric Track the total memory consumed by responses waiting to be written to the socket, exposed as a per-scheduling-group gauge (cql_pending_response_memory). This complements the response memory accounting added in the previous commits by giving visibility into how much memory each service level is holding in unsent response buffers.	2026-04-01 17:15:28 +02:00
Marcin Maliszkiewicz	a26ca0f5f7	transport: hold memory permit until response write completes Capture the memory permit in the leave lambda's .finally() continuation so that the semaphore units are kept alive until write_response finishes, preventing premature release of memory accounting. This is especially important with slow network and big responses when buffers can accumulate and deplete node's memory.	2026-03-31 14:05:00 +02:00
Piotr Dulikowski	d8b283e1fb	Merge 'Add CQL forwarding for strongly consistent tables' from Wojciech Mitros In this series we add support for forwarding strongly consistent CQL requests to suitable replicas, so that clients can issue reads/writes to any node and have the request executed on an appropriate tablet replica (and, for writes, on the Raft leader). We return the same CQL response as what the user would get while sending the request to the correct replica and we perform the same logging/stats updates on the request coordinator as if the coordinator was the appropriate replica. The core mechanism of forwarding a strongly consistent request is sending an RPC containing the user's cql request frame to the appropriate replica and returning back a ready, serialized `cql_transport::response`. We do this in the CQL server - it is most prepared for handling these types and forwarding a request containing a CQL frame allows us to reuse near-top-level methods for CQL request handling in the new RPC handler (such as the general `process`) For sending the RPC, the CQL server needs to obtain the information about who should it forward the request to. This requires knowledge about the tablet raft group members and leader. We obtain this information during the execution of a `cql3/strong_consistency` statement, and we return this information back to the CQL server using the generalized `bounce_to_shard` `response_message`, where we now store the information about either a shard, or a specific replica to which we should forward to. Similarly to `bounce_to_shard`, we need to handle this `result_message` in a loop - a replica may move during statement execution, or the Raft leader can change. We also use it for forwarding strongly consistent writes when we're not a member of the affected tablet raft group - in that case we need to forward the statement twice - once to any replica of the affected tablet, then that replica can find the leader and return this information to the coordinator, which allows the second request to be directed to the leader. This feature also allows passing through exception messages which happened on the target replica while executing the statement. For that, many methods of the `cql_transport::cql_server::connection` for creating error responses needed to be moved to `cql_transport::cql_server`. And for final exception handling on the coordinator, we added additional error info to the RPC response, so that the handling can be performed without having the `result_message::exception` or `exception_ptr` itself. Fixes [SCYLLADB-71](https://scylladb.atlassian.net/browse/SCYLLADB-71) [SCYLLADB-71]: https://scylladb.atlassian.net/browse/SCYLLADB-71?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Closes scylladb/scylladb#27517 * github.com:scylladb/scylladb: test: add tests for CQL forwarding transport: enable CQL forwarding for strong consistency statements transport: add remote statement preparation for CQL forwarding transport: handle redirect responses in CQL forwarding transport: add exception handling for forwarded CQL requests transport: add basic CQL request forwarding idl: add a representation of client_state for forwarding cql_server: handle query, execute, batch in one case transport: inline process_on_shard in cql_server::process transport: extract process() to cql_server transport: add messaging_service to cql_server transport: add response reconstruction helpers for forwarding transport: generalize the bounce result message for bouncing to other nodes strong consistency: redirect requests to live replicas from the same rack transport: pass foreign_ptr into sleep_until_timeout_passes and move it to cql_server transport: extract the error handling from process_request_one transport: move error response helpers from connection to cql_server	2026-03-13 15:03:10 +01:00
Avi Kivity	e2eeef3e01	Merge 'service level: remove remnants of version 1 service level' from Gleb Natapov can_use_effective_service_level_cache() always returns true now, so the function can be dropped entirely and all the code that assumes it may return false can be dropped as well. Also drop async versions of find_effective_service_level and get_user_scheduling_group since they are unused. No need to backport, code removal, Closes scylladb/scylladb#29002 * github.com:scylladb/scylladb: service level: make maybe_update_per_service_level_params synchronous service level: remove unused get_user_scheduling_group function service level: drop async find_effective_service_level service level: remove remnants of version 1 service level	2026-03-12 23:39:41 +02:00
Wojciech Mitros	21a7b036a5	transport: add remote statement preparation for CQL forwarding During forwarding of CQL EXECUTE requests, the target node may not have the prepared statement in its cache. If we do have this statement as a coordinator, instead of returning PREPARED NOT FOUND to the client, we want to prepare the statement ourselves on target node. For that, we add a new FORWARD_CQL_PREPARE RPC. We use the new RPC after gettting the prepared_not_found status during forwarding. When we try to forward a request, we always have the query string (we decide whether to forward based on this query), so we can always use the new RPC when getting the prepared_not_found status. After receiving the response, we try forwarding the EXECUTE request again.	2026-03-12 19:43:35 +01:00
Wojciech Mitros	96a5e1c7ce	transport: handle redirect responses in CQL forwarding During CQL forwarding, when the target node can't handle the request, it will find another node which can execute the request or which knows where the request can be executed. We return this information in responses to CQL forwarding, and in this patch, we add handling of this kind of a response. After getting a redirect response, we retry forwarding to the returned host/shard until success or timeout. This can happen many times during a single request, when we first forward to a replica and later to the coordinator, or when a replica/coordinator migrated while we were performing the forwarding	2026-03-12 19:43:31 +01:00
Wojciech Mitros	8816d3038c	transport: add exception handling for forwarded CQL requests When a forwarded request fails on the remote node, we can't use the exception handling that happens in process_request_one because we don't go through this code path. Instead, we use the previously extracted cql_server::handle_exception handler, which performs all accounting on the forwarded-to node, and which prepares the response. For the read_failure_exception_with_timeout exception, we need to perform the sleep on the source node, so we return the timeout in the forwarding response and use it on the source node to know how long to sleep without any extra calculations. The handle_forward_execute() method is extracted from the inline handler lambda to make the error catching wrapper cleaner.	2026-03-12 19:41:37 +01:00
Wojciech Mitros	23bff5dfef	transport: add basic CQL request forwarding Add the infrastructure for forwarding CQL requests to other nodes. When a process() call results in a node bounce (as opposed to a shard bounce), the coordinator serializes the request and sends it via the FORWARD_CQL_EXECUTE RPC verb to the target node. In this patch we omit several features that allow handling more scenarios that can happen when trying to forward a CQL request, but the RPC request and response are already prepared for them. They will be handled in the following commits.	2026-03-12 19:41:35 +01:00
Wojciech Mitros	b4a7fefe20	cql_server: handle query, execute, batch in one case Currently we perform the same steps when handling query, execute and batch CQL requests. So instead of creating multiple functions performing these steps, we can handle them all in one fallthrough case in cql_server::connection::process_request_one.	2026-03-12 17:48:58 +01:00
Wojciech Mitros	dadb87047c	transport: inline process_on_shard in cql_server::process The process_on_shard method is relatively short, it's only used in the process() method and the Process concept that is uses is as long as the function itself. This area will be made more complex by the following patches for cql forwarding, so we simplify it by inlining process_on_shard in cql_server::process.	2026-03-12 17:48:58 +01:00
Wojciech Mitros	24cdc3a10d	transport: extract process() to cql_server Move process() and process_on_shard() from cql_server::connection to cql_server. The process() method is no longer a template - instead, it takes an opcode parameter and uses get_process_fn_for_opcode() to select the appropriate internal processing function. The process_query, process_execute, and process_batch wrappers on connection now delegate to _server.process() with the appropriate opcode. This refactoring is preparation for CQL request forwarding, where process() will need to be called from a context other than connection - the forwarding RPC handler).	2026-03-12 17:48:57 +01:00
Wojciech Mitros	0e3469e89c	transport: add messaging_service to cql_server The messaging service will be used by cql_server to register RPC handlers for forwarding CQL requests between nodes. We pass it through the controller to cql_server.	2026-03-12 17:48:57 +01:00
Wojciech Mitros	e44820ba1f	transport: generalize the bounce result message for bouncing to other nodes In the following patches, we'll start allowing forwarding requests to strongly consistent tables so that they'll get executed on the suitable tablet Raft group members. For that we'll reuse the approach that we already have for bouncing requests to other shards - we'll try to execute a request locally, and the result of that will be a bounce message with another replica as the target. In this patch we generalize the former bounce_to_shard result message so that it will be able to specify the target of the bounce as another shard or specific replica. We also rename it to result_message::bounce so that it stops implying that only another shard may be its target. Aside from the host_id and the shard, the new message also includes the timeout, because in the service handling the forwarding we won't have the access to it, and it's needed for specifying how long we should wait for the forwarded requests. It also includes an information whether this is a write request to return correct timeout response in case the deadline is exceeded. We will return other hosts in the new bounce message when executing requests to strongly consistent tables when we can't handle the request because we aren't a suitable replica. We can't handle this message yet, so we don't return it anywhere and we still assume that every bounce message is a bounce to the same host.	2026-03-12 17:48:57 +01:00
Wojciech Mitros	309abc44d9	transport: pass foreign_ptr into sleep_until_timeout_passes and move it to cql_server Change sleep_until_timeout_passes() to accept a foreign_ptr<std::unique_ptr<response>>. We can easily create the foreign_ptr for the responses created in the CQL server, but we'll need this when we get responses when forwarding CQL statements - the responses may come from other shards. We also move it from cql_server::connection to cql_server, because for forwarded CQL requests, we'll need to handle it at the cql_server level. The method also loses its const qualifier - the abort_source that we pass into sleep_abortable needs to be non-const. Apparently, we could still use it in a const method of cql_server::connection because we passed it as _server._abort_source which caused the const qualifier to be lost.	2026-03-12 16:03:14 +01:00
Gleb Natapov	f888f2dced	service level: remove remnants of version 1 service level can_use_effective_service_level_cache() always returns true now, so the function can be dropped entirely and all the code that assumes it may return false can be dropped as well.	2026-03-12 12:27:52 +02:00
Marcin Maliszkiewicz	b277d9d9aa	cql3: track CQL parsing memory cost and use it for admission control Use rolling_max_tracker to record gross bytes allocated during each CQL parse. The rolling maximum is then added to the memory estimate for incoming QUERY and PREPARE requests so that the admission control in the CQL transport layer accounts for parsing overhead. The measured memory footprint serves as upper bound rather than exact number but it's purpose is to prevent OOMs under unprepared statements heavy load. In benchmark 1G memory node shows decrease of non-LSA memory usage from peak 320MB (our coordinator budget is 10% of 1G) to 96MB. While tps drops from 1.2 kops to 0.8 kops. Drop in tps is expected as memory admission kicks in trying to prevent OOM.	2026-03-12 10:16:10 +01:00
Wojciech Mitros	b1bd206147	transport: extract the error handling from process_request_one When we forward CQL statements, we'll need to handle the errors on the destination node. Only for read_failure_exception_with_timeout exception, we'll still need to wait until timeout passes on the source node. For that we extract the exception handling to a separate method. Additionally, we separate the waiting and all other handling, so that all handling aside from waiting will be reusable after forwarding, and we'll also be able to sleep on the source node if necessary.	2026-03-11 19:40:47 +01:00
Wojciech Mitros	6184b1d5ea	transport: move error response helpers from connection to cql_server These methods are used only in the error handler in the cql server, and outside of 3 cases, they don't need any information from the cql_server::connection. We move them from cql_server::connection to cql_server, so that they can be used in the following patches for methods for CQL request forwarding where we'll have no instance of cql_server::connection on the node forwarded to. After the change the methods require no access to the server's or connection's fields, so we also make them static methods.	2026-03-11 19:40:47 +01:00
Dario Mirovic	d765b5b309	client_state: add _bypass_auth_checks flag Authorization checks were previously skipped based on the _is_internal flag. This couples two concerns: marking client state as internal and bypassing authorization. Introduce _bypass_auth_checks to handle only the authorization bypass. Internal client state sets it to true, preserving current behavior. External client state accepts it as a constructor parameter, defaulting to false. This will allow maintenance socket connections to skip authorization without being marked as internal. Refs SCYLLADB-409	2026-03-03 22:31:35 +01:00
Amnon Heiman	3175540e87	transport/server: to bytes_histogram This patch replaces simple counters with bytes_histogram for tracking CQL request and response sizes, enabling better visibility into message size distribution. Changes: - Replace request_size and response_size metrics with bytes_histogram in cql_sg_stats::request_kind_stats - Per-shard metrics continue to be reported as before - QUERY, EXECUTE, and BATCH operations now report per-node, per-scheduling-group histograms of bytes sent and received, providing detailed insight into these operations Other CQL operations (e.g., PREPARE, OPTIONS) are not included in per-node histogram reporting as they are less performance-critical, but can be added in the future if proven useful. Metrics example: ``` # HELP scylla_transport_cql_request_bytes Counts the total number of received bytes in CQL messages of a specific kind. # TYPE scylla_transport_cql_request_bytes counter scylla_transport_cql_request_bytes{kind="BATCH",scheduling_group_name="sl:default",shard="0"} 129808 scylla_transport_cql_request_bytes{kind="EXECUTE",scheduling_group_name="sl:default",shard="0"} 227409 scylla_transport_cql_request_bytes{kind="PREPARE",scheduling_group_name="sl:default",shard="0"} 631 scylla_transport_cql_request_bytes{kind="QUERY",scheduling_group_name="sl:default",shard="0"} 2809 scylla_transport_cql_request_bytes{kind="QUERY",scheduling_group_name="sl:driver",shard="0"} 4079 scylla_transport_cql_request_bytes{kind="REGISTER",scheduling_group_name="sl:default",shard="0"} 98 scylla_transport_cql_request_bytes{kind="STARTUP",scheduling_group_name="sl:driver",shard="0"} 432 # HELP scylla_transport_cql_request_histogram_bytes A histogram of received bytes in CQL messages of a specific kind and specific scheduling group. # TYPE scylla_transport_cql_request_histogram_bytes histogram scylla_transport_cql_request_histogram_bytes_sum{kind="QUERY",scheduling_group_name="sl:driver"} 4079 scylla_transport_cql_request_histogram_bytes_count{kind="QUERY",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="1024.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="2048.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="4096.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="8192.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="16384.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="32768.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="65536.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="131072.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="262144.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="524288.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="1048576.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="2097152.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="4194304.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="8388608.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="16777216.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="33554432.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="67108864.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="134217728.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="268435456.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="536870912.000000",scheduling_group_name="sl:driver"} 57 scylla_transport_cql_request_histogram_bytes_bucket{kind="QUERY",le="1073741824.000000",scheduling_group_name="sl:driver"} 57 ```	2026-01-28 13:53:47 +02:00
Vlad Zolotarov	85adf6bdb1	system.clients: add a client_options column This new column is going to contain all OPTIONS sent in the STARTUP frame of the corresponding CQL session. The new column has a `frozen<map<text, text>>` type, and we are also optimizing the amount of required memory for storing corresponding keys and values by caching them on each shard level. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2025-12-20 12:26:15 -05:00
Vlad Zolotarov	3a54bab193	controller: update get_client_data to use foreign_ptr for client_data get_client_data() is used to assemble `client_data` objects from each connection on each CPU in the context of generation of the `system.clients` virtual table data. After collected, `client_data` objects were std::moved and arranged into a different structure to match the table's sorting requirements. This didn't allow having not-cross-shard-movable objects as fields in the `client_data`, e.g. lw_shared_ptr objects. Since we are planning to add such fields to `client_data` in following patches this patch is solving the limitation above by making get_client_data() return `foreign_ptr<std::unique_ptr<client_data>>` objects instead of naked `client_data` ones. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2025-12-19 11:01:41 -05:00
Piotr Dulikowski	7900aa5319	Merge 'server: fix scheduling group update timing in system.clients' from Alex Dathskovsky Previously, the scheduling_group column was updated during the switch_tenant function, which meant the update occurred only after the tenant change operation completed—updating rows one by one. With this change, the scheduling_group column is now updated before the switch_tenant logic runs, ensuring that the table reflects the correct scheduling groups for all rows as early as possible. fixes: #26060 fixes: #27295 backport: not required this is a minor bug fix. Internal logic worked but the user couldnt see the change in the table if they would read the system.clients table Closes scylladb/scylladb#26404 * github.com:scylladb/scylladb: test: cqlpy: Remove test_switch_tenants and add test in cluster testing. The test needs to run twice, in two separate Scylla runs, using two different modes: gossip and raft. The cluster framework supports this setup, while cqlpy only runs against Scylla instances in raft mode. Therefore, the test was moved from cqlpy to the cluster-based framework. This commit both adds the test in cluster/ and removes the old version in cqlpy/. server: Refactor update_control_connection_scheduling_group functionality This refactoring moves the logic that retrieves the scheduling group for driver_service_level_name out of switch_tenant. This change is possible because the scheduling group for the driver is retrieved from a map (LOOKUP). The lookup function is fully synchronized, non-coroutine, and returns immediately. For that reason, it’s better to perform this lookup outside of the switch_tenant function. server: Refactor scheduling group update functionality. This change generalizes the scheduling-group update functionality and removes some copy-paste code, improving overall readability and maintainability. To achieve this, capturing lambdas were introduced. As a result, self-deducing this was added to those lambdas to avoid coroutine-related issues (“coroutine fiasco”). server: Fix switch_tenant problem, When running on a V2 server, service-level data comes from service level cache. Because of this, we can use synchronized function to get the schedualing group. Since we are transitioning to a Raft-based architecture where all servers will be V2, we can safely implement this fix specifically for that case. This change adds get_cached_user_scheduling_group functionality and moves its usage out of switch_tenant function in update_scheduling_group_v2 usage. server: Add update_service_level_scheduling_group_v1 functions to create placehholder for functionality that will introduce v2 implementation. The new functionality will allow usage of service level cache	2025-12-16 15:39:49 +01:00
Andrzej Jackowski	c2b1b10ca0	service: transport: add CLIENT_ROUTES_CHANGE event Introduce the CLIENT_ROUTES_CHANGE event to let drivers refresh connections when `system.client_routes` is modified. Some deployments (e.g., Private Link) require specific address/port mappings that can change without topology changes and drivers need to adapt promptly to avoid connectivity issues. This new EVENT type carries a change indicator plus the affected `connection_ids` and `host_ids`. The only change value is `UPDATE_NODES`, meaning one or more client routes were inserted, updated, or deleted. Drivers subscribe using the existing events mechanism, so no additional `cql_protocol_extension` key is required. Ref: scylladb/scylla-enterprise#5699	2025-12-15 18:19:37 +01:00
Alex	5579489c4c	server: Refactor scheduling group update functionality. This change generalizes the scheduling-group update functionality and removes some copy-paste code, improving overall readability and maintainability. To achieve this, capturing lambdas were introduced. As a result, self-deducing this was added to those lambdas to avoid coroutine-related issues (“coroutine fiasco”).	2025-12-14 18:46:05 +02:00
Alex	17c9d640fe	server: Fix switch_tenant problem, When running on a V2 server, service-level data comes from service level cache. Because of this, we can use synchronized function to get the schedualing group. Since we are transitioning to a Raft-based architecture where all servers will be V2, we can safely implement this fix specifically for that case. This change adds get_cached_user_scheduling_group functionality and moves its usage out of switch_tenant function in update_scheduling_group_v2 usage.	2025-12-14 16:27:40 +02:00
Alex	f98af582a7	server: Add update_service_level_scheduling_group_v1 functions to create placehholder for functionality that will introduce v2 implementation. The new functionality will allow usage of service level cache	2025-12-14 16:09:18 +02:00
Andrzej Jackowski	14081d0727	generic_server: transport: start using `sl:driver` for new connections Before this change, new connections were handled in a default scheduling group (`main`), because before the user is authenticated we do not know which service level should be used. With the new `sl:driver` service level, creation of new connections can be moved to `sl:driver`. We switch the service level as early as possible, in `do_accepts`. There is a possibility, that `sl:driver` will not exist yet, for instance, in specific upgrade cases, or if it was removed. Therefore, we also switch to `sl:driver` after a connection is accepted. Refs: scylladb/scylladb#24411	2025-10-08 08:25:12 +02:00
Avi Kivity	1258e7c165	Revert "Merge 'transport: service_level_controller: create and use `driver` service level' from Andrzej Jackowski" This reverts commit `fe7e63f109`, reversing changes made to `b5f3f2f4c5`. It is causing test.py failures around cqlpy. Fixes #26163 Closes scylladb/scylladb#26174	2025-09-22 09:32:46 +03:00
Pavel Emelyanov	a1ea553fe1	code: Replace distributed<> with sharded<> The latter is recommended in seastar, and the former was left as compatibility alias. Latest seastar explicitly marks it as deprecated so once the submodule is updated, compilation logs will explode. Most of the patch is generated with for f in $(git grep -l '\<distributed<[A-Za-z0-9:_]>') ; do sed -e 's/\<distributed<$[A-Za-z0-9:_]$>/sharded<\1>/g' -i $f; done for f in $(git grep -l distributed.hh); do sed -e 's/distributed.hh/sharded.hh/' -i $f ; done and a small manual change in test/perf/perf.hh Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#26136	2025-09-19 12:22:51 +02:00
Andrzej Jackowski	1ad483749a	generic_server: transport: start using `sl:driver` for new connections Before this change, new connections were handled in a default scheduling group (`main`), because before the user is authenticated we do not know which service level should be used. With the new `sl:driver` service level, creation of new connections can be moved to `sl:driver`. We switch the service level as early as possible, in `do_accepts`. There is a possibility, that `sl:driver` will not exist yet, for instance, in specific upgrade cases, or if it was removed. Therefore, we also switch to `sl:driver` after a connection is accepted. Refs: scylladb/scylladb#24411	2025-09-18 09:29:29 +02:00
Sergey Zolotukhin	ea311be12b	generic_server: Two-step connection shutdown. When shutting down in `generic_server`, connections are now closed in two steps. First, only the RX (receive) side is shut down. Then, after all ongoing requests are completed, or a timeout happened the connections are fully closed. Fixes scylladb/scylladb#24481	2025-07-28 10:08:06 +02:00
Sergey Zolotukhin	7334bf36a4	transport: consmetic change, remove extra blanks.	2025-07-28 10:08:06 +02:00
Avi Kivity	3dfdcf7d7a	Merge 'transport: remove throwing `protocol_exception` on connection start' from Dario Mirovic `protocol_exception` is thrown in several places. This has become a performance issue, especially when starting/restarting a server. To alleviate this issue, throwing the exception has to be replaced with returning it as a result or an exceptional future. This PR replaces throws in the `transport/server` module. This is achieved by using result_with_exception, and in some places, where suitable, just by creating and returning an exceptional future. There are four commits in this PR. The first commit introduces tests in `test/cqlpy`. The second commit refactors transport server `handle_error` to not rethrow exceptions. The third commit refactors reusable buffer writer callbacks. The fourth commit replaces throwing `protocol_exception` to returning it. Based on the comments on an issue linked in https://github.com/scylladb/scylladb/issues/24567, the main culprit from the side of protocol exceptions is the invalid protocol version one, so I tested that exception for performance. In order to see if there is a measurable difference, a modified version of `test_protocol_version_mismatch` Python is used, with 100'000 runs across 10 processes (not threads, to avoid Python GIL). One test run consisted of 1 warm-up run and 5 measured runs. First test run has been executed on the current code, with throwing protocol exceptions. Second test urn has been executed on the new code, with returning protocol exceptions. The performance report is in https://github.com/scylladb/scylladb/pull/24738#issuecomment-3051611069. It shows ~10% gains in real, user, and sys time for this test. Testing Build: `release` Test file: `test/cqlpy/test_protocol_exceptions.py` Test name: `test_protocol_version_mismatch` (modified for mass connection requests) Test arguments: ``` max_attempts=100'000 num_parallel=10 ``` Throwing `protocol_exception` results: ``` real=1:26.97 user=10:00.27 sys=2:34.55 cpu=867% real=1:26.95 user=9:57.10 sys=2:32.50 cpu=862% real=1:26.93 user=9:56.54 sys=2:35.59 cpu=865% real=1:26.96 user=9:54.95 sys=2:32.33 cpu=859% real=1:26.96 user=9:53.39 sys=2:33.58 cpu=859% real=1:26.95 user=9:56.85 sys=2:34.11 cpu=862% # average ``` Returning `protocol_exception` as `result_with_exception` or an exceptional future: ``` real=1:18.46 user=9:12.21 sys=2:19.08 cpu=881% real=1:18.44 user=9:04.03 sys=2:17.91 cpu=869% real=1:18.47 user=9:12.94 sys=2:19.68 cpu=882% real=1:18.49 user=9:13.60 sys=2:19.88 cpu=883% real=1:18.48 user=9:11.76 sys=2:17.32 cpu=878% real=1:18.47 user=9:10.91 sys=2:18.77 cpu=879% # average ``` This PR replaced `transport/server` throws of `protocol_exception` with returns. There are a few other places where protocol exceptions are thrown, and there are many places where `invalid_request_exception` is thrown. That is out of scope of this single PR, so the PR just refs, and does not resolve issue #24567. Refs: #24567 This PR improves performance in cases when protocol exceptions happen, for example during connection storms. It will require backporting. Closes scylladb/scylladb#24738 * github.com:scylladb/scylladb: test/cqlpy: add cpp exception metric test conditions transport/server: replace protocol_exception throws with returns utils/reusable_buffer: accept non-throwing writer callbacks via result_with_exception transport/server: avoid exception-throw overhead in handle_error test/cqlpy: add protocol_exception tests	2025-07-20 17:42:30 +03:00
Dario Mirovic	5390f92afc	transport/server: replace protocol_exception throws with returns Replace throwing protocol_exception with returning it as a result or an exceptional future in the transport server module. This improves performance, for example during connection storms and server restarts, where protocol exceptions are more frequent. In functions already returning a future, protocol exceptions are propagated using an exceptional future. In functions not already returning a future, result_with_exception is used. Notable change is checking v.failed() before calling v.get() in process_request function, to avoid throwing in case of an exceptional future. Refs: #24567	2025-07-17 16:54:05 +02:00
Marcin Maliszkiewicz	2f840e51d1	service: pull out update_tablet_metadata from migration_listener It's not a good usage as there is only one non-empty implementation. Also we need to change it further in the following commit which makes it incompatible with listener code.	2025-07-10 10:40:43 +02:00
Pavel Emelyanov	9b178df7dd	transport: Stop using db::config by transport::server Now the server is self-contained in the way it is being configured by the controller. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-07-04 15:40:20 +03:00
Pavel Emelyanov	e2c1484d8d	transport: Keep uninitialized_connections_semaphore_cpu_concurrency on cql_server_config This also repeats previous patch for another updateable_value. The thing here is that this config option is passed further to generic_server, but not used by transport::server itslef. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-07-04 15:40:20 +03:00
Pavel Emelyanov	64ffe67cbd	transport: Move cql_duplicate_bind_variable_names_refer_to_same_variable to cql_server_config Similarly to previous patch -- move yet another updateable_value to let transport::server eventually stop messing with db::config. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-07-04 15:40:14 +03:00
Pavel Emelyanov	b6546ed5ff	transport: Move max_concurrent_requests to struct config This is updateable_value that's initialized from db::config named_value to tackle its shard-unsafety. However, the cql_server_config is created by controller using sharded_parameter() helper, so that is can be safely passed to server. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-07-04 15:35:55 +03:00
Pavel Emelyanov	6075eca168	transport: Use cql_server_config::max_request_size It's duplicated on config and the transport::server that aggregates the config itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-07-04 15:34:53 +03:00
Avi Kivity	cd79a8fc25	Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz" This reverts commit `0b516da95b`, reversing changes made to `30199552ac`. It breaks cluster.random_failures.test_random_failures.test_random_failures in debug mode (at least). Fixes #24513	2025-06-16 22:38:12 +03:00
Tomasz Grabiec	0b516da95b	Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz This change is preparing ground for state update unification for raft bound subsystems. It introduces schema_applier which in the future will become generic interface for applying mutations in raft. Pulling `database::apply()` out of schema merging code will allow to batch changes to subsystems. Future generic code will first call `prepare()` on all implementations, then single `database::apply()` and then `update()` on all implementations, then on each shard it will call `commit()` for all implementations, without preemption so that the change is observed as atomic across all subsystems, and then `post_commit()`. Backport: no, it's a new feature Fixes: https://github.com/scylladb/scylladb/issues/19649 Closes scylladb/scylladb#20853 * github.com:scylladb/scylladb: storage_service: always wake up load balancer on update tablet metadata db: schema_applier: call destroy also when exception occurs db: replica: simplify seeding ERM during shema change db: remove cleanup from add_column_family db: abort on exception during schema commit phase db: make user defined types changes atomic replica: db: make keyspace schema changes atomic db: atomically apply changes to tables and views replica: make truncate_table_on_all_shards get whole schema from table_shards service: split update_tablet_metadata into two phases service: pull out update_tablet_metadata from migration_listener db: service: add store_service dependency to schema_applier service: simplify load_tablet_metadata and update_tablet_metadata db: don't perform move on tablet_hint reference replica: split add_column_family_and_make_directory into steps replica: db: split drop_table into steps db: don't move map references in merge_tables_and_views() db: introduce commit_on_shard function db: access types during schema merge via special storage replica: make non-preemptive keyspace create/update/delete functions public replica: split update keyspace into two phases replica: split creating keyspace into two functions db: rename create_keyspace_from_schema_partition db: decouple functions and aggregates schema change notification from merging code db: store functions and aggregates change batch in schema_applier db: decouple tables and views schema change notifications from merging code db: store tables and views schema diff in schema_applier db: decouple user type schema change notifications from types merging code service: unify keyspace notification functions arguments db: replica: decouple keyspace schema change notifications to a separate function db: add class encapsulating schema merging	2025-06-10 13:45:32 +02:00
Marcin Maliszkiewicz	21a5a3c01f	service: pull out update_tablet_metadata from migration_listener It's not a good usage as there is only one non-empty implementation. Also we need to change it further in the following commit which makes it incompatible with listener code.	2025-06-06 08:50:33 +02:00
Piotr Dulikowski	555925c66b	Merge 'generic_server: transport: improve stats counting and shedding' from Marcin Maliszkiewicz The patch removes connection advertising functions and moves the logic to constructors and destructors, providing a more robust way of counting connections. This change was also necessary to allow skipping the connection process function during shedding, as the active connections counter needs to be decremented. The patch doesn't fix any active bug, just improves the flow. Backport: none, it's a cosmetic change Closes scylladb/scylladb#23890 * github.com:scylladb/scylladb: generic_server: make shutdown() return void generic_server: skip connection processing logic after shedding the connection transport: generic_server: remove no longer used connection advertising code transport: move new connection trace logs into connection class ctor/dtor transport: move cql connections counting into connection class ctor/dtor	2025-05-29 12:49:58 +02:00
Marcin Maliszkiewicz	f7e5adaca3	transport: generic_server: remove no longer used connection advertising code	2025-05-27 19:31:09 +02:00
Andrzej Jackowski	086df24555	transport: implement SCYLLA_USE_METADATA_ID support Metadata id was introduced in CQLv5 to make metadata of prepared statement consistent between driver and database. This commit introduces a protocol extension that allows to use the same mechanism in CQLv4. This change: - Introduce SCYLLA_USE_METADATA_ID protocol extension for CQLv4 - Introduce METADATA_CHANGED flag in RESULT. The flag cames directly from CQLv5 binary protocol. In CQLv4, the bit was never used, so we assume it is safe to reuse it. - Implement handling of metadata_id and METADATA_CHANGED in RESULT rows - Implement returning metadata_id in RESULT prepared - Implement reading metadata_id from EXECUTE - Added description of SCYLLA_USE_METADATA_ID in documentation Metadata_id is wrapped in cql_metadata_id_wrapper because we need to distinguish the following situations: - Metadata_id is not supported by the protocol (e.g. CQLv4 without the extension is used) - Metadata_id is supported by the protocol but not set - e.g. PREPARE query is being handled: it doesn't contain metadata_id in the request but the reply (RESULT prepared) must contain metadata_id - Metadata_id is supported by the protocol and set, any number of bytes >= 0 is allowed, according to the CQLv5 protocol specification Fixes scylladb/scylladb#20860	2025-05-14 09:59:16 +02:00

1 2 3 4 5

239 Commits