scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 20:27:03 +00:00

Author	SHA1	Message	Date
Kefu Chai	0b69a1badc	transport: cast unaligned<T> to T for formatting it in fmt v10, it does not cast unaligned<T> to T when formatting it, instead it insists on finding a matched fmt::formatter<> specialization for it. that's why we have FTBFS with fmt v10 when printing these packed<T> variables with fmtlib v10. in this change, we just cast them to the underlying types before formatting them. because seastar::unaligned<T> does not provide a method for accessing the raw value, neither does it provide a type alias of the type of the underlying raw value, we have to cast to the type without deducing it from the printed value. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16167	2023-11-27 15:26:13 +02:00
Kefu Chai	a9c1a435ec	result_message: add formatter for result_message::rows before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for `cql_transport::messages::result_message::rows` Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16143	2023-11-23 11:12:55 +02:00
Tomasz Grabiec	b06a0078fb	Merge 'Support for sending tablet info to the drivers' from Sylwia Szunejko There is a need for sending tablet info to the drivers so they can be tablet aware. For the best performance we want to get this info lazily only when it is needed. The info is send when driver asks about the information that the specific tablet contains and it is directed to the wrong node/shard so it could use that information for every subsequent query. If we send the query to the wrong node/shard, we want to send the RESULT message with additional information about the tablet (replicas and token range) in custom_payload. Mechanism for sending custom_payload added. Sending custom_payload tested using three node cluster and cqlsh queries. I used RF=1 so choosing wrong node was testable. I also manually tested it with the python-driver and confirmed that the tablet info can be deserialized properly. Automatic tests added. Closes scylladb/scylladb#15410 * github.com:scylladb/scylladb: docs: add documentation about sending tablet info to protocol extensions Add tests for sending tablet info cql3: send tablet if wrong node/shard is used during modification statement cql3: send tablet if wrong node/shard is used during select statement locator: add function to check locality locator: add function to check if host is local transport: add function to add tablet info to the result_message transport: add support for setting custom payload	2023-11-22 17:44:07 +02:00
sylwiaszunejko	93420353f4	transport: add function to add tablet info to the result_message	2023-11-21 15:15:20 +01:00
sylwiaszunejko	75b3dbf7ea	transport: add support for setting custom payload A custom payload can now be added to response_message. If it is set, it will be sent to client and the custom_payload flag will be set. write_string_bytes_map method is added to response class and a missing custom_payload flag is added to cql_frame_flags.	2023-11-21 15:09:36 +01:00
Kefu Chai	15bfa09454	treewide: do not mark return value const if this has no effect this change is a cleanup. to mark a return value without value semantics has no effect. these `const` specifier useless. so let's drop them. and, if we compile the tree with `-Wignore-qualifiers`, the compiler would warn like: ``` /home/kefu/dev/scylladb/schema/schema.hh:245:5: error: 'const' type qualifier on return type has no effect [-Werror,-Wignored-qualifiers] 245 \| const index_metadata_kind kind() const; \| ^~~~~ ``` so this change also silences the above warnings. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-11-17 17:46:19 +08:00
Kefu Chai	efd65aebb2	build: cmake: add check-header target to have feature parity with `configure.py`. we won't need this once we migrate to C++20 modules. but before that day comes, we need to stick with C++ headers. we generate a rule for each .hh files to create a corresponding .cc and then compile it, in order to verify the self-containness of that header. so the number of rule is quite large, to avoid the unnecessary overhead. the check-header target is enabled only if `Scylla_CHECK_HEADERS` option is enabled. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15913	2023-11-13 10:27:06 +02:00
Botond Dénes	844a0e426f	Merge 'Mark counters with skip when empty' from Amnon Heiman This series mark multiple high cardinality counters with skip_when_empty flag. After this patch the following counters will not be reported if they were never used: ``` scylla_transport_cql_errors_total scylla_storage_proxy_coordinator_reads_local_node scylla_storage_proxy_coordinator_completed_reads_local_node scylla_transport_cql_errors_total ``` Also marked, the CAS related CQL operations. Fixes #12751 Closes scylladb/scylladb#13558 * github.com:scylladb/scylladb: service/storage_proxy.cc: mark counters with skip_when_empty cql3/query_processor.cc: mark cas related metrics with skip_when_empty transport/server.cc: mark metric counter with skip_when_empty	2023-09-19 15:02:39 +03:00
Avi Kivity	ab6988c52f	Merge "auth: do not grant permissions to creator without actually creating" from Wojciech Mitros Currently, when creating the table, permissions may be mistakenly granted to the user even if the table is already existing. This can happen in two cases: The query has a IF NOT EXISTS clause - as a result no exception is thrown after encountering the existing table, and the permission granting is not prevented. The query is handled by a non-zero shard - as a result we accept the query with a bounce_to_shard result_message, again without preventing the granting of permissions. These two cases are now avoided by checking the result_message generated when handling the query - now we only grant permissions when the query resulted in a schema_change message. Additionally, a test is added that reproduces both of the mentioned cases. CVE-2023-33972 Fixes #15467. * 'no-grant-on-no-create' of github.com:scylladb/scylladb-ghsa-ww5v-p45p-3vhq: auth: do not grant permissions to creator without actually creating transport: add is_schema_change() method to result_message	2023-09-18 21:47:28 +03:00
Pavel Emelyanov	b42391bfbe	transport: Shutdown server on disablebinary ... and do the real "sharded::stop" in the background. On node shutdown it needs to pick up all dangling background stopping. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	bc2d44994a	transport/controller: Coroutinize do_stop_server() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Pavel Emelyanov	7701aa0789	transport/controller: Coroutinize stop_server() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Nadav Har'El	548386a0bb	treewide: reduce include of cql_statement.hh ClangBuildAnalyzer reports cql3/cql_statement.hh as being one of the most expensive header files in the project - being included (mostly indirectly) in 129 source files, and costing a total of 844 CPU seconds of compilation. This patch is an attempt, only partially successful, to reduce the number of times that cql_statement.hh is included. It succeeds in lowering the number 129 to 99, but not less :-( One of the biggest difficulties in reducing it further is that query_processor.hh includes a lot of templated code, which needs stuff from cql_statement.hh. The solution should be to un-template the functions in query_processor.hh and move them from the header to a source file, but this is beyond the scope of this patch and query_processor.hh appears problematic in other respects as well. Unfortunately the compilation speedup by this patch is negligible (the `du -bc build/dev/*/.o` metric shows less than 0.01% reduction). Beyond the fact that this patch only removes 30% of the inclusions of this header, it appears that most of the source files that no longer include cql_statement.hh after this patch, included anyway many of the other headers that cql_statement.hh included, so the saving is minimal. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15212	2023-09-08 13:23:50 +03:00
Amnon Heiman	1abcd4bb11	transport/server.cc: mark metric counter with skip_when_empty This patch mark scylla_transport_cql_errors_total with skip_when_empty flag. It reduces the overhead for metrics for types that are never reported. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2023-08-23 09:30:35 -04:00
Gleb Natapov	4ffc39d885	cql3: Extend the scope of group0_guard during DDL statement execution Currently we hold group0_guard only during DDL statement's execute() function, but unfortunately some statements access underlying schema state also during check_access() and validate() calls which are called by the query_processor before it calls execute. We need to cover those calls with group0_guard as well and also move retry loop up. This patch does it by introducing new function to cql_statement class take_guard(). Schema altering statements return group0 guard while others do not return any guard. Query processor takes this guard at the beginning of a statement execution and retries if service::group0_concurrent_modification is thrown. The guard is passed to the execute in query_state structure. Fixes: #13942 Message-ID: <ZNsynXayKim2XAFr@scylladb.com>	2023-08-17 15:52:48 +03:00
Avi Kivity	d57a951d48	Revert "cql3: Extend the scope of group0_guard during DDL statement execution" This reverts commit `70b5360a73`. It generates a failure in group0_test .test_concurrent_group0_modifications in debug mode with about 4% probability. Fixes #15050	2023-08-15 00:26:45 +03:00
Gleb Natapov	70b5360a73	cql3: Extend the scope of group0_guard during DDL statement execution Currently we hold group0_guard only during DDL statement's execute() function, but unfortunately some statements access underlying schema state also during check_access() and validate() calls which are called by the query_processor before it calls execute. We need to cover those calls with group0_guard as well and also move retry loop up. This patch does it by introducing new function to cql_statement class take_guard(). Schema altering statements return group0 guard while others do not return any guard. Query processor takes this guard at the beginning of a statement execution and retries if service::group0_concurrent_modification is thrown. The guard is passed to the execute in query_state structure. Fixes: #13942 Message-ID: <ZNSWF/cHuvcd+g1t@scylladb.com>	2023-08-13 14:19:39 +03:00
Kefu Chai	565f5c7380	transport: correct format string when printing logging message we print the stream id in the logging messages, but in this case, we forgot to pass `stream` to `log::debug()`. but the placeholder for `stream` was added. if the underlying fmtlib actually formats the argument with the format string, it would throw. fortunately, we don't enable debug level logging often, guess that's why we haven't spotted this issue yet. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14620	2023-07-13 11:21:43 +03:00
Calle Wilund	20e9619bb1	transport: Try to do early, transport based auth if possible Bypassing the need for an AUTH message+response. I.e. do auth _without_ client having login specified.	2023-06-26 15:00:21 +00:00
Wojciech Mitros	7883a88abd	transport: add is_schema_change() method to result_message In the next patch, we will want to observe when the result message is a schema change and handle it differently than when it is not. This patch adds a helper method for that, which should be more readable than a dynamic_pointer_cast and a comparison with nullptr.	2023-06-26 12:22:14 +02:00
Kefu Chai	c3d91f5190	tracing: drop trace(.., std::string&&) overload this change is a follow-up of `4f5fcb02fd`, the goal is to avoid the programming oversights like ```c++ trace(trace_ptr, "foo {} with {} but {} is {}"); ``` as `trace(const trace_state_ptr& p, const std::string& msg)` is a better match than the templated one, i.e., `trace(const trace_state_ptr& p, fmt::format_string<T...> fmt, T&&... args)`. so we cannot detect this with the compile-time format checking. so let's just drop this overload, and update its callers to use the other overload. The change was suggested by Avi. the example also came from him. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14188	2023-06-10 20:09:35 +03:00
Avi Kivity	26c8470f65	treewide: use #include <seastar/...> for seastar headers We treat Seastar as an external library, so fix the few places that didn't do so to use angle brackets. Closes #14037	2023-06-06 08:36:09 +03:00
Avi Kivity	42a1ced73b	cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt The expression system uses managed_bytes_opt for values, but result_set uses bytes_opt. This means that processing values from the result set in expressions requires a copy. Out of the two, managed_bytes_opt is the better choice, since it prevents large contiguous allocations for large blobs. So we switch result_set to use managed_bytes_opt. Users of the result_set API are adjusted. The db::function interface is not modified to limit churn; instead we convert the types on entry and exit. This will be adjusted in a following patch.	2023-05-07 17:17:36 +03:00
Kefu Chai	b76877fd99	transport: capture reference to temp value by value `current_scheduling_group()` returns a temporary value, and `name()` returns a reference, so we cannot capture the return value by reference, and use the reference after this expression is evaluated. this would cause undefined behavior. so let's just capture it by value. this change also silence following warning from GCC-13: ``` /home/kefu/dev/scylladb/transport/server.cc:204:11: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 204 \| auto& cur_sg_name = current_scheduling_group().name(); \| ^~~~~~~~~~~ /home/kefu/dev/scylladb/transport/server.cc:204:56: note: the temporary was destroyed at the end of the full expression ‘seastar::current_scheduling_group().seastar::scheduling_group::name()’ 204 \| auto& cur_sg_name = current_scheduling_group().name(); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ ``` Fixes #13719 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13724	2023-05-01 22:40:36 +03:00
Kamil Braun	30cc07b40d	Merge 'Introduce tablets' from Tomasz Grabiec This PR introduces an experimental feature called "tablets". Tablets are a way to distribute data in the cluster, which is an alternative to the current vnode-based replication. Vnode-based replication strategy tries to evenly distribute the global token space shared by all tables among nodes and shards. With tablets, the aim is to start from a different side. Divide resources of replica-shard into tablets, with a goal of having a fixed target tablet size, and then assign those tablets to serve fragments of tables (also called tablets). This will allow us to balance the load in a more flexible manner, by moving individual tablets around. Also, unlike with vnode ranges, tablet replicas live on a particular shard on a given node, which will allow us to bind raft groups to tablets. Those goals are not yet achieved with this PR, but it lays the ground for this. Things achieved in this PR: - You can start a cluster and create a keyspace whose tables will use tablet-based replication. This is done by setting `initial_tablets` option: ``` CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 3, 'initial_tablets': 8}; ``` All tables created in such a keyspace will be tablet-based. Tablet-based replication is a trait, not a separate replication strategy. Tablets don't change the spirit of replication strategy, it just alters the way in which data ownership is managed. In theory, we could use it for other strategies as well like EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy is augmented to support tablets. - You can create and drop tablet-based tables (no DDL language changes) - DML / DQL work with tablet-based tables Replicas for tablet-based tables are chosen from tablet metadata instead of token metadata Things which are not yet implemented: - handling of views, indexes, CDC created on tablet-based tables - sharding is done using the old method, it ignores the shard allocated in tablet metadata - node operations (topology changes, repair, rebuild) are not handling tablet-based tables - not integrated with compaction groups - tablet allocator piggy-backs on tokens to choose replicas. Eventually we want to allocate based on current load, not statically Closes #13387 * github.com:scylladb/scylladb: test: topology: Introduce test_tablets.py raft: Introduce 'raft_server_force_snapshot' error injection locator: network_topology_strategy: Support tablet replication service: Introduce tablet_allocator locator: Introduce tablet_aware_replication_strategy locator: Extract maybe_remove_node_being_replaced() dht: token_metadata: Introduce get_my_id() migration_manager: Send tablet metadata as part of schema pull storage_service: Load tablet metadata when reloading topology state storage_service: Load tablet metadata on boot and from group0 changes db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata() migration_notifier: Introduce before_drop_keyspace() migration_manager: Make prepare_keyspace_drop_announcement() return a future<> test: perf: Introduce perf-tablets test: Introduce tablets_test test: lib: Do not override table id in create_table() utils, tablets: Introduce external_memory_usage() db: tablets: Add printers db: tablets: Add persistence layer dht: Use last_token_of_compaction_group() in split_token_range_msb() locator: Introduce tablet_metadata dht: Introduce first_token() dht: Introduce next_token() storage_proxy: Improve trace-level logging locator: token_metadata: Fix confusing comment on ring_range() dht, storage_proxy: Abstract token space splitting Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries" db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms() db: Introduce get_non_local_vnode_based_strategy_keyspaces() service: storage_proxy: Avoid copying keyspace name in write handler locator: Introduce per-table replication strategy treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type locator: Introduce effective_replication_map locator: Rename effective_replication_map to vnode_effective_replication_map locator: effective_replication_map: Abstract get_pending_endpoints() db: Propagate feature_service to abstract_replication_strategy::validate_options() db: config: Introduce experimental "TABLETS" feature db: Log replication strategy for debugging purposes db: Log full exception on error in do_parse_schema_tables() db: keyspace: Remove non-const replication strategy getter config: Reformat	2023-04-27 09:40:18 +02:00
Botond Dénes	3e92bcaa20	Merge 'utils: redesign reusable_buffer' from Michał Chojnowski Common compression libraries work on contiguous buffers. Contiguous buffers are a problem for the allocator. However, as long as they are short-lived, we can avoid the expensive allocations by reusing buffers across tasks. This idea is already applied to the compression of CQL frames, but with some deficiencies. `utils: redesign reusable_buffer` attempts to improve upon it in a few ways. See its commit message for an extended discussion. Compression buffer reuse also happens in the zstd SSTable compressor, but the implementation is misguided. Every `zstd_processor` instance reuses a buffer, but each instance has its own buffer. This is very bad, because a healthy database might have thousands of concurrent instances (because there is one for each sstable reader). Together, the buffers might require gigabytes of memory, and the reuse actually increases memory pressure significantly, instead of reducing it. `zstd: share buffers between compressor instances` aims to improve that by letting a single buffer be shared across all instances on a shard. Closes #13324 * github.com:scylladb/scylladb: zstd: share buffers between compressor instances utils: redesign reusable_buffer	2023-04-27 09:09:09 +03:00
Michał Chojnowski	bf26a8c467	utils: redesign reusable_buffer Large contiguous buffers put large pressure on the allocator and are a common source of reactor stalls. Therefore, Scylla avoids their use, replacing it with fragmented buffers whenever possible. However, the use of large contiguous buffers is impossible to avoid when dealing with some external libraries (i.e. some compression libraries, like LZ4). Fortunately, calls to external libraries are synchronous, so we can minimize the allocator impact by reusing a single buffer between calls. An implementation of such a reusable buffer has two conflicting goals: to allocate as rarely as possible, and to waste as little memory as possible. The bigger the buffer, the more likely that it will be able to handle future requests without reallocation, but also the memory memory it ties up. If request sizes are repetitive, the near-optimal solution is to simply resize the buffer up to match the biggest seen request, and never resize down. However, if we anticipate pathologically large requests, which are caused by an application/configuration bug and are never repeated again after they are fixed, we might want to resize down after such pathological requests stop, so that the memory they took isn't tied up forever. The current implementation of reusable buffers handles this by resizing down to 0 every 100'000 requests. This patch attempts to solve a few shortcomings of the current implementation. 1. Resizing to 0 is too aggressive. During regular operation, we will surely need to resize it back to the previous size again. If something is allocated in the hole left by the old buffer, this might cause a stall. We prefer to resize down only after pathological requests. 2. When resizing, the current implementation allocates the new buffer before freeing the old one. This increases allocator pressure for no reason. 3. When resizing up, the buffer is resized to exactly the requested size. That is, if the current size is 1MiB, following requests of 1MiB+1B and 1MiB+2B will both cause a resize. It's preferable to limit the set of possible sizes so that every reset doesn't tend to cause multiple resizes of almost the same size. The natural set of sizes is powers of 2, because that's what the underlying buddy allocator uses. No waste is caused by rounding up the allocation to a power of 2. 4. The interval of 100'000 uses is both too low and too arbitrary. This is up for discussion, but I think that it's preferable to base the dynamics of the buffer on time, rather than the number of uses. It's more predictable to humans. The implementation proposed in this patch addresses these as follows: 1. Instead of resizing down to 0, we resize to the biggest size seen in the last period. As long as at least one maximal (up to a power of 2) "normal" request appears each period, the buffer will never have to be resized. 2. The capacity of the buffer is always rounded up to the nearest power of 2. 3. The resize down period is no longer measured in number of requests but in real time. Additionally, since a shared buffer in asynchronous code is quite a footgun, some rudimentary refcounting is added to assert that only one reference to the buffer exists at a time, and that the buffer isn't downsized while a reference to it exists. Fixes #13437	2023-04-26 22:09:17 +02:00
Tomasz Grabiec	41e69836fd	db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata()	2023-04-24 10:49:37 +02:00
Kefu Chai	ebf5e138e8	redis,thrift,transport: make timeout_config live-updateable * timeout_config - add `updated_timeout_config` which represents an always-updated options backed by `utils::updateable_value<>`. this class is used by servers which need to access the latest timeout related options. the existing `timeout_config` is more like a snapshot of the `updated_timeout_config`. it is used in the use case where we don't need to most updated options or we update the options manually on demand. * redis, thrift, transport: s/timeout_config/updated_timeout_config/ when appropriate. use the improved version of timeout_config where we need to have the access to the most-updated version of the timeout options. Fixes #10172 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:17:45 +08:00
Kefu Chai	1cc28679bc	transport: mark cql_server::timeout_config() const this function returns a const reference to member variable, so we can mark it with the `const` specifier for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	d72ab78ffd	transport: drop unused member function since `cql_server::connection::timeout_config()` is used nowhere, let's just drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	c642ca9e73	redis,thrift,transport: initialize _config with std::move(config) instead of copying the `config` parameter, move away from it. this change also prepares for a non-copyable config. if the class of `config` is not copyable, we will not be able to initialize the member variable by copying from the given `config` parameter. after the live-updateable config change, the `_config` member variable will contain instances of utils::observer<>, which is not copyable, but is move-constructable, hence in this change, we just move away from the give `config`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	e0ac2eb770	redis,thrift,transport: pass config via sharded_parameter * pass config via sharded_parameter * initialize config using designated initializer this change paves the road to servers with live-updateable timeout options. before this change, the servers initialize a domain specific combo config, like `redis_server_config`, with the same instance of a timeout_config, and pass the combox config as a ctor parameter to construct each sharded service instance. but this design assumes the value semantic of the config class, say, it should be copyable. but if we want to use utils::updateable_value<> to get updated option values, we would have to postpone the instantiation of the config until the sharded service is about to be initialized. so, in this change, instead of taking a domain specific config created before hand, all services constructed with a `timeout_config` will take a `sharded_parameter()` for creating the config. also, take this opportunity to initialize the config using designated initializer. for two reasons: * less repeatings this way. we don't have to repeat the variable name of the config being initialized for each member variable. * prepare for some member variables which do not have a default constructor. this applies to the timeout_config's updater which will not have a default constructor, as it should be initialized by db::config and a reference to the timeout_config to be updated. we will update the `timeout_config` side in a follow-up commit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:00 +08:00
Botond Dénes	19560419d2	Merge 'treewide: improve compatibility with gcc 13' from Avi Kivity An assortment of patches that reduce our incompatibilities with the upcoming gcc 13. Closes #13243 * github.com:scylladb/scylladb: transport: correctly format unknown opcode treewide: catch by reference test: raft: avoid confusing string compare utils, types, test: extract lexicographical compare utilities test: raft: fsm_test: disambiguate raft::configuration construction test: reader_concurrency_semaphore_test: handle all enum values repair: fix signed/unsigned compare repair: fix incorrect signed/unsigned compare treewide: avoid unused variables in if statements keys: disambiguate construction from initializer_list<bytes> cql3: expr: fix serialize_listlike() reference-to-temporary with gcc compaction: error on invalid scrub type treewide: prevent redefining names api: task_manager: fix signed/unsigned compare alternator: streams: fix signed/unsigned comparison test: fix some mismatched signed/unsigned comparisons	2023-03-24 15:16:05 +02:00
Vlad Zolotarov	f94bbc5b34	transport: add per-scheduling-group CQL opcode-specific metrics This patch extends a previous patch that added these metrics globally: - cql_requests_count - cql_request_bytes - cql_response_bytes This patch adds a "scheduling_group_name" label to these metrics and changes corresponding counters to be accounted on a per-scheduling-group level. As a bonus this patch also marks all 3 metrics as 'skip_when_empty'. Ref #13061 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20230321201412.3004845-1-vladz@scylladb.com>	2023-03-22 13:27:48 +02:00
Avi Kivity	19810cfc5e	transport: correctly format unknown opcode gcc allows an enum to contain values outside its members. For extra safety, as this can be user visible, format the unknown opcode and return it.	2023-03-21 15:43:00 +02:00
Avi Kivity	e75009cd49	treewide: catch by reference gcc rightly warns about capturing by value, so capture by reference.	2023-03-21 15:43:00 +02:00
Kefu Chai	21a7c439bb	build: cmake: find Snappy before using it Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Vlad Zolotarov	ae6724f155	transport: refactor CQL metrics This patch reorganizes and extends CQL related metrics. Before this patch we only had counters for specific CQL requests. However, many times we need to reason about the size of CQL queries: corresponding requests and response sizes. This patch adds corresponding metrics: - Arranges all 3 per-opcode statistics counters in a single struct. - Defines a vector of such structs for each CQL opcode. - Adjusts statistics updates accordingly - the code is much simpler now. - Removes old metrics that were accounting some CQL opcodes. - Adds new per-opcode metrics for requests number, request and response sizes: - New metrics are of a derived kind - rate() should be applied to them. - There are 3 new metrics names: - 'cql_requests_count' - 'cql_request_bytes' - 'cql_response_bytes' - New metrics have a per-opcode label - 'kind'. For example: A number of response bytes for an EXECUTE opcode on shard 0 looks as follows: scylla_transport_cql_response_bytes{kind="EXECUTE",shard="0"} Ref #13061 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20230302154816.299721-1-vladz@scylladb.com>	2023-03-07 12:02:34 +02:00
Kefu Chai	563fbb2d11	build: cmake: extract more subsystem out into its own CMakeLists.txt namely, cdc, compaction, dht, gms, lang, locator, mutation_writer, raft, readers, replica, service, tools, tracing and transport. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	412953fdd5	compress, transport: do not detect LZ4_compress_default() `LZ4_compress_default()` was introduced in liblz4 v1.7.3, despite that the release note (https://github.com/lz4/lz4/releases/tag/v1.7.3) of v1.7.3 didn't mention this. if we check the commit which added this API, we can find all releases including it: see ``` $ git tag --contains 1b17bf2ab8cf66dd2b740eca376e2d46f7ad7041 lz4-r130 r129 r130 r131 rc129v0 v1.7.3 v1.7.4 v1.7.4.2 v1.7.5 v1.8.0 v1.8.1 v1.8.1.2 v1.8.2 v1.8.3 v1.9.0 v1.9.1 v1.9.2 v1.9.3 v1.9.4 ``` and v1.7.3 was released in Nov 17, 2016. some popular distros releases also package new enough liblz4: - fedora 35 ships lz4-devel 1.9.3, - CentOS 7 ships lz4-devel 1.8.3 - debian 10 ships liblz4-dev 1.8.3 - ubuntu 18.04 ships liblz4-dev r131 so, in this change, we drop the support of liblz4 < 1.7.3 for better code readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12971	2023-02-23 14:39:20 +02:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Petr Gusev	3263523b54	transport server: fix "request size too large" handling Calling _read_buf.close() doesn't imply eof(), some data may have already been read into kernel or client buffers and will be returned next time read() is called. When the _server._max_request_size limit was exceeded and the _read_buf was closed, the process_request method finished and we started processing the next request in connection::process. The unread data from _read_buf was treated as the header of the next request frame, resulting in "Invalid or unsupported protocol version" error. The existing test_shed_too_large_request was adjusted. It was originally written with the assumption that the data of a large query would simply be dropped from the socket and the connection could be used to handle the next requests. This behaviour was changed in scylladb#8800, now the connection is closed on the Scylla side and can no longer be used. To check there are no errors in this case, we use Scylla metrics, getting them from the Scylla Prometheus API.	2023-02-08 00:07:08 +04:00
Petr Gusev	0904f98ebf	transport server: log failed requests with debug level These logs can be helpful for debugging, e.g. if an error was not handled correctly by the client driver, or another error occurred while handling it.	2023-02-08 00:07:08 +04:00
Petr Gusev	a4cf509c3d	transport server: fix unexpected server errors handling If request processing ended with an error, it is worth sending the error to the client through make_error/write_response. Previously in this case we just wrote a message to the log and didn't handle the client connection in any way. As a result, the only thing the client got in this case was timeout error. A new test_batch_with_error is added. It is quite difficult to reproduce error condition in a test, so we use error injection instead. Passing injection_key in the body of the request ensures that the exception will be thrown only for this test request and will not affect other requests that the driver may send in the background. Closes: scylladb#12104	2023-02-08 00:07:02 +04:00
Petr Gusev	bd80a449d5	transport server: log client errors with debug level Ideally, these errors should be transparently delivered to the client, but in practice, due to various flaws/bugs in scylla and/or the driver, they can be lost, which enormously complicates troubleshooting. const socket_address& get_remote_address() is needed for its convenient conversion to string, which includes ip and port.	2023-02-07 13:53:38 +04:00
Avi Kivity	0b418fa7cf	cql3, transport, tests: remove "unset" from value type system The CQL binary protocol introduced "unset" values in version 4 of the protocol. Unset values can be bound to variables, which cause certain CQL fragments to be skipped. For example, the fragment `SET a = :var` will not change the value of `a` if `:var` is bound to an unset value. Unsets, however, are very limited in where they can appear. They can only appear at the top-level of an expression, and any computation done with them is invalid. For example, `SET list_column = [3, :var]` is invalid if `:var` is bound to unset. This causes the code to be littered with checks for unset, and there are plenty of tests dedicated to catching unsets. However, a simpler way is possible - prevent the infiltration of unsets at the point of entry (when evaluating a bind variable expression), and introduce guards to check for the few cases where unsets are allowed. This is what this long patch does. It performs the following: (general) 1. unset is removed from the possible values of cql3::raw_value and cql3::raw_value_view. (external->cql3) 2. query_options is fortified with a vector of booleans, unset_bind_variable_vector, where each boolean corresponds to a bind variable index and is true when it is unset. 3. To avoid churn, two compatiblity structs are introduced: cql3::raw_value{,_view}_vector_with_unset, which can be constructed from a std::vector<raw_value{,_view/}>, which is what most callers have. They can also be constructed with explicit unset vectors, for the few cases they are needed. (cql3->variables) 4. query_options::get_value_at() now throws if the requested bind variable is unset. This replaces all the throwing checks in expression evaluation and statement execution, which are removed. 5. A new query_options::is_unset() is added for the users that can tolerate unset; though it is not used directly. 6. A new cql3::unset_operation_guard class guards against unsets. It accepts an expression, and can be queried whether an unset is present. Two conditions are checked: the expression must be a singleton bind variable, and at runtime it must be bound to an unset value. 7. The modification_statement operations are split into two, via two new subclasses of cql3::operation. cql3::operation_no_unset_support ignores unsets completely. cql3::operation_skip_if_unset checks if an operand is unset (luckily all operations have at most one operand that tolerates unset) and applies unset_operation_guard to it. 8. The various sites that accept expressions or operations are modified to check for should_skip_operation(). This are the loops around operations in update_statement and delete_statement, and the checks for unset in attributes (LIMIT and PER PARTITION LIMIT) (tests) 9. Many unset tests are removed. It's now impossible to enter an unset value into the expression evaluation machinery (there's just no unset value), so it's impossible to test for it. 10. Other unset tests now have to be invoked via bind variables, since there's no way to create an unset cql3::expr::constant. 11. Many tests have their exception message match strings relaxed. Since unsets are now checked very early, we don't know the context where they happen. It would be possible to reintroduce it (by adding a format string parameter to cql3::unset_operation_guard), but it seems not to be worth the effort. Usage of unsets is rare, and it is explicit (at least with the Python driver, an unset cannot be introduced by ommission). I tried as an alternative to wrap cql3::raw_value{,_view} (that doesn't recognize unsets) with cql3::maybe_unset_value (that does), but that caused huge amounts of churn, so I abandoned that in favor of the current approach. Closes #12517	2023-01-16 21:10:56 +02:00
Avi Kivity	7a8a442c1e	transport: drop some dead code around v1 and v2 protocols In `424dbf43f` ("transport: drop cql protocol versions 1 and 2"), we dropped support for protocols 1 and 2, but some code remains that checks for those versions. It is now dead code, so remove it. Closes #12497	2023-01-12 12:52:19 +02:00
Avi Kivity	2739ac66ed	treewide: drop cql_serialization_format Now that we don't accept cql protocol version 1 or 2, we can drop cql_serialization format everywhere, except when in the IDL (since it's part of the inter-node protocol). A few functions had duplicate versions, one with and one without a cql_serialization_format parameter. They are deduplicated. Care is taken that `partition_slice`, which communicates the cql_serialization_format across nodes, still presents a valid cql_serialization_format to other nodes when transmitting itself and rejects protocol 1 and 2 serialization\ format when receiving. The IDL is unchanged. One test checking the 16-bit serialization format is removed.	2023-01-03 19:54:13 +02:00
Avi Kivity	424dbf43f3	transport: drop cql protocol versions 1 and 2 Version 3 was introduced in 2014 (Cassandra 2.1) and was supported in the very first version of Scylla (`2a7da21481` "CQL binary protocol"). Cassandra 3.0 (2015) dropped protocols 1 and 2 as well. It's safe enough to drop it now, 9 years after introduction of v3 and 7 years after Cassandra stopped supporting it. Dropping it allows dropping cql_serialization_format, which causes quite a lot of pain, and is probably broken. This will be dropped in the following patch.	2023-01-03 19:47:49 +02:00

1 2 3 4 5 ...

555 Commits