scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 01:50:35 +00:00

Author	SHA1	Message	Date
sylwiaszunejko	75b3dbf7ea	transport: add support for setting custom payload A custom payload can now be added to response_message. If it is set, it will be sent to client and the custom_payload flag will be set. write_string_bytes_map method is added to response class and a missing custom_payload flag is added to cql_frame_flags.	2023-11-21 15:09:36 +01:00
Avi Kivity	42a1ced73b	cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt The expression system uses managed_bytes_opt for values, but result_set uses bytes_opt. This means that processing values from the result set in expressions requires a copy. Out of the two, managed_bytes_opt is the better choice, since it prevents large contiguous allocations for large blobs. So we switch result_set to use managed_bytes_opt. Users of the result_set API are adjusted. The db::function interface is not modified to limit churn; instead we convert the types on entry and exit. This will be adjusted in a following patch.	2023-05-07 17:17:36 +03:00
Michał Chojnowski	bf26a8c467	utils: redesign reusable_buffer Large contiguous buffers put large pressure on the allocator and are a common source of reactor stalls. Therefore, Scylla avoids their use, replacing it with fragmented buffers whenever possible. However, the use of large contiguous buffers is impossible to avoid when dealing with some external libraries (i.e. some compression libraries, like LZ4). Fortunately, calls to external libraries are synchronous, so we can minimize the allocator impact by reusing a single buffer between calls. An implementation of such a reusable buffer has two conflicting goals: to allocate as rarely as possible, and to waste as little memory as possible. The bigger the buffer, the more likely that it will be able to handle future requests without reallocation, but also the memory memory it ties up. If request sizes are repetitive, the near-optimal solution is to simply resize the buffer up to match the biggest seen request, and never resize down. However, if we anticipate pathologically large requests, which are caused by an application/configuration bug and are never repeated again after they are fixed, we might want to resize down after such pathological requests stop, so that the memory they took isn't tied up forever. The current implementation of reusable buffers handles this by resizing down to 0 every 100'000 requests. This patch attempts to solve a few shortcomings of the current implementation. 1. Resizing to 0 is too aggressive. During regular operation, we will surely need to resize it back to the previous size again. If something is allocated in the hole left by the old buffer, this might cause a stall. We prefer to resize down only after pathological requests. 2. When resizing, the current implementation allocates the new buffer before freeing the old one. This increases allocator pressure for no reason. 3. When resizing up, the buffer is resized to exactly the requested size. That is, if the current size is 1MiB, following requests of 1MiB+1B and 1MiB+2B will both cause a resize. It's preferable to limit the set of possible sizes so that every reset doesn't tend to cause multiple resizes of almost the same size. The natural set of sizes is powers of 2, because that's what the underlying buddy allocator uses. No waste is caused by rounding up the allocation to a power of 2. 4. The interval of 100'000 uses is both too low and too arbitrary. This is up for discussion, but I think that it's preferable to base the dynamics of the buffer on time, rather than the number of uses. It's more predictable to humans. The implementation proposed in this patch addresses these as follows: 1. Instead of resizing down to 0, we resize to the biggest size seen in the last period. As long as at least one maximal (up to a power of 2) "normal" request appears each period, the buffer will never have to be resized. 2. The capacity of the buffer is always rounded up to the nearest power of 2. 3. The resize down period is no longer measured in number of requests but in real time. Additionally, since a shared buffer in asynchronous code is quite a footgun, some rudimentary refcounting is added to assert that only one reference to the buffer exists at a time, and that the buffer isn't downsized while a reference to it exists. Fixes #13437	2023-04-26 22:09:17 +02:00
Vlad Zolotarov	ae6724f155	transport: refactor CQL metrics This patch reorganizes and extends CQL related metrics. Before this patch we only had counters for specific CQL requests. However, many times we need to reason about the size of CQL queries: corresponding requests and response sizes. This patch adds corresponding metrics: - Arranges all 3 per-opcode statistics counters in a single struct. - Defines a vector of such structs for each CQL opcode. - Adjusts statistics updates accordingly - the code is much simpler now. - Removes old metrics that were accounting some CQL opcodes. - Adds new per-opcode metrics for requests number, request and response sizes: - New metrics are of a derived kind - rate() should be applied to them. - There are 3 new metrics names: - 'cql_requests_count' - 'cql_request_bytes' - 'cql_response_bytes' - New metrics have a per-opcode label - 'kind'. For example: A number of response bytes for an EXECUTE opcode on shard 0 looks as follows: scylla_transport_cql_response_bytes{kind="EXECUTE",shard="0"} Ref #13061 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20230302154816.299721-1-vladz@scylladb.com>	2023-03-07 12:02:34 +02:00
Avi Kivity	2739ac66ed	treewide: drop cql_serialization_format Now that we don't accept cql protocol version 1 or 2, we can drop cql_serialization format everywhere, except when in the IDL (since it's part of the inter-node protocol). A few functions had duplicate versions, one with and one without a cql_serialization_format parameter. They are deduplicated. Care is taken that `partition_slice`, which communicates the cql_serialization_format across nodes, still presents a valid cql_serialization_format to other nodes when transmitting itself and rejects protocol 1 and 2 serialization\ format when receiving. The IDL is unchanged. One test checking the 16-bit serialization format is removed.	2023-01-03 19:54:13 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Rafael Ávila de Espíndola	80d969ce31	everywhere: Use uninitialized_string instead of sstring::initialized_later This is just a trivial wrapper over initialized_later when using sstring, but also works when std::string is used. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-03-10 13:17:49 -07:00
Rafael Ávila de Espíndola	caef2ef903	everywhere: Don't assume sstring::begin() and sstring::end() are pointers If we switch to using std::string we have to handle begin and end returning iterators. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-03-10 13:13:48 -07:00
Rafael Ávila de Espíndola	c2c44f4778	transport: Pass a string_view to cql_server::response::write_string With this we don't need to construct a sstring just to call write_string. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-02-28 08:36:27 -08:00
Calle Wilund	4ef940169f	Replace use of "ipv4_addr" with socket_address Allows the various sockets to use ipv6 address binding if so configured.	2019-07-08 14:13:09 +00:00
Avi Kivity	5f79ff0f54	transport: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Vlad Zolotarov	6db90a2e63	tracing: store a query response size Add a new "response_size" column to system_traces.sessions and store a size of an uncompressed response for a traced query. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-08-03 12:29:36 -04:00
Paweł Dziepak	f95bb21d99	transport: extract compression buffers from response class Both compression and decompression code is going to reuse the same pair of reusable buffers.	2018-07-18 12:28:06 +01:00
Paweł Dziepak	24929fd2ce	transport: move response outside of cql_server class	2018-07-18 12:28:06 +01:00
Paweł Dziepak	54d5dc414d	transport: use cql3::result_set visiting interface	2018-06-25 09:21:47 +01:00
Paweł Dziepak	c0e7160625	transport: response: add write_int_placeholder() This allows the response writer to defer writing integers until later time. It will be used by lazy response generator which will know the number of rows in the response only after they are all written.	2018-06-25 09:21:47 +01:00
Paweł Dziepak	88aff8eda8	transport: steal response buffers and make send zero-copy Each response is sent only once, so we can safely steal its buffers and pass them to the output_stream using the zero-copy interface.	2018-06-25 09:21:47 +01:00
Paweł Dziepak	821e6683e3	transport: use reusable_buffer for compression Compression algorithms require us to linearise bytes_ostream. This may cause an excessive number of large allocations. Using reusable_buffers can avoid that.	2018-06-25 09:21:47 +01:00
Paweł Dziepak	a7c4d407ce	transport: response: use bytes_ostream std::vector<char> is not a very good container for incrementally building a response. It may cause excessive copies and allocations. If the response is large it will put more pressure on the memory allocator by requiring the buffer to be contiguous. We already have bytes_ostream which avoids all of these problems, so let's use it.	2018-06-25 09:22:43 +01:00
Paweł Dziepak	c04d38b76b	transport: drop response::make_message()	2018-06-25 09:22:35 +01:00
Paweł Dziepak	12f89299b2	transport: move response to a separate header There are some other translation units which right now are satisfied with the response being an incomplete type. This means that std::unique_ptr can't be used for it. Let's move the class declaration to a header that can be included where needed.	2018-06-25 09:21:47 +01:00

22 Commits