scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-28 18:50:53 +00:00

Author	SHA1	Message	Date
Avi Kivity	cccd2e7fa7	Merge 'Generalize sstables TOC file reading' from Pavel Emelyanov TOC file is read and parsed in several places in the code. All do it differently, and it's worth generalizing this place. To make it happen also fix the S3 readable_file so that it could be used inside file_input_stream. Closes scylladb/scylladb#16175 * github.com:scylladb/scylladb: sstable: Generalize toc file read and parse s3/client: Don't GET object contents on out-of-bound reads s3/client: Cache stats on readable_file	2023-11-29 19:18:31 +02:00
Kefu Chai	c40da20092	utils/pretty_printers: stop using undocumented fmt api format_parse_context::on_error() is an undocumented API in fmt v9 and in fmt v10, see - https://fmt.dev/9.1.0/api.html#_CPPv4I0EN3fmt16basic_format_argE - https://fmt.dev/10.0.0/api.html#_CPPv4I0EN3fmt26basic_format_parse_contextE despite that this API was once used in its document for fmt v10.0.0, see https://fmt.dev/10.0.0/api.html#formatting-user-defined-types. it's still, well, undocumented. so, to have better compatibility, let's use the documented API in place of undocumented one. please note, `throw_format_error()` was still not a public API before 10.1.0, so before that release we have to throw `fmt::format_error` explicitly. so we cannot use it yet during the transitional period. because the class of `fmt::format_error` is defined in `fmt/format.h`, we need to include this header for using it. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16212	2023-11-29 12:49:04 +02:00
Pavel Emelyanov	c5d85bdf79	s3/client: Don't GET object contents on out-of-bound reads If S3 readable file is used inside file input stream, the latter may call its read methods with position that is above file size. In that case server replies with generic http error and the fact that the range was invalid is encoded into reply body's xml. That's not great to catch this via wrong reply status exception and xml parsing all the more so we can know that the read is out-of-bound in advance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-29 12:09:52 +03:00
Pavel Emelyanov	339182287f	s3/client: Cache stats on readable_file S3-based sstables components are immutable, so every time stat is called there's no need to ping server again. But the main intention of this patch is to provide stats for read calls in the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-29 12:06:54 +03:00
Pavel Emelyanov	210b01a5ce	config: Make object storage config updateable_value_source Now its plain updateable_value, but without the ..._source object the updateable_value is just a no-op value holder. In order for the observers to operate there must be the value source, updating it would update the attached updateable values _and_ notify the observers. In order for the config to be the u.v._source, config entries should be comparable to each other, thus the <=> operator for it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-21 16:47:50 +03:00
Pavel Emelyanov	855626f7de	s3/client: Map http exceptions into storage_io_error When http request resolves with excpetion it makes sense to translate the network exception into storage exceptio to make upper layers think that it was some sort of IO error, not SUDDENLY and http one. The translation is, for now, pretty simple: - 404 and 3xx -> ENOENT - 403(forbidden) and 401(unauthorized) -> EACCESS - anything else -> EIO Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-21 16:47:50 +03:00
Pavel Emelyanov	0e9428ab4a	exceptions: Extend storage_io_error construction options To make it possible to construct it with plain errno value and a string Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-21 13:37:52 +03:00
Kefu Chai	691f7f6edb	util: do not use variable length array vla (variable length array) is an extension in GCC and Clang. and it is not part of the C++ standard. so let's avoid using it if possible, for better standard compliant. it's also more consistent with other places where we calculate the size of an array of T in the same source file. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16084	2023-11-20 23:02:41 +02:00
Kefu Chai	12f4f9f481	build: cmake: link against cryptopp::cryptopp instead of linking against cryptopp, we should link against crytopp::crytopp. the latter is the target exposed by Findcryptopp.cmake, while the former is but a library name which is not even exposed by any find_package() call. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16060	2023-11-15 17:14:04 +02:00
Pavel Emelyanov	f4fd5c7207	s3/client: Tag pieces of jumbo uploader The jumbo sink is there to upload files that can be potentially larger than 50Gb (10000*5Mb). For that the sink uploads a set of so called "pieces" -- files up to 50Gb each -- then uses the copy-upload APi call to squash the pieces together. After copying the piece is removed. In case of a crash while uploading pieces remain in the bucket forever which is not great. This patch tags pieces with 'kind=piece' tag in order to tell pieces from regular objects. This can be used, for example, by setting up the lifecycle tag-based policy and collect dangling pieces eventually. fixes: #13670 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#16023	2023-11-15 15:32:30 +02:00
Kefu Chai	efd65aebb2	build: cmake: add check-header target to have feature parity with `configure.py`. we won't need this once we migrate to C++20 modules. but before that day comes, we need to stick with C++ headers. we generate a rule for each .hh files to create a corresponding .cc and then compile it, in order to verify the self-containness of that header. so the number of rule is quite large, to avoid the unnecessary overhead. the check-header target is enabled only if `Scylla_CHECK_HEADERS` option is enabled. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15913	2023-11-13 10:27:06 +02:00
Nadav Har'El	284534f489	Merge 'Nodetool additional commands 4/N' from Botond Dénes This PR implements the following new nodetool commands: * snapshot * drain * flush * disableautocompaction * enableautocompaction All commands come with tests and all tests pass with both the new and the current nodetool implementations. Refs: https://github.com/scylladb/scylladb/issues/15588 Closes scylladb/scylladb#15939 * github.com:scylladb/scylladb: test/nodetool: add README.md tools/scylla-nodetool: implement enableautocompaction command tools/scylla-nodetool: implement disableautocompaction command tools/scylla-nodetool: implement the flush command tools/scylla-nodetool: extract keyspace/table parsing tools/scylla-nodetool: implement the drain command tools/scylla-nodetool: implement the snapshot command test/nodetool: add support for matching aproximate query parameters utils/http: make dns_connection_factory::initialize() static	2023-11-08 11:18:35 +02:00
Nadav Har'El	a3621dbd3e	Merge 'Alternator: Support new ReturnValuesOnConditionCheckFailure feature' from Marcin Maliszkiewicz alternator: add support for ReturnValuesOnConditionCheckFailure feature As announced in https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-dynamodb-cost-failed-conditional-writes/, DynamoDB added a new option for write operations (PutItem, UpdateItem, or DeleteItem), ReturnValuesOnConditionCheckFailure, which if set to ALL_OLD returns the current value of the item - but only if a condition check failed. Fixes https://github.com/scylladb/scylladb/issues/14481 Closes scylladb/scylladb#15125 * github.com:scylladb/scylladb: alternator: add support for ReturnValuesOnConditionCheckFailure feature alternator: add ability to send additional fields in api_error	2023-11-07 23:19:51 +02:00
Botond Dénes	b61822900b	utils/http: make dns_connection_factory::initialize() static Said method can out-live the factory instance. This was not a problem because the method takes care to keep all its need from `this` alive, by copying them to the coroutine stack. However, this fact that this method can out-live the instance is not obvious, and an unsuspecting developer (me) added a new member (_logger) which was not kept alive. This can cause a use-after-free in the factory. Fix by making initialize() static, forcing the instance to pass all parameters explicitely and add a comment explaining that this method can out-live the instance.	2023-11-07 04:39:33 -05:00
Kefu Chai	ef023dae44	s3: use rapixml/rapidxml.hpp as a fallback on debian derivatives librapidxml-dev installs rapidxml.h as rapixml/rapidxml.hpp, so let's use it as a fallback. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15814	2023-11-01 10:25:40 +03:00
Marcin Maliszkiewicz	b4c77a373d	alternator: add ability to send additional fields in api_error While it may not be explicitly documented DynamoDB sometimes enchriches error message by additional fields. For instance when ConditionalCheckFailedException occurs while ReturnValuesOnConditionCheckFailure is set it will add Item object, similarly for TransactionCanceledException it will add CancellationReasons object. There may be more cases like this so generic json field is added to our error class. The change will be used by future commit implementing ReturnValuesOnConditionCheckFailure feature.	2023-10-30 15:13:06 +01:00
Botond Dénes	ceb866fa2e	Merge 'Make s3 upload sink PUT small objects' from Pavel Emelyanov When upload-sink is flushed, it may notice that the upload had not yet been started and fall-back to plain PUT in that case. This will make small files uploading much nicer, because multipart upload would take 3 API calls (start, part, complete) in this case fixes: #13014 Closes scylladb/scylladb#15824 * github.com:scylladb/scylladb: test: Add s3_client test for upload PUT fallback s3/client: Add PUT fallback to upload sink	2023-10-25 10:03:46 +03:00
Kefu Chai	f8104b92f8	build: cmake: detect rapidxml we use rapidxml for parsing XML, so let's detect it before using it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15813	2023-10-24 15:12:04 +03:00
Pavel Emelyanov	63f2bdca01	s3/client: Add PUT fallback to upload sink When the non-jumbo sink is flushed and notices that the real upload is not started yet, it may just go ahead and PUT the buffers into the object with the single request. For jumbo sink the fallback is not implemented as it likely doesn't make and any sense -- jumbo sinks are unlikely to produce less than 5Mb of data so it's going to be dead code anyway. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-10-24 10:59:46 +03:00
Avi Kivity	ee9cc450d4	logalloc: report increases of reserves The log-structured allocator maintains memory reserves to so that operations using log-strucutured allocator memory can have some working memory and can allocate. The reserves start small and are increased if allocation failures are encountered. Before starting an operation, the allocator first frees memory to satisfy the reserves. One problem is that if the reserves are set to a high value and we encounter a stall, then, first, we have no idea what value the reserves are set to, and second, we have no idea what operation caused the reserves to be increased. We fix this problem by promoting the log reports of reserve increases from DEBUG level to INFO level and by attaching a stack trace to those reports. This isn't optimal since the messages are used for debugging, not for informing the user about anything important for the operation of the node, but I see no other way to obtain the information. Ref #13930. Closes scylladb/scylladb#15153	2023-10-23 13:37:50 +02:00
Botond Dénes	9231454acd	mutation/json: extract generic streaming writer into utils/rjson.hh This writer is generally useful, not just for writing mutations as json. Make it generally available as well.	2023-10-20 10:04:56 -04:00
Michael Huang	75109e9519	cql3: Fix invalid JSON parsing for JSON objects with ASCII keys For JSON objects represented as map<ascii, int>, don't treat ASCII keys as a nested JSON string. We were doing that prior to the patch, which led to parsing errors. Included the error offset where JSON parsing failed for rjson::parse related functions to help identify parsing errors better. Fixes: #7949 Signed-off-by: Michael Huang <michaelhly@gmail.com> Closes scylladb/scylladb#15499	2023-10-05 22:26:08 +03:00
Avi Kivity	e600f35d1e	Merge 'logalloc, reader_concurrency_semaphore: cooperate on OOM kills' from Botond Dénes Consider the following code snippet: ```c++ future<> foo() { semaphore.consume(1024); } future<> bar() { return _allocating_section([&] { foo(); }); } ``` If the consumed memory triggers the OOM kill limit, the semaphore will throw `std::bad_alloc`. The allocating section will catch this, bump std reserves and retry the lambda. Bumping the reserves will not do anything to prevent the next call to `consume()` from triggering the kill limit. So this cycle will repeat until std reserves are so large that ensuring the reserve fails. At this point LSA gives up and re-throws the `std::bad_alloc`. Beyond the useless time spent on code that is doomed to fail, this also results in expensive LSA compaction and eviction of the cache (while trying to ensure reserves). Prevent this situation by throwing a distinct exception type which is derived from `std::bad_alloc`. Allocating section will not retry on seeing this exception. A test reproducing the bug is also added. Fixes: #15278 Closes scylladb/scylladb#15581 * github.com:scylladb/scylladb: test/boost/row_cache_test: add test_cache_reader_semaphore_oom_kill utils/logalloc: handle utils::memory_limit_reached in with_reclaiming_disabled() reader_concurrency_semaphore: use utils::memory_limit_reached exception utils: add memory_limit_reached exception	2023-10-05 19:47:21 +03:00
Pavel Emelyanov	c4f1929eea	s3: Abort multipart upload if finalize request fails It may happen that wrapping up multipart upload fails too. However, before sending the request the driver clears the _upload_id field thus marking the whole process as "all is OK". So in case the finalization method fails and thrown, the upload context remains on the server side forever. Fix this by keeping the _upload_id set, so even if finalization throws, closing the uploader notices this and calls abort. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15521	2023-10-03 09:47:33 +03:00
Botond Dénes	c0da6bcfb8	utils/logalloc: handle utils::memory_limit_reached in with_reclaiming_disabled() Said method catches bad-allocs and retries the passed-in function after raising the reserves. This does nothing to help the function succeed if the bad alloc was throw from the semaphore, because the kill limit was reached. In this case the read should be left to fail and terminate. Now that the semaphore is throwing utils::memory_limit_reached in this case, we can distinguish this case and just re-throw the exception.	2023-09-27 10:28:00 -04:00
Botond Dénes	721ffa319d	utils: add memory_limit_reached exception A distinct exception derived from std::bad_alloc, used in cases when memory didn't really run out, but the process or task reached the memory limit alloted for it. Using a distinct type for this case allows for LSA to correctly react to this case.	2023-09-27 10:26:41 -04:00
Kefu Chai	ac3406e537	utils/s3/creds: rename aws_config member variables - s/key/access_key_id/ - s/secret/secret_access_key/ - s/token/session_token/ so they are more aligned with the AWS document. for instance, in https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#ConstructingTheAuthenticationHeader AWSAccessKeyId is used in the "Authorization" header. this would help with the readability and maintainability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-23 14:28:07 +08:00
Avi Kivity	1da6a939fe	Merge 'Track memory usage of S3 object uploads' from Pavel Emelyanov The S3 uploading sink needs to collect buffers internally before sending them out, because the minimal upload-able part size is 5Mb. When the necessary amount of bytes is accumulated, the part uploading fibers starts in the background. On flush the sink waits for all the fibers to complete and handles failure of any. Uploading parallelism is nowadays limited by the means of the http client max-connections parameter. However, when a part uploading fibers waits for it connection it keeps the 5Mb+ buffers on the request's body, so even though the number of uploading parts is limited, the number of _waiting_ parts is effectively not. This PR adds a shard-wide limiter on the number of background buffers S3 clients (and theirs http clients) may use. Closes scylladb/scylladb#15497 * github.com:scylladb/scylladb: s3::client: Track memory in client uploads code: Configure s3 clients' memory usage s3::client: Construct client with shared semaphore sstables::storage_manager: Introduce config	2023-09-21 18:24:42 +03:00
Botond Dénes	a0c5dee2aa	utils/logalloc: introduce logalloc::bad_alloc This new exception type inherits from std::bad_alloc and allows logalloc code to add additional information about why the allocation failed. We currently have 3 different throw sites for std::bad_alloc in logalloc.cc and when investigating a coredump produced by --abort-on-lsa-bad-alloc, it is impossible to determine, which throw-site activated last, triggering the abort. This patch fixes that by disambiguating the throw-sites and including it in the error message printed, right before abort. Refs: #15373 Closes scylladb/scylladb#15503	2023-09-21 17:43:53 +03:00
Kefu Chai	0819788207	utils/s3: use structured binding when appropriate and use `sstring::starts_with()`, for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15487	2023-09-21 13:26:49 +03:00
Kefu Chai	c364efb998	utils/s3: auth using AWS_SESSION_TOKEN when accessing AWS resources, uses are allowed to long-term security credentials, they can also the temporary credentials. but if the latter are used, we have to pass a session token along with the keys. see also https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html so, if we want to programatically get authenticated, we need to set the "x-amz-security-token" header, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#UsingTemporarySecurityCredentials so, in this change, we 1. add another member named `token` in `s3::endpoint_config::aws_config` for storing "AWS_SESSION_TOKEN". 2. populate the setting from "object_storage.yaml" and "$AWS_SESSION_TOKEN" environment variable. 3. set "x-amz-security-token" header if `s3::endpoint_config::aws_config::token` is not empty. this should allow us to test s3 client and s3 object store backend with S3 bucket, with the temporary credentials. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15486	2023-09-21 13:26:11 +03:00
Pavel Emelyanov	e6fe18ca55	s3: Handle piece flushing exception When a piece is uploaded it's first flushed, then upload-copy is issued. Both happen in the background and if piece flush calls resolves with exception the exception remains unhandled. That's OK, since upload finalization code checks that some pieces didn't complete (for whatever reason) and fails the whole uploading, however, the ignored exception is reported in logs. Not nice. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15491	2023-09-21 10:39:04 +03:00
Kefu Chai	fe4caeb77f	utils/s3/client: do not allocate rapidxml::xml_document on stack as the size of `rapidxml::xml_document` size quite large, let's allocate it on the heap. otherwise GCC 13.2.1 warns us like: ``` utils/s3/client.cc: In function ‘seastar::sstring s3::parse_multipart_copy_upload_etag(seastar::sstring&)’: utils/s3/client.cc:455:9: warning: stack usage is 66208 bytes [-Wstack-usage=] 455 \| sstring parse_multipart_copy_upload_etag(sstring& body) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15472	2023-09-21 08:51:08 +03:00
Pavel Emelyanov	fc5306c5e8	s3::client: Track memory in client uploads When uploading an object part, client spawns a background fiber that keeps the buffers with data on the http request's write_body() lambda capture. This generates unbound usage of memory with uploaded buffers which is not nice. Even though s3 client is limited with http's client max-connections parallelism, waiting for the available connection still happens with buffers held in memory. This patch makes the client claim the background memory from the provided semaphore (which, in turn, sits on the shard-wide storage manager instance). Once body writing is complete, the claimed units are returned back to the semaphore allowing for more background writes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:29 +03:00
Pavel Emelyanov	b299757884	s3::client: Construct client with shared semaphore The semaphore will be used to cap memory consumption by client. This patch makes sure the reference to a semaphore exists as an argument to client's constructor, not more than that. In scylla binary, the semaphore sits on storage_manager. In tests the semaphore is some local object. For now the semaphore is unused and is initialized locked as this patch just pushes the needed argument all the way around, next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:07 +03:00
Kefu Chai	4d285590f0	utils/config_file: document config_file::value_status add doxygen style comment to document `value_status` members. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15277	2023-09-18 16:20:06 +03:00
Pavel Emelyanov	30959fc9b1	lsa, test: Extend memory footprint test with per-type total sizes When memory footprint test is over it prints total size taken by row cache, memtable and sstables as well as individual objects' sizes. It's also nice to know the details on the row-cache's individual objects. This patch extends the printing with total size of allocated object types according to migrator_fn types. Sample output: mutation footprint: - in cache: 11040928 - in memtable: 9142424 - in sstable: mc: 2160000 md: 2160000 me: 2160000 - frozen: 540 - canonical: 827 - query result: 342 sizeof(cache_entry) = 64 sizeof(memtable_entry) = 64 sizeof(bptree::node) = 288 sizeof(bptree::data) = 72 -- sizeof(decorated_key) = 32 -- sizeof(mutation_partition) = 96 -- -- sizeof(_static_row) = 8 -- -- sizeof(_rows) = 24 -- -- sizeof(_row_tombstones) = 40 sizeof(rows_entry) = 144 sizeof(evictable) = 24 sizeof(deletable_row) = 72 sizeof(row) = 16 radix_tree::inner_node::node_sizes = 48 80 144 272 528 1040 radix_tree::leaf_node::node_sizes = 120 216 416 816 3104 sizeof(atomic_cell_or_collection) = 16 btree::linear_node_size(1) = 24 btree::inner_node_size = 216 btree::leaf_node_size = 120 LSA stats: N18compact_radix_tree4treeI13cell_and_hashjE9leaf_nodeE: 360 N5bplus4dataIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 5040 N5bplus4nodeIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 19296 17partition_version: 952416 N11intrusive_b4nodeI10rows_entryXadL_ZNS1_5_linkEEENS1_11tri_compareELm12ELm20ELNS_10key_searchE0ELNS_10with_debugE0EEE: 317472 10rows_entry: 1429056 12blob_storage: 254 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15434	2023-09-18 11:23:18 +02:00
Avi Kivity	d9a453e72e	Merge 'Introduce a scylla-native nodetool' from Botond Dénes This series introduces a scylla-native nodetool. It is invokable via the main scylla executable as the other native tools we have. It uses the seastar's new `http::client` to connect to the specified node and execute the desired commands. For now a single command is implemented: `nodetool compact`, invokable as `scylla nodetool compact`. Once all the boilerplate is added to create a new tool, implementing a single command is not too bad, in terms of code-bloat. Certainly not as clean as a python implementation would be, but good enough. The advantages of a C++ implementation is that all of us in the core team know C++ and that it is shipped right as part of the scylla executable.. Closes #14841 * github.com:scylladb/scylladb: test: add nodetool tests test.py: add ToolTestSuite and ToolTest tools/scylla-nodetool: implement compact operation tools/scylla-nodetool: implement basic scylla_rest_api_client tools: introduce scylla-nodetool utils: export dns_connection_factory from s3/client.cc to http.hh utils/s3/client: pass logger to dns_connection_factory in constructor tools/utils: tool_app_template::run_async(): also detect --help* as --help	2023-09-14 17:20:40 +03:00
Botond Dénes	bf2fad3c00	utils: export dns_connection_factory from s3/client.cc to http.hh So others can use it too. Move headers only used by said class too.	2023-09-14 05:25:14 -04:00
Botond Dénes	17fd57390e	utils/s3/client: pass logger to dns_connection_factory in constructor We want to publish this class in a header so it can be used by others, but it uses the s3 logger. We don't want future users to pollute the s3 logs, so allow users to pass their own loggers to the factory.	2023-09-14 05:25:14 -04:00
Botond Dénes	cc16502691	Merge 'Add metrics to S3 client' from Pavel Emelyanov The added metrics include: - http client metrics, which include the number of connections, the number of active connections and the number of new connections made so far - IO metrics that mimic those for traditional IO -- total number of object read/write ops, total number of get/put/uploaded bytes and individual IO request delay (round-trip, including body transfer time) fixes: #13369 Closes #14494 * github.com:scylladb/scylladb: s3/client: Add IO stats metrics s3/client: Add HTTP client metrics s3/client: Split make_request() s3/client: Wrap http client with struct group_client s3/client: Move client::stats to namespace scope s3/client: Keep part size local variable	2023-09-14 09:49:08 +03:00
Kefu Chai	87088b65b6	util: replace <tab> with spaces to be aligned with seastar's coding-style.md: scylladb uses seastar's coding-style.md. so let's adhere to it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15345	2023-09-11 14:38:46 +03:00
Kefu Chai	ce291f4385	s3/client: do not use deprecated tls::connect() overload seastar has deprecated the overload which accepts `server_name`, let's use the one which accepts `tls::tls_options`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15324	2023-09-08 18:44:45 +03:00
Pavel Emelyanov	308db51306	s3/client: Add IO stats metrics These metrics mimic the existing IO ones -- total number of read operation, total number of read bytes and total read delay. And the same for writing. This patch makes no difference between wrting object with plain PUT vs putting it with multipart uploading. Instead, it "measures" individual IO writes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	91235a84cd	s3/client: Add HTTP client metrics Currently an http client has several exported "numbers" regarding the number of transport connections the client uses. This patch exports those via S3 client's per-sched-group metrics and prepares the ground for more metrics in next patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	08a12cd4a6	s3/client: Split make_request() There will appear another make_request() helper that'll do mostly the same. This split will help to avoid code duplication Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	4b548dd240	s3/client: Wrap http client with struct group_client The http-client is per-sched-group. Next patch will need to keep metrics per-sched-group too and this sched-group -> http-client map is the good place to put them on. Wrapping struct will allow extending it with metrics Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	627c1932e4	s3/client: Move client::stats to namespace scope The stats is stats about object, not about client, so it's better if it lives in namespace scope. Also it will avoid conflicts with client stats that will be reported as metrics (later patch) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Pavel Emelyanov	896b582850	s3/client: Keep part size local variable This serves two purposes. First, it fixes potential use-after-move since the bufs are moved on lambda and bufs.size() are called in the same statement with no defined evaluation order. Second, this makes 'size' varable alive up to the time request is complete thus making it possible to update stats with it (later patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-07 09:25:00 +03:00
Dawid Medrek	c7fe5d7f94	utils/lister: Limit the API of scan_dir() to fs::path Right now, the function allows for passing the path to a file as a seastar::sstring, which is then converted to std::filesystem::path -- implicitly to the caller. However, the function performs I/O, and there is no reason to accept any other type than std::filesystem::path, especially because the conversion is straightforward. Callers can perform it on their own. This commit introduces the more constrained API. Closes #15266	2023-09-05 20:50:42 +03:00

1 2 3 4 5 ...

1544 Commits