scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 19:35:12 +00:00

Author	SHA1	Message	Date
Michał Chojnowski	6e7e795dfd	cql3: expr: expression: make the argument of to_range a forwarding reference Make to_range able to handle rvalues. We will pass managed_bytes&& to it in the next patch to avoid pointless copying. The public declaration of to_range is changed to a concrete function to avoid having to explicitly instantiate to_range for all possible reference types of clustering_key_prefix.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	0bb959e890	cql3: don't linearize elements of lists, tuples, and user types This patch switches the type used to store collection elements inside the intermediate form used in lists::value, tuples::value etc. from bytes to managed_bytes. After this patch, tuple and list elements are only linearized in from_serialized, which will be corrected soon. This commit introduces some additional copies in expression.cc, which will be dealt with in a future commit.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	fa2749c2a0	cql3: values: add const managed_bytes& constructor to raw_value_view Will be used in the next patch. Separated for clarity.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	8927aaf225	cql3: output managed_bytes instead of bytes in get_with_protocol_version	2021-04-01 10:44:21 +02:00
Michał Chojnowski	aab9509775	types: collection: add versions of pack for fragmented buffers We will need them to port the representation of collection types in cql3/ from bytes to managed_bytes. The version which takes an iterator of `bytes` as an argument will be removed after that transition is complete.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	e9c05582a4	types: add write_collection_{value,size} for managed_bytes_mutable_view We will use them to avoid linearization when going from the intermediate std::vector<bytes> form in cql3/ to the atomic_cell format, by outputting managed_bytes instead of bytes in get_with_protocol_version.	2021-04-01 10:44:21 +02:00
Michał Chojnowski	3387d43a34	cql3: tuples, user_types: avoid linearization in from_serialized() and get() Deserialize from raw_value_view without linearizing and output managed_bytes instead of bytes.	2021-04-01 10:44:20 +02:00
Michał Chojnowski	a10a82da30	types: tuple: add build_value_fragmented A version of build_value which produces fragmented output. We will use it to avoid linearization in tuples::value and user_types::value.	2021-04-01 10:42:07 +02:00
Michał Chojnowski	9777026e71	cql3: update_parameters: add make_cell version for managed_bytes_view We will use it to port the representation of collections in cql3/ from bytes to managed_bytes. The duplicate version for bytes_view will be removed after that transition is complete.	2021-04-01 10:42:07 +02:00
Michał Chojnowski	c2c6b2abfa	cql3: remove operation::make_cell The operation::make_cell functions are useless aliases to methods of update_parameters, and are used interchangeably with them throughout the code. Remove them. Also, remove the now-unused update_parameters::make_cell version for fragmented_temporary_buffer::view.	2021-04-01 10:42:07 +02:00
Michał Chojnowski	463ec1b082	cql3: values: make raw_value fragmented As a part of the effort of removing big, contiguous buffers from the codebase, cql3::raw_value should be made fragmented. Unfortunately the change involves some nontrivial work, because raw_value must be viewable with raw_value_view, and raw_value_view must accomodate both raw_value (that's where we store values in prepared queries) and fragmented_temporary_buffer::view (because that's the type of values coming from the wire). This patch makes raw_value fragmented, by changing the backing type from bytes to managed_bytes. raw_value_view is modified accordingly by changing the backing type from fragmented_temporary_buffer::view to a variant of fragmented_temporary_buffer::view and managed_bytes_view. We have prepared the users of raw_value{_view} for this change in preceding commits.	2021-04-01 10:42:07 +02:00
Michał Chojnowski	5984d6b2ce	cql3: values: remove raw_value_view::operator== It's only used in a single test, and there is no reason why it should ever be used anywhere else. So let's remove it from the public header and move it to that test.	2021-04-01 10:42:07 +02:00
Michał Chojnowski	b9322a6b71	cql3: switch users of cql3::raw_value_view to internals-independent API We want to change the internals of cql3::raw_value{_view}. However, users of cql3::raw_value and cql3::raw_value_view often use them by extracting the internal representation, which will be different after the planned change. This commit prepares us for the change by making all accesses to the value inside cql3::raw_value(_view) be done through helper methods which don't expose the internal representation publicly. After this commit we are free to change the internal representation of raw_value_{view} without messing up their users.	2021-04-01 10:42:04 +02:00
Michał Chojnowski	b3167ac0a6	cql3: values: add an internals-independent API to raw_value_view Currently, raw_value_view is backed by a fragmented_temporary_buffer::view, and many users of this type use it by extracting that internal representation. However, we want to change raw_value_view so that it can be created both from fragmented_temporary_buffer and from managed_bytes, so that we can switch the internals of raw_value from bytes to managed_bytes. To do that we need to prepare all users for that more general representation. This commit adds an API which allow using raw_value_view without accessing its internal representation. In the next commits of this series we will switch all callers who currently depend on that representation to the new API, and then we will remove the old accessors and change the internals.	2021-04-01 10:39:42 +02:00
Michał Chojnowski	45e0ef26d3	utils: managed_bytes: add a managed_bytes constructor from FragmentedView Just for convenience. We will use it in an upcoming patch where we switch the inner representation of cql3::raw_value from bytes to managed_bytes, and we will want to construct managed_bytes from fragmented_temporary_buffer::view.	2021-04-01 10:39:42 +02:00
Michał Chojnowski	4715268e30	utils: managed_bytes: add operator<< and to_hex for managed_bytes We will need them to replace bytes with managed_bytes in some places in an upcoming patch. The change to configure.py is necessary because opearator<< links to to_hex in bytes.cc.	2021-04-01 10:39:42 +02:00
Michał Chojnowski	14c4639994	utils: fragment_range: add to_hex	2021-04-01 10:39:42 +02:00
Michał Chojnowski	b6740a01ac	configure: remove unused link dependencies from UUID_test	2021-04-01 10:39:42 +02:00
Avi Kivity	bbec43f9a1	Update tools/java submodule * tools/java ccc4201ded...fb21784b91 (2): > fix: Add dummy implementation of getToppartitions > nodetool: Make toppartitions call the generic endpoint Fixes #4520.	2021-03-31 17:38:03 +03:00
Pavel Emelyanov	887a1b0d3d	tracing: Stop tracing in main's deferred action Tracing is created in two steps and is destroyed in two too. The 2nd step doesn't have the corresponding stop part, so here it is -- defer tracing stop after it was started. But need to keep in mind, that tracing is also shut down on drain, so the stopping should handle this. Fixes #8382 tests: unit(dev), manual(start-stop, aborted-start) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210331092221.1602-1-xemul@scylladb.com>	2021-03-31 12:28:37 +03:00
Piotr Jastrzebski	57c7964d6c	config: ignore enable_sstables_mc_format flag Don't allow users to disable MC sstables format any more. We would like to retire some old cluster features that has been around for years. Namely MC_SSTABLE and UNBOUNDED_RANGE_TOMBSTONES. To do this we first have to make sure that all existing clusters have them enabled. It is impossible to know that unless we stop supporting enable_sstables_mc_format flag. Test: unit(dev) Refs #8352 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #8360	2021-03-31 12:23:59 +03:00
Avi Kivity	f9244734f9	Update seastar submodule * seastar 48376c76a...72e3baed9 (3): > file: Add RFW_NOWAIT detection case for AuFS > sharded: provide type info on no sharded instance exception > iotune: Estimate accuarcy of measurement Added missing include "database.hh" to api/lsa.cc since seastar::sharded<> now needs full type information.	2021-03-31 10:40:04 +03:00
Avi Kivity	de10a74a84	Merge 'types: remove linearization from abstract_type::compare' from Wojciech Mitros This patch is another series on removing big allocations from scylla. The buffers in `compare_visitor` were replaced with `managed_bytes_view`, similiar change was also needed in tuple_deserializing_iterator and listlike_partial_deserializing_iterator, and was applied as well. Tests:unit(dev) Closes #8357 * github.com:scylladb/scylla: types: remove linearization from abstract_type::compare types: replace buffers in tuple_deserializing_iterator with fragmented ones types: make tuple_type_impl::split work with any FragmentedViews types: move read_collection_size/value specialization to header file	2021-03-31 08:50:52 +03:00
Wojciech Mitros	f57fa935a2	types: remove linearization from abstract_type::compare To avoid high latencies caused by large contigous allocations needed by linearizing, work on fragmented buffers instead. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-03-31 06:35:10 +02:00
Wojciech Mitros	daa31be37f	types: replace buffers in tuple_deserializing_iterator with fragmented ones In preparation for removing linearization from abstract_type::compare, add options to avoid linearization in tuple_deserializing_iterator. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-03-31 06:35:09 +02:00
Wojciech Mitros	823d4c7529	types: make tuple_type_impl::split work with any FragmentedViews We may want to store a tuple in a fragmented buffer. To split it into a vector of optional bytes, tuple_type_impl::split can be used. To split a contiguous buffer(bytes_view), simply pass single_fragmented_view(bytes_view). Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2021-03-31 06:34:37 +02:00
Piotr Sarna	6a2377a233	Merge 'Fast slow query trace doc' from Ivan Addressed https://github.com/scylladb/scylla/pull/8314#issuecomment-803671234 (write issue: "Tracing: slow query fast mode documentation request") adds a fast slow queries tracing mode documentation to the docs/guide/tracing.md patch to the scylla-doc will be dup-ed after this one merged cc @nyh cc @vladzcloudius Closes #8373 * github.com:scylladb/scylla: tracing: api: fast mode doc improvement tracing: fast slow query tracing mode docs	2021-03-30 17:57:04 +02:00
Ivan Prisyazhnyy	778d9217f3	tracing: api: fast mode doc improvement Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com>	2021-03-30 16:22:56 +02:00
Ivan Prisyazhnyy	b3b66fb629	tracing: fast slow query tracing mode docs Signed-off-by: Ivan Prisyazhnyy <ivan@scylladb.com>	2021-03-30 16:22:56 +02:00
Avi Kivity	d2921b5112	Merge 'Clean up > 2-year-old features' from Piotr Sarna Following the work started in `253a7640e`, a new batch of old features is assumed to be always available. They are all still announced via gossip, but the code assumes that the feature is always true, because we only support upgrades from a previous release, and the release window is considerably smaller than 2 years. Features picked this time via `git blame`, along with the date of their introduction: * `fe4afb1aa3` (Asias He 2018-09-05 14:52:10 +0800 109) static const sstring ROW_LEVEL_REPAIR = "ROW_LEVEL_REPAIR"; * `ff5e541335` (Calle Wilund 2019-02-05 13:06:07 +0000 110) static const sstring TRUNCATION_TABLE = "TRUNCATION_TABLE"; * `fefef7b9eb` (Tomasz Grabiec 2019-03-05 19:08:07 +0100 111) static const sstring CORRECT_STATIC_COMPACT_IN_MC = "CORRECT_STATIC_COMPACT_IN_MC"; Tests: unit(dev) Closes #8235 * github.com:scylladb/scylla: sstables,test: remove variables depending on old features gms: make CORRECT_STATIC_COMPACT_IN_MC ft unconditionally true sstables: stop relying on CORRECT_STATIC_COMPACT_IN_MC feature gms: make TRUNCATION_TABLE feature unconditionally true gms: make ROW_LEVEL_REPAIR feature unconditionally true repair: stop relying on ROW_LEVEL_REPAIR feature	2021-03-30 16:13:35 +03:00
Calle Wilund	c0666ea89b	commitlog: Fix inner loop condition in allocation pre-fill Fixes #8369 This was originally found (and fixed) by @gleb-cloudius, but the patch set with the fix was reverted at some point, and the fix went away. Now the error remains even in new, nice coroutine code. We check the wrong var in the inner loop of the pre-fill path of allocate_segment_ex, often causing us to generate giant writev:s of more or less the whole file. Not intended. Closes #8370	2021-03-30 12:14:55 +02:00
Avi Kivity	c2866f46b5	test: relax quota for tests on machines with small page size `8a8589038c` ("test: increase quota for tests to 6GB") increased the quota for tests from 2GB to 6GB. I later found that the increased requirement is related to the page size: Address Sanitizer allocates at least a page per object, and so if the page size is larger the memory requirement is also larger. Make use of this by only increasing the quota if the page size is greater than 4096 (I've only seen 4096 and 65536 in the wild). This allows greater parallelism when the page size is small. Closes #8371	2021-03-30 12:13:42 +02:00
Avi Kivity	8785dd62cb	tests: use kernel page cache Tests are short-lived and use a small amount of data. They are also often run repeatly, and the data is deleted immediately after the test. This is a good scenario for using the kernel page cache, as it can cache read-only data from test to test, and avoid spilling write data to disk if it is deleted quickly. Acknowledge this by using the new --kernel-page-cache option for tests. This is expected to help on large machines, where the disk can be overloaded. Smaller machines with NVMe disks probably will not see a difference. Closes #8347	2021-03-30 12:04:55 +02:00
Piotr Sarna	6de2691bbd	sstables,test: remove variables depending on old features In order to maintain backward compatibility wrt. cluster features, two boolean variables were kept in sstable writers: - correctly_serialize_non_compound_range_tombstones - correctly_serialize_static_compact_in_mc Since these features are assumed to always be present now, the above variables are no longer needed and can be purged.	2021-03-30 09:37:41 +02:00
Piotr Sarna	e42dee6afb	gms: make CORRECT_STATIC_COMPACT_IN_MC ft unconditionally true The feature is assumed to be true due to being over 2 years old. It's still advertised in gossip, but it's assumed to always be present.	2021-03-30 09:37:13 +02:00
Piotr Sarna	28c9af6fa5	sstables: stop relying on CORRECT_STATIC_COMPACT_IN_MC feature The feature bit is going away because it's over 2 years old, so the code which depended on it becomes unconditional.	2021-03-30 09:37:04 +02:00
Piotr Sarna	08c4350968	gms: make TRUNCATION_TABLE feature unconditionally true Turns out the feature was not used presently. Historically, the commit which removed the support is `30a700c5b0` .	2021-03-30 09:36:45 +02:00
Piotr Sarna	c070178c7e	gms: make ROW_LEVEL_REPAIR feature unconditionally true The feature is assumed to be true due to being over 2 years old. It's still advertised in gossip, but it's assumed to always be present.	2021-03-30 09:36:11 +02:00
Piotr Sarna	80ebedd242	repair: stop relying on ROW_LEVEL_REPAIR feature The feature is going away because it's over 2 years old, so the code which depended on it becomes unconditional.	2021-03-30 09:35:40 +02:00
Avi Kivity	c1badc6317	noexcept_traits: convert enable_if to concepts A little easier to read. Closes #8329	2021-03-30 09:30:23 +02:00
Avi Kivity	405c4e7af1	serializer: replace enable_if in deserialized_bytes_proxy with constraint Simpler to read and understand. Closes #8303	2021-03-30 09:30:06 +02:00
Avi Kivity	7c953f33d5	utils: disk-error-handler: replace enable_if with concepts Simpler, cleaner. We also replace the deprecated std::result_of_t with std::invoke_result_t. Closes #8305	2021-03-30 09:29:46 +02:00
Nadav Har'El	115324f71a	Merge 'Add partial admission control to Thrift frontend' from Piotr Sarna This pull request adds partial admission control to Thrift frontend. The solution is partial mostly because the Thrift layer, aside from allowing Thrift messages, may also be used as a base protocol for CQL messages. Coupling admission control to this one is a little bit more complicated due to how the layer currently works - a Thrift handler, created once per connection, keeps a local `query_state` instance for the occasion of handling CQL requests. However, `query_state` should be kept per query, not per connection, so adding admission control to this aspect of the frontend is left for later. Finally, the way service permits are passed from the server, via the handler factory, handler and then to queries is hacky. I haven't figured out how to force Thrift to pass custom context per query, so the way it works now is by relying on the fact that the server does not yield (in Seastar sense) between having read the request and launching the proper handler. Due to that, it's possible to just store the service permit in the server itself, pass the reference (address) to it down to the handler, and then read it back from the handling code and claim ownership of it. It works, but if anyone has a better idea, please share. Refs #4826 Closes #8313 * github.com:scylladb/scylla: thrift: add support for max_concurrent_requests_per_shard thrift: add metrics for admission control thrift: add a counter for in-flight requests thrift: add a counter for blocked requests thrift: partially add admission control service_permit: add a getter for the number of units held thrift: coroutinize processing a request memory_limiter: add a missing seastarx include	2021-03-29 21:36:50 +03:00
Raphael S. Carvalho	a390f4eb61	sstables: optimize LCS reshape for repair-based operations LCS reshape is currently inefficient for repair-based operation, because the disjoint run of 256 sstables is reshaped into bigger L0 files, which will be then integrated into the main sstable set. On reshape completion, LCS has to compact those big L0 files onto higher levels, until last level is reached, producing bad write amplification. A much better approach is to instead compact that disjoint run into the best possible level L, which can be figured out with: log (base fan_out) of (total_size / max_sstable_size) This compaction will be essentially a copy operation. It's important to do it rather than only mutating the level of sstables because we have to reshape the input run according to LCS parameters like sstable size. For repair-based bootstrap/replace, the input disjoint run is now efficiently reshaped into an ideal level L, so there's no compaction backlog once reshape completes. This behavior will manifest in the log as this: LeveledManifest - Reshaping 256 disjoint sstables in level 0 into level 2 For repair-based decommission/removenode though, which reshape wasn't wired on yet, level L may temporarily hold 2 disjoint runs, which overlap one another, but LCS itself will incrementally merge them through either promotion of L-1 into L, or by detecting overlapping in level L and merging the overlapping sstables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210329171826.42873-1-raphaelsc@scylladb.com>	2021-03-29 20:22:04 +03:00
Botond Dénes	3c54c990ab	test: view_build_test: test_view_update_generator_buffering: fail gracefully Failures in this test typically happen inside the test consumer object. These however don't stop the test as the code invoking the consumer object handles exceptions coming from it. So the test will run to completion and will fail again when comparing the produced output with the expected one. This results in distracting failures. The real problem is not the difference in the output, but the first check that failed, which is however buried in the noise. To prevent this add an "ok" flag which is set to false if the consumer fails. In this case the additional checks are skipped in the end to not generate useless noise. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210326083147.26113-2-bdenes@scylladb.com>	2021-03-29 17:58:28 +03:00
Avi Kivity	a8463cfb37	Merge "reader_permit: signal leaked resources" from Botond " When a permit is destroyed we check if it still holds on to any resources in the destructor. Any resources the permit still holds on are leaked resources, as users should have released these. Currently we just invoke `on_internal_error_noexcept()` to handle this, which -- depending on the configuration -- will result in an error message or an assert. In the former case, the resources will be leaked for good. This mini-series fixes this, by signaling back these resources to the semaphore. This helps avoid an eventual complete dry-up of all semaphore resources and a subsequent complete shutdown of reads. Tests: unit(release, debug) " * 'reader-permit-signal-leaked-resources/v1' of https://github.com/denesb/scylla: reader_permit: signal leaked resources test: test_reader_lifecycle_policy: keep semaphores alive until all ops cease sstables: generate_summary(): extend the lifecycle of the reader concurrency semaphore	2021-03-29 17:57:31 +03:00
Botond Dénes	9e01c4c667	test: view_build_test: test_view_update_generator_buffering: use separate permit for readers Said test has two separate logical readers, but they share the same permit, which is illegal. This didn't cause any problems yet, but soon the semaphore will start to keep score of active/inactive permits which will be confused by such sharing, so have them use separate permits. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210326083147.26113-1-bdenes@scylladb.com>	2021-03-29 17:35:51 +03:00
Takuya ASADA	6f678ab7ff	aws: initialize self._disks['ebs'] when no EBS disks Seems like aws_instance.ebs_disks() causes traceback when no EBS disks available, need to initialize with empty list. Fixes #8365 Closes #8366	2021-03-29 17:21:14 +03:00
Gleb Natapov	13a3cf62bb	raft: move incoming message processing into per state functions Clean up step() function by moving state specific processing into per state functions. This way it is easier to see how each state handles individual messages. No functional changes here. Message-Id: <YGHCiTWjq+L/jVCB@scylladb.com>	2021-03-29 15:48:43 +02:00
Tomasz Grabiec	43fd322856	Merge 'scylla-gdb.py: Add io-queues command' from Piotr Sarna The command can be used to inspect IO queues of a local reactor. Example output: ``` (gdb) scylla io-queues Dev 0: Class: \|shares: \|ptr: -------------------------------------------------------------------------------- "default" \|1 \|(seastar::priority_class_data )0x6000002c6500 "commitlog" \|1000 \|(seastar::priority_class_data )0x6000003ad940 "memtable_flush" \|1000 \|(seastar::priority_class_data )0x6000005cb300 "streaming" \|200 \|(seastar::priority_class_data )0x0 "query" \|1000 \|(seastar::priority_class_data )0x600000718580 "compaction" \|1000 \|(seastar::priority_class_data )0x6000030ef0c0 Max request size: 2147483647 Max capacity: Ticket(weight: 4194303, size: 4194303) Capacity tail: Ticket(weight: 73168384, size: 100561888) Capacity head: Ticket(weight: 77360511, size: 104242143) Resources executing: Ticket(weight: 2176, size: 514048) Resources queued: Ticket(weight: 384, size: 98304) Handles: (1) Class 0x6000005d7278: Ticket(weight: 128, size: 32768) Ticket(weight: 128, size: 32768) Ticket(weight: 128, size: 32768) Pending in sink: (0) ``` Created when debugging a core dump. Turned out not to be immediately useful for this use case, but I'm publishing it since it may come in handy in future investigations. Closes #8362 * github.com:scylladb/scylla: scylla-gdb: add io-queues command scylla-gdb.py: add parsing std::priority_queue scylla-gdb.py: add parsing std::atomic scylla-gdb.py: add parsing std::shared_ptr scylla-db.py: add parsing intrusive_slist	2021-03-29 15:31:48 +02:00

1 2 3 4 5 ...

25803 Commits