scylladb

Author	SHA1	Message	Date
Avi Kivity	31a5378a82	utils: utf8: avoid harmless integer overflow 240 doesn't fit in char without overflow, so cast it explicitly to avoid a clang warning.	2020-09-22 17:24:33 +03:00
Avi Kivity	e12c72ad55	utils: multiprecision_int: disambiguate operator templates by adding overloads We have templates for multiprecision_int for both sides of the operator, for example: template <typename T> bool operator==(const T& x) const and template <typename T> friend bool operator==(const T& x, const multiprecision_int& y) Clang considers them equally satisfying when both operands are multiprecision_int, so provide a disambiguating overload.	2020-09-22 17:24:33 +03:00
Avi Kivity	d1c049b202	utils: error_injection: remove forward-declared function returning auto Clang dislikes forward-declared functions returning auto, so declare the type up front. Functions returning auto are a readability problem anyway. To solve a circular dependency problem (get_local_injector() -> error_injection<> -> get_local_injector()), which is further compounded by problems in using template specializations before they are defined (which is forbidden), the storage for get_local_injector() was moved to error_injection<>, and get_local_injector() is just an accessor. After this, error_injection<> does not depend on get_local_injector().	2020-09-22 17:24:33 +03:00
Avi Kivity	765e632626	utils: bptree: remove redundant and possibly wrong friend declaration Clang complains about befriending a constructor. It's possibly correct. In any case it's redundant, so remove it.	2020-09-22 17:24:33 +03:00
Avi Kivity	c7105019b2	utils: bptree: add missing typename for clang Clang does not implement p0634r3, so we must add more typenames.	2020-09-22 17:24:33 +03:00
Avi Kivity	0d25ea5a67	utils: bloom_calculations: avoid gratuitous conversion to double The conversion to double evokes a complaint about precision loss from clang, and is unneeded anyway, so use integral types throughout.	2020-09-22 17:24:33 +03:00
Avi Kivity	4c93ec8351	utils: updateable_value: fix nullptr_t name nullptr_t's full name is std::nullptr_t. gcc somehow allows plain nullptr_t, but that's not correct. Clang rejects it. Use std::nullptr_t.	2020-09-22 17:24:33 +03:00
Avi Kivity	dcaf4ea4dd	Merge "Fix race in schema version recalculation leading to stale schema version in gossip" from Tomasz " Migration manager installs several cluster feature change listeners. The listeners will call update_schema_version_and_announce() when cluster features are enabled, which does this: return update_schema_version(proxy, features).then([] (utils::UUID uuid) { return announce_schema_version(uuid); }); It first updates the schema version and then publishes it via gossip in announce_schema_version(). It is possible that the announce_schema_version() part of the first schema change will be deferred and will execute after the other four calls to update_schema_version_and_announce(). It will install the old schema version in gossip instead of the more recent one. The fix is to serialize schema digest calculation and publishing. Refs #7200 This problem also brought my attention to initialization code, which could be prone to the same problem. The storage service computes gossiper states before it starts the gossiper. Among them, node's schema version. There are two problems with that. First is that computing the schema version and publishing it is not atomic, so is not safe against concurrent schema changes or schema version recalculations. It will not exclude with recalculate_schema_version() calls, and we could end up with the old (and incorrect) schema version being advertised in gossip. Second problem is that we should not allow the database layer to call into the gossiper layer before it is fully initialized, as this may produce undefined behavior. Maybe we're not doing concurrent schema changes/recalculations now, but it is easy to imagine that this could change for whatever reason in the future. The solution for both problems is to break the cyclic dependency between the database layer and the storage_service layer by having the database layer not use the gossiper at all. The database layer publishes schema version inside the database class and allows installing listeners on changes. The storage_service layer asks the database layer for the current version when it initializes, and only after that installs a listener which will update the gossiper. Tests: - unit (dev) - manual (3 node ccm) " * tag 'fix-schema-digest-calculation-race-v1' of github.com:tgrabiec/scylla: db, schema: Hide update_schema_version_and_announce() db, storage_service: Do not call into gossiper from the database layer db: Make schema version observable utils: updateable_value_source: Introduce as_observable() schema: Fix race in schema version recalculation leading to stale schema version in gossip	2020-09-14 12:37:46 +03:00
Tomasz Grabiec	fed89ee23e	utils: updateable_value_source: Introduce as_observable()	2020-09-11 14:42:41 +02:00
Avi Kivity	7ac59dcc98	lsa: decay reserves The log-structured allocator (LSA) reserves memory when performing operations, since its operations are performed with reclaiming disabled and if it runs out, it cannot evict cache to gain more. The amount of memory to reserve is remembered across calls so that it does not have to repeat the fail/increase-reserve/retry cycle for every operation. However, we currently lack decaying the amount to reserve. This means that if a single operation increased the reserve in the distant past, all current operations also require this large reserve. Large reserves are expensive since they can cause large amounts of cache to be evicted. This patch adds reserve decay. The time-to-decay is inversely proportional to reserve size: 10GB/reserve. This means that a 20MB reserve is halved after 500 operations (10GB/20MB) while a 20kB reserve is halved after 500,000 operations (10GB/20kB). So large, expensive reserves are decayed quickly while small, inexpensive reserves are decayed slowly to reduce the risk of allocation failures and exceptions. A unit test is added. Fixes #325.	2020-09-08 15:59:25 +03:00
Piotr Grabowski	ffd8c8c505	utf8: Print invalid UTF-8 character position Add new validate_with_error_position function which returns -1 if data is a valid UTF-8 string or otherwise a byte position of first invalid character. The position is added to exception messages of all UTF-8 parsing errors in Scylla. validate_with_error_position is done in two passes in order to preserve the same performance in common case when the string is valid.	2020-09-07 18:11:21 +03:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Avi Kivity	392e24d199	Merge "Unglobal messaging service" from Pavel E " The messaging service is (as many other services) present in the global namespace and is widely accessed from where needed with global get(_local)?_messaging_service() calls. There's a long-term task to get rid of this globality and make services and componenets reference each-other and, for and due-to this, start and stop in specific order. This set makes this for the messaging service. The service is very low level and doesn't depend on anything. It's used by gossiper, streaming, repair, migration manager, storage proxy, storage service and API. According to this dependencies the set consists of several parts: patches 1-9 are preparatory, they encapsulate messaging service init/fini stuff in its own module and decouple it from the db::config patch 10-12 introduce local service reference in main and set its init/fini calls at the early stage so that this reference can later be passed to those depending on it patches 13-42 replace global referencing of messaging service from other subsystems with local references initialized from main. patch 43 finalizes tests. patch 44 wraps things up with removing global messaiging service instance along with get(_local)?_messaging_service calls. The service's stopping part is deliberately left incomplete (as it is now), the sharded service remains alive, only the instance's stop() method is called (and is empty for a while). Since the messaging service's users still do not stop cleanly, its instances should better continue leaking on exit. Once (if) the seastar gets the helper rpc::has_handlers() method merged the messaging_service::stop() will be able to check if all the verbs had been unregistered (spoiler: not yet, more fixes to come). For debugging purposes the pointer on now-local messaging service instance is kept in service::debug namespace. tests: unit(dev) dtest(dev: simple_boot_shutdown, repair, update_cluster_layout) manual start-stop " * 'br-unglobal-messaging-service-2' of https://github.com/xemul/scylla: (44 commits) messaging_service: Unglobal messaging service instance tests: Use own instances of messaging_service storage_service: Use local messaging reference storage_service: Keep reference on sharded messaging service migration_manager: Add messaging service as argument to get_schema_definition migration_manager: Use local messaging reference in simple cases migration_manager: Keep reference on messaging migration_manager: Make push_schema_mutation private non-static method migration_manager: Move get_schema_version verb handling from proxy repair: Stop using global messaging_service references repair: Keep sharded messaging service reference on repair_meta repair: Keep sharded messaging service reference on repair_info repair: Keep reference on messaging in row-level code repair: Keep sharded messaging service in API repair: Unset API endpoints on stop repair: Setup API endpoints in separate helper repair: Push the sharded<messaging_service> reference down to sync_data_using_repair repair: Use existing sharded db reference repair: Mark repair.cc local functions as static streaming: Keep messaging service on send_info ...	2020-08-20 12:20:36 +03:00
Avi Kivity	f6b66456fd	Update seastar submodule Contains patch from Rafael to fix up includes. * seastar c872c3408c...7f7cf0f232 (9): > future: Consider result_unavailable invalid in future_state_base::ignore() > future: Consider result_unavailable invalid in future_state_base::valid() > Merge "future-util: split header" from Benny > docs: corrected some text and code-examples in streaming-rpc docs > future: Reduce nesting in future::then > demos: coroutines: include std-compat.hh > sstring: mark str() and methods using it as noexcept > tls: Add an assert > future: fix coroutine compilation	2020-08-19 17:18:57 +03:00
Pavel Emelyanov	c28aeaee2e	messaging_service: Move initialization to messaging/ Now the init_messaging_service() only deals with messaing service and related internal stuff, so it can sit in its own module. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 13:08:12 +03:00
Dejan Mircevski	fb6c011b52	everywhere: Insert space after `switch` Quoth @avikivity: "switch is not a function, and we celebrate that by putting a space after it like other control-flow keywords." https://github.com/scylladb/scylla/pull/7052#discussion_r471932710 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 14:31:04 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Avi Kivity	4547949420	Merge "Fix repair stalls in get_sync_boundary and apply_rows_on_master_in_thread" from Asias " This path set fixes stalls in repair that are caused by std::list merge and clear operations during test_latency_read_with_nemesis test. Fixes #6940 Fixes #6975 Fixes #6976 " * 'fix_repair_list_stall_merge_clear_v2' of github.com:asias/scylla: repair: Fix stall in apply_rows_on_master_in_thread and apply_rows_on_follower repair: Use clear_gently in get_sync_boundary to avoid stall utils: Add clear_gently repair: Use merge_to_gently to merge two lists utils: Add merge_to_gently	2020-08-11 14:52:23 +03:00
Asias He	3e8c4a6788	utils: Add clear_gently A helper to clear a list without stall. Refs #6975 Refs #6940	2020-08-11 19:37:47 +08:00
Avi Kivity	41a75f2b99	Merge "make do_io_check path noexcept" from Benny " Make do_io_check and the io_check functions that call it noexcept. Up to sstable_write_io_check and sstable_touch_directory_io_check. Tests: unit (dev) " * tag 'io-check-noexcept-v1' of github.com:bhalevy/scylla: ssstable: io_check functions: make noexcept utils: do_io_check: adjust indentation utils: io_check: make noexcept for future-returning functions	2020-08-11 13:41:20 +03:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Asias He	0bf0019eeb	utils: Add merge_to_gently This helper is similar to std::merge but it runs inside a thread and does not stall. Refs #6976	2020-08-11 10:37:34 +08:00
Benny Halevy	e33fc10638	utils: do_io_check: adjust indentation was broken by the previous patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-06 19:01:18 +03:00
Benny Halevy	fd5b2672c1	utils: io_check: make noexcept for future-returning functions Use futurize_apply to handle any exception the passed function may throw. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-06 19:01:17 +03:00
Pavel Emelyanov	7c20e3ed05	utils: AVX searcher With all the preparations made so far it's now possible to implement the avx-powered search in an array. The array to search in has both -- capacity and size, so searching in it needs to take allocated, but unused tail into account. Two options for that -- limit the number of comparisons "by hands" or keep minimal and impossible value in this tail, scan "capacity" elements, then correct the result with "size" value. The latter approach is up to 50% faster than any (tried) attempt to do the former one. The run-time selection of the array search code is done with the gnu target attribute. It's available since gcc 4.8. For AVX-less platforms the default linear scanner is used. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-06 15:41:31 +03:00
Pavel Emelyanov	35a22ac48a	bptree: Special intra-node key search when possible If the key type is int64_t and the less-comparator is "natural" (i.e. it's literally 'a < b') we may use the SIMD instructions to search for the key on a node. Before doing so, the maybe_key and the searcher should be prepared for that, in particular: 1. maybe_key should set unused keys to the minimal value 2. the searcher for this case should call the gt() helper with primitive types -- int64_t search key and array of int64_t values To tell to B+ code that the key-less pair is such the less-er should define the simplify_key() method converting search keys to int64_t-s. This searcher is selected automatically, if any mismatch happens it silently falls back to default one. Thus also add a static assertion to the row-cache to mitigate this. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-06 15:41:31 +03:00
Pavel Emelyanov	14f0cdb779	bptree: Add lesses to maybe_key template The way maybe_key works will be in-sync with the intra-node searching code and will require to know what the Less type is, so prepare for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-06 15:41:31 +03:00
Avi Kivity	c97924b8ad	Update seastar submodule util/loading_cache.hh includes adjusted. * seastar 02ad74fa7d...eb452a22a0 (17): > core: add missing include for std::allocator_traits > exceptions: move timed_out_error and factory into its own header file > future: parallel_for_each: add disable_failure_guard for parallel_for_each_state > Merge "Improve file API noexcept correctness" from Rafael > util: Add a with_allocation_failures helper > future: Fix indentation > future: Refactor duplicated try/catch > future: Make set_to_current_exception public > future: Add noexcept to continuation related functions > core: mark timer cancellation functions as noexcept > future: Simplify future::schedule > test: add a case for overwriting exact routes > http: throw on duplicated routes to prevent memory leaks > metrics: Remove the type label > fstream: turn file_data_source_impl's memory corruption bugs into aborts > doc: update tutorial splitting script > reactor_backend: let the reactor know again if any work was done by aio backend	2020-08-04 17:54:45 +03:00
Avi Kivity	4edfdfa78d	Merge 'Build id cleanups' from Benny " Refs #5525 - main: add --build-id option - build_id: mv sources to utils/ - build_id: throw on errors rather than assert - build_id: simplify callback pointer type casting " * bhalevy-build-id-cleanups: build_id: simplify callback pointer type casting build_id: mv sources to utils/ main: add --build-id option	2020-08-03 17:18:09 +03:00
Benny Halevy	9256d2f504	build_id: simplify callback pointer type casting Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-03 15:55:18 +03:00
Benny Halevy	bf6e8f66d9	build_id: mv sources to utils/ The root directory is already overcrowded. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-03 15:55:16 +03:00
Botond Dénes	f4c8163d11	db/config_file.hh: named_value: remove unused members _name and _desc They seem to be just copypasta. Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200803080604.45595-1-bdenes@scylladb.com>	2020-08-03 12:51:16 +03:00
Rafael Ávila de Espíndola	30722b8c8e	logalloc: Add disable_failure_guard during a few tls variable initialization The constructors of these global variables can allocate memory. Since the variables are thread_local, they are initialized at first use. There is nothing we can do if these allocations fail, so use disable_failure_guard. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200729184901.205646-1-espindola@scylladb.com>	2020-07-31 15:49:21 +02:00
Botond Dénes	9faaf46d4b	utils: config_src::add_command_line_options(): drop name and desc args Now that there are no ad-hoc aliases needing to overwrite the name and description parameter of this method, we can drop these and have each config item just use `name()` and `desc()` to access these.	2020-07-28 18:00:29 +03:00
Botond Dénes	003f5e9e54	utils: config: add alias support Allow configuration items to also have an alias, besides the name. This allows easy replacement of configuration items, with newer names, while still supporting the old name for backward compatibility. The alias mechanism takes care of registering both the name and the alias as command line arguments, as well as parsing them from YAML. The command line documentation of the alias will just refer to the name for documentation.	2020-07-28 17:59:51 +03:00
Avi Kivity	5371be71e9	Merge "Reduce fanout of some mutation-related headers" from Pavel E " The set's goal is to reduce the indirect fanout of 3 headers only, but likely affects more. The measured improvement rates are flat_mutation_reader.hh: -80% mutation.hh : -70% mutation_partition.hh : -20% tests: dev-build, 'checkheaders' for changed headers (the tree-wide fails on master) " * 'br-debloat-mutation-headers' of https://github.com/xemul/scylla: headers:: Remove flat_mutation_reader.hh from several other headers migration_manager: Remove db/schema_tables.hh inclustion into header storage_proxy: Remove frozen_mutation.hh inclustion storage_proxy: Move paxos/*.hh inclusions from .hh to .cc storage_proxy: Move hint_wrapper from .hh to .cc headers: Remove mutation.hh from trace_state.hh	2020-07-19 19:47:59 +03:00
Pavel Emelyanov	8618a02815	migration_manager: Remove db/schema_tables.hh inclustion into header The schema_tables.hh -> migration_manager.hh couple seems to work as one of "single header for everyhing" creating big blot for many seemingly unrelated .hh's. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:54:43 +03:00
Rafael Ávila de Espíndola	c5405a5268	managed_bytes: Delete dead 'if' If external is true, _u.ptr is not null. An empty managed_bytes uses the internal representation. The current code looks scary, since it seems possible that backref would still point to the old location, which would invite corruption when the reclaimer runs. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Reviewed-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200716233124.521796-1-espindola@scylladb.com>	2020-07-17 11:58:53 +03:00
Nadav Har'El	61f52da9b1	merge: Alternator/CDC: Implement streams support Merged pull request https://github.com/scylladb/scylla/pull/6694 by Calle Wilund: Implementation of DynamoDB streams using Scylla CDC. Fixes #5065 Initial, naive implementation insofar that it uses 1:1 mapping CDC stream to DynamoDB shard. I.e. there are a lot of shards. Includes tests verified against both local DynamoDB server and actual AWS remote one. Note: Because of how data put is implemented in alternator, currently we do not get "proper" INSERT labels for first write of data, because to CDC it looks like an update. The test compensates for this, but actual users might not like it.	2020-07-16 08:18:25 +03:00
Calle Wilund	699c4d2c7e	rjson: Add templated get/set overloads and optional get<T> To allow immediate json value conversion for types we have TypeHelper<...>:s for. Typed opt-get to get both automatic type conversion, _and_ find functionality in one call.	2020-07-15 08:10:23 +00:00
Calle Wilund	72ec525045	rjson: Add exception overloads To avoid copying error message composing, as well as forcing said code info rjson.cc. Also helps caller to determine fault by catch type.	2020-07-15 08:10:23 +00:00
Pavel Emelyanov	4d2f5f93a4	memtable: Switch onto B+ rails The change is the same as with row-cache -- use B+ with int64_t token as key and array of memtable_entry-s inside it. The changes are: Similar to those for row_cache: - compare() goes away, new collection uses ring_position_comparator - insertion and removal happens with the help of double_decker, most of the places are about slightly changed semantics of it - flags are added to memtable_entry, this makes its size larger than it could be, but still smaller than it was before Memtable-specific: - when the new entry is inserted into tree iterators _might_ get invalidated by double-decker inner array. This is easy to check when it happens, so the invalidation is avoided when possible - the size_in_allocator_without_rows() is now not very precise. This is because after the patch memtable_entries are not allocated individually as they used to. They can be squashed together with those having token conflict and asking allocator for the occupied memory slot is not possible. As the closest (lower) estimate the size of enclosing B+ data node is used Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:30:02 +03:00
Pavel Emelyanov	cf1315cde5	double-decker: A combination of B+tree with array The collection is K:V store bplus::tree<Key = K, Value = array_trusted_bounds<V>> It will be used as partitions cache. The outer tree is used to quickly map token to cache_entry, the inner array -- to resolve (expected to be rare) hash collisions. It also must be equipped with two comparators -- less one for keys and full one for values. The latter is not kept on-board, but it required on all calls. The core API consists of just 2 calls - Heterogenuous lower_bound(search_key) -> iterator : finds the element that's greater or equal to the provided search key Other than the iterator the call returns a "hint" object that helps the next call. - emplace_before(iterator, key, hint, ...) : the call construct the element right before the given iterator. The key and hint are needed for more optimal algo, but strictly speaking not required. Adding an entry to the double_decker may result in growing the node's array. Here to B+ iterator's .reconstruct() method comes into play. The new array is created, old elements are moved onto it, then the fresh node replaces the old one. // TODO: Ideally this should be turned into the // template <typename OuterCollection, typename InnerCollection> // but for now the double_decker still has some intimate knowledge // about what outer and inner collections are. Insertion into this collection _may_ invalidate iterators, but may leave intact. Invalidation only happens in case of hashing conflict, which can be clearly seen from the hint object, so there's a good room for improvement. The main usage by row_cache (the find_or_create_entry) looks like cache_entry find_or_create_entry() { bound_hint hint; it = lower_bound(decorated_key, &hint); if (!hint.found) { it = emplace_before(it, decorated_key.token(), hint, <constructor args>) } return *it; } Now the hint. It contains 3 booleans, that are - match: set to true when the "greater or equal" condition evaluated to "equal". This frees the caller from the need to manually check whether the entry returned matches the search key or the new one should be inserted. This is the "!found" check from the above snippet. To explain the next 2 bools, here's a small example. Consider the tree containing two elements {token, partition key}: { 3, "a" }, { 5, "z" } As the collection is sorted they go in the order shown. Next, this is what the lower_bound would return for some cases: { 3, "z" } -> { 5, "z" } { 4, "a" } -> { 5, "z" } { 5, "a" } -> { 5, "z" } Apparently, the lower bound for those 3 elements are the same, but the code-flows of emplacing them before one differ drastically. { 3, "z" } : need to get previous element from the tree and push the element to it's vector's back { 4, "a" } : need to create new element in the tree and populate its empty vector with the single element { 5, "a" } : need to put the new element in the found tree element right before the found vector position To make one of the above decisions the .emplace_before would need to perform another set of comparisons of keys and elements. Fortunately, the needed information was already known inside the lower_bound call and can be reported via the hint. Said that, - key_match: set to true if tree.lower_bound() found the element for the Key (which is token). For above examples this will be true for cases 3z and 5a. - key_tail: set to true if the tree element was found, but when comparing values from array the bounding element turned out to belong to the next tree element and the iterator was ++-ed. For above examples this would be true for case 3z only. And the last, but not least -- the "erase self" feature. Which is given only the cache_entry pointer at hands remove it from the collection. To make this happen we need to make two steps: 1. get the array the entry sits in 2. get the b+ tree node the vectors sits in Both methods are provided by array_trusted_bounds and bplus::tree. So, when we need to get iterator from the given T pointer, the algo looks like - Walk back the T array untill hitting the head element - Call array_trusted_bounds::from_element() getting the array - Construct b+ iterator from obtained array - Construct the double_decker iterator from b+ iterator and from the number of "steps back" from above - Call double_decker::iterator.erase() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:29:53 +03:00
Pavel Emelyanov	eb70644c1c	intrusive-array: Array with trusted bounds A plain array of elements that grows and shrinks by constructing the new instance from an existing one and moving the elements from it. Behaves similarly to vector's external array, but has 0-bytes overhead. The array bounds (0-th and N-th elemements) are determined by checking the flags on the elements themselves. For this the type must support getters and setters for the flags. To remove an element from array there's also a nothrow option that drops the requested element from array, shifts the righter ones left and keeps the trailing unused memory (so called "train") until reconstruction or destruction. Also comes with lower_bound() helper that helps keeping the elements sotred and the from_element() one that returns back reference to the array in which the element sits. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:29:49 +03:00
Pavel Emelyanov	95f15ea383	utils: B+ tree implementation // The story is at // https://groups.google.com/forum/#!msg/scylladb-dev/sxqTHM9rSDQ/WqwF1AQDAQAJ This is the B+ version which satisfies several specific requirements to be suitable for row-cache usage. 1. Insert/Remove doesn't invalidate iterators 2. Elements should be LSA-compactable 3. Low overhead of data nodes (1 pointer) 4. External less-only comparator 5. As little actions on insert/delete as possible 6. Iterator walks the sorted keys The design, briefly is: There are 3 types of nodes: inner, leaf and data, inner and leaf keep build-in array of N keys and N(+1) nodes. Leaf nodes sit in a doubly linked list. Data nodes live separately from the leaf ones and keep pointers on them. Tree handler keeps pointers on root and left-most and right-most leaves. Nodes do _not_ keep pointers or references on the tree (except 3 of them, see below). changes in v9: - explicitly marked keys/kids indices with type aliases - marked the whole erase/clear stuff noexcept - disposers now accept object pointer instead of reference - clear tree in destructor - added more comments - style/readability review comments fixed Prior changes - Add noexcepts where possible - Restrict Less-comparator constraint -- it must be noexcept - Generalized node_id - Packed code for beging()/cbegin() - Unsigned indices everywhere - Cosmetics changes - Const iterators - C++20 concepts - The index_for() implmenetation is templatized the other way to make it possible for AVX key search specialization (further patching) - Insertion tries to push kids to siblings before split Before this change insertion into full node resulted into this node being split into two equal parts. This behaviour for random keys stress gives a tree with ~2/3 of nodes half-filled. With this change before splitting the full node try to push one element to each of the siblings (if they exist and not full). This slows the insertion a bit (but it's still way faster than the std::set), but gives 15% less total number of nodes. - Iterator method to reconstruct the data at the given position The helper creates a new data node, emplaces data into it and replaces the iterator's one with it. Needed to keep arrays of data in tree. - Milli-optimize erase() - Return back an iterator that will likely be not re-validated - Do not try to update ancestors separation key for leftmost kid This caused the clear()-like workload work poorly as compared to std:set. In particular the row_cache::invalidate() method does exactly this and this change improves its timing. - Perf test to measure drain speed - Helper call to collect tree counters - Fix corner case of iterator.emplace_before() - Clean heterogenous lookup API - Handle exceptions from nodes allocations - Explicitly mark places where the key is copied (for future) - Extend the tree.lower_bound() API to report back whether the bound hit the key or not - Addressed style/cleanness review comments Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:29:43 +03:00
Pavel Emelyanov	3237796e00	region: Mark trivial noexcept methods as such Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-09 14:41:37 +03:00
Pavel Emelyanov	2c4a94aeab	allocation_strategy: Mark returning lambda as noexcept It just calls current_alloctor().destroy() which is noexcept Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-09 14:41:23 +03:00
Pavel Emelyanov	a497dfdd0b	allocation_strategy: Mark trivial noexcept methods as such Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-09 14:41:03 +03:00
Avi Kivity	cc891a5de8	Merge "Convert a few uses of sstring to std::string_view" from Rafael " This series converts an API to use std::string_view and then converts a few sstring variables to be constexpr std::string_view. This has the advantage that a constexpr variables cannot be part of any initialization order problem. " * 'espindola/convert-to-constexpr' of https://github.com/espindola/scylla: auth: Convert sstring variables in common.hh to constexpr std::string_view auth: Convert sstring variables in default_authorizer to constexpr std::string_view cql_test_env: Make ks_name a constexpr std::string_view class_registry: Use std::string_view in (un)?qualified_name	2020-07-05 17:08:54 +03:00
Rafael Ávila de Espíndola	a2110e413f	class_registry: Use std::string_view in (un)?qualified_name This gives more flexibility for constructing a qualified_name or unqualified_name. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-03 12:28:14 -07:00

1 2 3 4 5 ...

816 Commits