scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 18:10:39 +00:00

Author	SHA1	Message	Date
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Avi Kivity	14a4173f50	treewide: make headers self-sufficient In preparation for some large header changes, fix up any headers that aren't self-sufficient by adding needed includes or forward declarations.	2021-04-20 21:23:00 +03:00
Pavel Emelyanov	2f7c03d84c	utils: Intrusive B-tree (with tests) The design of the tree goes from the row-cache needs, which are 1. Insert/Remove do not invalidate iterators 2. Elements are LSA-manageable 3. Low key overhead 4. External tri-comparator 5. As little actions on insert/remove as possible With the above the design is Two types of nodes -- inner and leaf. Both types keep pointer on parent nodes and N pointers on keys (not keys themselves). Two differences: inner nodes have array of pointers on kids, leaf nodes keep pointer on the tree (to update left- and rightmost tree pointers on node move). Nodes do not keep pointers/references on trees, thus we have O(1) move of any object, but O(logN) to get the tree size. Fortunately, with big keys-per-node value this won't result in too many steps. In turn, the tree has 3 pointers -- root, left- and rightmost leaves. The latter is for constant-time begin() and end(). Keys are managed by user with the help of embeddable member_hook instance, which is 1 pointer in size. The code was copied from the B+ tree one, then heavily reworked, the internal algorythms turned out to differ quite significantly. For the sake of mutation_partition::apply_monotonically(), which needs to move an element from one tree into another, there's a key_grabber helping wrapper that allows doing this move respecting the exception-safety requirement. As measured by the perf_collections test the B-tree with 8 keys is faster, than the std::set, but slower than the B+tree: vs set vs b+tree fill: +13% -6% find: +23% -35% Another neat thing is that 1-key insertion-removal is ~40% faster than for BST (the same number of allocations, but the key object is smaller, less pointers to set-up and less instructions to execute when linking node with root). v4: - equip insertion methods with on_alloc_point() calls to catch potential exception guarantees violations eariler - add unlink_leftmost_without_rebalance. The method is borrowed from boost intrusive set, and is added to kill two birds -- provide it, as it turns out to be popular, and use a bit faster step-by-step tree destruction than plain begin+erase loop v3: - introduce "inline" root node that is embedded into tree object and in which the 1st key is inserted. This greatly improves the 1-key-tree performance, which is pretty common case for rows cache v2: - introduce "linear" root leaf that grows on demand This improves the memory consumption for small trees. This linear node may and should over-grow the NodeSize parameter. This comes from the fact that there are two big per-key memory spikes on small trees -- 1-key root leaf and the first split, when the tree becomes 1-key root with two half-filled leaves. If the linear extention goes above NodeSize it can flatten even the 2nd peak - mitigate the keys indirection a bit Prefetching the keys while doing the intra-node linear scan and the nodes while descending the tree gives ~+5% of fill and find - generalize stress tests for B and B+ trees - cosmetic changes TODO: - fix few inefficincies in the core code (walks the sub-tree twice sometimes) - try to optimize the leaf nodes, that are not lef-/righmost not to carry unused tree pointer on board Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-02-02 09:30:29 +03:00

6 Commits