scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Jan Ciolek	8d7e35caef	cql3: expr: remove reference to temporary in get_rhs_receiver The function underlying_type() returns an data_type by value, but the code assigned it to a reference. At first I was sure this is an error (assigning temporary value to a reference), but it turns out that this is most likely correct due to C++ lifetime extension rules. I think it's better to avoid such unituitive tricks. Assigning to value makes it clearer that the code is correct and there are no dangling references. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #12485	2023-01-10 09:42:49 +02:00
Raphael "Raph" Carvalho	407c7fdaf2	docs: Fix command to create a symbolic link to relocatable pkg dir Closes #12481	2023-01-10 07:09:14 +02:00
Kamil Braun	822410c49b	test/pylib: scylla_cluster: release IPs when cluster is no longer needed With sufficiently many test cases we would eventually run out of IP addresses, because IPs (which are leased from a global host registry) would only be released at the end of an entire test suite. In fact we already hit this during next promotions, causing much pain indeed. Release IPs when a cluster, after being marked dirty, is stopped and thrown away. Closes #12482	2023-01-10 06:59:41 +02:00
Avi Kivity	e71e1dc964	Merge 'tools/scylla-sstable: add lua scripting support' from Botond Dénes Introduce a new "script" operation, which loads a script from the specified path, then feeds the mutation fragment stream to it. The script can then extract, process and present information from the sstable as it wishes. For now only Lua scripts are supported for the simple reason that Lua is easy to write bindings for, it is simple and lightweight and more importantly we already have Lua included in the Scylla binary as it is used as the implementation language for UDF/UDA. We might consider WASM support in the future, but for now we don't have any language support in WASM available. Example: ```lua function new_stats(key) return { partition_key = key, total = 0, partition = 0, static_row = 0, clustering_row = 0, range_tombstone_change = 0, }; end total_stats = new_stats(nil); function inc_stat(stats, field) stats[field] = stats[field] + 1; stats.total = stats.total + 1; total_stats[field] = total_stats[field] + 1; total_stats.total = total_stats.total + 1; end function on_new_sstable(sst) max_partition_stats = new_stats(nil); if sst then current_sst_filename = sst.filename; else current_sst_filename = nil; end end function consume_partition_start(ps) current_partition_stats = new_stats(ps.key); inc_stat(current_partition_stats, "partition"); end function consume_static_row(sr) inc_stat(current_partition_stats, "static_row"); end function consume_clustering_row(cr) inc_stat(current_partition_stats, "clustering_row"); end function consume_range_tombstone_change(crt) inc_stat(current_partition_stats, "range_tombstone_change"); end function consume_partition_end() if current_partition_stats.total > max_partition_stats.total then max_partition_stats = current_partition_stats; end end function on_end_of_sstable() if current_sst_filename then print(string.format("Stats for sstable %s:", current_sst_filename)); else print("Stats for stream:"); end print(string.format("\t%d fragments in %d partitions - %d static rows, %d clustering rows and %d range tombstone changes", total_stats.total, total_stats.partition, total_stats.static_row, total_stats.clustering_row, total_stats.range_tombstone_change)); print(string.format("\tPartition with max number of fragments (%d): %s - %d static rows, %d clustering rows and %d range tombstone changes", max_partition_stats.total, max_partition_stats.partition_key, max_partition_stats.static_row, max_partition_stats.clustering_row, max_partition_stats.range_tombstone_change)); end ``` Running this script wilt yield the following: ``` $ scylla sstable script --script-file fragment-stats.lua --system-schema system_schema.columns /var/lib/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/me-1-big-Data.db Stats for sstable /var/lib/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f//me-1-big-Data.db: 397 fragments in 7 partitions - 0 static rows, 362 clustering rows and 28 range tombstone changes Partition with max number of fragments (180): system - 0 static rows, 179 clustering rows and 0 range tombstone changes ``` Fixes: https://github.com/scylladb/scylladb/issues/9679 Closes #11649 * github.com:scylladb/scylladb: tools/scylla-sstable: consume_reader(): improve pause heuristincs test/cql-pytest/test_tools.py: add test for scylla-sstable script tools: add scylla-sstable-scripts directory tools/scylla-sstable: remove custom operation tools/scylla-sstable: add script operation tools/sstable: introduce the Lua sstable consumer dht/i_partitioner.hh: ring_position_ext: add weight() accessor lang/lua: export Scylla <-> lua type conversion methods lang/lua: use correct lib name for string lib lang/lua: fix type in aligned_used_data (meant to be user_data) lang/lua: use lua_State* in Scylla type <-> Lua type conversions tools/sstable_consumer: more consistent method naming tools/scylla-sstable: extract sstable_consumer interface into own header tools/json_writer: add accessor to underlying writer tools/scylla-sstable: fix indentation tools/scylla-sstable: export mutation_fragment_json_writer declaration tools/scylla-sstable: mutation_fragment_json_writer un-implement sstable_consumer tools/scylla-sstable: extract json writing logic from json_dumper tools/scylla-sstable: extract json_writer into its own header tools/scylla-sstable: use json_writer::DataKey() to write all keys tools/scylla-types: fix use-after-free on main lambda captures	2023-01-09 20:54:42 +02:00
Raphael S. Carvalho	05ffb024bb	replica: Kill table::calculate_shard_from_sstable_generation() Inferring shard from generation is long gone. We still use it in some scripts, but that's no longer needed in Scylla, when loading the SSTables, and it also conflicts with ongoing work of UUID-based generations. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12476	2023-01-09 20:17:57 +02:00
Takuya ASADA	548c9e36a1	main: add tcp_timestamps sanity check Check net.ipv4.tcp_timestamps, show warning message when it's not set to 1. Fixes #12144 Closes #12199	2023-01-09 19:08:21 +02:00
Nadav Har'El	d6e6820f33	Merge 'Drop support for cql binary protocols versions 1 and 2' from Avi Kivity The CQL binary protocol version 3 was introduced in 2014. All Scylla version support it, and Cassandra versions 2.1 and newer. Versions 1 and 2 have 16-bit collection sizes, while protocol 3 and newer use 32-bit collection sizes. Unfortunately, we implemented support for multiple serialization formats very intrusively, by pushing the format everywhere. This avoids the need to re-serialize (sometimes) but is quite obnoxious. It's also likely to be broken, since it's almost untested and it's too easy to write cql_serialization_format::internal() instead of propagating the client specified value. Since protocols 1 and 2 are obsolete for 9 years, just drop them. It's easy to verify that they are no longer in use on a running system by examining the `system.clients` table before upgrade. Fixes #10607 Closes #12432 * github.com:scylladb/scylladb: treewide: drop cql_serialization_format cql: modification_statement: drop protocol check for LWT transport: drop cql protocol versions 1 and 2	2023-01-09 18:52:41 +02:00
Botond Dénes	bd42da6e69	tools/scylla-sstable: consume_reader(): improve pause heuristincs The consume loop had some heuristics in place to determine whether after pausing, the consumer wishes to skip just the partition or the remaining content of the sstable. This heuristics was flawed so replace it with a non-heuristic method: track the last consumed fragment and look at this to determine what should be done.	2023-01-09 09:46:57 -05:00
Botond Dénes	1d222220e0	test/cql-pytest/test_tools.py: add test for scylla-sstable script To test the script operation, we use some of the example scripts from the example directory. Namely, dump.lua and slice.lua. These two scripts together have a very good coverage of the entire script API. Testing their functionality therefore also provides a good coverage of the lua bindings. A further advantage is that since both scripts dump output in identical format to that of the data-dump operation, it is trivial to do a comparison against this already tested operation. A targeted test is written for the sstable skip functionality of the consumer API.	2023-01-09 09:46:57 -05:00
Botond Dénes	ace42202df	tools: add scylla-sstable-scripts directory To be the home of example scripts for scylla-sstable. For now only a README.md is added describing the directory's purpose and with links to useful resources. One example script is added in this patch, more will come later.	2023-01-09 09:46:57 -05:00
Botond Dénes	7b40463f29	tools/scylla-sstable: remove custom operation We now have a script operation, the custom operation (poor man's script operation) has no reason to exist anymore.	2023-01-09 09:46:57 -05:00
Botond Dénes	e5071fdeab	tools/scylla-sstable: add script operation Loads the script from the specified path, then feeds the mutation fragment stream to it. For now only Lua scripts are supported for the simple reason that Lua is easy to write bindings for, it is simple and lightweight and more importantly we already have Lua included in the Scylla binary as it is used as the implementation language for UDF/UDA. We might consider WASM support in the future, but for now we don't have any language support in WASM available.	2023-01-09 09:46:57 -05:00
Botond Dénes	9dd5107919	tools/sstable: introduce the Lua sstable consumer The Lua sstable consumer loads a script from the specified path then feeds the mutation fragment stream to the script via the sstable_consumer methods, each method of which the script is allowed to define, effectively overloading the virtual method in Lua. This allows for very wide and flexible customization opportunities for what to extract from sstables and how to process and present them, without the need to recompile the scylla-sstable tool.	2023-01-09 09:46:57 -05:00
Botond Dénes	50b155e706	dht/i_partitioner.hh: ring_position_ext: add weight() accessor	2023-01-09 09:46:57 -05:00
Botond Dénes	8699fe5001	lang/lua: export Scylla <-> lua type conversion methods Currently hidden in lang/lua.cc, declare these in a header so others can use it.	2023-01-09 09:46:57 -05:00
Botond Dénes	e9a52837cf	lang/lua: use correct lib name for string lib AFAIK the mistake had no real consequence, but still it is nicer to have it correct.	2023-01-09 09:46:57 -05:00
Botond Dénes	76663d7774	lang/lua: fix type in aligned_used_data (meant to be user_data)	2023-01-09 09:46:57 -05:00
Botond Dénes	943fc3b6f3	lang/lua: use lua_State* in Scylla type <-> Lua type conversions Instead of the lua_slice_state which is local to this file. We want to reuse the Scylla type <-> Lua type conversion functions but for that they have to use the more generic lua_State*. No functionality or convenience is lost with the switch, the code didn't make use of the other fields bundled in lua_slice_state.	2023-01-09 09:46:57 -05:00
Botond Dénes	8045751867	tools/sstable_consumer: more consistent method naming Use `consume_` consistently across the entire interface, instead of having some methods with `on_` and others with `consume_` prefixes.	2023-01-09 09:46:57 -05:00
Botond Dénes	8e117501ac	tools/scylla-sstable: extract sstable_consumer interface into own header So it can be used in code outside scylla-sstable.cc. This source file is quite large already, and as we have yet another large chunk of code to add, we want to add it in a separate file.	2023-01-09 09:46:57 -05:00
Botond Dénes	9b1c486051	tools/json_writer: add accessor to underlying writer	2023-01-09 09:46:57 -05:00
Botond Dénes	cfb5afbe9b	tools/scylla-sstable: fix indentation Left broken by previous patches.	2023-01-09 09:46:57 -05:00
Botond Dénes	d42b0bb5d5	tools/scylla-sstable: export mutation_fragment_json_writer declaration To json_writer.hh. Method definition are left in scylla-sstable.cc. Indentation is left broken, will be fixed by the next patch.	2023-01-09 09:46:57 -05:00
Botond Dénes	517135e155	tools/scylla-sstable: mutation_fragment_json_writer un-implement sstable_consumer There is no point in the former implementing said interface. For one it is a futurized interface, which is not needed for something writing to the stdout. Rename the methods to follow the naming convention of rjson writers more closely.	2023-01-09 09:46:57 -05:00
Botond Dénes	0ee1c6ca57	tools/scylla-sstable: extract json writing logic from json_dumper We want to split this class into two parts: one with the actual logic converting mutation fragments to json, and a wrapper over this one, which implements the sstable_consumer interface. As a first step we extract the class as is (no changes) and just forward all-calls from now empty wrapper to it.	2023-01-09 09:46:57 -05:00
Botond Dénes	55ef0ed421	tools/scylla-sstable: extract json_writer into its own header Other source files will want to use it soon.	2023-01-09 09:46:57 -05:00
Botond Dénes	8623818a8d	tools/scylla-sstable: use json_writer::DataKey() to write all keys This method was renamed from its previous name of PartitionKey. Since in json partition keys and clustering keys look alike, with the only difference being that the former may also have a token, it makes to have a single method to write them (with an optional token parameter). This was the case at some point, json_dumper::write_key() taking this role. However at a later point, json_writer::PartitionKey() was introduced and now the code uses both. Standardize on the latter and give it a more generic name.	2023-01-09 09:46:57 -05:00
Botond Dénes	602fca0a12	tools/scylla-types: fix use-after-free on main lambda captures The main lambda of scylla-types, the one passed to app_template::run() was recently made a coroytine. app_template::run() however doesn't keep this lambda alive and hence after the first suspention point, accessing the lambda's captures triggers use-after-free. The simple fix is to convert the coroutine into continuation chain.	2023-01-09 09:46:57 -05:00
Tomasz Grabiec	f97268d8f2	row_cache: Fix violation of the "oldest version are evicted first" when evicting last dummy Consider the following MVCC state of a partition: v2: ==== <7> [entry2] ==== <9> ===== <last dummy> v1: ================================ <last dummy> [entry1] Where === means a continuous range and --- means a discontinuous range. After two LRU items are evicted (entry1 and entry2), we will end up with: v2: ---------------------- <9> ===== <last dummy> v1: ================================ <last dummy> [entry1] This will cause readers to incorrectly think there are no rows before entry <9>, because the range is continuous in v1, and continuity of a snapshot is a union of continuous intervals in all versions. The cursor will see the interval before <9> as continuous and the reader will produce no rows. This is only temporary, because current MVCC merging rules are such that the flag on the latest entry wins, so we'll end up with this once v1 is no longer needed: v2: ---------------------- <9> ===== <last dummy> ...and the reader will go to sstables to fetch the evicted rows before entry <9>, as expected. The bug is in rows_entry::on_evicted(), which treats the last dummy entry in a special way, and doesn't evict it, and doesn't clear the continuity by omission. The situation is not easy to trigger because it requires certain eviction pattern concurrent with multiple reads of the same partition in different versions, so across memtable flushes. Closes #12452	2023-01-09 16:10:52 +02:00
Avi Kivity	1bb1855757	Merge 'replica/database: fix read related metrics' from Botond Dénes Sstable read related metrics are broken for a long time now. First, the introduction of inactive reads (https://github.com/scylladb/scylladb/issues/1865) diluted this metric, as it now also contained inactive reads (contrary to the metric's name). Then, after moving the semaphore in front of the cache (`3d816b7c1`) this metric became completely broken as this metric now contains all kinds of reads: disk, in-memory and inactive ones too. This series aims to remedy this: * `scylla_database_active_reads` is fixed to only include active reads. * `scylla_database_active_reads_memory_consumption` is renamed to `scylla_database_reads_memory_consumption` and its description is brought up-to-date. * `scylla_database_disk_reads` is added to track current reads that are gone to disk. * `scylla_database_sstables_read` is added to track the number of sstables read currently. Fixes: https://github.com/scylladb/scylladb/issues/10065 Closes #12437 * github.com:scylladb/scylladb: replica/database: add disk_reads and sstables_read metrics sstables: wire in the reader_permit's sstable read count tracking reader_concurrency_semaphore: add disk_reads and sstables_read stats replica/database: fix active_reads_memory_consumption_metric replica/database: fix active_reads metric	2023-01-09 12:18:49 +02:00
Pavel Emelyanov	e20738cd7d	azure_snitch: Handle empty zone returned from IMDS Azure metadata API may return empty zone sometimes. If that happens shard-0 gets empty string as its rack, but propagates UNKNOWN_RACK to other shards. Empty zones response should be handled regardless. refs: #12185 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12274	2023-01-09 11:57:45 +02:00
Nadav Har'El	2d845b6244	test/cql-pytest: a test for more than one equality in WHERE Cassandra refuses a request with more than one equality relation to the same column, for example DELETE FROM tbl WHERE partitionKey = ? AND partitionKey = ? It complains that partitionkey cannot be restricted by more than one relation if it includes an Equal Currently, Scylla doesn't consider such requests an error. Whether or not we should be compatible with Cassandra here is discussed in issue #12472. But as long as we do accept this query, we should be sure we do the right thing: "WHERE p = 1 AND p = 2" should match nothing (not the first, or last, value being tested..), and "WHERE p = 1 AND p = 1" should match the matches of p = 1. This patch adds a test for verify that these requests indeed yield correct results. The test is scylla_only because, as explained above, Cassandra doesn't support this feature at all. Refs #12472 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12473	2023-01-09 11:56:39 +02:00
Anna Stuchlik	b61515c871	doc: replace Scylla with ScyllaDB on the menu tree and major links; related: https://github.com/scylladb/scylla-docs/issues/3962 Closes #12456	2023-01-09 08:39:50 +02:00
Avi Kivity	42575340ba	Update seastar submodule * seastar ca586cfb8d...8889cbc198 (14): > http: request_parser: fix grammar ambiguity in field_content Fixes #12468 > sstring: use fold expression to simply copy_str_to() > sstring: use fold expression to simply str_len() > metrics: capture by move in make_function() > metrics: replace homebrew is_callable<> with is_invocable_v<> > reactor: use std::move() to avoid copy. > reactor: remove redundant semicolon. > reactor: use mutable to make std::move() work. > build: install liburing explicitly on ArchLinux. > reactor: use a for loop for submitting ios > metrics: add spaces around '=' > parallel utils: align concept with implementation > reactor: s/resize(0)/clear()/ > reactor: fix a typo in comment Closes #12469	2023-01-08 18:56:00 +02:00
Alejo Sanchez	d632e1aa7a	test/pytest: add missing import, remove unused import Add missed import time and remove unused name import. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12446	2023-01-08 17:38:46 +02:00
Avi Kivity	5ffe4fee6d	Merge 'Remove legacy half reverse' from Michał Radwański This commit removes consume_in_reverse::legacy_half_reverse, an option once used to indicate that the given key ranges are sorted descending, based on the clustering key of the start of the range, and that the range tombstones inside partition would be sorted (descending, as all the mutation fragments would) according to their end (but range tombstone would still be stored according to their start bound). As it turns out, mutation::consume, when called with legacy_half_reverse option produces invalid fragment stream, one where all the row tombstone changes come after all the clustering rows. This was not an issue, since when constructing results from the query, Scylla would not pass the tombstones to the client, but instead compact data beforehand. In this commit, the consume_in_reverse::legacy_half_reverse is removed, along with all the uses. As for the swap out in mutation_partition.cc in query_mutation and to_data_query_result: The downstream was not prepared to deal with legacy_half_reverse. mutation::consume contains ``` if (reverse == consume_in_reverse::yes) { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::yes>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } else { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::no>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } ``` So why did it work at all? to_data_query_result deals with a single slice. The used consumer (compact_for_query_v2) compacts-away the range tombstone changes, and thus the only difference between the consume_in_reverse::no and consume_in_reverse::yes was that one was ordered increasing wrt. ckeys and the second one was ordered decreasing. This property is maintained if we swap out for the consume_in_reverse::yes format. Refs: #12353 Closes #12453 * github.com:scylladb/scylladb: mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse mutation_partition_view: treat query::partition_slice::option::reversed in to_data_query_result as consume_in_reverse::yes mutation: move consume_in_reverse def to mutation_consumer.hh	2023-01-08 15:42:00 +02:00
Botond Dénes	c4688563e3	sstables: track decompressed buffers Convert decompressed temporary buffers into tracked buffers just before returning them to the upper layer. This ensures these buffers are known to the reader concurrency semaphore and it has an accurate view of the actual memory consumption of reads. Fixes: #12448 Closes #12454	2023-01-08 15:34:28 +02:00
Kamil Braun	b77df84543	test: test_topology: make test_nodes_with_different_smp less hacky The test would use a trick to start a separate Scylla cluster from the one provided originally by the test framework. This is not supported by the test framework and may cause unexpected problems. Change the test to perform regular node operations. Instead of starting a fresh cluster of 3 nodes, we join the first of these nodes to the original framework-provided cluster, then decommission the original nodes, then bootstrap the other 2 fresh nodes. Also add some logging to the test. Refs: #12438, #12442 Closes #12457	2023-01-08 15:33:17 +02:00
Avi Kivity	02c9968e73	Merge 'Add WASM UDF implementation in Rust' from Wojciech Mitros This series adds the implementation and usage of rust wasmtime bindings. The WASM UDFs introduced by this patch are interruptable and use memory allocated using the seastar allocator. This series includes #11102 (the first two commits) because #11102 required disabling wasm UDFs completely. This patch disables them in the middle of the series, and enables them again at the end. After this patch, `libwasmtime.a` can be removed from the toolchain. This patch also removes the workaround for #https://github.com/scylladb/scylladb/issues/9387 but it hasn't been tested with ARM yet - if the ARM test causes issues I'll revert this part of the change. Closes #11351 * github.com:scylladb/scylladb: build: remove references to unused c bindings of wasmtime test: assert that WASM allocations can fail without crashing wasm: limit memory allocated using mmap wasm: add configuration options for instance cache and udf execution test: check that wasmtime functions yield wasm: use the new rust bindings of wasmtime rust: add Wasmtime bindings rust: add build profiles more aligned with ninja modes rust: adjust build according to cxxbridge's recommendations tools: toolchain: dbuild: prepare for sharing cargo cache	2023-01-08 15:31:09 +02:00
Nadav Har'El	f5cda3cfc3	test/cql-pytest: add more tests for "timestamp" column type In issue #3668, a discussion spanning several years theorized that several things are wrong with the "timestamp" type. This patch begins by adding several tests that demonstrate that Scylla is in fact behaving correctly, and mostly identically to Cassandra except one esoteric error handling case. However, after eliminating the red herrings, we are left for the real issue that prompted opening #3668, which is a duplicate of issues #2693 and #2694, and this patch also adds a reproducer for that. The issue is that Cassandra 4 added support for arithmetic expressions on values, and timestamps can be added durations, for example: '2011-02-03 04:05:12.345+0000' - 1d is a valid timestamp - and we don't currently support this syntax. So the new test - which passes on Cassandra 4 and fails on Scylla (or Cassandra 3) is marked xfail. Refs #2693 Refs #2694 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12436	2023-01-08 15:00:49 +02:00
Michał Chojnowski	08b3a9c786	configure: don't reduce parsers' optimization level to 1 in release The line modified in this patch was supposed to increase the optimization levels of parsers in debug mode to 1, because they were too slow otherwise. But as a side effect, it also reduced the optimization level in release mode to 1. This is not a problem for the CQL frontend, because statement preparation is not performance-sensitive, but it is a serious performance problem for Alternator, where it lies in the hot path. Fix this by only applying the -O1 to debug modes. Fixes #12463 Closes #12460	2023-01-06 18:04:36 +02:00
Wojciech Mitros	903c4874d0	build: remove references to unused c bindings of wasmtime Before the changes intorducing the new wasmtime bindings we relied on an downloaded static library libwasmtime.a. Now that the bindings are introduced, we do not rely on it anymore, so all references to it can be removed.	2023-01-06 14:07:29 +01:00
Wojciech Mitros	996a942e05	test: assert that WASM allocations can fail without crashing The main source of big allocations in the WASM UDF implementation is the WASM Linear Memory. We do not want Scylla to crash even if a memory allocation for the WASM Memory fails, so we assert that an exception is thrown instead. The wasmtime runtime does not actually fail on an allocation failure (assuming the memory allocator does not abort and returns nullptr instead - which our seastar allocator does). What happens then depends on the failed allocation handling of the code that was compiled to WASM. If the original code threw an exception or aborted, the resulting WASM code will trap. To make sure that we can handle the trap, we need to allow wasmtime to handle SIGILL signals, because that what is used to carry information about WASM traps. The new test uses a special WASM Memory allocator that fails after n allocations, and the allocations include both memory growth instructions in WASM, as well as growing memory manually using the wasmtime API. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2023-01-06 14:07:29 +01:00
Wojciech Mitros	f05d612da8	wasm: limit memory allocated using mmap The wasmtime runtime allocates memory for the executable code of the WASM programs using mmap and not the seastar allocator. As a result, the memory that Scylla actually uses becomes not only the memory preallocated for the seastar allocator but the sum of that and the memory allocated for executable codes by the WASM runtime. To keep limiting the memory used by Scylla, we measure how much memory do the WASM programs use and if they use too much, compiled WASM UDFs (modules) that are currently not in use are evicted to make room. To evict a module it is required to evict all instances of this module (the underlying implementation of modules and instances uses shared pointers to the executable code). For this reason, we add reference counts to modules. Each instance using a module is a reference. When an instance is destroyed, a reference is removed. If all references to a module are removed, the executable code for this module is deallocated. The eviction of a module is actually acheved by eviction of all its references. When we want to free memory for a new module we repeatedly evict instances from the wasm_instance_cache using its LRU strategy until some module loses all its instances. This process may not succeed if the instances currently in use (so not in the cache) use too much memory - in this case the query also fails. Otherwise the new module is added to the tracking system. This strategy may evict some instances unnecessarily, but evicting modules should not happen frequently, and any more efficient solution requires an even bigger intervention into the code.	2023-01-06 14:07:29 +01:00
Wojciech Mitros	b8d28a95bf	wasm: add configuration options for instance cache and udf execution Different users may require different limits for their UDFs. This patch allows them to configure the size of their cache of wasm, the maximum size of indivitual instances stored in the cache, the time after which the instances are evicted, the fuel that all wasm UDFs are allowed to consume before yielding (for the control of latency), the fuel that wasm UDFs are allowed to consume in total (to allow performing longer computations in the UDF without detecting an infinite loop) and the hard limit of the size of UDFs that are executed (to avoid large allocations)	2023-01-06 14:07:27 +01:00
Wojciech Mitros	3214f5c2db	test: check that wasmtime functions yield The new implementation for WASM UDFs allows executing the UDFs in pieces. This commit adds a test asserting that the UDF is in fact divided and that each of the execution segments takes no longer than 1ms.	2023-01-06 14:05:53 +01:00
Wojciech Mitros	3146807192	wasm: use the new rust bindings of wasmtime This patch replaces all dependencies on the wasmtime C++ bindings with our new ones. The wasmtime.hh and wasm_engine.hh files are deleted. The libwasmtime.a library is no longer required by configure.py. The SCYLLA_ENABLE_WASMTIME macro is removed and wasm udfs are now compiled by default on all architectures. In terms of implementation, most of code using wasmtime was moved to the Rust source files. The remaining code uses names from the new bindings (which are mostly unchanged). Most of wasmtime objects are now stored as a rust::Box<>, to make it compatible with rust lifetime requirements. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2023-01-06 14:05:53 +01:00
Wojciech Mitros	50b24cf036	rust: add Wasmtime bindings The C++ bindings provided by wasmtime are lacking a crucial capability: asynchronous execution of the wasm functions. This forces us to stop the execution of the function after a short time to prevent increasing the latency. Fortunately, this feature is implemented in the native language of Wasmtime - Rust. Support for Rust was recently added to scylla, so we can implement the async bindings ourselves, which is done in this patch. The bindings expose all the objects necessary for creating and calling wasm functions. The majority of code implemented in Rust is a translation of code that was previously present in C++. Types exported from Rust are currently required to be defined by the same crate that contains the bridge using them, so wasmtime types can't be exported directly. Instead, for each class that was supposed to be exported, a wrapper type is created, where its first member is the wasmtime class. Note that the members are not visible from C++ anyway, the difference only applies to Rust code. Aside from wasmtime types and methods, two additional types are exported with some associated methods. - The first one is ValVec, which is a wrapper for a rust Vec of wasmtime Vals. The underlying vector is required by wasmtime methods for calling wasm functions. By having it exported we avoid multiple conversions from a Val wrapper to a wasmtime Val, as would be required if we exported a rust Vec of Val wrappers (the rust Vec itself does not require wrappers if the type it contains is already wrapped) - The second one is Fut. This class represents an computation tha may or may not be ready. We're currently using it to control the execution of wasm functions from C++. This class exposes one method: resume(), which returns a bool that signals whether the computation is finished or not. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2023-01-06 14:05:53 +01:00
Wojciech Mitros	33c97de25c	rust: add build profiles more aligned with ninja modes A cargo profile is created for each of build modes: dev, debug, sanitize, realease and coverage. The names of cargo profiles are prefixed by "rust-" because cargo does not allow separate "dev" and "debug" profiles. The main difference between profiles are their optimization levels, they correlate to the levels used in configure.py. The debug info is stripped only in the dev mode, and only this mode uses "incremental" compilation to speed it up.	2023-01-06 14:05:53 +01:00
Wojciech Mitros	4d7858e66d	rust: adjust build according to cxxbridge's recommendations Currently, the rust build system in Scylla creates a separate static library for each incuded rust package. This could cause duplicate symbol issues when linking against multiple libraries compiled from rust. This issue is fixed in this patch by creating a single static library to link against, which combines all rust packages implemented in Scylla. The Cargo.lock for the combined build is now tracked, so that all users of the same scylla version also use the same versions of imported rust modules. Additionally, the rust package implementation and usage docs are modified to be compatible with the build changes. This patch also adds a new header file 'rust/cxx.hh' that contains definitions of additional rust types available in c++.	2023-01-06 14:05:53 +01:00

1 2 3 4 5 ...

34550 Commits