scylladb

Author	SHA1	Message	Date
Ernest Zaslavsky	54aa552af7	treewide: Move type related files to a `type` directory As requested in #22110 , moved the files and fixed other includes and build system. Moved files: - duration.hh - duration.cc - concrete_types.hh Fixes: #22110 This is a cleanup, no need to backport Closes scylladb/scylladb#25088	2025-09-17 17:32:19 +03:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Kefu Chai	3e84d43f93	treewide: use seastar::format() or fmt::format() explicitly before this change, we rely on `using namespace seastar` to use `seastar::format()` without qualifying the `format()` with its namespace. this works fine until we changed the parameter type of format string `seastar::format()` from `const char*` to `fmt::format_string<...>`. this change practically invited `seastar::format()` to the club of `std::format()` and `fmt::format()`, where all members accept a templated parameter as its `fmt` parameter. and `seastar::format()` is not the best candidate anymore. despite that argument-dependent lookup (ADT for short) favors the function which is in the same namespace as its parameter, but `using namespace` makes `seastar::format()` more competitive, so both `std::format()` and `seastar::format()` are considered as the condidates. that is what is happening scylladb in quite a few caller sites of `format()`, hence ADT is not able to tell which function the winner in the name lookup: ``` /__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous 265 \| return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id()); \| ^~~~~~ /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 4290 \| format(format_string<_Args...> __fmt, _Args&&... __args) \| ^ /__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 143 \| format(fmt::format_string<A...> fmt, A&&... a) { \| ^ ``` in this change, we change all `format()` to either `fmt::format()` or `seastar::format()` with following rules: - if the caller expects an `sstring` or `std::string_view`, change to `seastar::format()` - if the caller expects an `std::string`, change to `fmt::format()`. because, `sstring::operator std::basic_string` would incur a deep copy. we will need another change to enable scylladb to compile with the latest seastar. namely, to pass the format string as a templated parameter down to helper functions which format their parameters. to miminize the scope of this change, let's include that change when bumping up the seastar submodule. as that change will depend on the seastar change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-09-11 23:21:40 +03:00
Pavel Emelyanov	ceebbc5948	lang: Unfriend wasm context from manager The friendship was needed to get engine and instance cache from manager, but there's a shorter way to create cotnext with the info it needs. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 13:07:05 +03:00
Pavel Emelyanov	783ccc0a74	lang: Don't use db::config to create wasm context The managerr needs to get two "fuel" configurables from db::config in order to create context. Instead of carrying db config from callers, keep the options on existing lang::manager::config and use them. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 13:07:05 +03:00
Pavel Emelyanov	f277bd89f5	lang: Drop manager::precompile() method It's not helping much any longer. Manager can call wasm:: stuff directly with less code involved. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 13:07:05 +03:00
Pavel Emelyanov	fe7ff7172d	wasm: Replace startup_context with wasm_config The lang::manager starts with the help of a context because it needs to have std::shared_ptr<> pointg to cross-shard shared wasm engine and runner thread. For that a context is created in advance, that then helps sharing the engine and runner across manager instances. This patch removes the "context" and replaces it with classical manager::config. With it, it's lang::manager who's now responsible for initializing itself. In order to have cross-shard engine and thread pointers, the start() method uses invoke_on_others() facility to share the pointer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 12:35:57 +03:00
Pavel Emelyanov	f950469af5	lang: Move manager to lang namespace And, while at it, rename local variable to refer to it to as "manager" not "wasm". Query processor and database also have getters named "wasm()", these are not renamed yet to keep patch smaller (and those getters are going to be reworked further anyway). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 12:35:57 +03:00
Pavel Emelyanov	1dec79e97d	lang: Move wasm::manager to its .cc/.hh files It's going to become a facade in front of both -- wasm and lua, so keep it in files with language independent names. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-06-07 12:35:57 +03:00
Kefu Chai	75be212ab2	lang: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17193	2024-02-07 09:27:39 +02:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Pavel Emelyanov	595c5abbf9	wasm: Shuffle context::context() Add a constructor that builds context out of const manager reference. The existing one needs to get engine and instance cache and does it via query_processor. This change lets removing those exports and finally -- drop the wasm::manager -> cql3::query_processor friendship Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	93cb73fddb	wasm: Add manager::precompile() This is not to make query_processor export alien runner from the wasm::manager Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	d58a2d65b5	wasm: Move stop() out of query_processor When the q.p. stops it also "stops" the wasm manager. Move this call into main. The cql test env doesn't need this change, it stops the whole sharded service which stops instances on its own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Pavel Emelyanov	243f2217dd	wasm: Make wasm sharded<manager> The wasm::manager is just cql3::wasm_context renamed. It now sits in lang/wasm* and is started as a sharded service in main (and cql test env). This move also needs some headers shuffling, but it's not severe This change is required to make it possible for the wasm::manager to be shared (by reference) between q.p. and replica::database further Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-04 19:47:50 +03:00
Avi Kivity	3e0aacc8b5	db, cql3: functions: pass function parameters as a span instead of a vector Spans are more flexible and can be constructed from any contiguous container (such as small_vector), or a subrange of such a container. This can save allocations, so change the signature to accept a span. Spans cannot be constructed from std::initializer_list, so one such call site is changed to use construct a span directly from the single argument.	2023-04-19 20:38:55 +03:00
Pavel Emelyanov	92318fdeae	Merge 'Initialize Wasm together with query_processor' from Wojciech Mitros The wasm engine is moved from replica::database to the query_processor. The wasm instance cache and compilation thread runner were already there, but now they're also initialized in the query_processor constructor. By moving the initialization to the constructor, we can now be certain that all wasm-related objects (wasm instance cache, compilation thread runner, and wasm engine, which was already passed in the constructor) are initialized when we try to use them because we have to use the query processor to access them anyway. The change is also motivated by the fact that we're planning to take Wasm UDFs out of experimental, after which they should stop getting special treatment. Closes #13311 * github.com:scylladb/scylladb: wasm: move wasm initialization to query_processor constructor wasm: return wasm instance cache as a reference instead of a pointer wasm: move wasm engine to query_processor	2023-03-30 14:30:23 +03:00
Wojciech Mitros	cfd2a4588d	wasm: move wasm initialization to query_processor constructor By moving the initialization to the constructor, we can now be certain that all wasm-related objects (wasm instance cache, compilation thread runner, and wasm engine, which was already passed in the constructor) are initialized when we try to use them because we have to use the query processor to access them anyway. The change is also motivated by the fact that we're planning to take Wasm UDFs out of experimental, after which they should stop getting special treatment.	2023-03-29 14:55:36 +02:00
Wojciech Mitros	c9b701b516	wasm: return wasm instance cache as a reference instead of a pointer In an incoming change, the wasm instance cache will be modified to be owned by the query_processor - it will hold an optional instead of a raw pointer to the cache, so we should stop returning the raw pointer from the getter as well. Consequently, the cache is also stored as a reference in wasm::cache, as it gets the reference from the query_processor. For consistency with the wasm engine and the wasm alien thread runner, the name of the getter is also modified to follow the same pattern.	2023-03-28 18:18:48 +02:00
Wojciech Mitros	f0aa540e00	cql: renice the wasm compilation alien thread The Wasm compilation is a slow, low priority task, so it should not compete with reactor threads or the networking core. To achieve that, we increase the niceness of the thread by 10. An alternative solution would be to set the priority using pthread_setschedparam, but it's not currently feasible, because as long as we're using the SCHED_OTHER policy for our threads, we cannot select any other priority than 0. Closes #13307	2023-03-26 18:38:23 +03:00
Wojciech Mitros	2fd6d495fa	wasm: move compilation to an alien thread The compilation of wasm UDFs is performed by a call to a foreign function, which cannot be divided with yielding points and, as a result, causes long reactor stalls for big UDFs. We avoid them by submitting the compilation task to a non-seastar std::thread, and retrieving the result using seastar::alien. The thread is created at the start of the program. It executes tasks from a queue in an infinite loop. All seastar shards reference the thread through a std::shared_ptr to a `alien_thread_runner`. Considering that the compilation takes a long time anyway, the alien_thread_runner is implemented with focus on simplicity more than on performance. The tasks are stored in an std::queue, reading and writing to it is synchronized using an std::mutex for reading/ writing to the queue, and an std::condition_variable waiting until the queue has elements. When the destructor of the alien runner is called, an std::nullopt sentinel is pushed to the queue, and after all remaining tasks are finished and the sentinel is read, the thread finishes.	2023-03-09 11:54:38 +01:00
Wojciech Mitros	4609a45ce3	wasm: convert compilation to a future After we move the compilation to a alien thread, the completion of the compilation will be signaled by fulfilling a seastar promise. As a result, the `precompile` function will return a future, and because of that, other functions that use the `precompile` functions will also become futures. We can do all the neccessary adjustments beforehand, so that the actual patch that moves the compilation will contain less irrelevant changes.	2023-03-07 14:27:38 +01:00
Wojciech Mitros	b25ee62f75	wasm udf: deserialize counters as integers Currently, because serialize_visitor::operator() is not implemented for counters, we cannot convert a counter returned by a WASM UDF to bytes when returning from wasm::run_script(). We could disallow using counters as WASM UDF return types, but an easier solution which we're already using in Lua UDFs is treating the returned counters as 64-bit integers when deserializing. This patch implements the latter approach and adds a test for it.	2023-02-13 14:24:20 +01:00
Wojciech Mitros	f05d612da8	wasm: limit memory allocated using mmap The wasmtime runtime allocates memory for the executable code of the WASM programs using mmap and not the seastar allocator. As a result, the memory that Scylla actually uses becomes not only the memory preallocated for the seastar allocator but the sum of that and the memory allocated for executable codes by the WASM runtime. To keep limiting the memory used by Scylla, we measure how much memory do the WASM programs use and if they use too much, compiled WASM UDFs (modules) that are currently not in use are evicted to make room. To evict a module it is required to evict all instances of this module (the underlying implementation of modules and instances uses shared pointers to the executable code). For this reason, we add reference counts to modules. Each instance using a module is a reference. When an instance is destroyed, a reference is removed. If all references to a module are removed, the executable code for this module is deallocated. The eviction of a module is actually acheved by eviction of all its references. When we want to free memory for a new module we repeatedly evict instances from the wasm_instance_cache using its LRU strategy until some module loses all its instances. This process may not succeed if the instances currently in use (so not in the cache) use too much memory - in this case the query also fails. Otherwise the new module is added to the tracking system. This strategy may evict some instances unnecessarily, but evicting modules should not happen frequently, and any more efficient solution requires an even bigger intervention into the code.	2023-01-06 14:07:29 +01:00
Wojciech Mitros	b8d28a95bf	wasm: add configuration options for instance cache and udf execution Different users may require different limits for their UDFs. This patch allows them to configure the size of their cache of wasm, the maximum size of indivitual instances stored in the cache, the time after which the instances are evicted, the fuel that all wasm UDFs are allowed to consume before yielding (for the control of latency), the fuel that wasm UDFs are allowed to consume in total (to allow performing longer computations in the UDF without detecting an infinite loop) and the hard limit of the size of UDFs that are executed (to avoid large allocations)	2023-01-06 14:07:27 +01:00
Wojciech Mitros	3146807192	wasm: use the new rust bindings of wasmtime This patch replaces all dependencies on the wasmtime C++ bindings with our new ones. The wasmtime.hh and wasm_engine.hh files are deleted. The libwasmtime.a library is no longer required by configure.py. The SCYLLA_ENABLE_WASMTIME macro is removed and wasm udfs are now compiled by default on all architectures. In terms of implementation, most of code using wasmtime was moved to the Rust source files. The remaining code uses names from the new bindings (which are mostly unchanged). Most of wasmtime objects are now stored as a rust::Box<>, to make it compatible with rust lifetime requirements. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2023-01-06 14:05:53 +01:00
Wojciech Mitros	9281ba3919	wasm: reuse UDF instances When executing a wasm UDF, most of the time is spent on setting up the instance. To minimize its cost, we reuse the instance using wasm::instance_cache. This patch adds a wasm instance cache, that stores a wasmtime instance for each UDF and scheduling group. The instances are evicted using LRU strategy. The cache may store some entries for the UDF after evicting the instance, but they are evicted when the corresponding UDF is dropped, which greatly limits their number. The size of stored instances is estimated using the size of their WASM memories. In order to be able to read the size of memory, we require that the memory is exported by the client. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-07-20 18:19:22 +02:00
Benny Halevy	acae3cc223	treewide: stop use of deprecated coroutine::make_exception Convert most use sites from `co_return coroutine::make_exception` to `co_await coroutine::return_exception{,_ptr}` where possible. In cases this is done in a catch clause, convert to `co_return coroutine::exception`, generating an exception_ptr if needed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10972	2022-07-07 15:02:16 +03:00
Wojciech Mitros	4fd289a78c	wasm: fix freeing in wasm UDFs using WASI To call a UDF that is using WASI, we need to properly configure the wasmtime instance that it will be called on. The configuration was missing from udf_cache::load(), so we add it here. The free function does not return any value, so we should use a calling method that does not expect any returns. This patch adds such a method and uses it. A test that did not pass without this fix and does pass after is added. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com> Closes #10935	2022-07-01 07:57:45 +02:00
Wojciech Mitros	8a9d55d3a1	wasm: add wasm ABI version 2 Because the only available version of wasm ABI did not allow freeing any allocated memory, a new version of the ABI is introduced. In this version, the host is required to export _scylla_malloc and _scylla_free methods, which are later used for the memory management. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 20:49:35 +02:00
Wojciech Mitros	5b60cd1eab	wasm: add WASI handling One of the issues that comes with compiling programs to WebAssembly is the lack of a default implementation of a memory allocator. As a result, the only available solutions to the need of memory allocation are growing the wasm memory for each new allocated segment, or implementing one's own memory allocator. To avoid both of these approaches, for many languages, the user may compile a program to a WASI target. By doing so, the compiler adds default implementations of malloc and free methods, and the user can use them for dynamic memory management. This patch enables executing programs compiled with WASI by enabling it in the wasmtime runtime. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 19:44:36 +02:00
Wojciech Mitros	93b082e8d9	wasm: add _scylla_abi export for specifying abi for wasm udfs Different languages may require different ABIs for passing parameters, etc. This patch adds a requirement for all wasm UDFs to export an _scylla_abi symbol, that is an 32-bit integer with a value specifying the ABI version. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 19:37:11 +02:00
Wojciech Mitros	a7ee3ccf52	wasm: update ABI for passing parameters to wasm UDFs WebAssembly uses 32-bit address space, while also having 64-bit integers as it native types. As a result, when passing size of an object in memory and its address, it can be combined into one 64-bit value. As a bonus, if the object is null, we can signal it by passing -1 as its size. This patch implements handling of this new ABI and adjusts expamples in test_wasm.py. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 17:13:25 +02:00
Wojciech Mitros	7fd81e6dae	wasm: move common code to a separate function Both init_nullable_arg_visitor and, in case of abstract_type, init_arg_visitor were the same method with one difference. The common part was moved to init_abstract_arg, and the difference remained in the operator() method. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 17:13:22 +02:00
Wojciech Mitros	62761a7cf3	wasm: use wasm pages for wasm memory The memory.grow and memory.size wasm methods return the memory size in pages, and memory.size takes its argument in the number of pages. A WebAssembly page has a size of 64KiB, so during memory allocation we have to divide our desired size in bytes by page size and round up. Similarly, when reading memory size we need to multiply the result by 64KiB to get the size in bytes. The change affects current naive allocator for arguments when calling wasm UDFs and the examples in wasm_test.py - both commented code and compiled wasm in text representation. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 17:13:13 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Piotr Sarna	78afd518a8	wasm: add initial WebAssembly runtime implementation The engine is based on wasmtime and is able to: - compile wasm text format to bytecode - run a given compiled function with custom arguments This implementation is missing crucial features, like running on any other types than 32-bit integers. It serves as a skeleton for future full implementation.	2021-09-13 19:03:58 +02:00

37 Commits