Currently, because serialize_visitor::operator() is not implemented
for counters, we cannot convert a counter returned by a WASM UDF
to bytes when returning from wasm::run_script().
We could disallow using counters as WASM UDF return types, but an
easier solution which we're already using in Lua UDFs is treating
the returned counters as 64-bit integers when deserializing. This
patch implements the latter approach and adds a test for it.
The wasmtime runtime allocates memory for the executable code of
the WASM programs using mmap and not the seastar allocator. As
a result, the memory that Scylla actually uses becomes not only
the memory preallocated for the seastar allocator but the sum of
that and the memory allocated for executable codes by the WASM
runtime.
To keep limiting the memory used by Scylla, we measure how much
memory do the WASM programs use and if they use too much, compiled
WASM UDFs (modules) that are currently not in use are evicted to
make room.
To evict a module it is required to evict all instances of this
module (the underlying implementation of modules and instances uses
shared pointers to the executable code). For this reason, we add
reference counts to modules. Each instance using a module is a
reference. When an instance is destroyed, a reference is removed.
If all references to a module are removed, the executable code
for this module is deallocated.
The eviction of a module is actually acheved by eviction of all
its references. When we want to free memory for a new module we
repeatedly evict instances from the wasm_instance_cache using its
LRU strategy until some module loses all its instances. This
process may not succeed if the instances currently in use (so not
in the cache) use too much memory - in this case the query also
fails. Otherwise the new module is added to the tracking system.
This strategy may evict some instances unnecessarily, but evicting
modules should not happen frequently, and any more efficient
solution requires an even bigger intervention into the code.
Different users may require different limits for their UDFs. This
patch allows them to configure the size of their cache of wasm,
the maximum size of indivitual instances stored in the cache, the
time after which the instances are evicted, the fuel that all wasm
UDFs are allowed to consume before yielding (for the control of
latency), the fuel that wasm UDFs are allowed to consume in total
(to allow performing longer computations in the UDF without
detecting an infinite loop) and the hard limit of the size of UDFs
that are executed (to avoid large allocations)
This patch replaces all dependencies on the wasmtime
C++ bindings with our new ones.
The wasmtime.hh and wasm_engine.hh files are deleted.
The libwasmtime.a library is no longer required by
configure.py. The SCYLLA_ENABLE_WASMTIME macro is
removed and wasm udfs are now compiled by default
on all architectures.
In terms of implementation, most of code using
wasmtime was moved to the Rust source files. The
remaining code uses names from the new bindings
(which are mostly unchanged). Most of wasmtime objects
are now stored as a rust::Box<>, to make it compatible
with rust lifetime requirements.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
When executing a wasm UDF, most of the time is spent on
setting up the instance. To minimize its cost, we reuse
the instance using wasm::instance_cache.
This patch adds a wasm instance cache, that stores
a wasmtime instance for each UDF and scheduling group.
The instances are evicted using LRU strategy. The
cache may store some entries for the UDF after evicting
the instance, but they are evicted when the corresponding
UDF is dropped, which greatly limits their number.
The size of stored instances is estimated using the size
of their WASM memories. In order to be able to read the
size of memory, we require that the memory is exported
by the client.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Convert most use sites from `co_return coroutine::make_exception`
to `co_await coroutine::return_exception{,_ptr}` where possible.
In cases this is done in a catch clause, convert to
`co_return coroutine::exception`, generating an exception_ptr
if needed.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#10972
To call a UDF that is using WASI, we need to properly
configure the wasmtime instance that it will be called
on. The configuration was missing from udf_cache::load(),
so we add it here.
The free function does not return any value, so we should use
a calling method that does not expect any returns.
This patch adds such a method and uses it.
A test that did not pass without this fix and does pass after
is added.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Closes#10935
Because the only available version of wasm ABI did not allow
freeing any allocated memory, a new version of the ABI is
introduced. In this version, the host is required to export
_scylla_malloc and _scylla_free methods, which are later used
for the memory management.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
One of the issues that comes with compiling programs to WebAssembly
is the lack of a default implementation of a memory allocator. As
a result, the only available solutions to the need of memory allocation
are growing the wasm memory for each new allocated segment, or
implementing one's own memory allocator. To avoid both of these
approaches, for many languages, the user may compile a program to
a WASI target. By doing so, the compiler adds default implementations
of malloc and free methods, and the user can use them for dynamic
memory management.
This patch enables executing programs compiled with WASI by enabling
it in the wasmtime runtime.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Different languages may require different ABIs for passing
parameters, etc. This patch adds a requirement for all wasm
UDFs to export an _scylla_abi symbol, that is an 32-bit integer
with a value specifying the ABI version.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
WebAssembly uses 32-bit address space, while also
having 64-bit integers as it native types. As a result,
when passing size of an object in memory and its address,
it can be combined into one 64-bit value. As a bonus,
if the object is null, we can signal it by passing -1 as
its size.
This patch implements handling of this new ABI and adjusts
expamples in test_wasm.py.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Both init_nullable_arg_visitor and, in case
of abstract_type, init_arg_visitor were
the same method with one difference. The
common part was moved to init_abstract_arg,
and the difference remained in the operator()
method.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
The memory.grow and memory.size wasm methods return
the memory size in pages, and memory.size takes its
argument in the number of pages. A WebAssembly page
has a size of 64KiB, so during memory allocation
we have to divide our desired size in bytes by page
size and round up. Similarly, when reading memory
size we need to multiply the result by 64KiB to
get the size in bytes.
The change affects current naive allocator for
arguments when calling wasm UDFs and the examples
in wasm_test.py - both commented code and compiled
wasm in text representation.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
The engine is based on wasmtime and is able to:
- compile wasm text format to bytecode
- run a given compiled function with custom arguments
This implementation is missing crucial features, like running
on any other types than 32-bit integers. It serves as a skeleton
for future full implementation.