Commit Graph

24 Commits

Author SHA1 Message Date
Pavel Emelyanov
bebd121936 code: Enlighten wasm headers usage
Now when function context creation is encapsulated in lang::manager,
some .cc files can stop using wasm-specific headers and just go with the
lang/manager.hh one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 13:07:05 +03:00
Pavel Emelyanov
ceebbc5948 lang: Unfriend wasm context from manager
The friendship was needed to get engine and instance cache from manager,
but there's a shorter way to create cotnext with the info it needs.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 13:07:05 +03:00
Pavel Emelyanov
783ccc0a74 lang: Don't use db::config to create wasm context
The managerr needs to get two "fuel" configurables from db::config in
order to create context. Instead of carrying db config from callers,
keep the options on existing lang::manager::config and use them.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 13:07:05 +03:00
Pavel Emelyanov
fe7ff7172d wasm: Replace startup_context with wasm_config
The lang::manager starts with the help of a context because it needs to
have std::shared_ptr<> pointg to cross-shard shared wasm engine and
runner thread. For that a context is created in advance, that then helps
sharing the engine and runner across manager instances.

This patch removes the "context" and replaces it with classical
manager::config. With it, it's lang::manager who's now responsible for
initializing itself.

In order to have cross-shard engine and thread pointers, the start()
method uses invoke_on_others() facility to share the pointer.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Pavel Emelyanov
f950469af5 lang: Move manager to lang namespace
And, while at it, rename local variable to refer to it to as "manager"
not "wasm". Query processor and database also have getters named
"wasm()", these are not renamed yet to keep patch smaller (and those
getters are going to be reworked further anyway).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Pavel Emelyanov
1dec79e97d lang: Move wasm::manager to its .cc/.hh files
It's going to become a facade in front of both -- wasm and lua, so keep
it in files with language independent names.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-06-07 12:35:57 +03:00
Pavel Emelyanov
595c5abbf9 wasm: Shuffle context::context()
Add a constructor that builds context out of const manager reference.
The existing one needs to get engine and instance cache and does it via
query_processor. This change lets removing those exports and finally --
drop the wasm::manager -> cql3::query_processor friendship

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-04 19:47:50 +03:00
Pavel Emelyanov
56404ee053 wasm: Add manager::remove()
This is one of the users of query_processor's export of wasm::manager's
instance cache. Remove it in advance

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-04 19:47:50 +03:00
Pavel Emelyanov
93cb73fddb wasm: Add manager::precompile()
This is not to make query_processor export alien runner from the
wasm::manager

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-04 19:47:50 +03:00
Pavel Emelyanov
d58a2d65b5 wasm: Move stop() out of query_processor
When the q.p. stops it also "stops" the wasm manager. Move this call
into main. The cql test env doesn't need this change, it stops the whole
sharded service which stops instances on its own

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-04 19:47:50 +03:00
Pavel Emelyanov
243f2217dd wasm: Make wasm sharded<manager>
The wasm::manager is just cql3::wasm_context renamed. It now sits in
lang/wasm* and is started as a sharded service in main (and cql test
env). This move also needs some headers shuffling, but it's not severe

This change is required to make it possible for the wasm::manager to be
shared (by reference) between q.p. and replica::database further

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-04 19:47:50 +03:00
Avi Kivity
3e0aacc8b5 db, cql3: functions: pass function parameters as a span instead of a vector
Spans are more flexible and can be constructed from any contiguous
container (such as small_vector), or a subrange of such a container.
This can save allocations, so change the signature to accept a span.

Spans cannot be constructed from std::initializer_list, so one such
call site is changed to use construct a span directly from the single
argument.
2023-04-19 20:38:55 +03:00
Wojciech Mitros
cfd2a4588d wasm: move wasm initialization to query_processor constructor
By moving the initialization to the constructor, we can now
be certain that all wasm-related objects (wasm instance cache,
compilation thread runner, and wasm engine, which was already
passed in the constructor) are initialized when we try to use
them because we have to use the query processor to access them
anyway.

The change is also motivated by the fact that we're planning
to take Wasm UDFs out of experimental, after which they should
stop getting special treatment.
2023-03-29 14:55:36 +02:00
Wojciech Mitros
c9b701b516 wasm: return wasm instance cache as a reference instead of a pointer
In an incoming change, the wasm instance cache will be modified to be owned
by the query_processor - it will hold an optional instead of a raw
pointer to the cache, so we should stop returning the raw pointer
from the getter as well.
Consequently, the cache is also stored as a reference in wasm::cache,
as it gets the reference from the query_processor.
For consistency with the wasm engine and the wasm alien thread runner,
the name of the getter is also modified to follow the same pattern.
2023-03-28 18:18:48 +02:00
Wojciech Mitros
2fd6d495fa wasm: move compilation to an alien thread
The compilation of wasm UDFs is performed by a call to a foreign
function, which cannot be divided with yielding points and, as a
result, causes long reactor stalls for big UDFs.
We avoid them by submitting the compilation task to a non-seastar
std::thread, and retrieving the result using seastar::alien.

The thread is created at the start of the program. It executes
tasks from a queue in an infinite loop.

All seastar shards reference the thread through a std::shared_ptr
to a `alien_thread_runner`.

Considering that the compilation takes a long time anyway, the
alien_thread_runner is implemented with focus on simplicity more
than on performance. The tasks are stored in an std::queue, reading
and writing to it is synchronized using an std::mutex for reading/
writing to the queue, and an std::condition_variable waiting until
the queue has elements.

When the destructor of the alien runner is called, an std::nullopt
sentinel is pushed to the queue, and after all remaining tasks are
finished and the sentinel is read, the thread finishes.
2023-03-09 11:54:38 +01:00
Wojciech Mitros
4609a45ce3 wasm: convert compilation to a future
After we move the compilation to a alien thread, the completion
of the compilation will be signaled by fulfilling a seastar promise.
As a result, the `precompile` function will return a future, and
because of that, other functions that use the `precompile` functions
will also become futures.
We can do all the neccessary adjustments beforehand, so that the actual
patch that moves the compilation will contain less irrelevant changes.
2023-03-07 14:27:38 +01:00
Kefu Chai
df63e2ba27 types: move types.{cc,hh} into types
they are part of the CQL type system, and are "closer" to types.
let's move them into "types" directory.

the building systems are updated accordingly.

the source files referencing `types.hh` were updated using following
command:

```
find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} +
```

the source files under sstables include "types.hh", which is
indeed the one located under "sstables", so include "sstables/types.hh"
instea, so it's more explicit.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12926
2023-02-19 21:05:45 +02:00
Wojciech Mitros
f05d612da8 wasm: limit memory allocated using mmap
The wasmtime runtime allocates memory for the executable code of
the WASM programs using mmap and not the seastar allocator. As
a result, the memory that Scylla actually uses becomes not only
the memory preallocated for the seastar allocator but the sum of
that and the memory allocated for executable codes by the WASM
runtime.
To keep limiting the memory used by Scylla, we measure how much
memory do the WASM programs use and if they use too much, compiled
WASM UDFs (modules) that are currently not in use are evicted to
make room.
To evict a module it is required to evict all instances of this
module (the underlying implementation of modules and instances uses
shared pointers to the executable code). For this reason, we add
reference counts to modules. Each instance using a module is a
reference. When an instance is destroyed, a reference is removed.
If all references to a module are removed, the executable code
for this module is deallocated.
The eviction of a module is actually acheved by eviction of all
its references. When we want to free memory for a new module we
repeatedly evict instances from the wasm_instance_cache using its
LRU strategy until some module loses all its instances. This
process may not succeed if the instances currently in use (so not
in the cache) use too much memory - in this case the query also
fails. Otherwise the new module is added to the tracking system.
This strategy may evict some instances unnecessarily, but evicting
modules should not happen frequently, and any more efficient
solution requires an even bigger intervention into the code.
2023-01-06 14:07:29 +01:00
Wojciech Mitros
b8d28a95bf wasm: add configuration options for instance cache and udf execution
Different users may require different limits for their UDFs. This
patch allows them to configure the size of their cache of wasm,
the maximum size of indivitual instances stored in the cache, the
time after which the instances are evicted, the fuel that all wasm
UDFs are allowed to consume before yielding (for the control of
latency), the fuel that wasm UDFs are allowed to consume in total
(to allow performing longer computations in the UDF without
detecting an infinite loop) and the hard limit of the size of UDFs
that are executed (to avoid large allocations)
2023-01-06 14:07:27 +01:00
Wojciech Mitros
3146807192 wasm: use the new rust bindings of wasmtime
This patch replaces all dependencies on the wasmtime
C++ bindings with our new ones.
The wasmtime.hh and wasm_engine.hh files are deleted.
The libwasmtime.a library is no longer required by
configure.py. The SCYLLA_ENABLE_WASMTIME macro is
removed and wasm udfs are now compiled by default
on all architectures.
In terms of implementation, most of code using
wasmtime was moved to the Rust source files. The
remaining code uses names from the new bindings
(which are mostly unchanged). Most of wasmtime objects
are now stored as a rust::Box<>, to make it compatible
with rust lifetime requirements.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2023-01-06 14:05:53 +01:00
Wojciech Mitros
64c03a2d24 wasm: fix compilation without libwasmtime
Some segments of code using wasmtime were not under an
ifdef SCYLLA_ENABLE_WASMTIME, making Scylla unable to compile
on machines without wasmtime. This patch adds the ifdef where
needed.

Closes #11200
2022-08-03 18:16:02 +03:00
Wojciech Mitros
9281ba3919 wasm: reuse UDF instances
When executing a wasm UDF, most of the time is spent on
setting up the instance. To minimize its cost, we reuse
the instance using wasm::instance_cache.

This patch adds a wasm instance cache, that stores
a wasmtime instance for each UDF and scheduling group.
The instances are evicted using LRU strategy. The
cache may store some entries for the UDF after evicting
the instance, but they are evicted when the corresponding
UDF is dropped, which greatly limits their number.

The size of stored instances is estimated using the size
of their WASM memories. In order to be able to read the
size of memory, we require that the memory is exported
by the client.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-07-20 18:19:22 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Piotr Sarna
78afd518a8 wasm: add initial WebAssembly runtime implementation
The engine is based on wasmtime and is able to:
 - compile wasm text format to bytecode
 - run a given compiled function with custom arguments

This implementation is missing crucial features, like running
on any other types than 32-bit integers. It serves as a skeleton
for future full implementation.
2021-09-13 19:03:58 +02:00