Move the ::database, ::keyspace, and ::table classes to a new replica
namespace and replica/ directory. This designates objects that only
have meaning on a replica and should not be used on a coordinator
(but note that not all replica-only classes should be in this module,
for example compaction and sstables are lower-level objects that
deserve their own modules).
The module is imperfect - some additional classes like distributed_loader
should also be moved, but there is only one way to untie Gordian knots.
Closes#9872
* github.com:scylladb/scylla:
replica: move ::database, ::keyspace, and ::table to replica namespace
database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.
As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
Which configures seastar to act more appropriate to a tool app. I.e.
don't act as if it owns the place, taking over all system resources.
These tools are often run on a developer machine, or even next to a
running scylla instance, we want them to be the least intrusive
possible.
Also use the new tool mode in the existing tools.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20211220143104.132327-1-bdenes@scylladb.com>
When you run "configure.py", the result is not only the creation of
./build.ninja - it also creates build/<mode>/seastar/build.ninja
and build/<mode>/abseil/build.ninja. After a "rm -r build" (or "ninja
clean"), "ninja" will no longer work because those files are missing
when Scylla's ninja tries to run ninja in those internal project.
So we need to add a dependency, e.g., that running ninja in Seastar
requires build/<mode>/seastar/build.ninja to exist, and also say
that the rule that (re)runs "configure.py" generates those files.
After this patch,
configure.py --with-some-parameters --of-your-choice
rm -r build
ninja
works - "ninja" will re-run configure.py with the same parameters
when it needs Seastar's or Abseil's build.ninja.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211230133702.869177-1-nyh@scylladb.com>
The gc_grace_seconds is a very fragile and broken design inherited from
Cassandra. Deleted data can be resurrected if cluster wide repair is not
performed within gc_grace_seconds. This design pushes the job of making
the database consistency to the user. In practice, it is very hard to
guarantee repair is performed within gc_grace_seconds all the time. For
example, repair workload has the lowest priority in the system which can
be slowed down by the higher priority workload, so that there is no
guarantee when a repair can finish. A gc_grace_seconds value that is
used to work might not work after data volume grows in a cluster. Users
might want to avoid running repair during a specific period where
latency is the top priority for their business.
To solve this problem, an automatic mechanism to protect data
resurrection is proposed and implemented. The main idea is to remove the
tombstone only after the range that covers the tombstone is repaired.
In this patch, a new table option tombstone_gc is added. The option is
used to configure tombstone gc mode. For example:
1) GC a tombstone after gc_grace_seconds
cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ;
This is the default mode. If no tombstone_gc option is specified by the
user. The old gc_grace_seconds based gc will be used.
2) Never GC a tombstone
cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'};
3) GC a tombstone immediately
cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'};
4) GC a tombstone after repair
cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'};
In addition to the 'mode' option, another option 'propagation_delay_in_seconds'
is added. It defines the max time a write could possibly delay before it
eventually arrives at a node.
A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc
option can only be used after the whole cluster supports the new
feature. A mixed cluster works with no problem.
Tests: compaction_test.py, ninja test
Fixes#3560
[avi: resolve conflicts vs data_dictionary]
"
A big problem with scylla tool executables is that they include the
entire scylla codebase and thus they are just as big as the scylla
executable itself, making them impractical to deploy on production
machines. We could try to combat this by selectively including only the
actually needed dependencies but even ignoring the huge churn of
sorting out our depedency hell (which we should do at one point anyway),
some tools may genuinely depend on most of the scylla codebase.
A better solution is to host the tool executables in the scylla
executable itself, switching between the actual main function to run
some way. The tools themselves don't contain a lot of code so
this won't cause any considerable bloat in the size of the scylla
executable itself.
This series does exactly this, folds all the tool executables into the
scylla one, with main() switching between the actual main it will
delegate to based on a argv[1] command line argument. If this is a known
tool name, the respective tool's main will be invoked.
If it is "server", missing or unrecognized, the scylla main is invoked.
Originally this series used argv[0] as the mean to switch between the
main to run. This approach was abandoned for the approach mentioned above
for the following reasons:
* No launcher script, hard link, soft link or similar games are needed to
launch a specific tool.
* No packaging needed, all tools are automatically deployed.
* Explicit tool selection, no surprises after renaming scylla to
something else.
* Tools are discoverable via scylla's description.
* Follows the trend set by modern command line multi-command or multi-app
programs, like git.
Fixes: #7801
Tests: unit(dev)
"
* 'tools-in-scylla-exec-v5' of https://github.com/denesb/scylla:
main,tools,configure.py: fold tools into scylla exec
tools: prepare for inclusion in scylla's main
main: add skeleton switching code on argv[1]
main: extract scylla specific code into scylla_main()
The infrastructure is now in place. Remove the proxy main of the tools,
and add appropriate `else if` statements to the executable switch in
main.cc. Also remove the tool applications from the `apps` list and add
their respective sources as dependencies to the main scylla executable.
With this, we now have all tool executables living inside the scylla
main one.
The new module will contain all schema related metadata, detached from
actual data access (provided by the database class). User types is the
first contents to be moved to the new module.
It turns out that -O3 enabled -fslp-vectorize even if it is
disabled before -O3 on the command line. Rearrange the code
so that -O3 is before the more specific optimization options.
This PR finally removes the `term` class and replaces it with `expression`.
* There was some trouble with `lwt_cache_id` in `expr::function_call`.
The current code works the following way:
* for each `function_call` inside a `term` that describes a pk restriction, `prepare_context::add_pk_function_call` is called.
* `add_pk_function_call` takes a `::shared_ptr<cql3::functions::function_call>`, sets its `cache_id` and pushes this shared pointer onto a vector of all collected function calls
* Later when some condiition is met we want to clear cache ids of all those collected function calls. To do this we iterate through shared pointers collected in `prepare_context` and clear cache id for each of them.
This doesn't work with `expr::function_call` because it isn't kept inside a shared pointer.
To solve this I put the `lwt_cache_id` inside a shared pointer and then `prepare_context` collects these shared pointers to cache ids.
I also experimented with doing this without any shared pointers, maybe we could just walk through the expression and clear the cache ids ourselves. But the problem is that expressions are copied all the time, we could clear the cache in one place, but forget about a copy. Doing it using shared pointers more closely matches the original behaviour.
The experiment is on the [term2-pr3-backup-altcache](https://github.com/cvybhu/scylla/tree/term2-pr3-backup-altcache) branch
* `shared_ptr<term>` being `nullptr` could mean:
* It represents a cql value `null`
* That there is no value, like `std::nullopt` (for example in `attributes.hh`)
* That it's a mistake, it shouldn't be possible
A good way to distinguish between optional and mistake is to look for `my_term->bind_and_get()`, we then know that it's not an optional value.
* On the other hand `raw_value` cased to bool means:
* `false` - null or unset
* `true` - some value, maybe empty
I ran a simple benchmark on my laptop to see how performance is affected:
```
build/release/test/perf/perf_simple_query --smp 1 -m 1G --operations-per-shard 1000000 --task-quota-ms 10
```
* On master (a21b1fbb2f) I get:
```
176506.60 tps ( 77.0 allocs/op, 12.0 tasks/op, 45831 insns/op)
median 176506.60 tps ( 77.0 allocs/op, 12.0 tasks/op, 45831 insns/op)
median absolute deviation: 0.00
maximum: 176506.60
minimum: 176506.60
```
* On this branch I get:
```
172225.30 tps ( 75.1 allocs/op, 12.1 tasks/op, 46106 insns/op)
median 172225.30 tps ( 75.1 allocs/op, 12.1 tasks/op, 46106 insns/op)
median absolute deviation: 0.00
maximum: 172225.30
minimum: 172225.30
```
Closes#9481
* github.com:scylladb/scylla:
cql3: Remove remaining mentions of term
cql3: Remove term
cql3: Rename prepare_term to prepare_expression
cql3: Make prepare_term return an expression instead of term
cql3: expr: Add size check to evaluate_set
cql3: expr: Add expr::contains_bind_marker
cql3: expr: Rename find_atom to find_binop
cql3: expr: Add find_in_expression
cql3: Remove term in operations
cql3: Remove term in relations
cql3: Remove term in multi_column_restrictions
cql3: Remove term in term_slice, rename to bounds_slice
cql3: expr: Remove term in expression
cql3: expr: Add evaluate_IN_list(expression, options)
cql3: Remove term in column_condition
cql3: Remove term in select_statement
cql3: Remove term in update_statement
cql3: Use internal cql format in insert_prepared_json_statement cache
types: Add map_type_impl::serialize(range of <bytes, bytes>)
cql3: Remove term in cql3/attributes
cql3: expr: Add constant::view() method
cql3: expr: Implement fill_prepare_context(expression)
cql3: expr: add expr::visit that takes a mutable expression
cql3: expr: Add receiver to expr::bind_variable
Operations of adding or removing a node to Raft configuration
are made idempotent: they do nothing if already done, and
they are safe to resume after a failure.
However, since topology changes are not transactional, if a
bootstrap or removal procedure fails midway, Raft group 0
configuration may go out of sync with topology state as seen by
gossip.
In future we must change gossip to avoid making any persistent
changes to the cluster: all changes to persistent topology state
will be done exclusively through Raft Group 0.
Specifically, instead of persisting the tokens by advertising
them through gossip, the bootstrap will commit a change to a system
table using Raft group 0. nodetool will switch from looking at
gossip-managed tables to consulting with Raft Group 0 configuration
or Raft-managed tables.
Once this transformation is done, naturally, adding a node to Raft
configuration (perhaps as a non-voting member at first) will become the
first persistent change to ring state applied when a node joins;
removing a node from the Raft Group 0 configuration will become the last
action when removing a node.
Until this is done, do our best to avoid a cluster state when
a removed node or a node which addition failed is stuck in Raft
configuration, but the node is no longer present in gossip-managed
system tables. In other words, keep the gossip the primary source of
truth. For this purpose, carefully chose the timing when we
join and leave Raft group 0:
Join the Raft group 0 only after we've advertised our tokens, so the
cluster is aware of this node, it's visible in nodetool status,
but before node state jumps to "normal", i.e. before it accepts
queries. Since the operation is idempotent, invoke it on each
restart.
Remove the node from Group 0 *before* its tokens are removed
from gossip-managed system tables. This guarantees
that if removal from Raft group 0 fails for whatever reason,
the node stays in the ring, so nodetool removenode and
friends are re-tried.
Add tracing.
Introduce a special state machine used to to find
a leader of an existing Raft cluster or create
a new cluster.
This state machine should be used when a new
Scylla node has no persisted Raft Group 0 configuration.
The algorithm is initialized with a list of seed
IP addresses, IP address of this server, and,
this server's Raft server id.
The IP addresses are used to construct an initial list of peers.
Then, the algorithm tries to contact each peer (excluding self) from
its peer list and share the peer list with this peer, as well as
get the peer's peer list. If this peer is already part of
some Raft cluster, this information is also shared. On a response
from a peer, the current peer's peer list is updated. The
algorithm stops when all peers have exchanged peer information or
one of the peers responds with id of a Raft group and Raft
server address of the group leader.
(If any of the peers fails to respond, the algorithm re-tries
ad infinitum with a timeout).
More formally, the algorithm stops when one of the following is true:
- it finds an instance with initialized Raft Group 0, with a leader
- all the peers have been contacted, and this server's
Raft server id is the smallest among all contacted peers.
First, it doesn't test the gossiper so
it's unclear why have it at all.
And it doesn't test anything more than what we test
using the cql_test_env either.
For testing gossip there is test/manual/gossip.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211122081305.789375-2-bhalevy@scylladb.com>
This PR introduces 4 new virtual tables aimed at replacing nodetool commands, working towards the long-term goal of replacing nodetool completely at least for cluster information retrieval purposes.
As you may have noticed, most of these replacement are not exact matches. This is on purpose. I feel that the nodetool commands are somewhat chaotic: they might have had a clear plan on what command prints what but after years of organic development they are a mess of fields that feel like don't belong. In addition to this, they are centered on C* terminology which often sounds strange or doesn't make any sense for scylla (off-heap memory, counter cache, etc.).
So in this PR I tried to do a few things:
* Drop all fields that don't make sense for scylla;
* Rename/reformat/rephrase fields that have a corresponding concept in scylla, so that it uses the scylla terminology;
* Group information in tables based on some common theme;
With these guidelines in mind lets look at the virtual tables introduced in this PR:
* `system.snapshots` - replacement for `nodetool listnapshots`;
* `system.protocol_servers`- replacement for `nodetool statusbinary` as well as `Thrift active` and `Native Transport active` from `nodetool info`;
* `system.runtime_info` - replacement for `nodetool info`, not an exact match: some fields were removed, some were refactored to make sense for scylla;
* `system.versions` - replacement for `nodetool version`, prints all versions, including build-id;
Closes#9517
* github.com:scylladb/scylla:
test/cql-pytest: add virtual_tables.py
test/cql-pytest: nodetool.py: add take_snapshot()
db/system_keyspace: add versions table
configure.py: move release.cc and build_id.cc to scylla_core
db/system_keyspace: add runtime_info table
db/system_keyspace: add protocol_servers table
service: storage_service: s/client_shutdown_hooks/protocol_servers/
service: storage_service: remove unused unregister_client_shutdown_hook
redis: redis_service: implement the protocol_server interface
alternator: controller: implement the protocol_server interface
transport: controller: implement the protocol_server interface
thrift: controller: implement the protocol_server interface
Add protocol_server interface
db/system_keyspace: add snapshots virtual table
db/virtual_table: remove _db member
db/system_keyspace: propagate distributed<> database and storage_service to register_virtual_tables()
docs/design-notes/system_keyspace.md: add listing of existing virtual tables
docs/guides: add virtual-tables.md
These two files were only added to the scylla executable and some
specific unit tests. As we are about to use the symbols defined in these
files in some scylla_core code move them there.
compile_commands.json (a.k.a. "compdb",
https://clang.llvm.org/docs/JSONCompilationDatabase.html) is intended
to help stand-alone C-family LSP servers index the codebase as
precisely as possible.
The actively maintained LSP servers with good C++ support are:
- Clangd (https://clangd.llvm.org/)
- CCLS (https://github.com/MaskRay/ccls)
This change causes a successful invocation of configure.py to create a
unified Scylla+Seastar+Abseil compdb for every selected build mode,
and to leave a valid symlink in the source root (if a valid symlink
already exists, it will be left alone).
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Closes#9558
prepare_term now takes an expression and returns a prepared expression.
It should be renamed to prepare_expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
node_exporter is packaged with some random uid/gid in the tarball.
When extracting it as an ordinary user this isn't a problem, since
the uid/gid are reset to the current user, but that doesn't happen
under dbuild since `tar` thinks the current user is root. This causes
a problem if one wants to delete the build directory later, since it
becomes owned by some random user (see /etc/subuid)
Reset the uid/gid infomation so this doesn't happen.
Closes#9579
The warning was disabled during the migration to clang, but now it
appears unnecessary (perhaps clang added support for the attributes
it did not have then). It is valuable for detecting misspelled
attributes, so enable it again.
Closes#9480
Our source base drifted away from gcc compatibility; this mostly
restores the ability to build with gcc. An important exception is
coroutines that have an initializer list [1]; this still doesn't work.
We aim to switch back to gcc 11 if/when this gives us better
C++ compatibility and performance.
Test: unit (dev)
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056Closes#9459
* github.com:scylladb/scylla:
test: radix_tree_printer: avoid template specialization in class context
test: raft: avoid ignored variable errors
test: reader_concurrency_semaphore_test: isolate from namespace of source_location
test: cql_query_test: drop unused lambda assert_replication_not_contains
test: commitlog_test: don't use deprecated seastar::unaligned_cast
test: adjust signed/unsigned comparisons in loops and boost tests
build: silence some gcc 11 warnings
sstables: processing_result_generator: make coroutine support palatable for C++20 compilers
managed_bytes: avoid compile-time loop in converting constructor
service: service_level_controller: drop unused variable sl_compare
raft: disambiguate promise name in raft::active_read
locator: azure_snitch: use full type name in definition of globals
cql3: statements: create_service_level_statement: don't ignore replace_defaults()
cql3: statement_restrictions: adjust call to std::vector deduction guide
types: remove recursive constraint in deserialize_value
cql3: restrictions: relax constraint on visitor_with_binary_operator_content
treewide: handle switch statements that return
cql3: expr: correct type of captured map value_type
cdc: adjust type of streams_count
alternator: disambiguate attrs_to_get in table_requests
Just cut-n-paste the code into sstables_loader.cc. No other
changes but replace storage service logger with its own one.
For now the code stays in storage_service class, but next
patch will relocate the code into the sstables_loader one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Until now reversed queries were implemented inside
`querier::consume_page` (more precisely, inside the free function
`consume_page` used by `querier::consume_page`) by wrapping the
passed-in reader into `make_reversing_reader` and then consuming
fragments from the resulting reversed reader.
The first couple of commits change that by pushing the reversing down below
the `make_combined_reader` call in `table::query`. This allows
working on improving reversing for memtables independently from
reversing for sstables.
We then extend the `index_reader` with functions that allow
reading the promoted index in reverse.
We introduce `partition_reversing_data_source`, which wraps an sstable data
file and returns data buffers with contents of a single chosen partition
as if the rows were stored in reverse order.
We use the reversing source and the extended index reader in
`mx_sstable_mutation_reader` to implement efficient (at least in theory)
reversed single-partition reads.
The patchset disables cache for reversed reads. Fast-forwarding
is not supported in the mx reader for reversed queries at this point.
Details in commit messages. Read the commits in topological order
for best review experience.
Refs: #9134
(not saying "Fixes" because it's only for single-partition queries
without forwarding)
Closes#9281
* github.com:scylladb/scylla:
table: add option to automatically bypass cache for reversed queries
test: reverse sstable reader with random schema and random mutations
sstables: mx: implement reversed single-partition reads
sstables: mx: introduce partition_reversing_data_source
sstables: index_reader: add support for iterating over clustering ranges in reverse
clustering_key_filter: clustering_key_filter_ranges owning constructor
flat_mutation_reader: mention reversed schema in make_reversing_reader docstring
clustering_key_filter: document clustering_key_filter_ranges::get_ranges
This patch adds an implementation of a data source that wraps an sstable
data file and returns data buffers with contents of one partition in the
sstable as if the rows of the partition were present in a reversed
order. In other words, to the user of the source the partition appears
to be reversed. We shall call this an 'intermediary' data source.
As part of the interface of the intermediary source the user is also
given read access to the source's current position over the data file,
and the constructor of the source takes a reference to `index_reader`.
This is necessary because the index operates directly on data file
offsets and we want the user to be able to use the index to skip
sequences of rows.
In order to ask the source to skip a sequence of rows - e.g. when jumping
between clustering ranges - the user must advance the index' upper bound
in reverse (to an earlier position). The source will then notice that
the end position of the index has changed and take appropriate action.
An alternative would be to translate the data positions of
`index_reader` to 'reversed positions' of the intermediary and then use
`skip_to` for skipping, as we do for forward reads. However this
solution would introduce more complexity to `index_reader` and the
intermediary source. One reason for the complexity in the input stream
is that we would have two kinds of skips: a single row skip,
and a skip to a clustering range. We know the offset of the next row,
so we could check that to differentiate them. We would also need to add
an information about the position of first clustering row and end of
the last one in the index_reader. Skipping by checking the index seems
to be overall simpler.
For simplicity, the intermediary stream always starts with
parsing the partition header and (if present) the static row,
and returning the corresponding bytes as a result of the first
read.
After partition header and static row we must find the last row entry of
the requested range. If the range ends before the partition end (i.e.
there are more row entries after the range) we can use the 'previous
unfiltered size' of the row following the range; otherwise we must scan
the last promoted index block and take its last row.
After finding the data range of the last row, we parse rows
consecutively in reversed order. We must parse the rows partially
to learn their lengths and the positions of previous rows. We're
using similar constructs as in the sstable parser, but it only
contains a small part of the parsing coroutine and doesn't perform
any correctness checks. The parser for rows still turned out rather
big mostly because we can't always deduce the size of the clustering
blocks without reading the block header.
The parser allows reading rows while skipping their bodies also in
non-reversed order, which we are making use of while reading the
last promoted index block.
The intermediary data source has one more utility: reversing range
tombstones. When we read a tombstone bound/boundary, we modify
the data buffer so that the resulting bound/boundary has the reversed
kind (so we don't read ends before starts) and the boundaries have their
before/after timestamps swapped.
There seems to be a problem with libwasmtime.a dependency on aarch64,
causing occasional segfaults during tests - specifically, tests
which exercise the path for halting wasm execution due to fuel
exhaustion. As a temporary measure, wasm is disabled on this
architecture to unblock the flow.
Refs #9387Closes#9414
... from Nadav Har'El
This small series adds a stub implementation of Alternator's
UpdateTimeToLive and DescribeTimeToLive operations. These operations can
enable, disable, or inquire about, the chosen expiration-time attribute.
Currently, the information about the chosen attribute is only saved,
with no actual expiration of any items taking place.
Because this is an incomplete implementation of this feature, it is not
enabled unless an experimental flag is enabled on all nodes in the
cluster.
See the individual patches for more information on what this series
does.
Refs #5060.
Closes#9345
* github.com:scylladb/scylla:
test/alternator: rename utility function test_table_name()
alternator: stub TTL operations
alternator: make three utility functions in executor.cc non-static
test/alternator: test another corner case of TTL
"
Currently database start and stop code is quite disperse and
exists in two slightly different forms -- one in main and the
other one in cql_test_env. This set unifies both and makes
them look almost the perfect way:
sharded<database> db;
db.start(<dependencies>);
auto stop = defer([&db] { db.stop().get(); });
db.invoke_on_all(&database::start).get();
with all (well, most) other mentionings of the "db" variable
being arguments for other services' dependencies.
tests: unit(dev, release), unit.cross_shard_barrier(debug)
dtest.simple_boot_shutdown(dev)
refs: #2737
refs: #2795
refs: #5489
"
* 'br-database-teardown-unification-2' of https://github.com/xemul/scylla: (26 commits)
main: Log when database starts
view_update_generator: Register staging sstables in constructor
database, messaging: Delete old connection drop notification
database, proxy: Relocate connection-drop activity
messaging, proxy: Notify connection drops with boost signal
database, tests: Rework recommended format setting
database, sstables_manager: Sow some noexcepts
database: Eliminate unused helpers
database: Merge the stop_database() into database::stop()
database: Flatten stop_database()
database: Equip with cross-shard-barrier
database: Move starting bits into start()
database: Add .start() method
main: Initialize directories before database
main, api: Detach set_server_config from database and move up
main: Shorten commitlog creation
database: Extract commitlog initialization from init_system_keyspace
repair: Shutdown without database help
main: Shift iosched verification upward
database: Remove unused mm arg from init_non_system_keyspaces()
...
This patch adds stubs for the UpdateTimeToLive and DescribeTimeToLive
operations to Alternator. These operations can enable, disable, or inquire
about, the chosen expiration-time attribute.
Currently, the information about the chosen attribute is only saved, with
no actual expiration of any items taking place.
Some of the tests for the TTL feature start to pass, so their xfail tag
is removed.
Because this this new feature is incomplete, it is not enabled unless
the "alternator-ttl" experimental feature is enabled. Moreover, for
these operations to be allowed, the entire cluster needs to support
this experimental feature, because all nodes need to participate in the
data expiration - if some old nodes don't support Alternator TTL, some
of the data they hold won't get expired... So we don't allow enabling
TTL until all the nodes in the cluster support this feature.
The implementation is in a new source file, alternator/ttl.cc. This
source file will continue to grow as we implement the expiration feature.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The wasm::engine exists as a sharded<> service in main, but it's only
passed by local reference into database on start. There's no much profit
in keeping it at main scope, things get much simpler if keeping the
engine purely on database.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Add a synchronization facility to let shards wait for each
other to pass through certain points in the code.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.
Enable the warning and fix numerous violations.
Closes#9347
This series adds very basic support for WebAssembly-based user-defined functions.
This series comes with a basic set of tests which were used to designate a minimal goal for this initial implementation.
Example usage:
```cql
CREATE FUNCTION ks.fibonacci (str text)
RETURNS NULL ON NULL INPUT
RETURNS boolean
LANGUAGE xwasm
AS ' (module
(func $fibonacci (param $n i32) (result i32)
(if
(i32.lt_s (local.get $n) (i32.const 2))
(return (local.get $n))
)
(i32.add
(call $fibonacci (i32.sub (local.get $n) (i32.const 1)))
(call $fibonacci (i32.sub (local.get $n) (i32.const 2)))
)
)
(export "fibonacci" (func $fibonacci))
) '
```
Note that the language is currently called "xwasm" as in "experimental wasm", because its interface is still subject to change in the future.
Closes#9108
* github.com:scylladb/scylla:
docs: add a WebAssembly entry
cql-pytest: add wasm-based tests for user-defined functions
main: add wasm engine instantiation
treewide: add initial WebAssembly support to UDF
wasm: add initial WebAssembly runtime implementation
db: add wasm_engine pointer to database
lang: add wasm_engine service
import wasmtime.hh
lua: move to lang/ directory
cql3: generalize user-defined functions for more languages
The engine is based on wasmtime and is able to:
- compile wasm text format to bytecode
- run a given compiled function with custom arguments
This implementation is missing crucial features, like running
on any other types than 32-bit integers. It serves as a skeleton
for future full implementation.
'$builddir/release/{scylla_product}-python3-package.tar.gz' on
dist-python3 target is for compat-python3, we forgot to remove at 35a14ab.
Fixes#9333Closes#9334
A tool which can be used to examine the content of sstable(s) and
execute various operations on them. The currently supported operations
are:
* dump - dumps the content of the sstable(s), similar to sstabledump;
* index-dump - dumps the content of the sstable index(es), similar to
scylla-sstable-index;
* writetime-histogram - generates a histogram of all the timestamps in
the sstable(s);
* custom - a hackable operation for the expert user (until scripting
support is implemented);
* validate - validate the content of the sstable(s) with the mutation
fragment stream validator, same as scrub in validate mode;
A utility which can load a schema from a schema.cql file. The file has
to contain all the "dependencies" of the table: keyspace, UDTs, etc.
This will be used by the scylla-sstable-crawler in the next patch.
Pass --pip-packages option to tools/python3/reloc/build_reloc.sh,
add scylla-driver to relocatable python3 which required for
fix_system_distributed_tables.py.
[avi: regenrate toolchain]
Ref #9040
We have in our Ninja build file various targets which ask to build just
a single build mode. For example, "ninja dev" builds everything in dev
mode - including Scylla, tests, and distribution artifacts - but
shouldn't build anything in other build modes (debug, release, etc.),
even if they were previously configured by configure.py.
However, we had a bug where these build-mode-specific targets
nevertheless compiled *all* configured modes, not just the requested
mode.
The bug was introduced in commit edd54a9463 -
targets "dist-server-compat" and "dist-unified-compat" were introduced,
but instead of having per-build-mode versions of these targets, only
one of each was introduced building all modes. When these new targets
were used in a couple of places in per-build-mode targets, it forced
these targets to build all modes instead of just the chosen one.
The solution is to split the dist-server-compat target into multiple
dist-server-compat-{mode}, and similarly split dist-unified-compat.
The unsplit target is also retained - for use in targets that really
want all build modes.
Fixes#9260.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210829123418.290333-1-nyh@scylladb.com>
A term_raw_expression is a term::raw that holds an expression. It will
be used to incrementally convert the source base to expressions, while
still exposing the result to the common interface of shared_ptr<term::raw>.