In CQL, before a user can create a table, they must create a keyspace to
contain this table and, among other things, specify this keyspace's RF.
But in the DynamoDB API, there is no "create keyspace" operation - the
user just creates a table, and there is no way, and no opportunity,
to specify the requested RF. Presumably, Amazon always uses the same
RF for all tables, most likely 3, although this is not officially
documented anywhere.
The existing code creates the keyspace during Scylla boot, with RF=1.
This RF=1 always works, and is a good choice for a one-node test run,
but was a really bad choice for a real cluster with multiple nodes, so
this patch fixes this choice:
With this patch, the keyspace creation is delayed - it doesn't happen
when the first node of the cluster boots, but only when the user creates
the first table. Presumably, at that time, the cluster is already up,
so at that point we can make the obvious choice automatically: a one-node
cluster will get RF=1, a >=3 node cluster will get RF=3. The choice of
RF is logged - and the choice of RF=1 is considered a warning.
Note that with this patch, keyspace creation is still automatic as it
was before. The user may manually create the keyspace via CQL, to
override this automatic choice. In the future we may also add additional
keyspace configuration options via configuration flags or new REST
requests, and the keyspace management code will also likely change
as we start to support clusters with multiple regions and global
tables. But for now, I think the automatic method is easiest for
users who want to test-drive Alternator without reading lengthy
instructions on how to set up the keyspace.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190820180610.5341-1-nyh@scylladb.com>
In the new code, write and read queries take a "service permit" which they
hold for the duration of the query, to help limit the load on the machine.
Alternator doesn't yet participate in this feature, so for now let's just
use empty_service_permit() meaning the queries don't hold on to any permit.
We can fix this later to use real permits.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
We allow BillingMode to be set to either PAY_PER_REQUEST (the default)
or PROVISIONED, although neither mode is fully implemented: In the former
case the payment isn't accounted, and in the latter case the throughput
limits are not enforced.
But other settings for BillingMode are now refused, and we add a new test
to verify that.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818122919.8431-1-nyh@scylladb.com>
The alternator tests want to exercise many of the DynamoDB API features,
so they need a recent enough version of the client libraries, boto3
and botocore. In particular, only in botocore 1.12.54, released a year
ago, was support for BillingMode added - and we rely on this to create
pay-per-request tables for our tests.
Instead of letting the user run with an old version of this library and
get dozens of mysterious errors, in this patch we add a test to conftest.py
which cleanly aborts the test if the libraries aren't new enough, and
recommends a "pip" command to upgrade these libraries.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190819121831.26101-1-nyh@scylladb.com>
The DescribeTable operation was currently implemented to return the
minimal information that libraries and applications usually need from
it, namely verifying that some table exists. However, this operation
is actually supposed to return a lot more information fields (e.g.,
the size of the table, its creation date, and more) which we currently
don't return.
This patch adds a new test file, test_describe_table.py, testing all
these additional attributes that DescribeTable is supposed to return.
Several of the tests are marked xfail (expected to fail) because we
did not implement these attributes yet.
The test is exhaustive except for attributes that have to do with four
major features which will be tested together with these features: GSI,
LSI, streams (CDC), and backup/restore.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190816132546.2764-1-nyh@scylladb.com>
Currently Alternator starts all Scylla requests (including both reads
and writes) without any timeout set. Because of bugs and/or network
problems, Requests can theoretically hang and waste Scylla request for
hours, long after the client has given up on them and closed their
connection.
The DynamoDB protocol doesn't let a user specify which timeout to use,
so we should just use something "reasonable", in this patch 10 seconds.
Remember that all DynamoDB read and write requests are small (even scans
just scan a small piece), so 10 seconds should be above and beyond
anything we actually expect to see in practice.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190812105132.18651-1-nyh@scylladb.com>
So far we had the "--alternator-port" option allowing to configure the port
on which the Alternator server listens on, but the server always listened
to any address. It is important to also be able to configure the listen
address - it is useful in tests running several instances of Scylla on
the same machine, and useful in multi-homed machines with several interfaces.
So this patch adds the "--alternator-address" option, defaulting to 0.0.0.0
(to listen on all interfaces). It works like the many other "--*-address"
options that Scylla already has.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808204641.28648-1-nyh@scylladb.com>
It turns out that recent rjson patches introduced some buggy
tabs instead of spaces due to bad IDE configuration. The indentation
is restored to spaces.
Until now, filtering in alternator was possible only for non-key
column equality relations. This commit adds support for equality
relations for key columns.
Alternator allows passing hash and sort key restrictions
as filters - it is, however, better to incorporate these restrictions
directly into partition and clustering ranges, if possible.
It's also necessary, as optimizations inside restrictions_filter
assume that it will not be fed unneeded rows - e.g. if filtering
is not needed on partition key restrictions, they will not be checked.
Currently the only utility function for getting key bytes
from JSON was to parse a document with the following format:
"key_column_name" : { "key_column_type" : VALUE }.
However, it's also useful to parse only the inner document, i.e.:
{ "key_column_type" : VALUE }.
Three metrics related to filtering are added to alternator:
- total rows read during filtering operations
- rows read and matched by filtering
- rows read and dropped by filtering
Some underlying operations (e.g. paging) make use of cql_stats
structure from CQL3. As such, cql_stats structure is added
to alternator stats in order to gather and use these statistics.
Read-before-write stat counters were already introduced, but the metrics
needs to be added to a metric group as well in order to be available
for users.
This patch adds partial support for GSI (Global Secondary Index) in
Alternator, implemented using a materialized view in Scylla.
This initial version only supports the specific cases of the index indexing
a column which was already part of the base table's key - e.g., indexing
what used to be a sort key (clustering key) in the base table. Indexing
of non-key attributes (which today live in a map) is not yet supported in
this version.
Creation of a table with GSIs is supported, and so is deleting the table.
UpdateTable which adds a GSI to an existing table is not yet supported.
Query and Scan operations on the index are supported.
DescribeTable does not yet list the GSIs as it should.
Seven previously-failing tests now pass, so their "xfail" tag is removed.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808090256.12374-1-nyh@scylladb.com>
The rapidjson library needs to be used with caution in order to
provide maximum performance and avoid undefined behavior.
Comments added to rjson.hh describe provided methods and potential
pitfalls to avoid.
Message-Id: <ba94eda81c8dd2f772e1d336b36cae62d39ed7e1.1565270214.git.sarna@scylladb.com>
With libjsoncpp we were forced to work around the problem of
non-noexcept constructors by using an intermediate unique pointer.
Objects provided by rapidjson have correct noexcept specifiers,
so the workaround can be dropped.
Profiling alternator implied that JSON parsing takes up a fair amount
of CPU, and as such should be optimized. libjsoncpp is a standard
library for handling JSON objects, but it also proves slower than
rapidjson, which is hereby used instead.
The results indicated that libjsoncpp used roughly 30% of CPU
for a single-shard alternator instance under stress, while rapidjson
dropped that usage to 18% without optimizations.
Future optimizations should include eliding object copying, string copying
and perhaps experimenting with different JSON allocators.
Migrating from libjsoncpp to rapidjson proved to be beneficial
for parsing performance. As a first step, a set of helper functions
is provided to ease the migration process.
error.hh file implicitly assumed that seastar:: namespace is available
when it's included, which is not always the case. To remedy that,
seastar::httpd namespace is used explicitly.
Our CreateTable handler assumed that the function
migration_manager::announce_new_column_family()
returns a failed future if the table already exists. But in some of
our code branches, this is not the case - the function itself throws
instead of returning a failed future. The solution is to use
seastar::futurize_apply() to handle both possibilities (direct exception
or future holding an exception).
This fixes a failure of the test_table.py::test_create_table_already_exists
test case.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This adds a new document, docs/alternator.md, about Alternator.
The scope of this document should be expanded in the future. We begin
here by introducing Alternator and its current compatibility level with
Amazon DynamoDB, but it should later grow to explain the design of Alternator
and how it maps the DynamoDB data model onto Scylla's.
Whether this document should remain a short high-level overview, or a long
and detailed design document, remains an open question.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190805085340.17543-1-nyh@scylladb.com>
We start()ed Alternator's sharded service, but forgot to wait for the
future it returns! So on multi-shard run which is slow enough (e.g, debug
build), we sometimes get to invoke_on_all() before start() had completed,
and fail to initialize the Alternator server. The fix is to just wait for
the future returned by start() - just as similar code in main.cc does.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190804172006.14888-1-nyh@scylladb.com>
The function attrs_type() return a supposedly singleton, but because
it is a seastar::shared_ptr we can't use the same one for multiple
threads, and need to use a separate one per thread.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190804163933.13772-1-nyh@scylladb.com>
The CQL type singletons like utf8_type et al. are separate for separate
shards and cannot be used across shards. So whatever hash tables we use
to find them, also needs to be per-shard. If we fail to do this, we
get errors running the debug build with multiple shards.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190804165904.14204-1-nyh@scylladb.com>
Expand the GSI test suite. The most important new test is
test_gsi_key_not_in_index(), where the index's key includes just one of
the base table's key columns, but not a second one. In this case, the
Scylla implementation will nevertheless need to add the second key column
to the view (as a clustering key), even though it isn't considered a key
column by the DynamoDB API.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190718085606.7763-1-nyh@scylladb.com>
Our ListTables implementation uses get_column_families(), which lists both
base tables and materialized views. We will use materialized views to
implement DynamoDB's secondary indexes, and those should not be listed in
the results of ListTables.
The patch also includes a test for this.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190717133103.26321-2-nyh@scylladb.com>
The list_tables() utility function was used only in test_table.py
but I want to use it elsewhere too (in GSI test) so let's move it
to util.py.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190717133103.26321-1-nyh@scylladb.com>
As in case of set_diff, an exception message in set_sum should include
the user-provided request (ADD) rather than our internal helper function
set_sum.
Although we do not support GSI yet, until now we silently ignored
CreateTable's GSI parameter, and the user wouldn't know the table
wasn't created as intended.
In this patch, GSI is still unsupported, but now CreateTable will
fail with an error message that GSI is not supported.
We need to change some of the tests which test the error path, and
expect an error - but should not consider a table creation error
as the expected error.
After this patch, test_gsi.py still fails all the tests on
Alternator, but much more quickly :-)
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190711161420.18547-1-nyh@scylladb.com>
The test case for adding two sets with common values is added.
This case is a stub, because boto3 transforms the result into a Python
set, which removes duplicates on its own. A proper TODO is left
in order to migrate this case to a lower-level API and check
the returned JSON directly for lack of duplicates.