Commit Graph

19595 Commits

Author SHA1 Message Date
Piotr Sarna
4474ceceed alternator-test: enable passing tests
With more GSI features implemented, tests with XPASS status are promoted
to being enabled.

One test case (test_gsi_describe) is partially done as DescribeTable
now contains index names, but we could try providing more attributes
(e.g. IndexSizeBytes and ItemCount from the test case), so the test
is left in the XFAIL state.
2019-09-11 18:01:05 +03:00
Piotr Sarna
f922d6d771 alternator: Add 'mismatch' to serialization error message
In order to match the tests and origin more properly, the error message
for mismatched types is updated so it contains the word 'mismatch'.
2019-09-11 18:01:05 +03:00
Piotr Sarna
9dceea14f9 alternator: add describing GSI in DescribeTable
The DescribeTable request now contains the list of index names
as well. None of the attributes of the list are marked as 'required'
in the documentation, so currently the implementation provides
index names only.
2019-09-11 18:01:05 +03:00
Piotr Sarna
938a06e4c0 alternator: allow adding GSI-related regular columns to schema
In order to be able to create a Global Secondary Index over a regular
column, this column is upgraded from being a map entry to being a full
member of the schema. As such, it's possible to use this column
definition in the underlying materialized view's key.
2019-09-11 18:01:05 +03:00
Piotr Sarna
2a123925ca alternator: add handling regular columns with schema definitions
In order to prepare alternator for adding regular columns to schema,
i.e. in order to create a materialized view over them,
the code is changed so that updating no longer assumes that only keys
are included in the table schema.
2019-09-11 18:01:05 +03:00
Piotr Sarna
befa2fdc80 alternator: start fetching all regular columns
Since in the future we may want to have more regular columns
in alternator tables' schemas, the code is changed accordingly,
so all regular columns will be fetched instead of just the attribute
map.
2019-09-11 18:01:05 +03:00
Piotr Sarna
53044645aa alternator: avoid creating empty collection mutations
If no regular column attributes are passed to PutItem, the attr
collector serializes an empty collection mutation nonetheless
and sends it. It's redundant, so instead, if the attr colector
is empty, the collection does not get serialized and sent to replicas.
2019-09-11 18:01:05 +03:00
Nadav Har'El
317954fe19 alternator-test: add license blurbs
Add copyright and license blurbs to all alternator-test source files.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825161018.10358-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
c9eb9d9c76 alternator: update license blurbs
Update all the license blurbs to the one we use in the open-source
Scylla project, licensed under the AGPL.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825160321.10016-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Piotr Sarna
d6e671b04f alternator: add initial tracing to requests
Each request provides basic tracing information about itself.

Example output from tracing:

cqlsh> select request, parameters from system_traces.sessions
           where session_id = 39813070-c4ea-11e9-8572-000000000000;
 request          | parameters
------------------+-----------------------------------------------------
 Alternator Query | {'query': '{"TableName": "alternator_test_15664",
                    "KeyConditions": {"p": {"AttributeValueList":
                    [{"S": "T0FE0QCS0X"}], "ComparisonOperator": "EQ"}}}'}

cqlsh> select session_id, activity from system_traces.events
           where session_id = 39813070-c4ea-11e9-8572-000000000000;
 session_id                           | activity
--------------------------------------+-----------------------------
 39813070-c4ea-11e9-8572-000000000000 |                    Querying
 39813070-c4ea-11e9-8572-000000000000 | Performing a database query
2019-09-11 18:01:05 +03:00
Piotr Sarna
cb791abb9d alternator: enable query tracing
Probabilistic tracing can be enabled via REST API. Alternator will
from now on create tracing sessions for its operations as well.

Examples:

 # trace around 0.1% of all requests
curl -X POST http://localhost:10000/storage_service/trace_probability?probability=0.001
 # trace everything
curl -X POST http://localhost:10000/storage_service/trace_probability?probability=1
2019-09-11 18:01:05 +03:00
Piotr Sarna
6c8c31bfc9 alternator: add client state
Keeping an instance of client_state is a convenient way of being able
to use tracing for alternator. It's also currently used in paging,
so adding a client state to executor removes the need of keeping
a dummy value.
2019-09-11 18:01:05 +03:00
Piotr Sarna
1ca9dc5d47 alternator: use correct string views in serialization
String views used in JSON serialization should use not only the pointer
returned by rapidjson, but also the string length, as it may contain
\0 characters.
Additionally, one unnecessary copy is elided.
2019-09-11 18:01:05 +03:00
Nadav Har'El
32b898db7b alternator: docs/alternator.md: link to a longer document
Add a link to a longer document (currently, around 40 pages) about
DynamoDB's features and how we implemented or may implement them in
Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825121201.31747-2-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
a5c3d11ccb alternator: document choice of RF
After changing the choice of RF in a previous patch, let's update the
relevant part of docs/alternator.md.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825121201.31747-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
d20ec9f492 alternator: expand docs/alternator.md
Expand docs/alternator.md with new sections about how to run Alternator,
and a very brief introduction to its design.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818164628.12531-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
9b0ef1a311 alternator: refuse CreateTable if uses unsupported features
If a user tries to create a table with a unsupported feature -
a local secondary index, a used-defined encryption key or supporting
streams (CDC), let's refuse the table creation, so the application
doesn't continue thinking this feature is available to it.

The "Tags" feature is also not supported, but it is more harmless
(it is used mostly for accounting purposes) so we do not fail the
table creation because of it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818125528.9091-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Piotr Sarna
ab25472034 alternator: migrate to visitor pattern in serialization
Types can now be processed with a visitor pattern, which is more neat
than a chain of if statements.
Message-Id: <256429b7593d8ad8dff737d8ddb356991fb2a423.1566386758.git.sarna@scylladb.com>
2019-09-11 18:01:05 +03:00
Piotr Sarna
42d2910f2c alternator: add from_string with raw pointer to rjson
from_string is a family of function that create rjson values from
strings - now it's extended with accepting raw pointer and size.
Message-Id: <d443e2e4dcc115471202759ecc3641ec902ed9e4.1566386758.git.sarna@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
2f53423a2f alternator: automatically choose RF: 1 or 3
In CQL, before a user can create a table, they must create a keyspace to
contain this table and, among other things, specify this keyspace's RF.

But in the DynamoDB API, there is no "create keyspace" operation - the
user just creates a table, and there is no way, and no opportunity,
to specify the requested RF. Presumably, Amazon always uses the same
RF for all tables, most likely 3, although this is not officially
documented anywhere.

The existing code creates the keyspace during Scylla boot, with RF=1.
This RF=1 always works, and is a good choice for a one-node test run,
but was a really bad choice for a real cluster with multiple nodes, so
this patch fixes this choice:

With this patch, the keyspace creation is delayed - it doesn't happen
when the first node of the cluster boots, but only when the user creates
the first table. Presumably, at that time, the cluster is already up,
so at that point we can make the obvious choice automatically: a one-node
cluster will get RF=1, a >=3 node cluster will get RF=3. The choice of
RF is logged - and the choice of RF=1 is considered a warning.

Note that with this patch, keyspace creation is still automatic as it
was before. The user may manually create the keyspace via CQL, to
override this automatic choice. In the future we may also add additional
keyspace configuration options via configuration flags or new REST
requests, and the keyspace management code will also likely change
as we start to support clusters with multiple regions and global
tables. But for now, I think the automatic method is easiest for
users who want to test-drive Alternator without reading lengthy
instructions on how to set up the keyspace.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190820180610.5341-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Piotr Sarna
1a1935eb72 alternator-test: add a test for wrong BEGINS_WITH target type
The test ensures that passing a non-compatible type to BEGINS WITH,
e.g. a number, results in a validation error.
Tested both locally and remotely.
Message-Id: <894a10d3da710d97633dd12b6ac54edccc18be82.1566291989.git.sarna@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
b7b998568f alternator: add to CreateTable verification of BillingMode setting
We allow BillingMode to be set to either PAY_PER_REQUEST (the default)
or PROVISIONED, although neither mode is fully implemented: In the former
case the payment isn't accounted, and in the latter case the throughput
limits are not enforced.
But other settings for BillingMode are now refused, and we add a new test
to verify that.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818122919.8431-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
66a2af4f7d alternator-test: require a new-enough boto library
The alternator tests want to exercise many of the DynamoDB API features,
so they need a recent enough version of the client libraries, boto3
and botocore. In particular, only in botocore 1.12.54, released a year
ago, was support for BillingMode added - and we rely on this to create
pay-per-request tables for our tests.

Instead of letting the user run with an old version of this library and
get dozens of mysterious errors, in this patch we add a test to conftest.py
which cleanly aborts the test if the libraries aren't new enough, and
recommends a "pip" command to upgrade these libraries.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190819121831.26101-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
64bf2b29a8 alternator-test: exhaustive tests for DescribeTable operation
The DescribeTable operation was currently implemented to return the
minimal information that libraries and applications usually need from
it, namely verifying that some table exists. However, this operation
is actually supposed to return a lot more information fields (e.g.,
the size of the table, its creation date, and more) which we currently
don't return.

This patch adds a new test file, test_describe_table.py, testing all
these additional attributes that DescribeTable is supposed to return.
Several of the tests are marked xfail (expected to fail) because we
did not implement these attributes yet.

The test is exhaustive except for attributes that have to do with four
major features which will be tested together with these features: GSI,
LSI, streams (CDC), and backup/restore.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190816132546.2764-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
fbd2f5077d alternator: enable timeouts on requests
Currently Alternator starts all Scylla requests (including both reads
and writes) without any timeout set. Because of bugs and/or network
problems, Requests can theoretically hang and waste Scylla request for
hours, long after the client has given up on them and closed their
connection.

The DynamoDB protocol doesn't let a user specify which timeout to use,
so we should just use something "reasonable", in this patch 10 seconds.
Remember that all DynamoDB read and write requests are small (even scans
just scan a small piece), so 10 seconds should be above and beyond
anything we actually expect to see in practice.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190812105132.18651-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
b2bd3bbc1f alternator: add "--alternator-address" configuration parameter
So far we had the "--alternator-port" option allowing to configure the port
on which the Alternator server listens on, but the server always listened
to any address. It is important to also be able to configure the listen
address - it is useful in tests running several instances of Scylla on
the same machine, and useful in multi-homed machines with several interfaces.

So this patch adds the "--alternator-address" option, defaulting to 0.0.0.0
(to listen on all interfaces). It works like the many other "--*-address"
options that Scylla already has.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808204641.28648-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
ea41dd2cf8 alternator: docs/alternator.md more about filtering support
Give more details about what is, and what isn't, currently
supported in filtering of Scan (and Query) results.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190811094425.30951-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Piotr Sarna
88eed415bd alternator: fix indentation
It turns out that recent rjson patches introduced some buggy
tabs instead of spaces due to bad IDE configuration. The indentation
is restored to spaces.
2019-09-11 18:01:05 +03:00
Piotr Sarna
3c11428d8d alternator-test: add QueryFilter validation cases
QueryFilter validation was lately supplemented with non-key column
checks, which is hereby tested.
2019-09-11 18:01:05 +03:00
Piotr Sarna
0e0dc14302 alternator-test: add scan case for key equality filtering
With key equality filtering enabled, a test case for scanning is provided.
2019-09-11 18:01:05 +03:00
Piotr Sarna
f1641caa41 alternator: add filtering for key equality
Until now, filtering in alternator was possible only for non-key
column equality relations. This commit adds support for equality
relations for key columns.
2019-09-11 18:01:05 +03:00
Piotr Sarna
a2828f9daa alternator: add validation to QueryFilter
QueryFilter, according to docs, can only contain non-key attributes.
2019-09-11 18:01:05 +03:00
Piotr Sarna
d055658fff alternator: add computing key bounds from filtering
Alternator allows passing hash and sort key restrictions
as filters - it is, however, better to incorporate these restrictions
directly into partition and clustering ranges, if possible.
It's also necessary, as optimizations inside restrictions_filter
assume that it will not be fed unneeded rows - e.g. if filtering
is not needed on partition key restrictions, they will not be checked.
2019-09-11 18:01:05 +03:00
Piotr Sarna
9c05051b59 alternator: extract getting key value subfunction
Currently the only utility function for getting key bytes
from JSON was to parse a document with the following format:
"key_column_name" : { "key_column_type" : VALUE }.
However, it's also useful to parse only the inner document, i.e.:
{ "key_column_type" : VALUE }.
2019-09-11 18:01:05 +03:00
Piotr Sarna
c84019116a alternator: make make_map_element_restriction static
The function has no outside users and thus does not need to be exposed.
2019-09-11 18:01:05 +03:00
Piotr Sarna
3ee99a89b1 alternator: register filtering metrics
Three metrics related to filtering are added to alternator:
 - total rows read during filtering operations
 - rows read and matched by filtering
 - rows read and dropped by filtering
2019-09-11 18:01:05 +03:00
Piotr Sarna
b3e35dab26 alternator: add bumping filtering stats
When filtering is used in querying or scanning, the number of total
filtered rows is added to stats.
2019-09-11 18:01:05 +03:00
Piotr Sarna
a6d098d3eb alternator: add cql_stats to alternator stats
Some underlying operations (e.g. paging) make use of cql_stats
structure from CQL3. As such, cql_stats structure is added
to alternator stats in order to gather and use these statistics.
2019-09-11 18:01:05 +03:00
Piotr Sarna
3ae54892cd alternator: fix a comment typo
s/Miscellenous/Miscellaneous/g
2019-09-11 18:01:05 +03:00
Piotr Sarna
ccf778578a alternator: register read-before-write stats
Read-before-write stat counters were already introduced, but the metrics
needs to be added to a metric group as well in order to be available
for users.
2019-09-11 18:01:05 +03:00
Nadav Har'El
6f81d0cb15 alternator: initial support for GSI
This patch adds partial support for GSI (Global Secondary Index) in
Alternator, implemented using a materialized view in Scylla.

This initial version only supports the specific cases of the index indexing
a column which was already part of the base table's key - e.g., indexing
what used to be a sort key (clustering key) in the base table. Indexing
of non-key attributes (which today live in a map) is not yet supported in
this version.

Creation of a table with GSIs is supported, and so is deleting the table.
UpdateTable which adds a GSI to an existing table is not yet supported.
Query and Scan operations on the index are supported.
DescribeTable does not yet list the GSIs as it should.

Seven previously-failing tests now pass, so their "xfail" tag is removed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808090256.12374-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Piotr Sarna
33611acf44 alternator: add stats for read-before-write
A simple metric counting how many read-before-writes were executed
is added.
Message-Id: <d8cc1e9d77e832bbdeff8202a9f792ceb4f1e274.1565274797.git.sarna@scylladb.com>
2019-09-11 18:01:05 +03:00
Piotr Sarna
ae59340c15 alternator: complement rjson.hh comments
Some comments in rjson.hh header file were not clear and are hereby
amended.
Message-Id: <7fa4e2cf39b95c176af31fe66f404a6a51a25bec.1565275276.git.sarna@scylladb.com>
2019-09-11 18:01:04 +03:00
Piotr Sarna
5eb583ab09 alternator: remove missing key FIXME
The case for missing key in update_item was already properly fixed
along with migrating from libjsoncpp to rapidjson, but one FIXME
remained in the code by mistake.

Message-Id: <94b3cf53652aa932a661153c27aa2cb1207268c7.1565271432.git.sarna@scylladb.com>
2019-09-11 18:01:04 +03:00
Piotr Sarna
436f806341 alternator: remove decimal_type FIXME
Decimal precision problems were already solved by commit
d5a1854d93c9448b1d22c2d02eb1c46a286c5404, but one FIXME
remained in the code by mistake.

Message-Id: <381619e26f8362a8681b83e6920052919acf1142.1565271198.git.sarna@scylladb.com>
2019-09-11 18:01:04 +03:00
Piotr Sarna
b29b753196 alternator: add comments to rjson
The rapidjson library needs to be used with caution in order to
provide maximum performance and avoid undefined behavior.
Comments added to rjson.hh describe provided methods and potential
pitfalls to avoid.
Message-Id: <ba94eda81c8dd2f772e1d336b36cae62d39ed7e1.1565270214.git.sarna@scylladb.com>
2019-09-11 18:01:04 +03:00
Piotr Sarna
7b02c524d0 alternator: remove a pointer-based workaround for future<json>
With libjsoncpp we were forced to work around the problem of
non-noexcept constructors by using an intermediate unique pointer.
Objects provided by rapidjson have correct noexcept specifiers,
so the workaround can be dropped.
2019-09-11 18:01:04 +03:00
Piotr Sarna
cb29d6485e alternator: migrate to rapidjson library
Profiling alternator implied that JSON parsing takes up a fair amount
of CPU, and as such should be optimized. libjsoncpp is a standard
library for handling JSON objects, but it also proves slower than
rapidjson, which is hereby used instead.
The results indicated that libjsoncpp used roughly 30% of CPU
for a single-shard alternator instance under stress, while rapidjson
dropped that usage to 18% without optimizations.
Future optimizations should include eliding object copying, string copying
and perhaps experimenting with different JSON allocators.
2019-09-11 18:01:04 +03:00
Piotr Sarna
0fd1354ef9 alternator: add handling rapidjson errors in the server
If a JSON parsing error is encountered, it is transformed
to a validation exception and returned to the user in JSON form.
2019-09-11 18:01:04 +03:00
Piotr Sarna
7064b3a2bf alternator: add rapidjson helper functions
Migrating from libjsoncpp to rapidjson proved to be beneficial
for parsing performance. As a first step, a set of helper functions
is provided to ease the migration process.
2019-09-11 18:01:04 +03:00