Compare commits

..

282 Commits

Author SHA1 Message Date
Nadav Har'El
aac47f494b alternator-test: reproduce bug in Expected with EQ of set value
Our implementation of the "EQ" operator in Expected (conditional
operation) just compares the JSON represntation of the values.
This is almost always correct, but unfortunately incorrect for
sets - where we can have two equal sets despite having a
different order.

This patch just adds an (xfailing) test for this bug.

The bug itself can be fixed in the future in one of several ways
including changing the implementation of EQ, or changing the
serialization of sets so they'll always be sorted in the same
way.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190909125147.16484-1-nyh@scylladb.com>
2019-09-10 17:06:41 +03:00
Piotr Sarna
d93df8c184 view: handle multiple regular base columns in view pk
Previous assumption was that there can only be one regular base column
in the view key. The assumption is still correct for tables created
via CQL, but it's internally possible to create a view with multiple
such columns - the new assumption is that if there are multiple columns,
they share their liveness.
Message-Id: <c9dec243ce903d3a922ce077dc274f988bcf5d57.1567604945.git.sarna@scylladb.com>
2019-09-05 19:31:11 +03:00
Nadav Har'El
48e22c2910 alternator: implement the Expected request parameter
In this patch we implement the Expected parameter for the UpdateItem,
PutItem and DeleteItem operations. This parameter allows a conditional
update - i.e., do an update only if the existing value of the item
matches some condition.
This is the older form of conditional updates, but is still used by many
applications, including Amazon's Tic-Tac-Toe demo.

As usual, we do not yet provide isolation guarantees for read-modify-write
operations - the item is simply read before the modification, and there is
no protection against concurrent operation. This will of course need to be
addressed in the future.

The Expected parameter has a relatively large number of variations, and most
of them are supported by this code, except that currenly only two comparison
operators are supported (EQ and BEGINS_WITH) out of the 13 listed in the
documentation. The rest will be implemented later.

This patch also includes comprehensive tests for the Expected feature.
These tests are almost exhaustive, except for one missing part (labled FIXME) -
among the 13 comparison operations, the tests only check the EQ and BEGINS_WITH
operators. We'll later need to add checks to the rest of them as well.
As usual, all the tests pass on Amazon DynamoDB, and after this patch all
of them succeed on Alternator too.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190905125558.29133-1-nyh@scylladb.com>
2019-09-05 17:14:23 +03:00
Piotr Sarna
bbb6689938 alternator: add returning PAY_PER_REQUEST billing mode
In order for Spark jobs to work correctly, a hardcoded PAY_PER_REQUEST
billing mode entry is returned when describing a table with
a DescribeTable request.
Also, one test case in test_describe_table.py is no longer marked XFAIL.
Message-Id: <a4e6d02788d8be48b389045e6ff8c1628240197c.1567688894.git.sarna@scylladb.com>
2019-09-05 16:24:35 +03:00
Nadav Har'El
6c252feb0c alternator: add "getting started document".
Patch series by Eliran Sinvani. It moves Alternator documentation to
a new subdirectory docs/alternator, and adds a new file, getting-started.md,
with instructions on how to install and test Scylla with Alternator.
2019-09-05 14:40:43 +03:00
Nadav Har'El
33d5015edf alternator: update docs/alternator.md on GSI/LSI situation
Update docs/alternator.md on the current level of compatibility of our
GSI and LSI implementation vs. DynamoDB.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190904120730.12615-1-nyh@scylladb.com>
2019-09-05 14:37:51 +03:00
Eliran Sinvani
0be2f17952 Alternator: Add getting started document for alternator
This patch adds a getting started document for alternator,
it explains how to start up a cluster that has an alternator
API port open and how to test that it works using either an
application or some simple and minimal python scripts.
The goal of the document is to get a user to have an up and
running docker based cluster with alternator support in the
shortest time possible.
2019-09-04 16:59:03 +03:00
Eliran Sinvani
a74d5fcce8 move alternator.md to its own directory
As part of trying to make alternator more accessible
to users, we expect more documents to be created so
it seems like a good idea to give all of the alternator
docs their own directory.
2019-09-04 16:47:15 +03:00
Piotr Sarna
8e9e7cc952 alternator-test: add xfail test for GSI with 2 regular columns
When updating the second regular base column that is also a view
key, the code in Scylla will assume it only needs to update an entry
instead of replacing an old one. This leads to inconsitencies
exposed in the test case.
Message-Id: <5dfeb9f61f986daa6e480e9da4c7aabb5a09a4ec.1567599461.git.sarna@scylladb.com>
2019-09-04 15:27:23 +03:00
Amnon Heiman
ce5db951be alternator/executor.cc: Latencies should use steady_clock
To get a correct latency estimations executor should use a higher clock
resolution.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2019-09-04 13:27:29 +03:00
Piotr Sarna
5a9b9bdf4d alternator-test: fix LSI tests
LSI tests are amended, so they no longer needlessly XPASS:
 * two xpassing tests are no longer marked XFAIL
 * there's an additional test for partial projection
   that succeeds on DynamoDB and does not work fine yet in alternator
Message-Id: <0418186cb6c8a91de84837ffef9ac0947ea4e3d3.1567585915.git.sarna@scylladb.com>
2019-09-04 12:25:20 +03:00
Nadav Har'El
518f01099c alternator-test: fix test_describe_endpoints.py for AWS run
The previous patch fixed test_describe_endpoints.py for a local run
without an AWS configuration. But when running with "--aws", we do
need to use that AWS configuration, and this patch fixes this case.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-09-04 12:22:34 +03:00
Nadav Har'El
db617b459c alternator-test: test_describe_endpoints.py without configuring AWS
Even when running against a local Alternator, Boto3 wants to know the
region name, and AWS credentials, even though they aren't actually needed.
For a local run, we can supply garbage values for these settings, to
allow a user who never configured AWS to run tests locally.
Running against "--aws" will, of course, still require the user to
configure AWS.

The previous patch already fixed this for most tests, this patch fixes the
same issue in test_describe_endpoints.py, which had a separate copy of the
problematic code.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-09-04 12:00:42 +03:00
Nadav Har'El
72ac19623a alternator: run local tests without configuring AWS
Even when running against a local Alternator, Boto3 wants to know the
region name, and AWS credentials, even though they aren't actually needed.
For a local run, we can supply garbage values for these settings, to
allow a user who never configured AWS to run tests locally.
Running against "--aws" will, of course, still require the user to
configure AWS.

Also modified the README to be clearer, and more focused on the local
runs.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190708121420.7485-1-nyh@scylladb.com>
2019-09-04 11:54:23 +03:00
Nadav Har'El
eb1cc6610c alternator: add LSI support
Merged series by Piotr Sarna:

This series adds basic LocalSecondaryIndexes support to alternator,
based on previous GlobalSecondaryIndexes work.
It also provides tests in the form of test_lsi.py file.

Note that contrary to Scylla's indexes, this basic LSI implementation
is very similar to GSI and it just denormalizes each column,
so only one table read is involved when reading from the index,
but the capacity overhead is larger than Scylla's indexing.

Tests: alternator(local, remote)

Piotr Sarna (3):
  alternator: add basic LSI support
  alternator-test: bump create table time limit to 200s
  alternator-test: add LSI tests

 alternator-test/test_lsi.py | 336 ++++++++++++++++++++++++++++++++++++
 alternator-test/util.py     |   2 +-
 alternator/executor.cc      |  81 ++++++++-
 3 files changed, 409 insertions(+), 10 deletions(-)
2019-09-04 00:15:58 +03:00
Piotr Sarna
24789b549a alternator-test: add LSI tests
Cases for local secondary indexes are added - loosely based on
test_gsi.py suite.
2019-09-03 18:11:29 +02:00
Piotr Sarna
e11ba56dbc alternator-test: bump create table time limit to 200s
Unfortunately the previous 100s limit proved to be not enough
for creating tables with both local and global indexes attached
to them. Empirically 200s was chosen as a safe default,
as the longest test oscillated around 100s with the deviation of 10s.
2019-09-03 18:05:24 +02:00
Piotr Sarna
a2d68eac4c alternator: add basic LSI support
With this patch, LocalSecondaryIndexes can be added to a table
during its creation. The implementation is heavily shared
with GlobalSecondaryIndexes and as such suffers from the same TODOs:
projections, describing more details in DescribeTable, etc.
2019-09-03 18:05:23 +02:00
Nadav Har'El
c14fc8d854 alternator: rename reserved column name "attrs"
We currently reserve the column name "attrs" for a map of attributes,
so the user is not allowed to use this name as a name of a key.

We plan to lift this reservation in a future patch, but until we do,
let's at least choose a more obscure name to forbid - in this patch ":attrs".
It is even less likely that a user will want to use this specific name
as a column name.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190903133508.2033-1-nyh@scylladb.com>
2019-09-03 16:55:31 +03:00
Piotr Sarna
81dcee8961 alternator: migrate make_map_element_restriction to string view
In order to elide unnecessary copying and allow more copy elision
in the future, make_map_element_restriction helper function
uses string_view instead of a const string reference.
Message-Id: <1a3e82e7046dc40df604ee7fbea786f3853fee4d.1567502264.git.sarna@scylladb.com>
2019-09-03 13:26:13 +03:00
Nadav Har'El
6aa840543c alternator: rename and add metrics
Merged series from Amnon Heiman.
It makes two changes to Alternator's metrics:

1. The different operations counters, instead of being separate metrics,
   become one array (with each operation type becoming a "label")

2. It adds latency histograms for some operations (UpdateItem, GetItem,
   PutItem). This adds the cost of two clock reads to each query, hopefully
   it's tiny.
2019-09-02 16:29:32 +03:00
Nadav Har'El
84598b39b9 alternator: clean error, not a crash, on reserved column name
Currently, we reserve the name ATTRS_COLUMN_NAME ("attrs") - the user
cannot use it as a key column name (key of the base table or GSI or LSI)
because we use this name for the attribute map we add to the schema.

Currently, if the user does attempt to create such a key column, the
result is undefined (sometimes corrupt sstables, sometimes outright crashes).
This patches fixes it to become a clean error, saying that this column name is
currently reserved.

The test test_create_table_special_column_name now cleanly fails, instead
of crashing Scylla, so it is converted from "skip" to "xfail".

Eventually we need to solve this issue completely (e.g., in rare cases
rename columns to allow us to reserve a name like ATTRS_COLUMN_NAME,
or alternatively, instead of using a fixed name ATTRS_COLUMN_NAME pick a
different one different from the key column names). But until we do,
better fail with a clear error instead of a crash.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190901102832.7452-1-nyh@scylladb.com>
2019-09-02 15:39:32 +03:00
Nadav Har'El
287ba6ddde alternator: Add throwing on unsupported expressions
Merged series by Piotr Sarna:

This miniseries makes alternator throw on unsupported expressions
and adds test cases related to this issue.

ests: alternator-test(local, remote)

Piotr Sarna (3):
  alternator: throw on unsupported expressions
  alternator-test: add tests for unsupported expressions
  alternator-test: add initial test_condition_expression file

 alternator-test/test_condition_expression.py | 40 +++++++++++++++++++
 alternator-test/test_query.py                | 42 ++++++++++++++++++++
 alternator/executor.cc                       | 11 +++++
 3 files changed, 93 insertions(+)
 create mode 100644 alternator-test/test_condition_expression.py
2019-09-02 15:32:00 +03:00
Piotr Sarna
a700da216d alternator-test: add initial test_condition_expression file
The file initially consists of a very simple case that succeeds
with `--aws` and expectedly fails without it, because the expression
is not implemented yet.
2019-09-02 14:00:53 +02:00
Piotr Sarna
a6d2117a77 alternator-test: add tests for unsupported expressions
The test cases are marked XFAIL, as their expressions are not yet
supported in alternator. With `--aws`, they pass.
2019-09-02 14:00:53 +02:00
Pekka Enberg
c46458f61d dist/docker: Add support for Alternator
This adds a "alternator-address" and "alternator-port" configuration
options to the Docker image, so people can enable Alternator with
"docker run" with:

  docker run --name some-scylla -d <image> --alternator-port=8080
Message-Id: <20190902110920.19269-1-penberg@scylladb.com>
2019-09-02 14:41:58 +03:00
Piotr Sarna
dc09d96b22 alternator: throw on unsupported expressions
When an unsupported expression parameter is encountered -
KeyConditionExpression, ConditionExpression or FilterExpression
are such - alternator will return an error instead of ignoring
the parameter.
2019-09-02 13:38:10 +02:00
Amnon Heiman
350180125d alternator/executor: update the latencies histogram
This patch update the latencies histogram for get, put, delete and
update.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2019-09-01 18:12:17 +03:00
Amnon Heiman
78a153518a alternator/stats metrics: use labels and estimated histogram
This patch make two chagnes to the alternator stats:
1. It add estimated_histogram for the get, put, update and delete
operation

2. It changes the metrics naming, so the operation will be a label, it
will be easier to handle, perform operation and display in this way.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2019-09-01 18:12:01 +03:00
Nadav Har'El
70d4afc637 alternator_test: mark test_gsi_3 as passing
The test_gsi_3, involving creating a GSI with two key columns which weren't
previously a base key, now passes, so drop the "xfail" marker.

We still have problems with such materialized views, but not in the simple
scenario tested by test_gsi_3.

Later we should create a new test for the scenario which still fails, if
any.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-09-01 11:51:28 +03:00
Piotr Sarna
fea29fdfd1 alternator: allow creating GSI with 2 base regular columns
Creating an underlying materialized view with 2 regular base columns
is risky in Scylla, as second's column liveness will not be correctly
taken into account when ensuring view row liveness.
Still, in case specific conditions are met:
 * the regular base column value is always present in the base row
 * no TTLs are involved
then the materialized view will behave as expected.

Creating a GSI with 2 base regular columns issues a warning,
as it should be performed with care.
Message-Id: <5ce8642c1576529d43ea05e5c4bab64d122df829.1567159633.git.sarna@scylladb.com>
2019-09-01 11:38:56 +03:00
Nadav Har'El
93f1072cee alternator: fix default BillingMode
It is important that BillingMode should default to PROVISIONED, as it
does on DynamoDB. This allows old clients, which don't specify
BillingMode at all, to specify ProvisionedThroughput as allowed with
PROVISIONED.

Also added a test case for this case (where BillingMode is absent).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190829193027.7982-1-nyh@scylladb.com>
2019-09-01 11:25:56 +03:00
Nadav Har'El
edd03312ed alternator: correct error on missing index or table
When querying on a missing index, DynamoDB returns different errors in
case the entire table is missing (ResourceNotFoundException) or the table
exists and just the index is missing (ValidationException). We didn't
make this distinction, and always returned ValidationException, but this
confuses clients that expect ResourceNotFoundException - e.g., Amazon's
Tic-Tac-Toe demo.

This patch adds a test for the first case (the completely missing table) -
we already had a test for the second case - and returns the correct
error codes. As usual the test passes against DynamoDB as well as Alternator,
ensure they behave the same.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190829174113.5558-1-nyh@scylladb.com>
2019-09-01 11:18:59 +03:00
Nadav Har'El
faad21f72b alternator: improve request logging
We needlessly split the trace-level log message for the request to two
messages - one containing just the operation's name, and one with the
parameters. Moreover we printed them in the opposite order (parameters
first, then the operation). So this patch combines them into one log
message.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190829165341.3600-1-nyh@scylladb.com>
2019-09-01 11:17:58 +03:00
Nadav Har'El
298f5caa38 alternator-test: reproduce bug with using "attrs" as key column name
Alternator puts in the Scylla table a column called "attrs" for all the
non-key attributes. If the user happens to choose the same name, "attrs",
for one of the key columns, the result of writing two different columns
with the same name is a mess and corrupt sstables.

This test reproduces this bug (and works against DynamoDB of course).

Because the test doesn't cleanly fail, but rather leaves Scylla in a bad
state from which it can't fully recover, the test is marked as "skip"
until we fix this bug.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190828135644.23248-1-nyh@scylladb.com>
2019-09-01 11:13:31 +03:00
Piotr Sarna
b6662bac35 alternator: remove redundant key checks in UpdateItem
Updating key columns is not allowed in UpdateItem requests,
but the series introducing GSI support for regular columns
also introduced redundant duplicates checks of this kind.
This condition is already checked in resolve_update_path helper function
and existing test_update_expression_cannot_modify_key test makes sure that
the condition is checked.
Message-Id: <00f83ab631f93b263003fb09cd7b055bee1565cd.1567086111.git.sarna@scylladb.com>
2019-08-29 19:55:15 +03:00
Nadav Har'El
707a70235e alternator-test: improve test_update_expression_cannot_modify_key
The test test_update_expression_cannot_modify_key() verifies that an
update expression cannot modify one of the key columns. The existing
test only tried the SET and REMOVE actions - this patch makes the
test more complete by also testing the ADD and DELETE actions.

This patch also makes the expected exception more picky - we now
expect that the exception message contains the word "key" (as it,
indeed, does on both DynamoDB and Alternator). If we get any other
exception, there may be a problem.

The test passed before this patch, and passes now as well - it's just
stricter now.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190829135650.30928-1-nyh@scylladb.com>
2019-08-29 18:05:48 +03:00
Nadav Har'El
c223fafba3 alternator: Fix compound sort key deserialization
Merged patch series from Piotr Sarna:

This miniseries fixes a bug in clustering key deserialization in
alternator and provides a test case for it.
The bug was exposed only if an index (GSI or LSI) was created on a table
with both hash and sort keys defined. The test was checked both locally
and remotely.

Tests: alternator(local, remote)

Piotr Sarna (2):
  alternator: use from_single_value instead of from_singular in ck
  alternator-test: add test case for GSI with both keys

 alternator-test/test_gsi.py | 40 +++++++++++++++++++++++++++++++++++++
 alternator/executor.cc      |  6 +++---
 2 files changed, 43 insertions(+), 3 deletions(-)
2019-08-28 17:55:28 +03:00
Piotr Sarna
7a749a4772 alternator-test: add test case for GSI with both keys
A case which adds a global secondary index on a table with both
hash and sort keys is added.
2019-08-28 16:11:16 +02:00
Piotr Sarna
5fc1e70751 alternator: use from_single_value instead of from_singular in ck
The code previously used clustering_key::from_singular() to compute
a clustering key value. It works fine, but has two issues:
1. involves one redundant deserialization stage compared to
   from_single_value
2. does not work with compound clustering keys, which can appear
   when using indexes
2019-08-28 16:02:20 +02:00
Nadav Har'El
93fbbbc202 alternator: Allow GSI on regular columns
Merged patch series from Piotr Sarna:

This series allows creating Global Secondary Indexes during alternator
table creation. If a regular column is used in the GSI key, it is
included into underlying base table schema together with its type,
as was previously done only for hash and sort keys.
Creating a global index that uses regular base columns for both
hash and sort key is still prohibited, as it's not allowed to create
such materialized views in Scylla and the underlying implementation
of alternator's GSI is based on Scylla's materialized views.

Tests: alternator(local)

Piotr Sarna (8):
  alternator: avoid creating empty collection mutations
  alternator: start fetching all regular columns
  alternator: change attrs column name to :attrs
  alternator: add handling regular columns with schema definitions
  alternator: allow adding GSI-related regular columns to schema
  alternator: add describing GSI in DescribeTable
  alternator: Add 'mismatch' to serialization error message
  alternator-test: enable passing tests

 alternator-test/test_gsi.py |   8 +--
 alternator/executor.cc      | 120 ++++++++++++++++++++++++++++--------
 alternator/executor.hh      |   2 +-
 alternator/serialization.cc |   2 +-
 4 files changed, 99 insertions(+), 33 deletions(-)
2019-08-28 16:58:17 +03:00
Piotr Sarna
cfcf841c0d alternator-test: enable passing tests
With more GSI features implemented, tests with XPASS status are promoted
to being enabled.

One test case (test_gsi_describe) is partially done as DescribeTable
now contains index names, but we could try providing more attributes
(e.g. IndexSizeBytes and ItemCount from the test case), so the test
is left in the XFAIL state.
2019-08-28 15:38:13 +02:00
Piotr Sarna
c0c4d90df9 alternator: Add 'mismatch' to serialization error message
In order to match the tests and origin more properly, the error message
for mismatched types is updated so it contains the word 'mismatch'.
2019-08-28 15:38:13 +02:00
Piotr Sarna
86e80ea320 alternator: add describing GSI in DescribeTable
The DescribeTable request now contains the list of index names
as well. None of the attributes of the list are marked as 'required'
in the documentation, so currently the implementation provides
index names only.
2019-08-28 15:38:09 +02:00
Piotr Sarna
6d62ed213a alternator: allow adding GSI-related regular columns to schema
In order to be able to create a Global Secondary Index over a regular
column, this column is upgraded from being a map entry to being a full
member of the schema. As such, it's possible to use this column
definition in the underlying materialized view's key.
2019-08-28 15:29:35 +02:00
Piotr Sarna
05920f7c3b alternator: add handling regular columns with schema definitions
In order to prepare alternator for adding regular columns to schema,
i.e. in order to create a materialized view over them,
the code is changed so that updating no longer assumes that only keys
are included in the table schema.
2019-08-28 15:12:19 +02:00
Piotr Sarna
9390dd41d0 alternator: start fetching all regular columns
Since in the future we may want to have more regular columns
in alternator tables' schemas, the code is changed accordingly,
so all regular columns will be fetched instead of just the attribute
map.
2019-08-28 11:06:56 +02:00
Piotr Sarna
a795c290f3 alternator: avoid creating empty collection mutations
If no regular column attributes are passed to PutItem, the attr
collector serializes an empty collection mutation nonetheless
and sends it. It's redundant, so instead, if the attr colector
is empty, the collection does not get serialized and sent to replicas.
2019-08-28 11:06:56 +02:00
Nadav Har'El
d4b19f55b2 alternator-test: add license blurbs
Add copyright and license blurbs to all alternator-test source files.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825161018.10358-1-nyh@scylladb.com>
2019-08-26 14:23:11 +03:00
Nadav Har'El
5e8644413e alternator: update license blurbs
Update all the license blurbs to the one we use in the open-source
Scylla project, licensed under the AGPL.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825160321.10016-1-nyh@scylladb.com>
2019-08-26 14:22:41 +03:00
Nadav Har'El
40e3b7a0b6 alternator: Add tracing
Merged patch series from Piotr Sarna:

This series adds basic tracing to alternator requests.

Example:

session_id                           | activity
--------------------------------------+---------------------------------------------------------------------------------------
 d5d09eb0-c7e8-11e9-ac0e-000000000000 |                                                                               PutItem
 d5d09eb0-c7e8-11e9-ac0e-000000000000 | Creating write handler for token: 621935170417716715 natural: {127.0.0.1} pending: {}
 d5d09eb0-c7e8-11e9-ac0e-000000000000 |                                Creating write handler with live: {127.0.0.1} dead: {}
 d5d09eb0-c7e8-11e9-ac0e-000000000000 |                                                          Executing a mutation locally
 d5d09eb0-c7e8-11e9-ac0e-000000000000 |                                                        Got a response from /127.0.0.1
 d5d09eb0-c7e8-11e9-ac0e-000000000000 |                          Delay decision due to throttling: do not delay, resuming now
 d5d09eb0-c7e8-11e9-ac0e-000000000000 |                                                       Mutation successfully completed

Ref: docs/tracing.md.

Tests: manual, alternator(local) with tracing enabled via REST API

Piotr Sarna (3):
  alternator: add client state
  alternator: enable query tracing
  alternator: add initial tracing to requests

 alternator/executor.cc | 84 +++++++++++++++++++++++++++++-------------
 alternator/executor.hh | 31 +++++++++-------
 alternator/server.cc   | 38 +++++++++++--------
 3 files changed, 100 insertions(+), 53 deletions(-)
2019-08-26 13:24:14 +03:00
Piotr Sarna
76c3e4e0d6 alternator: add initial tracing to requests
Each request provides basic tracing information about itself.

Example output from tracing:

cqlsh> select request, parameters from system_traces.sessions
           where session_id = 39813070-c4ea-11e9-8572-000000000000;
 request          | parameters
------------------+-----------------------------------------------------
 Alternator Query | {'query': '{"TableName": "alternator_test_15664",
                    "KeyConditions": {"p": {"AttributeValueList":
                    [{"S": "T0FE0QCS0X"}], "ComparisonOperator": "EQ"}}}'}

cqlsh> select session_id, activity from system_traces.events
           where session_id = 39813070-c4ea-11e9-8572-000000000000;
 session_id                           | activity
--------------------------------------+-----------------------------
 39813070-c4ea-11e9-8572-000000000000 |                    Querying
 39813070-c4ea-11e9-8572-000000000000 | Performing a database query
2019-08-26 12:01:29 +02:00
Piotr Sarna
9567691551 alternator: enable query tracing
Probabilistic tracing can be enabled via REST API. Alternator will
from now on create tracing sessions for its operations as well.

Examples:

 # trace around 0.1% of all requests
curl -X POST http://localhost:10000/storage_service/trace_probability?probability=0.001
 # trace everything
curl -X POST http://localhost:10000/storage_service/trace_probability?probability=1
2019-08-26 11:57:44 +02:00
Piotr Sarna
2220c8c681 alternator: add client state
Keeping an instance of client_state is a convenient way of being able
to use tracing for alternator. It's also currently used in paging,
so adding a client state to executor removes the need of keeping
a dummy value.
2019-08-26 11:07:56 +02:00
Piotr Sarna
b2fa0c9c1c alternator: use correct string views in serialization
String views used in JSON serialization should use not only the pointer
returned by rapidjson, but also the string length, as it may contain
\0 characters.
Additionally, one unnecessary copy is elided.
2019-08-26 11:02:13 +02:00
Nadav Har'El
ace1ffc8da alternator: docs/alternator.md: link to a longer document
Add a link to a longer document (currently, around 40 pages) about
DynamoDB's features and how we implemented or may implement them in
Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825121201.31747-2-nyh@scylladb.com>
2019-08-26 11:56:34 +03:00
Nadav Har'El
6b8224ef85 alternator: document choice of RF
After changing the choice of RF in a previous patch, let's update the
relevant part of docs/alternator.md.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190825121201.31747-1-nyh@scylladb.com>
2019-08-26 11:56:05 +03:00
Nadav Har'El
2e68eafb22 alternator: expand docs/alternator.md
Expand docs/alternator.md with new sections about how to run Alternator,
and a very brief introduction to its design.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818164628.12531-1-nyh@scylladb.com>
2019-08-21 17:25:40 +03:00
Nadav Har'El
3044f71438 alternator: refuse CreateTable if uses unsupported features
If a user tries to create a table with a unsupported feature -
a local secondary index, a used-defined encryption key or supporting
streams (CDC), let's refuse the table creation, so the application
doesn't continue thinking this feature is available to it.

The "Tags" feature is also not supported, but it is more harmless
(it is used mostly for accounting purposes) so we do not fail the
table creation because of it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818125528.9091-1-nyh@scylladb.com>
2019-08-21 17:25:01 +03:00
Piotr Sarna
0af6ab6fc6 alternator: migrate to visitor pattern in serialization
Types can now be processed with a visitor pattern, which is more neat
than a chain of if statements.
Message-Id: <256429b7593d8ad8dff737d8ddb356991fb2a423.1566386758.git.sarna@scylladb.com>
2019-08-21 17:20:45 +03:00
Piotr Sarna
70ae5cc235 alternator: add from_string with raw pointer to rjson
from_string is a family of function that create rjson values from
strings - now it's extended with accepting raw pointer and size.
Message-Id: <d443e2e4dcc115471202759ecc3641ec902ed9e4.1566386758.git.sarna@scylladb.com>
2019-08-21 16:57:12 +03:00
Nadav Har'El
c9345d8a0e alternator: automatically choose RF: 1 or 3
In CQL, before a user can create a table, they must create a keyspace to
contain this table and, among other things, specify this keyspace's RF.

But in the DynamoDB API, there is no "create keyspace" operation - the
user just creates a table, and there is no way, and no opportunity,
to specify the requested RF. Presumably, Amazon always uses the same
RF for all tables, most likely 3, although this is not officially
documented anywhere.

The existing code creates the keyspace during Scylla boot, with RF=1.
This RF=1 always works, and is a good choice for a one-node test run,
but was a really bad choice for a real cluster with multiple nodes, so
this patch fixes this choice:

With this patch, the keyspace creation is delayed - it doesn't happen
when the first node of the cluster boots, but only when the user creates
the first table. Presumably, at that time, the cluster is already up,
so at that point we can make the obvious choice automatically: a one-node
cluster will get RF=1, a >=3 node cluster will get RF=3. The choice of
RF is logged - and the choice of RF=1 is considered a warning.

Note that with this patch, keyspace creation is still automatic as it
was before. The user may manually create the keyspace via CQL, to
override this automatic choice. In the future we may also add additional
keyspace configuration options via configuration flags or new REST
requests, and the keyspace management code will also likely change
as we start to support clusters with multiple regions and global
tables. But for now, I think the automatic method is easiest for
users who want to test-drive Alternator without reading lengthy
instructions on how to set up the keyspace.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190820180610.5341-1-nyh@scylladb.com>
2019-08-20 21:24:01 +03:00
Piotr Sarna
587b38cd69 alternator-test: add a test for wrong BEGINS_WITH target type
The test ensures that passing a non-compatible type to BEGINS WITH,
e.g. a number, results in a validation error.
Tested both locally and remotely.
Message-Id: <894a10d3da710d97633dd12b6ac54edccc18be82.1566291989.git.sarna@scylladb.com>
2019-08-20 14:52:22 +03:00
Piotr Sarna
7d68d5030d alternator: replace is_byte_order_compatible in BEGINS WITH
Checking if the type is byte-order compatible is more than enough
for BEGINS WITH operator - actually, we just need to check if the type
is compatible with a string.
Message-Id: <27a867cc1fa907ff87e011914e4acbb4f7db0181.1566225556.git.sarna@scylladb.com>
2019-08-19 17:43:12 +03:00
Nadav Har'El
c49e009e3e alternator: use empty_service_permit()
In the new code, write and read queries take a "service permit" which they
hold for the duration of the query, to help limit the load on the machine.

Alternator doesn't yet participate in this feature, so for now let's just
use empty_service_permit() meaning the queries don't hold on to any permit.
We can fix this later to use real permits.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 17:12:08 +03:00
Nadav Har'El
eebb2f0a0f alternator: add to CreateTable verification of BillingMode setting
We allow BillingMode to be set to either PAY_PER_REQUEST (the default)
or PROVISIONED, although neither mode is fully implemented: In the former
case the payment isn't accounted, and in the latter case the throughput
limits are not enforced.
But other settings for BillingMode are now refused, and we add a new test
to verify that.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190818122919.8431-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
fd10eee1ae alternator-test: require a new-enough boto library
The alternator tests want to exercise many of the DynamoDB API features,
so they need a recent enough version of the client libraries, boto3
and botocore. In particular, only in botocore 1.12.54, released a year
ago, was support for BillingMode added - and we rely on this to create
pay-per-request tables for our tests.

Instead of letting the user run with an old version of this library and
get dozens of mysterious errors, in this patch we add a test to conftest.py
which cleanly aborts the test if the libraries aren't new enough, and
recommends a "pip" command to upgrade these libraries.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190819121831.26101-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
28b3819c23 alternator-test: exhaustive tests for DescribeTable operation
The DescribeTable operation was currently implemented to return the
minimal information that libraries and applications usually need from
it, namely verifying that some table exists. However, this operation
is actually supposed to return a lot more information fields (e.g.,
the size of the table, its creation date, and more) which we currently
don't return.

This patch adds a new test file, test_describe_table.py, testing all
these additional attributes that DescribeTable is supposed to return.
Several of the tests are marked xfail (expected to fail) because we
did not implement these attributes yet.

The test is exhaustive except for attributes that have to do with four
major features which will be tested together with these features: GSI,
LSI, streams (CDC), and backup/restore.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190816132546.2764-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
656f62722b alternator: enable timeouts on requests
Currently Alternator starts all Scylla requests (including both reads
and writes) without any timeout set. Because of bugs and/or network
problems, Requests can theoretically hang and waste Scylla request for
hours, long after the client has given up on them and closed their
connection.

The DynamoDB protocol doesn't let a user specify which timeout to use,
so we should just use something "reasonable", in this patch 10 seconds.
Remember that all DynamoDB read and write requests are small (even scans
just scan a small piece), so 10 seconds should be above and beyond
anything we actually expect to see in practice.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190812105132.18651-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
ecb571d7e3 alternator: add "--alternator-address" configuration parameter
So far we had the "--alternator-port" option allowing to configure the port
on which the Alternator server listens on, but the server always listened
to any address. It is important to also be able to configure the listen
address - it is useful in tests running several instances of Scylla on
the same machine, and useful in multi-homed machines with several interfaces.

So this patch adds the "--alternator-address" option, defaulting to 0.0.0.0
(to listen on all interfaces). It works like the many other "--*-address"
options that Scylla already has.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808204641.28648-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
dd4638d499 alternator: docs/alternator.md more about filtering support
Give more details about what is, and what isn't, currently
supported in filtering of Scan (and Query) results.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190811094425.30951-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
aaf559c4f9 alternator: fix indentation
It turns out that recent rjson patches introduced some buggy
tabs instead of spaces due to bad IDE configuration. The indentation
is restored to spaces.
2019-08-19 15:49:52 +03:00
Piotr Sarna
f7d0ca3c92 alternator-test: add QueryFilter validation cases
QueryFilter validation was lately supplemented with non-key column
checks, which is hereby tested.
2019-08-19 15:49:52 +03:00
Piotr Sarna
8394225741 alternator-test: add scan case for key equality filtering
With key equality filtering enabled, a test case for scanning is provided.
2019-08-19 15:49:52 +03:00
Piotr Sarna
091b1b40c2 alternator: add filtering for key equality
Until now, filtering in alternator was possible only for non-key
column equality relations. This commit adds support for equality
relations for key columns.
2019-08-19 15:49:52 +03:00
Piotr Sarna
b914ba11fa alternator: add validation to QueryFilter
QueryFilter, according to docs, can only contain non-key attributes.
2019-08-19 15:49:52 +03:00
Piotr Sarna
1b2b2c7009 alternator: add computing key bounds from filtering
Alternator allows passing hash and sort key restrictions
as filters - it is, however, better to incorporate these restrictions
directly into partition and clustering ranges, if possible.
It's also necessary, as optimizations inside restrictions_filter
assume that it will not be fed unneeded rows - e.g. if filtering
is not needed on partition key restrictions, they will not be checked.
2019-08-19 15:49:52 +03:00
Piotr Sarna
188c6a552a alternator: extract getting key value subfunction
Currently the only utility function for getting key bytes
from JSON was to parse a document with the following format:
"key_column_name" : { "key_column_type" : VALUE }.
However, it's also useful to parse only the inner document, i.e.:
{ "key_column_type" : VALUE }.
2019-08-19 15:49:52 +03:00
Piotr Sarna
b8964ab0ba alternator: make make_map_element_restriction static
The function has no outside users and thus does not need to be exposed.
2019-08-19 15:49:52 +03:00
Piotr Sarna
c4fd846dbb alternator: register filtering metrics
Three metrics related to filtering are added to alternator:
 - total rows read during filtering operations
 - rows read and matched by filtering
 - rows read and dropped by filtering
2019-08-19 15:49:52 +03:00
Piotr Sarna
338b7e9e67 alternator: add bumping filtering stats
When filtering is used in querying or scanning, the number of total
filtered rows is added to stats.
2019-08-19 15:49:52 +03:00
Piotr Sarna
154d1649c6 alternator: add cql_stats to alternator stats
Some underlying operations (e.g. paging) make use of cql_stats
structure from CQL3. As such, cql_stats structure is added
to alternator stats in order to gather and use these statistics.
2019-08-19 15:49:52 +03:00
Piotr Sarna
5620a46024 alternator: fix a comment typo
s/Miscellenous/Miscellaneous/g
2019-08-19 15:49:52 +03:00
Piotr Sarna
fc9744791c alternator: register read-before-write stats
Read-before-write stat counters were already introduced, but the metrics
needs to be added to a metric group as well in order to be available
for users.
2019-08-19 15:49:52 +03:00
Nadav Har'El
e0b01a0233 alternator: initial support for GSI
This patch adds partial support for GSI (Global Secondary Index) in
Alternator, implemented using a materialized view in Scylla.

This initial version only supports the specific cases of the index indexing
a column which was already part of the base table's key - e.g., indexing
what used to be a sort key (clustering key) in the base table. Indexing
of non-key attributes (which today live in a map) is not yet supported in
this version.

Creation of a table with GSIs is supported, and so is deleting the table.
UpdateTable which adds a GSI to an existing table is not yet supported.
Query and Scan operations on the index are supported.
DescribeTable does not yet list the GSIs as it should.

Seven previously-failing tests now pass, so their "xfail" tag is removed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808090256.12374-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
b3bf2fab2e alternator: add stats for read-before-write
A simple metric counting how many read-before-writes were executed
is added.
Message-Id: <d8cc1e9d77e832bbdeff8202a9f792ceb4f1e274.1565274797.git.sarna@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
05b895ca84 alternator: complement rjson.hh comments
Some comments in rjson.hh header file were not clear and are hereby
amended.
Message-Id: <7fa4e2cf39b95c176af31fe66f404a6a51a25bec.1565275276.git.sarna@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
6b145b59d3 alternator: remove missing key FIXME
The case for missing key in update_item was already properly fixed
along with migrating from libjsoncpp to rapidjson, but one FIXME
remained in the code by mistake.

Message-Id: <94b3cf53652aa932a661153c27aa2cb1207268c7.1565271432.git.sarna@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
76bc30a82d alternator: remove decimal_type FIXME
Decimal precision problems were already solved by commit
d5a1854d93c9448b1d22c2d02eb1c46a286c5404, but one FIXME
remained in the code by mistake.

Message-Id: <381619e26f8362a8681b83e6920052919acf1142.1565271198.git.sarna@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
972474a215 alternator: add comments to rjson
The rapidjson library needs to be used with caution in order to
provide maximum performance and avoid undefined behavior.
Comments added to rjson.hh describe provided methods and potential
pitfalls to avoid.
Message-Id: <ba94eda81c8dd2f772e1d336b36cae62d39ed7e1.1565270214.git.sarna@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
3342ebff22 alternator: stop discarding futures in alternator server
By mistakes, some futures were discarded instead of being chained
in alternator server initialization.
2019-08-19 15:49:52 +03:00
Piotr Sarna
cd2c581c7c alternator: remove a pointer-based workaround for future<json>
With libjsoncpp we were forced to work around the problem of
non-noexcept constructors by using an intermediate unique pointer.
Objects provided by rapidjson have correct noexcept specifiers,
so the workaround can be dropped.
2019-08-19 15:49:52 +03:00
Piotr Sarna
e19a7f908e alternator: migrate to rapidjson library
Profiling alternator implied that JSON parsing takes up a fair amount
of CPU, and as such should be optimized. libjsoncpp is a standard
library for handling JSON objects, but it also proves slower than
rapidjson, which is hereby used instead.
The results indicated that libjsoncpp used roughly 30% of CPU
for a single-shard alternator instance under stress, while rapidjson
dropped that usage to 18% without optimizations.
Future optimizations should include eliding object copying, string copying
and perhaps experimenting with different JSON allocators.
2019-08-19 15:49:52 +03:00
Piotr Sarna
e9f1540de1 alternator: add handling rapidjson errors in the server
If a JSON parsing error is encountered, it is transformed
to a validation exception and returned to the user in JSON form.
2019-08-19 15:49:52 +03:00
Piotr Sarna
eb678ed63a alternator: add rapidjson helper functions
Migrating from libjsoncpp to rapidjson proved to be beneficial
for parsing performance. As a first step, a set of helper functions
is provided to ease the migration process.
2019-08-19 15:49:52 +03:00
Piotr Sarna
ebdd4022cf alternator: add missing namespaces to status_type
error.hh file implicitly assumed that seastar:: namespace is available
when it's included, which is not always the case. To remedy that,
seastar::httpd namespace is used explicitly.
2019-08-19 15:49:52 +03:00
Nadav Har'El
d6a8626e90 alternator: correct catch table-already-exists exception
Our CreateTable handler assumed that the function
migration_manager::announce_new_column_family()
returns a failed future if the table already exists. But in some of
our code branches, this is not the case - the function itself throws
instead of returning a failed future. The solution is to use
seastar::futurize_apply() to handle both possibilities (direct exception
or future holding an exception).

This fixes a failure of the test_table.py::test_create_table_already_exists
test case.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
62858c8466 alternator: add docs/alternator.md
This adds a new document, docs/alternator.md, about Alternator.

The scope of this document should be expanded in the future. We begin
here by introducing Alternator and its current compatibility level with
Amazon DynamoDB, but it should later grow to explain the design of Alternator
and how it maps the DynamoDB data model onto Scylla's.

Whether this document should remain a short high-level overview, or a long
and detailed design document, remains an open question.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190805085340.17543-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
3716d23ce4 dependencies: add rapidjson
The rapidjson fast JSON parsing library is used instead of libjsoncpp
in the Alternator subproject.

Message-Id: <a48104dec97c190e3762f927973a08a74fb0c773.1564995712.git.sarna@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
1611b5dd4f alternator: wait for sharded service to start
We start()ed Alternator's sharded service, but forgot to wait for the
future it returns! So on multi-shard run which is slow enough (e.g, debug
build), we sometimes get to invoke_on_all() before start() had completed,
and fail to initialize the Alternator server. The fix is to just wait for
the future returned by start() - just as similar code in main.cc does.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190804172006.14888-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
105533c046 alternator: fix sharing of a seastar::shared_ptr between threads
The function attrs_type() return a supposedly singleton, but because
it is a seastar::shared_ptr we can't use the same one for multiple
threads, and need to use a separate one per thread.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190804163933.13772-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
7c23e23e7d alternator: fix cross-shard use of CQL type objects
The CQL type singletons like utf8_type et al. are separate for separate
shards and cannot be used across shards. So whatever hash tables we use
to find them, also needs to be per-shard. If we fail to do this, we
get errors running the debug build with multiple shards.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190804165904.14204-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
a5057a3b6e alternator-test: some more GSI tests
Expand the GSI test suite. The most important new test is
test_gsi_key_not_in_index(), where the index's key includes just one of
the base table's key columns, but not a second one. In this case, the
Scylla implementation will nevertheless need to add the second key column
to the view (as a clustering key), even though it isn't considered a key
column by the DynamoDB API.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190718085606.7763-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
8d8baccdc4 alternator: ListTables should not list materialized views
Our ListTables implementation uses get_column_families(), which lists both
base tables and materialized views. We will use materialized views to
implement DynamoDB's secondary indexes, and those should not be listed in
the results of ListTables.

The patch also includes a test for this.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190717133103.26321-2-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Nadav Har'El
f8c7a2e0b8 alternator-test: move list_tables to util.py
The list_tables() utility function was used only in test_table.py
but I want to use it elsewhere too (in GSI test) so let's move it
to util.py.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190717133103.26321-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
99fd032b1f alternator: make set_sum exception more user-friendly
As in case of set_diff, an exception message in set_sum should include
the user-provided request (ADD) rather than our internal helper function
set_sum.
2019-08-19 15:49:52 +03:00
Piotr Sarna
7984050054 alternator-tests: enable DELETE case for sets
UpdateExpression's case for DELETE operation for sets is enabled.
2019-08-19 15:49:52 +03:00
Piotr Sarna
d7f75b405b alternator: implement set DELETE
UpdateExpression's DELETE operation for set is implemented on top
of set_diff helper function.
2019-08-19 15:49:52 +03:00
Piotr Sarna
1d19934bc6 alternator: add set difference helper function
A function for computing set differene of two sets represented
as JSON is added.
2019-08-19 15:49:52 +03:00
Nadav Har'El
493890c6f6 alternator: fail attempt to create table with GSI
Although we do not support GSI yet, until now we silently ignored
CreateTable's GSI parameter, and the user wouldn't know the table
wasn't created as intended.

In this patch, GSI is still unsupported, but now CreateTable will
fail with an error message that GSI is not supported.

We need to change some of the tests which test the error path, and
expect an error - but should not consider a table creation error
as the expected error.

After this patch, test_gsi.py still fails all the tests on
Alternator, but much more quickly :-)

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190711161420.18547-1-nyh@scylladb.com>
2019-08-19 15:49:52 +03:00
Piotr Sarna
e550a666e3 alternator-test: add stub case for set add duplication
The test case for adding two sets with common values is added.
This case is a stub, because boto3 transforms the result into a Python
set, which removes duplicates on its own. A proper TODO is left
in order to migrate this case to a lower-level API and check
the returned JSON directly for lack of duplicates.
2019-08-19 15:49:52 +03:00
Piotr Sarna
ea62ce67f4 alternator-test: enable tests for ADD operation
Tests for UpdateExpression::ADD are enabled.
2019-08-19 15:49:52 +03:00
Piotr Sarna
7ce0a30766 alternator: add ADD operation
UpdateExpression is now able to perform ADD operation on both numbers
and sets.
2019-08-19 15:49:52 +03:00
Piotr Sarna
d141c3b5bd alternator: add helper function for adding sets
A helper function that allows creating a set sum out of two sets
represented in JSON is added.
2019-08-19 15:49:52 +03:00
Piotr Sarna
a6ca5e19e4 alternator: add unwrap_set
It will be needed later to implement adding sets.
2019-08-19 15:49:51 +03:00
Piotr Sarna
482ac08a45 alternator: add get_item_type_string helper function
It will be useful later for ensuring that parameters for various
functions have matching types.
2019-08-19 15:49:51 +03:00
Nadav Har'El
7efa1aa48a alternator: fix Query verification of appropriate key columns
The Query operation's conditions can be used to search for a particular
hash key or both hash and sort keys - but not any other combinations.
We previously forgot to verify most errors, so in this patch we add
missing verifications - and tests to confirm we fail the query when
DynamoDB does.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190711132720.17248-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
196efbba13 alternator-test: more GSI tests
Add more tests for GSI - tests that DescribeTable describes the GSI,
and test the case of more than one GSI for a base table.

Unfortunately, creating an empty table with two GSIs routinely takes
on DynamoDB more than a full minute (!), so because we now have a
test with two GSIs, I had to increase the timeout in create_test_table().

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190711112911.14703-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Piotr Sarna
19111136c3 alternator-test: enable if_not_exists-related tests
Test cases that relied on the implementation of if_not_exists are
enabled.
2019-08-19 15:49:51 +03:00
Piotr Sarna
87626beaae alternator: implement if_not_exists
The if_not_exists function is implemented on the basis of recently added
read-before write mechanism.
2019-08-19 15:49:51 +03:00
Piotr Sarna
b5bbdc18e4 alternator: rename holds_path to a more generic name
The holds_path() utility function is actually used to check if a value
needs read before write, so its name is changed to more fitting
check_needs_read_before_write.
2019-08-19 15:49:51 +03:00
Nadav Har'El
34833707be alternator: fix bug in collection mutations
Alternator currently keeps an item's attributes inside a map, and we
had a serious bug in the way we build mutations for this map:

We didn't know there was a requirement to build this mutation sorted by
the attribute's name. When we neglect to do this sorting, this confuses
Scylla's merging algorithms, which assume collection cells are thus
sorted, and the result can be duplicate cells in a collection, and the
visible effect is a mutation that seems to be ignored - because both
old and new values exist in the collection.

So this patch includes a new helper class, "attribute_collector", which
helps collect attribute updates (put and del) and extract them in correctly
sorted order. This helper class also eliminates some duplication of
arcane code to create collection cells or deletions of collection cells.

This patch includes a simple test that previously failed, and one xfail
test that failed just because of this bug (this was the test that exposed
this bug). Both tests now succeed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190709160858.6316-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
cb9cb0e58e alternator-test: exhaustive tests for GSI
This patch adds what is hopefully an exhaustive test suite for the
global secondary indexing (GSI) feature, and all its various
complications and corner cases of how GSIs can be created, deleted,
named, written, read, and more (the tests are heavily documented to
explain what they are testing).

All these tests pass on DynamoDB, and fail on Alternator, so they are
marked "xfail". As we develop the GSI feature in Alternator piece by
piece, we should make these tests start to pass.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190708160145.13865-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
a69455df2b alternator-test: another test for BatchWriteItem
This adds another test for BatchWriteItem: That if one of the operations is
invalid - e.g., has a wrong key type - the entire batch is rejected, and not
none of its operations are done - even the valid ones.

The test succeeds, because we already handle this case correctly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190707134610.30613-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
6435895e2f alternator-test: test UpdateItem's SET with #reference
Test an operation like SET #one = #two, where the RHS has a reference
to a name, rather than the name itself. Also verify that DynamoDB
gives an error if ExpressionAttributeNames includes names not needed
by neither left or right hand side of such assignments.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190708133311.11843-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Piotr Sarna
3c7b07c20c alternator-test: add test for reading key before write
The test case checks if reading keys in order to use their values
in read-before-write updates works fine.
2019-08-19 15:49:51 +03:00
Piotr Sarna
f9063208e9 alternator-test: add test case for nested read-before-write
A test for read-before-write in nested paths (inside a function call
or inside a +/- operator) is added.
2019-08-19 15:49:51 +03:00
Piotr Sarna
99d7a34c00 alternator-test: enable basic read-before-write cases
With unsafe read-before-write implemented, simple cases can be enabled
by removing their xfail flag.
2019-08-19 15:49:51 +03:00
Piotr Sarna
3212167ac4 alternator: fix indentation 2019-08-19 15:49:51 +03:00
Piotr Sarna
33300fd30c alternator: add unsafe read-before-write to update_item
In order to serve update requests that depend on read-before-write,
a proper helper function which fetches the existing item with a given
key from the database is added.
This read-before-write mechanism is not considered safe, because it
provides no linearizability guarantees and offers no synchronization
protection. As such, it should be consider a placeholder that works
fine on a single machine and/or no concurrent access to the same key.
2019-08-19 15:49:51 +03:00
Piotr Sarna
0372ce0649 alternator: add context parameters to calculate_value
The calculate_value utility function is going to need more context
in order to resolve paths present in the right-hand side of update_item
operators: update_info and schema.
2019-08-19 15:49:51 +03:00
Piotr Sarna
43049bbec0 alternator: add allowing key columns when resolving path
Historically, resolving a path checked for key columns, which are not
allowed to be on the left-hand side of the assignment. However, path
resolving will now also be used for right-hand side, where it should
be allowed to use the key value.
2019-08-19 15:49:51 +03:00
Piotr Sarna
80163a67a2 alternator: add optional previous item to calculate_value
In order to implement read-before-write in the future, calculate_value
now accepts an additional parameter: previous_item. If read-before-write
was performed, previous_item will contain an item for the given key
which already exists in the database at the time of the update.
2019-08-19 15:49:51 +03:00
Piotr Sarna
f0448c67b0 alternator: move describe_item implementation up
It will be needed later to add read-before-write to update_item.
2019-08-19 15:49:51 +03:00
Nadav Har'El
94097b98d4 alternator-test: move create_test_table() to util.py
This patch moves the create_test_table() utility function, which creates
a test table with a unique name, from the fixtures (conftest.py) to
util.py. This will allow reusing this function in tests which need to
create tables but not through the existing fixtures. In particular
we will need to do this for GSI (global secondary index) tests
in the next patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190708104438.5830-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
7062b19461 alternator-test: expand tests of duplicate items in BatchWriteItem
The tests we had for BatchWriteItem's refusal to accept duplicate keys
only used test_table_s, with just a hash key. This patch adds tests
for test_table, i.e., a table with both hash and sort keys - to check
that we check duplicates in that case correctly as well.

Moreover, the expanded tests also verify that although identical
keys are not allowed, keys with just one component (hash or sort key)
the same but the other not the same - are fine.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190705191737.22235-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
7af4304592 alternator-test: run local tests without configuring AWS
Even when running against a local Alternator, Boto3 wants to know the
region name, and AWS credentials, even though they aren't actually needed.
For a local run, we can supply garbage values for these settings, to
allow a user who never configured AWS to run tests locally.
Running against "--aws" will, of course, still require the user to
configure AWS.

Also modified the README to be clearer, and more focused on the local
runs.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190708121420.7485-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
f2fac82bf0 alternator-test: don't hardcode us-east-1 region
For "--aws" tests, use the default region chosen by the user in the
AWS configuration (~/.aws/config or environment variable), instead of
hard-coding "us-east-1".

Patch by Pekka Enberg.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190708105852.6313-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Piotr Sarna
30aea6d942 alternator-test: enable precision test for add
With big_decimal-based implementation, the precision test passes.
Message-Id: <6d631a43901a272cb9ebd349cb779c9677ce471e.1562318971.git.sarna@scylladb.com>
2019-08-19 15:49:51 +03:00
Piotr Sarna
e3b1e2860c alternator: allow arithmetics without losing precision
Calculating value represented as 'v1 + v2' or 'v1 - v2' was previously
implemented with a double type, which offers limited precision.
From now on, these computations are based on big_decimal, which
allows returning values without losing precision.
This patch depends on 'add big_decimal arithmetic operators' series.
Message-Id: <f741017fe3d3287fa70618068bdc753bfc903e74.1562318971.git.sarna@scylladb.com>
2019-08-19 15:49:51 +03:00
Piotr Sarna
bb169ff51c alternator-test: enable batch duplication cases
With duplication checks implemented, batch write and delete tests
no longer need to be marked @xfail.
Message-Id: <6c5864607e06e8249101bd711dac665743f78d9f.1562325663.git.sarna@scylladb.com>
2019-08-19 15:49:51 +03:00
Piotr Sarna
76fa348dd9 alternator: add checking for duplicate keys in batches
Batch writes and batch deletes do not allow multiple entries
for the same key. This patch implements checking for duplicated
entries and throws an error if applicable.
Message-Id: <450220ba74f26a0893430cb903e4749f978dfd31.1562325663.git.sarna@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
2933648013 alternator-test: move utility functions to a new "util.py"
Move some common utility functions to a common file "util.py"
instead of repeating them in many test files.

The utility functions include random_string(), random_bytes(),
full_scan(), full_query(), and multiset() (the more general
version, which also supports freezing nested dicts).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190705081013.1796-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
2a2e2a5b3b alternator: use std::visit for reading std::variant
The idiomatic way to use an std::variant depending the type holds is to use
std::visit. This modern API makes it unnecessary to write many boiler-plate
functions to test and cast the type of the variant, and makes it impossible
to forget one of the options. So in this patch we throw out the old ways,
and welcome the new.

Thanks to Piotr Sarna for the idea.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190704205625.20300-1-nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
3135b9b5a5 alternator: support BatchGetItem
This patch adds to Alternator an implementation of the BatchGetItem
operation, which allows to start a number of GetItem requests in parallel
in a single request.

The implementation is almost complete - the only missing feature is the
ability to ask only for non-top-level attributes in ProjectionExpression.
Everything else should work, and this patch also includes tests which,
as usual, pass on DynamoDB and now also on Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
8d117c0f25 alternator: fix second boot
Amazingly, it appears we never tested booting Alternator a second time :-)

Our initialization code creates a new keyspace, and was supposed to ignore
the error if this keyspace already existed - but we thought the error will
come as an exceptional future, which it didn't - it came as a thrown
exception. So we need to change handle_exception() to a try/catch.

With this patch, I can kill Alternator and it will correctly start again.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
9cb2bf0820 alternator: generate error on spurious key columns
Operations which take a key as parameter, namely GetItem, UpdateItem,
DeleteItem and BatchWriteItem's DeleteRequest, already fail if the given
key is missing one of the nessary key attributes, or has the wrong types
for them. But they should also fail if the given key has spurious
attributes beyond those actually needed in a key.

So this patch adds this check, and tests to confirm that we do these checks
correctly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
ddbcdd9736 alternator: fix PutItem to really replace item.
The PutItem operation, and also the PutRequest of BatchWriteItem, are
supposed to completely replace the item - not to merge the new value with
the previous value. We implemented this wrongly - we just wrote the new
item forgetting a tombstone to remove the old item.

So this patch fixes these operations, and adds tests which confirm the
fix (as usual, these tests pass on DynamoDB, failed on Alternator before
this patch, and pass after the patch).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
c66f32a4ff alternator: add support for DeleteRequest in BatchWriteItem
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
de284b1111 alternator: add DeleteItem
Add support for the DeleteItem operation, which deletes an item.

The basic deletion operation is supported. Still not supported are:

1. Parameters to conditionally delete (ConditionalExpression or Expected)
2. Parameters to return pre-delete content
3. ReturnItemCollectionMetrics (statistics relevant for tables with LSI)

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
a0825043f4 alternator: cleaner error on DeleteRequest
In BatchWriteItem, we currently only support the PutRequest operation.
If a user tries to use DeleteRequest (which we don't support yet), he
will get a bizarre error. Let's test the request type more carefully,
and print a better error message. This will also be the place where
eventually we'll actually implement the DeleteRequest.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
7107923e1e alternator-test: tests for BatchWriteItem
This patch adds more comprehensive tests for the BatchWriteItem operation,
in a new file batch_test.py. The one test we already had for it was also
moved from test_item.py here.

Some of the test still xfail for two reasons:
1. Support for the DeleteRequest operation of BatchWriteItem is missing.
2. Tests that forbid duplicate keys in the same request are missing.

As usual, all tests succeed on DynamoDB, and hopefully (I tried...)
cover all the BatchWriteItem features.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
a6b007b753 alternator: support (most of) ProjectionExpression
DynamoDB has two similar parameters - AttributesToGet and
ProjectionExpression - which are supported by the GetItem, Scan and
Query operations. Until now we supported only the older AttributesToGet,
and this patch adds support to the newer ProjectionExpression.

Besides having a different syntax, the main difference between
AttributesToGet and ProjectionExpression is that the latter also
allows fetching only a specific nested attribute, e.g., a.b[3].c.
We do not support this feature yet, although it would not be
hard to add it: With our current data representation, it means
fetching the top-level attribute 'a', whose value is a JSON, and then
post-filtering it to take out only the '.b[3].c'. We'll do that
later.

This patch also adds more test cases to test_projection_expression.py.
All tests except three which check the nested attributes now pass,
and those three xfail (they succeed on DynamoDB, and fail as expected
on Alternator), reminding us what still needs to be done.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
3909370146 alternator-test: tests for yet-unimplemented ProjectionExpression
Our GetItem, Query and Scan implementations support the AttributesToGet
parameter to fetch only a subset of the attributes, but we don't yet
support the more elaborate ProjectionExpression parameter, which is
similar but has a different syntax and also allows to specify nested
document paths.

This patch adds existive testing of all the ProjectionExpression features.
All these tests pass against DynamoDB, but fail against the current
Alternator so they are marked "xfail". These tests will be helpful for
developing the ProjectionExpression feature.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
0230c04702 alternator-test: more tests for AttributesToGet parameter
The AttributesToGet parameter - saying which attributes to fetch for each
item - is already supported in the GetItem, Query and Scan operations.
However, we only had a test for it for it for Scan. This patch adds
similar tests also for the GetItem and Query operations.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
d5a92727d2 alternator-test: another test for top-level attribute overwrite
Yet another test for overwriting a top-level attribute which contains
a nested document - here, overwriting it by just a string.

This test passes. In the current implementation we don't yet support
updates to specific attribute paths (e.g. a.b[3].c) but we do support
well writing and over-writing top-level attributes.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
042e087066 alternator: initial implementation of "+" and "-" in UpdateExpression
This patch implements the last (finally!) syntactic feature of the
UpdateExpression - the ability to do SET a=val1+val2 (where, as
before, each of the values can be a reference to a value, an
attribute path, or a function call).

The implementation is not perfect: It adds the values as double-precision
numbers, which can lose precision. So the patch adds a new test which
checks that the precision isn't lost - a test that currently fails
(xfail) on Alternator, but passes on DynamoDB. The pre-existing test
for adding small integer now passes on Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
dcd271393e alternator: support the list_append() function in UpdateExpression
In the previous patch we added function-call support in the UpdateExpression
parser. In this patch we add support for one such function - list_append().
This function takes two values, confirms they are lists, and concatenates
them. After this patch only one function remains unimplemented:
if_not_exists().

We also split the test we already had for list_append() into two tests:
One uses only value references (":val") and passes after this patch.
The second test also uses references to other attributes and will only
work after we start supporting read-modify-write.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Nadav Har'El
f2767d973b alternator: parse more types of values in UpdateExpression
Until this patch, in update expressions like "SET a = :val", we only
allowed the right-hand-side of the assignment to be a reference to a
value stored in the request - like ":val" in the above example.

But DynamoDB also allows the value to be an attribute path (e.g.,
"a.b[3].c", and can also be a function of a bunch of other values.
This patch adds supports for parsing all these value types.

This patch only adds the correct parsing of these additional types of
values, but they are still not supported: reading existing attributes
(i.e., read-modify-write operations) is still not supported, and
none of the two functions which UpdateExpression needs to support
are supported yet. Nevertheless, the parsing is now correct, and the
the "unknown_function" test starts to pass.

Note that DynamoDB allows the right-hand side of an assignment to be
not only a single value, but also value+value and value-value. This
possibility is not yet supported by the parser and will be added
later.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:49:51 +03:00
Piotr Sarna
51a1a01605 alternator-test: add initial filtering test for scans
Currently the only supported case is equality on non-key attributes.
More complex filtering tests are also included in test_query.py.
2019-08-19 15:48:17 +03:00
Piotr Sarna
067f00050e alternator-test: add initial filtering test for query
The test cases verify that equality-based filtering on non-key
attributes works fine. It also contains test stubs for key filtering
and non-equality attribute filtering.
2019-08-19 15:48:17 +03:00
Piotr Sarna
9fc44d2f0e alternator-test: diversify attribute values in filled test table
Filled test table used to have identical non-key attributes for all
rows. These values are now diversified in order to allow writing
filtering test cases.
2019-08-19 15:48:17 +03:00
Piotr Sarna
67c86461cb alternator: add filtering to Query
Query requests now accept QueryFilter parameter.
2019-08-19 15:48:17 +03:00
Piotr Sarna
d018539b07 alternator: enable filtering for Scan
Scans can now accept ScanFilter parameter to perform filtering
on returned rows.
2019-08-19 15:48:17 +03:00
Piotr Sarna
1e488458b2 alternator: add initial filtering implementation
Filtering is currently only implemented for the equality operator
on non-key attributes.
Next steps (TODO) involve:
1. Implementing filtering for key restrictions
2. Implementing non-key attribute filtering for operators other than EQ.
   It, in turn, may involve introducing 'map value restrictions' notion
   to Scylla, since now it only allows equality restrictions on map
   values (alternator attributes are currently kept in a CQL map).
3. Implementing FilterExpression in addition to deprecated QueryFilter
2019-08-19 15:48:17 +03:00
Nadav Har'El
b4b6a377e4 alternator: clean up parsing of attribute-path components
Before this patch, we read either an attribute name like "name" or
a reference to one "#name", as one type of token - NAME.
However, while attribute paths indeed can use either one, in some other
contexts - such as a function name - only "name" is allowed, so we
need to distinguish between two types of tokens: NAME and NAMEREF.

While separating those, I noticed that we incorrectly allowed a "#"
followed by *zero* alphanumeric characters to be considered a NAMEREF,
which it shouldn't. In other words, NAMEREF should have ALNUM+, not ALNUM*.
Same for VALREF, which can't be just a ":" with nothing after it.
So this patch fixes these mistakes, and adds tests for them.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
90b8870a16 alternator: complain about unused values or names in UpdateExpression
DynamoDB complains, and fails an update, if the update contains in
ExpressionAttributeNames or ExpressionAttributeValues names which aren't
used by the expression.

Let's do the same, although sadly this means more work to track which
of the references we've seen and which we haven't.

This patch makes two previously xfail (expected fail) tests become
successful tests on Alternator (they always succeeded against DynamoDB).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
00ed12e56c alternator-test: complete test for UpdateItem's UpdateExpression
The existing tests in test_update_expression.py thoroughly tested the
UpdateExpression features which we currently support. But tests for
features which Alternator *doesn't* yet support were partial.

In this patch, we add a large number of new tests to
test_update_expression.py aiming to cover ALL the features of
UpdateExpression, regardless of whether we already support it in
Alternator or not. Every single feature and esoteric edge-case I could
discover is covered in these tests - and as far as I know these tests
now cover the *entire* UpdateExpression feature. All the tests succeed
on DynamoDB, and confirm our understanding of what DynamoDB actually does
on all these cases.

After this patch, test_update_expression.py is a whopper, with 752 lines of
code and 37 separate test functions. 23 out of these 37 tests are still
"xfail" - they succeed on DynamoDB but fail on Alternator, because of
several features we are still missing. Those missing features include
direct updates of nested attributes, read-modify-write updates (e.g.,
"SET a=b" or "SET a=a+1"), functions (e.g., "SET a = list_append(a, :val)"),
the ADD and DELETE operations on sets, and various other small missing
pieces.

The benefit of this whopper test is two-fold: First, it will allow us
to test our implementation as we continue to fill it (i.e., "test-
driven development"). Second, all these tested edge cases basically
"reverse engineer" how DynamoDB's expression parser is supposed to work,
and we will need this knowledge to implement the still-missing features of
UpdateExpression.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
04610f5299 alternator-test: test for UpdateItem's UpdateExpression
This patch adds an extensive array of tests for UpdateItem's UpdateExpression
support, which was introduced in the previous patch.

The tests include verification of various edge cases of the parser, support
for ":value" and "#name" references, functioning SET and REMOVE operations,
combinations of multiple such operations, and much more.

As usual, all these tests were ran and succeed on DynamoDB, as well as on
Alternator - to confirm Alternator behaves the same as DynamoDB.

There are two tests marked "xfail" (expected to fail), because Alternator
still doesn't support the attribute copy syntax (e.g., "SET a = b",
doing a read-before-write).

There are some additional areas which we don't support - such as the DELETE
and ADD operations or SET with functions - but those areas aren't yet test
in these tests.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
ef58e3fc1e alternator: enable support for UpdateItem's UpdateExpression
For the UpdateItem operation, so far we supported updates via the
AttributeUpdates parameter, specifying which attributes to set or remove
and how. But this parameter is considered deprecated, and DynamoDB supports
a more elaborate way to modify attributes, via an "UpdateExpression".

In the previous patch we added a function to parse such an UpdateExpression,
and in this patch we use the result of this parsing to actually perform
the required updates.

UpdateExpression is only partially supported after this patch. The basic
"SET" and "REMOVE" operations are supported, but various other cases aren't
fully supported and will be fixed in followup patches. The following
patch will add extensive tests to confirm exactly what works correctly
with the new UpdateExpression support.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
21c7e53c1c alternator: add expression parsers
The DynamoDB protocol is based on JSON, and most DynamoDB requests describe
the operation and its parameters via JSON objects such as maps and lists.
However, in some types of requests an "expression" is passed as a single
string, and we need to parse this string. These cases include:
1. Attribute paths, such as "a[3].b.c", are used in projection
 expressions as well as inside other expressions described below.
2. Condition expressions, such as "(NOT (a=b OR c=d)) AND e=f",
 used in conditional updates, filters, and other places.
3. Update expressions, such as "SET #a.b = :x, c = :y DELETE d"

This patch introduces the framework to parse these expressions, and
an implementation of parsing update expressions. These update expressions
will be used in the UpdateItem operation in the next patch.

All these expression syntaxes are very simple: Most of them could be
parsed as regular expressions, or at most a simple hand-written lexical
analyzer and recursive-descent parser. Nevertheless, we decided to specify
these parsers in the same ANTLR3 language already used in the Scylla
project for parsing CQL, hopefully making these parsers easier to reason
about, and easier to change if needed - and reducing the amount of boiler-
plate code.

The parsing of update expressions is most complete except that in SET
actions, only the "path = value" form is supported and not yet forms
forms such as "path1 = path2" (which does read-before-write) or
"path1 = path1 + value" or "path = function(...)".

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
66cad2975f alternator-test: split nested-document tests to new file
We need to write more tests for various case of handling
nested documents and nested attributes. Let's collect them
all in the same test file.

This patch mostly moves existing code, but also adds one
small test, test_nested_document_attribute_write, which
just writes a nested document and reads it back (it's
mostly covered by the existing test_put_and_get_attribute_types,
but is specifically about a nested document).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
bd251e81df alternator-test: make local test the default
We usually run Alternator tests against the local Alternator - testing
against AWS DynamoDB is rarer, and usually just done when writing the
test. So let's make "pytest" without parameters default to testing locally.
To test against AWS, use "pytest --aws" explicitly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
f807e2792d alternator: move related functions to serialization.cc
Existing functions related to serialization and deserialization
are moved to serialization.cc source file.
Message-Id: <fb49a08b05fdfcf7473e6a7f0ac53f6eaedc0144.1559646761.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
d1942091d7 alternator: apply new serialization to reads and writes
Attributes for reads (GetItem, Query, Scan, ...) and writes (PutItem,
UpdateItem, ...) are now serialized and deserialized in binary form
instead of raw JSON, provided that their type is S, B, BOOL or N.
Optimized serialization for the rest of the types will be introduced
as follow-ups.
Message-Id: <6aa9979d5db22ac42be0a835f8ed2931dae208c1.1559646761.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
01ff0d1ba8 alternator: remove usages of to_json()
Introducing to_json() helper to types.hh was an inaccurate idea,
so before the mentioned patch is reverted, each usage of to_json()
is replaced with to_json_string().
Message-Id: <72bfdcdcba02896c92904860773244dd99b7a213.1560260605.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
19c855287c alternator: add simple attribute serialization routines
Attributes used to be written into the database in raw JSON format,
which is far from optimal. This patch introduces more robust
serializationi routines for simple alternator types: S, B, BOOL, N.
Serialization uses the first byte to encode attribute type
and follows with serializing data in binary form.
More complex types (sets, lists, etc.) are currently still
serialized in raw JSON and will be optimized in follow-up patches.
Message-Id: <10955606455bbe9165affb8ac8fba4d9e7c3705f.1559646761.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
4c2c920250 alternator: move error class to a separate header
Error class definitions were previously in server.hh, but they
are separate entities - future .cc files can use the errors without
the need of including server definitions.
Message-Id: <b5689e0f4c9f9183161eafff718f45dd8a61b653.1559646761.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
e6f1841756 configure.py: move alternator source files to separate list
For some unknown reason we put the list of alternator source files
in configure.py inside the "api" list. Let's move it into a separate
list.

We could have just put it in the scylla_core list, but that would cause
frequent and annoying patch conflicts when people add alternator source
files and Scylla core source files concurrently.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
d4ad58f353 alternator: stub support for UpdateItem with UpdateExpression
So far for UpdateItem we only supported the old-style AttributeUpdates
parameter, not the newer UpdateExpression. This patch begins the path
to supporting UpdateExpression. First, trying to use *both* parameters
should result in an error, and this patch does this (and tests this).
Second, passing neither parameters is allowed, and should result in
an *empty* item being created.

Finally, since today we do not yet support UpdateExpression, this patch
will cause UpdateItem to fail if UpdateExpression is used, instead of
silently being ignored as we did so far.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
102034a742 alternator-tests: two simple test for nested documents
This patch adds two simple tests for nested documents, which pass:

test_nested_document_attribute_overwrite() tests what happens when
we UpdateItem a top-level attribute to a dictionary. We already tested
this works on an empty item in a previous test, but now we check what
happens when the attribute already existed, and already was a dictionary,
and now we update it to a new dictionary. In the test attribute a was
{b:3, c:4} and now we update it to {c:5}. The test verifies that the new
dictionary completely replaces the old one - the two are not merged.
The new value of the attribute is just {c:5}, *not* {b:3, c:5}.

The second test verifies that the AttributeUpdates parameter of
UpdateItem cannot be used to update a just a nested attributes.
Any dots in the attribute name are considered an actual dot - not
part of a path of attribute names.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
d914f37a08 alternator-test: test_query.py: change item list comparison
Comparing two lists of items without regard for order is not trivial.
For this reason some tests in test_query.py only compare arrays of sort
keys, and those tests are fine.

But other tests used a trick of converting a list of items into a
of set_of_frozen_elements() and compare this sets. This trick is almost
correct, but it can miss cases where items repeat.

So in this patch, we replace the set_of_frozen_elements() approach by
a similar one using a multiset (set with repetitions) instead of a set.
A multiset in Python is "collections.Counter". This is the same approach
we started to also used in test_scan.py in a recent patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
c63fefd37b alternator: remove unused code
Remove the incomplete and unused function to convert DynamoDB type names
to ScyllaDB type objects:

DynamoDB has a different set of types relevant for keys and for attributes.
We already have a separate function, parse_key_type(), for parsing key
types, and for attributes - we don't currently parse the type names at
all (we just save them as JSON strings), so the function we removed here
wasn't used, and was in fact #if'ed out. It was never completed, and it now
started to decay (the type for numbers is wrong), so we're better off
completely removing it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
c2cf2ab99b alternator: implement correct "number" type for keys
This patch implements a fully working number type for keys, and now
Alternator fully and correctly supports every key type - strings, byte
arrays, and numbers.

The patch also adds a test which verifies that Scylla correctly sorts
number sort keys, and also correctly retrieves them to the full precision
guaranteed by DynamoDB (38 decimal digits).

The implementation uses Scylla's "decimal" type, which supports arbitrary
precision decimal floating point, and in particular supports the precision
specified by DynamoDB. However, "decimal" is actually over-qualified for
this use, so might not be optimal for the more specific requirements of
DynamoDB. So a FIXME is left to optimize this case in the future.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
41526884f1 alternator-test: test_scan.py: change item list comparison
Comparing two lists of items without regard for order is not trivial.
test_scan.py currently has two ways of doing this, both unsatisfactory:

1. We convert each list to a set via set_of_frozen_elements(), and compare
   the sets. But this comparison can miss cases where items repeat.

2. We use sorted() on the list. This doesn't work on Python 3 because
   it removed the ability to compare (with "<") dictionaries.

So in this patch, we replace both by a new approach, similar to the first
one except we use a multiset (set with repetitions) instead of a set.
A multiset in Python is "collections.Counter".

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
2e703e6b73 alternator-test: drop "test_2_tables" fixture
Creating and deleting tables is the slowest part of our tests,
so we should lower the number of tables our tests create.

We had a "test_2_tables" fixture as a way to create two
tables, but since our tests already create other tables
for testing different key types, it's faster to reuse those
tables - instead of creating two more unused tables.

On my system, a "pytest --local", running all 38 tests
locally, drops from 25 seconds to 20 seconds.

As a bonus, we also have one fewer fixture ;-)

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
ecd585ef59 alternator-text: fix errors in len/length variable name
Also change "xrage" to "range" to appease Python 3

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
cd040d6674 DynamoDB limits the size of hash keys to 2048 bytes, sort keys
to 1024 bytes, and the entire item to 400 KB which therefore also
limits the size of one attribute. This test checks that we can
reach up to these limits, with binary keys and attributes.

The test does *not* check what happens once we exceed these
limits. In such a case, DynamoDB throws an error (I checked that
manually) but Alternator currently simply succeeds. If in the
future we decide to add artificial limits to Alternator as well,
we should add such tests as well.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
a1dbde66fd alternator-test: don't use "len" as a parameter name
"len" is an unfortunate choice for a variable name, in case one
day the implementation may want to call the built-in "len" function.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
67c35cde40 alternator-test: test sort-key ordering - for both string and binary keys
We already have a test for *string* sort-key ordering of items returned
by the Scan operation, and this test adds a similar test for the Query
operation. We verify that items are retrieved in the desired sorted
order (sorted by the aptly-named sort key) and not in creation order
or any other wrong order.

But beyond just checking that Query works as expected (it should,
given it uses the same machinary as Scan), the nice thing about this
test is that it doesn't create a new table - it uses a shared table
and creates one random partition inside it. This makes this test
faster and easier to write (no need for a new fixture), and most
importantly - easily allows us to write similar tests for other
key types.

So this patch also tests the correct ordering of *binary* sort keys.
It helped exposed bugs in previous versions of the binary key implementation.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
18b2f656f2 alternator-test: test item operations with binary keys
Simple tests for item operations (PutItem, GetItem) with binary key instead
of string for the hash and sort keys. We need to be able to store such
keys, and then retrieve them correctly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
90d7e6673e alternator: add support for bytes as key columns
Until now we only supported string for key columns (hash or sort key).
This patch adds support for the bytes type (a.k.a binary or blob) as well.
The last missing type to be supported in keys is the number type.

Note that in JSON, bytes values are represented with base64 encoding,
so we need to decode them before storing the decoded value, and re-encode
when the user retrieves the value. The decoding is important not just
for saving storage space (the encoding is 4/3 the size of the decoded)
but also for correct *sorting* of the binary keys.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
182450623a alternator: add base64 encoding and decoding functions
The DynamoDB API uses base64 encoding to encode binary blobs as JSON
strings. So we need functions to do these conversions.

This code was "inspired" by https://github.com/ReneNyffenegger/cpp-base64
but doesn't actually copy code from it.

I didn't write any specific unit tests for this code, but it will be
exercised and tested in a following patch which tests Alternator's use
of these functions.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
f14c7f8200 alternator-test: add dedicated BEGINS_WITH case to Query
BEGINS_WITH behaves in a special way when a key postfix
consists of <255> bytes. The initial test does not use that
and instead checks UTF-8 characters, but once bytes type
is implemented for keys, it should also test specifically for
corner cases, like strings that consist of <255> byte only.
Message-Id: <fe10d7addc1c9d095f7a06f908701bb2990ce6fe.1558603189.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
b81fbabe37 alternator-test: rename test_query_with_paginator
Paginator is an implementation detail and does not belong in the name,
and thus the test is renamed to test_query_basic_restrictions.
Message-Id: <849bc9d210d0faee4bb8479306654f2a59e18517.1558524028.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
9c934b64f8 alternator: fix string increment for BEGINS_WITH
BEGINS_WITH statement increments a string in order to compute
the upper bound for a clustering range of a query.
Unfortunately, previous implementation was not correct,
as it appended a <0> byte if the last character was <255>,
instead of incrementing a last-but-one character.
If the string contains <255> bytes only, the upper bound
of the returned upper bound is infinite.
Message-Id: <3a569f08f61fca66cc4f5d9e09a7188f6daad578.1558524028.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
4ad39f714f alternator: common get_read_consistency() function
We had several places in the code that need to parse the
ConsistentRead flag in the request. Let's add a function
that does this, and while at it, checks for more error
cases and also returns LOCAL_QUORUM and LOCAL_ONE instead
of QUORUM and ONE.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
f56a0fbcd9 alternator: for writes, use LOCAL_QUORUM instead of QUORUM
As Shlomi suggested in the past, it is more likely that when we
eventually support global tables, we will use LOCAL_QUORUM,
not QUORUM. So let's switch to that now.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
296c2566c5 alternator-test: verify that table with only hash key also works
So far, all of the tests in test_item.py (for PutItem, GetItem, UpdateItem),
were arbitrarily done on a test table with both hash key and sort key
(both with string type). While this covers most of the code paths, we still
need to verify that the case where there is *not* a sort key, also works
fine. E.g., maybe we have a bug where a missing clustering key is handled
incorrectly or an error is incorrectly reported in that case?

But in this patch we add tests for the hash-key-only case, and see that
it already works correctly. No bug :-)

We add a new fixture test_table_s for creating a test table with just
a single string key. Later we'll probably add more of these test tables
for additional key types.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
d760097dad alternator-test: also test for missing part of key
Another type of key type error can be to forget part of the key
(the hash or sort key). Let's test that too (it already works correctly,
no need to patch the code).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
523c5ee159 alternator: gracefully handle wrong key types
When a table has a hash key or sort key of a certain type (this can
be string, bytes, or number), one cannot try to choose an item using
values of different types.

We previously did not handle this case gracefully, and PutItem handled
it particularly bad - writing malformed data to the sstable and basically
hanging Scylla. In this patch we fix the pk_from_json() and ck_from_json()
functions to verify the expected type, and fail gracefully if the user
sent the wrong type.

This patch also adds tests for these failures, for the GetItem, PutItem,
and UpdateItem operations.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
8146eb6027 alternator: correct handling of missing item in GetItem
According to the documentation, trying to GetItem a non-existant item
should result in an empty response - NOT a response with an empty "Item"
map as we do before this patch.

This patch fixes this case, and adds a test case for it. As usual,
we verify that the test case also works on Amazon DynamoDB, to verify
DynamoDB really behaves the way we thik it does.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
725c18bf6c alternator: fix support for empty items
If an empty item (i.e., no attributes except the key) is created, or an item
becomes empty (by deleting its existing attributes), the empty item must be
maintained - it cannot just disappear. To do this in Scylla, we must add a
row marker - otherwise an empty attribute map is not enough to keep the
row alive.

This patch includes 4 test cases for all the various ways an empty item can be
created empty or non-empty item be emptied, and verifies that the empty item
can be correctly retrieved (as usual, to verify that our expectation of
"correctness" is indeed correct, we run the same tests against DynamoDB).
All these 4 tests failed before this patch, and now succeed.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
c8270831ec alternator: remove two unused lines of code
These lines of codes were superfluous and their result unused: the
make_item_mutation() function finds the pk and ck on its own.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
0987916542 alternator: add statistics
his patch adds a statistics framework to Alternator: Executor has (for
each shard) a _stats object which contains counters for various events,
and also is in charge of making these counters visible via Scylla's regular
metrics API (http://localhost:9180/metrics).

This patch includes a counter for each of DynamoDB's operation types,
and we increase the ones we support when handled. We also added counters
for total operations and unsupported operations (operation types we don't
yet handle). In the future we can easily add many more counters: Define
the counter in stats.hh, export it in stats.cc, and increment it in
where relevant in executor.cc (or server.cc).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
ec23a14f82 alternator-test: add initial Query test
The test covers simple restrictions on primary keys.
Message-Id: <2a7119d380a9f8572210571c565feb8168d43001.1558356119.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
7e4f7a20dc alternator: implement basic Query
The implementation covers the following restrictions
 - equality for hash key;
 - equality, <, <=, >, >=, between, begins_with for sort key.
Message-Id: <021989f6d0803674cbd727f9b8b3815433ceeea5.1558356119.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
930234e48e alternator: move do_query to separate function
A fair portion of code from scan() will be used later to implement
query(), so it's extracted as a helper function.
Message-Id: <d3bc163a1cb2032402768fcbc6a447192fba52a4.1558356119.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
9055096021 alternator-test: another edge case for Scan with AttributesToGet
Ask to retrieve only an attribute name which *none* of the items have.
The result should be a silly list of empty items, and indeed it is.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
65934ada59 alternator-test: shorten test_scan.py by reusing full_scan more
Use full_scan() in another test instead of open-coding the scan.
There are two more tests that could have used full_scan(), but
since they seem to be specifically adding more assertions or
using a different API ("paginators"), I decided to leave them
as-is. But new tests should use full_scan().

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
b28d3cdb5d alternator-test: test AttributesToGet parameter in Scan request
This is a short, but extensive, test to the AttributesToGet parameter
to Scan, allowing to select for output only some of the attributes.

The AttributesToGet feature has several non-obvious features. Firstly,
it doesn't require that any key attributes be selected. So since each
item may have different non-key attributes, some scanned items may
be missing some of the selected columns, and some of the items may
even be missing *all* the selected columns - in which case DynamoDB
returns an empty item (and doesn't entirely skip this item). This
test covers all these cases, and it adds yet another item to the
'filled_test_table' fixture, one which has different attributes,
so we can see these issues.

As usual, this test passes in both DynamoDB and Alternator, to
assure we correspond to the *right* behavior, not just what we
think is right.

This test actually exposed a bug in the way our code returned
empty items (items which had none of the selected columns),
a bug which was fixed by the previous patch.

Instead of having yet another copy of table-scanning code, this
patch adds a utility function full_scan(), to scan an entire
table (with optional extra parameters for the scan) and return
the result as an array. We should simply existing tests in
test_scan.py by using this new function.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
cf58acc23f alternator: fix bug in returning an empty item in a Scan
When a Scan selects only certain attributes, and none of the key
attributes are selected, for some of the scanned items *nothing*
will remain to be output, but still Dynamo outputs an empty item
in this case. Our code had a bug where after each item we "moved"
the object leaving behind a null object, not an empty map, so a
completely empty item wasn't output as an empty map as expected,
and resulted in boto3 failing to parse the response.

This simple one-line patch fixes the bug, by resetting the item
to an empty map after moving it out.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
75c3f33a8c alternator: add lookup table for requests
Instead of using a really long if-else chain, requests are now
looked up via a routing table.
Message-Id: <746a34b754c3070aa9cbeaf98a6e7c6781aaee65.1557914794.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
be516a080a alternator-test: migrate filled_test_table to use batches
Filled test table fixture now takes advantage of batch writes
in order to run faster.
Message-Id: <e299cdffa9131d36465481ca3246199502d65e0c.1557914382.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
176b7dfd17 alternator-test: add batch writing test case
Message-Id: <a950799dd6d31db429353d9220b63aa96676a7a7.1557914382.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
30d4b4e689 alternator: add basic BatchWriteItem
The initial implementation only supports PutRequest requests,
without serving DeleteRequest properly.
Message-Id: <451bcbed61f7eb2307ff5722de33c2e883563643.1557914382.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
3b73c49ac8 alternator: improve where DescribeEndpoints gets its information
Instead of blindly returning "localhost:8000" in response to
DescribeEndpoints and for sure causing us problems in the future,
the right thing to do is to return the same domain name which the
user originally used to get to us, be it "localhost:8000" or
"some.domain.name:1234". But how can we know what this domain name
was? Easy - this is why HTTP 1.1 added a mandatory "Host:" header,
and the DynamoDB driver I tested (boto3) adds it as expected,
indeed with the expected value of "localhost:8000" on my local setup.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
b556356a7d alternator-test: test for sort order of items in a single partition
Although different partitions are returned by a Scan in (seemingly)
random order, items in a single partition need to be returned sorted
by their sort key. This adds a test to verify this.

This patch adds to the filled_test_table fixture, which until now
had just one item in each partition, another partition (with the key
"long") with 164 additional items. The test_scan_sort_order_string
test then scans this table, and verifies that the items are really
returned in sorted order.

The sort order is, of course, string order. So we have the first
item with sort key "1", then "10", then "100", then "101", "102",
etc. When we implement numeric keys we'll need to add a version
of this test which uses a numeric clustering key and verifies the
sort order is numeric.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
90c12b4ea3 alternator: fix clustering key setup
Because of a typo, we incorrectly set the table's sort key as a second
partition key column instead of a clustering key column. This has bad
but subtle consequences - such as that the items are *not* sorted
according to the sort key. So in this patch we fix the typo.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
287a986715 alternator: add initial implementation of DescribeEndpoints
DescribeEndpoints is not a very important API (and by default, clients
don't use it) but I wanted to understand how DynamoDB responds to it,
and what better way than to write a test :-)

And then, if we already have a test, let's implement this request in
Scylla as well. This is a silly implementation, which always returns
"localhost:8000". In the future, this will need to be configurable -
we're not supposed here to return *this* server's IP address, but rather
a domain name which can be used to get to all servers.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
9b46c6ac2d alternator: unify and improve TableName field handling
Most of the request types need to a TableName parameter, specifying the
name of the table they operate on. There's a lot of boilerplate code
required to get this table name and verify that it is valid (the parameter
exists, is a string, passes DynamoDB's naming rules, and the table
actually exists), which resulted in a lot of code duplication - and
in some cases missing checks.

So this patch introduces two utility functions, get_table_name()
and get_table(), to fetch a table name or the schema of an existing
table, from the request, with all necessary validation. If validation
fails, the appropriate api_error() is thrown so the user gets the
right error message.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Nadav Har'El
f0436aeecc alternator-test: clean up conftest.py
Remove unused random-string code from conftest.py, and also add a
TODO comment how we should speed up filled_test_table fixture by
using a batch write - when that becomes available in Alternator.
(right now this fixture takes almost 4 seconds to prepare on a local
Alternator, and a whopping 3 minutes (!) to prepare on DynamoDB).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
e3ba65003d alternator-test: add initial scan test
Message-Id: <c28ff1d38930527b299fe34e9295ecd25607398c.1557757402.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
d25a07a6c0 alternator-test: add filled test table fixture
The fixture creates a test table and fills it with random data,
which can be later used for testing reads.
Message-Id: <649a8b8928e1899c5cbd82d65d745a464c1163c8.1557757402.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
426f53bc89 alternator: implement basic scan
The most basic version of Scan request is implemented.
It still contains a list of TODOs, among which the support for Segments
parameter for scan parallelism.
Message-Id: <5d1bfc086dbbe64b3674b0053e58a0439e64909b.1557757402.git.sarna@scylladb.com>
2019-08-19 15:48:17 +03:00
Piotr Sarna
1d85558d47 alternator: lower debug messages verbosity in the HTTP server
The HTTP server still uses WARN log level to log debug messages,
which is way higher than necessary. These messages are degraded
to TRACE level.
Message-Id: <59559277f2548d4046001bebff45ab2d3b7063b5.1557744617.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
1c7b1ac165 alternator-test: simplify test_put_and_get_attribute_types
The test test_put_and_get_attribute_types needlessly named all the
different attributes and their variables, causing a lot of repetition
and chance for mistakes when adding additional attributes to the test.

In this rewrite, we only have a list of items, and automatically build
attributes with them as values (using sequential names for the attributes)
and check we read back the same item (Python's dict equality operator
checks the equality recursively, as expected).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
c4c71989bf alternator-test: test all attribute types
Although we planned to initially support only string types, it turns out
for the attributes (*not* the key), we actually support all types already,
including all scalar types (string, number, bool, binary and null) and
more complex types (list, nested document, and sets).

This adds a tests which PutItem's these types and verifies that we can
retrieve them.

Note that this test deals with top-level attributes only. There is no
attempt to modify only a nested attribute (and with the current code,
it wouldn't work).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
4dad76a6a7 alternator-test: rewrite ListTables test
In our tests, we cannot really assume that ListTables should returns *only*
the tables we created for the test, or even that a page size of 100 will
be enough to list our 3 pages. The issue is that on a shared DynamoDB, or
in hypothetical cases where multiple tests are run in parallel, or previous
tests had catestrophic errors and failed to clean up, we have no idea how
many unrelated tables there are in the system. There may be hundreds of
them.  So every ListTables test will need to use paging.

So in this re-implementation, we begin with a list_tables() utility function
which calls ListTables multiple times to fetch all tables, and return the
resulting list (we assume this list isn't so huge it becomes unreasonable
to hold it in memory). We then use this utility function to fetch the table
list with various page sizes, and check that the test tables we created are
listed in the resulting list.

There's no longer a separate test for "all" tables (really was a page of 100
tables) and smaller pages (1,2,3,4) - we now have just one test that does the
page sizes 1,2,3,4, 50 and 100.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
323268e9ab alternator: add tests to ListTables command
Test cases cover both listing appropriate table names
and pagination.
Message-Id: <e7d5f1e5cce10c86c47cdfb4d803149488935ec0.1557402320.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
a441ad9360 alternator-test: add 2 tables fixture
For some tests, more than 1 table is needed, so another fixture
that provided two additional test tables is added.
Message-Id: <75ae9de5cc1bca19594db1f0bc03260f83459380.1557402320.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
d04a5b01c3 alternator: implement ListTables
ListTables is used to extract all table names created so far.
Message-Id: <04f4d804a40ff08a38125f36351e56d7426d2e3d.1557402320.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
4da8171b42 alternator: use trace level for debug messages
In the early development stage, warn level was used for all
debug messages, while it's more appropriate to use 'trace' or 'debug'.
Message-Id: <419ca5a22bc356c6e47fce80b392403cefbee14d.1557402320.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
fbc6f222b8 alternator-test: cleanup in conftest.py
This patch cleans up some comments and reorganizes some functions in
conftest.py, where the test_table fixture was defined. The goal is to
later add additional types of test tables with different schemas (e.g.,
just a partition key, different key types, etc.) without too much
code duplication.

This patch doesn't change anything functional in the tests, and they
still pass ("pytest --local" runs all tests against the local Alternator).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
31cce0323e alternator: make ck_from_json() easier to use
The ck_from_json() utility function is easier to use if it handles
the no-clustering-key case as the callers need them too, instead of
requiring them to handle the no-clustering-key case separately.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
f3d1cefe3e alternator: migrate to std::string
Most JSON libraries, including jsoncpp, are based on std::string,
so sstring becomes a source of unneeded copying. The usage of sstring
is only preserved in code that interacts with Scylla API directly.
Message-Id: <691d64c7d71196e33fb0e0847dd8a13704d3cdb2.1557314233.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
8bf6e963f2 alternator: add support for UpdateItem's DELETE operation
So far we supported UpdateItem only with PUT operations - this patch
adds support for DELETE operations, to delete specific attributes from
an item.

Only the case of a missing value is support. DynamoDB also provides
the ability to pass the old value, and only perform the deletion if
the value and/or its type is still up-to-date - but we don't support
this yet and fail such request if it is attempted.

This patch also includes a test for this case in alternator-test/

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
ac65f91b5d alternator-test: add tests for UpdateItem
Add initial tests for UpdateItem. Only the features currently supported
by our code (only string attributes, only "PUT" action) are tested.

As usual, this test (like all others) was tested to pass on both DynamoDB
and Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
c7d7c1e50d alternator: add initial UpdateItem implementation
Add an initial UpdateItem implementation. As PutItem and GetItem we
are still limited to string attributes. This initial implementation
of UpdateItem implements only the "PUT" action (not "DELETE" and
certainly not "ADD") and not any of the more advanced options.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
829f5fe359 alternator: add attrs_column() helper function
Message-Id: <d93ae70ccd27fe31d0bc6915a20d83d7a85342cf.1557223199.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
5f9409af68 alternator: make constant names more explicit
KEYSPACE and ATTRS constants refer to their names, not objects,
so they're named more explicitly.
Message-Id: <14b1f00d625e041985efbc4cbde192bd447cbf03.1557223199.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
f28680ec5c alternator: remove inaccessible return statement
Message-Id: <afaef20e7e110fa23271fb8c3dc40cec0716efb6.1557223199.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
a7175ddd44 alternator: inline keywords
It was decided that all alternator-specific keywords can be inlined
in code instead of defining them as constants.
Message-Id: <6dffb9527cfab2a28b8b95ac0ad614c18027f679.1557223199.git.sarna@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
90f56c32b0 alternator: some cleanups in validate_table_name()
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
2808f7ae3f alternator: clean up api_error() interface
All operation-generated error messages should have the 400 HTTP error
code. It's a real nag to have to type it every time. So make it the
default.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
ac278d67ca alternator-test: test for error on creating an already-existing table
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
53a9567804 alternator: correct error when trying to CreateTable an existing table
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
5a751ddcff alternator: fix return object from PutItem
Without special options, PutItem should return nothing (an empty
JSON result). Previously we had trouble doing this, because instead
of return an empty JSON result, we converted an empty string into
JSON :-) So the existing code had an ugly workaround which worked,
sort of, for the Python driver but not for the Java driver.

The correct fix, in this patch, is to invent a new type json_string
which is a string *already* in JSON and doesn't need further conversion,
so we can use it to return the empty result. PutItem now works from
YCSB's Java driver.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
dae70c892f alternator-test: more examples in README.md
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
4cbb40d5d8 alternator-test: test table name limit of 222 bytes, instead of 255.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
ae779ade37 alternator: limit table names to 222 bytes
Although we would like to allow table names up to 222 bytes, this is not
currently possible because Scylla tacks additional 33 bytes to create
a directory name, and directory names are limited to 255 bytes.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
8839b22277 alternator-test: verify appropriate error when invalid key type is used
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
35bc488f5b alternator: better key type parsing
The supported key types are just S(tring), B(lob), or N(umber).
Other types are valid for attributes, but not for keys, and should
not be accepted. And wrong types used should result in the appropriate
user-visible error.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
c85dcfb71d alternator-test: additional cases of invalid schemas in CreateTable
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
a37b334e16 alternator: better invalid schema detection for CreateTable
To be correct, CreateTable's input parsing need to work in reverse from
what it did: First, the key columns are listed in KeySchema, and then
each of these (and potetially more, e.g., from indexes) need to appear
AttributeDefinitions.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
d52e7de7be alternator-test: tests for CreateTable with bad schema
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
dd884b5552 alternator: better error handling for schema errors in CreateTable
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
6c6c5a37a1 alternator-test: test for PutItem to nonexistant table
We expect to see the right error code, not some "internal error".

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
4ff599b21f alternator: PutItem: appropriate error for a non-existant table
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
6178694b0e alternator-test: add another column to test_basic_string_put_and_get()
Just to make sure our success isn't limited to just a single non-key
attribute, let's add another one.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
700a3bb7be alternator: GetItem should by default returns all the columns, not none
The test

  pytest --local test_item.py::test_basic_string_put_and_get

Now passes.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
eb13166fb5 alternator: change empty return of PutItem
Without any arguments, PutItem should return no data at all. But somehow,
for reasons I don't understand, the boto3 driver gets confused from an
empty JSON thinking it isn't JSON at all. If we return a structure with
an empty "attributes" fields, boto3 is happy.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
325773d65b alternator: add initial implementation of DeleteTable
Add an initial implementation of Delete table, enough for making the

   pytest --local test_table.py::test_create_and_delete_table

Pass.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
4e12a4f212 alternator: on unknown operation, return standard API error
When given an unknown operation (we didn't implement yet many of them...)
we should throw the appropriate api_error, not some random exception.

This allows the client to understand the operation is not supported
and stop retrying - instead of retrying thinking this was a weird
internal error.

For example the test
   pytest --local test_table.py::test_create_and_delete_table

Now fails immediately, saying Unsupported operation DeleteTable.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
3600a096be alternator: fix JSON in DescribeTable response
The structure's name in DescribeTable's output is supposed to be called
"Table", not "TableDescription". Putting in the wrong place caused the
driver's table creation waiters to fail.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
9fbf601f44 alternator: validate table name in CreateTable
validate table name in CreateTable, and if it doesn't fit DynamoDB's
requirement, return the appropriate error as drivers expect.

With this patch, test_table.py::test_create_table_unsupported_names
now passes (albeit with a one minute pause - this a bug with keep-alive
support...).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
7b400913fc alternator-test: test_create_table_unsupported_names minor fix
Check the expected error message to contain just ValidationException
instead of an overly specific text message from DynamoDB, so we aren't
so constraint in our own messages' wording.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
623fd5d10d alternator-test: test for creating table with very long name
Dynamo allows tables names up to 255 characters, but when this is tested on
Alternator, the results are disasterous: mkdir with such a long directory
name fails, Scylla considers this an unrecoverable "I/O error", and exits
the server.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
8c9e88f77d test-table: test DescribeTable on non-existent table
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
4f3b0af6d0 Add "--local" option to run test against local Scylla installation
For example "pytest --local test_item.py"

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
830805e608 test_item.py: basic string put and get test
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
b87057eb08 test_table fixture: be quicker to realize table was created.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
53ac8dfb0a test_table fixture: automatically delete
Automatically delete the test table when the test ends.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
8a19ae8e39 test_item.py: start testing CRUD operations
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
41775a6e5d Start to use "test fixtures"
Start to use "test fixtures" defined in conftest.py: The connection to
the DynamoDB API, and also temporary tables, can be reused between multiple
tests.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
7adde41f55 Add some table tests and README
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
c926f1bb53 alternator: very initial implementation of DescribeTable
This initial implementation is enough to pass a test of getting a
failure for a non-existant table -
test_table.py::test_describe_table_non_existent_table
and to recognize an existing table. But it's still missing a lot
of fields for an existing table (among others, the schema).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
e6ff060944 alternator: errors should be output from server as Dynamo drivers expect
Exceptions from the handlers need to be output in a certain way - as
a JSON with specific fields - as DynamoDB drivers expect them to be.
If a handler throws an alternator::api_error with these specific fields,
they are output, but any other exception is converted into the same
format as an "Internal Error".

After this patch, executor code can throw an alternator::api_error and
the client will receive this error in the right format.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
b5fa4ce3f3 alternator: add alternator::api_error exception type
DynamoDB error messages are returned in JSON format and expect specific
information: Some HTTP error code (often but not always 400), a string
error "type" and a user-readable message. Code that wants to return
user-visible exceptions should use this type, and in the next patch we
will translate it to the appropriate JSON string.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
6e697e4cca alternator: table creation time is in seconds
The "Timestamp" type returned for CreationDateTime can be one of several
things but if it is a number, it is supposed to be the time in *seconds*
since the epoch - not in milliseconds. Returning milliseconds as we
wrongly did causes boto3 (AWS's Python driver) to throw a parse exception
on this response.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Nadav Har'El
b2b7ae4e41 alternator: require alternator-port configuration
Until now, we always opened the Alternator port along with Scylla's
regular ports (CQL etc.). This should really be made optional.

With this patch, by default Alternator does NOT start and does not
open a port. Run Scylla with --alternator-port=8000 to open an Alternator
API port on port 8000, as was the default until now. It's also possible
to set this in scylla.yaml.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-08-19 15:48:16 +03:00
Piotr Sarna
fca1e655b4 alternator: add minimal HTTP interface
The interface works on port 8000 by default and provides
the most basic alternator operations - it's an incomplete
set without validation, meant to allow testing as early as possible.
2019-08-19 15:46:47 +03:00
733 changed files with 6299 additions and 20021 deletions

3
.gitignore vendored
View File

@@ -19,6 +19,3 @@ CMakeLists.txt.user
__pycache__CMakeLists.txt.user
.gdbinit
resources
.pytest_cache
/expressions.tokens
tags

2
.gitmodules vendored
View File

@@ -1,6 +1,6 @@
[submodule "seastar"]
path = seastar
url = ../scylla-seastar
url = ../seastar
ignore = dirty
[submodule "swagger-ui"]
path = swagger-ui

View File

@@ -56,7 +56,7 @@ $ ./configure.py --help
The most important option is:
- `--enable-dpdk`: [DPDK](http://dpdk.org/) is a set of libraries and drivers for fast packet processing. During development, it's not necessary to enable support even if it is supported by your platform.
- `--{enable,disable}-dpdk`: [DPDK](http://dpdk.org/) is a set of libraries and drivers for fast packet processing. During development, it's not necessary to enable support even if it is supported by your platform.
Source files and build targets are tracked manually in `configure.py`, so the script needs to be updated when new files or targets are added or removed.

View File

29
README-DPDK.md Normal file
View File

@@ -0,0 +1,29 @@
Seastar and DPDK
================
Seastar uses the Data Plane Development Kit to drive NIC hardware directly. This
provides an enormous performance boost.
To enable DPDK, specify `--enable-dpdk` to `./configure.py`, and `--dpdk-pmd` as a
run-time parameter. This will use the DPDK package provided as a git submodule with the
seastar sources.
To use your own self-compiled DPDK package, follow this procedure:
1. Setup host to compile DPDK:
- Ubuntu
`sudo apt-get install -y build-essential linux-image-extra-$(uname -r)`
2. Prepare a DPDK SDK:
- Download the latest DPDK release: `wget http://dpdk.org/browse/dpdk/snapshot/dpdk-1.8.0.tar.gz`
- Untar it.
- Edit config/common_linuxapp: set CONFIG_RTE_MBUF_REFCNT and CONFIG_RTE_LIBRTE_KNI to 'n'.
- For DPDK 1.7.x: edit config/common_linuxapp:
- Set CONFIG_RTE_LIBRTE_PMD_BOND to 'n'.
- Set CONFIG_RTE_MBUF_SCATTER_GATHER to 'n'.
- Set CONFIG_RTE_LIBRTE_IP_FRAG to 'n'.
- Start the tools/setup.sh script as root.
- Compile a linuxapp target (option 9).
- Install IGB_UIO module (option 11).
- Bind some physical port to IGB_UIO (option 17).
- Configure hugepage mappings (option 14/15).
3. Run a configure.py: `./configure.py --dpdk-target <Path to untared dpdk-1.8.0 above>/x86_64-native-linuxapp-gcc`.

View File

@@ -38,24 +38,6 @@ Please see [HACKING.md](HACKING.md) for detailed information on building and dev
./build/release/scylla --help
```
## Scylla APIs and compatibility
By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and
Thrift. There is also experimental support for the API of Amazon DynamoDB,
but being experimental it needs to be explicitly enabled to be used. For more
information on how to enable the experimental DynamoDB compatibility in Scylla,
and the current limitations of this feature, see
[Alternator](docs/alternator/alternator.md) and
[Getting started with Alternator](docs/alternator/getting-started.md).
## Documentation
Documentation can be found in [./docs](./docs) and on the
[wiki](https://github.com/scylladb/scylla/wiki). There is currently no clear
definition of what goes where, so when looking for something be sure to check
both.
Seastar documentation can be found [here](http://docs.seastar.io/master/index.html).
User documentation can be found [here](https://docs.scylladb.com/).
## Building Fedora RPM
As a pre-requisite, you need to install [Mock](https://fedoraproject.org/wiki/Mock) on your machine:

View File

@@ -1,7 +1,7 @@
#!/bin/sh
PRODUCT=scylla
VERSION=3.2.5
VERSION=666.development
if test -f version
then

View File

@@ -31,48 +31,3 @@ and ~/.aws/config with the default region to use in the test:
region = us-east-1
```
## HTTPS support
In order to run tests with HTTPS, run pytest with `--https` parameter. Note that the Scylla cluster needs to be provided
with alternator\_https\_port configuration option in order to initialize a HTTPS server.
Moreover, running an instance of a HTTPS server requires a certificate. Here's how to easily generate
a key and a self-signed certificate, which is sufficient to run `--https` tests:
```
openssl genrsa 2048 > scylla.key
openssl req -new -x509 -nodes -sha256 -days 365 -key scylla.key -out scylla.crt
```
If this pair is put into `conf/` directory, it will be enough
to allow the alternator HTTPS server to think it's been authorized and properly certified.
Still, boto3 library issues warnings that the certificate used for communication is self-signed,
and thus should not be trusted. For the sake of running local tests this warning is explicitly ignored.
## Authorization
By default, boto3 prepares a properly signed Authorization header with every request.
In order to confirm the authorization, the server recomputes the signature by using
user credentials (user-provided username + a secret key known by the server),
and then checks if it matches the signature from the header.
Early alternator code did not verify signatures at all, which is also allowed by the protocol.
A partial implementation of the authorization verification can be allowed by providing a Scylla
configuration parameter:
```yaml
alternator_enforce_authorization: true
```
The implementation is currently coupled with Scylla's system\_auth.roles table,
which means that an additional step needs to be performed when setting up Scylla
as the test environment. Tests will use the following credentials:
Username: `alternator`
Secret key: `secret_pass`
With CQLSH, it can be achieved by executing this snipped:
```bash
cqlsh -x "INSERT INTO system_auth.roles (role, salted_hash) VALUES ('alternator', 'secret_pass')"
```
Most tests expect the authorization to succeed, so they will pass even with `alternator_enforce_authorization`
turned off. However, test cases from `test_authorization.py` may require this option to be turned on,
so it's advised.

View File

@@ -43,9 +43,6 @@ if (LooseVersion(botocore.__version__) < LooseVersion('1.12.54')):
def pytest_addoption(parser):
parser.addoption("--aws", action="store_true",
help="run against AWS instead of a local Scylla installation")
parser.addoption("--https", action="store_true",
help="communicate via HTTPS protocol on port 8043 instead of HTTP when"
" running against a local Scylla installation")
# "dynamodb" fixture: set up client object for communicating with the DynamoDB
# API. Currently this chooses either Amazon's DynamoDB in the default region
@@ -62,15 +59,8 @@ def dynamodb(request):
# requires us to specify dummy region and credential parameters,
# otherwise the user is forced to properly configure ~/.aws even
# for local runs.
local_url = 'https://localhost:8043' if request.config.getoption('https') else 'http://localhost:8000'
# Disable verifying in order to be able to use self-signed TLS certificates
verify = not request.config.getoption('https')
# Silencing the 'Unverified HTTPS request warning'
if request.config.getoption('https'):
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
return boto3.resource('dynamodb', endpoint_url=local_url, verify=verify,
region_name='us-east-1', aws_access_key_id='alternator', aws_secret_access_key='secret_pass')
return boto3.resource('dynamodb', endpoint_url='http://localhost:8000',
region_name='us-east-1', aws_access_key_id='whatever', aws_secret_access_key='whatever')
# "test_table" fixture: Create and return a temporary table to be used in tests
# that need a table to work on. The table is automatically deleted at the end.

View File

@@ -1,74 +0,0 @@
# Copyright 2019 ScyllaDB
#
# This file is part of Scylla.
#
# Scylla is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Scylla is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with Scylla. If not, see <http://www.gnu.org/licenses/>.
# Tests for authorization
import pytest
import botocore
from botocore.exceptions import ClientError
import boto3
import requests
# Test that trying to perform an operation signed with a wrong key
# will not succeed
def test_wrong_key_access(request, dynamodb):
print("Please make sure authorization is enforced in your Scylla installation: alternator_enforce_authorization: true")
url = dynamodb.meta.client._endpoint.host
with pytest.raises(ClientError, match='UnrecognizedClientException'):
if url.endswith('.amazonaws.com'):
boto3.client('dynamodb',endpoint_url=url, aws_access_key_id='wrong_id', aws_secret_access_key='').describe_endpoints()
else:
verify = not url.startswith('https')
boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='whatever', aws_secret_access_key='', verify=verify).describe_endpoints()
# A similar test, but this time the user is expected to exist in the database (for local tests)
def test_wrong_password(request, dynamodb):
print("Please make sure authorization is enforced in your Scylla installation: alternator_enforce_authorization: true")
url = dynamodb.meta.client._endpoint.host
with pytest.raises(ClientError, match='UnrecognizedClientException'):
if url.endswith('.amazonaws.com'):
boto3.client('dynamodb',endpoint_url=url, aws_access_key_id='alternator', aws_secret_access_key='wrong_key').describe_endpoints()
else:
verify = not url.startswith('https')
boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='alternator', aws_secret_access_key='wrong_key', verify=verify).describe_endpoints()
# A test ensuring that expired signatures are not accepted
def test_expired_signature(dynamodb, test_table):
url = dynamodb.meta.client._endpoint.host
print(url)
headers = {'Content-Type': 'application/x-amz-json-1.0',
'X-Amz-Date': '20170101T010101Z',
'X-Amz-Target': 'DynamoDB_20120810.DescribeEndpoints',
'Authorization': 'AWS4-HMAC-SHA256 Credential=alternator/2/3/4/aws4_request SignedHeaders=x-amz-date;host Signature=123'
}
response = requests.post(url, headers=headers)
assert not response.ok
assert "InvalidSignatureException" in response.text and "Signature expired" in response.text
# A test ensuring that signatures that exceed current time too much are not accepted.
# Watch out - this test is valid only for around next 1000 years, it needs to be updated later.
def test_signature_too_futuristic(dynamodb, test_table):
url = dynamodb.meta.client._endpoint.host
print(url)
headers = {'Content-Type': 'application/x-amz-json-1.0',
'X-Amz-Date': '30200101T010101Z',
'X-Amz-Target': 'DynamoDB_20120810.DescribeEndpoints',
'Authorization': 'AWS4-HMAC-SHA256 Credential=alternator/2/3/4/aws4_request SignedHeaders=x-amz-date;host Signature=123'
}
response = requests.post(url, headers=headers)
assert not response.ok
assert "InvalidSignatureException" in response.text and "Signature not yet current" in response.text

View File

@@ -22,7 +22,7 @@ import boto3
# Test that the DescribeEndpoints operation works as expected: that it
# returns one endpoint (it may return more, but it never does this in
# Amazon), and this endpoint can be used to make more requests.
def test_describe_endpoints(request, dynamodb):
def test_describe_endpoints(dynamodb):
endpoints = dynamodb.meta.client.describe_endpoints()['Endpoints']
# It is not strictly necessary that only a single endpoint be returned,
# but this is what Amazon DynamoDB does today (and so does Alternator).
@@ -34,16 +34,14 @@ def test_describe_endpoints(request, dynamodb):
# send it another describe_endpoints() request ;-) Note that the
# address does not include the "http://" or "https://" prefix, and
# we need to choose one manually.
prefix = "https://" if request.config.getoption('https') else "http://"
verify = not request.config.getoption('https')
url = prefix + address
url = "http://" + address
if address.endswith('.amazonaws.com'):
boto3.client('dynamodb',endpoint_url=url, verify=verify).describe_endpoints()
boto3.client('dynamodb',endpoint_url=url).describe_endpoints()
else:
# Even though we connect to the local installation, Boto3 still
# requires us to specify dummy region and credential parameters,
# otherwise the user is forced to properly configure ~/.aws even
# for local runs.
boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='alternator', aws_secret_access_key='secret_pass', verify=verify).describe_endpoints()
boto3.client('dynamodb',endpoint_url=url, region_name='us-east-1', aws_access_key_id='whatever', aws_secret_access_key='whatever').describe_endpoints()
# Nothing to check here - if the above call failed with an exception,
# the test would fail.

View File

@@ -56,7 +56,6 @@ def test_update_expression_and_expected(test_table_s):
# and the "false" case, where the condition evaluates to false, so the update
# doesn't happen and we get a ConditionalCheckFailedException instead.
# Tests for Expected with ComparisonOperator = "EQ":
def test_update_expected_1_eq_true(test_table_s):
p = random_string()
test_table_s.update_item(Key={'p': p},
@@ -126,576 +125,6 @@ def test_update_expected_1_eq_false(test_table_s):
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1}
# Tests for Expected with ComparisonOperator = "NE":
def test_update_expected_1_ne_true(test_table_s):
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'}})
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NE',
'AttributeValueList': [2]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1, 'b': 3}
# For NE, AttributeValueList must have a single element
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NE',
'AttributeValueList': [2, 3]}}
)
# If the types are different, this is considered "not equal":
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 4, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NE',
'AttributeValueList': ["1"]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1, 'b': 4}
# If the attribute does not exist at all, this is also considered "not equal":
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 5, 'Action': 'PUT'}},
Expected={'q': {'ComparisonOperator': 'NE',
'AttributeValueList': [1]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 1, 'b': 5}
def test_update_expected_1_ne_false(test_table_s):
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'}})
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NE',
'AttributeValueList': [1]}}
)
# Tests for Expected with ComparisonOperator = "LE":
@pytest.mark.xfail(reason="ComparisonOperator=LE in Expected not yet implemented")
def test_update_expected_1_le(test_table_s):
p = random_string()
# LE should work for string, number, and binary type
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
'b': {'Value': 'cat', 'Action': 'PUT'},
'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LE',
'AttributeValueList': [2]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LE',
'AttributeValueList': [1]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'LE',
'AttributeValueList': ['dog']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'LE',
'AttributeValueList': [bytearray('dog', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LE',
'AttributeValueList': [0]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'LE',
'AttributeValueList': ['aardvark']}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'LE',
'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
)
# If the types are different, this is also considered false
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LE',
'AttributeValueList': ["1"]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# For LE, AttributeValueList must have a single element
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LE',
'AttributeValueList': [2, 3]}}
)
# Tests for Expected with ComparisonOperator = "LT":
def test_update_expected_1_lt(test_table_s):
p = random_string()
# LT should work for string, number, and binary type
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
'b': {'Value': 'cat', 'Action': 'PUT'},
'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LT',
'AttributeValueList': [2]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'LT',
'AttributeValueList': ['dog']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'LT',
'AttributeValueList': [bytearray('dog', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LT',
'AttributeValueList': [1]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LT',
'AttributeValueList': [0]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'LT',
'AttributeValueList': ['aardvark']}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'LT',
'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
)
# If the types are different, this is also considered false
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LT',
'AttributeValueList': ["1"]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# For LT, AttributeValueList must have a single element
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'LT',
'AttributeValueList': [2, 3]}}
)
# Tests for Expected with ComparisonOperator = "GE":
@pytest.mark.xfail(reason="ComparisonOperator=GE in Expected not yet implemented")
def test_update_expected_1_ge(test_table_s):
p = random_string()
# GE should work for string, number, and binary type
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
'b': {'Value': 'cat', 'Action': 'PUT'},
'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GE',
'AttributeValueList': [0]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GE',
'AttributeValueList': [1]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'GE',
'AttributeValueList': ['aardvark']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'GE',
'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GE',
'AttributeValueList': [3]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'GE',
'AttributeValueList': ['dog']}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'GE',
'AttributeValueList': [bytearray('dog', 'utf-8')]}}
)
# If the types are different, this is also considered false
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GE',
'AttributeValueList': ["1"]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# For GE, AttributeValueList must have a single element
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GE',
'AttributeValueList': [2, 3]}}
)
# Tests for Expected with ComparisonOperator = "GT":
def test_update_expected_1_gt(test_table_s):
p = random_string()
# GT should work for string, number, and binary type
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
'b': {'Value': 'cat', 'Action': 'PUT'},
'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GT',
'AttributeValueList': [0]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'GT',
'AttributeValueList': ['aardvark']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'GT',
'AttributeValueList': [bytearray('aardvark', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GT',
'AttributeValueList': [3]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GT',
'AttributeValueList': [1]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'GT',
'AttributeValueList': ['dog']}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'GT',
'AttributeValueList': [bytearray('dog', 'utf-8')]}}
)
# If the types are different, this is also considered false
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GT',
'AttributeValueList': ["1"]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# For GE, AttributeValueList must have a single element
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'GT',
'AttributeValueList': [2, 3]}}
)
# Tests for Expected with ComparisonOperator = "NOT_NULL":
def test_update_expected_1_not_null(test_table_s):
# Note that despite its name, the "NOT_NULL" comparison operator doesn't check if
# the attribute has the type "NULL", or an empty value. Rather it is explicitly
# documented to check if the attribute exists at all.
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
'b': {'Value': 'cat', 'Action': 'PUT'},
'c': {'Value': None, 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'q': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': []}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
# For NOT_NULL, AttributeValueList must be empty
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_NULL', 'AttributeValueList': [2]}}
)
# Tests for Expected with ComparisonOperator = "NULL":
def test_update_expected_1_null(test_table_s):
# Note that despite its name, the "NULL" comparison operator doesn't check if
# the attribute has the type "NULL", or an empty value. Rather it is explicitly
# documented to check if the attribute exists at all.
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 1, 'Action': 'PUT'},
'b': {'Value': 'cat', 'Action': 'PUT'},
'c': {'Value': None, 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'q': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'NULL', 'AttributeValueList': []}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
# For NULL, AttributeValueList must be empty
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NULL', 'AttributeValueList': [2]}}
)
# Tests for Expected with ComparisonOperator = "CONTAINS":
@pytest.mark.xfail(reason="ComparisonOperator=CONTAINS in Expected not yet implemented")
def test_update_expected_1_contains(test_table_s):
# true cases. CONTAINS can be used for two unrelated things: check substrings
# (in string or binary) and membership (in set or list).
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'},
'b': {'Value': set([2, 4, 7]), 'Action': 'PUT'},
'c': {'Value': [2, 4, 7], 'Action': 'PUT'},
'd': {'Value': bytearray('hi there', 'utf-8'), 'Action': 'PUT'}})
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': ['ell']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [4]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
# The CONTAINS documentation uses confusing wording on whether it works
# only on sets, or also on lists. In fact, it does work on lists:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [4]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'d': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [bytearray('here', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': ['dog']}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'q': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [1]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'d': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [bytearray('dog', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
# For CONTAINS, AttributeValueList must have just one item, and it must be
# a string, number or binary
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [2, 3]}}
)
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': []}}
)
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'CONTAINS', 'AttributeValueList': [[1]]}}
)
# Tests for Expected with ComparisonOperator = "NOT_CONTAINS":
@pytest.mark.xfail(reason="ComparisonOperator=NOT_CONTAINS in Expected not yet implemented")
def test_update_expected_1_not_contains(test_table_s):
# true cases. NOT_CONTAINS can be used for two unrelated things: check substrings
# (in string or binary) and membership (in set or list).
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'},
'b': {'Value': set([2, 4, 7]), 'Action': 'PUT'},
'c': {'Value': [2, 4, 7], 'Action': 'PUT'},
'd': {'Value': bytearray('hi there', 'utf-8'), 'Action': 'PUT'}})
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': ['dog']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 7, 'Action': 'PUT'}},
Expected={'d': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [bytearray('dog', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 7
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': ['ell']}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [4]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [4]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'d': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [bytearray('here', 'utf-8')]}}
)
# Surprisingly, if an attribute does not exist at all, NOT_CONTAINS
# fails, rather than succeeding. This is surprising because it means in
# this case both CONTAINS and NOT_CONTAINS are false, and because "NE" does not
# behave this way (if the attribute does not exist, NE succeeds).
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'q': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [1]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 7
# For NOT_CONTAINS, AttributeValueList must have just one item, and it must be
# a string, number or binary
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [2, 3]}}
)
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': []}}
)
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 17, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'NOT_CONTAINS', 'AttributeValueList': [[1]]}}
)
# Tests for Expected with ComparisonOperator = "BEGINS_WITH":
def test_update_expected_1_begins_with_true(test_table_s):
p = random_string()
test_table_s.update_item(Key={'p': p},
@@ -707,163 +136,38 @@ def test_update_expected_1_begins_with_true(test_table_s):
'AttributeValueList': ['hell']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 'hello', 'b': 3}
# For BEGINS_WITH, AttributeValueList must have a single element
# For BEGIN_WITH, AttributeValueList must have a single element
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BEGINS_WITH',
Expected={'a': {'ComparisonOperator': 'EQ',
'AttributeValueList': ['hell', 'heaven']}}
)
def test_update_expected_1_begins_with_false(test_table_s):
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'},
'x': {'Value': 3, 'Action': 'PUT'}})
AttributeUpdates={'a': {'Value': 'hello', 'Action': 'PUT'}})
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BEGINS_WITH',
Expected={'a': {'ComparisonOperator': 'EQ',
'AttributeValueList': ['dog']}}
)
# BEGINS_WITH requires String or Binary operand, giving it a number
# results with a ValidationException (not a normal failed condition):
with pytest.raises(ClientError, match='ValidationException'):
# Although BEGINS_WITH requires String or Binary type, giving it a
# number results not with a ValidationException but rather a
# failed condition (ConditionalCheckFailedException)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BEGINS_WITH',
Expected={'a': {'ComparisonOperator': 'EQ',
'AttributeValueList': [3]}}
)
# However, if we try to compare the attribute to a String or Binary, and
# the attribute value itself is a number, this is just a failed condition:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'b': {'Value': 3, 'Action': 'PUT'}},
Expected={'x': {'ComparisonOperator': 'BEGINS_WITH',
'AttributeValueList': ['dog']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 'hello', 'x': 3}
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item'] == {'p': p, 'a': 'hello'}
# Tests for Expected with ComparisonOperator = "IN":
def test_update_expected_1_in(test_table_s):
# Some copies of "IN"'s documentation are outright wrong: "IN" checks
# whether the attribute value is in the give list of values. It does NOT
# do the opposite - testing whether certain items are in a set attribute.
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': set([2, 4, 7]), 'Action': 'PUT'},
'c': {'Value': 3, 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [2, 3, 8]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [1, 2, 4]}}
)
# a bunch of wrong interpretations of what the heck that "IN" does :-(
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'IN', 'AttributeValueList': [2]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'IN', 'AttributeValueList': [1, 2, 4, 7, 8]}}
)
# Strangely, all the items in AttributeValueList must be of the same type,
# we can't check if an item is either the number 3 or the string 'dog',
# although allowing this case as well would have been easy:
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [3, 'dog']}}
)
# Empty list is not allowed
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': []}}
)
# Non-scalar attribute values are not allowed
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'IN', 'AttributeValueList': [[1], [2]]}}
)
# FIXME: need to test many more ComparisonOperator options... See full list in
# description in https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LegacyConditionalParameters.Expected.html
# Tests for Expected with ComparisonOperator = "BETWEEN":
@pytest.mark.xfail(reason="ComparisonOperator=BETWEEN in Expected not yet implemented")
def test_update_expected_1_between(test_table_s):
p = random_string()
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'a': {'Value': 2, 'Action': 'PUT'},
'b': {'Value': 'cat', 'Action': 'PUT'},
'c': {'Value': bytearray('cat', 'utf-8'), 'Action': 'PUT'}})
# true cases:
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [1, 3]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 2
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 3, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [1, 2]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 3
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 4, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [2, 3]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 4
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 5, 'Action': 'PUT'}},
Expected={'b': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': ['aardvark', 'dog']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 5
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 6, 'Action': 'PUT'}},
Expected={'c': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [bytearray('aardvark', 'utf-8'), bytearray('dog', 'utf-8')]}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 6
# false cases:
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [0, 1]}}
)
with pytest.raises(ClientError, match='ConditionalCheckFailedException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': ['cat', 'dog']}}
)
assert test_table_s.get_item(Key={'p': p}, ConsistentRead=True)['Item']['z'] == 6
# The given AttributeValueList array must contain exactly two items of the
# same type, and in the right order. Any other input is considered a validation
# error:
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': []}})
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [2, 3, 4]}})
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [4, 3]}})
with pytest.raises(ClientError, match='ValidationException'):
test_table_s.update_item(Key={'p': p},
AttributeUpdates={'z': {'Value': 2, 'Action': 'PUT'}},
Expected={'a': {'ComparisonOperator': 'BETWEEN', 'AttributeValueList': [4, 'dog']}})
##############################################################################
# Instead of ComparisonOperator and AttributeValueList, one can specify either
# Value or Exists:
def test_update_expected_1_value_true(test_table_s):

View File

@@ -1,34 +0,0 @@
# Copyright 2019 ScyllaDB
#
# This file is part of Scylla.
#
# Scylla is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Scylla is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with Scylla. If not, see <http://www.gnu.org/licenses/>.
# Tests for the health check
import requests
# Test that a health check can be performed with a GET packet
def test_health_works(dynamodb):
url = dynamodb.meta.client._endpoint.host
response = requests.get(url)
assert response.ok
assert response.content.decode('utf-8').strip() == 'healthy: {}'.format(url.replace('https://', '').replace('http://', ''))
# Test that a health check only works for the root URL ('/')
def test_health_only_works_for_root_path(dynamodb):
url = dynamodb.meta.client._endpoint.host
for suffix in ['/abc', '/..', '/-', '/index.htm', '/health']:
response = requests.get(url + suffix)
assert response.status_code in range(400, 405)

View File

@@ -584,7 +584,7 @@ def test_update_expression_if_not_exists(test_table_s):
# value may itself be a function call - ad infinitum. So expressions like
# list_append(if_not_exists(a, :val1), :val2) are legal and so is deeper
# nesting.
@pytest.mark.xfail(reason="for unknown reason, DynamoDB does not allow nesting list_append")
@pytest.mark.xfail(reason="SET functions not yet implemented")
def test_update_expression_function_nesting(test_table_s):
p = random_string()
test_table_s.update_item(Key={'p': p},

View File

@@ -1,146 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "alternator/error.hh"
#include "log.hh"
#include <string>
#include <string_view>
#include <gnutls/crypto.h>
#include <seastar/util/defer.hh>
#include "hashers.hh"
#include "bytes.hh"
#include "alternator/auth.hh"
#include <fmt/format.h>
#include "auth/common.hh"
#include "auth/password_authenticator.hh"
#include "auth/roles-metadata.hh"
#include "cql3/query_processor.hh"
#include "cql3/untyped_result_set.hh"
namespace alternator {
static logging::logger alogger("alternator-auth");
static hmac_sha256_digest hmac_sha256(std::string_view key, std::string_view msg) {
hmac_sha256_digest digest;
int ret = gnutls_hmac_fast(GNUTLS_MAC_SHA256, key.data(), key.size(), msg.data(), msg.size(), digest.data());
if (ret) {
throw std::runtime_error(fmt::format("Computing HMAC failed ({}): {}", ret, gnutls_strerror(ret)));
}
return digest;
}
static hmac_sha256_digest get_signature_key(std::string_view key, std::string_view date_stamp, std::string_view region_name, std::string_view service_name) {
auto date = hmac_sha256("AWS4" + std::string(key), date_stamp);
auto region = hmac_sha256(std::string_view(date.data(), date.size()), region_name);
auto service = hmac_sha256(std::string_view(region.data(), region.size()), service_name);
auto signing = hmac_sha256(std::string_view(service.data(), service.size()), "aws4_request");
return signing;
}
static std::string apply_sha256(std::string_view msg) {
sha256_hasher hasher;
hasher.update(msg.data(), msg.size());
return to_hex(hasher.finalize());
}
static std::string format_time_point(db_clock::time_point tp) {
time_t time_point_repr = db_clock::to_time_t(tp);
std::string time_point_str;
time_point_str.resize(17);
// strftime prints the terminating null character as well
std::strftime(time_point_str.data(), time_point_str.size(), "%Y%m%dT%H%M%SZ", std::gmtime(&time_point_repr));
time_point_str.resize(16);
return time_point_str;
}
void check_expiry(std::string_view signature_date) {
//FIXME: The default 15min can be changed with X-Amz-Expires header - we should honor it
std::string expiration_str = format_time_point(db_clock::now() - 15min);
std::string validity_str = format_time_point(db_clock::now() + 15min);
if (signature_date < expiration_str) {
throw api_error("InvalidSignatureException",
fmt::format("Signature expired: {} is now earlier than {} (current time - 15 min.)",
signature_date, expiration_str));
}
if (signature_date > validity_str) {
throw api_error("InvalidSignatureException",
fmt::format("Signature not yet current: {} is still later than {} (current time + 15 min.)",
signature_date, validity_str));
}
}
std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
std::string_view body_content, std::string_view region, std::string_view service, std::string_view query_string) {
auto amz_date_it = signed_headers_map.find("x-amz-date");
if (amz_date_it == signed_headers_map.end()) {
throw api_error("InvalidSignatureException", "X-Amz-Date header is mandatory for signature verification");
}
std::string_view amz_date = amz_date_it->second;
check_expiry(amz_date);
std::string_view datestamp = amz_date.substr(0, 8);
if (datestamp != orig_datestamp) {
throw api_error("InvalidSignatureException",
format("X-Amz-Date date does not match the provided datestamp. Expected {}, got {}",
orig_datestamp, datestamp));
}
std::string_view canonical_uri = "/";
std::stringstream canonical_headers;
for (const auto& header : signed_headers_map) {
canonical_headers << fmt::format("{}:{}", header.first, header.second) << '\n';
}
std::string payload_hash = apply_sha256(body_content);
std::string canonical_request = fmt::format("{}\n{}\n{}\n{}\n{}\n{}", method, canonical_uri, query_string, canonical_headers.str(), signed_headers_str, payload_hash);
std::string_view algorithm = "AWS4-HMAC-SHA256";
std::string credential_scope = fmt::format("{}/{}/{}/aws4_request", datestamp, region, service);
std::string string_to_sign = fmt::format("{}\n{}\n{}\n{}", algorithm, amz_date, credential_scope, apply_sha256(canonical_request));
hmac_sha256_digest signing_key = get_signature_key(secret_access_key, datestamp, region, service);
hmac_sha256_digest signature = hmac_sha256(std::string_view(signing_key.data(), signing_key.size()), string_to_sign);
return to_hex(bytes_view(reinterpret_cast<const int8_t*>(signature.data()), signature.size()));
}
future<std::string> get_key_from_roles(cql3::query_processor& qp, std::string username) {
static const sstring query = format("SELECT salted_hash FROM {} WHERE {} = ?",
auth::meta::roles_table::qualified_name(), auth::meta::roles_table::role_col_name);
auto cl = auth::password_authenticator::consistency_for_user(username);
auto timeout = auth::internal_distributed_timeout_config();
return qp.process(query, cl, timeout, {sstring(username)}, true).then_wrapped([username = std::move(username)] (future<::shared_ptr<cql3::untyped_result_set>> f) {
auto res = f.get0();
auto salted_hash = std::optional<sstring>();
if (res->empty()) {
throw api_error("UnrecognizedClientException", fmt::format("User not found: {}", username));
}
salted_hash = res->one().get_opt<sstring>("salted_hash");
if (!salted_hash) {
throw api_error("UnrecognizedClientException", fmt::format("No password found for user: {}", username));
}
return make_ready_future<std::string>(*salted_hash);
});
}
}

View File

@@ -1,46 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <string>
#include <string_view>
#include <array>
#include "gc_clock.hh"
#include "utils/loading_cache.hh"
namespace cql3 {
class query_processor;
}
namespace alternator {
using hmac_sha256_digest = std::array<char, 32>;
using key_cache = utils::loading_cache<std::string, std::string>;
std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
std::string_view body_content, std::string_view region, std::string_view service, std::string_view query_string);
future<std::string> get_key_from_roles(cql3::query_processor& qp, std::string username);
}

View File

@@ -21,14 +21,8 @@
#pragma once
#include <string_view>
#include <string>
#include "bytes.hh"
#include "rjson.hh"
std::string base64_encode(bytes_view);
bytes base64_decode(std::string_view);
inline bytes base64_decode(const rjson::value& v) {
return base64_decode(std::string_view(v.GetString(), v.GetStringLength()));
}

View File

@@ -27,8 +27,6 @@
#include "cql3/constants.hh"
#include <unordered_map>
#include "rjson.hh"
#include "serialization.hh"
#include "base64.hh"
namespace alternator {
@@ -37,17 +35,13 @@ static logging::logger clogger("alternator-conditions");
comparison_operator_type get_comparison_operator(const rjson::value& comparison_operator) {
static std::unordered_map<std::string, comparison_operator_type> ops = {
{"EQ", comparison_operator_type::EQ},
{"NE", comparison_operator_type::NE},
{"LE", comparison_operator_type::LE},
{"LT", comparison_operator_type::LT},
{"GE", comparison_operator_type::GE},
{"GT", comparison_operator_type::GT},
{"IN", comparison_operator_type::IN},
{"NULL", comparison_operator_type::IS_NULL},
{"NOT_NULL", comparison_operator_type::NOT_NULL},
{"BETWEEN", comparison_operator_type::BETWEEN},
{"BEGINS_WITH", comparison_operator_type::BEGINS_WITH},
}; //TODO: CONTAINS
}; //TODO(sarna): NE, IN, CONTAINS, NULL, NOT_NULL
if (!comparison_operator.IsString()) {
throw api_error("ValidationException", format("Invalid comparison operator definition {}", rjson::print(comparison_operator)));
}
@@ -102,165 +96,32 @@ static ::shared_ptr<cql3::restrictions::single_column_restriction::EQ> make_key_
return filtering_restrictions;
}
namespace {
struct size_check {
// True iff size passes this check.
virtual bool operator()(rapidjson::SizeType size) const = 0;
// Check description, such that format("expected array {}", check.what()) is human-readable.
virtual sstring what() const = 0;
};
class exact_size : public size_check {
rapidjson::SizeType _expected;
public:
explicit exact_size(rapidjson::SizeType expected) : _expected(expected) {}
bool operator()(rapidjson::SizeType size) const override { return size == _expected; }
sstring what() const override { return format("of size {}", _expected); }
};
struct empty : public size_check {
bool operator()(rapidjson::SizeType size) const override { return size < 1; }
sstring what() const override { return "to be empty"; }
};
struct nonempty : public size_check {
bool operator()(rapidjson::SizeType size) const override { return size > 0; }
sstring what() const override { return "to be non-empty"; }
};
} // anonymous namespace
// Check that array has the expected number of elements
static void verify_operand_count(const rjson::value* array, const size_check& expected, const rjson::value& op) {
if (!array || !array->IsArray()) {
throw api_error("ValidationException", "With ComparisonOperator, AttributeValueList must be given and an array");
}
if (!expected(array->Size())) {
throw api_error("ValidationException",
format("{} operator requires AttributeValueList {}, instead found list size {}",
op, expected.what(), array->Size()));
}
}
// Check if two JSON-encoded values match with the EQ relation
static bool check_EQ(const rjson::value* v1, const rjson::value& v2) {
return v1 && *v1 == v2;
}
// Check if two JSON-encoded values match with the NE relation
static bool check_NE(const rjson::value* v1, const rjson::value& v2) {
return !v1 || *v1 != v2; // null is unequal to anything.
static bool check_EQ(const rjson::value& v1, const rjson::value& v2) {
return v1 == v2;
}
// Check if two JSON-encoded values match with the BEGINS_WITH relation
static bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2) {
// BEGINS_WITH requires that its single operand (v2) be a string or
// binary - otherwise it's a validation error. However, problems with
// the stored attribute (v1) will just return false (no match).
if (!v2.IsObject() || v2.MemberCount() != 1) {
throw api_error("ValidationException", format("BEGINS_WITH operator encountered malformed AttributeValue: {}", v2));
}
auto it2 = v2.MemberBegin();
if (it2->name != "S" && it2->name != "B") {
throw api_error("ValidationException", format("BEGINS_WITH operator requires String or Binary in AttributeValue, got {}", it2->name));
}
if (!v1 || !v1->IsObject() || v1->MemberCount() != 1) {
static bool check_BEGINS_WITH(const rjson::value& v1, const rjson::value& v2) {
// BEGINS_WITH only supports comparing two strings or two binaries -
// any other combinations of types, or other malformed values, return
// false (no match).
if (!v1.IsObject() || v1.MemberCount() != 1 || !v2.IsObject() || v2.MemberCount() != 1) {
return false;
}
auto it1 = v1->MemberBegin();
auto it1 = v1.MemberBegin();
auto it2 = v2.MemberBegin();
if (it1->name != it2->name) {
return false;
}
if (it1->name != "S" && it1->name != "B") {
return false;
}
std::string_view val1(it1->value.GetString(), it1->value.GetStringLength());
std::string_view val2(it2->value.GetString(), it2->value.GetStringLength());
return val1.substr(0, val2.size()) == val2;
}
// Check if a JSON-encoded value equals any element of an array, which must have at least one element.
static bool check_IN(const rjson::value* val, const rjson::value& array) {
if (!array[0].IsObject() || array[0].MemberCount() != 1) {
throw api_error("ValidationException",
format("IN operator encountered malformed AttributeValue: {}", array[0]));
}
const auto& type = array[0].MemberBegin()->name;
if (type != "S" && type != "N" && type != "B") {
throw api_error("ValidationException",
"IN operator requires AttributeValueList elements to be of type String, Number, or Binary ");
}
if (!val) {
return false;
}
bool have_match = false;
for (const auto& elem : array.GetArray()) {
if (!elem.IsObject() || elem.MemberCount() != 1 || elem.MemberBegin()->name != type) {
throw api_error("ValidationException",
"IN operator requires all AttributeValueList elements to have the same type ");
}
if (!have_match && *val == elem) {
// Can't return yet, must check types of all array elements. <sigh>
have_match = true;
}
}
return have_match;
}
static bool check_NULL(const rjson::value* val) {
return val == nullptr;
}
static bool check_NOT_NULL(const rjson::value* val) {
return val != nullptr;
}
// Check if two JSON-encoded values match with cmp.
template <typename Comparator>
bool check_compare(const rjson::value* v1, const rjson::value& v2, const Comparator& cmp) {
if (!v2.IsObject() || v2.MemberCount() != 1) {
throw api_error("ValidationException",
format("{} requires a single AttributeValue of type String, Number, or Binary",
cmp.diagnostic()));
}
const auto& kv2 = *v2.MemberBegin();
if (kv2.name != "S" && kv2.name != "N" && kv2.name != "B") {
throw api_error("ValidationException",
format("{} requires a single AttributeValue of type String, Number, or Binary",
cmp.diagnostic()));
}
if (!v1 || !v1->IsObject() || v1->MemberCount() != 1) {
return false;
}
const auto& kv1 = *v1->MemberBegin();
if (kv1.name != kv2.name) {
return false;
}
if (kv1.name == "N") {
return cmp(unwrap_number(*v1, cmp.diagnostic()), unwrap_number(v2, cmp.diagnostic()));
}
if (kv1.name == "S") {
return cmp(std::string_view(kv1.value.GetString(), kv1.value.GetStringLength()),
std::string_view(kv2.value.GetString(), kv2.value.GetStringLength()));
}
if (kv1.name == "B") {
return cmp(base64_decode(kv1.value), base64_decode(kv2.value));
}
clogger.error("check_compare panic: LHS type equals RHS type, but one is in {N,S,B} while the other isn't");
return false;
}
struct cmp_lt {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs < rhs; }
const char* diagnostic() const { return "LT operator"; }
};
struct cmp_gt {
// bytes only has <
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return rhs < lhs; }
const char* diagnostic() const { return "GT operator"; }
};
// Verify one Expect condition on one attribute (whose content is "got")
// for the verify_expected() below.
// This function returns true or false depending on whether the condition
@@ -281,7 +142,7 @@ static bool verify_expected_one(const rjson::value& condition, const rjson::valu
if (comparison_operator) {
throw api_error("ValidationException", "Cannot combine Value with ComparisonOperator");
}
return check_EQ(got, *value);
return got && check_EQ(*got, *value);
} else if (exists) {
if (comparison_operator) {
throw api_error("ValidationException", "Cannot combine Exists with ComparisonOperator");
@@ -295,32 +156,29 @@ static bool verify_expected_one(const rjson::value& condition, const rjson::valu
if (!comparison_operator) {
throw api_error("ValidationException", "Missing ComparisonOperator, Value or Exists");
}
if (!attribute_value_list || !attribute_value_list->IsArray()) {
throw api_error("ValidationException", "With ComparisonOperator, AttributeValueList must be given and an array");
}
comparison_operator_type op = get_comparison_operator(*comparison_operator);
switch (op) {
case comparison_operator_type::EQ:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_EQ(got, (*attribute_value_list)[0]);
case comparison_operator_type::NE:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_NE(got, (*attribute_value_list)[0]);
case comparison_operator_type::LT:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_lt{});
case comparison_operator_type::GT:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_gt{});
if (attribute_value_list->Size() != 1) {
throw api_error("ValidationException", "EQ operator requires one element in AttributeValueList");
}
if (got) {
const rjson::value& expected = (*attribute_value_list)[0];
return check_EQ(*got, expected);
}
return false;
case comparison_operator_type::BEGINS_WITH:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_BEGINS_WITH(got, (*attribute_value_list)[0]);
case comparison_operator_type::IN:
verify_operand_count(attribute_value_list, nonempty(), *comparison_operator);
return check_IN(got, *attribute_value_list);
case comparison_operator_type::IS_NULL:
verify_operand_count(attribute_value_list, empty(), *comparison_operator);
return check_NULL(got);
case comparison_operator_type::NOT_NULL:
verify_operand_count(attribute_value_list, empty(), *comparison_operator);
return check_NOT_NULL(got);
if (attribute_value_list->Size() != 1) {
throw api_error("ValidationException", "BEGINS_WITH operator requires one element in AttributeValueList");
}
if (got) {
const rjson::value& expected = (*attribute_value_list)[0];
return check_BEGINS_WITH(*got, expected);
}
return false;
default:
// FIXME: implement all the missing types, so there will be no default here.
throw api_error("ValidationException", format("ComparisonOperator {} is not yet supported", *comparison_operator));

View File

@@ -49,7 +49,6 @@
#include "utils/big_decimal.hh"
#include "seastar/json/json_elements.hh"
#include <boost/algorithm/cxx11/any_of.hpp>
#include "collection_mutation.hh"
#include <boost/range/adaptors.hpp>
@@ -607,8 +606,8 @@ public:
void del(bytes&& name, api::timestamp_type ts) {
add(std::move(name), atomic_cell::make_dead(ts, gc_clock::now()));
}
collection_mutation_description to_mut() {
collection_mutation_description ret;
collection_type_impl::mutation to_mut() {
collection_type_impl::mutation ret;
for (auto&& e : collected) {
ret.cells.emplace_back(e.first, std::move(e.second));
}
@@ -644,7 +643,7 @@ static mutation make_item_mutation(const rjson::value& item, schema_ptr schema)
}
if (!attrs_collector.empty()) {
auto serialized_map = attrs_collector.to_mut().serialize(*attrs_type());
auto serialized_map = attrs_type()->serialize_mutation_form(attrs_collector.to_mut());
row.cells().apply(attrs_column(*schema), std::move(serialized_map));
}
// To allow creation of an item with no attributes, we need a row marker.
@@ -668,7 +667,6 @@ static db::timeout_clock::time_point default_timeout() {
static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
service::storage_proxy& proxy,
service::client_state& client_state,
schema_ptr schema,
const rjson::value& item,
bool need_read_before_write,
@@ -692,7 +690,7 @@ future<json::json_return_type> executor::put_item(client_state& client_state, st
mutation m = make_item_mutation(item, schema);
return maybe_get_previous_item(_proxy, client_state, schema, item, has_expected, _stats).then(
return maybe_get_previous_item(_proxy, schema, item, has_expected, _stats).then(
[this, schema, has_expected, update_info = rjson::copy(update_info), m = std::move(m),
&client_state, start_time] (std::unique_ptr<rjson::value> previous_item) mutable {
if (has_expected) {
@@ -744,7 +742,7 @@ future<json::json_return_type> executor::delete_item(client_state& client_state,
mutation m = make_delete_item_mutation(key, schema);
check_key(key, schema);
return maybe_get_previous_item(_proxy, client_state, schema, key, has_expected, _stats).then(
return maybe_get_previous_item(_proxy, schema, key, has_expected, _stats).then(
[this, schema, has_expected, update_info = rjson::copy(update_info), m = std::move(m),
&client_state, start_time] (std::unique_ptr<rjson::value> previous_item) mutable {
if (has_expected) {
@@ -1041,11 +1039,31 @@ static rjson::value set_diff(const rjson::value& v1, const rjson::value& v2) {
return ret;
}
// Check if a given JSON object encodes a number (i.e., it is a {"N": [...]}
// and returns an object representing it.
static big_decimal unwrap_number(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
throw api_error("ValidationException", "UpdateExpression: invalid number object");
}
auto it = v.MemberBegin();
if (it->name != "N") {
throw api_error("ValidationException",
format("UpdateExpression: expected number, found type '{}'", it->name));
}
if (it->value.IsNumber()) {
return big_decimal(rjson::print(it->value)); // FIXME(sarna): should use big_decimal constructor with numeric values directly
}
if (!it->value.IsString()) {
throw api_error("ValidationException", "UpdateExpression: improperly formatted number constant");
}
return big_decimal(it->value.GetString());
}
// Take two JSON-encoded numeric values ({"N": "thenumber"}) and return the
// sum, again as a JSON-encoded number.
static rjson::value number_add(const rjson::value& v1, const rjson::value& v2) {
auto n1 = unwrap_number(v1, "UpdateExpression");
auto n2 = unwrap_number(v2, "UpdateExpression");
auto n1 = unwrap_number(v1);
auto n2 = unwrap_number(v2);
rjson::value ret = rjson::empty_object();
std::string str_ret = std::string((n1 + n2).to_string());
rjson::set(ret, "N", rjson::from_string(str_ret));
@@ -1053,8 +1071,8 @@ static rjson::value number_add(const rjson::value& v1, const rjson::value& v2) {
}
static rjson::value number_subtract(const rjson::value& v1, const rjson::value& v2) {
auto n1 = unwrap_number(v1, "UpdateExpression");
auto n2 = unwrap_number(v2, "UpdateExpression");
auto n1 = unwrap_number(v1);
auto n2 = unwrap_number(v2);
rjson::value ret = rjson::empty_object();
std::string str_ret = std::string((n1 - n2).to_string());
rjson::set(ret, "N", rjson::from_string(str_ret));
@@ -1316,7 +1334,6 @@ static bool check_needs_read_before_write(const std::vector<parsed::update_expre
// It should be overridden once we can leverage a consensus protocol.
static future<std::unique_ptr<rjson::value>> do_get_previous_item(
service::storage_proxy& proxy,
service::client_state& client_state,
schema_ptr schema,
const partition_key& pk,
const clustering_key& ck,
@@ -1341,7 +1358,7 @@ static future<std::unique_ptr<rjson::value>> do_get_previous_item(
auto cl = db::consistency_level::LOCAL_QUORUM;
return proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
return proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit())).then(
[schema, partition_slice = std::move(partition_slice), selection = std::move(selection)] (service::storage_proxy::coordinator_query_result qr) {
auto previous_item = describe_item(schema, partition_slice, *selection, std::move(qr.query_result), {});
return make_ready_future<std::unique_ptr<rjson::value>>(std::make_unique<rjson::value>(std::move(previous_item)));
@@ -1350,7 +1367,6 @@ static future<std::unique_ptr<rjson::value>> do_get_previous_item(
static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
service::storage_proxy& proxy,
service::client_state& client_state,
schema_ptr schema,
const partition_key& pk,
const clustering_key& ck,
@@ -1364,12 +1380,11 @@ static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
if (!needs_read_before_write) {
return make_ready_future<std::unique_ptr<rjson::value>>();
}
return do_get_previous_item(proxy, client_state, std::move(schema), pk, ck, stats);
return do_get_previous_item(proxy, std::move(schema), pk, ck, stats);
}
static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
service::storage_proxy& proxy,
service::client_state& client_state,
schema_ptr schema,
const rjson::value& item,
bool needs_read_before_write,
@@ -1380,7 +1395,7 @@ static future<std::unique_ptr<rjson::value>> maybe_get_previous_item(
}
partition_key pk = pk_from_json(item, schema);
clustering_key ck = ck_from_json(item, schema);
return do_get_previous_item(proxy, client_state, std::move(schema), pk, ck, stats);
return do_get_previous_item(proxy, std::move(schema), pk, ck, stats);
}
@@ -1438,7 +1453,7 @@ future<json::json_return_type> executor::update_item(client_state& client_state,
attribute_updates = update_info["AttributeUpdates"];
}
return maybe_get_previous_item(_proxy, client_state, schema, pk, ck, has_update_expression, expression, has_expected, _stats).then(
return maybe_get_previous_item(_proxy, schema, pk, ck, has_update_expression, expression, has_expected, _stats).then(
[this, schema, expression = std::move(expression), has_update_expression, ck = std::move(ck), has_expected,
update_info = rjson::copy(update_info), m = std::move(m), attrs_collector = std::move(attrs_collector),
attribute_updates = rjson::copy(attribute_updates), ts, &client_state, start_time] (std::unique_ptr<rjson::value> previous_item) mutable {
@@ -1563,7 +1578,7 @@ future<json::json_return_type> executor::update_item(client_state& client_state,
}
}
if (!attrs_collector.empty()) {
auto serialized_map = attrs_collector.to_mut().serialize(*attrs_type());
auto serialized_map = attrs_type()->serialize_mutation_form(attrs_collector.to_mut());
row.cells().apply(attrs_column(*schema), std::move(serialized_map));
}
// To allow creation of an item with no attributes, we need a row marker.
@@ -1635,7 +1650,7 @@ future<json::json_return_type> executor::get_item(client_state& client_state, st
auto attrs_to_get = calculate_attrs_to_get(table_info);
return _proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
return _proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit())).then(
[this, schema, partition_slice = std::move(partition_slice), selection = std::move(selection), attrs_to_get = std::move(attrs_to_get), start_time = std::move(start_time)] (service::storage_proxy::coordinator_query_result qr) mutable {
_stats.api_operations.get_item_latency.add(std::chrono::steady_clock::now() - start_time, _stats.api_operations.get_item_latency._count + 1);
return make_ready_future<json::json_return_type>(make_jsonable(describe_item(schema, partition_slice, *selection, std::move(qr.query_result), std::move(attrs_to_get))));
@@ -1698,7 +1713,7 @@ future<json::json_return_type> executor::batch_get_item(client_state& client_sta
auto selection = cql3::selection::selection::wildcard(rs.schema);
auto partition_slice = query::partition_slice(std::move(bounds), {}, std::move(regular_columns), selection->get_query_options());
auto command = ::make_lw_shared<query::read_command>(rs.schema->id(), rs.schema->version(), partition_slice, query::max_partitions);
future<std::tuple<std::string, std::optional<rjson::value>>> f = _proxy.query(rs.schema, std::move(command), std::move(partition_ranges), rs.cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
future<std::tuple<std::string, std::optional<rjson::value>>> f = _proxy.query(rs.schema, std::move(command), std::move(partition_ranges), rs.cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit())).then(
[schema = rs.schema, partition_slice = std::move(partition_slice), selection = std::move(selection), attrs_to_get = rs.attrs_to_get] (service::storage_proxy::coordinator_query_result qr) mutable {
std::optional<rjson::value> json = describe_single_item(schema, partition_slice, *selection, std::move(qr.query_result), std::move(attrs_to_get));
return make_ready_future<std::tuple<std::string, std::optional<rjson::value>>>(

View File

@@ -67,7 +67,7 @@ struct from_json_visitor {
bo.write(t.from_string(sstring_view(v.GetString(), v.GetStringLength())));
}
void operator()(const bytes_type_impl& t) const {
bo.write(base64_decode(v));
bo.write(base64_decode(std::string_view(v.GetString(), v.GetStringLength())));
}
void operator()(const boolean_type_impl& t) const {
bo.write(boolean_type->decompose(v.GetBool()));
@@ -177,7 +177,7 @@ bytes get_key_from_typed_value(const rjson::value& key_typed_value, const column
expected_type, column.name_as_text(), it->name.GetString()));
}
if (column.type == bytes_type) {
return base64_decode(it->value);
return base64_decode(it->value.GetString());
} else {
return column.type->from_string(it->value.GetString());
}
@@ -227,22 +227,4 @@ clustering_key ck_from_json(const rjson::value& item, schema_ptr schema) {
return clustering_key::from_exploded(raw_ck);
}
big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic) {
if (!v.IsObject() || v.MemberCount() != 1) {
throw api_error("ValidationException", format("{}: invalid number object", diagnostic));
}
auto it = v.MemberBegin();
if (it->name != "N") {
throw api_error("ValidationException", format("{}: expected number, found type '{}'", diagnostic, it->name));
}
if (it->value.IsNumber()) {
// FIXME(sarna): should use big_decimal constructor with numeric values directly:
return big_decimal(rjson::print(it->value));
}
if (!it->value.IsString()) {
throw api_error("ValidationException", format("{}: improperly formatted number constant", diagnostic));
}
return big_decimal(it->value.GetString());
}
}

View File

@@ -22,12 +22,10 @@
#pragma once
#include <string>
#include <string_view>
#include "types.hh"
#include "schema.hh"
#include "keys.hh"
#include "rjson.hh"
#include "utils/big_decimal.hh"
namespace alternator {
@@ -60,7 +58,4 @@ rjson::value json_key_column_value(bytes_view cell, const column_definition& col
partition_key pk_from_json(const rjson::value& item, schema_ptr schema);
clustering_key ck_from_json(const rjson::value& item, schema_ptr schema);
// If v encodes a number (i.e., it is a {"N": [...]}, returns an object representing it. Otherwise,
// raises ValidationException with diagnostic.
big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic);
}

View File

@@ -24,11 +24,10 @@
#include <seastar/http/function_handlers.hh>
#include <seastar/json/json_elements.hh>
#include <seastarx.hh>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
#include "error.hh"
#include "rjson.hh"
#include "auth.hh"
#include <cctype>
#include "cql3/query_processor.hh"
static logging::logger slogger("alternator-server");
@@ -38,23 +37,12 @@ namespace alternator {
static constexpr auto TARGET = "X-Amz-Target";
inline std::vector<std::string_view> split(std::string_view text, char separator) {
std::vector<std::string_view> tokens;
inline std::vector<sstring> split(const sstring& text, const char* separator) {
if (text == "") {
return tokens;
return std::vector<sstring>();
}
while (true) {
auto pos = text.find_first_of(separator);
if (pos != std::string_view::npos) {
tokens.emplace_back(text.data(), pos);
text.remove_prefix(pos + 1);
} else {
tokens.emplace_back(text);
break;
}
}
return tokens;
std::vector<sstring> tokens;
return boost::split(tokens, text, boost::is_any_of(separator));
}
// DynamoDB HTTP error responses are structured as follows
@@ -119,140 +107,9 @@ protected:
sstring _type;
};
class health_handler : public handler_base {
virtual future<std::unique_ptr<reply>> handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
rep->set_status(reply::status_type::ok);
rep->write_body("txt", format("healthy: {}", req->get_header("Host")));
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
};
future<> server::verify_signature(const request& req) {
if (!_enforce_authorization) {
slogger.debug("Skipping authorization");
return make_ready_future<>();
}
auto host_it = req._headers.find("Host");
if (host_it == req._headers.end()) {
throw api_error("InvalidSignatureException", "Host header is mandatory for signature verification");
}
auto authorization_it = req._headers.find("Authorization");
if (host_it == req._headers.end()) {
throw api_error("InvalidSignatureException", "Authorization header is mandatory for signature verification");
}
std::string host = host_it->second;
std::vector<std::string_view> credentials_raw = split(authorization_it->second, ' ');
std::string credential;
std::string user_signature;
std::string signed_headers_str;
std::vector<std::string_view> signed_headers;
for (std::string_view entry : credentials_raw) {
std::vector<std::string_view> entry_split = split(entry, '=');
if (entry_split.size() != 2) {
if (entry != "AWS4-HMAC-SHA256") {
throw api_error("InvalidSignatureException", format("Only AWS4-HMAC-SHA256 algorithm is supported. Found: {}", entry));
}
continue;
}
std::string_view auth_value = entry_split[1];
// Commas appear as an additional (quite redundant) delimiter
if (auth_value.back() == ',') {
auth_value.remove_suffix(1);
}
if (entry_split[0] == "Credential") {
credential = std::string(auth_value);
} else if (entry_split[0] == "Signature") {
user_signature = std::string(auth_value);
} else if (entry_split[0] == "SignedHeaders") {
signed_headers_str = std::string(auth_value);
signed_headers = split(auth_value, ';');
std::sort(signed_headers.begin(), signed_headers.end());
}
}
std::vector<std::string_view> credential_split = split(credential, '/');
if (credential_split.size() != 5) {
throw api_error("ValidationException", format("Incorrect credential information format: {}", credential));
}
std::string user(credential_split[0]);
std::string datestamp(credential_split[1]);
std::string region(credential_split[2]);
std::string service(credential_split[3]);
std::map<std::string_view, std::string_view> signed_headers_map;
for (const auto& header : signed_headers) {
signed_headers_map.emplace(header, std::string_view());
}
for (auto& header : req._headers) {
std::string header_str;
header_str.resize(header.first.size());
std::transform(header.first.begin(), header.first.end(), header_str.begin(), ::tolower);
auto it = signed_headers_map.find(header_str);
if (it != signed_headers_map.end()) {
it->second = std::string_view(header.second);
}
}
auto cache_getter = [] (std::string username) {
return get_key_from_roles(cql3::get_query_processor().local(), std::move(username));
};
return _key_cache.get_ptr(user, cache_getter).then([this, &req,
user = std::move(user),
host = std::move(host),
datestamp = std::move(datestamp),
signed_headers_str = std::move(signed_headers_str),
signed_headers_map = std::move(signed_headers_map),
region = std::move(region),
service = std::move(service),
user_signature = std::move(user_signature)] (key_cache::value_ptr key_ptr) {
std::string signature = get_signature(user, *key_ptr, std::string_view(host), req._method,
datestamp, signed_headers_str, signed_headers_map, req.content, region, service, "");
if (signature != std::string_view(user_signature)) {
_key_cache.remove(user);
throw api_error("UnrecognizedClientException", "The security token included in the request is invalid.");
}
});
}
future<json::json_return_type> server::handle_api_request(std::unique_ptr<request>&& req) {
sstring target = req->get_header(TARGET);
std::vector<std::string_view> split_target = split(target, '.');
//NOTICE(sarna): Target consists of Dynamo API version followed by a dot '.' and operation type (e.g. CreateTable)
std::string op = split_target.empty() ? std::string() : std::string(split_target.back());
slogger.trace("Request: {} {}", op, req->content);
return verify_signature(*req).then([this, op, req = std::move(req)] () mutable {
auto callback_it = _callbacks.find(op);
if (callback_it == _callbacks.end()) {
_executor.local()._stats.unsupported_operations++;
throw api_error("UnknownOperationException",
format("Unsupported operation {}", op));
}
//FIXME: Client state can provide more context, e.g. client's endpoint address
// We use unique_ptr because client_state cannot be moved or copied
return do_with(std::make_unique<executor::client_state>(executor::client_state::internal_tag()), [this, callback_it = std::move(callback_it), op = std::move(op), req = std::move(req)] (std::unique_ptr<executor::client_state>& client_state) mutable {
client_state->set_raw_keyspace(executor::KEYSPACE_NAME);
executor::maybe_trace_query(*client_state, op, req->content);
tracing::trace(client_state->get_trace_state(), op);
return callback_it->second(_executor.local(), *client_state, std::move(req));
});
});
}
void server::set_routes(routes& r) {
api_handler* req_handler = new api_handler([this] (std::unique_ptr<request> req) mutable {
return handle_api_request(std::move(req));
});
r.add(operation_type::POST, url("/"), req_handler);
r.add(operation_type::GET, url("/"), new health_handler);
}
//FIXME: A way to immediately invalidate the cache should be considered,
// e.g. when the system table which stores the keys is changed.
// For now, this propagation may take up to 1 minute.
server::server(seastar::sharded<executor>& e)
: _executor(e), _key_cache(1024, 1min, slogger), _enforce_authorization(false)
, _callbacks{
using alternator_callback = std::function<future<json::json_return_type>(executor&, executor::client_state&, std::unique_ptr<request>)>;
std::unordered_map<std::string, alternator_callback> routes{
{"CreateTable", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) {
return e.maybe_create_keyspace().then([&e, &client_state, req = std::move(req)] { return e.create_table(client_state, req->content); }); }
},
@@ -268,42 +125,46 @@ server::server(seastar::sharded<executor>& e)
{"BatchWriteItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.batch_write_item(client_state, req->content); }},
{"BatchGetItem", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.batch_get_item(client_state, req->content); }},
{"Query", [] (executor& e, executor::client_state& client_state, std::unique_ptr<request> req) { return e.query(client_state, req->content); }},
} {
};
api_handler* handler = new api_handler([this, routes = std::move(routes)](std::unique_ptr<request> req) -> future<json::json_return_type> {
_executor.local()._stats.total_operations++;
sstring target = req->get_header(TARGET);
std::vector<sstring> split_target = split(target, ".");
//NOTICE(sarna): Target consists of Dynamo API version folllowed by a dot '.' and operation type (e.g. CreateTable)
sstring op = split_target.empty() ? sstring() : split_target.back();
slogger.trace("Request: {} {}", op, req->content);
auto callback_it = routes.find(op);
if (callback_it == routes.end()) {
_executor.local()._stats.unsupported_operations++;
throw api_error("UnknownOperationException",
format("Unsupported operation {}", op));
}
//FIXME: Client state can provide more context, e.g. client's endpoint address
return do_with(executor::client_state::for_internal_calls(), [this, callback_it = std::move(callback_it), op = std::move(op), req = std::move(req)] (executor::client_state& client_state) mutable {
client_state.set_raw_keyspace(executor::KEYSPACE_NAME);
executor::maybe_trace_query(client_state, op, req->content);
tracing::trace(client_state.get_trace_state(), op);
return callback_it->second(_executor.local(), client_state, std::move(req));
});
});
r.add(operation_type::POST, url("/"), handler);
}
future<> server::init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds, bool enforce_authorization) {
_enforce_authorization = enforce_authorization;
if (!port && !https_port) {
return make_exception_future<>(std::runtime_error("Either regular port or TLS port"
" must be specified in order to init an alternator HTTP server instance"));
}
return seastar::async([this, addr, port, https_port, creds] {
try {
_executor.invoke_on_all([] (executor& e) {
return e.start();
}).get();
if (port) {
_control.start().get();
_control.set_routes(std::bind(&server::set_routes, this, std::placeholders::_1)).get();
_control.listen(socket_address{addr, *port}).get();
slogger.info("Alternator HTTP server listening on {} port {}", addr, *port);
}
if (https_port) {
_https_control.start().get();
_https_control.set_routes(std::bind(&server::set_routes, this, std::placeholders::_1)).get();
_https_control.server().invoke_on_all([creds] (http_server& serv) {
return serv.set_tls_credentials(creds->build_server_credentials());
}).get();
_https_control.listen(socket_address{addr, *https_port}).get();
slogger.info("Alternator HTTPS server listening on {} port {}", addr, *https_port);
}
} catch (...) {
slogger.warn("Failed to set up Alternator HTTP server on {} port {}, TLS port {}: {}",
addr, port ? std::to_string(*port) : "OFF", https_port ? std::to_string(*https_port) : "OFF", std::current_exception());
throw;
}
future<> server::init(net::inet_address addr, uint16_t port) {
return _executor.invoke_on_all([] (executor& e) {
return e.start();
}).then([this] {
return _control.start();
}).then([this] {
return _control.set_routes(std::bind(&server::set_routes, this, std::placeholders::_1));
}).then([this, addr, port] {
return _control.listen(socket_address{addr, port});
}).then([addr, port] {
slogger.info("Alternator HTTP server listening on {} port {}", addr, port);
}).handle_exception([addr, port] (std::exception_ptr e) {
slogger.warn("Failed to set up Alternator HTTP server on {} port {}: {}", addr, port, e);
});
}

View File

@@ -24,30 +24,18 @@
#include "alternator/executor.hh"
#include <seastar/core/future.hh>
#include <seastar/http/httpd.hh>
#include <seastar/net/tls.hh>
#include <optional>
#include <alternator/auth.hh>
namespace alternator {
class server {
using alternator_callback = std::function<future<json::json_return_type>(executor&, executor::client_state&, std::unique_ptr<request>)>;
using alternator_callbacks_map = std::unordered_map<std::string_view, alternator_callback>;
seastar::httpd::http_server_control _control;
seastar::httpd::http_server_control _https_control;
seastar::sharded<executor>& _executor;
key_cache _key_cache;
bool _enforce_authorization;
alternator_callbacks_map _callbacks;
public:
server(seastar::sharded<executor>& executor);
server(seastar::sharded<executor>& executor) : _executor(executor) {}
seastar::future<> init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds, bool enforce_authorization);
seastar::future<> init(net::inet_address addr, uint16_t port);
private:
void set_routes(seastar::httpd::routes& r);
future<> verify_signature(const seastar::httpd::request& r);
future<json::json_return_type> handle_api_request(std::unique_ptr<request>&& req);
};
}

View File

@@ -671,6 +671,21 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_read/condition_not_met",
"operations": [
{
"method": "GET",
"summary": "Get cas read metrics",
"type": "int",
"nickname": "get_cas_read_metrics_condition_not_met",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/read/timeouts",
"operations": [

View File

@@ -26,7 +26,7 @@
#include "sstables/sstables.hh"
#include "utils/estimated_histogram.hh"
#include <algorithm>
#include "db/system_keyspace_view_types.hh"
#include "db/data_listeners.hh"
extern logging::logger apilog;
@@ -53,7 +53,8 @@ std::tuple<sstring, sstring> parse_fully_qualified_cf_name(sstring name) {
return std::make_tuple(name.substr(0, pos), name.substr(end));
}
const utils::UUID& get_uuid(const sstring& ks, const sstring& cf, const database& db) {
const utils::UUID& get_uuid(const sstring& name, const database& db) {
auto [ks, cf] = parse_fully_qualified_cf_name(name);
try {
return db.find_uuid(ks, cf);
} catch (std::out_of_range& e) {
@@ -61,11 +62,6 @@ const utils::UUID& get_uuid(const sstring& ks, const sstring& cf, const database
}
}
const utils::UUID& get_uuid(const sstring& name, const database& db) {
auto [ks, cf] = parse_fully_qualified_cf_name(name);
return get_uuid(ks, cf, db);
}
future<> foreach_column_family(http_context& ctx, const sstring& name, function<void(column_family&)> f) {
auto uuid = get_uuid(name, ctx.db.local());
@@ -75,28 +71,28 @@ future<> foreach_column_family(http_context& ctx, const sstring& name, function<
}
future<json::json_return_type> get_cf_stats(http_context& ctx, const sstring& name,
int64_t column_family_stats::*f) {
int64_t column_family::stats::*f) {
return map_reduce_cf(ctx, name, int64_t(0), [f](const column_family& cf) {
return cf.get_stats().*f;
}, std::plus<int64_t>());
}
future<json::json_return_type> get_cf_stats(http_context& ctx,
int64_t column_family_stats::*f) {
int64_t column_family::stats::*f) {
return map_reduce_cf(ctx, int64_t(0), [f](const column_family& cf) {
return cf.get_stats().*f;
}, std::plus<int64_t>());
}
static future<json::json_return_type> get_cf_stats_count(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
return map_reduce_cf(ctx, name, int64_t(0), [f](const column_family& cf) {
return (cf.get_stats().*f).hist.count;
}, std::plus<int64_t>());
}
static future<json::json_return_type> get_cf_stats_sum(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
auto uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([uuid, f](database& db) {
// Histograms information is sample of the actual load
@@ -112,14 +108,14 @@ static future<json::json_return_type> get_cf_stats_sum(http_context& ctx, const
static future<json::json_return_type> get_cf_stats_count(http_context& ctx,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
return map_reduce_cf(ctx, int64_t(0), [f](const column_family& cf) {
return (cf.get_stats().*f).hist.count;
}, std::plus<int64_t>());
}
static future<json::json_return_type> get_cf_histogram(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
utils::UUID uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([f, uuid](const database& p) {
return (p.find_column_family(uuid).get_stats().*f).hist;},
@@ -130,7 +126,7 @@ static future<json::json_return_type> get_cf_histogram(http_context& ctx, const
});
}
static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
std::function<utils::ihistogram(const database&)> fun = [f] (const database& db) {
utils::ihistogram res;
for (auto i : db.get_column_families()) {
@@ -146,7 +142,7 @@ static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils:
}
static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
utils::UUID uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([f, uuid](const database& p) {
return (p.find_column_family(uuid).get_stats().*f).rate();},
@@ -157,7 +153,7 @@ static future<json::json_return_type> get_cf_rate_and_histogram(http_context& c
});
}
static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
std::function<utils::rate_moving_average_and_histogram(const database&)> fun = [f] (const database& db) {
utils::rate_moving_average_and_histogram res;
for (auto i : db.get_column_families()) {
@@ -254,11 +250,12 @@ class sum_ratio {
uint64_t _n = 0;
T _total = 0;
public:
void operator()(T value) {
future<> operator()(T value) {
if (value > 0) {
_total += value;
_n++;
}
return make_ready_future<>();
}
// Returns average value of all registered ratios.
T get() && {
@@ -407,11 +404,11 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx,req->param["name"] ,&column_family_stats::memtable_switch_count);
return get_cf_stats(ctx,req->param["name"] ,&column_family::stats::memtable_switch_count);
});
cf::get_all_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::memtable_switch_count);
return get_cf_stats(ctx, &column_family::stats::memtable_switch_count);
});
// FIXME: this refers to partitions, not rows.
@@ -456,67 +453,67 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx,req->param["name"] ,&column_family_stats::pending_flushes);
return get_cf_stats(ctx,req->param["name"] ,&column_family::stats::pending_flushes);
});
cf::get_all_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::pending_flushes);
return get_cf_stats(ctx, &column_family::stats::pending_flushes);
});
cf::get_read.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx,req->param["name"] ,&column_family_stats::reads);
return get_cf_stats_count(ctx,req->param["name"] ,&column_family::stats::reads);
});
cf::get_all_read.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx, &column_family_stats::reads);
return get_cf_stats_count(ctx, &column_family::stats::reads);
});
cf::get_write.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx, req->param["name"] ,&column_family_stats::writes);
return get_cf_stats_count(ctx, req->param["name"] ,&column_family::stats::writes);
});
cf::get_all_write.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx, &column_family_stats::writes);
return get_cf_stats_count(ctx, &column_family::stats::writes);
});
cf::get_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::reads);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::reads);
});
cf::get_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family_stats::reads);
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family::stats::reads);
});
cf::get_read_latency.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_sum(ctx,req->param["name"] ,&column_family_stats::reads);
return get_cf_stats_sum(ctx,req->param["name"] ,&column_family::stats::reads);
});
cf::get_write_latency.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_sum(ctx, req->param["name"] ,&column_family_stats::writes);
return get_cf_stats_sum(ctx, req->param["name"] ,&column_family::stats::writes);
});
cf::get_all_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, &column_family_stats::writes);
return get_cf_histogram(ctx, &column_family::stats::writes);
});
cf::get_all_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, &column_family_stats::writes);
return get_cf_rate_and_histogram(ctx, &column_family::stats::writes);
});
cf::get_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::writes);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::writes);
});
cf::get_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family_stats::writes);
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family::stats::writes);
});
cf::get_all_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, &column_family_stats::writes);
return get_cf_histogram(ctx, &column_family::stats::writes);
});
cf::get_all_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, &column_family_stats::writes);
return get_cf_rate_and_histogram(ctx, &column_family::stats::writes);
});
cf::get_pending_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -532,11 +529,11 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, req->param["name"], &column_family_stats::live_sstable_count);
return get_cf_stats(ctx, req->param["name"], &column_family::stats::live_sstable_count);
});
cf::get_all_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::live_sstable_count);
return get_cf_stats(ctx, &column_family::stats::live_sstable_count);
});
cf::get_unleveled_sstables.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -795,25 +792,25 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_cas_prepare.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
return cf.get_stats().estimated_cas_prepare;
},
utils::estimated_histogram_merge, utils_json::estimated_histogram());
cf::get_cas_prepare.set(r, [] (std::unique_ptr<request> req) {
//TBD
unimplemented();
//auto id = get_uuid(req->param["name"], ctx.db.local());
return make_ready_future<json::json_return_type>(0);
});
cf::get_cas_propose.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
return cf.get_stats().estimated_cas_propose;
},
utils::estimated_histogram_merge, utils_json::estimated_histogram());
cf::get_cas_propose.set(r, [] (std::unique_ptr<request> req) {
//TBD
unimplemented();
//auto id = get_uuid(req->param["name"], ctx.db.local());
return make_ready_future<json::json_return_type>(0);
});
cf::get_cas_commit.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
return cf.get_stats().estimated_cas_commit;
},
utils::estimated_histogram_merge, utils_json::estimated_histogram());
cf::get_cas_commit.set(r, [] (std::unique_ptr<request> req) {
//TBD
unimplemented();
//auto id = get_uuid(req->param["name"], ctx.db.local());
return make_ready_future<json::json_return_type>(0);
});
cf::get_sstables_per_read_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -824,11 +821,11 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_tombstone_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::tombstone_scanned);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::tombstone_scanned);
});
cf::get_live_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::live_scanned);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::live_scanned);
});
cf::get_col_update_time_delta_histogram.set(r, [] (std::unique_ptr<request> req) {
@@ -846,28 +843,13 @@ void set_column_family(http_context& ctx, routes& r) {
return true;
});
cf::get_built_indexes.set(r, [&ctx](std::unique_ptr<request> req) {
auto [ks, cf_name] = parse_fully_qualified_cf_name(req->param["name"]);
return db::system_keyspace::load_view_build_progress().then([ks, cf_name, &ctx](const std::vector<db::system_keyspace::view_build_progress>& vb) mutable {
std::set<sstring> vp;
for (auto b : vb) {
if (b.view.first == ks) {
vp.insert(b.view.second);
}
}
std::vector<sstring> res;
auto uuid = get_uuid(ks, cf_name, ctx.db.local());
column_family& cf = ctx.db.local().find_column_family(uuid);
res.reserve(cf.get_index_manager().list_indexes().size());
for (auto&& i : cf.get_index_manager().list_indexes()) {
if (vp.find(secondary_index::index_table_name(i.metadata().name())) == vp.end()) {
res.emplace_back(i.metadata().name());
}
}
return make_ready_future<json::json_return_type>(res);
});
cf::get_built_indexes.set(r, [](const_req) {
// FIXME
// Currently there are no index support
return std::vector<sstring>();
});
cf::get_compression_metadata_off_heap_memory_used.set(r, [](const_req) {
// FIXME
// Currently there are no information on the compression

View File

@@ -109,9 +109,9 @@ future<json::json_return_type> map_reduce_cf(http_context& ctx, I init,
}
future<json::json_return_type> get_cf_stats(http_context& ctx, const sstring& name,
int64_t column_family_stats::*f);
int64_t column_family::stats::*f);
future<json::json_return_type> get_cf_stats(http_context& ctx,
int64_t column_family_stats::*f);
int64_t column_family::stats::*f);
}

View File

@@ -74,14 +74,13 @@ void set_compaction_manager(http_context& ctx, routes& r) {
cm::get_pending_tasks_by_table.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([&ctx](database& db) {
return do_with(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), [&ctx, &db](std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& tasks) {
return do_for_each(db.get_column_families(), [&tasks](const std::pair<utils::UUID, seastar::lw_shared_ptr<table>>& i) {
table& cf = *i.second.get();
tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.get_compaction_strategy().estimated_pending_compactions(cf);
return make_ready_future<>();
}).then([&tasks] {
return std::move(tasks);
});
std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash> tasks;
return do_for_each(db.get_column_families(), [&tasks](const std::pair<utils::UUID, seastar::lw_shared_ptr<table>>& i) {
table& cf = *i.second.get();
tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.get_compaction_strategy().estimated_pending_compactions(cf);
return make_ready_future<>();
}).then([&tasks] {
return tasks;
});
}, std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), sum_pending_tasks).then(
[](const std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& task_map) {

View File

@@ -81,9 +81,12 @@ void set_storage_proxy(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(0);
});
sp::get_hinted_handoff_enabled.set(r, [&ctx](std::unique_ptr<request> req) {
auto enabled = ctx.db.local().get_config().hinted_handoff_enabled();
return make_ready_future<json::json_return_type>(enabled);
sp::get_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// hinted handoff is not supported currently,
// so we should return false
return make_ready_future<json::json_return_type>(false);
});
sp::set_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req) {
@@ -247,40 +250,68 @@ void set_storage_proxy(http_context& ctx, routes& r) {
});
});
sp::get_cas_read_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_timeouts);
sp::get_cas_read_timeouts.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_unavailables);
sp::get_cas_read_unavailables.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_timeouts);
sp::get_cas_write_timeouts.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_unavailables);
sp::get_cas_write_unavailables.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_write_unfinished_commit);
sp::get_cas_write_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &proxy::stats::cas_write_contention);
sp::get_cas_write_metrics_contention.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_condition_not_met.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_write_condition_not_met);
sp::get_cas_write_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_read_unfinished_commit);
sp::get_cas_read_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &proxy::stats::cas_read_contention);
sp::get_cas_read_metrics_contention.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_read_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
@@ -351,11 +382,19 @@ void set_storage_proxy(http_context& ctx, routes& r) {
return sum_timer_stats(ctx.sp, &proxy::stats::write);
});
sp::get_cas_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats(ctx.sp, &proxy::stats::cas_write);
//TBD
// FIXME
// cas is not supported yet, so just return empty moving average
return make_ready_future<json::json_return_type>(get_empty_moving_average());
});
sp::get_cas_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats(ctx.sp, &proxy::stats::cas_read);
//TBD
// FIXME
// cas is not supported yet, so just return empty moving average
return make_ready_future<json::json_return_type>(get_empty_moving_average());
});
sp::get_view_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

View File

@@ -192,7 +192,7 @@ void set_storage_service(http_context& ctx, routes& r) {
});
ss::get_load.set(r, [&ctx](std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::live_disk_space_used);
return get_cf_stats(ctx, &column_family::stats::live_disk_space_used);
});
ss::get_load_map.set(r, [] (std::unique_ptr<request> req) {
@@ -254,9 +254,6 @@ void set_storage_service(http_context& ctx, routes& r) {
if (column_family.empty()) {
resp = service::get_local_storage_service().take_snapshot(tag, keynames);
} else {
if (keynames.empty()) {
throw httpd::bad_param_exception("The keyspace of column families must be specified");
}
if (keynames.size() > 1) {
throw httpd::bad_param_exception("Only one keyspace allowed when specifying a column family");
}
@@ -307,24 +304,17 @@ void set_storage_service(http_context& ctx, routes& r) {
if (column_families.empty()) {
column_families = map_keys(ctx.db.local().find_keyspace(keyspace).metadata().get()->cf_meta_data());
}
return service::get_local_storage_service().is_cleanup_allowed(keyspace).then([&ctx, keyspace,
column_families = std::move(column_families)] (bool is_cleanup_allowed) mutable {
if (!is_cleanup_allowed) {
return make_exception_future<json::json_return_type>(
std::runtime_error("Can not perform cleanup operation when topology changes"));
return ctx.db.invoke_on_all([keyspace, column_families] (database& db) {
std::vector<column_family*> column_families_vec;
auto& cm = db.get_compaction_manager();
for (auto cf : column_families) {
column_families_vec.push_back(&db.find_column_family(keyspace, cf));
}
return ctx.db.invoke_on_all([keyspace, column_families] (database& db) {
std::vector<column_family*> column_families_vec;
auto& cm = db.get_compaction_manager();
for (auto cf : column_families) {
column_families_vec.push_back(&db.find_column_family(keyspace, cf));
}
return parallel_for_each(column_families_vec, [&cm] (column_family* cf) {
return cm.perform_cleanup(cf);
});
}).then([]{
return make_ready_future<json::json_return_type>(0);
return parallel_for_each(column_families_vec, [&cm] (column_family* cf) {
return cm.perform_cleanup(cf);
});
}).then([]{
return make_ready_future<json::json_return_type>(0);
});
});
@@ -870,7 +860,7 @@ void set_storage_service(http_context& ctx, routes& r) {
});
ss::get_metrics_load.set(r, [&ctx](std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::live_disk_space_used);
return get_cf_stats(ctx, &column_family::stats::live_disk_space_used);
});
ss::get_exceptions.set(r, [](const_req req) {

View File

@@ -22,6 +22,7 @@
#include "atomic_cell.hh"
#include "atomic_cell_or_collection.hh"
#include "types.hh"
#include "types/collection.hh"
/// LSA mirator for cells with irrelevant type
///
@@ -147,6 +148,35 @@ atomic_cell_or_collection::atomic_cell_or_collection(const abstract_type& type,
{
}
static collection_mutation_view get_collection_mutation_view(const uint8_t* ptr)
{
auto f = data::cell::structure::get_member<data::cell::tags::flags>(ptr);
auto ti = data::type_info::make_collection();
data::cell::context ctx(f, ti);
auto view = data::cell::structure::get_member<data::cell::tags::cell>(ptr).as<data::cell::tags::collection>(ctx);
auto dv = data::cell::variable_value::make_view(view, f.get<data::cell::tags::external_data>());
return collection_mutation_view { dv };
}
collection_mutation_view atomic_cell_or_collection::as_collection_mutation() const {
return get_collection_mutation_view(_data.get());
}
collection_mutation::collection_mutation(const collection_type_impl& type, collection_mutation_view v)
: _data(imr_object_type::make(data::cell::make_collection(v.data), &type.imr_state().lsa_migrator()))
{
}
collection_mutation::collection_mutation(const collection_type_impl& type, bytes_view v)
: _data(imr_object_type::make(data::cell::make_collection(v), &type.imr_state().lsa_migrator()))
{
}
collection_mutation::operator collection_mutation_view() const
{
return get_collection_mutation_view(_data.get());
}
bool atomic_cell_or_collection::equals(const abstract_type& type, const atomic_cell_or_collection& other) const
{
auto ptr_a = _data.get();
@@ -201,7 +231,7 @@ size_t atomic_cell_or_collection::external_memory_usage(const abstract_type& t)
size_t external_value_size = 0;
if (flags.get<data::cell::tags::external_data>()) {
if (flags.get<data::cell::tags::collection>()) {
external_value_size = as_collection_mutation().data.size_bytes();
external_value_size = get_collection_mutation_view(_data.get()).data.size_bytes();
} else {
auto cell_view = data::cell::atomic_cell_view(t.imr_state().type_info(), view);
external_value_size = cell_view.value_size();

View File

@@ -221,6 +221,30 @@ public:
friend std::ostream& operator<<(std::ostream& os, const atomic_cell& ac);
};
class collection_mutation_view;
// Represents a mutation of a collection. Actual format is determined by collection type,
// and is:
// set: list of atomic_cell
// map: list of pair<atomic_cell, bytes> (for key/value)
// list: tbd, probably ugly
class collection_mutation {
public:
using imr_object_type = imr::utils::object<data::cell::structure>;
imr_object_type _data;
collection_mutation() {}
collection_mutation(const collection_type_impl&, collection_mutation_view v);
collection_mutation(const collection_type_impl&, bytes_view bv);
operator collection_mutation_view() const;
};
class collection_mutation_view {
public:
atomic_cell_value_view data;
};
class column_definition;
int compare_atomic_cell_for_merge(atomic_cell_view left, atomic_cell_view right);

View File

@@ -34,12 +34,14 @@ template<>
struct appending_hash<collection_mutation_view> {
template<typename Hasher>
void operator()(Hasher& h, collection_mutation_view cell, const column_definition& cdef) const {
cell.with_deserialized(*cdef.type, [&] (collection_mutation_view_description m_view) {
::feed_hash(h, m_view.tomb);
for (auto&& key_and_value : m_view.cells) {
::feed_hash(h, key_and_value.first);
::feed_hash(h, key_and_value.second, cdef);
}
cell.data.with_linearized([&] (bytes_view cell_bv) {
auto ctype = static_pointer_cast<const collection_type_impl>(cdef.type);
auto m_view = ctype->deserialize_mutation_form(cell_bv);
::feed_hash(h, m_view.tomb);
for (auto&& key_and_value : m_view.cells) {
::feed_hash(h, key_and_value.first);
::feed_hash(h, key_and_value.second, cdef);
}
});
}
};

View File

@@ -22,7 +22,6 @@
#pragma once
#include "atomic_cell.hh"
#include "collection_mutation.hh"
#include "schema.hh"
#include "hashing.hh"

View File

@@ -77,23 +77,17 @@ private:
void on_update_view(const sstring& ks_name, const sstring& view_name, bool columns_changed) override {}
void on_drop_keyspace(const sstring& ks_name) override {
// Do it in the background.
(void)_authorizer.revoke_all(
_authorizer.revoke_all(
auth::make_data_resource(ks_name)).handle_exception_type([](const unsupported_authorization_operation&) {
// Nothing.
}).handle_exception([] (std::exception_ptr e) {
log.error("Unexpected exception while revoking all permissions on dropped keyspace: {}", e);
});
}
void on_drop_column_family(const sstring& ks_name, const sstring& cf_name) override {
// Do it in the background.
(void)_authorizer.revoke_all(
_authorizer.revoke_all(
auth::make_data_resource(
ks_name, cf_name)).handle_exception_type([](const unsupported_authorization_operation&) {
// Nothing.
}).handle_exception([] (std::exception_ptr e) {
log.error("Unexpected exception while revoking all permissions on dropped table: {}", e);
});
}

View File

@@ -101,8 +101,8 @@ static future<std::optional<record>> find_record(cql3::query_processor& qp, std:
return std::make_optional(
record{
row.get_as<sstring>(sstring(meta::roles_table::role_col_name)),
row.get_or<bool>("is_superuser", false),
row.get_or<bool>("can_login", false),
row.get_as<bool>("is_superuser"),
row.get_as<bool>("can_login"),
(row.has("member_of")
? row.get_set<sstring>("member_of")
: role_set())});
@@ -203,7 +203,7 @@ future<> standard_role_manager::migrate_legacy_metadata() const {
internal_distributed_timeout_config()).then([this](::shared_ptr<cql3::untyped_result_set> results) {
return do_for_each(*results, [this](const cql3::untyped_result_set_row& row) {
role_config config;
config.is_superuser = row.get_or<bool>("super", false);
config.is_superuser = row.get_as<bool>("super");
config.can_login = true;
return do_with(

View File

@@ -61,7 +61,6 @@ class cache_flat_mutation_reader final : public flat_mutation_reader::impl {
// - _last_row points at a direct predecessor of the next row which is going to be read.
// Used for populating continuity.
// - _population_range_starts_before_all_rows is set accordingly
// - _underlying is engaged and fast-forwarded
reading_from_underlying,
end_of_stream
@@ -100,13 +99,7 @@ class cache_flat_mutation_reader final : public flat_mutation_reader::impl {
// forward progress is not guaranteed in case iterators are getting constantly invalidated.
bool _lower_bound_changed = false;
// Points to the underlying reader conforming to _schema,
// either to *_underlying_holder or _read_context->underlying().underlying().
flat_mutation_reader* _underlying = nullptr;
std::optional<flat_mutation_reader> _underlying_holder;
future<> do_fill_buffer(db::timeout_clock::time_point);
future<> ensure_underlying(db::timeout_clock::time_point);
void copy_from_cache_to_buffer();
future<> process_static_row(db::timeout_clock::time_point);
void move_to_end();
@@ -193,22 +186,23 @@ future<> cache_flat_mutation_reader::process_static_row(db::timeout_clock::time_
return make_ready_future<>();
} else {
_read_context->cache().on_row_miss();
return ensure_underlying(timeout).then([this, timeout] {
return (*_underlying)(timeout).then([this] (mutation_fragment_opt&& sr) {
if (sr) {
assert(sr->is_static_row());
maybe_add_to_cache(sr->as_static_row());
push_mutation_fragment(std::move(*sr));
}
maybe_set_static_row_continuous();
});
return _read_context->get_next_fragment(timeout).then([this] (mutation_fragment_opt&& sr) {
if (sr) {
assert(sr->is_static_row());
maybe_add_to_cache(sr->as_static_row());
push_mutation_fragment(std::move(*sr));
}
maybe_set_static_row_continuous();
});
}
}
inline
void cache_flat_mutation_reader::touch_partition() {
_snp->touch();
if (_snp->at_latest_version()) {
rows_entry& last_dummy = *_snp->version()->partition().clustered_rows().rbegin();
_snp->tracker()->touch(last_dummy);
}
}
inline
@@ -238,36 +232,14 @@ future<> cache_flat_mutation_reader::fill_buffer(db::timeout_clock::time_point t
});
}
inline
future<> cache_flat_mutation_reader::ensure_underlying(db::timeout_clock::time_point timeout) {
if (_underlying) {
return make_ready_future<>();
}
return _read_context->ensure_underlying(timeout).then([this, timeout] {
flat_mutation_reader& ctx_underlying = _read_context->underlying().underlying();
if (ctx_underlying.schema() != _schema) {
_underlying_holder = make_delegating_reader(ctx_underlying);
_underlying_holder->upgrade_schema(_schema);
_underlying = &*_underlying_holder;
} else {
_underlying = &ctx_underlying;
}
});
}
inline
future<> cache_flat_mutation_reader::do_fill_buffer(db::timeout_clock::time_point timeout) {
if (_state == state::move_to_underlying) {
if (!_underlying) {
return ensure_underlying(timeout).then([this, timeout] {
return do_fill_buffer(timeout);
});
}
_state = state::reading_from_underlying;
_population_range_starts_before_all_rows = _lower_bound.is_before_all_clustered_rows(*_schema);
auto end = _next_row_in_range ? position_in_partition(_next_row.position())
: position_in_partition(_upper_bound);
return _underlying->fast_forward_to(position_range{_lower_bound, std::move(end)}, timeout).then([this, timeout] {
return _read_context->fast_forward_to(position_range{_lower_bound, std::move(end)}, timeout).then([this, timeout] {
return read_from_underlying(timeout);
});
}
@@ -308,7 +280,7 @@ future<> cache_flat_mutation_reader::do_fill_buffer(db::timeout_clock::time_poin
inline
future<> cache_flat_mutation_reader::read_from_underlying(db::timeout_clock::time_point timeout) {
return consume_mutation_fragments_until(*_underlying,
return consume_mutation_fragments_until(_read_context->underlying().underlying(),
[this] { return _state != state::reading_from_underlying || is_buffer_full(); },
[this] (mutation_fragment mf) {
_read_context->cache().on_row_miss();

View File

@@ -79,8 +79,7 @@ mutation canonical_mutation::to_mutation(schema_ptr s) const {
if (version == m.schema()->version()) {
auto partition_view = mutation_partition_view::from_view(mv.partition());
mutation_application_stats app_stats;
m.partition().apply(*m.schema(), partition_view, *m.schema(), app_stats);
m.partition().apply(*m.schema(), partition_view, *m.schema());
} else {
column_mapping cm = mv.mapping();
converting_mutation_partition_applier v(cm, *m.schema(), m.partition());

View File

@@ -1,604 +0,0 @@
/*
* Copyright (C) 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include <utility>
#include <algorithm>
#include <seastar/util/defer.hh>
#include <seastar/core/thread.hh>
#include "cdc/cdc.hh"
#include "bytes.hh"
#include "database.hh"
#include "db/config.hh"
#include "dht/murmur3_partitioner.hh"
#include "partition_slice_builder.hh"
#include "schema.hh"
#include "schema_builder.hh"
#include "service/migration_manager.hh"
#include "service/storage_service.hh"
#include "types/tuple.hh"
#include "cql3/statements/select_statement.hh"
#include "cql3/multi_column_relation.hh"
#include "cql3/tuples.hh"
#include "log.hh"
using locator::snitch_ptr;
using locator::token_metadata;
using locator::topology;
using seastar::sstring;
using service::migration_manager;
using service::storage_proxy;
namespace std {
template<> struct hash<std::pair<net::inet_address, unsigned int>> {
std::size_t operator()(const std::pair<net::inet_address, unsigned int> &p) const {
return std::hash<net::inet_address>{}(p.first) ^ std::hash<int>{}(p.second);
}
};
}
using namespace std::chrono_literals;
static logging::logger cdc_log("cdc");
namespace cdc {
using operation_native_type = std::underlying_type_t<operation>;
using column_op_native_type = std::underlying_type_t<column_op>;
sstring log_name(const sstring& table_name) {
static constexpr auto cdc_log_suffix = "_scylla_cdc_log";
return table_name + cdc_log_suffix;
}
sstring desc_name(const sstring& table_name) {
static constexpr auto cdc_desc_suffix = "_scylla_cdc_desc";
return table_name + cdc_desc_suffix;
}
static future<>
remove_log(db_context ctx, const sstring& ks_name, const sstring& table_name) {
try {
return ctx._migration_manager.announce_column_family_drop(
ks_name, log_name(table_name), false);
} catch (exceptions::configuration_exception& e) {
// It's fine if the table does not exist.
return make_ready_future<>();
} catch (...) {
return make_exception_future<>(std::current_exception());
}
}
static future<>
remove_desc(db_context ctx, const sstring& ks_name, const sstring& table_name) {
try {
return ctx._migration_manager.announce_column_family_drop(
ks_name, desc_name(table_name), false);
} catch (exceptions::configuration_exception& e) {
// It's fine if the table does not exist.
return make_ready_future<>();
} catch (...) {
return make_exception_future<>(std::current_exception());
}
}
future<>
remove(db_context ctx, const sstring& ks_name, const sstring& table_name) {
return when_all(remove_log(ctx, ks_name, table_name),
remove_desc(ctx, ks_name, table_name)).discard_result();
}
static future<> setup_log(db_context ctx, const schema& s) {
schema_builder b(s.ks_name(), log_name(s.cf_name()));
b.set_default_time_to_live(gc_clock::duration{s.cdc_options().ttl()});
b.set_comment(sprint("CDC log for %s.%s", s.ks_name(), s.cf_name()));
b.with_column("stream_id", uuid_type, column_kind::partition_key);
b.with_column("time", timeuuid_type, column_kind::clustering_key);
b.with_column("batch_seq_no", int32_type, column_kind::clustering_key);
b.with_column("operation", data_type_for<operation_native_type>());
b.with_column("ttl", long_type);
auto add_columns = [&] (const schema::const_iterator_range_type& columns, bool is_data_col = false) {
for (const auto& column : columns) {
auto type = column.type;
if (is_data_col) {
type = tuple_type_impl::get_instance({ /* op */ data_type_for<column_op_native_type>(), /* value */ type, /* ttl */long_type});
}
b.with_column("_" + column.name(), type);
}
};
add_columns(s.partition_key_columns());
add_columns(s.clustering_key_columns());
add_columns(s.static_columns(), true);
add_columns(s.regular_columns(), true);
return ctx._migration_manager.announce_new_column_family(b.build(), false);
}
static future<> setup_stream_description_table(db_context ctx, const schema& s) {
schema_builder b(s.ks_name(), desc_name(s.cf_name()));
b.set_comment(sprint("CDC description for %s.%s", s.ks_name(), s.cf_name()));
b.with_column("node_ip", inet_addr_type, column_kind::partition_key);
b.with_column("shard_id", int32_type, column_kind::partition_key);
b.with_column("created_at", timestamp_type, column_kind::clustering_key);
b.with_column("stream_id", uuid_type);
return ctx._migration_manager.announce_new_column_family(b.build(), false);
}
// This function assumes setup_stream_description_table was called on |s| before the call to this
// function.
static future<> populate_desc(db_context ctx, const schema& s) {
auto& db = ctx._proxy.get_db().local();
auto desc_schema =
db.find_schema(s.ks_name(), desc_name(s.cf_name()));
auto log_schema =
db.find_schema(s.ks_name(), log_name(s.cf_name()));
auto belongs_to = [&](const gms::inet_address& endpoint,
const unsigned int shard_id,
const int shard_count,
const unsigned int ignore_msb_bits,
const utils::UUID& stream_id) {
const auto log_pk = partition_key::from_singular(*log_schema,
data_value(stream_id));
const auto token = ctx._partitioner.decorate_key(*log_schema, log_pk).token();
if (ctx._token_metadata.get_endpoint(ctx._token_metadata.first_token(token)) != endpoint) {
return false;
}
const auto owning_shard_id = dht::murmur3_partitioner(shard_count, ignore_msb_bits).shard_of(token);
return owning_shard_id == shard_id;
};
std::vector<mutation> mutations;
const auto ts = api::new_timestamp();
const auto ck = clustering_key::from_single_value(
*desc_schema, timestamp_type->decompose(ts));
auto cdef = desc_schema->get_column_definition(to_bytes("stream_id"));
for (const auto& dc : ctx._token_metadata.get_topology().get_datacenter_endpoints()) {
for (const auto& endpoint : dc.second) {
const auto decomposed_ip = inet_addr_type->decompose(endpoint.addr());
const unsigned int shard_count = ctx._snitch->get_shard_count(endpoint);
const unsigned int ignore_msb_bits = ctx._snitch->get_ignore_msb_bits(endpoint);
for (unsigned int shard_id = 0; shard_id < shard_count; ++shard_id) {
const auto pk = partition_key::from_exploded(
*desc_schema, { decomposed_ip, int32_type->decompose(static_cast<int>(shard_id)) });
mutations.emplace_back(desc_schema, pk);
auto stream_id = utils::make_random_uuid();
while (!belongs_to(endpoint, shard_id, shard_count, ignore_msb_bits, stream_id)) {
stream_id = utils::make_random_uuid();
}
auto value = atomic_cell::make_live(*uuid_type,
ts,
uuid_type->decompose(stream_id));
mutations.back().set_cell(ck, *cdef, std::move(value));
}
}
}
return ctx._proxy.mutate(std::move(mutations),
db::consistency_level::QUORUM,
db::no_timeout,
nullptr,
empty_service_permit());
}
future<> setup(db_context ctx, schema_ptr s) {
return seastar::async([ctx = std::move(ctx), s = std::move(s)] {
setup_log(ctx, *s).get();
auto log_guard = seastar::defer([&] { remove_log(ctx, s->ks_name(), s->cf_name()).get(); });
setup_stream_description_table(ctx, *s).get();
auto desc_guard = seastar::defer([&] { remove_desc(ctx, s->ks_name(), s->cf_name()).get(); });
populate_desc(ctx, *s).get();
desc_guard.cancel();
log_guard.cancel();
});
}
db_context db_context::builder::build() {
return db_context{
_proxy,
_migration_manager ? _migration_manager->get() : service::get_local_migration_manager(),
_token_metadata ? _token_metadata->get() : service::get_local_storage_service().get_token_metadata(),
_snitch ? _snitch->get() : locator::i_endpoint_snitch::get_local_snitch_ptr(),
_partitioner ? _partitioner->get() : dht::global_partitioner()
};
}
class transformer final {
public:
using streams_type = std::unordered_map<std::pair<net::inet_address, unsigned int>, utils::UUID>;
private:
db_context _ctx;
schema_ptr _schema;
schema_ptr _log_schema;
utils::UUID _time;
bytes _decomposed_time;
::shared_ptr<const transformer::streams_type> _streams;
const column_definition& _op_col;
clustering_key set_pk_columns(const partition_key& pk, int batch_no, mutation& m) const {
const auto log_ck = clustering_key::from_exploded(
*m.schema(), { _decomposed_time, int32_type->decompose(batch_no) });
auto pk_value = pk.explode(*_schema);
size_t pos = 0;
for (const auto& column : _schema->partition_key_columns()) {
assert (pos < pk_value.size());
auto cdef = m.schema()->get_column_definition(to_bytes("_" + column.name()));
auto value = atomic_cell::make_live(*column.type,
_time.timestamp(),
bytes_view(pk_value[pos]));
m.set_cell(log_ck, *cdef, std::move(value));
++pos;
}
return log_ck;
}
void set_operation(const clustering_key& ck, operation op, mutation& m) const {
m.set_cell(ck, _op_col, atomic_cell::make_live(*_op_col.type, _time.timestamp(), _op_col.type->decompose(operation_native_type(op))));
}
partition_key stream_id(const net::inet_address& ip, unsigned int shard_id) const {
auto it = _streams->find(std::make_pair(ip, shard_id));
if (it == std::end(*_streams)) {
throw std::runtime_error(format("No stream found for node {} and shard {}", ip, shard_id));
}
return partition_key::from_exploded(*_log_schema, { uuid_type->decompose(it->second) });
}
public:
transformer(db_context ctx, schema_ptr s, ::shared_ptr<const transformer::streams_type> streams)
: _ctx(ctx)
, _schema(std::move(s))
, _log_schema(ctx._proxy.get_db().local().find_schema(_schema->ks_name(), log_name(_schema->cf_name())))
, _time(utils::UUID_gen::get_time_UUID())
, _decomposed_time(timeuuid_type->decompose(_time))
, _streams(std::move(streams))
, _op_col(*_log_schema->get_column_definition(to_bytes("operation")))
{}
// TODO: is pre-image data based on query enough. We only have actual column data. Do we need
// more details like tombstones/ttl? Probably not but keep in mind.
mutation transform(const mutation& m, const cql3::untyped_result_set* rs = nullptr) const {
auto& t = m.token();
auto&& ep = _ctx._token_metadata.get_endpoint(
_ctx._token_metadata.first_token(t));
if (!ep) {
throw std::runtime_error(format("No owner found for key {}", m.decorated_key()));
}
auto shard_id = dht::murmur3_partitioner(_ctx._snitch->get_shard_count(*ep), _ctx._snitch->get_ignore_msb_bits(*ep)).shard_of(t);
mutation res(_log_schema, stream_id(ep->addr(), shard_id));
auto& p = m.partition();
if (p.partition_tombstone()) {
// Partition deletion
auto log_ck = set_pk_columns(m.key(), 0, res);
set_operation(log_ck, operation::partition_delete, res);
} else if (!p.row_tombstones().empty()) {
// range deletion
int batch_no = 0;
for (auto& rt : p.row_tombstones()) {
auto set_bound = [&] (const clustering_key& log_ck, const clustering_key_prefix& ckp) {
auto exploded = ckp.explode(*_schema);
size_t pos = 0;
for (const auto& column : _schema->clustering_key_columns()) {
if (pos >= exploded.size()) {
break;
}
auto cdef = _log_schema->get_column_definition(to_bytes("_" + column.name()));
auto value = atomic_cell::make_live(*column.type,
_time.timestamp(),
bytes_view(exploded[pos]));
res.set_cell(log_ck, *cdef, std::move(value));
++pos;
}
};
{
auto log_ck = set_pk_columns(m.key(), batch_no, res);
set_bound(log_ck, rt.start);
// TODO: separate inclusive/exclusive range
set_operation(log_ck, operation::range_delete_start, res);
++batch_no;
}
{
auto log_ck = set_pk_columns(m.key(), batch_no, res);
set_bound(log_ck, rt.end);
// TODO: separate inclusive/exclusive range
set_operation(log_ck, operation::range_delete_end, res);
++batch_no;
}
}
} else {
// should be update or deletion
int batch_no = 0;
for (const rows_entry& r : p.clustered_rows()) {
auto ck_value = r.key().explode(*_schema);
std::optional<clustering_key> pikey;
const cql3::untyped_result_set_row * pirow = nullptr;
if (rs) {
for (auto& utr : *rs) {
bool match = true;
for (auto& c : _schema->clustering_key_columns()) {
auto rv = utr.get_view(c.name_as_text());
auto cv = r.key().get_component(*_schema, c.component_index());
if (rv != cv) {
match = false;
break;
}
}
if (match) {
pikey = set_pk_columns(m.key(), batch_no, res);
set_operation(*pikey, operation::pre_image, res);
pirow = &utr;
++batch_no;
break;
}
}
}
auto log_ck = set_pk_columns(m.key(), batch_no, res);
size_t pos = 0;
for (const auto& column : _schema->clustering_key_columns()) {
assert (pos < ck_value.size());
auto cdef = _log_schema->get_column_definition(to_bytes("_" + column.name()));
res.set_cell(log_ck, *cdef, atomic_cell::make_live(*column.type, _time.timestamp(), bytes_view(ck_value[pos])));
if (pirow) {
assert(pirow->has(column.name_as_text()));
res.set_cell(*pikey, *cdef, atomic_cell::make_live(*column.type, _time.timestamp(), bytes_view(ck_value[pos])));
}
++pos;
}
std::vector<bytes_opt> values(3);
auto process_cells = [&](const row& r, column_kind ckind) {
r.for_each_cell([&](column_id id, const atomic_cell_or_collection& cell) {
auto& cdef = _schema->column_at(ckind, id);
auto* dst = _log_schema->get_column_definition(to_bytes("_" + cdef.name()));
// todo: collections.
if (cdef.is_atomic()) {
column_op op;
values[1] = values[2] = std::nullopt;
auto view = cell.as_atomic_cell(cdef);
if (view.is_live()) {
op = column_op::set;
values[1] = view.value().linearize();
if (view.is_live_and_has_ttl()) {
values[2] = long_type->decompose(data_value(view.ttl().count()));
}
} else {
op = column_op::del;
}
values[0] = data_type_for<column_op_native_type>()->decompose(data_value(static_cast<column_op_native_type>(op)));
res.set_cell(log_ck, *dst, atomic_cell::make_live(*dst->type, _time.timestamp(), tuple_type_impl::build_value(values)));
if (pirow && pirow->has(cdef.name_as_text())) {
values[0] = data_type_for<column_op_native_type>()->decompose(data_value(static_cast<column_op_native_type>(column_op::set)));
values[1] = pirow->get_blob(cdef.name_as_text());
values[2] = std::nullopt;
assert(std::addressof(res.partition().clustered_row(*_log_schema, *pikey)) != std::addressof(res.partition().clustered_row(*_log_schema, log_ck)));
assert(pikey->explode() != log_ck.explode());
res.set_cell(*pikey, *dst, atomic_cell::make_live(*dst->type, _time.timestamp(), tuple_type_impl::build_value(values)));
}
} else {
cdc_log.warn("Non-atomic cell ignored {}.{}:{}", _schema->ks_name(), _schema->cf_name(), cdef.name_as_text());
}
});
};
process_cells(r.row().cells(), column_kind::regular_column);
process_cells(p.static_row().get(), column_kind::static_column);
set_operation(log_ck, operation::update, res);
++batch_no;
}
}
return res;
}
static db::timeout_clock::time_point default_timeout() {
return db::timeout_clock::now() + 10s;
}
future<lw_shared_ptr<cql3::untyped_result_set>> pre_image_select(
service::storage_proxy& proxy,
service::client_state& client_state,
db::consistency_level cl,
const mutation& m)
{
auto& p = m.partition();
if (p.partition_tombstone() || !p.row_tombstones().empty() || p.clustered_rows().empty()) {
return make_ready_future<lw_shared_ptr<cql3::untyped_result_set>>();
}
dht::partition_range_vector partition_ranges{dht::partition_range(m.decorated_key())};
auto&& pc = _schema->partition_key_columns();
auto&& cc = _schema->clustering_key_columns();
std::vector<query::clustering_range> bounds;
if (cc.empty()) {
bounds.push_back(query::clustering_range::make_open_ended_both_sides());
} else {
for (const rows_entry& r : p.clustered_rows()) {
auto& ck = r.key();
bounds.push_back(query::clustering_range::make_singular(ck));
}
}
std::vector<const column_definition*> columns;
columns.reserve(_schema->all_columns().size());
std::transform(pc.begin(), pc.end(), std::back_inserter(columns), [](auto& c) { return &c; });
std::transform(cc.begin(), cc.end(), std::back_inserter(columns), [](auto& c) { return &c; });
query::column_id_vector static_columns, regular_columns;
auto sk = column_kind::static_column;
auto rk = column_kind::regular_column;
// TODO: this assumes all mutations touch the same set of columns. This might not be true, and we may need to do more horrible set operation here.
for (auto& [r, cids, kind] : { std::tie(p.static_row().get(), static_columns, sk), std::tie(p.clustered_rows().begin()->row().cells(), regular_columns, rk) }) {
r.for_each_cell([&](column_id id, const atomic_cell_or_collection&) {
auto& cdef =_schema->column_at(kind, id);
cids.emplace_back(id);
columns.emplace_back(&cdef);
});
}
auto selection = cql3::selection::selection::for_columns(_schema, std::move(columns));
auto partition_slice = query::partition_slice(std::move(bounds), std::move(static_columns), std::move(regular_columns), selection->get_query_options());
auto command = ::make_lw_shared<query::read_command>(_schema->id(), _schema->version(), partition_slice, query::max_partitions);
return proxy.query(_schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), empty_service_permit(), client_state)).then(
[this, partition_slice = std::move(partition_slice), selection = std::move(selection)] (service::storage_proxy::coordinator_query_result qr) -> lw_shared_ptr<cql3::untyped_result_set> {
cql3::selection::result_set_builder builder(*selection, gc_clock::now(), cql_serialization_format::latest());
query::result_view::consume(*qr.query_result, partition_slice, cql3::selection::result_set_builder::visitor(builder, *_schema, *selection));
auto result_set = builder.build();
if (!result_set || result_set->empty()) {
return {};
}
return make_lw_shared<cql3::untyped_result_set>(*result_set);
});
}
};
// This class is used to build a mapping from <node ip, shard id> to stream_id
// It is used as a consumer for rows returned by the query to CDC Description Table
class streams_builder {
const schema& _schema;
transformer::streams_type _streams;
net::inet_address _node_ip = net::inet_address();
unsigned int _shard_id = 0;
api::timestamp_type _latest_row_timestamp = api::min_timestamp;
utils::UUID _latest_row_stream_id = utils::UUID();
public:
streams_builder(const schema& s) : _schema(s) {}
void accept_new_partition(const partition_key& key, uint32_t row_count) {
auto exploded = key.explode(_schema);
_node_ip = value_cast<net::inet_address>(inet_addr_type->deserialize(exploded[0]));
_shard_id = static_cast<unsigned int>(value_cast<int>(int32_type->deserialize(exploded[1])));
_latest_row_timestamp = api::min_timestamp;
_latest_row_stream_id = utils::UUID();
}
void accept_new_partition(uint32_t row_count) {
assert(false);
}
void accept_new_row(
const clustering_key& key,
const query::result_row_view& static_row,
const query::result_row_view& row) {
auto row_iterator = row.iterator();
api::timestamp_type timestamp = value_cast<db_clock::time_point>(
timestamp_type->deserialize(key.explode(_schema)[0])).time_since_epoch().count();
if (timestamp <= _latest_row_timestamp) {
return;
}
_latest_row_timestamp = timestamp;
for (auto&& cdef : _schema.regular_columns()) {
if (cdef.name_as_text() != "stream_id") {
row_iterator.skip(cdef);
continue;
}
auto val_opt = row_iterator.next_atomic_cell();
assert(val_opt);
val_opt->value().with_linearized([&] (bytes_view bv) {
_latest_row_stream_id = value_cast<utils::UUID>(uuid_type->deserialize(bv));
});
}
}
void accept_new_row(const query::result_row_view& static_row, const query::result_row_view& row) {
assert(false);
}
void accept_partition_end(const query::result_row_view& static_row) {
_streams.emplace(std::make_pair(_node_ip, _shard_id), _latest_row_stream_id);
}
transformer::streams_type build() {
return std::move(_streams);
}
};
static future<::shared_ptr<transformer::streams_type>> get_streams(
db_context ctx,
const sstring& ks_name,
const sstring& cf_name,
lowres_clock::time_point timeout,
service::query_state& qs) {
auto s =
ctx._proxy.get_db().local().find_schema(ks_name, desc_name(cf_name));
query::read_command cmd(
s->id(),
s->version(),
partition_slice_builder(*s).with_no_static_columns().build());
return ctx._proxy.query(
s,
make_lw_shared(std::move(cmd)),
{dht::partition_range::make_open_ended_both_sides()},
db::consistency_level::QUORUM,
{timeout, qs.get_permit(), qs.get_client_state()}).then([s = std::move(s)] (auto qr) mutable {
return query::result_view::do_with(*qr.query_result,
[s = std::move(s)] (query::result_view v) {
auto slice = partition_slice_builder(*s)
.with_no_static_columns()
.build();
streams_builder builder{ *s };
v.consume(slice, builder);
return ::make_shared<transformer::streams_type>(builder.build());
});
});
}
future<std::vector<mutation>> append_log_mutations(
db_context ctx,
schema_ptr s,
service::storage_proxy::clock_type::time_point timeout,
service::query_state& qs,
std::vector<mutation> muts) {
auto mp = ::make_lw_shared<std::vector<mutation>>(std::move(muts));
return get_streams(ctx, s->ks_name(), s->cf_name(), timeout, qs).then([ctx, s = std::move(s), mp, &qs](::shared_ptr<transformer::streams_type> streams) mutable {
mp->reserve(2 * mp->size());
auto trans = make_lw_shared<transformer>(ctx, s, std::move(streams));
auto i = mp->begin();
auto e = mp->end();
return parallel_for_each(i, e, [ctx, &qs, trans, mp](mutation& m) {
return trans->pre_image_select(ctx._proxy, qs.get_client_state(), db::consistency_level::LOCAL_QUORUM, m).then([trans, mp, &m](lw_shared_ptr<cql3::untyped_result_set> rs) {
mp->push_back(trans->transform(m, rs.get()));
});
}).then([mp] {
return std::move(*mp);
});
});
}
} // namespace cdc

View File

@@ -1,233 +0,0 @@
/*
* Copyright (C) 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <functional>
#include <optional>
#include <map>
#include <string>
#include <vector>
#include <seastar/core/future.hh>
#include <seastar/core/lowres_clock.hh>
#include <seastar/core/shared_ptr.hh>
#include <seastar/core/sstring.hh>
#include "exceptions/exceptions.hh"
#include "json.hh"
#include "timestamp.hh"
class schema;
using schema_ptr = seastar::lw_shared_ptr<const schema>;
namespace locator {
class snitch_ptr;
class token_metadata;
} // namespace locator
namespace service {
class migration_manager;
class storage_proxy;
class query_state;
} // namespace service
namespace dht {
class i_partitioner;
} // namespace dht
class mutation;
class partition_key;
namespace cdc {
class options final {
bool _enabled = false;
bool _preimage = false;
bool _postimage = false;
int _ttl = 86400; // 24h in seconds
public:
options() = default;
options(const std::map<sstring, sstring>& map) {
if (map.find("enabled") == std::end(map)) {
return;
}
for (auto& p : map) {
if (p.first == "enabled") {
_enabled = p.second == "true";
} else if (p.first == "preimage") {
_preimage = p.second == "true";
} else if (p.first == "postimage") {
_postimage = p.second == "true";
} else if (p.first == "ttl") {
_ttl = std::stoi(p.second);
} else {
throw exceptions::configuration_exception("Invalid CDC option: " + p.first);
}
}
}
std::map<sstring, sstring> to_map() const {
if (!_enabled) {
return {};
}
return {
{ "enabled", _enabled ? "true" : "false" },
{ "preimage", _preimage ? "true" : "false" },
{ "postimage", _postimage ? "true" : "false" },
{ "ttl", std::to_string(_ttl) },
};
}
sstring to_sstring() const {
return json::to_json(to_map());
}
bool enabled() const { return _enabled; }
bool preimage() const { return _preimage; }
bool postimage() const { return _postimage; }
int ttl() const { return _ttl; }
bool operator==(const options& o) const {
return _enabled == o._enabled && _preimage == o._preimage && _postimage == o._postimage && _ttl == o._ttl;
}
bool operator!=(const options& o) const {
return !(*this == o);
}
};
struct db_context final {
service::storage_proxy& _proxy;
service::migration_manager& _migration_manager;
locator::token_metadata& _token_metadata;
locator::snitch_ptr& _snitch;
dht::i_partitioner& _partitioner;
class builder final {
service::storage_proxy& _proxy;
std::optional<std::reference_wrapper<service::migration_manager>> _migration_manager;
std::optional<std::reference_wrapper<locator::token_metadata>> _token_metadata;
std::optional<std::reference_wrapper<locator::snitch_ptr>> _snitch;
std::optional<std::reference_wrapper<dht::i_partitioner>> _partitioner;
public:
builder(service::storage_proxy& proxy) : _proxy(proxy) { }
builder& with_migration_manager(service::migration_manager& migration_manager) {
_migration_manager = migration_manager;
return *this;
}
builder& with_token_metadata(locator::token_metadata& token_metadata) {
_token_metadata = token_metadata;
return *this;
}
builder& with_snitch(locator::snitch_ptr& snitch) {
_snitch = snitch;
return *this;
}
builder& with_partitioner(dht::i_partitioner& partitioner) {
_partitioner = partitioner;
return *this;
}
db_context build();
};
};
/// \brief Sets up CDC related tables for a given table
///
/// This function not only creates CDC Log and CDC Description for a given table
/// but also populates CDC Description with a list of change streams.
///
/// param[in] ctx object with references to database components
/// param[in] schema schema of a table for which CDC tables are being created
seastar::future<> setup(db_context ctx, schema_ptr schema);
// cdc log table operation
enum class operation : int8_t {
// note: these values will eventually be read by a third party, probably not privvy to this
// enum decl, so don't change the constant values (or the datatype).
pre_image = 0, update = 1, row_delete = 2, range_delete_start = 3, range_delete_end = 4, partition_delete = 5
};
// cdc log data column operation
enum class column_op : int8_t {
// same as "operation". Do not edit values or type/type unless you _really_ want to.
set = 0, del = 1, add = 2,
};
/// \brief Deletes CDC Log and CDC Description tables for a given table
///
/// This function cleans up all CDC related tables created for a given table.
/// At the moment, CDC Log and CDC Description are the only affected tables.
/// It's ok if some/all of them don't exist.
///
/// \param[in] ctx object with references to database components
/// \param[in] ks_name keyspace name of a table for which CDC tables are removed
/// \param[in] table_name name of a table for which CDC tables are removed
///
/// \pre This function works correctly no matter if CDC Log and/or CDC Description
/// exist.
seastar::future<>
remove(db_context ctx, const seastar::sstring& ks_name, const seastar::sstring& table_name);
seastar::sstring log_name(const seastar::sstring& table_name);
seastar::sstring desc_name(const seastar::sstring& table_name);
/// \brief For each mutation in the set appends related CDC Log mutation
///
/// This function should be called with a set of mutations of a table
/// with CDC enabled. Returned set of mutations contains all original mutations
/// and for each original mutation appends a mutation to CDC Log that reflects
/// the change.
///
/// \param[in] ctx object with references to database components
/// \param[in] s schema of a CDC enabled table which is being modified
/// \param[in] timeout period of time after which a request is considered timed out
/// \param[in] qs the state of the query that's being executed
/// \param[in] mutations set of changes of a CDC enabled table
///
/// \return set of mutations from input parameter with relevant CDC Log mutations appended
///
/// \pre CDC Log and CDC Description have to exist
/// \pre CDC Description has to be in sync with cluster topology
///
/// \note At the moment, cluster topology changes are not supported
// so the assumption that CDC Description is in sync with cluster topology
// is easy to enforce. When support for cluster topology changes is added
// it has to make sure the assumption holds.
seastar::future<std::vector<mutation>>append_log_mutations(
db_context ctx,
schema_ptr s,
lowres_clock::time_point timeout,
service::query_state& qs,
std::vector<mutation> mutations);
} // namespace cdc

View File

@@ -68,7 +68,7 @@ public:
public:
explicit iterator(const mutation_partition& mp)
: _mp(mp)
, _current(position_in_partition_view(position_in_partition_view::static_row_tag_t()), mp.static_row().get())
, _current(position_in_partition_view(position_in_partition_view::static_row_tag_t()), mp.static_row())
{ }
iterator(const mutation_partition& mp, mutation_partition::rows_type::const_iterator it)

View File

@@ -1,440 +0,0 @@
/*
* Copyright (C) 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "types/collection.hh"
#include "types/user.hh"
#include "concrete_types.hh"
#include "atomic_cell_or_collection.hh"
#include "mutation_partition.hh"
#include "compaction_garbage_collector.hh"
#include "combine.hh"
#include "collection_mutation.hh"
collection_mutation::collection_mutation(const abstract_type& type, collection_mutation_view v)
: _data(imr_object_type::make(data::cell::make_collection(v.data), &type.imr_state().lsa_migrator())) {}
collection_mutation::collection_mutation(const abstract_type& type, bytes_view v)
: _data(imr_object_type::make(data::cell::make_collection(v), &type.imr_state().lsa_migrator())) {}
static collection_mutation_view get_collection_mutation_view(const uint8_t* ptr)
{
auto f = data::cell::structure::get_member<data::cell::tags::flags>(ptr);
auto ti = data::type_info::make_collection();
data::cell::context ctx(f, ti);
auto view = data::cell::structure::get_member<data::cell::tags::cell>(ptr).as<data::cell::tags::collection>(ctx);
auto dv = data::cell::variable_value::make_view(view, f.get<data::cell::tags::external_data>());
return collection_mutation_view { dv };
}
collection_mutation::operator collection_mutation_view() const
{
return get_collection_mutation_view(_data.get());
}
collection_mutation_view atomic_cell_or_collection::as_collection_mutation() const {
return get_collection_mutation_view(_data.get());
}
bool collection_mutation_view::is_empty() const {
return data.with_linearized([&] (bytes_view in) { // FIXME: we can guarantee that this is in the first fragment
auto has_tomb = read_simple<bool>(in);
return !has_tomb && read_simple<uint32_t>(in) == 0;
});
}
template <typename F>
GCC6_CONCEPT(requires std::is_invocable_r_v<const data::type_info&, F, bytes_view&>)
static bool is_any_live(const atomic_cell_value_view& data, tombstone tomb, gc_clock::time_point now, F&& read_cell_type_info) {
return data.with_linearized([&] (bytes_view in) {
auto has_tomb = read_simple<bool>(in);
if (has_tomb) {
auto ts = read_simple<api::timestamp_type>(in);
auto ttl = read_simple<gc_clock::duration::rep>(in);
tomb.apply(tombstone{ts, gc_clock::time_point(gc_clock::duration(ttl))});
}
auto nr = read_simple<uint32_t>(in);
for (uint32_t i = 0; i != nr; ++i) {
auto& type_info = read_cell_type_info(in);
auto vsize = read_simple<uint32_t>(in);
auto value = atomic_cell_view::from_bytes(type_info, read_simple_bytes(in, vsize));
if (value.is_live(tomb, now, false)) {
return true;
}
}
return false;
});
}
bool collection_mutation_view::is_any_live(const abstract_type& type, tombstone tomb, gc_clock::time_point now) const {
return visit(type, make_visitor(
[&] (const collection_type_impl& ctype) {
auto& type_info = ctype.value_comparator()->imr_state().type_info();
return ::is_any_live(data, tomb, now, [&type_info] (bytes_view& in) -> const data::type_info& {
auto key_size = read_simple<uint32_t>(in);
in.remove_prefix(key_size);
return type_info;
});
},
[&] (const user_type_impl& utype) {
return ::is_any_live(data, tomb, now, [&utype] (bytes_view& in) -> const data::type_info& {
auto key_size = read_simple<uint32_t>(in);
auto key = read_simple_bytes(in, key_size);
return utype.type(deserialize_field_index(key))->imr_state().type_info();
});
},
[&] (const abstract_type& o) -> bool {
throw std::runtime_error(format("collection_mutation_view::is_any_live: unknown type {}", o.name()));
}
));
}
template <typename F>
GCC6_CONCEPT(requires std::is_invocable_r_v<const data::type_info&, F, bytes_view&>)
static api::timestamp_type last_update(const atomic_cell_value_view& data, F&& read_cell_type_info) {
return data.with_linearized([&] (bytes_view in) {
api::timestamp_type max = api::missing_timestamp;
auto has_tomb = read_simple<bool>(in);
if (has_tomb) {
max = std::max(max, read_simple<api::timestamp_type>(in));
(void)read_simple<gc_clock::duration::rep>(in);
}
auto nr = read_simple<uint32_t>(in);
for (uint32_t i = 0; i != nr; ++i) {
auto& type_info = read_cell_type_info(in);
auto vsize = read_simple<uint32_t>(in);
auto value = atomic_cell_view::from_bytes(type_info, read_simple_bytes(in, vsize));
max = std::max(value.timestamp(), max);
}
return max;
});
}
api::timestamp_type collection_mutation_view::last_update(const abstract_type& type) const {
return visit(type, make_visitor(
[&] (const collection_type_impl& ctype) {
auto& type_info = ctype.value_comparator()->imr_state().type_info();
return ::last_update(data, [&type_info] (bytes_view& in) -> const data::type_info& {
auto key_size = read_simple<uint32_t>(in);
in.remove_prefix(key_size);
return type_info;
});
},
[&] (const user_type_impl& utype) {
return ::last_update(data, [&utype] (bytes_view& in) -> const data::type_info& {
auto key_size = read_simple<uint32_t>(in);
auto key = read_simple_bytes(in, key_size);
return utype.type(deserialize_field_index(key))->imr_state().type_info();
});
},
[&] (const abstract_type& o) -> api::timestamp_type {
throw std::runtime_error(format("collection_mutation_view::last_update: unknown type {}", o.name()));
}
));
}
collection_mutation_description
collection_mutation_view_description::materialize(const abstract_type& type) const {
collection_mutation_description m;
m.tomb = tomb;
m.cells.reserve(cells.size());
visit(type, make_visitor(
[&] (const collection_type_impl& ctype) {
auto& value_type = *ctype.value_comparator();
for (auto&& e : cells) {
m.cells.emplace_back(to_bytes(e.first), atomic_cell(value_type, e.second));
}
},
[&] (const user_type_impl& utype) {
for (auto&& e : cells) {
m.cells.emplace_back(to_bytes(e.first), atomic_cell(*utype.type(deserialize_field_index(e.first)), e.second));
}
},
[&] (const abstract_type& o) {
throw std::runtime_error(format("attempted to materialize collection_mutation_view_description with type {}", o.name()));
}
));
return m;
}
bool collection_mutation_description::compact_and_expire(column_id id, row_tombstone base_tomb, gc_clock::time_point query_time,
can_gc_fn& can_gc, gc_clock::time_point gc_before, compaction_garbage_collector* collector)
{
bool any_live = false;
auto t = tomb;
tombstone purged_tomb;
if (tomb <= base_tomb.regular()) {
tomb = tombstone();
} else if (tomb.deletion_time < gc_before && can_gc(tomb)) {
purged_tomb = tomb;
tomb = tombstone();
}
t.apply(base_tomb.regular());
utils::chunked_vector<std::pair<bytes, atomic_cell>> survivors;
utils::chunked_vector<std::pair<bytes, atomic_cell>> losers;
for (auto&& name_and_cell : cells) {
atomic_cell& cell = name_and_cell.second;
auto cannot_erase_cell = [&] {
return cell.deletion_time() >= gc_before || !can_gc(tombstone(cell.timestamp(), cell.deletion_time()));
};
if (cell.is_covered_by(t, false) || cell.is_covered_by(base_tomb.shadowable().tomb(), false)) {
continue;
}
if (cell.has_expired(query_time)) {
if (cannot_erase_cell()) {
survivors.emplace_back(std::make_pair(
std::move(name_and_cell.first), atomic_cell::make_dead(cell.timestamp(), cell.deletion_time())));
} else if (collector) {
losers.emplace_back(std::pair(
std::move(name_and_cell.first), atomic_cell::make_dead(cell.timestamp(), cell.deletion_time())));
}
} else if (!cell.is_live()) {
if (cannot_erase_cell()) {
survivors.emplace_back(std::move(name_and_cell));
} else if (collector) {
losers.emplace_back(std::move(name_and_cell));
}
} else {
any_live |= true;
survivors.emplace_back(std::move(name_and_cell));
}
}
if (collector) {
collector->collect(id, collection_mutation_description{purged_tomb, std::move(losers)});
}
cells = std::move(survivors);
return any_live;
}
template <typename Iterator>
static collection_mutation serialize_collection_mutation(
const abstract_type& type,
const tombstone& tomb,
boost::iterator_range<Iterator> cells) {
auto element_size = [] (size_t c, auto&& e) -> size_t {
return c + 8 + e.first.size() + e.second.serialize().size();
};
auto size = accumulate(cells, (size_t)4, element_size);
size += 1;
if (tomb) {
size += sizeof(tomb.timestamp) + sizeof(tomb.deletion_time);
}
bytes ret(bytes::initialized_later(), size);
bytes::iterator out = ret.begin();
*out++ = bool(tomb);
if (tomb) {
write(out, tomb.timestamp);
write(out, tomb.deletion_time.time_since_epoch().count());
}
auto writeb = [&out] (bytes_view v) {
serialize_int32(out, v.size());
out = std::copy_n(v.begin(), v.size(), out);
};
// FIXME: overflow?
serialize_int32(out, boost::distance(cells));
for (auto&& kv : cells) {
auto&& k = kv.first;
auto&& v = kv.second;
writeb(k);
writeb(v.serialize());
}
return collection_mutation(type, ret);
}
collection_mutation collection_mutation_description::serialize(const abstract_type& type) const {
return serialize_collection_mutation(type, tomb, boost::make_iterator_range(cells.begin(), cells.end()));
}
collection_mutation collection_mutation_view_description::serialize(const abstract_type& type) const {
return serialize_collection_mutation(type, tomb, boost::make_iterator_range(cells.begin(), cells.end()));
}
template <typename C>
GCC6_CONCEPT(requires std::is_base_of_v<abstract_type, std::remove_reference_t<C>>)
static collection_mutation_view_description
merge(collection_mutation_view_description a, collection_mutation_view_description b, C&& key_type) {
using element_type = std::pair<bytes_view, atomic_cell_view>;
auto compare = [&] (const element_type& e1, const element_type& e2) {
return key_type.less(e1.first, e2.first);
};
auto merge = [] (const element_type& e1, const element_type& e2) {
// FIXME: use std::max()?
return std::make_pair(e1.first, compare_atomic_cell_for_merge(e1.second, e2.second) > 0 ? e1.second : e2.second);
};
// applied to a tombstone, returns a predicate checking whether a cell is killed by
// the tombstone
auto cell_killed = [] (const std::optional<tombstone>& t) {
return [&t] (const element_type& e) {
if (!t) {
return false;
}
// tombstone wins if timestamps equal here, unlike row tombstones
if (t->timestamp < e.second.timestamp()) {
return false;
}
return true;
// FIXME: should we consider TTLs too?
};
};
collection_mutation_view_description merged;
merged.cells.reserve(a.cells.size() + b.cells.size());
combine(a.cells.begin(), std::remove_if(a.cells.begin(), a.cells.end(), cell_killed(b.tomb)),
b.cells.begin(), std::remove_if(b.cells.begin(), b.cells.end(), cell_killed(a.tomb)),
std::back_inserter(merged.cells),
compare,
merge);
merged.tomb = std::max(a.tomb, b.tomb);
return merged;
}
collection_mutation merge(const abstract_type& type, collection_mutation_view a, collection_mutation_view b) {
return a.with_deserialized(type, [&] (collection_mutation_view_description a_view) {
return b.with_deserialized(type, [&] (collection_mutation_view_description b_view) {
return visit(type, make_visitor(
[&] (const collection_type_impl& ctype) {
return merge(std::move(a_view), std::move(b_view), *ctype.name_comparator());
},
[&] (const user_type_impl& utype) {
return merge(std::move(a_view), std::move(b_view), *short_type);
},
[] (const abstract_type& o) -> collection_mutation_view_description {
throw std::runtime_error(format("collection_mutation merge: unknown type: {}", o.name()));
}
)).serialize(type);
});
});
}
template <typename C>
GCC6_CONCEPT(requires std::is_base_of_v<abstract_type, std::remove_reference_t<C>>)
static collection_mutation_view_description
difference(collection_mutation_view_description a, collection_mutation_view_description b, C&& key_type)
{
collection_mutation_view_description diff;
diff.cells.reserve(std::max(a.cells.size(), b.cells.size()));
auto it = b.cells.begin();
for (auto&& c : a.cells) {
while (it != b.cells.end() && key_type.less(it->first, c.first)) {
++it;
}
if (it == b.cells.end() || !key_type.equal(it->first, c.first)
|| compare_atomic_cell_for_merge(c.second, it->second) > 0) {
auto cell = std::make_pair(c.first, c.second);
diff.cells.emplace_back(std::move(cell));
}
}
if (a.tomb > b.tomb) {
diff.tomb = a.tomb;
}
return diff;
}
collection_mutation difference(const abstract_type& type, collection_mutation_view a, collection_mutation_view b)
{
return a.with_deserialized(type, [&] (collection_mutation_view_description a_view) {
return b.with_deserialized(type, [&] (collection_mutation_view_description b_view) {
return visit(type, make_visitor(
[&] (const collection_type_impl& ctype) {
return difference(std::move(a_view), std::move(b_view), *ctype.name_comparator());
},
[&] (const user_type_impl& utype) {
return difference(std::move(a_view), std::move(b_view), *short_type);
},
[] (const abstract_type& o) -> collection_mutation_view_description {
throw std::runtime_error(format("collection_mutation difference: unknown type: {}", o.name()));
}
)).serialize(type);
});
});
}
template <typename F>
GCC6_CONCEPT(requires std::is_invocable_r_v<std::pair<bytes_view, atomic_cell_view>, F, bytes_view&>)
static collection_mutation_view_description
deserialize_collection_mutation(bytes_view in, F&& read_kv) {
collection_mutation_view_description ret;
auto has_tomb = read_simple<bool>(in);
if (has_tomb) {
auto ts = read_simple<api::timestamp_type>(in);
auto ttl = read_simple<gc_clock::duration::rep>(in);
ret.tomb = tombstone{ts, gc_clock::time_point(gc_clock::duration(ttl))};
}
auto nr = read_simple<uint32_t>(in);
ret.cells.reserve(nr);
for (uint32_t i = 0; i != nr; ++i) {
ret.cells.push_back(read_kv(in));
}
assert(in.empty());
return ret;
}
collection_mutation_view_description
deserialize_collection_mutation(const abstract_type& type, bytes_view in) {
return visit(type, make_visitor(
[&] (const collection_type_impl& ctype) {
// value_comparator(), ugh
auto& type_info = ctype.value_comparator()->imr_state().type_info();
return deserialize_collection_mutation(in, [&type_info] (bytes_view& in) {
// FIXME: we could probably avoid the need for size
auto ksize = read_simple<uint32_t>(in);
auto key = read_simple_bytes(in, ksize);
auto vsize = read_simple<uint32_t>(in);
auto value = atomic_cell_view::from_bytes(type_info, read_simple_bytes(in, vsize));
return std::make_pair(key, value);
});
},
[&] (const user_type_impl& utype) {
return deserialize_collection_mutation(in, [&utype] (bytes_view& in) {
// FIXME: we could probably avoid the need for size
auto ksize = read_simple<uint32_t>(in);
auto key = read_simple_bytes(in, ksize);
auto vsize = read_simple<uint32_t>(in);
auto value = atomic_cell_view::from_bytes(
utype.type(deserialize_field_index(key))->imr_state().type_info(), read_simple_bytes(in, vsize));
return std::make_pair(key, value);
});
},
[&] (const abstract_type& o) -> collection_mutation_view_description {
throw std::runtime_error(format("deserialize_collection_mutation: unknown type {}", o.name()));
}
));
}

View File

@@ -1,124 +0,0 @@
/*
* Copyright (C) 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "utils/chunked_vector.hh"
#include "schema_fwd.hh"
#include "gc_clock.hh"
#include "atomic_cell.hh"
#include "cql_serialization_format.hh"
class abstract_type;
class compaction_garbage_collector;
class row_tombstone;
class collection_mutation;
// An auxiliary struct used to (de)construct collection_mutations.
// Unlike collection_mutation which is a serialized blob, this struct allows to inspect logical units of information
// (tombstone and cells) inside the mutation easily.
struct collection_mutation_description {
tombstone tomb;
// FIXME: use iterators?
// we never iterate over `cells` more than once, so there is no need to store them in memory.
// In some cases instead of constructing the `cells` vector, it would be more efficient to provide
// a one-time-use forward iterator which returns the cells.
utils::chunked_vector<std::pair<bytes, atomic_cell>> cells;
// Expires cells based on query_time. Expires tombstones based on max_purgeable and gc_before.
// Removes cells covered by tomb or this->tomb.
bool compact_and_expire(column_id id, row_tombstone tomb, gc_clock::time_point query_time,
can_gc_fn&, gc_clock::time_point gc_before, compaction_garbage_collector* collector = nullptr);
// Packs the data to a serialized blob.
collection_mutation serialize(const abstract_type&) const;
};
// Similar to collection_mutation_description, except that it doesn't store the cells' data, only observes it.
struct collection_mutation_view_description {
tombstone tomb;
// FIXME: use iterators? See the fixme in collection_mutation_description; the same considerations apply here.
utils::chunked_vector<std::pair<bytes_view, atomic_cell_view>> cells;
// Copies the observed data, storing it in a collection_mutation_description.
collection_mutation_description materialize(const abstract_type&) const;
// Packs the data to a serialized blob.
collection_mutation serialize(const abstract_type&) const;
};
// Given a linearized collection_mutation_view, returns an auxiliary struct allowing the inspection of each cell.
// The struct is an observer of the data given by the collection_mutation_view and doesn't extend its lifetime.
// The function needs to be given the type of stored data to reconstruct the structural information.
collection_mutation_view_description deserialize_collection_mutation(const abstract_type&, bytes_view);
class collection_mutation_view {
public:
atomic_cell_value_view data;
// Is this a noop mutation?
bool is_empty() const;
// Is any of the stored cells live (not deleted nor expired) at the time point `tp`,
// given the later of the tombstones `t` and the one stored in the mutation (if any)?
// Requires a type to reconstruct the structural information.
bool is_any_live(const abstract_type&, tombstone t = tombstone(), gc_clock::time_point tp = gc_clock::time_point::min()) const;
// The maximum of timestamps of the mutation's cells and tombstone.
api::timestamp_type last_update(const abstract_type&) const;
// Given a function that operates on a collection_mutation_view_description,
// calls it on the corresponding description of `this`.
template <typename F>
inline decltype(auto) with_deserialized(const abstract_type& type, F f) const {
return data.with_linearized([&] (bytes_view bv) {
return f(deserialize_collection_mutation(type, std::move(bv)));
});
}
};
// A serialized mutation of a collection of cells.
// Used to represent mutations of collections (lists, maps, sets) or non-frozen user defined types.
// It contains a sequence of cells, each representing a mutation of a single entry (element or field) of the collection.
// Each cell has an associated 'key' (or 'path'). The meaning of each (key, cell) pair is:
// for sets: the key is the serialized set element, the cell contains no data (except liveness information),
// for maps: the key is the serialized map element's key, the cell contains the serialized map element's value,
// for lists: the key is a timeuuid identifying the list entry, the cell contains the serialized value,
// for user types: the key is an index identifying the field, the cell contains the value of the field.
// The mutation may also contain a collection-wide tombstone.
class collection_mutation {
public:
using imr_object_type = imr::utils::object<data::cell::structure>;
imr_object_type _data;
collection_mutation() {}
collection_mutation(const abstract_type&, collection_mutation_view);
collection_mutation(const abstract_type&, bytes_view);
operator collection_mutation_view() const;
};
collection_mutation merge(const abstract_type&, collection_mutation_view, collection_mutation_view);
collection_mutation difference(const abstract_type&, collection_mutation_view, collection_mutation_view);
// Serializes the given collection of cells to a sequence of bytes ready to be sent over the CQL protocol.
bytes serialize_for_cql(const abstract_type&, collection_mutation_view, cql_serialization_format);

View File

@@ -22,7 +22,7 @@
#pragma once
#include "schema.hh"
#include "collection_mutation.hh"
#include "types/collection.hh"
class atomic_cell;
class row_marker;
@@ -31,6 +31,6 @@ class compaction_garbage_collector {
public:
virtual ~compaction_garbage_collector() = default;
virtual void collect(column_id id, atomic_cell) = 0;
virtual void collect(column_id id, collection_mutation_description) = 0;
virtual void collect(column_id id, collection_type_impl::mutation) = 0;
virtual void collect(row_marker) = 0;
};

View File

@@ -125,8 +125,6 @@ struct date_type_impl final : public concrete_type<db_clock::time_point> {
date_type_impl();
};
using timestamp_date_base_class = concrete_type<db_clock::time_point>;
struct timeuuid_type_impl final : public concrete_type<utils::UUID> {
timeuuid_type_impl();
};

View File

@@ -112,9 +112,6 @@ read_request_timeout_in_ms: 5000
# How long the coordinator should wait for writes to complete
write_request_timeout_in_ms: 2000
# how long a coordinator should continue to retry a CAS operation
# that contends with other proposals for the same row
cas_contention_timeout_in_ms: 1000
# phi value that must be reached for a host to be marked down.
# most users should never need to adjust this.
@@ -241,9 +238,7 @@ batch_size_fail_threshold_in_kb: 50
# broadcast_rpc_address: 1.2.3.4
# Uncomment to enable experimental features
# experimental_features:
# - cdc
# - lwt
# experimental: true
# The directory where hints files are stored if hinted handoff is enabled.
# hints_directory: /var/lib/scylla/hints

View File

@@ -285,9 +285,7 @@ scylla_tests = [
'tests/row_cache_stress_test',
'tests/memory_footprint',
'tests/perf/perf_sstable',
'tests/cdc_test',
'tests/cql_query_test',
'tests/user_types_test',
'tests/secondary_index_test',
'tests/json_cql_query_test',
'tests/filtering_test',
@@ -381,7 +379,6 @@ scylla_tests = [
'tests/data_listeners_test',
'tests/truncation_migration_test',
'tests/like_matcher_test',
'tests/enum_option_test',
]
perf_tests = [
@@ -469,7 +466,6 @@ cassandra_interface = Thrift(source='interface/cassandra.thrift', service='Cassa
scylla_core = (['database.cc',
'table.cc',
'atomic_cell.cc',
'collection_mutation.cc',
'hashers.cc',
'schema.cc',
'frozen_schema.cc',
@@ -509,7 +505,6 @@ scylla_core = (['database.cc',
'sstables/partition.cc',
'sstables/compaction.cc',
'sstables/compaction_strategy.cc',
'sstables/leveled_compaction_strategy.cc',
'sstables/compaction_manager.cc',
'sstables/integrity_checked_file_impl.cc',
'sstables/prepended_input_stream.cc',
@@ -518,7 +513,6 @@ scylla_core = (['database.cc',
'transport/event_notifier.cc',
'transport/server.cc',
'transport/messages/result_message.cc',
'cdc/cdc.cc',
'cql3/abstract_marker.cc',
'cql3/attributes.cc',
'cql3/cf_name.cc',
@@ -547,7 +541,6 @@ scylla_core = (['database.cc',
'cql3/statements/schema_altering_statement.cc',
'cql3/statements/ks_prop_defs.cc',
'cql3/statements/modification_statement.cc',
'cql3/statements/cas_request.cc',
'cql3/statements/parsed_statement.cc',
'cql3/statements/property_definitions.cc',
'cql3/statements/update_statement.cc',
@@ -585,10 +578,6 @@ scylla_core = (['database.cc',
'service/priority_manager.cc',
'service/migration_manager.cc',
'service/storage_proxy.cc',
'service/paxos/proposal.cc',
'service/paxos/prepare_response.cc',
'service/paxos/paxos_state.cc',
'service/paxos/prepare_summary.cc',
'cql3/operator.cc',
'cql3/relation.cc',
'cql3/column_identifier.cc',
@@ -727,7 +716,6 @@ scylla_core = (['database.cc',
'tracing/trace_keyspace_helper.cc',
'tracing/trace_state.cc',
'tracing/tracing_backend_registry.cc',
'tracing/traced_file.cc',
'table_helper.cc',
'range_tombstone.cc',
'range_tombstone_list.cc',
@@ -794,7 +782,6 @@ alternator = [
Antlr3Grammar('alternator/expressions.g'),
'alternator/conditions.cc',
'alternator/rjson.cc',
'alternator/auth.cc',
]
idls = ['idl/gossip_digest.idl.hh',
@@ -821,8 +808,6 @@ idls = ['idl/gossip_digest.idl.hh',
'idl/consistency_level.idl.hh',
'idl/cache_temperature.idl.hh',
'idl/view.idl.hh',
'idl/messaging_service.idl.hh',
'idl/paxos.idl.hh',
]
headers = find_headers('.', excluded_dirs=['idl', 'build', 'seastar', '.git'])
@@ -856,6 +841,7 @@ pure_boost_tests = set([
'tests/range_test',
'tests/crc_test',
'tests/checksum_utils_test',
'tests/managed_vector_test',
'tests/dynamic_bitset_test',
'tests/idl_test',
'tests/cartesian_product_test',
@@ -876,7 +862,6 @@ pure_boost_tests = set([
'tests/top_k_test',
'tests/small_vector_test',
'tests/like_matcher_test',
'tests/enum_option_test',
])
tests_not_using_seastar_test_framework = set([
@@ -1087,6 +1072,17 @@ scylla_release = file.read().strip()
extra_cxxflags["release.cc"] = "-DSCYLLA_VERSION=\"\\\"" + scylla_version + "\\\"\" -DSCYLLA_RELEASE=\"\\\"" + scylla_release + "\\\"\""
seastar_flags = []
if args.dpdk:
# fake dependencies on dpdk, so that it is built before anything else
seastar_flags += ['--enable-dpdk']
if args.gcc6_concepts:
seastar_flags += ['--enable-gcc6-concepts']
if args.alloc_failure_injector:
seastar_flags += ['--enable-alloc-failure-injector']
if args.split_dwarf:
seastar_flags += ['--split-dwarf']
# We never compress debug info in debug mode
modes['debug']['cxxflags'] += ' -gz'
# We compress it by default in release mode
@@ -1101,56 +1097,27 @@ seastar_cflags += ' -Wno-error'
if args.target != '':
seastar_cflags += ' -march=' + args.target
seastar_ldflags = args.user_ldflags
seastar_flags += ['--compiler', args.cxx, '--c-compiler', args.cc, '--cflags=%s' % (seastar_cflags), '--ldflags=%s' % (seastar_ldflags),
'--c++-dialect=gnu++17', '--use-std-optional-variant-stringview=1', '--optflags=%s' % (modes['release']['cxx_ld_flags']), ]
libdeflate_cflags = seastar_cflags
zstd_cflags = seastar_cflags + ' -Wno-implicit-fallthrough'
MODE_TO_CMAKE_BUILD_TYPE = {'release' : 'RelWithDebInfo', 'debug' : 'Debug', 'dev' : 'Dev', 'sanitize' : 'Sanitize' }
status = subprocess.call([args.python, './configure.py'] + seastar_flags, cwd='seastar')
def configure_seastar(build_dir, mode):
seastar_build_dir = os.path.join(build_dir, mode, 'seastar')
if status != 0:
print('Seastar configuration failed')
sys.exit(1)
seastar_cmake_args = [
'-DCMAKE_BUILD_TYPE={}'.format(MODE_TO_CMAKE_BUILD_TYPE[mode]),
'-DCMAKE_C_COMPILER={}'.format(args.cc),
'-DCMAKE_CXX_COMPILER={}'.format(args.cxx),
'-DSeastar_CXX_FLAGS={}'.format((seastar_cflags + ' ' + modes[mode]['cxx_ld_flags']).replace(' ', ';')),
'-DSeastar_LD_FLAGS={}'.format(seastar_ldflags),
'-DSeastar_CXX_DIALECT=gnu++17',
'-DSeastar_STD_OPTIONAL_VARIANT_STRINGVIEW=ON',
'-DSeastar_UNUSED_RESULT_ERROR=ON',
]
if args.dpdk:
seastar_cmake_args += ['-DSeastar_DPDK=ON', '-DSeastar_DPDK_MACHINE=wsm']
if args.gcc6_concepts:
seastar_cmake_args += ['-DSeastar_GCC6_CONCEPTS=ON']
if args.split_dwarf:
seastar_cmake_args += ['-DSeastar_SPLIT_DWARF=ON']
if args.alloc_failure_injector:
seastar_cmake_args += ['-DSeastar_ALLOC_FAILURE_INJECTION=ON']
seastar_cmd = ['cmake', '-G', 'Ninja', os.path.relpath('seastar', seastar_build_dir)] + seastar_cmake_args
cmake_dir = seastar_build_dir
if args.dpdk:
# need to cook first
cmake_dir = 'seastar' # required by cooking.sh
relative_seastar_build_dir = os.path.join('..', seastar_build_dir) # relative to seastar/
seastar_cmd = ['./cooking.sh', '-i', 'dpdk', '-d', relative_seastar_build_dir, '--'] + seastar_cmd[4:]
print(seastar_cmd)
os.makedirs(seastar_build_dir, exist_ok=True)
subprocess.check_call(seastar_cmd, shell=False, cwd=cmake_dir)
for mode in build_modes:
configure_seastar('build', mode)
pc = {mode: 'build/{}/seastar/seastar.pc'.format(mode) for mode in build_modes}
pc = {mode: 'build/{}/seastar.pc'.format(mode) for mode in build_modes}
ninja = find_executable('ninja') or find_executable('ninja-build')
if not ninja:
print('Ninja executable (ninja or ninja-build) not found on PATH\n')
sys.exit(1)
def query_seastar_flags(pc_file, link_static_cxx=False):
def query_seastar_flags(seastar_pc_file, link_static_cxx=False):
pc_file = os.path.join('seastar', seastar_pc_file)
cflags = pkg_config(pc_file, '--cflags', '--static')
libs = pkg_config(pc_file, '--libs', '--static')
@@ -1164,6 +1131,8 @@ for mode in build_modes:
modes[mode]['seastar_cflags'] = seastar_cflags
modes[mode]['seastar_libs'] = seastar_libs
MODE_TO_CMAKE_BUILD_TYPE = {'release' : 'RelWithDebInfo', 'debug' : 'Debug', 'dev' : 'Dev', 'sanitize' : 'Sanitize' }
# We need to use experimental features of the zstd library (to use our own allocators for the (de)compression context),
# which are available only when the library is linked statically.
def configure_zstd(build_dir, mode):
@@ -1332,7 +1301,7 @@ with open(buildfile_tmp, 'w') as f:
serializers = {}
thrifts = set()
antlr3_grammars = set()
seastar_dep = 'build/{}/seastar/libseastar.a'.format(mode)
seastar_dep = 'seastar/build/{}/libseastar.a'.format(mode)
for binary in build_artifacts:
if binary in other:
continue
@@ -1361,7 +1330,7 @@ with open(buildfile_tmp, 'w') as f:
if binary in pure_boost_tests:
local_libs += ' ' + maybe_static(args.staticboost, '-lboost_unit_test_framework')
if binary not in tests_not_using_seastar_test_framework:
pc_path = pc[mode].replace('seastar.pc', 'seastar-testing.pc')
pc_path = os.path.join('seastar', pc[mode].replace('seastar.pc', 'seastar-testing.pc'))
local_libs += ' ' + pkg_config(pc_path, '--libs', '--static')
if has_thrift:
local_libs += ' ' + thrift_libs + ' ' + maybe_static(args.staticboost, '-lboost_system')
@@ -1449,18 +1418,18 @@ with open(buildfile_tmp, 'w') as f:
f.write('build $builddir/{mode}/{hh}.o: checkhh.{mode} {hh} || {gen_headers_dep}\n'.format(
mode=mode, hh=hh, gen_headers_dep=gen_headers_dep))
f.write('build build/{mode}/seastar/libseastar.a: ninja | always\n'
f.write('build seastar/build/{mode}/libseastar.a: ninja | always\n'
.format(**locals()))
f.write(' pool = submodule_pool\n')
f.write(' subdir = build/{mode}/seastar\n'.format(**locals()))
f.write(' subdir = seastar/build/{mode}\n'.format(**locals()))
f.write(' target = seastar seastar_testing\n'.format(**locals()))
f.write('build build/{mode}/seastar/apps/iotune/iotune: ninja\n'
f.write('build seastar/build/{mode}/apps/iotune/iotune: ninja\n'
.format(**locals()))
f.write(' pool = submodule_pool\n')
f.write(' subdir = build/{mode}/seastar\n'.format(**locals()))
f.write(' subdir = seastar/build/{mode}\n'.format(**locals()))
f.write(' target = iotune\n'.format(**locals()))
f.write(textwrap.dedent('''\
build build/{mode}/iotune: copy build/{mode}/seastar/apps/iotune/iotune
build build/{mode}/iotune: copy seastar/build/{mode}/apps/iotune/iotune
''').format(**locals()))
f.write('build build/{mode}/scylla-package.tar.gz: package build/{mode}/scylla build/{mode}/iotune build/SCYLLA-RELEASE-FILE build/SCYLLA-VERSION-FILE | always\n'.format(**locals()))
f.write(' pool = submodule_pool\n')
@@ -1481,7 +1450,7 @@ with open(buildfile_tmp, 'w') as f:
rule configure
command = {python} configure.py $configure_args
generator = 1
build build.ninja: configure | configure.py
build build.ninja: configure | configure.py seastar/configure.py
rule cscope
command = find -name '*.[chS]' -o -name "*.cc" -o -name "*.hh" | cscope -bq -i-
description = CSCOPE

View File

@@ -21,9 +21,6 @@
#pragma once
#include "types/user.hh"
#include "concrete_types.hh"
#include "mutation_partition_view.hh"
#include "mutation_partition.hh"
#include "schema.hh"
@@ -38,8 +35,8 @@ class converting_mutation_partition_applier : public mutation_partition_visitor
const column_mapping& _visited_column_mapping;
deletable_row* _current_row;
private:
static bool is_compatible(const column_definition& new_def, const abstract_type& old_type, column_kind kind) {
return ::is_compatible(new_def.kind, kind) && new_def.type->is_value_compatible_with(old_type);
static bool is_compatible(const column_definition& new_def, const data_type& old_type, column_kind kind) {
return ::is_compatible(new_def.kind, kind) && new_def.type->is_value_compatible_with(*old_type);
}
static atomic_cell upgrade_cell(const abstract_type& new_type, const abstract_type& old_type, atomic_cell_view cell,
atomic_cell::collection_member cm = atomic_cell::collection_member::no) {
@@ -52,59 +49,32 @@ private:
return atomic_cell(new_type, cell);
}
}
static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const abstract_type& old_type, atomic_cell_view cell) {
static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const data_type& old_type, atomic_cell_view cell) {
if (!is_compatible(new_def, old_type, kind) || cell.timestamp() <= new_def.dropped_at()) {
return;
}
dst.apply(new_def, upgrade_cell(*new_def.type, old_type, cell));
dst.apply(new_def, upgrade_cell(*new_def.type, *old_type, cell));
}
static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const abstract_type& old_type, collection_mutation_view cell) {
static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const data_type& old_type, collection_mutation_view cell) {
if (!is_compatible(new_def, old_type, kind)) {
return;
}
cell.data.with_linearized([&] (bytes_view cell_bv) {
auto new_ctype = static_pointer_cast<const collection_type_impl>(new_def.type);
auto old_ctype = static_pointer_cast<const collection_type_impl>(old_type);
auto old_view = old_ctype->deserialize_mutation_form(cell_bv);
cell.with_deserialized(old_type, [&] (collection_mutation_view_description old_view) {
collection_mutation_description new_view;
collection_type_impl::mutation new_view;
if (old_view.tomb.timestamp > new_def.dropped_at()) {
new_view.tomb = old_view.tomb;
}
visit(old_type, make_visitor(
[&] (const collection_type_impl& old_ctype) {
assert(new_def.type->is_collection()); // because is_compatible
auto& new_ctype = static_cast<const collection_type_impl&>(*new_def.type);
auto& new_value_type = *new_ctype.value_comparator();
auto& old_value_type = *old_ctype.value_comparator();
for (auto& c : old_view.cells) {
if (c.second.timestamp() > new_def.dropped_at()) {
new_view.cells.emplace_back(c.first, upgrade_cell(
new_value_type, old_value_type, c.second, atomic_cell::collection_member::yes));
}
}
},
[&] (const user_type_impl& old_utype) {
assert(new_def.type->is_user_type()); // because is_compatible
auto& new_utype = static_cast<const user_type_impl&>(*new_def.type);
for (auto& c : old_view.cells) {
if (c.second.timestamp() > new_def.dropped_at()) {
auto idx = deserialize_field_index(c.first);
assert(idx < new_utype.size() && idx < old_utype.size());
new_view.cells.emplace_back(c.first, upgrade_cell(
*new_utype.type(idx), *old_utype.type(idx), c.second, atomic_cell::collection_member::yes));
}
}
},
[&] (const abstract_type& o) {
throw std::runtime_error(format("not a multi-cell type: {}", o.name()));
for (auto& c : old_view.cells) {
if (c.second.timestamp() > new_def.dropped_at()) {
new_view.cells.emplace_back(c.first, upgrade_cell(*new_ctype->value_comparator(), *old_ctype->value_comparator(), c.second, atomic_cell::collection_member::yes));
}
));
}
if (new_view.tomb || !new_view.cells.empty()) {
dst.apply(new_def, new_view.serialize(*new_def.type));
dst.apply(new_def, new_ctype->serialize_mutation_form(std::move(new_view)));
}
});
}
@@ -130,7 +100,7 @@ public:
const column_mapping_entry& col = _visited_column_mapping.static_column_at(id);
const column_definition* def = _p_schema.get_column_definition(col.name());
if (def) {
accept_cell(_p._static_row.maybe_create(), column_kind::static_column, *def, *col.type(), cell);
accept_cell(_p._static_row, column_kind::static_column, *def, col.type(), cell);
}
}
@@ -138,7 +108,7 @@ public:
const column_mapping_entry& col = _visited_column_mapping.static_column_at(id);
const column_definition* def = _p_schema.get_column_definition(col.name());
if (def) {
accept_cell(_p._static_row.maybe_create(), column_kind::static_column, *def, *col.type(), collection);
accept_cell(_p._static_row, column_kind::static_column, *def, col.type(), collection);
}
}
@@ -161,7 +131,7 @@ public:
const column_mapping_entry& col = _visited_column_mapping.regular_column_at(id);
const column_definition* def = _p_schema.get_column_definition(col.name());
if (def) {
accept_cell(_current_row->cells(), column_kind::regular_column, *def, *col.type(), cell);
accept_cell(_current_row->cells(), column_kind::regular_column, *def, col.type(), cell);
}
}
@@ -169,7 +139,7 @@ public:
const column_mapping_entry& col = _visited_column_mapping.regular_column_at(id);
const column_definition* def = _p_schema.get_column_definition(col.name());
if (def) {
accept_cell(_current_row->cells(), column_kind::regular_column, *def, *col.type(), collection);
accept_cell(_current_row->cells(), column_kind::regular_column, *def, col.type(), collection);
}
}
@@ -177,9 +147,9 @@ public:
// Cells must have monotonic names.
static void append_cell(row& dst, column_kind kind, const column_definition& new_def, const column_definition& old_def, const atomic_cell_or_collection& cell) {
if (new_def.is_atomic()) {
accept_cell(dst, kind, new_def, *old_def.type, cell.as_atomic_cell(old_def));
accept_cell(dst, kind, new_def, old_def.type, cell.as_atomic_cell(old_def));
} else {
accept_cell(dst, kind, new_def, *old_def.type, cell.as_collection_mutation());
accept_cell(dst, kind, new_def, old_def.type, cell.as_collection_mutation());
}
}
};

View File

@@ -524,7 +524,6 @@ usingClauseObjective[::shared_ptr<cql3::attributes::raw> attrs]
*/
updateStatement returns [::shared_ptr<raw::update_statement> expr]
@init {
bool if_exists = false;
auto attrs = ::make_shared<cql3::attributes::raw>();
std::vector<std::pair<::shared_ptr<cql3::column_identifier::raw>, ::shared_ptr<cql3::operation::raw_update>>> operations;
}
@@ -532,14 +531,13 @@ updateStatement returns [::shared_ptr<raw::update_statement> expr]
( usingClause[attrs] )?
K_SET columnOperation[operations] (',' columnOperation[operations])*
K_WHERE wclause=whereClause
( K_IF (K_EXISTS{ if_exists = true; } | conditions=updateConditions) )?
( K_IF conditions=updateConditions )?
{
return ::make_shared<raw::update_statement>(std::move(cf),
std::move(attrs),
std::move(operations),
std::move(wclause),
std::move(conditions),
if_exists);
std::move(conditions));
}
;
@@ -583,7 +581,6 @@ deleteSelection returns [std::vector<::shared_ptr<cql3::operation::raw_deletion>
deleteOp returns [::shared_ptr<cql3::operation::raw_deletion> op]
: c=cident { $op = ::make_shared<cql3::operation::column_deletion>(std::move(c)); }
| c=cident '[' t=term ']' { $op = ::make_shared<cql3::operation::element_deletion>(std::move(c), std::move(t)); }
| c=cident '.' field=ident { $op = ::make_shared<cql3::operation::field_deletion>(std::move(c), std::move(field)); }
;
usingClauseDelete[::shared_ptr<cql3::attributes::raw> attrs]
@@ -1399,9 +1396,8 @@ columnOperation[operations_type& operations]
columnOperationDifferentiator[operations_type& operations, ::shared_ptr<cql3::column_identifier::raw> key]
: '=' normalColumnOperation[operations, key]
| '[' k=term ']' collectionColumnOperation[operations, key, k, false]
| '.' field=ident udtColumnOperation[operations, key, field]
| '[' K_SCYLLA_TIMEUUID_LIST_INDEX '(' k=term ')' ']' collectionColumnOperation[operations, key, k, true]
| '[' k=term ']' specializedColumnOperation[operations, key, k, false]
| '[' K_SCYLLA_TIMEUUID_LIST_INDEX '(' k=term ')' ']' specializedColumnOperation[operations, key, k, true]
;
normalColumnOperation[operations_type& operations, ::shared_ptr<cql3::column_identifier::raw> key]
@@ -1444,38 +1440,31 @@ normalColumnOperation[operations_type& operations, ::shared_ptr<cql3::column_ide
}
;
collectionColumnOperation[operations_type& operations,
shared_ptr<cql3::column_identifier::raw> key,
shared_ptr<cql3::term::raw> k,
bool by_uuid]
specializedColumnOperation[std::vector<std::pair<shared_ptr<cql3::column_identifier::raw>,
shared_ptr<cql3::operation::raw_update>>>& operations,
shared_ptr<cql3::column_identifier::raw> key,
shared_ptr<cql3::term::raw> k,
bool by_uuid]
: '=' t=term
{
add_raw_update(operations, key, make_shared<cql3::operation::set_element>(k, t, by_uuid));
}
;
udtColumnOperation[operations_type& operations,
shared_ptr<cql3::column_identifier::raw> key,
shared_ptr<cql3::column_identifier> field]
: '=' t=term
{
add_raw_update(operations, std::move(key), make_shared<cql3::operation::set_field>(std::move(field), std::move(t)));
}
;
columnCondition[conditions_type& conditions]
// Note: we'll reject duplicates later
: key=cident
( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::simple_condition(t, {}, *op)); }
( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::simple_condition(t, *op)); }
| K_IN
( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::in_condition({}, {}, values)); }
| marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::in_condition({}, marker, {})); }
( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::simple_in_condition(values)); }
| marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::simple_in_condition(marker)); }
)
| '[' element=term ']'
( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::simple_condition(t, element, *op)); }
( op=relationType t=term { conditions.emplace_back(key, cql3::column_condition::raw::collection_condition(t, element, *op)); }
| K_IN
( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::in_condition(element, {}, values)); }
| marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::in_condition(element, marker, {})); }
( values=singleColumnInValues { conditions.emplace_back(key, cql3::column_condition::raw::collection_in_condition(element, values)); }
| marker=inMarker { conditions.emplace_back(key, cql3::column_condition::raw::collection_in_condition(element, marker)); }
)
)
)

View File

@@ -45,7 +45,6 @@
#include "cql3/lists.hh"
#include "cql3/maps.hh"
#include "cql3/sets.hh"
#include "cql3/user_types.hh"
#include "types/list.hh"
namespace cql3 {
@@ -69,22 +68,19 @@ abstract_marker::raw::raw(int32_t bind_index)
::shared_ptr<term> abstract_marker::raw::prepare(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver)
{
if (receiver->type->is_collection()) {
if (receiver->type->get_kind() == abstract_type::kind::list) {
return ::make_shared<lists::marker>(_bind_index, receiver);
} else if (receiver->type->get_kind() == abstract_type::kind::set) {
return ::make_shared<sets::marker>(_bind_index, receiver);
} else if (receiver->type->get_kind() == abstract_type::kind::map) {
return ::make_shared<maps::marker>(_bind_index, receiver);
}
assert(0);
auto receiver_type = ::dynamic_pointer_cast<const collection_type_impl>(receiver->type);
if (receiver_type == nullptr) {
return ::make_shared<constants::marker>(_bind_index, receiver);
}
if (receiver->type->is_user_type()) {
return ::make_shared<user_types::marker>(_bind_index, receiver);
if (receiver_type->get_kind() == abstract_type::kind::list) {
return ::make_shared<lists::marker>(_bind_index, receiver);
} else if (receiver_type->get_kind() == abstract_type::kind::set) {
return ::make_shared<sets::marker>(_bind_index, receiver);
} else if (receiver_type->get_kind() == abstract_type::kind::map) {
return ::make_shared<maps::marker>(_bind_index, receiver);
}
return ::make_shared<constants::marker>(_bind_index, receiver);
assert(0);
return shared_ptr<term>();
}
assignment_testable::test_result abstract_marker::raw::test_assignment(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver) {

View File

@@ -45,8 +45,6 @@
#include "lists.hh"
#include "maps.hh"
#include <boost/range/algorithm_ext/push_back.hpp>
#include "types/map.hh"
#include "types/list.hh"
namespace {
@@ -63,54 +61,8 @@ void validate_operation_on_durations(const abstract_type& type, const cql3::oper
}
}
int is_satisfied_by(const cql3::operator_type &op, const abstract_type& cell_type,
const abstract_type& param_type, const data_value& cell_value, const bytes& param) {
int rc;
// For multi-cell sets and lists, cell value is represented as a map,
// thanks to collections_as_maps flag in partition_slice. param, however,
// is represented as a set or list type.
// We must implement an own compare of two different representations
// to compare the two.
if (cell_type.is_map() && cell_type.is_multi_cell() && param_type.is_listlike()) {
const listlike_collection_type_impl& list_type = static_cast<const listlike_collection_type_impl&>(param_type);
const map_type_impl& map_type = static_cast<const map_type_impl&>(cell_type);
assert(list_type.is_multi_cell());
// Inverse comparison result since the order of arguments is inverse.
rc = -list_type.compare_with_map(map_type, param, map_type.decompose(cell_value));
} else {
rc = cell_type.compare(cell_type.decompose(cell_value), param);
}
if (op == cql3::operator_type::EQ) {
return rc == 0;
} else if (op == cql3::operator_type::NEQ) {
return rc != 0;
} else if (op == cql3::operator_type::GTE) {
return rc >= 0;
} else if (op == cql3::operator_type::LTE) {
return rc <= 0;
} else if (op == cql3::operator_type::GT) {
return rc > 0;
} else if (op == cql3::operator_type::LT) {
return rc < 0;
}
assert(false);
return false;
}
// Read the list index from key and check that list index is not
// negative. The negative range check repeats Cassandra behaviour.
uint32_t read_and_check_list_index(const cql3::raw_value_view& key) {
// The list element type is always int32_type, see lists::index_spec_of
int32_t idx = read_simple_exactly<int32_t>(to_bytes(key));
if (idx < 0) {
throw exceptions::invalid_request_exception(format("Invalid negative list index {}", idx));
}
return static_cast<uint32_t>(idx);
}
} // end of anonymous namespace
namespace cql3 {
bool
@@ -140,134 +92,7 @@ void column_condition::collect_marker_specificaton(::shared_ptr<variable_specifi
value->collect_marker_specification(bound_names);
}
}
if (_value) {
_value->collect_marker_specification(bound_names);
}
}
bool column_condition::applies_to(const data_value* cell_value, const query_options& options) const {
// Cassandra condition support has a few quirks:
// - only a simple conjunct of predicates is supported "predicate AND predicate AND ..."
// - a predicate can operate on a column or a collection element, which must always be
// on the right side: "a = 3" or "collection['key'] IN (1,2,3)"
// - parameter markers are allowed on the right hand side only
// - only <, >, >=, <=, != and IN predicates are supported.
// - NULLs and missing values are treated differently from the WHERE clause:
// a term or cell in IF clause is allowed to be NULL or compared with NULL,
// and NULL value is treated just like any other value in the domain (there is no
// three-value logic or UNKNOWN like in SQL).
// - empty sets/lists/maps are treated differently when comparing with NULLs depending on
// whether the object is frozen or not. An empty *frozen* set/map/list is not equal to NULL.
// An empty *multi-cell* set/map/list is identical to NULL.
// The code below implements these rules in a way compatible with Cassandra.
// Use a map/list value instead of entire collection if a key is present in the predicate.
if (_collection_element != nullptr && cell_value != nullptr) {
// Checked in column_condition::raw::prepare()
assert(cell_value->type()->is_collection());
const collection_type_impl& cell_type = static_cast<const collection_type_impl&>(*cell_value->type());
cql3::raw_value_view key = _collection_element->bind_and_get(options);
if (key.is_unset_value()) {
throw exceptions::invalid_request_exception(
format("Invalid 'unset' value in {} element access", cell_type.cql3_type_name()));
}
if (key.is_null()) {
throw exceptions::invalid_request_exception(
format("Invalid null value for {} element access", cell_type.cql3_type_name()));
}
if (cell_type.is_map()) {
// If a collection is multi-cell and not frozen, it is returned as a map even if the
// underlying data type is "set" or "list". This is controlled by
// partition_slice::collections_as_maps enum, which is set when preparing a read command
// object. Representing a list as a map<timeuuid, listval> is necessary to identify the list field
// being updated, e.g. in case of UPDATE t SET list[3] = null WHERE a = 1 IF list[3]
// = 'key'
const map_type_impl& map_type = static_cast<const map_type_impl&>(cell_type);
// A map is serialized as a vector of data value pairs.
const std::vector<std::pair<data_value, data_value>>& map = map_type.from_value(*cell_value);
if (column.type->is_map()) {
// We're working with a map *type*, not only map *representation*.
with_linearized(*key, [&map, &map_type, &cell_value] (bytes_view key) {
auto end = map.end();
const auto& map_key_type = *map_type.get_keys_type();
auto less = [&map_key_type](const std::pair<data_value, data_value>& value, bytes_view key) {
return map_key_type.less(map_key_type.decompose(value.first), key);
};
// Map elements are sorted by key.
auto it = std::lower_bound(map.begin(), end, key, less);
if (it != end && map_key_type.equal(map_key_type.decompose(it->first), key)) {
cell_value = &it->second;
} else {
cell_value = nullptr;
}
});
} else if (column.type->is_list()) {
// We're working with a list type, represented as map.
uint32_t idx = read_and_check_list_index(key);
cell_value = idx >= map.size() ? nullptr : &map[idx].second;
} else {
// Syntax like "set_column['key'] = constant" is invalid.
assert(false);
}
} else if (cell_type.is_list()) {
// This is a *frozen* list.
const list_type_impl& list_type = static_cast<const list_type_impl&>(cell_type);
const std::vector<data_value>& list = list_type.from_value(*cell_value);
uint32_t idx = read_and_check_list_index(key);
cell_value = idx >= list.size() ? nullptr : &list[idx];
} else {
assert(false);
}
}
if (_op.is_compare()) {
// <, >, >=, <=, !=
cql3::raw_value_view param = _value->bind_and_get(options);
if (param.is_unset_value()) {
throw exceptions::invalid_request_exception("Invalid 'unset' value in condition");
}
if (param.is_null()) {
if (_op == operator_type::EQ) {
return cell_value == nullptr;
} else if (_op == operator_type::NEQ) {
return cell_value != nullptr;
} else {
throw exceptions::invalid_request_exception(format("Invalid comparison with null for operator \"{}\"", _op));
}
} else if (cell_value == nullptr) {
// The condition parameter is not null, so only NEQ can return true
return _op == operator_type::NEQ;
}
// type::validate() is called by bind_and_get(), so it's safe to pass to_bytes() result
// directly to compare.
return is_satisfied_by(_op, *cell_value->type(), *column.type, *cell_value, to_bytes(param));
}
assert(_op == operator_type::IN);
std::vector<bytes_opt> in_values;
if (_value) {
auto&& lval = dynamic_pointer_cast<multi_item_terminal>(_value->bind(options));
if (!lval) {
throw exceptions::invalid_request_exception("Invalid null value for IN condition");
}
in_values = std::move(lval->get_elements());
} else {
for (auto&& v : _in_values) {
in_values.emplace_back(to_bytes_opt(v->bind_and_get(options)));
}
}
// If cell value is NULL, IN list must contain NULL or an empty set/list. Otherwise it must contain cell value.
if (cell_value) {
return std::any_of(in_values.begin(), in_values.end(), [this, cell_value] (const bytes_opt& value) {
return value.has_value() && is_satisfied_by(operator_type::EQ, *cell_value->type(), *column.type, *cell_value, *value);
});
} else {
return std::any_of(in_values.begin(), in_values.end(), [] (const bytes_opt& value) { return !value.has_value() || value->empty(); });
}
_value->collect_marker_specification(bound_names);
}
::shared_ptr<column_condition>
@@ -275,54 +100,61 @@ column_condition::raw::prepare(database& db, const sstring& keyspace, const colu
if (receiver.type->is_counter()) {
throw exceptions::invalid_request_exception("Conditions on counters are not supported");
}
shared_ptr<term> collection_element_term;
shared_ptr<column_specification> value_spec = receiver.column_specification;
if (_collection_element) {
if (!receiver.type->is_collection()) {
throw exceptions::invalid_request_exception(format("Invalid element access syntax for non-collection column {}",
receiver.name_as_text()));
}
// Pass a correct type specification to the collection_element->prepare(), so that it can
// later be used to validate the parameter type is compatible with receiver type.
shared_ptr<column_specification> element_spec;
auto ctype = static_cast<const collection_type_impl*>(receiver.type.get());
if (ctype->get_kind() == abstract_type::kind::list) {
element_spec = lists::index_spec_of(receiver.column_specification);
value_spec = lists::value_spec_of(receiver.column_specification);
} else if (ctype->get_kind() == abstract_type::kind::map) {
element_spec = maps::key_spec_of(*receiver.column_specification);
value_spec = maps::value_spec_of(*receiver.column_specification);
} else if (ctype->get_kind() == abstract_type::kind::set) {
throw exceptions::invalid_request_exception(format("Invalid element access syntax for set column {}",
receiver.name_as_text()));
if (!_collection_element) {
if (_op == operator_type::IN) {
if (_in_values.empty()) { // ?
return column_condition::in_condition(receiver, _in_marker->prepare(db, keyspace, receiver.column_specification));
}
std::vector<::shared_ptr<term>> terms;
for (auto&& value : _in_values) {
terms.push_back(value->prepare(db, keyspace, receiver.column_specification));
}
return column_condition::in_condition(receiver, std::move(terms));
} else {
throw exceptions::invalid_request_exception(
format("Unsupported collection type {} in a condition with element access", ctype->cql3_type_name()));
validate_operation_on_durations(*receiver.type, _op);
return column_condition::condition(receiver, _value->prepare(db, keyspace, receiver.column_specification), _op);
}
collection_element_term = _collection_element->prepare(db, keyspace, element_spec);
}
if (_op.is_compare()) {
if (!receiver.type->is_collection()) {
throw exceptions::invalid_request_exception(format("Invalid element access syntax for non-collection column {}", receiver.name_as_text()));
}
shared_ptr<column_specification> element_spec, value_spec;
auto ctype = static_cast<const collection_type_impl*>(receiver.type.get());
if (ctype->get_kind() == abstract_type::kind::list) {
element_spec = lists::index_spec_of(receiver.column_specification);
value_spec = lists::value_spec_of(receiver.column_specification);
} else if (ctype->get_kind() == abstract_type::kind::map) {
element_spec = maps::key_spec_of(*receiver.column_specification);
value_spec = maps::value_spec_of(*receiver.column_specification);
} else if (ctype->get_kind() == abstract_type::kind::set) {
throw exceptions::invalid_request_exception(format("Invalid element access syntax for set column {}", receiver.name()));
} else {
abort();
}
if (_op == operator_type::IN) {
if (_in_values.empty()) {
return column_condition::in_condition(receiver,
_collection_element->prepare(db, keyspace, element_spec),
_in_marker->prepare(db, keyspace, value_spec));
}
std::vector<shared_ptr<term>> terms;
terms.reserve(_in_values.size());
boost::push_back(terms, _in_values
| boost::adaptors::transformed(std::bind(&term::raw::prepare, std::placeholders::_1, std::ref(db), std::ref(keyspace), value_spec)));
return column_condition::in_condition(receiver, _collection_element->prepare(db, keyspace, element_spec), terms);
} else {
validate_operation_on_durations(*receiver.type, _op);
return column_condition::condition(receiver, collection_element_term, _value->prepare(db, keyspace, value_spec), _op);
}
if (_op != operator_type::IN) {
throw exceptions::invalid_request_exception(format("Unsupported operator type {} in a condition ", _op));
}
if (_in_marker) {
assert(_in_values.empty());
shared_ptr<term> multi_item_term = _in_marker->prepare(db, keyspace, value_spec);
return column_condition::in_condition(receiver, collection_element_term, multi_item_term, {});
return column_condition::condition(receiver,
_collection_element->prepare(db, keyspace, element_spec),
_value->prepare(db, keyspace, value_spec),
_op);
}
// Both _in_values and in _in_marker can be missing in case of empty IN list: "a IN ()"
std::vector<::shared_ptr<term>> terms;
terms.reserve(_in_values.size());
for (auto&& value : _in_values) {
terms.push_back(value->prepare(db, keyspace, value_spec));
}
return column_condition::in_condition(receiver, collection_element_term, {}, std::move(terms));
}
} // end of namespace cql3
}

View File

@@ -52,18 +52,11 @@ namespace cql3 {
*/
class column_condition final {
public:
// If _collection_element is not zero, this defines the receiver cell, not the entire receiver
// column.
// E.g. if column type is list<string> and expression is "a = ['test']", then the type of the
// column definition below is list<string>. If expression is "a[0] = 'test'", then the column
// object stands for the string cell. See column_condition::raw::prepare() for details.
const column_definition& column;
private:
// For collection, when testing the equality of a specific element, nullptr otherwise.
::shared_ptr<term> _collection_element;
// A literal value for comparison predicates or a multi item terminal for "a IN ?"
::shared_ptr<term> _value;
// List of terminals for "a IN (value, value, ...)"
std::vector<::shared_ptr<term>> _in_values;
const operator_type& _op;
public:
@@ -79,6 +72,41 @@ public:
assert(_in_values.empty());
}
}
static ::shared_ptr<column_condition> condition(const column_definition& def, ::shared_ptr<term> value, const operator_type& op) {
return ::make_shared<column_condition>(def, ::shared_ptr<term>{}, std::move(value), std::vector<::shared_ptr<term>>{}, op);
}
static ::shared_ptr<column_condition> condition(const column_definition& def, ::shared_ptr<term> collection_element,
::shared_ptr<term> value, const operator_type& op) {
return ::make_shared<column_condition>(def, std::move(collection_element), std::move(value),
std::vector<::shared_ptr<term>>{}, op);
}
static ::shared_ptr<column_condition> in_condition(const column_definition& def, std::vector<::shared_ptr<term>> in_values) {
return ::make_shared<column_condition>(def, ::shared_ptr<term>{}, ::shared_ptr<term>{},
std::move(in_values), operator_type::IN);
}
static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> collection_element,
std::vector<::shared_ptr<term>> in_values) {
return ::make_shared<column_condition>(def, std::move(collection_element), ::shared_ptr<term>{},
std::move(in_values), operator_type::IN);
}
static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> in_marker) {
return ::make_shared<column_condition>(def, ::shared_ptr<term>{}, std::move(in_marker),
std::vector<::shared_ptr<term>>{}, operator_type::IN);
}
static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> collection_element,
::shared_ptr<term> in_marker) {
return ::make_shared<column_condition>(def, std::move(collection_element), std::move(in_marker),
std::vector<::shared_ptr<term>>{}, operator_type::IN);
}
bool uses_function(const sstring& ks_name, const sstring& function_name);
public:
/**
* Collects the column specification for the bind variables of this operation.
*
@@ -87,34 +115,592 @@ public:
*/
void collect_marker_specificaton(::shared_ptr<variable_specifications> bound_names);
bool uses_function(const sstring& ks_name, const sstring& function_name);
// Retrieve parameter marker values, if any, find the appropriate collection
// element if the cell is a collection and an element access is used in the expression,
// and evaluate the condition.
bool applies_to(const data_value* cell_value, const query_options& options) const;
// Helper constructor wrapper for "IF col['key'] = 'foo'" or "IF col = 'foo'" */
static ::shared_ptr<column_condition> condition(const column_definition& def, ::shared_ptr<term> collection_element,
::shared_ptr<term> value, const operator_type& op) {
return ::make_shared<column_condition>(def, std::move(collection_element), std::move(value),
std::vector<::shared_ptr<term>>{}, op);
#if 0
public ColumnCondition.Bound bind(QueryOptions options) throws InvalidRequestException
{
boolean isInCondition = operator == Operator.IN;
if (column.type instanceof CollectionType)
{
if (collectionElement == null)
return isInCondition ? new CollectionInBound(this, options) : new CollectionBound(this, options);
else
return isInCondition ? new ElementAccessInBound(this, options) : new ElementAccessBound(this, options);
}
return isInCondition ? new SimpleInBound(this, options) : new SimpleBound(this, options);
}
// Helper constructor wrapper for "IF col IN ... and IF col['key'] IN ... */
static ::shared_ptr<column_condition> in_condition(const column_definition& def, ::shared_ptr<term> collection_element,
::shared_ptr<term> in_marker, std::vector<::shared_ptr<term>> in_values) {
return ::make_shared<column_condition>(def, std::move(collection_element), std::move(in_marker),
std::move(in_values), operator_type::IN);
public static abstract class Bound
{
public final ColumnDefinition column;
public final Operator operator;
protected Bound(ColumnDefinition column, Operator operator)
{
this.column = column;
this.operator = operator;
}
/**
* Validates whether this condition applies to {@code current}.
*/
public abstract boolean appliesTo(Composite rowPrefix, ColumnFamily current, long now) throws InvalidRequestException;
public ByteBuffer getCollectionElementValue()
{
return null;
}
protected boolean isSatisfiedByValue(ByteBuffer value, Cell c, AbstractType<?> type, Operator operator, long now) throws InvalidRequestException
{
ByteBuffer columnValue = (c == null || !c.isLive(now)) ? null : c.value();
return compareWithOperator(operator, type, value, columnValue);
}
/** Returns true if the operator is satisfied (i.e. "value operator otherValue == true"), false otherwise. */
protected boolean compareWithOperator(Operator operator, AbstractType<?> type, ByteBuffer value, ByteBuffer otherValue) throws InvalidRequestException
{
if (value == null)
{
switch (operator)
{
case EQ:
return otherValue == null;
case NEQ:
return otherValue != null;
default:
throw new InvalidRequestException(String.format("Invalid comparison with null for operator \"%s\"", operator));
}
}
else if (otherValue == null)
{
// the condition value is not null, so only NEQ can return true
return operator == Operator.NEQ;
}
int comparison = type.compare(otherValue, value);
switch (operator)
{
case EQ:
return comparison == 0;
case LT:
return comparison < 0;
case LTE:
return comparison <= 0;
case GT:
return comparison > 0;
case GTE:
return comparison >= 0;
case NEQ:
return comparison != 0;
default:
// we shouldn't get IN, CONTAINS, or CONTAINS KEY here
throw new AssertionError();
}
}
protected Iterator<Cell> collectionColumns(CellName collection, ColumnFamily cf, final long now)
{
// We are testing for collection equality, so we need to have the expected values *and* only those.
ColumnSlice[] collectionSlice = new ColumnSlice[]{ collection.slice() };
// Filter live columns, this makes things simpler afterwards
return Iterators.filter(cf.iterator(collectionSlice), new Predicate<Cell>()
{
public boolean apply(Cell c)
{
// we only care about live columns
return c.isLive(now);
}
});
}
}
/**
* A condition on a single non-collection column. This does not support IN operators (see SimpleInBound).
*/
static class SimpleBound extends Bound
{
public final ByteBuffer value;
private SimpleBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
{
super(condition.column, condition.operator);
assert !(column.type instanceof CollectionType) && condition.collectionElement == null;
assert condition.operator != Operator.IN;
this.value = condition.value.bindAndGet(options);
}
public boolean appliesTo(Composite rowPrefix, ColumnFamily current, long now) throws InvalidRequestException
{
CellName name = current.metadata().comparator.create(rowPrefix, column);
return isSatisfiedByValue(value, current.getColumn(name), column.type, operator, now);
}
}
/**
* An IN condition on a single non-collection column.
*/
static class SimpleInBound extends Bound
{
public final List<ByteBuffer> inValues;
private SimpleInBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
{
super(condition.column, condition.operator);
assert !(column.type instanceof CollectionType) && condition.collectionElement == null;
assert condition.operator == Operator.IN;
if (condition.inValues == null)
this.inValues = ((Lists.Marker) condition.value).bind(options).getElements();
else
{
this.inValues = new ArrayList<>(condition.inValues.size());
for (Term value : condition.inValues)
this.inValues.add(value.bindAndGet(options));
}
}
public boolean appliesTo(Composite rowPrefix, ColumnFamily current, long now) throws InvalidRequestException
{
CellName name = current.metadata().comparator.create(rowPrefix, column);
for (ByteBuffer value : inValues)
{
if (isSatisfiedByValue(value, current.getColumn(name), column.type, Operator.EQ, now))
return true;
}
return false;
}
}
/** A condition on an element of a collection column. IN operators are not supported here, see ElementAccessInBound. */
static class ElementAccessBound extends Bound
{
public final ByteBuffer collectionElement;
public final ByteBuffer value;
private ElementAccessBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
{
super(condition.column, condition.operator);
assert column.type instanceof CollectionType && condition.collectionElement != null;
assert condition.operator != Operator.IN;
this.collectionElement = condition.collectionElement.bindAndGet(options);
this.value = condition.value.bindAndGet(options);
}
public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
{
if (collectionElement == null)
throw new InvalidRequestException("Invalid null value for " + (column.type instanceof MapType ? "map" : "list") + " element access");
if (column.type instanceof MapType)
{
MapType mapType = (MapType) column.type;
if (column.type.isMultiCell())
{
Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column, collectionElement));
return isSatisfiedByValue(value, cell, mapType.getValuesType(), operator, now);
}
else
{
Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column));
ByteBuffer mapElementValue = cell.isLive(now) ? mapType.getSerializer().getSerializedValue(cell.value(), collectionElement, mapType.getKeysType())
: null;
return compareWithOperator(operator, mapType.getValuesType(), value, mapElementValue);
}
}
// sets don't have element access, so it's a list
ListType listType = (ListType) column.type;
if (column.type.isMultiCell())
{
ByteBuffer columnValue = getListItem(
collectionColumns(current.metadata().comparator.create(rowPrefix, column), current, now),
getListIndex(collectionElement));
return compareWithOperator(operator, listType.getElementsType(), value, columnValue);
}
else
{
Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column));
ByteBuffer listElementValue = cell.isLive(now) ? listType.getSerializer().getElement(cell.value(), getListIndex(collectionElement))
: null;
return compareWithOperator(operator, listType.getElementsType(), value, listElementValue);
}
}
static int getListIndex(ByteBuffer collectionElement) throws InvalidRequestException
{
int idx = ByteBufferUtil.toInt(collectionElement);
if (idx < 0)
throw new InvalidRequestException(String.format("Invalid negative list index %d", idx));
return idx;
}
static ByteBuffer getListItem(Iterator<Cell> iter, int index)
{
int adv = Iterators.advance(iter, index);
if (adv == index && iter.hasNext())
return iter.next().value();
else
return null;
}
public ByteBuffer getCollectionElementValue()
{
return collectionElement;
}
}
static class ElementAccessInBound extends Bound
{
public final ByteBuffer collectionElement;
public final List<ByteBuffer> inValues;
private ElementAccessInBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
{
super(condition.column, condition.operator);
assert column.type instanceof CollectionType && condition.collectionElement != null;
this.collectionElement = condition.collectionElement.bindAndGet(options);
if (condition.inValues == null)
this.inValues = ((Lists.Marker) condition.value).bind(options).getElements();
else
{
this.inValues = new ArrayList<>(condition.inValues.size());
for (Term value : condition.inValues)
this.inValues.add(value.bindAndGet(options));
}
}
public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
{
if (collectionElement == null)
throw new InvalidRequestException("Invalid null value for " + (column.type instanceof MapType ? "map" : "list") + " element access");
CellNameType nameType = current.metadata().comparator;
if (column.type instanceof MapType)
{
MapType mapType = (MapType) column.type;
AbstractType<?> valueType = mapType.getValuesType();
if (column.type.isMultiCell())
{
CellName name = nameType.create(rowPrefix, column, collectionElement);
Cell item = current.getColumn(name);
for (ByteBuffer value : inValues)
{
if (isSatisfiedByValue(value, item, valueType, Operator.EQ, now))
return true;
}
return false;
}
else
{
Cell cell = current.getColumn(nameType.create(rowPrefix, column));
ByteBuffer mapElementValue = null;
if (cell != null && cell.isLive(now))
mapElementValue = mapType.getSerializer().getSerializedValue(cell.value(), collectionElement, mapType.getKeysType());
for (ByteBuffer value : inValues)
{
if (value == null)
{
if (mapElementValue == null)
return true;
continue;
}
if (valueType.compare(value, mapElementValue) == 0)
return true;
}
return false;
}
}
ListType listType = (ListType) column.type;
AbstractType<?> elementsType = listType.getElementsType();
if (column.type.isMultiCell())
{
ByteBuffer columnValue = ElementAccessBound.getListItem(
collectionColumns(nameType.create(rowPrefix, column), current, now),
ElementAccessBound.getListIndex(collectionElement));
for (ByteBuffer value : inValues)
{
if (compareWithOperator(Operator.EQ, elementsType, value, columnValue))
return true;
}
}
else
{
Cell cell = current.getColumn(nameType.create(rowPrefix, column));
ByteBuffer listElementValue = null;
if (cell != null && cell.isLive(now))
listElementValue = listType.getSerializer().getElement(cell.value(), ElementAccessBound.getListIndex(collectionElement));
for (ByteBuffer value : inValues)
{
if (value == null)
{
if (listElementValue == null)
return true;
continue;
}
if (elementsType.compare(value, listElementValue) == 0)
return true;
}
}
return false;
}
}
/** A condition on an entire collection column. IN operators are not supported here, see CollectionInBound. */
static class CollectionBound extends Bound
{
private final Term.Terminal value;
private CollectionBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
{
super(condition.column, condition.operator);
assert column.type.isCollection() && condition.collectionElement == null;
assert condition.operator != Operator.IN;
this.value = condition.value.bind(options);
}
public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
{
CollectionType type = (CollectionType)column.type;
if (type.isMultiCell())
{
Iterator<Cell> iter = collectionColumns(current.metadata().comparator.create(rowPrefix, column), current, now);
if (value == null)
{
if (operator == Operator.EQ)
return !iter.hasNext();
else if (operator == Operator.NEQ)
return iter.hasNext();
else
throw new InvalidRequestException(String.format("Invalid comparison with null for operator \"%s\"", operator));
}
return valueAppliesTo(type, iter, value, operator);
}
// frozen collections
Cell cell = current.getColumn(current.metadata().comparator.create(rowPrefix, column));
if (value == null)
{
if (operator == Operator.EQ)
return cell == null || !cell.isLive(now);
else if (operator == Operator.NEQ)
return cell != null && cell.isLive(now);
else
throw new InvalidRequestException(String.format("Invalid comparison with null for operator \"%s\"", operator));
}
// make sure we use v3 serialization format for comparison
ByteBuffer conditionValue;
if (type.kind == CollectionType.Kind.LIST)
conditionValue = ((Lists.Value) value).getWithProtocolVersion(Server.VERSION_3);
else if (type.kind == CollectionType.Kind.SET)
conditionValue = ((Sets.Value) value).getWithProtocolVersion(Server.VERSION_3);
else
conditionValue = ((Maps.Value) value).getWithProtocolVersion(Server.VERSION_3);
return compareWithOperator(operator, type, conditionValue, cell.value());
}
static boolean valueAppliesTo(CollectionType type, Iterator<Cell> iter, Term.Terminal value, Operator operator)
{
if (value == null)
return !iter.hasNext();
switch (type.kind)
{
case LIST: return listAppliesTo((ListType)type, iter, ((Lists.Value)value).elements, operator);
case SET: return setAppliesTo((SetType)type, iter, ((Sets.Value)value).elements, operator);
case MAP: return mapAppliesTo((MapType)type, iter, ((Maps.Value)value).map, operator);
}
throw new AssertionError();
}
private static boolean setOrListAppliesTo(AbstractType<?> type, Iterator<Cell> iter, Iterator<ByteBuffer> conditionIter, Operator operator, boolean isSet)
{
while(iter.hasNext())
{
if (!conditionIter.hasNext())
return (operator == Operator.GT) || (operator == Operator.GTE) || (operator == Operator.NEQ);
// for lists we use the cell value; for sets we use the cell name
ByteBuffer cellValue = isSet? iter.next().name().collectionElement() : iter.next().value();
int comparison = type.compare(cellValue, conditionIter.next());
if (comparison != 0)
return evaluateComparisonWithOperator(comparison, operator);
}
if (conditionIter.hasNext())
return (operator == Operator.LT) || (operator == Operator.LTE) || (operator == Operator.NEQ);
// they're equal
return operator == Operator.EQ || operator == Operator.LTE || operator == Operator.GTE;
}
private static boolean evaluateComparisonWithOperator(int comparison, Operator operator)
{
// called when comparison != 0
switch (operator)
{
case EQ:
return false;
case LT:
case LTE:
return comparison < 0;
case GT:
case GTE:
return comparison > 0;
case NEQ:
return true;
default:
throw new AssertionError();
}
}
static boolean listAppliesTo(ListType type, Iterator<Cell> iter, List<ByteBuffer> elements, Operator operator)
{
return setOrListAppliesTo(type.getElementsType(), iter, elements.iterator(), operator, false);
}
static boolean setAppliesTo(SetType type, Iterator<Cell> iter, Set<ByteBuffer> elements, Operator operator)
{
ArrayList<ByteBuffer> sortedElements = new ArrayList<>(elements.size());
sortedElements.addAll(elements);
Collections.sort(sortedElements, type.getElementsType());
return setOrListAppliesTo(type.getElementsType(), iter, sortedElements.iterator(), operator, true);
}
static boolean mapAppliesTo(MapType type, Iterator<Cell> iter, Map<ByteBuffer, ByteBuffer> elements, Operator operator)
{
Iterator<Map.Entry<ByteBuffer, ByteBuffer>> conditionIter = elements.entrySet().iterator();
while(iter.hasNext())
{
if (!conditionIter.hasNext())
return (operator == Operator.GT) || (operator == Operator.GTE) || (operator == Operator.NEQ);
Map.Entry<ByteBuffer, ByteBuffer> conditionEntry = conditionIter.next();
Cell c = iter.next();
// compare the keys
int comparison = type.getKeysType().compare(c.name().collectionElement(), conditionEntry.getKey());
if (comparison != 0)
return evaluateComparisonWithOperator(comparison, operator);
// compare the values
comparison = type.getValuesType().compare(c.value(), conditionEntry.getValue());
if (comparison != 0)
return evaluateComparisonWithOperator(comparison, operator);
}
if (conditionIter.hasNext())
return (operator == Operator.LT) || (operator == Operator.LTE) || (operator == Operator.NEQ);
// they're equal
return operator == Operator.EQ || operator == Operator.LTE || operator == Operator.GTE;
}
}
public static class CollectionInBound extends Bound
{
private final List<Term.Terminal> inValues;
private CollectionInBound(ColumnCondition condition, QueryOptions options) throws InvalidRequestException
{
super(condition.column, condition.operator);
assert column.type instanceof CollectionType && condition.collectionElement == null;
assert condition.operator == Operator.IN;
inValues = new ArrayList<>();
if (condition.inValues == null)
{
// We have a list of serialized collections that need to be deserialized for later comparisons
CollectionType collectionType = (CollectionType) column.type;
Lists.Marker inValuesMarker = (Lists.Marker) condition.value;
if (column.type instanceof ListType)
{
ListType deserializer = ListType.getInstance(collectionType.valueComparator(), false);
for (ByteBuffer buffer : inValuesMarker.bind(options).elements)
{
if (buffer == null)
this.inValues.add(null);
else
this.inValues.add(Lists.Value.fromSerialized(buffer, deserializer, options.getProtocolVersion()));
}
}
else if (column.type instanceof MapType)
{
MapType deserializer = MapType.getInstance(collectionType.nameComparator(), collectionType.valueComparator(), false);
for (ByteBuffer buffer : inValuesMarker.bind(options).elements)
{
if (buffer == null)
this.inValues.add(null);
else
this.inValues.add(Maps.Value.fromSerialized(buffer, deserializer, options.getProtocolVersion()));
}
}
else if (column.type instanceof SetType)
{
SetType deserializer = SetType.getInstance(collectionType.valueComparator(), false);
for (ByteBuffer buffer : inValuesMarker.bind(options).elements)
{
if (buffer == null)
this.inValues.add(null);
else
this.inValues.add(Sets.Value.fromSerialized(buffer, deserializer, options.getProtocolVersion()));
}
}
}
else
{
for (Term value : condition.inValues)
this.inValues.add(value.bind(options));
}
}
public boolean appliesTo(Composite rowPrefix, ColumnFamily current, final long now) throws InvalidRequestException
{
CollectionType type = (CollectionType)column.type;
CellName name = current.metadata().comparator.create(rowPrefix, column);
if (type.isMultiCell())
{
// copy iterator contents so that we can properly reuse them for each comparison with an IN value
List<Cell> cells = newArrayList(collectionColumns(name, current, now));
for (Term.Terminal value : inValues)
{
if (CollectionBound.valueAppliesTo(type, cells.iterator(), value, Operator.EQ))
return true;
}
return false;
}
else
{
Cell cell = current.getColumn(name);
for (Term.Terminal value : inValues)
{
if (value == null)
{
if (cell == null || !cell.isLive(now))
return true;
}
else if (type.compare(((Term.CollectionTerminal)value).getWithProtocolVersion(Server.VERSION_3), cell.value()) == 0)
{
return true;
}
}
return false;
}
}
}
#endif
class raw final {
private:
::shared_ptr<term::raw> _value;
std::vector<::shared_ptr<term::raw>> _in_values;
::shared_ptr<abstract_marker::in_raw> _in_marker;
// Can be nullptr, used with the syntax "IF m[e] = ..." (in which case it's 'e')
// Can be nullptr, only used with the syntax "IF m[e] = ..." (in which case it's 'e')
::shared_ptr<term::raw> _collection_element;
const operator_type& _op;
public:
@@ -130,29 +716,46 @@ public:
, _op(op)
{ }
/** A condition on a column or collection element. For example: "IF col['key'] = 'foo'" or "IF col = 'foo'" */
static ::shared_ptr<raw> simple_condition(::shared_ptr<term::raw> value, ::shared_ptr<term::raw> collection_element,
const operator_type& op) {
/** A condition on a column. For example: "IF col = 'foo'" */
static ::shared_ptr<raw> simple_condition(::shared_ptr<term::raw> value, const operator_type& op) {
return ::make_shared<raw>(std::move(value), std::vector<::shared_ptr<term::raw>>{},
::shared_ptr<abstract_marker::in_raw>{}, std::move(collection_element), op);
::shared_ptr<abstract_marker::in_raw>{}, ::shared_ptr<term::raw>{}, op);
}
/**
* An IN condition on a column or a collection element. IN may contain a list of values or a single marker.
* For example:
* "IF col IN ('foo', 'bar', ...)"
* "IF col IN ?"
* "IF col['key'] IN * ('foo', 'bar', ...)"
* "IF col['key'] IN ?"
*/
static ::shared_ptr<raw> in_condition(::shared_ptr<term::raw> collection_element,
::shared_ptr<abstract_marker::in_raw> in_marker, std::vector<::shared_ptr<term::raw>> in_values) {
return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::move(in_values), std::move(in_marker),
std::move(collection_element), operator_type::IN);
/** An IN condition on a column. For example: "IF col IN ('foo', 'bar', ...)" */
static ::shared_ptr<raw> simple_in_condition(std::vector<::shared_ptr<term::raw>> in_values) {
return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::move(in_values),
::shared_ptr<abstract_marker::in_raw>{}, ::shared_ptr<term::raw>{}, operator_type::IN);
}
/** An IN condition on a column with a single marker. For example: "IF col IN ?" */
static ::shared_ptr<raw> simple_in_condition(::shared_ptr<abstract_marker::in_raw> in_marker) {
return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::vector<::shared_ptr<term::raw>>{},
std::move(in_marker), ::shared_ptr<term::raw>{}, operator_type::IN);
}
/** A condition on a collection element. For example: "IF col['key'] = 'foo'" */
static ::shared_ptr<raw> collection_condition(::shared_ptr<term::raw> value, ::shared_ptr<term::raw> collection_element,
const operator_type& op) {
return ::make_shared<raw>(std::move(value), std::vector<::shared_ptr<term::raw>>{}, ::shared_ptr<abstract_marker::in_raw>{}, std::move(collection_element), op);
}
/** An IN condition on a collection element. For example: "IF col['key'] IN ('foo', 'bar', ...)" */
static ::shared_ptr<raw> collection_in_condition(::shared_ptr<term::raw> collection_element,
std::vector<::shared_ptr<term::raw>> in_values) {
return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::move(in_values), ::shared_ptr<abstract_marker::in_raw>{},
std::move(collection_element), operator_type::IN);
}
/** An IN condition on a collection element with a single marker. For example: "IF col['key'] IN ?" */
static ::shared_ptr<raw> collection_in_condition(::shared_ptr<term::raw> collection_element,
::shared_ptr<abstract_marker::in_raw> in_marker) {
return ::make_shared<raw>(::shared_ptr<term::raw>{}, std::vector<::shared_ptr<term::raw>>{}, std::move(in_marker),
std::move(collection_element), operator_type::IN);
}
::shared_ptr<column_condition> prepare(database& db, const sstring& keyspace, const column_definition& receiver);
};
};
} // end of namespace cql3
}

View File

@@ -85,7 +85,7 @@ assignment_testable::test_result
constants::literal::test_assignment(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver)
{
auto receiver_type = receiver->type->as_cql3_type();
if (receiver_type.is_collection() || receiver_type.is_user_type()) {
if (receiver_type.is_collection()) {
return test_result::NOT_ASSIGNABLE;
}
if (!receiver_type.is_native()) {
@@ -166,10 +166,10 @@ constants::literal::prepare(database& db, const sstring& keyspace, ::shared_ptr<
void constants::deleter::execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params) {
if (column.type->is_multi_cell()) {
collection_mutation_description coll_m;
collection_type_impl::mutation coll_m;
coll_m.tomb = params.make_tombstone();
m.set_cell(prefix, column, coll_m.serialize(*column.type));
auto ctype = static_pointer_cast<const collection_type_impl>(column.type);
m.set_cell(prefix, column, atomic_cell_or_collection::from_collection_mutation(ctype->serialize_mutation_form(coll_m)));
} else {
m.set_cell(prefix, column, make_dead_cell(params));
}

View File

@@ -173,7 +173,7 @@ public:
marker(int32_t bind_index, ::shared_ptr<column_specification> receiver)
: abstract_marker{bind_index, std::move(receiver)}
{
assert(!_receiver->type->is_collection() && !_receiver->type->is_user_type());
assert(!_receiver->type->is_collection());
}
virtual cql3::raw_value_view bind_and_get(const query_options& options) override {

View File

@@ -37,7 +37,7 @@ namespace cql3 {
cql3_type cql3_type::raw::prepare(database& db, const sstring& keyspace) {
try {
auto&& ks = db.find_keyspace(keyspace);
return prepare_internal(keyspace, *ks.metadata()->user_types());
return prepare_internal(keyspace, ks.metadata()->user_types());
} catch (no_such_keyspace& nsk) {
throw exceptions::invalid_request_exception("Unknown keyspace " + keyspace);
}
@@ -54,10 +54,6 @@ bool cql3_type::raw::references_user_type(const sstring& name) const {
class cql3_type::raw_type : public raw {
private:
cql3_type _type;
virtual sstring to_string() const override {
return _type.to_string();
}
public:
raw_type(cql3_type type)
: _type{type}
@@ -66,7 +62,7 @@ public:
virtual cql3_type prepare(database& db, const sstring& keyspace) {
return _type;
}
cql3_type prepare_internal(const sstring&, user_types_metadata&) override {
cql3_type prepare_internal(const sstring&, lw_shared_ptr<user_types_metadata>) override {
return _type;
}
@@ -78,6 +74,10 @@ public:
return _type.is_counter();
}
virtual sstring to_string() const {
return _type.to_string();
}
virtual bool is_duration() const override {
return _type.get_type() == duration_type;
}
@@ -87,19 +87,6 @@ class cql3_type::raw_collection : public raw {
const abstract_type::kind _kind;
shared_ptr<raw> _keys;
shared_ptr<raw> _values;
virtual sstring to_string() const override {
sstring start = is_frozen() ? "frozen<" : "";
sstring end = is_frozen() ? ">" : "";
if (_kind == abstract_type::kind::list) {
return format("{}list<{}>{}", start, _values, end);
} else if (_kind == abstract_type::kind::set) {
return format("{}set<{}>{}", start, _values, end);
} else if (_kind == abstract_type::kind::map) {
return format("{}map<{}, {}>{}", start, _keys, _values, end);
}
abort();
}
public:
raw_collection(const abstract_type::kind kind, shared_ptr<raw> keys, shared_ptr<raw> values)
: _kind(kind), _keys(std::move(keys)), _values(std::move(values)) {
@@ -123,37 +110,35 @@ public:
return true;
}
virtual cql3_type prepare_internal(const sstring& keyspace, user_types_metadata& user_types) override {
virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata> user_types) override {
assert(_values); // "Got null values type for a collection";
if (!is_frozen() && _values->supports_freezing() && !_values->is_frozen()) {
throw exceptions::invalid_request_exception(
format("Non-frozen user types or collections are not allowed inside collections: {}", *this));
if (!_frozen && _values->supports_freezing() && !_values->_frozen) {
throw exceptions::invalid_request_exception(format("Non-frozen collections are not allowed inside collections: {}", *this));
}
if (_values->is_counter()) {
throw exceptions::invalid_request_exception(format("Counters are not allowed inside collections: {}", *this));
}
if (_keys) {
if (!is_frozen() && _keys->supports_freezing() && !_keys->is_frozen()) {
throw exceptions::invalid_request_exception(
format("Non-frozen user types or collections are not allowed inside collections: {}", *this));
if (!_frozen && _keys->supports_freezing() && !_keys->_frozen) {
throw exceptions::invalid_request_exception(format("Non-frozen collections are not allowed inside collections: {}", *this));
}
}
if (_kind == abstract_type::kind::list) {
return cql3_type(list_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !is_frozen()));
return cql3_type(list_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !_frozen));
} else if (_kind == abstract_type::kind::set) {
if (_values->is_duration()) {
throw exceptions::invalid_request_exception(format("Durations are not allowed inside sets: {}", *this));
}
return cql3_type(set_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !is_frozen()));
return cql3_type(set_type_impl::get_instance(_values->prepare_internal(keyspace, user_types).get_type(), !_frozen));
} else if (_kind == abstract_type::kind::map) {
assert(_keys); // "Got null keys type for a collection";
if (_keys->is_duration()) {
throw exceptions::invalid_request_exception(format("Durations are not allowed as map keys: {}", *this));
}
return cql3_type(map_type_impl::get_instance(_keys->prepare_internal(keyspace, user_types).get_type(), _values->prepare_internal(keyspace, user_types).get_type(), !is_frozen()));
return cql3_type(map_type_impl::get_instance(_keys->prepare_internal(keyspace, user_types).get_type(), _values->prepare_internal(keyspace, user_types).get_type(), !_frozen));
}
abort();
}
@@ -165,18 +150,23 @@ public:
bool is_duration() const override {
return false;
}
virtual sstring to_string() const override {
sstring start = _frozen ? "frozen<" : "";
sstring end = _frozen ? ">" : "";
if (_kind == abstract_type::kind::list) {
return format("{}list<{}>{}", start, _values, end);
} else if (_kind == abstract_type::kind::set) {
return format("{}set<{}>{}", start, _values, end);
} else if (_kind == abstract_type::kind::map) {
return format("{}map<{}, {}>{}", start, _keys, _values, end);
}
abort();
}
};
class cql3_type::raw_ut : public raw {
ut_name _name;
virtual sstring to_string() const override {
if (is_frozen()) {
return format("frozen<{}>", _name.to_string());
}
return _name.to_string();
}
public:
raw_ut(ut_name name)
: _name(std::move(name)) {
@@ -190,7 +180,7 @@ public:
_frozen = true;
}
virtual cql3_type prepare_internal(const sstring& keyspace, user_types_metadata& user_types) override {
virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata> user_types) override {
if (_name.has_keyspace()) {
// The provided keyspace is the one of the current statement this is part of. If it's different from the keyspace of
// the UTName, we reject since we want to limit user types to their own keyspace (see #6643)
@@ -202,10 +192,14 @@ public:
} else {
_name.set_keyspace(keyspace);
}
if (!user_types) {
// bootstrap mode.
throw exceptions::invalid_request_exception(format("Unknown type {}", _name));
}
try {
data_type type = user_types.get_type(_name.get_user_type_name());
if (is_frozen()) {
type = type->freeze();
auto&& type = user_types->get_type(_name.get_user_type_name());
if (!_frozen) {
throw exceptions::invalid_request_exception("Non-frozen User-Defined types are not supported, please use frozen<>");
}
return cql3_type(std::move(type));
} catch (std::out_of_range& e) {
@@ -219,18 +213,14 @@ public:
return true;
}
virtual bool is_user_type() const override {
return true;
virtual sstring to_string() const override {
return _name.to_string();
}
};
class cql3_type::raw_tuple : public raw {
std::vector<shared_ptr<raw>> _types;
virtual sstring to_string() const override {
return format("tuple<{}>", join(", ", _types));
}
public:
raw_tuple(std::vector<shared_ptr<raw>> types)
: _types(std::move(types)) {
@@ -249,8 +239,8 @@ public:
}
_frozen = true;
}
virtual cql3_type prepare_internal(const sstring& keyspace, user_types_metadata& user_types) override {
if (!is_frozen()) {
virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata> user_types) override {
if (!_frozen) {
freeze();
}
std::vector<data_type> ts;
@@ -268,6 +258,10 @@ public:
return t->references_user_type(name);
});
}
virtual sstring to_string() const override {
return format("tuple<{}>", join(", ", _types));
}
};
bool
@@ -280,16 +274,6 @@ cql3_type::raw::is_counter() const {
return false;
}
bool
cql3_type::raw::is_user_type() const {
return false;
}
bool
cql3_type::raw::is_frozen() const {
return _frozen;
}
std::optional<sstring>
cql3_type::raw::keyspace() const {
return std::nullopt;

View File

@@ -60,28 +60,23 @@ public:
bool is_collection() const { return _type->is_collection(); }
bool is_counter() const { return _type->is_counter(); }
bool is_native() const { return _type->is_native(); }
bool is_user_type() const { return _type->is_user_type(); }
data_type get_type() const { return _type; }
const sstring& to_string() const { return _type->cql3_type_name(); }
// For UserTypes, we need to know the current keyspace to resolve the
// actual type used, so Raw is a "not yet prepared" CQL3Type.
class raw {
virtual sstring to_string() const = 0;
protected:
bool _frozen = false;
public:
virtual ~raw() {}
bool _frozen = false;
virtual bool supports_freezing() const = 0;
virtual bool is_collection() const;
virtual bool is_counter() const;
virtual bool is_duration() const;
virtual bool is_user_type() const;
bool is_frozen() const;
virtual bool references_user_type(const sstring&) const;
virtual std::optional<sstring> keyspace() const;
virtual void freeze();
virtual cql3_type prepare_internal(const sstring& keyspace, user_types_metadata&) = 0;
virtual cql3_type prepare_internal(const sstring& keyspace, lw_shared_ptr<user_types_metadata>) = 0;
virtual cql3_type prepare(database& db, const sstring& keyspace);
static shared_ptr<raw> from(cql3_type type);
static shared_ptr<raw> user_type(ut_name name);
@@ -90,6 +85,7 @@ public:
static shared_ptr<raw> set(shared_ptr<raw> t);
static shared_ptr<raw> tuple(std::vector<shared_ptr<raw>> ts);
static shared_ptr<raw> frozen(shared_ptr<raw> t);
virtual sstring to_string() const = 0;
friend std::ostream& operator<<(std::ostream& os, const raw& r);
};

View File

@@ -1,36 +0,0 @@
/*
* Copyright (C) 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "restrictions/restrictions_config.hh"
namespace cql3 {
struct cql_config {
restrictions::restrictions_config restrictions;
};
extern const cql_config default_cql_config;
}

View File

@@ -115,21 +115,4 @@ public:
}
};
// Conditional modification statements and batches
// return a result set and have metadata, while same
// statements without conditions do not.
class cql_statement_opt_metadata : public cql_statement {
protected:
// Result set metadata, may be empty for simple updates and batches
shared_ptr<metadata> _metadata;
public:
using cql_statement::cql_statement;
virtual shared_ptr<const metadata> get_result_metadata() const override {
if (_metadata) {
return _metadata;
}
return make_empty_metadata();
}
};
}

View File

@@ -233,13 +233,13 @@ struct aggregate_type_for {
};
template<>
struct aggregate_type_for<ascii_native_type> {
using type = ascii_native_type::primary_type;
struct aggregate_type_for<simple_date_native_type> {
using type = simple_date_native_type::primary_type;
};
template<>
struct aggregate_type_for<simple_date_native_type> {
using type = simple_date_native_type::primary_type;
struct aggregate_type_for<timestamp_native_type> {
using type = timestamp_native_type::primary_type;
};
template<>

View File

@@ -27,13 +27,10 @@
#include "cql3/sets.hh"
#include "cql3/lists.hh"
#include "cql3/constants.hh"
#include "cql3/user_types.hh"
#include "database.hh"
#include "types/map.hh"
#include "types/set.hh"
#include "types/list.hh"
#include "types/user.hh"
#include "concrete_types.hh"
namespace cql3 {
namespace functions {
@@ -114,17 +111,13 @@ functions::init() {
declare(aggregate_fcts::make_max_function<sstring>());
declare(aggregate_fcts::make_min_function<sstring>());
declare(aggregate_fcts::make_count_function<ascii_native_type>());
declare(aggregate_fcts::make_max_function<ascii_native_type>());
declare(aggregate_fcts::make_min_function<ascii_native_type>());
declare(aggregate_fcts::make_count_function<simple_date_native_type>());
declare(aggregate_fcts::make_max_function<simple_date_native_type>());
declare(aggregate_fcts::make_min_function<simple_date_native_type>());
declare(aggregate_fcts::make_count_function<db_clock::time_point>());
declare(aggregate_fcts::make_max_function<db_clock::time_point>());
declare(aggregate_fcts::make_min_function<db_clock::time_point>());
declare(aggregate_fcts::make_count_function<timestamp_native_type>());
declare(aggregate_fcts::make_max_function<timestamp_native_type>());
declare(aggregate_fcts::make_min_function<timestamp_native_type>());
declare(aggregate_fcts::make_count_function<timeuuid_native_type>());
declare(aggregate_fcts::make_max_function<timeuuid_native_type>());
@@ -470,34 +463,23 @@ function_call::contains_bind_marker() const {
shared_ptr<terminal>
function_call::make_terminal(shared_ptr<function> fun, cql3::raw_value result, cql_serialization_format sf) {
static constexpr auto to_buffer = [] (const cql3::raw_value& v) {
if (v) {
return fragmented_temporary_buffer::view{bytes_view{*v}};
}
return fragmented_temporary_buffer::view{};
};
return visit(*fun->return_type(), make_visitor(
[&] (const list_type_impl& ltype) -> shared_ptr<terminal> {
return make_shared(lists::value::from_serialized(to_buffer(result), ltype, sf));
},
[&] (const set_type_impl& stype) -> shared_ptr<terminal> {
return make_shared(sets::value::from_serialized(to_buffer(result), stype, sf));
},
[&] (const map_type_impl& mtype) -> shared_ptr<terminal> {
return make_shared(maps::value::from_serialized(to_buffer(result), mtype, sf));
},
[&] (const user_type_impl& utype) -> shared_ptr<terminal> {
// TODO (kbraun): write a test for this case when User Defined Functions are implemented
return make_shared(user_types::value::from_serialized(to_buffer(result), utype));
},
[&] (const abstract_type& type) -> shared_ptr<terminal> {
if (type.is_collection()) {
throw std::runtime_error(format("function_call::make_terminal: unhandled collection type {}", type.name()));
}
return make_shared<constants::value>(std::move(result));
if (!dynamic_pointer_cast<const collection_type_impl>(fun->return_type())) {
return ::make_shared<constants::value>(std::move(result));
}
));
auto ctype = static_pointer_cast<const collection_type_impl>(fun->return_type());
fragmented_temporary_buffer::view res;
if (result) {
res = fragmented_temporary_buffer::view(bytes_view(*result));
}
if (ctype->get_kind() == abstract_type::kind::list) {
return make_shared(lists::value::from_serialized(std::move(res), static_pointer_cast<const list_type_impl>(ctype), sf));
} else if (ctype->get_kind() == abstract_type::kind::set) {
return make_shared(sets::value::from_serialized(std::move(res), static_pointer_cast<const set_type_impl>(ctype), sf));
} else if (ctype->get_kind() == abstract_type::kind::map) {
return make_shared(maps::value::from_serialized(std::move(res), static_pointer_cast<const map_type_impl>(ctype), sf));
}
abort();
}
::shared_ptr<term>

View File

@@ -61,16 +61,6 @@ make_now_fct() {
});
}
static int64_t get_valid_timestamp(const data_value& ts_obj) {
auto ts = value_cast<db_clock::time_point>(ts_obj);
int64_t ms = ts.time_since_epoch().count();
auto nanos_since = utils::UUID_gen::make_nanos_since(ms);
if (!utils::UUID_gen::is_valid_nanos_since(nanos_since)) {
throw exceptions::server_exception(format("{}: timestamp is out of range. Must be in milliseconds since epoch", ms));
}
return ms;
}
inline
shared_ptr<function>
make_min_timeuuid_fct() {
@@ -84,7 +74,8 @@ make_min_timeuuid_fct() {
if (ts_obj.is_null()) {
return {};
}
auto uuid = utils::UUID_gen::min_time_UUID(get_valid_timestamp(ts_obj));
auto ts = value_cast<db_clock::time_point>(ts_obj);
auto uuid = utils::UUID_gen::min_time_UUID(ts.time_since_epoch().count());
return {timeuuid_type->decompose(uuid)};
});
}
@@ -94,6 +85,7 @@ shared_ptr<function>
make_max_timeuuid_fct() {
return make_native_scalar_function<true>("maxtimeuuid", timeuuid_type, { timestamp_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
// FIXME: should values be a vector<optional<bytes>>?
auto& bb = values[0];
if (!bb) {
return {};
@@ -102,22 +94,12 @@ make_max_timeuuid_fct() {
if (ts_obj.is_null()) {
return {};
}
auto uuid = utils::UUID_gen::max_time_UUID(get_valid_timestamp(ts_obj));
auto ts = value_cast<db_clock::time_point>(ts_obj);
auto uuid = utils::UUID_gen::max_time_UUID(ts.time_since_epoch().count());
return {timeuuid_type->decompose(uuid)};
});
}
inline utils::UUID get_valid_timeuuid(bytes raw) {
if (!utils::UUID_gen::is_valid_UUID(raw)) {
throw exceptions::server_exception(format("invalid timeuuid: size={}", raw.size()));
}
auto uuid = utils::UUID_gen::get_UUID(raw);
if (!uuid.is_timestamp()) {
throw exceptions::server_exception(format("{}: Not a timeuuid: version={}", uuid, uuid.version()));
}
return uuid;
}
inline
shared_ptr<function>
make_date_of_fct() {
@@ -128,7 +110,7 @@ make_date_of_fct() {
if (!bb) {
return {};
}
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(get_valid_timeuuid(*bb))));
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb))));
return {timestamp_type->decompose(ts)};
});
}
@@ -143,7 +125,7 @@ make_unix_timestamp_of_fct() {
if (!bb) {
return {};
}
return {long_type->decompose(UUID_gen::unix_timestamp(get_valid_timeuuid(*bb)))};
return {long_type->decompose(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb)))};
});
}
@@ -151,7 +133,7 @@ inline shared_ptr<function>
make_currenttimestamp_fct() {
return make_native_scalar_function<true>("currenttimestamp", timestamp_type, {},
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
return {timestamp_type->decompose(db_clock::now())};
return {timestamp_type->decompose(timestamp_native_type{db_clock::now()})};
});
}
@@ -171,7 +153,7 @@ make_currentdate_fct() {
return make_native_scalar_function<true>("currentdate", simple_date_type, {},
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
auto to_simple_date = get_castas_fctn(simple_date_type, timestamp_type);
return {simple_date_type->decompose(to_simple_date(db_clock::now()))};
return {simple_date_type->decompose(to_simple_date(timestamp_native_type{db_clock::now()}))};
});
}
@@ -194,7 +176,7 @@ make_timeuuidtodate_fct() {
if (!bb) {
return {};
}
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(get_valid_timeuuid(*bb))));
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb))));
auto to_simple_date = get_castas_fctn(simple_date_type, timestamp_type);
return {simple_date_type->decompose(to_simple_date(ts))};
});
@@ -229,7 +211,7 @@ make_timeuuidtotimestamp_fct() {
if (!bb) {
return {};
}
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(get_valid_timeuuid(*bb))));
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb))));
return {timestamp_type->decompose(ts)};
});
}
@@ -263,14 +245,10 @@ make_timeuuidtounixtimestamp_fct() {
if (!bb) {
return {};
}
return {long_type->decompose(UUID_gen::unix_timestamp(get_valid_timeuuid(*bb)))};
return {long_type->decompose(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb)))};
});
}
inline bytes time_point_to_long(const data_value& v) {
return data_value(get_valid_timestamp(v)).serialize();
}
inline
shared_ptr<function>
make_timestamptounixtimestamp_fct() {
@@ -285,7 +263,7 @@ make_timestamptounixtimestamp_fct() {
if (ts_obj.is_null()) {
return {};
}
return time_point_to_long(ts_obj);
return {long_type->decompose(ts_obj)};
});
}
@@ -304,7 +282,7 @@ make_datetounixtimestamp_fct() {
return {};
}
auto from_simple_date = get_castas_fctn(timestamp_type, simple_date_type);
return time_point_to_long(from_simple_date(simple_date_obj));
return {long_type->decompose(from_simple_date(simple_date_obj))};
});
}

View File

@@ -54,14 +54,6 @@ shared_ptr<term>
lists::literal::prepare(database& db, const sstring& keyspace, shared_ptr<column_specification> receiver) {
validate_assignable_to(db, keyspace, receiver);
// In Cassandra, an empty (unfrozen) map/set/list is equivalent to the column being null. In
// other words a non-frozen collection only exists if it has elements. Return nullptr right
// away to simplify predicate evaluation. See also
// https://issues.apache.org/jira/browse/CASSANDRA-5141
if (receiver->type->is_multi_cell() && _elements.empty()) {
return cql3::constants::null_literal::NULL_VALUE;
}
auto&& value_spec = value_spec_of(receiver);
std::vector<shared_ptr<term>> values;
values.reserve(_elements.size());
@@ -124,24 +116,24 @@ lists::literal::to_string() const {
}
lists::value
lists::value::from_serialized(const fragmented_temporary_buffer::view& val, const list_type_impl& type, cql_serialization_format sf) {
lists::value::from_serialized(const fragmented_temporary_buffer::view& val, list_type type, cql_serialization_format sf) {
return with_linearized(val, [&] (bytes_view v) {
return from_serialized(v, type, sf);
});
}
lists::value
lists::value::from_serialized(bytes_view v, const list_type_impl& type, cql_serialization_format sf) {
lists::value::from_serialized(bytes_view v, list_type type, cql_serialization_format sf) {
try {
// Collections have this small hack that validate cannot be called on a serialized object,
// but compose does the validation (so we're fine).
// FIXME: deserializeForNativeProtocol()?!
auto l = value_cast<list_type_impl::native_type>(type.deserialize(v, sf));
auto l = value_cast<list_type_impl::native_type>(type->deserialize(v, sf));
std::vector<bytes_opt> elements;
elements.reserve(l.size());
for (auto&& element : l) {
// elements can be null in lists that represent a set of IN values
elements.push_back(element.is_null() ? bytes_opt() : bytes_opt(type.get_elements_type()->decompose(element)));
elements.push_back(element.is_null() ? bytes_opt() : bytes_opt(type->get_elements_type()->decompose(element)));
}
return value(std::move(elements));
} catch (marshal_exception& e) {
@@ -174,7 +166,7 @@ lists::value::equals(shared_ptr<list_type_impl> lt, const value& v) {
[t = lt->get_elements_type()] (const bytes_opt& e1, const bytes_opt& e2) { return t->equal(*e1, *e2); });
}
const std::vector<bytes_opt>&
std::vector<bytes_opt>
lists::value::get_elements() {
return _elements;
}
@@ -227,7 +219,7 @@ lists::delayed_value::bind(const query_options& options) {
::shared_ptr<terminal>
lists::marker::bind(const query_options& options) {
const auto& value = options.get_value_at(_bind_index);
auto& ltype = static_cast<const list_type_impl&>(*_receiver->type);
auto ltype = static_pointer_cast<const list_type_impl>(_receiver->type);
if (value.is_null()) {
return nullptr;
} else if (value.is_unset_value()) {
@@ -235,8 +227,8 @@ lists::marker::bind(const query_options& options) {
} else {
try {
return with_linearized(*value, [&] (bytes_view v) {
ltype.validate(v, options.get_cql_serialization_format());
return make_shared(value::from_serialized(v, ltype, options.get_cql_serialization_format()));
ltype->validate(v, options.get_cql_serialization_format());
return make_shared(value::from_serialized(v, std::move(ltype), options.get_cql_serialization_format()));
});
} catch (marshal_exception& e) {
throw exceptions::invalid_request_exception(e.what());
@@ -270,11 +262,12 @@ lists::setter::execute(mutation& m, const clustering_key_prefix& prefix, const u
return;
}
if (column.type->is_multi_cell()) {
// Delete all cells first, then append new ones
collection_mutation_view_description mut;
// delete + append
collection_type_impl::mutation mut;
mut.tomb = params.make_tombstone_just_before();
m.set_cell(prefix, column, mut.serialize(*column.type));
auto ctype = static_pointer_cast<const list_type_impl>(column.type);
auto col_mut = ctype->serialize_mutation_form(std::move(mut));
m.set_cell(prefix, column, std::move(col_mut));
}
do_append(value, m, prefix, column, params);
}
@@ -310,10 +303,11 @@ lists::setter_by_index::execute(mutation& m, const clustering_key_prefix& prefix
auto idx = with_linearized(*index, [] (bytes_view v) {
return value_cast<int32_t>(data_type_for<int32_t>()->deserialize(v));
});
auto&& existing_list_opt = params.get_prefetched_list(m.key(), prefix, column);
auto&& existing_list_opt = params.get_prefetched_list(m.key().view(), prefix.view(), column);
if (!existing_list_opt) {
throw exceptions::invalid_request_exception("Attempted to set an element on a list which is null");
}
auto ltype = dynamic_pointer_cast<const list_type_impl>(column.type);
auto&& existing_list = *existing_list_opt;
// we verified that index is an int32_type
if (idx < 0 || size_t(idx) >= existing_list.size()) {
@@ -321,19 +315,16 @@ lists::setter_by_index::execute(mutation& m, const clustering_key_prefix& prefix
idx, existing_list.size()));
}
auto ltype = static_cast<const list_type_impl*>(column.type.get());
const data_value& eidx_dv = existing_list[idx].first;
bytes eidx = eidx_dv.type()->decompose(eidx_dv);
collection_mutation_description mut;
const bytes& eidx = existing_list[idx].key;
list_type_impl::mutation mut;
mut.cells.reserve(1);
if (!value) {
mut.cells.emplace_back(std::move(eidx), params.make_dead_cell());
mut.cells.emplace_back(eidx, params.make_dead_cell());
} else {
mut.cells.emplace_back(std::move(eidx),
params.make_cell(*ltype->value_comparator(), *value, atomic_cell::collection_member::yes));
mut.cells.emplace_back(eidx, params.make_cell(*ltype->value_comparator(), *value, atomic_cell::collection_member::yes));
}
m.set_cell(prefix, column, mut.serialize(*ltype));
auto smut = ltype->serialize_mutation_form(mut);
m.set_cell(prefix, column, atomic_cell_or_collection::from_collection_mutation(std::move(smut)));
}
bool
@@ -353,13 +344,15 @@ lists::setter_by_uuid::execute(mutation& m, const clustering_key_prefix& prefix,
throw exceptions::invalid_request_exception("Invalid null value for list index");
}
auto ltype = static_cast<const list_type_impl*>(column.type.get());
auto ltype = dynamic_pointer_cast<const list_type_impl>(column.type);
collection_mutation_description mut;
list_type_impl::mutation mut;
mut.cells.reserve(1);
mut.cells.emplace_back(to_bytes(*index), params.make_cell(*ltype->value_comparator(), *value, atomic_cell::collection_member::yes));
m.set_cell(prefix, column, mut.serialize(*ltype));
auto smut = ltype->serialize_mutation_form(mut);
m.set_cell(prefix, column,
atomic_cell_or_collection::from_collection_mutation(
std::move(smut)));
}
void
@@ -379,6 +372,7 @@ lists::do_append(shared_ptr<term> value,
const column_definition& column,
const update_parameters& params) {
auto&& list_value = dynamic_pointer_cast<lists::value>(value);
auto&& ltype = dynamic_pointer_cast<const list_type_impl>(column.type);
if (column.type->is_multi_cell()) {
// If we append null, do nothing. Note that for Setter, we've
// already removed the previous value so we're good here too
@@ -386,10 +380,8 @@ lists::do_append(shared_ptr<term> value,
return;
}
auto ltype = static_cast<const list_type_impl*>(column.type.get());
auto&& to_add = list_value->_elements;
collection_mutation_description appended;
collection_type_impl::mutation appended;
appended.cells.reserve(to_add.size());
for (auto&& e : to_add) {
auto uuid1 = utils::UUID_gen::get_time_UUID_bytes();
@@ -397,7 +389,7 @@ lists::do_append(shared_ptr<term> value,
// FIXME: can e be empty?
appended.cells.emplace_back(std::move(uuid), params.make_cell(*ltype->value_comparator(), *e, atomic_cell::collection_member::yes));
}
m.set_cell(prefix, column, appended.serialize(*ltype));
m.set_cell(prefix, column, ltype->serialize_mutation_form(appended));
} else {
// for frozen lists, we're overwriting the whole cell value
if (!value) {
@@ -421,11 +413,11 @@ lists::prepender::execute(mutation& m, const clustering_key_prefix& prefix, cons
assert(lvalue);
auto time = precision_time::REFERENCE_TIME - (db_clock::now() - precision_time::REFERENCE_TIME);
collection_mutation_description mut;
collection_type_impl::mutation mut;
mut.cells.reserve(lvalue->get_elements().size());
// We reverse the order of insertion, so that the last element gets the lastest time
// (lists are sorted by time)
auto ltype = static_cast<const list_type_impl*>(column.type.get());
auto&& ltype = static_cast<const list_type_impl*>(column.type.get());
for (auto&& v : lvalue->_elements | boost::adaptors::reversed) {
auto&& pt = precision_time::get_next(time);
auto uuid = utils::UUID_gen::get_time_UUID_bytes(pt.millis.time_since_epoch().count(), pt.nanos);
@@ -433,7 +425,7 @@ lists::prepender::execute(mutation& m, const clustering_key_prefix& prefix, cons
}
// now reverse again, to get the original order back
std::reverse(mut.cells.begin(), mut.cells.end());
m.set_cell(prefix, column, mut.serialize(*ltype));
m.set_cell(prefix, column, atomic_cell_or_collection::from_collection_mutation(ltype->serialize_mutation_form(std::move(mut))));
}
bool
@@ -445,10 +437,12 @@ void
lists::discarder::execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params) {
assert(column.type->is_multi_cell()); // "Attempted to delete from a frozen list";
auto&& existing_list = params.get_prefetched_list(m.key(), prefix, column);
auto&& existing_list = params.get_prefetched_list(m.key().view(), prefix.view(), column);
// We want to call bind before possibly returning to reject queries where the value provided is not a list.
auto&& value = _t->bind(params._options);
auto&& ltype = static_pointer_cast<const list_type_impl>(column.type);
if (!existing_list) {
return;
}
@@ -466,27 +460,24 @@ lists::discarder::execute(mutation& m, const clustering_key_prefix& prefix, cons
auto lvalue = dynamic_pointer_cast<lists::value>(value);
assert(lvalue);
auto ltype = static_cast<const list_type_impl*>(column.type.get());
// Note: below, we will call 'contains' on this toDiscard list for each element of existingList.
// Meaning that if toDiscard is big, converting it to a HashSet might be more efficient. However,
// the read-before-write this operation requires limits its usefulness on big lists, so in practice
// toDiscard will be small and keeping a list will be more efficient.
auto&& to_discard = lvalue->_elements;
collection_mutation_description mnew;
collection_type_impl::mutation mnew;
for (auto&& cell : elist) {
auto has_value = [&] (bytes_view value) {
auto have_value = [&] (bytes_view value) {
return std::find_if(to_discard.begin(), to_discard.end(),
[ltype, value] (auto&& v) { return ltype->get_elements_type()->equal(*v, value); })
!= to_discard.end();
};
bytes eidx = cell.first.type()->decompose(cell.first);
bytes value = cell.second.type()->decompose(cell.second);
if (has_value(value)) {
mnew.cells.emplace_back(std::move(eidx), params.make_dead_cell());
if (have_value(cell.value)) {
mnew.cells.emplace_back(cell.key, params.make_dead_cell());
}
}
m.set_cell(prefix, column, mnew.serialize(*ltype));
auto mnew_ser = ltype->serialize_mutation_form(mnew);
m.set_cell(prefix, column, atomic_cell_or_collection::from_collection_mutation(std::move(mnew_ser)));
}
bool
@@ -505,10 +496,11 @@ lists::discarder_by_index::execute(mutation& m, const clustering_key_prefix& pre
return;
}
auto ltype = static_pointer_cast<const list_type_impl>(column.type);
auto cvalue = dynamic_pointer_cast<constants::value>(index);
assert(cvalue);
auto&& existing_list_opt = params.get_prefetched_list(m.key(), prefix, column);
auto&& existing_list_opt = params.get_prefetched_list(m.key().view(), prefix.view(), column);
int32_t idx = read_simple_exactly<int32_t>(*cvalue->_bytes);
if (!existing_list_opt) {
throw exceptions::invalid_request_exception("Attempted to delete an element from a list which is null");
@@ -517,11 +509,9 @@ lists::discarder_by_index::execute(mutation& m, const clustering_key_prefix& pre
if (idx < 0 || size_t(idx) >= existing_list.size()) {
throw exceptions::invalid_request_exception(format("List index {:d} out of bound, list has size {:d}", idx, existing_list.size()));
}
collection_mutation_description mut;
const data_value& eidx_dv = existing_list[idx].first;
bytes eidx = eidx_dv.type()->decompose(eidx_dv);
mut.cells.emplace_back(std::move(eidx), params.make_dead_cell());
m.set_cell(prefix, column, mut.serialize(*column.type));
collection_type_impl::mutation mut;
mut.cells.emplace_back(existing_list[idx].key, params.make_dead_cell());
m.set_cell(prefix, column, ltype->serialize_mutation_form(mut));
}
}

View File

@@ -73,18 +73,18 @@ public:
};
class value : public multi_item_terminal, collection_terminal {
static value from_serialized(bytes_view v, const list_type_impl& type, cql_serialization_format sf);
static value from_serialized(bytes_view v, list_type type, cql_serialization_format sf);
public:
std::vector<bytes_opt> _elements;
public:
explicit value(std::vector<bytes_opt> elements)
: _elements(std::move(elements)) {
}
static value from_serialized(const fragmented_temporary_buffer::view& v, const list_type_impl& type, cql_serialization_format sf);
static value from_serialized(const fragmented_temporary_buffer::view& v, list_type type, cql_serialization_format sf);
virtual cql3::raw_value get(const query_options& options) override;
virtual bytes get_with_protocol_version(cql_serialization_format sf) override;
bool equals(shared_ptr<list_type_impl> lt, const value& v);
virtual const std::vector<bytes_opt>& get_elements() override;
virtual std::vector<bytes_opt> get_elements() override;
virtual sstring to_string() const;
friend class lists;
};

View File

@@ -153,17 +153,17 @@ maps::literal::to_string() const {
}
maps::value
maps::value::from_serialized(const fragmented_temporary_buffer::view& fragmented_value, const map_type_impl& type, cql_serialization_format sf) {
maps::value::from_serialized(const fragmented_temporary_buffer::view& fragmented_value, map_type type, cql_serialization_format sf) {
try {
// Collections have this small hack that validate cannot be called on a serialized object,
// but compose does the validation (so we're fine).
// FIXME: deserialize_for_native_protocol?!
return with_linearized(fragmented_value, [&] (bytes_view value) {
auto m = value_cast<map_type_impl::native_type>(type.deserialize(value, sf));
std::map<bytes, bytes, serialized_compare> map(type.get_keys_type()->as_less_comparator());
auto m = value_cast<map_type_impl::native_type>(type->deserialize(value, sf));
std::map<bytes, bytes, serialized_compare> map(type->get_keys_type()->as_less_comparator());
for (auto&& e : m) {
map.emplace(type.get_keys_type()->decompose(e.first),
type.get_values_type()->decompose(e.second));
map.emplace(type->get_keys_type()->decompose(e.first),
type->get_values_type()->decompose(e.second));
}
return maps::value { std::move(map) };
});
@@ -269,7 +269,8 @@ maps::marker::bind(const query_options& options) {
} catch (marshal_exception& e) {
throw exceptions::invalid_request_exception(e.what());
}
return ::make_shared(maps::value::from_serialized(*val, static_cast<const map_type_impl&>(*_receiver->type), options.get_cql_serialization_format()));
return ::make_shared<maps::value>(maps::value::from_serialized(*val, static_pointer_cast<const map_type_impl>(_receiver->type),
options.get_cql_serialization_format()));
}
void
@@ -284,10 +285,12 @@ maps::setter::execute(mutation& m, const clustering_key_prefix& row_key, const u
return;
}
if (column.type->is_multi_cell()) {
// Delete all cells first, then put new ones
collection_mutation_description mut;
// delete + put
collection_type_impl::mutation mut;
mut.tomb = params.make_tombstone_just_before();
m.set_cell(row_key, column, mut.serialize(*column.type));
auto ctype = static_pointer_cast<const map_type_impl>(column.type);
auto col_mut = ctype->serialize_mutation_form(std::move(mut));
m.set_cell(row_key, column, std::move(col_mut));
}
do_put(m, row_key, params, value, column);
}
@@ -307,12 +310,13 @@ maps::setter_by_key::execute(mutation& m, const clustering_key_prefix& prefix, c
if (!key) {
throw invalid_request_exception("Invalid null map key");
}
auto ctype = static_cast<const map_type_impl*>(column.type.get());
auto ctype = static_pointer_cast<const map_type_impl>(column.type);
auto avalue = value ? params.make_cell(*ctype->get_values_type(), *value, atomic_cell::collection_member::yes) : params.make_dead_cell();
collection_mutation_description update;
map_type_impl::mutation update;
update.cells.emplace_back(std::move(to_bytes(*key)), std::move(avalue));
m.set_cell(prefix, column, update.serialize(*ctype));
// should have been verified as map earlier?
auto col_mut = ctype->serialize_mutation_form(std::move(update));
m.set_cell(prefix, column, std::move(col_mut));
}
void
@@ -329,18 +333,18 @@ maps::do_put(mutation& m, const clustering_key_prefix& prefix, const update_para
shared_ptr<term> value, const column_definition& column) {
auto map_value = dynamic_pointer_cast<maps::value>(value);
if (column.type->is_multi_cell()) {
collection_type_impl::mutation mut;
if (!value) {
return;
}
collection_mutation_description mut;
auto ctype = static_cast<const map_type_impl*>(column.type.get());
auto ctype = static_pointer_cast<const map_type_impl>(column.type);
for (auto&& e : map_value->map) {
mut.cells.emplace_back(e.first, params.make_cell(*ctype->get_values_type(), fragmented_temporary_buffer::view(e.second), atomic_cell::collection_member::yes));
}
m.set_cell(prefix, column, mut.serialize(*ctype));
auto col_mut = ctype->serialize_mutation_form(std::move(mut));
m.set_cell(prefix, column, std::move(col_mut));
} else {
// for frozen maps, we're overwriting the whole cell
if (!value) {
@@ -363,10 +367,10 @@ maps::discarder_by_key::execute(mutation& m, const clustering_key_prefix& prefix
if (key == constants::UNSET_VALUE) {
throw exceptions::invalid_request_exception("Invalid unset map key");
}
collection_mutation_description mut;
collection_type_impl::mutation mut;
mut.cells.emplace_back(*key->get(params._options), params.make_dead_cell());
m.set_cell(prefix, column, mut.serialize(*column.type));
auto mtype = static_cast<const map_type_impl*>(column.type.get());
m.set_cell(prefix, column, mtype->serialize_mutation_form(mut));
}
}

View File

@@ -81,7 +81,7 @@ public:
value(std::map<bytes, bytes, serialized_compare> map)
: map(std::move(map)) {
}
static value from_serialized(const fragmented_temporary_buffer::view& value, const map_type_impl& type, cql_serialization_format sf);
static value from_serialized(const fragmented_temporary_buffer::view& value, map_type type, cql_serialization_format sf);
virtual cql3::raw_value get(const query_options& options) override;
virtual bytes get_with_protocol_version(cql_serialization_format sf);
bool equals(map_type mt, const value& v);

View File

@@ -43,12 +43,10 @@
#include "maps.hh"
#include "sets.hh"
#include "lists.hh"
#include "user_types.hh"
#include "types/tuple.hh"
#include "types/map.hh"
#include "types/list.hh"
#include "types/set.hh"
#include "types/user.hh"
namespace cql3 {
@@ -93,67 +91,6 @@ operation::set_element::is_compatible_with(shared_ptr<raw_update> other) {
return !dynamic_pointer_cast<set_value>(std::move(other));
}
sstring
operation::set_field::to_string(const column_definition& receiver) const {
return format("{}.{} = {}", receiver.name_as_text(), *_field, *_value);
}
shared_ptr<operation>
operation::set_field::prepare(database& db, const sstring& keyspace, const column_definition& receiver) {
if (!receiver.type->is_user_type()) {
throw exceptions::invalid_request_exception(
format("Invalid operation ({}) for non-UDT column {}", to_string(receiver), receiver.name_as_text()));
} else if (!receiver.type->is_multi_cell()) {
throw exceptions::invalid_request_exception(
format("Invalid operation ({}) for frozen UDT column {}", to_string(receiver), receiver.name_as_text()));
}
auto& type = static_cast<const user_type_impl&>(*receiver.type);
auto idx = type.idx_of_field(_field->name());
if (!idx) {
throw exceptions::invalid_request_exception(
format("UDT column {} does not have a field named {}", receiver.name_as_text(), *_field));
}
auto val = _value->prepare(db, keyspace, user_types::field_spec_of(receiver.column_specification, *idx));
return make_shared<user_types::setter_by_field>(receiver, *idx, std::move(val));
}
bool
operation::set_field::is_compatible_with(shared_ptr<raw_update> other) {
auto x = dynamic_pointer_cast<set_field>(other);
if (x) {
return _field != x->_field;
}
return !dynamic_pointer_cast<set_value>(std::move(other));
}
shared_ptr<column_identifier::raw>
operation::field_deletion::affected_column() {
return _id;
}
shared_ptr<operation>
operation::field_deletion::prepare(database& db, const sstring& keyspace, const column_definition& receiver) {
if (!receiver.type->is_user_type()) {
throw exceptions::invalid_request_exception(
format("Invalid deletion operation for non-UDT column {}", receiver.name_as_text()));
} else if (!receiver.type->is_multi_cell()) {
throw exceptions::invalid_request_exception(
format("Frozen UDT column {} does not support field deletions", receiver.name_as_text()));
}
auto type = static_cast<const user_type_impl*>(receiver.type.get());
auto idx = type->idx_of_field(_field->name());
if (!idx) {
throw exceptions::invalid_request_exception(
format("UDT column {} does not have a field named {}", receiver.name_as_text(), *_field));
}
return make_shared<user_types::deleter_by_field>(receiver, *idx);
}
sstring
operation::addition::to_string(const column_definition& receiver) const {
return format("{} = {} + {}", receiver.name_as_text(), receiver.name_as_text(), *_value);
@@ -263,24 +200,20 @@ operation::set_value::prepare(database& db, const sstring& keyspace, const colum
throw exceptions::invalid_request_exception(format("Cannot set the value of counter column {} (counters can only be incremented/decremented, not set)", receiver.name_as_text()));
}
if (receiver.type->is_collection()) {
auto k = receiver.type->get_kind();
if (k == abstract_type::kind::list) {
return make_shared<lists::setter>(receiver, v);
} else if (k == abstract_type::kind::set) {
return make_shared<sets::setter>(receiver, v);
} else if (k == abstract_type::kind::map) {
return make_shared<maps::setter>(receiver, v);
} else {
abort();
}
if (!receiver.type->is_collection()) {
return ::make_shared<constants::setter>(receiver, v);
}
if (receiver.type->is_user_type()) {
return make_shared<user_types::setter>(receiver, v);
auto k = static_pointer_cast<const collection_type_impl>(receiver.type)->get_kind();
if (k == abstract_type::kind::list) {
return make_shared<lists::setter>(receiver, v);
} else if (k == abstract_type::kind::set) {
return make_shared<sets::setter>(receiver, v);
} else if (k == abstract_type::kind::map) {
return make_shared<maps::setter>(receiver, v);
} else {
abort();
}
return ::make_shared<constants::setter>(receiver, v);
}
::shared_ptr <operation>

View File

@@ -142,7 +142,6 @@ public:
* This can be one of:
* - Setting a value: c = v
* - Setting an element of a collection: c[x] = v
* - Setting a field of a user-defined type: c.x = v
* - An addition/subtraction to a variable: c = c +/- v (where v can be a collection literal)
* - An prepend operation: c = v + c
*/
@@ -177,7 +176,6 @@ public:
* This can be one of:
* - Deleting a column
* - Deleting an element of a collection
* - Deleting a field of a user-defined type
*/
class raw_deletion {
public:
@@ -216,42 +214,11 @@ public:
: _selector(std::move(selector)), _value(std::move(value)), _by_uuid(by_uuid) {
}
virtual shared_ptr<operation> prepare(database& db, const sstring& keyspace, const column_definition& receiver) override;
virtual shared_ptr<operation> prepare(database& db, const sstring& keyspace, const column_definition& receiver);
virtual bool is_compatible_with(shared_ptr<raw_update> other) override;
};
// Set a single field inside a user-defined type.
class set_field : public raw_update {
const shared_ptr<column_identifier> _field;
const shared_ptr<term::raw> _value;
private:
sstring to_string(const column_definition& receiver) const;
public:
set_field(shared_ptr<column_identifier> field, shared_ptr<term::raw> value)
: _field(std::move(field)), _value(std::move(value)) {
}
virtual shared_ptr<operation> prepare(database& db, const sstring& keyspace, const column_definition& receiver) override;
virtual bool is_compatible_with(shared_ptr<raw_update> other) override;
};
// Delete a single field inside a user-defined type.
// Equivalent to setting the field to null.
class field_deletion : public raw_deletion {
const shared_ptr<column_identifier::raw> _id;
const shared_ptr<column_identifier> _field;
public:
field_deletion(shared_ptr<column_identifier::raw> id, shared_ptr<column_identifier> field)
: _id(std::move(id)), _field(std::move(field)) {
}
virtual shared_ptr<column_identifier::raw> affected_column() override;
virtual shared_ptr<operation> prepare(database& db, const sstring& keyspace, const column_definition& receiver) override;
};
class addition : public raw_update {
const shared_ptr<term::raw> _value;
private:

View File

@@ -46,7 +46,6 @@
#include "maps.hh"
#include "sets.hh"
#include "lists.hh"
#include "user_types.hh"
namespace cql3 {

View File

@@ -78,10 +78,6 @@ public:
bool is_slice() const {
return (*this == LT) || (*this == LTE) || (*this == GT) || (*this == GTE);
}
bool is_compare() const {
// EQ, LT, LTE, GT, GTE, NEQ
return _b < 5 || _b == 8;
}
sstring to_string() const { return _text; }
bool operator==(const operator_type& other) const { return this == &other; }
bool operator!=(const operator_type& other) const { return this != &other; }

View File

@@ -39,22 +39,17 @@
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "cql3/cql_config.hh"
#include "query_options.hh"
#include "version.hh"
namespace cql3 {
const cql_config default_cql_config;
thread_local const query_options::specific_options query_options::specific_options::DEFAULT{-1, {}, {}, api::missing_timestamp};
thread_local query_options query_options::DEFAULT{default_cql_config,
db::consistency_level::ONE, infinite_timeout_config, std::nullopt,
thread_local query_options query_options::DEFAULT{db::consistency_level::ONE, infinite_timeout_config, std::nullopt,
std::vector<cql3::raw_value_view>(), false, query_options::specific_options::DEFAULT, cql_serialization_format::latest()};
query_options::query_options(const cql_config& cfg,
db::consistency_level consistency,
query_options::query_options(db::consistency_level consistency,
const ::timeout_config& timeout_config,
std::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
@@ -62,8 +57,7 @@ query_options::query_options(const cql_config& cfg,
bool skip_metadata,
specific_options options,
cql_serialization_format sf)
: _cql_config(cfg)
, _consistency(consistency)
: _consistency(consistency)
, _timeout_config(timeout_config)
, _names(std::move(names))
, _values(std::move(values))
@@ -74,16 +68,14 @@ query_options::query_options(const cql_config& cfg,
{
}
query_options::query_options(const cql_config& cfg,
db::consistency_level consistency,
query_options::query_options(db::consistency_level consistency,
const ::timeout_config& timeout_config,
std::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
bool skip_metadata,
specific_options options,
cql_serialization_format sf)
: _cql_config(cfg)
, _consistency(consistency)
: _consistency(consistency)
, _timeout_config(timeout_config)
, _names(std::move(names))
, _values(std::move(values))
@@ -95,16 +87,14 @@ query_options::query_options(const cql_config& cfg,
fill_value_views();
}
query_options::query_options(const cql_config& cfg,
db::consistency_level consistency,
query_options::query_options(db::consistency_level consistency,
const ::timeout_config& timeout_config,
std::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value_view> value_views,
bool skip_metadata,
specific_options options,
cql_serialization_format sf)
: _cql_config(cfg)
, _consistency(consistency)
: _consistency(consistency)
, _timeout_config(timeout_config)
, _names(std::move(names))
, _values()
@@ -115,10 +105,8 @@ query_options::query_options(const cql_config& cfg,
{
}
query_options::query_options(db::consistency_level cl, const ::timeout_config& timeout_config, std::vector<cql3::raw_value> values,
specific_options options)
query_options::query_options(db::consistency_level cl, const ::timeout_config& timeout_config, std::vector<cql3::raw_value> values, specific_options options)
: query_options(
default_cql_config,
cl,
timeout_config,
{},
@@ -131,8 +119,7 @@ query_options::query_options(db::consistency_level cl, const ::timeout_config& t
}
query_options::query_options(std::unique_ptr<query_options> qo, ::shared_ptr<service::pager::paging_state> paging_state)
: query_options(qo->_cql_config,
qo->_consistency,
: query_options(qo->_consistency,
qo->get_timeout_config(),
std::move(qo->_names),
std::move(qo->_values),
@@ -144,8 +131,7 @@ query_options::query_options(std::unique_ptr<query_options> qo, ::shared_ptr<ser
}
query_options::query_options(std::unique_ptr<query_options> qo, ::shared_ptr<service::pager::paging_state> paging_state, int32_t page_size)
: query_options(qo->_cql_config,
qo->_consistency,
: query_options(qo->_consistency,
qo->get_timeout_config(),
std::move(qo->_names),
std::move(qo->_values),
@@ -217,12 +203,4 @@ void query_options::fill_value_views()
}
}
db::consistency_level query_options::check_serial_consistency() const {
if (_options.serial_consistency.has_value()) {
return *_options.serial_consistency;
}
throw exceptions::protocol_exception("Consistency level for LWT is missing for a request with conditions");
}
}

View File

@@ -55,9 +55,6 @@
namespace cql3 {
class cql_config;
extern const cql_config default_cql_config;
/**
* Options for a query.
*/
@@ -73,7 +70,6 @@ public:
const api::timestamp_type timestamp;
};
private:
const cql_config& _cql_config;
const db::consistency_level _consistency;
const timeout_config& _timeout_config;
const std::optional<std::vector<sstring_view>> _names;
@@ -108,16 +104,14 @@ public:
query_options(query_options&&) = default;
explicit query_options(const query_options&) = default;
explicit query_options(const cql_config& cfg,
db::consistency_level consistency,
explicit query_options(db::consistency_level consistency,
const timeout_config& timeouts,
std::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
bool skip_metadata,
specific_options options,
cql_serialization_format sf);
explicit query_options(const cql_config& cfg,
db::consistency_level consistency,
explicit query_options(db::consistency_level consistency,
const timeout_config& timeouts,
std::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
@@ -125,8 +119,7 @@ public:
bool skip_metadata,
specific_options options,
cql_serialization_format sf);
explicit query_options(const cql_config& cfg,
db::consistency_level consistency,
explicit query_options(db::consistency_level consistency,
const timeout_config& timeouts,
std::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value_view> value_views,
@@ -194,14 +187,11 @@ public:
return get_specific_options().state;
}
/** Serial consistency for conditional updates. */
/** Serial consistency for conditional updates. */
std::optional<db::consistency_level> get_serial_consistency() const {
return get_specific_options().serial_consistency;
}
/** Return serial consistency for conditional updates. Throws if the consistency is not set. */
db::consistency_level check_serial_consistency() const;
api::timestamp_type get_timestamp(service::query_state& state) const {
auto tstamp = get_specific_options().timestamp;
return tstamp != api::missing_timestamp ? tstamp : state.get_timestamp();
@@ -237,10 +227,6 @@ public:
return _names;
}
const cql_config& get_cql_config() const {
return _cql_config;
}
void prepare(const std::vector<::shared_ptr<column_specification>>& specs);
private:
void fill_value_views();
@@ -258,7 +244,7 @@ query_options::query_options(query_options&& o, std::vector<OneMutationDataRange
std::vector<query_options> tmp;
tmp.reserve(values_ranges.size());
std::transform(values_ranges.begin(), values_ranges.end(), std::back_inserter(tmp), [this](auto& values_range) {
return query_options(_cql_config, _consistency, _timeout_config, {}, std::move(values_range), _skip_metadata, _options, _cql_serialization_format);
return query_options(_consistency, _timeout_config, {}, std::move(values_range), _skip_metadata, _options, _cql_serialization_format);
});
_batch_options = std::move(tmp);
}

View File

@@ -69,7 +69,7 @@ const std::chrono::minutes prepared_statements_cache::entry_expiry = std::chrono
class query_processor::internal_state {
service::query_state _qs;
public:
internal_state() : _qs(service::client_state::for_internal_calls(), empty_service_permit()) {
internal_state() : _qs(service::client_state{service::client_state::internal_tag()}, empty_service_permit()) {
}
operator service::query_state&() {
return _qs;
@@ -83,8 +83,15 @@ public:
operator const service::client_state&() const {
return _qs.get_client_state();
}
api::timestamp_type next_timestamp() {
return _qs.get_client_state().get_timestamp();
}
};
api::timestamp_type query_processor::next_timestamp() {
return _internal_state->next_timestamp();
}
query_processor::query_processor(service::storage_proxy& proxy, database& db, query_processor::memory_config mcfg)
: _migration_subscriber{std::make_unique<migration_subscriber>(this)}
, _proxy(proxy)
@@ -114,77 +121,38 @@ query_processor::query_processor(service::storage_proxy& proxy, database& db, qu
}
_metrics.add_group("query_processor", qp_group);
sm::label cas_label("conditional");
auto cas_label_instance = cas_label("yes");
auto non_cas_label_instance = cas_label("no");
_metrics.add_group(
"cql",
{
sm::make_derive(
"reads",
_cql_stats.statements[size_t(statement_type::SELECT)],
_cql_stats.reads,
sm::description("Counts a total number of CQL read requests.")),
sm::make_derive(
"inserts",
_cql_stats.statements[size_t(statement_type::INSERT)],
sm::description("Counts a total number of CQL INSERT requests without conditions."),
{non_cas_label_instance}),
sm::make_derive(
"inserts",
_cql_stats.cas_statements[size_t(statement_type::INSERT)],
sm::description("Counts a total number of CQL INSERT requests with conditions."),
{cas_label_instance}),
_cql_stats.inserts,
sm::description("Counts a total number of CQL INSERT requests.")),
sm::make_derive(
"updates",
_cql_stats.statements[size_t(statement_type::UPDATE)],
sm::description("Counts a total number of CQL UPDATE requests without conditions."),
{non_cas_label_instance}),
sm::make_derive(
"updates",
_cql_stats.cas_statements[size_t(statement_type::UPDATE)],
sm::description("Counts a total number of CQL UPDATE requests with conditions."),
{cas_label_instance}),
_cql_stats.updates,
sm::description("Counts a total number of CQL UPDATE requests.")),
sm::make_derive(
"deletes",
_cql_stats.statements[size_t(statement_type::DELETE)],
sm::description("Counts a total number of CQL DELETE requests without conditions."),
{non_cas_label_instance}),
sm::make_derive(
"deletes",
_cql_stats.cas_statements[size_t(statement_type::DELETE)],
sm::description("Counts a total number of CQL DELETE requests with conditions."),
{cas_label_instance}),
_cql_stats.deletes,
sm::description("Counts a total number of CQL DELETE requests.")),
sm::make_derive(
"batches",
_cql_stats.batches,
sm::description("Counts a total number of CQL BATCH requests without conditions."),
{non_cas_label_instance}),
sm::make_derive(
"batches",
_cql_stats.cas_batches,
sm::description("Counts a total number of CQL BATCH requests with conditions."),
{cas_label_instance}),
sm::description("Counts a total number of CQL BATCH requests.")),
sm::make_derive(
"statements_in_batches",
_cql_stats.statements_in_batches,
sm::description("Counts a total number of sub-statements in CQL BATCH requests without conditions."),
{non_cas_label_instance}),
sm::make_derive(
"statements_in_batches",
_cql_stats.statements_in_cas_batches,
sm::description("Counts a total number of sub-statements in CQL BATCH requests with conditions."),
{cas_label_instance}),
sm::description("Counts a total number of sub-statements in CQL BATCH requests.")),
sm::make_derive(
"batches_pure_logged",
@@ -316,10 +284,7 @@ query_processor::process(const sstring_view& query_string, service::query_state&
auto p = get_statement(query_string, query_state.get_client_state());
auto cql_statement = p->statement;
if (cql_statement->get_bound_terms() != options.get_values_count()) {
const auto msg = format("Invalid amount of bind variables: expected {:d} received {:d}",
cql_statement->get_bound_terms(),
options.get_values_count());
throw exceptions::invalid_request_exception(msg);
throw exceptions::invalid_request_exception("Invalid amount of bind variables");
}
options.prepare(p->bound_names);

View File

@@ -298,6 +298,16 @@ public:
const timeout_config& timeout_config,
const std::initializer_list<data_value>& = { });
/*
* This function provides a timestamp that is guaranteed to be higher than any timestamp
* previously used in internal queries.
*
* This is useful because the client_state have a built-in mechanism to guarantee monotonicity.
* Bypassing that mechanism by the use of some other clock may yield times in the past, even if the operation
* was done in the future.
*/
api::timestamp_type next_timestamp();
future<::shared_ptr<cql_transport::messages::result_message::prepared>>
prepare(sstring query_string, service::query_state& query_state);

View File

@@ -1,35 +0,0 @@
/*
* Copyright (C) 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <cstdint>
namespace cql3::restrictions {
struct restrictions_config {
uint32_t partition_key_restrictions_max_cartesian_product_size = 100;
uint32_t clustering_key_restrictions_max_cartesian_product_size = 100;
};
}

View File

@@ -46,7 +46,6 @@
#include "cartesian_product.hh"
#include "cql3/restrictions/primary_key_restrictions.hh"
#include "cql3/restrictions/single_column_restrictions.hh"
#include "cql3/cql_config.hh"
#include <boost/algorithm/cxx11/all_of.hpp>
#include <boost/range/adaptor/transformed.hpp>
#include <boost/range/adaptor/filtered.hpp>
@@ -56,29 +55,6 @@ namespace cql3 {
namespace restrictions {
namespace {
template <typename ValueType>
const char*
restricted_component_name_v;
template <>
const char* restricted_component_name_v<partition_key> = "partition key";
template <>
const char* restricted_component_name_v<clustering_key> = "clustering key";
inline
void check_cartesian_product_size(size_t size, size_t max, const char* component_name) {
if (size > max) {
throw std::runtime_error(fmt::format("{} cartesian product size {} is greater than maximum {}",
component_name, size, max));
}
}
}
/**
* A set of single column restrictions on a primary key part (partition key or clustering key).
*/
@@ -93,8 +69,6 @@ private:
schema_ptr _schema;
bool _allow_filtering;
::shared_ptr<single_column_restrictions> _restrictions;
private:
static uint32_t max_cartesian_product_size(const restrictions_config& config);
public:
single_column_primary_key_restrictions(schema_ptr schema, bool allow_filtering)
: _schema(schema)
@@ -210,10 +184,7 @@ public:
}
std::vector<ValueType> result;
auto size = cartesian_product_size(value_vector);
check_cartesian_product_size(size, max_cartesian_product_size(options.get_cql_config().restrictions),
restricted_component_name_v<ValueType>);
result.reserve(size);
result.reserve(cartesian_product_size(value_vector));
for (auto&& v : make_cartesian_product(value_vector)) {
result.emplace_back(ValueType::from_optional_exploded(*_schema, std::move(v)));
}
@@ -288,10 +259,7 @@ private:
return ranges;
}
auto size = cartesian_product_size(vec_of_values);
check_cartesian_product_size(size, max_cartesian_product_size(options.get_cql_config().restrictions),
restricted_component_name_v<ValueType>);
ranges.reserve(size);
ranges.reserve(cartesian_product_size(vec_of_values));
for (auto&& prefix : make_cartesian_product(vec_of_values)) {
auto read_bound = [r, &prefix, &options, this](statements::bound bound) -> range_bound {
if (r->has_bound(bound)) {
@@ -332,10 +300,7 @@ private:
vec_of_values.emplace_back(std::move(values));
}
auto size = cartesian_product_size(vec_of_values);
check_cartesian_product_size(size, max_cartesian_product_size(options.get_cql_config().restrictions),
restricted_component_name_v<ValueType>);
ranges.reserve(size);
ranges.reserve(cartesian_product_size(vec_of_values));
for (auto&& prefix : make_cartesian_product(vec_of_values)) {
ranges.emplace_back(range_type::make_singular(ValueType::from_optional_exploded(*_schema, std::move(prefix))));
}
@@ -478,7 +443,7 @@ inline bool single_column_primary_key_restrictions<clustering_key>::needs_filter
// 3. a SLICE restriction isn't on a last place
column_id position = 0;
for (const auto& restriction : _restrictions->restrictions() | boost::adaptors::map_values) {
if (restriction->is_contains() || restriction->is_LIKE() || position != restriction->get_column_def().id) {
if (restriction->is_contains() || position != restriction->get_column_def().id) {
return true;
}
if (!restriction->is_slice()) {
@@ -524,19 +489,6 @@ inline unsigned single_column_primary_key_restrictions<partition_key>::num_prefi
using single_column_partition_key_restrictions = single_column_primary_key_restrictions<partition_key>;
using single_column_clustering_key_restrictions = single_column_primary_key_restrictions<clustering_key>;
template <>
inline
uint32_t single_column_primary_key_restrictions<partition_key>::max_cartesian_product_size(const restrictions_config& config) {
return config.partition_key_restrictions_max_cartesian_product_size;
}
template <>
inline
uint32_t single_column_primary_key_restrictions<clustering_key>::max_cartesian_product_size(const restrictions_config& config) {
return config.clustering_key_restrictions_max_cartesian_product_size;
}
}
}

View File

@@ -41,8 +41,6 @@
#pragma once
#include <optional>
#include "cql3/restrictions/restriction.hh"
#include "cql3/restrictions/term_slice.hh"
#include "cql3/term.hh"
@@ -53,7 +51,6 @@
#include "exceptions/exceptions.hh"
#include "keys.hh"
#include "mutation_partition.hh"
#include "utils/like_matcher.hh"
namespace cql3 {
@@ -387,11 +384,6 @@ public:
class single_column_restriction::LIKE final : public single_column_restriction {
private:
::shared_ptr<term> _value;
/// Matches cell value against LIKE pattern. Optional because it cannot be initialized in the
/// constructor when the pattern is a bind marker. Mutable because it is initialized on demand
/// in is_satisfied_by().
mutable std::optional<like_matcher> _matcher;
mutable bytes_opt _last_pattern; ///< Pattern from which _matcher was last initialized.
public:
LIKE(const column_definition& column_def, ::shared_ptr<term> value)
: single_column_restriction(op::LIKE, column_def)
@@ -432,14 +424,6 @@ public:
virtual ::shared_ptr<single_column_restriction> apply_to(const column_definition& cdef) override {
return ::make_shared<LIKE>(cdef, _value);
}
private:
/// If necessary, reinitializes _matcher and _last_pattern.
///
/// Invoked from is_satisfied_by(), so must be const.
///
/// @return true iff _value was successfully translated to LIKE pattern (regardless of initialization)
bool init_matcher(const query_options& options) const;
};
// This holds CONTAINS, CONTAINS_KEY, and map[key] = value restrictions because we might want to have any combination of them.

View File

@@ -35,6 +35,7 @@
#include "types/map.hh"
#include "types/list.hh"
#include "types/set.hh"
#include "utils/like_matcher.hh"
namespace cql3 {
namespace restrictions {
@@ -235,16 +236,6 @@ statement_restrictions::statement_restrictions(database& db,
_index_restrictions.push_back(_partition_key_restrictions);
}
// If the only updated/deleted columns are static, then we don't need clustering columns.
// And in fact, unless it is an INSERT, we reject if clustering columns are provided as that
// suggest something unintended. For instance, given:
// CREATE TABLE t (k int, v int, s int static, PRIMARY KEY (k, v))
// it can make sense to do:
// INSERT INTO t(k, v, s) VALUES (0, 1, 2)
// but both
// UPDATE t SET s = 3 WHERE k = 0 AND v = 1
// DELETE s FROM t WHERE k = 0 AND v = 1
// sounds like you don't really understand what your are doing.
if (selects_only_static_columns && has_clustering_columns_restriction()) {
if (type.is_update() || type.is_delete()) {
throw exceptions::invalid_request_exception(format("Invalid restrictions on clustering columns since the {} statement modifies only static columns", type));
@@ -390,45 +381,28 @@ std::vector<const column_definition*> statement_restrictions::get_column_defs_fo
if (need_filtering()) {
auto& sim = db.find_column_family(_schema).get_index_manager();
auto [opt_idx, _] = find_idx(sim);
auto column_uses_indexing = [&opt_idx] (const column_definition* cdef, ::shared_ptr<single_column_restriction> restr) {
return opt_idx && restr && restr->is_supported_by(*opt_idx);
auto column_uses_indexing = [&opt_idx] (const column_definition* cdef) {
return opt_idx && opt_idx->depends_on(*cdef);
};
auto single_pk_restrs = dynamic_pointer_cast<single_column_partition_key_restrictions>(_partition_key_restrictions);
if (_partition_key_restrictions->needs_filtering(*_schema)) {
for (auto&& cdef : _partition_key_restrictions->get_column_defs()) {
::shared_ptr<single_column_restriction> restr;
if (single_pk_restrs) {
auto it = single_pk_restrs->restrictions().find(cdef);
if (it != single_pk_restrs->restrictions().end()) {
restr = dynamic_pointer_cast<single_column_restriction>(it->second);
}
}
if (!column_uses_indexing(cdef, restr)) {
if (!column_uses_indexing(cdef)) {
column_defs_for_filtering.emplace_back(cdef);
}
}
}
auto single_ck_restrs = dynamic_pointer_cast<single_column_clustering_key_restrictions>(_clustering_columns_restrictions);
const bool pk_has_unrestricted_components = _partition_key_restrictions->has_unrestricted_components(*_schema);
if (pk_has_unrestricted_components || _clustering_columns_restrictions->needs_filtering(*_schema)) {
column_id first_filtering_id = pk_has_unrestricted_components ? 0 : _schema->clustering_key_columns().begin()->id +
_clustering_columns_restrictions->num_prefix_columns_that_need_not_be_filtered();
for (auto&& cdef : _clustering_columns_restrictions->get_column_defs()) {
::shared_ptr<single_column_restriction> restr;
if (single_pk_restrs) {
auto it = single_ck_restrs->restrictions().find(cdef);
if (it != single_ck_restrs->restrictions().end()) {
restr = dynamic_pointer_cast<single_column_restriction>(it->second);
}
}
if (cdef->id >= first_filtering_id && !column_uses_indexing(cdef, restr)) {
if (cdef->id >= first_filtering_id && !column_uses_indexing(cdef)) {
column_defs_for_filtering.emplace_back(cdef);
}
}
}
for (auto&& cdef : _nonprimary_key_restrictions->get_column_defs()) {
auto restr = dynamic_pointer_cast<single_column_restriction>(_nonprimary_key_restrictions->get_restriction(*cdef));
if (!column_uses_indexing(cdef, restr)) {
if (!column_uses_indexing(cdef)) {
column_defs_for_filtering.emplace_back(cdef);
}
}
@@ -733,7 +707,7 @@ bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
return false;
}
auto col_type = static_cast<const collection_type_impl*>(_column_def.type.get());
auto col_type = static_pointer_cast<const collection_type_impl>(_column_def.type);
if ((!_keys.empty() || !_entry_keys.empty()) && !col_type->is_map()) {
return false;
}
@@ -743,8 +717,8 @@ bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
auto&& element_type = col_type->is_set() ? col_type->name_comparator() : col_type->value_comparator();
if (_column_def.type->is_multi_cell()) {
auto cell = cells.find_cell(_column_def.id);
return cell->as_collection_mutation().with_deserialized(*col_type, [&] (collection_mutation_view_description mv) {
auto&& elements = mv.cells;
return cell->as_collection_mutation().data.with_linearized([&] (bytes_view collection_bv) {
auto&& elements = col_type->deserialize_mutation_form(collection_bv).cells;
auto end = std::remove_if(elements.begin(), elements.end(), [now] (auto&& element) {
return element.second.is_dead(now);
});
@@ -817,7 +791,6 @@ bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
}
bool single_column_restriction::contains::is_satisfied_by(bytes_view collection_bv, const query_options& options) const {
assert(_column_def.type->is_collection());
auto col_type = static_pointer_cast<const collection_type_impl>(_column_def.type);
if (collection_bv.empty() || ((!_keys.empty() || !_entry_keys.empty()) && !col_type->is_map())) {
return false;
@@ -943,18 +916,6 @@ bool token_restriction::slice::is_satisfied_by(const schema& schema,
return satisfied;
}
bool single_column_restriction::LIKE::init_matcher(const query_options& options) const {
auto pattern = to_bytes_opt(_value->bind_and_get(options));
if (!pattern) {
return false;
}
if (!_matcher || pattern != _last_pattern) {
_matcher.emplace(*pattern);
_last_pattern = std::move(pattern);
}
return true;
}
bool single_column_restriction::LIKE::is_satisfied_by(const schema& schema,
const partition_key& key,
const clustering_key_prefix& ckey,
@@ -965,25 +926,19 @@ bool single_column_restriction::LIKE::is_satisfied_by(const schema& schema,
throw exceptions::invalid_request_exception("LIKE is allowed only on string types");
}
auto cell_value = get_value(schema, key, ckey, cells, now);
if (!cell_value) {
return false;
}
if (!init_matcher(options)) {
return false;
}
return cell_value->with_linearized([&] (bytes_view data) {
return (*_matcher)(data);
});
return !cell_value ? false :
cell_value->with_linearized([&] (bytes_view data) {
auto pattern = to_bytes_opt(_value->bind_and_get(options));
return pattern ? like_matcher(*pattern)(data) : false;
});
}
bool single_column_restriction::LIKE::is_satisfied_by(bytes_view data, const query_options& options) const {
if (!_column_def.type->is_string()) {
throw exceptions::invalid_request_exception("LIKE is allowed only on string types");
}
if (!init_matcher(options)) {
return false;
}
return (*_matcher)(data);
auto pattern = to_bytes_opt(_value->bind_and_get(options));
return pattern ? like_matcher(*pattern)(data) : false;
}
}

View File

@@ -137,20 +137,6 @@ public:
return _partition_key_restrictions->is_IN();
}
/**
* Checks if the restrictions on the clustering key is an IN restriction.
*
* @return <code>true</code> the restrictions on the partition key is an IN restriction, <code>false</code>
* otherwise.
*/
bool clustering_key_restrictions_has_IN() const {
return _clustering_columns_restrictions->is_IN();
}
bool clustering_key_restrictions_has_only_eq() const {
return _clustering_columns_restrictions->empty() || _clustering_columns_restrictions->is_all_eq();
}
/**
* Checks if the query request a range of partition keys.
*

View File

@@ -92,14 +92,6 @@ public:
: abstract_function_selector(fun, std::move(arg_selectors))
, _tfun(dynamic_pointer_cast<T>(fun)) {
}
const functions::function_name& name() const {
return _tfun->name();
}
virtual sstring assignment_testable_source_context() const override {
return format("{}", this->name());
}
};
}

View File

@@ -79,6 +79,11 @@ public:
dynamic_pointer_cast<functions::aggregate_function>(func), std::move(arg_selectors))
, _aggregate(fun()->new_aggregate()) {
}
virtual sstring assignment_testable_source_context() const override {
// FIXME:
return "FIXME";
}
};
}

View File

@@ -82,6 +82,12 @@ public:
: abstract_function_selector_for<functions::scalar_function>(
dynamic_pointer_cast<functions::scalar_function>(std::move(fun)), std::move(arg_selectors)) {
}
virtual sstring assignment_testable_source_context() const override {
// FIXME:
return "FIXME";
}
};
}

View File

@@ -142,19 +142,20 @@ shared_ptr<selector::factory>
selectable::with_field_selection::new_selector_factory(database& db, schema_ptr s, std::vector<const column_definition*>& defs) {
auto&& factory = _selected->new_selector_factory(db, s, defs);
auto&& type = factory->new_instance()->get_type();
if (!type->underlying_type()->is_user_type()) {
auto&& ut = dynamic_pointer_cast<const user_type_impl>(type->underlying_type());
if (!ut) {
throw exceptions::invalid_request_exception(
format("Invalid field selection: {} of type {} is not a user type", _selected->to_string(), type->as_cql3_type()));
format("Invalid field selection: {} of type {} is not a user type",
_selected->to_string(), factory->new_instance()->get_type()->as_cql3_type()));
}
auto ut = static_pointer_cast<const user_type_impl>(type->underlying_type());
auto idx = ut->idx_of_field(_field->bytes_);
if (!idx) {
throw exceptions::invalid_request_exception(format("{} of type {} has no field {}",
_selected->to_string(), ut->as_cql3_type(), _field));
for (size_t i = 0; i < ut->size(); ++i) {
if (ut->field_name(i) != _field->bytes_) {
continue;
}
return field_selector::new_factory(std::move(ut), i, std::move(factory));
}
return field_selector::new_factory(std::move(ut), *idx, std::move(factory));
throw exceptions::invalid_request_exception(format("{} of type {} has no field {}",
_selected->to_string(), ut->as_cql3_type(), _field));
}
sstring

View File

@@ -380,7 +380,7 @@ bool result_set_builder::restrictions_filter::do_filter(const selection& selecti
const std::vector<bytes>& partition_key,
const std::vector<bytes>& clustering_key,
const query::result_row_view& static_row,
const query::result_row_view* row) const {
const query::result_row_view& row) const {
static logging::logger rlogger("restrictions_filter");
if (_current_partition_key_does_not_match || _current_static_row_does_not_match || _remaining == 0 || _per_partition_remaining == 0) {
@@ -395,17 +395,14 @@ bool result_set_builder::restrictions_filter::do_filter(const selection& selecti
}
auto static_row_iterator = static_row.iterator();
auto row_iterator = row ? std::optional<query::result_row_view::iterator_type>(row->iterator()) : std::nullopt;
auto row_iterator = row.iterator();
auto non_pk_restrictions_map = _restrictions->get_non_pk_restriction();
for (auto&& cdef : selection.get_columns()) {
switch (cdef->kind) {
case column_kind::static_column:
// fallthrough
case column_kind::regular_column: {
if (cdef->kind == column_kind::regular_column && !row_iterator) {
continue;
}
auto& cell_iterator = (cdef->kind == column_kind::static_column) ? static_row_iterator : *row_iterator;
auto& cell_iterator = (cdef->kind == column_kind::static_column) ? static_row_iterator : row_iterator;
std::optional<query::result_bytes_view> result_view_opt;
if (cdef->type->is_multi_cell()) {
result_view_opt = cell_iterator.next_collection_cell();
@@ -461,9 +458,6 @@ bool result_set_builder::restrictions_filter::do_filter(const selection& selecti
if (restr_it == clustering_key_restrictions_map.end()) {
continue;
}
if (clustering_key.empty()) {
return false;
}
restrictions::single_column_restriction& restriction = *restr_it->second;
const bytes& value_to_check = clustering_key[cdef->id];
bool pk_restriction_matches = restriction.is_satisfied_by(value_to_check, _options);
@@ -483,7 +477,7 @@ bool result_set_builder::restrictions_filter::operator()(const selection& select
const std::vector<bytes>& partition_key,
const std::vector<bytes>& clustering_key,
const query::result_row_view& static_row,
const query::result_row_view* row) const {
const query::result_row_view& row) const {
const bool accepted = do_filter(selection, partition_key, clustering_key, static_row, row);
if (!accepted) {
++_rows_dropped;

View File

@@ -257,7 +257,7 @@ private:
public:
class nop_filter {
public:
inline bool operator()(const selection&, const std::vector<bytes>&, const std::vector<bytes>&, const query::result_row_view&, const query::result_row_view*) const {
inline bool operator()(const selection&, const std::vector<bytes>&, const std::vector<bytes>&, const query::result_row_view&, const query::result_row_view&) const {
return true;
}
void reset(const partition_key* = nullptr) {
@@ -300,13 +300,13 @@ public:
, _rows_fetched_for_last_partition(rows_fetched_for_last_partition)
, _last_pkey(std::move(last_pkey))
{ }
bool operator()(const selection& selection, const std::vector<bytes>& pk, const std::vector<bytes>& ck, const query::result_row_view& static_row, const query::result_row_view* row) const;
bool operator()(const selection& selection, const std::vector<bytes>& pk, const std::vector<bytes>& ck, const query::result_row_view& static_row, const query::result_row_view& row) const;
void reset(const partition_key* key = nullptr);
uint32_t get_rows_dropped() const {
return _rows_dropped;
}
private:
bool do_filter(const selection& selection, const std::vector<bytes>& pk, const std::vector<bytes>& ck, const query::result_row_view& static_row, const query::result_row_view* row) const;
bool do_filter(const selection& selection, const std::vector<bytes>& pk, const std::vector<bytes>& ck, const query::result_row_view& static_row, const query::result_row_view& row) const;
};
result_set_builder(const selection& s, gc_clock::time_point now, cql_serialization_format sf,
@@ -379,7 +379,7 @@ public:
void accept_new_row(const query::result_row_view& static_row, const query::result_row_view& row) {
auto static_row_iterator = static_row.iterator();
auto row_iterator = row.iterator();
if (!_filter(_selection, _partition_key, _clustering_key, static_row, &row)) {
if (!_filter(_selection, _partition_key, _clustering_key, static_row, row)) {
return;
}
_builder.new_row();
@@ -409,9 +409,6 @@ public:
uint32_t accept_partition_end(const query::result_row_view& static_row) {
if (_row_count == 0) {
if (!_filter(_selection, _partition_key, _clustering_key, static_row, nullptr)) {
return _filter.get_rows_dropped();
}
_builder.new_row();
auto static_row_iterator = static_row.iterator();
for (auto&& def : _selection.get_columns()) {

View File

@@ -38,22 +38,12 @@ shared_ptr<term>
sets::literal::prepare(database& db, const sstring& keyspace, shared_ptr<column_specification> receiver) {
validate_assignable_to(db, keyspace, receiver);
if (_elements.empty()) {
// In Cassandra, an empty (unfrozen) map/set/list is equivalent to the column being null. In
// other words a non-frozen collection only exists if it has elements. Return nullptr right
// away to simplify predicate evaluation. See also
// https://issues.apache.org/jira/browse/CASSANDRA-5141
if (receiver->type->is_multi_cell()) {
return cql3::constants::null_literal::NULL_VALUE;
}
// We've parsed empty maps as a set literal to break the ambiguity so
// handle that case now. This branch works for frozen sets/maps only.
if (dynamic_pointer_cast<const map_type_impl>(receiver->type)) {
// use empty_type for comparator, set is empty anyway.
std::map<bytes, bytes, serialized_compare> m(empty_type->as_less_comparator());
return ::make_shared<maps::value>(std::move(m));
}
// We've parsed empty maps as a set literal to break the ambiguity so
// handle that case now
if (_elements.empty() && dynamic_pointer_cast<const map_type_impl>(receiver->type)) {
// use empty_type for comparator, set is empty anyway.
std::map<bytes, bytes, serialized_compare> m(empty_type->as_less_comparator());
return ::make_shared<maps::value>(std::move(m));
}
auto value_spec = value_spec_of(receiver);
@@ -132,16 +122,16 @@ sets::literal::to_string() const {
}
sets::value
sets::value::from_serialized(const fragmented_temporary_buffer::view& val, const set_type_impl& type, cql_serialization_format sf) {
sets::value::from_serialized(const fragmented_temporary_buffer::view& val, set_type type, cql_serialization_format sf) {
try {
// Collections have this small hack that validate cannot be called on a serialized object,
// but compose does the validation (so we're fine).
// FIXME: deserializeForNativeProtocol?!
return with_linearized(val, [&] (bytes_view v) {
auto s = value_cast<set_type_impl::native_type>(type.deserialize(v, sf));
std::set<bytes, serialized_compare> elements(type.get_elements_type()->as_less_comparator());
auto s = value_cast<set_type_impl::native_type>(type->deserialize(v, sf));
std::set<bytes, serialized_compare> elements(type->get_elements_type()->as_less_comparator());
for (auto&& element : s) {
elements.insert(elements.end(), type.get_elements_type()->decompose(element));
elements.insert(elements.end(), type->get_elements_type()->decompose(element));
}
return value(std::move(elements));
});
@@ -237,15 +227,15 @@ sets::marker::bind(const query_options& options) {
} else if (value.is_unset_value()) {
return constants::UNSET_VALUE;
} else {
auto& type = static_cast<const set_type_impl&>(*_receiver->type);
auto as_set_type = static_pointer_cast<const set_type_impl>(_receiver->type);
try {
with_linearized(*value, [&] (bytes_view v) {
type.validate(v, options.get_cql_serialization_format());
as_set_type->validate(v, options.get_cql_serialization_format());
});
} catch (marshal_exception& e) {
throw exceptions::invalid_request_exception(e.what());
}
return make_shared(value::from_serialized(*value, type, options.get_cql_serialization_format()));
return make_shared(value::from_serialized(*value, as_set_type, options.get_cql_serialization_format()));
}
}
@@ -261,10 +251,12 @@ sets::setter::execute(mutation& m, const clustering_key_prefix& row_key, const u
return;
}
if (column.type->is_multi_cell()) {
// Delete all cells first, then add new ones
collection_mutation_description mut;
// delete + add
collection_type_impl::mutation mut;
mut.tomb = params.make_tombstone_just_before();
m.set_cell(row_key, column, mut.serialize(*column.type));
auto ctype = static_pointer_cast<const set_type_impl>(column.type);
auto col_mut = ctype->serialize_mutation_form(std::move(mut));
m.set_cell(row_key, column, std::move(col_mut));
}
adder::do_add(m, row_key, params, value, column);
}
@@ -283,21 +275,21 @@ void
sets::adder::do_add(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params,
shared_ptr<term> value, const column_definition& column) {
auto set_value = dynamic_pointer_cast<sets::value>(std::move(value));
auto set_type = dynamic_cast<const set_type_impl*>(column.type.get());
assert(set_type);
auto set_type = dynamic_pointer_cast<const set_type_impl>(column.type);
if (column.type->is_multi_cell()) {
// FIXME: mutation_view? not compatible with params.make_cell().
collection_type_impl::mutation mut;
if (!set_value || set_value->_elements.empty()) {
return;
}
// FIXME: collection_mutation_view_description? not compatible with params.make_cell().
collection_mutation_description mut;
for (auto&& e : set_value->_elements) {
mut.cells.emplace_back(e, params.make_cell(*set_type->value_comparator(), bytes_view(), atomic_cell::collection_member::yes));
}
auto smut = set_type->serialize_mutation_form(mut);
m.set_cell(row_key, column, mut.serialize(*set_type));
m.set_cell(row_key, column, std::move(smut));
} else if (set_value != nullptr) {
// for frozen sets, we're overwriting the whole cell
auto v = set_type->serialize_partially_deserialized_form(
@@ -318,7 +310,7 @@ sets::discarder::execute(mutation& m, const clustering_key_prefix& row_key, cons
return;
}
collection_mutation_description mut;
collection_type_impl::mutation mut;
auto kill = [&] (bytes idx) {
mut.cells.push_back({std::move(idx), params.make_dead_cell()});
};
@@ -328,7 +320,10 @@ sets::discarder::execute(mutation& m, const clustering_key_prefix& row_key, cons
for (auto&& e : svalue->_elements) {
kill(e);
}
m.set_cell(row_key, column, mut.serialize(*column.type));
auto ctype = static_pointer_cast<const collection_type_impl>(column.type);
m.set_cell(row_key, column,
atomic_cell_or_collection::from_collection_mutation(
ctype->serialize_mutation_form(mut)));
}
void sets::element_discarder::execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params)
@@ -338,9 +333,10 @@ void sets::element_discarder::execute(mutation& m, const clustering_key_prefix&
if (!elt) {
throw exceptions::invalid_request_exception("Invalid null set element");
}
collection_mutation_description mut;
collection_type_impl::mutation mut;
mut.cells.emplace_back(*elt->get(params._options), params.make_dead_cell());
m.set_cell(row_key, column, mut.serialize(*column.type));
auto ctype = static_pointer_cast<const collection_type_impl>(column.type);
m.set_cell(row_key, column, ctype->serialize_mutation_form(mut));
}
}

View File

@@ -78,7 +78,7 @@ public:
value(std::set<bytes, serialized_compare> elements)
: _elements(std::move(elements)) {
}
static value from_serialized(const fragmented_temporary_buffer::view& v, const set_type_impl& type, cql_serialization_format sf);
static value from_serialized(const fragmented_temporary_buffer::view& v, set_type type, cql_serialization_format sf);
virtual cql3::raw_value get(const query_options& options) override;
virtual bytes get_with_protocol_version(cql_serialization_format sf) override;
bool equals(set_type st, const value& v);

View File

@@ -42,7 +42,6 @@
#include "alter_keyspace_statement.hh"
#include "prepared_statement.hh"
#include "service/migration_manager.hh"
#include "service/storage_service.hh"
#include "db/system_keyspace.hh"
#include "database.hh"
@@ -94,8 +93,7 @@ void cql3::statements::alter_keyspace_statement::validate(service::storage_proxy
future<shared_ptr<cql_transport::event::schema_change>> cql3::statements::alter_keyspace_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only) {
auto old_ksm = service::get_local_storage_proxy().get_db().local().find_keyspace(_name).metadata();
const auto& tm = service::get_local_storage_service().get_token_metadata();
return service::get_local_migration_manager().announce_keyspace_update(_attrs->as_ks_metadata_update(old_ksm, tm), is_local_only).then([this] {
return service::get_local_migration_manager().announce_keyspace_update(_attrs->as_ks_metadata_update(old_ksm), is_local_only).then([this] {
using namespace cql_transport;
return make_shared<event::schema_change>(
event::schema_change::change_type::UPDATED,

View File

@@ -43,12 +43,10 @@
#include "index/secondary_index_manager.hh"
#include "prepared_statement.hh"
#include "service/migration_manager.hh"
#include "service/storage_service.hh"
#include "validation.hh"
#include "db/extensions.hh"
#include <boost/range/adaptor/filtered.hpp>
#include <boost/range/adaptor/transformed.hpp>
#include "cdc/cdc.hh"
#include "cql3/util.hh"
#include "view_info.hh"
#include "database.hh"
@@ -223,7 +221,7 @@ void alter_table_statement::add_column(schema_ptr schema, const table& cf, schem
schema_builder builder(view);
if (view->view_info()->include_all_columns()) {
builder.with_column(column_name->name(), type);
} else if (!view->view_info()->base_non_pk_column_in_view_pk()) {
} else if (view->view_info()->base_non_pk_columns_in_view_pk().empty()) {
db::view::create_virtual_column(builder, column_name->name(), type);
}
view_updates.push_back(view_ptr(builder.build()));
@@ -312,8 +310,6 @@ future<shared_ptr<cql_transport::event::schema_change>> alter_table_statement::a
}
};
bool create_cdc = false;
bool delete_cdc = false;
switch (_type) {
case alter_table_statement::type::add:
assert(_column_changes.size());
@@ -356,11 +352,6 @@ future<shared_ptr<cql_transport::event::schema_change>> alter_table_statement::a
throw exceptions::invalid_request_exception("Cannot set default_time_to_live on a table with counters");
}
{
bool enable_cdc = _properties->get_cdc_options() && cdc::options(*_properties->get_cdc_options()).enabled();
create_cdc = enable_cdc && !schema->cdc_options().enabled();
delete_cdc = !enable_cdc && schema->cdc_options().enabled();
}
_properties->apply_to_builder(cfm, db.extensions());
break;
@@ -397,17 +388,7 @@ future<shared_ptr<cql_transport::event::schema_change>> alter_table_statement::a
break;
}
auto f = service::get_local_migration_manager().announce_column_family_update(cfm.build(), false, std::move(view_updates), is_local_only);
if (create_cdc) {
f = f.then([&proxy, schema = std::move(schema)] {
return cdc::setup(cdc::db_context::builder(proxy).build(), schema);
});
} else if (delete_cdc) {
f = f.then([&proxy, schema = std::move(schema)] {
return cdc::remove(cdc::db_context::builder(proxy).build(), schema->ks_name(), schema->cf_name());
});
}
return f.then([this] {
return service::get_local_migration_manager().announce_column_family_update(cfm.build(), false, std::move(view_updates), is_local_only).then([this] {
using namespace cql_transport;
return make_shared<event::schema_change>(
event::schema_change::change_type::UPDATED,

View File

@@ -78,6 +78,16 @@ const sstring& alter_type_statement::keyspace() const
return _name.get_keyspace();
}
static std::optional<uint32_t> get_idx_of_field(user_type type, shared_ptr<column_identifier> field)
{
for (uint32_t i = 0; i < type->field_names().size(); ++i) {
if (field->name() == type->field_names()[i]) {
return {i};
}
}
return {};
}
void alter_type_statement::do_announce_migration(database& db, ::keyspace& ks, bool is_local_only)
{
auto&& all_types = ks.metadata()->user_types()->get_all_types();
@@ -112,6 +122,18 @@ void alter_type_statement::do_announce_migration(database& db, ::keyspace& ks, b
}
}
}
// Other user types potentially using the updated type
for (auto&& ut : ks.metadata()->user_types()->get_all_types() | boost::adaptors::map_values) {
// Re-updating the type we've just updated would be harmless but useless so we avoid it.
if (ut->_keyspace != updated->_keyspace || ut->_name != updated->_name) {
auto upd_opt = ut->update_user_type(updated);
if (upd_opt) {
service::get_local_migration_manager().announce_type_update(
static_pointer_cast<const user_type_impl>(*upd_opt), is_local_only).get();
}
}
}
}
future<shared_ptr<cql_transport::event::schema_change>> alter_type_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)
@@ -143,30 +165,25 @@ alter_type_statement::add_or_alter::add_or_alter(const ut_name& name, bool is_ad
user_type alter_type_statement::add_or_alter::do_add(database& db, user_type to_update) const
{
if (to_update->idx_of_field(_field_name->name())) {
if (get_idx_of_field(to_update, _field_name)) {
throw exceptions::invalid_request_exception(format("Cannot add new field {} to type {}: a field of the same name already exists",
_field_name->to_string(), _name.to_string()));
}
if (to_update->size() == max_udt_fields) {
throw exceptions::invalid_request_exception(format("Cannot add new field to type {}: maximum number of fields reached", _name));
}
std::vector<bytes> new_names(to_update->field_names());
new_names.push_back(_field_name->name());
std::vector<data_type> new_types(to_update->field_types());
auto&& add_type = _field_type->prepare(db, keyspace()).get_type();
if (add_type->references_user_type(to_update->_keyspace, to_update->_name)) {
throw exceptions::invalid_request_exception(format("Cannot add new field {} of type {} to type {} as this would create a circular reference",
*_field_name, *_field_type, _name.to_string()));
throw exceptions::invalid_request_exception(format("Cannot add new field {} of type {} to type {} as this would create a circular reference", _field_name->to_string(), _field_type->to_string(), _name.to_string()));
}
new_types.push_back(std::move(add_type));
return user_type_impl::get_instance(to_update->_keyspace, to_update->_name, std::move(new_names), std::move(new_types), to_update->is_multi_cell());
return user_type_impl::get_instance(to_update->_keyspace, to_update->_name, std::move(new_names), std::move(new_types));
}
user_type alter_type_statement::add_or_alter::do_alter(database& db, user_type to_update) const
{
auto idx = to_update->idx_of_field(_field_name->name());
std::optional<uint32_t> idx = get_idx_of_field(to_update, _field_name);
if (!idx) {
throw exceptions::invalid_request_exception(format("Unknown field {} in type {}", _field_name->to_string(), _name.to_string()));
}
@@ -175,12 +192,12 @@ user_type alter_type_statement::add_or_alter::do_alter(database& db, user_type t
auto new_type = _field_type->prepare(db, keyspace()).get_type();
if (!new_type->is_compatible_with(*previous)) {
throw exceptions::invalid_request_exception(format("Type {} in incompatible with previous type {} of field {} in user type {}",
*_field_type, previous->as_cql3_type(), *_field_name, _name));
_field_type->to_string(), previous->as_cql3_type().to_string(), _field_name->to_string(), _name.to_string()));
}
std::vector<data_type> new_types(to_update->field_types());
new_types[*idx] = new_type;
return user_type_impl::get_instance(to_update->_keyspace, to_update->_name, to_update->field_names(), std::move(new_types), to_update->is_multi_cell());
return user_type_impl::get_instance(to_update->_keyspace, to_update->_name, to_update->field_names(), std::move(new_types));
}
user_type alter_type_statement::add_or_alter::make_updated_type(database& db, user_type to_update) const
@@ -203,13 +220,13 @@ user_type alter_type_statement::renames::make_updated_type(database& db, user_ty
std::vector<bytes> new_names(to_update->field_names());
for (auto&& rename : _renames) {
auto&& from = rename.first;
auto idx = to_update->idx_of_field(from->name());
std::optional<uint32_t> idx = get_idx_of_field(to_update, from);
if (!idx) {
throw exceptions::invalid_request_exception(format("Unknown field {} in type {}", from->to_string(), _name.to_string()));
}
new_names[*idx] = rename.second->name();
}
auto&& updated = user_type_impl::get_instance(to_update->_keyspace, to_update->_name, std::move(new_names), to_update->field_types(), to_update->is_multi_cell());
auto&& updated = user_type_impl::get_instance(to_update->_keyspace, to_update->_name, std::move(new_names), to_update->field_types());
create_type_statement::check_for_duplicate_names(updated);
return updated;
}

View File

@@ -40,10 +40,27 @@
#include "batch_statement.hh"
#include "raw/batch_statement.hh"
#include "db/config.hh"
#include "db/consistency_level_validations.hh"
#include "database.hh"
#include <seastar/core/execution_stage.hh>
#include "cas_request.hh"
namespace {
struct mutation_equals_by_key {
bool operator()(const mutation& m1, const mutation& m2) const {
return m1.schema() == m2.schema()
&& m1.decorated_key().equal(*m1.schema(), m2.decorated_key());
}
};
struct mutation_hash_by_key {
size_t operator()(const mutation& m) const {
auto dk_hash = std::hash<dht::decorated_key>();
return dk_hash(m.decorated_key());
}
};
}
namespace cql3 {
@@ -62,20 +79,12 @@ batch_statement::batch_statement(int bound_terms, type type_,
std::vector<single_statement> statements,
std::unique_ptr<attributes> attrs,
cql_stats& stats)
: cql_statement_opt_metadata(timeout_for_type(type_))
: cql_statement_no_metadata(timeout_for_type(type_))
, _bound_terms(bound_terms), _type(type_), _statements(std::move(statements))
, _attrs(std::move(attrs))
, _has_conditions(boost::algorithm::any_of(_statements, [] (auto&& s) { return s.statement->has_conditions(); }))
, _stats(stats)
{
if (has_conditions()) {
// A batch can be created not only by raw::batch_statement::prepare, but also by
// cql_server::connection::process_batch, which doesn't call any methods of
// cql3::statements::batch_statement, only constructs it. So let's call
// build_cas_result_set_metadata right from the constructor to avoid crash trying to access
// uninitialized batch metadata.
build_cas_result_set_metadata();
}
}
batch_statement::batch_statement(type type_,
@@ -183,20 +192,21 @@ const std::vector<batch_statement::single_statement>& batch_statement::get_state
return _statements;
}
future<std::vector<mutation>> batch_statement::get_mutations(service::storage_proxy& storage, const query_options& options,
db::timeout_clock::time_point timeout, bool local, api::timestamp_type now, service::query_state& query_state) {
future<std::vector<mutation>> batch_statement::get_mutations(service::storage_proxy& storage, const query_options& options, db::timeout_clock::time_point timeout, bool local, api::timestamp_type now, tracing::trace_state_ptr trace_state,
service_permit permit) {
// Do not process in parallel because operations like list append/prepend depend on execution order.
using mutation_set_type = std::unordered_set<mutation, mutation_hash_by_key, mutation_equals_by_key>;
return do_with(mutation_set_type(), [this, &storage, &options, timeout, now, local, &query_state] (auto& result) mutable {
return do_with(mutation_set_type(), [this, &storage, &options, timeout, now, local, trace_state, permit = std::move(permit)] (auto& result) mutable {
result.reserve(_statements.size());
_stats.statements_in_batches += _statements.size();
return do_for_each(boost::make_counting_iterator<size_t>(0),
boost::make_counting_iterator<size_t>(_statements.size()),
[this, &storage, &options, now, local, &result, timeout, &query_state] (size_t i) {
[this, &storage, &options, now, local, &result, timeout, trace_state, permit = std::move(permit)] (size_t i) {
auto&& statement = _statements[i].statement;
++_stats.statements[size_t(statement->type)];
statement->inc_cql_stats();
auto&& statement_options = options.for_statement(i);
auto timestamp = _attrs->get_timestamp(now, statement_options);
return statement->get_mutations(storage, statement_options, timeout, local, timestamp, query_state).then([&result] (auto&& more) {
return statement->get_mutations(storage, statement_options, timeout, local, timestamp, trace_state, permit).then([&result] (auto&& more) {
for (auto&& m : more) {
// We want unordered_set::try_emplace(), but we don't have it
auto pos = result.find(m);
@@ -264,6 +274,7 @@ static thread_local inheriting_concrete_execution_stage<
future<shared_ptr<cql_transport::messages::result_message>> batch_statement::execute(
service::storage_proxy& storage, service::query_state& state, const query_options& options) {
++_stats.batches;
return batch_stage(this, seastar::ref(storage), seastar::ref(state),
seastar::cref(options), false, options.get_timestamp(state));
}
@@ -281,16 +292,11 @@ future<shared_ptr<cql_transport::messages::result_message>> batch_statement::do_
throw new InvalidRequestException("Invalid empty serial consistency level");
#endif
if (_has_conditions) {
++_stats.cas_batches;
_stats.statements_in_cas_batches += _statements.size();
return execute_with_conditions(storage, options, query_state);
}
++_stats.batches;
_stats.statements_in_batches += _statements.size();
auto timeout = db::timeout_clock::now() + options.get_timeout_config().*get_timeout_config_selector();
return get_mutations(storage, options, timeout, local, now, query_state).then([this, &storage, &options, timeout, tr_state = query_state.get_trace_state(),
return get_mutations(storage, options, timeout, local, now, query_state.get_trace_state(), query_state.get_permit()).then([this, &storage, &options, timeout, tr_state = query_state.get_trace_state(),
permit = query_state.get_permit()] (std::vector<mutation> ms) mutable {
return execute_without_conditions(storage, std::move(ms), options.get_consistency(), timeout, std::move(tr_state), std::move(permit));
}).then([] {
@@ -336,82 +342,56 @@ future<> batch_statement::execute_without_conditions(
}
future<shared_ptr<cql_transport::messages::result_message>> batch_statement::execute_with_conditions(
service::storage_proxy& proxy,
service::storage_proxy& storage,
const query_options& options,
service::query_state& qs) {
service::query_state& state)
{
fail(unimplemented::cause::LWT);
#if 0
auto now = state.get_timestamp();
ByteBuffer key = null;
String ksName = null;
String cfName = null;
CQL3CasRequest casRequest = null;
Set<ColumnDefinition> columnsWithConditions = new LinkedHashSet<>();
auto cl_for_commit = options.get_consistency();
auto cl_for_paxos = options.check_serial_consistency();
seastar::shared_ptr<cas_request> request;
schema_ptr schema;
db::timeout_clock::time_point now = db::timeout_clock::now();
const timeout_config& cfg = options.get_timeout_config();
auto batch_timeout = now + cfg.write_timeout; // Statement timeout.
auto cas_timeout = now + cfg.cas_timeout; // Ballot contention timeout.
auto read_timeout = now + cfg.read_timeout; // Query timeout.
for (size_t i = 0; i < _statements.size(); ++i) {
modification_statement& statement = *_statements[i].statement;
const query_options& statement_options = options.for_statement(i);
++_stats.cas_statements[size_t(statement.type)];
modification_statement::json_cache_opt json_cache = statement.maybe_prepare_json_cache(statement_options);
// At most one key
std::vector<dht::partition_range> keys = statement.build_partition_keys(statement_options, json_cache);
if (keys.empty()) {
continue;
for (int i = 0; i < statements.size(); i++)
{
ModificationStatement statement = statements.get(i);
QueryOptions statementOptions = options.forStatement(i);
long timestamp = attrs.getTimestamp(now, statementOptions);
List<ByteBuffer> pks = statement.buildPartitionKeyNames(statementOptions);
if (pks.size() > 1)
throw new IllegalArgumentException("Batch with conditions cannot span multiple partitions (you cannot use IN on the partition key)");
if (key == null)
{
key = pks.get(0);
ksName = statement.cfm.ksName;
cfName = statement.cfm.cfName;
casRequest = new CQL3CasRequest(statement.cfm, key, true);
}
if (request.get() == nullptr) {
schema = statement.s;
request = seastar::make_shared<cas_request>(schema, std::move(keys));
} else if (keys.size() != 1 || keys.front().equal(request->key().front(), dht::ring_position_comparator(*schema)) == false) {
throw exceptions::invalid_request_exception("BATCH with conditions cannot span multiple partitions");
else if (!key.equals(pks.get(0)))
{
throw new InvalidRequestException("Batch with conditions cannot span multiple partitions");
}
std::vector<query::clustering_range> ranges = statement.create_clustering_ranges(statement_options, json_cache);
request->add_row_update(statement, std::move(ranges), std::move(json_cache), statement_options);
}
if (request.get() == nullptr) {
throw exceptions::invalid_request_exception(format("Unrestricted partition key in a conditional BATCH"));
}
return proxy.cas(schema, request, request->read_command(), request->key(),
{read_timeout, qs.get_permit(), qs.get_client_state(), qs.get_trace_state()},
cl_for_paxos, cl_for_commit, batch_timeout, cas_timeout).then([this, request] (bool is_applied) {
return modification_statement::build_cas_result_set(_metadata, _columns_of_cas_result_set, is_applied, request->rows());
});
}
void batch_statement::build_cas_result_set_metadata() {
if (_statements.empty()) {
return;
}
const auto& schema = *_statements.front().statement->s;
_columns_of_cas_result_set.resize(schema.all_columns_count());
// Add the mandatory [applied] column to result set metadata
std::vector<shared_ptr<column_specification>> columns;
auto applied = make_shared<cql3::column_specification>(schema.ks_name(), schema.cf_name(),
make_shared<cql3::column_identifier>("[applied]", false), boolean_type);
columns.push_back(applied);
for (const auto& def : boost::range::join(schema.partition_key_columns(), schema.clustering_key_columns())) {
_columns_of_cas_result_set.set(def.ordinal_id);
}
for (const auto& s : _statements) {
_columns_of_cas_result_set.union_with(s.statement->columns_of_cas_result_set());
}
columns.reserve(_columns_of_cas_result_set.count());
for (const auto& def : schema.all_columns()) {
if (_columns_of_cas_result_set.test(def.ordinal_id)) {
columns.emplace_back(def.column_specification);
Composite clusteringPrefix = statement.createClusteringPrefix(statementOptions);
if (statement.hasConditions())
{
statement.addConditions(clusteringPrefix, casRequest, statementOptions);
// As soon as we have a ifNotExists, we set columnsWithConditions to null so that everything is in the resultSet
if (statement.hasIfNotExistCondition() || statement.hasIfExistCondition())
columnsWithConditions = null;
else if (columnsWithConditions != null)
Iterables.addAll(columnsWithConditions, statement.getColumnsWithConditions());
}
casRequest.addRowUpdate(clusteringPrefix, statement, statementOptions, timestamp);
}
_metadata = seastar::make_shared<cql3::metadata>(std::move(columns));
ColumnFamily result = StorageProxy.cas(ksName, cfName, key, casRequest, options.getSerialConsistency(), options.getConsistency(), state.getClientState());
return new ResultMessage.Rows(ModificationStatement.buildCasResultSet(ksName, key, cfName, result, columnsWithConditions, true, options.forStatement(0)));
#endif
}
namespace raw {

View File

@@ -62,7 +62,7 @@ namespace statements {
* A <code>BATCH</code> statement parsed from a CQL query.
*
*/
class batch_statement : public cql_statement_opt_metadata {
class batch_statement : public cql_statement_no_metadata {
static logging::logger _logger;
public:
using type = raw::batch_statement::type;
@@ -85,19 +85,7 @@ private:
type _type;
std::vector<single_statement> _statements;
std::unique_ptr<attributes> _attrs;
// True if *any* statement of the batch has IF .. clause. In
// this case entire batch is considered a CAS batch.
bool _has_conditions;
// If the BATCH has conditions, it must return columns which
// are involved in condition expressions in its result set.
// Unlike Cassandra, Scylla always returns all columns,
// regardless of whether the batch succeeds or not - this
// allows clients to prepare a CAS statement like any other
// statement, and trust the returned statement metadata.
// Cassandra returns a result set only if CAS succeeds. If
// any statement in the batch has IF EXISTS, we must return
// all columns of the table, including the primary key.
column_set _columns_of_cas_result_set;
cql_stats& _stats;
public:
/**
@@ -131,18 +119,14 @@ public:
// Validates a prepared batch statement without validating its nested statements.
void validate();
bool has_conditions() const { return _has_conditions; }
void build_cas_result_set_metadata();
// The batch itself will be validated in either Parsed#prepare() - for regular CQL3 batches,
// or in QueryProcessor.processBatch() - for native protocol batches.
virtual void validate(service::storage_proxy& proxy, const service::client_state& state) override;
const std::vector<single_statement>& get_statements();
private:
future<std::vector<mutation>> get_mutations(service::storage_proxy& storage, const query_options& options, db::timeout_clock::time_point timeout,
bool local, api::timestamp_type now, service::query_state& query_state);
future<std::vector<mutation>> get_mutations(service::storage_proxy& storage, const query_options& options, db::timeout_clock::time_point timeout, bool local, api::timestamp_type now, tracing::trace_state_ptr trace_state,
service_permit permit);
public:
/**

View File

@@ -1,188 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/*
* Copyright (C) 2019 ScyllaDB
*
* Modified by ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "modification_statement.hh"
#include "cas_request.hh"
#include <seastar/core/sleep.hh>
namespace cql3::statements {
using namespace std::chrono;
void cas_request::add_row_update(modification_statement &stmt_arg,
std::vector<query::clustering_range> ranges_arg,
modification_statement::json_cache_opt json_cache_arg,
const query_options& options_arg) {
// TODO: reserve updates array for batches
_updates.emplace_back(cas_row_update{
.statement = stmt_arg,
.ranges = std::move(ranges_arg),
.json_cache = std::move(json_cache_arg),
.options = options_arg});
}
std::optional<mutation> cas_request::apply_updates(api::timestamp_type ts) const {
// We're working with a single partition, so there will be only one element
// in the vector. A vector is used since this is a conventional format
// to pass a mutation onward.
std::optional<mutation> mutation_set;
for (const cas_row_update& op: _updates) {
update_parameters params(_schema, op.options, ts, op.statement.get_time_to_live(op.options), _rows);
std::vector<mutation> statement_mutations = op.statement.apply_updates(_key, op.ranges, params, op.json_cache);
// Append all mutations (in fact only one) to the consolidated one.
for (mutation& m : statement_mutations) {
if (mutation_set.has_value() == false) {
mutation_set.emplace(std::move(m));
} else {
mutation_set->apply(std::move(m));
}
}
}
return mutation_set;
}
lw_shared_ptr<query::read_command> cas_request::read_command() const {
column_set columns_to_read(_schema->all_columns_count());
std::vector<query::clustering_range> ranges;
for (const cas_row_update& op : _updates) {
if (op.statement.has_conditions() == false && op.statement.requires_read() == false) {
// No point in pre-fetching the old row if the statement doesn't check it in a CAS and
// doesn't use it to apply updates.
continue;
}
columns_to_read.union_with(op.statement.columns_to_read());
if (op.statement.has_only_static_column_conditions() && !op.statement.requires_read()) {
// If a statement has only static column conditions and doesn't have operations that
// require read, it doesn't matter what clustering key range to query - any partition
// row will do for the check.
continue;
}
ranges.reserve(op.ranges.size());
std::copy(op.ranges.begin(), op.ranges.end(), std::back_inserter(ranges));
}
uint32_t max_rows = query::max_rows;
if (ranges.empty()) {
// With only a static condition, we still want to make the distinction between
// a non-existing partition and one that exists (has some live data) but has not
// static content. So we query the first live row of the partition.
ranges.emplace_back(query::clustering_range::make_open_ended_both_sides());
max_rows = 1;
} else {
ranges = query::clustering_range::deoverlap(std::move(ranges), clustering_key::tri_compare(*_schema));
}
auto options = update_parameters::options;
options.set(query::partition_slice::option::always_return_static_content);
query::partition_slice ps(std::move(ranges), *_schema, columns_to_read, options);
ps.set_partition_row_limit(max_rows);
return make_lw_shared<query::read_command>(_schema->id(), _schema->version(), std::move(ps));
}
bool cas_request::applies_to() const {
const partition_key& pkey = _key.front().start()->value().key().value();
const clustering_key empty_ckey = clustering_key::make_empty();
bool applies = true;
bool is_cas_result_set_empty = true;
bool has_static_column_conditions = false;
for (const cas_row_update& op: _updates) {
if (op.statement.has_conditions() == false) {
continue;
}
if (op.statement.has_static_column_conditions()) {
has_static_column_conditions = true;
}
// If a statement has only static columns conditions, we must ignore its clustering columns
// restriction when choosing a row to check the conditions, i.e. choose any partition row,
// because any of them must have static columns and that's all we need to know if the
// statement applies. For example, the following update must successfully apply (effectively
// turn into INSERT), because, although the table doesn't have any regular rows matching the
// statement clustering column restriction, the static row matches the statement condition:
// CREATE TABLE t(p int, c int, s int static, v int, PRIMARY KEY(p, c));
// INSERT INTO t(p, s) VALUES(1, 1);
// UPDATE t SET v=1 WHERE p=1 AND c=1 IF s=1;
// Another case when we pass an empty clustering key prefix is apparently when the table
// doesn't have any clustering key columns and the clustering key range is empty (open
// ended on both sides).
const auto& ckey = !op.statement.has_only_static_column_conditions() && op.ranges.front().start() ?
op.ranges.front().start()->value() : empty_ckey;
const auto* row = _rows.find_row(pkey, ckey);
if (row) {
row->is_in_cas_result_set = true;
is_cas_result_set_empty = false;
}
if (!applies) {
// No need to check this condition as we have already failed a previous one.
// Continuing the loop just to set is_in_cas_result_set flag for all involved
// statements, which is necessary to build the CAS result set.
continue;
}
applies = op.statement.applies_to(row, op.options);
}
if (has_static_column_conditions && is_cas_result_set_empty) {
// If none of the fetched rows matches clustering key restrictions and hence none of them is
// included into the CAS result set, but there is a static column condition in the CAS batch,
// we must still include the static row into the result set. Consider the following example:
// CREATE TABLE t(p int, c int, s int static, v int, PRIMARY KEY(p, c));
// INSERT INTO t(p, s) VALUES(1, 1);
// DELETE v FROM t WHERE p=1 AND c=1 IF v=1 AND s=1;
// In this case the conditional DELETE must return [applied=False, v=null, s=1].
const auto* row = _rows.find_row(pkey, empty_ckey);
if (row) {
row->is_in_cas_result_set = true;
}
}
return applies;
}
std::optional<mutation> cas_request::apply(query::result& qr,
const query::partition_slice& slice, api::timestamp_type ts) {
_rows = update_parameters::build_prefetch_data(_schema, qr, slice);
if (applies_to()) {
return apply_updates(ts);
} else {
return {};
}
}
} // end of namespace "cql3::statements"

View File

@@ -1,105 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/*
* Copyright (C) 2019 ScyllaDB
*
* Modified by ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "service/storage_proxy.hh"
namespace cql3::statements {
using namespace std::chrono;
/**
* Due to some operation on lists, we can't generate the update that a given Modification statement does before
* we get the values read by the initial read of Paxos. A RowUpdate thus just store the relevant information
* (include the statement itself) to generate those updates. We'll have multiple RowUpdate for a Batch, otherwise
* we'll have only one.
*/
struct cas_row_update {
modification_statement& statement;
std::vector<query::clustering_range> ranges;
modification_statement::json_cache_opt json_cache;
// This statement query options. Different from cas_request::query_options,
// which may stand for BATCH statement, not individual modification_statement,
// in case of BATCH
const query_options& options;
};
/**
* Processed CAS conditions and update on potentially multiple rows of the same partition.
*/
class cas_request: public service::cas_request {
private:
std::vector<cas_row_update> _updates;
schema_ptr _schema;
// A single partition key. Represented as a vector of partition ranges
// since this is the conventional format for storage_proxy.
std::vector<dht::partition_range> _key;
update_parameters::prefetch_data _rows;
public:
cas_request(schema_ptr schema_arg, std::vector<dht::partition_range> key_arg)
: _schema(schema_arg)
, _key(std::move(key_arg))
, _rows(schema_arg)
{
assert(_key.size() == 1 && query::is_single_partition(_key.front()));
}
dht::partition_range_vector key() const {
return dht::partition_range_vector(_key);
}
const update_parameters::prefetch_data& rows() const {
return _rows;
}
lw_shared_ptr<query::read_command> read_command() const;
void add_row_update(modification_statement &stmt_arg, std::vector<query::clustering_range> ranges_arg,
modification_statement::json_cache_opt json_cache_arg, const query_options& options_arg);
virtual std::optional<mutation> apply(query::result& qr,
const query::partition_slice& slice, api::timestamp_type ts) override;
private:
bool applies_to() const;
std::optional<mutation> apply_updates(api::timestamp_type t) const;
};
} // end of namespace "cql3::statements"

View File

@@ -41,8 +41,6 @@
#include "cql3/statements/cf_prop_defs.hh"
#include "db/extensions.hh"
#include "cdc/cdc.hh"
#include "service/storage_service.hh"
#include <boost/algorithm/string/predicate.hpp>
@@ -70,8 +68,6 @@ const sstring cf_prop_defs::KW_CRC_CHECK_CHANCE = "crc_check_chance";
const sstring cf_prop_defs::KW_ID = "id";
const sstring cf_prop_defs::KW_CDC = "cdc";
const sstring cf_prop_defs::COMPACTION_STRATEGY_CLASS_KEY = "class";
const sstring cf_prop_defs::COMPACTION_ENABLED_KEY = "enabled";
@@ -88,7 +84,7 @@ void cf_prop_defs::validate(const db::extensions& exts) {
KW_GCGRACESECONDS, KW_CACHING, KW_DEFAULT_TIME_TO_LIVE,
KW_MIN_INDEX_INTERVAL, KW_MAX_INDEX_INTERVAL, KW_SPECULATIVE_RETRY,
KW_BF_FP_CHANCE, KW_MEMTABLE_FLUSH_PERIOD, KW_COMPACTION,
KW_COMPRESSION, KW_CRC_CHECK_CHANCE, KW_ID, KW_CDC
KW_COMPRESSION, KW_CRC_CHECK_CHANCE, KW_ID
});
static std::set<sstring> obsolete_keywords({
sstring("index_interval"),
@@ -127,12 +123,6 @@ void cf_prop_defs::validate(const db::extensions& exts) {
cp.validate();
}
auto cdc_options = get_cdc_options();
if (cdc_options && !cdc_options->empty()) {
// Constructor throws if options are not valid
cdc::options opts(*cdc_options);
}
validate_minimum_int(KW_DEFAULT_TIME_TO_LIVE, 0, DEFAULT_DEFAULT_TIME_TO_LIVE);
auto min_index_interval = get_int(KW_MIN_INDEX_INTERVAL, DEFAULT_MIN_INDEX_INTERVAL);
@@ -182,10 +172,6 @@ std::optional<utils::UUID> cf_prop_defs::get_id() const {
return std::nullopt;
}
std::optional<std::map<sstring, sstring>> cf_prop_defs::get_cdc_options() const {
return get_map(KW_CDC);
}
void cf_prop_defs::apply_to_builder(schema_builder& builder, const db::extensions& exts) {
if (has_property(KW_COMMENT)) {
builder.set_comment(get_string(KW_COMMENT, ""));
@@ -259,14 +245,6 @@ void cf_prop_defs::apply_to_builder(schema_builder& builder, const db::extension
if (compression_options) {
builder.set_compressor_params(compression_parameters(*compression_options));
}
auto cdc_options = get_cdc_options();
if (cdc_options) {
auto opts = cdc::options(*cdc_options);
if (opts.enabled() && !service::get_local_storage_service().cluster_supports_cdc()) {
throw exceptions::configuration_exception("CDC not supported by the cluster");
}
builder.set_cdc_options(std::move(opts));
}
#if 0
CachingOptions cachingOptions = getCachingOptions();
if (cachingOptions != null)

View File

@@ -78,8 +78,6 @@ public:
static const sstring KW_ID;
static const sstring KW_CDC;
static const sstring COMPACTION_STRATEGY_CLASS_KEY;
static const sstring COMPACTION_ENABLED_KEY;
@@ -93,7 +91,6 @@ public:
void validate(const db::extensions&);
std::map<sstring, sstring> get_compaction_options() const;
std::optional<std::map<sstring, sstring>> get_compression_options() const;
std::optional<std::map<sstring, sstring>> get_cdc_options() const;
#if 0
public CachingOptions getCachingOptions() throws SyntaxException, ConfigurationException
{

View File

@@ -151,17 +151,21 @@ create_index_statement::validate(service::storage_proxy& proxy, const service::c
target->as_string()));
}
if (cd->type->is_multi_cell()) {
// NOTICE(sarna): should be lifted after #2962 (indexes on non-frozen collections) is implemented
// NOTICE(kbraun): don't forget about non-frozen user defined types
throw exceptions::invalid_request_exception(
format("Cannot create secondary index on non-frozen collection or UDT column {}", cd->name_as_text()));
} else if (cd->type->is_collection()) {
bool is_map = dynamic_cast<const collection_type_impl *>(cd->type.get()) != nullptr
&& dynamic_cast<const collection_type_impl *>(cd->type.get())->is_map();
bool is_collection = cd->type->is_collection();
bool is_frozen_collection = is_collection && !cd->type->is_multi_cell();
if (is_frozen_collection) {
validate_for_frozen_collection(target);
} else if (is_collection) {
// NOTICE(sarna): should be lifted after #2962 (indexes on non-frozen collections) is implemented
throw exceptions::invalid_request_exception(
format("Cannot create secondary index on non-frozen collection column {}", cd->name_as_text()));
} else {
validate_not_full_index(target);
validate_is_values_index_if_target_column_not_collection(cd, target);
validate_target_column_is_map_if_index_involves_keys(cd->type->is_map(), target);
validate_target_column_is_map_if_index_involves_keys(is_map, target);
}
}

View File

@@ -110,8 +110,7 @@ void create_keyspace_statement::validate(service::storage_proxy&, const service:
future<shared_ptr<cql_transport::event::schema_change>> create_keyspace_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)
{
return make_ready_future<>().then([this, is_local_only] {
const auto& tm = service::get_local_storage_service().get_token_metadata();
return service::get_local_migration_manager().announce_new_keyspace(_attrs->as_ks_metadata(_name, tm), is_local_only);
return service::get_local_migration_manager().announce_new_keyspace(_attrs->as_ks_metadata(_name), is_local_only);
}).then_wrapped([this] (auto&& f) {
try {
f.get();

View File

@@ -51,12 +51,10 @@
#include "auth/resource.hh"
#include "auth/service.hh"
#include "cdc/cdc.hh"
#include "schema_builder.hh"
#include "service/storage_service.hh"
#include "db/extensions.hh"
#include "database.hh"
#include "types/user.hh"
namespace cql3 {
@@ -98,52 +96,25 @@ std::vector<column_definition> create_table_statement::get_columns()
return column_defs;
}
template <typename CreateTable>
future<shared_ptr<cql_transport::event::schema_change>>
create_table_statement::create_table_with_cdc(service::storage_proxy& proxy,
schema_ptr schema,
CreateTable&& create_table) {
if (_if_not_exists) {
throw exceptions::invalid_request_exception(
"Can't create table with CDC support using IF NOT EXISTS");
}
cdc::db_context ctx = cdc::db_context::builder(proxy).build();
return cdc::setup(ctx, schema).then([ctx, create_table = std::move(create_table), schema = std::move(schema)] () mutable{
return create_table().handle_exception([ctx, schema = std::move(schema)](std::exception_ptr ep) mutable {
return cdc::remove(ctx, schema->ks_name(), schema->cf_name()).then([ep = std::move(ep)] () -> shared_ptr<cql_transport::event::schema_change> {
std::rethrow_exception(ep);
});
});
});
}
future<shared_ptr<cql_transport::event::schema_change>> create_table_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only) {
auto schema = get_cf_meta_data(proxy.get_db().local());
auto create_table = [this, is_local_only, schema] () mutable {
return make_ready_future<>().then([this, is_local_only, schema = std::move(schema)] {
return service::get_local_migration_manager().announce_new_column_family(std::move(schema), is_local_only);
}).then_wrapped([this] (auto&& f) {
try {
f.get();
using namespace cql_transport;
return make_shared<event::schema_change>(
event::schema_change::change_type::CREATED,
event::schema_change::target_type::TABLE,
this->keyspace(),
this->column_family());
} catch (const exceptions::already_exists_exception& e) {
if (_if_not_exists) {
return ::shared_ptr<cql_transport::event::schema_change>();
}
throw e;
return make_ready_future<>().then([this, is_local_only, &proxy] {
return service::get_local_migration_manager().announce_new_column_family(get_cf_meta_data(proxy.get_db().local()), is_local_only);
}).then_wrapped([this] (auto&& f) {
try {
f.get();
using namespace cql_transport;
return make_shared<event::schema_change>(
event::schema_change::change_type::CREATED,
event::schema_change::target_type::TABLE,
this->keyspace(),
this->column_family());
} catch (const exceptions::already_exists_exception& e) {
if (_if_not_exists) {
return ::shared_ptr<cql_transport::event::schema_change>();
}
});
};
bool cdc_enabled = _properties->get_cdc_options() && cdc::options(*_properties->get_cdc_options()).enabled();
return cdc_enabled
? create_table_with_cdc(
proxy, std::move(schema), std::move(create_table))
: create_table();
throw e;
}
});
}
/**
@@ -234,34 +205,18 @@ std::unique_ptr<prepared_statement> create_table_statement::raw_statement::prepa
auto stmt = ::make_shared<create_table_statement>(_cf_name, _properties.properties(), _if_not_exists, _static_columns, _properties.properties()->get_id());
std::optional<std::map<bytes, data_type>> defined_multi_cell_columns;
std::optional<std::map<bytes, data_type>> defined_multi_cell_collections;
for (auto&& entry : _definitions) {
::shared_ptr<column_identifier> id = entry.first;
cql3_type pt = entry.second->prepare(db, keyspace());
if (pt.is_counter() && !service::get_local_storage_service().cluster_supports_counters()) {
throw exceptions::invalid_request_exception("Counter support is not enabled");
}
if (pt.get_type()->is_multi_cell()) {
if (pt.get_type()->is_user_type()) {
// check for multi-cell types (non-frozen UDTs or collections) inside a non-frozen UDT
auto type = static_cast<const user_type_impl*>(pt.get_type().get());
for (auto&& inner: type->all_types()) {
if (inner->is_multi_cell()) {
// a nested non-frozen UDT should have already been rejected when defining the type
assert(inner->is_collection());
throw exceptions::invalid_request_exception("Non-frozen UDTs with nested non-frozen collections are not supported");
}
}
if (!service::get_local_storage_service().cluster_supports_nonfrozen_udts()) {
throw exceptions::invalid_request_exception("Non-frozen UDT support is not enabled");
}
if (pt.is_collection() && pt.get_type()->is_multi_cell()) {
if (!defined_multi_cell_collections) {
defined_multi_cell_collections = std::map<bytes, data_type>{};
}
if (!defined_multi_cell_columns) {
defined_multi_cell_columns = std::map<bytes, data_type>{};
}
defined_multi_cell_columns->emplace(id->name(), pt.get_type());
defined_multi_cell_collections->emplace(id->name(), pt.get_type());
}
stmt->_columns.emplace(id, pt.get_type()); // we'll remove what is not a column below
}
@@ -298,8 +253,8 @@ std::unique_ptr<prepared_statement> create_table_statement::raw_statement::prepa
if (stmt->_columns.empty()) {
throw exceptions::invalid_request_exception("No definition found that is not part of the PRIMARY KEY");
}
if (defined_multi_cell_columns) {
throw exceptions::invalid_request_exception("Non-frozen collections and UDTs are not supported with COMPACT STORAGE");
if (defined_multi_cell_collections) {
throw exceptions::invalid_request_exception("Non-frozen collection types are not supported with COMPACT STORAGE");
}
}
stmt->_clustering_key_types = std::vector<data_type>{};
@@ -307,8 +262,8 @@ std::unique_ptr<prepared_statement> create_table_statement::raw_statement::prepa
// If we use compact storage and have only one alias, it is a
// standard "dynamic" CF, otherwise it's a composite
if (_properties.use_compact_storage() && _column_aliases.size() == 1) {
if (defined_multi_cell_columns) {
throw exceptions::invalid_request_exception("Non-frozen collections and UDTs are not supported with COMPACT STORAGE");
if (defined_multi_cell_collections) {
throw exceptions::invalid_request_exception("Collection types are not supported with COMPACT STORAGE");
}
auto alias = _column_aliases[0];
if (_static_columns.count(alias) > 0) {
@@ -341,8 +296,8 @@ std::unique_ptr<prepared_statement> create_table_statement::raw_statement::prepa
}
if (_properties.use_compact_storage()) {
if (defined_multi_cell_columns) {
throw exceptions::invalid_request_exception("Non-frozen collections and UDTs are not supported with COMPACT STORAGE");
if (defined_multi_cell_collections) {
throw exceptions::invalid_request_exception("Collection types are not supported with COMPACT STORAGE");
}
stmt->_clustering_key_types = types;
} else {
@@ -432,12 +387,8 @@ data_type create_table_statement::raw_statement::get_type_and_remove(column_map_
throw exceptions::invalid_request_exception(format("Unknown definition {} referenced in PRIMARY KEY", t->text()));
}
auto type = it->second;
if (type->is_multi_cell()) {
if (type->is_collection()) {
throw exceptions::invalid_request_exception(format("Invalid non-frozen collection type for PRIMARY KEY component {}", t->text()));
} else {
throw exceptions::invalid_request_exception(format("Invalid non-frozen user-defined type for PRIMARY KEY component {}", t->text()));
}
if (type->is_collection() && type->is_multi_cell()) {
throw exceptions::invalid_request_exception(format("Invalid collection type for PRIMARY KEY component {}", t->text()));
}
columns.erase(t);

View File

@@ -119,11 +119,6 @@ private:
void apply_properties_to(schema_builder& builder, const database&);
void add_column_metadata_from_aliases(schema_builder& builder, std::vector<bytes> aliases, const std::vector<data_type>& types, column_kind kind);
template <typename CreateTable>
future<shared_ptr<cql_transport::event::schema_change>> create_table_with_cdc(service::storage_proxy& proxy,
schema_ptr,
CreateTable&&);
};
class create_table_statement::raw_statement : public raw::cf_statement {

View File

@@ -88,16 +88,9 @@ void create_type_statement::validate(service::storage_proxy& proxy, const servic
throw exceptions::invalid_request_exception(format("Cannot add type in unknown keyspace {}", keyspace()));
}
if (_column_types.size() > max_udt_fields) {
throw exceptions::invalid_request_exception(format("A user type cannot have more than {} fields", max_udt_fields));
}
for (auto&& type : _column_types) {
if (type->is_counter()) {
throw exceptions::invalid_request_exception("A user type cannot contain counters");
}
if (type->is_user_type() && !type->is_frozen()) {
throw exceptions::invalid_request_exception("A user type cannot contain non-frozen user type fields");
throw exceptions::invalid_request_exception(format("A user type cannot contain counters"));
}
}
}
@@ -133,9 +126,8 @@ inline user_type create_type_statement::create_type(database& db)
field_types.push_back(column_type->prepare(db, keyspace()).get_type());
}
// When a table is created with a UDT column, the column will be non-frozen (multi cell) by default.
return user_type_impl::get_instance(keyspace(), _name.get_user_type_name(),
std::move(field_names), std::move(field_types), true /* multi cell */);
std::move(field_names), std::move(field_types));
}
future<shared_ptr<cql_transport::event::schema_change>> create_type_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)

Some files were not shown because too many files have changed in this diff Show More