The executor::add_stream_options() obtains local database reference from
proxy just to get feature service from it.
Similar chain is used in executor::update_time_to_live().
It's shorter to get features from proxy itself.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#26973
An Alternator user was recently "bit" when switching `alternator_enforce_authorization` from "false" to "true": ְְְAfter the configuration change, all application requests suddenly failed because unbeknownst to the user, their application used incorrect secret keys.
This series introduces a solution for users who want to **safely** switch `alternator_enforce_authorization` from "false" to "true": Before switching from "false" to "true", the user can temporarily switch a new option, `alternator_warn_authorization`, to true. In this "warn" mode, authentication and authorization errors are counted in metrics (`scylla_alternator_authentication_failures` and `scylla_alternator_authorization_failures`) and logged as WARNings, but the user's application continues to work. The user can use these metrics or log messages to learn of errors in their application's setup, fix them, and only do the switch of `alternator_enforce_authorization` when the metrics or log messages show there are no more errors.
The first patch is the implementation of the the feature - the new configuration option, the metrics and the log messages, the second patch is a test for the new feature, and the third patch is documentation recommending how to use the warn mode and the associated metrics or log messages to safely switch `alternaor_enforce_authorization` from false to true.
Fixes#25308
This is a feature that users need, so it should probably be backported to live branches.
Closesscylladb/scylladb#25457
* github.com:scylladb/scylladb:
docs/alternator: explain alternator_warn_authorization
test/alternator: tests for new auth failure metrics and log messages
alternator: add alternator_warn_authorization config
In commit a3ec6c7d1d we supposedly
implemented the feature of telling TTL experation events from regular
user-sent deletions. However, that implementation did not actually work
at all... It had two bugs:
1. It created an null rjson::value() instead of an empty dictionary
rjson::empty_object(), so GetRecords failed every time such a
TTL expiration event was generated.
2. In the output, it used lowercase field names "type" and "principalId"
instead of the uppercase "Type" and "PrincipalId". This is not the
correct capitalization, and when boto3 recieves such incorrect
fields it silently deletes them and never passes them to the user's
get_records() call.
This patch fixes those two bugs, and importantly - enables a test for
this feature. We did already have such a test but it was marked as
"veryslow" so doesn't run in CI and apparently not even run once to
check the new feature. This test is not actually very long on Alternator
when the TTL period is set very low (as we do in our tests), so I replaced
the "veryslow" marker by "waits_for_expiration". The latter marker means
that the test is still very slow - as much as half an hour - on DynamoDB -
but runs quickly on Scylla in our test setup, and enabled in CI by
default.
The enabled test failed badly before this patch (a server error during
GetRecords), and passes with this patch.
Also, the aforementioned commit forgot to remove the paragraph in
Alternator's compatibility.md that claims we don't have that feature yet.
So we do it now.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#26633
Before this patch, the configuration alternator_enforce_authorization
is a boolean: true means enforce authentication checks (i.e., each
request is signed by a valid user) and authorization checks (the user
who signed the request is allowed by RBAC to perform this request).
This patch adds a second boolean configuration option,
alternator_warn_authorization. When alternator_enforce_authorization
is false but alternator_warn_authorization is true, authentication and
authorization checks are performed as in enforce mode, but failures
are ignored and counted in two new metrics:
scylla_alternator_authentication_failures
scylla_alternator_authorization_failures
additionally,also each authentication or authorization error is logged as
a WARN-level log message. Some users prefer those log messages over
metrics, as the log messages contain additional information about the
failure that can be useful - such as the address of the misconfigured
client, or the username attempted in the request.
All combinations of the two configuration options are allowed:
* If just "enforce" is true, auth failures cause a request failure.
The failures are counted, but not logged.
* If both "enforce" and "warn" are true, auth failures cause a request
failure. The failures are both counted and logged.
* If just "warn" is true, auth failures are ignored (the request
is allowed to compelete) but are counted and logged.
* If neither "enforce" nor "warn" are true, no authentication or
authorization check are done at all. So we don't know about failures,
so naturally we don't count them and don't log them.
This patch is fairly straightforward, doing mainly the following
things:
1. Add an alternator_warn_authorization config parameter.
2. Make sure alternator_enforce_authorization is live-updatable (we'll
use this in a test in the next patch). It "almost" was, but a typo
prevented the live update from working properly.
3. Add the two new metrics, and increment them in every type of
authentication or authorization error.
Some code that needs to increment these new metrics didn't have
access to the "stats" object, so we had to pass it around more.
4. Add log messages when alternator_warn_authorization is true.
5. If alternator_enforce_authorization is false, allow the auth check
to allow the request to proceed (after having counted and/or logged
the auth error).
A separate patch will follow and add documentation suggesting to users
how to use the new "warn" options to safely switch between non-enforcing
to enforcing mode. Another patch will add tests for the new configuration
options, new metrics and new log messages.
Fixes#25308.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
UserIdentity is a map of two fields in GetRecords responses, which
always has the same value. It may be missing, or contain a constant
object with value `{"type": "Service", "principalId":
"dynamodb.amazonaws.com"}`. Currently, the latter is set only for
`REMOVE`s triggered by TTL.
This commit introduces two new CDC operation types: `service_row_delete`
and `service_partition_delete`, emitted in place of `row_delete` and
`partition_delete`. Alternator Streams treats them as regular `REMOVE`s,
but in addition adds the `userIdentity` field to the record.
This change may break existing Scylla libraries for reading raw CDC
tables, but we doubt that anybody has this use case.
Refs https://github.com/scylladb/scylladb/pull/26149
Refs https://github.com/scylladb/scylladb/pull/26121
Fixes https://github.com/scylladb/scylladb/issues/11523Closesscylladb/scylladb#26460
Tiny code cleanup to improve readability without changing behavior.
Changes:
- remove unused variables and imports,
- remove redundant whitespaces, and a duplicated `public:` access
specifier,
- use `is_aws` function to check if running in AWS
test/alternator/test_metrics.py,
- other trivial changes.
Closesscylladb/scylladb#26423
This commit adds missing fields to GetRecords responses: `awsRegion` and
`eventVersion`. We also considered changing `eventSource` from
`scylladb:alternator` to `aws:dynamodb` and setting `SizeBytes` subfield
inside the `dynamodb` field.
We set `awsRegion` to the datacenter's name of the node that received
the request. This is in line with the AWS documentation, except that
Scylla has no direct equivalent of a region, so we use the datacenter's
name, which is analogous to DynamoDB's concept of region.
The field `eventVersion` determines the structure of a Record. It is
updated whenever the structure changes. We think that adding a field
`userIdentity` bumped the version from `1.0` to `1.1`. Currently, Scylla
doesn't support this field (#11523), hence we use the older 1.0 version.
We have decided to leave `eventSource` as is, since it's easy to modify
it in case of problems to `aws:dynamodb` used by DynamoDB.
Not setting `SizeBytes` subfield inside the `dynamodb` field was
dictated by the lack of apparent use cases. The documentation is unclear
about how `SizeBytes` is calculated and after experimenting a little
bit, I haven't found an obvious pattern.
Fixes: #6931Closesscylladb/scylladb#24903
std::views::trasform()s should not have side effects since they could be
called several times, depending on the algorithm they're paired with.
For example, std::ranges::to() can run the algorithm once to measure
the resulting container size, and then a second time to copy the data
(avoiding reallocations). If that happens, then the side-effect happens
twice.
Avoid this be refactoring the code. Make the side-effect -- appending
to the `column` vector -- happen first, then use that result to generate
the `regular_column` vector.
In this case, the side effect did not happen twice because small_vector's
std::from_range_t constructor only reserves if the input range is sized
(and it is not), but better not have the weakness in the code.
Closesscylladb/scylladb#25011
This series fixes one cause of oversized allocations - and therefore potentially stalls and increased tail latencies - in Alternator.
The first patch in the series is the main fix - the later patches are cleanups requested by reviewers but also involved other pre-existing code, so I did those cleanups as separate patches.
Alternator's Scan or Query operation return a page of results. When the number of items is not limited by a "Limit" parameter, the default is to return a 1 MB page. If items are short, a large number of them can fit in that 1MB. The test test_query.py::test_query_large_page_small_rows has 30,000 items returned in a single page.
In the response JSON, all these items are returned in a single array "Items". Before this patch, we build the full response as a RapidJSON object before sending it. The problem is that unfortunately, RapidJSON stores arrays as contiguous allocations. This results in large contiguous allocations in workloads that scan many small items, and large contiguous allocations can also cause stalls and high tail latencies. For example, before this patch, running
test/alternator/run --runveryslow \
test_query.py::test_query_large_page_small_rows
reports in the log:
oversized allocation: 573440 bytes.
After this patch, this warning no longer appears.
The patch solves the problem by collecting the scanned items not in a RapidJSON array, but rather in a chunked_vector<rjson::value>, i.e, a chunked (non-contiguous) array of items (each a JSON value). After collecting this array separately from the response object, we need to print its content without actually inserting it into the object - we add a new function print_with_extra_array() to do that.
The new separate-chunked-vector technique is used when a large number (currently, >256) of items were scanned. When there is a smaller number of items in a page (this is typical when each item is longer), we just insert those items in the object and print it as before.
Beyond the original slow test that demonstrated the oversized allocation (which is now gone), this patch also includes a new test which exercises the new code with a scan of 700 (>256) items in a page - but this new test is fast enough to be permanently in our test suite and not a manual "veryslow" test as the other test.
Fixes#23535
The stalls caused by large allocations was seen by actual users, so it makes sense to backport this patch. On the other hand, the patch while not big is fairly intrusive (modifies the nomal Scan and Query path and also the later patches do some cleanup of additional code) so there is some small risk involved in the backport.
Closesscylladb/scylladb#24480
* github.com:scylladb/scylladb:
alternator: clean up by co-routinizing
alternator: avoid spamming the log when failing to write response
alternator: clean up and simplify request_return_type
alternator: avoid oversized allocation in Query/Scan
The previous patch introduced a function make_streamed_with_extra_array
which was a duplicate of the existing make_streamed. Reviewers
complained how baroque the new function is (just like the old function),
having to jump through hoops to return a copyable function working
on non-copyable objects, making strange-named copies and shared pointers
of everything.
We needed to return a copyable function (std::function) just because
Alternator used Seastar's json::json_return_type in the return type
from executor function (request_return_type). This json_return_type
contained either a sstring or an std::function, but neither was ever
really appropriate:
1. We want to return noncopyable_function, not an std::function!
2. We want to return an std::string (which rjson::print()) returns,
not an sstring!
So in this patch we stop using seastar::json::json_return_type
entirely in Alternator.
Alternator's request_return_type is now an std::variant of *three* types:
1. std::string for short responses,
2. noncopyable_function for long streamed response
3. api_error for errors.
The ugliest parts of make_streamed() where we made copies and shared
pointers to allow for a copyable function are all gone. Even nicer, a
lot of other ugly relics of using seastar::json_return_type are gone:
1. We no longer need obscure classes and functions like make_jsonable()
and json_string() to convert strings to response bodies - an operation
can simply return a string directly - usually returning
rjson::print(value) or a fixed string like "" and it just works.
2. There is no more usage of seastar::json in Alternator (except one
minor use of seastar::json::formatter::to_json in streams.cc that
can be removed later). Alternator uses RapidJSON for its JSON
needs, we don't need to use random pieces from a different JSON
library.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Currently, in Alternator it is possible to create a table whose name has
222 characters, and then trying to add Streams to that table results in
an attempt to create a CDC log table with the same name plus a
15-character suffix "_scylla_cdc_log", which resulted (Ref #24598) in
an IO-error and a Scylla shutdown.
This patch adds code to the Stream-adding operations (both CreateTable
and UpdateTable) that validates that the table's name, plus that 15
character suffix, doesn't exceed max_auxiliary_table_name_length, i.e.,
222.
After this patch, if you have a table whose name is between 207 and 222
characters, attempting to enable Streams on it will fail with:
"Streams cannot be added if the table name is longer than 207 characters."
Note that in the future, if we lower max_table_name_length to below 207,
e.g., to 192, then it will always be possible to add a stream to any
legal table, and the new checks we had here will be mostly redundant.
But only "mostly" - not entirely: Checking in UpdateTable is still
important because of the possibility that an upgrading user might have
a pre-existing table whose name is longer than the new limit, and might
try to enable Streams.
After this patch, the crash reported in #24598 can no longer happen, so
in this sense the bug is solved. However, we still want to lower
max_table_name_length from 222 to 192, so that it will always be
possible to enable streams on any table with a legal name length.
We'll do this in the next patch.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Alternator Streams' "GetRecords" operation has a "Limit" parameter on
how many records to return. The DynamoDB documentations says that the
upper limit on this Limit parameter is 1000 - but Alternator didn't
enforce this. In this patch we begin enforcing this highest Limit, and
also add a test for verifying this enforcement. As usual, the new test
passes on DynamoDB, and after this patch - also on Alternator.
The reason why it's useful to have *some* upper limit on Limit is that
the existing executor::get_records() implementation does not really have
preemption points in all the necessary places. In particular, we have a
loop on all returned records without preemption points. We also store
the returned records in a RapidJson vector, which requires a contiguous
allocation.
Even before this patch, GetRecords had a hard limit of 1 MB of results.
But still, in some cases 1 MB of results may be a lot of results, and we
can see stalls in the aforementioned places being O(number of results).
Fixes#23534
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#23547
Read and Write Consumed Capacity units are an abstract way of measuring Alternator actions. In general, they correspond to the read or write data.
In the long run, the RCU/WCU adds a way of charging an operation and limiting usage.
This series addresses two issues: consume capacity request API and metering.
The Alternator (and DynmoDB) API has an optional parameter allowing users to check the number of units an operation consumes. When a user adds that parameter, the response will contain the number of units used for the operation.
This series adds the consume capacity support to the get_item and put_item, adds a metric to collect the overall RCU and WCU used, and adds a test for the new functionality.
Follow-up PRs will add support for more operations and GSI.
Replaces #19811
Partially implement: #5027Closesscylladb/scylladb#21543
* github.com:scylladb/scylladb:
alternator/test_metrics: Add tests for table consumption units
test_returnconsumedcapacity.py: Add putItem tests
Alternator: add WCU support
Add test/alternator/test_returnconsumedcapacity.py
alternator/executor: Add consume capacity for get_item
alsternator/stats: Add rcu and wcu metrics to stats
alternator/executor.hh: white-space cleanup
Add the consume_capacity helper class
This patch adds functionality to track Read Capacity Units (RCU) for the
get_item operation. This enhancement allows for standardized measurement
of read operations, aligning with DynamoDB-like metrics.
Additionally, the RCU value can now be included in the response to
provide immediate feedback on the read capacity usage.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
It's somewhat common to ask for the partition key and clustering key
columns, or for the static and regular columsn. Provide accessors for them
rather than requiring the user to glue them.
Some callers are converted.
Closesscylladb/scylladb#21191
To reduce dependency load, use std ranges instead of boost ranges.
The std::ranges::{lower,upper}_bound don't support heterogeneous lookup,
but a more natural solution is to use a projection to search for the name,
so we use that and the custom comparator is removed.
Many callers are converted as well due to poor interoperability between
boost ranges and std ranges.
When the configuration has alternator_enforce_authorization=false,
Alternator should not do authentication (check which user signed each
request) nor authorization (check if that user has permissions to do
each operation).
Our implementation forgot to disable the authorization checks when
it's configured to false. The (incorrect) assumption was that when
alternator_enforce_authorization is configured to false, the CQL
'authenticator' and 'authorizer' configuration is also disabled -
so the authorization checks will be no-ops. But we can't assume
that: Users are free to configure 'authenticator' and 'authorizer'
for use in CQL, and then set alternator_enforce_authorization=false
just for Alternator.
So this patch adds a new test for this case - when we have
authenticator=PasswordAuthenticator, authorizer=CassandraAuthorizer
but alternator_enforce_authorization=false, and fixes it to work
correctly.
The heart of the fix is trivial: the `verify_*_permission()` functions
just need to check the alternator_enforce_authorization and return
immediately when false. The bigger part of this change is to get the
alternator_enforce_authorization into the "executor" object and then
to pass it into the verify calls.
Although alternator_enforce_authorization is not YET live updatable,
this code is prepared for the future that it may become live
updatable, so the executor object saves not the boolean value of
this flag, but a live-updatable reference to it.
Fixes#20619
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch address two requests made by reviewers of the original "Add
CQL-based RBAC support to Alternator" series. Both requests were about
the error messages produced when access is denied:
1. The error message is improved to use more proper English, and also
to include the name of the role which was denied access.
2. The permission-check and error-message-formatting code is
de-duplicated, using a common function verify_permission().
This de-duplication required moving the access-denied error path to
throwing an exception instead of the previous exception-free
implementation. However, it can be argued that this change is actually
a good thing, because it makes the successful case, when access is
allowed, faster.
The de-duplicated code is shorter and simpler, and allowed changing
the text of the error message in just one place.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#20326
This patch adds a requirement for the "SELECT" permission on a table to
run a GetRecords on it (the DynamoDB Streams API, i.e., CDC).
The grant is checked on the *CDC log table* - not on the base table,
which allows giving a role the ability to read the base but not is
change stream, or vice versa.
The operations ListStreams, DescribeStreams, GetShardIterators do not
require any permissions to run - they do not read any data, and are
(in my opinion) similar in spirit to DescribeTable, so I think it's fine
not to require any permissions for them.
A test is also added.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
since we've switched almost all callers of the operator<< to {fmt},
let's drop the unused operator<<:s.
the callers in alternator/streams.cc is updated to use `fmt::print()`
to format the `bytes` instances.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19448
before this change, when linking an executable referencing `marker`,
we could have following error:
```
13:58:02 ld.lld: error: undefined symbol: alternator::event_id::marker
13:58:02 >>> referenced by streams.cc
13:58:02 >>> build/dev/alternator/streams.o:(from_string_helper<rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>, alternator::event_id>::Set(rapidjson::GenericValue<rapidjson::UTF8<char>, rjson::internal::throwing_allocator>&, alternator::event_id, rjson::internal::throwing_allocator&))
13:58:02 clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
```
it turns out `event_id::marker` is only declared, but never defined.
please note, the non-inline static member variable in its class
definition is not considered as a definition, see
[class.static.data](https://eel.is/c++draft/class.static.data#3)
> The declaration of a non-inline static data member in its class
> definition is not a definition and may be of an incomplete type
> other than cv void.
so, let's declare it as a `constexpr` instead. it implies `inline`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19452
The CDC feature was made non-experimental in e9072542c1 (2020; 4.4)
and can now be assumed to be always present. We also remove the corresponding
schema_feature.
Shard-level latencies generate a lot of metrics. This patch reduces the
the number of latencies reported by Alternator while keeping the same
functionality.
On the shard level, summaries will be reported instead of histograms.
On the instance level, an aggregated histogram will be reported.
Summaries, histograms, and counters are marked with skip_when_empty.
Fixes#12230Closesscylladb/scylladb#17581
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
The DeleteTable operation in Alternator shoudl return a TableDescription
object describing the table which has just been deleted, similar to what
DescribeTable returns
Fixes scylladb#11472
Closes#11628
this change replaces all occurrences of `boost::lexical_cast<std::string>`
in the source tree with `fmt::to_string()`. for couple reasons:
* `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`,
so the latter is easier to parse and read.
* `boost::lexical_cast<std::string>` creates a stringstream under the
hood, so it can use the `operator<<` to stringify the given object.
but stringstream is known to be less performant than fmtlib.
* we are migrating to fmtlib based formatting, see #13245. so
using `fmt::to_string()` helps us to remove yet another dependency
on `operator<<`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13611
We compare a signed variable to an unsigned one, which can
yield surprising results. In this case, it is harmless since
we already validated the signed input is positive, but
use std::cmp_less() to quench any doubts (and warnings).
we should assume that some included header does this for us.
we'd have following compiling failure if seastar's
src/http/request_parser.rl does not `using namespace httpd;` anymore.
```
/home/kefu/dev/scylladb/alternator/streams.cc:433:55: error: no matching literal operator for call to 'operator""h' with argument of type 'unsigned long long' or 'const char *', and no matching literal operator template
static constexpr auto dynamodb_streams_max_window = 24h;
^
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.
Closes#12858
Fixes#12601 (maybe?)
Sort the set of tables on ID. This should ensure we never
generate duplicates in a paged listing here. Can obviously miss things if they
are added between paged calls and end up with a "smaller" UUID/ARN, but that
is to be expected.
Main assumption here is that if is_big is good enough for
GetBatchItems operation it should work well also for Scan,
Query and GetRecords. And it's easier to maintain more unified
code.
Additionally 'future<> print' documentation used for streaming
suggests that there is quite big overhead so since it seems the
only motivation for streaming was to reduce contiguous allocation
size below some threshold we should not stream when this threshold
is not exceeded.
Closes#12164
Now that we don't accept cql protocol version 1 or 2, we can
drop cql_serialization format everywhere, except when in the IDL
(since it's part of the inter-node protocol).
A few functions had duplicate versions, one with and one without
a cql_serialization_format parameter. They are deduplicated.
Care is taken that `partition_slice`, which communicates
the cql_serialization_format across nodes, still presents
a valid cql_serialization_format to other nodes when
transmitting itself and rejects protocol 1 and 2 serialization\
format when receiving. The IDL is unchanged.
One test checking the 16-bit serialization format is removed.
Define table_id as a distinct utils::tagged_uuid modeled after raft
tagged_id, so it can be differentiated from other uuid-class types,
in particular from table_schema_version.
Fixes#11207
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Tags are a useful mechanism that could be used outside of alternator
namespace. My motivation to move tags_extension and other utilities to
db/tags/ was that I wanted to use them to mark "synchronous mode" views.
I have extracted `get_tags_of_table`, `find_tag` and `update_tags`
method to db/tags/utils.cc and moved alternator/tags_extension.hh to
db/tags/.
The signature of `get_tags_of_table` was changed from `const
std::map<sstring, sstring>&` to `const std::map<sstring, sstring>*`
Original behavior of this function was to throw an
`alternator::api_error` exception. This was undesirable, as it
introduced a dependency on the alternator module. I chose to change it
to return a potentially null value, and added a wrapper function to the
alternator module - `get_tags_of_table_or_throw` to keep the previous
throwing behavior.
Each feature has a private variable and a public accessor. Since the
accessor effectively makes the variable public, avoid the intermediary
and make the variable public directly.
To ease mechanical translation, the variable name is chosen as
the function name (without the cluster_supports_ prefix).
References throughout the codebase are adjusted.
This patch implements the previously-unimplemented Select option of the
Query and Scan operators.
The most interesting use case of this option is Select=COUNT which means
we should only count the items, without returning their actual content.
But there are actually four different Select settings: COUNT,
ALL_ATTRIBUTES, SPECIFIC_ATTRIBUTES, and ALL_PROJECTED_ATTRIBUTES.
Five previously-failing tests now pass, and their xfail mark is removed:
* test_query.py::test_query_select
* test_scan.py::test_scan_select
* test_query_filter.py::test_query_filter_and_select_count
* test_filter_expression.py::test_filter_expression_and_select_count
* test_gsi.py::test_gsi_query_select_1
These tests cover many different cases of successes and errors, including
combination of Select and other options. E.g., combining Select=COUNT
with filtering requires us to get the parts of the items needed for the
filtering function - even if we don't need to return them to the user
at the end.
Because we do not yet support GSI/LSI projection (issue #5036), the
support for ALL_PROJECTED_ATTRIBUTES is a bit simpler than it will need
to be in the future, but we can only finish that after #5036 is done.
Fixes#5058.
The most intrusive part of this patch is a change from attrs_to_get -
a map of top-level attributes that a read needs to fetch - to an
optional<attrs_to_get>. This change is needed because we also need
to support the case that we want to read no attributes (Select=COUNT),
and attrs_to_get.empty() used to mean that we want to read *all*
attributes, not no attributes. After this patch, an unset
optional<attrs_to_get> means read *all* attributes, a set but empty
attrs_to_get means read *no* attributes, and a set and non-empty
attrs_to_get means read those specific attributes.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220405113700.9768-2-nyh@scylladb.com>
In 4aa9e86924 ("Merge 'alternator: move uses of replica module to
data_dictionary' from Avi Kivity"), we changed alternator to use
data_dictionary instead of replica::database. However,
data_dictionary::database objects are different from replica::database
objects in that they don't have a stable address and need to be
captured by value (they are pointer-like). One capture in
describe_stream() was capturing a data_dictionary::database
by reference and so caused a use-after-free when the previous
continuation was deallocated.
Fix by capturing by value.
Fixes#9952.
Closes#9954
Alternator is a coordinator-side service and so should not access
the replica module. In this series all but one of uses of the replica
module are replaced with data_dictionary.
One case remains - accessing the replication map which is not
available (and should not be available) via the data dictionary.
The data_dictionary module is expanded with missing accessors.
Closes#9945
* github.com:scylladb/scylla:
alternator: switch to data_dictionary for table listing purposes
data_dictionary: add get_tables()
data_dictionary: introduce keyspace::is_internal()
As a coordinator-side service, alternator shouldn't touch the
replica module, so it is migrated here to data_dictionary.
One use case still remains that uses replica::keyspace - accessing
the replication map. This really isn't a replica-side thing, but it's
also not logically part of the data dictionary, so it's left using
replica::keyspace (using the data_dictionary::database::real_database()
escape hatch). Figuring out how to expose the replication map to
coordinator-side services is left for later.
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
The patch series moves the rest of internal ddl users to do schema
change over raft (if enabled). After that series only tests are left
using old API.
* 'gleb/raft-schema-rest-v6' of github.com:scylladb/scylla-dev: (33 commits)
migration_manager: drop no longer used functions
system_distributed_keyspace: move schema creation code to use raft
auth: move table creation code to use raft
auth: move keyspace creation code to use raft
table_helper: move schema creation code to use raft
cql3: make query_processor inherit from peering_sharded_service
table_helper: make setup_table() static
table_helper: co-routinize setup_keyspace()
redis: move schema creation code to go through raft
thrift: move system_update_column_family() to raft
thrift: authenticate a statement before verifying in system_update_column_family()
thrift: co-routinize system_update_column_family()
thrift: move system_update_keyspace() to raft
thrift: authenticate a statement before verifying in system_update_keyspace()
thrift: co-routinize system_update_keyspace()
thrift: move system_drop_keyspace() to raft
thrift: authenticate a statement before verifying in system_drop_keyspace()
thrift: co-routinize system_drop_keyspace()
thrift: move system_add_keyspace() to raft
thrift: co-routinize system_add_keyspace()
...