The following metrics will be marked with basic_level label:
scylla_alternator_operation
scylla_alternator_op_latency_bucket
scylla_alternator_op_latency_count
scylla_alternator_op_latency_sum
scylla_alternator_total_operations
scylla_alternator_batch_item_count
scylla_alternator_op_latency
scylla_alternator_op_latency_summary
scylla_expiration_items_deleted
All alternator metrics are marked with __alternator label.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
If user fails to supply the AttributeDefinitions parameter when creating
a table, Scylla used to fail on RAPIDJSON_ASSERT. Now it calls a polite
exception, which is fully in-line with what DynamoDB does.
The commit supplies also a new, relevant test routine.
Fixes#23043Closesscylladb/scylladb#23041
This commit eliminates unused boost header includes from the tree.
Removing these unnecessary includes reduces dependencies on the
external Boost.Adapters library, leading to faster compile times
and a slightly cleaner codebase.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22997
Table updates that try to enable stream (while changing or not the
StreamViewType) on a table that already has the stream enabled
will result in ValidationError.
Table updates that try to disable stream on a table that does not
have the stream enabled will result in ValidationError.
Add two tests to verify the above.
Mark the test for changing the existing stream's StreamViewType
not to xfail.
Fixesscylladb/scylladb#6939Closesscylladb/scylladb#22827
The main goal of this patch is to fully support UpdateTable's ability
to add a GSI to an existing table, and delete a GSI from an existing
table. But to achieve this, this patch first needs to overhaul how GSIs
are implemented:
Remember that in Alternator's data model, key attributes in a table
are stored as real CQL columns (with a known type), but all other
attributes of an item are stored in one map called ":attrs".
* Before this patch, the GSI's key columns were made into real columns
in the table's schema, and the materialized view used that column as
the view's key.
* After this patch, the GSI's key columns usually (when they are not
the base table's keys, and not any LSI's key) are left in the ":attrs"
map, just like any other non-key column. We use a new type of computed
column (added in the previous patch) to extract the desired element from
this map.
This overhaul of the GSI implementation doesn't change anything in the
functionality of GSIs (and the Alternator test suite tries very hard to
ensure that), but finally allows us to add a GSI to an already-existing
table. This is now possible because the GSI will be able to pick up
existing data from inside the ":attrs" map where it is stored, instead
of requiring the data in the map to be moved to a stand-alone column as
the previous implementation needed.
So this patch also finally implements the UpdateTable operations
(Create and Delete) to add or delete a GSI on an existing table,
as this is now fairly straightfoward. For the process of "backfilling"
the existing data into the new GSI we don't need to do anything - this
is just the materialized-view "view building" process that already
exists.
Fixes#11567.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch adds a new computed column class for materialized views,
extract_from_attrs_column_computation
which is Alternator-specific and knows how to extract a value (of a
known type) from an attribute stored in Alternator's map-of-all-nonkey-
attributes ":attrs".
We'll use this new computed column in the next patch to reimplement GSI.
The new computed-column class is based on regular_column_transformation
introduced in the previous patch. It is not yet wired to anything:
The MV code cannot handle any regular_column_transformation yet, and
Alternator will not yet use it to create a GSI. We'll do those things
in the following patches.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch introduces a function serialized_value_if_type() which takes
a serialized value stored in the ":attrs" map, and converts it into a
serialized *CQL* type if it matches a particular type (S, B or N) - or
returns null the value has the wrong type.
We will use this function in the following patch for deserializing
values stored in the ":attrs" map to use them as a materialized view
key. If the value has the right type, it will be converted to the CQL
type and used as the key - but if it has the wrong type the key will
be null and it will not appear in the view. This is exactly how GSI
is supposed to behave.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch adds the missing IndexStatus and Backfilling fields for the
GSIs listed by a DescribeTable request. These fields allow an application
to check whether a GSI has been fully built (IndexStatus=ACTIVE) or
currently being built (IndexStatus=CREATING, Backfilling=true).
This feature is necessary when a GSI can be added to an existing table
so its backfilling might take time - and the application might want to
wait for it.
One test - test_gsi.py::test_gsi_describe_indexstatus - begins to pass
with this fix, so the xfail tag is removed from it.
Fixes#11471.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch adds to Alternator's api_error type yet another type of
error, api_error::limit_exceeded (error code "LimitExceededException").
DynamoDB returns this error code in certain situations where certain
low limits were exceeded, such as the case we'll need in a following
patch - an UpdateTable that tries to create more than one GSI at once.
The LimitExceededException error type should not be confused with
other similarly-named but different error messages like
ProvisionedThroughputExceededException or RequestLimitExceeded.
In general, we make an attempt to return the same error code that
DynamoDB returns for a given error.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.
please note, because quite a few source files relied on
`utils/to_string.hh` to pull in the specialization of
`fmt::formatter<std::optional<T>>`, after removing
`#include <fmt/std.h>` from `utils/to_string.hh`, we have to
include `fmt/std.h` directly.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This series adds WCU support for the Alternator update item.
This motivation behind it, is to have a rough estimation of what a similar operation would have taken from WCU perspective if used with DynamoDB.
The calculation is done while minimal overhead is the prime objective, the results are values that is less or equal to what it would have been in DynamoDB
** New feature, no need to backport. **
Closesscylladb/scylladb#21999
* github.com:scylladb/scylladb:
alternator/test_returnconsumedcapacity.py: update item
alternator/executor.cc: Add WCU for update_item
Currently, `get_network_topology_options()` is using gossip data
and iterates over topology using IPs and not host IDs, which may
result in operating on inconsistent data.
This method's implemenations has been changed to instead use
`get_datacenters()`, which should always return consistent data.
Fixes: scylladb/scylladb#21490Closesscylladb/scylladb#21940
This patch adds WCU support for update_item. The way Alternator modifies
values means we don't always have the full item sizes. When there is a
read-before-write, the code in rmw_operation takes care of the object
size.
When updating a value without read-before-write, we will make a rough
estimation of the value's size. This is better than simply taking 1 (as
we do with delete) and is also more Alternator-like.
Clean up the unnecessary includes reported by the GitHub checks that are
polluting the PR diffs.
The "utils/assert.hh" report should be actually fixed by the #21739, but
as the usage of `SEASTAR_ASSERT()` is protected by the `SEASTAR_DEBUG`
check it makes sense to include the header conditionally as well.
Closesscylladb/scylladb#21817
Calculating the item length of WCU deleted Item depends on how the
operations was performed.
In a simple scenario it would be consider a 1 byte.
With an unsafe Read-Before-Write the item is return by get_perious_item
and with LWT the item is get from the apply method.
This patch changes the calls to describe_single_item in the last two
scenarios so that they would use the read item to determine the item
length.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Actions in rmw_operation can use describe_item to determine to get an
existing value (Read before Write scenario) on those cases the existing
item size can be bigger than the one we are storing (in the extreme
case, when deleting an object we only have its keys)
This modify the describe_item API so it would take a pointer to uint
instead of the consumed_capacity_counter so we can use it to get the old
value size and depends on that, determine the size that will be used for
the WCU calculation.
rmw operations needs to be able to modify consume_capacity total_bytes
directly.
Depends on the previous stored item the length on which the WCU will be
calculated can be different than the length of the operation.
This patch makes the total_bytes public so it will be possible to modify
it directly.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This patch add visibility to the WCU metrics. It uses a label 'ops' to
split each of the operations that contribute to WCU into their
operations.
When summing over all ops value the result will be the same.
these unused includes are identified by clang-include-cleaner. after
auditing the source files, all of the reports have been confirmed.
please note, because `mutation/mutation.hh` does not include
`seastar/coroutine/maybe_yield.hh` anymore, and quite a few source
files were relying on this header to bring in the declaration of
`maybe_yield()`, we have to include this header in the places where
this symbol is used. the same applies to `seastar/core/when_all.hh`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This change improves dependency management by explicitly specifying
library linkage visibility in CMake targets.
Previously, some ScyllaDB targets used `target_link_libraries()`
without `PUBLIC` or `PRIVATE` keywords, which resulted in transitive
library dependencies by default. This unintentionally exposed
non-public dependencies to downstream targets.
Changes:
- Always use explicit `PRIVATE` or `PUBLIC` keywords with
`target_link_libraries()`
- Tighten build dependency tree
- Enforce a more modular linkage model
See: [CMake documentation on library dependencies](https://cmake.org/cmake/help/latest/command/target_link_libraries.html)
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21686
Modernize the codebase by replacing Boost range adaptors with C++23 standard library views,
reducing external dependencies and leveraging modern C++ language features.
Key Changes:
- Replace `boost::adaptors::filtered` with `std::views::filter`
- Remove `#include <boost/range/adaptor/filtered.hpp>`
- Utilize standard library range views
Motivation:
- Reduce project's external dependency footprint
- Leverage standard library's range and view capabilities
- Improve long-term code maintainability
- Align with modern C++ best practices
Implementation Challenges and Considerations:
1. Range Conversion and Move Semantics
- `std::ranges::to` adaptor requires rvalue references
- Necessitated updates to variable and parameter constness
- Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const`
from `common` to enable efficient range conversion
2. Range Iteration and Mutation
- Range views may mutate internal state during iteration
- Cannot pass ranges by const reference in some scenarios
- Solution: Pass ranges by rvalue reference to explicitly indicate
state invalidation
Limitations:
- One instance of `boost::adaptors::filtered` temporarily preserved
due to lack of a C++23 alternative for `boost::join()`
- A comprehensive replacement will be addressed in a follow-up change
This change is part of our ongoing effort to modernize the codebase,
reducing external dependencies and adopting modern C++ practices.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21648
Read and Write Consumed Capacity units are an abstract way of measuring Alternator actions. In general, they correspond to the read or write data.
In the long run, the RCU/WCU adds a way of charging an operation and limiting usage.
This series addresses two issues: consume capacity request API and metering.
The Alternator (and DynmoDB) API has an optional parameter allowing users to check the number of units an operation consumes. When a user adds that parameter, the response will contain the number of units used for the operation.
This series adds the consume capacity support to the get_item and put_item, adds a metric to collect the overall RCU and WCU used, and adds a test for the new functionality.
Follow-up PRs will add support for more operations and GSI.
Replaces #19811
Partially implement: #5027Closesscylladb/scylladb#21543
* github.com:scylladb/scylladb:
alternator/test_metrics: Add tests for table consumption units
test_returnconsumedcapacity.py: Add putItem tests
Alternator: add WCU support
Add test/alternator/test_returnconsumedcapacity.py
alternator/executor: Add consume capacity for get_item
alsternator/stats: Add rcu and wcu metrics to stats
alternator/executor.hh: white-space cleanup
Add the consume_capacity helper class
now that we are allowed to use C++23. we now have the luxury of using `std::ranges::find_if`.
in this change, we:
- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`
to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible.
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#21495
* github.com:scylladb/scylladb:
treewide: replace boost::find_if with std::ranges::find_if
counters: replace boost::find_if with std::ranges::find_if
combine.hh: use std::iter_const_reference_t when appropriate
This patch adds functionality to track Write Capacity Units (WCU).
Currently for the put_item operation.
This enhancement allows for standardized measurement of write
operations, aligning with DynamoDB-like metrics.
Additionally, the WCU value is now optionally included in the response to provide
immediate feedback on the write capacity usage.
The implementation adds a consumed_capacity_counter member to
rmw_operation, this will allow to add WCU functionality to update_item
and delete_item
This patch adds functionality to track Read Capacity Units (RCU) for the
get_item operation. This enhancement allows for standardized measurement
of read operations, aligning with DynamoDB-like metrics.
Additionally, the RCU value can now be included in the response to
provide immediate feedback on the read capacity usage.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Introduced `rcu` (Read Capacity Units) and `wcu` (Write Capacity Units)
metrics to the `stats` object for enhanced capacity tracking.
`rcu` and `wcu` provide a simplified way of measuring reads and writes,
respectively, by representing capacity usage in standardized units.
This patch adds these metrics to the existing alternator stats, enabling
monitoring of the total consumed units.
Alternator API should support returning WCU and RCU when requested.
The consumed capacity helper class serves multiple purposes:
1. Break the logic of calculating the RCU and WCU from the main code.
2. Add a helper class consumed_capacity_counter that can accumulate bytes.
3. Optionally update counters for RCU and WCU that will be used by the
metric layer.
4. Update the response with the consumed units if needed.
The consumed_capacity_counter is a base class with two implementations:
A read and write implmenentation.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Alternator's "/localnodes" HTTP requests is supposed to return the list
of nodes in the local DC to which the user can send requests.
Before commit bac7c33313 we used the
gossiper is_alive() method to determine if a node should be returned.
That commit changed the check to is_normal() - because a node can be
alive but in non-normal (e.g., joining) state and not ready for
requests.
However, it turns out that checking is_normal() is not enough, because
if node is stopped abruptly, other nodes will still consider it "normal",
but down (this is so-called "DN" state). So we need to check **both**
is_alive() and is_normal().
This patch also adds a test reproducing this case, where a node is
shut down abruptly. Before this patch, the test failed ("/localnodes"
continued to return the dead node), and after it it passes.
Fixes#21538
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#21540
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::find_if`.
in this change, we:
- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`
to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Our "sstring_view" is an historic alias for the standard std::string_view.
Alternator only used this alias in a couple of random names, let's change
them to the standard type name.
Refs #4062.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
For historic reasons, we have (in bytes.hh) a type sstring_view which
is an alias for std::string_view - since the same standard type can hold
a pointer into both a seastar::sstring and std::string.
This alias in unnecessary and misleading to new developers (who might
assume it is somehow different from std::string_view). This patch doesn't
yet remove all occurances of sstring_view (the request in #4062), but
begins to do it by renaming one commonly-used function, to_sstring_view(bytes)
to to_string_view() and of course changes all its uses to the new name.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The later includes the former and in addition to `seastar::format()`,
`print.hh` also provides helpers like `seastar::fprint()` and
`seastar::print()`, which are deprecated and not used by scylladb.
Previously, we include `seastar/core/print.hh` for using
`seastar::format()`. and in seastar 5b04939e, we extracted
`seastar::format()` into `seastar/core/format.hh`. this allows us
to include a much smaller header.
In this change, we just include `seastar/core/format.hh` in place of
`seastar/core/print.hh`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21574
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::any_of`.
in this change, we replace `boost::algorithm::any_of` with
`std::ranges::any_of`
to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::all_of`.
in this change, we replace `boost::algorithm::all_of` with
`std::ranges::all_of`
to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This reverts commit c286434e4c, reversing
changes made to 6712fcc316.
The commit causes memtable_test to be very flaky in debug mode.
Specifically, subtests test_exceptions_in_flush_on_sstable_open
and test_exceptions_in_flush_on_sstable_write).
It's somewhat common to ask for the partition key and clustering key
columns, or for the static and regular columsn. Provide accessors for them
rather than requiring the user to glue them.
Some callers are converted.
Closesscylladb/scylladb#21191
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.
in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
now that we are allowed to use C++23. we now have the luxury of using
`std::views::keys`.
in this change, we:
- replace `boost::adaptors::map_keys` with `std::views::keys`
- update affected code to work with `std::views::keys`
to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21198
This includes way too much, including <boost/regex.hpp>, which is huge.
Drop includes of adaptors.hpp and replace by what is needed.
Closesscylladb/scylladb#21187
To reduce dependency load, use std ranges instead of boost ranges.
The std::ranges::{lower,upper}_bound don't support heterogeneous lookup,
but a more natural solution is to use a projection to search for the name,
so we use that and the custom comparator is removed.
Many callers are converted as well due to poor interoperability between
boost ranges and std ranges.
Before this patch, the "/localnodes" HTTP request to the Alternator server
lists all the live nodes of the current DC. This patch adds two optional
parameters to this query:
dc: allows to list the live nodes of a specific named DC instead of the
current DC of the server.
rack: allows to restrict the results to just the nodes belonging to a
specific named rack.
For both options, if no live node exists in the given dc or rack (in
particular, if such a dc or rack doesn't even exist), an empty list is
returned - it's not an error.
The default, if dc or rack is not specified - remains exactly as it is
today - look at the current DC (the one of the node being request), and
do not restrict the list to any specific rack.
We expect the new options that we added here to be useful for two use cases:
1. A client that knows of *some* Scylla node (belonging to an unknown DC),
but wants to list the nodes in *its* DC, which it knows by name.
2. A client in a multi-rack DC (e.g., multi-AZ region in AWS) that wants
to send requests to nodes in its own rack (which it knows by name),
to avoid cross-rack networking costs.
Note that in both cases, this requires clients to know the names of DCs
and AZs via some out-of-band means. The client can also get a list of DCs
and racks using the system.local system table, as the tests included in
this patch demonstrate.
This patch includes two set of tests for these new options: One in the the
single-node test/alternator framework that has a single dc and rack but
can still check the case of an unknown dc or rack (in which case an empty
list is returned). The second test is in the topology framework, and runs
an 8-node cluster with two DCs, two racks, and two nodes in each, and
checks all the combinations of "/localnodes" requests with and without
dc and rack options. This test also resolves a longstanding TODO that
asked for such a multi-DC test for "/localnodes" to be written.
Fixes#12147
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#20915