Said method passes down its `diff` input to `mutate_internal()`, after
some std::ranges massaging. Said massaging is destructive -- it moves
items from the diff. If the output range is iterated-over multiple
times, only the first time will see the actual output, further
iterations will get an empty range.
When trace-level logging is enabled, this is exactly what happens:
`mutate_internal()` iterates over the range multiple times, first to log
its content, then to pass it down the stack. This ends up resulting in
a range with moved-from elements being pased down and consequently write
handlers being created with nullopt mutations.
Make the range re-entrant by materializing it into a vector before
passing it to `mutate_internal()`.
Fixes: scylladb/scylladb#21907Fixes: scylladb/scylladb#21714Closesscylladb/scylladb#21910
Oversized materialized view and index names are rejected;
Materialized view names with invalid symbols are rejected.
fixes: #20755Closesscylladb/scylladb#21746
Replace boost::copy() with the standard library's std::ranges::copy()
to reduce external dependencies and simplify the codebase. This change
eliminates the requirement for boost::range and makes the implementation
more maintainable.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22789
Set true to wait for the repair to complete. Set false to skip waiting
for the repair to complete. When the option is not provided, it defaults
to false.
It is useful for management tool that wants the api to be async.
Fixes#22418Closesscylladb/scylladb#22436
Developers are expected to run new cqlpy tests against Cassandra - to
verify that the new test itself is correct. Usually there is no need
to run the entire cqlpy test suite against Cassandra, but when users do
this, it isn't confidence-inspiring to see hundreds of tests failing.
In this patch I fix many but not all of these failures.
Refs #11690 (which will remain open until we fix all the failures on
Cassandra)
* Fixed the "compact_storage" fixture recently introduced to enable the
deprecated feature in Scylla for the tests. This fixture was broken on
Cassandra and caused all compact-storage related tests to fail
on Cassandra.
* Marked all tests in test_tombstone_limit.py as scylla_only - as they
check the Scylla-only query_tombstone_page_limit configuration option.
* Marked all tests in test_service_level_api.py as scylla_only - as they
check the Scylla-only service levels feature.
* Marked a test specific to the Scylla-only IncrementalCompactionStrategy
as scylla_only. Some tests mix STCS and ICS testing in one test - this
is a mistake and isn't fixed in this patch.
* Various tests in test_tablets.py forgot to use skip_without_tablets
to skip them on Cassandra or older Scylla that doesn't have the
tablets feature.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
x
Closesscylladb/scylladb#22561
Use host_id in a children list of a task in task manager to indicate
a node on which the child was created.
Move TASKS_CHILDREN_REQUEST to IDL. Send it by host_id.
Fixes: https://github.com/scylladb/scylladb/issues/22284.
Ip to host_id transition; backport isn't needed.
Closesscylladb/scylladb#22487
* github.com:scylladb/scylladb:
tasks: drop task_manager::config::broadcast_address as it's unused
tasks: replace ip with host_id in task_identity
api: task_manager: pass gossiper to api::set_task_manager
tasks: keep host_id in task_manager
tasks: move tasks_get_children to IDL
In this series we implement the UpdateTable operation to add a GSI to an existing table, or remove a GSI from a table. As the individual commit messages will explained, this required changing how Alternator stores materialized view keys - instead of insisting that these key must be real columns (that is **not** the case when adding a GSI to an existing table), the materialized view can now take as its key any Alternator attribute serialized inside the ":attrs" map holding all non-key attributes. Fixes#11567.
We also fix the IndexStatus and Backfilling attributes returned by DescribeTable - as DynamoDB API users use this API to discover when a newly added GSI completed its "backfilling" (what we call "view building") stage. Fixes#11471.
This series should not be backported lightly - it's a new feature and required fairly large and intrusive changes that can introduce bugs to use cases that don't even use Alternator or its UpdateTable operations - every user of CQL materialized views or secondary indexes, as well as Alternator GSI or LSI, will use modified code. **It should be backported to 2025.1**, though - this version was actually branched long after this PR was sent, and it provides a feature that was promised for 2025.1.
Closesscylladb/scylladb#21989
* github.com:scylladb/scylladb:
alternator: fix view build on oversized GSI key attribute
mv: clean up do_delete_old_entry
test/alternator: unflake test for IndexStatus
test/alternator: work around unrelated bug causing test flakiness
docs/alternator: adding a GSI is no longer an unimplemented feature
test/alternator: remove xfail from all tests for issue 11567
alternator: overhaul implementation of GSIs and support UpdateTable
mv: support regular_column_transformation key columns in view
alternator: add new materialized-view computed column for item in map
build: in cmake build, schema needs alternator
build: build tests with Alternator
alternator: add function serialized_value_if_type()
mv: introduce regular_column_transformation, a new type of computed column
alternator: add IndexStatus/Backfilling in DescribeTable
alternator: add "LimitExceededException" error type
docs/alternator: document two more unimplemented Alternator features
We are supposed to be loading the most recent RPC compression dictionary
on startup, but we forgot to port the relevant piece of logic during
the source-available port. This causes a restarted node not to use the
dictionary for RPC compression until the next dictionary update.
Fix that.
Fixesscylladb/scylladb#22738
This is more of a bugfix than an improvement, so it should be backported to 2025.1.
Closesscylladb/scylladb#22739
* github.com:scylladb/scylladb:
test_rpc_compression.py: test the dictionaries are loaded on startup
raft/group0_state_machine: load current RPC compression dict on startup
Before these changes, shutting down a node could be prolonged because of
mapreduce_service. `mapreduce_service::stop()` uninitializes messaging
service, which includes waiting for all ongoing RPC handlers. We already
had a mechanism for cancelling local mapreduce tasks, but we were missing
one for cancelling external queries.
In this commit, we modify the signature of the request so it supports
cancelling via an abort source. We also provide a reproducer test
for the problem.
Fixesscylladb/scylladb#22337Closesscylladb/scylladb#22651
The following is observed in pytest:
1) node1, stream master, tried to pull data from node3
2) node3, stream follower, found node1 restarted
3) node3 killed the rpc stream
4) node1 did not get the stream session failure message from node3. This
failure message was supposed to kill the stream plan on node1. That's the
reason node1 failed the stream session much later at "2024-08-19 21:07:45,539".
Note, node3 failed the stream on its side, so it should have sent the stream
session failure message.
```
$ cat node1.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO 2024-08-19 20:24:01,162 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers={127.0.34.3}, master
ERROR 2024-08-19 20:24:01,190 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to handle STREAM_MUTATION_FRAGMENTS (receive and distribute phase) for ks=ks, cf=cf, peer=127.0.34.3: seastar::nested_exception: seastar::rpc::stream_closed (rpc stream was closed by peer) (while cleaning up after seastar::rpc::stream_closed (rpc stream was closed by peer))
WARN 2024-08-19 21:07:45,539 [shard 0:main] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.3}, tx=0 KiB, 0.00 KiB/s, rx=484 KiB, 0.18 KiB/s
$ cat node3.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO 2024-08-19 20:24:01,163 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers=127.0.34.1, slave
INFO 2024-08-19 20:24:01,164 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Start sending ks=ks, cf=cf, estimated_partitions=2560, with new rpc streaming
WARN 2024-08-19 20:24:01,187 [shard 0: gms] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.1}, tx=633 KiB, 26506.81 KiB/s, rx=0 KiB, 0.00 KiB/s
WARN 2024-08-19 20:24:01,188 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] stream_transfer_task: Fail to send to 127.0.34.1:0: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN 2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to send: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN 2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming error occurred, peer=127.0.34.1
```
To be safe in case the stream fail message is not received, node1 could fail
the stream plan as soon as the rpc stream is aborted in the
stream_mutation_fragments handler.
Fixes#20227Closesscylladb/scylladb#21960
This PR converts boost load balancer tests in preparation for load balancer changes
which add per-table tablet hints. After those changes, load balancer consults with the replication
strategy in the database, so we need to create proper schema in the
database. To do that, we need proper topology for replication
strategies which use RF > 1, otherwise keyspace creation will fail.
Topology is created in tests via group0 commands, which is abstracted by
the new `topology_builder` class.
Tests cannot modify token_metadata only in memory now as it needs to be
consistent with the schema and on-disk metadata. That's why modifications to
tablet metadata are now made under group0 guard and save back metadata to disk.
Closesscylladb/scylladb#22648
* github.com:scylladb/scylladb:
test: tablets: Drop keyspace after do_test_load_balancing_merge_colocation() scenario
tests: tablets: Set initial tablets to 1 to exit growing mode
test: tablets_test: Create proper schema in load balancer tests
test: lib: Introduce topology_builder
test: cql_test_env: Expose topology_state_machine
topology_state_machine: Introduce lock transition
This patch removes expansion of "SELECT *" in DESC MATERIALIZED VIEW.
Instead of explicitly printing each column, DESC command will now just
use SELECT *, if view was created with it. Also, adds a correspodning test.
Fixes#21154Closesscylladb/scylladb#21962
In few test cases of test_view_build_status we create a view, wait for
it and then query the view_build_status table and expect it to have all
rows for each node and view.
But it may fail because it could happen that the wait_for_view query and
the following queries are done on different nodes, and some of the nodes
didn't apply all the table updates yet, so they have missing rows.
To fix it, we change the assert to work in the eventual consistency
sense, retrying until the number of rows is as expectd.
Fixesscylladb/scylladb#22644Closesscylladb/scylladb#22654
Add the possibility to run boost and unit tests with pytest
test.py should follow the next paradigm - the ability to run all test cases sequentially by ONE pytest command.
With this paradigm, to have the better performance, we can split this 1 command into 2,3,4,5,100,200... whatever we want
It's a new functionality that does not touch test.py way of executing the boost and unit tests.
It supports the main features of test.py way of execution: automatic discovery of modes, repeats.
There is an additional requirement to execute tests in parallel: pytest-xdist. To install it, execute `pip install pytest-xdist`
To run test with pytest execute `pytest test/boost`. To execute only one file, provide the path filename `pytest test/boost/aggregate_fcts_test.cc` since it's a normal path, autocompletion will work on the terminal. To provide a specific mode, use the next parameter `--mode dev`, if parameter will not be provided pytest will try to use `ninja mode_list` to find out the compiled modes.
Parallel execution controlled by pyest-xdist and the parameter `-n 12`.
The useful command to discover the tests in the file or directory is `pytest --collect-only -q --mode dev test/boost/aggregate_fcts_test.cc`. That will return all test functions in the file. To execute only one function from the test, you can invoke the output from the previous command, but suffix for mode should be skipped, for example output will be `test/boost/aggregate_fcts_test.cc::test_aggregate_avg.dev`, so to execute this specific test function, please use the next command `pytest --mode dev test/boost/aggregate_fcts_test.cc::test_aggregate_avg`
There is a parameter `--repeat` that used to repeat the test case several times in the same way as test.py did.
It's not possible to run both boost and unit tests directories with one command, so we need to provide explicitly which directory should be executed. Like this `pytest --mode dev test/unit` or `pytest --mode dev test/boost`
Fixes: https://github.com/scylladb/qa-tasks/issues/1775Closesscylladb/scylladb#21108
* github.com:scylladb/scylladb:
test.py: Add possibility to run ldap tests from pytest
test.py: Add the possibility to run unit tests from pytest
test.py: Add the possibility to run boost test from pytest
test.py: Add discovery for C++ tests for pytest
test.py: Modify s3 server mock
test.py: Add method to get environment variables from MinIO wrapper
test.py: Move get configured modes to common lib
This series extends the table schema with per-table tablet options.
The options are used as hints for initial tablet allocation on table creation and later for resize (split or merge) decisions,
when the table size changes.
* New feature, no backport required
Closesscylladb/scylladb#22090
* github.com:scylladb/scylladb:
tablets: resize_decision: get rid of initial_decision
tablet_allocator: consider tablet options for resize decision
tablet_allocator: load_balancer: table_size_desc: keep target_tablet_size as member
network_topology_strategy: allocate_tablets_for_new_table: consider tablet options
network_topology_strategy: calculate_initial_tablets_from_topology: precalculate shards per dc using for_each_token_owner
network_topology_strategy: calculate_initial_tablets_from_topology: set default rf to 0
cql3: data_dictionary: format keyspace_metadata: print "enabled":true when initial_tablets=0
cql3/create_keyspace_statement: add deprecation warning for initial tablets
test: cqlpy: test_tablets: add tests for per-table tablet options
schema: add per-table tablet options
feature_service: add TABLET_OPTIONS cluster schema feature
`set_notify_handler()` is called after a querier was inserted into the querier cache. It has two purposes: set a callback for eviction and set a TTL for the cache entry. This latter was not disabling the pre-existing timeout of the permit (if any) and this would lead to premature eviction of the cache entry if the timeout was shorter than TTL (which his typical).
Disable the timeout before setting the TTL to prevent premature eviction.
Fixes: https://github.com/scylladb/scylladb/issues/22629
Backport required to all active releases, they are all affected.
Closesscylladb/scylladb#22701
* github.com:scylladb/scylladb:
reader_concurrency_semaphore: set_notify_handler(): disable timeout
reader_permit: mark check_abort() as const
Add posibility to run ldap tests with pytest.
LDAP server will be created for each worker if xdist will be used.
For one thread one LDAP server will be used for all tests.
Add the possibility to run boost test from pytest.
Boost facade based on code from https://github.com/pytest-dev/pytest-cpp, but enhanced and rewritten to suite better.
Code based on https://github.com/pytest-dev/pytest-cpp. Updated, customized, enhanced to suit current needs.
Modify generate report to not modify the names, since it will break
xdist way of working. Instead modification will be done in post collect
but before executing the tests.
Add the possibility to return environment as a dict to use it later it subprocess created by xdist, without starting another s3 mock server for each thread.
Add method to retrieve MinIO server wrapper environment variables for
later processing.
This change will allow to sharing connection information with other
processes and allow reusing the server across multiple tests.
This scenario is invoked in a loop in the
test_load_balancing_merge_colocation_with_random_load test case, which
will cause accumulation of tablet maps making each reload slower in
subsequent iterations.
It wasn't a problem before because we overwritten tablet_metadata in
each iteration to contain only tablets for the current table, but now
we need to keep it consistent with the schema and don't do that.
After tablet hints, there is no notion of leaving growing mode and
tablet count is sustained continuously by initial tablet option, so we
need to lower it for merge to happen.
This is in preparation for load balancer changes needed to respect
per-table tablet hints and respecting per-shard tablet count
goal. After those changes, load balancer consults with the replication
strategy in the database, so we need to create proper schema in the
database. To do that, we need proper topology for replication
strategies which use RF > 1, otherwise keyspace creation will fail.
Will be used by load balancer tests which need more than a single-node
topology, and which want to create proper schema in the database which
depends on that topology, in particular creating keyspaces with
replication factor > 1.
We need to do that because load balancer will use replication strategy
from the database as part of plan making.
It's possible to modify 'memtable_flush_period_in_ms' option only and as
single option, not with any other options together
Refs #20999Fixes#21223Closesscylladb/scylladb#22536
set_notify_handler() is called after a querier was inserted into the
querier cache. It has two purposes: set a callback for eviction and set
a TTL for the cache entry. This latter was not disabling the
pre-existing timeout of the permit (if any) and this would lead to
premature eviction of the cache entry if the timeout was shorter than
TTL (which his typical).
Disable the timeout before setting the TTL to prevent premature
eviction.
Fixes: #scylladb/scylladb#22629
Before this change, it was ensured that a default superuser is created
before serving CQL. However, the mechanism didn't wait for default
password initialization, so effectively, for a short period, customer
couldn't authenticate as the superuser properily. The purpose of this
change is to improve the superuser initialization mechanism to wait for
superuser default password, just as for the superuser creation.
This change:
- Introduce authenticator::ensure_superuser_is_created() to allow
waiting for complete initialization of super user authentication
- Implement ensure_superuser_is_created in password_authenticator, so
waiting for superuser password initialization is possible
- Implement ensure_superuser_is_create in transitional_authenticator,
so the implementation from password_authenticator is used
- Implement no-op ensure_superuser_is_create for other authenticators
- Extend service::ensure_superuser_is_created to wait for superuser
initialization in authenticator, just as it was implemented earlier
for role_manager
- Add injected error (sleep) in password_authenticator::start to
reproduce a case of delayed password creation
- Implement test_delayed_deafult_password to verify the correctness of the fix
- Ensure superuser is created in single_node_cql_env::run_in_thread to
make single_node_cql more similar to scylla_main in main.cc
Fixesscylladb/scylladb#20566
Backport not needed - a minor bugfix
Closesscylladb/scylladb#22532
* github.com:scylladb/scylladb:
test: implement test_auth_password_ensured
test: implement connect_driver argument in ManagerClient::server_add
auth: ensure default superuser password is set before serving CQL
auth: added password_authenticator_start_pause injected error
This pull request is an implementation of vector data type similar to one used by Apache Cassandra.
The patch contains:
- implementation of vector_type_impl class
- necessary functionalities similar to other data types
- support for serialization and deserialization of vectors
- support for Lua and JSON format
- valid CQL syntax for `vector<>` type
- `type_parser` support for vectors
- expression adjustments such as:
- add `collection_constructor::style_type::vector`
- rename `collection_constructor::style_type::list` to `collection_constructor::style_type::list_or_vector`
- vector type encoding (for drivers)
- unit tests
- cassandra compatibility tests
- necessary documentation
Co-authored-by: @janpiotrlakomy
Fixes https://github.com/scylladb/scylladb/issues/19455Closesscylladb/scylladb#22488
* github.com:scylladb/scylladb:
docs: add vector type documentation
cassandra_tests: translate tests covering the vector type
type_codec: add vector type encoding
boost/expr_test: add vector expression tests
expression: adjust collection constructor list style
expression: add vector style type
test/boost: add vector type cql_env boost tests
test/boost: add vector type_parser tests
type_parser: support vector type
cql3: add vector type syntax
types: implement vector_type_impl
Do not merge tablets if that would drop the tablet_count
below the minimum provided by hints.
Split tablets if the current tablet_count is less than
the minimum tablet count calculated using the table's tablet options.
TODO: override min_tablet_count if the tablet count per shard
is greater than the maximum allowed. In this case
the tables tablet counts should be scaled down proportionally.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Before fix of scylladb#20566, CQL was served irrespectively of default
superuser password creation, which led to an incorrect product behavior
and sporadic test failures. This test verifies race condition of serving
CQL and creating default superuser password. Injected failure is used to
ensure CQL use is attempted before default superuser password creation,
however, the attempt is expected to fail because scylladb#20566 is
fixed. Following that, the injected error is notified, so CQL driver can
be started correctly. Finally, CREATE USER query is executed to confirm
successful superuser authentication.
This change:
- Implement test_auth_password_ensured.py
The test starts a server without expecting CQL serving, because
expected_server_up_state=ServerUpState.HOST_ID_QUERIED and
connect_driver=False. Error password_authenticator_start_pause is
injected to block superuser password setup during server startup.
Next, the test waits for a log to confirm that the code implementing
injected error is reached. When the server startup procedure is
unfinished, some operations might not complete on a first try, so
waiting for driver connection is wrapped in repeat_if_host_unavailable.
This commit introduces connect_driver argument in
ManagerClient::server_add. The argument allow skipping CQL driver
initialization part during server start. Starting a server without
the driver is necessary to implement some test scenarios related
to system initialization.
After stopping a server, ManagerClient::server_start can be used to
start the server again, so connect_driver argument is also added here to
allow preventing connecting the driver after a server restart.
This change:
- Implement connect_driver argument in ManagerClient::server_add
- Implement connect_driver argument in ManagerClient::server_start
Before this patch, the regular_column_transformation constructor, which
we used in Alternator GSIs to generates a view key from a regular-column
cell, accepted a cell of any size. As a reviewer (Avi) noticed, very
long cells are possible, well beyond what Scylla allows for keys (64KB),
and because regular_column_transformation stores such values in a
contiguous "bytes" object it can cause stalls.
But allowing oversized attributes creates an even more accute problem:
While view building (backfilling in DynamoDB jargon), if we encounter
an oversized (>64KB) key, the view building step will fail and the
entire view building will hang forever.
This patch fixes both problems by adding to regular_column_transformation's
constructor the check that if the cell is 64KB or larger, an empty value
is returned for the key. This causes the backfilling to silently skip
this item, which is what we expect to happen (backfilling cannot do
anything to fix or reject the pre-existing items in the best table).
A test test_gsi_updatetable.py::test_gsi_backfill_oversized_key is
introduced to reproduce this problem and its fix. The test adds a 65KB
attribute to a base table, and then adds GSIs to this table with this
attribute as its partition key or its sort key. Before this patch, the
backfilling process for the new GSIs hangs, and never completes.
After this patch, the backfilling completes and as expected contains
other base-table items but not the item with the oversized attribute.
The new test also passes on DynamoDB.
However, while implementing this fix I realized that issue #10347 also
exists for GSIs. Issue #10347 is about the fact that DynamoDB limits
partition key and sort key attributes to 2048 and 1024 bytes,
respectively. In the fix described above we only handled the accute case
of lengths above 64 KB, but we should actually skip items whose GSI
keys are over 2048 or 1024 bytes - not 64KB. This extra checking is
not handled in this patch, and is part of a wider existing issue:
Refs #10347
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The test for IndexStatus verifies that on a newly created table and GSI,
the IndexStatus is "ACTIVE". However, in Alternator, this doesn't strictly
need to happen *immediately* - view building, even for an empty table -
can take a short while in debug mode. This make the test test
test_gsi_describe_indexstatus flaky in debug mode.
The fix is to wait for the GSI to become active with wait_for_gsi()
before checking it is active. This is sort of silly and redundant,
but the important point that if the IndexStatus is incorrect this test
will fail, it doesn't really matter whether the wait_for_gsi() or
the DescribeTable assertion is what fails.
Now that wait_for_gsi() is used in two test files, this patch moves it
(and its friend, wait_for_gsi_gone()) to util.py.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The alternator test test_gsi_updatetable.py::test_gsi_delete_with_lsi
Creates a GSI together with a table, and then deletes it. We have a
bug unrelated to the purpose of this test - #9059 - that causes view
building to sometimes crash Scylla if the view is deleted while the
view build is starting. We see specifically in debug builds that even
view building of an *empty* table might not finish before the test
deletes the view - so this bug happens.
Work around that bug by waiting for the GSI to build after creating
the table with the GSI. This shouldn't be necessary (in DynamoDB,
a GSI created with the table always begins ready with the table),
but doesn't hurt either.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The previous patches fully implemented issue 11567 - supporting
UpdateTable to add or delet a GSI on an existing Alternator table.
All 14 tests that were marked xfail because of this issue now pass,
so this patch removes their xfail. There are no more xfailing tests
referring to this issue.
These 14 tests, most of them in test/alternator/test_gsi_updatetable.py,
cover all aspects of this feature, including adding a GSI, deleting a
GSI, interactions between GSI and LSI, RBAC when adding or deleting a GSI,
data type limitation on an attribute that becomes a GSI key or stops
being one, GSI backfill, DescribeTable and backfill, various error
conditions, and more.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch adds the missing IndexStatus and Backfilling fields for the
GSIs listed by a DescribeTable request. These fields allow an application
to check whether a GSI has been fully built (IndexStatus=ACTIVE) or
currently being built (IndexStatus=CREATING, Backfilling=true).
This feature is necessary when a GSI can be added to an existing table
so its backfilling might take time - and the application might want to
wait for it.
One test - test_gsi.py::test_gsi_describe_indexstatus - begins to pass
with this fix, so the xfail tag is removed from it.
Fixes#11471.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Before this change, it was possible to change non-liveupdatable config
parameter without process restart. This erroneous behavior not only
contradicts the documentation but is potentially dangerous, as various
components theoretically might not be prepared for a change of
configuration parameter value without a restart. The issue came from
a fact that liveupdatability verification check was skipped for default
configuration parameters (those without its initial values
in configuration file during process start).
This change:
- Introduce _initialization_completed member in config_file
- Set _initialization_completed=true when config file is processed on
server start
- Verify config_file's initialization status during config update - if
config_file was initialized, prevent from further changes of
non-liveupdatable parameters
- Implement ScyllaRESTAPIClient::get_config() that obtains a current
value of given configuration parameter via /v2/config REST API
- Implement test to confirm that only liveupdatable parameters are
changed when SIGHUP is sent after configuration file change
Function set_initialization_completed() is called only once in main.cc,
and the effect is expected to be visible in all shards, as a side effect
of cfg->broadcast_to_all_shards() that is called shortly after. The same
technique was already used for enable_3_1_0_compatibility_mode() call.
Fixesscylladb/scylladb#5382
No backport - minor fix.
Closesscylladb/scylladb#22655
* github.com:scylladb/scylladb:
test: SIGHUP doesn't change non-liveupdatable configuration
test: implement ScyllaRESTAPIClient::get_config()
config: prevent SIGHUP from changing non-liveupdatable parameters
config: remove unused set_value_on_all_shards(const YAML::Node&)
This update introduces four types of credential providers:
1. Environment variables
2. Configuration file
3. AWS STS
4. EC2 Metadata service
The first two providers should only be used for testing and local runs. **They must NEVER be used in production.**
The last two providers are intended for use on real EC2 instances:
- **AWS STS**: Preferred method for obtaining temporary credentials using IAM roles.
- **EC2 Metadata Service**: Should be used as a last resort.
Additionally, a simple credentials provider chain is created. It queries each provider sequentially until valid credentials are obtained. If all providers fail, it returns an empty result.
fixes: #21828Closesscylladb/scylladb#21830
* github.com:scylladb/scylladb:
docs: update the `object_storage.md` and `admin.rst`
aws creds: add STS and Instance Metadata service credentials providers
aws creds: add env. and file credentials providers
s3 creds: move credentials out of endpoint config
Use the keyspace initial_tablets for min_tablet_count, if the latter
isn't set, then take the maximum of the option-based tablet counts:
- min_tablet_count
- and expected_data_size_in_gb / target_tablet_size
- min_per_shard_tablet_count (via
calculate_initial_tablets_from_topology)
If none of the hints produce a positive tablet_count,
fall back to calculate_initial_tablets_from_topology * initial_scale.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Test specifying of per-table tablet options on table creation
and alter table.
Also, add a negative test for atempting to use tablet options
with vnodes (that should fail).
And add a basic test for testing tablet options also with
materialized views.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Unlike with vnodes, each tablet is served only by a single
shard, and it is associated with a memtable that, when
flushed, it creates sstables which token-range is confined
to the tablet owning them.
On one hand, this allows for far better agility and elasticity
since migration of tablets between nodes or shards does not
require rewriting most if not all of the sstables, as required
with vnodes (at the cleanup phase).
Having too few tablets might limit performance due not
being served by all shards or by imbalance between shards
caused by quantization. The number of tabelts per table has to be
a power of 2 with the current design, and when divided by the
number of shards, some shards will serve N tablets, while others
may serve N+1, and when N is small N+1/N may be significantly
larger than 1. For example, with N=1, some shards will serve
2 tablet replicas and some will serve only 1, causing an imbalance
of 100%.
Now, simply allocating a lot more tablets for each table may
theoretically address this problem, but practically:
a. Each tablet has memory overhead and having too many tablets
in the system with many tables and many tablets for each of them
may overwhelm the system's and cause out-of-memory errors.
b. Too-small tablets cause a proliferation of small sstables
that are less efficient to acces, have higher metadata overhead
(due to per-sstable overhead), and might exhaust the system's
open file-descriptors limitations.
The options introduced in this change can help the user tune
the system in two ways:
1. Sizing the table to prevent unnecessary tablet splits
and migrations. This can be done when the table is created,
or later on, using ALTER TABLE.
2. Controlling min_per_shard_tablet_count to improve
tablet balancing, for hot tables.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
For example, nodes which are being decommissioned should not be
consider as available capacity for new tables. We don't allocate
tablets on such nodes.
Would result in higher per-shard load then planned.
Closesscylladb/scylladb#22657
in order to reduce the external header dependency, let's switch to
the standardlized std::ranges::min_element().
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22572