Commit Graph

1062 Commits

Author SHA1 Message Date
Pavel Emelyanov
7c580b4bd4 Merge 'sstable: switch to uuid identifier for naming S3 sstable objects' from Kefu Chai
before this change, we create a new UUID for a new sstable managed by the s3_storage, and we use the string representation of UUID defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for naming the objects stored on s3_storage. but this representation is not what we are using for storing sstables on local filesystem when the option of "uuid_sstable_identifiers_enabled" is enabled. instead, we are using a base36-based representation which is shorter.

to be consistent with the naming of the sstables created for local filesystem, and more importantly, to simplify the interaction between the local copy of sstables and those stored on object storage, we should use the same string representation of the sstable identifier.

so, in this change:

1. instead of creating a new UUID, just reuse the generation of the sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the base36-based one (implemented by fmt::formatter<sstables::generation_type>)

Fixes #14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#14406

* github.com:scylladb/scylladb:
  sstable: remove _remote_prefix from s3_storage
  sstable: switch to uuid identifier for naming S3 sstable objects
2023-10-23 21:05:13 +03:00
Aleksandra Martyniuk
0c6a3f568a compaction: delete default_compaction_progress_monitor
default_compaction_progress_monitor returns a reference to a static
object. So, it should be read-only, but its users need to modify it.

Delete default_compaction_progress_monitor and use one's own
compaction_progress_monitor instance where it's needed.

Closes scylladb/scylladb#15800
2023-10-23 16:03:34 +03:00
Kefu Chai
af8bc8ba63 sstable: switch to uuid identifier for naming S3 sstable objects
before this change, we create a new UUID for a new sstable managed
by the s3_storage, and we use the string representation of UUID
defined by RFC4122 like "0aa490de-7a85-46e2-8f90-38b8f496d53b" for
naming the objects stored on s3_storage. but this representation is
not what we are using for storing sstables on local filesystem when
the option of "uuid_sstable_identifiers_enabled" is enabled. instead,
we are using a base36-based representation which is shorter.

to be consistent with the naming of the sstables created for local
filesystem, and more importantly, to simplify the interaction between
the local copy of sstables and those stored on object storage, we should
use the same string representation of the sstable identifier.

so, in this change:

1. instead of creating a new UUID, just reuse the generation of the
   sstable for the object's key.
2. do not store the uuid in the sstable_registry system table. As
   we already have the generation of the sstable for the same purpose.
3. switch the sstable identifier representation from the one defined
   by the RFC4122 (implemented by fmt::formatter<utils::UUID>) to the
   base36-based one (implemented by
   fmt::formatter<sstables::generation_type>)
4. enable the `uuid_sstable_identifers` cluster feature if it is
   enabled in the `test_env_config`, so that it the sstable manager
   can enable the uuid-based uuid when creating a new uuid for
   sstable.
5. throw if the generation of sstable is not UUID-based when
   accessing / manipulating an sstable with S3 storage backend. as
   the S3 storage backend now relies on this option. as, otherwise
   we'd have sstables with key like s3://bucket/number/basename, which
   is just unable to serve as a unique id for sstable if the bucket is
   shared across multiple tables.

Fixes #14175
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-10-23 10:08:22 +08:00
Botond Dénes
ffefa623f4 test/lib/memtable_snapshot_source: disable critical alloc section while waiting
memtable_snapshot_source starts a background fiber in its constructor,
which compacts LSA memory in a loop. The loop's inside is covered with a
critical alloc section. It also contains a wait on a condition variable
and in its present form the critical section also covers the wait,
effectively turning off allocation failure injection for any test using
the memtable_snapshot_source.
This patch disables the critical alloc section while the loop waits on
the condition variable.
2023-10-20 03:16:57 -04:00
Kefu Chai
a6e68d8309 build: cmake: move message/* into message/CMakeLists.txt
messaging_service.cc depends on idl, but many source files in
scylla-main do no depend on idl, so let's

* move "message/*" into its own directory and add an inter-library
  dependency between it and the "idl" library.
* rename the target of "message" under test/manual to "message_test"
  to avoid the name collision

this should address the compilation failure of
```
FAILED: CMakeFiles/scylla-main.dir/message/messaging_service.cc.o
/usr/bin/clang++ -DBOOST_NO_CXX98_FUNCTION_BASE -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/cmake/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -march=westmere  -Og -g -gz -std=gnu++20 -fvisibility=hidden -U_FORTIFY_SOURCE -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT CMakeFiles/scylla-main.dir/message/messaging_service.cc.o -MF CMakeFiles/scylla-main.dir/message/messaging_service.cc.o.d -o CMakeFiles/scylla-main.dir/message/messaging_service.cc.o -c /home/kefu/dev/scylladb/message/messaging_service.cc
/home/kefu/dev/scylladb/message/messaging_service.cc:81:10: fatal error: 'idl/join_node.dist.hh' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~
```
where the compiler failed to find the included `idl/join_node.dist.hh`,
which is exposed by the idl library as part of its public interface.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15657
2023-10-19 13:33:29 +03:00
Pavel Emelyanov
ec94cc9538 Merge 'test: set use_uuid to true by default in sstables::test_env ' from Kefu Chai
this series

1. let sstable tests using test_env to use uuid-based sstable identifiers by default
2. let the test who requires integer-based identifier keep using it

this should enable us to perform the s3 related test after enforcing the uuid-based identifier for s3 backend, otherwise the s3 related test would fail as it also utilize `test_env`.

Closes scylladb/scylladb#14553

* github.com:scylladb/scylladb:
  test: set use_uuid to true by default in sstables::test_env
  test: enable test to set uuid_sstable_identifiers
2023-10-19 09:09:38 +03:00
Calle Wilund
3378c246f7 main/cql_test_env: Augment compaction mgr tombstone_gc_state with CL GC info
Fixes #14870 (yet another alternative solution)

(Originally suggested by @avikivity). Use store GC clock min positions from CL
to narrow compaction GC bounds.

Note: not optimized with caches or anything at this point. Can easily be added
though of course always somewhat risky.
2023-10-17 10:30:40 +00:00
Raphael S. Carvalho
4e6fe34501 tests: Synchronize boost logger for multithreaded tests in sstable_directory_test
The logger is not thread safe, so a multithreaded test can concurrently
write into the log, yielding unreadable XMLs.

Example:
boost/sstable_directory_test: failed to parse XML output '/scylladir/testlog/x86_64/release/xml/boost.sstable_directory_test.sstable_directory_shared_sstables_reshard_correctly.3.xunit.xml': not well-formed (invalid token): line 1, column 1351

The critical (today's unprotected) section is in boost/test/utils/xml_printer.hpp:
```
inline std::ostream&
operator<<( custom_printer<cdata> const& p, const_string value )
{
    *p << BOOST_TEST_L( "<![CDATA[" );
    print_escaped_cdata( *p, value );
    return  *p << BOOST_TEST_L( "]]>" );
}
```

The problem is not restricted to xml, but the unreadable xml file caused
the test to fail when trying to parse it, to present a summary.

New thread-safe variants of BOOST_REQUIRE and BOOST_REQUIRE_EQUAL are
introduced to help multithreaded tests. We'll start patching tests of
sstable_directory_test that will call BOOST_REQUIRE* from multiple
threads. Later, we can expand its usage to other tests.

Fixes #15654.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15655
2023-10-08 15:57:08 +03:00
Kefu Chai
1efd0d9a92 test: set use_uuid to true by default in sstables::test_env
for better coverage of uuid-based sstable identifier. since this
option is enabled by default, this also match our tests with the
default behavior of scylladb.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-10-07 18:56:47 +08:00
Kefu Chai
50c8619ed9 test: enable test to set uuid_sstable_identifiers
some of the tests are still relying on the integer-based sstable
identifier, so let's add a method to test_env, so that the tests
relying on this can opt-out. we will change the default setting
of sstables::test_env to use uuid-base sstable identifier in the
next commit. this change does not change the existing behavior.
it just adds a new knob to test_env_config. and let the tests
relying on this to customize the test_env_config to disable
use_uuid.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-10-07 18:56:47 +08:00
Nadav Har'El
9dea20539d Merge 'Sanitize forward-service shutdown' from Pavel Emelyanov
There's a dedicated forward_service::shutdown() method that's defer-scheduled in main for very early invocation. That's not nice, the fwd service start-shutdown-stop sequence can be made "canonical" by moving the shutting down code into abort source subscription. Similar thing was done for view updates generator in 3b95f4f107

refs: #2737
refs: #4384

Closes scylladb/scylladb#15545

* github.com:scylladb/scylladb:
  forward_service: Remove .shutdown() method
  forward_service: Set _shutdown in abort-source subscription
  forward_service: Add abort_source to constructor
2023-09-26 18:36:52 +03:00
Pavel Emelyanov
b18c54f56c forward_service: Add abort_source to constructor
It will be used by the next patch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 10:38:26 +03:00
Raphael S. Carvalho
8997fe0625 compaction: Switch to strategy_control::candidates() for regular compaction
Now everything is prepared for the switch, let's do it.

Now let's wait for ICS to enjoy the set of changes.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
0fe2630d70 sstables: Make all_sstable_runs() more efficient by exposing frozen shared runs
Users of all_sstable_runs() don't want to mutate the runs, but rather
work with their content. So let's avoid copy and make the intention
explicit with the new frozen_sstable_run used as return type
for the interface.

This will guarantee that ICS will be able to fetch uncompacting
runs efficiently.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:20 -03:00
Raphael S. Carvalho
9f6c3369d2 sstables: Simplify sstable_set interface to retrieve runs
This interface selects all runs that store at least one of the
sstables in the vector.

But that's very fragile, to the point that even ICS had to
stop using it. A better interface is to return all runs
managed by the set and allow compaction manager to do its
filtering.

We want to use it in ICS to avoid the overhead of rebuilding
sstable runs which may be expensive as sorting is performed
to guarantee the disjoint invariant.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:04:20 -03:00
Kefu Chai
ac3406e537 utils/s3/creds: rename aws_config member variables
- s/key/access_key_id/
- s/secret/secret_access_key/
- s/token/session_token/

so they are more aligned with the AWS document.
for instance, in
https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#ConstructingTheAuthenticationHeader
AWSAccessKeyId is used in the "Authorization" header.

this would help with the readability and maintainability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-23 14:28:07 +08:00
Avi Kivity
1da6a939fe Merge 'Track memory usage of S3 object uploads' from Pavel Emelyanov
The S3 uploading sink needs to collect buffers internally before sending them out, because the minimal upload-able part size is 5Mb. When the necessary amount of bytes is accumulated, the part uploading fibers starts in the background. On flush the sink waits for all the fibers to complete and handles failure of any.

Uploading parallelism is nowadays limited by the means of the http client max-connections parameter. However, when a part uploading fibers waits for it connection it keeps the 5Mb+ buffers on the request's body, so even though the number of uploading parts is limited, the number of _waiting_ parts is effectively not.

This PR adds a shard-wide limiter on the number of background buffers S3 clients (and theirs http clients) may use.

Closes scylladb/scylladb#15497

* github.com:scylladb/scylladb:
  s3::client: Track memory in client uploads
  code: Configure s3 clients' memory usage
  s3::client: Construct client with shared semaphore
  sstables::storage_manager: Introduce config
2023-09-21 18:24:42 +03:00
Botond Dénes
3b95f4f107 Merge 'Sanitize view-update-generator start-stop sequence' from Pavel Emelyanov
The v.u.g. start stop is now spread over main() code heavily.

1. sharded<v.u.g.>.start() happens early enough to allow depending services register staging sstables on it
2. after the system is "more-or-less" alive the invoke_on_all(v.u.g.::start()) is called (conditionally) to activate the generator background fiber. Not 100% sure why it happens _that_ late, but somehow it's required that while scylla is joining the cluster the generation doesn't happen
3. early on stop the v.u.g. is fully stopped

The 3rd step is pretty nasty. It may happen that v.u.g. is not stopped if scylla start aborts before the last action is defer-scheduled. Also, when it happens, it leaves stopping dependencies with non-initialized v.u.g.'s local instances, which is not symmetrical to how they start.

Said that, this PR fixes the stopping sequence to happen later, i.e. -- being defer-scheduled right after sharded<v.u.g.> is started. Also it makes sure that terminating the background fiber happens as early as it is now. This is done the compaction_manager-style -- the v.u.g. subscribes on stop signal abort source and kicks the fiber to stop when it fires.

Closes scylladb/scylladb#15466

* github.com:scylladb/scylladb:
  view_update_generator: Stop for real later
  view_update_generator: Add logging to do_abort()
  view_update_generator: Move abort kicking to do_abort()
  view_update_generator: Add early abort subscription
2023-09-21 17:01:27 +03:00
Pavel Emelyanov
e34220ebb7 view_update_generator: Add early abort subscription
Subscribe v.u.g. to the main's stop_signal. For now a no-op callback.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-21 13:32:45 +03:00
Kefu Chai
c364efb998 utils/s3: auth using AWS_SESSION_TOKEN
when accessing AWS resources, uses are allowed to long-term security
credentials, they can also the temporary credentials. but if the latter
are used, we have to pass a session token along with the keys.
see also https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html
so, if we want to programatically get authenticated, we need to
set the "x-amz-security-token" header,
see
https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#UsingTemporarySecurityCredentials

so, in this change, we

1. add another member named `token` in `s3::endpoint_config::aws_config`
   for storing "AWS_SESSION_TOKEN".
2. populate the setting from "object_storage.yaml" and
  "$AWS_SESSION_TOKEN" environment variable.
3. set "x-amz-security-token" header if
   `s3::endpoint_config::aws_config::token` is not empty.

this should allow us to test s3 client and s3 object store backend
with S3 bucket, with the temporary credentials.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15486
2023-09-21 13:26:11 +03:00
Pavel Emelyanov
f40b4e3e84 sstables::storage_manager: Introduce config
Just an empty config that's fed to storage_manager when constructed as a
preparation for further heavier patching

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-20 17:42:59 +03:00
Botond Dénes
45dfce6632 Merge 'compaction: change behaviour of compaction task executors' from Aleksandra Martyniuk
Compaction tasks executors serve two different purposes - as compaction
manager related entity they execute compaction operation and as task
manager related entity they track compaction status.

When one role depends on the other, as it currently is for
compaction_task_impl::done() and compaction_task_executor::compaction_done(),
requirements of both roles need to be satisfied at the same time in each
corner case. Such complexity leads to bugs.

To prevent it, compaction_task_impl::done() of executors no longer depends
on compaction_task_executor::compaction_done().

Fixes: #14912.

Closes scylladb/scylladb#15140

* github.com:scylladb/scylladb:
  compaction: warn about compaction_done()
  compaction: do not run stopped compaction
  compaction: modify lowest compaction tasks' run method
  compaction: pass do_throw_if_stopping to compaction_task_executor
2023-09-19 15:15:14 +03:00
Petr Gusev
ce0ee32d5a database.cc: make _uses_schema_commitlog optional
This field on the null shard is properly initialized
in maybe_init_schema_commitlog function, until then
we can't make decisions based on its value. This problem
can happen e.g. if add_column_family function is called
with readonly=false before maybe_init_schema_commitlog.
It will call commitlog_for to pass the commitlog to
mark_ready_for_writes and commitlog_for reads _uses_schema_commitlog.

In this commit we add protection against this case - we
trigger internal_error if _uses_schema_commitlog is read
before it is initialized.

maybe_init_schema_commitlog() was added to cql_test_env
to make boost tests work with the new invariant.
2023-09-13 23:17:20 +04:00
Petr Gusev
beb29f094b system_keyspace: drop load phases
We want to switch system.scylla_local table to the
schema commitlog, but load phases hamper here - schema
commitlog is initialized after phase1,
so a table which is using it should be moved to phase2,
but system.scylla_local contains features, and we need
them before  schema commitlog initialization for
SCHEMA_COMMITLOG feature.

In this commit we are taking a different approach to
loading system tables. First, we load them all in
one pass in 'readonly' mode. In this mode, the table
cannot be written to and has not yet been assigned
a commit log. To achieve this we've added _readonly bool field
to the table class, it's initialized to true in table's
constructor. In addition, we changed the table constructor
to always assign nullptr to commitlog, and we trigger
an internal error if table.commitlog() property is accessed
while the table is in readonly mode. Then, after
triggering on_system_tables_loaded notifications on
feature_service and sstable_format_selector, we call
system_keyspace::mark_writable and eventually
table::mark_ready_for_writes which selects the
proper commitlog and marks the table as writable.

In sstable_compaction_test we drop several
mark_ready_for_writes calls since they are redundant,
the table has already been made writable in
env.make_table_for_tests call.

The table::commitlog function either returns the current
commitlog or causes an error if the table is readonly. This
didn't work for virtual tables, since they never called
mark_ready_for_writes. In this commit we add this
call to initialize_virtual_tables.
2023-09-13 23:17:20 +04:00
Petr Gusev
e395086557 system_keyspace: move initialize_virtual_tables into virtual_tables.hh
This is a readability refactoring commit without observable changes
in behaviour.

initialize_virtual_tables logically belongs to virtual_tables module,
and it allows to make other functions in virtual_tables.cc
(register_virtual_tables, install_virtual_readers)
local to the module, which simplifies the matters a bit.

all_virtual_tables() is not needed anymore, all the references to
registered virtual tables are now local to virtual_tables module
and can just use virtual_tables variable directly.
2023-09-13 23:00:15 +04:00
Petr Gusev
c4787a160b system_keyspace: remove unused parameter 2023-09-13 23:00:15 +04:00
Petr Gusev
b90011294d config.cc: drop db::config::host_id
In this refactoring commit we remove the db::config::host_id
field, as it's hacky and duplicates token_metadata::get_my_id.

Some tests want specific host_id, we add it to cql_test_config
and use in cql_test_env.

We can't pass host_id to sstables_manager by value since it's
initialized in database constructor and host_id is not loaded yet.
We also prefer not to make a dependency on shared_token_metadata
since in this case we would have to create artificial
shared_token_metadata in many tools and tests where sstables_manager
is used. So we pass a function that returns host_id to
sstables_manager constructor.
2023-09-13 23:00:15 +04:00
Pavel Emelyanov
512465288f main, cql_test_env: Start-stop batchlog manager in one "block"
Currently starting and stopping of b.m. is spread over main(). Keep it
close to each other.

Another trickery here is that calling b.m.::start() can only be done
after joining the cluster, because this start() spawns replay loop
which, in turn calls token_metadata::count_normal_token_owners() and if
the latter returns zero, the b.m. code uses it as a fraction denominator
and crashes.

With the above in mind, cql_test_env should start batchlog manager after
it "joins the ring" too. For now it doesn't make any difference, but
next patch will make use of it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-12 16:33:31 +03:00
Avi Kivity
89ba4e4a5e Merge 'Stop using anonymous minio bucket for tests' from Pavel Emelyanov
Currently minio starts with a bucket that has public anonymous access. Respectively, all tests use unsigned S3 requests. That was done for simplicity, and its better to apply some policy to the bucket and, consequentially, make tests sign their requests.

Other than the obvious benefit that we test requests signing in unit tests, another goal of this PR is to make it possible to simulate and test various error paths locally, e.g. #13745 and #13022

Closes #14525

* github.com:scylladb/scylladb:
  test/s3: Remove AWS_S3_EXTRA usage
  test/s3: Run tests over non-anonymous bucket
  test/minio: Create random temp user on start
  code: Rename S3_PUBLIC_BUCKET_FOR_TEST
2023-09-11 23:12:56 +03:00
Botond Dénes
b062b245ad Merge 'Don't cache dc:rack on system keyspace local cache' from Pavel Emelyanov
The local node's dc:rack pair is cached on system keyspace on start. However, most of other code don't need it as they get dc:rack from topology or directly from snitch. There are few places left that still mess with sysks cache, but they are easy to patch. So after this patch all the core code uses two sources of dc:rack -- topology / snitch -- instead of three.

Closes #15280

* github.com:scylladb/scylladb:
  system_keyspace: Don't require snitch argument on start
  system_keyspace: Don't cache local dc:rack pair
  system_keyspace: Save local info with explicit location
  storage_service: Get endpoint location from snitch, not system keyspace
  snitch: Introduce and use get_location() method
  repair: Local location variables instead of system keyspace's one
  repair: Use full endpoint location instead of datacenter part
2023-09-11 10:26:26 +03:00
Aleksandra Martyniuk
832df38d26 compaction: pass do_throw_if_stopping to compaction_task_executor
As a preparation for further changes, keep do_throw_if_stopping flag
as a member of compaction_task_executor.
2023-09-09 11:19:11 +02:00
Pavel Emelyanov
1d00cc5baa test/s3: Run tests over non-anonymous bucket
Currently minio applies anonymous public policy for the test bucket and
all tests just use unsigned S3 requests. This patch generates a policy
for the temporary minio user and removes the anon public one. All tests
are updated respectively to use the provided key:secret pair.

The use-https bit is off by default as minio still starts with plain
http. That's OK for now, all tests are local and have no secret data
anyway

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-07 11:16:13 +03:00
Pavel Emelyanov
e8e8539c7c code: Rename S3_PUBLIC_BUCKET_FOR_TEST
The bucket is going to stop being public, rename the env variable in
advance to make the essential patch smaller

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-07 10:25:53 +03:00
Pavel Emelyanov
5d52a35e05 system_keyspace: Don't require snitch argument on start
Now system keyspace is finally independent from snitch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-05 12:57:09 +03:00
Pavel Emelyanov
9926917bf5 system_keyspace: Save local info with explicit location
On boot system keyspace is kicked to insert local info into system.local
table. Among other things there's dc:rack pair which sys.ks. gets from
its cache which, in turn, should have been previously initialized from
snitch on sys.ks. start. This patch makes the local info updating method
get the dc:rack from caller via argument. Callers, in turn, call snitch
directly, because these are main and cql_test_env startup routines.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-05 12:54:46 +03:00
Pavel Emelyanov
13a0c29618 storage_service: Remove query processor arg from join_cluster()
The s.service since d42685d0cb is having on-board query processor ref^w
pointer and can use it to join cluster

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15236
2023-09-05 07:30:37 +03:00
Botond Dénes
3e7ec6cc83 Merge 'Move cell assertion from cql_test_env to cql_assertions' from Pavel Emelyanov
The cql_test_env has a virtual require_column_has_value() helper that better fits cql_assertions crowd. Also, the helper in question duplicates some existing code, so it can also be made shorter (and one class table helper gets removed afterwards)

Closes #15208

* github.com:scylladb/scylladb:
  cql_assertions: Make permit from env
  table: Remove find_partition_slow() helper
  sstable_compaction_test: Do not re-decorate key
  cql_test_env: Move .require_column_has_value
  cql_test_env: Use table.find_row() shortcut
2023-08-30 08:34:05 +03:00
Pavel Emelyanov
137c7116dc cql_assertions: Make permit from env
To call table::find_row() one needs to provide a permit. Tests have
short and neat helper to create one from cql_test_env

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-29 16:01:29 +03:00
Pavel Emelyanov
4e9f380608 cql_test_env: Move .require_column_has_value
This env helper is only used by tests (from cql_query_test)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-29 15:38:33 +03:00
Pavel Emelyanov
7597663ef5 cql_test_env: Use table.find_row() shortcut
The require_column_has_value() finds the cell in three steps -- finds
partition, then row, then cell. The class table already has a method to
facilitate row finding by partition and clustering key

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-29 15:37:27 +03:00
Pavel Emelyanov
a61454be00 storage_service: Use local cdc gen. service in join_cluster()
The method in question accepts cdc_generation_service ref argument from
main and cql_test_env, but storage service now has local cdcv gen.
service reference, so this argument and its propagation down the stack
can be removed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-29 09:36:58 +03:00
Pavel Emelyanov
933ea0afe6 storage_service: Bring cdc_generation_service dependency back
It sort of reverts the 5a97ba7121 commit, because storage service now
uses the cdc generation service to serve raft topo updates which, in
turn, takes the cdc gen. service all over the raft code _just_ to make
it as an argument to storage service topo calls.

Also there's API carrying cdc gen. service for the single call and also
there's an implicit need to kick cdc gen. service on decommission which
also needs storage service to reference cdc gen. after boot is complete

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-29 09:36:58 +03:00
Botond Dénes
139ba553b8 Merge 'sstable, test: log sstable name and pk when capping local_deletion_time ' from Kefu Chai
in this series, we also print the sstable name and pk when writing a tombstone whose local_deletion_time (ldt for short) is greater than INT32_MAX which cannot be represented by an uint32_t.

Fixes #15015

Closes #15107

* github.com:scylladb/scylladb:
  sstable/writer: log sstable name and pk when capping ldt
  test: sstable_compaction_test: add a test for capped tombstone ldt
2023-08-23 09:29:54 +03:00
Kefu Chai
0bc99c7f49 test: sstable_compaction_test: add a test for capped tombstone ldt
local_delection_time (short for ldt) is a timestamp used for the
purpose of purging the tombstone after gc_grace_seconds. if its value
is greater than INT32_MAX, it is capped when being written to sstable.
this is very likely a signal of bad configuration or a even a bug in
scylla. so we keep track of it with a metric named
"scylla_sstables_capped_tombstone_deletion_time".

in this change, a test is added to verify that the metric is updated
upon seeing a tombstone with this abnormal ldt.

because we validate the consistency before and after compaction in
tests, this change adds a parameter to disable this check, otherwise,
because capping the ldt changes the mutation, the validation would
fail the test.

Refs #15015
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-08-21 19:25:32 +08:00
Pavel Emelyanov
6bc30f1944 system_keyspace: De-bloat .setup() from messing with system.local
On boot several manipulations with system.local are performed.

1. The host_id value is selected from it with key = local

   If not found, system_keyspace generates a new host_id, inserts the
   new value into the table and returns back

2. The cluster_name is selected from it with key = local

   Then it's system_keyspace that either checks that the name matches
   the one from db::config, or inserts the db::config value into the
   table

3. The row with key = local is updated with various info like versions,
   listen, rpc and bcast addresses, dc, rack, etc. Unconditionally

All three steps are scattered over main, p.1 is called directly, p.2 and
p.3 are executed via system_keyspace::setup() that happens rather late.
Also there's some touch of this table from the cql_test_env startup code.

The proposal is to collect this setup into one place and execute it
early -- as soon as the system.local table is populated. This frees the
system_keyspace code from the logic of selecting host id and cluster
name leaving it to main and keeps it with only select/insert work.

refs: #2795

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #15082
2023-08-20 21:24:31 +03:00
Tomasz Grabiec
bd8bb5d4b1 Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho
Compaction group is the data plane for tablets, so this integration
allows each tablet to have its own storage (memtable + sstables).
A crucial step for dynamic tablets, where each tablet can be worked
on independently.

There are still some inefficiencies to be worked on, but as it is,
it already unlocks further development.

```
INFO  2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata
INFO  2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf
INFO  2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf
INFO  2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf
```

Closes #14863

* github.com:scylladb/scylladb:
  Kill scylla option to configure number of compaction groups
  replica: Wire tablet into compaction group
  token_metadata: Add this_host_id to topology config
  replica: Switch to chunked_vector for storing compaction groups
  replica: Generate group_id for compaction_group on demand
2023-08-18 15:17:17 +02:00
Gleb Natapov
4ffc39d885 cql3: Extend the scope of group0_guard during DDL statement execution
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.

Fixes: #13942

Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
2023-08-17 15:52:48 +03:00
Raphael S. Carvalho
b578d6643f Kill scylla option to configure number of compaction groups
The option was introduced to bootstrap the project. It's still
useful for testing, but that translates into maintaining an
additional option and code that will not be really used
outside of testing. A possible option is to later map the
option in boost tests to initial_tablets, which may yield
the same effect for testing.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:53 -03:00
Raphael S. Carvalho
5d1f60439a token_metadata: Add this_host_id to topology config
The motivation is that token_metadata::get_my_id() is not available
early in the bootstrap process, as raft topology is pulled later
than new tables are registered and created, and this node is added
to topology even later.

To allow creation of compaction groups to retrieve "my id" from
token metadata early, initialization will now feed local id
into topology config which is immutable for each node anyway.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-08-16 18:23:44 -03:00
Avi Kivity
e8f3b073c3 Merge 'Maintain sstable state explicitly' from Pavel Emelyanov
An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state.

This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path.

refs: #13017
refs: #12707

Closes #14152

* github.com:scylladb/scylladb:
  sstable/storage: Make filesystem storage with initial state
  sstable: Maintain state
  sstable: Make .change_state() accept state, not directory string
  sstable: Construct it with state
  sstables_manager: Remove state-less make_sstable()
  table: Make sstables with required state
  test: Make sstables with upload state in some cases
  tools: Make sstables with normal state
  table: Open-code sstables making streaming helpers
  tests: Make sstables with normal state by default
  sstable_directory: Make sstable with required state
  sstable_directory: Construct with state
  distributed_loader: Make sstable with desired state when populating
  distributed_loader: Make sstable with upload state when uploading
  sstable: Introduce state enum
  sstable_directory: Merge verify and g.c. calls
  distributed_loader: Merge verify and gc invocations
  sstable/filesystem: Put underscores to dir members
  sstable/s3: Mark make_s3_object_name() const
  sstable: Remove filename(dir, ...) method
2023-08-15 17:44:06 +03:00