2405 Commits

Author SHA1 Message Date
Botond Dénes
864d27f9af Merge 'clear_gently: handle null unique_ptr and optional values' from Benny Halevy
This series adds handling of null std::unique_ptr to utils::clear_gently
and handling of std::optional and seastar::optimized_optional (both engaged and disengaged cases).

Also, unit tests were added to tests the above cases.

Fixes #13636

Closes #13638

* github.com:scylladb/scylladb:
  utils: clear_gently: add variants for optional values
  utils: clear_gently: do not clear null unique_ptr
2023-04-24 10:27:32 +03:00
Botond Dénes
9e757d9c6d Merge 'De-globalize storage proxy' from Pavel Emelyanov
All users of global proxy are gone (*), proxy can be made fully main/cql_test_env local.

(*) one test case still needs it, but can get it via cql_test_env

Closes #13616

* github.com:scylladb/scylladb:
  code: Remove global proxy
  schema_change_test: Use proxy from cql_test_env
  test: Carry proxy reference on cql_test_env
2023-04-24 09:38:00 +03:00
Botond Dénes
1750bb34b7 Merge 'sstables, replica: add generation generator' from Kefu Chai
this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation.

this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier.

Closes #13073

* github.com:scylladb/scylladb:
  replica, test: create generation id using generator
  sstables: add generation_generator
  test: sstables: use generate_n for generating ids for testing
2023-04-24 09:31:08 +03:00
Benny Halevy
002865018f utils: clear_gently: add variants for optional values
Implement clear_gently for std:;optional<T>
and seastar::optimized_optional<T> and respective
unit tests.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 21:34:02 +03:00
Benny Halevy
12877ad026 utils: clear_gently: do not clear null unique_ptr
Otherwise the null pointer is dereferenced.

Add a unit test reproducing the issue
and testing this fix.

Fixes #13636

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-23 21:33:11 +03:00
Pavel Emelyanov
5e201b9120 database: Remove compaction_manager.hh inclusion into database.hh
The only reason why it's there (right next to compaction_fwd.hh) is
because the database::table_truncate_state subclass needs the definition
of compaction_manager::compaction_reenabler subclass.

However, the former sub is not used outside of database.cc and can be
defined in .cc. Keeping it outside of the header allows dropping the
compaction_manager.hh from database.hh thus greatly reducing its fanout
over the code (from ~180 indirect inclusions down to ~20).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13622
2023-04-23 16:27:11 +03:00
Kefu Chai
576adbdbc5 replica, test: create generation id using generator
reuse generation_generator for generating generation identifiers for
less repeatings. also, add allow update generator to update its
lastest known generation id.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-21 22:02:30 +08:00
Kefu Chai
a2aa133822 treewide: use std::lexicographical_compare_threeway
this the standard library offers
`std::lexicographical_compare_threeway()`, and we never uses the
last two addition parameters which are not provided by
`std::lexicographical_compare_threeway()`. there is no need to have
the homebrew version of trichotomic compare function.

in this change,

* all occurrences of `lexicographical_tri_compare()` are replaced
  with `std::lexicographical_compare_threeway()`.
* ``lexicographical_tri_compare()` is dropped.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13615
2023-04-21 14:28:18 +03:00
Pavel Emelyanov
f953fb2f52 schema_change_test: Use proxy from cql_test_env
There's one place where test case calls for storage proxy and currently
does it via global refernece. Time to switch it to cql_test_env's one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-21 14:18:00 +03:00
Kamil Braun
f7408130c9 Merge 'Fix topology management when raft-based topology is enabled' from Tomasz Grabiec
Fixes a problem when raft-based topology is enabled, which loads
topology from storage. It starts by clearing topology and then adding
nodes one by one. Before this patch, this violates internal invariant
of topology object which puts the local node as the first node. This
would manifest by triggering an assert in topology::pop_node() which
throws if popping the node at index 0 in order to keep the information
about local node around. This is normally prevented by a check in
topology::remove_node() which avoid calling pop_node() if removing the
local node. But since there is no node which is marked as local, this
check allows the first node to be popped.

To fix the problem I lift the invariant that local node is always in
_nodes. We still have information about local node in config. Instead
of keeping it in _nodes, we recognize it as part of indexing. We also
allow removing the local node like a regular node.

The path which reloads topology works correctly after this, the local
node will be recognized when (if) it is added to the topology.

Fixes #13495

Closes #13498

* github.com:scylladb/scylladb:
  locator: topology: Fix move assignment
  locator: topology: Add printer
  tests: topology: Test that topology clearing preserves information about local node
  locator: topology: Recognize local node as part of indexing it
  locator: topology: Fix get_location(ep) for local node
  locator: topology: Fix typo
  locator: topology: Preserve config when cloning
2023-04-21 11:45:08 +02:00
Kefu Chai
b0ef053552 test: sstables: use generate_n for generating ids for testing
so we don't need to keep a `prev_gen` around, this also prepares for
the coming change to use generation generator.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-21 15:45:16 +08:00
Kefu Chai
c5fa1ac9f7 sstable: specialize fmt::formatter<component_type>
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `component_type` without the help of `operator<<`.

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now.

also, please note, to enable fmtlib to format `std::set<component_type>`
in `test/boost/sstable_3_x_test.cc` , we need to include
`<fmt/ranges.h>` in that source file.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13598
2023-04-21 09:49:24 +03:00
Kefu Chai
ecb5380638 treewide: s/boost::lexical_cast<std::string>/fmt::to_string()/
this change replaces all occurrences of `boost::lexical_cast<std::string>`
in the source tree with `fmt::to_string()`. for couple reasons:

* `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`,
  so the latter is easier to parse and read.
* `boost::lexical_cast<std::string>` creates a stringstream under the
  hood, so it can use the `operator<<` to stringify the given object.
  but stringstream is known to be less performant than fmtlib.
* we are migrating to fmtlib based formatting, see #13245. so
  using `fmt::to_string()` helps us to remove yet another dependency
  on `operator<<`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13611
2023-04-21 09:43:53 +03:00
Tomasz Grabiec
0ec700cd00 locator: topology: Fix move assignment
Defaulted assignment doesn't update node::_topology.
2023-04-20 23:39:18 +02:00
Tomasz Grabiec
3dfd49fe62 tests: topology: Test that topology clearing preserves information about local node 2023-04-20 23:39:18 +02:00
Tomasz Grabiec
7d3384089a locator: topology: Recognize local node as part of indexing it
Fixes a problem when raft-based topology is enabled, which loads
topology from storage. It starts by clearing topology and then adding
nodes one by one. Before this patch, this violates internal invariant
of topology object which puts the local node as the first node. This
would manifest by triggering an assert in topology::pop_node() which
throws if popping the node at index 0 in order to keep the information
about local node around. This is normally prevented by a check in
topology::remove_node() which avoid calling pop_node() if removing the
local node. But since there is no node which is marked as local, this
check allows the first node to be popped.

To fix the problem I lift the invariant that local node is always in
_nodes. We still have information about local node in config. Instead
of keeping it in _nodes, we recognize it as part of indexing. We also
allow removing the local node like a regular node.

The path which reloads topology works correctly after this, the local
node will be recognized when (if) it is added to the topology.

Fixes #13495
2023-04-20 23:39:18 +02:00
Botond Dénes
1426c623eb Merge 'Tune up S3 unit tests environment usage (and a bit more)' from Pavel Emelyanov
The tests in question are using MINIO_SERVER_ADDRESS environment variable to export minio server address from pylib to test cases. Also they use hard-coded public bucket name. Both plays badly with AWS S3, the former due to MINIO_... in its name and the latter because public bucket name can be any.

So this PR puts address and public bucket name into S3_..._FOR_TEST environment variables and fixes output stream closure on failure while at it.

Detached from #13493

Closes #13546

* github.com:scylladb/scylladb:
  s3/test: Rename MINIO_SERVER_ADDRESS environment variable
  s3/test: Keep public bucket name in environment
  s3/test: Fix upload stream closure
  test/lib: Add getenv_safe() helper
2023-04-20 18:01:12 +03:00
Pavel Emelyanov
a77ca69360 s3/test: Rename MINIO_SERVER_ADDRESS environment variable
Using it the pylib minio code export minio address for tests. This
creates unneeded WTFs when running the test over AWS S3, so it's better
to rename to variable not to mention MINIO at all.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-19 12:51:12 +03:00
Pavel Emelyanov
12c4e7d605 s3/test: Keep public bucket name in environment
Local test.py runs minio with the public 'testbucket' bucket and all
test cases know that. This series adds an ability to run tests over real
S3 so the bucket name should be configurable.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-19 12:51:12 +03:00
Pavel Emelyanov
91674da982 s3/test: Fix upload stream closure
If multipart upload fails for some reason the output stream remains not
closed and the respective assertion masquerades the original failure.
Fix that by closing the stream in all cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-19 12:51:12 +03:00
Avi Kivity
7a42927a3d treewide: stop using 'using namespace std' in namespace scope
Such namespace-wide imports can create conflicts between names that
are the same in seastar and std, such as {std,seastar}::future and
{std,seastar}::format, since we also have 'using namespace seastar'.

Replace the namespace imports with explicit qualification, or with
specific name imports.

Closes #13528
2023-04-17 14:08:37 +03:00
Botond Dénes
6c889213bf Merge 'Topology add node exception safety' from Benny Halevy
Currently if index_node throws when trying to
add an already indexed node, pop_node might
unindex the existing node instead of the new one.

Instead, with this change, unindex_node looks up
the node by its pointer and removed it from the
index map only if it's found there so to clean up
safely after index_node throws (at any stage).

Add a unit test to verify that.

In addition, added a unit test to reproduce #13502 and test the fix.

Closes #13512

* github.com:scylladb/scylladb:
  test: locator_topology: add test_update_node
  topology: add_node, unindex_node: make exception safe
2023-04-17 11:02:15 +03:00
Botond Dénes
4c37dc5507 Merge 'keys: specialize fmt::formatter<partition_key> and friends' from Kefu Chai
this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`.

- partition_key_view
- partition_key
- partition_key::with_schema_wrapper
- key_with_schema
- clustering_key_prefix
- clustering_key_prefix::with_schema_wrapper

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now. the helper
of `print_key()` is removed, as its only caller is
`operator<<(std::ostream&, const
clustering_key_prefix::with_schema_wrapper&)`.

the reason why all these operators are replaced in one go is that
we have a template function of `key_to_str()` in `db/large_data_handler.cc`.
this template function is actually the caller of operator<< of
`partition_key::with_schema_wrapper` and
`clustering_key_prefix::with_schema_wrapper`.
so, in order to drop either of these two operator<<, we need to remove
both of them, so that we can switch over to `fmt::to_string()` in this
template function.

Refs scylladb#13245

Closes #13513

* github.com:scylladb/scylladb:
  keys: consolidate the formatter for partition_keys
  keys: specialize fmt::formatter<partition_key> and friends
2023-04-17 10:27:31 +03:00
Benny Halevy
e18eb71fa3 test: locator_topology: add test_update_node
Reproduces issue fixed in PR #13502

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-14 17:51:07 +03:00
Benny Halevy
e29994b2aa topology: add_node, unindex_node: make exception safe
Current if index_node throws when trying to
add an already indexed node, pop_node might
unindex the existing node instead of the new one.

Instead, with this change, unindex_node looks up
the node by its pointer and removed it from the
index map only if it's found there so to clean up
safely after index_node throws (at any stage).

Add a unit test to verify that.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-14 17:51:05 +03:00
Raphael S. Carvalho
a47bac931c Move TWCS option from table into TWCS itself
enable_optimized_twcs_queries is specific to TWCS, therefore it
belongs to TWCS, not replica::table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #13489
2023-04-14 08:28:16 +03:00
Kefu Chai
3738fcbe05 keys: specialize fmt::formatter<partition_key> and friends
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print following classes without the help of `operator<<`.

- partition_key_view
- partition_key
- partition_key::with_schema_wrapper
- key_with_schema
- clustering_key_prefix
- clustering_key_prefix::with_schema_wrapper

the corresponding `operator<<()` are dropped dropped in this change,
as all its callers are now using fmtlib for formatting now. the helper
of `print_key()` is removed, as its only caller is
`operator<<(std::ostream&, const
clustering_key_prefix::with_schema_wrapper&)`.

the reason why all these operators are replaced in one go is that
we have a template function of `key_to_str()` in `db/large_data_handler.cc`.
this template function is actually the caller of operator<< of
`partition_key::with_schema_wrapper` and
`clustering_key_prefix::with_schema_wrapper`.
so, in order to drop either of these two operator<<, we need to remove
both of them, so that we can switch over to `fmt::to_string()` in this
template function.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-14 13:21:30 +08:00
Botond Dénes
bd57471e54 reader_concurrency_semaphore: don't evict inactive readers needlessly
Inactive readers should only be evicted to free up resources for waiting
readers. Evicting them when waiters are not admitted for any other
reason than resources is wasteful and leads to extra load later on when
these evicted readers have to be recreated end requeued.
This patch changes the logic on both the registering path and the
admission path to not evict inactive readers unless there are readers
actually waiting on resources.
A unit-test is also added, reproducing the overly-agressive eviction and
checking that it doesn't happen anymore.

Fixes: #11803

Closes #13286
2023-04-13 15:20:18 +03:00
Kefu Chai
87170bf07a build: cmake: add more tests
this change should add the remaining tests under boost/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13494
2023-04-13 14:57:00 +03:00
Botond Dénes
f1bbf705f9 Merge 'Cleanup sstables in resharding and other compaction types' from Benny Halevy
This series extends sstable cleanup to resharding and other (offstrategy, major, and regular) compaction types so to:
* cleanup uploaded sstables (#11933)
* cleanup staging sstables after they are moved back to the main directory and become eligible for compaction (#9559)

When perform_cleanup is called, all sstables are scanned, and those that require cleanup are marked as such, and are added for tracking to table_state::cleanup_sstable_set.  They are removed from that set once released by compaction.
Along with that sstables set, we keep the owned_ranges_ptr used by cleanup in the table_state to allow other compaction types (offstrategy, major, or regular) to cleanup those sstables that are marked as require_cleanup and that were skipped by cleanup compaction for either being in the maintenance set (requiring offstrategy compaction) or in staging.

Resharding is using a more straightforward mechanism of passing the owned token ranges when resharding uploaded sstables and using it to detect sstable that require cleanup, now done as piggybacked on resharding compaction.

Closes #12422

* github.com:scylladb/scylladb:
  table: discard_sstables: update_sstable_cleanup_state when deleting sstables
  compaction_manager: compact_sstables: retrieve owned ranges if required
  sstables: add a printer for shared_sstable
  compaction_manager: keep owned_ranges_ptr in compaction_state
  compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup
  compaction: refactor compaction_state out of compaction_manager
  compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh
  compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state
  compaction_manager: refactor get_candidates
  compaction_manager: get_candidates: mark as const
  table, compaction_manager: add requires_cleanup
  sstable_set: add for_each_sstable_until
  distributed_loader: reshard: update sstable cleanup state
  table, compaction_manager: add update_sstable_cleanup_state
  compaction_manager: needs_cleanup: delete unused schema param
  compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges
  distributed_loader: reshard: consider sstables for cleanup
  distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard
  distributed_loader: reshard: add optional owned_ranges_ptr param
  distributed_loader: reshard: get a ref to table_state
  distributed_loader: reshard: capture creator by ref
  distributed_loader: reshard: reserve num_jobs buckets
  compaction: move owned ranges filtering to base class
  compaction: move owned_ranges into descriptor
2023-04-11 14:52:29 +03:00
Kefu Chai
86b66a9875 build: cmake: drop test_table.CC
this change mirrors the corresponding change in `configure.py` in
4b5b6a9010 .

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13461
2023-04-11 09:42:58 +03:00
Botond Dénes
05b381bfa2 Merge 'Simple S3 storage for sstables' from Pavel Emelyanov
The PR adds sstables storage backend that keeps all component files as S3 objects and system.sstables_registry ownership table that keeps track of what sstables objects belong to local node and their names.

When a keyspace is configured with 'STORAGE = { 'type': 'S3' }' the respective class table object eventually gets the storage_options instance pointing to the target S3 endpoint and bucket. All the sstables created for that table attach the S3 storage implementation that maintains components' files as S3 objects. Writing to and reading from components is handled by the S3 client facilities from utils/. Changing the sstable state, which is -- moving between normal, staging and quarantine states -- is not yet implemented, but would eventually happen by updating entries in the sstables registry.

To keep track of which node owns which objects, to provide bucket-wide uniqueness of object names and to maintain sstable state the storage driver keeps records in the system.sstables_registry ownership table. The table maps sstable location and generation to the object format, version, status-state (*) and (!) unique identifier (some time soon this identifier is supposed to be replaced with UUID sstables generations). The component object name is thus s3://bucket/uuid/component_basename. The registry is also used on boot. The distributed loader picks up sstables from all the tables found in schema and for S3-backed keyspaces it lists entries in the registry to a) identify those and b) get their unique S3-side identifiers to open by name.

(*) About sstable's status and state.

The state field is the part of today's sstable path on disk -- staging, quarantine, normal (root table data dir), etc. Since S3 doesn't have the renaming facility, moving sstable between those states is only possible by updating the entry in the registry. This is not yet implemented in this set (#13017)

The status field tracks sstable' transition through its creation-deletion. It first starts with 'creating' status which corresponds to the today's TemporaryTOC file. After being created and written to the sstable moves into 'sealed' state which corresponds to the today's normal sstable being with the TOC file. To delete sstable atomically it first moves into 'removing' state which is equivalent to being in the deletion-log for the on-disk sstable. Once removed from the bucket, the entry is removed from the registry.

To play with:

1. Start minio (installed by install-dependencies.sh)
```
export MINIO_ROOT_USER=${root_user}
export MINIO_ROOT_PASSWORD=${root_pass}
mkdir -p ${root_directory}
minio server ${root_directory}
```

2. Configure minio CLI, create anonymous bucket
```
mc config host rm local
mc config host add local http://127.0.0.1:9000 ${root_user} ${root_pass}
mc mb local/sstables
mc anonymous set public local/sstables
```

3. Start Scylla with object-storage feature enabled
``` scylla ... --experimental-features=keyspace-storage-options --workdir ${as_usual}```

4. Create KS with S3 storage
``` create keyspace ... storage = { 'type': 'S3', 'endpoint': '127.0.0.1:9000', 'bucket': 'sstables' };```

The S3 client has a logger named "s3", it's useful to use on with `trace` verbosity.

Closes #12523

* github.com:scylladb/scylladb:
  test: Add object-storage test
  distributed_loader: Print storage type when populating
  sstable_directory: Add ownership table components lister
  sstable_directory: Make components_lister and API
  sstable_directory: Create components lister based on storage options
  sstables: Add S3 storage implementation
  system_keyspace: Add ownership table
  system_keyspace: Plug to user sstables manager too
  sstable: Make storage instance based on storage options
  sstable_directory: Keep storage_options aboard
  sstable: Virtualize the helper that gets on-disk stats for sstable
  sstable, storage: Virtualize data sink making for small components
  sstable, storage: Virtualize data sink making for Data and Index
  sstable/writer: Shuffle writer::init_file_writers()
  sstable: Make storage an API
  utils: Add S3 readable file impl for random reads
  utils: Add S3 data sink for multipart upload
  utils: Add S3 client with basic ops
  cql-pytest: Add option to run scylla over stable directory
  test.py: Equip it with minio server
  sstables: Detach write_toc() helper
2023-04-11 08:17:25 +03:00
Benny Halevy
9105f9800c sstables: add a printer for shared_sstable
Refactor the printing logic in compaction::formatted_sstables_list
out to sstables::to_string(const shared_sstable&, bool include_origin)
and operator<<(const shared_sstable) on top of it.

So that we can easily print std::vector<shared_sstable>
from compaction_manager in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-10 23:31:35 +03:00
Benny Halevy
1baca96de1 compaction_manager: needs_cleanup: delete unused schema param
It isn't needed.  The sstable already has a schema.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-10 23:03:53 +03:00
Benny Halevy
09df04c919 compaction: move owned_ranges into descriptor
Move the owned_ranges_ptr, currently used only by
cleanup and upgrade compactions, to the generic
compaction descriptor so we apply cleanup in other
compaction types.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-10 22:52:12 +03:00
Pavel Emelyanov
fd817e199c Merge 'auth: replace operator<<(..) with fmt formatter' from Kefu Chai
this is a part of a series to migrating from `operator<<(ostream&, ..)`
based formatting to fmtlib based formatting. the goal here is to enable
fmtlib to print `auth::auth_authentication_options` and `auth::resource_kind`
without the help of fmt::ostream. and their `operator<<(ostream,..)` are
dropped, as there are no users of them anymore.

Refs #13245

Closes #13460

* github.com:scylladb/scylladb:
  auth: remove unused operator<<(.., resource_kind)
  auth: specialize fmt::formatter<resource_kind>
  auth: remove unused operator<<(.., authentication_option)
  auth: specialize fmt::formatter<authentication_option>
2023-04-10 17:05:09 +03:00
Pavel Emelyanov
4bb885b759 sstable: Make storage instance based on storage options
This patch adds storage options lw-ptr to sstables_manager::make_sstable
and makes the storage instance creation depend on the options. For local
it just creates the filesystem storage instance, for S3 -- throws, but
next patch will fix that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
df026e2cb5 sstable_directory: Keep storage_options aboard
The class in question will need to know the table's storage it will need
to list sstables from. For that -- construct it with the storage options
taken from table.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
033fa107f8 utils: Add S3 readable file impl for random reads
Sometimes an sstable is used for random read, sometimes -- for streamed
read using the input stream. For both cases the file API can be
provided, because S3 API allows random reads of arbitrary lengths.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
a4a64149a6 utils: Add S3 data sink for multipart upload
Putting a large object into S3 using plain PUT is bad choice -- one need
to collect the whole object in memory, then send it as a content-length
request with plain body. Less memory stress is by using multipart
upload, but multipart upload has its limitation -- each part should be
at least 5Mb in size. For that reason using file API doesn't work --
file IO API operates with external memory buffers and the file impl
would only have raw pointers to it. In order to collect 5Mb of chunk in
RAM the impl would have to copy the memory which is not good. Unlike the
file API data_sink API is more flexible, as it has temporary buffers at
hand and can cache them in zero-copy manner.

Having sad that, the S3 data_sink implementation is like this:

* put(buffer):
  move the buffer into local cache, once the local cache grows above 5Mb
  send out the part

* flush:
  send out whatever is in cache, then send upload completion request

* close:
  check that the upload finihsed (in flush), abort the upload otherwise

User of the API may (actually should) wrap the sink with output_stream
and use it as any other output_stream.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Pavel Emelyanov
3745b5c715 utils: Add S3 client with basic ops
Those include -- HEAD to get size, PUT to upload object in one go, GET
to read the object as contigious buffer and DELETE to drop one.

The client uses http client from seastar and just implements the S3
protocol using it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-04-10 16:43:01 +03:00
Nadav Har'El
d26bb8c12d Merge 'tree: migrate from std::regex to boost::regex' from Botond Dénes
Except for where usage of `std::regex` is required by 3rd party library interfaces.
As demonstrated countless times, std::regex's practice of using recursion for pattern matching can result in stack overflow, especially on AARCH64. The most recent incident happened after merging https://github.com/scylladb/scylladb/pull/13075, which (indirectly) uses `sstables::make_entry_descriptor()` to test whether a certain path is a valid scylla table path in a trial-and-error manner. This resulted in stacks blowing up in AARCH64.
To prevent this, use the already tried and tested method of switching from `std::regex` to `boost::regex`. Don't wait until each of the `std::regex` sites explode, replace them all preemptively.

Refs: https://github.com/scylladb/scylladb/issues/13404

Closes #13452

* github.com:scylladb/scylladb:
  test: s/std::regex/boost::regex/
  utils: s/std::regex/boost::regex/
  db/commitlog: s/std::regex/boost::regex/
  types: s/std::regex/boost::regex/
  index: s/std::regex/boost::regex/
  duration.cc: s/std::regex/boost::regex/
  cql3: s/std::regex/boost::regex/
  thrift: s/std::regex/boost::regex/
  sstables: use s/std::regex/boost::regex/
2023-04-09 18:47:41 +03:00
Kefu Chai
9d5fbe226e auth: remove unused operator<<(.., resource_kind)
since the only user of operator<<(..., resource_kind) is now
`auth_resource_test`, let's just move it into this test. and
there is no need to keep this operator in the header file where
`resource_kind` is defined.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-07 20:32:28 +08:00
Botond Dénes
452cb1a712 test: s/std::regex/boost::regex/
The former is prone to producing stack-overflow as it uses recursion in
it match implementation.

The migration is entirely mechanical.
2023-04-06 09:51:32 -04:00
Botond Dénes
0a46a574e6 Merge 'Topology: introduce nodes' from Benny Halevy
As a first step towards using host_id to identify nodes instead of ip addresses
this series introduces a node abstraction, kept in topology,
indexed by both host_id and endpoint.

The revised interface also allows callers to handle cases where nodes
are not found in the topology more gracefully by introducing `find_node()` functions
that look up nodes by host_id or inet_address and also get a `must_exist` parameter
that, if false (the default parameter value) would return nullptr if the node is not found.
If true, `find_node` throws an internal error, since this indicates a violation of an internal
assumption that the node must exist in the topology.

Callers that may handle missing nodes, should use the more permissive flavor
and handle the !find_node() case gracefully.

Closes #11987

* github.com:scylladb/scylladb:
  topology: add node state
  topology: remove dead code
  locator: add class node
  topology: rename update_endpoint to add_or_update_endpoint
  topology: define get_{rack,datacenter} inline
  shared_token_metadata: mutate_token_metadata: replicate to all shards
  locator: endpoint_dc_rack: refactor default_location
  locator: endpoint_dc_rack: define default operator==
  test: storage_proxy_test: provide valid endpoint_dc_rack
2023-04-06 13:47:22 +03:00
Tomasz Grabiec
bbabf07f69 Merge 'test/boost/multishard_mutation_query: use random schema' from Botond Dénes
This test currently uses `test/lib/test_table.hh` to generate data for its test cases. This data generation facility is used by no other tests. Worse, it is redundant as we already have a random data generator with fixed schema, in `test/lib/mutation_source_test.hh`. So in this series, we migrate the test cases in said test file to random schema and its random data generation facilities. These are used by several other test cases and using random schema allows us to cover a wider (quasi-infinite) number of possibilities.
After migrating all tests away from it, `test/lib/test_table.hh` is removed.
This series also reduces the runtime of `fuzzy_test` drastically. It should now run in a few minutes or even in seconds (depending on the machine).

Fixes: #12944

Closes #12574

* github.com:scylladb/scylladb:
  test/lib: rm test_table.hh
  test/boos/multishard_mutation_query_test: migrate other tests to random schema
  test/boost/multishard_mutation_query_test: use ks keyspace
  test/boost/multishard_mutation_query_test: improve test pager
  test/boost/multishard_mutation_query_test: refactor fuzzy_test
  test/boost: add multishard_mutation_query_test more memory
  types/user: add get_name() accessor
  test/lib/random_schema: add create_with_cql()
  test/lib/random_schema: fix udt handling
  test/lib/random_schema: type_generator(): also generate frozen types
  test/lib/random_schema: type_generator(): make static column generation conditional
  test/lib/random_schema: type_generator(): don't generate duration_type for keys
  test/lib/random_schema: generate_random_mutations(): add overload with seed
  test/lib/random_schema: generate_random_mutations(): respect range tombstone count param
  test/lib/random_schema: generate_random_mutations(): add yields
  test/lib/random_schema: generate_random_mutations(): fix indentation
  test/lib/random_schema: generate_random_mutations(): coroutinize method
  test/lib/random_schema: generate_random_mutations(): expand comment
2023-04-05 10:32:58 +02:00
Benny Halevy
c17df1759e topology: add node state
Add a simple node state model with:
`joining`, `normal`, `leaving`, and `left` states
to help managing nodes during replace
with the the same ip address.

Later on, this could also help prevent nodes
that were decommissioned, removed, or replaced
from rejoining the cluster.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-02 20:18:31 +03:00
Benny Halevy
f3d5df5448 locator: add class node
And keep per node information (idx, host_id, endpoint, dc_rack, is_pending)
in node objects, indexed by topology on several indices like:
idx, host_id, endpoint, current/pending, per dc, per dc/rack.

The node index is a shorthand identifier for the node.

node* and index are valid while the respective topology instance is valid.
To be used, the caller must hold on to the topology / token_metadata object
(e.g. via a token_metadata_ptr or effective_replication_map)

Refs #6403

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

topology: add node idx

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-02 20:13:02 +03:00
Benny Halevy
006e02410f topology: rename update_endpoint to add_or_update_endpoint
To reflect what it does,

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-02 20:08:03 +03:00
Benny Halevy
5874a0d0ca test: storage_proxy_test: provide valid endpoint_dc_rack
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-04-02 19:13:05 +03:00