scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Files

Botond Dénes 5a14c3311a Merge 'Break S3 upload 50Gb file limit' from Pavel Emelyanov

Current S3 uploading sink has implicit limit for the final file size that comes from two places. First, S3 protocol declares that uploading parts count from 1 to 10000 (inclusive). Second, uploading sink sends out parts once they grow above S3 minimal part size which is 5Mb. Since sstables puts data in 128kb (or smaller) portions, parts are almost exactly 5Mb in size, so the total uploading size cannot grow above ~50Gb. That's too low.

To break the limit the new sink (called jumbo sink) uses the UploadPartCopy S3 call that helps splicing several objects into one right on the server. Jumbo sink starts uploading parts into an intermediate temporary object called a piece and named ${original_object}_${piece_number}. When the number of parts in current piece grows above the configured limit the piece is finalized and upload-copied into the object as its next part, then deleted. This happens in the background, meanwhile the new piece is created and subsequent data is put into it. When the sink is flushed the current piece is flushed as is and also squashed into the object.

The new jumbo sink is capable of uploading ~500Tb of data, which looks enough.

fixes: #13019

Closes #13577

* github.com:scylladb/scylladb:
  sstables: Switch data and index sink to use jumbo uploader
  s3/test: Tune-up multipart upload test alignment
  s3/test: Add jumbo upload test
  s3/client: Wait for background upload fiber on close-abort
  c3/client: Implement jumbo upload sink
  s3/client: Move memory buffers to upload_sink from base
  s3/client: Move last part upload out of finalize_upload()
  s3/client: Merge do_flush() with upload_part()
  s3/client: Rename upload_sink -> upload_sink_base

2023-05-25 11:44:06 +03:00

aggregate_fcts_test.cc

test: do not initialize a time_t with braces

2023-03-07 17:54:53 +08:00

allocation_strategy_test.cc

…

alternator_unit_test.cc

alternator: unit test for number magnitude and precision function

2023-05-02 11:04:05 +03:00

anchorless_list_test.cc

…

auth_passwords_test.cc

…

auth_resource_test.cc

auth: remove unused operator<<(.., resource_kind)

2023-04-07 20:32:28 +08:00

auth_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

batchlog_manager_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

big_decimal_test.cc

…

bptree_test.cc

…

broken_sstable_test.cc

test: sstables: use generation_type::int_t

2023-03-22 13:48:50 +02:00

btree_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

bytes_ostream_test.cc

…

cache_flat_mutation_reader_test.cc

partition_version: remove the schema argument from partition_entry::read()

2023-05-04 02:37:29 +02:00

cached_file_test.cc

test: fix some mismatched signed/unsigned comparisons

2023-03-21 13:15:12 +02:00

caching_options_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

canonical_mutation_test.cc

Introduce mutation/ module

2023-02-14 11:19:03 +02:00

cartesian_product_test.cc

…

castas_fcts_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

cdc_generation_test.cc

…

cdc_test.cc

cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt

2023-05-07 17:17:36 +03:00

cell_locker_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

checksum_utils_test.cc

…

chunked_managed_vector_test.cc

test: fix some mismatched signed/unsigned comparisons

2023-03-21 13:15:12 +02:00

chunked_vector_test.cc

test/boost/chunked_vector_test: remove defaulted exception_safe_class's move ctor

2023-02-19 12:58:22 +08:00

clustering_ranges_walker_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

CMakeLists.txt

build: cmake: add missing test

2023-05-17 09:51:51 +03:00

column_mapping_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

commitlog_test.cc

reader_permit: keep trace_state pointer on permit

2023-03-22 04:58:01 -04:00

compound_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

compress_test.cc

…

config_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

continuous_data_consumer_test.cc

reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes

2023-04-19 05:20:42 -04:00

counter_test.cc

Merge ' mvcc: make schema upgrades gentle' from Michał Chojnowski

2023-05-24 22:58:43 +02:00

cql_auth_query_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

cql_auth_syntax_test.cc

…

cql_functions_test.cc

cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt

2023-05-07 17:17:36 +03:00

cql_query_group_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

cql_query_large_test.cc

cql3: result_set: switch cell data type from bytes_opt to managed_bytes_opt

2023-05-07 17:17:36 +03:00

cql_query_like_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

cql_query_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

crc_test.cc

utils: spell "barrett" correctly

2022-11-16 16:30:38 +02:00

data_listeners_test.cc

treewide: stop using 'using namespace std' in namespace scope

2023-04-17 14:08:37 +03:00

database_test.cc

Merge 'Generalize some file accessing helpers in test/' from Pavel Emelyanov

2023-05-24 08:43:41 +03:00

dirty_memory_manager_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

double_decker_test.cc

treewide: use defaulted operator!=() and operator==()

2023-04-27 10:24:46 +03:00

duration_test.cc

…

dynamic_bitset_test.cc

…

enum_option_test.cc

…

enum_set_test.cc

move to_string.hh to utils/

2023-02-15 11:09:04 +02:00

error_injection_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

estimated_histogram_test.cc

…

exception_container_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

exceptions_fallback_test.cc

…

exceptions_optimized_test.cc

…

exceptions_test.inc.cc

…

expr_test.cc

cql3: remove expr::token

2023-04-29 13:11:31 +02:00

extensions_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

filtering_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

flat_mutation_reader_test.cc

everywhere: optimize calls to make_flat_mutation_reader_from_mutations_v2 with single mutation

2023-05-02 07:58:34 +03:00

flush_queue_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

fragmented_temporary_buffer_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

frozen_mutation_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

gossiping_property_file_snitch_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

group0_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

hash_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

hashers_test.cc

utils: hashing: add simple_xx_hasher

2023-04-24 14:06:43 +03:00

hint_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

idl_test.cc

…

index_with_paging_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

input_stream_test.cc

…

intrusive_array_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

json_cql_query_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

json_test.cc

…

keys_test.cc

types: move types.{cc,hh} into types

2023-02-19 21:05:45 +02:00

large_paging_state_test.cc

types: move types.{cc,hh} into types

2023-02-19 21:05:45 +02:00

like_matcher_test.cc

…

limiting_data_source_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

linearizing_input_stream_test.cc

…

lister_test.cc

test: Generalize touch_file() into test_utils.*

2023-05-23 10:40:55 +03:00

loading_cache_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

locator_topology_test.cc

locator: topology: Fix move assignment

2023-04-20 23:39:18 +02:00

log_heap_test.cc

…

logalloc_standard_allocator_segment_pool_backend_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

logalloc_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

managed_bytes_test.cc

…

managed_vector_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

map_difference_test.cc

treewide: stop using 'using namespace std' in namespace scope

2023-04-17 14:08:37 +03:00

memtable_test.cc

Merge 'Use implicit default prio class in tests' from Pavel Emelyanov

2023-05-23 18:46:52 +03:00

multishard_combining_reader_as_mutation_source_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

multishard_mutation_query_test.cc

Merge 'test/boost/multishard_mutation_query: use random schema' from Botond Dénes

2023-04-05 10:32:58 +02:00

murmur_hash_test.cc

…

mutation_fragment_test.cc

mutation_fragment_stream_validator: produce error messages in low-level validator

2023-05-02 09:42:41 -04:00

mutation_query_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

mutation_reader_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

mutation_test.cc

Merge ' mvcc: make schema upgrades gentle' from Michał Chojnowski

2023-05-24 22:58:43 +02:00

mutation_writer_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

mvcc_test.cc

test: mvcc_test: add a test for gentle schema upgrades

2023-05-04 03:35:15 +02:00

network_topology_strategy_test.cc

test/network_topology_strategy_test: Test NTS with replication_factor option in test_invalid_dcs

2023-05-22 17:56:27 +02:00

nonwrapping_range_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

observable_test.cc

…

partitioner_test.cc

dht: Use last_token_of_compaction_group() in split_token_range_msb()

2023-04-24 10:49:37 +02:00

per_partition_rate_limit_test.cc

Introduce mutation/ module

2023-02-14 11:19:03 +02:00

querier_cache_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

query_processor_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

radix_tree_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

range_assert.hh

…

range_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

range_tombstone_list_assertions.hh

Introduce mutation/ module

2023-02-14 11:19:03 +02:00

range_tombstone_list_test.cc

mutation: specialize fmt::formatter<range_tombstone_{entry,list}>

2023-04-26 09:00:25 +03:00

rate_limiter_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

reader_concurrency_semaphore_test.cc

Merge 'reader_permit: minor improvements to resource consume/release safety' from Botond Dénes

2023-05-14 14:14:23 +03:00

recent_entries_map_test.cc

utils: introduce recent_entries_map datatype to track least recent visited entries.

2023-02-03 19:04:32 +01:00

repair_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

restrictions_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

result_utils_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

reusable_buffer_test.cc

utils: redesign reusable_buffer

2023-04-26 22:09:17 +02:00

role_manager_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

row_cache_test.cc

Merge ' mvcc: make schema upgrades gentle' from Michał Chojnowski

2023-05-24 22:58:43 +02:00

rust_test.cc

…

s3_test.cc

s3/test: Tune-up multipart upload test alignment

2023-05-16 12:23:18 +03:00

schema_change_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

schema_changes_test.cc

everywhere: use sstables::generation_type

2023-03-22 13:59:47 +02:00

schema_loader_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

schema_registry_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

secondary_index_test.cc

test: do not initialize plain number with {}

2023-03-07 17:54:53 +08:00

serialization_test.cc

treewide: s/boost::lexical_cast<std::string>/fmt::to_string()/

2023-04-21 09:43:53 +03:00

serialized_action_test.cc

serialized_action: make serialized_action abortable

2023-05-11 16:31:23 +03:00

service_level_controller_test.cc

treewide: use defaulted operator!=() and operator==()

2023-04-27 10:24:46 +03:00

small_vector_test.cc

treewide: use defaulted operator!=() and operator==()

2023-04-27 10:24:46 +03:00

snitch_reset_test.cc

treewide: do not define/capture unused variables

2023-02-28 21:56:53 +08:00

sstable_3_x_test.cc

Merge 'Generalize some file accessing helpers in test/' from Pavel Emelyanov

2023-05-24 08:43:41 +03:00

sstable_compaction_test.cc

Resurrect optimization to avoid bloom filter checks during compaction

2023-05-18 09:01:50 +03:00

sstable_conforms_to_mutation_source_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

sstable_datafile_test.cc

Merge 'Remove explicit default_priority_class() usage from sstable aux methods' from Pavel Emelyanov

2023-05-24 09:23:24 +03:00

sstable_directory_test.cc

test: Generalize touch_file() into test_utils.*

2023-05-23 10:40:55 +03:00

sstable_move_test.cc

test: sstable_move_test: avoid using helper using generation_type::int_t

2023-05-11 12:32:22 +08:00

sstable_mutation_test.cc

index_reader: Introduce and use default arguments to constructor

2023-05-23 11:29:04 +03:00

sstable_partition_index_cache_test.cc

sstables: partition_index_cache: clean up an unused type alias

2023-02-17 17:58:26 +03:00

sstable_resharding_test.cc

test: drop a reusable_sst() variant which accepts int as generation

2023-05-06 18:24:48 +08:00

sstable_set_test.cc

sstable_set: add for_each_sstable_gently* helpers

2023-05-17 11:31:07 +03:00

sstable_test.cc

test/sstables: Use seastar::file_exists() to check

2023-05-23 10:40:54 +03:00

sstable_test.hh

everywhere: use sstables::generation_type

2023-03-22 13:59:47 +02:00

stall_free_test.cc

utils: clear_gently: add variants for optional values

2023-04-23 21:34:02 +03:00

statement_restrictions_test.cc

cql3: remove expr::token

2023-04-29 13:11:31 +02:00

storage_proxy_test.cc

dht, storage_proxy: Abstract token space splitting

2023-04-24 10:49:36 +02:00

string_format_test.cc

test: string_format_test: don't compare std::string with sstring

2023-05-16 08:56:16 +03:00

suite.yaml

test/boost: add multishard_mutation_query_test more memory

2023-03-27 01:44:00 -04:00

summary_test.cc

…

tablets_test.cc

test: Introduce tablets_test

2023-04-24 10:49:37 +02:00

tagged_integer_test.cc

utils: tagged_integer: implement std::numeric_limits::{min,max}

2023-05-15 10:19:39 +03:00

token_metadata_test.cc

token_metadata_test: add a test for everywhere strategy

2023-05-21 13:17:42 +04:00

top_k_test.cc

treewide: stop using 'using namespace std' in namespace scope

2023-04-17 14:08:37 +03:00

total_order_check.hh

move to_string.hh to utils/

2023-02-15 11:09:04 +02:00

tracing_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

transport_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

types_test.cc

types: move types.{cc,hh} into types

2023-02-19 21:05:45 +02:00

user_function_test.cc

cql3: enforce permissions on DROP FUNCTION

2023-03-09 17:51:15 +01:00

user_types_test.cc

test: use NetworkTopologyStrategy in all unit tests

2023-05-23 08:52:56 +03:00

utf8_test.cc

…

UUID_test.cc

treewide: do not use std::rel_ops

2023-04-26 14:09:58 +08:00

view_build_test.cc

test, view_build: Use default prio class

2023-05-23 10:14:05 +03:00

view_complex_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

view_schema_ckey_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

view_schema_pkey_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

view_schema_test.cc

treewide: use fmt::join() when appropriate

2023-03-16 20:34:18 +08:00

vint_serialization_test.cc

…

virtual_reader_test.cc

test: Make cql_test_env::get_system_keyspace() return sharded

2023-03-06 13:17:21 +03:00

virtual_table_mutation_source_test.cc

Introduce schema/ module

2023-02-15 11:01:50 +02:00

virtual_table_test.cc

test: Enable Scylla test command line options for boost tests

2023-02-01 20:14:51 -03:00

wasm_alloc_test.cc

wasm: return wasm instance cache as a reference instead of a pointer

2023-03-28 18:18:48 +02:00

wasm_test.cc

wasm: return wasm instance cache as a reference instead of a pointer

2023-03-28 18:18:48 +02:00