Merge "Classify queries based on their initiator, rather than their target" from Botond

"
Currently we classify queries as "system" or "user" based on the table
they target. The class of a query determines how the query is treated,
currently: timeout, limits for reverse queries and the concurrency
semaphore. The catch is that users are also allowed to query system
tables and when doing so they will bypass the limits intended for user
queries. This has caused performance problems in the past, yet the
reason we decided to finally address this is that we want to introduce a
memory limit for unpaged queries. Internal (system) queries are all
unpaged and we don't want to impose the same limit on them.

This series uses scheduling groups to distinguish user and system
workloads, based on the assumption that user workloads will run in the
statement scheduling group, while system workloads will run in the main
(or default) scheduling group, or perhaps something else, but in any
case not in the statement one. Currently the scheduling group of reads
and writes is lost when going through the messaging service, so to be
able to use scheduling groups to distinguish user and system reads this
series refactors the messaging service to retain this distinction across
verb calls. Furthermore, we execute some system reads/writes as part of
user reads/writes, such as auth and schema sync. These processes are
tagged to run in the main group.
This series also centralises query classification on the replica and
moves it to a higher level. More specifically, queries are now
classified -- the scheduling group they run in is translated to the
appropriate query class specific configuration -- on the database level
and the configuration is propagated down to the lower layers.
Currently this query class specific configuration consists of the reader
concurrency semaphore and the max memory limit for otherwise unlimited
queries. A corollary of the semaphore begin selected on the database
level is that the read permit is now created before the read starts. A
valid permit is now available during all stages of the read, enabling
tracking the memory consumption of e.g. the memtable and cache readers.
This change aligns nicely with the needs of more accurate reader memory
tracking, which also wants a valid permit that is available in every layer.

The series can be divided roughly into the following distinct patch
groups:
* 01-02: Give system read concurrency a boost during startup.
* 03-06: Introduce user/system statement isolation to messaging service.
* 07-13: Various infrastructure changes to prepare for using read
  permits in all stages of reads.
* 14-19: Propagate the semaphore and the permit from database to the
  various table methods that currently create the permit.
* 20-23: Migrate away from using the reader concurrency semaphore for
  waiting for admission, use the permit instead.
* 24: Introduce `database::make_query_config()` and switch the database
  methods needing such a config to use it.
* 25-31: Get rid of all uses of `no_reader_permit()`.
* 32-33: Ban empty permits for good.
* 34: querier_cache: use the queriers' permits to obtain the semaphore.

Fixes: #5919

Tests: unit(dev, release, debug),
dtest(bootstrap_test.py:TestBootstrap.start_stop_test_node), manual
testing with a 2 node mixed cluster with extra logging.
"
* 'query-class/v6' of https://github.com/denesb/scylla: (34 commits)
  querier_cache: get semaphore from querier
  reader_permit: forbid empty permits
  reader_permit: fix reader_resources::operator bool
  treewide: remove all uses of no_reader_permit()
  database: make_multishard_streaming_reader: pass valid permit to multi range reader
  sstables: pass valid permits to all internal reads
  compaction: pass a valid permit to sstable reads
  database: add compaction read concurrency semaphore
  view: use valid permits for reads from the base table
  database: use valid permit for counter read-before-write
  database: introduce make_query_class_config()
  reader_concurrency_semaphore: remove wait_admission and consume_resources()
  test: move away from reader_concurrency_semaphore::wait_admission()
  reader_permit: resource_units: introduce add()
  mutation_reader: restricted_reader: work in terms of reader_permit
  row_cache: pass a valid permit to underlying read
  memtable: pass a valid permit to the delegate reader
  table: require a valid permit to be passed to most read methods
  multishard_mutation_query: pass a valid permit to shard mutation sources
  querier: add reader_permit parameter and forward it to the mutation_source
  ...
This commit is contained in:
Avi Kivity
2020-05-28 14:50:53 +03:00
72 changed files with 1163 additions and 755 deletions

View File

@@ -894,6 +894,7 @@ scylla_tests_generic_dependencies = [
'test/lib/cql_test_env.cc',
'test/lib/test_services.cc',
'test/lib/log.cc',
'test/lib/reader_permit.cc',
]
scylla_tests_dependencies = scylla_core + idls + scylla_tests_generic_dependencies + [

View File

@@ -190,8 +190,11 @@ database::database(const db::config& cfg, database_config dbcfg, service::migrat
max_count_streaming_concurrent_reads,
max_memory_streaming_concurrent_reads(),
"_streaming_concurrency_sem")
// No limits, just for accounting.
, _compaction_concurrency_sem(reader_concurrency_semaphore::no_limits{})
, _system_read_concurrency_sem(
max_count_system_concurrent_reads,
// Using higher initial concurrency, see revert_initial_system_read_concurrency_boost().
max_count_concurrent_reads,
max_memory_system_concurrent_reads(),
"_system_read_concurrency_sem")
, _data_query_stage("data_query", &column_family::query)
@@ -200,7 +203,7 @@ database::database(const db::config& cfg, database_config dbcfg, service::migrat
, _version(empty_version)
, _compaction_manager(make_compaction_manager(_cfg, dbcfg, as))
, _enable_incremental_backups(cfg.incremental_backups())
, _querier_cache(_read_concurrency_sem, dbcfg.available_memory * 0.04)
, _querier_cache(dbcfg.available_memory * 0.04)
, _large_data_handler(std::make_unique<db::cql_table_large_data_handler>(_cfg.compaction_large_partition_warning_threshold_mb()*1024*1024,
_cfg.compaction_large_row_warning_threshold_mb()*1024*1024,
_cfg.compaction_large_cell_warning_threshold_mb()*1024*1024,
@@ -916,8 +919,8 @@ keyspace::make_column_family_config(const schema& s, const database& db) const {
cfg.compaction_enforce_min_threshold = _config.compaction_enforce_min_threshold;
cfg.dirty_memory_manager = _config.dirty_memory_manager;
cfg.streaming_dirty_memory_manager = _config.streaming_dirty_memory_manager;
cfg.read_concurrency_semaphore = _config.read_concurrency_semaphore;
cfg.streaming_read_concurrency_semaphore = _config.streaming_read_concurrency_semaphore;
cfg.compaction_concurrency_semaphore = _config.compaction_concurrency_semaphore;
cfg.cf_stats = _config.cf_stats;
cfg.enable_incremental_backups = _config.enable_incremental_backups;
cfg.compaction_scheduling_group = _config.compaction_scheduling_group;
@@ -931,10 +934,8 @@ keyspace::make_column_family_config(const schema& s, const database& db) const {
// avoid self-reporting
if (is_system_table(s)) {
cfg.sstables_manager = &db.get_system_sstables_manager();
cfg.max_memory_for_unlimited_query = std::numeric_limits<uint64_t>::max();
} else {
cfg.sstables_manager = &db.get_user_sstables_manager();
cfg.max_memory_for_unlimited_query = db_config.max_memory_for_unlimited_query();
}
cfg.view_update_concurrency_semaphore = _config.view_update_concurrency_semaphore;
@@ -1179,6 +1180,7 @@ database::query(schema_ptr s, const query::read_command& cmd, query::result_opti
return _data_query_stage(&cf,
std::move(s),
seastar::cref(cmd),
make_query_class_config(),
opts,
seastar::cref(ranges),
std::move(trace_state),
@@ -1211,7 +1213,7 @@ database::query_mutations(schema_ptr s, const query::read_command& cmd, const dh
cmd.partition_limit,
cmd.timestamp,
timeout,
cf.get_config().max_memory_for_unlimited_query,
make_query_class_config(),
std::move(accounter),
std::move(trace_state),
std::move(cache_ctx)).then_wrapped([this, s = _stats, hit_rate = cf.get_global_cache_hit_rate(), op = cf.read_in_progress()] (auto f) {
@@ -1273,6 +1275,19 @@ void database::register_connection_drop_notifier(netw::messaging_service& ms) {
});
}
query_class_config database::make_query_class_config() {
// Everything running in the statement group is considered a user query
if (current_scheduling_group() == _dbcfg.statement_scheduling_group) {
return query_class_config{_read_concurrency_sem, _cfg.max_memory_for_unlimited_query()};
// Reads done on behalf of view update generation run in the streaming group
} else if (current_scheduling_group() == _dbcfg.streaming_scheduling_group) {
return query_class_config{_streaming_concurrency_sem, std::numeric_limits<uint64_t>::max()};
// Everything else is considered a system query
} else {
return query_class_config{_system_read_concurrency_sem, std::numeric_limits<uint64_t>::max()};
}
}
std::ostream& operator<<(std::ostream& out, const column_family& cf) {
return fmt_print(out, "{{column_family: {}/{}}}", cf._schema->ks_name(), cf._schema->cf_name());
}
@@ -1329,7 +1344,7 @@ future<mutation> database::do_apply_counter_update(column_family& cf, const froz
// counter state for each modified cell...
tracing::trace(trace_state, "Reading counter values from the CF");
return counter_write_query(m_schema, cf.as_mutation_source(), m.decorated_key(), slice, trace_state)
return counter_write_query(m_schema, cf.as_mutation_source(), make_query_class_config().semaphore.make_permit(), m.decorated_key(), slice, trace_state)
.then([this, &cf, &m, m_schema, timeout, trace_state] (auto mopt) {
// ...now, that we got existing state of all affected counter
// cells we can look for our shard in each of them, increment
@@ -1540,7 +1555,7 @@ future<> database::do_apply(schema_ptr s, const frozen_mutation& m, tracing::tra
if (cf.views().empty()) {
return apply_with_commitlog(std::move(s), cf, std::move(uuid), m, timeout, sync).finally([op = std::move(op)] { });
}
future<row_locker::lock_holder> f = cf.push_view_replica_updates(s, m, timeout, std::move(tr_state));
future<row_locker::lock_holder> f = cf.push_view_replica_updates(s, m, timeout, std::move(tr_state), make_query_class_config().semaphore);
return f.then([this, s = std::move(s), uuid = std::move(uuid), &m, timeout, &cf, op = std::move(op), sync] (row_locker::lock_holder lock) mutable {
return apply_with_commitlog(std::move(s), cf, std::move(uuid), m, timeout, sync).finally(
// Hold the local lock on the base-table partition or row
@@ -1631,8 +1646,8 @@ database::make_keyspace_config(const keyspace_metadata& ksm) {
cfg.compaction_enforce_min_threshold = _cfg.compaction_enforce_min_threshold;
cfg.dirty_memory_manager = &_dirty_memory_manager;
cfg.streaming_dirty_memory_manager = &_streaming_dirty_memory_manager;
cfg.read_concurrency_semaphore = &_read_concurrency_sem;
cfg.streaming_read_concurrency_semaphore = &_streaming_concurrency_sem;
cfg.compaction_concurrency_semaphore = &_compaction_concurrency_sem;
cfg.cf_stats = &_cf_stats;
cfg.enable_incremental_backups = _enable_incremental_backups;
@@ -1753,6 +1768,11 @@ future<> database::stop_large_data_handler() {
return _large_data_handler->stop();
}
void database::revert_initial_system_read_concurrency_boost() {
_system_read_concurrency_sem.consume({database::max_count_concurrent_reads - database::max_count_system_concurrent_reads, 0});
dblog.debug("Reverted system read concurrency from initial {} to normal {}", database::max_count_concurrent_reads, database::max_count_system_concurrent_reads);
}
future<>
database::stop() {
assert(!_large_data_handler->running());
@@ -2051,8 +2071,9 @@ flat_mutation_reader make_multishard_streaming_reader(distributed<database>& db,
std::move(trace_state), fwd_mr);
});
auto&& full_slice = schema->full_slice();
return make_flat_multi_range_reader(std::move(schema), std::move(ms), std::move(range_generator), std::move(full_slice),
service::get_local_streaming_read_priority(), {}, mutation_reader::forwarding::no);
auto& cf = db.local().find_column_family(schema);
return make_flat_multi_range_reader(std::move(schema), cf.streaming_read_concurrency_semaphore().make_permit(), std::move(ms),
std::move(range_generator), std::move(full_slice), service::get_local_streaming_read_priority(), {}, mutation_reader::forwarding::no);
}
std::ostream& operator<<(std::ostream& os, gc_clock::time_point tp) {

View File

@@ -93,6 +93,7 @@
#include "utils/disk-error-handler.hh"
#include "utils/updateable_value.hh"
#include "user_types_metadata.hh"
#include "query_class_config.hh"
class cell_locker;
class cell_locker_stats;
@@ -375,8 +376,8 @@ public:
bool enable_dangerous_direct_import_of_cassandra_counters = false;
::dirty_memory_manager* dirty_memory_manager = &default_dirty_memory_manager;
::dirty_memory_manager* streaming_dirty_memory_manager = &default_dirty_memory_manager;
reader_concurrency_semaphore* read_concurrency_semaphore;
reader_concurrency_semaphore* streaming_read_concurrency_semaphore;
reader_concurrency_semaphore* compaction_concurrency_semaphore;
::cf_stats* cf_stats = nullptr;
seastar::scheduling_group memtable_scheduling_group;
seastar::scheduling_group memtable_to_cache_scheduling_group;
@@ -389,7 +390,6 @@ public:
db::timeout_semaphore* view_update_concurrency_semaphore;
size_t view_update_concurrency_semaphore_limit;
db::data_listeners* data_listeners = nullptr;
utils::updateable_value<uint64_t> max_memory_for_unlimited_query;
};
struct no_commitlog {};
@@ -641,6 +641,7 @@ private:
// The 'range' parameter must be live as long as the reader is used.
// Mutations returned by the reader will all have given schema.
flat_mutation_reader make_sstable_reader(schema_ptr schema,
reader_permit permit,
lw_shared_ptr<sstables::sstable_set> sstables,
const dht::partition_range& range,
const query::partition_slice& slice,
@@ -707,6 +708,7 @@ public:
// If I/O needs to be issued to read anything in the specified range, the operations
// will be scheduled under the priority class given by pc.
flat_mutation_reader make_reader(schema_ptr schema,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc = default_priority_class(),
@@ -714,6 +716,7 @@ public:
streamed_mutation::forwarding fwd = streamed_mutation::forwarding::no,
mutation_reader::forwarding fwd_mr = mutation_reader::forwarding::yes) const;
flat_mutation_reader make_reader_excluding_sstables(schema_ptr schema,
reader_permit permit,
std::vector<sstables::shared_sstable>& sst,
const dht::partition_range& range,
const query::partition_slice& slice,
@@ -722,9 +725,9 @@ public:
streamed_mutation::forwarding fwd = streamed_mutation::forwarding::no,
mutation_reader::forwarding fwd_mr = mutation_reader::forwarding::yes) const;
flat_mutation_reader make_reader(schema_ptr schema, const dht::partition_range& range = query::full_partition_range) const {
flat_mutation_reader make_reader(schema_ptr schema, reader_permit permit, const dht::partition_range& range = query::full_partition_range) const {
auto& full_slice = schema->full_slice();
return make_reader(std::move(schema), range, full_slice);
return make_reader(std::move(schema), std::move(permit), range, full_slice);
}
// The streaming mutation reader differs from the regular mutation reader in that:
@@ -786,9 +789,9 @@ public:
const schema_ptr& schema() const { return _schema; }
void set_schema(schema_ptr);
db::commitlog* commitlog() { return _commitlog; }
future<const_mutation_partition_ptr> find_partition(schema_ptr, const dht::decorated_key& key) const;
future<const_mutation_partition_ptr> find_partition_slow(schema_ptr, const partition_key& key) const;
future<const_row_ptr> find_row(schema_ptr, const dht::decorated_key& partition_key, clustering_key clustering_key) const;
future<const_mutation_partition_ptr> find_partition(schema_ptr, reader_permit permit, const dht::decorated_key& key) const;
future<const_mutation_partition_ptr> find_partition_slow(schema_ptr, reader_permit permit, const partition_key& key) const;
future<const_row_ptr> find_row(schema_ptr, reader_permit permit, const dht::decorated_key& partition_key, clustering_key clustering_key) const;
// Applies given mutation to this column family
// The mutation is always upgraded to current schema.
void apply(const frozen_mutation& m, const schema_ptr& m_schema, db::rp_handle&& = {});
@@ -798,6 +801,7 @@ public:
// Returns at most "cmd.limit" rows
future<lw_shared_ptr<query::result>> query(schema_ptr,
const query::read_command& cmd,
query_class_config class_config,
query::result_options opts,
const dht::partition_range_vector& ranges,
tracing::trace_state_ptr trace_state,
@@ -996,8 +1000,10 @@ public:
void remove_view(view_ptr v);
void clear_views();
const std::vector<view_ptr>& views() const;
future<row_locker::lock_holder> push_view_replica_updates(const schema_ptr& s, const frozen_mutation& fm, db::timeout_clock::time_point timeout, tracing::trace_state_ptr tr_state) const;
future<row_locker::lock_holder> push_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout, tracing::trace_state_ptr tr_state) const;
future<row_locker::lock_holder> push_view_replica_updates(const schema_ptr& s, const frozen_mutation& fm, db::timeout_clock::time_point timeout,
tracing::trace_state_ptr tr_state, reader_concurrency_semaphore& sem) const;
future<row_locker::lock_holder> push_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout,
tracing::trace_state_ptr tr_state, reader_concurrency_semaphore& sem) const;
future<row_locker::lock_holder>
stream_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout,
std::vector<sstables::shared_sstable>& excluded_sstables) const;
@@ -1020,17 +1026,17 @@ public:
flat_mutation_reader&&,
gc_clock::time_point);
reader_concurrency_semaphore& read_concurrency_semaphore() {
return *_config.read_concurrency_semaphore;
}
reader_concurrency_semaphore& streaming_read_concurrency_semaphore() {
return *_config.streaming_read_concurrency_semaphore;
}
reader_concurrency_semaphore& compaction_concurrency_semaphore() {
return *_config.compaction_concurrency_semaphore;
}
private:
future<row_locker::lock_holder> do_push_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout, mutation_source&& source,
tracing::trace_state_ptr tr_state, const io_priority_class& io_priority, query::partition_slice::option_set custom_opts) const;
tracing::trace_state_ptr tr_state, reader_concurrency_semaphore& sem, const io_priority_class& io_priority, query::partition_slice::option_set custom_opts) const;
std::vector<view_ptr> affected_views(const schema_ptr& base, const mutation& update, gc_clock::time_point now) const;
future<> generate_and_propagate_view_updates(const schema_ptr& base,
std::vector<view_ptr>&& views,
@@ -1085,7 +1091,7 @@ private:
public:
// Iterate over all partitions. Protocol is the same as std::all_of(),
// so that iteration can be stopped by returning false.
future<bool> for_all_partitions_slow(schema_ptr, std::function<bool (const dht::decorated_key&, const mutation_partition&)> func) const;
future<bool> for_all_partitions_slow(schema_ptr, reader_permit permit, std::function<bool (const dht::decorated_key&, const mutation_partition&)> func) const;
friend std::ostream& operator<<(std::ostream& out, const column_family& cf);
// Testing purposes.
@@ -1195,8 +1201,8 @@ public:
bool enable_dangerous_direct_import_of_cassandra_counters = false;
::dirty_memory_manager* dirty_memory_manager = &default_dirty_memory_manager;
::dirty_memory_manager* streaming_dirty_memory_manager = &default_dirty_memory_manager;
reader_concurrency_semaphore* read_concurrency_semaphore;
reader_concurrency_semaphore* streaming_read_concurrency_semaphore;
reader_concurrency_semaphore* compaction_concurrency_semaphore;
::cf_stats* cf_stats = nullptr;
seastar::scheduling_group memtable_scheduling_group;
seastar::scheduling_group memtable_to_cache_scheduling_group;
@@ -1340,6 +1346,7 @@ private:
reader_concurrency_semaphore _read_concurrency_sem;
reader_concurrency_semaphore _streaming_concurrency_sem;
reader_concurrency_semaphore _compaction_concurrency_sem;
reader_concurrency_semaphore _system_read_concurrency_sem;
named_semaphore _sstable_load_concurrency_sem{max_concurrent_sstable_loads(), named_semaphore_exception_factory{"sstable load concurrency"}};
@@ -1352,6 +1359,7 @@ private:
column_family*,
schema_ptr,
const query::read_command&,
query_class_config,
query::result_options,
const dht::partition_range_vector&,
tracing::trace_state_ptr,
@@ -1504,6 +1512,12 @@ public:
sstring get_available_index_name(const sstring& ks_name, const sstring& cf_name,
std::optional<sstring> index_name_root) const;
schema_ptr find_indexed_table(const sstring& ks_name, const sstring& index_name) const;
/// Revert the system read concurrency to the normal value.
///
/// When started the database uses a higher initial concurrency for system
/// reads, to speed up startup. After startup this should be reverted to
/// the normal concurrency.
void revert_initial_system_read_concurrency_boost();
future<> stop();
future<> close_tables(table_kind kind_to_close);
@@ -1643,6 +1657,8 @@ public:
bool supports_infinite_bound_range_deletions() {
return _supports_infinite_bound_range_deletions;
}
query_class_config make_query_class_config();
};
future<> start_large_data_handler(sharded<database>& db);

View File

@@ -1910,8 +1910,7 @@ void make(database& db, bool durable, bool volatile_testing_only) {
kscfg.enable_disk_writes = !volatile_testing_only;
kscfg.enable_commitlog = !volatile_testing_only;
kscfg.enable_cache = true;
// don't make system keyspace reads wait for user reads
kscfg.read_concurrency_semaphore = &db._system_read_concurrency_sem;
kscfg.compaction_concurrency_semaphore = &db._compaction_concurrency_sem;
// don't make system keyspace writes wait for user writes (if under pressure)
kscfg.dirty_memory_manager = &db._system_dirty_memory_manager;
keyspace _ks{ksm, std::move(kscfg)};

View File

@@ -64,6 +64,7 @@ class build_progress_virtual_reader {
build_progress_reader(
schema_ptr legacy_schema,
reader_permit permit,
column_family& scylla_views_build_progress,
const dht::partition_range& range,
const query::partition_slice& slice,
@@ -80,6 +81,7 @@ class build_progress_virtual_reader {
, _slice(adjust_partition_slice())
, _underlying(scylla_views_build_progress.make_reader(
scylla_views_build_progress.schema(),
std::move(permit),
range,
slice,
pc,
@@ -188,7 +190,7 @@ public:
flat_mutation_reader operator()(
schema_ptr s,
reader_permit,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
@@ -197,6 +199,7 @@ public:
mutation_reader::forwarding fwd_mr) {
return flat_mutation_reader(std::make_unique<build_progress_reader>(
std::move(s),
std::move(permit),
_db.find_column_family(s->ks_name(), system_keyspace::v3::SCYLLA_VIEWS_BUILDS_IN_PROGRESS),
range,
slice,

View File

@@ -1295,9 +1295,10 @@ view_builder::build_step& view_builder::get_or_create_build_step(utils::UUID bas
void view_builder::initialize_reader_at_current_token(build_step& step) {
step.pslice = make_partition_slice(*step.base->schema());
step.prange = dht::partition_range(dht::ring_position::starting_at(step.current_token()), dht::ring_position::max());
auto permit = _db.make_query_class_config().semaphore.make_permit();
step.reader = make_local_shard_sstable_reader(
step.base->schema(),
no_reader_permit(),
std::move(permit),
make_lw_shared(sstables::sstable_set(step.base->get_sstable_set())),
step.prange,
step.pslice,

View File

@@ -54,7 +54,7 @@ future<> view_update_generator::start() {
}
flat_mutation_reader staging_sstable_reader = ::make_range_sstable_reader(s,
no_reader_permit(),
_db.make_query_class_config().semaphore.make_permit(),
std::move(ssts),
query::full_partition_range,
s->full_slice(),

View File

@@ -597,6 +597,7 @@ flat_mutation_reader_from_mutations(std::vector<mutation> mutations, const dht::
/// Delays the creation of the underlying reader until it is first
/// fast-forwarded and thus a range is available.
class forwardable_empty_mutation_reader : public flat_mutation_reader::impl {
reader_permit _permit;
mutation_source _source;
const query::partition_slice& _slice;
const io_priority_class& _pc;
@@ -604,11 +605,13 @@ class forwardable_empty_mutation_reader : public flat_mutation_reader::impl {
flat_mutation_reader_opt _reader;
public:
forwardable_empty_mutation_reader(schema_ptr s,
reader_permit permit,
mutation_source source,
const query::partition_slice& slice,
const io_priority_class& pc,
tracing::trace_state_ptr trace_state)
: impl(s)
, _permit(std::move(permit))
, _source(std::move(source))
, _slice(slice)
, _pc(pc)
@@ -632,7 +635,7 @@ public:
}
virtual future<> fast_forward_to(const dht::partition_range& pr, db::timeout_clock::time_point timeout) override {
if (!_reader) {
_reader = _source.make_reader(_schema, no_reader_permit(), pr, _slice, _pc, std::move(_trace_state), streamed_mutation::forwarding::no,
_reader = _source.make_reader(_schema, _permit, pr, _slice, _pc, std::move(_trace_state), streamed_mutation::forwarding::no,
mutation_reader::forwarding::yes);
_end_of_stream = false;
return make_ready_future<>();
@@ -674,6 +677,7 @@ class flat_multi_range_mutation_reader : public flat_mutation_reader::impl {
public:
flat_multi_range_mutation_reader(
schema_ptr s,
reader_permit permit,
mutation_source source,
const dht::partition_range& first_range,
Generator generator,
@@ -682,7 +686,7 @@ public:
tracing::trace_state_ptr trace_state)
: impl(s)
, _generator(std::move(generator))
, _reader(source.make_reader(s, no_reader_permit(), first_range, slice, pc, trace_state, streamed_mutation::forwarding::no, mutation_reader::forwarding::yes))
, _reader(source.make_reader(s, std::move(permit), first_range, slice, pc, trace_state, streamed_mutation::forwarding::no, mutation_reader::forwarding::yes))
{
}
@@ -729,7 +733,7 @@ public:
};
flat_mutation_reader
make_flat_multi_range_reader(schema_ptr s, mutation_source source, const dht::partition_range_vector& ranges,
make_flat_multi_range_reader(schema_ptr s, reader_permit permit, mutation_source source, const dht::partition_range_vector& ranges,
const query::partition_slice& slice, const io_priority_class& pc,
tracing::trace_state_ptr trace_state,
mutation_reader::forwarding fwd_mr)
@@ -751,14 +755,15 @@ make_flat_multi_range_reader(schema_ptr s, mutation_source source, const dht::pa
if (ranges.empty()) {
if (fwd_mr) {
return make_flat_mutation_reader<forwardable_empty_mutation_reader>(std::move(s), std::move(source), slice, pc, std::move(trace_state));
return make_flat_mutation_reader<forwardable_empty_mutation_reader>(std::move(s), std::move(permit), std::move(source), slice, pc,
std::move(trace_state));
} else {
return make_empty_flat_reader(std::move(s));
}
} else if (ranges.size() == 1) {
return source.make_reader(std::move(s), no_reader_permit(), ranges.front(), slice, pc, std::move(trace_state), streamed_mutation::forwarding::no, fwd_mr);
return source.make_reader(std::move(s), std::move(permit), ranges.front(), slice, pc, std::move(trace_state), streamed_mutation::forwarding::no, fwd_mr);
} else {
return make_flat_mutation_reader<flat_multi_range_mutation_reader<adapter>>(std::move(s), std::move(source),
return make_flat_mutation_reader<flat_multi_range_mutation_reader<adapter>>(std::move(s), std::move(permit), std::move(source),
ranges.front(), adapter(std::next(ranges.cbegin()), ranges.cend()), slice, pc, std::move(trace_state));
}
}
@@ -766,6 +771,7 @@ make_flat_multi_range_reader(schema_ptr s, mutation_source source, const dht::pa
flat_mutation_reader
make_flat_multi_range_reader(
schema_ptr s,
reader_permit permit,
mutation_source source,
std::function<std::optional<dht::partition_range>()> generator,
const query::partition_slice& slice,
@@ -798,12 +804,12 @@ make_flat_multi_range_reader(
auto* first_range = adapted_generator();
if (!first_range) {
if (fwd_mr) {
return make_flat_mutation_reader<forwardable_empty_mutation_reader>(std::move(s), std::move(source), slice, pc, std::move(trace_state));
return make_flat_mutation_reader<forwardable_empty_mutation_reader>(std::move(s), std::move(permit), std::move(source), slice, pc, std::move(trace_state));
} else {
return make_empty_flat_reader(std::move(s));
}
} else {
return make_flat_mutation_reader<flat_multi_range_mutation_reader<adapter>>(std::move(s), std::move(source),
return make_flat_mutation_reader<flat_multi_range_mutation_reader<adapter>>(std::move(s), std::move(permit), std::move(source),
*first_range, std::move(adapted_generator), slice, pc, std::move(trace_state));
}
}

View File

@@ -39,6 +39,7 @@
using seastar::future;
class mutation_source;
class reader_permit;
GCC6_CONCEPT(
template<typename Consumer>
@@ -677,7 +678,7 @@ flat_mutation_reader_from_mutations(std::vector<mutation> ms,
/// ranges. Otherwise the reader is created with
/// mutation_reader::forwarding::yes.
flat_mutation_reader
make_flat_multi_range_reader(schema_ptr s, mutation_source source, const dht::partition_range_vector& ranges,
make_flat_multi_range_reader(schema_ptr s, reader_permit permit, mutation_source source, const dht::partition_range_vector& ranges,
const query::partition_slice& slice, const io_priority_class& pc = default_priority_class(),
tracing::trace_state_ptr trace_state = nullptr,
flat_mutation_reader::partition_range_forwarding fwd_mr = flat_mutation_reader::partition_range_forwarding::yes);
@@ -689,6 +690,7 @@ make_flat_multi_range_reader(schema_ptr s, mutation_source source, const dht::pa
flat_mutation_reader
make_flat_multi_range_reader(
schema_ptr s,
reader_permit permit,
mutation_source source,
std::function<std::optional<dht::partition_range>()> generator,
const query::partition_slice& slice,

View File

@@ -46,6 +46,7 @@ class built_indexes_virtual_reader {
built_indexes_reader(
database& db,
schema_ptr schema,
reader_permit permit,
column_family& built_views,
const dht::partition_range& range,
const query::partition_slice& slice,
@@ -57,6 +58,7 @@ class built_indexes_virtual_reader {
, _db(db)
, _underlying(built_views.make_reader(
built_views.schema(),
std::move(permit),
range,
slice,
pc,
@@ -118,7 +120,7 @@ public:
flat_mutation_reader operator()(
schema_ptr s,
reader_permit,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
@@ -128,6 +130,7 @@ public:
return make_flat_mutation_reader<built_indexes_reader>(
_db,
std::move(s),
std::move(permit),
_db.find_column_family(s->ks_name(), system_keyspace::v3::BUILT_VIEWS),
range,
slice,

View File

@@ -111,7 +111,7 @@ void init_ms_fd_gossiper(sharded<gms::gossiper>& gossiper
// Delay listening messaging_service until gossip message handlers are registered
netw::messaging_service::memory_config mcfg = { std::max<size_t>(0.08 * available_memory, 1'000'000) };
netw::messaging_service::scheduling_config scfg;
scfg.statement = scheduling_config.statement;
scfg.statement_tenants = { {scheduling_config.statement, "$user"}, {default_scheduling_group(), "$system"} };
scfg.streaming = scheduling_config.streaming;
scfg.gossip = scheduling_config.gossip;
netw::get_messaging_service().start(listen, storage_port, ew, cw, tndw, ssl_storage_port, creds, mcfg, scfg, sltba).get();

View File

@@ -1082,6 +1082,10 @@ int main(int ac, char** av) {
// Truncate `clients' CF - this table should not persist between server restarts.
clear_clientlist().get();
db.invoke_on_all([] (database& db) {
db.revert_initial_system_read_concurrency_boost();
}).get();
if (cfg->start_native_transport()) {
supervisor::notify("starting native transport");
with_scheduling_group(dbcfg.statement_scheduling_group, [] {

View File

@@ -341,12 +341,13 @@ protected:
return {};
}
flat_mutation_reader delegate_reader(const dht::partition_range& delegate,
flat_mutation_reader delegate_reader(reader_permit permit,
const dht::partition_range& delegate,
const query::partition_slice& slice,
const io_priority_class& pc,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
auto ret = _memtable->_underlying->make_reader(_schema, no_reader_permit(), delegate, slice, pc, nullptr, fwd, fwd_mr);
auto ret = _memtable->_underlying->make_reader(_schema, std::move(permit), delegate, slice, pc, nullptr, fwd, fwd_mr);
_memtable = {};
_last = {};
return ret;
@@ -361,6 +362,7 @@ protected:
class scanning_reader final : public flat_mutation_reader::impl, private iterator_reader {
std::optional<dht::partition_range> _delegate_range;
std::optional<flat_mutation_reader> _delegate;
std::optional<reader_permit> _permit;
const io_priority_class& _pc;
const query::partition_slice& _slice;
mutation_reader::forwarding _fwd_mr;
@@ -389,12 +391,14 @@ class scanning_reader final : public flat_mutation_reader::impl, private iterato
public:
scanning_reader(schema_ptr s,
lw_shared_ptr<memtable> m,
std::optional<reader_permit> permit, // not needed when used for flushing
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
mutation_reader::forwarding fwd_mr)
: impl(s)
, iterator_reader(s, std::move(m), range)
, _permit(std::move(permit))
, _pc(pc)
, _slice(slice)
, _fwd_mr(fwd_mr)
@@ -405,7 +409,8 @@ public:
if (!_delegate) {
_delegate_range = get_delegate_range();
if (_delegate_range) {
_delegate = delegate_reader(*_delegate_range, _slice, _pc, streamed_mutation::forwarding::no, _fwd_mr);
assert(_permit);
_delegate = delegate_reader(*_permit, *_delegate_range, _slice, _pc, streamed_mutation::forwarding::no, _fwd_mr);
} else {
auto key_and_snp = read_section()(region(), [&] {
return with_linearized_managed_bytes([&] () -> std::optional<std::pair<dht::decorated_key, partition_snapshot_ptr>> {
@@ -636,6 +641,7 @@ partition_snapshot_ptr memtable_entry::snapshot(memtable& mtbl) {
flat_mutation_reader
memtable::make_flat_reader(schema_ptr s,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
@@ -666,7 +672,7 @@ memtable::make_flat_reader(schema_ptr s,
rd.upgrade_schema(s);
return rd;
} else {
auto res = make_flat_mutation_reader<scanning_reader>(std::move(s), shared_from_this(), range, slice, pc, fwd_mr);
auto res = make_flat_mutation_reader<scanning_reader>(std::move(s), shared_from_this(), std::move(permit), range, slice, pc, fwd_mr);
if (fwd == streamed_mutation::forwarding::yes) {
return make_forwardable(std::move(res));
} else {
@@ -681,7 +687,7 @@ memtable::make_flush_reader(schema_ptr s, const io_priority_class& pc) {
return make_flat_mutation_reader<flush_reader>(s, shared_from_this());
} else {
auto& full_slice = s->full_slice();
return make_flat_mutation_reader<scanning_reader>(std::move(s), shared_from_this(),
return make_flat_mutation_reader<scanning_reader>(std::move(s), shared_from_this(), std::nullopt,
query::full_partition_range, full_slice, pc, mutation_reader::forwarding::no);
}
}
@@ -696,8 +702,8 @@ memtable::update(db::rp_handle&& h) {
}
future<>
memtable::apply(memtable& mt) {
return do_with(mt.make_flat_reader(_schema), [this] (auto&& rd) mutable {
memtable::apply(memtable& mt, reader_permit permit) {
return do_with(mt.make_flat_reader(_schema, std::move(permit)), [this] (auto&& rd) mutable {
return consume_partitions(rd, [self = this->shared_from_this(), &rd] (mutation&& m) {
self->apply(m);
return stop_iteration::no;
@@ -749,7 +755,7 @@ mutation_source memtable::as_data_source() {
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
return mt->make_flat_reader(std::move(s), range, slice, pc, std::move(trace_state), fwd, fwd_mr);
return mt->make_flat_reader(std::move(s), std::move(permit), range, slice, pc, std::move(trace_state), fwd, fwd_mr);
});
}

View File

@@ -197,7 +197,7 @@ public:
future<> clear_gently() noexcept;
schema_ptr schema() const { return _schema; }
void set_schema(schema_ptr) noexcept;
future<> apply(memtable&);
future<> apply(memtable&, reader_permit);
// Applies mutation to this memtable.
// The mutation is upgraded to current schema.
void apply(const mutation& m, db::rp_handle&& = {});
@@ -248,6 +248,7 @@ public:
//
// Mutations returned by the reader will all have given schema.
flat_mutation_reader make_flat_reader(schema_ptr,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc = default_priority_class(),
@@ -256,9 +257,10 @@ public:
mutation_reader::forwarding fwd_mr = mutation_reader::forwarding::yes);
flat_mutation_reader make_flat_reader(schema_ptr s,
reader_permit permit,
const dht::partition_range& range = query::full_partition_range) {
auto& full_slice = s->full_slice();
return make_flat_reader(s, range, full_slice);
return make_flat_reader(s, std::move(permit), range, full_slice);
}
flat_mutation_reader make_flush_reader(schema_ptr, const io_priority_class& pc);

View File

@@ -279,7 +279,7 @@ future<> messaging_service::unregister_handler(messaging_verb verb) {
messaging_service::messaging_service(gms::inet_address ip, uint16_t port)
: messaging_service(std::move(ip), port, encrypt_what::none, compress_what::none, tcp_nodelay_what::all, 0, nullptr, memory_config{1'000'000},
scheduling_config{}, false)
scheduling_config{{{{}, "$default"}}, {}, {}}, false)
{}
static
@@ -321,6 +321,11 @@ void messaging_service::do_start_listen() {
// local or remote datacenter, and whether or not the connection will be used for gossip. We can fix
// the first by wrapping its server_socket, but not the second.
auto limits = rpc_resource_limits(_mcfg.rpc_memory_limit);
limits.isolate_connection = [this] (sstring isolation_cookie) {
rpc::isolation_config cfg;
cfg.sched_group = scheduling_group_for_isolation_cookie(isolation_cookie);
return cfg;
};
if (!_server[0]) {
auto listen = [&] (const gms::inet_address& a, rpc::streaming_domain_type sdomain) {
so.streaming_domain = sdomain;
@@ -386,8 +391,10 @@ messaging_service::messaging_service(gms::inet_address ip
, _should_listen_to_broadcast_address(sltba)
, _rpc(new rpc_protocol_wrapper(serializer { }))
, _credentials_builder(credentials ? std::make_unique<seastar::tls::credentials_builder>(*credentials) : nullptr)
, _clients(2 + scfg.statement_tenants.size() * 2)
, _mcfg(mcfg)
, _scheduling_config(scfg)
, _scheduling_info_for_connection_index(initial_scheduling_info())
{
_rpc->set_logger(&rpc_logger);
register_handler(this, messaging_verb::CLIENT_ID, [] (rpc::client_info& ci, gms::inet_address broadcast_address, uint32_t src_cpu_id, rpc::optional<uint64_t> max_result_size) {
@@ -396,6 +403,11 @@ messaging_service::messaging_service(gms::inet_address ip
ci.attach_auxiliary("max_result_size", max_result_size.value_or(query::result_memory_limiter::maximum_result_size));
return rpc::no_wait;
});
_connection_index_for_tenant.reserve(_scheduling_config.statement_tenants.size());
for (unsigned i = 0; i < _scheduling_config.statement_tenants.size(); ++i) {
_connection_index_for_tenant.push_back({_scheduling_config.statement_tenants[i].sched_group, i});
}
}
msg_addr messaging_service::get_source(const rpc::client_info& cinfo) {
@@ -449,24 +461,6 @@ rpc::no_wait_type messaging_service::no_wait() {
static constexpr unsigned do_get_rpc_client_idx(messaging_verb verb) {
switch (verb) {
case messaging_verb::CLIENT_ID:
case messaging_verb::MUTATION:
case messaging_verb::READ_DATA:
case messaging_verb::READ_MUTATION_DATA:
case messaging_verb::READ_DIGEST:
case messaging_verb::GOSSIP_DIGEST_ACK:
case messaging_verb::DEFINITIONS_UPDATE:
case messaging_verb::TRUNCATE:
case messaging_verb::MIGRATION_REQUEST:
case messaging_verb::SCHEMA_CHECK:
case messaging_verb::COUNTER_MUTATION:
// Use the same RPC client for light weight transaction
// protocol steps as for standard mutations and read requests.
case messaging_verb::PAXOS_PREPARE:
case messaging_verb::PAXOS_ACCEPT:
case messaging_verb::PAXOS_LEARN:
case messaging_verb::PAXOS_PRUNE:
return 0;
// GET_SCHEMA_VERSION is sent from read/mutate verbs so should be
// sent on a different connection to avoid potential deadlocks
// as well as reduce latency as there are potentially many requests
@@ -476,7 +470,7 @@ static constexpr unsigned do_get_rpc_client_idx(messaging_verb verb) {
case messaging_verb::GOSSIP_SHUTDOWN:
case messaging_verb::GOSSIP_ECHO:
case messaging_verb::GET_SCHEMA_VERSION:
return 1;
return 0;
case messaging_verb::PREPARE_MESSAGE:
case messaging_verb::PREPARE_DONE_MESSAGE:
case messaging_verb::STREAM_MUTATION:
@@ -499,6 +493,24 @@ static constexpr unsigned do_get_rpc_client_idx(messaging_verb verb) {
case messaging_verb::REPAIR_PUT_ROW_DIFF_WITH_RPC_STREAM:
case messaging_verb::REPAIR_GET_FULL_ROW_HASHES_WITH_RPC_STREAM:
case messaging_verb::HINT_MUTATION:
return 1;
case messaging_verb::CLIENT_ID:
case messaging_verb::MUTATION:
case messaging_verb::READ_DATA:
case messaging_verb::READ_MUTATION_DATA:
case messaging_verb::READ_DIGEST:
case messaging_verb::GOSSIP_DIGEST_ACK:
case messaging_verb::DEFINITIONS_UPDATE:
case messaging_verb::TRUNCATE:
case messaging_verb::MIGRATION_REQUEST:
case messaging_verb::SCHEMA_CHECK:
case messaging_verb::COUNTER_MUTATION:
// Use the same RPC client for light weight transaction
// protocol steps as for standard mutations and read requests.
case messaging_verb::PAXOS_PREPARE:
case messaging_verb::PAXOS_ACCEPT:
case messaging_verb::PAXOS_LEARN:
case messaging_verb::PAXOS_PRUNE:
return 2;
case messaging_verb::MUTATION_DONE:
case messaging_verb::MUTATION_FAILED:
@@ -518,21 +530,62 @@ static constexpr std::array<uint8_t, static_cast<size_t>(messaging_verb::LAST)>
static std::array<uint8_t, static_cast<size_t>(messaging_verb::LAST)> s_rpc_client_idx_table = make_rpc_client_idx_table();
static unsigned get_rpc_client_idx(messaging_verb verb) {
return s_rpc_client_idx_table[static_cast<size_t>(verb)];
unsigned
messaging_service::get_rpc_client_idx(messaging_verb verb) const {
auto idx = s_rpc_client_idx_table[static_cast<size_t>(verb)];
if (idx < 2) {
return idx;
}
// A statement or statement-ack verb
const auto curr_sched_group = current_scheduling_group();
for (unsigned i = 0; i < _connection_index_for_tenant.size(); ++i) {
if (_connection_index_for_tenant[i].sched_group == curr_sched_group) {
// i == 0: the default tenant maps to the default client indexes of 2 and 3.
idx += i * 2;
break;
}
}
return idx;
}
std::vector<messaging_service::scheduling_info_for_connection_index>
messaging_service::initial_scheduling_info() const {
if (_scheduling_config.statement_tenants.empty()) {
throw std::runtime_error("messaging_service::initial_scheduling_info(): must have at least one tenant configured");
}
auto sched_infos = std::vector<scheduling_info_for_connection_index>({
{ _scheduling_config.gossip, "gossip" },
{ _scheduling_config.streaming, "streaming", },
});
sched_infos.reserve(sched_infos.size() + _scheduling_config.statement_tenants.size() * 2);
for (const auto& tenant : _scheduling_config.statement_tenants) {
sched_infos.push_back({ tenant.sched_group, "statement:" + tenant.name });
sched_infos.push_back({ tenant.sched_group, "statement-ack:" + tenant.name });
}
return sched_infos;
};
scheduling_group
messaging_service::scheduling_group_for_verb(messaging_verb verb) const {
static const scheduling_group scheduling_config::*idx_to_group[] = {
&scheduling_config::statement,
&scheduling_config::gossip,
&scheduling_config::streaming,
&scheduling_config::statement,
};
return _scheduling_config.*(idx_to_group[get_rpc_client_idx(verb)]);
return _scheduling_info_for_connection_index[get_rpc_client_idx(verb)].sched_group;
}
scheduling_group
messaging_service::scheduling_group_for_isolation_cookie(const sstring& isolation_cookie) const {
// Once per connection, so a loop is fine.
for (auto&& info : _scheduling_info_for_connection_index) {
if (info.isolation_cookie == isolation_cookie) {
return info.sched_group;
}
}
// Client is using a new connection class that the server doesn't recognize yet.
// Assume it's important, after server upgrade we'll recognize it.
return default_scheduling_group();
}
/**
* Get an IP for a given endpoint to connect to
*
@@ -643,6 +696,10 @@ shared_ptr<messaging_service::rpc_protocol_client_wrapper> messaging_service::ge
}
opts.tcp_nodelay = must_tcp_nodelay;
opts.reuseaddr = true;
// We send cookies only for non-default statement tenant clients.
if (idx > 3) {
opts.isolation_cookie = _scheduling_info_for_connection_index[idx].isolation_cookie;
}
auto client = must_encrypt ?
::make_shared<rpc_protocol_client_wrapper>(*_rpc, std::move(opts),

View File

@@ -219,11 +219,30 @@ public:
};
struct scheduling_config {
scheduling_group statement;
struct tenant {
scheduling_group sched_group;
sstring name;
};
// Must have at least one element. No two tenants should have the same
// scheduling group. [0] is the default tenant, that all unknown
// scheduling groups will fall back to. The default tenant should use
// the statement scheduling group, for backward compatibility. In fact
// any other scheduling group would be dropped as the default tenant,
// does not transfer its scheduling group across the wire.
std::vector<tenant> statement_tenants;
scheduling_group streaming;
scheduling_group gossip;
};
private:
struct scheduling_info_for_connection_index {
scheduling_group sched_group;
sstring isolation_cookie;
};
struct tenant_connection_index {
scheduling_group sched_group;
unsigned cliend_idx;
};
private:
gms::inet_address _listen_address;
uint16_t _port;
@@ -239,12 +258,14 @@ private:
::shared_ptr<seastar::tls::server_credentials> _credentials;
std::unique_ptr<seastar::tls::credentials_builder> _credentials_builder;
std::array<std::unique_ptr<rpc_protocol_server_wrapper>, 2> _server_tls;
std::array<clients_map, 4> _clients;
std::vector<clients_map> _clients;
uint64_t _dropped_messages[static_cast<int32_t>(messaging_verb::LAST)] = {};
bool _stopping = false;
std::list<std::function<void(gms::inet_address ep)>> _connection_drop_notifiers;
memory_config _mcfg;
scheduling_config _scheduling_config;
std::vector<scheduling_info_for_connection_index> _scheduling_info_for_connection_index;
std::vector<tenant_connection_index> _connection_index_for_tenant;
public:
using clock_type = lowres_clock;
public:
@@ -524,6 +545,9 @@ public:
std::unique_ptr<rpc_protocol_wrapper>& rpc();
static msg_addr get_source(const rpc::client_info& client);
scheduling_group scheduling_group_for_verb(messaging_verb verb) const;
scheduling_group scheduling_group_for_isolation_cookie(const sstring& isolation_cookie) const;
std::vector<messaging_service::scheduling_info_for_connection_index> initial_scheduling_info() const;
unsigned get_rpc_client_idx(messaging_verb verb) const;
};
extern distributed<messaging_service> _the_messaging_service;

View File

@@ -98,17 +98,21 @@ class read_context : public reader_lifecycle_policy {
struct reader_meta {
struct remote_parts {
reader_concurrency_semaphore& semaphore;
reader_permit permit;
std::unique_ptr<const dht::partition_range> range;
std::unique_ptr<const query::partition_slice> slice;
utils::phased_barrier::operation read_operation;
explicit remote_parts(reader_concurrency_semaphore& semaphore)
: permit(semaphore.make_permit()) {
}
remote_parts(
reader_concurrency_semaphore& semaphore,
reader_permit permit,
std::unique_ptr<const dht::partition_range> range = nullptr,
std::unique_ptr<const query::partition_slice> slice = nullptr,
utils::phased_barrier::operation read_operation = {})
: semaphore(semaphore)
: permit(std::move(permit))
, range(std::move(range))
, slice(std::move(slice))
, read_operation(std::move(read_operation)) {
@@ -236,7 +240,7 @@ public:
virtual void destroy_reader(shard_id shard, future<stopped_reader> reader_fut) noexcept override;
virtual reader_concurrency_semaphore& semaphore() override {
return _readers[this_shard_id()].rparts->semaphore;
return _readers[this_shard_id()].rparts->permit.semaphore();
}
future<> lookup_readers();
@@ -290,9 +294,10 @@ flat_mutation_reader read_context::create_reader(
}
auto& table = _db.local().find_column_family(schema);
auto class_config = _db.local().make_query_class_config();
if (!rm.rparts) {
rm.rparts = make_foreign(std::make_unique<reader_meta::remote_parts>(table.read_concurrency_semaphore()));
rm.rparts = make_foreign(std::make_unique<reader_meta::remote_parts>(class_config.semaphore));
}
rm.rparts->range = std::make_unique<const dht::partition_range>(pr);
@@ -300,7 +305,7 @@ flat_mutation_reader read_context::create_reader(
rm.rparts->read_operation = table.read_in_progress();
rm.state = reader_state::used;
return table.as_mutation_source().make_reader(std::move(schema), no_reader_permit(), *rm.rparts->range, *rm.rparts->slice, pc,
return table.as_mutation_source().make_reader(std::move(schema), rm.rparts->permit, *rm.rparts->range, *rm.rparts->slice, pc,
std::move(trace_state));
}
@@ -343,10 +348,8 @@ future<> read_context::stop() {
for (shard_id shard = 0; shard != smp::count; ++shard) {
if (_readers[shard].state == reader_state::saving) {
// Move to the background.
(void)_db.invoke_on(shard, [schema = global_schema_ptr(_schema), rm = std::move(_readers[shard])] (database& db) mutable {
// We cannot use semaphore() here, as this can be already destroyed.
auto& table = db.find_column_family(schema);
table.read_concurrency_semaphore().unregister_inactive_read(std::move(*rm.handle));
(void)_db.invoke_on(shard, [rm = std::move(_readers[shard])] (database& db) mutable {
rm.rparts->permit.semaphore().unregister_inactive_read(std::move(*rm.handle));
});
}
}
@@ -445,7 +448,7 @@ future<> read_context::save_reader(shard_id shard, const dht::decorated_key& las
return _db.invoke_on(shard, [this, shard, query_uuid = _cmd.query_uuid, query_ranges = _ranges, rm = std::exchange(_readers[shard], {}),
&last_pkey, &last_ckey, gts = tracing::global_trace_state_ptr(_trace_state)] (database& db) mutable {
try {
flat_mutation_reader_opt reader = try_resume(rm.rparts->semaphore, std::move(*rm.handle));
flat_mutation_reader_opt reader = try_resume(rm.rparts->permit.semaphore(), std::move(*rm.handle));
if (!reader) {
return;
@@ -474,6 +477,7 @@ future<> read_context::save_reader(shard_id shard, const dht::decorated_key& las
std::move(rm.rparts->range),
std::move(rm.rparts->slice),
std::move(*reader),
std::move(rm.rparts->permit),
last_pkey,
last_ckey);
@@ -508,7 +512,7 @@ future<> read_context::lookup_readers() {
auto schema = gs.get();
auto querier_opt = db.get_querier_cache().lookup_shard_mutation_querier(cmd->query_uuid, *schema, *ranges, cmd->slice, gts.get());
auto& table = db.find_column_family(schema);
auto& semaphore = table.read_concurrency_semaphore();
auto& semaphore = db.make_query_class_config().semaphore;
if (!querier_opt) {
return reader_meta(reader_state::inexistent, reader_meta::remote_parts(semaphore));
@@ -518,7 +522,7 @@ future<> read_context::lookup_readers() {
auto handle = pause(semaphore, std::move(q).reader());
return reader_meta(
reader_state::successful_lookup,
reader_meta::remote_parts(semaphore, std::move(q).reader_range(), std::move(q).reader_slice(), table.read_in_progress()),
reader_meta::remote_parts(q.permit(), std::move(q).reader_range(), std::move(q).reader_slice(), table.read_in_progress()),
std::move(handle));
}).then([this, shard] (reader_meta rm) {
_readers[shard] = std::move(rm);
@@ -599,18 +603,18 @@ static future<reconcilable_result> do_query_mutations(
mutation_reader::forwarding fwd_mr) {
return make_multishard_combining_reader(ctx, std::move(s), pr, ps, pc, std::move(trace_state), fwd_mr);
});
auto reader = make_flat_multi_range_reader(s, std::move(ms), ranges, cmd.slice,
auto class_config = ctx->db().local().make_query_class_config();
auto reader = make_flat_multi_range_reader(s, class_config.semaphore.make_permit(), std::move(ms), ranges, cmd.slice,
service::get_local_sstable_query_read_priority(), trace_state, mutation_reader::forwarding::no);
auto compaction_state = make_lw_shared<compact_for_mutation_query_state>(*s, cmd.timestamp, cmd.slice, cmd.row_limit,
cmd.partition_limit);
return do_with(std::move(reader), std::move(compaction_state), [&, accounter = std::move(accounter), timeout] (
return do_with(std::move(reader), std::move(compaction_state), [&, class_config, accounter = std::move(accounter), timeout] (
flat_mutation_reader& reader, lw_shared_ptr<compact_for_mutation_query_state>& compaction_state) mutable {
auto rrb = reconcilable_result_builder(*reader.schema(), cmd.slice, std::move(accounter));
auto& table = ctx->db().local().find_column_family(reader.schema());
return query::consume_page(reader, compaction_state, cmd.slice, std::move(rrb), cmd.row_limit, cmd.partition_limit, cmd.timestamp,
timeout, table.get_config().max_memory_for_unlimited_query).then([&] (consume_result&& result) mutable {
timeout, class_config.max_memory_for_unlimited_query).then([&] (consume_result&& result) mutable {
return make_ready_future<page_consume_result>(page_consume_result(std::move(result), reader.detach_buffer(), std::move(compaction_state)));
});
}).then_wrapped([&ctx] (future<page_consume_result>&& result_fut) {

View File

@@ -2167,7 +2167,7 @@ future<> data_query(
gc_clock::time_point query_time,
query::result::builder& builder,
db::timeout_clock::time_point timeout,
uint64_t max_memory_reverse_query,
query_class_config class_config,
tracing::trace_state_ptr trace_ptr,
query::querier_cache_context cache_ctx)
{
@@ -2178,12 +2178,11 @@ future<> data_query(
auto querier_opt = cache_ctx.lookup_data_querier(*s, range, slice, trace_ptr);
auto q = querier_opt
? std::move(*querier_opt)
: query::data_querier(source, s, range, slice, service::get_local_sstable_query_read_priority(), trace_ptr);
: query::data_querier(source, s, class_config.semaphore.make_permit(), range, slice, service::get_local_sstable_query_read_priority(), trace_ptr);
return do_with(std::move(q), [=, &builder, trace_ptr = std::move(trace_ptr),
cache_ctx = std::move(cache_ctx)] (query::data_querier& q) mutable {
return do_with(std::move(q), [=, &builder, trace_ptr = std::move(trace_ptr), cache_ctx = std::move(cache_ctx)] (query::data_querier& q) mutable {
auto qrb = query_result_builder(*s, builder);
return q.consume_page(std::move(qrb), row_limit, partition_limit, query_time, timeout, max_memory_reverse_query).then(
return q.consume_page(std::move(qrb), row_limit, partition_limit, query_time, timeout, class_config.max_memory_for_unlimited_query).then(
[=, &builder, &q, trace_ptr = std::move(trace_ptr), cache_ctx = std::move(cache_ctx)] () mutable {
if (q.are_limits_reached() || builder.is_short_read()) {
cache_ctx.insert(std::move(q), std::move(trace_ptr));
@@ -2262,7 +2261,7 @@ static do_mutation_query(schema_ptr s,
uint32_t partition_limit,
gc_clock::time_point query_time,
db::timeout_clock::time_point timeout,
uint64_t max_memory_reverse_query,
query_class_config class_config,
query::result_memory_accounter&& accounter,
tracing::trace_state_ptr trace_ptr,
query::querier_cache_context cache_ctx)
@@ -2274,12 +2273,12 @@ static do_mutation_query(schema_ptr s,
auto querier_opt = cache_ctx.lookup_mutation_querier(*s, range, slice, trace_ptr);
auto q = querier_opt
? std::move(*querier_opt)
: query::mutation_querier(source, s, range, slice, service::get_local_sstable_query_read_priority(), trace_ptr);
: query::mutation_querier(source, s, class_config.semaphore.make_permit(), range, slice, service::get_local_sstable_query_read_priority(), trace_ptr);
return do_with(std::move(q), [=, &slice, accounter = std::move(accounter), trace_ptr = std::move(trace_ptr), cache_ctx = std::move(cache_ctx)] (
query::mutation_querier& q) mutable {
auto rrb = reconcilable_result_builder(*s, slice, std::move(accounter));
return q.consume_page(std::move(rrb), row_limit, partition_limit, query_time, timeout, max_memory_reverse_query).then(
return q.consume_page(std::move(rrb), row_limit, partition_limit, query_time, timeout, class_config.max_memory_for_unlimited_query).then(
[=, &q, trace_ptr = std::move(trace_ptr), cache_ctx = std::move(cache_ctx)] (reconcilable_result r) mutable {
if (q.are_limits_reached() || r.is_short_read()) {
cache_ctx.insert(std::move(q), std::move(trace_ptr));
@@ -2302,13 +2301,13 @@ mutation_query(schema_ptr s,
uint32_t partition_limit,
gc_clock::time_point query_time,
db::timeout_clock::time_point timeout,
uint64_t max_memory_reverse_query,
query_class_config class_config,
query::result_memory_accounter&& accounter,
tracing::trace_state_ptr trace_ptr,
query::querier_cache_context cache_ctx)
{
return do_mutation_query(std::move(s), std::move(source), seastar::cref(range), seastar::cref(slice),
row_limit, partition_limit, query_time, timeout, max_memory_reverse_query, std::move(accounter), std::move(trace_ptr), std::move(cache_ctx));
row_limit, partition_limit, query_time, timeout, class_config, std::move(accounter), std::move(trace_ptr), std::move(cache_ctx));
}
deletable_row::deletable_row(clustering_row&& cr)
@@ -2502,7 +2501,7 @@ mutation_partition::fully_discontinuous(const schema& s, const position_range& r
return check_continuity(s, r, is_continuous::no);
}
future<mutation_opt> counter_write_query(schema_ptr s, const mutation_source& source,
future<mutation_opt> counter_write_query(schema_ptr s, const mutation_source& source, reader_permit permit,
const dht::decorated_key& dk,
const query::partition_slice& slice,
tracing::trace_state_ptr trace_ptr)
@@ -2514,19 +2513,19 @@ future<mutation_opt> counter_write_query(schema_ptr s, const mutation_source& so
range_and_reader(range_and_reader&&) = delete;
range_and_reader(const range_and_reader&) = delete;
range_and_reader(schema_ptr s, const mutation_source& source,
range_and_reader(schema_ptr s, const mutation_source& source, reader_permit permit,
const dht::decorated_key& dk,
const query::partition_slice& slice,
tracing::trace_state_ptr trace_ptr)
: range(dht::partition_range::make_singular(dk))
, reader(source.make_reader(s, no_reader_permit(), range, slice, service::get_local_sstable_query_read_priority(),
, reader(source.make_reader(s, std::move(permit), range, slice, service::get_local_sstable_query_read_priority(),
std::move(trace_ptr), streamed_mutation::forwarding::no,
mutation_reader::forwarding::no))
{ }
};
// do_with() doesn't support immovable objects
auto r_a_r = std::make_unique<range_and_reader>(s, source, dk, slice, std::move(trace_ptr));
auto r_a_r = std::make_unique<range_and_reader>(s, source, std::move(permit), dk, slice, std::move(trace_ptr));
auto cwqrb = counter_write_query_result_builder(*s);
auto cfq = make_stable_flattened_mutations_consumer<compact_for_query<emit_only_live_rows::yes, counter_write_query_result_builder>>(
*s, gc_clock::now(), slice, query::max_rows, query::max_rows, std::move(cwqrb));

View File

@@ -28,6 +28,7 @@
#include "db/timeout_clock.hh"
#include "querier.hh"
#include "utils/chunked_vector.hh"
#include "query_class_config.hh"
#include <seastar/core/execution_stage.hh>
class reconcilable_result;
@@ -162,7 +163,7 @@ future<reconcilable_result> mutation_query(
uint32_t partition_limit,
gc_clock::time_point query_time,
db::timeout_clock::time_point timeout,
uint64_t max_memory_reverse_query,
query_class_config class_config,
query::result_memory_accounter&& accounter = { },
tracing::trace_state_ptr trace_ptr = nullptr,
query::querier_cache_context cache_ctx = { });
@@ -177,7 +178,7 @@ future<> data_query(
gc_clock::time_point query_time,
query::result::builder& builder,
db::timeout_clock::time_point timeout,
uint64_t max_memory_reverse_query,
query_class_config class_config,
tracing::trace_state_ptr trace_ptr = nullptr,
query::querier_cache_context cache_ctx = { });
@@ -192,7 +193,7 @@ class mutation_query_stage {
uint32_t,
gc_clock::time_point,
db::timeout_clock::time_point,
uint64_t,
query_class_config,
query::result_memory_accounter&&,
tracing::trace_state_ptr,
query::querier_cache_context> _execution_stage;
@@ -203,7 +204,7 @@ public:
};
// Performs a query for counter updates.
future<mutation_opt> counter_write_query(schema_ptr, const mutation_source&,
future<mutation_opt> counter_write_query(schema_ptr, const mutation_source&, reader_permit permit,
const dht::decorated_key& dk,
const query::partition_slice& slice,
tracing::trace_state_ptr trace_ptr);

View File

@@ -667,6 +667,7 @@ class restricting_mutation_reader : public flat_mutation_reader::impl {
struct mutation_source_and_params {
mutation_source _ms;
schema_ptr _s;
reader_permit _permit;
std::reference_wrapper<const dht::partition_range> _range;
std::reference_wrapper<const query::partition_slice> _slice;
std::reference_wrapper<const io_priority_class> _pc;
@@ -674,17 +675,17 @@ class restricting_mutation_reader : public flat_mutation_reader::impl {
streamed_mutation::forwarding _fwd;
mutation_reader::forwarding _fwd_mr;
flat_mutation_reader operator()(reader_permit permit) {
return _ms.make_reader(std::move(_s), std::move(permit), _range.get(), _slice.get(), _pc.get(), std::move(_trace_state), _fwd, _fwd_mr);
flat_mutation_reader operator()() {
return _ms.make_reader(std::move(_s), std::move(_permit), _range.get(), _slice.get(), _pc.get(), std::move(_trace_state), _fwd, _fwd_mr);
}
};
struct pending_state {
reader_concurrency_semaphore& semaphore;
mutation_source_and_params reader_factory;
};
struct admitted_state {
flat_mutation_reader reader;
reader_permit::resource_units units;
};
std::variant<pending_state, admitted_state> _state;
@@ -702,17 +703,18 @@ class restricting_mutation_reader : public flat_mutation_reader::impl {
return fn(state->reader);
}
return std::get<pending_state>(_state).semaphore.wait_admission(new_reader_base_cost,
timeout).then([this, fn = std::move(fn)] (reader_permit permit) mutable {
return std::get<pending_state>(_state).reader_factory._permit.wait_admission(new_reader_base_cost,
timeout).then([this, fn = std::move(fn)] (reader_permit::resource_units units) mutable {
auto reader_factory = std::move(std::get<pending_state>(_state).reader_factory);
_state.emplace<admitted_state>(admitted_state{reader_factory(std::move(permit))});
_state.emplace<admitted_state>(admitted_state{reader_factory(), std::move(units)});
return fn(std::get<admitted_state>(_state).reader);
});
}
public:
restricting_mutation_reader(reader_concurrency_semaphore& semaphore,
restricting_mutation_reader(
mutation_source ms,
schema_ptr s,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
@@ -720,8 +722,8 @@ public:
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr)
: impl(s)
, _state(pending_state{semaphore,
mutation_source_and_params{std::move(ms), std::move(s), range, slice, pc, std::move(trace_state), fwd, fwd_mr}}) {
, _state(pending_state{
mutation_source_and_params{std::move(ms), std::move(s), std::move(permit), range, slice, pc, std::move(trace_state), fwd, fwd_mr}}) {
}
virtual future<> fill_buffer(db::timeout_clock::time_point timeout) override {
@@ -766,16 +768,17 @@ public:
};
flat_mutation_reader
make_restricted_flat_reader(reader_concurrency_semaphore& semaphore,
make_restricted_flat_reader(
mutation_source ms,
schema_ptr s,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
return make_flat_mutation_reader<restricting_mutation_reader>(semaphore, std::move(ms), std::move(s), range, slice, pc, std::move(trace_state), fwd, fwd_mr);
return make_flat_mutation_reader<restricting_mutation_reader>(std::move(ms), std::move(s), std::move(permit), range, slice, pc, std::move(trace_state), fwd, fwd_mr);
}

View File

@@ -268,7 +268,7 @@ public:
flat_mutation_reader
make_reader(
schema_ptr s,
reader_permit permit = no_reader_permit(),
reader_permit permit,
partition_range range = query::full_partition_range) const
{
auto& full_slice = s->full_slice();
@@ -315,9 +315,10 @@ snapshot_source make_empty_snapshot_source();
// a semaphore to track and limit the memory usage of readers. It also
// contains a timeout and a maximum queue size for inactive readers
// whose construction is blocked.
flat_mutation_reader make_restricted_flat_reader(reader_concurrency_semaphore& semaphore,
flat_mutation_reader make_restricted_flat_reader(
mutation_source ms,
schema_ptr s,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc = default_priority_class(),
@@ -325,12 +326,13 @@ flat_mutation_reader make_restricted_flat_reader(reader_concurrency_semaphore& s
streamed_mutation::forwarding fwd = streamed_mutation::forwarding::no,
mutation_reader::forwarding fwd_mr = mutation_reader::forwarding::yes);
inline flat_mutation_reader make_restricted_flat_reader(reader_concurrency_semaphore& semaphore,
inline flat_mutation_reader make_restricted_flat_reader(
mutation_source ms,
schema_ptr s,
reader_permit permit,
const dht::partition_range& range = query::full_partition_range) {
auto& full_slice = s->full_slice();
return make_restricted_flat_reader(semaphore, std::move(ms), std::move(s), range, full_slice);
return make_restricted_flat_reader(std::move(ms), std::move(s), std::move(permit), range, full_slice);
}
using mutation_source_opt = optimized_optional<mutation_source>;

View File

@@ -191,7 +191,7 @@ void querier_cache::scan_cache_entries() {
while (it != end && it->is_expired(now)) {
++_stats.time_based_evictions;
--_stats.population;
_sem.unregister_inactive_read(std::move(*it).get_inactive_handle());
it->permit().semaphore().unregister_inactive_read(std::move(*it).get_inactive_handle());
it = _entries.erase(it);
}
}
@@ -217,9 +217,8 @@ static querier_cache::entries::iterator find_querier(querier_cache::entries& ent
return it->pos();
}
querier_cache::querier_cache(reader_concurrency_semaphore& sem, size_t max_cache_size, std::chrono::seconds entry_ttl)
: _sem(sem)
, _expiry_timer([this] { scan_cache_entries(); })
querier_cache::querier_cache(size_t max_cache_size, std::chrono::seconds entry_ttl)
: _expiry_timer([this] { scan_cache_entries(); })
, _entry_ttl(entry_ttl)
, _max_queriers_memory_usage(max_cache_size) {
_expiry_timer.arm_periodic(entry_ttl / 2);
@@ -245,7 +244,6 @@ public:
template <typename Querier>
static void insert_querier(
reader_concurrency_semaphore& sem,
querier_cache::entries& entries,
querier_cache::index& index,
querier_cache::stats& stats,
@@ -279,13 +277,15 @@ static void insert_querier(
auto it = entries.begin();
while (it != entries.end() && memory_usage >= max_queriers_memory_usage) {
memory_usage -= it->memory_usage();
sem.unregister_inactive_read(std::move(*it).get_inactive_handle());
it->permit().semaphore().unregister_inactive_read(std::move(*it).get_inactive_handle());
it = entries.erase(it);
--stats.population;
++stats.memory_based_evictions;
}
}
auto& sem = q.permit().semaphore();
auto& e = entries.emplace_back(key, std::move(q), expires);
e.set_pos(--entries.end());
++stats.population;
@@ -297,23 +297,22 @@ static void insert_querier(
}
void querier_cache::insert(utils::UUID key, data_querier&& q, tracing::trace_state_ptr trace_state) {
insert_querier(_sem, _entries, _data_querier_index, _stats, _max_queriers_memory_usage, key, std::move(q), lowres_clock::now() + _entry_ttl,
insert_querier(_entries, _data_querier_index, _stats, _max_queriers_memory_usage, key, std::move(q), lowres_clock::now() + _entry_ttl,
std::move(trace_state));
}
void querier_cache::insert(utils::UUID key, mutation_querier&& q, tracing::trace_state_ptr trace_state) {
insert_querier(_sem, _entries, _mutation_querier_index, _stats, _max_queriers_memory_usage, key, std::move(q), lowres_clock::now() + _entry_ttl,
insert_querier(_entries, _mutation_querier_index, _stats, _max_queriers_memory_usage, key, std::move(q), lowres_clock::now() + _entry_ttl,
std::move(trace_state));
}
void querier_cache::insert(utils::UUID key, shard_mutation_querier&& q, tracing::trace_state_ptr trace_state) {
insert_querier(_sem, _entries, _shard_mutation_querier_index, _stats, _max_queriers_memory_usage, key, std::move(q), lowres_clock::now() + _entry_ttl,
insert_querier(_entries, _shard_mutation_querier_index, _stats, _max_queriers_memory_usage, key, std::move(q), lowres_clock::now() + _entry_ttl,
std::move(trace_state));
}
template <typename Querier>
static std::optional<Querier> lookup_querier(
reader_concurrency_semaphore& sem,
querier_cache::entries& entries,
querier_cache::index& index,
querier_cache::stats& stats,
@@ -330,7 +329,7 @@ static std::optional<Querier> lookup_querier(
}
auto q = std::move(*it).template value<Querier>();
sem.unregister_inactive_read(std::move(*it).get_inactive_handle());
q.permit().semaphore().unregister_inactive_read(std::move(*it).get_inactive_handle());
entries.erase(it);
--stats.population;
@@ -350,7 +349,7 @@ std::optional<data_querier> querier_cache::lookup_data_querier(utils::UUID key,
const dht::partition_range& range,
const query::partition_slice& slice,
tracing::trace_state_ptr trace_state) {
return lookup_querier<data_querier>(_sem, _entries, _data_querier_index, _stats, key, s, range, slice, std::move(trace_state));
return lookup_querier<data_querier>(_entries, _data_querier_index, _stats, key, s, range, slice, std::move(trace_state));
}
std::optional<mutation_querier> querier_cache::lookup_mutation_querier(utils::UUID key,
@@ -358,7 +357,7 @@ std::optional<mutation_querier> querier_cache::lookup_mutation_querier(utils::UU
const dht::partition_range& range,
const query::partition_slice& slice,
tracing::trace_state_ptr trace_state) {
return lookup_querier<mutation_querier>(_sem, _entries, _mutation_querier_index, _stats, key, s, range, slice, std::move(trace_state));
return lookup_querier<mutation_querier>(_entries, _mutation_querier_index, _stats, key, s, range, slice, std::move(trace_state));
}
std::optional<shard_mutation_querier> querier_cache::lookup_shard_mutation_querier(utils::UUID key,
@@ -366,7 +365,7 @@ std::optional<shard_mutation_querier> querier_cache::lookup_shard_mutation_queri
const dht::partition_range_vector& ranges,
const query::partition_slice& slice,
tracing::trace_state_ptr trace_state) {
return lookup_querier<shard_mutation_querier>(_sem, _entries, _shard_mutation_querier_index, _stats, key, s, ranges, slice,
return lookup_querier<shard_mutation_querier>(_entries, _shard_mutation_querier_index, _stats, key, s, ranges, slice,
std::move(trace_state));
}
@@ -382,7 +381,8 @@ bool querier_cache::evict_one() {
++_stats.resource_based_evictions;
--_stats.population;
_sem.unregister_inactive_read(std::move(_entries.front()).get_inactive_handle());
auto& sem = _entries.front().permit().semaphore();
sem.unregister_inactive_read(std::move(_entries.front()).get_inactive_handle());
_entries.pop_front();
return true;
@@ -394,7 +394,7 @@ void querier_cache::evict_all_for_table(const utils::UUID& schema_id) {
while (it != end) {
if (it->schema().id() == schema_id) {
--_stats.population;
_sem.unregister_inactive_read(std::move(*it).get_inactive_handle());
it->permit().semaphore().unregister_inactive_read(std::move(*it).get_inactive_handle());
it = _entries.erase(it);
} else {
++it;

View File

@@ -142,6 +142,7 @@ struct position_view {
template <emit_only_live_rows OnlyLive>
class querier {
schema_ptr _schema;
reader_permit _permit;
std::unique_ptr<const dht::partition_range> _range;
std::unique_ptr<const query::partition_slice> _slice;
flat_mutation_reader _reader;
@@ -151,14 +152,16 @@ class querier {
public:
querier(const mutation_source& ms,
schema_ptr schema,
reader_permit permit,
dht::partition_range range,
query::partition_slice slice,
const io_priority_class& pc,
tracing::trace_state_ptr trace_ptr)
: _schema(schema)
, _permit(permit)
, _range(std::make_unique<dht::partition_range>(std::move(range)))
, _slice(std::make_unique<query::partition_slice>(std::move(slice)))
, _reader(ms.make_reader(schema, no_reader_permit(), *_range, *_slice, pc, std::move(trace_ptr),
, _reader(ms.make_reader(schema, std::move(permit), *_range, *_slice, pc, std::move(trace_ptr),
streamed_mutation::forwarding::no, mutation_reader::forwarding::no))
, _compaction_state(make_lw_shared<compact_for_query_state<OnlyLive>>(*schema, gc_clock::time_point{}, *_slice, 0, 0)) {
}
@@ -203,6 +206,10 @@ public:
return _schema;
}
reader_permit& permit() {
return _permit;
}
position_view current_position() const {
const dht::decorated_key* dk = _compaction_state->current_partition();
const clustering_key_prefix* clustering_key = _last_ckey ? &*_last_ckey : nullptr;
@@ -231,6 +238,7 @@ class shard_mutation_querier {
std::unique_ptr<const dht::partition_range> _reader_range;
std::unique_ptr<const query::partition_slice> _reader_slice;
flat_mutation_reader _reader;
reader_permit _permit;
dht::decorated_key _nominal_pkey;
std::optional<clustering_key_prefix> _nominal_ckey;
@@ -240,12 +248,14 @@ public:
std::unique_ptr<const dht::partition_range> reader_range,
std::unique_ptr<const query::partition_slice> reader_slice,
flat_mutation_reader reader,
reader_permit permit,
dht::decorated_key nominal_pkey,
std::optional<clustering_key_prefix> nominal_ckey)
: _query_ranges(std::move(query_ranges))
, _reader_range(std::move(reader_range))
, _reader_slice(std::move(reader_slice))
, _reader(std::move(reader))
, _permit(std::move(permit))
, _nominal_pkey(std::move(nominal_pkey))
, _nominal_ckey(std::move(nominal_ckey)) {
}
@@ -262,6 +272,10 @@ public:
return _reader.schema();
}
reader_permit& permit() {
return _permit;
}
position_view current_position() const {
return {&_nominal_pkey, _nominal_ckey ? &*_nominal_ckey : nullptr};
}
@@ -376,6 +390,12 @@ public:
}, _value);
}
reader_permit& permit() {
return std::visit([] (auto& q) -> reader_permit& {
return q.permit();
}, _value);
}
dht::partition_ranges_view ranges() const {
return std::visit([] (auto& q) {
return q.ranges();
@@ -413,7 +433,6 @@ public:
boost::intrusive::constant_time_size<false>>;
private:
reader_concurrency_semaphore& _sem;
entries _entries;
index _data_querier_index;
index _mutation_querier_index;
@@ -426,7 +445,7 @@ private:
void scan_cache_entries();
public:
explicit querier_cache(reader_concurrency_semaphore& sem, size_t max_cache_size = 1'000'000, std::chrono::seconds entry_ttl = default_entry_ttl);
explicit querier_cache(size_t max_cache_size = 1'000'000, std::chrono::seconds entry_ttl = default_entry_ttl);
querier_cache(const querier_cache&) = delete;
querier_cache& operator=(const querier_cache&) = delete;

31
query_class_config.hh Normal file
View File

@@ -0,0 +1,31 @@
/*
* Copyright (C) 2020 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <cinttypes>
class reader_concurrency_semaphore;
struct query_class_config {
reader_concurrency_semaphore& semaphore;
uint64_t max_memory_for_unlimited_query;
};

View File

@@ -120,6 +120,7 @@ public:
class read_context final : public enable_lw_shared_from_this<read_context> {
row_cache& _cache;
schema_ptr _schema;
reader_permit _permit;
const dht::partition_range& _range;
const query::partition_slice& _slice;
const io_priority_class& _pc;
@@ -144,6 +145,7 @@ class read_context final : public enable_lw_shared_from_this<read_context> {
public:
read_context(row_cache& cache,
schema_ptr schema,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
@@ -151,6 +153,7 @@ public:
mutation_reader::forwarding fwd_mr)
: _cache(cache)
, _schema(std::move(schema))
, _permit(std::move(permit))
, _range(range)
, _slice(slice)
, _pc(pc)
@@ -176,6 +179,7 @@ public:
read_context(const read_context&) = delete;
row_cache& cache() { return _cache; }
const schema_ptr& schema() const { return _schema; }
reader_permit permit() const { return _permit; }
const dht::partition_range& range() const { return _range; }
const query::partition_slice& slice() const { return _slice; }
const io_priority_class& pc() const { return _pc; }

View File

@@ -26,62 +26,57 @@
#include "utils/exceptions.hh"
reader_permit::impl::impl(reader_concurrency_semaphore& semaphore, reader_resources base_cost) : semaphore(semaphore), base_cost(base_cost) {
reader_permit::resource_units::resource_units(reader_concurrency_semaphore& semaphore, reader_resources res) noexcept
: _semaphore(&semaphore), _resources(res) {
}
reader_permit::impl::~impl() {
semaphore.signal(base_cost);
reader_permit::resource_units::resource_units(resource_units&& o) noexcept
: _semaphore(o._semaphore)
, _resources(std::exchange(o._resources, {})) {
}
reader_permit::memory_units::memory_units(reader_concurrency_semaphore* semaphore, ssize_t memory) noexcept
: _semaphore(semaphore), _memory(memory) {
if (_semaphore && _memory) {
_semaphore->consume_memory(_memory);
}
}
reader_permit::memory_units::memory_units(memory_units&& o) noexcept
: _semaphore(std::exchange(o._semaphore, nullptr))
, _memory(std::exchange(o._memory, 0)) {
}
reader_permit::memory_units::~memory_units() {
reader_permit::resource_units::~resource_units() {
reset();
}
reader_permit::memory_units& reader_permit::memory_units::operator=(memory_units&& o) noexcept {
reader_permit::resource_units& reader_permit::resource_units::operator=(resource_units&& o) noexcept {
if (&o == this) {
return *this;
}
reset();
_semaphore = std::exchange(o._semaphore, nullptr);
_memory = std::exchange(o._memory, 0);
_semaphore = o._semaphore;
_resources = std::exchange(o._resources, {});
return *this;
}
void reader_permit::memory_units::reset(size_t memory) {
if (_semaphore) {
_semaphore->consume_memory(memory);
_semaphore->signal_memory(_memory);
void reader_permit::resource_units::add(resource_units&& o) {
assert(_semaphore == o._semaphore);
_resources += std::exchange(o._resources, {});
}
void reader_permit::resource_units::reset(reader_resources res) {
_semaphore->consume(res);
if (_resources) {
_semaphore->signal(_resources);
}
_memory = memory;
_resources = res;
}
reader_permit::reader_permit(reader_concurrency_semaphore& semaphore, reader_resources base_cost)
: _impl(make_lw_shared<reader_permit::impl>(semaphore, base_cost)) {
reader_permit::reader_permit(reader_concurrency_semaphore& semaphore)
: _semaphore(&semaphore) {
}
reader_permit::memory_units reader_permit::get_memory_units(size_t memory) {
return memory_units(_impl ? &_impl->semaphore : nullptr, memory);
future<reader_permit::resource_units> reader_permit::wait_admission(size_t memory, db::timeout_clock::time_point timeout) {
return _semaphore->do_wait_admission(memory, timeout);
}
void reader_permit::release() {
_impl->semaphore.signal(_impl->base_cost);
_impl->base_cost = {};
reader_permit::resource_units reader_permit::consume_memory(size_t memory) {
return consume_resources(reader_resources{0, ssize_t(memory)});
}
reader_permit no_reader_permit() {
return reader_permit{};
reader_permit::resource_units reader_permit::consume_resources(reader_resources res) {
_semaphore->consume(res);
return resource_units(*_semaphore, res);
}
void reader_concurrency_semaphore::signal(const resources& r) noexcept {
@@ -90,7 +85,7 @@ void reader_concurrency_semaphore::signal(const resources& r) noexcept {
auto& x = _wait_list.front();
_resources -= x.res;
try {
x.pr.set_value(reader_permit(*this, x.res));
x.pr.set_value(reader_permit::resource_units(*this, x.res));
} catch (...) {
x.pr.set_exception(std::current_exception());
}
@@ -98,6 +93,10 @@ void reader_concurrency_semaphore::signal(const resources& r) noexcept {
}
}
reader_concurrency_semaphore::~reader_concurrency_semaphore() {
broken(std::make_exception_ptr(broken_semaphore{}));
}
reader_concurrency_semaphore::inactive_read_handle reader_concurrency_semaphore::register_inactive_read(std::unique_ptr<inactive_read> ir) {
// Implies _inactive_reads.empty(), we don't queue new readers before
// evicting all inactive reads.
@@ -139,13 +138,12 @@ bool reader_concurrency_semaphore::try_evict_one_inactive_read() {
return true;
}
future<reader_permit> reader_concurrency_semaphore::wait_admission(size_t memory,
db::timeout_clock::time_point timeout) {
future<reader_permit::resource_units> reader_concurrency_semaphore::do_wait_admission(size_t memory, db::timeout_clock::time_point timeout) {
if (_wait_list.size() >= _max_queue_length) {
if (_prethrow_action) {
_prethrow_action();
}
return make_exception_future<reader_permit>(
return make_exception_future<reader_permit::resource_units>(
std::make_exception_ptr(std::runtime_error(
format("{}: restricted mutation reader queue overload", _name))));
}
@@ -161,17 +159,23 @@ future<reader_permit> reader_concurrency_semaphore::wait_admission(size_t memory
}
if (may_proceed(r)) {
_resources -= r;
return make_ready_future<reader_permit>(reader_permit(*this, r));
return make_ready_future<reader_permit::resource_units>(reader_permit::resource_units(*this, r));
}
promise<reader_permit> pr;
promise<reader_permit::resource_units> pr;
auto fut = pr.get_future();
_wait_list.push_back(entry(std::move(pr), r), timeout);
return fut;
}
reader_permit reader_concurrency_semaphore::consume_resources(resources r) {
_resources -= r;
return reader_permit(*this, r);
reader_permit reader_concurrency_semaphore::make_permit() {
return reader_permit(*this);
}
void reader_concurrency_semaphore::broken(std::exception_ptr ex) {
while (!_wait_list.empty()) {
_wait_list.front().pr.set_exception(std::make_exception_ptr(broken_semaphore{}));
_wait_list.pop_front();
}
}
// A file that tracks the memory usage of buffers resulting from read
@@ -244,11 +248,8 @@ public:
}
virtual future<temporary_buffer<uint8_t>> dma_read_bulk(uint64_t offset, size_t range_size, const io_priority_class& pc) override {
return get_file_impl(_tracked_file)->dma_read_bulk(offset, range_size, pc).then([this, units = _permit.get_memory_units(range_size)] (temporary_buffer<uint8_t> buf) {
if (_permit) {
buf = make_tracked_temporary_buffer(std::move(buf), _permit);
}
return make_ready_future<temporary_buffer<uint8_t>>(std::move(buf));
return get_file_impl(_tracked_file)->dma_read_bulk(offset, range_size, pc).then([this, units = _permit.consume_memory(range_size)] (temporary_buffer<uint8_t> buf) {
return make_ready_future<temporary_buffer<uint8_t>>(make_tracked_temporary_buffer(std::move(buf), _permit));
});
}
};

View File

@@ -23,18 +23,15 @@
#include <map>
#include <seastar/core/future.hh>
#include "db/timeout_clock.hh"
#include "reader_permit.hh"
using namespace seastar;
/// Specific semaphore for controlling reader concurrency
///
/// Before creating a reader one should obtain a permit by calling
/// `wait_admission()`. This permit can then be used for tracking the
/// reader's memory consumption.
/// The permit should be held onto for the lifetime of the reader
/// and/or any buffer its tracking.
/// Use `make_permit()` to create a permit to track the resource consumption
/// of a specific read. The permit should be created before the read is even
/// started so it is available to track resource consumption from the start.
/// Reader concurrency is dual limited by count and memory.
/// The semaphore can be configured with the desired limits on
/// construction. New readers will only be admitted when there is both
@@ -92,9 +89,9 @@ public:
private:
struct entry {
promise<reader_permit> pr;
promise<reader_permit::resource_units> pr;
resources res;
entry(promise<reader_permit>&& pr, resources r) : pr(std::move(pr)), res(r) {}
entry(promise<reader_permit::resource_units>&& pr, resources r) : pr(std::move(pr)), res(r) {}
};
class expiry_handler {
@@ -128,15 +125,8 @@ private:
return has_available_units(r) && _wait_list.empty();
}
void consume_memory(size_t memory) {
_resources.memory -= memory;
}
future<reader_permit::resource_units> do_wait_admission(size_t memory, db::timeout_clock::time_point timeout);
void signal(const resources& r) noexcept;
void signal_memory(size_t memory) noexcept {
signal(resources(0, static_cast<ssize_t>(memory)));
}
public:
struct no_limits { };
@@ -160,6 +150,8 @@ public:
std::numeric_limits<ssize_t>::max(),
"unlimited reader_concurrency_semaphore") {}
~reader_concurrency_semaphore();
reader_concurrency_semaphore(const reader_concurrency_semaphore&) = delete;
reader_concurrency_semaphore& operator=(const reader_concurrency_semaphore&) = delete;
@@ -199,16 +191,21 @@ public:
return _inactive_read_stats;
}
future<reader_permit> wait_admission(size_t memory, db::timeout_clock::time_point timeout);
/// Consume the specific amount of resources without waiting.
reader_permit consume_resources(resources r);
reader_permit make_permit();
const resources available_resources() const {
return _resources;
}
void consume(resources r) {
_resources -= r;
}
void signal(const resources& r) noexcept;
size_t waiters() const {
return _wait_list.size();
}
void broken(std::exception_ptr ex);
};

View File

@@ -25,6 +25,8 @@
#include <seastar/core/file.hh>
#include "seastarx.hh"
#include "db/timeout_clock.hh"
struct reader_resources {
int count = 0;
ssize_t memory = 0;
@@ -53,69 +55,70 @@ struct reader_resources {
}
explicit operator bool() const {
return count >= 0 && memory >= 0;
return count > 0 || memory > 0;
}
};
class reader_concurrency_semaphore;
/// A permit for a specific read.
///
/// Used to track the read's resource consumption and wait for admission to read
/// from the disk.
/// Use `consume_memory()` to register memory usage. Use `wait_admission()` to
/// wait for admission, before reading from the disk. Both methods return a
/// `resource_units` RAII object that should be held onto while the respective
/// resources are in use.
class reader_permit {
struct impl {
reader_concurrency_semaphore& semaphore;
reader_resources base_cost;
impl(reader_concurrency_semaphore& semaphore, reader_resources base_cost);
~impl();
};
friend reader_permit no_reader_permit();
friend class reader_concurrency_semaphore;
public:
class memory_units {
reader_concurrency_semaphore* _semaphore = nullptr;
size_t _memory = 0;
class resource_units {
reader_concurrency_semaphore* _semaphore;
reader_resources _resources;
friend class reader_permit;
friend class reader_concurrency_semaphore;
private:
memory_units(reader_concurrency_semaphore* semaphore, ssize_t memory) noexcept;
resource_units(reader_concurrency_semaphore& semaphore, reader_resources res) noexcept;
public:
memory_units(const memory_units&) = delete;
memory_units(memory_units&&) noexcept;
~memory_units();
memory_units& operator=(const memory_units&) = delete;
memory_units& operator=(memory_units&&) noexcept;
void reset(size_t memory = 0);
operator size_t() const {
return _memory;
}
resource_units(const resource_units&) = delete;
resource_units(resource_units&&) noexcept;
~resource_units();
resource_units& operator=(const resource_units&) = delete;
resource_units& operator=(resource_units&&) noexcept;
void add(resource_units&& o);
void reset(reader_resources res = {});
};
private:
lw_shared_ptr<impl> _impl;
reader_concurrency_semaphore* _semaphore;
private:
reader_permit() = default;
explicit reader_permit(reader_concurrency_semaphore& semaphore);
public:
reader_permit(reader_concurrency_semaphore& semaphore, reader_resources base_cost);
bool operator==(const reader_permit& o) const {
return _impl == o._impl;
}
operator bool() const {
return bool(_impl);
return _semaphore == o._semaphore;
}
memory_units get_memory_units(size_t memory = 0);
reader_concurrency_semaphore& semaphore() {
return *_semaphore;
}
future<resource_units> wait_admission(size_t memory, db::timeout_clock::time_point timeout);
resource_units consume_memory(size_t memory = 0);
resource_units consume_resources(reader_resources res);
void release();
};
reader_permit no_reader_permit();
template <typename Char>
temporary_buffer<Char> make_tracked_temporary_buffer(temporary_buffer<Char> buf, reader_permit& permit) {
return temporary_buffer<Char>(buf.get_write(), buf.size(),
make_deleter(buf.release(), [units = permit.get_memory_units(buf.size())] () mutable { units.reset(); }));
make_deleter(buf.release(), [units = permit.consume_memory(buf.size())] () mutable { units.reset(); }));
}
file make_tracked_file(file f, reader_permit p);

View File

@@ -47,7 +47,7 @@ using namespace cache;
flat_mutation_reader
row_cache::create_underlying_reader(read_context& ctx, mutation_source& src, const dht::partition_range& pr) {
ctx.on_underlying_created();
return src.make_reader(_schema, no_reader_permit(), pr, ctx.slice(), ctx.pc(), ctx.trace_state(), streamed_mutation::forwarding::yes);
return src.make_reader(_schema, ctx.permit(), pr, ctx.slice(), ctx.pc(), ctx.trace_state(), streamed_mutation::forwarding::yes);
}
static thread_local mutation_application_stats dummy_app_stats;
@@ -737,6 +737,7 @@ row_cache::make_scanning_reader(const dht::partition_range& range, lw_shared_ptr
flat_mutation_reader
row_cache::make_reader(schema_ptr s,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
@@ -744,7 +745,7 @@ row_cache::make_reader(schema_ptr s,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr)
{
auto ctx = make_lw_shared<read_context>(*this, s, range, slice, pc, trace_state, fwd_mr);
auto ctx = make_lw_shared<read_context>(*this, s, std::move(permit), range, slice, pc, trace_state, fwd_mr);
if (!ctx->is_range_query() && !fwd_mr) {
auto mr = _read_section(_tracker.region(), [&] {

View File

@@ -490,6 +490,7 @@ public:
// as long as the reader is used.
// The range must not wrap around.
flat_mutation_reader make_reader(schema_ptr,
reader_permit permit,
const dht::partition_range&,
const query::partition_slice&,
const io_priority_class& = default_priority_class(),
@@ -497,9 +498,9 @@ public:
streamed_mutation::forwarding fwd = streamed_mutation::forwarding::no,
mutation_reader::forwarding fwd_mr = mutation_reader::forwarding::no);
flat_mutation_reader make_reader(schema_ptr s, const dht::partition_range& range = query::full_partition_range) {
flat_mutation_reader make_reader(schema_ptr s, reader_permit permit, const dht::partition_range& range = query::full_partition_range) {
auto& full_slice = s->full_slice();
return make_reader(std::move(s), range, full_slice);
return make_reader(std::move(s), std::move(permit), range, full_slice);
}
const stats& stats() const { return _stats; }

View File

@@ -393,6 +393,7 @@ protected:
column_family& _cf;
creator_fn _sstable_creator;
schema_ptr _schema;
reader_permit _permit;
std::vector<shared_sstable> _sstables;
// Unused sstables are tracked because if compaction is interrupted we can only delete them.
// Deleting used sstables could potentially result in data loss.
@@ -417,6 +418,7 @@ protected:
: _cf(cf)
, _sstable_creator(std::move(descriptor.creator))
, _schema(cf.schema())
, _permit(_cf.compaction_concurrency_semaphore().make_permit())
, _sstables(std::move(descriptor.sstables))
, _max_sstable_size(descriptor.max_sstable_bytes)
, _sstable_level(descriptor.level)
@@ -733,7 +735,7 @@ public:
flat_mutation_reader make_sstable_reader() const override {
return ::make_local_shard_sstable_reader(_schema,
no_reader_permit(),
_permit,
_compacting,
query::full_partition_range,
_schema->full_slice(),
@@ -1232,7 +1234,7 @@ public:
// Use reader that makes sure no non-local mutation will not be filtered out.
flat_mutation_reader make_sstable_reader() const override {
return ::make_range_sstable_reader(_schema,
no_reader_permit(),
_permit,
_compacting,
query::full_partition_range,
_schema->full_slice(),

View File

@@ -2504,14 +2504,15 @@ future<> sstable::generate_summary(const io_priority_class& pc) {
return do_with(summary_generator(_schema->get_partitioner(), _components->summary,
_manager.config().sstable_summary_ratio()),
[this, &pc, options = std::move(options), index_file, index_size] (summary_generator& s) mutable {
auto sem = std::make_unique<reader_concurrency_semaphore>(reader_concurrency_semaphore::no_limits{});
auto ctx = make_lw_shared<index_consume_entry_context<summary_generator>>(
no_reader_permit(), s, trust_promoted_index::yes, *_schema, index_file, std::move(options), 0, index_size,
sem->make_permit(), s, trust_promoted_index::yes, *_schema, index_file, std::move(options), 0, index_size,
(_version == sstable_version_types::mc
? std::make_optional(get_clustering_values_fixed_lengths(get_serialization_header()))
: std::optional<column_values_fixed_lengths>{}));
return ctx->consume_input().finally([ctx] {
return ctx->close();
}).then([this, ctx, &s] {
}).then([this, ctx, &s, sem = std::move(sem)] {
seal_summary(_components->summary, std::move(s.first_key), std::move(s.last_key), s.state());
});
});
@@ -2796,8 +2797,8 @@ input_stream<char> sstable::data_stream(uint64_t pos, size_t len, const io_prior
return make_file_input_stream(f, pos, len, std::move(options));
}
future<temporary_buffer<char>> sstable::data_read(uint64_t pos, size_t len, const io_priority_class& pc) {
return do_with(data_stream(pos, len, pc, no_reader_permit(), tracing::trace_state_ptr(), {}), [len] (auto& stream) {
future<temporary_buffer<char>> sstable::data_read(uint64_t pos, size_t len, const io_priority_class& pc, reader_permit permit) {
return do_with(data_stream(pos, len, pc, std::move(permit), tracing::trace_state_ptr(), {}), [len] (auto& stream) {
return stream.read_exactly(len).finally([&stream] {
return stream.close();
});
@@ -3212,9 +3213,11 @@ future<bool> sstable::has_partition_key(const utils::hashed_key& hk, const dht::
if (!filter_has_key(hk)) {
return make_ready_future<bool>(false);
}
seastar::shared_ptr<sstables::index_reader> lh_index
= seastar::make_shared<sstables::index_reader>(s, no_reader_permit(), default_priority_class(), tracing::trace_state_ptr());
return lh_index->advance_lower_and_check_if_present(dk).then([lh_index, s, this] (bool present) {
auto sem = std::make_unique<reader_concurrency_semaphore>(reader_concurrency_semaphore::no_limits{});
auto lh_index_ptr = std::make_unique<sstables::index_reader>(s, sem->make_permit(), default_priority_class(), tracing::trace_state_ptr());
auto& lh_index = *lh_index_ptr;
return lh_index.advance_lower_and_check_if_present(dk).then([lh_index_ptr = std::move(lh_index_ptr), s, sem = std::move(sem)] (bool present) mutable {
lh_index_ptr.reset(); // destroy before the semaphore
return make_ready_future<bool>(present);
});
}

View File

@@ -644,7 +644,7 @@ private:
// determined using the index file).
// This function is intended (and optimized for) random access, not
// for iteration through all the rows.
future<temporary_buffer<char>> data_read(uint64_t pos, size_t len, const io_priority_class& pc);
future<temporary_buffer<char>> data_read(uint64_t pos, size_t len, const io_priority_class& pc, reader_permit permit);
future<summary_entry&> read_summary_entry(size_t i);

114
table.cc
View File

@@ -327,6 +327,7 @@ flat_mutation_reader make_range_sstable_reader(schema_ptr s,
flat_mutation_reader
table::make_sstable_reader(schema_ptr s,
reader_permit permit,
lw_shared_ptr<sstables::sstable_set> sstables,
const dht::partition_range& pr,
const query::partition_slice& slice,
@@ -334,10 +335,6 @@ table::make_sstable_reader(schema_ptr s,
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) const {
auto* semaphore = service::get_local_streaming_read_priority().id() == pc.id()
? _config.streaming_read_concurrency_semaphore
: _config.read_concurrency_semaphore;
// CAVEAT: if make_sstable_reader() is called on a single partition
// we want to optimize and read exactly this partition. As a
// consequence, fast_forward_to() will *NOT* work on the result,
@@ -359,7 +356,7 @@ table::make_sstable_reader(schema_ptr s,
});
}
return mutation_source([semaphore, this, sstables=std::move(sstables)] (
return mutation_source([this, sstables=std::move(sstables)] (
schema_ptr s,
reader_permit permit,
const dht::partition_range& pr,
@@ -372,7 +369,7 @@ table::make_sstable_reader(schema_ptr s,
_stats.estimated_sstable_per_read, pr, slice, pc, std::move(trace_state), fwd, fwd_mr);
});
} else {
return mutation_source([semaphore, sstables=std::move(sstables)] (
return mutation_source([sstables=std::move(sstables)] (
schema_ptr s,
reader_permit permit,
const dht::partition_range& pr,
@@ -387,18 +384,14 @@ table::make_sstable_reader(schema_ptr s,
}
}();
if (semaphore) {
return make_restricted_flat_reader(*semaphore, std::move(ms), std::move(s), pr, slice, pc, std::move(trace_state), fwd, fwd_mr);
} else {
return ms.make_reader(std::move(s), no_reader_permit(), pr, slice, pc, std::move(trace_state), fwd, fwd_mr);
}
return make_restricted_flat_reader(std::move(ms), std::move(s), std::move(permit), pr, slice, pc, std::move(trace_state), fwd, fwd_mr);
}
// Exposed for testing, not performance critical.
future<table::const_mutation_partition_ptr>
table::find_partition(schema_ptr s, const dht::decorated_key& key) const {
return do_with(dht::partition_range::make_singular(key), [s = std::move(s), this] (auto& range) {
return do_with(this->make_reader(s, range), [s] (flat_mutation_reader& reader) {
table::find_partition(schema_ptr s, reader_permit permit, const dht::decorated_key& key) const {
return do_with(dht::partition_range::make_singular(key), [s = std::move(s), permit = std::move(permit), this] (auto& range) mutable {
return do_with(this->make_reader(std::move(s), std::move(permit), range), [] (flat_mutation_reader& reader) {
return read_mutation_from_flat_mutation_reader(reader, db::no_timeout).then([] (mutation_opt&& mo) -> std::unique_ptr<const mutation_partition> {
if (!mo) {
return {};
@@ -410,13 +403,13 @@ table::find_partition(schema_ptr s, const dht::decorated_key& key) const {
}
future<table::const_mutation_partition_ptr>
table::find_partition_slow(schema_ptr s, const partition_key& key) const {
return find_partition(s, dht::decorate_key(*s, key));
table::find_partition_slow(schema_ptr s, reader_permit permit, const partition_key& key) const {
return find_partition(s, std::move(permit), dht::decorate_key(*s, key));
}
future<table::const_row_ptr>
table::find_row(schema_ptr s, const dht::decorated_key& partition_key, clustering_key clustering_key) const {
return find_partition(s, partition_key).then([clustering_key = std::move(clustering_key), s] (const_mutation_partition_ptr p) {
table::find_row(schema_ptr s, reader_permit permit, const dht::decorated_key& partition_key, clustering_key clustering_key) const {
return find_partition(s, std::move(permit), partition_key).then([clustering_key = std::move(clustering_key), s] (const_mutation_partition_ptr p) {
if (!p) {
return make_ready_future<const_row_ptr>();
}
@@ -432,6 +425,7 @@ table::find_row(schema_ptr s, const dht::decorated_key& partition_key, clusterin
flat_mutation_reader
table::make_reader(schema_ptr s,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
@@ -439,7 +433,7 @@ table::make_reader(schema_ptr s,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) const {
if (_virtual_reader) {
return (*_virtual_reader).make_reader(s, no_reader_permit(), range, slice, pc, trace_state, fwd, fwd_mr);
return (*_virtual_reader).make_reader(s, std::move(permit), range, slice, pc, trace_state, fwd, fwd_mr);
}
std::vector<flat_mutation_reader> readers;
@@ -466,13 +460,13 @@ table::make_reader(schema_ptr s,
// https://github.com/scylladb/scylla/issues/185
for (auto&& mt : *_memtables) {
readers.emplace_back(mt->make_flat_reader(s, range, slice, pc, trace_state, fwd, fwd_mr));
readers.emplace_back(mt->make_flat_reader(s, permit, range, slice, pc, trace_state, fwd, fwd_mr));
}
if (cache_enabled() && !slice.options.contains(query::partition_slice::option::bypass_cache)) {
readers.emplace_back(_cache.make_reader(s, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
readers.emplace_back(_cache.make_reader(s, permit, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
} else {
readers.emplace_back(make_sstable_reader(s, _sstables, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
readers.emplace_back(make_sstable_reader(s, permit, _sstables, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
}
auto comb_reader = make_combined_reader(s, std::move(readers), fwd, fwd_mr);
@@ -496,25 +490,27 @@ sstables::shared_sstable table::make_streaming_sstable_for_write(std::optional<s
flat_mutation_reader
table::make_streaming_reader(schema_ptr s,
const dht::partition_range_vector& ranges) const {
auto permit = _config.streaming_read_concurrency_semaphore->make_permit();
auto& slice = s->full_slice();
auto& pc = service::get_local_streaming_read_priority();
auto source = mutation_source([this] (schema_ptr s, reader_permit, const dht::partition_range& range, const query::partition_slice& slice,
auto source = mutation_source([this] (schema_ptr s, reader_permit permit, const dht::partition_range& range, const query::partition_slice& slice,
const io_priority_class& pc, tracing::trace_state_ptr trace_state, streamed_mutation::forwarding fwd, mutation_reader::forwarding fwd_mr) {
std::vector<flat_mutation_reader> readers;
readers.reserve(_memtables->size() + 1);
for (auto&& mt : *_memtables) {
readers.emplace_back(mt->make_flat_reader(s, range, slice, pc, trace_state, fwd, fwd_mr));
readers.emplace_back(mt->make_flat_reader(s, permit, range, slice, pc, trace_state, fwd, fwd_mr));
}
readers.emplace_back(make_sstable_reader(s, _sstables, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
readers.emplace_back(make_sstable_reader(s, permit, _sstables, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
return make_combined_reader(s, std::move(readers), fwd, fwd_mr);
});
return make_flat_multi_range_reader(s, std::move(source), ranges, slice, pc, nullptr, mutation_reader::forwarding::no);
return make_flat_multi_range_reader(s, std::move(permit), std::move(source), ranges, slice, pc, nullptr, mutation_reader::forwarding::no);
}
flat_mutation_reader table::make_streaming_reader(schema_ptr schema, const dht::partition_range& range,
const query::partition_slice& slice, mutation_reader::forwarding fwd_mr) const {
auto permit = _config.streaming_read_concurrency_semaphore->make_permit();
const auto& pc = service::get_local_streaming_read_priority();
auto trace_state = tracing::trace_state_ptr();
const auto fwd = streamed_mutation::forwarding::no;
@@ -522,9 +518,9 @@ flat_mutation_reader table::make_streaming_reader(schema_ptr schema, const dht::
std::vector<flat_mutation_reader> readers;
readers.reserve(_memtables->size() + 1);
for (auto&& mt : *_memtables) {
readers.emplace_back(mt->make_flat_reader(schema, range, slice, pc, trace_state, fwd, fwd_mr));
readers.emplace_back(mt->make_flat_reader(schema, permit, range, slice, pc, trace_state, fwd, fwd_mr));
}
readers.emplace_back(make_sstable_reader(schema, _sstables, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
readers.emplace_back(make_sstable_reader(schema, permit, _sstables, range, slice, pc, std::move(trace_state), fwd, fwd_mr));
return make_combined_reader(std::move(schema), std::move(readers), fwd, fwd_mr);
}
@@ -535,7 +531,7 @@ future<std::vector<locked_cell>> table::lock_counter_cells(const mutation& m, db
// Not performance critical. Currently used for testing only.
future<bool>
table::for_all_partitions_slow(schema_ptr s, std::function<bool (const dht::decorated_key&, const mutation_partition&)> func) const {
table::for_all_partitions_slow(schema_ptr s, reader_permit permit, std::function<bool (const dht::decorated_key&, const mutation_partition&)> func) const {
struct iteration_state {
flat_mutation_reader reader;
std::function<bool (const dht::decorated_key&, const mutation_partition&)> func;
@@ -543,13 +539,14 @@ table::for_all_partitions_slow(schema_ptr s, std::function<bool (const dht::deco
bool empty = false;
public:
bool done() const { return !ok || empty; }
iteration_state(schema_ptr s, const column_family& cf, std::function<bool (const dht::decorated_key&, const mutation_partition&)>&& func)
: reader(cf.make_reader(std::move(s)))
iteration_state(schema_ptr s, reader_permit permit, const column_family& cf,
std::function<bool (const dht::decorated_key&, const mutation_partition&)>&& func)
: reader(cf.make_reader(std::move(s), std::move(permit)))
, func(std::move(func))
{ }
};
return do_with(iteration_state(std::move(s), *this, std::move(func)), [] (iteration_state& is) {
return do_with(iteration_state(std::move(s), std::move(permit), *this, std::move(func)), [] (iteration_state& is) {
return do_until([&is] { return is.done(); }, [&is] {
return read_mutation_from_flat_mutation_reader(is.reader, db::no_timeout).then([&is](mutation_opt&& mo) {
if (!mo) {
@@ -1559,14 +1556,14 @@ table::sstables_as_snapshot_source() {
return snapshot_source([this] () {
auto sst_set = _sstables;
return mutation_source([this, sst_set] (schema_ptr s,
reader_permit,
reader_permit permit,
const dht::partition_range& r,
const query::partition_slice& slice,
const io_priority_class& pc,
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
return make_sstable_reader(std::move(s), sst_set, r, slice, pc, std::move(trace_state), fwd, fwd_mr);
return make_sstable_reader(std::move(s), std::move(permit), sst_set, r, slice, pc, std::move(trace_state), fwd, fwd_mr);
}, [this, sst_set] {
return make_partition_presence_checker(sst_set);
});
@@ -2361,6 +2358,7 @@ struct query_state {
future<lw_shared_ptr<query::result>>
table::query(schema_ptr s,
const query::read_command& cmd,
query_class_config class_config,
query::result_options opts,
const dht::partition_range_vector& partition_ranges,
tracing::trace_state_ptr trace_state,
@@ -2372,14 +2370,14 @@ table::query(schema_ptr s,
_stats.reads.set_latency(lc);
auto f = opts.request == query::result_request::only_digest
? memory_limiter.new_digest_read(max_size) : memory_limiter.new_data_read(max_size);
return f.then([this, lc, s = std::move(s), &cmd, opts, &partition_ranges,
return f.then([this, lc, s = std::move(s), &cmd, class_config, opts, &partition_ranges,
trace_state = std::move(trace_state), timeout, cache_ctx = std::move(cache_ctx)] (query::result_memory_accounter accounter) mutable {
auto qs_ptr = std::make_unique<query_state>(std::move(s), cmd, opts, partition_ranges, std::move(accounter));
auto& qs = *qs_ptr;
return do_until(std::bind(&query_state::done, &qs), [this, &qs, trace_state = std::move(trace_state), timeout, cache_ctx = std::move(cache_ctx)] {
return do_until(std::bind(&query_state::done, &qs), [this, &qs, class_config, trace_state = std::move(trace_state), timeout, cache_ctx = std::move(cache_ctx)] {
auto&& range = *qs.current_partition_range++;
return data_query(qs.schema, as_mutation_source(), range, qs.cmd.slice, qs.remaining_rows(),
qs.remaining_partitions(), qs.cmd.timestamp, qs.builder, timeout, _config.max_memory_for_unlimited_query, trace_state, cache_ctx);
qs.remaining_partitions(), qs.cmd.timestamp, qs.builder, timeout, class_config, trace_state, cache_ctx);
}).then([qs_ptr = std::move(qs_ptr), &qs] {
return make_ready_future<lw_shared_ptr<query::result>>(
make_lw_shared<query::result>(qs.builder.build()));
@@ -2395,14 +2393,14 @@ table::query(schema_ptr s,
mutation_source
table::as_mutation_source() const {
return mutation_source([this] (schema_ptr s,
reader_permit,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
return this->make_reader(std::move(s), range, slice, pc, std::move(trace_state), fwd, fwd_mr);
return this->make_reader(std::move(s), std::move(permit), range, slice, pc, std::move(trace_state), fwd, fwd_mr);
});
}
@@ -2472,6 +2470,7 @@ table::disable_auto_compaction() {
flat_mutation_reader
table::make_reader_excluding_sstables(schema_ptr s,
reader_permit permit,
std::vector<sstables::shared_sstable>& excluded,
const dht::partition_range& range,
const query::partition_slice& slice,
@@ -2483,7 +2482,7 @@ table::make_reader_excluding_sstables(schema_ptr s,
readers.reserve(_memtables->size() + 1);
for (auto&& mt : *_memtables) {
readers.emplace_back(mt->make_flat_reader(s, range, slice, pc, trace_state, fwd, fwd_mr));
readers.emplace_back(mt->make_flat_reader(s, permit, range, slice, pc, trace_state, fwd, fwd_mr));
}
auto effective_sstables = ::make_lw_shared<sstables::sstable_set>(*_sstables);
@@ -2491,7 +2490,7 @@ table::make_reader_excluding_sstables(schema_ptr s,
effective_sstables->erase(sst);
}
readers.emplace_back(make_sstable_reader(s, std::move(effective_sstables), range, slice, pc, std::move(trace_state), fwd, fwd_mr));
readers.emplace_back(make_sstable_reader(s, permit, std::move(effective_sstables), range, slice, pc, std::move(trace_state), fwd, fwd_mr));
return make_combined_reader(s, std::move(readers), fwd, fwd_mr);
}
@@ -2524,14 +2523,15 @@ future<> table::move_sstables_from_staging(std::vector<sstables::shared_sstable>
* Given an update for the base table, calculates the set of potentially affected views,
* generates the relevant updates, and sends them to the paired view replicas.
*/
future<row_locker::lock_holder> table::push_view_replica_updates(const schema_ptr& s, const frozen_mutation& fm, db::timeout_clock::time_point timeout, tracing::trace_state_ptr tr_state) const {
future<row_locker::lock_holder> table::push_view_replica_updates(const schema_ptr& s, const frozen_mutation& fm,
db::timeout_clock::time_point timeout, tracing::trace_state_ptr tr_state, reader_concurrency_semaphore& sem) const {
//FIXME: Avoid unfreezing here.
auto m = fm.unfreeze(s);
return push_view_replica_updates(s, std::move(m), timeout, std::move(tr_state));
return push_view_replica_updates(s, std::move(m), timeout, std::move(tr_state), sem);
}
future<row_locker::lock_holder> table::do_push_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout, mutation_source&& source,
tracing::trace_state_ptr tr_state, const io_priority_class& io_priority, query::partition_slice::option_set custom_opts) const {
tracing::trace_state_ptr tr_state, reader_concurrency_semaphore& sem, const io_priority_class& io_priority, query::partition_slice::option_set custom_opts) const {
if (!_config.view_update_concurrency_semaphore->current()) {
// We don't have resources to generate view updates for this write. If we reached this point, we failed to
// throttle the client. The memory queue is already full, waiting on the semaphore would cause this node to
@@ -2574,14 +2574,14 @@ future<row_locker::lock_holder> table::do_push_view_replica_updates(const schema
// We'll return this lock to the caller, which will release it after
// writing the base-table update.
future<row_locker::lock_holder> lockf = local_base_lock(base, m.decorated_key(), slice.default_row_ranges(), timeout);
return lockf.then([m = std::move(m), slice = std::move(slice), views = std::move(views), base, this, timeout, now, source = std::move(source), tr_state = std::move(tr_state), &io_priority] (row_locker::lock_holder lock) mutable {
return lockf.then([m = std::move(m), slice = std::move(slice), views = std::move(views), base, this, timeout, now, source = std::move(source), tr_state = std::move(tr_state), &sem, &io_priority] (row_locker::lock_holder lock) mutable {
tracing::trace(tr_state, "View updates for {}.{} require read-before-write - base table reader is created", base->ks_name(), base->cf_name());
return do_with(
dht::partition_range::make_singular(m.decorated_key()),
std::move(slice),
std::move(m),
[base, views = std::move(views), lock = std::move(lock), this, timeout, now, source = std::move(source), &io_priority, tr_state = std::move(tr_state)] (auto& pk, auto& slice, auto& m) mutable {
auto reader = source.make_reader(base, no_reader_permit(), pk, slice, io_priority, tr_state, streamed_mutation::forwarding::no, mutation_reader::forwarding::no);
[base, views = std::move(views), lock = std::move(lock), this, timeout, now, source = std::move(source), &sem, &io_priority, tr_state = std::move(tr_state)] (auto& pk, auto& slice, auto& m) mutable {
auto reader = source.make_reader(base, sem.make_permit(), pk, slice, io_priority, tr_state, streamed_mutation::forwarding::no, mutation_reader::forwarding::no);
return this->generate_and_propagate_view_updates(base, std::move(views), std::move(m), std::move(reader), tr_state, now).then([base, tr_state = std::move(tr_state), lock = std::move(lock)] () mutable {
tracing::trace(tr_state, "View updates for {}.{} were generated and propagated", base->ks_name(), base->cf_name());
// return the local partition/row lock we have taken so it
@@ -2593,29 +2593,37 @@ future<row_locker::lock_holder> table::do_push_view_replica_updates(const schema
});
}
future<row_locker::lock_holder> table::push_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout, tracing::trace_state_ptr tr_state) const {
future<row_locker::lock_holder> table::push_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout,
tracing::trace_state_ptr tr_state, reader_concurrency_semaphore& sem) const {
return do_push_view_replica_updates(s, std::move(m), timeout, as_mutation_source(),
std::move(tr_state), service::get_local_sstable_query_read_priority(), {});
std::move(tr_state), sem, service::get_local_sstable_query_read_priority(), {});
}
future<row_locker::lock_holder>
table::stream_view_replica_updates(const schema_ptr& s, mutation&& m, db::timeout_clock::time_point timeout,
std::vector<sstables::shared_sstable>& excluded_sstables) const {
return do_push_view_replica_updates(s, std::move(m), timeout, as_mutation_source_excluding(excluded_sstables),
tracing::trace_state_ptr(), service::get_local_streaming_read_priority(), query::partition_slice::option_set::of<query::partition_slice::option::bypass_cache>());
return do_push_view_replica_updates(
s,
std::move(m),
timeout,
as_mutation_source_excluding(excluded_sstables),
tracing::trace_state_ptr(),
*_config.streaming_read_concurrency_semaphore,
service::get_local_streaming_read_priority(),
query::partition_slice::option_set::of<query::partition_slice::option::bypass_cache>());
}
mutation_source
table::as_mutation_source_excluding(std::vector<sstables::shared_sstable>& ssts) const {
return mutation_source([this, &ssts] (schema_ptr s,
reader_permit,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
const io_priority_class& pc,
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
return this->make_reader_excluding_sstables(std::move(s), ssts, range, slice, pc, std::move(trace_state), fwd, fwd_mr);
return this->make_reader_excluding_sstables(std::move(s), std::move(permit), ssts, range, slice, pc, std::move(trace_state), fwd, fwd_mr);
});
}

View File

@@ -43,7 +43,7 @@ static void broken_sst(sstring dir, unsigned long generation, schema_ptr s, sstr
try {
sstables::test_env env;
sstable_ptr sstp = std::get<0>(env.reusable_sst(s, dir, generation, version).get());
auto r = sstp->read_rows_flat(s, no_reader_permit());
auto r = sstp->read_rows_flat(s, tests::make_permit());
r.consume(my_consumer{}, db::no_timeout).get();
BOOST_FAIL("expecting exception");
} catch (malformed_sstable_exception& e) {

View File

@@ -36,6 +36,7 @@
#include "test/lib/memtable_snapshot_source.hh"
#include "test/lib/mutation_assertions.hh"
#include "test/lib/flat_mutation_reader_assertions.hh"
#include "test/lib/reader_permit.hh"
#include <variant>
@@ -230,7 +231,7 @@ void test_slice_single_version(mutation& underlying,
try {
auto range = dht::partition_range::make_singular(DK);
auto reader = cache.make_reader(SCHEMA, range, slice);
auto reader = cache.make_reader(SCHEMA, tests::make_permit(), range, slice);
check_produces_only(DK, std::move(reader), expected_sm_fragments, slice.row_ranges(*SCHEMA, DK.key()));

View File

@@ -47,6 +47,7 @@
#include "test/lib/cql_test_env.hh"
#include "test/lib/data_model.hh"
#include "test/lib/sstable_utils.hh"
#include "test/lib/reader_permit.hh"
using namespace db;
@@ -639,7 +640,7 @@ SEASTAR_TEST_CASE(test_commitlog_replay_invalid_key){
}
{
auto rd = mt.make_flat_reader(s);
auto rd = mt.make_flat_reader(s, tests::make_permit());
auto mopt = read_mutation_from_flat_mutation_reader(rd, db::no_timeout).get0();
BOOST_REQUIRE(mopt);

View File

@@ -24,6 +24,7 @@
#include "bytes.hh"
#include "utils/buffer_input_stream.hh"
#include "test/lib/reader_permit.hh"
#include <boost/test/unit_test.hpp>
#include <seastar/core/iostream.hh>
@@ -61,7 +62,7 @@ class test_consumer final : public data_consumer::continuous_data_consumer<test_
public:
test_consumer(uint64_t tested_value)
: continuous_data_consumer(no_reader_permit(), prepare_stream(tested_value), 0, calculate_length(tested_value))
: continuous_data_consumer(tests::make_permit(), prepare_stream(tested_value), 0, calculate_length(tested_value))
, _tested_value(tested_value)
{ }

View File

@@ -27,6 +27,7 @@
#include "test/lib/cql_test_env.hh"
#include "test/lib/result_set_assertions.hh"
#include "test/lib/reader_permit.hh"
#include "database.hh"
#include "partition_slice_builder.hh"
@@ -150,7 +151,7 @@ SEASTAR_THREAD_TEST_CASE(test_database_with_data_in_sstables_is_a_mutation_sourc
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
return cf.make_reader(s, range, slice, pc, std::move(trace_state), fwd, fwd_mr);
return cf.make_reader(s, tests::make_permit(), range, slice, pc, std::move(trace_state), fwd, fwd_mr);
});
});
return make_ready_future<>();

View File

@@ -39,6 +39,7 @@
#include "test/lib/simple_schema.hh"
#include "test/lib/flat_mutation_reader_assertions.hh"
#include "test/lib/log.hh"
#include "test/lib/reader_permit.hh"
struct mock_consumer {
struct result {
@@ -441,14 +442,14 @@ SEASTAR_TEST_CASE(test_multi_range_reader) {
// Generator ranges are single pass, so we need a new range each time they are used.
auto run_test = [&] (auto make_empty_ranges, auto make_single_ranges, auto make_multiple_ranges) {
testlog.info("empty ranges");
assert_that(make_flat_multi_range_reader(s.schema(), source, make_empty_ranges(), s.schema()->full_slice()))
assert_that(make_flat_multi_range_reader(s.schema(), tests::make_permit(), source, make_empty_ranges(), s.schema()->full_slice()))
.produces_end_of_stream()
.fast_forward_to(fft_range)
.produces(ms[9])
.produces_end_of_stream();
testlog.info("single range");
assert_that(make_flat_multi_range_reader(s.schema(), source, make_single_ranges(), s.schema()->full_slice()))
assert_that(make_flat_multi_range_reader(s.schema(), tests::make_permit(), source, make_single_ranges(), s.schema()->full_slice()))
.produces(ms[1])
.produces(ms[2])
.produces_end_of_stream()
@@ -457,7 +458,7 @@ SEASTAR_TEST_CASE(test_multi_range_reader) {
.produces_end_of_stream();
testlog.info("read full partitions and fast forward");
assert_that(make_flat_multi_range_reader(s.schema(), source, make_multiple_ranges(), s.schema()->full_slice()))
assert_that(make_flat_multi_range_reader(s.schema(), tests::make_permit(), source, make_multiple_ranges(), s.schema()->full_slice()))
.produces(ms[1])
.produces(ms[2])
.produces(ms[4])
@@ -467,7 +468,7 @@ SEASTAR_TEST_CASE(test_multi_range_reader) {
.produces_end_of_stream();
testlog.info("read, skip partitions and fast forward");
assert_that(make_flat_multi_range_reader(s.schema(), source, make_multiple_ranges(), s.schema()->full_slice()))
assert_that(make_flat_multi_range_reader(s.schema(), tests::make_permit(), source, make_multiple_ranges(), s.schema()->full_slice()))
.produces_partition_start(keys[1])
.next_partition()
.produces_partition_start(keys[2])

View File

@@ -37,6 +37,7 @@
#include "test/lib/data_model.hh"
#include "test/lib/random_utils.hh"
#include "test/lib/log.hh"
#include "test/lib/reader_permit.hh"
static api::timestamp_type next_timestamp() {
static thread_local api::timestamp_type next_timestamp = 1;
@@ -96,7 +97,7 @@ SEASTAR_TEST_CASE(test_memtable_with_many_versions_conforms_to_mutation_source)
for (auto&& m : muts) {
mt->apply(m);
// Create reader so that each mutation is in a separate version
flat_mutation_reader rd = mt->make_flat_reader(s, ranges_storage.emplace_back(dht::partition_range::make_singular(m.decorated_key())));
flat_mutation_reader rd = mt->make_flat_reader(s, tests::make_permit(), ranges_storage.emplace_back(dht::partition_range::make_singular(m.decorated_key())));
rd.set_max_buffer_size(1);
rd.fill_buffer(db::no_timeout).get();
readers.push_back(std::move(rd));
@@ -199,8 +200,8 @@ SEASTAR_TEST_CASE(test_adding_a_column_during_reading_doesnt_affect_read_result)
mt->apply(m);
}
auto check_rd_s1 = assert_that(mt->make_flat_reader(s1));
auto check_rd_s2 = assert_that(mt->make_flat_reader(s2));
auto check_rd_s1 = assert_that(mt->make_flat_reader(s1, tests::make_permit()));
auto check_rd_s2 = assert_that(mt->make_flat_reader(s2, tests::make_permit()));
check_rd_s1.next_mutation().has_schema(s1).is_equal_to(ring[0]);
check_rd_s2.next_mutation().has_schema(s2).is_equal_to(ring[0]);
mt->set_schema(s2);
@@ -211,13 +212,13 @@ SEASTAR_TEST_CASE(test_adding_a_column_during_reading_doesnt_affect_read_result)
check_rd_s1.produces_end_of_stream();
check_rd_s2.produces_end_of_stream();
assert_that(mt->make_flat_reader(s1))
assert_that(mt->make_flat_reader(s1, tests::make_permit()))
.produces(ring[0])
.produces(ring[1])
.produces(ring[2])
.produces_end_of_stream();
assert_that(mt->make_flat_reader(s2))
assert_that(mt->make_flat_reader(s2, tests::make_permit()))
.produces(ring[0])
.produces(ring[1])
.produces(ring[2])
@@ -249,7 +250,7 @@ SEASTAR_TEST_CASE(test_virtual_dirty_accounting_on_flush) {
}
// Create a reader which will cause many partition versions to be created
flat_mutation_reader_opt rd1 = mt->make_flat_reader(s);
flat_mutation_reader_opt rd1 = mt->make_flat_reader(s, tests::make_permit());
rd1->set_max_buffer_size(1);
rd1->fill_buffer(db::no_timeout).get();
@@ -312,29 +313,29 @@ SEASTAR_TEST_CASE(test_partition_version_consistency_after_lsa_compaction_happen
m3.set_clustered_cell(ck3, to_bytes("col"), data_value(bytes(bytes::initialized_later(), 8)), next_timestamp());
mt->apply(m1);
std::optional<flat_reader_assertions> rd1 = assert_that(mt->make_flat_reader(s));
std::optional<flat_reader_assertions> rd1 = assert_that(mt->make_flat_reader(s, tests::make_permit()));
rd1->set_max_buffer_size(1);
rd1->fill_buffer().get();
mt->apply(m2);
std::optional<flat_reader_assertions> rd2 = assert_that(mt->make_flat_reader(s));
std::optional<flat_reader_assertions> rd2 = assert_that(mt->make_flat_reader(s, tests::make_permit()));
rd2->set_max_buffer_size(1);
rd2->fill_buffer().get();
mt->apply(m3);
std::optional<flat_reader_assertions> rd3 = assert_that(mt->make_flat_reader(s));
std::optional<flat_reader_assertions> rd3 = assert_that(mt->make_flat_reader(s, tests::make_permit()));
rd3->set_max_buffer_size(1);
rd3->fill_buffer().get();
logalloc::shard_tracker().full_compaction();
auto rd4 = assert_that(mt->make_flat_reader(s));
auto rd4 = assert_that(mt->make_flat_reader(s, tests::make_permit()));
rd4.set_max_buffer_size(1);
rd4.fill_buffer().get();
auto rd5 = assert_that(mt->make_flat_reader(s));
auto rd5 = assert_that(mt->make_flat_reader(s, tests::make_permit()));
rd5.set_max_buffer_size(1);
rd5.fill_buffer().get();
auto rd6 = assert_that(mt->make_flat_reader(s));
auto rd6 = assert_that(mt->make_flat_reader(s, tests::make_permit()));
rd6.set_max_buffer_size(1);
rd6.fill_buffer().get();
@@ -421,7 +422,7 @@ SEASTAR_TEST_CASE(test_fast_forward_to_after_memtable_is_flushed) {
mt2->apply(m);
}
auto rd = assert_that(mt->make_flat_reader(s));
auto rd = assert_that(mt->make_flat_reader(s, tests::make_permit()));
rd.produces(ring[0]);
mt->mark_flushed(mt2->as_data_source());
rd.produces(ring[1]);
@@ -447,7 +448,7 @@ SEASTAR_TEST_CASE(test_exception_safety_of_partition_range_reads) {
do {
try {
injector.fail_after(i++);
assert_that(mt->make_flat_reader(s, query::full_partition_range))
assert_that(mt->make_flat_reader(s, tests::make_permit(), query::full_partition_range))
.produces(ms);
injector.cancel();
} catch (const std::bad_alloc&) {
@@ -500,7 +501,7 @@ SEASTAR_TEST_CASE(test_exception_safety_of_single_partition_reads) {
do {
try {
injector.fail_after(i++);
assert_that(mt->make_flat_reader(s, dht::partition_range::make_singular(ms[1].decorated_key())))
assert_that(mt->make_flat_reader(s, tests::make_permit(), dht::partition_range::make_singular(ms[1].decorated_key())))
.produces(ms[1]);
injector.cancel();
} catch (const std::bad_alloc&) {
@@ -524,7 +525,7 @@ SEASTAR_TEST_CASE(test_hash_is_cached) {
mt->apply(m);
{
auto rd = mt->make_flat_reader(s);
auto rd = mt->make_flat_reader(s, tests::make_permit());
rd(db::no_timeout).get0()->as_partition_start();
clustering_row row = std::move(rd(db::no_timeout).get0()->as_mutable_clustering_row());
BOOST_REQUIRE(!row.cells().cell_hash_for(0));
@@ -533,14 +534,14 @@ SEASTAR_TEST_CASE(test_hash_is_cached) {
{
auto slice = s->full_slice();
slice.options.set<query::partition_slice::option::with_digest>();
auto rd = mt->make_flat_reader(s, query::full_partition_range, slice);
auto rd = mt->make_flat_reader(s, tests::make_permit(), query::full_partition_range, slice);
rd(db::no_timeout).get0()->as_partition_start();
clustering_row row = std::move(rd(db::no_timeout).get0()->as_mutable_clustering_row());
BOOST_REQUIRE(row.cells().cell_hash_for(0));
}
{
auto rd = mt->make_flat_reader(s);
auto rd = mt->make_flat_reader(s, tests::make_permit());
rd(db::no_timeout).get0()->as_partition_start();
clustering_row row = std::move(rd(db::no_timeout).get0()->as_mutable_clustering_row());
BOOST_REQUIRE(row.cells().cell_hash_for(0));
@@ -550,7 +551,7 @@ SEASTAR_TEST_CASE(test_hash_is_cached) {
mt->apply(m);
{
auto rd = mt->make_flat_reader(s);
auto rd = mt->make_flat_reader(s, tests::make_permit());
rd(db::no_timeout).get0()->as_partition_start();
clustering_row row = std::move(rd(db::no_timeout).get0()->as_mutable_clustering_row());
BOOST_REQUIRE(!row.cells().cell_hash_for(0));
@@ -559,14 +560,14 @@ SEASTAR_TEST_CASE(test_hash_is_cached) {
{
auto slice = s->full_slice();
slice.options.set<query::partition_slice::option::with_digest>();
auto rd = mt->make_flat_reader(s, query::full_partition_range, slice);
auto rd = mt->make_flat_reader(s, tests::make_permit(), query::full_partition_range, slice);
rd(db::no_timeout).get0()->as_partition_start();
clustering_row row = std::move(rd(db::no_timeout).get0()->as_mutable_clustering_row());
BOOST_REQUIRE(row.cells().cell_hash_for(0));
}
{
auto rd = mt->make_flat_reader(s);
auto rd = mt->make_flat_reader(s, tests::make_permit());
rd(db::no_timeout).get0()->as_partition_start();
clustering_row row = std::move(rd(db::no_timeout).get0()->as_mutable_clustering_row());
BOOST_REQUIRE(row.cells().cell_hash_for(0));

View File

@@ -35,6 +35,7 @@
#include "test/lib/dummy_sharder.hh"
#include "test/lib/reader_lifecycle_policy.hh"
#include "test/lib/log.hh"
#include "test/lib/reader_permit.hh"
#include "dht/sharder.hh"
#include "mutation_reader.hh"
@@ -50,10 +51,11 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_as_mutation_source) {
// It has to be a container that does not invalidate pointers
std::list<dummy_sharder> keep_alive_sharder;
test_reader_lifecycle_policy::operations_gate operations_gate;
do_with_cql_env([&keep_alive_sharder] (cql_test_env& env) -> future<> {
auto make_populate = [&keep_alive_sharder, &env] (bool evict_paused_readers, bool single_fragment_buffer) {
return [&keep_alive_sharder, &env, evict_paused_readers, single_fragment_buffer] (schema_ptr s, const std::vector<mutation>& mutations) mutable {
do_with_cql_env([&] (cql_test_env& env) -> future<> {
auto make_populate = [&] (bool evict_paused_readers, bool single_fragment_buffer) {
return [&, evict_paused_readers, single_fragment_buffer] (schema_ptr s, const std::vector<mutation>& mutations) mutable {
// We need to group mutations that have the same token so they land on the same shard.
std::map<dht::token, std::vector<frozen_mutation>> mutations_by_token;
@@ -83,7 +85,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_as_mutation_source) {
}
keep_alive_sharder.push_back(sharder);
return mutation_source([&keep_alive_sharder, remote_memtables, evict_paused_readers, single_fragment_buffer] (schema_ptr s,
return mutation_source([&, remote_memtables, evict_paused_readers, single_fragment_buffer] (schema_ptr s,
reader_permit,
const dht::partition_range& range,
const query::partition_slice& slice,
@@ -98,7 +100,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_as_mutation_source) {
const io_priority_class& pc,
tracing::trace_state_ptr trace_state,
mutation_reader::forwarding fwd_mr) {
auto reader = remote_memtables->at(this_shard_id())->make_flat_reader(s, range, slice, pc, std::move(trace_state),
auto reader = remote_memtables->at(this_shard_id())->make_flat_reader(s, tests::make_permit(), range, slice, pc, std::move(trace_state),
streamed_mutation::forwarding::no, fwd_mr);
if (single_fragment_buffer) {
reader.set_max_buffer_size(1);
@@ -106,7 +108,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_as_mutation_source) {
return reader;
};
auto lifecycle_policy = seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory), evict_paused_readers);
auto lifecycle_policy = seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory), operations_gate, evict_paused_readers);
auto mr = make_multishard_combining_reader_for_tests(keep_alive_sharder.back(), std::move(lifecycle_policy), s, range, slice, pc, trace_state, fwd_mr);
if (fwd_sm == streamed_mutation::forwarding::yes) {
return make_forwardable(std::move(mr));
@@ -125,6 +127,6 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_as_mutation_source) {
testlog.info("run_mutation_source_tests(evict_readers=true, single_fragment_buffer=true)");
run_mutation_source_tests(make_populate(true, true));
return make_ready_future<>();
return operations_gate.close();
}).get();
}

View File

@@ -1021,13 +1021,13 @@ SEASTAR_THREAD_TEST_CASE(fuzzy_test) {
const auto& partitions = pop_desc.partitions;
smp::invoke_on_all([cfg, db = &env.db(), gs = global_schema_ptr(pop_desc.schema), &partitions] {
auto& sem = db->local().find_column_family(gs.get()).read_concurrency_semaphore();
auto& sem = db->local().make_query_class_config().semaphore;
auto resources = sem.available_resources();
resources -= reader_concurrency_semaphore::resources{1, 0};
auto permit = sem.consume_resources(resources);
auto permit = sem.make_permit();
return run_fuzzy_test_workload(cfg, *db, gs.get(), partitions).finally([permit = std::move(permit)] {});
return run_fuzzy_test_workload(cfg, *db, gs.get(), partitions).finally([units = permit.consume_resources(resources)] {});
}).handle_exception([seed] (std::exception_ptr e) {
testlog.error("Test workload failed with exception {}."
" To repeat this particular run, replace the random seed of the test, with that of this run ({})."

View File

@@ -33,6 +33,7 @@
#include "memtable.hh"
#include "test/lib/mutation_assertions.hh"
#include "test/lib/reader_permit.hh"
// A StreamedMutationConsumer which distributes fragments randomly into several mutations.
class fragment_scatterer {
@@ -127,7 +128,7 @@ SEASTAR_TEST_CASE(test_mutation_merger_conforms_to_mutation_source) {
{
std::vector<flat_mutation_reader> readers;
for (int i = 0; i < n; ++i) {
readers.push_back(memtables[i]->make_flat_reader(s, range, slice, pc, trace_state, fwd, fwd_mr));
readers.push_back(memtables[i]->make_flat_reader(s, tests::make_permit(), range, slice, pc, trace_state, fwd, fwd_mr));
}
return make_combined_reader(s, std::move(readers), fwd, fwd_mr);
});

View File

@@ -34,6 +34,7 @@
#include "test/lib/mutation_assertions.hh"
#include "test/lib/result_set_assertions.hh"
#include "test/lib/mutation_source_test.hh"
#include "test/lib/reader_permit.hh"
#include "mutation_query.hh"
#include <seastar/core/do_with.hh>
@@ -76,7 +77,6 @@ static query::partition_slice make_full_slice(const schema& s) {
}
static auto inf32 = std::numeric_limits<unsigned>::max();
static const uint64_t max_memory_for_reverse_query = 1 << 20;
query::result_set to_result_set(const reconcilable_result& r, schema_ptr s, const query::partition_slice& slice) {
return query::result_set::from_raw_result(s, slice, to_data_query_result(r, s, slice, inf32, inf32));
@@ -101,7 +101,7 @@ SEASTAR_TEST_CASE(test_reading_from_single_partition) {
auto slice = make_full_slice(*s);
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 2, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 2, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
// FIXME: use mutation assertions
assert_that(to_result_set(result, s, slice))
@@ -124,7 +124,7 @@ SEASTAR_TEST_CASE(test_reading_from_single_partition) {
.build();
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, query::max_rows, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, query::max_rows, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_only(a_row()
@@ -160,7 +160,7 @@ SEASTAR_TEST_CASE(test_cells_are_expired_according_to_query_timestamp) {
auto slice = make_full_slice(*s);
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 1, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 1, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_only(a_row()
@@ -174,7 +174,7 @@ SEASTAR_TEST_CASE(test_cells_are_expired_according_to_query_timestamp) {
auto slice = make_full_slice(*s);
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 1, query::max_partitions, now + 2s, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 1, query::max_partitions, now + 2s, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_only(a_row()
@@ -207,7 +207,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
.build();
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(3)
@@ -237,7 +237,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
.build();
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(3)
@@ -265,7 +265,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
{
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 10, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 10, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(3)
@@ -285,7 +285,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
{
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 1, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 1, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(1)
@@ -297,7 +297,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
{
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 2, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 2, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(2)
@@ -324,7 +324,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
.build();
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 2, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 2, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(2)
@@ -348,7 +348,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
.build();
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(2)
@@ -370,7 +370,7 @@ SEASTAR_TEST_CASE(test_reverse_ordering_is_respected) {
.build();
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, 3, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_only(a_row()
@@ -396,7 +396,7 @@ SEASTAR_TEST_CASE(test_query_when_partition_tombstone_covers_live_cells) {
auto slice = make_full_slice(*s);
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, query::max_rows, query::max_partitions, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, query::max_rows, query::max_partitions, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.is_empty();
@@ -447,7 +447,7 @@ SEASTAR_TEST_CASE(test_partitions_with_only_expired_tombstones_are_dropped) {
auto query_time = now + std::chrono::seconds(1);
reconcilable_result result = mutation_query(s, src, query::full_partition_range, slice, query::max_rows, query::max_partitions, query_time,
db::no_timeout, max_memory_for_reverse_query).get0();
db::no_timeout, tests::make_query_class_config()).get0();
BOOST_REQUIRE_EQUAL(result.partitions().size(), 2);
BOOST_REQUIRE_EQUAL(result.row_count(), 2);
@@ -466,28 +466,28 @@ SEASTAR_TEST_CASE(test_result_row_count) {
auto src = make_source({m1});
auto r = to_data_query_result(mutation_query(s, make_source({m1}), query::full_partition_range, slice, 10000, query::max_partitions, now,
db::no_timeout, max_memory_for_reverse_query).get0(), s, slice, inf32, inf32);
db::no_timeout, tests::make_query_class_config()).get0(), s, slice, inf32, inf32);
BOOST_REQUIRE_EQUAL(r.row_count().value(), 0);
m1.set_static_cell("s1", data_value(bytes("S_v1")), 1);
r = to_data_query_result(mutation_query(s, make_source({m1}), query::full_partition_range, slice, 10000, query::max_partitions, now,
db::no_timeout, max_memory_for_reverse_query).get0(), s, slice, inf32, inf32);
db::no_timeout, tests::make_query_class_config()).get0(), s, slice, inf32, inf32);
BOOST_REQUIRE_EQUAL(r.row_count().value(), 1);
m1.set_clustered_cell(clustering_key::from_single_value(*s, bytes("A")), "v1", data_value(bytes("A_v1")), 1);
r = to_data_query_result(mutation_query(s, make_source({m1}), query::full_partition_range, slice, 10000, query::max_partitions, now,
db::no_timeout, max_memory_for_reverse_query).get0(), s, slice, inf32, inf32);
db::no_timeout, tests::make_query_class_config()).get0(), s, slice, inf32, inf32);
BOOST_REQUIRE_EQUAL(r.row_count().value(), 1);
m1.set_clustered_cell(clustering_key::from_single_value(*s, bytes("B")), "v1", data_value(bytes("B_v1")), 1);
r = to_data_query_result(mutation_query(s, make_source({m1}), query::full_partition_range, slice, 10000, query::max_partitions, now,
db::no_timeout, max_memory_for_reverse_query).get0(), s, slice, inf32, inf32);
db::no_timeout, tests::make_query_class_config()).get0(), s, slice, inf32, inf32);
BOOST_REQUIRE_EQUAL(r.row_count().value(), 2);
mutation m2(s, partition_key::from_single_value(*s, "key2"));
m2.set_static_cell("s1", data_value(bytes("S_v1")), 1);
r = to_data_query_result(mutation_query(s, make_source({m1, m2}), query::full_partition_range, slice, 10000, query::max_partitions, now,
db::no_timeout, max_memory_for_reverse_query).get0(), s, slice, inf32, inf32);
db::no_timeout, tests::make_query_class_config()).get0(), s, slice, inf32, inf32);
BOOST_REQUIRE_EQUAL(r.row_count().value(), 3);
});
}
@@ -510,7 +510,7 @@ SEASTAR_TEST_CASE(test_partition_limit) {
{
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, query::max_rows, 10, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, query::max_rows, 10, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(2)
@@ -526,7 +526,7 @@ SEASTAR_TEST_CASE(test_partition_limit) {
{
reconcilable_result result = mutation_query(s, src,
query::full_partition_range, slice, query::max_rows, 1, now, db::no_timeout, max_memory_for_reverse_query).get0();
query::full_partition_range, slice, query::max_rows, 1, now, db::no_timeout, tests::make_query_class_config()).get0();
assert_that(to_result_set(result, s, slice))
.has_size(1)
@@ -549,11 +549,11 @@ SEASTAR_THREAD_TEST_CASE(test_result_size_calculation) {
query::result::builder digest_only_builder(slice, query::result_options{query::result_request::only_digest, query::digest_algorithm::xxHash}, l.new_digest_read(query::result_memory_limiter::maximum_result_size).get0());
data_query(s, source, query::full_partition_range, slice, std::numeric_limits<uint32_t>::max(), std::numeric_limits<uint32_t>::max(),
gc_clock::now(), digest_only_builder, db::no_timeout, max_memory_for_reverse_query).get0();
gc_clock::now(), digest_only_builder, db::no_timeout, tests::make_query_class_config()).get0();
query::result::builder result_and_digest_builder(slice, query::result_options{query::result_request::result_and_digest, query::digest_algorithm::xxHash}, l.new_data_read(query::result_memory_limiter::maximum_result_size).get0());
data_query(s, source, query::full_partition_range, slice, std::numeric_limits<uint32_t>::max(), std::numeric_limits<uint32_t>::max(),
gc_clock::now(), result_and_digest_builder, db::no_timeout, max_memory_for_reverse_query).get0();
gc_clock::now(), result_and_digest_builder, db::no_timeout, tests::make_query_class_config()).get0();
BOOST_REQUIRE_EQUAL(digest_only_builder.memory_accounter().used_memory(), result_and_digest_builder.memory_accounter().used_memory());
}

View File

@@ -44,6 +44,7 @@
#include "test/lib/make_random_string.hh"
#include "test/lib/dummy_sharder.hh"
#include "test/lib/reader_lifecycle_policy.hh"
#include "test/lib/reader_permit.hh"
#include "dht/sharder.hh"
#include "mutation_reader.hh"
@@ -635,7 +636,7 @@ SEASTAR_THREAD_TEST_CASE(combined_mutation_reader_test) {
sstable_mutation_readers.emplace_back(
sst->as_mutation_source().make_reader(
s.schema(),
no_reader_permit(),
tests::make_permit(),
query::full_partition_range,
s.schema()->full_slice(),
seastar::default_priority_class(),
@@ -649,7 +650,7 @@ SEASTAR_THREAD_TEST_CASE(combined_mutation_reader_test) {
auto incremental_reader = make_local_shard_sstable_reader(
s.schema(),
no_reader_permit(),
tests::make_permit(),
sstable_set,
query::full_partition_range,
s.schema()->full_slice(),
@@ -988,7 +989,7 @@ public:
return flat_mutation_reader(std::move(tracker_ptr));
});
_reader = make_restricted_flat_reader(semaphore, std::move(ms), schema);
_reader = make_restricted_flat_reader(std::move(ms), schema, semaphore.make_permit());
}
reader_wrapper(
@@ -1084,8 +1085,7 @@ class dummy_file_impl : public file_impl {
SEASTAR_TEST_CASE(reader_restriction_file_tracking) {
return async([&] {
reader_concurrency_semaphore semaphore(100, 4 * 1024, get_name());
// Testing the tracker here, no need to have a base cost.
auto permit = semaphore.wait_admission(0, db::no_timeout).get0();
auto permit = semaphore.make_permit();
{
auto tracked_file = make_tracked_file(file(shared_ptr<file_impl>(make_shared<dummy_file_impl>())), permit);
@@ -1327,16 +1327,14 @@ SEASTAR_TEST_CASE(restricted_reader_create_reader) {
SEASTAR_TEST_CASE(test_restricted_reader_as_mutation_source) {
return seastar::async([test_name = get_name()] {
reader_concurrency_semaphore semaphore(100, 10 * new_reader_base_cost, test_name);
auto make_restricted_populator = [&semaphore](schema_ptr s, const std::vector<mutation> &muts) {
auto make_restricted_populator = [] (schema_ptr s, const std::vector<mutation> &muts) {
auto mt = make_lw_shared<memtable>(s);
for (auto &&mut : muts) {
mt->apply(mut);
}
auto ms = mt->as_data_source();
return mutation_source([&semaphore, ms = std::move(ms)](schema_ptr schema,
return mutation_source([ms = std::move(ms)](schema_ptr schema,
reader_permit permit,
const dht::partition_range& range,
const query::partition_slice& slice,
@@ -1344,7 +1342,7 @@ SEASTAR_TEST_CASE(test_restricted_reader_as_mutation_source) {
tracing::trace_state_ptr tr,
streamed_mutation::forwarding fwd,
mutation_reader::forwarding fwd_mr) {
return make_restricted_flat_reader(semaphore, std::move(ms), std::move(schema), range, slice, pc, tr,
return make_restricted_flat_reader(std::move(ms), std::move(schema), std::move(permit), range, slice, pc, tr,
fwd, fwd_mr);
});
};
@@ -1384,7 +1382,7 @@ SEASTAR_TEST_CASE(test_fast_forwarding_combined_reader_is_consistent_with_slicin
}
mutation_source ds = create_sstable(env, s, muts)->as_mutation_source();
readers.push_back(ds.make_reader(s,
no_reader_permit(),
tests::make_permit(),
dht::partition_range::make({keys[0]}, {keys[0]}),
s->full_slice(), default_priority_class(), nullptr,
streamed_mutation::forwarding::yes,
@@ -1459,8 +1457,8 @@ SEASTAR_TEST_CASE(test_combined_reader_slicing_with_overlapping_range_tombstones
{
auto slice = partition_slice_builder(*s).with_range(range).build();
readers.push_back(ds1.make_reader(s, no_reader_permit(), query::full_partition_range, slice));
readers.push_back(ds2.make_reader(s, no_reader_permit(), query::full_partition_range, slice));
readers.push_back(ds1.make_reader(s, tests::make_permit(), query::full_partition_range, slice));
readers.push_back(ds2.make_reader(s, tests::make_permit(), query::full_partition_range, slice));
auto rd = make_combined_reader(s, std::move(readers),
streamed_mutation::forwarding::no, mutation_reader::forwarding::no);
@@ -1482,9 +1480,9 @@ SEASTAR_TEST_CASE(test_combined_reader_slicing_with_overlapping_range_tombstones
// Check fast_forward_to()
{
readers.push_back(ds1.make_reader(s, no_reader_permit(), query::full_partition_range, s->full_slice(), default_priority_class(),
readers.push_back(ds1.make_reader(s, tests::make_permit(), query::full_partition_range, s->full_slice(), default_priority_class(),
nullptr, streamed_mutation::forwarding::yes));
readers.push_back(ds2.make_reader(s, no_reader_permit(), query::full_partition_range, s->full_slice(), default_priority_class(),
readers.push_back(ds2.make_reader(s, tests::make_permit(), query::full_partition_range, s->full_slice(), default_priority_class(),
nullptr, streamed_mutation::forwarding::yes));
auto rd = make_combined_reader(s, std::move(readers),
@@ -1589,6 +1587,7 @@ SEASTAR_THREAD_TEST_CASE(test_foreign_reader_as_mutation_source) {
auto remote_reader = smp::submit_to(remote_shard,
[&, s = global_schema_ptr(s), fwd_sm, fwd_mr, trace_state = tracing::global_trace_state_ptr(trace_state)] {
return make_foreign(std::make_unique<flat_mutation_reader>(remote_mt->make_flat_reader(s.get(),
tests::make_permit(),
range,
slice,
pc,
@@ -1958,7 +1957,9 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_reading_empty_table) {
return;
}
do_with_cql_env([] (cql_test_env& env) -> future<> {
test_reader_lifecycle_policy::operations_gate operations_gate;
do_with_cql_env([&] (cql_test_env& env) -> future<> {
std::vector<std::atomic<bool>> shards_touched(smp::count);
simple_schema s;
auto factory = [&shards_touched] (
@@ -1973,7 +1974,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_reading_empty_table) {
};
assert_that(make_multishard_combining_reader(
seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory)),
seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory), operations_gate),
s.schema(),
query::full_partition_range,
s.schema()->full_slice(),
@@ -1984,7 +1985,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_reading_empty_table) {
BOOST_REQUIRE(shards_touched.at(i));
}
return make_ready_future<>();
return operations_gate.close();
}).get();
}
@@ -2179,7 +2180,9 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_destroyed_with_pending
return;
}
do_with_cql_env([] (cql_test_env& env) -> future<> {
test_reader_lifecycle_policy::operations_gate operations_gate;
do_with_cql_env([&] (cql_test_env& env) -> future<> {
auto remote_controls = std::vector<foreign_ptr<std::unique_ptr<puppet_reader::control>>>();
remote_controls.reserve(smp::count);
for (unsigned i = 0; i < smp::count; ++i) {
@@ -2222,7 +2225,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_destroyed_with_pending
{
dummy_sharder sharder(s.schema()->get_sharder(), std::move(pkeys_by_tokens));
auto reader = make_multishard_combining_reader_for_tests(sharder, seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory)),
auto reader = make_multishard_combining_reader_for_tests(sharder, seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory), operations_gate),
s.schema(), query::full_partition_range, s.schema()->full_slice(), service::get_local_sstable_query_read_priority());
reader.fill_buffer(db::no_timeout).get();
BOOST_REQUIRE(reader.is_buffer_full());
@@ -2244,12 +2247,14 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_destroyed_with_pending
std::logical_and<bool>()).get0();
}));
return make_ready_future<>();
return operations_gate.close();
}).get();
}
SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_next_partition) {
do_with_cql_env([] (cql_test_env& env) -> future<> {
test_reader_lifecycle_policy::operations_gate operations_gate;
do_with_cql_env([&] (cql_test_env& env) -> future<> {
env.execute_cql("CREATE KEYSPACE multishard_combining_reader_next_partition_ks"
" WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : 1};").get();
env.execute_cql("CREATE TABLE multishard_combining_reader_next_partition_ks.test (pk int, v int, PRIMARY KEY(pk));").get();
@@ -2289,7 +2294,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_next_partition) {
auto& table = db->local().find_column_family(schema);
auto reader = table.as_mutation_source().make_reader(
schema,
no_reader_permit(),
tests::make_permit(),
range,
slice,
service::get_local_sstable_query_read_priority(),
@@ -2300,7 +2305,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_next_partition) {
return reader;
};
auto reader = make_multishard_combining_reader(
seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory)),
seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory), operations_gate),
schema,
query::full_partition_range,
schema->full_slice(),
@@ -2320,7 +2325,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_next_partition) {
}
assertions.produces_end_of_stream();
return make_ready_future<>();
return operations_gate.close();
}).get();
}
@@ -2402,7 +2407,9 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_non_strictly_monotonic
BOOST_REQUIRE(mf.as_clustering_row().key().equal(*s.schema(), ckey));
}
do_with_cql_env([=, s = std::move(s)] (cql_test_env& env) mutable -> future<> {
test_reader_lifecycle_policy::operations_gate operations_gate;
do_with_cql_env([=, &operations_gate, s = std::move(s)] (cql_test_env& env) mutable -> future<> {
auto factory = [=, gs = global_simple_schema(s)] (
schema_ptr,
const dht::partition_range& range,
@@ -2427,14 +2434,14 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_combining_reader_non_strictly_monotonic
BOOST_REQUIRE(mut_opt);
assert_that(make_multishard_combining_reader(
seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory), true),
seastar::make_shared<test_reader_lifecycle_policy>(std::move(factory), operations_gate, true),
s.schema(),
query::full_partition_range,
s.schema()->full_slice(),
service::get_local_sstable_query_read_priority()))
.produces_partition(*mut_opt);
return make_ready_future<>();
return operations_gate.close();
}).get();
}
@@ -2448,7 +2455,9 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_streaming_reader) {
return;
}
do_with_cql_env([] (cql_test_env& env) -> future<> {
test_reader_lifecycle_policy::operations_gate operations_gate;
do_with_cql_env([&] (cql_test_env& env) -> future<> {
env.execute_cql("CREATE KEYSPACE multishard_streaming_reader_ks WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : 1};").get();
env.execute_cql("CREATE TABLE multishard_streaming_reader_ks.test (pk int, v int, PRIMARY KEY(pk));").get();
@@ -2486,11 +2495,11 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_streaming_reader) {
tracing::trace_state_ptr trace_state,
mutation_reader::forwarding fwd_mr) mutable {
auto& table = db->local().find_column_family(s);
return table.as_mutation_source().make_reader(std::move(s), no_reader_permit(), range, slice, pc, std::move(trace_state),
return table.as_mutation_source().make_reader(std::move(s), tests::make_permit(), range, slice, pc, std::move(trace_state),
streamed_mutation::forwarding::no, fwd_mr);
};
auto reference_reader = make_filtering_reader(
make_multishard_combining_reader(seastar::make_shared<test_reader_lifecycle_policy>(std::move(reader_factory)),
make_multishard_combining_reader(seastar::make_shared<test_reader_lifecycle_policy>(std::move(reader_factory), operations_gate),
schema, partition_range, schema->full_slice(), service::get_local_sstable_query_read_priority()),
[&remote_partitioner] (const dht::decorated_key& pkey) {
return remote_partitioner.shard_of(pkey.token()) == 0;
@@ -2514,7 +2523,7 @@ SEASTAR_THREAD_TEST_CASE(test_multishard_streaming_reader) {
assert_that(tested_muts[i]).is_equal_to(reference_muts[i]);
}
return make_ready_future<>();
return operations_gate.close();
}).get();
}
@@ -2669,7 +2678,7 @@ SEASTAR_THREAD_TEST_CASE(test_compacting_reader_as_mutation_source) {
tracing::trace_state_ptr trace_state,
streamed_mutation::forwarding fwd_sm,
mutation_reader::forwarding fwd_mr) mutable {
auto source = mt->make_flat_reader(s, range, slice, pc, std::move(trace_state), streamed_mutation::forwarding::no, fwd_mr);
auto source = mt->make_flat_reader(s, tests::make_permit(), range, slice, pc, std::move(trace_state), streamed_mutation::forwarding::no, fwd_mr);
auto mr = make_compacting_reader(std::move(source), query_time, [] (const dht::decorated_key&) { return api::min_timestamp; });
if (single_fragment_buffer) {
mr.set_max_buffer_size(1);

View File

@@ -42,6 +42,7 @@
#include "query-result-reader.hh"
#include "partition_slice_builder.hh"
#include "test/lib/tmpdir.hh"
#include "test/lib/reader_permit.hh"
#include "sstables/compaction_manager.hh"
#include <seastar/testing/test_case.hh>
@@ -91,7 +92,7 @@ static atomic_cell make_collection_member(data_type dt, T value) {
static mutation_partition get_partition(memtable& mt, const partition_key& key) {
auto dk = dht::decorate_key(*mt.schema(), key);
auto range = dht::partition_range::make_singular(dk);
auto reader = mt.make_flat_reader(mt.schema(), range);
auto reader = mt.make_flat_reader(mt.schema(), tests::make_permit(), range);
auto mo = read_mutation_from_flat_mutation_reader(reader, db::no_timeout).get0();
BOOST_REQUIRE(bool(mo));
return std::move(mo->partition());
@@ -438,7 +439,7 @@ SEASTAR_THREAD_TEST_CASE(test_large_collection_allocation) {
mt->apply(make_mutation_with_collection(pk, std::move(cmd1)));
mt->apply(make_mutation_with_collection(pk, std::move(cmd2))); // this should trigger a merge of the two collections
auto rd = mt->make_flat_reader(schema);
auto rd = mt->make_flat_reader(schema, tests::make_permit());
auto res_mut_opt = read_mutation_from_flat_mutation_reader(rd, db::no_timeout).get0();
BOOST_REQUIRE(res_mut_opt);
@@ -516,7 +517,7 @@ SEASTAR_TEST_CASE(test_multiple_memtables_one_partition) {
auto verify_row = [&] (int32_t c1, int32_t r1) {
auto c_key = clustering_key::from_exploded(*s, {int32_type->decompose(c1)});
auto p_key = dht::decorate_key(*s, key);
auto r = cf.find_row(cf.schema(), p_key, c_key).get0();
auto r = cf.find_row(cf.schema(), tests::make_permit(), p_key, c_key).get0();
{
BOOST_REQUIRE(r);
auto i = r->find_cell(r1_col.id);
@@ -575,13 +576,13 @@ SEASTAR_TEST_CASE(test_flush_in_the_middle_of_a_scan) {
std::sort(mutations.begin(), mutations.end(), mutation_decorated_key_less_comparator());
// Flush will happen in the middle of reading for this scanner
auto assert_that_scanner1 = assert_that(cf.make_reader(s, query::full_partition_range));
auto assert_that_scanner1 = assert_that(cf.make_reader(s, tests::make_permit(), query::full_partition_range));
// Flush will happen before it is invoked
auto assert_that_scanner2 = assert_that(cf.make_reader(s, query::full_partition_range));
auto assert_that_scanner2 = assert_that(cf.make_reader(s, tests::make_permit(), query::full_partition_range));
// Flush will happen after all data was read, but before EOS was consumed
auto assert_that_scanner3 = assert_that(cf.make_reader(s, query::full_partition_range));
auto assert_that_scanner3 = assert_that(cf.make_reader(s, tests::make_permit(), query::full_partition_range));
assert_that_scanner1.produces(mutations[0]);
assert_that_scanner1.produces(mutations[1]);
@@ -655,7 +656,7 @@ SEASTAR_TEST_CASE(test_multiple_memtables_multiple_partitions) {
}
return do_with(std::move(result), [&cf, s, &r1_col, shadow] (auto& result) {
return cf.for_all_partitions_slow(s, [&, s] (const dht::decorated_key& pk, const mutation_partition& mp) {
return cf.for_all_partitions_slow(s, tests::make_permit(), [&, s] (const dht::decorated_key& pk, const mutation_partition& mp) {
auto p1 = value_cast<int32_t>(int32_type->deserialize(pk._key.explode(*s)[0]));
for (const rows_entry& re : mp.range(*s, nonwrapping_range<clustering_key_prefix>())) {
auto c1 = value_cast<int32_t>(int32_type->deserialize(re.key().explode(*s)[0]));

View File

@@ -113,6 +113,7 @@ private:
Querier make_querier(const dht::partition_range& range) {
return Querier(_mutation_source,
_s.schema(),
_sem.make_permit(),
range,
_s.schema()->full_slice(),
service::get_local_sstable_query_read_priority(),
@@ -160,7 +161,7 @@ public:
test_querier_cache(const noncopyable_function<sstring(size_t)>& external_make_value, std::chrono::seconds entry_ttl = 24h, size_t cache_size = 100000)
: _sem(reader_concurrency_semaphore::no_limits{})
, _cache(_sem, cache_size, entry_ttl)
, _cache(cache_size, entry_ttl)
, _mutations(make_mutations(_s, external_make_value))
, _mutation_source([this] (schema_ptr, reader_permit, const dht::partition_range& range) {
auto rd = flat_mutation_reader_from_mutations(_mutations, range);
@@ -675,22 +676,22 @@ SEASTAR_THREAD_TEST_CASE(test_resources_based_cache_eviction) {
nullptr,
db::no_timeout).get();
auto& semaphore = cf.read_concurrency_semaphore();
auto& semaphore = db.make_query_class_config().semaphore;
auto permit = semaphore.make_permit();
BOOST_CHECK_EQUAL(db.get_querier_cache_stats().resource_based_evictions, 0);
// Drain all resources of the semaphore
std::vector<reader_permit> permits;
const auto resources = semaphore.available_resources();
permits.reserve(resources.count);
const auto per_permit_memory = resources.memory / resources.count;
const auto per_count_memory = resources.memory / resources.count;
for (int i = 0; i < resources.count; ++i) {
permits.emplace_back(semaphore.wait_admission(per_permit_memory, db::no_timeout).get0());
auto units = permit.wait_admission(per_count_memory, db::no_timeout).get0();
for (int i = 0; i < resources.count - 1; ++i) {
units.add(permit.wait_admission(per_count_memory, db::no_timeout).get0());
}
BOOST_CHECK_EQUAL(semaphore.available_resources().count, 0);
BOOST_CHECK(semaphore.available_resources().memory < per_permit_memory);
BOOST_CHECK(semaphore.available_resources().memory < per_count_memory);
auto cmd2 = query::read_command(s->id(),
s->version(),
@@ -748,12 +749,13 @@ SEASTAR_THREAD_TEST_CASE(test_immediate_evict_on_insert) {
test_querier_cache t;
auto& sem = t.get_semaphore();
auto permit = sem.make_permit();
auto permit1 = sem.consume_resources(reader_concurrency_semaphore::resources(sem.available_resources().count, 0));
auto resources = permit.consume_resources(reader_resources(sem.available_resources().count, 0));
BOOST_CHECK_EQUAL(sem.available_resources().count, 0);
auto permit2_fut = sem.wait_admission(1, db::no_timeout);
auto fut = permit.wait_admission(1, db::no_timeout);
BOOST_CHECK_EQUAL(sem.waiters(), 1);
@@ -763,7 +765,7 @@ SEASTAR_THREAD_TEST_CASE(test_immediate_evict_on_insert) {
.no_drops()
.resource_based_evictions();
permit1.release();
resources.reset();
permit2_fut.get();
fut.get();
}

File diff suppressed because it is too large Load Diff

View File

@@ -31,6 +31,7 @@
#include "test/lib/mutation_source_test.hh"
#include "test/lib/flat_mutation_reader_assertions.hh"
#include "test/lib/sstable_utils.hh"
#include "test/lib/reader_permit.hh"
using namespace sstables;
using namespace std::chrono_literals;
@@ -56,7 +57,8 @@ SEASTAR_THREAD_TEST_CASE(test_schema_changes) {
mt->apply(m);
}
created_with_base_schema = env.make_sstable(base, dir.path().string(), gen, version, sstables::sstable::format_types::big);
created_with_base_schema->write_components(mt->make_flat_reader(base), base_mutations.size(), base, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
created_with_base_schema->write_components(mt->make_flat_reader(base, tests::make_permit()), base_mutations.size(), base,
test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
created_with_base_schema->load().get();
created_with_changed_schema = env.make_sstable(changed, dir.path().string(), gen, version, sstables::sstable::format_types::big);
@@ -72,14 +74,14 @@ SEASTAR_THREAD_TEST_CASE(test_schema_changes) {
}
auto mr = assert_that(created_with_base_schema->as_mutation_source()
.make_reader(changed, no_reader_permit(), dht::partition_range::make_open_ended_both_sides(), changed->full_slice()));
.make_reader(changed, tests::make_permit(), dht::partition_range::make_open_ended_both_sides(), changed->full_slice()));
for (auto& m : changed_mutations) {
mr.produces(m);
}
mr.produces_end_of_stream();
mr = assert_that(created_with_changed_schema->as_mutation_source()
.make_reader(changed, no_reader_permit(), dht::partition_range::make_open_ended_both_sides(), changed->full_slice()));
.make_reader(changed, tests::make_permit(), dht::partition_range::make_open_ended_both_sides(), changed->full_slice()));
for (auto& m : changed_mutations) {
mr.produces(m);
}

View File

@@ -53,6 +53,7 @@
#include "sstables/mc/writer.hh"
#include "test/lib/simple_schema.hh"
#include "test/lib/exception_utils.hh"
#include "test/lib/reader_permit.hh"
using namespace sstables;
@@ -93,7 +94,7 @@ public:
return sstables::test(_sst).read_indexes();
}
flat_mutation_reader read_rows_flat() {
return _sst->read_rows_flat(_sst->_schema, no_reader_permit());
return _sst->read_rows_flat(_sst->_schema, tests::make_permit());
}
const stats_metadata& get_stats_metadata() const {
@@ -109,7 +110,7 @@ public:
mutation_reader::forwarding fwd_mr = mutation_reader::forwarding::yes,
read_monitor& monitor = default_read_monitor()) {
return _sst->read_range_rows_flat(_sst->_schema,
no_reader_permit(),
tests::make_permit(),
range,
slice,
pc,
@@ -3010,7 +3011,7 @@ static flat_mutation_reader compacted_sstable_reader(test_env& env, schema_ptr s
sstables::compact_sstables(std::move(desc), *cf).get();
auto compacted_sst = open_sstable(env, s, tmp.path().string(), new_generation);
return compacted_sst->as_mutation_source().make_reader(s, no_reader_permit(), query::full_partition_range, s->full_slice());
return compacted_sst->as_mutation_source().make_reader(s, tests::make_permit(), query::full_partition_range, s->full_slice());
}
SEASTAR_THREAD_TEST_CASE(compact_deleted_row) {
@@ -3190,8 +3191,8 @@ static tmpdir write_sstables(test_env& env, schema_ptr s, lw_shared_ptr<memtable
auto sst = env.make_sstable(s, tmp.path().string(), 1, sstables::sstable_version_types::mc, sstable::format_types::big, 4096);
sst->write_components(make_combined_reader(s,
mt1->make_flat_reader(s),
mt2->make_flat_reader(s)), 1, s, test_sstables_manager.configure_writer(), mt1->get_encoding_stats()).get();
mt1->make_flat_reader(s, tests::make_permit()),
mt2->make_flat_reader(s, tests::make_permit())), 1, s, test_sstables_manager.configure_writer(), mt1->get_encoding_stats()).get();
return tmp;
}
@@ -4552,7 +4553,7 @@ static sstring get_read_index_test_path(sstring table_name) {
}
static std::unique_ptr<index_reader> get_index_reader(shared_sstable sst) {
return std::make_unique<index_reader>(sst, no_reader_permit(), default_priority_class(), tracing::trace_state_ptr());
return std::make_unique<index_reader>(sst, tests::make_permit(), default_priority_class(), tracing::trace_state_ptr());
}
shared_sstable make_test_sstable(test_env& env, schema_ptr schema, const sstring& table_name, int64_t gen = 1) {
@@ -5102,11 +5103,11 @@ SEASTAR_THREAD_TEST_CASE(test_sstable_reader_on_unknown_column) {
1 /* generation */,
sstable_version_types::mc,
sstables::sstable::format_types::big);
sst->write_components(mt->make_flat_reader(write_schema), 1, write_schema, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(write_schema, tests::make_permit()), 1, write_schema, cfg, mt->get_encoding_stats()).get();
sst->load().get();
BOOST_REQUIRE_EXCEPTION(
assert_that(sst->read_rows_flat(read_schema, no_reader_permit()))
assert_that(sst->read_rows_flat(read_schema, tests::make_permit()))
.produces_partition_start(dk)
.produces_row(to_ck(0), {{val2_cdef, int32_type->decompose(int32_t(200))}})
.produces_row(to_ck(1), {{val2_cdef, int32_type->decompose(int32_t(201))}})
@@ -5189,7 +5190,7 @@ static void test_sstable_write_large_row_f(schema_ptr s, memtable& mt, const par
// trigger depends on the size of rows after they are written in the MC format and that size
// depends on the encoding statistics (because of variable-length encoding). The original values
// were chosen with the default-constructed encoding_stats, so let's keep it that way.
sst->write_components(mt.make_flat_reader(s), 1, s, test_sstables_manager.configure_writer(), encoding_stats{}).get();
sst->write_components(mt.make_flat_reader(s, tests::make_permit()), 1, s, test_sstables_manager.configure_writer(), encoding_stats{}).get();
BOOST_REQUIRE_EQUAL(i, expected.size());
}
@@ -5239,7 +5240,7 @@ static void test_sstable_log_too_many_rows_f(int rows, uint64_t threshold, bool
auto env = test_env(manager);
tmpdir dir;
auto sst = env.make_sstable(sc, dir.path().string(), 1, sstable_version_types::mc, sstables::sstable::format_types::big);
sst->write_components(mt->make_flat_reader(sc), 1, sc, test_sstables_manager.configure_writer(), encoding_stats{}).get();
sst->write_components(mt->make_flat_reader(sc, tests::make_permit()), 1, sc, test_sstables_manager.configure_writer(), encoding_stats{}).get();
BOOST_REQUIRE_EQUAL(logged, expected);
}

View File

@@ -70,7 +70,7 @@
#include <boost/icl/interval_map.hpp>
#include "test/lib/test_services.hh"
#include "test/lib/cql_test_env.hh"
#include "test/lib/reader_permit.hh"
#include "test/lib/sstable_utils.hh"
namespace fs = std::filesystem;
@@ -834,7 +834,7 @@ SEASTAR_TEST_CASE(datafile_generation_11) {
return write_memtable_to_sstable_for_test(*mt, sst).then([&env, s, sst, mt, verifier, tomb, &static_set_col, tmpdir_path] {
return env.reusable_sst(s, tmpdir_path, 11).then([s, verifier, tomb, &static_set_col] (auto sstp) mutable {
return do_with(make_dkey(s, "key1"), [sstp, s, verifier, tomb, &static_set_col] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, verifier, tomb, &static_set_col, rd] (auto mutation) {
auto verify_set = [&tomb] (const collection_mutation_description& m) {
BOOST_REQUIRE(bool(m.tomb) == true);
@@ -862,7 +862,7 @@ SEASTAR_TEST_CASE(datafile_generation_11) {
});
}).then([sstp, s, verifier] {
return do_with(make_dkey(s, "key2"), [sstp, s, verifier] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, verifier, rd] (auto mutation) {
auto m = verifier(mutation);
BOOST_REQUIRE(!m.tomb);
@@ -895,7 +895,7 @@ SEASTAR_TEST_CASE(datafile_generation_12) {
return write_memtable_to_sstable_for_test(*mt, sst).then([&env, s, tomb, tmpdir_path] {
return env.reusable_sst(s, tmpdir_path, 12).then([s, tomb] (auto sstp) mutable {
return do_with(make_dkey(s, "key1"), [sstp, s, tomb] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, tomb, rd] (auto mutation) {
auto& mp = mutation->partition();
BOOST_REQUIRE(mp.row_tombstones().size() == 1);
@@ -931,7 +931,7 @@ static future<> sstable_compression_test(compressor_ptr c, unsigned generation)
return write_memtable_to_sstable_for_test(*mtp, sst).then([&env, s, tomb, generation, tmpdir_path] {
return env.reusable_sst(s, tmpdir_path, generation).then([s, tomb] (auto sstp) mutable {
return do_with(make_dkey(s, "key1"), [sstp, s, tomb] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, tomb, rd] (auto mutation) {
auto& mp = mutation->partition();
BOOST_REQUIRE(mp.row_tombstones().size() == 1);
@@ -1012,19 +1012,19 @@ static future<std::vector<sstables::shared_sstable>> open_sstables(test_env& env
// mutation_reader for sstable keeping all the required objects alive.
static flat_mutation_reader sstable_reader(shared_sstable sst, schema_ptr s) {
return sst->as_mutation_source().make_reader(s, no_reader_permit(), query::full_partition_range, s->full_slice());
return sst->as_mutation_source().make_reader(s, tests::make_permit(), query::full_partition_range, s->full_slice());
}
static flat_mutation_reader sstable_reader(shared_sstable sst, schema_ptr s, const dht::partition_range& pr) {
return sst->as_mutation_source().make_reader(s, no_reader_permit(), pr, s->full_slice());
return sst->as_mutation_source().make_reader(s, tests::make_permit(), pr, s->full_slice());
}
// We don't need to normalize the sstable reader for 'mc' format
// because it is naturally normalized now.
static flat_mutation_reader make_normalizing_sstable_reader(
shared_sstable sst, schema_ptr s, const dht::partition_range& pr) {
auto sstable_reader = sst->as_mutation_source().make_reader(s, no_reader_permit(), pr, s->full_slice());
auto sstable_reader = sst->as_mutation_source().make_reader(s, tests::make_permit(), pr, s->full_slice());
if (sst->get_version() == sstables::sstable::version_types::mc) {
return sstable_reader;
}
@@ -1411,7 +1411,7 @@ SEASTAR_TEST_CASE(datafile_generation_37) {
return write_memtable_to_sstable_for_test(*mtp, sst).then([&env, s, tmpdir_path] {
return env.reusable_sst(s, tmpdir_path, 37).then([s, tmpdir_path] (auto sstp) {
return do_with(make_dkey(s, "key1"), [sstp, s] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, rd] (auto mutation) {
auto& mp = mutation->partition();
@@ -1446,7 +1446,7 @@ SEASTAR_TEST_CASE(datafile_generation_38) {
return write_memtable_to_sstable_for_test(*mtp, sst).then([&env, s, tmpdir_path] {
return env.reusable_sst(s, tmpdir_path, 38).then([s] (auto sstp) {
return do_with(make_dkey(s, "key1"), [sstp, s] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, rd] (auto mutation) {
auto& mp = mutation->partition();
auto clustering = clustering_key_prefix::from_exploded(*s, {to_bytes("cl1"), to_bytes("cl2")});
@@ -1482,7 +1482,7 @@ SEASTAR_TEST_CASE(datafile_generation_39) {
return write_memtable_to_sstable_for_test(*mtp, sst).then([&env, s, tmpdir_path] {
return env.reusable_sst(s, tmpdir_path, 39).then([s] (auto sstp) {
return do_with(make_dkey(s, "key1"), [sstp, s] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, rd] (auto mutation) {
auto& mp = mutation->partition();
auto& row = mp.clustered_row(*s, clustering_key::make_empty());
@@ -1578,7 +1578,7 @@ SEASTAR_TEST_CASE(datafile_generation_41) {
return write_memtable_to_sstable_for_test(*mt, sst).then([&env, s, tomb, tmpdir_path] {
return env.reusable_sst(s, tmpdir_path, 41).then([s, tomb] (auto sstp) mutable {
return do_with(make_dkey(s, "key1"), [sstp, s, tomb] (auto& key) {
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, tomb, rd] (auto mutation) {
auto& mp = mutation->partition();
BOOST_REQUIRE(mp.clustered_rows().calculate_size() == 1);
@@ -2509,7 +2509,7 @@ SEASTAR_TEST_CASE(sstable_rewrite) {
void test_sliced_read_row_presence(shared_sstable sst, schema_ptr s, const query::partition_slice& ps,
std::vector<std::pair<partition_key, std::vector<clustering_key>>> expected)
{
auto reader = sst->as_mutation_source().make_reader(s, no_reader_permit(), query::full_partition_range, ps);
auto reader = sst->as_mutation_source().make_reader(s, tests::make_permit(), query::full_partition_range, ps);
partition_key::equality pk_eq(*s);
clustering_key::equality ck_eq(*s);
@@ -3660,7 +3660,7 @@ SEASTAR_TEST_CASE(test_repeated_tombstone_skipping) {
.with_range(query::clustering_range::make_singular(ck2))
.with_range(query::clustering_range::make_singular(ck3))
.build();
flat_mutation_reader rd = ms.make_reader(table.schema(), no_reader_permit(), query::full_partition_range, slice);
flat_mutation_reader rd = ms.make_reader(table.schema(), tests::make_permit(), query::full_partition_range, slice);
assert_that(std::move(rd)).has_monotonic_positions();
}
}
@@ -3701,7 +3701,7 @@ SEASTAR_TEST_CASE(test_skipping_using_index) {
auto ms = as_mutation_source(sst);
auto rd = ms.make_reader(table.schema(),
no_reader_permit(),
tests::make_permit(),
query::full_partition_range,
table.schema()->full_slice(),
default_priority_class(),
@@ -4220,7 +4220,7 @@ SEASTAR_TEST_CASE(test_summary_entry_spanning_more_keys_than_min_interval) {
std::set<mutation, mutation_decorated_key_less_comparator> merged;
merged.insert(mutations.begin(), mutations.end());
auto rd = assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), query::full_partition_range));
auto rd = assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), query::full_partition_range));
auto keys_read = 0;
for (auto&& m : merged) {
keys_read++;
@@ -4230,7 +4230,7 @@ SEASTAR_TEST_CASE(test_summary_entry_spanning_more_keys_than_min_interval) {
BOOST_REQUIRE(keys_read == keys_written);
auto r = dht::partition_range::make({mutations.back().decorated_key(), true}, {mutations.back().decorated_key(), true});
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), r))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), r))
.produces(slice(mutations, r))
.produces_end_of_stream();
});
@@ -4432,7 +4432,7 @@ SEASTAR_TEST_CASE(compaction_correctness_with_partitioned_sstable_set) {
}
static std::unique_ptr<index_reader> get_index_reader(shared_sstable sst) {
return std::make_unique<index_reader>(sst, no_reader_permit(), default_priority_class(), tracing::trace_state_ptr());
return std::make_unique<index_reader>(sst, tests::make_permit(), default_priority_class(), tracing::trace_state_ptr());
}
SEASTAR_TEST_CASE(test_broken_promoted_index_is_skipped) {
@@ -4492,7 +4492,7 @@ SEASTAR_TEST_CASE(test_old_format_non_compound_range_tombstone_is_read) {
{
auto slice = partition_slice_builder(*s).with_range(query::clustering_range::make_singular({ck})).build();
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), dht::partition_range::make_singular(dk), slice))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), dht::partition_range::make_singular(dk), slice))
.produces(m)
.produces_end_of_stream();
}
@@ -4727,7 +4727,7 @@ SEASTAR_TEST_CASE(sstable_scrub_test) {
BOOST_REQUIRE(table->candidates_for_compaction().front() == sst);
auto verify_fragments = [&] (sstables::shared_sstable sst, const std::vector<mutation_fragment>& mfs) {
auto r = assert_that(sst->as_mutation_source().make_reader(schema));
auto r = assert_that(sst->as_mutation_source().make_reader(schema, tests::make_permit()));
for (const auto& mf : mfs) {
testlog.trace("Expecting {}", mutation_fragment::printer(*schema, mf));
r.produces(*schema, mf);
@@ -5156,7 +5156,7 @@ SEASTAR_TEST_CASE(test_reads_cassandra_static_compact) {
m.set_clustered_cell(clustering_key::make_empty(), *s->get_column_definition("c2"),
atomic_cell::make_live(*utf8_type, 1551785032379079, utf8_type->decompose("cde"), {}));
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit()))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit()))
.produces(m)
.produces_end_of_stream();
});
@@ -5405,7 +5405,7 @@ SEASTAR_TEST_CASE(purged_tombstone_consumer_sstable_test) {
compacting->insert(std::move(sst));
}
auto reader = ::make_range_sstable_reader(s,
no_reader_permit(),
tests::make_permit(),
compacting,
query::full_partition_range,
s->full_slice(),
@@ -5568,8 +5568,6 @@ SEASTAR_TEST_CASE(incremental_compaction_data_resurrection_test) {
cfg.enable_commitlog = false;
cfg.enable_cache = true;
cfg.enable_incremental_backups = false;
reader_concurrency_semaphore sem = reader_concurrency_semaphore(reader_concurrency_semaphore::no_limits{});
cfg.read_concurrency_semaphore = &sem;
auto tracker = make_lw_shared<cache_tracker>();
auto cf = make_lw_shared<column_family>(s, cfg, column_family::no_commitlog(), *cm, cl_stats, *tracker);
cf->mark_ready_for_writes();
@@ -5577,7 +5575,7 @@ SEASTAR_TEST_CASE(incremental_compaction_data_resurrection_test) {
cf->set_compaction_strategy(sstables::compaction_strategy_type::null);
auto is_partition_dead = [&s, &cf] (partition_key& pkey) {
column_family::const_mutation_partition_ptr mp = cf->find_partition_slow(s, pkey).get0();
column_family::const_mutation_partition_ptr mp = cf->find_partition_slow(s, tests::make_permit(), pkey).get0();
return mp && bool(mp->partition_tombstone());
};
@@ -5678,8 +5676,6 @@ SEASTAR_TEST_CASE(twcs_major_compaction_test) {
cfg.enable_commitlog = false;
cfg.enable_cache = false;
cfg.enable_incremental_backups = false;
reader_concurrency_semaphore sem = reader_concurrency_semaphore(reader_concurrency_semaphore::no_limits{});
cfg.read_concurrency_semaphore = &sem;
auto tracker = make_lw_shared<cache_tracker>();
auto cf = make_lw_shared<column_family>(s, cfg, column_family::no_commitlog(), *cm, cl_stats, *tracker);
cf->mark_ready_for_writes();
@@ -5821,8 +5817,6 @@ SEASTAR_TEST_CASE(test_bug_6472) {
cfg.enable_commitlog = false;
cfg.enable_cache = false;
cfg.enable_incremental_backups = false;
reader_concurrency_semaphore sem = reader_concurrency_semaphore(reader_concurrency_semaphore::no_limits{});
cfg.read_concurrency_semaphore = &sem;
auto tracker = make_lw_shared<cache_tracker>();
cell_locker_stats cl_stats;
auto cf = make_lw_shared<column_family>(s, cfg, column_family::no_commitlog(), *cm, cl_stats, *tracker);

View File

@@ -46,6 +46,7 @@
#include "test/lib/data_model.hh"
#include "test/lib/random_utils.hh"
#include "test/lib/log.hh"
#include "test/lib/reader_permit.hh"
using namespace sstables;
using namespace std::chrono_literals;
@@ -56,7 +57,7 @@ SEASTAR_THREAD_TEST_CASE(nonexistent_key) {
env.reusable_sst(uncompressed_schema(), uncompressed_dir(), 1).then([] (auto sstp) {
return do_with(make_dkey(uncompressed_schema(), "invalid_key"), [sstp] (auto& key) {
auto s = uncompressed_schema();
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return (*rd)(db::no_timeout).then([sstp, s, &key, rd] (auto mutation) {
BOOST_REQUIRE(!mutation);
return make_ready_future<>();
@@ -69,7 +70,7 @@ future<> test_no_clustered(sstables::test_env& env, bytes&& key, std::unordered_
return env.reusable_sst(uncompressed_schema(), uncompressed_dir(), 1).then([k = std::move(key), map = std::move(map)] (auto sstp) mutable {
return do_with(make_dkey(uncompressed_schema(), std::move(k)), [sstp, map = std::move(map)] (auto& key) {
auto s = uncompressed_schema();
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, &key, rd, map = std::move(map)] (auto mutation) {
BOOST_REQUIRE(mutation);
auto& mp = mutation->partition();
@@ -144,7 +145,7 @@ future<mutation> generate_clustered(sstables::test_env& env, bytes&& key) {
return env.reusable_sst(complex_schema(), "test/resource/sstables/complex", Generation).then([k = std::move(key)] (auto sstp) mutable {
return do_with(make_dkey(complex_schema(), std::move(k)), [sstp] (auto& key) {
auto s = complex_schema();
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, &key, rd] (auto mutation) {
BOOST_REQUIRE(mutation);
return std::move(*mutation);
@@ -344,7 +345,7 @@ future<> test_range_reads(sstables::test_env& env, const dht::token& min, const
auto stop = make_lw_shared<bool>(false);
return do_with(dht::partition_range::make(dht::ring_position::starting_at(min),
dht::ring_position::ending_at(max)), [&, sstp, s] (auto& pr) {
auto mutations = make_lw_shared<flat_mutation_reader>(sstp->read_range_rows_flat(s, no_reader_permit(), pr));
auto mutations = make_lw_shared<flat_mutation_reader>(sstp->read_range_rows_flat(s, tests::make_permit(), pr));
return do_until([stop] { return *stop; },
// Note: The data in the following lambda, including
// "mutations", continues to live until after the last
@@ -426,7 +427,7 @@ SEASTAR_TEST_CASE(test_sstable_can_write_and_read_range_tombstone) {
sstables::sstable::format_types::big);
write_memtable_to_sstable_for_test(*mt, sst).get();
sst->load().get();
auto mr = sst->read_rows_flat(s, no_reader_permit());
auto mr = sst->read_rows_flat(s, tests::make_permit());
auto mut = read_mutation_from_flat_mutation_reader(mr, db::no_timeout).get0();
BOOST_REQUIRE(bool(mut));
auto& rts = mut->partition().row_tombstones();
@@ -447,7 +448,7 @@ SEASTAR_THREAD_TEST_CASE(compact_storage_sparse_read) {
env.reusable_sst(compact_sparse_schema(), "test/resource/sstables/compact_sparse", 1).then([] (auto sstp) {
return do_with(make_dkey(compact_sparse_schema(), "first_row"), [sstp] (auto& key) {
auto s = compact_sparse_schema();
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, &key, rd] (auto mutation) {
BOOST_REQUIRE(mutation);
auto& mp = mutation->partition();
@@ -466,7 +467,7 @@ SEASTAR_THREAD_TEST_CASE(compact_storage_simple_dense_read) {
env.reusable_sst(compact_simple_dense_schema(), "test/resource/sstables/compact_simple_dense", 1).then([] (auto sstp) {
return do_with(make_dkey(compact_simple_dense_schema(), "first_row"), [sstp] (auto& key) {
auto s = compact_simple_dense_schema();
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, &key, rd] (auto mutation) {
auto& mp = mutation->partition();
@@ -487,7 +488,7 @@ SEASTAR_THREAD_TEST_CASE(compact_storage_dense_read) {
env.reusable_sst(compact_dense_schema(), "test/resource/sstables/compact_dense", 1).then([] (auto sstp) {
return do_with(make_dkey(compact_dense_schema(), "first_row"), [sstp] (auto& key) {
auto s = compact_dense_schema();
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, &key, rd] (auto mutation) {
auto& mp = mutation->partition();
@@ -511,7 +512,7 @@ SEASTAR_THREAD_TEST_CASE(broken_ranges_collection) {
sstables::test_env env;
env.reusable_sst(peers_schema(), "test/resource/sstables/broken_ranges", 2).then([] (auto sstp) {
auto s = peers_schema();
auto reader = make_lw_shared<flat_mutation_reader>(sstp->as_mutation_source().make_reader(s, no_reader_permit(), query::full_partition_range));
auto reader = make_lw_shared<flat_mutation_reader>(sstp->as_mutation_source().make_reader(s, tests::make_permit(), query::full_partition_range));
return repeat([s, reader] {
return read_mutation_from_flat_mutation_reader(*reader, db::no_timeout).then([s, reader] (mutation_opt mut) {
auto key_equal = [s, &mut] (sstring ip) {
@@ -579,7 +580,7 @@ SEASTAR_THREAD_TEST_CASE(tombstone_in_tombstone) {
auto wait_bg = seastar::defer([] { sstables::await_background_jobs().get(); });
ka_sst(tombstone_overlap_schema(), "test/resource/sstables/tombstone_overlap", 1).then([] (auto sstp) {
auto s = tombstone_overlap_schema();
return do_with(sstp->read_rows_flat(s, no_reader_permit()), [sstp, s] (auto& reader) {
return do_with(sstp->read_rows_flat(s, tests::make_permit()), [sstp, s] (auto& reader) {
return repeat([sstp, s, &reader] {
return read_mutation_from_flat_mutation_reader(reader, db::no_timeout).then([s] (mutation_opt mut) {
if (!mut) {
@@ -643,7 +644,7 @@ SEASTAR_THREAD_TEST_CASE(range_tombstone_reading) {
auto wait_bg = seastar::defer([] { sstables::await_background_jobs().get(); });
ka_sst(tombstone_overlap_schema(), "test/resource/sstables/tombstone_overlap", 4).then([] (auto sstp) {
auto s = tombstone_overlap_schema();
return do_with(sstp->read_rows_flat(s, no_reader_permit()), [sstp, s] (auto& reader) {
return do_with(sstp->read_rows_flat(s, tests::make_permit()), [sstp, s] (auto& reader) {
return repeat([sstp, s, &reader] {
return read_mutation_from_flat_mutation_reader(reader, db::no_timeout).then([s] (mutation_opt mut) {
if (!mut) {
@@ -721,7 +722,7 @@ SEASTAR_THREAD_TEST_CASE(tombstone_in_tombstone2) {
auto wait_bg = seastar::defer([] { sstables::await_background_jobs().get(); });
ka_sst(tombstone_overlap_schema2(), "test/resource/sstables/tombstone_overlap", 3).then([] (auto sstp) {
auto s = tombstone_overlap_schema2();
return do_with(sstp->read_rows_flat(s, no_reader_permit()), [sstp, s] (auto& reader) {
return do_with(sstp->read_rows_flat(s, tests::make_permit()), [sstp, s] (auto& reader) {
return repeat([sstp, s, &reader] {
return read_mutation_from_flat_mutation_reader(reader, db::no_timeout).then([s] (mutation_opt mut) {
if (!mut) {
@@ -801,7 +802,7 @@ static schema_ptr buffer_overflow_schema() {
SEASTAR_THREAD_TEST_CASE(buffer_overflow) {
auto s = buffer_overflow_schema();
auto sstp = ka_sst(s, "test/resource/sstables/buffer_overflow", 5).get0();
auto r = sstp->read_rows_flat(s, no_reader_permit());
auto r = sstp->read_rows_flat(s, tests::make_permit());
auto pk1 = partition_key::from_exploded(*s, { int32_type->decompose(4) });
auto dk1 = dht::decorate_key(*s, pk1);
auto pk2 = partition_key::from_exploded(*s, { int32_type->decompose(3) });
@@ -861,7 +862,7 @@ SEASTAR_TEST_CASE(test_non_compound_table_row_is_not_marked_as_static) {
sstables::sstable::format_types::big);
write_memtable_to_sstable_for_test(*mt, sst).get();
sst->load().get();
auto mr = sst->read_rows_flat(s, no_reader_permit());
auto mr = sst->read_rows_flat(s, tests::make_permit());
auto mut = read_mutation_from_flat_mutation_reader(mr, db::no_timeout).get0();
BOOST_REQUIRE(bool(mut));
}
@@ -900,7 +901,7 @@ SEASTAR_TEST_CASE(test_has_partition_key) {
dht::decorated_key dk(dht::decorate_key(*s, k));
auto hk = sstables::sstable::make_hashed_key(*s, dk.key());
sst->load().get();
auto mr = sst->read_rows_flat(s, no_reader_permit());
auto mr = sst->read_rows_flat(s, tests::make_permit());
auto res = sst->has_partition_key(hk, dk).get0();
BOOST_REQUIRE(bool(res));
@@ -913,7 +914,7 @@ SEASTAR_TEST_CASE(test_has_partition_key) {
}
static std::unique_ptr<index_reader> get_index_reader(shared_sstable sst) {
return std::make_unique<index_reader>(sst, no_reader_permit(), default_priority_class(), tracing::trace_state_ptr());
return std::make_unique<index_reader>(sst, tests::make_permit(), default_priority_class(), tracing::trace_state_ptr());
}
SEASTAR_TEST_CASE(test_promoted_index_blocks_are_monotonic) {
@@ -962,7 +963,7 @@ SEASTAR_TEST_CASE(test_promoted_index_blocks_are_monotonic) {
sstables::sstable::format_types::big);
sstable_writer_config cfg = test_sstables_manager.configure_writer();
cfg.promoted_index_block_size = 1;
sst->write_components(mt->make_flat_reader(s), 1, s, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, cfg, mt->get_encoding_stats()).get();
sst->load().get();
assert_that(get_index_reader(sst)).has_monotonic_positions(*s);
});
@@ -1015,7 +1016,7 @@ SEASTAR_TEST_CASE(test_promoted_index_blocks_are_monotonic_compound_dense) {
sstables::sstable::format_types::big);
sstable_writer_config cfg = test_sstables_manager.configure_writer();
cfg.promoted_index_block_size = 1;
sst->write_components(mt->make_flat_reader(s), 1, s, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, cfg, mt->get_encoding_stats()).get();
sst->load().get();
{
@@ -1024,7 +1025,7 @@ SEASTAR_TEST_CASE(test_promoted_index_blocks_are_monotonic_compound_dense) {
{
auto slice = partition_slice_builder(*s).with_range(query::clustering_range::make_starting_with({ck1})).build();
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), dht::partition_range::make_singular(dk), slice))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), dht::partition_range::make_singular(dk), slice))
.produces(m)
.produces_end_of_stream();
}
@@ -1075,7 +1076,7 @@ SEASTAR_TEST_CASE(test_promoted_index_blocks_are_monotonic_non_compound_dense) {
sstables::sstable::format_types::big);
sstable_writer_config cfg = test_sstables_manager.configure_writer();
cfg.promoted_index_block_size = 1;
sst->write_components(mt->make_flat_reader(s), 1, s, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, cfg, mt->get_encoding_stats()).get();
sst->load().get();
{
@@ -1084,7 +1085,7 @@ SEASTAR_TEST_CASE(test_promoted_index_blocks_are_monotonic_non_compound_dense) {
{
auto slice = partition_slice_builder(*s).with_range(query::clustering_range::make_starting_with({ck1})).build();
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), dht::partition_range::make_singular(dk), slice))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), dht::partition_range::make_singular(dk), slice))
.produces(m)
.produces_end_of_stream();
}
@@ -1132,12 +1133,12 @@ SEASTAR_TEST_CASE(test_promoted_index_repeats_open_tombstones) {
sstables::sstable::format_types::big);
sstable_writer_config cfg = test_sstables_manager.configure_writer();
cfg.promoted_index_block_size = 1;
sst->write_components(mt->make_flat_reader(s), 1, s, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, cfg, mt->get_encoding_stats()).get();
sst->load().get();
{
auto slice = partition_slice_builder(*s).with_range(query::clustering_range::make_starting_with({ck})).build();
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), dht::partition_range::make_singular(dk), slice))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), dht::partition_range::make_singular(dk), slice))
.produces(m)
.produces_end_of_stream();
}
@@ -1177,12 +1178,12 @@ SEASTAR_TEST_CASE(test_range_tombstones_are_correctly_seralized_for_non_compound
1 /* generation */,
version,
sstables::sstable::format_types::big);
sst->write_components(mt->make_flat_reader(s), 1, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
sst->load().get();
{
auto slice = partition_slice_builder(*s).build();
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), dht::partition_range::make_singular(dk), slice))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), dht::partition_range::make_singular(dk), slice))
.produces(m)
.produces_end_of_stream();
}
@@ -1218,7 +1219,7 @@ SEASTAR_TEST_CASE(test_promoted_index_is_absent_for_schemas_without_clustering_k
sstables::sstable::format_types::big);
sstable_writer_config cfg = test_sstables_manager.configure_writer();
cfg.promoted_index_block_size = 1;
sst->write_components(mt->make_flat_reader(s), 1, s, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, cfg, mt->get_encoding_stats()).get();
sst->load().get();
assert_that(get_index_reader(sst)).is_empty(*s);
@@ -1259,12 +1260,12 @@ SEASTAR_TEST_CASE(test_can_write_and_read_non_compound_range_tombstone_as_compou
sstables::sstable::format_types::big);
sstable_writer_config cfg = test_sstables_manager.configure_writer();
cfg.correctly_serialize_non_compound_range_tombstones = false;
sst->write_components(mt->make_flat_reader(s), 1, s, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, cfg, mt->get_encoding_stats()).get();
sst->load().get();
{
auto slice = partition_slice_builder(*s).build();
assert_that(sst->as_mutation_source().make_reader(s, no_reader_permit(), dht::partition_range::make_singular(dk), slice))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit(), dht::partition_range::make_singular(dk), slice))
.produces(m)
.produces_end_of_stream();
}
@@ -1312,11 +1313,11 @@ SEASTAR_TEST_CASE(test_writing_combined_stream_with_tombstones_at_the_same_posit
version,
sstables::sstable::format_types::big);
sst->write_components(make_combined_reader(s,
mt1->make_flat_reader(s),
mt2->make_flat_reader(s)), 1, s, test_sstables_manager.configure_writer(), encoding_stats{}).get();
mt1->make_flat_reader(s, tests::make_permit()),
mt2->make_flat_reader(s, tests::make_permit())), 1, s, test_sstables_manager.configure_writer(), encoding_stats{}).get();
sst->load().get();
assert_that(sst->as_mutation_source().make_reader(s))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit()))
.produces(m1 + m2)
.produces_end_of_stream();
}
@@ -1357,7 +1358,7 @@ SEASTAR_TEST_CASE(test_no_index_reads_when_rows_fall_into_range_boundaries) {
auto before = index_accesses();
{
assert_that(ms.make_reader(s))
assert_that(ms.make_reader(s, tests::make_permit()))
.produces(m1)
.produces(m2)
.produces_end_of_stream();
@@ -1497,17 +1498,17 @@ SEASTAR_THREAD_TEST_CASE(test_large_index_pages_do_not_cause_large_allocations)
1 /* generation */,
sstable_version_types::ka,
sstables::sstable::format_types::big);
sst->write_components(mt->make_flat_reader(s), 1, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
sst->load().get();
auto pr = dht::partition_range::make_singular(small_keys[0]);
auto mt_reader = mt->make_flat_reader(s, pr);
auto mt_reader = mt->make_flat_reader(s, tests::make_permit(), pr);
mutation expected = *read_mutation_from_flat_mutation_reader(mt_reader, db::no_timeout).get0();
auto t0 = std::chrono::steady_clock::now();
auto large_allocs_before = memory::stats().large_allocations();
auto sst_reader = sst->as_mutation_source().make_reader(s, no_reader_permit(), pr);
auto sst_reader = sst->as_mutation_source().make_reader(s, tests::make_permit(), pr);
mutation actual = *read_mutation_from_flat_mutation_reader(sst_reader, db::no_timeout).get0();
auto large_allocs_after = memory::stats().large_allocations();
auto duration = std::chrono::steady_clock::now() - t0;
@@ -1565,7 +1566,7 @@ SEASTAR_THREAD_TEST_CASE(test_reading_serialization_header) {
// writting parts. Let's use a separate objects for writing and reading to ensure that nothing
// carries over that wouldn't normally be read from disk.
auto sst = env.make_sstable(s, dir.path().string(), 1, sstable::version_types::mc, sstables::sstable::format_types::big);
sst->write_components(mt->make_flat_reader(s), 2, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 2, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
}
auto sst = env.make_sstable(s, dir.path().string(), 1, sstable::version_types::mc, sstables::sstable::format_types::big);
@@ -1660,10 +1661,10 @@ SEASTAR_THREAD_TEST_CASE(test_counter_header_size) {
sstables::test_env env;
for (const auto version : all_sstable_versions) {
auto sst = env.make_sstable(s, dir.path().string(), 1, version, sstables::sstable::format_types::big);
sst->write_components(mt->make_flat_reader(s), 1, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), 1, s, test_sstables_manager.configure_writer(), mt->get_encoding_stats()).get();
sst->load().get();
assert_that(sst->as_mutation_source().make_reader(s))
assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit()))
.produces(m)
.produces_end_of_stream();
}
@@ -1702,7 +1703,7 @@ SEASTAR_TEST_CASE(test_static_compact_tables_are_read) {
cfg.correctly_serialize_static_compact_in_mc = correctly_serialize;
auto ms = make_sstable_mutation_source(env, s, dir.path().string(), muts, cfg, version);
assert_that(ms.make_reader(s))
assert_that(ms.make_reader(s, tests::make_permit()))
.produces(muts[0])
.produces(muts[1])
.produces_end_of_stream();

View File

@@ -109,7 +109,7 @@ void run_sstable_resharding_test() {
auto shard = shards.front();
BOOST_REQUIRE(column_family_test::calculate_shard_from_sstable_generation(new_sst->generation()) == shard);
auto rd = assert_that(new_sst->as_mutation_source().make_reader(s, no_reader_permit()));
auto rd = assert_that(new_sst->as_mutation_source().make_reader(s, tests::make_permit()));
BOOST_REQUIRE(muts[shard].size() == keys_per_shard);
for (auto k : boost::irange(0u, keys_per_shard)) {
rd.produces(muts[shard][k]);

View File

@@ -340,7 +340,7 @@ public:
int count_row_end = 0;
test_row_consumer(int64_t t)
: row_consumer(no_reader_permit(), tracing::trace_state_ptr()
: row_consumer(tests::make_permit(), tracing::trace_state_ptr()
, default_priority_class()), desired_timestamp(t) {
}
@@ -461,7 +461,7 @@ public:
int count_range_tombstone = 0;
count_row_consumer()
: row_consumer(no_reader_permit(), tracing::trace_state_ptr(), default_priority_class()) {
: row_consumer(tests::make_permit(), tracing::trace_state_ptr(), default_priority_class()) {
}
virtual proceed consume_row_start(sstables::key_view key, sstables::deletion_time deltime) override {
@@ -839,7 +839,7 @@ SEASTAR_TEST_CASE(wrong_range) {
return test_using_reusable_sst(uncompressed_schema(), "test/resource/sstables/wrongrange", 114, [] (auto sstp) {
return do_with(make_dkey(uncompressed_schema(), "todata"), [sstp] (auto& key) {
auto s = columns_schema();
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, no_reader_permit(), key));
auto rd = make_lw_shared<flat_mutation_reader>(sstp->read_row_flat(s, tests::make_permit(), key));
return read_mutation_from_flat_mutation_reader(*rd, db::no_timeout).then([sstp, s, &key, rd] (auto mutation) {
return make_ready_future<>();
});
@@ -975,7 +975,7 @@ static future<int> count_rows(sstable_ptr sstp, schema_ptr s, sstring key, sstri
return seastar::async([sstp, s, key, ck1, ck2] () mutable {
auto ps = make_partition_slice(*s, ck1, ck2);
auto dkey = make_dkey(s, key.c_str());
auto rd = sstp->read_row_flat(s, no_reader_permit(), dkey, ps);
auto rd = sstp->read_row_flat(s, tests::make_permit(), dkey, ps);
auto mfopt = rd(db::no_timeout).get0();
if (!mfopt) {
return 0;
@@ -996,7 +996,7 @@ static future<int> count_rows(sstable_ptr sstp, schema_ptr s, sstring key, sstri
static future<int> count_rows(sstable_ptr sstp, schema_ptr s, sstring key) {
return seastar::async([sstp, s, key] () mutable {
auto dkey = make_dkey(s, key.c_str());
auto rd = sstp->read_row_flat(s, no_reader_permit(), dkey);
auto rd = sstp->read_row_flat(s, tests::make_permit(), dkey);
auto mfopt = rd(db::no_timeout).get0();
if (!mfopt) {
return 0;
@@ -1018,7 +1018,7 @@ static future<int> count_rows(sstable_ptr sstp, schema_ptr s, sstring key) {
static future<int> count_rows(sstable_ptr sstp, schema_ptr s, sstring ck1, sstring ck2) {
return seastar::async([sstp, s, ck1, ck2] () mutable {
auto ps = make_partition_slice(*s, ck1, ck2);
auto reader = sstp->read_range_rows_flat(s, no_reader_permit(), query::full_partition_range, ps);
auto reader = sstp->read_range_rows_flat(s, tests::make_permit(), query::full_partition_range, ps);
int nrows = 0;
auto mfopt = reader(db::no_timeout).get0();
while (mfopt) {

View File

@@ -45,6 +45,7 @@
#include "db/batchlog_manager.hh"
#include "schema_builder.hh"
#include "test/lib/tmpdir.hh"
#include "test/lib/reader_permit.hh"
#include "db/query_context.hh"
#include "test/lib/test_services.hh"
#include "db/view/view_builder.hh"
@@ -277,7 +278,7 @@ public:
table_name = std::move(table_name)] (database& db) mutable {
auto& cf = db.find_column_family(ks_name, table_name);
auto schema = cf.schema();
return cf.find_partition_slow(schema, pkey)
return cf.find_partition_slow(schema, tests::make_permit(), pkey)
.then([schema, ckey, column_name, exp] (column_family::const_mutation_partition_ptr p) {
assert(p != nullptr);
auto row = p->find_row(*schema, ckey);

View File

@@ -24,6 +24,7 @@
#include "mutation_reader.hh"
#include "memtable.hh"
#include "utils/phased_barrier.hh"
#include "test/lib/reader_permit.hh"
#include <seastar/core/circular_buffer.hh>
#include <seastar/core/thread.hh>
#include <seastar/core/condition-variable.hh>
@@ -66,6 +67,7 @@ private:
std::vector<flat_mutation_reader> readers;
for (auto&& mt : _memtables) {
readers.push_back(mt->make_flat_reader(new_mt->schema(),
tests::make_permit(),
query::full_partition_range,
new_mt->schema()->full_slice(),
default_priority_class(),
@@ -121,7 +123,7 @@ public:
void apply(memtable& mt) {
auto op = _apply.start();
auto new_mt = new_memtable();
new_mt->apply(mt).get();
new_mt->apply(mt, tests::make_permit()).get();
_memtables.push_back(new_mt);
}
// mt must not change from now on.

View File

@@ -35,6 +35,7 @@
#include "test/lib/make_random_string.hh"
#include "test/lib/data_model.hh"
#include "test/lib/log.hh"
#include "test/lib/reader_permit.hh"
#include <boost/algorithm/string/join.hpp>
#include "types/user.hh"
#include "types/map.hh"
@@ -176,7 +177,7 @@ static void test_slicing_and_fast_forwarding(populate_fn_ex populate) {
auto test_common = [&] (const query::partition_slice& slice) {
testlog.info("Read whole partitions at once");
auto pranges_walker = partition_range_walker(pranges);
auto mr = ms.make_reader(s.schema(), no_reader_permit(), pranges_walker.initial_range(), slice,
auto mr = ms.make_reader(s.schema(), tests::make_permit(), pranges_walker.initial_range(), slice,
default_priority_class(), nullptr, streamed_mutation::forwarding::no, fwd_mr);
auto actual = assert_that(std::move(mr));
for (auto& expected : mutations) {
@@ -202,7 +203,7 @@ static void test_slicing_and_fast_forwarding(populate_fn_ex populate) {
testlog.info("Read partitions with fast-forwarding to each individual row");
pranges_walker = partition_range_walker(pranges);
mr = ms.make_reader(s.schema(), no_reader_permit(), pranges_walker.initial_range(), slice,
mr = ms.make_reader(s.schema(), tests::make_permit(), pranges_walker.initial_range(), slice,
default_priority_class(), nullptr, streamed_mutation::forwarding::yes, fwd_mr);
actual = assert_that(std::move(mr));
for (auto& expected : mutations) {
@@ -238,14 +239,14 @@ static void test_slicing_and_fast_forwarding(populate_fn_ex populate) {
test_common(slice);
testlog.info("Test monotonic positions");
auto mr = ms.make_reader(s.schema(), no_reader_permit(), query::full_partition_range, slice,
auto mr = ms.make_reader(s.schema(), tests::make_permit(), query::full_partition_range, slice,
default_priority_class(), nullptr, streamed_mutation::forwarding::no, fwd_mr);
assert_that(std::move(mr)).has_monotonic_positions();
if (range_size != 1) {
testlog.info("Read partitions fast-forwarded to the range of interest");
auto pranges_walker = partition_range_walker(pranges);
mr = ms.make_reader(s.schema(), no_reader_permit(), pranges_walker.initial_range(), slice,
mr = ms.make_reader(s.schema(), tests::make_permit(), pranges_walker.initial_range(), slice,
default_priority_class(), nullptr, streamed_mutation::forwarding::yes, fwd_mr);
auto actual = assert_that(std::move(mr));
for (auto& expected : mutations) {
@@ -285,7 +286,7 @@ static void test_slicing_and_fast_forwarding(populate_fn_ex populate) {
testlog.info("Read partitions with just static rows");
auto pranges_walker = partition_range_walker(pranges);
mr = ms.make_reader(s.schema(), no_reader_permit(), pranges_walker.initial_range(), slice,
mr = ms.make_reader(s.schema(), tests::make_permit(), pranges_walker.initial_range(), slice,
default_priority_class(), nullptr, streamed_mutation::forwarding::no, fwd_mr);
auto actual = assert_that(std::move(mr));
for (auto& expected : mutations) {
@@ -312,7 +313,7 @@ static void test_slicing_and_fast_forwarding(populate_fn_ex populate) {
test_common(slice);
testlog.info("Test monotonic positions");
auto mr = ms.make_reader(s.schema(), no_reader_permit(), query::full_partition_range, slice,
auto mr = ms.make_reader(s.schema(), tests::make_permit(), query::full_partition_range, slice,
default_priority_class(), nullptr, streamed_mutation::forwarding::no, fwd_mr);
assert_that(std::move(mr)).has_monotonic_positions();
}
@@ -385,10 +386,10 @@ static void test_streamed_mutation_forwarding_is_consistent_with_slicing(populat
mutation_source ms = populate(m.schema(), {m}, gc_clock::now());
flat_mutation_reader sliced_reader =
ms.make_reader(m.schema(), no_reader_permit(), prange, slice_with_ranges);
ms.make_reader(m.schema(), tests::make_permit(), prange, slice_with_ranges);
flat_mutation_reader fwd_reader =
ms.make_reader(m.schema(), no_reader_permit(), prange, full_slice, default_priority_class(), nullptr, streamed_mutation::forwarding::yes);
ms.make_reader(m.schema(), tests::make_permit(), prange, full_slice, default_priority_class(), nullptr, streamed_mutation::forwarding::yes);
std::optional<mutation_rebuilder> builder{};
struct consumer {
@@ -476,7 +477,7 @@ static void test_streamed_mutation_forwarding_guarantees(populate_fn_ex populate
auto new_stream = [&ms, s, &m] () -> flat_reader_assertions {
testlog.info("Creating new streamed_mutation");
auto res = assert_that(ms.make_reader(s,
no_reader_permit(),
tests::make_permit(),
query::full_partition_range,
s->full_slice(),
default_priority_class(),
@@ -612,7 +613,7 @@ static void test_fast_forwarding_across_partitions_to_empty_range(populate_fn_ex
auto pr = dht::partition_range::make({keys[0]}, {keys[1]});
auto rd = assert_that(ms.make_reader(s,
no_reader_permit(),
tests::make_permit(),
pr,
s->full_slice(),
default_priority_class(),
@@ -714,7 +715,7 @@ static void test_streamed_mutation_slicing_returns_only_relevant_tombstones(popu
))
.build();
auto rd = assert_that(ms.make_reader(s, no_reader_permit(), pr, slice));
auto rd = assert_that(ms.make_reader(s, tests::make_permit(), pr, slice));
rd.produces_partition_start(m.decorated_key());
rd.produces_row_with_key(keys[2]);
@@ -733,7 +734,7 @@ static void test_streamed_mutation_slicing_returns_only_relevant_tombstones(popu
))
.build();
auto rd = assert_that(ms.make_reader(s, no_reader_permit(), pr, slice));
auto rd = assert_that(ms.make_reader(s, tests::make_permit(), pr, slice));
rd.produces_partition_start(m.decorated_key())
.produces_range_tombstone(rt3, slice.row_ranges(*s, m.key()))
@@ -788,7 +789,7 @@ static void test_streamed_mutation_forwarding_across_range_tombstones(populate_f
mutation_source ms = populate(s, std::vector<mutation>({m}), gc_clock::now());
auto rd = assert_that(ms.make_reader(s,
no_reader_permit(),
tests::make_permit(),
query::full_partition_range,
s->full_slice(),
default_priority_class(),
@@ -872,7 +873,7 @@ static void test_range_queries(populate_fn_ex populate) {
auto test_slice = [&] (dht::partition_range r) {
testlog.info("Testing range {}", r);
assert_that(ds.make_reader(s, no_reader_permit(), r))
assert_that(ds.make_reader(s, tests::make_permit(), r))
.produces(slice(partitions, r))
.produces_end_of_stream();
};
@@ -976,7 +977,7 @@ void test_all_data_is_read_back(populate_fn_ex populate) {
auto ms = populate(m.schema(), {m}, query_time);
mutation copy(m);
copy.partition().compact_for_compaction(*copy.schema(), always_gc, query_time);
assert_that(ms.make_reader(m.schema())).produces_compacted(copy, query_time);
assert_that(ms.make_reader(m.schema(), tests::make_permit())).produces_compacted(copy, query_time);
});
}
@@ -1013,7 +1014,7 @@ static void test_date_tiered_clustering_slicing(populate_fn_ex populate) {
.with_range(ss.make_ckey_range(1, 2))
.build();
auto prange = dht::partition_range::make_singular(pkey);
assert_that(ms.make_reader(s, no_reader_permit(), prange, slice))
assert_that(ms.make_reader(s, tests::make_permit(), prange, slice))
.produces(m1, slice.row_ranges(*s, pkey.key()))
.produces_end_of_stream();
}
@@ -1023,7 +1024,7 @@ static void test_date_tiered_clustering_slicing(populate_fn_ex populate) {
.with_range(query::clustering_range::make_singular(ss.make_ckey(0)))
.build();
auto prange = dht::partition_range::make_singular(pkey);
assert_that(ms.make_reader(s, no_reader_permit(), prange, slice))
assert_that(ms.make_reader(s, tests::make_permit(), prange, slice))
.produces(m1)
.produces_end_of_stream();
}
@@ -1110,14 +1111,14 @@ static void test_clustering_slices(populate_fn_ex populate) {
auto slice = partition_slice_builder(*s)
.with_range(query::clustering_range::make_singular(make_ck(0)))
.build();
assert_that(ds.make_reader(s, no_reader_permit(), pr, slice))
assert_that(ds.make_reader(s, tests::make_permit(), pr, slice))
.produces_eos_or_empty_mutation();
}
{
auto slice = partition_slice_builder(*s)
.build();
auto rd = assert_that(ds.make_reader(s, no_reader_permit(), pr, slice, default_priority_class(), nullptr, streamed_mutation::forwarding::yes));
auto rd = assert_that(ds.make_reader(s, tests::make_permit(), pr, slice, default_priority_class(), nullptr, streamed_mutation::forwarding::yes));
rd.produces_partition_start(pk)
.fast_forward_to(position_range(position_in_partition::for_key(ck1), position_in_partition::after_key(ck2)))
.produces_row_with_key(ck1)
@@ -1128,7 +1129,7 @@ static void test_clustering_slices(populate_fn_ex populate) {
{
auto slice = partition_slice_builder(*s)
.build();
auto rd = assert_that(ds.make_reader(s, no_reader_permit(), pr, slice, default_priority_class(), nullptr, streamed_mutation::forwarding::yes));
auto rd = assert_that(ds.make_reader(s, tests::make_permit(), pr, slice, default_priority_class(), nullptr, streamed_mutation::forwarding::yes));
rd.produces_partition_start(pk)
.produces_end_of_stream()
.fast_forward_to(position_range(position_in_partition::for_key(ck1), position_in_partition::after_key(ck2)))
@@ -1140,7 +1141,7 @@ static void test_clustering_slices(populate_fn_ex populate) {
auto slice = partition_slice_builder(*s)
.with_range(query::clustering_range::make_singular(make_ck(1)))
.build();
assert_that(ds.make_reader(s, no_reader_permit(), pr, slice))
assert_that(ds.make_reader(s, tests::make_permit(), pr, slice))
.produces(row1 + row2 + row3 + row4 + row5 + del_1, slice.row_ranges(*s, pk.key()))
.produces_end_of_stream();
}
@@ -1148,7 +1149,7 @@ static void test_clustering_slices(populate_fn_ex populate) {
auto slice = partition_slice_builder(*s)
.with_range(query::clustering_range::make_singular(make_ck(2)))
.build();
assert_that(ds.make_reader(s, no_reader_permit(), pr, slice))
assert_that(ds.make_reader(s, tests::make_permit(), pr, slice))
.produces(row6 + row7 + del_1 + del_2, slice.row_ranges(*s, pk.key()))
.produces_end_of_stream();
}
@@ -1157,7 +1158,7 @@ static void test_clustering_slices(populate_fn_ex populate) {
auto slice = partition_slice_builder(*s)
.with_range(query::clustering_range::make_singular(make_ck(1, 2)))
.build();
assert_that(ds.make_reader(s, no_reader_permit(), pr, slice))
assert_that(ds.make_reader(s, tests::make_permit(), pr, slice))
.produces(row3 + row4 + del_1, slice.row_ranges(*s, pk.key()))
.produces_end_of_stream();
}
@@ -1166,7 +1167,7 @@ static void test_clustering_slices(populate_fn_ex populate) {
auto slice = partition_slice_builder(*s)
.with_range(query::clustering_range::make_singular(make_ck(3)))
.build();
assert_that(ds.make_reader(s, no_reader_permit(), pr, slice))
assert_that(ds.make_reader(s, tests::make_permit(), pr, slice))
.produces(row8 + del_3, slice.row_ranges(*s, pk.key()))
.produces_end_of_stream();
}
@@ -1174,12 +1175,12 @@ static void test_clustering_slices(populate_fn_ex populate) {
// Test out-of-range partition keys
{
auto pr = dht::partition_range::make_singular(keys[0]);
assert_that(ds.make_reader(s, no_reader_permit(), pr, s->full_slice()))
assert_that(ds.make_reader(s, tests::make_permit(), pr, s->full_slice()))
.produces_eos_or_empty_mutation();
}
{
auto pr = dht::partition_range::make_singular(keys[2]);
assert_that(ds.make_reader(s, no_reader_permit(), pr, s->full_slice()))
assert_that(ds.make_reader(s, tests::make_permit(), pr, s->full_slice()))
.produces_eos_or_empty_mutation();
}
}
@@ -1202,7 +1203,7 @@ static void test_query_only_static_row(populate_fn_ex populate) {
// fully populate cache
{
auto prange = dht::partition_range::make_ending_with(dht::ring_position(m1.decorated_key()));
assert_that(ms.make_reader(s.schema(), no_reader_permit(), prange, s.schema()->full_slice()))
assert_that(ms.make_reader(s.schema(), tests::make_permit(), prange, s.schema()->full_slice()))
.produces(m1)
.produces_end_of_stream();
}
@@ -1213,7 +1214,7 @@ static void test_query_only_static_row(populate_fn_ex populate) {
.with_ranges({})
.build();
auto prange = dht::partition_range::make_ending_with(dht::ring_position(m1.decorated_key()));
assert_that(ms.make_reader(s.schema(), no_reader_permit(), prange, slice))
assert_that(ms.make_reader(s.schema(), tests::make_permit(), prange, slice))
.produces(m1, slice.row_ranges(*s.schema(), m1.key()))
.produces_end_of_stream();
}
@@ -1224,7 +1225,7 @@ static void test_query_only_static_row(populate_fn_ex populate) {
.with_ranges({})
.build();
auto prange = dht::partition_range::make_singular(m1.decorated_key());
assert_that(ms.make_reader(s.schema(), no_reader_permit(), prange, slice))
assert_that(ms.make_reader(s.schema(), tests::make_permit(), prange, slice))
.produces(m1, slice.row_ranges(*s.schema(), m1.key()))
.produces_end_of_stream();
}
@@ -1246,7 +1247,7 @@ static void test_query_no_clustering_ranges_no_static_columns(populate_fn_ex pop
{
auto prange = dht::partition_range::make_ending_with(dht::ring_position(m1.decorated_key()));
assert_that(ms.make_reader(s.schema(), no_reader_permit(), prange, s.schema()->full_slice()))
assert_that(ms.make_reader(s.schema(), tests::make_permit(), prange, s.schema()->full_slice()))
.produces(m1)
.produces_end_of_stream();
}
@@ -1257,7 +1258,7 @@ static void test_query_no_clustering_ranges_no_static_columns(populate_fn_ex pop
.with_ranges({})
.build();
auto prange = dht::partition_range::make_ending_with(dht::ring_position(m1.decorated_key()));
assert_that(ms.make_reader(s.schema(), no_reader_permit(), prange, slice))
assert_that(ms.make_reader(s.schema(), tests::make_permit(), prange, slice))
.produces(m1, slice.row_ranges(*s.schema(), m1.key()))
.produces_end_of_stream();
}
@@ -1268,7 +1269,7 @@ static void test_query_no_clustering_ranges_no_static_columns(populate_fn_ex pop
.with_ranges({})
.build();
auto prange = dht::partition_range::make_singular(m1.decorated_key());
assert_that(ms.make_reader(s.schema(), no_reader_permit(), prange, slice))
assert_that(ms.make_reader(s.schema(), tests::make_permit(), prange, slice))
.produces(m1, slice.row_ranges(*s.schema(), m1.key()))
.produces_end_of_stream();
}
@@ -1284,7 +1285,7 @@ void test_streamed_mutation_forwarding_succeeds_with_no_data(populate_fn_ex popu
auto source = populate(s.schema(), {m}, gc_clock::now());
assert_that(source.make_reader(s.schema(),
no_reader_permit(),
tests::make_permit(),
query::full_partition_range,
s.schema()->full_slice(),
default_priority_class(),
@@ -1333,7 +1334,7 @@ void test_slicing_with_overlapping_range_tombstones(populate_fn_ex populate) {
{
auto slice = partition_slice_builder(*s).with_range(range).build();
auto rd = ds.make_reader(s, no_reader_permit(), query::full_partition_range, slice);
auto rd = ds.make_reader(s, tests::make_permit(), query::full_partition_range, slice);
auto prange = position_range(range);
mutation result(m1.schema(), m1.decorated_key());
@@ -1351,7 +1352,7 @@ void test_slicing_with_overlapping_range_tombstones(populate_fn_ex populate) {
// Check fast_forward_to()
{
auto rd = ds.make_reader(s, no_reader_permit(), query::full_partition_range, s->full_slice(), default_priority_class(),
auto rd = ds.make_reader(s, tests::make_permit(), query::full_partition_range, s->full_slice(), default_priority_class(),
nullptr, streamed_mutation::forwarding::yes);
auto prange = position_range(range);
@@ -1413,7 +1414,7 @@ void test_next_partition(populate_fn_ex populate) {
mutations.push_back(std::move(m));
}
auto source = populate(s.schema(), mutations, gc_clock::now());
assert_that(source.make_reader(s.schema()))
assert_that(source.make_reader(s.schema(), tests::make_permit()))
.next_partition() // Does nothing before first partition
.produces_partition_start(pkeys[0])
.produces_static_row()

View File

@@ -26,6 +26,55 @@
class test_reader_lifecycle_policy
: public reader_lifecycle_policy
, public enable_shared_from_this<test_reader_lifecycle_policy> {
public:
class operations_gate {
public:
class operation {
gate* _g = nullptr;
private:
void leave() {
if (_g) {
_g->leave();
}
}
public:
operation() = default;
explicit operation(gate& g) : _g(&g) { _g->enter(); }
operation(const operation&) = delete;
operation(operation&& o) : _g(std::exchange(o._g, nullptr)) { }
~operation() { leave(); }
operation& operator=(const operation&) = delete;
operation& operator=(operation&& o) {
leave();
_g = std::exchange(o._g, nullptr);
return *this;
}
};
private:
std::vector<gate> _gates;
public:
operations_gate()
: _gates(smp::count) {
}
operation enter() {
return operation(_gates[this_shard_id()]);
}
future<> close() {
return parallel_for_each(boost::irange(smp::count), [this] (shard_id shard) {
return smp::submit_to(shard, [this, shard] {
return _gates[shard].close();
});
});
}
};
private:
using factory_function = std::function<flat_mutation_reader(
schema_ptr,
const dht::partition_range&,
@@ -34,22 +83,28 @@ class test_reader_lifecycle_policy
tracing::trace_state_ptr,
mutation_reader::forwarding)>;
struct reader_params {
const dht::partition_range range;
const query::partition_slice slice;
};
struct reader_context {
foreign_ptr<std::unique_ptr<reader_concurrency_semaphore>> semaphore;
foreign_ptr<std::unique_ptr<const reader_params>> params;
std::unique_ptr<reader_concurrency_semaphore> semaphore;
operations_gate::operation op;
std::optional<reader_permit> permit;
std::optional<future<reader_permit::resource_units>> wait_future;
std::optional<const dht::partition_range> range;
std::optional<const query::partition_slice> slice;
reader_context(dht::partition_range range, query::partition_slice slice) : range(std::move(range)), slice(std::move(slice)) {
}
};
factory_function _factory_function;
std::vector<reader_context> _contexts;
operations_gate& _operation_gate;
std::vector<foreign_ptr<std::unique_ptr<reader_context>>> _contexts;
std::vector<future<>> _destroy_futures;
bool _evict_paused_readers = false;
public:
explicit test_reader_lifecycle_policy(factory_function f, bool evict_paused_readers = false)
explicit test_reader_lifecycle_policy(factory_function f, operations_gate& g, bool evict_paused_readers = false)
: _factory_function(std::move(f))
, _operation_gate(g)
, _contexts(smp::count)
, _evict_paused_readers(evict_paused_readers) {
}
@@ -61,33 +116,47 @@ public:
tracing::trace_state_ptr trace_state,
mutation_reader::forwarding fwd_mr) override {
const auto shard = this_shard_id();
_contexts[shard].params = make_foreign(std::make_unique<const reader_params>(reader_params{range, slice}));
return _factory_function(std::move(schema), _contexts[shard].params->range, _contexts[shard].params->slice, pc,
std::move(trace_state), fwd_mr);
if (_contexts[shard]) {
_contexts[shard]->range.emplace(range);
_contexts[shard]->slice.emplace(slice);
} else {
_contexts[shard] = make_foreign(std::make_unique<reader_context>(range, slice));
}
_contexts[shard]->op = _operation_gate.enter();
return _factory_function(std::move(schema), *_contexts[shard]->range, *_contexts[shard]->slice, pc, std::move(trace_state), fwd_mr);
}
virtual void destroy_reader(shard_id shard, future<stopped_reader> reader) noexcept override {
// Move to the background.
// Move to the background, waited via _operation_gate
(void)reader.then([shard, this] (stopped_reader&& reader) {
return smp::submit_to(shard, [handle = std::move(reader.handle), ctx = std::move(_contexts[shard])] () mutable {
ctx.semaphore->unregister_inactive_read(std::move(*handle));
ctx->semaphore->unregister_inactive_read(std::move(*handle));
ctx->semaphore->broken(std::make_exception_ptr(broken_semaphore{}));
if (ctx->wait_future) {
return ctx->wait_future->then_wrapped([ctx = std::move(ctx)] (future<reader_permit::resource_units> f) mutable {
f.ignore_ready_future();
ctx->permit.reset(); // make sure it's destroyed before the semaphore
});
}
return make_ready_future<>();
});
}).finally([zis = shared_from_this()] {});
}
virtual reader_concurrency_semaphore& semaphore() override {
const auto shard = this_shard_id();
if (!_contexts[shard].semaphore) {
if (!_contexts[shard]->semaphore) {
if (_evict_paused_readers) {
_contexts[shard].semaphore = make_foreign(std::make_unique<reader_concurrency_semaphore>(0, std::numeric_limits<ssize_t>::max(),
format("reader_concurrency_semaphore @shard_id={}", shard)));
// Add a waiter, so that all registered inactive reads are
// immediately evicted.
// We don't care about the returned future.
(void)_contexts[shard].semaphore->wait_admission(1, db::no_timeout);
_contexts[shard]->semaphore = std::make_unique<reader_concurrency_semaphore>(0, std::numeric_limits<ssize_t>::max(),
format("reader_concurrency_semaphore @shard_id={}", shard));
_contexts[shard]->permit = _contexts[shard]->semaphore->make_permit();
// Add a waiter, so that all registered inactive reads are
// immediately evicted.
// We don't care about the returned future.
_contexts[shard]->wait_future = _contexts[shard]->permit->wait_admission(1, db::no_timeout);
} else {
_contexts[shard].semaphore = make_foreign(std::make_unique<reader_concurrency_semaphore>(reader_concurrency_semaphore::no_limits{}));
_contexts[shard]->semaphore = std::make_unique<reader_concurrency_semaphore>(reader_concurrency_semaphore::no_limits{});
}
}
return *_contexts[shard].semaphore;
return *_contexts[shard]->semaphore;
}
};

40
test/lib/reader_permit.cc Normal file
View File

@@ -0,0 +1,40 @@
/*
* Copyright (C) 2020 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "test/lib/reader_permit.hh"
namespace tests {
thread_local reader_concurrency_semaphore the_semaphore{reader_concurrency_semaphore::no_limits{}};
reader_concurrency_semaphore& semaphore() {
return the_semaphore;
}
reader_permit make_permit() {
return the_semaphore.make_permit();
}
query_class_config make_query_class_config() {
return query_class_config{the_semaphore, std::numeric_limits<uint64_t>::max()};
}
} // namespace tests

35
test/lib/reader_permit.hh Normal file
View File

@@ -0,0 +1,35 @@
/*
* Copyright (C) 2020 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "reader_concurrency_semaphore.hh"
#include "query_class_config.hh"
namespace tests {
reader_concurrency_semaphore& semaphore();
reader_permit make_permit();
query_class_config make_query_class_config();
} // namespace tests

View File

@@ -28,6 +28,7 @@
#include <boost/range/irange.hpp>
#include <boost/range/adaptor/map.hpp>
#include "test/lib/flat_mutation_reader_assertions.hh"
#include "test/lib/reader_permit.hh"
#include <seastar/core/reactor.hh>
using namespace sstables;
@@ -92,7 +93,7 @@ sstables::shared_sstable make_sstable_containing(std::function<sstables::shared_
}
// validate the sstable
auto rd = assert_that(sst->as_mutation_source().make_reader(s));
auto rd = assert_that(sst->as_mutation_source().make_reader(s, tests::make_permit()));
for (auto&& m : merged) {
rd.produces(m);
}
@@ -117,7 +118,7 @@ shared_sstable make_sstable(sstables::test_env& env, schema_ptr s, sstring dir,
mt->apply(m);
}
sst->write_components(mt->make_flat_reader(s), mutations.size(), s, cfg, mt->get_encoding_stats()).get();
sst->write_components(mt->make_flat_reader(s, tests::make_permit()), mutations.size(), s, cfg, mt->get_encoding_stats()).get();
sst->load().get();
return sst;

View File

@@ -32,6 +32,7 @@
#include <boost/range/adaptor/map.hpp>
#include "test/lib/test_services.hh"
#include "test/lib/sstable_test_env.hh"
#include "test/lib/reader_permit.hh"
#include "gc_clock.hh"
using namespace sstables;
@@ -97,12 +98,12 @@ public:
}
future<temporary_buffer<char>> data_read(uint64_t pos, size_t len) {
return _sst->data_read(pos, len, default_priority_class());
return _sst->data_read(pos, len, default_priority_class(), tests::make_permit());
}
future<index_list> read_indexes() {
auto l = make_lw_shared<index_list>();
return do_with(std::make_unique<index_reader>(_sst, no_reader_permit(), default_priority_class(), tracing::trace_state_ptr()),
return do_with(std::make_unique<index_reader>(_sst, tests::make_permit(), default_priority_class(), tracing::trace_state_ptr()),
[this, l] (std::unique_ptr<index_reader>& ir) {
return ir->read_partition_data().then([&, l] {
l->push_back(std::move(ir->current_partition_entry()));

View File

@@ -20,6 +20,7 @@
*/
#include "test/lib/test_services.hh"
#include "test/lib/reader_permit.hh"
#include "auth/service.hh"
#include "db/config.hh"
#include "db/system_distributed_keyspace.hh"
@@ -106,6 +107,7 @@ thread_local sstables::sstables_manager test_sstables_manager(nop_lp_handler, te
column_family::config column_family_test_config() {
column_family::config cfg;
cfg.sstables_manager = &test_sstables_manager;
cfg.compaction_concurrency_semaphore = &tests::semaphore();
return cfg;
}

View File

@@ -119,9 +119,9 @@ public:
app_config["stats-file"].as<sstring>(),
std::chrono::milliseconds(app_config["stats-period-ms"].as<unsigned>())};
}
stats_collector(table& tab, std::optional<params> p)
stats_collector(reader_concurrency_semaphore& sem, std::optional<params> p)
: _params(std::move(p))
, _sem(tab.read_concurrency_semaphore())
, _sem(sem)
, _initial_res(_sem.available_resources()) {
}
stats_collector(const stats_collector&) = delete;
@@ -186,7 +186,8 @@ void execute_reads(reader_concurrency_semaphore& sem, unsigned reads, unsigned c
if (sem.waiters()) {
testlog.trace("Waiting for queue to drain");
sem.wait_admission(1, db::no_timeout).get();
auto permit = sem.make_permit();
permit.wait_admission(1, db::no_timeout).get();
}
}
@@ -307,12 +308,14 @@ int main(int argc, char** argv) {
auto prev_occupancy = logalloc::shard_tracker().occupancy();
testlog.info("Occupancy before: {}", prev_occupancy);
auto& sem = env.local_db().make_query_class_config().semaphore;
testlog.info("Reading");
stats_collector sc(tab, stats_collector_params);
stats_collector sc(sem, stats_collector_params);
try {
auto _ = sc.collect();
memory::set_heap_profiling_enabled(true);
execute_reads(tab.read_concurrency_semaphore(), reads, read_concurrency, [&] (unsigned i) {
execute_reads(sem, reads, read_concurrency, [&] (unsigned i) {
return env.execute_cql(format("select * from ks.test where pk = 0 and ck > {} limit 100;",
tests::random::get_int(rows / 2))).discard_result();
});

View File

@@ -37,6 +37,7 @@
#include "test/lib/test_services.hh"
#include "test/lib/sstable_test_env.hh"
#include "test/lib/cql_test_env.hh"
#include "test/lib/reader_permit.hh"
class size_calculator {
class nest {
@@ -204,7 +205,7 @@ static sizes calculate_sizes(cache_tracker& tracker, const mutation_settings& se
v,
sstables::sstable::format_types::big);
auto mt2 = make_lw_shared<memtable>(s);
mt2->apply(*mt).get();
mt2->apply(*mt, tests::make_permit()).get();
write_memtable_to_sstable_for_test(*mt2, sst).get();
sst->load().get();
result.sstable[v] = sst->data_size();

View File

@@ -26,6 +26,7 @@
#include <boost/range/adaptors.hpp>
#include <json/json.h>
#include "test/lib/cql_test_env.hh"
#include "test/lib/reader_permit.hh"
#include "test/perf/perf.hh"
#include <seastar/core/app-template.hh>
#include "schema_builder.hh"
@@ -747,6 +748,7 @@ static void assert_partition_start(flat_mutation_reader& rd) {
// cf should belong to ks.test
static test_result scan_rows_with_stride(column_family& cf, int n_rows, int n_read = 1, int n_skip = 0) {
auto rd = cf.make_reader(cf.schema(),
tests::make_permit(),
query::full_partition_range,
cf.schema()->full_slice(),
default_priority_class(),
@@ -791,7 +793,7 @@ static test_result scan_with_stride_partitions(column_family& cf, int n, int n_r
int pk = 0;
auto pr = n_skip ? dht::partition_range::make_ending_with(dht::partition_range::bound(keys[0], false)) // covering none
: query::full_partition_range;
auto rd = cf.make_reader(cf.schema(), pr, cf.schema()->full_slice());
auto rd = cf.make_reader(cf.schema(), tests::make_permit(), pr, cf.schema()->full_slice());
metrics_snapshot before;
@@ -813,6 +815,7 @@ static test_result scan_with_stride_partitions(column_family& cf, int n, int n_r
static test_result slice_rows(column_family& cf, int offset = 0, int n_read = 1) {
auto rd = cf.make_reader(cf.schema(),
tests::make_permit(),
query::full_partition_range,
cf.schema()->full_slice(),
default_priority_class(),
@@ -842,7 +845,7 @@ static test_result slice_rows_by_ck(column_family& cf, int offset = 0, int n_rea
clustering_key::from_singular(*cf.schema(), offset + n_read - 1)))
.build();
auto pr = dht::partition_range::make_singular(make_pkey(*cf.schema(), 0));
auto rd = cf.make_reader(cf.schema(), pr, slice);
auto rd = cf.make_reader(cf.schema(), tests::make_permit(), pr, slice);
return test_reading_all(rd);
}
@@ -854,6 +857,7 @@ static test_result select_spread_rows(column_family& cf, int stride = 0, int n_r
auto slice = sb.build();
auto rd = cf.make_reader(cf.schema(),
tests::make_permit(),
query::full_partition_range,
slice);
@@ -867,14 +871,14 @@ static test_result test_slicing_using_restrictions(column_family& cf, int_range
}))
.build();
auto pr = dht::partition_range::make_singular(make_pkey(*cf.schema(), 0));
auto rd = cf.make_reader(cf.schema(), pr, slice, default_priority_class(), nullptr,
auto rd = cf.make_reader(cf.schema(), tests::make_permit(), pr, slice, default_priority_class(), nullptr,
streamed_mutation::forwarding::no, mutation_reader::forwarding::no);
return test_reading_all(rd);
}
static test_result slice_rows_single_key(column_family& cf, int offset = 0, int n_read = 1) {
auto pr = dht::partition_range::make_singular(make_pkey(*cf.schema(), 0));
auto rd = cf.make_reader(cf.schema(), pr, cf.schema()->full_slice(), default_priority_class(), nullptr, streamed_mutation::forwarding::yes, mutation_reader::forwarding::no);
auto rd = cf.make_reader(cf.schema(), tests::make_permit(), pr, cf.schema()->full_slice(), default_priority_class(), nullptr, streamed_mutation::forwarding::yes, mutation_reader::forwarding::no);
metrics_snapshot before;
assert_partition_start(rd);
@@ -893,7 +897,7 @@ static test_result slice_partitions(column_family& cf, const std::vector<dht::de
dht::partition_range::bound(keys[std::min<size_t>(keys.size(), offset + n_read) - 1], true)
);
auto rd = cf.make_reader(cf.schema(), pr, cf.schema()->full_slice());
auto rd = cf.make_reader(cf.schema(), tests::make_permit(), pr, cf.schema()->full_slice());
metrics_snapshot before;
uint64_t fragments = consume_all_with_next_partition(rd);
@@ -996,6 +1000,7 @@ static test_result test_forwarding_with_restriction(column_family& cf, clustered
auto pr = single_partition ? dht::partition_range::make_singular(make_pkey(*cf.schema(), 0)) : query::full_partition_range;
auto rd = cf.make_reader(cf.schema(),
tests::make_permit(),
pr,
slice,
default_priority_class(),

View File

@@ -26,6 +26,7 @@
#include "seastar/include/seastar/testing/perf_tests.hh"
#include "test/lib/simple_schema.hh"
#include "test/lib/reader_permit.hh"
#include "mutation_reader.hh"
#include "flat_mutation_reader.hh"
@@ -291,22 +292,22 @@ protected:
PERF_TEST_F(memtable, one_partition_one_row)
{
return consume_all(single_row_mt().make_flat_reader(schema(), single_partition_range()));
return consume_all(single_row_mt().make_flat_reader(schema(), tests::make_permit(), single_partition_range()));
}
PERF_TEST_F(memtable, one_partition_many_rows)
{
return consume_all(multi_row_mt().make_flat_reader(schema(), single_partition_range()));
return consume_all(multi_row_mt().make_flat_reader(schema(), tests::make_permit(), single_partition_range()));
}
PERF_TEST_F(memtable, many_partitions_one_row)
{
return consume_all(single_row_mt().make_flat_reader(schema(), multi_partition_range(25)));
return consume_all(single_row_mt().make_flat_reader(schema(), tests::make_permit(), multi_partition_range(25)));
}
PERF_TEST_F(memtable, many_partitions_many_rows)
{
return consume_all(multi_row_mt().make_flat_reader(schema(), multi_partition_range(25)));
return consume_all(multi_row_mt().make_flat_reader(schema(), tests::make_permit(), multi_partition_range(25)));
}
}

View File

@@ -35,6 +35,7 @@
#include "schema_builder.hh"
#include "memtable.hh"
#include "test/perf/perf.hh"
#include "test/lib/reader_permit.hh"
static const int update_iterations = 16;
static const int cell_size = 128;
@@ -152,7 +153,7 @@ void run_test(const sstring& name, schema_ptr s, MutationGenerator&& gen) {
// Create a reader which tests the case of memtable snapshots
// going away after memtable was merged to cache.
auto rd = std::make_unique<flat_mutation_reader>(
make_combined_reader(s, cache.make_reader(s), mt->make_flat_reader(s)));
make_combined_reader(s, cache.make_reader(s, tests::make_permit()), mt->make_flat_reader(s, tests::make_permit())));
rd->set_max_buffer_size(1);
rd->fill_buffer(db::no_timeout).get();

View File

@@ -204,7 +204,7 @@ public:
}
future<double> read_sequential_partitions(int idx) {
return do_with(_sst[0]->read_rows_flat(s, no_reader_permit()), [this] (flat_mutation_reader& r) {
return do_with(_sst[0]->read_rows_flat(s, tests::make_permit()), [this] (flat_mutation_reader& r) {
auto start = perf_sstable_test_env::now();
auto total = make_lw_shared<size_t>(0);
auto done = make_lw_shared<bool>(false);

View File

@@ -30,6 +30,7 @@
#include "log.hh"
#include "schema_builder.hh"
#include "memtable.hh"
#include "test/lib/reader_permit.hh"
static
partition_key new_key(schema_ptr s) {
@@ -186,7 +187,7 @@ int main(int argc, char** argv) {
// Verify that all mutations from memtable went through
for (auto&& key : keys) {
auto range = dht::partition_range::make_singular(key);
auto reader = cache.make_reader(s, range);
auto reader = cache.make_reader(s, tests::make_permit(), range);
auto mo = read_mutation_from_flat_mutation_reader(reader, db::no_timeout).get0();
assert(mo);
assert(mo->partition().live_row_count(*s) ==
@@ -203,7 +204,7 @@ int main(int argc, char** argv) {
for (auto&& key : keys) {
auto range = dht::partition_range::make_singular(key);
auto reader = cache.make_reader(s, range);
auto reader = cache.make_reader(s, tests::make_permit(), range);
auto mfopt = reader(db::no_timeout).get0();
assert(mfopt);
assert(mfopt->is_partition_start());
@@ -241,7 +242,7 @@ int main(int argc, char** argv) {
}
try {
auto reader = cache.make_reader(s, range);
auto reader = cache.make_reader(s, tests::make_permit(), range);
assert(!reader(db::no_timeout).get0());
auto evicted_from_cache = logalloc::segment_size + large_cell_size;
// GCC's -fallocation-dce can remove dead calls to new and malloc, so

View File

@@ -100,7 +100,7 @@ struct table {
testlog.trace("flushing");
prev_mt = std::exchange(mt, make_lw_shared<memtable>(s.schema()));
auto flushed = make_lw_shared<memtable>(s.schema());
flushed->apply(*prev_mt).get();
flushed->apply(*prev_mt, tests::make_permit()).get();
prev_mt->mark_flushed(flushed->as_data_source());
testlog.trace("updating cache");
cache.update([&] {
@@ -148,12 +148,12 @@ struct table {
auto r = std::make_unique<reader>(reader{std::move(pr), std::move(slice), make_empty_flat_reader(s.schema())});
std::vector<flat_mutation_reader> rd;
if (prev_mt) {
rd.push_back(prev_mt->make_flat_reader(s.schema(), r->pr, r->slice, default_priority_class(), nullptr,
rd.push_back(prev_mt->make_flat_reader(s.schema(), tests::make_permit(), r->pr, r->slice, default_priority_class(), nullptr,
streamed_mutation::forwarding::no, mutation_reader::forwarding::no));
}
rd.push_back(mt->make_flat_reader(s.schema(), r->pr, r->slice, default_priority_class(), nullptr,
rd.push_back(mt->make_flat_reader(s.schema(), tests::make_permit(), r->pr, r->slice, default_priority_class(), nullptr,
streamed_mutation::forwarding::no, mutation_reader::forwarding::no));
rd.push_back(cache.make_reader(s.schema(), r->pr, r->slice, default_priority_class(), nullptr,
rd.push_back(cache.make_reader(s.schema(), tests::make_permit(), r->pr, r->slice, default_priority_class(), nullptr,
streamed_mutation::forwarding::no, mutation_reader::forwarding::no));
r->rd = make_combined_reader(s.schema(), std::move(rd), streamed_mutation::forwarding::no, mutation_reader::forwarding::no);
return r;