Check that the replica returns empty pages as expected, when a large
tombstone prefix/span is present. Large = larger than the configured
query_tombstone_limit (using a tiny value of 10 in the test to avoid
having to write many tombstones).
The query result writer now counts tombstones and cuts the page (marking
it as a short one) when the tombstone limit is reached. This is to avoid
timing out on large span of tombstones, especially prefixes.
In the case of unpaged queries, we fail the read instead, similarly to
how we do with max result size.
If the limit is 0, the previous behaviour is used: tombstones are not
taken into consideration at all.
The pager currently assumes that an empty pages means the query is
exhausted. Lift this assumption, as we will soon have empty short
pages.
Also, paging using filtering also needs to use the replica-provided
last-position when the page is empty.
We expect each replica to stop at exactly the same position when the
digests match. Soon however, if replicas have a lot of tombstones, some
may stop earlier then the others. As long as all digests match, this is
fine but we need to make sure we continue from the smallest such
positions on the next page.
We want to transmit the last position as determined by the replica on
both result and digest reads. Result reads already do that via the
query::result, but digest reads don't yet as they don't return the full
query::result structure, just the digest field from it. Add the last
position to the digest read's return value and collect these in the
digest resolver, along with the returned digests.
When merging multiple query-results, we use the last-position of the
last result in the combined one as the combined result's last position.
This only works however if said last result was included fully.
Otherwise we have to discard the last-position included with the result
and the pager will use the position of the last row in the combined
result as the last position.
The commit introducing the above logic mistakenly discarded the last
position when the result is a short read or a page is not full. This is
not necessary and even harmful as it can result in an empty combined
result being delivered to the pager, without a last-position.
To be used by coordinator side code to determine the correct tombstone
limit to pass to read-command (tombstone limit field added in the next
commit). When this limit is non-zero, the replica will start cutting
pages after the tombstone limit is surpassed.
This getter works similarly to `get_max_result_size()`: if the cluster
feature for empty replica pages is set, it will return the value
configured via db::config::query_tombstone_limit. System queries always
use a limit of 0 (unlimited tombstones).
This will be the value used to break pages, after processing the
specified amount of tombstones. The page will be cut even if empty.
We could maybe use the already existing tombstone_{warn,fail}_threshold
instead and use them as a soft/hard limit pair, like we did with page
sizes.
It is not type safe: has multiple limits passed to it as raw ints, as
well as other types that ints implicitly convert to. Furthermore the row
limit is passed in two separate fields (lower 32 bits and upper 32
bits). All this make this constructor a minefield for humans to use. We
have a safer constructor for some time but some users of the old one
remain. Move them to the safe one.
When starting `Build` job we have a situation when `x86` and `arm`
starting in different dates causing the all process to fail
As suggested by @avikivity , adding a date-stamp parameter and will pass
it through downstream jobs to get one release for each job
Ref: scylladb/scylla-pkg#3008
Closes#11234
We would like to define more distinct types
that are currently defined as aliases to utils::UUID to identify
resources in the system, like table id and schema
version id.
As with counter_id, the motivation is to restrict
the usage of the distinct types so they can be used
(assigned, compared, etc.) only with objects of
the same type. Using with a generic UUID will
then require explicit conversion, that we want to
expose.
This series starts with cleaning up the idl header definition
by adding support for `import` and `include` statements in the idl-compiler.
These allow the idl header to become self-sufficient
and then remove manually-added includes from source files.
The latter usually need only the top level idl header
and it, in turn, should include other headers if it depends on them.
Then, a UUID_class template was defined as a shared boiler plate
for the various uuid-class. First, we convert counter_id to use it,
rather than mimicking utils::UUID on its own.
On top of utils::UUID_class<T>, we define table_id,
table_schema_version, and query_id.
Following up on this series, we should define more commonly used
types like: host_id, streaming_plan_id, paxos_ballot_id.
Fixes#11207Closes#11220
* github.com:scylladb/scylladb:
query-request, everywhere: define and use query_id as a strong type
schema, everywhere: define and use table_schema_version as a strong type
schema, everywhere: define and use table_id as a strong type
schema: include schema_fwd.hh in schema.hh
system_keyspace: get_truncation_record: delete unused lambda capture
utils: uuid: define appending_hash<utils::tagged_uuid<Tag>>
utils: tagged_uuid: rename to_uuid() to uuid()
counters: counter_id: use base class create_random_id
counters: base counter_id on utils::tagged_uuid
utils: tagged_uuid: mark functions noexcept
utils: tagged_uuid: bool: reuse uuid::bool operator
raft: migrate tagged_id definition to utils::tagged_uuid
utils: uuid: mark functions noexcept
counters: counter_id delete requirement for triviality
utils: bit_cast: require TriviallyCopyable To
repair: delete unused include of utils/bit_cast.hh
bit_cast: use std::bit_cast
idl: make idl headers self-sufficient
db: hints: sync_point: do not include idl definition file
db/per_partition_rate_limit: tidy up headers self-sufficiency
idl-compiler: include serialization impl and visitors in generated dist.impl.hh files
idl-compiler: add include statements
idl_test: add a struct depending on UUID
Define table_schema_version as a distinct tagged_uuid class,
So it can be differentiated from other uuid-class types,
in particular table_id.
Added reversed(table_schema_version) for convenience
and uniformity since the same logic is currently open coded
in several places.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Define table_id as a distinct utils::tagged_uuid modeled after raft
tagged_id, so it can be differentiated from other uuid-class types,
in particular from table_schema_version.
Fixes#11207
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Rather than defining generate_random,
and use respectively in unit tests.
(It was inherited from raft::internal::tagged_id.)
This allows us to shorten counter_id's definition
to just using utils::tagged_uuid<struct counter_id_tag>.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Use the common base class for uuid-based types.
tagged_uuid::to_uuid defined here for backward
compatibility, but it will be renamed in the next patch
to uuid().
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
So it can be used for other types in the system outside
of raft, like counter_id, table_id, table_schema_version,
and more.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This stemmed from utils/bit_cast overly strict requirement.
Now that it was relaxed, these is no need for this static assert
as counter_id is trivially copyable, and that is checked
by bit_cast {read,write}_unaligned
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that scylla requries c++20 there's no
need to define our own implementation in utils/bit_cast.hh
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add include statements to satisfy dependencies.
Delete, now unneeded, include directives from the upper level
source files.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
idl definition files are not intended for direct
inclusion in .cc files.
Data types it represents are supposed to be defined
in regular C++ header, so define them in db/hints/scyn_point.hh
and include it rather then idl/hinted_handoff.idl.hh.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
They are generally required by the serialization implementation.
This will simplify using them without having to hand pick
what header to include in the .cc file that includes them.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
For generating #include directives in the generated files,
so we don't have to hand-craft include the dependencies
in the right order.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This series is aimed at fixing #11132.
To get there, the series untangles the functions that currently depend on the the cross-shard coordination in table::snapshot,
namely database::truncate and consequently database::drop_column_family.
database::get_table_on_all_shards is added here as a helper to get a foreign shared ptr of the the table shard from all shards,
and it is later used by multiple functions to truncate and then take a snapshot of the sharded table.
database::truncate_table_on_all_shards is defined to orchestrate the truncate process end-to-end, flushing or clearing all table shards before taking a snapshot if needed, using the newly defined table::snapshot_on_all_shards, and by that leaving only the discard_sstables job to the per-shard database::truncate function.
The latter, snapshot_on_all_shards, orchestrates the snapshot process on all shards - getting rid of the per-shard table::snapshot function (after refactoring take_snapshot and finalize_snapshot out of it), and the associated dreaded data structures: snapshot_manager and pending_snapshots.
Fixes#11132.
Closes#11133
* github.com:scylladb/scylladb:
table: reindent write_schema_as_cql
table: coroutinize write_schema_as_cql
table: seal_snapshot: maybe_yield when iterating over the table names
table: reindent seal_snapshot
table: coroutinize seal_snapshot
table: delete unused snapshot_manager and pending_snapshots
table: delete unused snapshot function
table: snapshot_on_all_shards: orchestrate snapshot process
table: snapshot: move pending_snapshots.erase from seal_snapshot
table: finalize_snapshot: take the file sets as a param
table: make seal_snapshot a static member
table: finalize_snapshot: reindent
table: refactor finalize_snapshot out of snapshot
table: snapshot: keep per-shard file sets in snapshot_manager
table: take_snapshot: return foreign unique ptr
table: take_snapshot: maybe yield in per-sstable loop
table: take_snapshot: simplify tables construction code
table: take_snapshot: reindent
table: take_snapshot: simplify error handling
table: refactor take_snapshot out of snapshot
utils: get rid of joinpoint
database: get rid of timestamp_func
database: truncate: snapshot table in all-shards layer
database: truncate: flush table and views in all-shards layer
database: truncate: stop and disable compaction in all-shards layer
database: truncate: move call to set_low_replay_position_mark to all-shards layer
database: truncate: enter per-shard table async_gate in all-shards layer
database: truncate: move check for schema_tables keyspace to all-shards layer.
database: snapshot_table_on_all_shards: reindent
table: add snapshot_on_all_shards
database: add snapshot_table_on_all_shards
database: rename {flush,snapshot}_on_all and make static
database: drop_table_on_all_shards: truncate and stop table in upper layer
database: drop_table_on_all_shards: get all table shards before drop_column_family on each
database: drop_column_family: define table& cf
database: drop_column_family: reuse uuid for evict_all_for_table
database: drop_column_family: move log message up a layer
database: truncate: get rid of the unused ks param
database: add truncate_table_on_all_shards
database: drop_table_on_all_shards: do not accept a truncated_at timestamp_func
database: truncate: get optional snapshot_name from caller
database: truncate: fix assert about replay_position low_mark
database_test: apply_mutation on the correct db shard
Add maybe_yield calls in tight loop, potentially
over thousands of sstable names to prevent reactor stalls.
Although the per-sstable cost is very small, we've experienced
stalls realted to printing in O(#sstables) in compaction.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Handle exceptions, making sure the output
stream is properly closed in all cases,
and an intermediate error, if any, is returned as the
final future.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that snapshot orchestration in snapshot_on_all_shards
doesn't use snapshot_manager, get rid of the data structure.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that snapshot orchestration is done solely
in snapshot_on_all_shards, the per-shard
snapshot function can be deleted.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Call take_snapshot on each shard and collect the
returns snapshot_file_set.
When all are done, move the vector<snapshot_file_set>
to finalize_snapshot.
All that without resorting to using the snapshot_manager
nor calling table::snapshot.
Both will deleted in the following patches.
Fixes#11132
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that seal_snapshot doesn't need to lookup
the snapshot_manager in pending_snapshots to
get to the file_sets, erasing the snapshot_manager
object can be done in table::snapshot which
also inserted it there.
This will make it easier to get rid of it in a later patch.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
and pass it to seal_snapshot, so that the latter won't
need to lookup and access the snapshot_manager object.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>