scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Avi Kivity	71398f3fb4	Merge "Cleanup sstable writer" from Benny " This series cleans up the legacy and common ssatble writer code. metadata_collector::_ancestors were moved to class sstable so that the former can be moved out of sstable into file_writer_impl. Moved setting of replay position and sstable level via sstable_writer_config so that compaction won't need to access the metadata_collector via the sstable. With that, metadata_collector could be moved from class sstable to sstable_writer::writer_impl along with the column_stats. That allowed moved "generic" file_writer methods that were actually k/l format specific into sstable_writer_k_l. Eventually `file_writer` code is moved into sstables/writer.cc and sstable_writer_k_l into sstables/kl/writer.{hh,cc} A bonus cleanup is the ability to get rid of sstable::_correctly_serialize_non_compound_range_tombstones as it's now available to the writers via the writer configuration and not required to be stored in the sstable object. Fixes #3012 Test: unit(dev) " * tag 'cleanup-sstable-writer-v2' of github.com:bhalevy/scylla: sstables: move writer code away to writer.cc sstables: move sstable_writer_k_l away to kl/writer sstables: get rid of sstable::_correctly_serialize_non_compound_range_tombstones sstables: move writer methods to sstable_writer_k_l sstables: move compaction ancestors to sstable sstables: sstable_writer: optionally set sstable level via config sstables: sstable_writer: optionally set replay position via config sstables: compaction: make_sstable_writer_config sstables: open code update_stats_on_end_of_stream in sstable_writer::consume_end_of_stream sstables: fold components_writer into sstable_writer_k_l sstables: move sstable_writer_k_l definition upwards sstables: components_writer: turn _index into unique_ptr	2020-10-15 10:40:28 +03:00
Benny Halevy	8cd4d53643	sstables: mx/writer: fix copy-paste error in reader_semaphore name It was copied from sstables.cc in `6ca0464af5`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20201014171651.541232-1-bhalevy@scylladb.com>	2020-10-14 22:17:49 +02:00
Benny Halevy	96cd6adc71	sstables: get rid of sstable::_correctly_serialize_non_compound_range_tombstones Now it's available to the writers via the writer configuration and not required to be stored in the sstable object. Refs #3012 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:53:23 +03:00
Benny Halevy	97a446f9fa	sstables: move writer methods to sstable_writer_k_l They are called solely from the sstable_writer_k_l path. With that, moce the metadata collector and column stats to writer_impl. They are now only used by the sstable writers. Refs #3012 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:52:17 +03:00
Benny Halevy	e1692bec17	sstables: move compaction ancestors to sstable Compaction needs access to the sstable's ancestors so we need to keep the ancestors for the sstable separately from the metadata collector as the latter is about to be moved to the sstable writer. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-10-14 19:51:26 +03:00
Avi Kivity	86bbf1763d	Merge "reader concurrency semaphore: dump permit diagnostics on timeout or queue overflow" from Botond " The reader concurrency semaphore timing out or its queue being overflown are fairly common events both in production and in testing. At the same time it is a hard to diagnose problem that often has a benign cause (especially during testing), but it is equally possible that it points to something serious. So when this error starts to appear in logs, usually we want to investigate and the investigation is lengthy... either involves looking at metrics or coredumps or both. This patch intends to jumpstart this process by dumping a diagnostics on semaphore timeout or queue overflow. The diagnostics is printed to the log with debug level to avoid excessive spamming. It contains a histogram of all the permits associated with the problematic semaphore organized by table, operation and state. Example: DEBUG 2020-10-08 17:05:26,115 [shard 0] reader_concurrency_semaphore - Semaphore _read_concurrency_sem: timed out, dumping permit diagnostics: Permits with state admitted, sorted by memory memory count name 3499M 27 ks.test:data-query 3499M 27 total Permits with state waiting, sorted by count count memory name 1 0B ks.test:drain 7650 0B ks.test:data-query 7651 0B total Permits with state registered, sorted by count count memory name 0 0B total Total: permits: 7678, memory: 3499M This allows determining several things at glance: * What are the tables involved * What are the operations involved * Where is the memory This can speed up a follow-up investigation greatly, or it can even be enough on its own to determine that the issue is benign. Tests: unit(dev, debug) " * 'dump-diagnostics-on-semaphore-timeout/v2' of https://github.com/denesb/scylla: reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow utils: add to_hr_size() reader_concurrency_semaphore: link permits into an intrusive list reader_concurrency_semaphore: move expiry_handler::operator()() out-of-line reader_concurrency_semaphore: move constructors out-of-line reader_concurrency_semaphore: add state to permits reader_concurrency_semaphore: name permits querier_cache_test: test_immediate_evict_on_insert: use two permits multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader() multishard_combining_reader: add permit parameter multishard_combining_reader: shard_reader: use multishard reader's permit	2020-10-13 12:44:23 +03:00
Botond Dénes	ff623e70b3	reader_concurrency_semaphore: name permits Require a schema and an operation name to be given to each permit when created. The schema is of the table the read is executed against, and the operation name, which is some name identifying the operation the permit is part of. Ideally this should be different for each site the permit is created at, to be able to discern not only different kind of reads, but different code paths the read took. As not all read can be associated with one schema, the schema is allowed to be null. The name will be used for debugging purposes, both for coredump debugging and runtime logging of permit-related diagnostics.	2020-10-13 12:32:13 +03:00
Avi Kivity	5065ae835f	sstables: move bound_kind_m formatter to namespace sstables Without this, clang complains that we violate argument dependent lookup rules: note: 'operator<<' should be declared prior to the call site or in namespace 'sstables' std::ostream& operator<<(std::ostream&, const sstables::bound_kind_m&); we can't enforce the #include order, but we can easily move it it to namespace sstables (where it belongs anyway), so let's do that. gcc is happy either way.	2020-10-12 20:38:11 +03:00
Avi Kivity	a00fca1a69	sstables: move bound_kind_m formatter to its natural place Move bound_kind_m's formatter to the same header file where is is defined. This prevents cases where the compiler decays the type (an enum) to the underlying integral type because it does not see the formatter declaration, resulting in the wrong output.	2020-10-12 20:36:10 +03:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Benny Halevy	f5ffd5fc5f	sstables: Fix reactor stall in sstables::seal_summary() With relatively big summaries, reactor can be stalled for a couple of milliseconds. This patch: a. allocates positions upfront to avoid excessive reallocation. b. returns a future from seal_summary() and uses `seastar::do_for_each` to iterate over the summary entries so the loop can yield if necessary. Fixes #7108. Based on 2470aad5a389dfd32621737d2c17c7e319437692 by Raphael S. Carvalho <raphaelsc@scylladb.com> Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200826091337.28530-1-bhalevy@scylladb.com>	2020-08-26 12:18:05 +03:00
Avi Kivity	3530e80ce1	Merge "Support md format" from Benny " This series adds support for the "md" sstable format. Support is based on the following: * do not use clustering based filtering in the presence of static row, tombstones. * Disabling min/max column names in the metadata for formats older than "md". * When updating the metadata, reset and disable min/max in the presence of range tombstones (like Cassandra does and until we process them accurately). * Fix the way we maintain min/max column names by: keeping whole clustering key prefixes as min/max rather than calculating min/max independently for each component, like Cassandra does in the "md" format. Fixes #4442 Tests: unit(dev), cql_query_test -t test_clustering_filtering* (debug) md migration_test dtest from git@github.com:bhalevy/scylla-dtest.git migration_test-md-v1 " * tag 'md-format-v4' of github.com:bhalevy/scylla: (27 commits) config: enable_sstables_md_format by default test: cql_query_test: add test_clustering_filtering unit tests table: filter_sstable_for_reader: allow clustering filtering md-format sstables table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results table: filter_sstable_for_reader: adjust to md-format table: filter_sstable_for_reader: include non-scylla sstables with tombstones table: filter_sstable_for_reader: do not filter if static column is requested table: filter_sstable_for_reader: refactor clustering filtering conditional expression features: add MD_SSTABLE_FORMAT cluster feature config: add enable_sstables_md_format database: add set_format_by_config test: sstable_3_x_test: test both mc and md versions test: Add support for the "md" format sstables: mx/writer: use version from sstable for write calls sstables: mx/writer: update_min_max_components for partition tombstone sstables: metadata_collector: support min_max_components for range tombstones sstable: validate_min_max_metadata: drop outdated logic sstables: rename mc folder to mx sstables: may_contain_rows: always true for old formats sstables: add may_contain_rows ...	2020-08-11 13:29:11 +03:00
Benny Halevy	e44ec45ab9	sstables: mx/writer: use version from sstable for write calls Rather than using a constant sstable_version_types::mc. In preparation to supporting sstable_version_types::md. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	bd4383a842	sstables: mx/writer: update_min_max_components for partition tombstone Partition tombstones represent an implicit clustering range that is unbound on both sides, so reflect than in min/max column names metadata using empty clustering key prefixes. If we don't do that, when using the sstable for filtering, we have no other way of distinguishing range tombstones from partition tombstones given the sstable metadata and we would need to include any sstable with tombstones, even if those are range tombstone, for which we can do a better filtering job, using the sstable min/max column names metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	68acae5873	sstables: metadata_collector: support min_max_components for range tombstones We essentially treat min/max column names as range bounds with min as incl_start and max as incl_end. By generating a bound_view for min/max column names on the fly, we can correctly track and compare also short clustering key prefixes that may be used as bounds for range tombstones. Extend the sstable_tombstone_metadata_check unit test to cover these cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	12393c5ec2	sstables: rename mc folder to mx Prepare for supporting the md format. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00

17 Commits