sstables/mx/writer: when creating filter, use the sstables's schema, not the writer's

There are two schema's associated with a sstable writer:
the sstable's schema (i.e. the schema of the table at the time when the
sstable object was created), and the writer's schema (equal to the schema
of the reader which is feeding into the writer).

It's easy to mix up the two and break something as a result.

The writer's schema is needed to correctly interpret and serialize the data
passing through the writer, and to populate the on-disk metadata about the
on-disk schema.

The sstables's schema is used to configure some parameters for newly created
sstable, such as bloom filter false positive ratio, or compression.

The problem fixed by this patch is that the writer was wrongly creating
the filter based on its own schema, while the layer outside the writer
was interpreting it as if it was created with the sstable's schema.

This patch forces the writer to pick the filter's parameters based on the
sstable's schema instead.
This commit is contained in:
Michał Chojnowski
2024-07-11 10:35:00 +02:00
parent a1834efd82
commit d10b38ba5b
2 changed files with 40 additions and 2 deletions

View File

@@ -804,7 +804,7 @@ public:
_sst._shards = { shard };
_cfg.monitor->on_write_started(_data_writer->offset_tracker());
_sst._components->filter = utils::i_filter::get_filter(estimated_partitions, _schema.bloom_filter_fp_chance(), utils::filter_format::m_format);
_sst._components->filter = utils::i_filter::get_filter(estimated_partitions, _sst._schema->bloom_filter_fp_chance(), utils::filter_format::m_format);
_pi_write_m.promoted_index_block_size = cfg.promoted_index_block_size;
_pi_write_m.promoted_index_auto_scale_threshold = cfg.promoted_index_auto_scale_threshold;
_index_sampling_state.summary_byte_cost = _cfg.summary_byte_cost;
@@ -1466,7 +1466,7 @@ void writer::consume_end_of_stream() {
_sst._components->statistics.contents[metadata_type::Serialization] = std::make_unique<serialization_header>(std::move(_sst_schema.header));
seal_statistics(_sst.get_version(), _sst._components->statistics, _collector,
_sst._schema->get_partitioner().name(), _schema.bloom_filter_fp_chance(),
_sst._schema->get_partitioner().name(), _sst._schema->bloom_filter_fp_chance(),
_sst._schema, _sst.get_first_decorated_key(), _sst.get_last_decorated_key(), _enc_stats);
close_data_writer();
_sst.write_summary();

View File

@@ -5604,3 +5604,41 @@ SEASTAR_TEST_CASE(test_compression_premature_eof) {
}
});
}
// A reproducer for scylladb/scylladb#16065.
// Creates an sstable with a newer schema, and populates
// it with a reader created with an older schema.
//
// Before the fixes, it would have resulted in an assert violation.
SEASTAR_TEST_CASE(test_alter_bloom_fp_chance_during_write) {
return test_env::do_with_async([] (test_env& env) {
auto s1 = schema_builder("ks", "t")
.with_column("pk", bytes_type, column_kind::partition_key)
.with_column("v", utf8_type, column_kind::regular_column)
.set_bloom_filter_fp_chance(1.0)
.build();
auto s2 = schema_builder(s1)
.set_bloom_filter_fp_chance(0.01)
.build();
auto ts = api::new_timestamp();
auto m = mutation(s1, partition_key::from_single_value(*s1, serialized(0)));
auto val = std::string(1000, '0');
m.set_clustered_cell(clustering_key::make_empty(), "v", val, ts);
auto mt = make_lw_shared<replica::memtable>(s1);
mt->apply(m);
auto sst = env.make_sstable(s2, sstable_version_types::me);
sst->write_components(mt->make_flat_reader(s1, env.make_reader_permit()), 1, s1, env.manager().configure_writer(), mt->get_encoding_stats()).get();
sstable_assertions sa(env, sst);
sa.load();
m.upgrade(s2);
auto assertions = assert_that(sa.make_reader());
assertions.produces(m);
assertions.produces_end_of_stream();
});
}