mirror of
https://github.com/scylladb/scylladb.git
synced 2026-05-12 19:02:12 +00:00
This series adds support for detecting collections that have too many items and recording them in `system.large_cells`. A configuration variable was added to db/config: `compaction_collection_items_count_warning_threshold` set by default to 10000. Collections that have more items than this threshold will be warned about and will be recorded as a large cell in the `system.large_cells` table. Documentation has been updated respectively. A new column was added to system.large_cells: `collection_items`. Similar to the `rows` column in system.large_partition, `collection_items` holds the number of items in a collection when the large cell is a collection, or 0 if it isn't. Note that the collection may be recorded in system.large_cells either due to its size, like any other cell, and/or due to the number of items in it, if it cross the said threshold. Note that #11449 called for a new system.large_collections table, but extending system.large_cells follows the logic of system.large_partitions is a smaller change overall, hence it was preferred. Since the system keyspace schema is hard coded, the schema version of system.large_cells was bumped, and since the change is not backward compatible, we added a cluster feature - `LARGE_COLLECTION_DETECTION` - to enable using it. The large_data_handler large cell detection record function will populate the new column only when the new cluster feature is enabled. In addition, unit tests were added in sstable_3_x_test for testing large cells detection by cell size, and large_collection detection by the number of items. Closes #11449 Closes #11674 * github.com:scylladb/scylladb: sstables: mx/writer: optimize large data stats members order sstables: mx/writer: keep large data stats entry as members db: large_data_handler: dynamically update config thresholds utils/updateable_value: add transforming_value_updater db/large_data_handler: cql_table_large_data_handler: record large_collections db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler db/large_data_handler: cql_table_large_data_handler: move ctor out of line docs: large-rows-large-cells-tables: fix typos db/system_keyspace: add collection_elements column to system.large_cells gms/feature_service: add large_collection_detection cluster feature test: sstable_3_x_test: add test_sstable_too_many_collection_elements test: lib: simple_schema: add support for optional collection column test: lib: simple_schema: build schema in ctor body test: lib: simple_schema: cql: define s1 as static only if built this way db/large_data_handler: maybe_record_large_cells: consider collection_elements db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries sstables: mx/writer: pass collection_elements to writer::maybe_record_large_cells sstables: mx/writer: add large_data_type::elements_in_collection db/large_data_handler: get the collection_elements_count_threshold db/config: add compaction_collection_elements_count_warning_threshold test: sstable_3_x_test: add test_sstable_write_large_cell test: sstable_3_x_test: pass cell_threshold_bytes to large_data_handler test: sstable_3_x_test: large_data_handler: prepare callback for testing large_cells test: sstable_3_x_test: large_data tests: use BOOST_REQUIRE_[GL]T test: sstable_3_x_test: test_sstable_log_too_many_rows: use tests::random