scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Avi Kivity	d36601a838	Merge 'Make commitlog respect disk limit better' from Calle " Refs #6148 Separates disk usage into two cases: Allocated and used. Since we use both reserve and recycled segments, both which are not actually filled with anything at the point of waiting. Also refuses to recycle segments or increase reserve size if our current disk footprint exceeds threshold. And finally uses some initial heuristics to determine when we should suggest flushing, based on disk limit, segment size, and current usage. Right now, when we only have a half segment left before hitting used == max. Some initial tests show an improved adherence to limit though it will still be exceeded, because we do _not_ force waiting for segments to become cleared or similar if we need to add data, thus slow flushing can still make usage create extra segments. We will however attempt to shrink disk usage when load is lighter. Somewhat unclear how much this impacts performance with tight limits, and how much this matters. " * elcallio-calle/commitlog_size: commitlog: Make commitlog respect disk limit better commitlog: Demote buffer write log messages to trace	2020-08-11 15:03:32 +03:00
Dejan Mircevski	013893b08d	auth: Drop needless role-manager check The service constructor included a check ensuring that only standard_role_manager can be used with password_authenticator. But after `00f7bc6`, password_authenticator does not depend on any action of standard_role_manager. All queries to meta::roles_table in password_authenticator seem self-contained: the table is created at the start if missing, and salted_hash is CRUDed independently of any other columns bar the primary key role_col_name. NOTE: a nonstandard role manager may not delete a role's row in meta::roles_table when that role is dropped. This will result in successful authentication for that non-existing role. But the clients call check_user_can_login() after such authentication, which in turn calls role_manager::exists(role). Any correctly implemented role manager will then return false, and authentication_exception will be thrown. Therefore, no dependencies exist on the role-manager behaviour, other than it being self-consistent. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-11 14:56:18 +03:00
Avi Kivity	4547949420	Merge "Fix repair stalls in get_sync_boundary and apply_rows_on_master_in_thread" from Asias " This path set fixes stalls in repair that are caused by std::list merge and clear operations during test_latency_read_with_nemesis test. Fixes #6940 Fixes #6975 Fixes #6976 " * 'fix_repair_list_stall_merge_clear_v2' of github.com:asias/scylla: repair: Fix stall in apply_rows_on_master_in_thread and apply_rows_on_follower repair: Use clear_gently in get_sync_boundary to avoid stall utils: Add clear_gently repair: Use merge_to_gently to merge two lists utils: Add merge_to_gently	2020-08-11 14:52:23 +03:00
Botond Dénes	db5926134a	sstables: sstable_mutation_reader: read_partition(): include more information in exception Resolve the FIXME to help investigating related issues and include the position of the consumer in the error message. Refs: #6529 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200811111101.1576222-1-bdenes@scylladb.com>	2020-08-11 14:52:04 +03:00
Asias He	c65ad02fcd	repair: Fix stall in apply_rows_on_master_in_thread and apply_rows_on_follower The row_diff list in apply_rows_on_master_in_thread and apply_rows_on_follower can be large. Modify do_apply_rows to remove the row from the list when the row is consumed to avoid stall when the list is destroyed. Fixes #6975	2020-08-11 19:37:47 +08:00
Asias He	9f4b3a5fa6	repair: Use clear_gently in get_sync_boundary to avoid stall The _row_buf and _working_row_buf list can be large. Use clear_gently helper to avoid stalls. Fixes #6940	2020-08-11 19:37:47 +08:00
Asias He	3e8c4a6788	utils: Add clear_gently A helper to clear a list without stall. Refs #6975 Refs #6940	2020-08-11 19:37:47 +08:00
Calle Wilund	ed86e870ee	docs/cdc.md: Add short explanation of stream ID bit composition Bit layout, sort order and field usage of CDC stream ids.	2020-08-11 14:09:45 +03:00
Avi Kivity	41a75f2b99	Merge "make do_io_check path noexcept" from Benny " Make do_io_check and the io_check functions that call it noexcept. Up to sstable_write_io_check and sstable_touch_directory_io_check. Tests: unit (dev) " * tag 'io-check-noexcept-v1' of github.com:bhalevy/scylla: ssstable: io_check functions: make noexcept utils: do_io_check: adjust indentation utils: io_check: make noexcept for future-returning functions	2020-08-11 13:41:20 +03:00
Calle Wilund	5d044ab74e	commitlog: Make commitlog respect disk limit better Refs #6148 Separates disk usage into two cases: Allocated and used. Since we use both reserve and recycled segments, both which are not actually filled with anything at the point of waiting. Also refuses to recycle segments or increase reserve size if our current disk footprint exceeds threshold. And finally uses some initial heuristics to determine when we should suggest flushing, based on disk limit, segment size, and current usage. Right now, when we only have a half segment left before hitting used == max. Some initial tests show an improved adherence to limit though it will still be exceeded, because we do _not_ force waiting for segments to become cleared or similar if we need to add data, thus slow flushing can still make usage create extra segments. We will however attempt to shrink disk usage when load is lighter. Somewhat unclear how much this impacts performance with tight limits, and how much this matters. v2: * Add some comments/explanations v3: * Made disk footprint subtract happen post delete (non-optimistic)	2020-08-11 10:40:56 +00:00
Avi Kivity	3530e80ce1	Merge "Support md format" from Benny " This series adds support for the "md" sstable format. Support is based on the following: * do not use clustering based filtering in the presence of static row, tombstones. * Disabling min/max column names in the metadata for formats older than "md". * When updating the metadata, reset and disable min/max in the presence of range tombstones (like Cassandra does and until we process them accurately). * Fix the way we maintain min/max column names by: keeping whole clustering key prefixes as min/max rather than calculating min/max independently for each component, like Cassandra does in the "md" format. Fixes #4442 Tests: unit(dev), cql_query_test -t test_clustering_filtering* (debug) md migration_test dtest from git@github.com:bhalevy/scylla-dtest.git migration_test-md-v1 " * tag 'md-format-v4' of github.com:bhalevy/scylla: (27 commits) config: enable_sstables_md_format by default test: cql_query_test: add test_clustering_filtering unit tests table: filter_sstable_for_reader: allow clustering filtering md-format sstables table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results table: filter_sstable_for_reader: adjust to md-format table: filter_sstable_for_reader: include non-scylla sstables with tombstones table: filter_sstable_for_reader: do not filter if static column is requested table: filter_sstable_for_reader: refactor clustering filtering conditional expression features: add MD_SSTABLE_FORMAT cluster feature config: add enable_sstables_md_format database: add set_format_by_config test: sstable_3_x_test: test both mc and md versions test: Add support for the "md" format sstables: mx/writer: use version from sstable for write calls sstables: mx/writer: update_min_max_components for partition tombstone sstables: metadata_collector: support min_max_components for range tombstones sstable: validate_min_max_metadata: drop outdated logic sstables: rename mc folder to mx sstables: may_contain_rows: always true for old formats sstables: add may_contain_rows ...	2020-08-11 13:29:11 +03:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Avi Kivity	55cf219c97	Merge "sstable: close files on error" from Benny " Make sure to close sstable files also on error paths. Refs #5509 Fixes #6448 Tests: unit (dev) " * tag 'sstable-close-files-on-error-v6' of github.com:bhalevy/scylla: sstable: file_writer: auto-close in destructor sstable: file_writer: add optional filename member sstable: add make_component_file_writer sstable: remove_by_toc_name: accept std::string_view sstable: remove_by_toc_name: always close file and input stream sstable: delete_sstables: delete outdated FIXME comment sstable: remove_by_toc_name: drop error_handler parameter sstable: remove_by_toc_name: make static sstable: read_toc: always close file sstable: mark read_toc and methods calling it noexcept sstable: read_toc: get rid of file_path sstable: open_data, create_data: set member only on success. sstable: open_file: mark as noexcept sstable: new_sstable_component_file: make noexcept sstable: new_sstable_component_file: close file on failure sstable: rename_new_sstable_component_file: do not pass file sstable: open_sstable_component_file_non_checked: mark as noexcept sstable: open_integrity_checked_file_dma: make noexcept sstable: open_integrity_checked_file_dma: close file on failure	2020-08-11 13:28:50 +03:00
Botond Dénes	b11d181413	scylla-gdb.py: restore python2 compatibility Although python2 should be a distant memory by now, the reality is that we still need to debug scylla on platforms that still have no python3 available (centos7), so we need to keep scylla-gdb.py python2 compatible. Refs: #7014 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Reviewed-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200811093753.1567689-1-bdenes@scylladb.com>	2020-08-11 12:55:42 +03:00
Nadav Har'El	796ad24f37	docs: correct typo in maintainers.md maintainers.md contains a very helpful explanation of how to backport Seastar fixes to old branches of Scylla, but has a tiny typo, which this patch corrects. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200811095350.77146-1-nyh@scylladb.com>	2020-08-11 12:54:41 +03:00
Takuya ASADA	6fbbe836c1	scylla_raid_setup: use mdadm.service on older Debian variants On older Debian variants does not have mdmonitor.service, we should use mdadm.service instead. Fixes #7000	2020-08-11 12:52:24 +03:00
Calle Wilund	a6ad70d3da	cdc:stream_id: Encode format version + vnode grouping/index in id Fixes #6948 Changes the stream_id format from <token:64>:<rand:64> to <token:64>:<rand:38><index:22><version:4> The code will attempt to assert version match when presented with a stored id (i.e. construct from bytes). This means that ID:s created by previous (experimental) versions will break. Moves the ID encoding fully into the ID class, and makes the code path private for the topology generation code path. Removes some superflous accessors but adds accessors for token, version and index. (For alternator etc).	2020-08-11 12:48:04 +03:00
Calle Wilund	9167d1ac76	commitlog: Demote buffer write log messages to trace Because they become very plentiful and annoying when one tries to analyze segment behaviour. More so in batch mode.	2020-08-11 09:18:23 +00:00
Asias He	53fee789f0	repair: Use merge_to_gently to merge two lists During a performance test, test_latency_read_with_nemesis during manager repair, it experienced a stall of 73 ms: ``` (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >::operator=(repair_row const&) at /usr/include/c++/9/bits/stl_iterator.h:515 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__copy_move<false, false, std::bidirectional_iterator_tag>::__copy_m<std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:312 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__copy_move_a<false, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:404 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__copy_move_a2<false, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:440 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::copy<std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:474 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__merge<std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, __gnu_cxx::__ops::_Iter_comp_iter<repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}> >(std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, __gnu_cxx::__ops::_Iter_comp_iter<repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}>, __gnu_cxx::__ops::_Iter_comp_iter<repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}>) at /usr/include/c++/9/bits/stl_algo.h:4923 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::merge<std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}>(std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}, repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}) at /usr/include/c++/9/bits/stl_algo.h:5018 (inlined by) repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int) at ./repair/row_level.cc:1242 repair_meta::get_row_diff_source_op(seastar::bool_class<update_peer_row_hash_sets_tag>, gms::inet_address, unsigned int, seastar::rpc::sink<repair_hash_with_cmd>&, seastar::rpc::source<repair_row_on_wire_with_cmd>&) at ./repair/row_level.cc:1608 repair_meta::get_row_diff_with_rpc_stream(std::unordered_set<repair_hash, std::hash<repair_hash>, std::equal_to<repair_hash>, std::allocator<repair_hash> >, seastar::bool_class<needs_all_rows_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, gms::inet_address, unsigned int) at ./repair/row_level.cc:1674 row_level_repair::get_missing_rows_from_follower_nodes(repair_meta&) at ./repair/row_level.cc:2413 ``` The problem was that when std::merge() ran out of one range, it copied the second range. To fix, use the new merge_to_gently helper. Fixes #6976	2020-08-11 10:37:34 +08:00
Asias He	0bf0019eeb	utils: Add merge_to_gently This helper is similar to std::merge but it runs inside a thread and does not stall. Refs #6976	2020-08-11 10:37:34 +08:00
Benny Halevy	e2340d0684	config: enable_sstables_md_format by default Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	0d85ceaf37	test: cql_query_test: add test_clustering_filtering unit tests Add unit tests reproducing https://github.com/scylladb/scylla/issues/3552 with clustering-key filtering enabled. enable_sstables_md_format option is set to true as clustering-key filtering is enabled only for md-format sstables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	7cfca519cb	table: filter_sstable_for_reader: allow clustering filtering md-format sstables Now that it is safe to filter md format sstable by min/max column names we can remove the `filtering_broken` variable that disabled filtering in `19b76bf75b` to fix #4442. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	ab67629ea6	table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results To prevent https://github.com/scylladb/scylla/issues/3552 we want to ensure that in any case that the partition exists in any sstable, we emit partition_start/end, even when returning no rows. In the first filtering pass, filter_sstable_for_reader_by_pk filters the input sstables based on the partition key, and num_sstables is set the size of the sstables list after the first filtering pass. An empty sstables list at this stage means there are indeed no sstables with the required partition so returning an empty result will leave the cache in the desired state. Otherwise, we filter again, using filter_sstable_for_reader_by_ck, and examine the list of the remaining readers. If num_readers != num_sstables, we know that some sstables were filterd by clustering key, so we append a flat_mutation_reader_from_mutations to the list of readers and return a combined reader as before. This will ensure that we will always have a partition_start/end mutations for the queried partition, even if the filtered readers emit no rows. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	a672747da3	table: filter_sstable_for_reader: adjust to md-format With the md sstable format, min/max column names in the metadata now track clustering rows (with or without row tombstones), range tombstones, and partition tombstones (that are reflected with empty min/max column names - indicating the full range). As such, min and max column names may be of different lengths due to range tombstones and potentially short clustering key prefixes with compact storage, so the current matching algorithm must be changed to take this into account. To determine if a slice range overlaps the min/max range we are using position_range::overlaps. sstable::clustering_components_ranges was renamed to position_range as it now holds a single position_range rather than a vector of bytes_view ranges. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:30 +03:00
Benny Halevy	90d0fea7df	table: filter_sstable_for_reader: include non-scylla sstables with tombstones Move contains_rows from table code to sstable::may_contain_rows since its implementation now has too specific knowledge of sstable internals. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	2a57ec8c3d	table: filter_sstable_for_reader: do not filter if static column is requested Static rows aren't reflected in the sstable min/max clustering keys metadata. Since we don't have any indication in the metadata that the sstable stores static rows, we must read all sstables if a static column is requested. Refs #3553 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	2fed3f472c	table: filter_sstable_for_reader: refactor clustering filtering conditional expression We're about to drop `filtering_broken` in a future patche when clustering filtering can be supported for md-format sstables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	e8d7744040	features: add MD_SSTABLE_FORMAT cluster feature Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	65239a6e50	config: add enable_sstables_md_format MD format is disabled by default at this point. The option extends enable_sstables_mc_format so that both are needed to be set for supporting the md format. The MD_FORMAT cluster feature will be added in a following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	8e0e2c8a48	database: add set_format_by_config This is required for test applications that may select a sstable format different than the default mc format, like perf_fast_forward. These apps don't use the gossip-based sstables_format_selector to set the format based on the cluster feature and so they need to rely on the db config. Call set_format_by_config in single_node_cql_env::do_with. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	d77ceba498	test: sstable_3_x_test: test both mc and md versions Run the test cases that write sstables using both the mc and md versions. Note that we can still compare the resulting Data, Index, Digest, and Filter components with the prepared mc sstables we have since these haven't changed in md. We take special consideration around validating min/max column names that are now calculated using a revised algorithm in the md format. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Pekka Enberg	3168be3483	test: Add support for the "md" format Test also the md format in all_sstable_versions. Add pre-computed md-sstable files generated using Cassandra version 3.11.7 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	e44ec45ab9	sstables: mx/writer: use version from sstable for write calls Rather than using a constant sstable_version_types::mc. In preparation to supporting sstable_version_types::md. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	bd4383a842	sstables: mx/writer: update_min_max_components for partition tombstone Partition tombstones represent an implicit clustering range that is unbound on both sides, so reflect than in min/max column names metadata using empty clustering key prefixes. If we don't do that, when using the sstable for filtering, we have no other way of distinguishing range tombstones from partition tombstones given the sstable metadata and we would need to include any sstable with tombstones, even if those are range tombstone, for which we can do a better filtering job, using the sstable min/max column names metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	68acae5873	sstables: metadata_collector: support min_max_components for range tombstones We essentially treat min/max column names as range bounds with min as incl_start and max as incl_end. By generating a bound_view for min/max column names on the fly, we can correctly track and compare also short clustering key prefixes that may be used as bounds for range tombstones. Extend the sstable_tombstone_metadata_check unit test to cover these cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	34fb95dacf	sstable: validate_min_max_metadata: drop outdated logic The following checks were introduced in `0a5af61176` To deal with a bug in min max metadata generation of our own, from a time where only ka / la were supported. This is no longer relevant now that we'll consider min_max_column_names only for sstable format > mc (in sstable::may_contain_rows) We choose not to clear_incorrect_min_max_column_names from older versions here as this disturbs sstable unit tests. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	12393c5ec2	sstables: rename mc folder to mx Prepare for supporting the md format. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	7139fb92e6	sstables: may_contain_rows: always true for old formats the min/max column names metadata can be trusted only starting the md format, so just always return `true` for older sstable formats. Note that we could achieve that by clearing the min/max metadata in set_clustering_components_ranges but we choose not to do so since it disturbs sstable unit tests Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	200d8d41d9	sstables: add may_contain_rows Move the logic from table to sstable as it will contain intimate knowledge of the sstable min/max column names validity for md format. Also, get rid of the sstable::clustering_components_ranges() method as the member is used only internally by the sstable code now. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Pekka Enberg	a37eaaa022	sstables: Add support for the "md" format enum value Add the sstable_version_types::md enum value and logically extend sstable_version_types comparisons to cover also the > sstable_version_types::mc cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	7de004d42a	sstables: version: delete unused is_latest_supported predicate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	025b74e20e	sstables: metadata_collector: use empty key to represent full min/max range Instead of keeping the `_has_min_max_clustering_keys` flag, just store an empty key for `_{min,max}_clustering_key` to represent the full range. These will never be narrowed down and will be encoded as empty min/max column names as if they weren't set. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	9f114d821a	sstables: keep whole clustering_key_prefix as min/max_column_names Currently we compare each min/max component independently. This may lead to suboptimal, inclusive clustering ranges that do not indicate any actual key we encountered. For example: ['a', 2], ['b', 1] will lead to min=['a', 1], max=['b', 2] instead of the keys themselves. This change keeps the min or max keys as a whole. It considers shorter clustering prefixes (that are possible with compact storage) as range tombstone bounds, so that a shorter key is considered less than the minimum if the latter has a common prefix, and greater than the maximum if the latter has a common prefix. Extend the min_max_clustering_key_test to test for this case. Previously {"a", "2"}, {"b", "1"} clustering keys would erronuously end up with min={"a", "1"} max={"b", "2"} while we want them to be min={"a", "2"} max={"b", "1"}. Adjust sstable_3_x_test to ignore original mc sstables that were previously computed with different min/max column names. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:03 +03:00
Benny Halevy	707b098f44	sstables: metadata_collector: construct with schema Pass the sstable schema to the metadata_collector constructor. Note that the long term plan is to move metadata_collector to the sstable writer but this requires a bigger change to get rid of the dependencies on it in the legacy writer code in class sstable methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:52:43 +03:00
Benny Halevy	c9cade833c	sstables: metadata_collector: make only for write path make a metadata_collector only when writing the sstable, no need to make one when reading. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:51:12 +03:00
Rafael Ávila de Espíndola	74db08165d	tests: Convert to using memory::with_allocation_failures Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200805155143.122396-1-espindola@scylladb.com>	2020-08-10 18:37:42 +03:00
Piotr Jastrzebski	52ec0c683e	codebase wide: replace erase + remove_if with erase_if C++20 introduced std::erase_if which simplifies removal of elements from the collection. Previously the code pattern looked like: <collection>.erase( std::remove_if(<collection>.begin(), <collection>.end(), <predicate>), <collection>.end()); In C++20 the same can be expressed with: std::erase_if(<collection>, <predicate>); This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <6ffcace5cce79793ca6bd65c61dc86e6297233fd.1597064990.git.piotr@scylladb.com>	2020-08-10 18:17:38 +03:00
Calle Wilund	9620755c7f	database: Do not assert on replay positions if truncate does not flush Fixes #6995 In `c2c6c71` the assert on replay positions in flushed sstables discarded by truncate was broken, by the fact that we no longer flush all sstables unless auto snapshot is enabled. This means the low_mark assertion does not hold, because we maybe/probably never got around to creating the sstables that would hold said mark. Note that the (old) change to not create sstables and then just delete them is in itself good. But in that case we should not try to verify the rp mark.	2020-08-10 18:17:38 +03:00
Avi Kivity	f9aea94c5c	Merge 'add out of box configs for GCP VMs with nvmes' from Lubos " not recommended setups will still run iotune fixes #6631 " * tarzanek-gcp-iosetup: scylla_io_setup: Supported GCP VMs with NVMEs get out of box I/O configs scylla_util.py: add support for gcp instances scylla_util.py: support http headers in curl function scylla_io_setup: refactor iotune run to a function	2020-08-10 18:17:38 +03:00

1 2 3 4 5 ...

23141 Commits