scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 20:05:10 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	c9e1694c58	Merge "Some optimizations on cache entry lookup" from Pavel Emelyanov The set contains 3 small optimizations: - avoid copying of partition key on lookup path - reduce number of args carried around when creating a new entry - save one partition key comparison on reader creation Plus related satellite cleanups. * https://github.com/xemul/scylla/tree/br-row-cache-less-copies: row_cache: Revive do_find_or_create_entry concepts populating reader: Do not copy decorated key too early populating reader: Less allocator switching on population populating reader: Fix indentation after previous patch row_cache: Move missing entry creation into helper test: Lookup an existing entry with its own helper row_cache: Do not copy partition tombstone when creating cache entry row_cache: Kill incomplete_tag row_cache: Save one key compare on direct hit	2020-09-15 17:49:47 +02:00
Avi Kivity	0e03c979d2	Merge 'Fix ignoring cells after null in appending hash' from Piotr Sarna " This series fixes a bug in `appending_hash<row>` that caused it to ignore any cells after the first NULL. It also adds a cluster feature which starts using the new hashing only after the whole cluster is aware of it. The series comes with tests, which reproduce the issue. Fixes #4567 Based on #4574 " * psarna-fix_ignoring_cells_after_null_in_appending_hash: test: extend mutation_test for NULL values tests/mutation: add reproducer for #4567 gms: add a cluster feature for fixed hashing digest: add null values to row digest mutation_partition: fix formatting appending_hash<row>: make publicly visible	2020-09-10 15:35:38 +03:00
Piotr Sarna	fe5cd846b5	test: extend mutation_test for NULL values The test is extended for another possible corner case: [1, NULL, 2] vs [1, 2, NULL] should have different digests. Also, a check for legacy behavior is added.	2020-09-10 13:16:44 +02:00
Paweł Dziepak	287d0371fa	tests/mutation: add reproducer for #4567	2020-09-10 13:16:44 +02:00
Dejan Mircevski	9d02f10c71	cql3: Fix NULL reference in get_column_defs_for_filtering There was a typo in get_column_defs_for_filtering(): it checked the wrong pointer before dereferencing. Add a test exposing the NULL dereference and fix the typo. Tests: unit (dev) Fixes #7198. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-09-10 08:45:07 +02:00
Avi Kivity	7ac59dcc98	lsa: decay reserves The log-structured allocator (LSA) reserves memory when performing operations, since its operations are performed with reclaiming disabled and if it runs out, it cannot evict cache to gain more. The amount of memory to reserve is remembered across calls so that it does not have to repeat the fail/increase-reserve/retry cycle for every operation. However, we currently lack decaying the amount to reserve. This means that if a single operation increased the reserve in the distant past, all current operations also require this large reserve. Large reserves are expensive since they can cause large amounts of cache to be evicted. This patch adds reserve decay. The time-to-decay is inversely proportional to reserve size: 10GB/reserve. This means that a 20MB reserve is halved after 500 operations (10GB/20MB) while a 20kB reserve is halved after 500,000 operations (10GB/20kB). So large, expensive reserves are decayed quickly while small, inexpensive reserves are decayed slowly to reduce the risk of allocation failures and exceptions. A unit test is added. Fixes #325.	2020-09-08 15:59:25 +03:00
Piotr Grabowski	ffd8c8c505	utf8: Print invalid UTF-8 character position Add new validate_with_error_position function which returns -1 if data is a valid UTF-8 string or otherwise a byte position of first invalid character. The position is added to exception messages of all UTF-8 parsing errors in Scylla. validate_with_error_position is done in two passes in order to preserve the same performance in common case when the string is valid.	2020-09-07 18:11:21 +03:00
Botond Dénes	c01af1d9d2	tests/boost/multishard_mutation_query_test: remove last BOOST_REQUIRE* macros Previous patches removed those `BOOST_REQUIRE` macros that could be invoked from shards other than 0. The reason is that said macros are not thread-safe, so calling them from multiple shards produces mangled output to stdout as well as the XML report file. It was assumed that only these invocations -- from a non-0 shard -- are problematic, but it turns out even these can race with seastar log messages emitted from other shards. This patch removes all such macros, replacing them with the thread safe `require` functions from `test/lib/test_utils.hh`. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200907125309.1199104-1-bdenes@scylladb.com>	2020-09-07 17:07:26 +03:00
Pavel Emelyanov	84a6d439ad	test: Lookup an existing entry with its own helper The only caller of find_or_create() in tests works on already existing (.populate()-d) entry, so patch this place for explicity and for the sake of next patching. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-09-03 21:13:21 +03:00
Avi Kivity	64c7c81bac	Merge "Update log messages to {fmt} rules" from Pavel E " Before seastar is updated with the {fmt} engine under the logging hood, some changes are to be made in scylla to conform to {fmt} standards. Compilation and tests checked against both -- old (current) and new seastar-s. tests: unit(dev), manual " * 'br-logging-update' of https://github.com/xemul/scylla: code: Force formatting of pointer in .debug and .trace code: Format { and } as {fmt} needs streaming: Do not reveal raw pointer in info message mp_row_consumer: Provide hex-formatting wrapper for bytes_view heat_load_balance: Include fmt/ranges.h	2020-09-03 15:10:09 +03:00
Kamil Braun	ff78a3c332	cdc: rename CDC description tables... again Commit `a6ad70d3da` changed the format of stream IDs: the lower 8 bytes were previously generated randomly, now some of them have semantics. In particular, the least significant byte contains a version (stream IDs might evolve with further releases). This is a backward-incompatible change: the code won't properly handle stream IDs with all lower 8 bytes generated randomly. To protect us from subtle bugs, the code has an assertion that checks the stream ID's version. This means that if an experimental user used CDC before the change and then upgraded, they might hit the assertion when a node attempts to retrieve a CDC generation with old stream IDs from the CDC description tables and then decode it. In effect, the user won't even be able to start a node. Similarly as with the case described in `d89b7a0548`, the simplest fix is to rename the tables. This fix must get merged in before CDC goes out of experimental. Now, if the user upgrades their cluster from a pre-rename version, the node will simply complain that it can't obtain the CDC generation instead of preventing the cluster from working. The user will be able to use CDC after running checkAndRepairCDCStreams. Since a new table is added to the system_distributed keyspace, the cluster's schema has changed, so sstables and digests need to be regenerated for schema_digest_test.	2020-08-31 11:33:14 +03:00
Rafael Ávila de Espíndola	d18af34205	everywhere: Use future::get0 when appropriate This works with current seastar and clears most of the way for updating to a version that doesn't use std::tuple in futures. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200826231947.1145890-1-espindola@scylladb.com>	2020-08-27 15:05:51 +03:00
Nadav Har'El	95afadfe21	merge: alternator_streams: Include keys in OldImage/NewImage Merged pull request https://github.com/scylladb/scylla/pull/7063 By Calle Wilund: Fixes #6935 DynamoDB streams for some reason duplicate the record keys into both the "Keys" and "OldImage"/"NewImage" sub-objects when doing GetRecords. But only if there is other data to include. This patch appends the pk/ck parts into old/new image iff we had any record data. Updated to handle keys-only updates, and distinguish creating vs. updating rows. Changes cdc to not generate preimage for non-existent/deleted rows, and also fixes missing operations/ttls in keys-only delta mode. alternator_streams: Include keys in OldImage/NewImage cdc: Do not generate pre/post image for non-existent rows	2020-08-27 11:23:35 +03:00
Calle Wilund	e50911e5b0	cdc: Do not generate pre/post image for non-existent rows Fixes #7119 Fixes #7120 If preimage select came up empty - i.e. the row did not exist, either due to never been created, or once delete, we should not bother creating a log preimage row for it. Esp. since it makes it harder to interpret the cdc log. If an operation in a cdc batch did a row delete (ranged, ck, etc), do not generate postimage data, since the row does no longer exist. Note that we differentiate deleting all (non-pk/ck) columns from actual row delete.	2020-08-26 18:14:09 +00:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Pavel Emelyanov	366b4e8a8f	code: Format { and } as {fmt} needs There are two places that want to print "{<text>}" strings, but do not format the curly braces the {fmt}-way. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Avi Kivity	3daa49f098	Merge "materialized views: Fix undefined behavior on base table schema changes" from Tomasz " The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema did not change when the base table was altered. Another problem was that view building was using the current table's schema to interpret the fragments and invoke view building. That's incorrect for two reasons. First, fragments generated by a reader must be accessed only using the reader's schema. Second, base_non_pk_columns_in_view_pk of the recorded view ptrs may not longer match the current base table schema, which is used to generate the view updates. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entity called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Fixes #7061. Tests: - unit (dev) - [v1] manual (reproduced using scylla binary and cqlsh) " * tag 'mv-schema-mismatch-fix-v2' of github.com:tgrabiec/scylla: db: view: Refactor view_info::initialize_base_dependent_fields() tests: mv: Test dropping columns from base table db: view: Fix incorrect schema access during view building after base table schema changes schema: Call on_internal_error() when out of range id is passed to column_at() db: views: Fix undefined behavior on base table schema changes db: views: Introduce has_base_non_pk_columns_in_view_pk()	2020-08-26 17:37:52 +03:00
Raphael S. Carvalho	1c29f0a43d	cql3/statements: verify that counter column cannot be added into non-counter table A check, to validate that counter column cannot be added into non-counter table, is missing for alter table statement. Validation is performed when building new schema, but it's limited to checking that a schema will not contain both counter and non-counter columns. Due to lack of validation, the added counter column could be incorrectly persisted to the schema, but this results in a crash when setting the new schema to its table. On restart, it can be confirmed that the schema change was indeed persisted when describing the table. This problem is fixed by doing proper validation for the alter table statement, which consists of making sure a new counter column cannot be added to a non-counter table. The test cdc_disallow_cdc_for_counters_test is adjusted because one of its tests was built on the assumption that counter column can be added into a non-counter table. Fixes #7065. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200824155709.34743-1-raphaelsc@scylladb.com>	2020-08-25 10:41:54 +03:00
Pavel Emelyanov	a6e6856e1f	compaction: Keep database reference on cleanup options The database is available at both places that create the options -- tests and API perform_cleanup call. Options object doesn't over-survive the returned future, so it's safe to keep the reference on it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-21 14:58:40 +03:00
Tomasz Grabiec	c44455d514	Merge "Miscellaneous schema code cleanups" from Rafael	2020-08-20 15:19:42 +02:00
Tomasz Grabiec	617ccc5408	tests: mv: Test dropping columns from base table Reproduces #7061.	2020-08-20 14:53:07 +02:00
Avi Kivity	392e24d199	Merge "Unglobal messaging service" from Pavel E " The messaging service is (as many other services) present in the global namespace and is widely accessed from where needed with global get(_local)?_messaging_service() calls. There's a long-term task to get rid of this globality and make services and componenets reference each-other and, for and due-to this, start and stop in specific order. This set makes this for the messaging service. The service is very low level and doesn't depend on anything. It's used by gossiper, streaming, repair, migration manager, storage proxy, storage service and API. According to this dependencies the set consists of several parts: patches 1-9 are preparatory, they encapsulate messaging service init/fini stuff in its own module and decouple it from the db::config patch 10-12 introduce local service reference in main and set its init/fini calls at the early stage so that this reference can later be passed to those depending on it patches 13-42 replace global referencing of messaging service from other subsystems with local references initialized from main. patch 43 finalizes tests. patch 44 wraps things up with removing global messaiging service instance along with get(_local)?_messaging_service calls. The service's stopping part is deliberately left incomplete (as it is now), the sharded service remains alive, only the instance's stop() method is called (and is empty for a while). Since the messaging service's users still do not stop cleanly, its instances should better continue leaking on exit. Once (if) the seastar gets the helper rpc::has_handlers() method merged the messaging_service::stop() will be able to check if all the verbs had been unregistered (spoiler: not yet, more fixes to come). For debugging purposes the pointer on now-local messaging service instance is kept in service::debug namespace. tests: unit(dev) dtest(dev: simple_boot_shutdown, repair, update_cluster_layout) manual start-stop " * 'br-unglobal-messaging-service-2' of https://github.com/xemul/scylla: (44 commits) messaging_service: Unglobal messaging service instance tests: Use own instances of messaging_service storage_service: Use local messaging reference storage_service: Keep reference on sharded messaging service migration_manager: Add messaging service as argument to get_schema_definition migration_manager: Use local messaging reference in simple cases migration_manager: Keep reference on messaging migration_manager: Make push_schema_mutation private non-static method migration_manager: Move get_schema_version verb handling from proxy repair: Stop using global messaging_service references repair: Keep sharded messaging service reference on repair_meta repair: Keep sharded messaging service reference on repair_info repair: Keep reference on messaging in row-level code repair: Keep sharded messaging service in API repair: Unset API endpoints on stop repair: Setup API endpoints in separate helper repair: Push the sharded<messaging_service> reference down to sync_data_using_repair repair: Use existing sharded db reference repair: Mark repair.cc local functions as static streaming: Keep messaging service on send_info ...	2020-08-20 12:20:36 +03:00
Rafael Ávila de Espíndola	6363716799	schema: Pass an rvalue to set_compaction_strategy_options This produces less code and makes sure every caller moves the value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:35 -07:00
Pavel Emelyanov	ee41645a1a	tests: Use own instances of messaging_service The global one is going away, no core code uses it, so all tests can be safely switched to use their own instances. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 20:50:53 +03:00
Pavel Emelyanov	4ea3c2797c	storage_service: Keep reference on sharded messaging service It is a bit step backward in the storage-service decompsition campaign, but... Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 20:50:53 +03:00
Pavel Emelyanov	65bd54604d	gossiper: Use messaging service by reference Gossiper needs messaging service, the messaging is started before the gossiper, so we can push the former reference into it. Gossiper is not stopped for real, neither the messaging service is, so the memory usage is still safe. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 20:50:52 +03:00
Avi Kivity	6f986df458	Merge "Fix TWCS compaction aggressiveness due to data segregation" from Raphael " After data segregation feature, anything that cause out-of-order writes, like read repair, can result in small updates to past time windows. This causes compaction to be very aggressive because whenever a past time window is updated like that, that time window is recompacted into a single SSTable. Users expect that once a window is closed, it will no longer be written to, but that has changed since the introduction of the data segregation future. We didn't anticipate the write amplification issues that the feature would cause. To fix this problem, let's perform size-tiered compaction on the windows that are no longer active and were updated because data was segregated. The current behavior where the last active window is merged into one file is kept. But thereafter, that same window will only be compacted using STCS. Fixes #6928. " * 'fix_twcs_agressiveness_after_data_segregation_v2' of github.com:raphaelsc/scylla: compaction/twcs: improve further debug messages compaction/twcs: Improve debug log which shows all windows test: Check that TWCS properly performs size-tiered compaction on past windows compaction/twcs: Make task estimation take into account the size-tiered behavior compaction/stcs: Export static function that estimates pending tasks compaction/stcs: Make get_buckets() static compact/twcs: Perform size-tiered compaction on past time windows compaction/twcs: Make strategy easier to extend by removing duplicated knowledge compaction/twcs: Make newest_bucket() non-static compaction/twcs: Move TWCS implementation into source file	2020-08-19 17:19:01 +03:00
Pavel Emelyanov	dc0918e255	tests: Keep local reference on global messaging Some tests directly reference the global messaging service. For the sake of simpler patching wrap this global reference with a local one. Once the global messaging service goes away tests will get their own instances. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 13:08:12 +03:00
Raphael S. Carvalho	3be1420083	test: Check that TWCS properly performs size-tiered compaction on past windows Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-18 15:14:09 -03:00
Dejan Mircevski	fb6c011b52	everywhere: Insert space after `switch` Quoth @avikivity: "switch is not a function, and we celebrate that by putting a space after it like other control-flow keywords." https://github.com/scylladb/scylla/pull/7052#discussion_r471932710 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-18 14:31:04 +03:00
Botond Dénes	22a6493716	view_update_generator: fix race between registering and processing sstables `fea83f6` introduced a race between processing (and hence removing) sstables from `_sstables_with_tables` and registering new ones. This manifested in sstables that were added concurrently with processing a batch for the same sstables being dropped and the semaphore units associated with them not returned. This resulted in repairs being blocked indefinitely as the units of the semaphore were effectively leaked. This patch fixes this by moving the contents of `_sstables_with_tables` to a local variable before starting the processing. A unit test reproducing the problem is also added. Fixes: #6892 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200817160913.2296444-1-bdenes@scylladb.com>	2020-08-18 10:22:35 +03:00
Raphael S. Carvalho	f2b588cfc4	compaction/twcs: Make newest_bucket() non-static To fix #6928, newest_bucket() will have to access the class fields. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-08-17 12:29:34 -03:00
Pavel Solodovnikov	9aa4712270	lwt: introduce `paxos_grace_seconds` per-table option to set paxos ttl Previously system.paxos TTL was set as max(3h, gc_grace_seconds). Introduce new per-table option named `paxos_grace_seconds` to set the amount of seconds which are used to TTL data in paxos tables when using LWT queries against the base table. Default value is equal to `DEFAULT_GC_GRACE_SECONDS`, which is 10 days. This change allows to easily test various issues related to paxos TTL. Fixes #6284 Tests: unit (dev, debug) Co-authored-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200816223935.919081-1-pa.solodovnikov@scylladb.com>	2020-08-17 16:44:14 +02:00
Raphael S. Carvalho	11df96718a	compaction: Prevent non-regular compaction from picking compacting SSTables After `8014c7124`, cleanup can potentially pick a compacting SSTable. Upgrade and scrub can also pick a compacting SSTable. The problem is that table::candidates_for_compaction() was badly named. It misleads the user into thinking that the SSTables returned are perfect candidates for compaction, but manager still need to filter out the compacting SSTables from the returned set. So it's being renamed. When the same SSTable is compacted in parallel, the strategy invariant can be broken like overlapping being introduced in LCS, and also some deletion failures as more than one compaction process would try to delete the same files. Let's fix scrub, cleanup and ugprade by calling the manager function which gets the correct candidates for compaction. Fixes #6938. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200811200135.25421-1-raphaelsc@scylladb.com>	2020-08-16 17:31:03 +03:00
Dejan Mircevski	edf91e9e06	test: Restore a case in user_types_test This testcase was temporarily commented out in `37ebe52`, because it relied on buggy (#6369) behaviour fixed by that commit. Specifically, it expected a NULL comparison to match a NULL cell value. We now bring it back, with corrected result expectation. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-16 13:49:55 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Nadav Har'El	ee7291aa88	merge: CDC: allow "full" preimage in logs Merged pull request https://github.com/scylladb/scylla/pull/7028 By Calle Wilund: Changes the "preimage" option from binary true/false to on/off/full (accepting true/false, and using old style notation for normal to string - for upgrade reasons), where "full" will force us to include all columns in pre image log rows. Adds small test (just adding the case to preimage test). Uses the feature in alternator Fixes #7030 alternator: Set "preimage" to "full" for streams cdc_test: Do small test of "full" cdc: Make pre image optionally "full" (include all columns)	2020-08-12 23:19:46 +03:00
Calle Wilund	8cc5076033	cdc_test: Do small test of "full" Not a huge test change, but at least verifies it works.	2020-08-12 16:04:52 +00:00
Avi Kivity	24aa03a13c	Merge "Move some test code out of line" (sstable_run_based_compaction_strategy_for_test) from Rafael * 'espindola/move-out-of-line' of https://github.com/espindola/scylla: test: Move code in sstable_run_based_compaction_strategy_for_tests.hh out of line test: Drop ifdef now that we always use c++20 test: Move sstable_run_based_compaction_strategy_for_tests.hh to test/lib	2020-08-12 10:46:40 +03:00
Rafael Ávila de Espíndola	bd2f9fc685	test: Move sstable_run_based_compaction_strategy_for_tests.hh to test/lib This is in preparation to moving the code to a .cc file. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-11 11:48:41 -07:00
Benny Halevy	6deba1d0b4	test: cql_query_test: test_cache_bypass: use table stats test is currently flaky since system reads can happen in the background and disturb the global row cache stats. Use the table's row_cache stats instead. Fixes #6773 Test: cql_query_test.test_cache_bypass(dev, debug) Credit-to: Botond Dénes <bdenes@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200811140521.421813-1-bhalevy@scylladb.com>	2020-08-11 19:52:16 +03:00
Avi Kivity	4547949420	Merge "Fix repair stalls in get_sync_boundary and apply_rows_on_master_in_thread" from Asias " This path set fixes stalls in repair that are caused by std::list merge and clear operations during test_latency_read_with_nemesis test. Fixes #6940 Fixes #6975 Fixes #6976 " * 'fix_repair_list_stall_merge_clear_v2' of github.com:asias/scylla: repair: Fix stall in apply_rows_on_master_in_thread and apply_rows_on_follower repair: Use clear_gently in get_sync_boundary to avoid stall utils: Add clear_gently repair: Use merge_to_gently to merge two lists utils: Add merge_to_gently	2020-08-11 14:52:23 +03:00
Botond Dénes	db5926134a	sstables: sstable_mutation_reader: read_partition(): include more information in exception Resolve the FIXME to help investigating related issues and include the position of the consumer in the error message. Refs: #6529 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200811111101.1576222-1-bdenes@scylladb.com>	2020-08-11 14:52:04 +03:00
Avi Kivity	3530e80ce1	Merge "Support md format" from Benny " This series adds support for the "md" sstable format. Support is based on the following: * do not use clustering based filtering in the presence of static row, tombstones. * Disabling min/max column names in the metadata for formats older than "md". * When updating the metadata, reset and disable min/max in the presence of range tombstones (like Cassandra does and until we process them accurately). * Fix the way we maintain min/max column names by: keeping whole clustering key prefixes as min/max rather than calculating min/max independently for each component, like Cassandra does in the "md" format. Fixes #4442 Tests: unit(dev), cql_query_test -t test_clustering_filtering* (debug) md migration_test dtest from git@github.com:bhalevy/scylla-dtest.git migration_test-md-v1 " * tag 'md-format-v4' of github.com:bhalevy/scylla: (27 commits) config: enable_sstables_md_format by default test: cql_query_test: add test_clustering_filtering unit tests table: filter_sstable_for_reader: allow clustering filtering md-format sstables table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results table: filter_sstable_for_reader: adjust to md-format table: filter_sstable_for_reader: include non-scylla sstables with tombstones table: filter_sstable_for_reader: do not filter if static column is requested table: filter_sstable_for_reader: refactor clustering filtering conditional expression features: add MD_SSTABLE_FORMAT cluster feature config: add enable_sstables_md_format database: add set_format_by_config test: sstable_3_x_test: test both mc and md versions test: Add support for the "md" format sstables: mx/writer: use version from sstable for write calls sstables: mx/writer: update_min_max_components for partition tombstone sstables: metadata_collector: support min_max_components for range tombstones sstable: validate_min_max_metadata: drop outdated logic sstables: rename mc folder to mx sstables: may_contain_rows: always true for old formats sstables: add may_contain_rows ...	2020-08-11 13:29:11 +03:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Asias He	0bf0019eeb	utils: Add merge_to_gently This helper is similar to std::merge but it runs inside a thread and does not stall. Refs #6976	2020-08-11 10:37:34 +08:00
Benny Halevy	0d85ceaf37	test: cql_query_test: add test_clustering_filtering unit tests Add unit tests reproducing https://github.com/scylladb/scylla/issues/3552 with clustering-key filtering enabled. enable_sstables_md_format option is set to true as clustering-key filtering is enabled only for md-format sstables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	d77ceba498	test: sstable_3_x_test: test both mc and md versions Run the test cases that write sstables using both the mc and md versions. Note that we can still compare the resulting Data, Index, Digest, and Filter components with the prepared mc sstables we have since these haven't changed in md. We take special consideration around validating min/max column names that are now calculated using a revised algorithm in the md format. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	bd4383a842	sstables: mx/writer: update_min_max_components for partition tombstone Partition tombstones represent an implicit clustering range that is unbound on both sides, so reflect than in min/max column names metadata using empty clustering key prefixes. If we don't do that, when using the sstable for filtering, we have no other way of distinguishing range tombstones from partition tombstones given the sstable metadata and we would need to include any sstable with tombstones, even if those are range tombstone, for which we can do a better filtering job, using the sstable min/max column names metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	68acae5873	sstables: metadata_collector: support min_max_components for range tombstones We essentially treat min/max column names as range bounds with min as incl_start and max as incl_end. By generating a bound_view for min/max column names on the fly, we can correctly track and compare also short clustering key prefixes that may be used as bounds for range tombstones. Extend the sstable_tombstone_metadata_check unit test to cover these cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00

1 2 3 4 5 ...

541 Commits