scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-21 17:10:35 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	64c9359443	storage_proxy: Don't use default-initialized endpoint in get_read_executor() After calling filter_for_query() the extra_replica to speculate to may be left default-initialized which is :0 ipv6 address. Later below this address is used as-is to check if it belongs to the same DC or not which is not nice, as :0 is not an address of any existing endpoint. Recent move of dc/rack data onto topology made this place reveal itself by emitting the internal error due to :0 not being present on the topology's collection of endpoints. Prior to this move the dc filter would count :0 as belonging to "default_dc" datacenter which may or may not match with the dc of the local node. The fix is to explicitly tell set extra_replica from unset one. fixes: #11825 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11833	2022-10-25 09:16:50 +03:00
Botond Dénes	e981bd4f21	Merge 'Alternator, MV: fix bug in some view updates which set the view key to its existing value' from Nadav Har'El As described in issue #11801, we saw in Alternator when a GSI has both partition and sort keys which were non-key attributes in the base, cases where updating the GSI-sort-key attribute to the same value it already had caused the entire GSI row to be deleted. In this series fix this bug (it was a bug in our materialized views implementation) and add a reproducing test (plus a few more tests for similar situations which worked before the patch, and continue to work after it). Fixes #11801 Closes #11808 * github.com:scylladb/scylladb: test/alternator: add test for issue 11801 MV: fix handling of view update which reassign the same key value materialized views: inline used-once and confusing function, replace_entry()	2022-10-21 10:49:28 +03:00
Pavel Emelyanov	52d6e56a10	system_keyspace: Don't use global snitch instance There are two places to patch: .start() and .setup() and both only need snitch to get local dc/rack from, nothing more. Thus both can live with the explicit argument for now Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-20 12:29:26 +03:00
Avi Kivity	69199dbfba	Merge 'schema_tables: limit concurrency' from Benny Halevy To prevent stalls due to large number of tables. Fixes scylladb/scylladb#11574 Closes #11689 * github.com:scylladb/scylladb: schema_tables: merge_tables_and_views reindent schema_tables: limit paralellism	2022-10-19 18:40:45 +03:00
Nadav Har'El	8f4243b875	MV: fix handling of view update which reassign the same key value When a materialized view has a key (in Alternator, this can be two keys) which was a regular column in the base table, and a base update modifies that regular column, there are two distinct cases: 1. If the old and new key values are different, we need to delete the old view row, and create a new view row (with the different key). 2. If the old and new key values are the same, we just need to update the pre-existing row. It's important not to confuse the two cases: If we try to delete and create the same view row in the same timestamp, the result will be that the row will be deleted (a tombstone wins over data if they have the same timestamp) instead of updated. This is what we saw in issue #11801. We had a bug that was seen when an update set the view key column to the old value it already had: To compare the old and new key values we used the function compare_atomic_cell_for_merge(), but this compared not just they values but also incorrectly compared the metadata such as a the timestamp. Because setting a column to the same value changes its timestamp, we wrongly concluded that these to be different view keys and used the delete-and-create code for this case, resulting in the view row being deleted (as explained above). The simple fix is to compare just the key values - not looking at the metadata. See tests reproducing this bug and confirming its fix in the next patch. Fixes #11801 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-10-19 13:43:12 +03:00
Nadav Har'El	e1f8cb6521	materialized views: inline used-once and confusing function, replace_entry() The replace_entry() function is nothing more than a convenience for calling delete_old_entry() and then create_entry(). But it is only used once in the code, and we can just open-code the two calls instead of the one. The reason I want to change it now is that the shortcut replace_entry() helped hide a bug (#11801) - replace_entry() works incorrectly if the old and new row have the same key, because if they do we get a deletion and creation of the same row with the same timestamp - and the deletion wins. Having the two calls not hidden by a convenience function makes this potential problem more apparent. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-10-19 13:25:34 +03:00
Benny Halevy	ce22dd4329	schema_tables: merge_tables_and_views reindent Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-19 13:05:41 +03:00
Benny Halevy	7ccb0e70f0	schema_tables: limit paralellism To prevent stalls due to large number of tables. Fixes scylladb/scylladb#11574 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-19 13:05:38 +03:00
Botond Dénes	2d581e9e8f	Merge "Maintain dc/rack by topology" from Pavel Emelyanov " There's an ongoing effort to move the endpoint -> {dc/rack} mappings from snitch onto topology object and this set finalizes it. After it the snitch service stops depending on gossiper and system keyspace and is ready for de-globalization. As a nice side-effect the system keyspace no longer needs to maintain the dc/rack info cache and its starting code gets relaxed. refs: #2737 refs: #2795 " * 'br-snitch-dont-mess-with-topology-data-2' of https://github.com/xemul/scylla: (23 commits) system_keyspace: Dont maintain dc/rack cache system_keyspace: Indentation fix after previous patch system_keyspace: Coroutinuze build_dc_rack_info() topology: Move all post-configuration to topology::config snitch: Start early gossiper: Do not export system keyspace snitch: Remove gossiper reference snitch: Mark get_datacenter/_rack methods const snitch: Drop some dead dependency knots snitch, code: Make get_datacenter() report local dc only snitch, code: Make get_rack() report local rack only storage_service: Populate pending endpoint in on_alive() code: Populate pending locations topology: Put local dc/rack on topology early topology: Add pending locations collection topology: Make get_location() errors more verbose token_metadata: Add config, spread everywhere token_metadata: Hide token_metadata_impl copy constructor gosspier: Remove messaging service getter snitch: Get local address to gossip via config ...	2022-10-19 06:50:21 +03:00
Tomasz Grabiec	4ff204c028	Merge 'cache: make all removals of cache items explicit' from Michał Chojnowski This series is a step towards non-LRU cache algorithms. Our cache items are able to unlink themselves from the LRU list. (In other words, they can be unlinked solely via a pointer to the item, without access to the containing list head). Some places in the code make use of that, e.g. by relying on auto-unlink of items in their destructor. However, to implement algorithms smarter than LRU, we might want to update some cache-wide metadata on item removal. But any cache-wide structures are unreachable through an item pointer, since items only have access to themselves and their immediate neighbours. Therefore, we don't want items to unlink themselves — we want `cache.remove(item)`, rather than `item.remove_self()`, because the former can update the metadata in `cache`. This series inserts explicit item unlink calls in places that were previously relying on destructors, gets rid of other self-unlinks, and adds an assert which ensures that every item is explicitly unlinked before destruction. Closes #11716 * github.com:scylladb/scylladb: utils: lru: assert that evictables are unlinked before destruction utils: lru: remove unlink_from_lru() cache: make all cache unlinks explicit	2022-10-17 12:47:02 +02:00
Michał Chojnowski	d785364375	cache: make all cache unlinks explicit Our LSA cache is implemented as an auto_unlink Boost intrusive list, meaning that elements of the list unlink themselves from the list automatically on destruction. Some parts of the code rely on that, and don't unlink them manually. However, this precludes accurate bookkeeping about the cache. Elements only have access to themselves and their neighbours, not to any bookkeeping context. Therefore, a destructor cannot update the relevant metadata. In this patch, we fix this by adding explicit unlink calls to places where it would be done by a destructor. In a following patch, we will add an assert to the destructor to check that every element is unlinked before destruction.	2022-10-17 12:07:27 +02:00
Avi Kivity	19e62d4704	commitlog: delete unused "num_deleted" variable Since `d478896d46` we update the variable, but never read it. Clang 15 notices and complains. Remove the variable to make it happy. Closes #11765	2022-10-13 15:11:32 +02:00
Raphael S. Carvalho	ec79ac46c9	db/view: Add visibility to view updating of Staging SSTables Today, we're completely blind about the progress of view updating on Staging files. We don't know how long it will take, nor how much progress we've made. This patch adds visibility with a new metric that will inform the number of bytes to be processed from Staging files. Before any work is done, the metric tell us the total size to be processed. As view updating progresses, the metric value is expected to decrease, unless work is being produced faster than we can consume them. We're piggybacking on sstables::read_monitor, which allows the progress metric to be updated whenever the SSTable reader makes progress. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #11751	2022-10-12 16:57:37 +03:00
Kamil Braun	08e654abf5	Merge 'raft: (service) cleanups on the path for dynamic IP address support' from Konstantin Osipov In preparation for supporting IP address changes of Raft Group 0: 1) Always use start_server_for_group0() to start a server for group 0. This will provide a single extension point when it's necessary to prompt raft_address_map with gossip data. 2) Don't use raft::server_address in discovery, since going forward discovery won't store raft::server_address. On the same token stop using discovery::peer_set anywhere outside discovery (for persistence), use a peer_list instead, which is easier to marshal. Closes #11676 * github.com:scylladb/scylladb: raft: (discovery) do not use raft::server_address to carry IP data raft: (group0) API refactoring to avoid raft::server_address raft: rename group0_upgrade.hh to group0_fwd.hh raft: (group0) move the code around raft: (discovery) persist a list of discovered peers, not a set raft: (group0) always start group0 using start_server_for_group0()	2022-10-11 13:43:41 +02:00
Pavel Emelyanov	8b8b37cdda	system_keyspace: Dont maintain dc/rack cache Some good news finally. The saved dc/rack info about the ring is now only loaded once on start. So the whole cache is not needed and the loading code in storage_service can be greatly simplified Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-11 05:18:31 +03:00
Pavel Emelyanov	775f42c8d1	system_keyspace: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-11 05:18:31 +03:00
Pavel Emelyanov	8f1df240c7	system_keyspace: Coroutinuze build_dc_rack_info() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-11 05:18:31 +03:00
Pavel Emelyanov	4206b1f98f	snitch, code: Make get_datacenter() report local dc only The continuation of the previous patch -- all the code uses topology::get_datacenter(endpoint) to get peers' dc string. The topology still uses snitch for that, but it already contains the needed data. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-11 05:17:08 +03:00
Pavel Emelyanov	6c6711404f	snitch, code: Make get_rack() report local rack only All the code out there now calls snitch::get_rack() to get rack for the local node. For other nodes the topology::get_rack(endpoint) is used. Since now the topology is properly populated with endpoints, it can finally be patched to stop using snitch and get rack from its internal collections Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-11 05:17:08 +03:00
Konstantin Osipov	3e46c32d7b	raft: (discovery) do not use raft::server_address to carry IP data We plan to remove IP information from Raft addresses. raft::server_address is used in Raft configuration and also in discovery, which is a separate algorithm, as a handy data structure, to avoid having new entities in RPC. Since we plan to remove IP addresses from Raft configuration, using raft::server_address in discovery and still storing IPs in it would create ambiguity: in some uses raft::server_address would store an IP, and in others - would not. So switch to an own data structure for the purposes of discovery, discovery_peer, which contains a pair ip, raft server id. Note to reviewers: ideally we should switch to URIs in discovery_peer right away. Otherwise we may have to deal with incompatible changes in discovery when adding URI support to Scylla.	2022-10-10 16:24:33 +03:00
Pavel Emelyanov	b1f4273f0d	large_data_handler: Use local system_keyspace to update entries The l._d._h.'s way to update system keyspace is not like in other code. Instead of a dedicated helper on the system_keyspace's side it executes the insertion query directly with the help of qctx. Now when the l._d._h. has the weak system keyspace reference it can execute queries on _it_ rather than on the qctx. Just like in previous patch, it needs to keep the sys._k.s. weak reference alive until the query's future resolves. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 16:20:59 +03:00
Pavel Emelyanov	907fd2d355	system_keyspace: De-static compaction history update Compaction manager now has the weak reference on the system keyspace object and can use it to update its stats. It only needs to take care and keep the shared pointer until the respective future resolves. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 16:20:59 +03:00
Pavel Emelyanov	3e0b61d707	compaction_manager: Relax history paths There's a virtual method on table_state to update the entry in system keyspace. It's an overkill to facilitate tests that don't want this. With new system_keyspace weak referencing it can be made simpled by moving the updating call to the compaction_manager itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 16:20:59 +03:00
Pavel Emelyanov	f9b57df471	database: Plug/unplug system_keyspace There's a circular dependency between system_keyspace and database. The former needs the latter because it needs to execula local requests via query_processor. The latter needs the former via compaction manager and large data handler, database depends on both and these too need to insert their entries into system keyspace. To cut this loop the compaction manager and large data handler both get a weak reference on the system keysace. Once system keyspace starts is activcates this reference via the database call. When system keyspace is shutdown-ed on stop, it deactivates the reference. Technically the weak reference is implemented by marking the system_k.s. object as async_sharded_service, and the "reference" in question is the shared_from_this() pointer. When compaction manager or large data handler need to update a system keyspace's table, they both hold an extra reference on the system keyspace until the entry is committed, thus making sure that sys._k.s. doesn't stop from under their feet. At the same time, unplugging the reference on shutdown makes sure that no new entries update will appear and the system_k.s. will eventually be released. It's not a C++ classical reference, because system_keyspace starts after and stops before database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 16:20:59 +03:00
Konstantin Osipov	224dd9ce1e	raft: rename group0_upgrade.hh to group0_fwd.hh The plan is to add other group-0-related forward declarations to this file, not just the ones for upgrade.	2022-10-10 15:58:48 +03:00
Pavel Emelyanov	caed12c8f2	system_keyspace: Add .shutdown() method Many services out there have one (sometimes called .drain()) that's called early on stop and that's responsible for prearing the service for stop -- aborting pending/in-flight fibers and alike. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 15:29:33 +03:00
Pavel Emelyanov	53bad617c0	virtual_tables: Use token_metadata.is_member() This method just jumps into topology.has_endpoint(). The change is for consistency with other users of it and as a preparation for topology.has_endpoint() future enhancements Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-10 12:16:19 +03:00
Botond Dénes	b247f29881	Merge 'De-static system_keyspace::get_{saved\|local}_tokens()' from Pavel Emelyanov Yet another user of global qctx object. Making the method(s) non-static requires pushing the system_keyspace all the way down to size_estimate_virtual_reader and a small update of the cql_test_env Closes #11738 * github.com:scylladb/scylladb: system_keyspace: Make get_{local\|saved}_tokens non static size_estimates_virtual_reader: Pass sys_ks argument to get_local_ranges() cql_test_env: Keep sharded<system_keyspace> reference size_estimate_virtual_reader: Keep system_keyspace reference system_keyspace: Pass sys_ks argument to install_virtual_readers() system_keyspace: Make make() non-static distributed_loader: Pass sys_ks argument to init_system_keyspace() system_keyspace: Remove dangling forward declaration	2022-10-07 11:28:32 +03:00
Avi Kivity	20bad62562	Merge 'Detect and record large collections' from Benny Halevy This series adds support for detecting collections that have too many items and recording them in `system.large_cells`. A configuration variable was added to db/config: `compaction_collection_items_count_warning_threshold` set by default to 10000. Collections that have more items than this threshold will be warned about and will be recorded as a large cell in the `system.large_cells` table. Documentation has been updated respectively. A new column was added to system.large_cells: `collection_items`. Similar to the `rows` column in system.large_partition, `collection_items` holds the number of items in a collection when the large cell is a collection, or 0 if it isn't. Note that the collection may be recorded in system.large_cells either due to its size, like any other cell, and/or due to the number of items in it, if it cross the said threshold. Note that #11449 called for a new system.large_collections table, but extending system.large_cells follows the logic of system.large_partitions is a smaller change overall, hence it was preferred. Since the system keyspace schema is hard coded, the schema version of system.large_cells was bumped, and since the change is not backward compatible, we added a cluster feature - `LARGE_COLLECTION_DETECTION` - to enable using it. The large_data_handler large cell detection record function will populate the new column only when the new cluster feature is enabled. In addition, unit tests were added in sstable_3_x_test for testing large cells detection by cell size, and large_collection detection by the number of items. Closes #11449 Closes #11674 * github.com:scylladb/scylladb: sstables: mx/writer: optimize large data stats members order sstables: mx/writer: keep large data stats entry as members db: large_data_handler: dynamically update config thresholds utils/updateable_value: add transforming_value_updater db/large_data_handler: cql_table_large_data_handler: record large_collections db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler db/large_data_handler: cql_table_large_data_handler: move ctor out of line docs: large-rows-large-cells-tables: fix typos db/system_keyspace: add collection_elements column to system.large_cells gms/feature_service: add large_collection_detection cluster feature test: sstable_3_x_test: add test_sstable_too_many_collection_elements test: lib: simple_schema: add support for optional collection column test: lib: simple_schema: build schema in ctor body test: lib: simple_schema: cql: define s1 as static only if built this way db/large_data_handler: maybe_record_large_cells: consider collection_elements db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries sstables: mx/writer: pass collection_elements to writer::maybe_record_large_cells sstables: mx/writer: add large_data_type::elements_in_collection db/large_data_handler: get the collection_elements_count_threshold db/config: add compaction_collection_elements_count_warning_threshold test: sstable_3_x_test: add test_sstable_write_large_cell test: sstable_3_x_test: pass cell_threshold_bytes to large_data_handler test: sstable_3_x_test: large_data_handler: prepare callback for testing large_cells test: sstable_3_x_test: large_data tests: use BOOST_REQUIRE_[GL]T test: sstable_3_x_test: test_sstable_log_too_many_rows: use tests::random	2022-10-06 18:28:21 +03:00
Pavel Emelyanov	59da903054	system_keyspace: Make get_{local\|saved}_tokens non static Now all callers have system_keyspace reference at hand. This removes one more user of the global qctx object Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 18:02:09 +03:00
Pavel Emelyanov	b03f1e7b17	size_estimates_virtual_reader: Pass sys_ks argument to get_local_ranges() This method static calls system_keyspace::get_local_tokens(). Having the system_keyspace reference will make this method non-static Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 18:00:09 +03:00
Pavel Emelyanov	34e8e5959f	size_estimate_virtual_reader: Keep system_keyspace reference The s._e._v._reader::fill_buffer() method needs system keyspace to get node's local tokens. Now it's a static method, having system_keyspace reference will make it non-static Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:58:07 +03:00
Pavel Emelyanov	04552f2d58	system_keyspace: Pass sys_ks argument to install_virtual_readers() The size-estimate-virtual-reader will need it, now it's available as "this" from system_keyspace::make() method Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:57:13 +03:00
Pavel Emelyanov	1938412d7a	system_keyspace: Make make() non-static This helper needs system_keyspace reference and using "this" as this looks natural. Also this de-static-ification makes it possible to put some sense into the invoke_on_all() call from init_system_keyspace() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:56:11 +03:00
Pavel Emelyanov	e996503f0d	system_keyspace: Remove dangling forward declaration It doesn't match the real system_keyspace_make() definition and is in fact not needed, as there's another "real" one in database.hh Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-06 17:54:22 +03:00
Benny Halevy	2c4ff71d2b	db: large_data_handler: dynamically update config thresholds make the various large data thresholds live-updateable and construct the observers and updaters in cql_table_large_data_handler to dynamically update the base large_data_handler class threshold members. Fixes #11685 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-05 10:53:40 +03:00
Avi Kivity	37c6b46d26	dirty_memory_manager: re-term "virtual dirty" to "unspooled dirty" The "virtual dirty" term is not very informative. "Virtual" means "not real", but it doesn't say in which way it isn't real. In this case, virtual dirty refers to real dirty memory, minus the portion of memtables that has been written to disk (but not yet sealed - in that case it would not be dirty in the first place). I chose to call "the portion of memtables that has been written to disk" as "spooled memory". At least the unique term will cause people to look it up and may be easier to remember. From that we have "unspooled memory". I plan to further change the accounting to account for spooled memory rather than unspooled, as that is a more natural term, but that is left for later. The documentation, config item, and metrics are adjusted. The config item is practically unused so it isn't worth keeping compatibility here.	2022-10-04 14:03:59 +03:00
Benny Halevy	46ebffcc93	db/large_data_handler: cql_table_large_data_handler: record large_collections When the large_collection_detection cluster feature is enabled, select the internal_record_large_cells_and_collections method to record the large collection cell, storing also the collection_elements column. We want to do that only when the cluster feature is enabled to facilitate rollback in case rolling upgrade is aborted, otherwise system.large_cells won't be backward compatible and will have to be deleted manually. Delete the sstable from system.large_cells if it contains elements_in_collection above threshold. Closes #11449 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:42:10 +03:00
Benny Halevy	3f8bba202f	db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler For recording collection_elements of large_collections when the large_collection_detection feature is enabled. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:42:10 +03:00
Benny Halevy	dc4e7d8e01	db/large_data_handler: cql_table_large_data_handler: move ctor out of line Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:42:09 +03:00
Benny Halevy	2f49eebb04	db/system_keyspace: add collection_elements column to system.large_cells And bump the schema version offset since the new schema should be distinguishable from the previous one. Refs scylladb/scylladb#11660 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:42:08 +03:00
Benny Halevy	6dadca2648	db/large_data_handler: maybe_record_large_cells: consider collection_elements Detect large_collections when the number of collection_elements is above the configured threshold. Next step would be to record the number of collection_elements in the system.large_cells table, when the respective cluster feature is enabled. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:42:05 +03:00
Benny Halevy	27ee75c54e	db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries Log in debug level when deleting large data entry from system table. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:42:04 +03:00
Benny Halevy	a107f583fd	db/large_data_handler: get the collection_elements_count_threshold Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:31:11 +03:00
Benny Halevy	167ec84eeb	db/config: add compaction_collection_elements_count_warning_threshold Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-04 08:31:10 +03:00
Botond Dénes	5621cdd7f9	db/view/view_builder: don't drop partition and range tombstones when resuming The view builder builds the views from a given base table in view_builder::batch_size batches of rows. After processing this many rows, it suspends so the view builder can switch to building views for other base tables in the name of fairness. When resuming the build step for a given base table, it reuses the reader used previously (also serving the role of a snapshot, pinning sstables read from). The compactor however is created anew. As the reader can be in the middle of a partition, the view builder injects a partition start into the compactor to prime it for continuing the partition. This however only included the partition-key, crucially missing any active tombstones: partition tombstone or -- since the v2 transition -- active range tombstone. This can result in base rows covered by either of this to be resurrected and the view builder to generate view updates for them. This patch solves this by using the detach-state mechanism of the compactor which was explicitly developed for situations like this (in the range scan code) -- resuming a read with the readers kept but the compactor recreated. Also included are two test cases reproducing the problem, one with a range tombstone, the other with a partition tombstone. Fixes: #11668 Closes #11671	2022-10-03 11:28:22 +03:00
Botond Dénes	ad04f200d3	Merge 'database: automatically take snapshot of base table views' from Benny Halevy The logic to reject explicit snapshot of views/indexes was improved in `aa127a2dbb`. However, we never implemented auto-snapshot of view/indexes when taking a snapshot of the base table. This is implemented in this patch. The implementation is built on top of `ba42852b0e` so it would be hard to backport to 5.1 or earlier releases. Fixes #11612 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11616 * github.com:scylladb/scylladb: database: automatically take snapshot of base table views api: storage_service: reject snapshot of views in api layer	2022-09-29 13:33:31 +03:00
Benny Halevy	d32c497cd9	database: automatically take snapshot of base table views The logic to reject explicit snapshot of views/indexes was improved in `aa127a2dbb`. However, we never implemented auto-snapshot of view/indexes when taking a snapshot of the base table. This is implemented in this patch. The implementation is built on top of `ba42852b0e` so it would be hard to backport to 5.1 or earlier releases. Fixes #11612 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-09-26 11:02:54 +03:00
Benny Halevy	55b0b8fe2c	api: storage_service: reject snapshot of views in api layer Rather than pushing the check to `snapshot_ctl::take_column_family_snapshot`, just check that explcitly when taking a snapshot of a particular table by name over the api. Other paths that call snapshot_ctl::take_column_family_snapshot are internal and use it to snap views already. With that, we can get rid of the allow_view_snapshots flag that was introduced in `aab4cd850c`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-09-26 10:44:56 +03:00
Benny Halevy	fcbbc3eb9c	db/large_data_handler: print static cell/collection description in log warning When warning about a large cell/collection in a static row, print that fact in the log warning to make it clearer. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-09-25 14:37:42 +03:00

1 2 3 4 5 ...

2777 Commits