scylladb

Author	SHA1	Message	Date
Botond Dénes	7af0690762	mutation/mutation_compactor: drop v2 from compactor and related names	2025-05-09 07:53:29 -04:00
Botond Dénes	b5170e27d0	replica/table: s/make_reader_v2/make_mutation_reader/	2025-05-09 07:53:29 -04:00
Botond Dénes	ca7f557e86	readers/multishard: drop v2 from reader and related names	2025-05-09 07:53:29 -04:00
Botond Dénes	4d92bc8b2f	readers/evictable: drop v2 from reader and related names	2025-05-09 07:53:28 -04:00
Raphael S. Carvalho	c77f710a0c	sstables: Fix quadratic space complexity in partitioned_sstable_set Interval map is very susceptible to quadratic space behavior when it's flooded with many entries overlapping all (or most of) intervals, since each such entry will have presence on all intervals it overlaps with. A trigger we observed was memtable flush storm, which creates many small "L0" sstables that spans roughly the entire token range. Since we cannot rely on insertion order, solution will be about storing sstables with such wide ranges in a vector (unleveled). There should be no consequence for single-key reads, since upper layer applies an additional filtering based on token of key being queried. And for range scans, there can be an increase in memory usage, but not significant because the sstables span an wide range and would have been selected in the combined reader if the range of scan overlaps with them. Anyway, this is a protection against storm of memtable flushes and shouldn't be the common scenario. It works both with tablets and vnodes, by adjusting the token range spanned by compaction group accordingly. Fixes #23634. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-04-29 15:47:33 -03:00
Wojciech Mitros	bf7bba9634	mv: add a test for dropping an index while it's building Dropping an index is a schema change of its base table and a schema drop of the index's materialized view. This combination of schema changes used to cause issues during view building, because when a view schema was dropped, it wasn't getting updated with the new version of the base schema, and while the view building was in progress, we would update the base schema for the base table mutation reader and try generating updates with a view schema that wasn't compatible with the base schema, failing on an `on_internal_error`. In this patch we add a test for this scenario. We create an index, halt its view building process using an injection, and drop it. If no errors are thrown, the test succeeds. The test was failing before https://github.com/scylladb/scylladb/pull/23337 and is passing afterwards.	2025-04-24 01:09:32 +02:00
Wojciech Mitros	d77f11d436	base_info: remove the lw_shared_ptr variant The base_dependent_view_info is no longer needed to be shared or modified in the view_info, so we no longer need to keep it as a shared pointer.	2025-04-24 01:08:40 +02:00
Wojciech Mitros	d7bd86591e	view_info: don't re-set base_info after construction In the previous commits we made sure that the base info is not dependent on the base schema version, and the info dependent on the base schema version is calculated when it's needed. In this patch we remove the unnecessary re-setting of the base_info. The set_base_info method isn't removed completely, because it also has a secondary function - zeroing the view_info fields other than base_info. Because of this, in this patch we rename it accordingly and limit its use to the updates caused by a base schema change.	2025-04-24 01:08:40 +02:00
Wojciech Mitros	ea462efa3d	base_info: remove base_info snapshot semantics The base info in view schemas no longer changes on base schema updates, so saving the base info with a view schema from a specific point in time doesn't provide any additional benefits. In this patch we remove the code using the base_and_view snapshots as it's no longer useful.	2025-04-24 01:08:40 +02:00
Wojciech Mitros	ad55935411	base_info: remove base schema from the base_info The base info now only contains values which are not reliant on the base schema version. We remove the the base schema from the base info to make it immutable regardless of base schema version, at the point of this patch it's also not needed anywhere - the new base info can replace the base schema in most places, and in the few (view_updates) where we need it, we pull the most recent base schema version from the database. After this change, the base info no longer changes in a view schema after creation, so we'll no longer get errors when we try generating view updates with a base_info that's incompatible with a specific base schema version. Fixes #9059 Fixes #21292 Fixes #22410	2025-04-24 01:08:39 +02:00
Wojciech Mitros	05fce91945	schema_registry: store base info instead of base schema for view entries In the following patch we plan to remove the base schema from the base_info to make the base_info immutable. To do that, we first prepare the schema registry for the change; we need to be able to create view schemas from frozen schemas there and frozen schemas have no information about the base table. Unless we do this change, after base schemas are removed from the base info, we'll no longer be able to load a view schema to the schema registry without looking up the base schema in the database. This change also required some updates to schema building: * we add a method for unfreezing a view schema with base info instead of a base schema * we make it possible to use schema_builder with a base info instead of a base schema * we add a method for creating a view schema from mutations with a base info instead of a base schema * we add a view_info constructor withat base info instead of a base schema * we update the naming in schema_registry to reflect the usage of base info instead of base schema	2025-04-24 01:08:39 +02:00
Wojciech Mitros	6e539c2b4d	base_info: make members non-const In the following patches we'll add the base info instead of the base schema to various places (schema building, schema registry). There, we'll sometimes need to update the base_info fields, which we can't do with const members. There's also a place (global_schema_ptr) where we won't be able to use the base_info_ptr (a shared pointer to the base_info), so we can't just use the base_info_ptr everywhere instead. In this patch we unmark these members as const. In the following patches we'll remove the methods for changing the base_info in the view schema, so it will remain effectively const.	2025-04-24 01:08:39 +02:00
Wojciech Mitros	32258d8f9a	view_info: move the base info to a separate header In the following commits the base_depenedent_view_info will be needed in many more places. To avoid including the whole db/view/view.hh or forward declaring (where possible) the base info, we move it to a separate header which can be included anywhere at almost no cost.	2025-04-24 01:08:39 +02:00
Wojciech Mitros	a3d2cd6b5e	view_info: move computation of view pk columns not in base pk to view_updates In preparation of making the base_info immutable, we want to get rid of any base_dependent_view_info fields that can change when base schema is updated. The _base_regular_columns_in_view_pk and _base_static_columns_in_view_pk base column_ids of corresponding base columns and they can change (decrease) when an earlier column is dropped in the base table. view_updates is the only location where these values are used and calculating them is not expensive when comparing to the overall work done while performing a view update - we iterate over all view primary key columns and look them up in the base table. With this in mind, we can just calculate them when creating a view_updates object, instead of keeping them in the base_info. We do that in this patch.	2025-04-24 01:08:39 +02:00
Wojciech Mitros	a33963daef	view_info: move base-dependent variables into base_info The has_computed_column_depending_on_base_non_primary_key and is_partition_key_permutation_of_base_partition_key variables in the view_info depend on the base table so they should be in the base_dependent_view_info instead of view_info.	2025-04-24 01:08:39 +02:00
Wojciech Mitros	900687c818	view_info: set base info on construction Currently, the base_info may or may not be set in view schemas. Even when it's set, it may be modified. This necessitates extra checks when handling view schemas, as well as potentially causing errors when we forget to set it at some point. Instead, we want to make the base info an immutable member of view schemas (inside view_info). The first step towards that is making sure that all newly created schemas have the base info set. We achieve that by requiring a base schema when constructing a view schema. Unfortunately, this adds complexity each time we're making a view schema - we need to get the base schema as well. In most cases, the base schema is already available. The most problematic scenario is when we create a schema from mutations: - when parsing system tables we can get the schema from the database, as regular tables are parsed before views - when loading a view schema using the schema loader tool, we need to load the base additionally to the view schema, effectively doubling the work - when pulling the schema from another node - in this case we can only get the current version of the base schema from the local database Additionally, we need to consider the base schema version - when we generate view updates the version of the base schema used for reads should match the version of the base schema in view's base info. This is achieved by selecting the correct (old or new) schema in `db::schema_tables::merge_tables_and_views` and using the stored base schema in the schema_registry.	2025-04-24 01:08:39 +02:00
Botond Dénes	c29c696780	readers: mv from_mutations_v2.hh from_mutations.hh Completely mechanical change.	2025-04-16 04:46:08 -04:00
Botond Dénes	b104862702	tree: s/make_mutation_reader_from_mutations_v2/make_mutation_reader_from_mutations/s Completely mechanical change.	2025-04-16 04:46:07 -04:00
Botond Dénes	7547d0c6a9	readers: mv from_fragments_v2.hh from_fragments.hh Completely mechanical change.	2025-04-16 04:35:00 -04:00
Benny Halevy	e1fe82ed33	utils: phased_barrier, pluggable: use named gate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-04-12 11:47:00 +03:00
Amnon Heiman	19a414598b	db/view/view.cc: label metrics with basic_level The following metrics will be marked with basic_level label: scylla_view_builder_builds_in_progress Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2025-03-03 16:58:39 +02:00
Kefu Chai	6e4cb20a69	tree: implement boost::accumulate with std::ranges library Replace boost::accumulate() calls with std::ranges facilities. This change reduces external dependencies and modernizes the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23062	2025-02-26 23:22:02 +02:00
Andrzej Jackowski	b4f0a5149a	db: cql3: add comments regarding unsafe interval<clustering_key_prefix> class clustering_range is a range of Clustering Key Prefixes implemented as interval<clustering_key_prefix>. However, due to the nature of Clustering Key Prefix, the ordering of clustering_range is complex and does not satisfy the invariant of interval<>. To be more specific, as a comment in interval<> implementation states: “The end bound can never be smaller than the start bound”. As a range of CKP violates the invariant, some algorithms, like intersection(), can return incorrect results. For more details refer to scylladb#8157, scylladb#21604, scylladb#22817. This commit: - Add a WARNING comment to discourage usage of clustering_range - Add WARNING comments to potentially incorrect uses of interval<clustering_key_prefix> non-trivial methods - Add a FIXME comment to incorrect use of interval<clustering_key_prefix_view>::deoverlap and WARNING comments to related interval<clustering_key_prefix_view> misuse. Closes scylladb/scylladb#22913	2025-02-26 12:01:28 +01:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Nadav Har'El	cae8a7222e	alternator: fix view build on oversized GSI key attribute Before this patch, the regular_column_transformation constructor, which we used in Alternator GSIs to generates a view key from a regular-column cell, accepted a cell of any size. As a reviewer (Avi) noticed, very long cells are possible, well beyond what Scylla allows for keys (64KB), and because regular_column_transformation stores such values in a contiguous "bytes" object it can cause stalls. But allowing oversized attributes creates an even more accute problem: While view building (backfilling in DynamoDB jargon), if we encounter an oversized (>64KB) key, the view building step will fail and the entire view building will hang forever. This patch fixes both problems by adding to regular_column_transformation's constructor the check that if the cell is 64KB or larger, an empty value is returned for the key. This causes the backfilling to silently skip this item, which is what we expect to happen (backfilling cannot do anything to fix or reject the pre-existing items in the best table). A test test_gsi_updatetable.py::test_gsi_backfill_oversized_key is introduced to reproduce this problem and its fix. The test adds a 65KB attribute to a base table, and then adds GSIs to this table with this attribute as its partition key or its sort key. Before this patch, the backfilling process for the new GSIs hangs, and never completes. After this patch, the backfilling completes and as expected contains other base-table items but not the item with the oversized attribute. The new test also passes on DynamoDB. However, while implementing this fix I realized that issue #10347 also exists for GSIs. Issue #10347 is about the fact that DynamoDB limits partition key and sort key attributes to 2048 and 1024 bytes, respectively. In the fix described above we only handled the accute case of lengths above 64 KB, but we should actually skip items whose GSI keys are over 2048 or 1024 bytes - not 64KB. This extra checking is not handled in this patch, and is part of a wider existing issue: Refs #10347 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-02-06 09:59:50 +01:00
Nadav Har'El	7a0027bacc	mv: clean up do_delete_old_entry The function do_delete_old_entry() had an if() which was supposedly for the case of collection column indexing, and which our previous patch that improved this function to support caller-specified deletion_ts left behind. As a reviewer noticed, the new tombstone-setting code was in an "else" of that existing if(), and it wasn't clear what happens if we get to that else in the collection column indexing. So I reviewed the code and added breakpoints and realized that in fact, do_delete_old_entry() is never called for the collection-indexing case, which has its own update_entry_for_computed_column() which view_updates::generate_update() calls instead of the do_delete_old_entry() function and its friends. So it appears that do_delete_old_entry() doesn't need that special case at all, which simplifies it. We should eventually simplify this code further. In particular, the function generate_update() already knows the key of the rows it adds or deletes so do_delete_old_entry() and its friends don't need to call get_view_rows() to get it again. But these simplifications and other will need to come in a later patch series, this one is already long enough :-) Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-02-06 09:59:49 +01:00
Nadav Har'El	bc7b5926d2	mv: support regular_column_transformation key columns in view In an earlier patch, we introduced regular_column_transformation, a new type of computed column that does a computation on a cell in regular column in the base and returns a potentially transformed cell (value or deletion, timestamp and ttl). In this patch, we wire the materialized view code to support this new kind of computed column that is usable as a materialized-view key column. This new type of computed column is not yet used in this patch - this will come in the next patch, where we will use it for Alternator GSIs. Before this patch, the logic of deciding when the view update needs to create a new row or delete a new one, and which timestamp and ttl to give to the new row, could depend on one (or two - in Alternator) cells read from base-table regular columns. In this patch, this logic is rewritten - the notion of "base table regular columns" is generalized to the notion of "updatable view key columns" - these are view key columns that an update may change - because they really are base regular columns, or a computed function of one (regular_column_transformation). In some sense, the new code is easier to understand - there is no longer a separate "compute_row_marker()" function, rather the top-level generate_update() is now in charge of finding the "updatable view key columns" and calculate the row marker (timestamp and ttl) as part of deciding what needs to be done. But unfortunately the code still has separate code paths for "collection secondary indexing", and also for old-style column_computation (basically, only token_column_computation). Perhaps in the future this can be further simplified. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-02-06 09:59:49 +01:00
Nadav Har'El	c8ea9f8470	mv: introduce regular_column_transformation, a new type of computed column In the patches that follow, we want Alternator to be able to use as a key for a materialized view (GSI) not a real column from the schema, but rather an attribute value deserialized from a member of the ":attrs" map. For this, we need the ability for materialized view to define a key column which is computed as function of a real column (":attrs"). We already have an MV feature which we called "computed column" (column_computation), but it is wholy inadequate for this job: column_computation can only take a partition key, and produce a value - while we need it to take a regular column (one member of ":attrs"), not just the partition key, and return a cell - value or deletion, timestamp and TTL. So in this patch we introduce a new type of computed column, which we called "regular_column_transformation" since it intends to perform some sort of transformation on a single column (or more accurately, a single atomic cell). The limitation that this function transforms a single column only is important - if we had a function of multiple columns, we wouldn't know which timestamp or ttl it should use for the result if the two columns had different timestamps or TTLs. The new class isn't wired to anything yet: The MV code cannot handle it yet, and the Alternator code will not use it yet. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-02-06 09:59:48 +01:00
Michael Litvak	6d34125eb7	view_builder: fix loop in view builder when tokens are moved The view builder builds a view by going over the entire token ring, consuming the base table partitions, and generating view updates for each partition. A view is considered as built when we complete a full cycle of the token ring. Suppose we start to build a view at a token F. We will consume all partitions with tokens starting at F until the maximum token, then go back to the minimum token and consume all partitions until F, and then we detect that we pass F and complete building the view. This happens in the view builder consumer in `check_for_built_views`. The problem is that we check if we pass the first token F with the condition `_step.current_token() >= it->first_token` whenever we consume a new partition or the current_token goes back to the minimum token. But suppose that we don't have any partitions with a token greater than or equal to the first token (this could happen if the partition with token F was moved to another node for example), then this condition will never be satisfied, and we don't detect correctly when we pass F. Instead, we go back to the minimum token, building the same token ranges again, in a possibly infinite loop. To fix this we add another step when reaching the end of the reader's stream. When this happens it means we don't have any more fragments to consume until the end of the range, so we advance the current_token to the end of the range, simulating a partition, and check for built views in that range. Fixes scylladb/scylladb#21829 Closes scylladb/scylladb#22493	2025-01-30 14:35:18 +02:00
Benny Halevy	dd21d591f6	network_topology_strategy_test: add tablets rack_aware_view_pairing tests Test the simple case of base/view pairing with replication_factor that is a multiple of the number of racks. As well as the complex case when simple_tablets_rack_aware_view_pairing is not possible. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	249b793674	view: get_view_natural_endpoint: implement rack-aware pairing for tablets Enabled with the tablets_rack_aware_view_pairing cluster feature rack-aware pairing pairs base to view replicas that are in the same dc and rack, using their ordinality in the replica map We distinguish between 2 cases: - Simple rack-aware pairing: when the replication factor in the dc is a multiple of the number of racks and the minimum number of nodes per rack in the dc is greater than or equal to rf / nr_racks. In this case (that includes the single rack case), all racks would have the same number of replicas, so we first filter all replicas by dc and rack, retaining their ordinality in the process, and finally, we pair between the base replicas and view replicas, that are in the same rack, using their original order in the tablet-map replica set. For example, nr_racks=2, rf=4: base_replicas = { N00, N01, N10, N11 } view_replicas = { N11, N12, N01, N02 } pairing would be: { N00, N01 }, { N01, N02 }, { N10, N11 }, { N11, N12 } Note that we don't optimize for self-pairing if it breaks pairing ordinality. - Complex rack-aware pairing: when the replication factor is not a multiple of nr_racks. In this case, we attempt best-match pairing in all racks, using the minimum number of base or view replicas in each rack (given their global ordinality), while pairing all the other replicas, across racks, sorted by their ordinality. For example, nr_racks=4, rf=3: base_replicas = { N00, N10, N20 } view_replicas = { N11, N21, N31 } pairing would be: { N00, N31 }, { N10, N11 }, { N20, N21 } cross-rack pair If we'd simply stable-sort both base and view replicas by rack, we might end up with much worse pairing across racks: { N00, N11 }, { N10, N21 }, { N20, N31 }* * cross-rack pair Fixes scylladb/scylladb#17147 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	0e388a1594	view: get_view_natural_endpoint: handle case when there are too few view replicas Currently, when reducing RF, we may drop replicas from the view before dropping replicas from the base table. Since get_view_natural_endpoint is allowed to return a disengaged optional if it can't find a pair for the base replica, replcace the exiting assertion with code handling this case, and count those events in a new table metric: total_view_updates_failed_pairing. Note that this does not fix the root cause for the issue which is the unsynchronized dropping of replicas, that should be atomic, using a single group0 transaction. Refs scylladb/scylladb#21492 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	858b0a51f8	view: get_view_natural_endpoint: track replica locator::nodes Rather than tracking only the replica host_id, keep track of the locator:::node& to prepare for rack-aware pairing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	cadd33bdf6	view: get_view_natural_endpoint: refactor predicate function Simplify the function logic by calculating the predicate function once, before scanning all base and view replicas, rather than testing the different options in the inner loop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	97f85e52f7	view: get_view_natural_endpoint: clarify documentation "self-pairing" is enabled only when use_legacy_self_pairing is enabled. That is currently unclear in the documentation comment for this function. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	6d4de30a3a	view: mutate_MV: optimize remote_endpoints filtering check Currently we always lookup both `my_address` and target_endpoint in remote_endpoints. But if my_address is in remote_endpoints in some cases the second lookup is not needed, so do it only to decide whether to swap target_endpoint with my_address, if found in remote_endpoints, or to remove that match, if target_endpoint is already pending as well. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	91d3bf8ebc	view: mutate_MV: lookup base and view erms synchronously Although at the moment storage_service::replicate_to_all_cores may yield between updating the base and view tables with a new effective_replication_map, scylladb/scylladb#21781 was submitted to change that so that they are updated atomically together. This change prepares for the above change, and is harmless at the moment. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Benny Halevy	d04cdce0fc	view: mutate_MV: calculate keyspace-dependent flags once All view live in the same keyspace as their base table, so calculate the keyspace-dependent flags once, outside the per-view update loop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-22 09:04:24 +02:00
Kamil Braun	89ee2a6834	Merge 'drop ip addresses from token metadata' from Gleb Now that all topology related code uses host ids there is not point to maintain ip to id (and back) mappings in the token metadata. After the patch the mapping will be maintained in the gossiper only. The rest of the system will use host ids and in rare cases where translation is needed (mostly for UX compatibility reasons) the translation will be done using gossiper. Fixes: scylladb/scylla#21777 * 'gleb/drop-ip-from-tm-v3' of github.com:scylladb/scylla-dev: (57 commits) hint manager: do not translate ip to id in case hint manager is stopped already locator: token_metadata: drop update_host_id() function that does nothing now locator: topology: drop indexing by ips repair: drop unneeded code storage_service: use host_id to look for a node in on_alive handler storage_proxy: translate ips to ids in forward array using gossiper locator: topology: remove unused functions storage_service: check for outdated ip in on_change notification in the peers table storage_proxy: translate id to ip using address map in tablets's describe_ring code instead of taking one from the topology topology coordinator: change connection dropping code to work on host ids cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query locator: drop unused function from tablet_effective_replication_map api: view_build_statuses: do not use IP from the topology, but translate id to ip using address map instead locator: token_metadata: remove unused ip based functions locator: network_topology_strategy: use host_id based function to check number of endpoints in dcs gossiper: drop get_unreachable_token_owners functions storage_service: use gossiper to map ip to id in node_ops operations storage_service: fix indentation after the last patch storage_service: drop loops from node ops replace_prepare handling since there can be only one replacing node token_metadata: drop no longer used functions ...	2025-01-17 11:00:52 +01:00
Gleb Natapov	122d58b4ad	api: view_build_statuses: do not use IP from the topology, but translate id to ip using address map instead	2025-01-16 16:37:07 +02:00
Gleb Natapov	844cb090bf	view: do not use get_endpoint_for_host_id_if_known to check if a node is part of the topology Check directly in the topology instead.	2025-01-15 16:30:28 +02:00
Michael Litvak	7a6aec1a6c	view_builder: hold semaphore during entire startup Guard the whole view builder startup routine by holding the semaphore until it's done instead of releasing it early, so that it's not intercepted by migration notifications.	2025-01-14 12:31:29 +02:00
Michael Litvak	1104411f83	view_builder: pass view name by value to write_view_build_status The function write_view_build_status takes two lambda functions and chooses which of them to run depending on the upgrade state. It might run both of them. The parameters ks_name and view_name should be passed by value instead of by reference because they are moved inside each lambda function. Otherwise, if both lambdas are run, the second call operates on invalid values that were moved.	2025-01-14 12:31:29 +02:00
Michael Litvak	b1be2d3c41	view_builder: write status to tables before starting to build When adding a new view for building, first write the status to the system tables and then add the view building step that will start building it. Otherwise, if we start building it before the status is written to the table, it may happen that we complete building the view, write the SUCCESS status, and then overwrite it with the STARTED status. The view_build_status table will remain in incorrect state indicating the view building is not complete. Fixes scylladb/scylladb#20638	2025-01-14 12:31:20 +02:00
Michael Litvak	2a8ff478f0	view_builder: register listener for new views before reading views When starting the view builder, we find all existing views in `calculate_shard_build_step` and then register a listener for new views. Between these steps we may yield and create a new view, then we miss initializing the view build step for the new view, and we won't start building it. To fix this we first register the listener and then read existing views, so a view can't be missed. Fixes scylladb/scylladb#20338 Closes scylladb/scylladb#22184	2025-01-09 13:18:28 +02:00
Kefu Chai	e4463b11af	treewide: replace boost::algorithm::join() with fmt::join() Replace usages of `boost::algorithm::join()` with `fmt::join()` to improve performance and reduce dependency on Boost. `fmt::join()` allows direct formatting of ranges and tuples with custom separators without creating intermediate strings. When formatting comma-separated values into another string, fmt::join() avoids the overhead of temporary string creation that `boost::algorithm::join()` requires. This change also helps streamline our dependencies by leveraging the existing fmt library instead of Boost.Algorithm. To avoid the ambiguity, some caller sites were updated to call `seastar::format()` explicitly. See also - boost::algorithm::join(): https://www.boost.org/doc/libs/1_87_0/doc/html/string_algo/reference.html#doxygen.join_8hpp - fmt::join(): https://fmt.dev/11.0/api/#ranges-api Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22082	2025-01-07 12:45:05 +02:00
Wojciech Mitros	37a25d3af4	mv: avoid stalls when calculating affected clustering ranges Currently, when finishing db::view::calculate_affected_clustering_ranges we deoverlap, transform and copy all ranges prepared before. This is all done within a single continuation and can cause stalls. We fix this by adding yields after each transform and moving elements to the final vector one by one instead of copying them all at the end. After this change, the longest continuation in this code will be deoverlapping the initial ranges (and one transform). While it has a relatively high computational complexity (we sort all ranges), it should execute quickly because we're operating on views there and we don't need to copy the actual bytes. If we encounter a stall there, we'll need to implement an asynchronous `deoverlap` method. Fixes scylladb/scylladb#21843 Closes scylladb/scylladb#21846	2024-12-19 12:50:30 +01:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Botond Dénes	34a8b492be	Merge 'materialized view: make flow-control maximum delay configurable' from Piotr Dulikowski This pull request is continuation of scylladb/scylladb#20688 - contents of the main commit are the same, the only change is the additional commit with a test. Until this patch, the materialized view flow-control algorithm (https://www.scylladb.com/2018/12/04/worry-free-ingestion-flow-control/) used a constant delay_limit_us hard-coded to one second, which means that when the size of view-update backlog reached the maximum (10% of memory), we delay every request by an additional second - while smaller amounts of backlog will result in smaller delays. This hard-coded one maximum second delay was considered huge - it will slow down a client with concurrency 1000 to just 1000 requests per second - but we already saw some workloads where it was not enough - such as a test workload running very slow reads at high concurrency on a slow machine, where a latency of over one second was expected for each read, so adding a one second latecy for writes wasn't having any noticable affect on slowing down the client. So this patch replaces the hard-coded default with a live-updateable configuration parameter, `view_flow_control_delay_limit_in_ms`, which defaults to 1000ms as before. Another useful way in which the new `view_flow_control_delay_limit_in_ms` can be used is to set it to 0. In that case, the view-update flow control always adds zero delay, and in effect - does absolutely nothing. This setting can be used in emergency situations where it is suspected that the MV flow control is not behaving properly, and the user wants to disable it. The new parameter's help string mentions both these use cases of the parameter. Fixes #18187 This is new functionality, no need to backport to any open source release. Closes scylladb/scylladb#21647 * github.com:scylladb/scylladb: materialized views: test for the MV delay configuration parameter service: add injection for skipping view update backlog materialized view: make flow-control maximum delay configurable	2024-12-16 14:20:33 +02:00
muthu90tech	e49381119d	locator: topology: use node& instead of node* This change goes thru locator:topology to use node& instead of node* where nullptr is not possible. There are places where the node object is used in unordered_set, in those cases the node is wrapped in std::reference_wrapper. Fixes scylladb/scylladb#20357 Closes scylladb/scylladb#21863	2024-12-12 13:22:55 +01:00

1 2 3 4 5 ...

659 Commits