scylladb

Author	SHA1	Message	Date
Duarte Nunes	6fbf792777	db/view/view_builder: Don't timeout waiting for view to be built Remove the timeout argument to db::view::view_builder::wait_until_built(), a test-only function to wait until a given materialized view has finished building. This change is motivated by the fact that some tests running on slow environments will timeout. Instead of incrementally increasing the timeout, remove it completely since tests are already run under an exterior timeout. Fixes #3920 Tests: unit release(view_build_test, view_schema_test) Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181115173902.19048-1-duarte@scylladb.com>	2018-11-15 19:41:43 +02:00
Piotr Sarna	fc7267c797	db/view: add view_update_from_staging_generator service A shardable service for generating mv updates after restarts is added.	2018-11-13 15:01:52 +01:00
Piotr Sarna	ed05d91adc	db/view: add view updating consumer This consumer is used to generate and push view replica updates from read mutations.	2018-11-13 14:54:39 +01:00
Avi Kivity	d77e044cde	db: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Nadav Har'El	b8337f8c9d	Materalized views: fix race condition in resharding while view building When a node reshards (i.e., restarts with a different number of CPUs), and is in the middle of building a view for a pre-existing table, the view building needs to find the right token from which to start building on all shards. We ran the same code on all shards, hoping they would all make the same decision on which token to continue. But in some cases, one shard might make the decision, start building, and make progress - all before a second shard goes to make the decision, which will now be different. This resulted, in some rare cases, in the new materialized view missing a few rows when the build was interrupted with a resharding. The fix is to add the missing synchronization: All shards should make the same decision on whether and how to reshard - and only then should start building the view. Fixes #3890 Fixes #3452 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181028140549.21200-1-nyh@scylladb.com>	2018-10-28 17:20:10 +00:00
Duarte Nunes	f3a5ec0fd9	db/view: Don't copy keyspace name Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181022104527.14555-1-duarte@scylladb.com>	2018-10-22 13:00:00 +02:00
Nadav Har'El	1d5f8d0015	materialized views: update stats.write statistics in all cases mutate_MV usually calls send_to_endpoint() to push view update to remote view replicas. This function gets passed a statistics object, service::storage_proxy_stats::write_stats and, in particular, updates its "writes" statistic which counts the number of ongoing writes. In the case that the paired view replica happens to be the same node, we avoid calling send_to_endpoint() and call mutate_locally() instead. That function does not take a write_stats object, so the "writes" statistic doesn't get incremented for the duration of the write. So we should do this explicitly. Co-authored-by: Nadav Har'El <nyh@scylladb.com> Co-authored-by: Duarte Nunes <duarte@scylladb.com>	2018-10-02 20:44:58 +01:00
Botond Dénes	eb357a385d	flat_mutation_reader: make timeout opt-out rather than opt-in Currently timeout is opt-in, that is, all methods that even have it default it to `db::no_timeout`. This means that ensuring timeout is used where it should be is completely up to the author and the reviewrs of the code. As humans are notoriously prone to mistakes this has resulted in a very inconsistent usage of timeout, many clients of `flat_mutation_reader` passing the timeout only to some members and only on certain call sites. This is small wonder considering that some core operations like `operator()()` only recently received a timeout parameter and others like `peek()` didn't even have one until this patch. Both of these methods call `fill_buffer()` which potentially talks to the lower layers and is supposed to propagate the timeout. All this makes the `flat_mutation_reader`'s timeout effectively useless. To make order in this chaos make the timeout parameter a mandatory one on all `flat_mutation_reader` methods that need it. This ensures that humans now get a reminder from the compiler when they forget to pass the timeout. Clients can still opt-out from passing a timeout by passing `db::no_timeout` (the previous default value) but this will be now explicit and developers should think before typing it. There were suprisingly few core call sites to fix up. Where a timeout was available nearby I propagated it to be able to pass it to the reader, where I couldn't I passed `db::no_timeout`. Authors of the latter kind of code (view, streaming and repair are some of the notable examples) should maybe consider propagating down a timeout if needed. In the test code (the wast majority of the changes) I just used `db::no_timeout` everywhere. Tests: unit(release, debug) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>	2018-09-20 11:31:24 +02:00
Nadav Har'El	16a6f76873	materialized views: simplify do_delete_old_entry() In previous patches, we gave up on an old (and broken) attempt to track the timestamps of many unselected base-table columns through one row marker in the view table - and replaced them by "virtual cells", one per unselected cell. The do_delete_old_entry() function still contains old code which maintained that row marker, and is no longer needed. That old code is no only no longer needed, it also no longer did anything because all columns now appear in the view (as virtual columns) so the code ignored them when calculating the row marker. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180829131914.16042-1-nyh@scylladb.com>	2018-08-29 14:33:41 +01:00
Nadav Har'El	6c00341383	Materialized Views: no need for elaborate row marker calculations Now that we have separate virtual cells to represent unselected columns in a materialized view, we no longer need the elaborate row-marker liveness calculations which aimed (but failed) to do the same thing. So that code can be removed. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-08-16 15:45:41 +03:00
Nadav Har'El	30f721afab	Materialized Views: add unselected columns as virtual columns When a view's partition key contains only columns from the base's partition key (and not an additional one), the liveness (existance or disappearance) of a view-table row is tied to the liveness of the base table row - and that depends not only on selected columns (base-table columns SELECTed to also appear in the view) but also on unselected columns. This means that we may need to keep a view row alive even without data, just because some unselected column is alive in the base table. Before this patch we tried to build a single "row marker" in the view column which summarizes the liveness information in all unselected columns, but this proved unworkable, as explained in issue #3362 and as will be demonstrated in unit tests in a later patch. Because we can't replace several unselected cells by one row marker, what we do in this patch is to add for each for the unselected cell a "virtual cell" which contains the cell's liveness information (timestamp, deletion, ttl) but not its value. For collections, we can't represent the entire collection by one virtual cell, and rather need a collection of virtual cells. This patch just adds the virtual columns to the view schema. Code in the previous patch, when it notices the virtual columns in the view's schema, added the appropriate content into these columns. We may need to add virtual columns to a view when first created, but also when an unselected column is added to the base table with "ALTER TABLE", so both are supported in this patch. Fixes #3362. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-08-16 15:42:22 +03:00
Nadav Har'El	782baa44ef	Materialized Views: fill virtual columns The add_cells_to_view() function usually adds selected cells from the base table to the view mutation. For issue #3362, we sometimes want to also add unselected cells as "virtual" cells - truncated versions of the base-table cells just without the values. This patch contains the code to fill the virtual columns' data using the regular columns from the base table. This patch does not yet actually add any virtual columns to the schema, so until that is done (in the next patch), this patch will not yet cause any behavior change. This is important for bisectability. Refs #3362. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2018-08-16 15:38:27 +03:00
Tomasz Grabiec	894961006b	Merge "db/view/view_builder: Fixes to bookkeeping" from Duarte This series contains a couple of fixes to the bookkeeping of the view build process, which could cause data to be left behind in the system tables. * git@github.com:duarten/scylla.git materialized-views/view-build-fixes/v1: Duarte Nunes (3): db/system_keyspace: Add function to remove view build status of a shard db/view: Don't have shard 0 clear other shard's status on drop db/view: Restrict writes to the distributed system keyspace to shard 0	2018-07-17 18:01:28 +02:00
Duarte Nunes	55caaec411	db/view/build_progress_virtual_reader: Also adjust end RT bound Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-07-11 23:28:31 +01:00
Duarte Nunes	eda6b88b0e	db/view/build_progress_virtual_reader: Fix full ck detection As an optimization, the virtual reader doesn't change the underlying key if it is not full, and hence doesn't include the extra clustering key. However, this detection is broken because it checked for 3 clustering columns, instead of 2. This patch fixes that by obtaining the clustering key size from the underlying schema instead of hardcoding the size. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-07-11 23:28:31 +01:00
Duarte Nunes	ff3a0d437a	db/view/build_progress_virtual_reader: Use correct schema to adjust ck The virtual reader adjusts clustering keys obtained from the underlying, scylla-specific schema, and potentially sheds the extra clustering key that's absent from the Cassandra-compatible schema. This patches ensures we use the correct schema to iterator over the key. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-07-11 23:28:31 +01:00
Duarte Nunes	df66d7db59	db/view: Restrict writes to the distributed system keyspace to shard 0 Writing to the distributed system keyspace should be confined to a single shard of each host, namely shard 0. We were violating this constraint by having all shards set the host status to "started". This could be problematic when the build finishes quickly or there's a concurrent view drop, such that a write done by shard 0 can have a smaller timestamp than one done by some other shard. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-07-11 21:45:26 +01:00
Duarte Nunes	e683c1367f	db/view: Don't have shard 0 clear other shard's status on drop Shard 0 can clear the in-progress build status of all shards when a view finishes building, because we are ensured all writes to the system table have completed with earlier timestamps. This is not the case when dropping a view. A drop can happen concurrently with the build, in which case shard 0 may process the notification before another shard receives it, and before that shard writes to the system table. Fix this by ensuring each shard clears its own status on drop. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-07-11 21:45:26 +01:00
Piotr Sarna	d5e7b5507b	view: add handling of a token column for secondary indexes In order to ensure token order on secondary index queries, first clustering column for each view that backs a secondary index is going to store a token computed from base's partition keys. After this commit, if there exists a column that is not present in base schema, it will be filled with computed token.	2018-06-05 18:59:25 +02:00
Piotr Sarna	06eee0f525	view: add is_index method is_index method returns true if view that owns it is backing a secondary index.	2018-06-05 11:10:24 +02:00
Paweł Dziepak	aa25f0844f	atomic_cell: introduce fragmented buffer value interface As a prepratation for the switch to the new cell representation this patch changes the type returned by atomic_cell_view::value() to one that requires explicit linearisation of the cell value. Even though the value is still implicitly linearised (and only when managed by the LSA) the new interface is the same as the target one so that no more changes to its users will be needed.	2018-05-31 15:51:11 +01:00
Paweł Dziepak	27014a23d7	treewide: require type info for copying atomic_cell_or_collection	2018-05-31 15:51:11 +01:00
Paweł Dziepak	93130e80fb	atomic_cell: require column_definition for creating atomic_cell views	2018-05-31 15:51:11 +01:00
Duarte Nunes	99d678d079	db/view: Remove ifdef'd Java code It provides no useful information, so just get rid of it. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-05-28 11:51:23 +01:00
Duarte Nunes	ad18d535e9	db/view: Ignore scenario where base replica hasn't joined the ring Apache Cassandra handles a case where the node hasn't joined the ring and may consequentially have an outdated view of it. Following the same reasoning as with the previous patch, we ignore this scenario. It happens when there are range movements, and this node is bootstrapping, but there are already other mechanisms in the cluster, such as hinted handoff and dual-writing to replicas during range movements, that contribute to this update eventually making its way to the view. This patch doesn't change any behavior, but it provides the reasoning why we won't use the batchlog as Cassandra does, or the hinted handoff log as we will, to later send the update when the node is joined (note that Cassandra just sends the mutations "later", and doesn't check again for any condition or change). Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-05-28 11:51:23 +01:00
Duarte Nunes	be45e6a1b7	db/view: Handle case when base has no paired view replica If no view replica is paired with the current base replica, it means there's a range movement going on (decommission or move), such that this base replica is gaining new token ranges. The current node is thus a pending_endpoint from the POV of the coordinator that sent the request. Sending view updates to the view replica this base will eventually be paired with only makes a difference when the base update didn't make it to the node which is currently being decommissioned or moved-from. The update will, however, make it to that node if HH is enabled at the coordinator, before the range movement finishes, or later to this node when it becomes a natural endpoint for the token. We still ensure we send to any pending view endpoints though, at least until we handle that case more optimally. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-05-28 11:51:18 +01:00
Piotr Sarna	3792bed3ed	view: adapt view_stats to act as write stats This commit adapts view_stats structure so it can be passed to storage_proxy as write stats. Thanks to that, mv replica updates will not interfere with user write metrics. As a side effect it also provides more stats to replica view updates. Closes #3385 Closes #3416	2018-05-22 16:52:58 +02:00
Piotr Sarna	9246bb36bc	db: add row locking metrics This commit adds statistics to row_locker class. Metrics are independendly counted for all lock types: row<->partition and exclusive<->shared. Metrics gathered: - total acquisitions - operations that wait on the lock - histogram of the time spent on waiting on this type of lock References #3385 References #3416	2018-05-22 16:52:58 +02:00
Piotr Sarna	49bebcfa25	view: add view metrics This commit introduces view statistics: - updates pushed to local/remote replicas - updates failed to be pushed to local/remote replicas Metrics are kept on per-table basis, i.e. updates_pushed_remote shows the number of total updates (mutations) pushed to all paired mv replicas that this particular table has. Every single update is taken into consideration, so if view update requires removing a row from one view and adding a row to another, it will be counted as 2 updates. References #3385 References #3416	2018-05-22 16:52:58 +02:00
Paweł Dziepak	75b8b521d9	db/view/build_progress: avoid copying mutation fragment	2018-05-09 16:52:26 +01:00
Paweł Dziepak	0b4c6b8938	types: make some collection_type_impl functions non-static The switch to the new in-memory representation will require a larger parts of the logic be aware of the type of the values they are dealing with. In most cases it is not a significant burden for the users.	2018-05-09 16:52:26 +01:00
Duarte Nunes	c053275a48	db/view/row_locking: Add timeout when waiting for the lock This ensures we respect the write timeout set by the client when applying base writes, in case a writes takes too long to acquire the row lock for the read-before-write phase of a materialized view update. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180507132755.8751-1-duarte@scylladb.com>	2018-05-07 18:22:39 +01:00
Nadav Har'El	8012f231ca	materialized views: fix another case-sensitivity bug We had another case-sensitivity bug in materialized views, where if a case-sensitive (quoted) column name was listed explicitly on "SELECT" (instead of implicitly, e.g., in "SELECT *") the column name was incorrectly folded to lower-case and inserts would fail. This patch fixes the code, where a "SELECT" statement was built using the desired column names, but column names that needed quoting were not being quoted. The bug was in a helper function build_select_statement() which took column name strings and failed to quote them. We clean up this function to take column definitions instead of strings - and take care of the quoting itself. It also needs to quote the table's name in the select statement being built. Fixes #3391. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180429221857.6248-6-nyh@scylladb.com>	2018-04-30 00:27:23 +02:00
Duarte Nunes	844e0b41d1	db/view: Move cells instead of copying in add_cells_to_view() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:03 +01:00
Duarte Nunes	4b4d1dbd1f	db/view: Handle unselected base columns and corner cases When a view's PK only contains the columns that form the base's PK, then the liveness of a particular view row is determined not only by the base row's marker, but also by the selected and, more importantly, unselected columns. This patch ensures that unselected columns are considered as much as possible, even though some limitations will still exist. In particular, we need to represent multiple timestamps (from all the unselected columns), but have only mechanisms to record a single timestamp. We also have some issues when dealing with selected column, and the way we currently delete them. Consider the following: create table cf (p int, c int, a int, b int, primary key (p, c)) create materialized view vcf as select a, b from cf where p is not null and c is not null primary key (p, c) 1) update cf using timestamp 10 set a = 1 where p = 1 and c = 1 2) delete a from cf using timestamp 11 where p = 1 and c = 1 3) update cf using timestamp 1 set a = 2 where p = 1 and c = 1 After 1), the MV should include a row with row marker @ ts10, p = 1, c = 1, a = 1. After 2), this row should be removed. At 3), we should add a row with row marker @ ts1, p = 1, c = 1, a = 1, with a lower timestamp. This means that the delete should not insert a row tombstone with timestamp @ 11, as we do now but it should just delete the view's row marker (which exists) with ts1. Refs #3362 Fixes #3140 Fixes #3361 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Duarte Nunes	67dac67c46	mutation_partition: Regular base column in view determines row liveness When views contain a primary key column that is not part of the base table primary key, that column determines whether the row is live or not. We need to ensure that when that cell is dead, and thus the derived row marker, either by normal deletion of by TTL, so is the rest of the row. This patch introduces the idea of shawdowing row marker. We map the status of the regular base column in the view's PK to the view row's marker. If this marker is dead, so is that cell in the base table, and so should the view row become. To enforce that, a view row's dead marker shadows the whole row if that view includes a base regular column in its PK. Fixes #3360 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Duarte Nunes	4dfce4d369	db/view: Don't avoid read-before-write when view PK matches base When a view's PK only contains the columns that form the base's PK, then the liveness of a particular view row is determined not only by the base row's marker, but also by the selected and, more importantly, unselected columns. When calculating the view's row marker we need to access those unselected columns, so we can't avoid the read-before-write as we were doing. Refs #3362 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Duarte Nunes	bd3cedd240	db/view: Process base updates to column unselected by its views When a view's PK only contains the columns that form the base's PK, then the liveness of a particular view row is determined not only by the base row's marker, but also by the selected and, more importantly, unselected columns. So, process base updates to columns unselected by any of its views. Refs #3362 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Duarte Nunes	ac9b93eb89	db/view: Consider partition tombstone when generating updates Not adding the partition tombstone to the current list of tombstones may cause updates to be incorrectly generated. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Duarte Nunes	164f043768	view_info: Add view_column() overload For when we already have the base's column_definition. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Duarte Nunes	31370fd7b1	view_info: Explicitly initialize base-dependent fields Instead of lazily-initializing the regular base column in the view's PK field, explicitly initialize it. This will be used by future patches that don't have access to the schema when wanting to obtain that column. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Duarte Nunes	17917e12ce	db/view: Wait for schema agreement in background upon view building Waiting for schema agreement in the foreground may cause the node to not boot in useful time. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180417125915.11262-1-duarte@scylladb.com>	2018-04-17 18:03:43 +03:00
Avi Kivity	9cef37e643	Merge "db/view: View building fixes" from Duarte " Fixes to the view building process, discovered from field experience. Tests: dtest(materialized_view_tests.py, smp=2) " * 'views/view-build-fixes/v1' of https://github.com/duarten/scylla: db/view: Start view building after schema agreement db/system_keyspace: scylla_views_builds_in_progress writes are user mem db/view: Require configuration option to enable view building	2018-04-03 17:42:21 +03:00
Duarte Nunes	ec8960df45	db/view: Reject view entries with non-composite, empty partition key Empty partition keys are not supported on normal tables - they cannot be inserted or queried (surprisingly, the rules for composite partition keys are different: all components are then allowed to be empty). However, the (non-composite) partition key of a view could end up being empty if that column is: a base table regular column, a base table clustering key column, or a base table partition key column, part of a composite key. Fixes #3262 Refs CASSANDRA-14345 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180403122244.10626-1-duarte@scylladb.com>	2018-04-03 15:25:52 +03:00
Duarte Nunes	d4db043f03	db/view: Start view building after schema agreement If a base table or view has been dropped in one node, but another one hasn't yet learned about it, it starts the view build process immediately on boot, possibly calculating unneeded view updates and causing errors at the view replica, if that replica has already processed the schema changes. We should thus wait for schema agreement, even if the node is a seed. Fixes #3328 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-03 13:16:28 +01:00
Duarte Nunes	11ece46f14	db/view: Remove leftover debug statement Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180402175238.5528-1-duarte@scylladb.com>	2018-04-03 09:41:33 +01:00
Duarte Nunes	a45fa8eaa2	db/view/view_builder: Allow synchronizing with the end of a build Intended for use by unit tests, this patch allows synchronizing with the end of a build for a particular view. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:11 +01:00
Duarte Nunes	5f822e3928	db/view/view_builder: Actually build views This patch adds the missing view building code to the eponymous class. We consume from the reader associated with each base table until all its views are built. If the reader reaches the end and there are incomplete views, then a view was added while others were being built. In such cases, we restart the reader to the beginning of the current token, but not to the beginning of the token range, when the view is added. Then, when we exhaust the reader, we simply create a new one for the whole token range, and resume building the pending views. We aim to be resource-conscious. On a given shard, at any given moment, we consume at most from one reader. We also strive for fairness, in that each build step inserts entries for the views of a different base. Each build step reads and generates updates for batch_size rows. We lack a controller, which could potentially allow us to go faster (to execute multiple steps at the same time, or consume more rows per batch), and also which would apply backpressure, so we could, for example, delay executing a build step. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:11 +01:00
Duarte Nunes	a21efeffa0	db/view/view_builder: React to schema changes The view_builder now uses the migration_manager to subscribe to schema change events, and update its bookkeeping accordingly. We prefer this to having the database call into the view_builder, as that would create a cyclic dependency. We serialize changes to the views of a particular base table, such that schema changes do not interfere with the upcoming view building code. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:11 +01:00
Duarte Nunes	901faabaa2	db/view: Introduce view_builder This patch introduces the view_builder class, a sharded service responsible for building all defined materialized views. This process entails walking over the existing data in a given base table, and using it to calculate and insert the respective entries for one or more views. This patch introduces only the bootstrap functionality, which is responsible for loading the data stored in the system tables and filling the in-memory data structures with the relevant information, to be used in subsequent patches for the actual view building. The interaction with the system tables is as follows. Interaction with the tables in system_keyspace: - When we start building a view, we add an entry to the scylla_views_builds_in_progress system table. If the node restarts at this point, we'll consider these newly inserted views as having made no progress, and we'll treat them as new views; - When we finish a build step, we update the progress of the views that we built during this step by writing the next token to the scylla_views_builds_in_progress table. If the node restarts here, we'll start building the views at the token in the next_token column. - When we finish building a view, we mark it as completed in the built views system table, and remove it from the in-progress system table. Under failure, the following can happen: * When we fail to mark the view as built, we'll redo the last step upon node reboot; * When we fail to delete the in-progress record, upon reboot we'll remove this record. A view is marked as completed only when all shards have finished their share of the work, that is, if a view is not built, then all shards will still have an entry in the in-progress system table; - A view that a shard finished building, but not all other shards, remains in the in-progress system table, with first_token == next_token. Interaction with the distributed system table (view_build_status): - When we start building a view, we mark the view build as being in-progress; - When we finish building a view, we mark the view as being built. Upon failure, we ensure that if the view is in the in-progress system table, then it may not have been written to this table. We don't load the built views from this table when starting. When starting, the following happens: * If the view is in the system.built_views table and not the in-progress system table, then it will be in view_build_status; * If the view is in the system.built_views table and not in this one, it will still be in the in-progress system table - we detect this and mark it as built in this table too, keeping the invariant; * If the view is in this table but not in system.built_views, then it will also be in the in-progress system table - we don't detect this and will redo the missing step, for simplicity. View building is necessarily a sharded process. That means that on restart, if the number of shards has changed, we need to calculate the most conservative token range that has been built, and build the remainder. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-03-27 01:20:10 +01:00

1 2

91 Commits