scylladb

Author	SHA1	Message	Date
Avi Kivity	fd1dd0eac7	Merge "Track the memory consumption of reader buffers" from Botond " The last major untracked area of the reader pipeline is the reader buffers. These scale with the number of readers as well as with the size and shape of data, so their memory consumption is unpredictable varies wildly. For example many small rows will trigger larger buffers allocated within the `circular_buffer<mutation_fragment>`, while few larger rows will consume a lot of external memory. This series covers this area by tracking the memory consumption of both the buffer and its content. This is achieved by passing a tracking allocator to `circular_buffer<mutation_fragment>` so that each allocation it makes is tracked. Additionally, we now track the memory consumption of each and every mutation fragment through its whole lifetime. Initially I contemplated just tracking the `_buffer_size` of `flat_mutation_reader::impl`, but concluded that as our reader trees are typically quite deep, this would result in a lot of unnecessary `signal()`/`consume()` calls, that scales with the number of mutation fragments and hence adds to the already considerable per mutation fragment overhead. The solution chosen in this series is to instead track the memory consumption of the individual mutation fragments, with the observation that these are typically always moved and very rarely copied, so the number of `signal()`/`consume()` calls will be minimal. This additional tracking introduces an interesting dilemma however: readers will now have significant memory on their account even before being admitted. So it may happen that they can prevent their own admission via this memory consumption. To prevent this, memory consumption is only forwarded to the semaphore upon admission. This might be solved when the semaphore is moved to the front -- before the cache. Another consequence of this additional, more complete tracking is that evictable readers now consume memory even when the underlying reader is evicted. So it may happen that even though no reader is currently admitted, all memory is consumed from the semaphore. To prevent any such deadlocks, the semaphore now admits a reader unconditionally if no reader is admitted -- that is if all count resources all available. Refs: #4176 Tests: unit(dev, debug, release) " * 'track-reader-buffers/v2' of https://github.com/denesb/scylla: (37 commits) test/manual/sstable_scan_footprint_test: run test body in statement sched group test/manual/sstable_scan_footprint_test: move test main code into separate function test/manual/sstable_scan_footprint_test: sprinkle some thread::maybe_yield():s test/manual/sstable_scan_footprint_test: make clustering row size configurable test/manual/sstable_scan_footprint_test: document sstable related command line arguments mutation_fragment_test: add exception safety test for mutation_fragment::mutate_as_*() test: simple_schema: add make_static_row() reader_permit: reader_resources: add operator== mutation_fragment: memory_usage(): remove unused schema parameter mutation_fragment: track memory usage through the reader_permit reader_permit: resource_units: add permit() and resources() accessors mutation_fragment: add schema and permit partition_snapshot_row_cursor: row(): return clustering_row instead of mutation_fragment mutation_fragment: remove as_mutable_end_of_partition() mutation_fragment: s/as_mutable_partition_start/mutate_as_partition_start/ mutation_fragment: s/as_mutable_range_tombstone/mutate_as_range_tombstone/ mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ mutation_fragment: s/as_mutable_static_row/mutation_as_static_row/ flat_mutation_reader: make _buffer a tracked buffer mutation_reader: extract the two fill_buffer_result into a single one ...	2020-09-29 16:08:16 +03:00
Eliran Sinvani	925cdc9ae1	consistency level: fix wrong quorum calculation whe RF = 0 We used to calculate the number of endpoints for quorum and local_quorum unconditionally as ((rf / 2) + 1). This formula doesn't take into account the corner case where RF = 0, in this situation quorum should also be 0. This commit adds the missing corner case. Tests: Unit Tests (dev) Fixes #6905 Closes #7296	2020-09-29 13:25:41 +03:00
Piotr Sarna	4b856cf62d	transport: make max_concurrent_requests_per_shard reloadable This configuration entry is expected to be used as a quick fix for an overloaded node, so it should be possible to reload this value without having to restart the server.	2020-09-29 10:11:36 +02:00
Piotr Sarna	b4db6d2598	transport,config: add a param for max request concurrency The newly introduced parameter - max_concurrent_requests_per_shard - can be used to limit the number of in-flight requests a single coordinator shard can handle. Each surplus request will be immediately refused by returning OverloadedException error to the client. The default value for this parameter is large enough to never actually shed any requests. Currently, the limit is only applied to CQL requests - other frontends like alternator and redis are not throttled yet.	2020-09-29 09:59:30 +02:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Botond Dénes	4f5ccf82cb	mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_clustering_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	3fab83b3a1	flat_mutation_reader: impl: add reader_permit parameter Not used yet, this patch does all the churn of propagating a permit to each impl. In the next patch we will use it to track to track the memory consumption of `_buffer`.	2020-09-28 10:53:48 +03:00
Piotr Dulikowski	39771967bb	hinted handoff: fix race - decomission vs. endpoint mgr init This patch fixes a race between two methods in hints manager: drain_for and store_hint. The first method is called when a node leaves the cluster, and it 'drains' end point hints manager for that node (sends out all hints for that node). If this method is called when the local node is being decomissioned or removed, it instead drains hints managers for all endpoints. In the case of decomission/remove, drain_for first calls parallel_for_each on all current ep managers and tells them to drain their hints. Then, after all of them complete, _ep_managers.clear() is called. End point hints managers are created lazily and inserted into _ep_managers map the first time a hint is stored for that node. If this happens between parallel_for_each and _ep_managers.clear() described above, the clear operation will destroy the new ep manager without draining it first. This is a bug and will trigger an assert in ep manager's destructor. To solve this, a new flag for the hints manager is added which is set when it drains all ep managers on removenode/decommission, and prevents further hints from being written. Fixes #7257 Closes #7278	2020-09-24 14:51:24 +03:00
Avi Kivity	844b675520	view: view_update_generator: drop references to sstables when stopping sstable_manager will soon wait for all sstables under its control to be deleted (if so marked), but that can't happen if someone is holding on to references to those sstables. To allow sstables_manager::stop() to work, drop remaining queued work when terminating.	2020-09-23 20:55:02 +03:00
Avi Kivity	a0ffcabd66	view: use nonwrapping_interval instead of nonwrapping_range to avoid clang deduction failure We use class template argument deduction (CTAD) in a few places, but it appears not to work for alias templates in clang. While it looks like a clang bug, using the class name is an improvement, so let's do that.	2020-09-21 16:32:53 +03:00
Pavel Solodovnikov	6e10f2b530	schema_registry: make grace period configurable Introduce new database config option `schema_registry_grace_period` describing the amount of time in seconds after which unused schema versions will be cleaned up from the schema registry cache. Default value is 1 second, the same value as was hardcoded before. Tests: unit(debug) Refs: #7225 Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200915131957.446455-1-pa.solodovnikov@scylladb.com>	2020-09-15 17:53:27 +02:00
Avi Kivity	dcaf4ea4dd	Merge "Fix race in schema version recalculation leading to stale schema version in gossip" from Tomasz " Migration manager installs several cluster feature change listeners. The listeners will call update_schema_version_and_announce() when cluster features are enabled, which does this: return update_schema_version(proxy, features).then([] (utils::UUID uuid) { return announce_schema_version(uuid); }); It first updates the schema version and then publishes it via gossip in announce_schema_version(). It is possible that the announce_schema_version() part of the first schema change will be deferred and will execute after the other four calls to update_schema_version_and_announce(). It will install the old schema version in gossip instead of the more recent one. The fix is to serialize schema digest calculation and publishing. Refs #7200 This problem also brought my attention to initialization code, which could be prone to the same problem. The storage service computes gossiper states before it starts the gossiper. Among them, node's schema version. There are two problems with that. First is that computing the schema version and publishing it is not atomic, so is not safe against concurrent schema changes or schema version recalculations. It will not exclude with recalculate_schema_version() calls, and we could end up with the old (and incorrect) schema version being advertised in gossip. Second problem is that we should not allow the database layer to call into the gossiper layer before it is fully initialized, as this may produce undefined behavior. Maybe we're not doing concurrent schema changes/recalculations now, but it is easy to imagine that this could change for whatever reason in the future. The solution for both problems is to break the cyclic dependency between the database layer and the storage_service layer by having the database layer not use the gossiper at all. The database layer publishes schema version inside the database class and allows installing listeners on changes. The storage_service layer asks the database layer for the current version when it initializes, and only after that installs a listener which will update the gossiper. Tests: - unit (dev) - manual (3 node ccm) " * tag 'fix-schema-digest-calculation-race-v1' of github.com:tgrabiec/scylla: db, schema: Hide update_schema_version_and_announce() db, storage_service: Do not call into gossiper from the database layer db: Make schema version observable utils: updateable_value_source: Introduce as_observable() schema: Fix race in schema version recalculation leading to stale schema version in gossip	2020-09-14 12:37:46 +03:00
Tomasz Grabiec	691009bc1e	db, schema: Hide update_schema_version_and_announce()	2020-09-11 14:42:48 +02:00
Tomasz Grabiec	9f58dcc705	db, storage_service: Do not call into gossiper from the database layer The storage service computes gossiper states before it starts the gossiper. Among them, node's schema version. There are two problems with that. First is that computing the schema version and publishing it is not atomic, so is not safe against concurrent schema changes or schema version recalculations. It will not exclude with recalculate_schema_version() calls, and we could end up with the old (and incorrect) schema version being advertised in gossip. Second problem is that we should not allow the database layer to call into the gossiper layer before it is fully initialized, as this may produce undefined behavior. The solution for both problems is to break the cyclic dependency between the database layer and the storage_service layer by having the database layer not use the gossiper at all. The database layer publishes schema version inside the database class and allows installing listeners on changes. The storage_service layer asks the database layer for the current version when it initializes, and only after that installs a listener which will update the gossiper. This also allows us to drop unsafe functions like update_schema_version().	2020-09-11 14:42:41 +02:00
Tomasz Grabiec	1a57d641d1	schema: Fix race in schema version recalculation leading to stale schema version in gossip Migration manager installs several feature change listeners: if (this_shard_id() == 0) { _feature_listeners.push_back(_feat.cluster_supports_view_virtual_columns().when_enabled(update_schema)); _feature_listeners.push_back(_feat.cluster_supports_digest_insensitive_to_expiry().when_enabled(update_schema)); _feature_listeners.push_back(_feat.cluster_supports_cdc().when_enabled(update_schema)); _feature_listeners.push_back(_feat.cluster_supports_per_table_partitioners().when_enabled(update_schema)); } They will call update_schema_version_and_announce() when features are enabled, which does this: return update_schema_version(proxy, features).then([] (utils::UUID uuid) { return announce_schema_version(uuid); }); So it first updates the schema version and then publishes it via gossip in announce_schema_version(). It is possible that the announce_schema_version() part of the first schema change will be deferred and will execute after the other four calls to update_schema_version_and_announce(). It will install the old schema version in gossip instead of the more recent one. The fix is to serialize schema digest calculation and publishing. Refs #7200	2020-09-11 14:40:28 +02:00
Piotr Grabowski	462d12f555	db: Propagate enable_cache to system keyspaces Make enable_cache configuration option also affect caching of system keyspaces. Fixes #2909.	2020-09-07 17:54:46 +03:00
Avi Kivity	64c7c81bac	Merge "Update log messages to {fmt} rules" from Pavel E " Before seastar is updated with the {fmt} engine under the logging hood, some changes are to be made in scylla to conform to {fmt} standards. Compilation and tests checked against both -- old (current) and new seastar-s. tests: unit(dev), manual " * 'br-logging-update' of https://github.com/xemul/scylla: code: Force formatting of pointer in .debug and .trace code: Format { and } as {fmt} needs streaming: Do not reveal raw pointer in info message mp_row_consumer: Provide hex-formatting wrapper for bytes_view heat_load_balance: Include fmt/ranges.h	2020-09-03 15:10:09 +03:00
Kamil Braun	ff78a3c332	cdc: rename CDC description tables... again Commit `a6ad70d3da` changed the format of stream IDs: the lower 8 bytes were previously generated randomly, now some of them have semantics. In particular, the least significant byte contains a version (stream IDs might evolve with further releases). This is a backward-incompatible change: the code won't properly handle stream IDs with all lower 8 bytes generated randomly. To protect us from subtle bugs, the code has an assertion that checks the stream ID's version. This means that if an experimental user used CDC before the change and then upgraded, they might hit the assertion when a node attempts to retrieve a CDC generation with old stream IDs from the CDC description tables and then decode it. In effect, the user won't even be able to start a node. Similarly as with the case described in `d89b7a0548`, the simplest fix is to rename the tables. This fix must get merged in before CDC goes out of experimental. Now, if the user upgrades their cluster from a pre-rename version, the node will simply complain that it can't obtain the CDC generation instead of preventing the cluster from working. The user will be able to use CDC after running checkAndRepairCDCStreams. Since a new table is added to the system_distributed keyspace, the cluster's schema has changed, so sstables and digests need to be regenerated for schema_digest_test.	2020-08-31 11:33:14 +03:00
Rafael Ávila de Espíndola	d18af34205	everywhere: Use future::get0 when appropriate This works with current seastar and clears most of the way for updating to a version that doesn't use std::tuple in futures. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200826231947.1145890-1-espindola@scylladb.com>	2020-08-27 15:05:51 +03:00
Piotr Sarna	ca9422ca73	Merge 'Fix view_builder lockup and crash on shutdown' from Pavel The lockup: When view_builder starts all shards at some point get to a barrier waiting for each other to pass. If any shard misses this checkpoint, all others stuck forever. As this barrier lives inside the _started future, which in turn is waited on stop, the stop stucks as well. Reasons to miss the barrier -- exception in the middle of the fun^w start or explicit abort request while waiting for the schema agreement. Fix the "exception" case by unlocking the barrier promise with exception and fix the "abort request" case by turning it into an exception. The bug can be reproduced by hands if making one shard never see the schema agreement and continue looping until the abort request. The crash: If the background start up fails, then the _started future is resolved into exception. The view_builder::stop then turns this future into a real exception caught-and-rethrown by main.cc. This seems wrong that a failure in a background fiber aborts the regular shutdown that may proceed otherwise. tests: unit(dev), manual start-stop branch: https://github.com/xemul/scylla/tree/br-view-builder-shutdown-fix-3 fixes: #7077 Patch #5 leaves the seastar::async() in the 1-st phase of the start() although can also be tuned not to produce a thread. However, there's one more (painless) issue with the _sem usage, so this change appears too large for the part of the bug-fix and will come as a followup. * 'br-view-builder-shutdown-fix-3' of git://github.com/xemul/scylla: view_builder: Add comment about builder instances life-times view_builder: Do sleep abortable view_builder: Wakeup barrier on exception view_builder: Always resolve started future to success view_builder: Re-futurize start view_builder: Split calculate_shard_build_step into two view_builder: Populate the view_builder_init_state view_builder: Fix indentation after previous patch view_builder: Introduce view_builder_init_state	2020-08-27 11:51:46 +02:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Pavel Emelyanov	366b4e8a8f	code: Format { and } as {fmt} needs There are two places that want to print "{<text>}" strings, but do not format the curly braces the {fmt}-way. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Pavel Emelyanov	fe33e3ed78	heat_load_balance: Include fmt/ranges.h To provide vector<> formatter for {fmt} Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:08 +03:00
Avi Kivity	3daa49f098	Merge "materialized views: Fix undefined behavior on base table schema changes" from Tomasz " The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema did not change when the base table was altered. Another problem was that view building was using the current table's schema to interpret the fragments and invoke view building. That's incorrect for two reasons. First, fragments generated by a reader must be accessed only using the reader's schema. Second, base_non_pk_columns_in_view_pk of the recorded view ptrs may not longer match the current base table schema, which is used to generate the view updates. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entity called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Fixes #7061. Tests: - unit (dev) - [v1] manual (reproduced using scylla binary and cqlsh) " * tag 'mv-schema-mismatch-fix-v2' of github.com:tgrabiec/scylla: db: view: Refactor view_info::initialize_base_dependent_fields() tests: mv: Test dropping columns from base table db: view: Fix incorrect schema access during view building after base table schema changes schema: Call on_internal_error() when out of range id is passed to column_at() db: views: Fix undefined behavior on base table schema changes db: views: Introduce has_base_non_pk_columns_in_view_pk()	2020-08-26 17:37:52 +03:00
Pavel Emelyanov	cf1cb4d145	view_builder: Add comment about builder instances life-times The barrier passing is tricky and deserves a description about objects' life-times. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:56:38 +03:00
Pavel Emelyanov	643c431ce4	view_builder: Do sleep abortable If one shard delays in seeing the schema agreement and returns on abort request, other shards may get stuck waiting for it on the status read barrier. Luckily with the previous patch the barrier is exception-proof, so we may abort the waiting loop with exception and handle the lock-up. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:56:38 +03:00
Pavel Emelyanov	c36bbc37c9	view_builder: Wakeup barrier on exception If an exception pops up during the view_builder::start while some shards wait for the status-read barrier, these shards are not woken up, thus causing the shutdown to stuck. Fix this by setting exception on the barrier promise, resolving all pending and on-going futures. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:56:38 +03:00
Pavel Emelyanov	8f8ed625ab	view_builder: Always resolve started future to success If the view builder background start fails, the _started future resolves to exceptional state. In turn, stopping the view builder keeps this state through .finally() and aborts the shutdown very early, while it may and should proceed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:56:38 +03:00
Pavel Emelyanov	60e21bb59a	view_builder: Re-futurize start Step two turning the view_builder::start() into a chain of lambdas -- rewrite (most of) the seastar::async()'s lambda into a more "classical" form. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:56:38 +03:00
Pavel Emelyanov	77c7d94f85	view_builder: Split calculate_shard_build_step into two The calculate_shard_build_step() has a cross-shard barrier in the middle and passing the barrier is broken wrt exceptions that may happen before it. The intention is to prepare this barrier passing for exception handling by turning the view_builder::start() into a dedicated continuation lambda. Step one in this campaign -- split the calculate_shard_build_step() into steps called by view_builder::start(): - before the barrier - barrier - after the barrier Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:56:38 +03:00
Pavel Emelyanov	fe0326b75b	view_builder: Populate the view_builder_init_state Keep the internal calculate_shard_build_step()'s stuff on the init helper struct, as the method in question is about to be split into a chain of continuation lambdas. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:56:35 +03:00
Pavel Emelyanov	2d2d04c6b7	view_builder: Fix indentation after previous patch No functional changes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:46:36 +03:00
Pavel Emelyanov	d0393d92a2	view_builder: Introduce view_builder_init_state This is the helper initialization struct that will carry the needed objects accross continuation lambdas. The indentation in ::start() will be fixed in the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 15:45:15 +03:00
Botond Dénes	0a8cc4c2b5	db/size_estimates_virtual_reader: remove redundant _schema member This reader was probably created in ancient times, when readers didn't yet have a _schema member of their own. But now that they do, it is not necessary to store the schema in the reader implementation, there is one available in the parent class. While at it also move the schema into the class when calling the constructor. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Reviewed-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200821070358.33937-1-bdenes@scylladb.com>	2020-08-22 20:47:49 +03:00
Tomasz Grabiec	c44455d514	Merge "Miscellaneous schema code cleanups" from Rafael	2020-08-20 15:19:42 +02:00
Rafael Ávila de Espíndola	33669bd21d	commitlog: Use try_with_gate Now that we have try_with_gate we can use instead of futurize_invoke and with_gate. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200819191334.74108-1-espindola@scylladb.com>	2020-08-20 15:19:42 +02:00
Tomasz Grabiec	cf12b5e537	db: view: Refactor view_info::initialize_base_dependent_fields() It is no longer called once for a given view_info, so the name "initialize" is not appropriate. This patch splits the "initialize" method into the "make" part, which makes a new base_info object, and the "set" part, which changes the current base_info object attached to the view.	2020-08-20 14:53:07 +02:00
Tomasz Grabiec	f8df214836	db: view: Fix incorrect schema access during view building after base table schema changes The view building process was accessing mutation fragments using current table's schema. This is not correct, fragments must be accessed using the schema of the generating reader. This could lead to undefined behavior when the column set of the base table changes. out_of_range exceptions could be observed, or data in the view ending up in the wrong column. Refs #7061. The fix has two parts. First, we always use the reader's schema to access fragments generated by the reader. Second, when calling populate_views() we upgrade the fragment-wrapping reader's schema to the base table schema so that it matches the base table schema of view_and_base snapshots passed to populate_views().	2020-08-20 14:53:07 +02:00
Tomasz Grabiec	3a6ec9933c	db: views: Fix undefined behavior on base table schema changes The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema does not change when the base table is altered. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entitiy called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Refs #7061.	2020-08-20 14:53:07 +02:00
Tomasz Grabiec	dc18117b82	db: views: Introduce has_base_non_pk_columns_in_view_pk() In preparation for pushing _base_non_pk_columns_in_view_pk deeper.	2020-08-20 14:53:07 +02:00
Rafael Ávila de Espíndola	6363716799	schema: Pass an rvalue to set_compaction_strategy_options This produces less code and makes sure every caller moves the value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:35 -07:00
Pavel Emelyanov	78298ec776	init: Use local messaging reference in main There are few places that initialize db and system_ks and need the messaging service. Pass the reference to it from main instead of using the global helpers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 13:08:12 +03:00
Botond Dénes	22a6493716	view_update_generator: fix race between registering and processing sstables `fea83f6` introduced a race between processing (and hence removing) sstables from `_sstables_with_tables` and registering new ones. This manifested in sstables that were added concurrently with processing a batch for the same sstables being dropped and the semaphore units associated with them not returned. This resulted in repairs being blocked indefinitely as the units of the semaphore were effectively leaked. This patch fixes this by moving the contents of `_sstables_with_tables` to a local variable before starting the processing. A unit test reproducing the problem is also added. Fixes: #6892 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200817160913.2296444-1-bdenes@scylladb.com>	2020-08-18 10:22:35 +03:00
Pavel Solodovnikov	9aa4712270	lwt: introduce `paxos_grace_seconds` per-table option to set paxos ttl Previously system.paxos TTL was set as max(3h, gc_grace_seconds). Introduce new per-table option named `paxos_grace_seconds` to set the amount of seconds which are used to TTL data in paxos tables when using LWT queries against the base table. Default value is equal to `DEFAULT_GC_GRACE_SECONDS`, which is 10 days. This change allows to easily test various issues related to paxos TTL. Fixes #6284 Tests: unit (dev, debug) Co-authored-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200816223935.919081-1-pa.solodovnikov@scylladb.com>	2020-08-17 16:44:14 +02:00
Piotr Jastrzebski	01ea159fde	codebase wide: use try_emplace when appropriate C++17 introduced try_emplace for maps to replace a pattern: if(element not in a map) { map.emplace(...) } try_emplace is more efficient and results in a more concise code. This commit introduces usage of try_emplace when it's appropriate. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <4970091ed770e233884633bf6d46111369e7d2dd.1597327358.git.piotr@scylladb.com>	2020-08-16 14:41:09 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Nadav Har'El	8135647906	merge: Add metrics to semaphores Merged pull request https://github.com/scylladb/scylla/pull/7018 by Piotr Sarna: This series addresses various issues with metrics and semaphores - it mainly adds missing metrics, which makes it possible to see the length of the queues attached to the semaphores. In case of view building and view update generation, metrics was not present in these services at all, so a first, basic implementation is added. More precise semaphore metrics would ease the testing and development of load shedding and admission control. view_builder: add metrics db, view: add view update generator metrics hints: track resource_manager sending queue length hints: add drain queue length to metrics table: add metrics for sstable deletion semaphore database: remove unused semaphore	2020-08-12 12:39:59 +03:00
Piotr Sarna	5086a5ca32	view_builder: add metrics The view builder service lacked metrics, so a basic set of them is added.	2020-08-11 17:43:53 +02:00
Piotr Sarna	e4d78b60ff	db, view: add view update generator metrics The view update generator completely lacked metrics, so a basic set of them is now exposed.	2020-08-11 17:43:53 +02:00
Piotr Sarna	180a1505fd	hints: track resource_manager sending queue length The number of tasks waiting for a hint to be sent is now tracked.	2020-08-11 17:43:53 +02:00

1 2 3 4 5 ...

1832 Commits