scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-08 07:53:20 +00:00

Author	SHA1	Message	Date
Botond Dénes	0a8cc4c2b5	db/size_estimates_virtual_reader: remove redundant _schema member This reader was probably created in ancient times, when readers didn't yet have a _schema member of their own. But now that they do, it is not necessary to store the schema in the reader implementation, there is one available in the parent class. While at it also move the schema into the class when calling the constructor. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Reviewed-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200821070358.33937-1-bdenes@scylladb.com>	2020-08-22 20:47:49 +03:00
Tomasz Grabiec	c44455d514	Merge "Miscellaneous schema code cleanups" from Rafael	2020-08-20 15:19:42 +02:00
Rafael Ávila de Espíndola	33669bd21d	commitlog: Use try_with_gate Now that we have try_with_gate we can use instead of futurize_invoke and with_gate. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200819191334.74108-1-espindola@scylladb.com>	2020-08-20 15:19:42 +02:00
Rafael Ávila de Espíndola	6363716799	schema: Pass an rvalue to set_compaction_strategy_options This produces less code and makes sure every caller moves the value. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-08-19 14:02:35 -07:00
Pavel Emelyanov	78298ec776	init: Use local messaging reference in main There are few places that initialize db and system_ks and need the messaging service. Pass the reference to it from main instead of using the global helpers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 13:08:12 +03:00
Botond Dénes	22a6493716	view_update_generator: fix race between registering and processing sstables `fea83f6` introduced a race between processing (and hence removing) sstables from `_sstables_with_tables` and registering new ones. This manifested in sstables that were added concurrently with processing a batch for the same sstables being dropped and the semaphore units associated with them not returned. This resulted in repairs being blocked indefinitely as the units of the semaphore were effectively leaked. This patch fixes this by moving the contents of `_sstables_with_tables` to a local variable before starting the processing. A unit test reproducing the problem is also added. Fixes: #6892 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200817160913.2296444-1-bdenes@scylladb.com>	2020-08-18 10:22:35 +03:00
Pavel Solodovnikov	9aa4712270	lwt: introduce `paxos_grace_seconds` per-table option to set paxos ttl Previously system.paxos TTL was set as max(3h, gc_grace_seconds). Introduce new per-table option named `paxos_grace_seconds` to set the amount of seconds which are used to TTL data in paxos tables when using LWT queries against the base table. Default value is equal to `DEFAULT_GC_GRACE_SECONDS`, which is 10 days. This change allows to easily test various issues related to paxos TTL. Fixes #6284 Tests: unit (dev, debug) Co-authored-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200816223935.919081-1-pa.solodovnikov@scylladb.com>	2020-08-17 16:44:14 +02:00
Piotr Jastrzebski	01ea159fde	codebase wide: use try_emplace when appropriate C++17 introduced try_emplace for maps to replace a pattern: if(element not in a map) { map.emplace(...) } try_emplace is more efficient and results in a more concise code. This commit introduces usage of try_emplace when it's appropriate. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <4970091ed770e233884633bf6d46111369e7d2dd.1597327358.git.piotr@scylladb.com>	2020-08-16 14:41:09 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Nadav Har'El	8135647906	merge: Add metrics to semaphores Merged pull request https://github.com/scylladb/scylla/pull/7018 by Piotr Sarna: This series addresses various issues with metrics and semaphores - it mainly adds missing metrics, which makes it possible to see the length of the queues attached to the semaphores. In case of view building and view update generation, metrics was not present in these services at all, so a first, basic implementation is added. More precise semaphore metrics would ease the testing and development of load shedding and admission control. view_builder: add metrics db, view: add view update generator metrics hints: track resource_manager sending queue length hints: add drain queue length to metrics table: add metrics for sstable deletion semaphore database: remove unused semaphore	2020-08-12 12:39:59 +03:00
Piotr Sarna	5086a5ca32	view_builder: add metrics The view builder service lacked metrics, so a basic set of them is added.	2020-08-11 17:43:53 +02:00
Piotr Sarna	e4d78b60ff	db, view: add view update generator metrics The view update generator completely lacked metrics, so a basic set of them is now exposed.	2020-08-11 17:43:53 +02:00
Piotr Sarna	180a1505fd	hints: track resource_manager sending queue length The number of tasks waiting for a hint to be sent is now tracked.	2020-08-11 17:43:53 +02:00
Piotr Sarna	58a9fa7d2e	hints: add drain queue length to metrics The number of tasks waiting for a drain is now tracked.	2020-08-11 17:43:53 +02:00
Avi Kivity	d36601a838	Merge 'Make commitlog respect disk limit better' from Calle " Refs #6148 Separates disk usage into two cases: Allocated and used. Since we use both reserve and recycled segments, both which are not actually filled with anything at the point of waiting. Also refuses to recycle segments or increase reserve size if our current disk footprint exceeds threshold. And finally uses some initial heuristics to determine when we should suggest flushing, based on disk limit, segment size, and current usage. Right now, when we only have a half segment left before hitting used == max. Some initial tests show an improved adherence to limit though it will still be exceeded, because we do _not_ force waiting for segments to become cleared or similar if we need to add data, thus slow flushing can still make usage create extra segments. We will however attempt to shrink disk usage when load is lighter. Somewhat unclear how much this impacts performance with tight limits, and how much this matters. " * elcallio-calle/commitlog_size: commitlog: Make commitlog respect disk limit better commitlog: Demote buffer write log messages to trace	2020-08-11 15:03:32 +03:00
Calle Wilund	5d044ab74e	commitlog: Make commitlog respect disk limit better Refs #6148 Separates disk usage into two cases: Allocated and used. Since we use both reserve and recycled segments, both which are not actually filled with anything at the point of waiting. Also refuses to recycle segments or increase reserve size if our current disk footprint exceeds threshold. And finally uses some initial heuristics to determine when we should suggest flushing, based on disk limit, segment size, and current usage. Right now, when we only have a half segment left before hitting used == max. Some initial tests show an improved adherence to limit though it will still be exceeded, because we do _not_ force waiting for segments to become cleared or similar if we need to add data, thus slow flushing can still make usage create extra segments. We will however attempt to shrink disk usage when load is lighter. Somewhat unclear how much this impacts performance with tight limits, and how much this matters. v2: * Add some comments/explanations v3: * Made disk footprint subtract happen post delete (non-optimistic)	2020-08-11 10:40:56 +00:00
Avi Kivity	3530e80ce1	Merge "Support md format" from Benny " This series adds support for the "md" sstable format. Support is based on the following: * do not use clustering based filtering in the presence of static row, tombstones. * Disabling min/max column names in the metadata for formats older than "md". * When updating the metadata, reset and disable min/max in the presence of range tombstones (like Cassandra does and until we process them accurately). * Fix the way we maintain min/max column names by: keeping whole clustering key prefixes as min/max rather than calculating min/max independently for each component, like Cassandra does in the "md" format. Fixes #4442 Tests: unit(dev), cql_query_test -t test_clustering_filtering* (debug) md migration_test dtest from git@github.com:bhalevy/scylla-dtest.git migration_test-md-v1 " * tag 'md-format-v4' of github.com:bhalevy/scylla: (27 commits) config: enable_sstables_md_format by default test: cql_query_test: add test_clustering_filtering unit tests table: filter_sstable_for_reader: allow clustering filtering md-format sstables table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results table: filter_sstable_for_reader: adjust to md-format table: filter_sstable_for_reader: include non-scylla sstables with tombstones table: filter_sstable_for_reader: do not filter if static column is requested table: filter_sstable_for_reader: refactor clustering filtering conditional expression features: add MD_SSTABLE_FORMAT cluster feature config: add enable_sstables_md_format database: add set_format_by_config test: sstable_3_x_test: test both mc and md versions test: Add support for the "md" format sstables: mx/writer: use version from sstable for write calls sstables: mx/writer: update_min_max_components for partition tombstone sstables: metadata_collector: support min_max_components for range tombstones sstable: validate_min_max_metadata: drop outdated logic sstables: rename mc folder to mx sstables: may_contain_rows: always true for old formats sstables: add may_contain_rows ...	2020-08-11 13:29:11 +03:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Calle Wilund	9167d1ac76	commitlog: Demote buffer write log messages to trace Because they become very plentiful and annoying when one tries to analyze segment behaviour. More so in batch mode.	2020-08-11 09:18:23 +00:00
Benny Halevy	e2340d0684	config: enable_sstables_md_format by default Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	e8d7744040	features: add MD_SSTABLE_FORMAT cluster feature Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	65239a6e50	config: add enable_sstables_md_format MD format is disabled by default at this point. The option extends enable_sstables_mc_format so that both are needed to be set for supporting the md format. The MD_FORMAT cluster feature will be added in a following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Piotr Jastrzebski	52ec0c683e	codebase wide: replace erase + remove_if with erase_if C++20 introduced std::erase_if which simplifies removal of elements from the collection. Previously the code pattern looked like: <collection>.erase( std::remove_if(<collection>.begin(), <collection>.end(), <predicate>), <collection>.end()); In C++20 the same can be expressed with: std::erase_if(<collection>, <predicate>); This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <6ffcace5cce79793ca6bd65c61dc86e6297233fd.1597064990.git.piotr@scylladb.com>	2020-08-10 18:17:38 +03:00
Dejan Mircevski	df20854963	cql3: Move expressions to their own namespace Move the classes representing CQL expressions (and utility functions on them) from the `restrictions` namespace to a new namespace `expr`. Most of the restriction.hh content was moved verbatim to expression.hh. Similarly, all expression-related code was moved from statement_restrictions.cc verbatim to expression.cc. As suggested in #5763 feedback https://github.com/scylladb/scylla/pull/5763#discussion_r443210498 Tests: dev (unit) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-08-08 21:03:26 +03:00
Avi Kivity	1572b9e41c	Merge 'transport: Added listener with port-based load balancing' from Juliusz " This is inspired by #6781. The idea is to make Scylla listen for CQL connections on port 9042 (where both old shard-aware and shard-unaware clients can still connect the traditional way). On top of that I added a new port, where everything works the same way, only the port from client's socket used to determine the shard No. to connect to. Desired shard No. is the result of `clientside_port % num_shards`. The new port is configurable from scylla.yaml and defaults to 19042 (unencrypted, unless user configures encryption options and omits `native_shard_aware_transport_port_ssl` in DB config). Two "SUPPORTED" tags are added: "SCYLLA_SHARD_AWARE_PORT" and "SCYLLA_SHARD_AWARE_PORT_SSL". For compatibility, "SCYLLA_SHARDING_ALGORITHM" is still kept. Fixes #5239 " * jul-stas-shard-aware-listener: docs: Info about shard-aware listeners in protocol-extensions transport: Added listener with port-based load balancing	2020-08-03 19:23:28 +03:00
Calle Wilund	30a700c5b0	system_keyspace: Remove support for legacy truncation records Fixes #6341 Since scylla no longer supports upgrading from a version without the "new" (dedicated) truncation record table, we can remove support for these and the migtration thereof. Make sure the above holds whereever this is committed. Note that this does not remove the "truncated_at" field in system.local.	2020-08-03 17:16:26 +03:00
Eliran Sinvani	779502ab11	Revert "schema: take into account features when converting a table creation to" This reverts commit `b97f466438`. It turns out that the schema mechanism has a lot of nuances, after this change, for unknown reason, it was empirically proven that the amount of cross shard on an upgraded node was increased significantly with a steady stress traffic, if was so significant that the node appeared unavailable to the coordinators because all of the requests started to fail on smp_srvice_group semaphore. This revert will bring back a caveat in Scylla, the caveat is that creating a table in a mixed cluster might under certain condition cause schema mismatch on the newly created table, this make the table essentially unusable until the whole cluster has a uniform version (rolling upgrade or rollback completion). Fixes #6893.	2020-08-03 12:51:16 +03:00
Avi Kivity	257c17a87a	Merge "Don't depend on seastar::make_(lw_)?shared idiosyncrasies" from Rafael " While working on another patch I was getting odd compiler errors saying that a call to ::make_shared was ambiguous. The reason was that seastar has both: template <typename T, typename... A> shared_ptr<T> make_shared(A&&... a); template <typename T> shared_ptr<T> make_shared(T&& a); The second variant doesn't exist in std::make_shared. This series drops the dependency in scylla, so that a future change can make seastar::make_shared a bit more like std::make_shared. " * 'espindola/make_shared' of https://github.com/espindola/scylla: Everywhere: Explicitly instantiate make_lw_shared Everywhere: Add a make_shared_schema helper Everywhere: Explicitly instantiate make_shared cql3: Add a create_multi_column_relation helper main: Return a shared_ptr from defer_verbose_shutdown	2020-08-02 19:51:24 +03:00
Juliusz Stasiewicz	1c11d8f4c4	transport: Added listener with port-based load balancing The new port is configurable from scylla.yaml and defaults to 19042 (unencrypted, unless client configures encryption options and omits `native_shard_aware_transport_port_ssl`). Two "SUPPORTED" tags are added: "SCYLLA_SHARD_AWARE_PORT" and "SCYLLA_SHARD_AWARE_PORT_SSL". For compatibility, "SCYLLA_SHARDING_ALGORITHM" is still kept. Fixes #5239	2020-07-31 13:02:13 +02:00
Tomasz Grabiec	3486eba1ce	commitlog: Fix use-after-free on mutation object during replay The mutation object may be freed prematurely during commitlog replay in the schema upgrading path. We will hit the problem if the memtable is full and apply_in_memory() needs to defer. This will typically manifest as a segfault. Fixes #6953 Introduced in `79935df` Tests: - manual using scylla binary. Reproduced the problem then verified the fix makes it go away Message-Id: <1596044010-27296-1-git-send-email-tgrabiec@scylladb.com>	2020-07-29 20:58:15 +03:00
Avi Kivity	fea5067dfa	Merge "Limit non-paged query memory consumption" from Botond " Non-paged queries completely ignore the query result size limiter mechanism. They consume all the memory they want. With sufficiently large datasets this can easily lead to a handful or even a single unpaged query producing an OOM. This series continues the work started by `134d5a5f7`, by introducing a configurable pair of soft/hard limit (default to 1MB/100MB) that is applied to otherwise unlimited queries, like reverse and unpaged ones. When an unlimited query reaches the soft limit a warning is logged. This should give users some heads-up to adjust their application. When the hard limit is reached the query is aborted. The idea is to not greet users with failing queries after an upgrade while at the same time protect the database from the really bad queries. The hard limit should be decreased from time to time gradually approaching the desired goal of 1MB. We don't want to limit internal queries, we trust ourselves to either use another form of memory usage control, or read only small datasets. So the limit is selected according to the query class. User reads use the `max_memory_for_unlimited_query_{soft,hard}_limit` configuration items, while internal reads are not limited. The limit is obtained by the coordinator, who passes it down to replicas using the existing `max_result_size` parameter (which is not a special type containing the two limits), which is now passed on every verb, instead of once per connection. This ensures that all replicas work with the same limits. For normal paged queries `max_result_size` is set to the usual `query::result_memory_limiter::maximum_result_size` For queries that can consume unlimited amount of memory -- unpaged and reverse queries -- this is set to the value of the aforementioned `max_memory_for_unlimited_query_{soft,hard}_limit` configuration item, but only for user reads, internal reads are not limited. This has the side-effect that reverse reads now send entire partitions in a single page, but this is not that bad. The data was already read, and its size was below the limit, the replica might as well send it all. Fixes: #5870 " * 'nonpaged-query-limit/v5' of https://github.com/denesb/scylla: (26 commits) test: database_test: add test for enforced max result limit mutation_partition: abort read when hard limit is exceeded for non-paged reads query-result.hh: move the definition of short_read to the top test: cql_test_env: set the max_memory_unlimited_query_{soft,hard}_limit test: set the allow_short_read slice option for paged queries partition_slice_builder: add with_option() result_memory_accounter: remove default constructor query_*(): use the coordinator specified memory limit for unlimited queries storage_proxy: use read_command::max_result_size to pass max result size around query: result_memory_limiter: use the new max_result_size type query: read_command: add max_result_size query: read_command: use tagged ints for limit ctor params query: read_command: add separate convenience constructor service: query_pager: set the allow_short_read flag result_memory_accounter: check(): use _maximum_result_size instead of hardcoded limit storage_proxy: add get_max_result_size() result_memory_limiter: add unlimited_result_size constant database: add get_statement_scheduling_group() database: query_mutations(): obtain the memory accounter inside query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field ...	2020-07-29 13:41:53 +03:00
Botond Dénes	9eab5bca27	query_*(): use the coordinator specified memory limit for unlimited queries It is important that all replicas participating in a read use the same memory limits to avoid artificial differences due to different amount of results. The coordinator now passes down its own memory limit for reads, in the form of max_result_size (or max_size). For unpaged or reverse queries this has to be used now instead of the locally set max_memory_unlimited_query configuration item. To avoid the replicas accidentally using the local limit contained in the `query_class_config` returned from `database::make_query_class_config()`, we refactor the latter into `database::get_reader_concurrency_semaphore()`. Most of its callers were only interested in the semaphore only anyway and those that were interested in the limit as well should get it from the coordinator instead, so this refactoring is a win-win.	2020-07-28 18:00:29 +03:00
Botond Dénes	159d37053d	storage_proxy: use read_command::max_result_size to pass max result size around Use the recently added `max_result_size` field of `query::read_command` to pass the max result size around, including passing it to remote nodes. This means that the max result size will be sent along each read, instead of once per connection. As we want to select the appropriate `max_result_size` based on the type of the query as well as based on the query class (user or internal) the previous method won't do anymore. If the remote doesn't fill this field, the old per-connection value is used.	2020-07-28 18:00:29 +03:00
Botond Dénes	92a7b16cba	query: read_command: add max_result_size This field will replace max size which is currently passed once per established rpc connection via the CLIENT_ID verb and stored as an auxiliary value on the client_info. For now it is unused, but we update all sites creating a read command to pass the correct value to it. In the next patch we will phase out the old max size and use this field to pass max size on each verb instead.	2020-07-28 18:00:29 +03:00
Botond Dénes	8992bcd1f8	query: read_command: use tagged ints for limit ctor params The convenience constructor of read_command now has two integer parameter next to each other. In the next patch we intend to add another one. This is recipe for disaster, so to avoid mistakes this patch converts these parameters to tagged integers. This makes sure callers pass what they meant to pass. As a matter of fact, while fixing up call-sites, I already found several ones passing `query::max_partitions` to the `row_limit` parameter. No harm done yet, as `query::max_partitions` == `query::max_rows` but this shows just how easy it is to mix up parameters with the same type.	2020-07-28 18:00:29 +03:00
Botond Dénes	46d5b651eb	db/config: introduce max_memory_for_unlimited_query_soft_limit and max_memory_for_unlimited_query_hard_limit This pair of limits replace the old max_memory_for_unlimited_query one, which remains as an alias to the hard limit. The soft limit inherits the previous value of the limit (1MB), when this limit is reached a warning will be logged allowing the users to adjust their client codes without downtime. The hard limit starts out with a more permissive default of 100MB. When this is reached queries are aborted, the same behaviour as with the previous single limit. The idea is to allow clients a grace period for fixing their code, while at the same time protecting the database from the really bad queries.	2020-07-28 18:00:29 +03:00
Botond Dénes	9faaf46d4b	utils: config_src::add_command_line_options(): drop name and desc args Now that there are no ad-hoc aliases needing to overwrite the name and description parameter of this method, we can drop these and have each config item just use `name()` and `desc()` to access these.	2020-07-28 18:00:29 +03:00
Botond Dénes	dc23736d0c	db/config: replace ad-hoc aliases with alias mechanism We already uses aliases for some configuration items, although these are created with an ad-hoc mechanism that only registers them on the command line. Replace this with the built-in alias mechanism in the previous patch, which has the benefit of conflict resolution and also working with YAML.	2020-07-28 18:00:29 +03:00
Piotr Sarna	ee35c4c3d6	db: handle errors when loading view build progress Currently, encountering an error when loading view build progress would result in view builder refusing to start - which also means that future views would not be built until the server restarts. A more user-friendly solution would be to log an error message, but continue to boot the view builder as if no views are currently in progress, which would at least allow future views to be built correctly. The test case is also amended, since now it expects the call to return that "no view builds are in progress" instead of an exception. Fixes #6934 Tests: unit(dev) Message-Id: <9f26de941d10e6654883a919fd43426066cee89c.1595922374.git.sarna@scylladb.com>	2020-07-28 11:32:09 +03:00
Nadav Har'El	f488eaebaf	merge: db/view: view_update_generator: make staging reader evictable Merged patch set by Botond Dénes: The view update generation process creates two readers. One is used to read the staging sstables, the data which needs view updates to be generated for, and another reader for each processed mutation, which reads the current value (pre-image) of each row in said mutation. The staging reader is created first and is kept alive until all staging data is processed. The pre-image reader is created separately for each processed mutation. The staging reader is not restricted, meaning it does not wait for admission on the relevant reader concurrency semaphore, but it does register its resource usage on it. The pre-image reader however is restricted. This creates a situation, where the staging reader possibly consumes all resources from the semaphore, leaving none for the later created pre-image reader, which will not be able to start reading. This will block the view building process meaning that the staging reader will not be destroyed, causing a deadlock. This patch solves this by making the staging reader restricted and making it evictable. To prevent thrashing -- evicting the staging reader after reading only a really small partition -- we only make the staging reader evictable after we have read at least 1MB worth of data from it. test/boost: view_build_test: add test_view_update_generator_buffering test/boost: view_build_test: add test test_view_update_generator_deadlock reader_permit: reader_resources: add operator- and operator+ reader_concurrency_semaphore: add initial_resources() test: cql_test_env: allow overriding database_config mutation_reader: expose new_reader_base_cost db/view: view_updating_consumer: allow passing custom update pusher db/view: view_update_generator: make staging reader evictable db/view: view_updating_consumer: move implementation from table.cc to view.cc database: add make_restricted_range_sstable_reader() Signed-off-by: Botond Dénes <bdenes@scylladb.com> --- db/view/view_updating_consumer.hh \| 51 ++++++++++++++++++++++++++++--- db/view/view.cc \| 39 +++++++++++++++++------ db/view/view_update_generator.cc \| 19 +++++++++--- 3 files changed, 91 insertions(+), 18 deletions(-)	2020-07-27 09:19:37 +02:00
Avi Kivity	39db54a758	Merge "Use seastar::with_file_close_on_failure in commitlog" from Benny " `close_on_failure` was committed to seastar so use the library version. This requires making the lambda function passed to it nothrow move constructible, so this series also makes db::commitlog::descriptor move constructor noexcept and changes allocate_segment_ex and segment::segment to get a descriptor by value rather than by reference. Test: unit(dev), commitlog_test(debug) " * tag 'commit-log-use-with_file_close_on_failure-v1' of github.com:bhalevy/scylla: commitlog: use seastar::with_file_close_on_failure commitlog: descriptor: make nothrow move constructible commitlog: allocate_segment_ex, segment: pass descriptor by value commitlog: allocate_segment_ex: filename capture is unused	2020-07-23 19:23:23 +03:00
Rafael Ávila de Espíndola	e15c8ee667	Everywhere: Explicitly instantiate make_lw_shared seastar::make_lw_shared has a constructor taking a T&&. There is no such constructor in std::make_shared: https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared This means that we have to move from make_lw_shared(T(...) to make_lw_shared<T>(...) If we don't want to depend on the idiosyncrasies of seastar::make_lw_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Rafael Ávila de Espíndola	efeaded427	Everywhere: Add a make_shared_schema helper This replaces a lot of make_lw_shared(schema(...)) with make_shared_schema(...). This makes it easier to drop a dependency on the differences between seastar::make_shared and std::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Botond Dénes	566e31a5ac	db/view: view_updating_consumer: allow passing custom update pusher So that tests can test the `view_update_consumer` in isolation, without having to set up the whole database machinery. In addition to less infrastructure setup, this allows more direct checking of mutations pushed for view generation.	2020-07-20 11:23:39 +03:00
Botond Dénes	0166f97096	db/view: view_update_generator: make staging reader evictable The view update generation process creates two readers. One is used to read the staging sstables, the data which needs view updates to be generated for, and another reader for each processed mutation, which reads the current value (pre-image) of each row in said mutation. The staging reader is created first and is kept alive until all staging data is processed. The pre-image reader is created separately for each processed mutation. The staging reader is not restricted, meaning it does not wait for admission on the relevant reader concurrency semaphore, but it does register its resource usage on it. The pre-image reader however is restricted. This creates a situation, where the staging reader possibly consumes all resources from the semaphore, leaving none for the later created pre-image reader, which will not be able to start reading. This will block the view building process meaning that the staging reader will not be destroyed, causing a deadlock. This patch solves this by making the staging reader restricted and making it evictable. To prevent thrashing -- evicting the staging reader after reading only a really small partition -- we only make the staging reader evictable after we have read at least 1MB worth of data from it.	2020-07-20 11:23:39 +03:00
Botond Dénes	84357f0722	db/view: view_updating_consumer: move implementation from table.cc to view.cc table.cc is a very counter-intuitive place for view related stuff, especially if the declarations reside in `db/view/`.	2020-07-20 11:23:39 +03:00
Avi Kivity	5371be71e9	Merge "Reduce fanout of some mutation-related headers" from Pavel E " The set's goal is to reduce the indirect fanout of 3 headers only, but likely affects more. The measured improvement rates are flat_mutation_reader.hh: -80% mutation.hh : -70% mutation_partition.hh : -20% tests: dev-build, 'checkheaders' for changed headers (the tree-wide fails on master) " * 'br-debloat-mutation-headers' of https://github.com/xemul/scylla: headers:: Remove flat_mutation_reader.hh from several other headers migration_manager: Remove db/schema_tables.hh inclustion into header storage_proxy: Remove frozen_mutation.hh inclustion storage_proxy: Move paxos/*.hh inclusions from .hh to .cc storage_proxy: Move hint_wrapper from .hh to .cc headers: Remove mutation.hh from trace_state.hh	2020-07-19 19:47:59 +03:00
Eliran Sinvani	b97f466438	schema: take into account features when converting a table creation to schema_mutations When upgrading from a version that lacks some schema features, during the transition, when we have a mixed cluster. Schema digests are calculated without taking into account the mixed cluster supported features. Every node calculate the digest as if the whole cluster supports its supported features. Scylla already has a mechanism of redaction to the lowest common denominator, but it haven't been used in this context. This commit is using the redaction mechanism when calculating the digest on the newly added table so it will match the supported features of the whole cluster. Tests: Manual upgrading - upgraded to a version with an additional feature and additional schema column and validated that the digest of the tables schema is identical on every node on the mixed cluster.	2020-07-19 10:30:51 +03:00
Pavel Emelyanov	92f58f62f2	headers:: Remove flat_mutation_reader.hh from several other headers All they can live with forward declaration of the f._m._r. plus a seastar header in commitlog code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:54:47 +03:00
Pavel Emelyanov	8618a02815	migration_manager: Remove db/schema_tables.hh inclustion into header The schema_tables.hh -> migration_manager.hh couple seems to work as one of "single header for everyhing" creating big blot for many seemingly unrelated .hh's. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:54:43 +03:00

1 2 3 4 5 ...

1795 Commits