scylladb

Author	SHA1	Message	Date
Botond Dénes	3fab83b3a1	flat_mutation_reader: impl: add reader_permit parameter Not used yet, this patch does all the churn of propagating a permit to each impl. In the next patch we will use it to track to track the memory consumption of `_buffer`.	2020-09-28 10:53:48 +03:00
Avi Kivity	88ea02bfeb	table: clear sstable set when stopping Drop references to a table's sstables when stopping it, so that the sstable_manager can start deleting it. This includes staging sstables. Although the table is no longer in use at this time, maintain cache synchronity by calling row_cache::invalidate() (this also has the benefit of avoiding a stall in row_cache's destructor). We also refresh the cache's view of the sstable set to drop the cache's references.	2020-09-23 20:55:05 +03:00
Avi Kivity	9932e6a899	table: prevent table::stop() race with table::query() Take the gate in table::query() so that stop() waits for queries. The gate is already waited for in table::stop(). This allows us to know we are no longer using the table's sstables in table::stop().	2020-09-23 20:55:05 +03:00
Avi Kivity	64c7c81bac	Merge "Update log messages to {fmt} rules" from Pavel E " Before seastar is updated with the {fmt} engine under the logging hood, some changes are to be made in scylla to conform to {fmt} standards. Compilation and tests checked against both -- old (current) and new seastar-s. tests: unit(dev), manual " * 'br-logging-update' of https://github.com/xemul/scylla: code: Force formatting of pointer in .debug and .trace code: Format { and } as {fmt} needs streaming: Do not reveal raw pointer in info message mp_row_consumer: Provide hex-formatting wrapper for bytes_view heat_load_balance: Include fmt/ranges.h	2020-09-03 15:10:09 +03:00
Pavel Emelyanov	812eed27fe	code: Force formatting of pointer in .debug and .trace ... and tests. Printin a pointer in logs is considered to be a bad practice, so the proposal is to keep this explicit (with fmt::ptr) and allow it for .debug and .trace cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-26 20:44:11 +03:00
Tomasz Grabiec	cf12b5e537	db: view: Refactor view_info::initialize_base_dependent_fields() It is no longer called once for a given view_info, so the name "initialize" is not appropriate. This patch splits the "initialize" method into the "make" part, which makes a new base_info object, and the "set" part, which changes the current base_info object attached to the view.	2020-08-20 14:53:07 +02:00
Tomasz Grabiec	f8df214836	db: view: Fix incorrect schema access during view building after base table schema changes The view building process was accessing mutation fragments using current table's schema. This is not correct, fragments must be accessed using the schema of the generating reader. This could lead to undefined behavior when the column set of the base table changes. out_of_range exceptions could be observed, or data in the view ending up in the wrong column. Refs #7061. The fix has two parts. First, we always use the reader's schema to access fragments generated by the reader. Second, when calling populate_views() we upgrade the fragment-wrapping reader's schema to the base table schema so that it matches the base table schema of view_and_base snapshots passed to populate_views().	2020-08-20 14:53:07 +02:00
Tomasz Grabiec	3a6ec9933c	db: views: Fix undefined behavior on base table schema changes The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema does not change when the base table is altered. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entitiy called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Refs #7061.	2020-08-20 14:53:07 +02:00
Botond Dénes	78f94ba36a	table: get_sstables_by_partition_key(): don't make a copy of selected sstables Currently we assign the reference to the vector of selected sstables to `auto sst`. This makes a copy and we pass this local variable to `do_for_each()`, which will result in a use-after-free if the latter defers. Fix by not making a copy and instead just keep the reference. Fixes: #7060 Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200818091241.2341332-1-bdenes@scylladb.com>	2020-08-18 14:20:31 +03:00
Raphael S. Carvalho	11df96718a	compaction: Prevent non-regular compaction from picking compacting SSTables After `8014c7124`, cleanup can potentially pick a compacting SSTable. Upgrade and scrub can also pick a compacting SSTable. The problem is that table::candidates_for_compaction() was badly named. It misleads the user into thinking that the SSTables returned are perfect candidates for compaction, but manager still need to filter out the compacting SSTables from the returned set. So it's being renamed. When the same SSTable is compacted in parallel, the strategy invariant can be broken like overlapping being introduced in LCS, and also some deletion failures as more than one compaction process would try to delete the same files. Let's fix scrub, cleanup and ugprade by calling the manager function which gets the correct candidates for compaction. Fixes #6938. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200811200135.25421-1-raphaelsc@scylladb.com>	2020-08-16 17:31:03 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Nadav Har'El	8135647906	merge: Add metrics to semaphores Merged pull request https://github.com/scylladb/scylla/pull/7018 by Piotr Sarna: This series addresses various issues with metrics and semaphores - it mainly adds missing metrics, which makes it possible to see the length of the queues attached to the semaphores. In case of view building and view update generation, metrics was not present in these services at all, so a first, basic implementation is added. More precise semaphore metrics would ease the testing and development of load shedding and admission control. view_builder: add metrics db, view: add view update generator metrics hints: track resource_manager sending queue length hints: add drain queue length to metrics table: add metrics for sstable deletion semaphore database: remove unused semaphore	2020-08-12 12:39:59 +03:00
Piotr Sarna	8b56b24737	table: add metrics for sstable deletion semaphore It's now possible to read the number of tasks waiting on the sstable deletion semaphore.	2020-08-11 17:43:53 +02:00
Benny Halevy	7cfca519cb	table: filter_sstable_for_reader: allow clustering filtering md-format sstables Now that it is safe to filter md format sstable by min/max column names we can remove the `filtering_broken` variable that disabled filtering in `19b76bf75b` to fix #4442. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	ab67629ea6	table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results To prevent https://github.com/scylladb/scylla/issues/3552 we want to ensure that in any case that the partition exists in any sstable, we emit partition_start/end, even when returning no rows. In the first filtering pass, filter_sstable_for_reader_by_pk filters the input sstables based on the partition key, and num_sstables is set the size of the sstables list after the first filtering pass. An empty sstables list at this stage means there are indeed no sstables with the required partition so returning an empty result will leave the cache in the desired state. Otherwise, we filter again, using filter_sstable_for_reader_by_ck, and examine the list of the remaining readers. If num_readers != num_sstables, we know that some sstables were filterd by clustering key, so we append a flat_mutation_reader_from_mutations to the list of readers and return a combined reader as before. This will ensure that we will always have a partition_start/end mutations for the queried partition, even if the filtered readers emit no rows. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:32 +03:00
Benny Halevy	a672747da3	table: filter_sstable_for_reader: adjust to md-format With the md sstable format, min/max column names in the metadata now track clustering rows (with or without row tombstones), range tombstones, and partition tombstones (that are reflected with empty min/max column names - indicating the full range). As such, min and max column names may be of different lengths due to range tombstones and potentially short clustering key prefixes with compact storage, so the current matching algorithm must be changed to take this into account. To determine if a slice range overlaps the min/max range we are using position_range::overlaps. sstable::clustering_components_ranges was renamed to position_range as it now holds a single position_range rather than a vector of bytes_view ranges. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 19:19:30 +03:00
Benny Halevy	90d0fea7df	table: filter_sstable_for_reader: include non-scylla sstables with tombstones Move contains_rows from table code to sstable::may_contain_rows since its implementation now has too specific knowledge of sstable internals. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	2a57ec8c3d	table: filter_sstable_for_reader: do not filter if static column is requested Static rows aren't reflected in the sstable min/max clustering keys metadata. Since we don't have any indication in the metadata that the sstable stores static rows, we must read all sstables if a static column is requested. Refs #3553 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	2fed3f472c	table: filter_sstable_for_reader: refactor clustering filtering conditional expression We're about to drop `filtering_broken` in a future patche when clustering filtering can be supported for md-format sstables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Benny Halevy	200d8d41d9	sstables: add may_contain_rows Move the logic from table to sstable as it will contain intimate knowledge of the sstable min/max column names validity for md format. Also, get rid of the sstable::clustering_components_ranges() method as the member is used only internally by the sstable code now. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Nadav Har'El	936cf4cce0	merge: Increase row limits Merged pull request https://github.com/scylladb/scylla/pull/6910 by Wojciech Mitros: This patch enables selecting more than 2^32 rows from a table. The change becomes active after upgrading whole cluster - until then old limits are used. Tested reading 4.5*10^9 rows from a virtual table, manually upgrading a cluster with ccm and performing cql SELECT queries during the upgrade, ran unit tests in dev mode and cql and paging dtests. tests: add large paging state tests increase the maximum size of query results to 2^64	2020-08-04 19:52:30 +03:00
Avi Kivity	311ba4e427	Merge "sstables: Simplify the relationship of monitors and writers" from Rafael " With this patches a monitor is destroyed before the writer, which simplifies the writer destructor. " * 'espindola/simplify-write-monitor-v2' of https://github.com/espindola/scylla: sstables: Delete write_failed sstables: Move monitor after writer in compaction_writer	2020-08-04 11:01:55 +03:00
Wojciech Mitros	45215746fe	increase the maximum size of query results to 2^64 Currently, we cannot select more than 2^32 rows from a table because we are limited by types of variables containing the numbers of rows. This patch changes these types and sets new limits. The new limits take effect while selecting all rows from a table - custom limits of rows in a result stay the same (2^32-1). In classes which are being serialized and used in messaging, in order to be able to process queries originating from older nodes, the top 32 bits of new integers are optional and stay at the end of the class - if they're absent we assume they equal 0. The backward compatibility was tested by querying an older node for a paged selection, using the received paging_state with the same select statement on an upgraded node, and comparing the returned rows with the result generated for the same query by the older node, additionally checking if the paging_state returned by the upgraded node contained new fields with correct values. Also verified if the older node simply ignores the top 32 bits of the remaining rows number when handling a query with a paging_state originating from an upgraded node by generating and sending such a query to an older node and checking the paging_state in the reply(using python driver). Fixes #5101.	2020-08-03 17:32:49 +02:00
Benny Halevy	3fa0f289de	table: snapshot: do not capture name This captured sstring is unused. Test: database_test(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200803072258.44681-1-bhalevy@scylladb.com>	2020-08-03 12:51:16 +03:00
Benny Halevy	122136c617	tables: snapshot: do not create links from multiple shards We need only one of the shards owning each ssatble to call create_links. This will allow us to simplify it and only handle crash/replay scenarios rather than rename/link/remove races. Fixes #1622 Test: unit(dev), database_test(debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200803065505.42100-3-bhalevy@scylladb.com>	2020-08-03 10:07:07 +03:00
Benny Halevy	ec6e136819	table: snapshot: reduce copies of snapshot dir sstring Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200803065505.42100-2-bhalevy@scylladb.com>	2020-08-03 10:07:06 +03:00
Benny Halevy	72365445c6	table: snapshot: create destination dir only once No need to recursive_touch_directory for each sstable. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200803065505.42100-1-bhalevy@scylladb.com>	2020-08-03 10:07:05 +03:00
Avi Kivity	257c17a87a	Merge "Don't depend on seastar::make_(lw_)?shared idiosyncrasies" from Rafael " While working on another patch I was getting odd compiler errors saying that a call to ::make_shared was ambiguous. The reason was that seastar has both: template <typename T, typename... A> shared_ptr<T> make_shared(A&&... a); template <typename T> shared_ptr<T> make_shared(T&& a); The second variant doesn't exist in std::make_shared. This series drops the dependency in scylla, so that a future change can make seastar::make_shared a bit more like std::make_shared. " * 'espindola/make_shared' of https://github.com/espindola/scylla: Everywhere: Explicitly instantiate make_lw_shared Everywhere: Add a make_shared_schema helper Everywhere: Explicitly instantiate make_shared cql3: Add a create_multi_column_relation helper main: Return a shared_ptr from defer_verbose_shutdown	2020-08-02 19:51:24 +03:00
Botond Dénes	f7a4d19fb1	mutation_partition: abort read when hard limit is exceeded for non-paged reads If the read is not paged (short read is not allowed) abort the query if the hard memory limit is reached. On reaching the soft memory limit a warning is logged. This should allow users to adjust their application code while at the same time protecting the database from the really bad queries. The enforcement happens inside the memory accounter and doesn't require cooperation from the result builders. This ensures memory limit set for the query is respected for all kind of reads. Previously non-paged reads simply ignored the memory accounter requesting the read to stop and consumed all the memory they wanted.	2020-07-29 08:32:31 +03:00
Botond Dénes	6660a5df51	result_memory_accounter: remove default constructor If somebody wants to bypass proper memory accounting they should at the very least be forced to consider if that is indeed wise and think a second about the limit they want to apply.	2020-07-28 18:00:29 +03:00
Botond Dénes	159d37053d	storage_proxy: use read_command::max_result_size to pass max result size around Use the recently added `max_result_size` field of `query::read_command` to pass the max result size around, including passing it to remote nodes. This means that the max result size will be sent along each read, instead of once per connection. As we want to select the appropriate `max_result_size` based on the type of the query as well as based on the query class (user or internal) the previous method won't do anymore. If the remote doesn't fill this field, the old per-connection value is used.	2020-07-28 18:00:29 +03:00
Botond Dénes	fbbbc3e05c	query: result_memory_limiter: use the new max_result_size type	2020-07-28 18:00:29 +03:00
Botond Dénes	517a941feb	query_class_config: move into the query namespace It belongs there, its name even starts with "query".	2020-07-28 18:00:29 +03:00
Rafael Ávila de Espíndola	34d60efbf9	sstables: Delete write_failed It is no longer used. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-27 11:23:48 -07:00
Rafael Ávila de Espíndola	e15c8ee667	Everywhere: Explicitly instantiate make_lw_shared seastar::make_lw_shared has a constructor taking a T&&. There is no such constructor in std::make_shared: https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared This means that we have to move from make_lw_shared(T(...) to make_lw_shared<T>(...) If we don't want to depend on the idiosyncrasies of seastar::make_lw_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Botond Dénes	84357f0722	db/view: view_updating_consumer: move implementation from table.cc to view.cc table.cc is a very counter-intuitive place for view related stuff, especially if the declarations reside in `db/view/`.	2020-07-20 11:23:39 +03:00
Botond Dénes	cd849ed40d	database: add make_restricted_range_sstable_reader() A variant of `make_range_sstable_reader()` that wraps the reader in a restricting reader, hence making it wait for admission on the read concurrency semaphore, before starting to actually read.	2020-07-20 11:23:39 +03:00
Benny Halevy	e39fbe1849	compaction: move compaction uuid generation to compaction_info We'd like to use the same uuid both for printing compaction log messages and to update compaction_history. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-07-16 13:55:23 +03:00
Amnon Heiman	186301aff8	per table metrics: change estimated_histogram to time_estimated_histogram This patch changes the per table latencies histograms: read, write, cas_prepare, cas_accept, and cas_learn. Beside changing the definition type and the insertion method, the API was changed to support the new metrics. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-07-14 11:17:43 +03:00
Amnon Heiman	ea8d52b11c	row_locking: change estimated histogram with time_estimated_histogram This patch changes the row locking latencies to use time_estimated_histogram. The change consist of changing the histogram definition and changing how values are inserted to the histogram. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-07-14 11:17:43 +03:00
Raphael S. Carvalho	1e9c5b5295	table: simplify table::discard_sstables() no longer need to have any special code for shared SSTables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:23:40 -03:00
Raphael S. Carvalho	ce210a4420	table: simplify add_sstable() get_shards_for_this_sstable() can be called inside table::add_sstable() because the shards for a sstable is precomputed and so completely exception safe. We want a central point for checking that table will no longer added shared SSTables to its sstable set. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:23:32 -03:00
Raphael S. Carvalho	68b527f100	table: simplify update_stats_for_new_sstable() no longer need to conditionally track the SSTable metadata, as table will no longer accept shared SSTables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:22:04 -03:00
Raphael S. Carvalho	607c74dc95	table: remove unused open_sstable function Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:22:00 -03:00
Raphael S. Carvalho	60467a7e36	table: no longer keep track of sstables that need resharding Now that table will no longer accept shared SSTables, it no longer needs to keep track of them. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:21:38 -03:00
Raphael S. Carvalho	cd548c6304	table: Remove unused functions no longer used by resharding Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:21:36 -03:00
Raphael S. Carvalho	68a4739a42	table: remove sstable::shared() condition from backlog tracker add/remove functions Now that table no longer accept shared SSTables, those two functions can be simplified by removing the shared condition. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:21:34 -03:00
Raphael S. Carvalho	343efe797d	table: No longer accept a shared SSTable With off-strategy work on reshard on boot and refresh, table no longer needs to work with Shared SSTables. That will unlock a host of cleanups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:21:04 -03:00
Pavel Emelyanov	f045cec586	snap: Get rid of storage_service reference in schema.cc Now when the snapshot stopping is correctly handled, we may pull the database reference all the way down to the schema::describe(). One tricky place is in table::napshot() -- the local db reference is pulled through an smp::submit_to call, but thanks to the shard checks in the place where it is needed the db is still "local" Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 20:28:25 +03:00
Avi Kivity	e5be3352cf	database, streaming, messaging: drop streaming memtables Before Scylla 3.0, we used to send streaming mutations using individual RPC requests and flush them together using dedicated streaming memtables. This mechanism is no longer in use and all versions that use it have long reached end-of-life. Remove this code.	2020-06-25 15:25:54 +02:00

1 2 3 4

197 Commits