scylladb

Author	SHA1	Message	Date
Botond Dénes	0b5217052d	querier: switch to v2 compactor output The change is mostly mechanical: update all compactor instances to the _v2 variant and update all call-sites, of which there is not that many. As a consequence of this patch, queries -- both single-partition and range-scans -- now do the v2->v1 conversion in the consumers, instead of in the compactor.	2022-03-11 09:24:05 +02:00
Botond Dénes	7e0b51ff23	Merge 'Overhaul compaction_manager::task' from Benny Halevy The series overhauls the compaction_manager::task design and implementation by properly layering the functionality between the compaction_manager that deals with generic task execution, and the per-task business logic that is defined in a set of classes derived from the generic task class. While at it, the series introduces `task::state` and a set of helper functions to manage it to prevent leaks in the statistics, fixing #9974. Two more stats counter were exposed: `completed_tasks` and a new `postponed_tasks`. Test: sstable_compaction_test Dtest: compaction_test.py compaction_additional_test.py Fixes #9974 Closes #10122 * github.com:scylladb/scylla: compaction_manager: use coroutine::switch_to compaction_manager::task: drop _compaction_running compaction_manager: move per-type logic to derived task compaction_manager: task: add state enum compaction_manager: task: add maybe_retry compaction_manager: reevaluate_postponed_compactions: mark as noexcept compaction_manager: define derived task types compaction_manager: register_metrics: expose postponed_compactions compaction_manager: register_metrics: expose failed_compactions compaction_manager: register_metrics: expose _stats.completed_tasks compaction: add documentation for compaction_type to string conversions compaction: expose to_string(compaction_type) compaction_manager: task: standardize task description in log messages compaction_manager: refactor can_proceed compaction_manager: pass compaction_manager& to task ctor compaction_manager: use shared_ptr<task> rather than lw_shared_ptr compaction_manager: rewrite_sstables: acquire _maintenance_ops_sem once compaction_manager: use compaction_state::lock only to synchronize major and regular compaction	2022-03-10 13:33:56 +02:00
Benny Halevy	a2a5e530f0	compaction_manager: move per-type logic to derived task Move the business logic into the task specific classes. Separating initialization during task construction, from the compaction_done task, moved into a do_run() method, and in some cases moving a lambda function that was called per table (as in rewrite_sstables) into a private method of the derived class. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Mikołaj Sielużycki	5920349357	row_cache: Make row_cache reader from sstables compacting. Reading data from sstables without compacting first puts unnecessary pressure on the cache. The mutation streams need to be resolved anyway before passing to subsequent consumers, so it's better to do it as close to the source as possible. Fixes: #3568 Closes #10188	2022-03-10 11:40:10 +02:00
Botond Dénes	32e9809e9c	test/boost/mutation_test: migrate to compact_for_mutation_v2	2022-03-10 09:16:33 +02:00
Botond Dénes	959483a2dc	test: migrate to the v2 variant of the sstable writer API	2022-03-10 09:16:33 +02:00
Botond Dénes	105bf8888a	sstables: convert mx writer to v2 The sstables::sstable class has two methods for writing sstables: 1) sstable_writer get_writer(...); 2) future<> write_components(flat_mutation_reader, ...); (1) directly exposes the writer type, so we have to update all users of it (there is not that many) in this same patch. We defer updating users of (2) to a follow-up commits.	2022-03-10 07:03:49 +02:00
Botond Dénes	2057db54ab	test/boost/mutation_test: test_compactor_range_tombstone_spanning_many_pages extend to check v2 output too	2022-03-10 07:03:49 +02:00
Botond Dénes	7a37e30310	mutation_reader: convert compacting reader v2 Its input was already a v2 reader, now itself is also a v2 reader. With this commit, compaction.cc is finally v2 all-the-way.	2022-03-10 07:03:46 +02:00
Avi Kivity	e1c326a5ba	Merge "Convert multishard writer to v2" from Botond " Also convert the foreign_reader used by it in the process. Tests: unit(dev) " * 'multishard-writer-v2/v1' of https://github.com/denesb/scylla: mutation_writer/multishard_writer: remove now unused v1 factory overloads test/boost/mutation_writer_test: test the v2 variant of distribute_reader_and_consume_on_shards() flat_mutation_reader: add v2 variant of make_generating_reader() mutation_reader: multishard_writer: migrate implementation to v2 mutation_reader: convert foreign_reader to v2 streaming/consumer: convert to v2 mutation_writer/multishard_writer: add v2 variant of distribute_reader_and_consume_on_shards()	2022-03-09 19:28:05 +02:00
Tomasz Grabiec	8fa704972f	loading_cache: Make invalidation take immediate effect There are two issues with current implementation of remove/remove_if: 1) If it happens concurrently with get_ptr(), the latter may still populate the cache using value obtained from before remove() was called. remove() is used to invalidate caches, e.g. the prepared statements cache, and the expected semantic is that values calculated from before remove() should not be present in the cache after invalidation. 2) As long as there is any active pointer to the cached value (obtained by get_ptr()), the old value from before remove() will be still accessible and returned by get_ptr(). This can make remove() have no effect indefinitely if there is persistent use of the cache. One of the user-perceived effects of this bug is that some prepared statements may not get invalidated after a schema change and still use the old schema (until next invalidation). If the schema change was modifying UDT, this can cause statement execution failures. CQL coordinator will try to interpret bound values using old set of fields. If the driver uses the new schema, the coordinaotr will fail to process the value with the following exception: User Defined Type value contained too many fields (expected 5, got 6) The patch fixes the problem by making remove()/remove_if() erase old entries from _loading_values immediately. The predicate-based remove_if() variant has to also invalidate values which are concurrently loading to be safe. The predicate cannot be avaluated on values which are not ready. This may invalidate some values unnecessarily, but I think it's fine. Fixes #10117 Message-Id: <20220309135902.261734-1-tgrabiec@scylladb.com>	2022-03-09 16:13:07 +02:00
Avi Kivity	1622995900	Merge 'Allow empty partition keys in views' from Nadav Har'El Cassandra generally does not allow empty strings as partition keys (note, by the way, that empty strings are allowed as clustering keys, as well as in individual components of a compound partition key). However, Cassandra does allow empty strings in _regular_ columns - and those regular columns can be indexed by a secondary index, or become an empty partition-key column in a materialized view. As noted in issues #9375 and #9364 and verified in a few xfailing cql-pytest tests, Scylla didn't allow these cases - and this patch series fixes that. Before the last patch in this series finally enables empty-string partition keys in materialized views, we first need to solve a couple of bugs in our code related to handling empty partition keys: The first patch fixes issue #10178 - a bug in `key_view::tri_compare()` where comparing two empty keys returned a random result instead of "equal". The second patch fixes issue #9352: our tokenizer has an inconsistency where for an empty string key, two variants of the same function return different results: 1. One variant `murmur3_partitioner::get_token(bytes_view key)` returned `minimum_token()` for the empty string. 2. Another variant `murmur3_partitioner::get_token(const schema& s, partition_key_view key)` did not have this special case, and called the normal hash-function calculation on the empty string (the resulting token is 0). Variant 2 was an unintentional bug, because Cassandra always does what variant does 1. So the "obvious" fix here would be to fix variant 2 to do what variant 1 does. Nevertheless, we decided to do the opposite: Change variant 1 to match variant 2. The reasoning is as follows: The `minimum_token()` is `token{token::kind::before_all_keys, 0 }` - it's not a real token. Since we intend in this patch allow real data to exist with the empty key, we need this real data to have a real token. For example, this token needs to be located on the token ring (so the empty-key partition will have replicas) and also belong to one of the shards, and it's not clear that `minimum_token()` will be handled correctly in this context. After changing the token of the empty string to 0, we note that some places in the code assume that `dht::decorated_key(dh t::minimum_token(), partition_key::make_empty())` is a legal decorated key. However, as far as I can tell, none of these places actually assume that the partition-key part (the `make_empty()`) really matches the token - this decorated key is only used to start an iteration (ignoring this key itself) or to indicate a non-existent key (in modern code `std::optional` should be used for that). While normally changing the token of a key is a big faux-pas, which can result in old data no longer being readable, in this case this change is safe because: 1. Scylla previously disallowed empty partition keys (in both base tables and views), so we cannot have had such a partition key saved in any sstable. 3. Cassandra does allow empty partition keys in _views_ and _secondary indexes_, but we do not support migrating sstables of those into Scylla - users are expected to only migrate the base table and then re-create the view or index. So however Cassandra writes those empty-key partitions, we don't care. The third patch finally fixes the materialized views implementation to not drop view rows with an empty-string partition key (#9375). This means we basically revert commit `ec8960df45` - which fixed #3262 by disallowing empty partition keys in views, whereas this patch fixes the same problem by handling the empty partition keys correctly. The fix for the secondary index bug (#9364) comes "for free" because it is based on materialized views. We already had xfailing test cases for empty strings in materialized views and indexes, and after this series they begin to pass so the "xfail" mark is removed. The series also adds additional test cases that validate additional corner cases discovered during the debugging. Fixes #9352 Fixes #9364 Fixes #9375 Fixes #10178 Closes #10170 * github.com:scylladb/scylla: compound_compat.hh: add missing methods of iterator materialized views: allow empty strings in views and indexes murmur3: fix inconsistent token for empty partition key compound_compat.hh: fix bug iterating on empty singular key	2022-03-08 15:55:55 +02:00
Nadav Har'El	ef43531fb6	materialized views: allow empty strings in views and indexes Although Cassandra generally does not allow empty strings as partition keys (note they are allowed as clustering keys!), it does allow empty strings in regular columns to be indexed by a secondary index, or to become an empty partition-key column in a materialized view. As noted in issues #9375 and #9364 and verified in a few xfailing cql-pytest tests, Scylla didn't allow these cases - and this patch fixes that. The patch mostly removes unnecessary code: In one place, code prevented an sstable with an empty partition key from being written. Another piece of removed code was a function is_partition_key_empty() which the materialized-view code used to check whether the view's row will end up with an empty partition key, which was supposedly forbidden. But in fact, should have been allowed like they are allowed in Cassandra and required for the secondary-index implementation, and the entire function wasn't necessary. Note that the removed function is_partition_key_empty() was NOT required for the "IS NOT NULL" feature of materialized views - this continues to work as expected after this patch, and we add another test to confirm it. Being null and being an empty string are two different things. This patch also removes a part of a unit test which enshrined the wrong behavior. After this patch we are left with one interesting difference from Cassandra: Though Cassandra allows a user to create a view row with an empty-string partition key, and this row is fully visible in when scanning the view, this row can not be queried individually because "WHERE v=''" is forbidden when v is the partition key (of the view). Scylla does not reproduce this anomaly - and such point query does work in Scylla after this patch. We add a new test to check this case, and mark it "cassandra_bug", i.e., it's a Cassandra behavior which we consider wrong and don't want to emulate. This patch relies on #9352 and #10178 having been fixed in previous patches, otherwise the WHERE v='' does not work when reading from sstables. We add to the already existing tests we had for empty materialized-views keys a lookup with WHERE v='' which failed before fixing those two issues. Fixes #9364 Fixes #9375 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-03-08 15:34:26 +02:00
Nadav Har'El	f8807e24f4	compound_compat.hh: fix bug iterating on empty singular key When iterating over a compound key with legacy_compound_view<>, when the key is "singular" (i.e., a single column) we need to iterate over just the component's actual bytes - without the two length bytes or end-of-component byte. In particular, when the component is an empty string, the iteration should return zero bytes. In other words, we should have begin() == end(). Unfortunately, this is not what happened - for an empty singular key, the iterator returned for begin() was slightly different from end() - so code using this iterator would not know there is nothing to iterate. So in this patch we fix begin() and end() to return the same thing if we have an empty singular key. The bug in legacy_compound_view<> (which we fix here) caused a bug in sstables::key_view::tri_compare(const schema& s, partition_key_view other), causing it to return wrong results when comparing two empty keys. As a result we were unable to retrieve a partition with an empty key from the sstable index. So this patch is necessary to fix support for empty-string keys in sstables (part of issue #9375). This patch also includes a unit-test for this bug. We test it in the context of sstables::key_view::tri_compare(), where it was first discovered, and also test the legacy_compound_view itself. The included test used to fail in both places before this patch, and pass after it. Fixes #10178 Refs #9375 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-03-08 14:14:18 +02:00
Benny Halevy	a085ef74ff	atomic_cell: compare_atomic_cell_for_merge: compare ttl if expiry is equal Following up on `a57c087c89`, compare_atomic_cell_for_merge should compare the ttl value in the reverse order since, when comparing two cells that are identical in all attributes but their ttl, we want to keep the cell with the smaller ttl value rather than the larger ttl, since it was written at a later (wall-clock) time, and so would remain longer after it expires, until purged after gc_grace seconds. Fixes #10173 Test: mutation_test.test_cell_ordering, unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220302154328.2400717-1-bhalevy@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220306091913.106508-1-bhalevy@scylladb.com>	2022-03-07 11:05:30 +02:00
Benny Halevy	a57c087c89	atomic_cell: compare_atomic_cell_for_merge: compare ttl if expiry is equal Unlike atomic_cell_or_collection::equals, compare_atomic_cell_for_merge currently returns std::strong_ordering::equal if two cells are equal in every way except their ttl:s. The problem with that is that the cells' hashes are different and this will cause repair to keep trying to repair discrepancies caused by the ttl being different. This may be triggered by e.g. the spark migrator that computes the ttl based on the expiry time by subtracting the expiry time from the current time to produce a respective ttl. If the cell is migrated multiple times at different times, it will generate cells that the same expiry (by design) but have different ttl values. Fixes #10156 Test: mutation_test.test_cell_ordering, unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220302154328.2400717-1-bhalevy@scylladb.com>	2022-03-03 15:27:16 +02:00
Raphael S. Carvalho	2dba0670ad	compaction: Fix time_window_backlog_tracker::replace_sstables() Introduced in commit: `ddd693c6d7` We're not emplacing newer windows in the tracker, causing std::out_of_range when replacing sstables for windows. Let's fix the logic and add an unit test to cover this. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220301194944.95096-1-raphaelsc@scylladb.com>	2022-03-02 10:08:40 +02:00
Botond Dénes	70e95a9cf7	test/boost/mutation_writer_test: test the v2 variant of distribute_reader_and_consume_on_shards() The underlying implementation behind the v1 and v2 variants if said methods is the same, but we want to move to using the v2 variant in the test as the v1 variant is going away soon.	2022-03-02 09:57:24 +02:00
Botond Dénes	cdf7e74da8	mutation_reader: convert foreign_reader to v2	2022-03-02 09:55:38 +02:00
Nadav Har'El	7cf2e5ee5c	Merge 'directory_lister: drop abort method and simplify close semantics' from Benny Halevy This series contains: - lister: move to utils - tidy up the clutter in the root dir Based on Avi's feedback to `[PATCH 1/1] utils: directory_lister: close: always abort queue` that was sent to the mailing list: - directory_lister: drop abort method - lister: do not require get after close to fail - test: lister_test: test_directory_lister_close simplify indentation - cosmetic cleanup Closes #10142 * github.com:scylladb/scylla: test: lister_test: test_directory_lister_close simplify indentation lister: do not require get after close to fail directory_lister: drop abort method lister: move to utils	2022-03-01 16:23:47 +02:00
Michael Livshin	2221aeff0e	flat_mutation_reader_test: fix "test_flat_mutation_reader_consume_single_partition" Since `flat_reader_assertions::produces(const range_tombstone&,...)` records the range tombstone for checking, be sure to explicitly pass in a clustering range that does not extend beyond the mock-read part of the mutation. Also (provisionally) change the assertion method to accept clustering ranges. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-28 17:11:54 +02:00
Michael Livshin	9bacce4359	memtable::make_flat_reader(): return flat_mutation_reader_v2 This is just a facade change. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-28 17:11:54 +02:00
Michał Radwański	9ada63a9cb	flat_mutation_reader: allow destructing readers which are not closed and didn't initiate any IO. In functions such as upgrade_to_v2 (excerpt below), if the constructor of transforming_reader throws, r needs to be destroyed, however it hasn't been closed. However, if a reader didn't start any operations, it is safe to destruct such a reader. This issue can potentially manifest itself in many more readers and might be hard to track down. This commit adds a bool indicating whether a close is anticipated, thus avoiding errors in the destructor. Code excerpt: flat_mutation_reader_v2 upgrade_to_v2(flat_mutation_reader r) { class transforming_reader : public flat_mutation_reader_v2::impl { // ... }; return make_flat_mutation_reader_v2<transforming_reader>(std::move(r)); } Fixes #9065.	2022-02-28 17:11:54 +02:00
Michael Livshin	67c3c31a6e	tests: stop comparing sstables with range tombstones to C* reference As flat mutation reader {up,down}grades get added to the write path, comparing range-tombstone-containing (at least) sstables byte-by-byte to a reference is starting to seem like a fool's errand. * When a flat mutation reader is {up,down}graded, information may get lost while splitting range tombstones. Making those splits revertable should in theory be possible but would surely make {up,down}graders slower and more complex, and may also possibly entail adding information to in-memory representation of range tombstones and range rombstone changes. Such investment for the sake of 7 unit tests does not seem wise, given that the plan is to get rid of reader {up,down}grade logic once the move to flat mutation reader v2 is completed. * All affected tests also validate their written sstables semantically. * At least some of the offending reference sstables are not "canonical" wrt range tombstones to begin with -- they contain range tombstones that overlap with clustering rows. The fact that Scylla does not "canonicalize" those in some way seems purely incidental. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-28 17:11:54 +02:00
Benny Halevy	9c89c2df37	test: lister_test: test_directory_lister_close simplify indentation There's no need anymore for an indented block to destroy tnhe directory_lister since the other sub-case was deleted. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-28 13:00:03 +02:00
Benny Halevy	41d097ef47	lister: do not require get after close to fail Currently, the lister test expected get() to always fail after close(), but it unexpectedly succeeded if get() was never called before close, as seen in https://jenkins.scylladb.com/view/master/job/scylla-master/job/next/4587/artifact/testlog/x86_64_debug/lister_test.test_directory_lister_close.4001.log ``` random-seed=1475104835 Generated 719 dir entries Getting 565 dir entries Closing directory_lister Getting 0 dir entries Closing directory_lister test/boost/lister_test.cc(190): fatal error: in "test_directory_lister_close": exception std::exception expected but not raised ``` This change relaxes this requirement to keep close() simple, based on Avi's feedback: > The user should call close(), and not do it while get() is running, and > that's it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-28 12:59:08 +02:00
Benny Halevy	00327bfae3	directory_lister: drop abort method Based on Avi's feedback: > We generally have a public abort() only if we depend on an external > event (like data from a tcp socket) that we don't control. But here > there are no such external events. So why have a public abort() at all? If needed in the future, we can consider adding get(abort_source&) to allow aborting get() via an external event. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-28 12:52:47 +02:00
Benny Halevy	ebbbf1e687	lister: move to utils There's nothing specific to scylla in the lister classes, they could (and maybe should) be part of the seastar library. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-28 12:36:03 +02:00
Avi Kivity	ff2cd72766	Merge 'utils: cached_file: Fix alloc-dealloc mismatch during eviction' from Tomasz Grabiec cached_page::on_evicted() is invoked in the LSA allocator context, set in the reclaimer callback installed by the cache_tracker. However, cached_pages are allocated in the standard allocator context (note: page content is allocated inside LSA via lsa_buffer). The LSA region will happily deallocate these, thinking that they these are large objects which were delegated to the standard allocator. But the _non_lsa_memory_in_use metric will underflow. When it underflows enough, shard_segment_pool.total_memory() will become 0 and memory reclamation will stop doing anything, leading to apparent OOM. The fix is to switch to the standard allocator context inside cached_page::on_evicted(). evict_range() was also given the same treatment as a precaution, it currently is only invoked in the standard allocator context. The series also adds two safety checks to LSA to catch such problems earlier. Fixes #10056 \cc @slivne @bhalevy Closes #10130 * github.com:scylladb/scylla: lsa: Abort when trying to free a standard allocator object not allocated through the region lsa: Abort when _non_lsa_memory_in_use goes negative tests: utils: cached_file: Validate occupancy after eviction test: sstable_partition_index_cache_test: Fix alloc-dealloc mismatch utils: cached_file: Fix alloc-dealloc mismatch during eviction	2022-02-25 18:19:04 +02:00
Tomasz Grabiec	ca09a72597	tests: utils: cached_file: Validate occupancy after eviction Reproducer for #10056 Catches alloc-dealloc mismatch leading to the underflow of _non_lsa_memory_in_use.	2022-02-25 01:42:15 +01:00
Tomasz Grabiec	b0d5bb334c	test: sstable_partition_index_cache_test: Fix alloc-dealloc mismatch The test was allocating entries in the standard allocator, but they are evicted in the LSA allocator context. Fix by allocating under LSA.	2022-02-25 01:42:15 +01:00
Raphael S. Carvalho	2a7939ee4d	tests: Add compaction controller test There's no automated test for controller, it's time to have one. Let's start with a basic one that verifies the assumption that perfectly compacted tiers should produce 0 backlog. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-02-24 18:57:45 -03:00
Raphael S. Carvalho	ddd693c6d7	compaction_backlog_tracker: Batch changes through a new replacement interface This new interface allows table to communicate multiple changes in the SSTable set with a single call, which is useful on compaction completion for example. With this new interface, the size tiered backlog tracker will be able to know when compaction completed, which will allow it to recompute tiers and their backlog contribution, if any. Without it, tiered tracker would have to recompute tiers for every change, which would be terribly expensive. The old remove/add interface are being removed in favor of the new one. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-02-24 15:34:16 -03:00
Avi Kivity	cbba80914d	memtable: move to replica module and namespace Memtables are a replica-side entity, and so are moved to the replica module and namespace. Memtables are also used outside the replica, in two places: - in some virtual tables; this is also in some way inside the replica, (virtual readers are installed at the replica level, not the cooordinator), so I don't consider it a layering violation - in many sstable unit tests, as a convenient way to create sstables with known input. This is a layering violation. We could make memtables their own module, but I think this is wrong. Memtables are deeply tied into replica memory management, and trying to make them a low-level primitive (at a lower level than sstables) will be difficult. Not least because memtables use sstables. Instead, we should have a memtable-like thing that doesn't support merging and doesn't have all other funky memtable stuff, and instead replace the uses of memtables in sstable tests with some kind of make_flat_mutation_reader_from_unsorted_mutations() that does the sorting that is the reason for the use of memtables in tests (and live with the layering violation meanwhile). Test: unit (dev) Closes #10120	2022-02-23 09:05:16 +02:00
Avi Kivity	75fb45df1b	Merge 'Propagate CQL coordinator timeouts and failures for reads' from Piotr Dulikowski This PR propagates the read coordinator logic so that read timeout and read failure exceptions are propagated without throwing on the coordinator side. This PR is only concerned with exceptions which were originally thrown by the coordinator (in read resolvers). Exceptions propagated through RPC and RPC timeouts will still throw, although those exceptions will be caught and converted into exceptions-as-values by read resolvers. This is a continuation of work started in #10014. Results of `perf_simple_query --smp 1 --operations-per-shard 1000000` (read workload), compared with merge base (`10880fb0a7`): ``` BEFORE: 125085.13 tps ( 80.2 allocs/op, 12.2 tasks/op, 49010 insns/op) 125645.88 tps ( 80.2 allocs/op, 12.2 tasks/op, 49008 insns/op) 126148.85 tps ( 80.2 allocs/op, 12.2 tasks/op, 49005 insns/op) 126044.40 tps ( 80.2 allocs/op, 12.2 tasks/op, 49005 insns/op) 125799.75 tps ( 80.2 allocs/op, 12.2 tasks/op, 49003 insns/op) AFTER: 127557.21 tps ( 80.2 allocs/op, 12.2 tasks/op, 49197 insns/op) 127835.98 tps ( 80.2 allocs/op, 12.2 tasks/op, 49198 insns/op) 127749.81 tps ( 80.2 allocs/op, 12.2 tasks/op, 49202 insns/op) 128941.17 tps ( 80.2 allocs/op, 12.2 tasks/op, 49192 insns/op) 129276.15 tps ( 80.2 allocs/op, 12.2 tasks/op, 49182 insns/op) ``` The PR does not introduce additional allocations on the read happy-path. The number of instructions used grows by about 200 insns/op. The increase in TPS is probably just a measurement error. Closes #10092 * github.com:scylladb/scylla: indexed_table_select_statement: return some exceptions as exception messages result_combinators: add result_wrap_unpack select_statement: return exceptions as errors in execute_without_checking_exception_message select_statement: return exceptions without throwing in do_execute select_statement: implement execute_without_checking_exception_message select_statement: introduce helpers for working with failed results query_pager: resultify relevant methods storage_proxy: resultify (do_)query storage_proxy: resultify query_singular storage_proxy: propagate failed results through query_partition_key_range storage_proxy: resultify query_partition_key_range_concurrent storage_proxy: modify handle_read_error to also handle exception containers abstract_read_executor: return result from execute() abstract_read_executor: return and handle result from has_cl() storage_proxy: resultify handling errors from read-repair abstract_read_executor::reconcile: resultise handling of data_resolver->done() abstract_read_executor::execute: resultify handling of data_resolver->done() result_combinators: add result_discard_value abstract_read_executor: resultify _result_promise abstract_read_executor: return result from done() abstract_read_resolver: fail promises by passing exception as value abstract_read_resolver: resultify promises exceptions: make it possible to return read_{timeout,failure}_exception as value result_try: add as_inner/clone_inner to handle types result_try: relax ConvertWithTo constraint exception_container: switch impl to std::shared_ptr and make copyable result_loop: add result_repeat result_loop: add result_do_until result_loop: add result_map_reduce utils/result: add utilities for checking/creating rebindable results	2022-02-22 20:58:25 +03:00
Nadav Har'El	be84a8def3	Merge 'Allow integers in scientific format in `INSERT JSON` ' from Piotr Grabowski Add support for specifing integers in scientific format (for example 1.234e8) in INSERT JSON statement: ``` INSERT INTO table JSON '{"int_column": 1e7}'; ``` Before the JSON parsing library was switched to RapidJSON from JsonCpp, this statement used to work correctly, because JsonCpp transparently casts double to integer value. Inserting a floating-point number ending with .0 is allowed, as the fractional part is zero. Non-zero fractional part (for example 12.34) is disallowed. A new test is added to test all those behaviors. This behavior differs from Cassandra, which disallows those types of numbers (1e7, 123.0 and 12.34), however some users rely on that behavior and JSON specification itself does not distinct between floating-point numbers and integer numbers (only a single "number" type is defined). This PR also fixes two minor issues I noticed while looking at the code: wrong blob validation and missing `IsString()` checks that could result in assertion error. Fixes #10100 Fixes #10114 Fixes #10115 Closes #10101 * github.com:scylladb/scylla: type_json: support integers in scientific format type_json: add missing IsString() checks type_json: fix wrong blob JSON validation	2022-02-22 20:58:25 +03:00
Piotr Dulikowski	091b20019b	result_combinators: add result_wrap_unpack Adds a helper combinator utils::result_wrap_unpack which, in contrast to utils::result_wrap, uses futurize_apply instead of futurize_invoke to call the wrapped callable. In short, if utils::result_wrap is used to adapt code like this: f.then([] {}) -> f_result.then(utils::result_wrap([] {})) Then utils::result_wrap_unpack works for the following case: f.then_unpack([] (arg1, arg2) {}) -> f_result.then(utils::result_wrap_unpack([] (arg1, arg2) {}))	2022-02-22 16:25:21 +01:00
Piotr Dulikowski	7afea88dfc	result_loop: add result_repeat Adds a result-aware counterpart to seastar::repeat. The new function does not base on seastar::repeat, but rather is a rewrite of the original (using a coroutine instead of an open-coded task). The main consequence of using a coroutine is that exceptions from AsyncAction need to be thrown once more.	2022-02-22 16:08:52 +01:00
Piotr Dulikowski	32cbc89779	result_loop: add result_do_until Adds a result-aware counterpart to seastar::do_until. The new function does not base on seastar::do_until, but rather is a rewrite of the original (using a coroutine instead of an open-coded task). The main consequence of using a coroutine is that exceptions from StopCondition or AsyncAction need to be thrown once more.	2022-02-22 16:08:52 +01:00
Piotr Dulikowski	4f0a98a829	result_loop: add result_map_reduce Adds result-aware counterparts to all seastar::map_reduce overloads. Fortunately, it was possible to implement the functions by basing them on seastar::map_reduce and get the same number of allocation. The only exception happens when reducer::get() returns a non-ready future, which doesn't seem to happen on the read coordinator path.	2022-02-22 16:08:52 +01:00
Piotr Grabowski	efe7456f0a	type_json: support integers in scientific format Add support for specifing integers in scientific format (for example 1.234e8) in INSERT JSON statement: INSERT INTO table JSON '{"int_column": 1e7}'; Inserting a floating-point number ending with .0 is allowed, as the fractional part is zero. Non-zero fractional part (for example 12.34) is disallowed. A new test is added to test all those behaviors. Before the JSON parsing library was switched to RapidJSON from JsonCpp, this statement used to work correctly, because JsonCpp transparently casts double to integer value. This behavior differs from Cassandra, which disallows those types of numbers (1e7, 123.0 and 12.34). Fix typo in if condition: "if (value.GetUint64())" to "if (value.IsUint64())". Fixes #10100	2022-02-22 12:55:38 +01:00
Piotr Grabowski	649ab70936	type_json: add missing IsString() checks Add missing IsString() checks to parsing date, time, uuid and inet types by introducing validated_to_string_view function which checks whether the value is of string type and otherwise throws marshal_exception. Without this check, a malformed input to those types would result in nasty ServerError with RapidJSON assertion instead of marshal_exception with detail about the problem. Add new tests checking passing non-string values for those types. Fixes #10115	2022-02-21 16:58:13 +01:00
Piotr Grabowski	f8b67c9bd1	type_json: fix wrong blob JSON validation Fixes wrong condition for validating whether a JSON string representing blob value is valid. Previously, strings such as "6" or "0392fa" would pass the validation, even though they are too short or don't start with "0x". Add those test cases to json_cql_query_test.cc. Fixes #10114	2022-02-21 16:58:12 +01:00
Avi Kivity	adc08d0ab9	Merge "Drop v1 input support for mutation compactor" from Botond " Currently the mutation compactor supports v1 and v2 output and has a v1 output. The next step is to add a v2 output but this would lead to a full conversion matrix which we want to avoid. So in preparation we drop the v1 input support. Most inputs were already v2, but there were some notable exceptions: tests, the compacting reader and the multishard query code. The former two was a simple mechanical update but the latter required some further work because it turned out the v2 version of evictable reader wasn't used yet and thus it managed to hide some bugs and dropped features. While at it, we migrate all evictable and multishard reader users to the v2 variant of the respective readers and drop the v1 variant completely. With this the road is open to a v2 compactor output and therefore to a v2 sstable writer. Tests: unit(dev, release), dtest(paging_additional_test.py) " * 'compact-mutation-v2-only-input/v5' of https://github.com/denesb/scylla: test/lib/test_utils: return OK from check() variants repair/row_level: use evictable reader v2 db/view/view_updating_consumer: migrate to v2 test/boost/mutation_reader_test: add v2 specific evictable reader tests test: migrate to evictable reader v2 and multishard combining reader v2 compact_mutation: drop support for v1 input test: pass v2 input to mutation_compaction test/boost/mutation_test: simplify test_compaction_data_stream_split test mutation_partition: do_compact(): do drop row tombstones covered by higher order tombstones multishard_mutation_query: migrate to v2 mutation_fragment_v2: range_tombstone_change: add memory_usage() evictable_reader_v2: terminate active range tombstones on reader recreation evictable_reader_v2: restore handling of non-monotonically increasing positions evictable_reader_v2: simplify handling of reader recreation mutation: counter_write_query: use v2 reader mutation: migrate consume() to v2 mutation_fragment_v2,flat_mutation_reader_v2: mirror v1 concept organization mutation_reader: compacting_reader: require a v2 input reader db/view/view_builder: use v2 reader test/lib/flat_mutation_reader_assertions: adjust has_monotonic_positions() to v2 spec	2022-02-21 14:32:55 +02:00
Botond Dénes	841b982e51	test/lib/test_utils: return OK from check() variants The various require() and check() methods in test_utils.hh were introduced to replace BOOST_REQUIRE() and BOOST_CHECK() respectively in multi-shard concurrent tests, specifically those in tests/boost/multishard_mutation_query_test.cc. This was done literally, just replacing BOOST_REQUIRE() with require() and BOOST_CHECK() with check(). The problem is that check() is missing a feature BOOST_CHECK() had: while BOOST_CHECK() doesn't cause an immediate test failure, just logging an error if the condition fails, it remembers this failure and will fail the test in the end. check() did not have this feature and this caused potential errors to just be logged while the test could still pass fine, causing false-positive tests passes. This patch fixes this by returning a [[nodiscard]] bool from the check() methods. The caller can & these together over all calls to check() methods and manually fail the test in the end. We choose this method over a hidden global (like BOOST_CHECK() does) for simplicity sake.	2022-02-21 12:29:25 +02:00
Botond Dénes	05c48ee0cc	db/view/view_updating_consumer: migrate to v2 Not a completely mechanical transition. The consumer has to generate its mutation via a mutation_rebuilder_v2 as mutation fragment v2 cannot be applied to mutations directly yet.	2022-02-21 12:29:24 +02:00
Botond Dénes	014a23bf2a	test/boost/mutation_reader_test: add v2 specific evictable reader tests One is a reincarnation of the recently removed test_multishard_combining_reader_non_strictly_monotonic_positions. The latter was actually targeting the evictable reader but through the multishard reader, probably for historic reasons (evictable reader was part of the multishard reader family). The other one checks that active range tombstones changes are properly terminated when the partition ends abruptly after recreating the reader.	2022-02-21 12:29:24 +02:00
Botond Dénes	e3c618beba	test: migrate to evictable reader v2 and multishard combining reader v2 All reads are now using the v2 version of these readers, test them instead of the old v1.	2022-02-21 12:29:24 +02:00
Botond Dénes	f1e9e3b3b7	compact_mutation: drop support for v1 input	2022-02-21 12:29:24 +02:00
Botond Dénes	284ed9154f	test: pass v2 input to mutation_compaction	2022-02-21 12:29:24 +02:00

1 2 3 4 5 ...

1537 Commits