scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 20:05:10 +00:00

Author	SHA1	Message	Date
Nadav Har'El	e0693f19d0	alternator test: produce newer xunit format for test results test.py passes the "--junit-xml" option to test/alternator/run, which passes this option to pytest to get an xunit-format summary of the test results. However, unfortunately until very recent versions (which aren't yet in Linux distributions), pytest defaulted to a non-standard xunit format which tools like Jenkins couldn't properly parse. The more standard format can be chosen by passing the option "-o junit_family=xunit2", so this is what we do in this patch. Fixes #6767 (hopefully). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200719203414.985340-1-nyh@scylladb.com>	2020-07-20 09:24:50 +03:00
Avi Kivity	5371be71e9	Merge "Reduce fanout of some mutation-related headers" from Pavel E " The set's goal is to reduce the indirect fanout of 3 headers only, but likely affects more. The measured improvement rates are flat_mutation_reader.hh: -80% mutation.hh : -70% mutation_partition.hh : -20% tests: dev-build, 'checkheaders' for changed headers (the tree-wide fails on master) " * 'br-debloat-mutation-headers' of https://github.com/xemul/scylla: headers:: Remove flat_mutation_reader.hh from several other headers migration_manager: Remove db/schema_tables.hh inclustion into header storage_proxy: Remove frozen_mutation.hh inclustion storage_proxy: Move paxos/*.hh inclusions from .hh to .cc storage_proxy: Move hint_wrapper from .hh to .cc headers: Remove mutation.hh from trace_state.hh	2020-07-19 19:47:59 +03:00
Rafael Ávila de Espíndola	9fd2682bfd	restrictions_test: Fix use after return The query_options constructor captures a reference to the cql_config. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200718013221.640926-1-espindola@scylladb.com>	2020-07-19 15:44:38 +03:00
Pavel Emelyanov	8618a02815	migration_manager: Remove db/schema_tables.hh inclustion into header The schema_tables.hh -> migration_manager.hh couple seems to work as one of "single header for everyhing" creating big blot for many seemingly unrelated .hh's. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-17 17:54:43 +03:00
Nadav Har'El	3b5122fd04	alternator test: fix warning message in test_streams.py In test_streams.py, we had the line: assert desc['StreamDescription'].get('StreamLabel') In Alternator, the 'StreamLabel' attribute is missing, which the author of this test probably thought would cause this test to fail (which is expected, the test is marked with "xfail"). However, my version of pytest actually doesn't like that assert is given a value instead of a comparison, and we get the warning message: PytestAssertRewriteWarning: asserting the value None, please use "assert is None" I think that the nicest replacement for this line is assert 'StreamLabel' in desc['StreamDescription'] This is very readable, "pythonic", and checks the right thing - it checks that the JSON must include the 'StreamLabel' item, as the get() assertion was supposed to have been doing. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200716124621.906473-1-nyh@scylladb.com>	2020-07-17 14:36:23 +03:00
Rafael Ávila de Espíndola	66d866427d	sstable_datafile_test: Use BOOST_REQUIRE_EQUAL This only works for types that can be printed, but produces a better error message if the check fails. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Reviewed-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200716232700.521414-1-espindola@scylladb.com>	2020-07-17 11:58:58 +03:00
Avi Kivity	7bf51b8c6c	Merge 'Distinguish single-column expressions in AST' from Dejan " Fix #6825 by explicitly distinguishing single- from multi-column expressions in AST. Tests: unit (dev), dtest secondary_indexes_test.py (dev) " * dekimir-single-multiple-ast: cql3/restrictions: Separate AST for single column cql3/restrictions: Single-column helper functions	2020-07-16 16:59:14 +03:00
Pavel Solodovnikov	5ff5df1afd	storage_proxy: un-hardcode force sync flag for `mutate_locally(mutation)` overload Corresponding overload of `storage_proxy::mutate_locally` was hardcoded to pass `db::commitlog::force_sync::no` to the `database::apply`. Unhardcode it and substitute `force_sync::no` to all existing call sites (as it were before). `force_sync::yes` will be used later for paxos learn writes when trying to apply mutations upgraded from an obsolete schema version (similar to the current case when applying locally a `frozen_mutation` stored in accepted proposal). Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200716124915.464789-1-pa.solodovnikov@scylladb.com>	2020-07-16 16:38:48 +03:00
Dejan Mircevski	0047e1e44d	cql3/restrictions: Separate AST for single column Existing AST assumes the single-column expression is a special case of multi-column expressions, so it cannot distinguish `c=(0)` from `(c)=(0)`. This leads to incorrect behaviour and dtest failures. Fix it by separating the two cases explicitly in the AST representation. Modify AST-creation code to create different AST for single- and multi-column expressions. Modify AST-consuming code to handle column_name separately from vector<column_name>. Drop code relying on cardinality testing to distinguisn single-column cases. Add a new unit test for `c=(0)`. Fixes #6825. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-07-16 12:27:25 +02:00
Nadav Har'El	dcf9c888a2	alternator test: disable test_streams.py::test_get_records This test usually fails, with the following error. Marking it "xfail" until we can get to the bottom of this. dynamodb = dynamodb.ServiceResource() dynamodbstreams = <botocore.client.DynamoDBStreams object at 0x7fa91e72de80> def test_get_records(dynamodb, dynamodbstreams): # TODO: add tests for storage/transactionable variations and global/local index with create_stream_test_table(dynamodb, StreamViewType='NEW_AND_OLD_IMAGES') as table: arn = wait_for_active_stream(dynamodbstreams, table) p = 'piglet' c = 'ninja' val = 'lucifers' val2 = 'flowers' > table.put_item(Item={'p': p, 'c': c, 'a1': val, 'a2': val2}) test_streams.py:316: ... E botocore.exceptions.ClientError: An error occurred (Internal Server Error) when calling the PutItem operation (reached max retries: 3): Internal server error: std::runtime_error (cdc::metadata::get_stream: could not find any CDC stream (current time: 2020/07/15 17:26:36). Are we in the middle of a cluster upgrade?) Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-07-16 08:24:25 +03:00
Nadav Har'El	61f52da9b1	merge: Alternator/CDC: Implement streams support Merged pull request https://github.com/scylladb/scylla/pull/6694 by Calle Wilund: Implementation of DynamoDB streams using Scylla CDC. Fixes #5065 Initial, naive implementation insofar that it uses 1:1 mapping CDC stream to DynamoDB shard. I.e. there are a lot of shards. Includes tests verified against both local DynamoDB server and actual AWS remote one. Note: Because of how data put is implemented in alternator, currently we do not get "proper" INSERT labels for first write of data, because to CDC it looks like an update. The test compensates for this, but actual users might not like it.	2020-07-16 08:18:25 +03:00
Nadav Har'El	c4497bf770	alternator test: enable experimental CDC In the script test/alternator/run, which runs Scylla for the Alternator tests, add the "--experimental-features=cdc" option, to allow us testing the streams API whose implementation requires the experimenal CDC feature. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2020-07-16 08:18:09 +03:00
Nadav Har'El	09a71ccd84	merge: cql3/restrictions: exclude NULLs from comparison in filtering Merge pull request https://github.com/scylladb/scylla/pull/6834 by Juliusz Stasiewicz: NULLs used to give false positives in GT, LT, GEQ and LEQ ops performed upon ALLOW FILTERING. That was a consequence of not distinguishing NULL from an empty buffer. This patch excludes NULLs on high level, preventing them from entering LHS of any comparison, i.e. it assumes that any binary operation should return false whenever the LHS operand is NULL (note: at the moment filters with RHS NULL, such as ...WHERE x=NULL ALLOW FILTERING, return empty sets anyway). Fixes #6295 * '6295-do-not-compare-nulls-v2' of github.com:jul-stas/scylla: filtering_test: check that NULLs do not compare to normal values cql3/restrictions: exclude NULLs from comparison in filtering	2020-07-15 18:32:14 +03:00
Calle Wilund	76f6fe679a	alternator tests: Add streams test Small set of positive and negative tests of streams functionality. Verified against DynamoDB and Alternator.	2020-07-15 08:21:34 +00:00
Juliusz Stasiewicz	c25398e8cf	filtering_test: check that NULLs do not compare to normal values Tested operators are: `<` and `>`. Tests all types of NULLs except `duration` because durations are explicitly not comparable. Values to compare against were chosen arbitrarily.	2020-07-14 15:37:17 +02:00
Pavel Emelyanov	f8ffc31218	test: Print more sizes in memory_footprint_test The row cache memory footprint changed after switch to B+ because we no longer have a sole cache_entry allocation, but also the bplus::data and bplus::node. Knowing their sizes helps analyzing the footprint changes. Also print the size of memtable_entry that's now also stored in B+'s data. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:30:02 +03:00
Pavel Emelyanov	174b101a49	row_cache: Switch partition tree onto B+ rails The row_cache::partitions_type is replaced from boost::intrusive::set to bplus::tree<Key = int64_t, T = array_trusted_bounds<cache_entry>> Where token is used to quickly locate the partition by its token and the internal array -- to resolve hashing conflicts. Summary of changes in cache_entry: - compare's goes away as the new collection needs tri-compare one which is provided by ring_position_comparator - when initialized the dummy entry is added with "after_all_keys" kind, not "before_all_keys" as it was by default. This is to make tree entries sorted by token - insertion and removing of cache_entries happens inside double_decker, most of the changes in row_cache.cc are about passing constructor args from current_allocator.construct into double_decker.empace_before() - the _flags is extended to keep array head/tail bits. There's a room for it, sizeof(cache_entry) remains unchanged The rest fits smothly into the double_decker API. Also, as was told in the previous patch, insertion and removal _may_ invalidate iterators, but may leave them intact. However, currently this doesn't seem to be a problem as the cache_tracker ::insert() and ::on_partition_erase do invalidate iterators unconditionally. Later this can be otimized, as iterators are invalidated by double-decker only in case of hash conflict, otherwise it doesn't change arrays and B+ tree doesn't invalidate its. tests: unit(dev), perf(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:30:02 +03:00
Pavel Emelyanov	cf1315cde5	double-decker: A combination of B+tree with array The collection is K:V store bplus::tree<Key = K, Value = array_trusted_bounds<V>> It will be used as partitions cache. The outer tree is used to quickly map token to cache_entry, the inner array -- to resolve (expected to be rare) hash collisions. It also must be equipped with two comparators -- less one for keys and full one for values. The latter is not kept on-board, but it required on all calls. The core API consists of just 2 calls - Heterogenuous lower_bound(search_key) -> iterator : finds the element that's greater or equal to the provided search key Other than the iterator the call returns a "hint" object that helps the next call. - emplace_before(iterator, key, hint, ...) : the call construct the element right before the given iterator. The key and hint are needed for more optimal algo, but strictly speaking not required. Adding an entry to the double_decker may result in growing the node's array. Here to B+ iterator's .reconstruct() method comes into play. The new array is created, old elements are moved onto it, then the fresh node replaces the old one. // TODO: Ideally this should be turned into the // template <typename OuterCollection, typename InnerCollection> // but for now the double_decker still has some intimate knowledge // about what outer and inner collections are. Insertion into this collection _may_ invalidate iterators, but may leave intact. Invalidation only happens in case of hashing conflict, which can be clearly seen from the hint object, so there's a good room for improvement. The main usage by row_cache (the find_or_create_entry) looks like cache_entry find_or_create_entry() { bound_hint hint; it = lower_bound(decorated_key, &hint); if (!hint.found) { it = emplace_before(it, decorated_key.token(), hint, <constructor args>) } return *it; } Now the hint. It contains 3 booleans, that are - match: set to true when the "greater or equal" condition evaluated to "equal". This frees the caller from the need to manually check whether the entry returned matches the search key or the new one should be inserted. This is the "!found" check from the above snippet. To explain the next 2 bools, here's a small example. Consider the tree containing two elements {token, partition key}: { 3, "a" }, { 5, "z" } As the collection is sorted they go in the order shown. Next, this is what the lower_bound would return for some cases: { 3, "z" } -> { 5, "z" } { 4, "a" } -> { 5, "z" } { 5, "a" } -> { 5, "z" } Apparently, the lower bound for those 3 elements are the same, but the code-flows of emplacing them before one differ drastically. { 3, "z" } : need to get previous element from the tree and push the element to it's vector's back { 4, "a" } : need to create new element in the tree and populate its empty vector with the single element { 5, "a" } : need to put the new element in the found tree element right before the found vector position To make one of the above decisions the .emplace_before would need to perform another set of comparisons of keys and elements. Fortunately, the needed information was already known inside the lower_bound call and can be reported via the hint. Said that, - key_match: set to true if tree.lower_bound() found the element for the Key (which is token). For above examples this will be true for cases 3z and 5a. - key_tail: set to true if the tree element was found, but when comparing values from array the bounding element turned out to belong to the next tree element and the iterator was ++-ed. For above examples this would be true for case 3z only. And the last, but not least -- the "erase self" feature. Which is given only the cache_entry pointer at hands remove it from the collection. To make this happen we need to make two steps: 1. get the array the entry sits in 2. get the b+ tree node the vectors sits in Both methods are provided by array_trusted_bounds and bplus::tree. So, when we need to get iterator from the given T pointer, the algo looks like - Walk back the T array untill hitting the head element - Call array_trusted_bounds::from_element() getting the array - Construct b+ iterator from obtained array - Construct the double_decker iterator from b+ iterator and from the number of "steps back" from above - Call double_decker::iterator.erase() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:29:53 +03:00
Pavel Emelyanov	eb70644c1c	intrusive-array: Array with trusted bounds A plain array of elements that grows and shrinks by constructing the new instance from an existing one and moving the elements from it. Behaves similarly to vector's external array, but has 0-bytes overhead. The array bounds (0-th and N-th elemements) are determined by checking the flags on the elements themselves. For this the type must support getters and setters for the flags. To remove an element from array there's also a nothrow option that drops the requested element from array, shifts the righter ones left and keeps the trailing unused memory (so called "train") until reconstruction or destruction. Also comes with lower_bound() helper that helps keeping the elements sotred and the from_element() one that returns back reference to the array in which the element sits. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:29:49 +03:00
Pavel Emelyanov	95f15ea383	utils: B+ tree implementation // The story is at // https://groups.google.com/forum/#!msg/scylladb-dev/sxqTHM9rSDQ/WqwF1AQDAQAJ This is the B+ version which satisfies several specific requirements to be suitable for row-cache usage. 1. Insert/Remove doesn't invalidate iterators 2. Elements should be LSA-compactable 3. Low overhead of data nodes (1 pointer) 4. External less-only comparator 5. As little actions on insert/delete as possible 6. Iterator walks the sorted keys The design, briefly is: There are 3 types of nodes: inner, leaf and data, inner and leaf keep build-in array of N keys and N(+1) nodes. Leaf nodes sit in a doubly linked list. Data nodes live separately from the leaf ones and keep pointers on them. Tree handler keeps pointers on root and left-most and right-most leaves. Nodes do _not_ keep pointers or references on the tree (except 3 of them, see below). changes in v9: - explicitly marked keys/kids indices with type aliases - marked the whole erase/clear stuff noexcept - disposers now accept object pointer instead of reference - clear tree in destructor - added more comments - style/readability review comments fixed Prior changes - Add noexcepts where possible - Restrict Less-comparator constraint -- it must be noexcept - Generalized node_id - Packed code for beging()/cbegin() - Unsigned indices everywhere - Cosmetics changes - Const iterators - C++20 concepts - The index_for() implmenetation is templatized the other way to make it possible for AVX key search specialization (further patching) - Insertion tries to push kids to siblings before split Before this change insertion into full node resulted into this node being split into two equal parts. This behaviour for random keys stress gives a tree with ~2/3 of nodes half-filled. With this change before splitting the full node try to push one element to each of the siblings (if they exist and not full). This slows the insertion a bit (but it's still way faster than the std::set), but gives 15% less total number of nodes. - Iterator method to reconstruct the data at the given position The helper creates a new data node, emplaces data into it and replaces the iterator's one with it. Needed to keep arrays of data in tree. - Milli-optimize erase() - Return back an iterator that will likely be not re-validated - Do not try to update ancestors separation key for leftmost kid This caused the clear()-like workload work poorly as compared to std:set. In particular the row_cache::invalidate() method does exactly this and this change improves its timing. - Perf test to measure drain speed - Helper call to collect tree counters - Fix corner case of iterator.emplace_before() - Clean heterogenous lookup API - Handle exceptions from nodes allocations - Explicitly mark places where the key is copied (for future) - Extend the tree.lower_bound() API to report back whether the bound hit the key or not - Addressed style/cleanness review comments Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 16:29:43 +03:00
Juliusz Stasiewicz	c69075bbef	cql3/restrictions: exclude NULLs from comparison in filtering NULLs used to give false positives in GT, LT, GEQ and LEQ ops performed upon `ALLOW FILTERING`. That was a consequence of not distinguishing NULL from an empty buffer. This patch excludes NULLs on high level, preventing them from entering any comparison, i.e. it assumes that any binary operator should return `false` whenever one of the operands is NULL (note: ATM filters such as `...WHERE x=NULL ALLOW FILTERING` return empty sets anyway). `restriction_test/regular_col_slice` had to be updated accordingly. Fixes #6295	2020-07-14 12:59:01 +02:00
Pavel Emelyanov	9d38846ed2	test: Move perf measurement helpers into header To use the code in new perf tests in next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-14 12:58:26 +03:00
Nadav Har'El	8e3be5e7d6	alternator test: configurable temporary directory The test/alternator/run script creates a temporary directory for the Scylla database in /tmp. The assumption was that this is the fastest disk (usually even a ramdisk) on the test machine, and we didn't need anything else from it. But it turns out that on some systems, /tmp is actually a slow disk, so this patch adds a way to configure the temporary directory - if the TMPDIR environment variable exists, it is used instead of /tmp. As before this patch, a temporary subdirectry is created in $TMPDIR, and this subdirectory is automatically deleted when the test ends. The test.py script already passes an appropriate TMPDIR (testlog/$mode), which after this patch the Alternator test will use instead of /tmp. Fixes #6750 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200713193023.788634-1-nyh@scylladb.com>	2020-07-14 08:52:22 +03:00
Nadav Har'El	35f7048228	alternator: CreateTable with bad Tags shouldn't create a table Currently, if a user tries to CreateTable with a forbidden set of tags, e.g., the Tags list is too long or contains an invalid value for system:write_isolation, then the CreateTable request fails but the table is still created. Without the tag of course. This patch fixes this bug, and adds two test cases for it that fail before this patch, and succeed with it. One of the test cases is scylla_only because it checks the Scylla-specific system:write_isolation tag, but the second test case works on DynamoDB as well. What this patch does is to split the update_tags() function into two parts - the first part just parses the Tags, validates them, and builds a map. Only the second part actually writes the tags to the schema. CreateTable now does the first part early, before creating the table, so failure in parsing or validating the Tags will not leave a created table behind. Fixes #6809. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200713120611.767736-1-nyh@scylladb.com>	2020-07-13 17:14:44 +03:00
Nadav Har'El	f549d147ea	alternator: fix Expected's "NULL" operator with missing AttributeValueList The "NULL" operator in Expected (old-style conditional operations) doesn't have any parameters, so we insisted that the AttributeValueList be empty. However, we forgot to allow it to also be missing - a possibility which DynamoDB allows. This patch adds a test to reproduce this case (the test passes on DyanmoDB, fails on Alternator before this patch, and succeeds after this patch), and a fix. Fixes #6816. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200709161254.618755-1-nyh@scylladb.com>	2020-07-10 07:45:02 +02:00
Benny Halevy	3ce86a7160	test: restrictions_test: set_contains: uncomment check depnding on #6797 Now that #6797 is fixed. Refs #5763 Cc: Dejan Mircevski <dejan@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Test: restrictions_test(debug) Message-Id: <20200709123703.955897-1-bhalevy@scylladb.com>	2020-07-09 17:56:09 +03:00
Nadav Har'El	9042161ba3	merge: cdc: better pre/postimages for complicated batches Merged pull request https://github.com/scylladb/scylla/pull/6741 by Piotr Dulikowski: This PR changes the algorithm used to generate preimages and postimages in CDC log. While its behavior is the same for non-batch operations (with one exception described later), it generates pre/postimages that are organized more nicely, and account for multiple updates to the same row in one CQL batch. Fixes #6597, #6598 Tests: - unit(dev), for each consecutive commit - unit(debug), for the last commit Previous method The previous method worked on a per delta row basis. First, the base table is queried for the current state of the rows being modified in the processed mutation (this is called the "preimage query"). Then, for each delta row (representing a modification of a row): If preimage is enabled and the row was already present in the table, a corresponding preimage row is inserted before the delta row. The preimage row contains data taken directly from the preimage query result. Only columns that are modified by the delta are included in the preimage. If postimage is enabled, then a postimage row is inserted after the delta row. The postimage row contains data which was a result of taking row data directly from the preimage query result and applying the change the corresponding delta row represented. All columns of the row are included in the postimage. The above works well for simple cases such like singular CQL INSERT, UPDATE, DELETE, or simple CQL BATCH-es. An example: cqlsh:ks> BEGIN UNLOGGED BATCH INSERT INTO tbl (pk, ck, v) VALUES (0, 1, 111); INSERT INTO tbl (pk, ck, v) VALUES (0, 2, 222); APPLY BATCH; cqlsh:ks> SELECT "cdc$batch_seq_no", "cdc$operation", "cdc$ttl", pk, ck, v from ks.tbl_scylla_cdc_log ; cdc$batch_seq_no \| cdc$operation \| cdc$ttl \| pk \| ck \| v ------------------+---------------+---------+----+----+----- ...snip... 0 \| 0 \| null \| 0 \| 1 \| 100 1 \| 2 \| null \| 0 \| 1 \| 111 2 \| 9 \| null \| 0 \| 1 \| 111 3 \| 0 \| null \| 0 \| 2 \| 200 4 \| 2 \| null \| 0 \| 2 \| 222 5 \| 9 \| null \| 0 \| 2 \| 222 Preimage rows are represented by cdc operation 0, and postimage by 9. Please note that all rows presented above share the same value of cdc$time column, which was not shown here for brevity. Problems with previous approach This simple algorithm has some conceptual and implementational problems which arise when processing more complicated CQL BATCH-es. Consider the following example: cqlsh:ks> BEGIN UNLOGGED BATCH INSERT INTO tbl (pk, ck, v1) VALUES (0, 0, 1) USING TTL 1000; INSERT INTO tbl (pk, ck, v2) VALUES (0, 0, 2) USING TTL 2000; APPLY BATCH; cqlsh:ks> SELECT "cdc$batch_seq_no", "cdc$operation", "cdc$ttl", pk, ck, v1, v2 FROM tbl_scylla_cdc_log; cdc$batch_seq_no \| cdc$operation \| cdc$ttl \| pk \| ck \| v1 \| v2 ------------------+---------------+---------+----+----+------+------ ...snip... 0 \| 0 \| null \| 0 \| 0 \| null \| 0 1 \| 2 \| 2000 \| 0 \| 0 \| null \| 2 2 \| 9 \| null \| 0 \| 0 \| 0 \| 2 3 \| 0 \| null \| 0 \| 0 \| 0 \| null 4 \| 1 \| 1000 \| 0 \| 0 \| 1 \| null 5 \| 9 \| null \| 0 \| 0 \| 1 \| 0 A single cdc group (corresponding to rows sharing the same cdc$time) might have more than one delta that modify the same row. For example, this happens when modifying two columns of the same row with different TTLs - due to our choice of CDC log schema, we must represent such change with two delta rows. It does not make sense to present a postimage after the first delta and preimage before the second - both deltas are applied simultaneously by the same CQL BATCH, so the middle "image" is purely imaginary and does not appear at any point in the table. Moreover, in this example, the last postimage is wrong - v1 is updated, but v2 is not. None of the postimages presented above represent the final state of the row. New algorithm The new algorithm works now on per cdc group basis, not delta row. When starting processing a CQL BATCH: Load preimage query results into a data structure representing current state of the affected rows. For each cdc group: For each row modified within the group, a preimage is produced, regardless if the row was present in the table. The preimage is calculated based on the current state. Only include columns that are modified for this row within the group. For each delta, produce a delta row and update the current state accordingly. Produce postimages in the same way as preimages - but include all columns for each row in the postimage. The new algorithm produces postimage correctly when multiple deltas affect one, because the state of the row is updated on the fly. This algorithm moves preimage and postimage rows to the beginning and the end of the cdc group, accordingly. This solves the problem of imaginary preimages and postimages appearing inside a cdc group. Unfortunately, it is possible for one CQL BATCH to contain changes that use multiple timestamps. This will result in one CQL BATCH creating multiple cdc groups, with different cdc$time. As it is impossible, with our choice of schema, to tell that those cdc groups were created from one CQL BATCH, instead we pretend as if those groups were separate CQL operations. By tracking the state of the affected rows, we make sure that preimage in later groups will reflect changes introduces in previous groups. One more thing - this algorithm should have the same results for singular CQL operations and simple CQL BATCH-es, with one exception. Previously, preimage not produced if a row was not present in the table. Now, the preimage row will appear unconditionally - it will have nulls in place of column values. * 'cdc-pre-postimage-persistence' of github.com:piodul/scylla: cdc: fix indentation cdc: don't update partition state when not needed cdc: implement pre/postimage persistence cdc: add interface for producing pre/postimages cdc: load preimage query result into partition state fields cdc: introduce fields for keeping partition state cdc: rename set_pk_columns -> allocate_new_log_row cdc: track batch_no inside transformer cdc: move cdc$time generation to transformer cdc: move find_timestamp to split.cc cdc: introduce change_processor interface cdc: remove redundant schema arguments from cdc functions cdc: move management of generated mutations inside transformer cdc: move preimage result set into a field of transformer cdc: keep ts and tuuid inside transformer cdc: track touched parts of mutations inside transformer cdc: always include preimage for affected rows	2020-07-09 16:55:55 +03:00
Piotr Sarna	75dbaa0834	test: add alternator test for incorrect numeric values The test case is put inside test_manual_requests suite, because boto3 validates numeric inputs and does not allow passing arbitrary incorrect values. Tests: unit(dev), alternator(local, remote) Message-Id: <ac2baedc2ea61f0d857e7c01839f34cd15f7e02d.1594289250.git.sarna@scylladb.com>	2020-07-09 13:58:33 +03:00
Dejan Mircevski	d956233a80	cql_query_test: Drop get() on cquery_nofail result cquery_nofail returns the query result, not a future. Invoking .get() on its result is unnecessary. This just happened to compile because shared_ptr has a get() method with the same signature as future::get. Tests: cql_query_test unit test (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-07-09 13:52:52 +03:00
Nadav Har'El	9ff9cd37c3	alternator test: tests for the number type We had some tests for the number type in Alternator and how it can be stored, retrieved, calculated and sorted, but only had rudementary tests for the allowed magnitude and precision of numbers. This patch creates a new test file, test_number.py, with tests aiming to check exactly the supported magnitudes and precision of numbers. These tests verify two things: 1. That Alternator's number type supports the full precision and magnitude that DynamoDB's number type supports. We don't want to see precision or magnitude lost when storing and retrieving numbers, or when doing calculations on them. 2. That Alternator's number type does not have better precision or magnitude than DynamoDB does. If it did, users may be tempted to rely on that implementation detail. The three tests of the first type pass; But all four tests of the second type xfail: Alternator currently stores numbers using big_decimal which has unlimited precision and almost-unlimited magnitude, and is not yet limited by the precision and magnitude allowed by DynamoDB. This is a known issue - Refs #6794 - and these four new xfailing tests will can be used to reproduce that issue. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200707204824.504877-1-nyh@scylladb.com>	2020-07-09 07:38:36 +02:00
Piotr Dulikowski	246f8da6f6	cdc: implement pre/postimage persistence Moves responsibility for generating pre/postimage rows from the "process_change" method to "produce_preimage" and "produce_postimage". This commit actually affects the contents of generated CDC log mutations. Added a unit test that verifies more complicated cases with CQL BATCH.	2020-07-08 15:36:41 +02:00
Piotr Dulikowski	f907cab156	cdc: remove redundant schema arguments from cdc functions A `mutation` object already has a reference to its schema. It does not make sense to call functions changed in this commit with a different schema.	2020-07-08 15:36:40 +02:00
Piotr Dulikowski	027d20c654	cdc: always include preimage for affected rows This changes the current algorithm so that the preimage row will not be skipped if the corresponding rows was not present in preimage query results.	2020-07-08 15:36:40 +02:00
Rafael Ávila de Espíndola	b10beead61	memtable_snapshot_source: Avoid a std::bad_alloc crash _should_compact is a condition_variable and condition_variable::wait() allocates memory. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200706223201.903072-1-espindola@scylladb.com>	2020-07-08 15:21:50 +02:00
Avi Kivity	7ea9ee27dd	Merge 'aggregates: Use type-specific comparators in min/max' from Juliusz " For collections and UDTs the `MIN()` and `MAX()` functions are generated on the fly. Until now they worked by comparing just the byte representations of their arguments. This patch employs specific per-type comparators to provide semantically sensible, dynamically created aggregates. Fixes #6768 " * jul-stas-6768-use-type-comparators-for-minmax: tests: Test min/max on set aggregate_fcts: Use per-type comparators for dynamic types	2020-07-08 15:07:57 +03:00
Juliusz Stasiewicz	f08e0e10be	tests: Test min/max on set Expected behavior is the lexicographical comparison of sets (element by element), so this test was failing when raw byte representations were compared.	2020-07-08 13:39:15 +02:00
Avi Kivity	b0698dfb38	Merge 'Rewrite CQL3 restriction representation' from dekimir " This is the first stage of replacing the existing restrictions code with a new representation. It adds a new class `expression` to replace the existing class `restriction`. Lots of the old code is deleted, though not all -- that will come in subsequent stages. Tests: unit (dev, debug restrictions_test), dtest (next-gating) " * dekimir-restrictions-rewrite: cql3/restrictions: Drop dead code cql3/restrictions: Use free functions instead of methods cql3/restrictions: Create expression objects cql3/restrictions: Add free functions over new classes cql3/restrictions: Add new representation	2020-07-08 10:22:17 +03:00
Dejan Mircevski	37ebe521e3	cql3/restrictions: Use free functions instead of methods Instead of `restriction` class methods, use the new free functions. Specific replacement actions are listed below. Note that class `restrictions` (plural) remains intact -- both its methods and its type hierarchy remain intact for now. Ensure full test coverage of the replacement code with new file test/boost/restrictions_test.cc and some extra testcases in test/cql/*. Drop some existing tests because they codify buggy behaviour (reference #6369, #6382). Drop others because they forbid relation combinations that are now allowed (eg, mixing equality and inequality, comparing to NULL, etc.). Here are some specific categories of what was replaced: - restriction::is_foo predicates are replaced by using the free function find_if; sometimes it is used transitively (see, eg, has_slice) - restriction::is_multi_column is replaced by dynamic casts (recall that the `restrictions` class hierarchy still exists) - utility methods is_satisfied_by, is_supported_by, to_string, and uses_function are replaced by eponymous free functions; note that restrictions::uses_function still exists - restriction::apply_to is replaced by free function replace_column_def - when checking infinite_bound_range_deletions, the has_bound is replaced by local free function bounded_ck - restriction::bounds and restriction::value are replaced by the more general free function possible_lhs_values - using free functions allows us to simplify the multi_column_restriction and token_restriction hierarchies; their methods merge_with and uses_function became identical in all subclasses, so they were moved to the base class - single_column_primary_key_restrictions<clustering_key>::needs_filtering was changed to reuse num_prefix_columns_that_need_not_be_filtered, which uses free functions Fixes #5799. Fixes #6369. Fixes #6371. Fixes #6372. Fixes #6382. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-07-07 23:08:09 +02:00
Nadav Har'El	0143aaa5a8	merge: Forbid internal schema changes for distributed tables Merged patch set from Piotr Sarna: This series addresses issue #6700 again (it was reopened), by forbidding all non-local schema changes to be performed from within the database via CQL interface. These changes are dangerous since they are not directly propagated to other nodes. Tests: unit(dev) Fixes #6700 Piotr Sarna (4): test: make schema changes in query_processor_test global cql3: refuse to change schema internally for distributed tables test: expand testing internal schema changes cql3: add explanatory comments to execute_internal cql3/query_processor.hh \| 13 ++++++++++++- cql3/statements/alter_table_statement.cc \| 6 ------ cql3/statements/schema_altering_statement.cc \| 15 +++++++++++++++ test/boost/cql_query_test.cc \| 8 ++++++-- test/boost/query_processor_test.cc \| 16 ++++++++-------- 5 files changed, 41 insertions(+), 17 deletions(-)	2020-07-07 18:27:16 +03:00
Piotr Sarna	8ecae38d6b	test: expand testing internal schema changes ... in order to ensure that not only ALTER TABLE, but also other schema altering statements are not allowed for distributed tables/keyspaces.	2020-07-07 10:02:58 +02:00
Piotr Sarna	9bdf17a804	test: make schema changes in query_processor_test global Now that schema changes are going to be forbidden for non-local tables, query_processor_test is updated accordingly.	2020-07-07 09:09:40 +02:00
Botond Dénes	5ebe2c28d1	db/view: view_update_generator: re-balance wait/signal on the register semaphore The view update generator has a semaphore to limit concurrency. This semaphore is waited on in `register_staging_sstable()` and later the unit is returned after the sstable is processed in the loop inside `start()`. This was broken by `4e64002`, which changed the loop inside `start()` to process sstables in per table batches, however didn't change the `signal()` call to return the amount of units according to the number of sstables processed. This can cause the semaphore units to dry up, as the loop can process multiple sstables per table but return just a single unit. This can also block callers of `register_staging_sstable()` indefinitely as some waiters will never be released as under the right circumstances the units on the semaphore can permanently go below 0. In addition to this, `4e64002` introduced another bug: table entries from the `_sstables_with_tables` are never removed, so they are processed every turn. If the sstable list is empty, there won't be any update generated but due to the unconditional `signal()` described above, this can cause the units on the semaphore to grow to infinity, allowing future staging sstables producers to register a huge amount of sstables, causing memory problems due to the amount of sstable readers that have to be opened (#6603, #6707). Both outcomes are equally bad. This patch fixes both issues and modifies the `test_view_update_generator` unit test to reproduce them and hence to verify that this doesn't happen in the future. Fixes: #6774 Refs: #6707 Refs: #6603 Tests: unit(dev) Signed-off-by: Botond DÃ©nes <bdenes@scylladb.com> Message-Id: <20200706135108.116134-1-bdenes@scylladb.com>	2020-07-07 08:53:00 +02:00
Dejan Mircevski	921dbd0978	cql/restrictions: Handle `WHERE a>0 AND a<0` WHERE clauses with start point above the end point were handled incorrectly. When the slice bounds are transformed to interval bounds, the resulting interval is interpreted as wrap-around (because start > end), so it contains all values above 0 and all values below 0. This is clearly incorrect, as the user's intent was to filter out all possible values of a. Fix it by explicitly short-circuiting to false when start > end. Add a test case. Fixes #5799. Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-07-06 19:11:20 +03:00
Piotr Sarna	446b89f408	test: move json tests from manual/ to boost/ Manual tests are, as the name suggests, not run automatically, which makes them more prone to regressions. JSON tests are fast and correct, so there's no reason for them to be marked as manual. Message-Id: <dea75b0a0d1c238d12382a28840978884ac6ec2c.1594023481.git.sarna@scylladb.com>	2020-07-06 11:24:12 +03:00
Piotr Sarna	83ab41c76d	test: add json test for parsing from map Our JSON legacy helper functions for parsing documents to/from string maps are indirectly tested by several unit tests, e.g. caching_options_test.cc. They however lacked one corner case detected only by dtest - parsing an empty map from a null JSON document. This case is hereby added in order to prevent future regressions. Message-Id: <df8243bd083b2ba198df665aeb944c8710834736.1594020411.git.sarna@scylladb.com>	2020-07-06 10:28:55 +03:00
Avi Kivity	cc891a5de8	Merge "Convert a few uses of sstring to std::string_view" from Rafael " This series converts an API to use std::string_view and then converts a few sstring variables to be constexpr std::string_view. This has the advantage that a constexpr variables cannot be part of any initialization order problem. " * 'espindola/convert-to-constexpr' of https://github.com/espindola/scylla: auth: Convert sstring variables in common.hh to constexpr std::string_view auth: Convert sstring variables in default_authorizer to constexpr std::string_view cql_test_env: Make ks_name a constexpr std::string_view class_registry: Use std::string_view in (un)?qualified_name	2020-07-05 17:08:54 +03:00
Rafael Ávila de Espíndola	33af0c293f	cql_test_env: Make ks_name a constexpr std::string_view Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-03 12:28:20 -07:00
Nadav Har'El	8e3ecc30a9	merge: Migrate from libjsoncpp to rjson Merged patch series by Piotr Sarna: The alternator project was in need of a more optimized JSON library, which resulted in creating "rjson" helper functions. Scylla generally used libjsoncpp for its JSON handling, but in order to reduce the dependency hell, the usage is now migrated to rjson, which is faster and offers the same functionality. The original plan was to be able to drop the dependency on libjsoncpp-lib altogether and remove it from install-dependencies.sh, but one last usage of it remains in our test suite, namely cql_repl. The tool compares its output JSON textually, so it depends on how a library presents JSON - what are the delimeters, indentation, etc. It's possible to provide a layer of translation to force rjson to print in an identical format, but the other issue is that libjsoncpp keeps subobjects sorted by their name, while rjson uses an unordered structure. There are two possible solutions for the last remaining usage of libjsoncpp: 1. change our test suite to compare JSON documents with a JSON parser, so that we don't rely on internal library details 2. provide a layer of translation which forces rjson to print its objects in a format idential to libjsoncpp. (1.) would be preferred, since now we're also vulnerable for changes inside libjsoncpp itself - if they change anything in their output format, tests would start failing. The issue is not critical however, so it's left for later. Tests: unit(dev), manual(json_test), dtest(partitioner_tests.TestPartitioner.murmur3_partitioner_test) Piotr Sarna (8): alternator,utils: move rjson.hh to utils/ alternator: remove ambiguous string overloads in rjson rjson: add parse_to_map helper function rjson: add from_string_map function rjson: add non-throwing parsing rjson: move quote_json_string to rjson treewide: replace libjsoncpp usage with rjson configure: drop json.cc and json.hh helpers alternator/base64.hh \| 2 +- alternator/conditions.cc \| 2 +- alternator/executor.hh \| 2 +- alternator/expressions.hh \| 2 +- alternator/expressions_types.hh \| 2 +- alternator/rmw_operation.hh \| 2 +- alternator/serialization.cc \| 2 +- alternator/serialization.hh \| 2 +- alternator/server.cc \| 2 +- caching_options.hh \| 9 +- cdc/log.cc \| 4 +- column_computation.hh \| 5 +- configure.py \| 3 +- cql3/functions/functions.cc \| 4 +- cql3/statements/update_statement.cc \| 24 ++-- cql3/type_json.cc \| 212 ++++++++++++++++++---------- cql3/type_json.hh \| 7 +- db/legacy_schema_migrator.cc \| 12 +- db/schema_tables.cc \| 1 - flat_mutation_reader.cc \| 1 + index/secondary_index.cc \| 80 +++++------ json.cc \| 80 ----------- json.hh \| 113 --------------- schema.cc \| 25 ++-- test/boost/cql_query_test.cc \| 9 +- test/manual/json_test.cc \| 4 +- test/tools/cql_repl.cc \| 1 + {alternator => utils}/rjson.cc \| 75 +++++++++- {alternator => utils}/rjson.hh \| 40 +++++- 29 files changed, 344 insertions(+), 383 deletions(-) delete mode 100644 json.cc delete mode 100644 json.hh rename {alternator => utils}/rjson.cc (86%) rename {alternator => utils}/rjson.hh (81%)	2020-07-03 18:23:56 +02:00
Piotr Sarna	4cb79f04b0	treewide: replace libjsoncpp usage with rjson In order to eventually switch to a single JSON library, most of the libjsoncpp usage is dropped in favor of rjson. Unfortunately, one usage still remains: test/utils/test_repl utility heavily depends on the exact textual format of its output JSON files, so replacing a library results in all tests failing because of differences in formatting. It is possible to force rjson to print its documents in the exact matching format, but that's left for later, since the issue is not critical. It would be nice though if our test suite compared JSON documents with a real JSON parser, since there are more differences - e.g. libjsoncpp keeps children of the object sorted, while rapidjson uses an unordered data structure. This change should cause no change in semantics, it strives just to replace all usage of libjsoncpp with rjson.	2020-07-03 10:27:23 +02:00
Piotr Sarna	1b37517aab	rjson: move quote_json_string to rjson This utility function is used for type serialization, but it also has a dedicated unit test, so it needs to be globally reachable.	2020-07-03 10:27:23 +02:00

1 2 3 4 5 ...

638 Commits