Improve the "pull_github_pr.sh" to detect the number of commits in a
pull request, and use "git cherry-pick" to merge single-commit pull
requests.
Message-Id: <20200713093044.96764-1-penberg@scylladb.com>
There are a bunch of renames that are done if PRODUCT is not the
default, but the Python code for them is incorrect. Path.glob()
is not a static method, and Path does not support .endswith().
Fix by constructing a Path object, and later casting to str.
Let's report each missing CPU feature individually, and improve the
error message a bit. For example, if the "clmul" instruction is missing,
the report looks as follows:
ERROR: You will not be able to run Scylla on this machine because its CPU lacks the following features: pclmulqdq
If this is a virtual machine, please update its CPU feature configuration or upgrade to a newer hypervisor.
Fixes#6528
Add links to the users and developers mailing lists, and the Slack
channel in README.md to make them more discoverable.
Message-Id: <20200713074654.90204-1-penberg@scylladb.com>
Simplify the build and run instructions by splitting the text in three
sections (prerequisites, building, and running) and streamlining the
steps a bit.
Message-Id: <20200713065910.84582-1-penberg@scylladb.com>
scylla fiber often fails to really unwind the entire fiber, stopping
sooner than expected. This is expected as scylla fiber only recognizes
the most standard continuations but can drop the ball as soon as there
is an unusual transmission.
This commits adds a message below the found tasks explaining that the
list might not be exhaustive and prints a command which can be used to
explain why the unwinding stopped at the last task.
While at it also rephrase an out-of-date comment.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200710120813.100009-1-bdenes@scylladb.com>
As requested in #5763 feedback, require that Fn be callable with
binary_operator in the functions mentioned above.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
The problem is that this option is defined in seastar testing wrapper,
while no unit tests use it, all just start themselves with app.run() and
would complain on unknown option.
"Would", because nowadays every single test in it declares its own options
in suite.yaml, that override test.py's defaults. Once an option-less unit
test is added (B+ tree ones) it will complain.
The proposal is to remove this option from defaults, if any unit test will
use the seastar testing wrappers and will need this option, it can add one
to the suite.yaml.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200709084602.8386-1-xemul@scylladb.com>
The goal is to make the lambdas, that are fed into partition cache's
clear_and_dispose() and erase_in_dispose(), to be noexcept.
This is to satisfy B+, which strictly requires those to be noexcept
(currently used collections don't care).
The set covers not only the strictly required minimum, but also some
other methods that happened to be nearby.
* https://github.com/xemul/scylla/tree/br-noexcepts-over-the-row-cache:
row_cache: Mark invalidation lambda as noexcept
cache_tracker: Mark methods noexcept
cache_entry: Mark methods noexcept
region: Mark trivial noexcept methods as such
allocation_strategy: Mark returning lambda as noexcept
allocation_strategy: Mark trivial noexcept methods as such
dht: Mark noexcept methods
The "NULL" operator in Expected (old-style conditional operations) doesn't
have any parameters, so we insisted that the AttributeValueList be empty.
However, we forgot to allow it to also be missing - a possibility which
DynamoDB allows.
This patch adds a test to reproduce this case (the test passes on DyanmoDB,
fails on Alternator before this patch, and succeeds after this patch), and
a fix.
Fixes#6816.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200709161254.618755-1-nyh@scylladb.com>
If any of the compared bytes_view's is empty
consider the empty prefix is same and proceed to compare
the size of the suffix.
A similar issue exists in legacy_compound_view::tri_comparator::operator().
It too must not pass nullptr to memcmp if any of the compared byte_view's
is empty.
Fixes#6797
Refs #6814
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Test: unit(dev)
Branches: all
Message-Id: <20200709123453.955569-1-bhalevy@scylladb.com>
Merged pull request https://github.com/scylladb/scylla/pull/6741
by Piotr Dulikowski:
This PR changes the algorithm used to generate preimages and postimages
in CDC log. While its behavior is the same for non-batch operations
(with one exception described later), it generates pre/postimages that
are organized more nicely, and account for multiple updates to the same
row in one CQL batch.
Fixes#6597, #6598
Tests:
- unit(dev), for each consecutive commit
- unit(debug), for the last commit
Previous method
The previous method worked on a per delta row basis. First, the base
table is queried for the current state of the rows being modified in
the processed mutation (this is called the "preimage query"). Then,
for each delta row (representing a modification of a row):
If preimage is enabled and the row was already present in the table,
a corresponding preimage row is inserted before the delta row.
The preimage row contains data taken directly from the preimage
query result. Only columns that are modified by the delta are
included in the preimage.
If postimage is enabled, then a postimage row is inserted after the
delta row. The postimage row contains data which was a result of
taking row data directly from the preimage query result and applying
the change the corresponding delta row represented. All columns
of the row are included in the postimage.
The above works well for simple cases such like singular CQL INSERT,
UPDATE, DELETE, or simple CQL BATCH-es. An example:
cqlsh:ks> BEGIN UNLOGGED BATCH
INSERT INTO tbl (pk, ck, v) VALUES (0, 1, 111);
INSERT INTO tbl (pk, ck, v) VALUES (0, 2, 222);
APPLY BATCH;
cqlsh:ks> SELECT "cdc$batch_seq_no", "cdc$operation", "cdc$ttl",
pk, ck, v from ks.tbl_scylla_cdc_log ;
cdc$batch_seq_no | cdc$operation | cdc$ttl | pk | ck | v
------------------+---------------+---------+----+----+-----
...snip...
0 | 0 | null | 0 | 1 | 100
1 | 2 | null | 0 | 1 | 111
2 | 9 | null | 0 | 1 | 111
3 | 0 | null | 0 | 2 | 200
4 | 2 | null | 0 | 2 | 222
5 | 9 | null | 0 | 2 | 222
Preimage rows are represented by cdc operation 0, and postimage by 9.
Please note that all rows presented above share the same value of
cdc$time column, which was not shown here for brevity.
Problems with previous approach
This simple algorithm has some conceptual and implementational problems
which arise when processing more complicated CQL BATCH-es. Consider
the following example:
cqlsh:ks> BEGIN UNLOGGED BATCH
INSERT INTO tbl (pk, ck, v1) VALUES (0, 0, 1) USING TTL 1000;
INSERT INTO tbl (pk, ck, v2) VALUES (0, 0, 2) USING TTL 2000;
APPLY BATCH;
cqlsh:ks> SELECT "cdc$batch_seq_no", "cdc$operation", "cdc$ttl",
pk, ck, v1, v2 FROM tbl_scylla_cdc_log;
cdc$batch_seq_no | cdc$operation | cdc$ttl | pk | ck | v1 | v2
------------------+---------------+---------+----+----+------+------
...snip...
0 | 0 | null | 0 | 0 | null | 0
1 | 2 | 2000 | 0 | 0 | null | 2
2 | 9 | null | 0 | 0 | 0 | 2
3 | 0 | null | 0 | 0 | 0 | null
4 | 1 | 1000 | 0 | 0 | 1 | null
5 | 9 | null | 0 | 0 | 1 | 0
A single cdc group (corresponding to rows sharing the same cdc$time)
might have more than one delta that modify the same row. For example,
this happens when modifying two columns of the same row with
different TTLs - due to our choice of CDC log schema, we must
represent such change with two delta rows.
It does not make sense to present a postimage after the first delta
and preimage before the second - both deltas are applied
simultaneously by the same CQL BATCH, so the middle "image" is purely
imaginary and does not appear at any point in the table.
Moreover, in this example, the last postimage is wrong - v1 is updated,
but v2 is not. None of the postimages presented above represent the
final state of the row.
New algorithm
The new algorithm works now on per cdc group basis, not delta row.
When starting processing a CQL BATCH:
Load preimage query results into a data structure representing
current state of the affected rows.
For each cdc group:
For each row modified within the group, a preimage is produced,
regardless if the row was present in the table. The preimage
is calculated based on the current state. Only include columns
that are modified for this row within the group.
For each delta, produce a delta row and update the current state
accordingly.
Produce postimages in the same way as preimages - but include all
columns for each row in the postimage.
The new algorithm produces postimage correctly when multiple deltas
affect one, because the state of the row is updated on the fly.
This algorithm moves preimage and postimage rows to the beginning and
the end of the cdc group, accordingly. This solves the problem of
imaginary preimages and postimages appearing inside a cdc group.
Unfortunately, it is possible for one CQL BATCH to contain changes that
use multiple timestamps. This will result in one CQL BATCH creating
multiple cdc groups, with different cdc$time. As it is impossible, with
our choice of schema, to tell that those cdc groups were created from
one CQL BATCH, instead we pretend as if those groups were separate CQL
operations. By tracking the state of the affected rows, we make sure
that preimage in later groups will reflect changes introduces in
previous groups.
One more thing - this algorithm should have the same results for
singular CQL operations and simple CQL BATCH-es, with one exception.
Previously, preimage not produced if a row was not present in the
table. Now, the preimage row will appear unconditionally - it will have
nulls in place of column values.
* 'cdc-pre-postimage-persistence' of github.com:piodul/scylla:
cdc: fix indentation
cdc: don't update partition state when not needed
cdc: implement pre/postimage persistence
cdc: add interface for producing pre/postimages
cdc: load preimage query result into partition state fields
cdc: introduce fields for keeping partition state
cdc: rename set_pk_columns -> allocate_new_log_row
cdc: track batch_no inside transformer
cdc: move cdc$time generation to transformer
cdc: move find_timestamp to split.cc
cdc: introduce change_processor interface
cdc: remove redundant schema arguments from cdc functions
cdc: move management of generated mutations inside transformer
cdc: move preimage result set into a field of transformer
cdc: keep ts and tuuid inside transformer
cdc: track touched parts of mutations inside transformer
cdc: always include preimage for affected rows
All but few are trivially such.
The clear_continuity() calls cache_entry::set_continuous() that had become noexcept
a patch ago.
The allocator() calls region.allocator() which had been marked noexcept few patches
back.
The on_partition_erase() calls allocator().invalidate_references(), both had
been marked noexcept few patches back.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
All but one are trivially such, the position() one calls is_dummy_entry()
which has become noexcept right now.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
cquery_nofail returns the query result, not a future. Invoking .get()
on its result is unnecessary. This just happened to compile because
shared_ptr has a get() method with the same signature as future::get.
Tests: cql_query_test unit test (dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
When "trace"-level logging is enabled for Alternator, we log every request,
but currently only the request's body. For debugging, it is sometimes useful
to also see the headers - which are important to debug authentication,
for example. So let's print the headers as well.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200709103414.599883-1-nyh@scylladb.com>
In one of the longevity tests, we observed 1.3s reactor stall which came from
repair_meta::get_full_row_hashes_source_op. It traced back to a call to
std::unordered_set::insert() which triggered big memory allocation and
reclaim.
I measured std::unordered_set, absl::flat_hash_set, absl::node_hash_set
and absl::btree_set. The absl::btree_set was the only one that seastar
oversized allocation checker did not warn in my tests where around 300K
repair hashes were inserted into the container.
- unordered_set:
hash_sets=295634, time=333029199 ns
- flat_hash_set:
hash_sets=295634, time=312484711 ns
- node_hash_set:
hash_sets=295634, time=346195835 ns
- btree_set:
hash_sets=295634, time=341379801 ns
The btree_set is a bit slower than unordered_set but it does not have
huge memory allocation. I do not measure real difference of total time
to finish repair of the same dataset with unordered_set and btree_set.
To fix, switch to absl btree_set container.
Fixes#6190
We had some tests for the number type in Alternator and how it can be
stored, retrieved, calculated and sorted, but only had rudementary tests
for the allowed magnitude and precision of numbers.
This patch creates a new test file, test_number.py, with tests aiming to
check exactly the supported magnitudes and precision of numbers.
These tests verify two things:
1. That Alternator's number type supports the full precision and magnitude
that DynamoDB's number type supports. We don't want to see precision
or magnitude lost when storing and retrieving numbers, or when doing
calculations on them.
2. That Alternator's number type does not have *better* precision or
magnitude than DynamoDB does. If it did, users may be tempted to rely
on that implementation detail.
The three tests of the first type pass; But all four tests of the second
type xfail: Alternator currently stores numbers using big_decimal which
has unlimited precision and almost-unlimited magnitude, and is not yet
limited by the precision and magnitude allowed by DynamoDB.
This is a known issue - Refs #6794 - and these four new xfailing tests
will can be used to reproduce that issue.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200707204824.504877-1-nyh@scylladb.com>
While building unified-deb we first use scylla/reloc/build_deb.sh to create the scylla core package, and after that scylla/reloc/python3/build_deb.sh to create python3.
On 058da69#diff-4a42abbd0ed654a1257c623716804c82 a new rm -rf command was added.
It causes python3 process to erase Scylla-core process.
Set python3 to erase its own dir scylla-python3-package only.
In some cases, tracking the state of processed rows inside `transformer`
is not needd at all. We don't need to do it if either:
- Preimage and postimage are disabled for the table,
- Only preimage is enabled and we are processing the last timestamp.
This commit disables updating the state in the cases listed above.
Moves responsibility for generating pre/postimage rows from the
"process_change" method to "produce_preimage" and "produce_postimage".
This commit actually affects the contents of generated CDC log
mutations.
Added a unit test that verifies more complicated cases with CQL BATCH.
Introduces new methods to the change_processor interface that will cause
it to produce pre/postimage rows for requested clustering key, or for
static row.
Introduces logic in split.cc responsible for calling pre/postimage
methods of the change_processor interface. This does not have any effect
on generated CDC log mutations yet, because the transformer class has
empty implementations in place of those methods.
Instead of looking up preimage data directly from the raw preimage query
results, use the raw results to populate current partition state data,
and read directly from the current partition state.
The function is no longer used in log.cc, so instead it is moved to
split.cc.
Removed declaration of the function from the log.hh header, because it
is not used elsewhere - apart from testing code, but it already
declared find_timestamp in the cdc_test.cc file.
This allows for a more refined use of the transformer by the
for_each_change function (now named "process_changes_with_splitting).
The change_processor interface exposes two methods so far:
begin_timestamp, and process_change (previously named "transform").
By separating those two and exposing them, process_changes_with\
_splitting can cause the transformer to generate less CDC log mutations
- only one for each timestamp in the batch.
Adds a `begin_timestamp` method which tells the `transformer` to start
using the following timestamp and timeuuid when generating new log row
mutations.
Moves tracking of the "touched parts" statistics inside the transformer
class.
This commit is the first of multiple commits in this series which move
parts of the state used in CDC log row generation inside the
`transformer` class. There is a lot of state being passed to
`transformer` each time its methods are called, which could be as well
tracked by the `transformer` itself. This will result in a nicer
interface and will allow us to generate less CDC log mutations which
give the same result.