Commit Graph

23464 Commits

Author SHA1 Message Date
Pavel Emelyanov
ca148acbf9 deletable_row: Do not mess with clustering_row
The deletable_row accepts clustering_row in constructor and
.apply() method. The next patch will make clustering_row
embed the deletable_row inside, so those two methods will
violate layering and should be fixed in advance.

The fix is in providing a clustering_row method to convert
itself into a deletable_row. There are two places that need
this: mutation_fragment_applier and partition_snapshot_row_cursor.
Both methods pass temporary clustering_row value, so the
method in question is also move-converter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-08 22:18:15 +03:00
Wojciech Mitros
66e8214606 cql: Forbid adding new fields to UDTs used in partition key columns
Changing a user type may allow adding apparently duplicate rows to
tables where this type is used in a partitioning key. Fix by checking
all types of existing partitioning columns before allowing to add new
fields to the type.

Fixes #6941
2020-09-08 16:08:07 +03:00
Avi Kivity
7ac59dcc98 lsa: decay reserves
The log-structured allocator (LSA) reserves memory when performing
operations, since its operations are performed with reclaiming disabled
and if it runs out, it cannot evict cache to gain more. The amount of
memory to reserve is remembered across calls so that it does not have
to repeat the fail/increase-reserve/retry cycle for every operation.

However, we currently lack decaying the amount to reserve. This means
that if a single operation increased the reserve in the distant past,
all current operations also require this large reserve. Large reserves
are expensive since they can cause large amounts of cache to be evicted.

This patch adds reserve decay. The time-to-decay is inversely proportional
to reserve size: 10GB/reserve. This means that a 20MB reserve is halved
after 500 operations (10GB/20MB) while a 20kB reserve is halved after
500,000 operations (10GB/20kB). So large, expensive reserves are decayed
quickly while small, inexpensive reserves are decayed slowly to reduce
the risk of allocation failures and exceptions.

A unit test is added.

Fixes #325.
2020-09-08 15:59:25 +03:00
Takuya ASADA
4deb245198 scylla_ntp_setup: don't install ntpd package when it's already exists
Don't install ntpd package when it's already exists.

Related with #7153
2020-09-08 13:59:04 +03:00
Etienne Adam
63a1a4cbb9 redis: add hgetall and hdel commands
This patch adds support for 2 hash commands HDEL and HGETALL.

Internally it introduces the hashes_result_builder class to
read hashes and stored them in a std::map.

Other changes:
  - one exception return string was fixed
  - tests now use pytest.raises

Signed-off-by: Etienne Adam <etienne.adam@gmail.com>
Message-Id: <20200907202528.4985-1-etienne.adam@gmail.com>
2020-09-08 11:59:52 +03:00
Benny Halevy
f8d9e81bdb types: time_point_to_string: prevent overflow of nanoseconds
Due to #7175, microseconds are stored in a db_clock::time_point
as if they were milliseconds.

std::chrono::duration_cast<std::chrono::nanoseconds> may cause overflow
and end up with invalid/negative nanos.

This change specializes time_point_to_string to std::chrono::milliseconds
since it's currently only called to print db_clock::time_point
and uses boost::posix_time::milliseconds to print the count.

This would generate an exception in today's time stamps
and the output will look like:
    1599493018559873 milliseconds (Year is out of valid range: 1400..9999)
instead of:
    1799-07-16T19:57:52.175010

It is preferrable to print the numeric value annotated as out of valid range
than to print a bogus date in the past.

Test: unit(dev), commitlog_test:TestCommitLog.test_mixed_mode_commitlog_same_partition_smp_1
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200907162845.147477-1-bhalevy@scylladb.com>
2020-09-08 10:02:02 +03:00
Nadav Har'El
f76c519c1f merge: Alternator streams - fix table description and sequence number
Merged pull request https://github.com/scylladb/scylla/pull/7160
By Calle Wilund:

Stream descriptions, as returned from create, update and describe stream was missing "latest" stream arn.

Shard descriptions sequence number (for us timeuuid:s) were formatted wrongly. The spec states they should be numeric only.

Both these defects break Kinesis operations.

  alternator: Set CDC delta to keys only for alternator streams
  alternator: Include stream spec in desc for create/update/describe
  alternator: Include LatestStreamLabel in resulting desc for create/update table
  alternator: Make "StreamLabel" an iso8601 timestamp
  alternator: Alloc BILLING_MODE in update_table
  cdc: Add setter for delta mode
  alternator: Fix sequence number range using wrong format
  alternator: Include stream arn in table description if enabled
2020-09-07 18:26:21 +03:00
Piotr Grabowski
ffd8c8c505 utf8: Print invalid UTF-8 character position
Add new validate_with_error_position function
which returns -1 if data is a valid UTF-8 string
or otherwise a byte position of first invalid
character. The position is added to exception
messages of all UTF-8 parsing errors in Scylla.

validate_with_error_position is done in two
passes in order to preserve the same performance
in common case when the string is valid.
2020-09-07 18:11:21 +03:00
Piotr Grabowski
462d12f555 db: Propagate enable_cache to system keyspaces
Make enable_cache configuration option also
affect caching of system keyspaces. Fixes #2909.
2020-09-07 17:54:46 +03:00
Calle Wilund
7224ae6d38 alternator: Set CDC delta to keys only for alternator streams
Fixes #7190

Since we don't use any delta value when translating cdc -> streams
it is wasteful to write these to the log table, esp. since we already
write big fat pre- and post images.
2020-09-07 14:27:54 +00:00
Calle Wilund
f7bb0baba7 alternator: Include stream spec in desc for create/update/describe
Fixes #7163

If enabled, the resulting table description should include a
StreamDescription object with the appropriate members describing
current stream settings.
2020-09-07 14:26:21 +00:00
Calle Wilund
e6266d5652 alternator: Include LatestStreamLabel in resulting desc for create/update table
Fixes #7162

Same value as 'StreamLabel' in the currently active stream (cdc log) if
enabled.
2020-09-07 14:24:48 +00:00
Calle Wilund
fa68493d64 alternator: Make "StreamLabel" an iso8601 timestamp
Fixes #7164

See https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_TableDescription.html
StreamLabel: A timestamp, in ISO 8601 format, for this stream

Scylla tables do not have a timestamp as such, but the UUID for a given schema
is a timeuuid, so we can misuse this to fake a creation timestamp.
2020-09-07 14:24:00 +00:00
Calle Wilund
f16792aad0 alternator: Alloc BILLING_MODE in update_table
While it does not do anything, we want something to
update for testing (dynamo python libs refuse empty
update).
2020-09-07 14:15:17 +00:00
Calle Wilund
d29d676955 cdc: Add setter for delta mode 2020-09-07 14:14:04 +00:00
Botond Dénes
c01af1d9d2 tests/boost/multishard_mutation_query_test: remove last BOOST_REQUIRE* macros
Previous patches removed those `BOOST_REQUIRE*` macros that could be
invoked from shards other than 0. The reason is that said macros are not
thread-safe, so calling them from multiple shards produces mangled
output to stdout as well as the XML report file. It was assumed that
only these invocations -- from a non-0 shard -- are problematic, but it
turns out even these can race with seastar log messages emitted from
other shards. This patch removes all such macros, replacing them with
the thread safe `require*` functions from `test/lib/test_utils.hh`.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200907125309.1199104-1-bdenes@scylladb.com>
2020-09-07 17:07:26 +03:00
Calle Wilund
a7d021ee57 alternator: Fix sequence number range using wrong format
Fixes #7158

A streams shard  descriptions has a sequence range describing start/end
(if available) of the shard. This is specified as being "numeric only".

Alternator incorrectly used UUID here, which breaks kinesis.

v2:
* Fix uint128_t parsing from string. bmp::number constructor accepted
  sstring, but did not interpret it as std::string/chars. Weird results.
2020-09-07 12:01:22 +00:00
Pekka Enberg
1ed9a336a5 Update tools/jmx submodule
* tools/jmx 12ab6aa...8d92e54 (1):
  > Merge 'JMX footprint work' from Calle
Fixes scylladb/scylla-jmx#133
Fixes scylladb/scylla-jmx#134
2020-09-07 13:56:47 +03:00
Benny Halevy
0c474b1c01 types: time_point_to_string: handle errors from boost::posix_time::to_iso_extended_string
As seen in https://github.com/scylladb/scylla/issues/7175,
1e676cd845
that was merged in bc77939ada
exposed a preexisting problem in time_point_to_string
where it tried printing a timestamp that was in microseconds
(taken from an api::timestamp_type instead of db_clock::time_point)
and hit `boost::wrapexcept<boost::gregorian::bad_year> (Year is out of valid range: 1400..9999)`

If hit, this patch with print the offending time_stamp in
nanoseconds and the error message.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200907083303.33229-1-bhalevy@scylladb.com>
2020-09-07 11:36:43 +03:00
Calle Wilund
f5c79d15a8 alternator: Include stream arn in table description if enabled
Fixes #7157

When creating/altering/describing a table, if streams are enabled, the
"latest active" stream arn should be included as LatestStreamArn.

Not doing so breaks java kinesis.
2020-09-07 08:16:11 +00:00
Pekka Enberg
e4266ead98 Improve build documentation
This improves the build documentation beyond just packaging:

- Explain how the configure.py step works

- Explain how to build just executables and tests (for development)

- Explain how to build for specific build mode if you didn't specify a
  build mode in configure.py step

- Fix build artifact locations, missing .debs, and add executables and
  tests

Message-Id: <20200904084443.495137-1-penberg@iki.fi>
2020-09-07 10:51:31 +03:00
Benny Halevy
66ce3a4c25 types: time_point_to_string: do not assume tp is in milliseconds
T& tp may have other period than milliseconds.
Cast the time_point duration to nanoseconds (or microseconds
if boost doesn't supports it) so it is printed in the best possible
resolution.

Note that we presume that the time_point epoch is the
Unix epoch of 1970-01-01, but the c++ standard doesn't guwarntee
that.  See https://github.com/scylladb/scylla/issues/5498

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200906171106.690872-1-bhalevy@scylladb.com>
2020-09-07 10:44:52 +03:00
Pekka Enberg
478e831d4f Update tools/jmx submodule
* tools/jmx d5d1efd...12ab6aa (1):
  > Merge "Fix JMX startup after offline installation" from Amos

Fixes: scylladb/scylla#7098
Fixes: scylladb/scylla-jmx#129
2020-09-07 09:42:26 +03:00
Piotr Jastrzebski
4499a37eae docs: Improve protocol-extensions documentation
Documentation states that `SCYLLA_LWT_OPTIMIZATION_META_BIT_MASK`
is a 32-bit integer that represents bit mask. What it fails to mention
is that it's a unsigned value and in fact it takes value of 2147483648.
This is problematic for clients in languages that don't have unsigned
types (like Java).

This patch improves the documentation to make it clear that
`SCYLLA_LWT_OPTIMIZATION_META_BIT_MASK` is represented by unsigned
value.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <7166b736461ae6f3d8ffdf5733e810a82aa02abc.1599382184.git.piotr@scylladb.com>
2020-09-06 13:35:12 +03:00
Dejan Mircevski
a127f5615b cql3: Simplify pk test in statement_restrictions
statement_restrictions::process_partition_key_restrictions() was
checking has_unrestricted_components(), whereas just an empty() check
suffices there, because has_unrestricted_components() is implicitly
checked five lines down by needs_filtering().

The replacement check is cheaper and simpler to understand.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-09-06 12:39:36 +03:00
Dejan Mircevski
df3ea2443b cql3: Drop all uses_function methods
No one seems to call them except for other uses_function methods.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-09-04 17:27:30 +02:00
Piotr Grabowski
561753fe71 mutation: Improve log print of mutations
Changes format of mutation, mutation_partition
log messages to more human-readable. Fixes #826.
2020-09-04 16:33:25 +02:00
Tomasz Grabiec
bcdcf06ec7 Merge "lwt: for each statement in cas_request provide a row in CAS result set" from Pavel Solodovnikov
Previously batch statement result set included rows for only
those updates which have a prefetch data present (i.e. there
was an "old" (pre-existing) row for a key).

Also, these rows were sorted not in the order in which statements
appear in the batch, but in the order of updated clustering keys.

If we have a batch which updates a few non-existent keys, then
it's impossible to figure out which update inserted a new key
by looking at the query response. Not only because the responses
may not correspond to the order of statements in the batch, but
even some rows may not show up in the result set at all.

Please see #7113 on Github for detailed description
of the problem:
https://github.com/scylladb/scylla/issues/7113

The patch set proposes the following fix:

For conditional batch statements the result set now always
includes a row for each LWT statement, in the same order
in which individual statements appear in the batch.

This way we can always tell which update did actually insert
a new key or update the existing one.

Technically, the following changes were made:
 * `update_parameters::prefetch_data::row::is_in_cas_result_set`
   member removed as well as the supporting code in
   `cas_request::applies_to` which iterated through cas updates
   and marked individual `prefetch_data` rows as "need to be in
   cas result set".
 * `cas_request::applies_to` substantially simplified since it
   doesn't do anything more than checking `stmt.applies_to()`
   in short-circuiting manner.
 * `modification_statement::build_cas_result_set` method moved
   to `cas_request`. This allows to easily iterate through
   individual `cas_row_update` instances and preserve the order
   of the rows in the result set.
 * A little helper `cas_request::find_old_row`
   is introduced to find a row in `prefetch_data` based on the
   (pk, ck) combination obtained from the current `cas_request`
   and a given `cas_row_update`.
 * A few tests for the issue #7113 are written, other lwt-batch-related
   tests adjusted accordingly.
2020-09-04 16:09:45 +02:00
Pavel Solodovnikov
92fd515186 lwt: for each statement in cas_request provide a row in CAS result set
Previously batch statement result set included rows for only
those updates which have a prefetch data present (i.e. there
was an "old" (pre-existing) row for a key).

Also, these rows were sorted not in the order in which statements
appear in the batch, but in the order of updated clustering keys.

If we have a batch which updates a few non-existent keys, then
it's impossible to figure out which update inserted a new key
by looking at the query response. Not only because the responses
may not correspond to the order of statements in the batch, but
even some rows may not show up in the result set at all.

The patch proposes the following fix:

For conditional batch statements the result set now always
includes a row for each LWT statement, in the same order
in which individual statements appear in the batch.

This way we can always tell which update did actually insert
a new key or update the existing one.

`update_parameters::prefetch_data::row::is_in_cas_result_set`
member variable was removed as well as supporting code in
`cas_request::applies_to` which iterated through cas updates
and marked individual `prefetch_data` rows as "need to be in
cas result set".

Instead now `cas_request::applies_to` is significantly
simplified since it doesn't do anything more than checking
`stmt.applies_to()` in short-circuiting manner.

A few tests for the issue are written, other lwt-batch-related
tests were adjusted accordingly to include rows in result set
for each statement inside conditional batches.

Tests: unit(dev, debug)

Co-authored-by: Konstantin Osipov <kostja@scylladb.com>
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-09-04 13:13:26 +03:00
Pavel Solodovnikov
feaf2b6320 cas_request: move modification_statement::build_cas_result_set to cas_request
This is just a plain move of the code from `modification_statement`
to `cas_request` without changes in the logic, which will further
help to refactor `build_cas_result_set` behavior to include a row
for each LWT statement and order rows in the order of statements
in a batch.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-09-04 12:25:06 +03:00
Pavel Solodovnikov
0f0ff73a58 cas_request: extract find_old_row helper function
Factor out little helper function which finds a pre-existing
row for a given `cas_row_update` (matching the primary key).
Used in `cas_request::applies_to`.

Will be used in a subsequent patch to move
`modification_statement::build_cas_result_set` into `cas_request`.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-09-04 12:09:31 +03:00
Botond Dénes
e88b8a9a07 docs/debugging.md: document how TLS variables work
We use a lot of TLS variables yet GDB is not of much help when working
with these. So in this patch I document where they are located in
memory, how to calculate the address of a known TLS variable and how to
find (identify) one given an address.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200903131802.1068288-1-bdenes@scylladb.com>
2020-09-03 17:41:15 +02:00
Avi Kivity
bc77939ada Update seastar submodule
* seastar 7f7cf0f232...4ff91c4c3a (47):
  > core/reactor: complete_timers(): restore previous scheduling group
Fixes #7117.
  > future: Drop spurious 'mutable' from a lambda
  > future: Don't put a std::tuple in future_state
  > future: Prepare for changing the future_state storage type
  > future: Add future_state::get0()
  > when_all: Replace another untuple with get0
  > when_all: Use get0 instead of untuple
  > rpc: Don't assume that a future stores a std::tuple
  > future: Add a futurize_base
  > testing: Stop boost from installing its own signal handler
  > future: Define futurize::value_type with future::value_type
  > future: Move futurize down the file
  > Merge "Put logging onto {fmt} rails" from Pavel E
  > Merge "Make future_state non variadic" from Rafael
  > sharded: propagate sharded instances stop error
  > log: Fix misprint in docs
  > future: Use destroy_at in destructor
  > future: Add static_assert rejecting variadic future_state
  > futures_test: Drop variadic test
  > when_all: Drop a superfluous use of futurize::from_tuples
  > everywhere: Use future::get0 when appropriate
  > net: upgrade sockopt comments to doxygen
  > iostream: document the read_exactly() function
  > net:tls: fix clang-10 compilation duration cast
  > future: Move continuation_base_from_future to future.hh
  > repeat: Drop unnecessary make_tuple
  > shared_future: Don't use future_state_type
  > future: Add a static_assert against variadic futures
  > future: Delete warn_variadic_future
  > rpc_demo: Don't use a variadic future
  > rpc_test: Don't use a variadic future
  > futures_test: Don't use a variadic future
  > future: Move disable_failure_guard to promise::schedule
  > net: add an interface for custom socket options
  > posix: add one more setsockopt overload
  > Merge "Simplify pollfn and its inheritants" from Pavel E
  > util:std-compat.hh: add forward declaration of std::pmr for clang-10
  > rpc: Add protocol::has_handlers() helper
  > Add a Seastar_DEBUG_ALLOCATIONS build option
  > futures_test: Add a test for futures of references
  > future: Simplify destruction of future_state
  > Use detect_stack_use_after_return=1
  > repeat: Fix indentation
  > repeat: Delete try/catch
  > repeat: Simplify loop
  > Avoid call to std::exception_ptr's destructor from a .hh
  > file: Add missing include
2020-09-03 15:56:12 +03:00
Avi Kivity
64c7c81bac Merge "Update log messages to {fmt} rules" from Pavel E
"
Before seastar is updated with the {fmt} engine under the
logging hood, some changes are to be made in scylla to
conform to {fmt} standards.

Compilation and tests checked against both -- old (current)
and new seastar-s.

tests: unit(dev), manual
"

* 'br-logging-update' of https://github.com/xemul/scylla:
  code: Force formatting of pointer in .debug and .trace
  code: Format { and } as {fmt} needs
  streaming: Do not reveal raw pointer in info message
  mp_row_consumer: Provide hex-formatting wrapper for bytes_view
  heat_load_balance: Include fmt/ranges.h
2020-09-03 15:10:09 +03:00
Nadav Har'El
1d06da18fc alternator test: test for the TRIM_HORIZON stream iterator
This patch adds a test for the TRIM_HORIZON option of GetShardIterator in
Alternator Streams. This option asks to fetch again *all* the available
history in this shard stream. We had an implementation for it, but not a
test - so this patch adds one. The test passes.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200830131458.381350-1-nyh@scylladb.com>
2020-09-02 18:37:06 +02:00
Nadav Har'El
3d4183863a alternator test: add tests for sequence-number based iterators
Alternator Streams already support the AT_SEQUENCE_NUMBER and
AFTER_SEQUENCE_NUMBER options for iterators. These options allow to replay
a stream of changes from a known position or after that known position.
However, we never had a test verifying that these features actually work
as intended, beyond just checking syntax. Having such tests is important
because recently we changed the implementation of these iterators, but
didn't have a test verifying that they still work.

So in this patch we add such tests. The tests pass (as usual, on both
Alternator and DynamoDB).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200830115817.380075-1-nyh@scylladb.com>
2020-09-02 18:36:59 +02:00
Nadav Har'El
c879b23b82 alternator test: add another passing Alternator Streams test
We had a test, test_streams_last_result, that verifies that after reading from
an Alternator Stream the last event, reading again will find nothing.
But we didn't actually have a test which checks that if at that point a new event
*does* arrive, we can read it. This test checks this case, and it passes (we don't
have a bug there, but it's good as a regression test for NextShardIterator).

This test also verifies that after reading an event for a particular key on a
a specific stream "shard", the next event for the same key will arrive on the
same shard.

This test passes on both Alternator and DynamoDB.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200830105744.378790-1-nyh@scylladb.com>
2020-09-02 18:36:50 +02:00
Raphael S. Carvalho
adf576f769 compaction_manager: export method that returns if table has ongoing compaction
A compaction strategy, that supports parallel compaction, may want to know
if the table has compaction running on its behalf before making a decision.
For example, a size-tiered-like strategy may not want to trigger a behavior,
like cross-tier compaction, when there's ongoing compaction.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200901134306.23961-1-raphaelsc@scylladb.com>
2020-09-02 16:46:49 +03:00
Botond Dénes
90042746bf scylla-gdb.py: scylla_sstables::filename(): add md format support
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200902090929.879377-1-bdenes@scylladb.com>
2020-09-02 16:12:17 +03:00
Dejan Mircevski
0c73ac107d cql3: Drop get_partition_key_unrestricted_components
Not used anywhere.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-09-02 08:14:54 +03:00
Botond Dénes
b3f00685ec scylla-gdb.py: scylla memory: better summary of semaphore memory usage
If available, use the recently added
`reader_concurrency_semaphore::_initial_resources` to calculate the
amount of memory used out of the initially configured amount. If not
available, the summary falls back to the previous mode of just printing
the remaining amount of memory.
Example:
Replica:
  Read Concurrency Semaphores:
    user sstable reads:       11/100,     263621214/     42949672 B, queued: 847
    streaming sstable reads:   0/ 10,             0/     42949672 B, queued: 0
    system sstable reads:      1/ 10,        251584/     42949672 B, queued: 0

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200901091452.806419-1-bdenes@scylladb.com>
2020-09-01 16:26:57 +02:00
Nadav Har'El
52f92b886b alternator streams: fix bug returning the same change again
This patch fixes a bug which caused sporadic failures of the Alternator
test - test_streams.py::test_streams_last_result.

The GetRecords operation reads from an Alternator Streams shard and then
returns an "iterator" from where to continue reading next time. Because
we obviously don't want to read the same change again, we "incremented"
the current position, to start at the incremented position on the next read.

Unfortunately, the implementation of the increment() function wasn't quite
right. The position in the CDC log is a timeuuid, which has a really bizarre
comparison function (see compare_visitor in types.cc). In particular the
least-sigificant bytes of the UUID are compared as *signed* bytes. This
means that if the last byte of the UUID was 127, and increment() increased
it to 128, and this was wrong because the comparison function later deemed
that as a signed byte, where 128 is lower than 127, not higher! The result
was that with 1/256 probability (whenever the last byte of the position was
127) we would return an item twice. This was reproduced (with 1/256
probability) by the test test_streams_last_result, as reported in issue #7004.

The fix in this patch is to drop the increment() and replace it by a flag
whether an iterator is inclusive of the threshold (>=) or exclusive (>).
The internal representation of the iterator has a boolean flag "inclusive",
and the string representation uses the prefixes "I" or "i" to indicate an
inclusive or exclusive range, respectively - whereas before this patch we
always used the prefix "I".

Although increment() could have been fixed to work correctly, the result would
have been ugly because of the weirdness of the timeuuid comparison function.
increment() would also require extensive new unit-tests: we were lucky that
the high-level functional tests caught a 1 in 256 error, but they would not
have caught rarer errors (e.g., 1 in 2^32). Furthermore, I am looking at
Alternator as the first "user" of CDC, and seeing how complicated and
error-prone increment() is, we should not recommend to users to use this
technique - they should use exclusive (>) range queries instead.

Fixes #7004.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200901102718.435227-1-nyh@scylladb.com>
2020-09-01 12:28:39 +02:00
Avi Kivity
37352a73b8 Update tools/python3 submodule
* tools/python3 f89ade5...19a9cd3 (1):
  > dist: redhat: reduce log spam from unpacking sources when building rpm
2020-09-01 12:36:24 +03:00
Pavel Emelyanov
86897aa040 partition_version: Remove dead code
The rows_iterator is no longer in use since 70c72773

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200831191208.18418-1-xemul@scylladb.com>
2020-09-01 10:19:47 +03:00
Raphael S. Carvalho
7f7f366cb5 compaction: add debug msg to inform the amount of expired ssts skipped by compaction
this information is useful when debugging compaction issues that involve
fully expired ssts.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200828140401.96440-1-raphaelsc@scylladb.com>
2020-08-31 17:18:47 +03:00
Amos Kong
5785947e28 unified/install.sh: set default python3/sysconfdir smartly
Users can set python3 and sysconfdir from cmdline of install.sh
according to the install mode (root or nonroot) and distro type.

It's helpful to correct the default python3/sysconfdir, otherwise
setup scripts or scylla-jmx doesn't work.

Fixes #7130

Signed-off-by: Amos Kong <amos@scylladb.com>
2020-08-31 15:54:51 +03:00
Amos Kong
83d7454787 install.sh: clean tmp scylla.yaml after installation
If the install.sh is executed by sudo, then the tmp scylla.yaml is owned
by root. It's difficult to overwrite it by non-privileged users.

Signed-off-by: Amos Kong <amos@scylladb.com>
2020-08-31 15:54:51 +03:00
Kamil Braun
ff78a3c332 cdc: rename CDC description tables... again
Commit a6ad70d3da changed the format of
stream IDs: the lower 8 bytes were previously generated randomly, now
some of them have semantics. In particular, the least significant byte
contains a version (stream IDs might evolve with further releases).

This is a backward-incompatible change: the code won't properly handle
stream IDs with all lower 8 bytes generated randomly. To protect us from
subtle bugs, the code has an assertion that checks the stream ID's
version.

This means that if an experimental user used CDC before the change and
then upgraded, they might hit the assertion when a node attempts to
retrieve a CDC generation with old stream IDs from the CDC description
tables and then decode it.
In effect, the user won't even be able to start a node.

Similarly as with the case described in
d89b7a0548, the simplest fix is to rename
the tables. This fix must get merged in before CDC goes out of
experimental.

Now, if the user upgrades their cluster from a pre-rename version, the
node will simply complain that it can't obtain the CDC generation
instead of preventing the cluster from working. The user will be able to
use CDC after running checkAndRepairCDCStreams.

Since a new table is added to the system_distributed keyspace, the
cluster's schema has changed, so sstables and digests need to be
regenerated for schema_digest_test.
2020-08-31 11:33:14 +03:00
Nadav Har'El
f3dfb6e011 merge: cdc: Remove post-filterings for keys-only/off cdc delta generation
Merged pull request https://github.com/scylladb/scylla/pull/7121
By Calle Wilund:

Refs #7095
Fixes #7128

CDC delta!=full both relied on post-filtering
to remove generated log row and/or cells. This is inefficient.
Instead, simply check if the data should be created in the
visitors.

Also removed delta_mode=off mode.

  cdc: Remove post-filterings for keys-only/off cdc delta generation
  cdc: Remove cdc delta_mode::off
2020-08-31 11:22:09 +03:00
Calle Wilund
70a282ced2 cdc: Remove post-filterings for keys-only/off cdc delta generation
Refs #7095

CDC delta!=full both relied on post-filtering
to remove generated log row and/or cells. This is inefficient.
Instead, simply check if the data should be created in the
visitors.

v2:
* Fixed delta logs rows created (empty) even when delta == off
v3:
* Killed delta == off
v4:
* Move checks into (const) member var(s)
2020-08-31 07:59:43 +00:00