Bring supervisor support from dist/docker to install.sh, make it
installable from relocatable package.
This enables to use supervisor with nonroot / offline environment,
and also make relocatable package able to run in Docker environment.
Related #8849Closes#8918
This patch follows #9002, further reducing the complexity of the sstable readers.
The split between row consumer interfaces and implementations has been first added in 2015, and there is no reason to create new implementations anymore. By merging those classes, we achieve a sizeable reduction in sstable reader length and complexity.
Refs #7952
Tests: unit(dev)
Closes#9073
* github.com:scylladb/scylla:
sstables: merge row_consumer into mp_row_consumer_k_l
sstables: move kl row_consumer
sstables: merge consumer_m into mp_row_consumer_m
sstables: move mp_row_consumer_m
This is a translation of Cassandra's CQL unit test source file
validation/entities/TypeTest.java into our our cql-pytest framework.
This is a tiny test file, with only four test which apparently didn't
find their place in other source files. All four tests pass on Cassandra,
and all but one pass on Scylla - the test marked xfail discovered one
previously-unknown incompatibility with Cassandra:
Refs #9082: DROP TYPE IF EXISTS shouldn't fail on non-existent keyspace
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210726140934.1479443-1-nyh@scylladb.com>
Prevent accidental conversions to bool from yielding the wrong results.
Unprepared users (that converted to bool, or assigned to int) are adjusted.
Ref #1449
Test: unit (dev)
Closes#9088
* seastar 93d053cd...ce3cc268 (4):
> doc: update coroutine exception paragraph with make_exception
> coroutine: add make_exception helper
> coroutine: use std::move for forwarding exception_ptr
> doc: tutorial: document direct exception propagation
With the new throw-less coroutine exception support, we can modify
some of Scylla's new coroutine code to generate exceptions a bit more
efficiently, without actually thowing an exception.
Partition count is of a type size_t but we use std::plus<int>
to reduce values of partition count in various column families.
This patch changes the argument of std::plus to the right type.
Using std::plus<int> for size_t compiles but does not work as expected.
For example plus<int>(2147483648LL, 1LL) = -2147483647 while the code
would probably want 2147483649.
Fixes#9090
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Closes#9074
This is a translation of Cassandra's CQL unit test source file
validation/entities/TupleTypeTest.java into our our cql-pytest framework.
This test file checks has a few tests on various features of tuples.
Unfortunately, some of the tests could not be easily translated into
Python so were left commented out: Some tests try to send invalid input
to the server which the Python driver "helpfully" forbids; Two tests
used an external testing library "QuickTheories" and are the only two
tests in the Cassandra test suite to use this library - so it's not
a worthwhile to translate it to Python.
11 tests remain, all of them pass on Cassandra, and just one fails on
Scylla (so marked xfail for now), reproducing one known issue:
Refs #7735: CQL parser missing support for Cassandra 3.10's new "+=" syntax
Actually, += is not supposed to be supported on tuple columns anyway, but
should print the appropriate error - not the syntax error we get now as
the "+=" feature is not supported at all.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210722201900.1442391-1-nyh@scylladb.com>
Since detach_buffer is used before closing and
destroying the reader, we want to mark it as noexcept
to simply the caller error handling.
Currently, although it does construct a new circular_buffer,
none of the constructors used may throw.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210617114240.1294501-2-bhalevy@scylladb.com>
detach_buffer exchanges the current _buffer with
a new buffer constructed using the circular_buffer(Alloc)
constructor. The compiler implicitly constructs a
tracking_allocator(reader_permit) and passes it
to the circular_buffer constructor.
This patch just makes that explicit so it would be
clearer to the reader what's going on here.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210617114240.1294501-1-bhalevy@scylladb.com>
Loading snapshot id and term + vote involve selecting static
fields from the "system.raft" table, constrained by a given
group id.
The code incorrectly assumes that, for example,
`SELECT snapshot_id FROM raft WHERE group_id=?` in
`load_snapshot` always returns only one row.
This is not true, since this will return a row
for each (pk, ck) combination, which is (group_id, index)
for "system.raft" table.
The same applies for the `load_term_and_vote`, which selects
static `vote_term` and `vote` from "system.raft".
This results in a crash at node startup when there is
a non-empty raft log containing more than one entry
for a given `group_id`.
Restrict the selection to always return one row by applying
`LIMIT 1` clause.
Tests: unit(dev)
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20210723183232.742083-1-pa.solodovnikov@scylladb.com>
Commit 2150c0f7a2 proposed by issue #5619
added a limitation that USING TIMESTAMP cannot be more than 3 days into
the future. But the actual code used to check it,
timestamp - now > MAX_DIFFERENCE
only makes sense for *positive* timestamps. For negative timestamps,
which are allowed in Cassandra, the difference "timestamp - now" might
overflow the signed integer and the result is undefined - leading to the
undefined-behavior sanitizer to complain as reported in issue #8895.
Beyond the sanitizer, in practice, on my test setup, the timestamp -2^63+1
causes such overflow, which causes the above if() to make the nonsensical
statement that the timestamp is more than 3 days into the future.
This patch assumes that negative timestamps of any magnitude are still
allowed (as they are in Cassandra), and fixes the above if() to only
check timestamps which are in the future (timestamp > now).
We also add a cql-pytest test for negative timestamps, passing on both
Cassandra and Scylla (after this patch - it failed before, and also
reported sanitizer errors in the debug build).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210621141255.309485-1-nyh@scylladb.com>
This is the 2nd PR in series with the goal to finish the hackathon project authored by @tgrabiec, @kostja, @amnonh and @mmatczuk (improved virtual tables + function call syntax in CQL). This one introduces a new implementation of the virtual tables, the streaming tables, which are suitable for large amounts of data.
This PR was created by @jul-stas and @StarostaGit
Closes#8961
* github.com:scylladb/scylla:
test/boost: run_mutation_source_tests on streaming virtual table
system_keyspace: Introduce describe_ring table as virtual_table
storage_service: Pass the reference down to system_keyspace
endpoint_details: store `_host` as `gms::inet_address`
queue_reader: implement next_partition()
virtual_tables: Introduce streaming_virtual_table
flat_mutation_reader: Add a new filtering reader factory method
We want to serialize snapshot application with command application
otherwise a command may be applied after a snapshot that already contains
the result of its application (it is not necessary a problem since the
raft by itself does not guaranty apply-once semantics, but better to
prevent it when possible). This also moves all interactions with user's
state machine into one place.
Message-Id: <YPltCmBAGUQnpW7r@scylladb.com>
"
The cql-server -> storage-service dependency comes from the server's
event_notifier which (un)subscribes on the lifecycle events that come
from the storage service. To break this link the same trick as with
migration manager notifications is used -- the notification engine
is split out of the storage service and then is pushed directly into
both -- the listeners (to (un)subscribe) and the storage service (to
notify).
tests: unit(dev), dtest(simple_boot_shutdown, dev)
manual({ start/stop,
with/without started transport,
nodetool enable-/disablebinary
} in various combinations, dev)
"
* 'br-remove-storage-service-from-transport' of https://github.com/xemul/scylla:
transport.controller: Brushup cql_server declarations
code: Remove storage-service header from irrelevant places
storage_service: Remove (unlifecycle) subscribe methods
transport: Use local notifier to (un)subscribe server
transport: Keep lifecycle notifier sharded reference
main: Use local lifecycle notifier to (un)subscribe listeners
main, tests: Push notifier through storage service
storage_service: Move notification core into dedicated class
storage_service: Split lifecycle notification code
transport, generic_server: Remove no longer used functionality
transport: (Un)Subscribe cql_server::event_notifier from controller
tests: Remove storage service from manual gossiper test
The controller code sits in the cql_transport namespace and
can omit its mentionings. Also the seastar::distributed<>
is replaced with modern seastar::sharded<> while at it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Some .cc files over the code include the storage service
for no real need. Drop the header and include (in some)
what's really needed.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now the controller has the lifecycle notifier reference and
can stop using storage service to manage the subscription.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The storage proxy and sl-manager get subscribed on lifecycle
events with the help of storage service. Now when the notifier
lives in main() they can use it directly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now it's time to move the lifecycle notifier from storage
service to the main's scope. Next patches will remove the
$lifecycle-subscriber -> storage_service dependency.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Introduce the endpoint_lifecycle_notifier class that's in
charge of keeping track of subscribers and notifying them.
The subscribers will thus be able to set and unset their
subscription without the need to mess with storage service
at all.
The storage_service for now keeps the notifier on board, but
this is going to change in the next patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This prepares the ground for moving the notification engine
into own class like it was done for migration_notifier some
time ago.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
After subscription management was moved onto controller level
a bunch of code can be dropped:
- passing migration notifier beyond controller
- event_notifier's _stopped bit
- event_notifier .stop() method
- event_notifier empty constructor and destrictor
- generic_server's on_stop virtual method
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's a migration notifier that's carried through cql_server
_just_ to let event-notifier (un)subscribe on it. Also there's
a call for global storage-service in there which will need to
be replaced with yet another pass-through argument which is not
great.
It's easier to establish this subscription outside of cql_server
like it's currently done for proxy and sl-manager. In case of
cql_server the "outside" is the controller.
This patch just moves the subscription management from cql_server
to controller, next two patches will make more use of this change.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This PR contains the parts relevant to batchlog_manager stop in #8998 without adding a gate to the storage_proxy for synchronization with on-going queries in storage_proxy::drain_on_shutdown.
As explained in #9009, we see that the batchlog_manager isn't stopped if scylla shuts down during startup, e.g. when waiting for gossip to settle, since currently the batchlog_manager is stopped only from `storage_service::do_drain`, while `storage_service::drain_on_shutdown` deferred shutdown is installed only later on:
222ef17305/main.cc (L1419-L1421)Fixes#9009
Test: unit(dev)
DTest: compact_storage_tests.py:TestCompactStorage.wide_row_test paging_test:TestPagingDatasetChanges.test_cell_TTL_expiry_during_paging update_cluster_layout_tests:TestUpdateClusterLayout.simple_add_new_node_while_adding_info_{1,2}_test (dev)
Closes#9010
* github.com:scylladb/scylla:
main: add deferred stop of batchlog_manager
batchlog_manager: refactor drain out of stop
batchlog_manager: stop: break _sem on shard 0
batchlog_manager: stop: use abort_source to abort batchlog_replay_loop
batchlog_manager: do_batch_log_replay: hold _gate
* seastar 388ee307...93d053cd (5):
> doc: tutorial: document seastar::coroutine::all()
> doc: tutorial: nest "exceptions in coroutines" under "coroutines"
> coroutine: add a way of propagating exceptions without throwing
> input_stream: Fix read_exactly(n) incorrectly skipping data
> coroutines: introduce all() template for waiting for multiple futures
The class name `coroutine` became problematic since seastar
introduced it as a namespace for coroutine helpers.
To avoid a clash, the class from scylla is wrapped in a separate
namespace.
Without this patch, Seastar submodule update fails to compile.
Message-Id: <6cb91455a7ac3793bc78d161e2cb4174cf6a1606.1626949573.git.sarna@scylladb.com>
This is a rewrite of an old PR: #7582
`TOKEN()` restrictions don't work properly when a query uses an index.
For example this returns both rows:
```cql
CREATE TABLE t(pk int, ck int, v int, PRIMARY KEY(pk, ck));
CREATE INDEX ON t(v);
INSERT INTO t (pk, ck, v) VALUES (0, 0, 0);
INSERT INTO t (pk, ck, v) VALUES (1, 0, 0);
SELECT token(pk), pk, ck, v FROM t WHERE v = 0 AND token(pk) = token(0) ALLOW FILTERING;
```
This functionality is supported on both old and new indexes. In old
indexes the type of the token column was `blob`. This causes problems,
because `blob` representation of tokens is ordered differently. Tokens
represented as blobs are ordered like this:
```
0, 1, 2, 3, 4, 5, ..., bigint_max, bigint_min, ...., -5, -4, -3, -2, -1
```
Because of that clustering range for `token()` restrictions needs to be
translated to two clustering ranges on the `blob` column.
To create old indexes disable the feature called:
`CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX` or run scylla version from branch
[`cvybhu/si-token2-old-index`](https://github.com/cvybhu/scylla/commits/si-token2-old-index)
I'm not sure if it's possible to create automatic tests with old
indexes. I ran `dev-test` manually on the `si-token2-old-index` branch,
and the only tests that failed were the ones testing row ordering. Rows
should be ordered by `token`, but because in old indexes the token is
represented as a `blob` this ordering breaks. This is a known issue
(#7443), that has been fixed by introducing new indexes.
To sum up:
* `token()` restrictions are fixed on both new and old indexes.
* When using old indexes, the rows are not properly ordered by token.
* With new indexes the rows are properly ordered by token.
Fixes#7043Closes#9067
* github.com:scylladb/scylla:
tests: add secondary index tests with TOKEN clause
secondary_index_test: extract test data
secondary_index: Fix TOKEN() restrictions in indexed SELECTs
expression: Add replace_token function
The row_consumer interface has only one implementation:
mp_row_consumer_k_l; and we're not planning other ones,
so to reduce the number of inheritances, and the number
of lines in the sstable reader, these classes may be
combined.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
In preparation for the next patch combining row_consumer and
mp_row_consumer_k_l, move row_consumer next to row_consumer.
Because row_consumer is going to be removed, we retire some
old tests for different implementations of the row_consumer
interface; as a result, we don't need to expose internal
types of kl sstable reader for tests, so all classes from
reader_impl.hh are moved to reader.cc, and the reader_impl.hh
file is deleted, and the reader.cc file has an analogous
structure to the reader.cc file in sstables/mx directory.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
The consumer_m interface has only one implementation:
mp_row_consumer_m; and we're not planning other ones,
so to reduce the number of inheritances, and the number
of lines in the sstable reader, these classes may be
combined.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
To make next patch combining consumer_m and mp_row_consumer_m
more readable, move mp_row_consumer_m next to consumer_m.
Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Add tests of SELECTs with TOKEN clauses on tables with secondary
indexes (both global and local).
test_select_with_token_range_cases checks all possible token range
combinations (inclusive/exclusive/infinity start/end) on tables without
index, with local or with global index.
test_select_with_token_range_filtering checks whether TOKEN restrictions
combined with column restrictions work properly. As different code paths
are taken if index is created on clustering key (first or non-first) or
non-primary-key column, the tests checks scenarios when index is created
on different columns.
Extract test data to a separate variables, allowing it to be easily
reused by other tests. The tokens are hard-coded, because calculating
their value brought too much complexity to this code.
When using an index, restrictions like token(p) <= x were ignored.
Because of this a query like this would select all rows where r = 0:
SELECT * FROM tab WHERE r = 0 and token(p) > 0;
Adds proper handling of token restrictions to queries that use indexes.
Old indexes represented token as a blob, which complicates
clustering bounds. Special code is included, which translates
token clustering bounds to blob clustering bounds.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Today, table relies on row_cache::invalidate() serialization for
concurrent sstable list updates to produce correct results.
That's very error prone because table is relying on an implementation
detail of invalidate() to get things right.
Instead, let's make table itself take care of serialization on
concurrent updates.
To achieve that, sstable_list_builder is introduced. Only one
builder can be alive for a given table, so serialization is guaranteed
as long as the builder is kept alive throughout the update procedure.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210721001716.210281-1-raphaelsc@scylladb.com>
That function is dangerously used by distributed loader, as the latter
was responsible for invalidating cache for new sstable.
load_sstable() is an unsafe alternative to
add_sstable_and_update_cache() that should never have been used by
the outside world. Instead, let's kill it and make loader use
the safe alternative instead.
This will also make it easier to make sure that all concurrent updates
to sstable set are properly serialized.
Additionally, this may potentially reduce the amount of data evicted
from the cache, when the sstables being imported have a narrow range,
like high level sstables imported from a LCS table. Unlikely but
possible.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210721131949.26899-1-raphaelsc@scylladb.com>
Adds replace_token function which takes an expression
and replaces all left hand side occurences of token()
with the given column definition.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
When a leader moves to a follower state it aborts all requests that are
waiting on an admission semaphore with not_a_leader exception. But
currently it specifies itself as a new leader since abortion happens
before the fsm state changes to a follower. The patch fixes this by
destroying leader state after fsm state already changed to be a
follower.
Message-Id: <YPbI++0z5ZPV9pKb@scylladb.com>
* seastar ef320940...388ee307 (4):
> Merge 'Add a stall analyser tool' from Benny Halevy
> compat: implement coroutine_handle<void> for <experimental/coroutine> header
> Merge "Make app_template::run noexcept" from Pavel E
> perftune.py: make RPS CPU set to be a full CPU set
The stall analyser tool was requested by the SCT team to help make
sense of Scylla's stall reports and find more stall bugs!
Stop the batchlog manager using a deferred
action in main to make sure it is stopped after
its start() method has been called, also
if we bail out of main early due to exception.
Change the bm.stop() calls in storage_service
to just stop the replay loop using drain().
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>