"
The main challenge here is to move messaging_service.start_listen()
call from out of gossiper into main. Other changes are pretty minor
compared to that and include
- patch gossiper API towards a standard start-shutdown-stop form
- gossiping "sharder info" in initial state
- configure cluster name and seeds via gossip_config
tests: unit(dev)
dtest.bootstrap_test.start_stop_test_node(dev)
manual(dev): start+stop, nodetool enable-/disablegossip
refs: #2737
refs: #2795
refs: #5489
"
* 'br-gossiper-dont-start-messaging-listen-2' of https://github.com/xemul/scylla:
code: Expell gossiper.hh from other headers
storage_service: Gossip "sharder" in initial states
gossiper: Relax set_seeds()
gossiper, main: Turn init_gossiper into get_seeds_from_config
storage_service: Eliminate the do-bind argument from everywhere
gossiper: Drop ms-registered manipulations
messaging, main, gossiper: Move listening start into main
gossiper: Do handlers reg/unreg from start/stop
gossiper: Split (un)init_messaging_handler()
gossiper: Relocate stop_gossiping() into .stop()
gossiper: Introduce .shutdown() and use where appropriate
gossiper: Set cluster_name via gossip_config
gossiper, main: Straighten start/stop
tests/cql_test_env: Open-code tst_init_ms_fd_gossiper
tests/cql_test_env: De-global most of gossiper
gossiper: Merge start_gossiping() overloads into one
gossiper: Use is_... helpers
gossiper: Fix do_shadow_round comment
gossiper: Dispose dead code
* seastar c04a12edbd...e6db0cd587 (13):
> Merge "Add kernel stack trace reporting for stalls" from Avi
Ref #8828
> Merge "Keep XFS' dioattr cached" from Pavel E
> coroutines: de-template maybe_yield()
> sharded: Add const versions of map_reduce's
> apps/io_tester: remove unused lambda capture
> doc: exclude seastar::coroutine::internal namespace
> deprecate unaligned_cast<> from unaligned.hh
> reactor: adjust max_networking_aio_io_control_blocks to lower size when fs.aio-max-nr is small
> build: clarify choice of C++ dialect, and change default to C++20
> coding_style: update concepts style to snake_case
> Merge "Teach io_tester to submit requests-per-second flow" from Pavel E
> cmake: find and link against Boost::filesystem
> coroutine: add maybe_yield
We know (verified by existing tests) that null keys are not allowed -
neither as partition keys nor clustering keys.
In issue #9352 a question was raised of whether an *empty string* is
allowed as as a key on a base table (not a materialized view or index).
The following tests confirm that the current situation is as follows:
1. An empty string is perfectly legal as a clustering key.
2. An empty string is NOT ALLOWED as a partition key - the error
"Key may not be empty" is reported if this is attempted.
3. If the partition key is compound (multiple partition-key columns)
then any or all of them may be empty strings.
These tests pass the same on both Cassandra and Scylla, showing that
this bizarre (and undocumented) behavior is identical in both.
Refs #9352.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210922131310.293846-1-nyh@scylladb.com>
"
There's a whole lot of places that create an sstable for tests
like this
auto sst = env.make_sstable(...);
sst->write_components(...);
sst->load();
Some of them are already generalized with the make_sstable_easy
helper, but there are several instances of them.
Found while hunting down the places that use default IO sched
class behind the scenes.
tests: unit(dev)
"
* 'br-sst-tests-make-sstable-easy' of https://github.com/xemul/scylla:
test: Generalize make_sstable() and make_sstable_easy()
test: Use now existing helpers elsewhere
test: Generalize all make_sstable_easy()-s
test: Set test change estimation to 1
test: Generalize make_sstable_easy in mutation tests
test: Generalize make_sstable_easy in set tests
test: Reuse make_sstable_easy in datafile tests
test: Relax make_sstable_easy in compaction tests
When a string column is indexed with a secondary index, the empty value
for this column (an empty string '') is perfectly legal, and should be
indexed as well. This is not the same as an unset (null) value which
isn't indexed.
The following test demonstrates that this case works in Cassandra, but
does not in Scylla (so the test is marked "xfail"). In Scylla, a query
that returns the expected results with ALLOW FILTERING suddenly returns
a different (and wrong) result when an index is added on the table.
This test reproduces issue #9364.
Refs #9364.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210922121510.291826-1-nyh@scylladb.com>
"
Backlog tracker isn't updated correctly when facing a schema change, and
may leak a SSTable if compaction strategy is changed, which causes
backlog to be computed incorrectly. Most of these problems happen because
sstable set and tracker are updated independently, so it could happen
that tracker lose track (pun intended) of changes applied to set.
The first patch will fix the leak when strategy is changed, and the third
patch will make sure that tracker is updated atomically with sstable set,
so these kind of problems will not happen anymore.
Fixes#9157
test: mode(debug)
"
* 'fixes_to_backlog_tracker_v3' of https://github.com/raphaelsc/scylla:
compaction: Update backlog tracker correctly when schema is updated
compaction: Don't leak backlog of input sstable when compaction strategy is changed
compaction: introduce compaction_read_monitor_generator::remove_exhausted_sstables()
compaction: simplify removal of monitors
Scylla and Cassandra do not allow an empty string as a partition key,
but a materialized view might "convert" a regular string column into a
partition key, and an empty string is a perfectly valid value for this
column. This can result in a view row which has an empty string as a
partition key. This case works in Cassandra, but doesn't in Scylla (the
row with the empty string as a partition key doesn't appear). The
following test demonstrates this difference between Scylla and Cassandra
(it passes on Cassandra, fails on Scylla, and accordingly marked
"xfail").
Refs #9375.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210922115000.290387-1-nyh@scylladb.com>
Recently we had to say goodbye to our dear friend Pekka.
He orphaned a few subsystems that can't call for his help in code
reviews anymore.
This patch makes sure no one will bother Pekka in his afterlife.
It also cleanups HACKING.md a little bit by removing Pekka and Duarte
from the maintainer/reviewer lists.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <98ba1aed9ee8a87b9037b5032b82abc5bfddbd66.1632301309.git.piotr@scylladb.com>
A variant of make_reversed() which goes through the schema registry,
teaching the schema to the registry if necessary. This effectively
caches the result of the reversing and as an added bonus double
reversing yields the very same schema C++ object that was the starting
point.
Closes#9365
Add the content of the compaction stats introduced in the previous patch
to the tracing data. This will help diagnose query performance related
problems caused by tombstones.
Add the content of the compaction stats introduced in the previous patch
to the tracing data. This will help diagnose query performance related
problems caused by tombstones.
This needs to add forward declarations of the gossiper class and
re-include some other headers here and there.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Right now the number of shards and ignore-msb-bits are gossiped
with a separate call. It's simpler to include this data into
the initial gossiping state.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's much shorter and simpler to pass the seeds, obtained from the
config, into gossiper via gossip_config rahter than with the help
of a special call.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Looking into init_gossiper() helper makes it clear that what it does
is gets seeds, provider and listen_address from config and generates
a set of seeds for the gossiper. Then calls gossiper.set_seeds().
This patch renames the helper into get_seeds_from_config(), removes
all but db::config& argunebts from it and moves the call to set_seed()
into main.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The same as in previous patch -- the gossiper doesn't need to know
if it should call messaging.start_listen() or not, neither should
do the storage_service.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Before preparing the cluster join process the messaging should be
put into listening state. Right now it's done "on-demand" by the
call to the do_shadow_round(), also there's a safety call in the
start_gossiping(). Tests, however, should not start listening, so
the do_bind boolean exists and is passed all the way around.
Make the main() code explicitly call the messaging.start_listen()
and leave tests without it. This change makes messaging start
listening a bit earlier, but in between these old and new places
there's nothing that needs messaging to stay deaf.
As the do_bind becomes useless, the wait_for_gossip_to_settle() is
also moved into main.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
On start handlers can be registered any time before the messaging
starts to listen. On stop handlers can remain registered any long,
since the messaging service stops early in drain_on_shutdown().
One tricky place is API start_/stop_gossiping(). The latter calls
gossiper::stop() thus unregistering the handlers. So to make the
start_gossiping() work it must call gossiper::start() in advance.
Overall the gossiper start/stop becomes this:
gossiper.start()
`- registers handlers
gossiper.start_gossiping()
`- // starts gossiping
gossiper.shutdown()
`- // stops gossiping
gossiper.stop()
`- calls shutdown() // re-entrable
`- unregisters handlers
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper in question is called in two places:
1. In main() as a fuse against early exception before creating the
drain_on_shutdown() defer
2. In the stop_gossiping() API call
Both can be replaced with the stop_gossiping() call from the .stop()
method, here's why:
1. In main the gossiper::stop() call is already deferred right after
the gossiper is started. So this change moves it above. It may
happen that an exception pops up before the old fuse was deferred,
but that's OK -- the stop_gossiping() is safe against early- and
re- entrances
2. The stop_gossiping() change is effectlvey a rename -- it calls the
stop_gossiping() as it did before, but with the help of the .stop()
method
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The start/stop sequence we're moving towards assumes a shutdown (or
drain) method that will be called early on stop to notify the service
that the system is going down so it could prepare.
For gossiper it already means calling stop_gossiping() on the shard-0
instance. So by and large this patch renames a few stop_gossiping()
calls into .shutdown() ones.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It's taken purely from the db::config and thus can be set up early.
Right now the empty name is converted into "Test Cluster" one, but
remains empty in the config and is later used by the system_keyspace
code. This logic remains intact.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Turn the gossiper start/stop sequence into the canonical form
gossiper.start(std::ref(dependencies)...).get();
auto stop_gossiper = defer({
gossiper.invoke_on_all(&gossiper::stop).get();
});
gossiper.invoke_on_all(&gossiper::start).get();
The deferred call should be gossiper.stop(); but for now keep
the instances memory alive.
This trick is safe at this point, because .start() and .stop()
methods are both empty (still).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper is called once. Keeping this code in the caller packs the
code, helps it look more like main() and facilitates further patching.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Sizes too small to fit a ::seastar::memory::free_object won't contain
any objects at all so they don't contribute anything to the listing
beyond noise.
Closes#9366
Gossiper is still global and cql_test_env heavily exploits this fact.
Clean that by getting the gossiper once and using the local reference
everywhere else.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are two of them and one is only called from the API with the
do_bind always set to "yes". This fact makes it possible to remove
it by adding relevant defaults for the other.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are several state booleans on the service and some helpers to
manipulate/check those. Make the code consistent by always using these
helpers.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The debug_show() is unused, as well as the advertise_myself().
The _features_condvar used to be listened on before f32f08c9,
now it's signal-only.
Feature frendship with gossiper is not required.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
... from Nadav Har'El
This small series adds a stub implementation of Alternator's
UpdateTimeToLive and DescribeTimeToLive operations. These operations can
enable, disable, or inquire about, the chosen expiration-time attribute.
Currently, the information about the chosen attribute is only saved,
with no actual expiration of any items taking place.
Because this is an incomplete implementation of this feature, it is not
enabled unless an experimental flag is enabled on all nodes in the
cluster.
See the individual patches for more information on what this series
does.
Refs #5060.
Closes#9345
* github.com:scylladb/scylla:
test/alternator: rename utility function test_table_name()
alternator: stub TTL operations
alternator: make three utility functions in executor.cc non-static
test/alternator: test another corner case of TTL
Currently the following can happen:
1) there's ongoing compaction with input sstable A, so sstable set
and backlog tracker both contains A.
2) ongoing compaction replaces input sstable A by B, so sstable set
contains only B now.
3) schema is updated, so a new backlog tracker is built without A
because sstable set now contains only B.
4) ongoing compaction tries to remove A from tracker, but it was
excluded in step 3.
5) tracker can now have a negative value if table is decreasing in
size, which leads to log(<negative number>) == -NaN
This problem happens because backlog tracker updates are decoupled
from sstable set updates. Given that the essential content of
backlog tracker should be the same as one of sstable set, let's move
tracker management to table.
Whenever sstable set is updated, backlog tracker will be updated with
the same changes, making their management less error prone.
Fixes#9157
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The generic back formula is: ALL + PARTIAL - COMPACTING
With transfer_ongoing_charges() we already ignore the effect of
ongoing compactions on COMPACTING as we judge them to be pointless.
But ongoing compactions will run to completion, meaning that output
sstables will be added to ALL anyway, in the formula above.
With stop_tracking_ongoing_compactions(), input sstables are never
removed from the tracker, but output sstables are added, which means
we end up with duplicate backlog in the tracker.
By removing this tracking mechanism, pointless ongoing compaction
will be ignored as expected and the leaks will be fixed.
Later, the intention is to force a stop on ongoing compactions if
strategy has changed as they're pointless anyway.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
by switching to unordered_map, removal of generated monitors is
made easier. this is a preparatory change for patch which will
remove monitor for all exhausted sstables
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The former constructs a memtable from the vector of mutations and
then does exactlty the same steps as the latter one -- creates an
sstable corresponding to the memtable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are already four of them. Those working with the mutation reader
can be folded into one with some default args.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test intention is not to test how zero estimated partitions
work, there's another case for than (in another test). Also it
looks like 0 is doesn't flow anywhere far, it's std::max-ed into
1 early inside mc::writer constructor.
This changes significantly simplifies the unification of the set
of make_sstable_easy()-s in the next patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The same trick as in the previous patch, but the new helper
accepts a memtable instead of a mutation reader and makes the
reader from the memtable.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There a bunch of places in the test that do the same sequence
of steps to create an sstable. Generalize them into a helper
that resembles the one from previous patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This patch is two-fold. First it changes the signature of the
local helper to facilitate next patching. Second, it makes more
relevant places in the test use this helper.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The version argument can be omitted, the env.make_sstable will
default it to highest version. The generation argument is left
and defaulted to 1.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Originally, the expected failure for a recursive invocation
test case was to expect that fuel gets exhausted, but it's also
possible to hit a stack limit first. All errors are equally
expected here as long as the execution is halted, so let's relax
the condition and accept any wasm-related InvalidRequest errors.
Closes#9361
This reverts commit e9343fd382, reversing
changes made to 27138b215b. It causes a
regression in v2 serialization_format support:
collection_serialization_with_protocol_v2_test fails with: marshaling error: read_simple_bytes - not enough bytes (requested 1627390306, got 3)
Fixes#9360
"
Currently database start and stop code is quite disperse and
exists in two slightly different forms -- one in main and the
other one in cql_test_env. This set unifies both and makes
them look almost the perfect way:
sharded<database> db;
db.start(<dependencies>);
auto stop = defer([&db] { db.stop().get(); });
db.invoke_on_all(&database::start).get();
with all (well, most) other mentionings of the "db" variable
being arguments for other services' dependencies.
tests: unit(dev, release), unit.cross_shard_barrier(debug)
dtest.simple_boot_shutdown(dev)
refs: #2737
refs: #2795
refs: #5489
"
* 'br-database-teardown-unification-2' of https://github.com/xemul/scylla: (26 commits)
main: Log when database starts
view_update_generator: Register staging sstables in constructor
database, messaging: Delete old connection drop notification
database, proxy: Relocate connection-drop activity
messaging, proxy: Notify connection drops with boost signal
database, tests: Rework recommended format setting
database, sstables_manager: Sow some noexcepts
database: Eliminate unused helpers
database: Merge the stop_database() into database::stop()
database: Flatten stop_database()
database: Equip with cross-shard-barrier
database: Move starting bits into start()
database: Add .start() method
main: Initialize directories before database
main, api: Detach set_server_config from database and move up
main: Shorten commitlog creation
database: Extract commitlog initialization from init_system_keyspace
repair: Shutdown without database help
main: Shift iosched verification upward
database: Remove unused mm arg from init_non_system_keyspaces()
...