In order to be able to avoid a deadlock when CQL server cannot be started,
the view builder shutdown procedure is now split to two parts -
- drain and stop. Drain is performed before storage proxy shutdown,
but stop() will be called even before drain is scheduled.
The deadlock is as follows:
- view builder creates a reader permit in order to be able
to read from system tables
- CQL server fails to start, shutdown procedure begins
- view builder stop() is not called (because it was not scheduled
yet), so it holds onto its reader permit
- database shutdown procedure waits for all permits to be destroyed,
and it hangs indefinitely because view builder keeps holding
its permit.
In the quest to have explicit dependencies and the abiliy to run
multiple nodes in one process, remove some uses of get_gossiper() and
get_local_gossiper() and replace them with parameters passed from main()
or its equivalents.
Some uses still remain, mostly in snitch, but this series removes a
majority.
* https://github.com/avikivity/scylla.git gossiper-deglobal-1/v1
alternator: remove uses of get_local_gossiper()
storage_service: remove stray get_gossiper(), get_local_gossiper() calls
migration_manager: remove use of get_gossiper() from passive_announce()
storage_proxy: start_hints_manager(): don't require caller to provide gossiper
migration_manager: remove uses of get_local_gossiper()
storage_proxy: remove uses of get_local_gossiper()
gossiper: remove get_local_gossiper() from some inline helpers
gossiper: remove get_gossiper() from stop_gossiping()
gossiper: remove uses of get_local_gossiper for its rpc server
api: remove use of get_local_gossiper()
gossiper: remove calls to global get_gossiper from within the gossiper itself
migration_manager already has a reference to _gossiper, but
passive_announce is static and so can't use it. Luckily the only
caller (in storage_service) uses it as it it wasn't static, so
we can just unstaticify it.
Pass gossiper as a constructor parameter instead. cql_test_env
gains a use of get_gossiper() instead, but at least these uses
are concentrated in one place.
Some state accessors called get_local_gossiper(); this is removed
and replaced with a parameter. Some callers (redis, alternators)
now have the gossiper passed as a parameter during initialization
so they can use the adjusted API.
Have the callers pass it instead, and they all have a reference
already except for cql_test_env (which will be fixed later).
The checks for initialization it does are likely unnecessary, but
we'll only be able to prove it when get_gossiper() is completely
removed.
Initialization happens in the gossiper itself, so we can capture
'this'. If we need to move to shard 0, use sharded::invoke_on() to
get the local instance.
Pass down gossiper from main, converting it to a shard-local instance
in calls to register_api() (which is the point that broadcasts the
endpoint registration across shards).
This helps remove gossiper as a global variable.
i3en.xlarge is currently not getting tuned properly. A quick test using
Scylla AMI ( ami-07a31481e4394d346 ) reveals that the storage
capabilities under this instance are greatly reduced:
$ grep iops /etc/scylla.d/io_properties.yaml
read_iops: 257024
write_iops: 174080
This patch corrects this typo, in such a way that iotune now properly
tunes this instance type.
Closes#9298
"
This is a set of random patches trying to fix broken cmake build:
* `-fcoroutines` flag is now used only for GCC, but not for Clang
* `SCYLLA-VERSION-GEN` invocation is adjusted to work correctly with
out-of-source builds
* Auxiliary targets are adjusted to support out-of-source builds
* Removed extra source files and added missing ones to the scylla
target
Scylla still doesn't build successfully with CMake build. But now,
at least, it's passes configuration step, which is a prerequisite
to loading the solution in IDEs.
"
* 'cmake_improvements' of github.com:ManManson/scylla:
cmake: fix out-of-source builds
cmake: don't use `-fcoroutines` for clang
cmake: update and sort source files and idl:s
"
There's a landmine buried in range_rombstone's move constructor.
Whoever tries to use it risks grabbing the tombstone from the
containing list thus leaking the guy optionally invalidating an
iterator pointing at it. There's a safety without_link moving
constructor out there, but still.
To keep this place safe it's better to separate range_tombstone
from its linkage into anywhere. In particular to keep the range
tombstones in a range_tombstone_list here's the entry that keeps
the tombstone _and_ the list hook (which's a boost set hook).
The approach resembles the rows_entry::deletable_row pair.
tests: unit(dev, debug, patch from #9207)
fixes: #9243
"
* 'br-range-tombstone-vs-entry' of https://github.com/xemul/scylla:
range_tombstone: Drop without-link constructor
range_tombstone: Drop move_assign()
range_tombstone: Move linkage into range_tombstone_entry
range_tombstone_list: Prepare to use range_tombstone_entry
range_tombstone, code: Add range_tombstone& getters
range_tombstone_list: Factor out tombstone construction
range_tombstone_list: Simplify (maybe) pop_front_and_lock()
range_tombstone_list: De-templatize pop_as<>
range_tombstone_list: Conceptualize erase_where()
range_tombstone(_list): Mark some bits noexcept
mutation: Use range_tombstone_list's iterators
mutation_partition: Shorten memory usage calculation
mutation_partition: Remove unused local variable
Scrub is supposed to not remove anything from input, write it as is
while fixing any corruption it might have. It shouldn't have any
assumption on the input. Additionally, a data shadowed by a tombstone
might be in another corrupted sstable, so expired tombstones should
not be purged in order to prevent data ressurection from occurring.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210904165908.135044-1-raphaelsc@scylladb.com>
We were silently ignoring INSERTs with NULL values for primary-key
columns, which Cassandra rejects. Fix it by rejecting any
modification_statement that would operate on empty partition or
clustering range.
This is the most direct fix, because range and slice are calculated in
one place for all modification statements. It covers not only NULL
cases, but also impossible restrictions like c>0 AND c<0.
Unfortunately, Cassandra doesn't treat all modification statements
consistently, so this fix cannot fully match its behavior. We err on
the side of tolerance, accepting some DELETE statements that Cassandra
rejects. We add a TODO for rejecting such DELETEs later.
Fixes#7852.
Tests: unit (dev), cql-pytest against Cassandra 4.0
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#9286
The thing was used to move a range tombstone without detaching it
from the containing list (well, intrusive set). Now when the linkage
is gone this facility is no longer needed (and actually no longer
used).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper was in use by move-assignment operator and by the .swap()
method. Since now the operator equals the helper, the code can be
merged and the .swap() can be prettified.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now it's time to remove the boost set's hook from the range_tombstone
and keep it wrapped into another class if the r._tombstone's location
is the range_tombstone_list.
Also the added previously .tombstone() getters and the _entry alias
can be removed -- all the code can work with the new class.
Two places in the code that made use of without_link{} move-constructor
are patched to get the range_tombstone part from the respective _entry
with the same result.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
A continuation of the previous patch. The range_tombstone_list works
with the range_tombstone very actively, kicking every single line
doing this to call .tombstone() seems excessive. Instead, declare
the range_tombstone_entry alias. When the entry will appear for real,
the alias would go away and the range_tombstone_list will be switched
into new entity right at once.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently all the code operates on the range_tombstone class.
and many of those places get the range tombstone in question
from the range_tombstone_list. Next patches will make that list
carry (and return) some new object called range_tombstone_entry,
so all the code that expects to see the former one there will
need to patched to get the range_tombstone from the _entry one.
This patch prepares the ground for that by introdusing the
range_tombstone& tombstone() { return *this; }
getter on the range_tombstone itself and patching all future
users of the _entry to call .tombstone() right now.
Next patch will remove those getters together with adding the new
range_tombstone_entry object thus automatically converting all
the patched places into using the entry in a proper way.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Just add a helper for constructing the managed range
tombstone object. This will also help further patch
have less duplicating hunks in it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method returns a pointer on the left-most range tombstone
and expects the caller to "dispose" it. This is not very nice
because the callers thus needs to mess with the relevant deleter.
A nicer approach is the pop-like one (former pop_as<>-like)
which is in returning the range tombstone by value. This value
is move-constructed from the original object which is disposed
by the container itself.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method pops the range tombstone from the containing list
and transparently "converts" it into some other type. Nowadays
all callers of it need range tombstone as-is, so the template
can be relaxed down to a plan call.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The range_tombstone's .empty() and .operator bool are trivially such.
The swap()'s noexceptness comes from what it calls -- the without-link
move constructor (noexcept) and .move_assign(). The latter is noexcept
because it's already called from noexcept move-assign operator and
because it calls noexcept move operators of tombstones' fields. The
update_node() is noexcept for the same reason.
The range_tombstone_list's clear() is noexcept because both -- set
clear and disposer lambda are both such.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This gcc flag is not supported. `-fcoroutines-ts` also
cannot be used, so just don't supply anything, similar to
what `configure.py` does.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
The consume_clustering_fragments declares several auxiliary symbols
to work with rows' and range-tombstones' iterators. For the range
tombstones it relies on what container is declared inside the
range tombstone itself. Soon the container declaration will move
from range_tombstone class into a new entity and this place should
be prepared for that. The better place to get iterator types from
is the range-tombstones container itself.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The tests in this patch verify that null characters are valid characters
inside string and bytes (blob) attributes in Alternator. The tests
verify this for both key attributes and non-key attributes (since those
are serialized differently, it's important to check both cases).
The tests pass on both DynamoDB and Alternator - confirming that we
don't have a bug in this area.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210824163442.186881-1-nyh@scylladb.com>
"
A special-purpose reader which doesn't use the index at all, designed to
be used in circumstances where the index is not reliable. The use-case
is scrub and validate which often have to work with corrupt indexes and
it is especially important that they don't further any existing
corruption.
Tests: unit(dev)
"
* 'crawling-sstable-reader/v2' of https://github.com/denesb/scylla:
compaction: scrub/validate: use the crawling sstable reader
sstables: wire in crawling reader
sstables: mx/reader: add crawling reader
sstables: kl/reader: add crawling reader
This patch adds tests for two undocumented (as far as I can tell) corner
cases of CQL's string types:
1. The types "text" and "varchar" are not just similar - they are in
fact exactly the same type.
2. All CQL string and blob types ("ascii", "text" or "varchar", "blob")
allow the null character as a valid character inside them. They are
*not* C strings that get terminated by the first null.
These tests pass on both Cassandra and Scylla, so did not expose any
bug, but having such tests is useful to understand these (so-far)
undocumented behaviors - so we can later document them.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210824225641.194146-1-nyh@scylladb.com>
Python 3.8 doesn't allow to use built-in collection types
in type annotations (such as `list` or `dict`).
This feature is implemented starting from 3.9.
Replace `list[str]` type annotation with an old-style
`List[str]`, which uses `List` from the `typing` module.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20210901131436.35231-1-pa.solodovnikov@scylladb.com>
After aa7cdc0392, run_custom_job() propagates stop exception.
The problem is that we fail to handle stop exception in the procedure
which stops ongoing compactions, so the exception will be propagated
all the way to init, which causes scylla to abort.
to fix this, let's swallow stop_exception in stop_ongoing_compactions(),
which is correct because compactions are stopped by triggering that
exception if signalled to stop.
when reshard is stopped, scylla init will fail as follow instead:
ERROR 2021-08-16 20:13:13,770 [shard 0] init - Startup failed: std::runtime_error
(Exception while populating keyspace 'keyspace5' with column family 'standard1'
from file ...
Fixes#9158.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210816232434.78375-1-raphaelsc@scylladb.com>
lw_shared_ptr must not be copied on a foreign shard.
Copying the vector on shard 0 tries increases the reference count of
lw_shared_ptr<sstable> elements that were created on other shards,
as seen in https://github.com/scylladb/scylla/issues/9278.
Fixes#9278
DTest: migration_test.py:TestLoadAndStream_with_3_0_md.load_and_stream_increase_cluster_test(debug)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210902084313.2003328-1-bhalevy@scylladb.com>
The CQL grammar is obviously about cql3 and mostly about cql3
expressions, so add using namespace statements so we don't
have to specify it over and over again.
These statements are present in the headers, but only in the
cql_parser namespace, so it doesn't pollute other translation
units.
Closes#9255
compile_commands.json is a format of compilation database info for use
with several editors, such as VSCode (with official C++ extension) and
Vim (with youcompleteme). It can be generated with ninja:
```
ninja -t compdb > compile_commands.json
```
I propose this addition, so that this file won't be commited by
accident.
Closes#9279
This change allows to invoke the script in out-of-source
builds: `git log` now uses `-C` option with the directory
containing the script.
Also, the destination path can now be overriden by providing
`-o|--output-dir PATH` option. By default it's set to the
`build` directory relative to the script location.
Usage message is now shown, when '-h|--help' option is
specified.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20210831120257.46920-1-pa.solodovnikov@scylladb.com>
Pass --pip-packages option to tools/python3/reloc/build_reloc.sh,
add scylla-driver to relocatable python3 which required for
fix_system_distributed_tables.py.
[avi: regenrate toolchain]
Ref #9040
Both functions do the same -- get datacenters from given
endpoint and local broadcast address and compare them to
match (or not).
tests: unit(dev)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210902080858.16364-1-xemul@scylladb.com>
* 'raft-misc-v3' of github.com:scylladb/scylla-dev:
raft: rename snapshot into snapshot_descriptor
raft: drop snapshot if is application failed
raft: fix local snapshot detection
raft: replication_test: store multiple snapshots in a state machine
raft: do not wait for entry to become stable before replicate it