To emphasize that the function requires `seastar::thread`
context to function properly.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
* seastar 655078dfdb...28fe4214e5 (2):
> program_options: avoid including boost/program_options.hpp when possible
> smp: split smp_options out of smp.hh
sstables/sstables.cc uses seastar::metrics but was missing an include of
<seastar/core/metrics.hh>. It probably received this include through
some other random included Seastar header (e.g., smp.hh).
Now that we're reducing the unnecessary inclusions in Seastar (an ongoing
effort of Seastar patches), it is no longer included implicitly, and we
need to include it explicitly in sstables.cc.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220109162823.511781-1-nyh@scylladb.com>
As already noted in commit eac6fb8, many of the scylla-gdb tests fail on
aarch64 for various reasons. The solution used in that commit was to
have test/scylla-gdb/run pretend to succeed - without testing anything -
when not running on x86_64. This workaround was accidentally lost when
scylla-gdb/run was recently rewritten.
This patch brings this workaround back, but in a slightly different form -
Instead of the run script not doing anything, the tests do get called, but
the "gdb" fixture in test/scylla-gdb/conftest.py causes each individual
test to be skipped.
The benefit of this approach is that it can easily be improved in the
future to only skip (or xfail) specific tests which are known to fail on
aarch64, instead of all of them - as half of the tests do pass on aarch64.
Fixes#9892.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220109152630.506088-1-nyh@scylladb.com>
This is a consolidation of #9714 and #9709 PRs by @elcallio that were reviewed by @asias
The last comment on those was that they should be consolidated in order not to create a security degradation for
ec2 setups.
For some cases it is impossible to determine dc or rack association for nodes on outgoing connections.
One example is when some IPs are hidden behind Nat layer.
In some cases this creates problems where one side of the connection is aware of the rack/dc association where the
other doesn't.
The solution here is a two stage one:
1. First add a gossip reverse lookup that will help us determine the rack/dc association for a broader (hopefully all) range
of setups and NAT situations.
2. When this fails - be more strict about downgrading a node which tries to ensure that both sides of the connection will at least
downgrade the connection instead of just fail to start when it is not possible for one side to determine rack/dc association.
Fixes#9653
/cc @elcallio @asias
Closes#9822
* github.com:scylladb/scylla:
messaging_service: Add reverse mapping of private ip -> public endpoint
production_snitch_base: Do reverse lookup of endpoint for info
messaging_service: Make dc/rack encryption check for connection more strict
init.hh relies on boost::program_options but forgot to include the
header file <boost/program_options.hpp> for it. Today, this doesn't
matter, because Seastar unnecessarily includes <boost/program_options.hpp>
from unrelated header files (such as smp.hh) - so it ends up not being
missing.
But we plan to clean up Seastar from those unnecessary includes, and
then including what we need in init.hh will become important.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220109123152.492466-1-nyh@scylladb.com>
Move the ::database, ::keyspace, and ::table classes to a new replica
namespace and replica/ directory. This designates objects that only
have meaning on a replica and should not be used on a coordinator
(but note that not all replica-only classes should be in this module,
for example compaction and sstables are lower-level objects that
deserve their own modules).
The module is imperfect - some additional classes like distributed_loader
should also be moved, but there is only one way to untie Gordian knots.
Closes#9872
* github.com:scylladb/scylla:
replica: move ::database, ::keyspace, and ::table to replica namespace
database: Move database, keyspace, table classes to replica/ directory
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.
References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.
scylla-gdb.py is adjusted to look for both the new and old names.
Tables waiting for a chance to run reshape wouldn't trigger stop
exception, as the exception was only being triggered for ongoing
compactions. Given that stop reshape API must abort all ongoing
tasks and all pending ones, let's change run_custom_job() to
trigger the exception if it found that the pending task was
asked to stop.
Tests:
dtest: compaction_additional_test.py::TestCompactionAdditional::test_stop_reshape_with_multiple_keyspaces
unit: dev
Fixes#9836.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20211223002157.215571-1-raphaelsc@scylladb.com>
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.
As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
This patch is an almost complete rewrite of the test/scylla-gdb
framework for testing Scylla's gdb commands.
The goals of this rewrite are described in issue #9864. In short, the
goals are:
1. Use pytest to define individual test cases instead one long Python
script. This will make it easier to add more tests, to run only
individual tests (e.g., test/scylla-gdb/run somefile.py::sometest),
to understand which test failed when it fails - and a lot of other
pytest conveniences.
2. Instead of an ad-hoc shell script to run Scylla, gdb, and the test,
use the same Python code which is used in other test suites (alternator,
cql-pytest, redis, and more). The resulting handling of the temporary
resources (processes, directories, IP address) is more robust, and
interrupting test/scylla-gdb/run will correctly kill its child
processes (both Scylla and gdb).
All existing gdb tests (except one - more on this below...) were
easily rewritten in the new framework.
The biggest change in this patch is who starts what. Before this patch,
"run" starts gdb, which in turn starts Scylla, stops it on a breakpoint,
and then runs various tests. After this patch, "run" starts Scylla on
its own (like it does in test/cql-pytest/run, et al.), and then gdb runs
pytest - and in a pytest fixture attaches to the running Scylla process.
The biggest benefit of this approach is that "run" is aware of both gdb
and Scylla, and can kill both with abruptly with SIGKILL to end the test.
But there's also a downside to this change: One of the tests (of "scylla
fiber") needs access to some task object. Before this patch, Scylla was
stopped on a breakpoint, and a task was available at that point. After
this patch, we attach gdb to an idle Scylla, and the test cannot find
any task to use. So the test_fiber() test fails for now.
One way we could perhaps fix it is to add a breakpoint and "continue"
Scylla a bit more after attaching to it. However, I could find the right
breakpoint - and we may also need to send a request to Scylla to
get it to reach that breakpoint. I'm still looking for a better way
to have access to some "task" object we can test on.
Fixes#9864.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220102221534.1096659-1-nyh@scylladb.com>
This set contains follow-up fixes to folding tools into the scylla
executable:
* Improve the app description of scylla w.r.t. tools
* Add a new --list-tools option
* Error out when the first argument is unrecognized
Tests: unit(dev)
Botond Dénes (3):
main: rephrase app description
main: add move tool listing to --list-tools
main: improve handling of non-matching argv[1]
main.cc | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
Be silent when argv[1] starts with "-", it is probably an option to
scylla (and "server" is missing from the cmd line).
Print an error and stop when argv[1] doesn't start with "-" and thus the
user assumably meant to start either the server or a tool and mis-typed
it. Instead of trying to guess what they meant stop with a clear error
message.
And make it the central place listing available tools (to minimize the
places to update when adding a new one). The description is edited to
point to this command instead of listing the tools itself.
Remove "compatible with Apache Cassandra", scylla is much more than that
already.
Rephrase the part describing the included tools such that it is clear
that the scylla server is the main thing and the tools are the "extra"
additions. Also use the term "tool" instead of the term "app".
The error-handling code removes the cache entry but this leads to an
assertion because the entry is still referenced by the entry pointer
instance which is returned on the normal path. To avoid this clear the
pointer on the error path and make sure there are no additional
references kept to it.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20220105140859.586234-2-bdenes@scylladb.com>
When a template is instantiated in a header file which is included by many
source files, the compiler needs to compile it again and again.
ClangBuildAnalyzer helps find the worst cases of this happening, and one
of the worst happens to be
seastar::throw_with_backtrace<marshal_exception, sstring>
This specific template function takes (according to ClangBuildAnalyzer)
362 milliseconds to instantiate, and this is done 312 (!) times, because
it reaches virtually every Scylla source file via either types.hh or
compound.hh which use this idiom.
Unfortunately, C++ as it exists today does not have a mechanism to
avoid compiling a specific template instantiation if this was already
done in some other source file. But we can do this manually using
the C++11 feature of "extern template":
1. For a specific template instance, in this case
seastar::throw_with_backtrace<marhsal_exception, sstring>,
all source files except one specify it as "extern template".
This means that the code for it will NOT be built in this source
file, and the compiler assumes the linker will eventually supply it.
2. At the same time, one source file instantiates this template
instance once regularly, without "extern".
The numbers from ClangBuildAnalyzer suggest that this patch should
reduce total build time by 1% (in dev build mode), but this is hard to
measure in practice because the very long build time (210 CPU minutes on
my laptop) usually fluctuates by more than 1% in consecutive runs.
However, we've seen in the past that a good estimate of build time is
the total produced object size (du -bc build/dev/**/*.o). This patch
indeed reduces this total object size (in dev build mode) by exactly 1%.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220105171453.308821-1-nyh@scylladb.com>
The header file <seastar/net/ip.hh> is a large collection of unrelated stuff, and according to ClangBuildAnalyzer, takes 2 seconds to compile for every source file that included it - and unfortunately virtually all Scylla source files included it - through either "types.hh" or "gms/inet_address.hh". That's 2*300 CPU seconds wasted.
In this two-patch series we completely eliminate the inclusion of <seastar/net/ip.hh> from Scylla. We still need the ipv4_address, ipv6_address types (e.g., gms/inet_address.hh uses it to hold a node's IP address) so those were split (in a Seastar patch that is already in) from ip.hh into separate small header files that we can include.
This patch reduces the entire build time (of build/dev/scylla) by 4% - reducing almost 10 sCPU minutes (!) from the build.
Closes#9875
* github.com:scylladb/scylla:
build performance: do not include <seastar/net/ip.hh>
build performance: speed up inclusion of <gm/inet_address.hh>
In a previous patch, we noticed that the header file <gm/inet_address.hh>,
which is included, directly or indirectly, by most source files,
includes <seastar/net/ip.hh> which is very slow to compile, and
replaced it by the much faster-to-include <seastar/net/ipv[46]_address.hh>.
However, we also included <seastar/net/ip.hh> in types.hh - and that
too is included by almost every file, so the actual saving from the
above patch was minimal. So in this patch we replace this include too.
After this patch Scylla does not include <seastar/net/ip.hh> at all.
According to ClangBuildAnalyzer, this reduces the average time to include
types.hh (multiply this by 312 times!) from 4 seconds to 1.8 seconds,
and reduces total build time (dev mode) by about 3%.
Some of the source files were now missing some include directives, that
were previously included in ip.hh - so we need to add those explicitly.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The test case assumed int32 partition key, but
scylla_bench_large_part_ds1 has int64 partition key. This resulted in
no results to be returned by the reader.
Fixs by introducing a partition key factory on the data source level.
Message-Id: <20220105150550.67951-1-tgrabiec@scylladb.com>
dbuild's README contained some vague and very partial hints on how to use
ccache with dbuild. Replace them with more concrete instructions.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211229180433.781906-1-nyh@scylladb.com>
Which configures seastar to act more appropriate to a tool app. I.e.
don't act as if it owns the place, taking over all system resources.
These tools are often run on a developer machine, or even next to a
running scylla instance, we want them to be the least intrusive
possible.
Also use the new tool mode in the existing tools.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20211220143104.132327-1-bdenes@scylladb.com>
Sort follower nodes by the proximity so that in the step where the
master node gets missing rows from repair follower nodes,the master
node has a chance to get the missing rows from a near node first (e.g.,
local dc node), avoding getting rows from a far node.
For example:
dc1: n1, n2
dc2: n3, n4
dc3: n5, n6
Run repair on n1, with this patch, n1 will get data from n2 which is in the same dc first.
[shard 0] repair - Repair 1 out of 1 ranges, id=[id=1, uuid=8b0040bd-5aa5-42e1-bb9f-58c5e7052aec],
shard=0, keyspace=ks, table={cf}, range=(-6734413101754081925, -6539883972247625343],
peers={127.0.39.5, 127.0.39.6, 127.0.39.2, 127.0.39.4, 127.0.39.3},
live_peers={127.0.39.5, 127.0.39.6, 127.0.39.2, 127.0.39.4, 127.0.39.3}
[shard 0] repair - Before sort = {127.0.39.5, 127.0.39.6, 127.0.39.2, 127.0.39.4, 127.0.39.3}
[shard 0] repair - After sort = {127.0.39.2, 127.0.39.5, 127.0.39.6, 127.0.39.4, 127.0.39.3}
[shard 0] repair - Started Row Level Repair (Master): local=127.0.39.1,
peers={127.0.39.2, 127.0.39.5, 127.0.39.6, 127.0.39.4, 127.0.39.3}
Closes#9769
We already have multiple tests for the unimplemented "Projection" feature
of GSI and LSI (see issue #5036). This patch adds seven more test cases,
focusing on various types of errors conditions (e.g., trying to project
the same attribute twice), esoteric corner cases (it's fine to list a key in
NonKeyAttributes!), and corner cases that I expect we will have in our
implementation (e.g., a projected attribute may either be a real Scylla
column or just an element in a map column).
All new tests pass on DynamoDB and fail on Alternator (due to #5036), so
marked with "xfail".
Refs #5036.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211228193748.688060-1-nyh@scylladb.com>
"
Like flat_mutation_reader_from_fragments, this reader is also heavily
used by tests to compose a specific workload for readers above it. So
instead of converting it, we add a v2 variant and leave the v1 variant
in place.
The v2 variant was written from scratch to have built-in support for
reading in reverse. It is built-on `mutation::consume()` to avoid
duplicating the logic of consuming the contents of the mutation. To
avoid stalls, `mutation::consume()` gets support for pausing and
resuming consuming a mutation.
Tests: unit(dev)
"
* 'flat_mutation_reader_from_mutations_v2/v2' of https://github.com/denesb/scylla:
flat_mutation_reader: convert make_flat_mutation_reader_from_mutation() v2
flat_mutation_reader: extract mutation slicing into a function
mutation: consume(): make it pausable/resumable
mutation: consume(): restructure clustering iterator initialization
test/boost/mutation_test: add rebuild test for mutation::consume()