Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
---
v2:
- Have stop easrlier so that exception in start/listen do
not prevent prometheu.stop from calling
As suggested in issue #4586 here is the helper that prints
"shutting down foo" message, then shuts the foo down, then
prints the "shutting down foo was successfull". In between
it catches the exception (if any) and warns this in logs.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Before stopping the db itself, stop the migration service.
It must be stopped before RPC, but RPC is not stopped yet
itself, so we should be safe here.
Here's the tail of the resulting logs:
INFO 2019-11-20 11:22:35,193 [shard 0] init - shutdown migration manager
INFO 2019-11-20 11:22:35,193 [shard 0] migration_manager - stopping migration service
INFO 2019-11-20 11:22:35,193 [shard 1] migration_manager - stopping migration service
INFO 2019-11-20 11:22:35,193 [shard 0] init - Shutdown database started
INFO 2019-11-20 11:22:35,193 [shard 0] init - Shutdown database finished
INFO 2019-11-20 11:22:35,193 [shard 0] init - stopping prometheus API server
INFO 2019-11-20 11:22:35,193 [shard 0] init - Scylla version 666.development-0.20191120.25820980f shutdown complete.
Also -- stop the mm on drain before the commitlog it stopped.
[Tomasz: mm needs the cl because pulling schema changes from other nodes
involves applying them into the database. So cl/db needs to be
stopped after mm is stopped.]
The drain logs would look like
...
INFO 2019-11-25 11:00:40,562 [shard 0] migration_manager - stopping migration service
INFO 2019-11-25 11:00:40,562 [shard 1] migration_manager - stopping migration service
INFO 2019-11-25 11:00:40,563 [shard 0] storage_service - DRAINED:
and then on stop
...
INFO 2019-11-25 11:00:46,427 [shard 0] init - shutdown migration manager
INFO 2019-11-25 11:00:46,427 [shard 0] init - Shutdown database started
INFO 2019-11-25 11:00:46,427 [shard 0] init - Shutdown database finished
INFO 2019-11-25 11:00:46,427 [shard 0] init - stopping prometheus API server
INFO 2019-11-25 11:00:46,427 [shard 0] init - Scylla version 666.development-0.20191125.3eab6cd54 shutdown complete.
Fixes#5300
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20191125080605.7661-1-xemul@scylladb.com>
In get_full_row_hashes_with_rpc_stream and
repair_get_row_diff_with_rpc_stream_process_op which were introduced in
the "Repair switch to rpc stream" series, rx_hashes_nr metrics are not
updated correctly.
In the test we have 3 nodes and run repair on node3, we makes sure the
following metrics are correct.
assertEqual(node1_metrics['scylla_repair_tx_hashes_nr'] + node2_metrics['scylla_repair_tx_hashes_nr'],
node3_metrics['scylla_repair_rx_hashes_nr'])
assertEqual(node1_metrics['scylla_repair_rx_hashes_nr'] + node2_metrics['scylla_repair_rx_hashes_nr'],
node3_metrics['scylla_repair_tx_hashes_nr'])
assertEqual(node1_metrics['scylla_repair_tx_row_nr'] + node2_metrics['scylla_repair_tx_row_nr'],
node3_metrics['scylla_repair_rx_row_nr'])
assertEqual(node1_metrics['scylla_repair_rx_row_nr'] + node2_metrics['scylla_repair_rx_row_nr'],
node3_metrics['scylla_repair_tx_row_nr'])
assertEqual(node1_metrics['scylla_repair_tx_row_bytes'] + node2_metrics['scylla_repair_tx_row_bytes'],
node3_metrics['scylla_repair_rx_row_bytes'])
assertEqual(node1_metrics['scylla_repair_rx_row_bytes'] + node2_metrics['scylla_repair_rx_row_bytes'],
node3_metrics['scylla_repair_tx_row_bytes'])
Tests: repair_additional_test.py:RepairAdditionalTest.repair_almost_synced_3nodes_test
Fixes: #5339
Backports: 3.2
Merged pull request https://github.com/scylladb/scylla/pull/5310 from
Avi Kivity:
This is a minor update as gcc and boost versions did not change. A noteable
update is patchelf 0.10, which adds support to large binaries.
A few minor issues exposed by the update are fixed in preparatory patches.
Patches:
dist: rpm: correct systemd post-uninstall scriptlet
build: force xz compression on rpm binary payload
tools: toolchain: update to Fedora 31
Since 90d6c0b, cache will abort when trying to detach partition
entries while they're updated. This should never happen. It can happen
though, when the update fails on bad_alloc, because the cleanup guard
invalidates the cache before it releases partition snapshots (held by
"update" coroutine).
Fix by destroying the coroutine first.
Fixes#5327.
Tests:
- row_cache_test (dev)
Message-Id: <1574360259-10132-1-git-send-email-tgrabiec@scylladb.com>
By default rpm uses dwz to merge the debug info from various
binaries. Unfortunately, it looks like addr2line has not been updated
to handle this:
// This works
$ addr2line -e build/release/scylla 0x1234567
$ dwz -m build/release/common.debug build/release/scylla.debug build/release/iotune.debug
// now this fails
$ addr2line -e build/release/scylla 0x1234567
I think the issue is
https://sourceware.org/bugzilla/show_bug.cgi?id=23652Fixes#5289
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191123015734.89331-1-espindola@scylladb.com>
By default we were compressing debug info only in release
executables. The idea, if I understand it correctly, is that those are
the ones we ship, so we want a more compact binary.
I don't think that was doing anything useful. The compression is just
gzip, so when we ship a .tar.xz, having the debug info compressed
inside the scylla binary probably reduces the overall compression a
bit.
When building a rpm the situation in amusing. As part of the rpm
build process the debug info is decompressed and extracted to an
external file.
Given that most of the link time goes to compressing debug info, it is
probably a good idea to just skip that.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191123022825.102837-1-espindola@scylladb.com>
Structure the code to be able to introduce futures.
Apply trivial cleanups.
Switch to asyncio and use it to work with processes and
handle signals. Cleanup all processes upon signal.
This patch implements a simple optimization for LWT: it makes PAXOS
prepare phase query locally and return the current value of the modified
key so that a separate query is not necessary. For more details see
patch 6. Patch 1 fixes a bug in next. Patches 2-5 contain trivial
preparatory refactoring.
Current LWT implementation uses at least three network round trips:
- first, execute PAXOS prepare phase
- second, query the current value of the updated key
- third, propose the change to participating replicas
(there's also learn phase, but we don't wait for it to complete).
The idea behind the optimization implemented by this patch is simple:
piggyback the current value of the updated key on the prepare response
to eliminate one round trip.
To generate less network traffic, only the closest to the coordinator
replica sends data while other participating replicas send digests which
are used to check data consistency.
Note, this patch changes the API of some RPC calls used by PAXOS, but
this should be okay as long as the feature in the early development
stage and marked experimental.
To assess the impact of this optimization on LWT performance, I ran a
simple benchmark that starts a number of concurrent clients each of
which updates its own key (uncontended case) stored in a cluster of
three AWS i3.2xlarge nodes located in the same region (us-west-1) and
measures the aggregate bandwidth and latency. The test uses shard-aware
gocql driver. Here are the results:
latency 99% (ms) bandwidth (rq/s) timeouts (rq/s)
clients before after before after before after
1 2 2 626 637 0 0
5 4 3 2616 2843 0 0
10 3 3 4493 4767 0 0
50 7 7 10567 10833 0 0
100 15 15 12265 12934 0 0
200 48 30 13593 14317 0 0
400 185 60 14796 15549 0 0
600 290 94 14416 15669 0 0
800 568 118 14077 15820 2 0
1000 710 118 13088 15830 9 0
2000 1388 232 13342 15658 85 0
3000 1110 363 13282 15422 233 0
4000 1735 454 13387 15385 329 0
That is, this optimization improves max LWT bandwidth by about 15%
and allows to run 3-4x more clients while maintaining the same level
of system responsiveness.
invoke_on() guarantees that captures object won't be destroyed until the
future returned by the invoked function is resolved so there's no need
to move key, token, proposal for calling paxos_state::*_impl helpers.
The test_health_only_works_for_root_path test checks that while Alternator's
HTTP server responds to a "GET /" request with success ("health check"), it
should respond to different URLs with failures (page not found).
One of the URLs it tested was "/..", but unfortunately some versions of
Python's HTTP client canonize this request to just a "/", causing the
request to unexpectedly succeed - and the test to fail.
So this patch just drops the "/.." check. A few other nonsense URLs are
attempted by the test - e.g., "/abc".
Fixes#5321
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
One of the fields still missing in DescribeTable's response (Refs #5026)
was the table's schema - KeySchema and AttributeDefinitions.
This patch adds this missing feature, and enables the previously-xfailing
test test_describe_table_schema.
A complication of this patch is that in a table with secondary indexes,
we need to return not just the base table's schema, but also the indexes'
schema. The existing tests did not cover that feature, so we add here
two more tests in test_gsi.py for that.
One of these secondary-index schema tests, test_gsi_2_describe_table_schema,
still fails, because it outputs a range-key which Scylla added to a view
because of its own implementation needs, but wasn't in the user's
definition of the GSI. I opened a separate issue #5320 for that.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Serialize reference_wrapper<T> as T and make sure is_equivalent<> treats
reference_wrapper<T> wrapped in std::optional<> or std::variant<>, or
std::tuple<> as T.
We need it to avoid copying query::result while serializing
paxos::promise.
Currently even if `-a` or `-s 0` is provided, `scylla task_histogram`
will scan a limited amount of pages due to a bug in the scan loop's stop
condition, which will be trigger a stop once the default sample limit is
reached. Fix the loop by skipping this check when the user wants to scan
all tasks.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20191121141706.29476-1-bdenes@scylladb.com>
UnitTest class uses juggles with the name 'args' quite a bit to
construct the command line for a unit test, so let's spread
the harness command line arguments from the unit test command line
arguments a bit apart by consistently calling the harness command line
arguments 'options', and unit test command line arguments 'args'.
Rename usage() to parse_cmd_line().
Create unique UnitTest objects in find_tests() for each found match,
including repeat, to ensure each test has its own unique id.
This will also be used to store execution state in the test.
It somewhat stands in the way of using asyncio
This patch also implements a more comprehensive
fix for #5303, since we not only have --repeat, but
run some tests in different configurations, in which
case xml output is also overwritten.
When starting scylla daemon as non-root the initialization fails
because standard /var/lib/scylla is not accessible by regular users.
Making the default dir accessible for user is not very convenient
either, as it will cause conflicts if two or more instances of scylla
are in use.
This problem can be resolved by specifying --commitlog-directory,
--data-file-directories, etc on start, but it's too much typing. I
propose to revive Nadav's --home option that allows to move all the
directories under the same prefix in one go.
Unlike Nadav's approach the --workdir option doesn't do any tricky
manipulations with existing directories. Insead, as Pekka suggested,
the individual directories are placed under the workir if and only
if the respective option is NOT provided. Otherwise the directory
configuration is taken as is regardless of whether its absolute or
relative path.
The values substutution is done early on start. Avi suggested that
this is unsafe wrt HUP config re-read and proper paths must be
resolved on the fly, but this patch doesn't address that yet, here's
why.
First of all, the respective options are MustRestart now and the
substitution is done before HUP handler is installed.
Next, commitlog and data_file values are copied on start, so marking
the options as LiveUpdate won't make any effect.
Finally, the existing named_value::operator() returns a reference,
so returning a calculated (and thus temporary) value is not possible
(from my current understanding, correct me if I'm wrong). Thus if we
want the *_directory() to return calculated value all callers of them
must be patched to call something different (e.g. *_directory.get() ?)
which will lead to more confusion and errors.
Changes v3:
- the option is --workdir back again
- the existing *directory are only affected if unset
- default config doesn't have any of these set
- added the short -W alias
Changes v2:
- the option is --home now
- all other paths are changed to be relative
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20191119130059.18066-1-xemul@scylladb.com>