Commit Graph

30 Commits

Author SHA1 Message Date
Pavel Emelyanov
8a03683671 batchlog_manager: Drain it with shared future
The .drain() method can be called from several places, each needs to
wait for its completion. Now this is achieved with the help of a gate,
but there's a simpler way

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-04 13:42:45 +03:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Nadav Har'El
3fbbad7d60 build performance: speed up inclusion of <gm/inet_address.hh>
The header file <gm/inet_address.hh> is included, directly or
indirectly, from 291 source files in Scylla. It is hard to reduce this
number because Scylla relies heavily on IP addresses as keys to
different things. So it is important that this header file be fast to
include. Unfortunately it wasn't... ClangBuildAnalyzer measurements
showed that each inclusion of this header file added a whopping 2 seconds
(in dev build mode) to the build. A total of 600 CPU seconds - 10 CPU
minutes - were spent just on this header file. It was actually worse
because the build also spent additional time on template instantiation
(more on this below).

So in this patch we:

1. Remove some unnecessary stuff from gms/inet_address.hh, and avoid
   including it in one place that doesn't need it. This is just
   cosmetic, and doesn't significantly speed up the build.

2. Move the to_sstring() implementation for the .hh to .cc. This saves
   a lot of time on template instantiations - previously every source
   file instantiated this to_sstring(), which was slow (that "format"
   thing is slow).

3. Do not include <seastar/net/ip.hh> which is a huge file including
   half the world. All we need from it is the type "ipv4_address",
   so instead include just the new <seastar/net/ipv4_address.hh>.
   This change brings most of the performance improvement.
   So source files forgot to include various Seastar header files
   because the includes-everything ip.hh did it - so we need to add
   these missing includes in this patch.

After this patch, ClangBuildAnalyzer's reports that the cost of
inclusion of <gms/inet_address.hh> is down from 2 seconds to 0.326
seconds. Additionally the format<inet_address> template instantiation
291 times - about half a second each - is also gone.

All in all, this patch should reduce around 10 CPU minutes from the build.

Refs #1

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-01-04 21:07:23 +02:00
Benny Halevy
d344765ec6 get rid of the global batchlog_manager
Now that it's unused.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
744275df73 batchlog_manager: get_batch_log_mutation_for: move to storage_proxy
And rename to get_batchlog_mutation_for while at it,
as it's about the batchlog, not batch_log.

This resolves a circular dependency between the
batchlog_manager and the storage_proxy that required
it in the case.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
55967a8597 batchlog_manager: endpoint_filter: move to gossiper
There's nothing in this function that actually requries
the batchlog manager instance.

It uses a random number engine that's moved along with it
to class gossiper.

This resolves a circular dependency between the
batchlog_manager and storage_proxy.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
691afe1c4d batchlog_manager: derive from peering_sharded_service
So that do_batch_log_replay can get the sharded
batchlog_manager as container().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
03039e8f8a main: allow setting the global batchlog_manager
As a prerequisite to globalizing the batchlog_manager,
allow setting a global pointer to it and instantiate
the sharded<db::batchlog_manager> on the main/cql_test_env
stack.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
5165780d81 batchlog_manager: refactor drain out of stop
drain() aborts the replay loop fiber
and returns its future.

It's grabbing _gate so stop() will wait on it.

The intention is to call stop_replay_loop from
storage_service::decommission and do_drain rather
than stop, so we can stop the batchlog manager once,
using a deferred action in main.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-07-20 20:23:06 +03:00
Benny Halevy
deef1b4f59 batchlog_manager: stop: use abort_source to abort batchlog_replay_loop
Harden start/stop by using an abort_source to abort from
the replay loop.

Extract the loop into batchlog_replay_loop() coroutine,
with the _stop abourt source as a stop condition,
plus use it for sleep_abortable to be able to promptly
stop while sleeping.

start() stores batchlog_replay_loop's future in a newly added
_started member, which is waited on in stop() to synchronize
with the start process at any stage.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-07-20 19:32:55 +03:00
Avi Kivity
4d70f3baee storage_proxy: change unordered_set<inet_address> to small_vector in write path
The write paths in storage_proxy pass replica sets as
std::unordered_set<gms::inet_address>. This is a complex type, with
N+1 allocations for N members, so we change it to a small_vector (via
inet_address_vector_replica_set) which requires just one allocation, and
even zero when up to three replicas are used.

This change is more nuanced than the corresponding change to the read path
abe3d7d7 ("Merge 'storage_proxy: use small_vector for vectors of
inet_address' from Avi Kivity"), for two reasons:

 - there is a quadratic algorithm in
   abstract_write_response_handler::response(): it searches for a replica
   and erases it. Since this happens for every replica, it happens N^2/2
   times.
 - replica sets for writes always include all datacenters, while reads
   usually involve just one datacenter.

So, a write to a keyspace that has 5 datacenters will invoke 15*(15-1)/2
=105 compares.

We could remove this by sending the index of the replica in the replica
set to the replica and ask it to include the index in the response, but
I think that this is unnecessary. Those 105 compares need to be only
105/15 = 7 times cheaper than the corresponding unordered_set operation,
which they surely will. Handling a response after a cross-datacenter round
trip surely involves L3 cache misses, and a small_vector reduces these
to a minimum compared to an unordered_set with its bucket table, linked
list walking and managent, and table rehashing.

Tests using perf_simple_query --write --smp 1 --operations-per-shard 1000000
 --task-quota-ms show two allocations removed (as expected) and a nice
reduction in instructions executed.

before: median 204842.54 tps ( 54.2 allocs/op,  13.2 tasks/op,   49890 insns/op)
after:  median 206077.65 tps ( 52.2 allocs/op,  13.2 tasks/op,   49138 insns/op)

Closes #8847
2021-06-17 13:46:40 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
e0749d6264 treewide: some random header cleanups
Eliminate not used includes and replace some more includes
with forward declarations where appropriate.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-06-06 19:18:49 +03:00
Pavel Emelyanov
b4e66ddf1d batchlog: Use in-config ring-delay
This kills the first (out of two) global reference on storage_service

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-02-10 20:54:32 +03:00
Avi Kivity
89be47e291 batchlog_manager: remove dependency on db::config
Extract configuration into a new struct batchlog_manager_config and have the
callers populate it using db::config. This reduces dependencies on global objects.
2018-12-09 20:11:38 +02:00
Nadav Har'El
25bd139508 cross-tree: clean up use of std::random_device()
std::random_device() uses the relatively slow /dev/urandom, and we rarely if
ever intend to use it directly - we normally want to use it to seed a faster
random_engine (a pseudo-random number generator).

In many places in the code, we first created a random_device variable, and then
using it created a random_engine variable. However, this practice created the
risk of a programmer accidentally using the random_device object, instead of the
random_engine object, because both have the same API; This hurts performance.

This risk materialized in just two places in the code, utils/uuid.cc and
gms/gossiper.cc. A patch for to uuid.cc was sent previously by Pawel and is
not included in this patch, and the fix for gossiper.{cc,hh} is included here.

To avoid risking the same mistake in the future, this patch switches across the
code to an idiom where the random_device object is not *named*, so cannot be
accidentally used. We use the following idiom:

   std::default_random_engine _engine{std::random_device{}()};

Here std::random_device{}() creates the random device (/dev/urandom) and pulls
a random integer from it. It then uses this seed to create the random_engine
(the pseudo-random number generator). The std::random_device{} object is
temporary and unnamed, and cannot be unintentionally used directly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180726154958.4405-1-nyh@scylladb.com>
2018-07-26 16:54:58 +01:00
Avi Kivity
86de6cc7fb Merge seastat upstream
* seastar f14d2a3...7a49ae5 (8):
  > sharded: improve support for cooperating sharded<> services
  > sharded: support for peer services
  > semaphore: add a version of with_semaphore that takes a duration timeout
  > scripts: perftune.py: fix the CPU mask generation for more than 64 CPUs
  > Revert "future-utils: make when_all() (vector variant) exception safe"
  > Revert "future-utils: fix gross compilation errors in when_all()"
  > future-utils: fix gross compilation errors in when_all()
  > future-utils: make when_all() (vector variant) exception safe

Includes change to batchlog_manager constructor to adapt it to
seastar::sharded::start() change.
2017-08-06 17:47:47 +03:00
Vlad Zolotarov
a9f6e5f8da db::batchlog_manager: move collectd registration to the metrics registration layer
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-01-10 16:24:54 -05:00
Vlad Zolotarov
4ef5b11e9b batchlog_manager: add a counter for a total number of write attempts
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-04-21 11:29:21 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Asias He
cdb43c5586 batchlog_manager: Allow user initiated bachlog replay operation
During decommission, the storage_service::unbootstrap() needs to
initiate a batchlog replay operation. To sync the replay operation
initiated by the timer in batchlog_manager and storage_service, a
semaphore is introduced.  To simplify the semaphore locking, the
management code now always runs on shard zero, but the real work is
distruted to all shards.
2016-03-30 20:54:30 +08:00
Calle Wilund
42c086a5cd batchlog_manager: Fixup includes + exception handling
* Fix exception handling in batch loop (report + still re-arm)
* Cleanup seastar include reference style
2015-10-07 17:06:34 +03:00
Calle Wilund
6f94a3bdad batchlog_manager: Use gate instead of semaphore
Since that exists now.
2015-10-07 14:30:09 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Paweł Dziepak
ddec2b4d09 batchlog_manager: pass mutations by const ref
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-03 10:30:29 +02:00
Calle Wilund
9a52ad84b1 BatchlogManager: make blm globally reachable distributed like other objects 2015-08-11 17:10:17 +02:00
Calle Wilund
0ded44eeee BatchlogManager: make endpoint_filter method + implement 2015-08-11 17:10:16 +02:00
Calle Wilund
b7cdd189e7 BatchlogManager: make constructible from distributed<db> (to fit main init) 2015-08-11 09:46:59 +02:00
Calle Wilund
ef2cc9b05d BatchLogManager.java -> C++
Somewhat simplifies version of the Origin code, since from what I 
can see, there is less need for us to do explicit query sends in 
the BLM itself, instead we can just go through storage_proxy. 
I could be wrong though.
2015-07-08 10:59:57 +02:00