Commit Graph

27 Commits

Author SHA1 Message Date
Marcin Maliszkiewicz
6152223890 test: perf: extract result aggregation logic to a separate struct
It will be reused later by a new tool.
2024-05-09 13:58:29 +02:00
Benny Halevy
e5ca65f78b test/perf: report also log_allocations/op
Currently perf-simple-query --write ignores
log allocations that happen on the memtable
apply path.

This change adds tracking and accounting
of the number of log allocation,
and reporting of thereof.

For reference, here's the output of
build/release/scylla perf-simple-query --write --default-log-level=error --random-seed=1 -c 1
```
random-seed=1
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=write, frontend=cql, query_single_key=no, counters=no}
Disabling auto compaction
78073.55 tps ( 59.4 allocs/op,  16.3 logallocs/op,  14.3 tasks/op,   52991 insns/op,        0 errors)
77263.59 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   53282 insns/op,        0 errors)
79913.07 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   53295 insns/op,        0 errors)
79554.32 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   53284 insns/op,        0 errors)
79151.53 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   53289 insns/op,        0 errors)

median 79151.53 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   53289 insns/op,        0 errors)
median absolute deviation: 761.54
maximum: 79913.07
minimum: 77263.59
```

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-05-02 18:42:41 +03:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Kefu Chai
fe28aac440 test/perf: add fmt::formatter for perf_result_with_aio_writes
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for `perf_result_with_aio_writes`,
and drop its operator<<.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17849
2024-03-18 12:53:39 +02:00
Kefu Chai
2ccd9e695d test/perf: add fmt::formatters for scheduling_latency_measurer and perf_result
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for

* scheduling_latency_measurer
* perf_result

and drop their operator<<:s

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-02-23 10:17:50 +08:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Piotr Dulikowski
b9250a43e3 test: perf: count errors and report the count in results
Now, exceptions encountered during the test are counted as errors, and
the error count is reported at the end of the test.
2022-06-27 22:14:29 +02:00
Piotr Dulikowski
21612f97b0 test: perf: add stop-on-error argument
Adds the "--stop-on-error" argument to perf_simple_query. When enabled
(and it is enabled by default), the benchmark will propagate exceptions
if any occur in the tested function. Otherwise, errors will be ignored.
2022-06-27 22:14:29 +02:00
Piotr Dulikowski
d3bc946859 test: perf: coroutinize run_worker()
Converts the executor::run_worker() method to a coroutine. This will
allow extending the function in further commits without having to
allocate continuations.
2022-06-27 22:14:29 +02:00
Piotr Dulikowski
33b22e78be test: perf: fix crash on exception in time_parallel_ex
The `time_parallel_ex` function creates a sharded<executor> and uses it
to run the benchmark on multiple shards in parallel. However, if the
benchmarking function throws an exception, the sharded<executor> will be
destroyed without being stopped, which triggers an assertion in
sharded<T> destructor.

This commit makes sure that the executor is stopped before being
destroyed by putting `exec.stop()` into a `seastar::defer`.
2022-06-27 22:14:29 +02:00
Nadav Har'El
043b1c7f89 Update seastar submodule. Unfortunately, also requires two changes
to Scylla itself to make it still compile - see below

* seastar 5e863627...96bb3a1b (18):
  > install-dependencies: add rocky as a supported distro
  > circleci: relax docker limits to allow running with new toolchain
  > core: memory: Add memory::free_memory() also in Debug mode
  > build: bump up zlib to 1.2.12
  > cmake: add FindValgrind.cmake
  > Merge 'seastar-addr2line: support sct syslogs' from Benny Halevy
  > rpc: lower log level for 'failed to connect' errors
  > scripts: Build validation
  > perftune.py: remove rx_queue_count from mode condition.
  > memory: add attributes to memalign for compatibility with glibc 2.35
  > condition-variable: Fix timeout "when" potentially not killing timer
  > Merge "tests: perf: measure coroutines performance" from Benny
  > Merge: Refine COUNTER metrics
  > Revert "Merge: Refine COUNTER metrics"
  > reactor: document intentional bitwise-on-bool op in smp_pollfn::poll()
  > Merge: Refine COUNTER metrics
  > SLES: additionally check irqbalance.service under /usr/lib
  > rpc_tester: job_cpu: mark virtual methods override

Changes to Scylla also included in this merge:

1. api: Don't export DERIVEs (Pavel Emelyanov)

Newer seastar doesn't have DERIVE metrics, but does have REAL_COUNTER
one. Teach the collectd getter the change.

(for the record: I don't understand how this endpoing works at all,
there's a HISTOGRAM metrics out there that would be attempted to get
exposed with the v.ui() call which's totally wrong)

2. test: use linux_perf_events.{cc,hh} from Seastar

Seastar now has linux_perf_events.{cc,hh}. Remove Scylla's version
of the same files and use Seastar's. Without this change, Scylla
fails to compile when some source files end up including both
versions and seeing double definitions.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-05-11 14:46:30 +02:00
Calle Wilund
5b60a6cf7c perf: Add aio_writes mixin for perf_results
Can be used with time_parallel_ex. Adds measurements for aio writes/aio written bytes.
2022-04-05 13:42:36 +00:00
Calle Wilund
12ab34a3d9 test/perf/perf.hh: Make templated version of test routine to allow extended stats
Adds sub-template for time_parallel with templated result type + optional
per-iteration post-process func. Idea is that Res may be a subtype of
perf_result, with additional stats, initiated on init, and post-process
function can fix up and apply stats -> we can add stats to result.
2022-04-05 13:30:42 +00:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
4118f2d8be treewide: replace deprecated seastar::later() with seastar::yield()
seastar::later() was recently deprecated and replaced with two
alternatives: a cheap seastar::yield() and an expensive (but more
powerful) seastar::check_for_io_immediately(), that corresponds to
the original later().

This patch replaces all later() calls with the weaker yield(). In
all cases except one, it's unambiguously correct. In one case
(test/perf scheduling_latency_measurer::stop()) it's not so ambiguous,
since check_for_io_immediately() will additionally force a poll and
so will cause more work to be done (but no additional tasks to be
executed). However, I think that any measurement that relies on
the measuring the work on the last tick to be inaccurate (you need
thousands of ticks to get any amount of confidence in the
measurement) that in the end it doesn't matter what we pick.

Tests: unit (dev)

Closes #9904
2022-01-12 12:19:19 +01:00
Botond Dénes
2454811dd6 test/perf: perf.hh: add reader_concurrency_semaphore_wrapper
A convenience, self-closing wrapper for those perf tests that have no
way to stop the semaphore and wait for it too.
2021-07-08 16:53:38 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Benny Halevy
f4cfa530cc perf: enable instructions_retired_counter only once per executor::run
Enabling it for each run_worker call will invoke ioctl
PERF_EVENT_IOC_ENABLE in parallel to other workers running
and this may skew the results.

Test: perf_simple_query
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210514130542.301168-1-bhalevy@scylladb.com>
2021-05-16 12:13:27 +03:00
Avi Kivity
2b252ef9b7 test: perf: tidy up executor_stats snapshot computation
Now that executor_stats_snapshot() is a member function, we can move
the capture of _count into invocations into it, capturing all the
stats in one place.
2021-04-28 19:02:35 +03:00
Avi Kivity
863b49af03 test: perf: report instructions retired per operations
Instructions retired per op is a much more stable than time per op
(inverse throughput) since it isn't much affected by changes in
CPU frequencey or other load on the test system (it's still somewhat
affected since a slower system will run more reactor polls per op).
It's also less indicative of real performance, since it's possible for
fewer inststructions to execute in more time than more instructions,
but that isn't an issue for comparative tests).

This allows incremental changes to the code base to be compared with
more confidence.
2021-04-28 18:46:55 +03:00
Avi Kivity
498e6b9a64 test: perf: make executor_stats_snapshot() a member function of executor
I'd like to add an instructions counter which isn't accessible via
a global, so make the snapshot function a member. Out of respect to #1,
define functions for getting the number of allocations and tasks processed,
as they need heavy header files.
2021-04-28 18:38:35 +03:00
Avi Kivity
202c631dee test: perf: perf_simple_query: collect allocation and task statistics
Calculate and display the number of memory allocations and tasks
executed per operation. Sample results (--smp 1):

180022.46 tps (90 allocs/op, 20 tasks/op)
178963.44 tps (90 allocs/op, 20 tasks/op)
178702.41 tps (90 allocs/op, 20 tasks/op)
177679.74 tps (90 allocs/op, 20 tasks/op)
179539.36 tps (90 allocs/op, 20 tasks/op)

median 178963.44 tps (90 allocs/op, 20 tasks/op)
median absolute deviation: 575.92
maximum: 180022.46
minimum: 177679.74

This allows less noisy tracking of how some changes impact performance.
2021-04-07 17:54:48 +03:00
Avi Kivity
3a90df39c5 perf: deinline some functions in perf.hh
Those functions were defined in a header, but not marked inline.
This made including the header from two source files impossible,
as the linker would complain about duplicate symbols.

Rather than making them inline, put them in a new source file
perf.cc as they don't need to be inline.
2021-04-07 17:51:58 +03:00
Pavel Emelyanov
9d38846ed2 test: Move perf measurement helpers into header
To use the code in new perf tests in next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 12:58:26 +03:00
Rafael Ávila de Espíndola
c6897dcbea perf_simple_query: Simplify with seastar::thread
There is no reason not to use a seastar::thread in setup code.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200225055745.321086-1-espindola@scylladb.com>
2020-02-26 18:22:04 +02:00
Konstantin Osipov
8047d24c48 tests: move .hh files and resources to new locations
The plan is to move the unstructured content of tests/ directory
into the following directories of test/:

test/lib - shared header and source files for unit tests
test/boost - boost unit tests
test/unit - non-boost unit tests
test/manual - tests intended to be run manually
test/resource - binary test resources and configuration files

In order to not break git bisect and preserve the file history,
first move most of the header files and resources.
Update paths to these files in .cc files, which are not moved.
2019-12-16 17:47:42 +03:00