Commit Graph

31 Commits

Author SHA1 Message Date
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
00810e6a01 treewide: include seastar/core/format.hh instead of seastar/core/print.hh
The later includes the former and in addition to `seastar::format()`,
`print.hh` also provides helpers like `seastar::fprint()` and
`seastar::print()`, which are deprecated and not used by scylladb.

Previously, we include `seastar/core/print.hh` for using
`seastar::format()`. and in seastar 5b04939e, we extracted
`seastar::format()` into `seastar/core/format.hh`. this allows us
to include a much smaller header.

In this change, we just include `seastar/core/format.hh` in place of
`seastar/core/print.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21574
2024-11-14 17:45:07 +02:00
Kefu Chai
aebb532906 bytes, utils: include fmt/iostream.h and iostream when appropriate
in seastar e96932b05f394b27cd0101e24f0584736795b50f, we stopped
including unused `fmt/ostream.h`. this helped to reduce the header
dependency. but this also broke the build of scylladb, as we rely
on the `fmt/ostream.h` indirectly included by seastar's header project.

in this change, we include `fmt/iostream.h` and `iostream` explictly
when we are using the declarations in them. this enables us to

- bump up the seastar submodule
- potentially reduce the header dependency as we will be able to
  include seastar/core/format.hh instead of a more bloated
  seastar/core/print.hh after bumping up seastar submodule

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21494
2024-11-08 16:43:25 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Botond Dénes
47ac7d70e4 utils/estimated_histogram: estimated_histogram: add constructor taking buckets
And bucket offsets. Allows constructing the histogram back from a json
format.
2024-03-13 02:06:30 -04:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Kefu Chai
26dcfea84a estimated_histogram: do not use dynamic format_string
fmtlib allows us to specify the field width dynamically, so specify
the field width in the same statement formatting the argument improves
the readability. and use the constexpr fmt string allows us to switch
to compile-time formatter supported by fmtlib v8.

this change also use `fmt::print()` to format the argument right to
the output ostream, instead of creating a temporary sstring, and
copy it to the output ostream.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #14579
2023-07-08 15:10:41 +03:00
Amnon Heiman
99bc6d882b estimated_histogram: add missing getter method
This patch adds the square bracket operator method that was missing.
2022-07-26 15:59:33 +03:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Raphael S. Carvalho
affa1d9b04 utils/estimated_histogram.hh: fix division-by-zero in mean()
if mean() is called when there are no elements in the histogram,
a runtime error will happen due to division-by-zero.
approx_exponential_histogram::mean() handles it but for some
reason we forgot to do the same for estimated_histogram.

this problem was found when adding an unit test which calls
mean() in an empty histogram.

Fixes #9531.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20211027142813.56969-1-raphaelsc@scylladb.com>
2021-10-28 08:58:25 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Amnon Heiman
bc854342e7 approx_exponential_histogram: Makes the implementation clearer
This patch aim to make the implementation and usage of the
approx_exponential_histogram clearer.

The approx_exponential_histogram Uses a combination of Min, Max,
Precision and number of buckets where the user needs to pick 3.

Most of the changes in the patch are about documenting the class and
method, but following the review there are two functionality changes:

1. The user would pick: Min, Max and Precision and the number of buckets
   will be calculated from these values.
2. The template restrictions are now state in a requires so voiolation
   will be stop at compile time.
2020-06-18 14:18:21 +03:00
Amnon Heiman
3319756f36 utils/estimated_histogram: Adding approx_exponential_histogram
This patch adds an efficient histogram implementation.
The implementation chooses efficiency over flexibility.
That is why templates are used.

How the approx_exponential_histogram pseudo floating point histogram
works: It split the range [MIN, MAX] into log2(MAX/MIN) ranges it then
split each of that ranges linearly according to a given resolution.

For example, using resolution of 4, would be similar to using an
exponentially growing histogram with a coefficient of 1.2.

All values are uint64. To prevent handling of corner cases, it is not
allowed to set the MIN to be lower than the resolution.

The approx_exponential_histogram will probably not be used directly,
the first used is by time_estimated_histogram. A histogram for durations.

It should be compared to the estimated_histogram.

Performance comparison:
Comparison was done by inserting 2^20 values into
time_estimated_histogram and estimated_histogram.

In debug mode on a local machine insert operation took an average of
26.0 nanoseconds vs 342.2 nanoseconds.

In release mode insert operation took an average of 1.90 vs 8.28 nanoseconds

Fixes #5815

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2020-06-15 08:22:43 +03:00
Amnon Heiman
3e5beba403 estimated_histogram: clean if0 and FIXME
This patch cleans the estimated histogram implementation.
It removes the FIXME that were left in the code from the migration time
and the if0 commented out code.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2020-05-27 08:40:05 +03:00
Botond Dénes
e0284bb9ee treewide: add missing headers and/or forward declarations 2020-03-23 09:29:45 +02:00
Avi Kivity
775b7e41f4 Update seastar submodule
* seastar d59fcef...b924495 (2):
  > build: Fix protobuf generation rules
  > Merge "Restructure files" from Jesse

Includes fixup patch from Jesse:

"
Update Seastar `#include`s to reflect restructure

All Seastar header files are now prefixed with "seastar" and the
configure script reflects the new locations of files.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com>
"
2018-11-21 00:01:44 +02:00
Avi Kivity
be99101f36 utils: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Avi Kivity
3cf434b863 utils: estimated_histogram: convert generated format strings to fmt
Convert printf games to format games.

Note that fmt supports specifying the field width as an argument, but that
is left to a dedicated change.
2018-11-01 13:16:17 +00:00
Amnon Heiman
3f8d9a87ee estimated_histogram: update the sum and count when merging
When merging histograms the count and the sum should be updated.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20171122154822.23855-1-amnon@scylladb.com>
2017-11-22 16:55:55 +01:00
Glauber Costa
6c4e8049a0 estimated_histogram: also fill up sum metric
Prometheus histograms have 3 embedded metrics: count, buckets, and sum.
Currently we fill up count and buckets but sum is left at 0. This is
particularly bad, since according to the prometheus documentation, the
best way to calculate histogram averages is to write:

  rate(metric_sum[5m]) / rate(metric_count[5m])

One way of keeping track of the sum is adding the value we sampled,
every time we sample. However, the interface for the estimated histogram
has a method that allows to add a metric while allowing to adjust the
count for missing metrics (add_nano())

That makes acumulating a sum inaccurate--as we will have no values for
the points that were added. To overcome that, when we call add_nano(),
we pretend we are introducing new_count - _count metrics, all with the
same value.

Long term, doing away with sampling may help us provide more accurate
results.

After this patch, we are able to correctly calculate latency averages
through the data exported in prometheus.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20171122144558.7575-1-glauber@scylladb.com>
2017-11-22 16:10:12 +01:00
Glauber Costa
fc4416abcc estimated_histogram: bring histogram closer to what prometheus expects.
Histograms are a native prometheus type, and there are many functions
available that operate on them. There is extensive documentation about
them at https://prometheus.io/docs/practices/histograms/

One example is the function histogram_quantile(), that can extract
useful quantiles from the histograms. Currently, those functions don't
work well.

The reasons are twofold:
1) We are only exporting 16 metrics, starting from 1usec. That means
   that the highest latency we can differentiate is 4ms. After that,
   everything falls into the same bin.

2) The format that prometheus expects is that each bin will contain
   the total number of points seen *up until that bin*, while we
   currently export the total number of points that falls between bins.
   IOW, it is a cummulative histogram.

About point two, granted it is a bit hidden in their website, but it is
there. The following phrase about a caveat make it clear:

 "Note that we divide the sum of both buckets. The reason is that the
  histogram buckets are cumulative. The le="0.3" bucket is also contained
  in the le="1.2" bucket; dividing it by 2 corrects for that."

It is also not needed to accumulate things that fall over the last bin:
the _count component of the histogram will already account for that.

This patch changes the histogram format to be more in line with what
prometheus expect.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-10-04 20:01:13 -04:00
Gleb Natapov
31e803a36c storage_proxy: wire up percentile speculative read properly
Collect coordinator side read statistic per CF and use them in percentile
speculative read executor. Getting percentile from estimated_histogram
object is rather expensive, so cache it and recalculate only once per
second (or if requested percentile changes).

Fixes #2757

Message-Id: <20170911131752.27369-3-gleb@scylladb.com>
2017-09-14 10:31:26 +03:00
Gleb Natapov
0842faecef estimated_histogram: fix overflow error handling
Currently overflow values are stored in incorrect bucket (last one
instead of special "overflow" one) and percentile() function throws
if there is overflow value. The patch fixes the code to store overflow
value in corespondent bucket and makes percentile() to take it into
account instead of throwing.

Message-Id: <20170911131752.27369-2-gleb@scylladb.com>
2017-09-14 10:31:21 +03:00
Avi Kivity
3fe6731436 Merge "educe the effect of the latency metrics" from Amnon
"This series reduce that effect in two ways:
1. Remove the latency counters from the system keyspaces
2. Reduce the histogram size by limiting the maximum number of buckets and
   stop the last bucket."

Fixes #2650.

* 'amnon/remove_cf_latency_v2' of github.com:cloudius-systems/seastar-dev:
  database: remove latency from the system table
  estimated histogram: return a smaller histogram
2017-07-31 15:58:30 +03:00
Tomasz Grabiec
5602be72fa estimated_histogram: Implement percentile() 2017-07-27 17:19:07 +02:00
Tomasz Grabiec
1bc305ed7b utils: estimated_histogram: Make printable 2017-07-27 17:19:03 +02:00
Amnon Heiman
1b05f23d12 estimated histogram: return a smaller histogram
The current histogram contains 91 buckets, this is a very high
resolution with a high upper limit.

To reduce traffic passed, between scylla and the prometheus, this patch
generate a smaller histogram.

It limit the number of buckets (16 by default), set a lower limit to the
lowest bucket, and uses 2 as the bucket coeficient.

Highest empty buckets will not be reported.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

estimated histogram
2017-07-27 11:41:10 +03:00
Tomasz Grabiec
cd4d15672b utils: estimated_histogram: Fix clear()
It was a no-op. It doesn't seem currently used, but I will have a use
for it soon.
Message-Id: <1495198172-1969-1-git-send-email-tgrabiec@scylladb.com>
2017-05-19 14:34:34 +01:00
Amnon Heiman
1e3cfe7396 estimated_histogram: returns a metrics histogram
The metrics histogram is a struct that describe a histogram.
This patch adds a getter method that lets the estimated_histogram return
a metrics::histogram, this will allow to register it as a histogram
metrics.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2017-02-06 17:34:43 +02:00
Glauber Costa
4310635bae move estimated histogram to utils
Nothing sstable-specific in it, really.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-08-31 15:13:23 -04:00