scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-20 08:30:35 +00:00

Author	SHA1	Message	Date
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Amnon Heiman	bc854342e7	approx_exponential_histogram: Makes the implementation clearer This patch aim to make the implementation and usage of the approx_exponential_histogram clearer. The approx_exponential_histogram Uses a combination of Min, Max, Precision and number of buckets where the user needs to pick 3. Most of the changes in the patch are about documenting the class and method, but following the review there are two functionality changes: 1. The user would pick: Min, Max and Precision and the number of buckets will be calculated from these values. 2. The template restrictions are now state in a requires so voiolation will be stop at compile time.	2020-06-18 14:18:21 +03:00
Amnon Heiman	3319756f36	utils/estimated_histogram: Adding approx_exponential_histogram This patch adds an efficient histogram implementation. The implementation chooses efficiency over flexibility. That is why templates are used. How the approx_exponential_histogram pseudo floating point histogram works: It split the range [MIN, MAX] into log2(MAX/MIN) ranges it then split each of that ranges linearly according to a given resolution. For example, using resolution of 4, would be similar to using an exponentially growing histogram with a coefficient of 1.2. All values are uint64. To prevent handling of corner cases, it is not allowed to set the MIN to be lower than the resolution. The approx_exponential_histogram will probably not be used directly, the first used is by time_estimated_histogram. A histogram for durations. It should be compared to the estimated_histogram. Performance comparison: Comparison was done by inserting 2^20 values into time_estimated_histogram and estimated_histogram. In debug mode on a local machine insert operation took an average of 26.0 nanoseconds vs 342.2 nanoseconds. In release mode insert operation took an average of 1.90 vs 8.28 nanoseconds Fixes #5815 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-06-15 08:22:43 +03:00
Amnon Heiman	3e5beba403	estimated_histogram: clean if0 and FIXME This patch cleans the estimated histogram implementation. It removes the FIXME that were left in the code from the migration time and the if0 commented out code. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-05-27 08:40:05 +03:00
Botond Dénes	e0284bb9ee	treewide: add missing headers and/or forward declarations	2020-03-23 09:29:45 +02:00
Avi Kivity	775b7e41f4	Update seastar submodule * seastar d59fcef...b924495 (2): > build: Fix protobuf generation rules > Merge "Restructure files" from Jesse Includes fixup patch from Jesse: " Update Seastar `#include`s to reflect restructure All Seastar header files are now prefixed with "seastar" and the configure script reflects the new locations of files. Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com> "	2018-11-21 00:01:44 +02:00
Avi Kivity	be99101f36	utils: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Avi Kivity	3cf434b863	utils: estimated_histogram: convert generated format strings to fmt Convert printf games to format games. Note that fmt supports specifying the field width as an argument, but that is left to a dedicated change.	2018-11-01 13:16:17 +00:00
Amnon Heiman	3f8d9a87ee	estimated_histogram: update the sum and count when merging When merging histograms the count and the sum should be updated. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <20171122154822.23855-1-amnon@scylladb.com>	2017-11-22 16:55:55 +01:00
Glauber Costa	6c4e8049a0	estimated_histogram: also fill up sum metric Prometheus histograms have 3 embedded metrics: count, buckets, and sum. Currently we fill up count and buckets but sum is left at 0. This is particularly bad, since according to the prometheus documentation, the best way to calculate histogram averages is to write: rate(metric_sum[5m]) / rate(metric_count[5m]) One way of keeping track of the sum is adding the value we sampled, every time we sample. However, the interface for the estimated histogram has a method that allows to add a metric while allowing to adjust the count for missing metrics (add_nano()) That makes acumulating a sum inaccurate--as we will have no values for the points that were added. To overcome that, when we call add_nano(), we pretend we are introducing new_count - _count metrics, all with the same value. Long term, doing away with sampling may help us provide more accurate results. After this patch, we are able to correctly calculate latency averages through the data exported in prometheus. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20171122144558.7575-1-glauber@scylladb.com>	2017-11-22 16:10:12 +01:00
Glauber Costa	fc4416abcc	estimated_histogram: bring histogram closer to what prometheus expects. Histograms are a native prometheus type, and there are many functions available that operate on them. There is extensive documentation about them at https://prometheus.io/docs/practices/histograms/ One example is the function histogram_quantile(), that can extract useful quantiles from the histograms. Currently, those functions don't work well. The reasons are twofold: 1) We are only exporting 16 metrics, starting from 1usec. That means that the highest latency we can differentiate is 4ms. After that, everything falls into the same bin. 2) The format that prometheus expects is that each bin will contain the total number of points seen up until that bin, while we currently export the total number of points that falls between bins. IOW, it is a cummulative histogram. About point two, granted it is a bit hidden in their website, but it is there. The following phrase about a caveat make it clear: "Note that we divide the sum of both buckets. The reason is that the histogram buckets are cumulative. The le="0.3" bucket is also contained in the le="1.2" bucket; dividing it by 2 corrects for that." It is also not needed to accumulate things that fall over the last bin: the _count component of the histogram will already account for that. This patch changes the histogram format to be more in line with what prometheus expect. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2017-10-04 20:01:13 -04:00
Gleb Natapov	31e803a36c	storage_proxy: wire up percentile speculative read properly Collect coordinator side read statistic per CF and use them in percentile speculative read executor. Getting percentile from estimated_histogram object is rather expensive, so cache it and recalculate only once per second (or if requested percentile changes). Fixes #2757 Message-Id: <20170911131752.27369-3-gleb@scylladb.com>	2017-09-14 10:31:26 +03:00
Gleb Natapov	0842faecef	estimated_histogram: fix overflow error handling Currently overflow values are stored in incorrect bucket (last one instead of special "overflow" one) and percentile() function throws if there is overflow value. The patch fixes the code to store overflow value in corespondent bucket and makes percentile() to take it into account instead of throwing. Message-Id: <20170911131752.27369-2-gleb@scylladb.com>	2017-09-14 10:31:21 +03:00
Avi Kivity	3fe6731436	Merge "educe the effect of the latency metrics" from Amnon "This series reduce that effect in two ways: 1. Remove the latency counters from the system keyspaces 2. Reduce the histogram size by limiting the maximum number of buckets and stop the last bucket." Fixes #2650. * 'amnon/remove_cf_latency_v2' of github.com:cloudius-systems/seastar-dev: database: remove latency from the system table estimated histogram: return a smaller histogram	2017-07-31 15:58:30 +03:00
Tomasz Grabiec	5602be72fa	estimated_histogram: Implement percentile()	2017-07-27 17:19:07 +02:00
Tomasz Grabiec	1bc305ed7b	utils: estimated_histogram: Make printable	2017-07-27 17:19:03 +02:00
Amnon Heiman	1b05f23d12	estimated histogram: return a smaller histogram The current histogram contains 91 buckets, this is a very high resolution with a high upper limit. To reduce traffic passed, between scylla and the prometheus, this patch generate a smaller histogram. It limit the number of buckets (16 by default), set a lower limit to the lowest bucket, and uses 2 as the bucket coeficient. Highest empty buckets will not be reported. Signed-off-by: Amnon Heiman <amnon@scylladb.com> estimated histogram	2017-07-27 11:41:10 +03:00
Tomasz Grabiec	cd4d15672b	utils: estimated_histogram: Fix clear() It was a no-op. It doesn't seem currently used, but I will have a use for it soon. Message-Id: <1495198172-1969-1-git-send-email-tgrabiec@scylladb.com>	2017-05-19 14:34:34 +01:00
Amnon Heiman	1e3cfe7396	estimated_histogram: returns a metrics histogram The metrics histogram is a struct that describe a histogram. This patch adds a getter method that lets the estimated_histogram return a metrics::histogram, this will allow to register it as a histogram metrics. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2017-02-06 17:34:43 +02:00
Glauber Costa	4310635bae	move estimated histogram to utils Nothing sstable-specific in it, really. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-08-31 15:13:23 -04:00

20 Commits