Commit Graph

397 Commits

Author SHA1 Message Date
Avi Kivity
78eebe74c7 loading_cache: adjust code for older compilers
Need this-> qualifier in a generic lambda, and static_assert()s need
a message.
2018-04-23 16:12:25 +03:00
Vlad Zolotarov
e31331bdb2 utils + cql3: use a functor class instead of std::function
Define value_extractor_fn as a functor class instead of std::function.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 15:28:45 -05:00
Vlad Zolotarov
ad68d3ecfd utils::loading_cache: make the size limitation more strict
Ensure that the size of the cache is never bigger than the "max_size".

Before this patch the size of the cache could have been indefinitely bigger than
the requested value during the refresh time period which is clearly an undesirable
behaviour.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:36:56 -05:00
Vlad Zolotarov
707ac9242e utils::loading_cache: added static_asserts for checking the callbacks signatures
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:36:53 -05:00
Vlad Zolotarov
dae0563ff8 utils::loading_cache: add a bunch of standard synchronous methods
Add a few standard synchronous methods to the cache, e.g. find(), remove_if(), etc.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:36:43 -05:00
Vlad Zolotarov
7ba50b87f1 utils::loading_cache: add the ability to create a cache that would not reload the values
Sometimes we don't want the cached values to be periodically reloaded.
This patch adds the ability to control this using a ReloadEnabled template parameter.

In case the reloading is not needed the "loading" function is not given to the constructor
but rather to the get_ptr(key, loader) method (currently it's the only method that is used, we may add
the corresponding get(key, loader) method in the future when needed).

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:35:25 -05:00
Vlad Zolotarov
1da277c78e utils::loading_cache: add the ability to work with not-copy-constructable values
Current get(...) interface restricts the cache to work only with copy-constructable
values (it returns future<Tp>).
To make it able to work with non-copyable value we need to introduce an interface that would
return something like a reference to the cached value (like regular containers do).

We can't return future<Tp&> since the caller would have to ensure somehow that the underlying
value is still alive. The much more safe and easy-to-use way would be to return a shared_ptr-like
pointer to that value.

"Luckily" to us we value we actually store in a cache is already wrapped into the lw_shared_ptr
and we may simply return an object that impersonates itself as a smart_pointer<Tp> value while
it keeps a "reference" to an object stored in the cache.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:35:21 -05:00
Vlad Zolotarov
36dfd4b990 utils::loading_cache: add EntrySize template parameter
Allow a variable entry size parameter.
Provide an EntrySize functor that would return a size for a
specific entry.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:35:19 -05:00
Vlad Zolotarov
c5ce2765dc utils::loading_cache: rework on top of utils::loading_shared_values
Get rid of the "proprietary" solution for asynchronous values on-demand loading.
Use utils::loading_shared_values instead.

We would still need to maintain intrusive set and list for efficient shrink and invalidate
operations but their entry is not going to contain the actual key and value anymore
but rather a loading_shared_values::entry_ptr which is essentially a shared pointer to a key-value
pair value.

In general, we added another level of dereferencing in order to get the key value but since
we use the bi::store_hash<true> in the hook and the bi::compare_hash<true> in the bi::unordered_set
this should not translate into an additional set lookup latency.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:35:15 -05:00
Vlad Zolotarov
dbc6d9fe01 utils::loading_cache: arm the timer with a period equal to min(_expire, _update)
Arm the timer with a period that is not greater than either the permissions_validity_in_ms
or the permissions_update_interval_in_ms in order to ensure that we are not stuck with
the values older than permissions_validity_in_ms.

Fixes #2590

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-02-02 12:17:41 -05:00
Avi Kivity
a603111a85 Merge "Fix memory leak on zone reclaim" from Tomek
"_free_segments_in_zones is not adjusted by
segment_pool::reclaim_segments() for empty zones on reclaim under some
conditions. For instance when some zone becomes empty due to regular
free() and then reclaiming is called from the std allocator, and it is
satisfied from a zone after the one which is empty. This would result
in free memory in such zone to appear as being leaked due to corrupted
free segment count, which may cause a later reclaim to fail. This
could result in bad_allocs.

The fix is to always collect such zones.

Fixes #3129
Refs #3119
Refs #3120"

* 'tgrabiec/fix-free_segments_in_zones-leak' of github.com:scylladb/seastar-dev:
  tests: lsa: Test _free_segments_in_zones is kept correct on reclaim
  lsa: Expose max_zone_segments for tests
  lsa: Expose tracker::non_lsa_used_space()
  lsa: Fix memory leak on zone reclaim

(cherry picked from commit 4ad212dc01)
2018-01-16 15:54:54 +02:00
Paweł Dziepak
b3bc82bc77 Merge "Fix exception safety related to range tombstones in cache" from Tomasz
Fixes #2938.

* 'tgrabiec/fix-range-tombstone-list-exception-safety-v1' of github.com:scylladb/seastar-dev:
  tests: range_tombstone_list: Add test for exception safety of apply()
  tests: Introduce range_tombstone_list assertions
  cache: Make range tombstone merging exception-safe
  range_tombstone_list: Introduce apply_monotonically()
  range_tombstone_list: Make reverter::erase() exception-safe
  range_tombstone_list: Fix memory leaks in case of bad_alloc
  mutation_partition: Fix abort in case range tombstone copying fails
  managed_bytes: Declare copy constructor as allocation point
  Integrate with allocation failure injection framework

(cherry picked from commit 5a4b46f555)
2017-11-10 13:10:26 +01:00
Duarte Nunes
b12a2e6b08 Merge 'Solves problems related to gossip which can be observed in a large cluster' from Tomasz
"The main problem fixed is slow processing of application state changes.
This may lead to a bootstrapping node not having up to date view on the
ring, and serve incorrect data.

Fixes #2855."

* tag 'tgrabiec/gossip-performance-v3' of github.com:scylladb/seastar-dev:
  gms/gossiper: Remove periodic replication of endpoint state map
  gossiper: Check for features in the change listener
  gms/gossiper: Replicate changes incrementally to other shards
  gms/gossiper: Document validity of endpoint_state properties
  storage_service: Update token_metadata after changing endpoint_state
  gms/gossiper: Process endpoints in parallel
  gms/gossiper: Serialize state changes and notifications for given node
  utils/loading_shared_values: Allow Loader to return non-future result
  gms/gossiper: Encapsulate lookup of endpoint_state
  storage_service: Batch token metadata and endpoint state replication
  utils/serialized_action: Introduce trigger_later()
  gossiper: Add and improve logging
  gms/gossiper: Don't fire change listeners when there is no change
  gms/gossiper: Allow parallel apply_state_locally()
  gms/gossiper: Avoid copies in endpoint_state::add_application_state()
  gms/failure_detector: Ignore short update intervals

(cherry picked from commit 044b8deae4)
2017-11-07 19:21:30 +08:00
Vlad Zolotarov
eb34937ff6 utils: introduce loading_shared_values
This class implements an key-value container that is populated
using the provided asynchronous callback.

The value is loaded when there are active references to the value for the given key.

Container ensures that only one entry is loaded per key at any given time.

The returned value is a lw_shared_ptr to the actual value.

The value for a specific key is immediately evicted when there are no
more references to it.

The container is based on the boost::intrusive::unordered_set and is rehashed (grown) if needed
every time a new value is added (asynchronously loaded).

The container has a rehash() method that would grow or shrink the container as needed
in order to get the load factor into the [0.25, 0.75] range.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
(cherry picked from commit ec3fed5c4d)
2017-11-07 19:21:29 +08:00
Tomasz Grabiec
f06bb656a4 utils: Introduce serialized_action
(cherry picked from commit 6a3703944b)

Conflicts:
	configure.py
2017-11-07 19:21:29 +08:00
Vlad Zolotarov
4bb6ba6d58 utils::loading_cache: cancel the timer after closing the gate
The timer is armed inside the section guarded by the _timer_reads_gate
therefore it has to be canceled after the gate is closed.

Otherwise we may end up with the armed timer after stop() method has
returned a ready future.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1501603059-32515-1-git-send-email-vladz@scylladb.com>
(cherry picked from commit 4b28ea216d)
2017-08-01 17:23:53 +01:00
Avi Kivity
4710ee229d Merge "educe the effect of the latency metrics" from Amnon
"This series reduce that effect in two ways:
1. Remove the latency counters from the system keyspaces
2. Reduce the histogram size by limiting the maximum number of buckets and
   stop the last bucket."

Fixes #2650.

* 'amnon/remove_cf_latency_v2' of github.com:cloudius-systems/seastar-dev:
  database: remove latency from the system table
  estimated histogram: return a smaller histogram

(cherry picked from commit 3fe6731436)
2017-07-31 16:01:05 +03:00
Vlad Zolotarov
93cb78f21d utils::loading_cache: add stop() method
loading_cache invokes a timer that may issue asynchronous operations
(queries) that would end with writing into the internal fields.

We have to ensure that these operations are over before we can destroy
the loading_cache object.

Fixes #2624

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1501208345-3687-1-git-send-email-vladz@scylladb.com>
2017-07-31 15:55:46 +03:00
Tomasz Grabiec
d76e9e4026 lsa: Fix performance regression in eviction and compact_on_idle
Region comparator, used by the two, calls region_impl::min_occupancy(),
which calls log_histogram::largest(). The latter is O(N) in terms of
the number of segments, and is supposed to be used only in tests.
We should call one_of_largest() instead, which is O(1).

This caused compact_on_idle() to take more CPU as the number of
segments grew (even when there was nothing to compact). Eviction
would see the same kind of slow down as well.

Introduced in 11b5076b3c.

Message-Id: <1498641973-20054-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 3489c68a68)
2017-06-28 12:33:11 +03:00
Tomasz Grabiec
4b4aef789e utils: Add helpers for dealing with nonwrapping_range<int> 2017-06-24 18:06:11 +02:00
Vlad Zolotarov
1ae40ee91a utils::timestamped_val: fix the touch() comment
The current comment has been written when the function has not been a
timestamped_val member.

Let's adjust it to the current code.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1495555659-10881-1-git-send-email-vladz@scylladb.com>
2017-05-26 19:26:56 +03:00
Vlad Zolotarov
0619c2cb71 utils::serialization: remove not used deserialization_xxx() functions
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1495556124-16672-1-git-send-email-vladz@scylladb.com>
2017-05-26 19:26:20 +03:00
Paweł Dziepak
3b9c0a6ae2 Merge "loading_cache: fix the known complexity issue in the shrink() method" from Vlad
Use the boost::intrusive containers in order to achieve a O(1) complexity
for both "LRU list" update and to minimize the memory overhead in the hash
table item to "LRU list" item connection:
   - Make the timestamped_val be both a bi::list and a bi::unordered_set
     item.
   - Make a bi::unordered_set be a cache backend instead of the
     std::unordered_map.

As a result dropping k LRU items becomes an O(k) operation instead of
O(N log N), where N is a total number of all cached items:
   - Every time a value is read - move it to the front of the "LRU list"
     (O(1)).
   - When we need to remove k LRU items:
      - Repeat k times:
         - Take an element from the back of the "LRU list". (O(1)).
         - Remove it from the bi::unordered_set and dispose. (O(1)).
           We use an auto-unlink configuration for bi::list, therefore
           disposing an item is going to auto unlink it from the list.

* 'permissions_cache_move_to_intrusive-v1' of github.com:scylladb/seastar-dev:
  utils::loading_cache: cleanup
  utils/loading_cache.hh: use intrusive list to store the lru entry
  utils::loading_cache: implement automatic rehashing
  utils::loading_cache: make the underlying map to be an intrusive unordered_set
2017-05-23 16:18:16 +01:00
Avi Kivity
fd0e1eb1e2 Merge "Fixes for mutation algebra" from Tomasz
"Enforces commutativity of addition:

 m1 + m2 == m2 + m1

and consistency of difference and addition with equality:

 m1 + (m2 - m1) == m1 + m2"

* tag 'tgrabiec/fix-range-tombstone-commutativity-v2' of github.com:cloudius-systems/seastar-dev:
  mutation: Make compare_*_for_merge() consistent with equals()
  tests: mutation: Improve assertion failure message
  tests: Use default equality in test_mutation_diff_with_random_generator
  mutation: Make counter cell difference consistent with apply
  tests: range_tombstone_list_test: Improve error message
  tests: range_tombstone_list: Check adjacent range merging
  range_tombstone_list: Merge adjacent range tombstones in apply()
  tests: mutation: Check commutativity of mutation addition
  range_tombstone_list: Avoid violating set invariant
  range_tombstone_list: Make tombstone merging commutative
  range_tombstone_list: Add erase() operation to the reverter
  range_tombstone_list: Make all undo operations ordered relative to each other
  utils: Extract to_boost_visitor() to a separate header
  allocating_strategy: Introduce alloc_strategy_unique_ptr<>
2017-05-23 15:20:38 +03:00
Vlad Zolotarov
2d4d198fb9 utils::loading_cache: cleanup
- Remove "_" at the beginning of the type names.
   - s/Pred/EqualPred/

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-22 23:02:18 -04:00
Vlad Zolotarov
fd59a548c0 utils/loading_cache.hh: use intrusive list to store the lru entry
Fix the shrink() O(n log n) complexity issue by constantly pushing the corresponding intrusive
list entry to the head of the list every time the values are read.

This will keep the list ordered by the last read time from the most recently read
to the least recently read entry.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-22 23:00:18 -04:00
Vlad Zolotarov
0c4e9efce7 utils::loading_cache: implement automatic rehashing
- Start the cache with 256 buckets - the minimum number of buckets.
   - Limit the maximal number of buckets by 1M buckets.
   - Keep the load factor between 0.25 and 1.0 as long as the number of buckets is
     between the minimum and the maximum values mentioned above.
   - Grow and shrink the hash every "refresh" period if needed.
   - Enable bi::power_2_buckets and bi::compare_hash bi::unordered_set options.
   - Enable bi::unordered_set_base_hook's bi::store_hash option.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-22 22:57:44 -04:00
Vlad Zolotarov
2be3596a4f utils::loading_cache: make the underlying map to be an intrusive unordered_set
Make the underlying map to be a boost::intrusive::unordered_set<timestamped_val>
instead of std::unordered_set<Key, timestamped_val>.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-22 18:45:13 -04:00
Tomasz Grabiec
5aeb9eb70c utils: Extract to_boost_visitor() to a separate header 2017-05-22 19:30:02 +02:00
Tomasz Grabiec
69e2eccf68 allocating_strategy: Introduce alloc_strategy_unique_ptr<> 2017-05-22 19:30:02 +02:00
Avi Kivity
ebaeefa02b Merge seatar upstream (seastar namespace)
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
 - 'net' namespace conflicts with seastar::net, renamed to 'netw'.
 - 'transport' namespace conflicts with seastar::transport, renamed to
   cql_transport.
 - "logger" global variables now conflict with logger global type, renamed
   to xlogger.
 - other minor changes
2017-05-21 12:26:15 +03:00
Tomasz Grabiec
cd4d15672b utils: estimated_histogram: Fix clear()
It was a no-op. It doesn't seem currently used, but I will have a use
for it soon.
Message-Id: <1495198172-1969-1-git-send-email-tgrabiec@scylladb.com>
2017-05-19 14:34:34 +01:00
Vlad Zolotarov
6a63c87a9f utils::loading_cache: avoid the reads storm when the key is not in the cache
Use a mutex to serialize producers when the key is not present in the cache.

Fixes #2262

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-18 07:55:48 -04:00
Vlad Zolotarov
1ef22f84c1 utils::loading_cache: cleanup
- Fix a callback signature: receive a const ref.
   - White spaces.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-17 15:03:14 -04:00
Vlad Zolotarov
87ce0b2d47 utils::loading_cache: align the constrains in the constructor with the parameters description
According to description of permissions_validity_in_ms the permissions_cache is enabled if this
value is set to a non-zero value. Otherwise the permissions_cache is disabled.

According to the permissions_update_interval_in_ms description it must have a non-zero value if permissions_cache
is enabled.

permissions_cache_max_entries description doesn't explicitly state it but it makes no sense to allow it to be zero
if permissions_cache is enabled.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-17 15:03:14 -04:00
Vlad Zolotarov
e286828472 utils::loading_cache: refresh in the background
This patch changes the way a loading_cache works.

Before this patch:
   1) If a permissions key is not in the cache it's loaded in the foreground and the original
      query is blocked till the permissions are loaded.
   2) Every _period the timer does the following:
      1) If a value was loaded more than _expiry time ago it is removed from the cache.
      2) If the cache is too big - the less recently loaded values are removed till the cache
         fits the requested size.

After this patch:
   1) If a permissions key is not in the cache it's loaded in the foreground and the original
      query is blocked till the permissions are loaded.
   2) Every _period the timer does the following:
      1) If a value in the cache was loaded or read for the last time more than _expiry time ago - it's removed from the cache.
      2) If the cache is too big - the less recently read values are removed till the cache fits the requested size.
      3) The values that were loaded more than _refresh time ago are re-read in the background.

The new implementation allows to minimize the amount of the foreground reads for a frequently used value to a single
event (when the value is loaded for the first time).

It also ensures we do not reload values we no longer need.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-17 15:03:06 -04:00
Avi Kivity
1c6cecd9d0 utils: introduce div_ceil()
Divides integrals but rounds up rather than down.
2017-05-17 12:30:03 +03:00
Vlad Zolotarov
494ea82a88 utils::UUID: align the UUID serialization API with the similar API of other classes in the project
The standard serialization API (e.g. in data_value) includes the following methods:

size_t serialized_size() const;
void serialize(bytes::iterator& it) const;
bytes serialize() const;

Align the utils::UUID API with the pattern above.

The only addition is that we are going to make an output iterator parameter of a second method above
a template so that we may serialize into different output sources.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-16 15:56:03 -04:00
Vlad Zolotarov
7706775a63 utils: serialization: unify the variety of serialize_XXX(...)
Use the same templated implementation for all different serialize_XXX(...).

The chosen implementation is based on the std::copy_n(char*, size, OutputIterator),
which is heavily optimized and will be using memcpy/memmove where possible.

This patch also removes the not needed specializations that accept signed integer
values since we were casting them to unsigned value anyway.

The std::ostream based specifications are also removed since they are not used
anywhere except for a test-serialization.cc and adjusting the ostream to the iterator
is a single-liner.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-16 15:56:03 -04:00
Avi Kivity
7e29dd7066 managed_bytes: improve alignment hygene
While blob_storage is marked as an unaligned type, the back references also
point to an unaligned type (a pointer to blob_storage), since a back
reference can live in a blob_storage.  This triggers errors from zapcc/clang 4.

Fix by creating a type for the reference, which is marked as unaligned.
Message-Id: <20170502071404.507-1-avi@scylladb.com>
2017-05-02 10:04:13 +01:00
Avi Kivity
1d12d69881 logalloc: define segment_zone::maximum_size
Yield build errors with some compilers, if missing.
2017-05-01 16:31:29 +03:00
Paweł Dziepak
f5cf86484e lsa: introduce upper bound on zone size
Attempting to create huge zones may introduce significant latency. This
patch introduces the maximum allowed zone size so that the time spent
trying to allocate and initialising zone is bounded.

Fixes #2335.

Message-Id: <20170428145916.28093-1-pdziepak@scylladb.com>
2017-04-30 10:58:11 +03:00
Duarte Nunes
d216c3dbd2 tombstone: Extract out relational operators
This patch extracts out the relational operators in struct tombstone
to a class capable of generating them from a tri-compare function.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-04-25 11:43:04 +02:00
Pekka Enberg
940c3f4330 Merge "Clang fixes (part 2)" from Avi
"This series fixes some more errors found by clang, with the aim of enabling
clang/zapcc as a supported compiler.  A single issue remains, but it's
probably in std::experimental::optional::swap(); not in our code."

* tag 'clang/2/v1' of https://github.com/avikivity/scylla:
  sstable_test: avoid passing negative non-type template arguments to unsigned parameters
  UUID: add more comparison operators
  sstable_datafile_test: avoid string_view user-defined literal conversion operator
  mutation_source_test: avoid template function without template keyword
  cql_query_test: define static variable
  cql_query_test: add braces for single-item collection initializers
  storage_service: don't use typeid(temporary)
  logalloc: remove unused max_occupancy_for_compaction
  storage_proxy: drop overzealous use of __int128_t in recently-modified-no-read-repair logic
  storage_proxy: drop unused member access from return value
  storage_proxy: fix reference bound to temporary in data_read_resolver::less_compare
  read_repair_decision: fix operator<<(std::ostream&, ...)
2017-04-24 20:32:16 +03:00
Avi Kivity
6d9e18fd61 logalloc: reduce descriptor overhead
Every lsa-allocated object is prefixed by a header that contains information
needed to free or migrate it.  This includes its size (for freeing) and
an 8-byte migrator (for migrating).  Together with some flags, the overhead
is 14 bytes (16 bytes if the default alignment is used).

This patch reduces the header size to 1 byte (8 bytes if the default alignment
is used).  It uses the following techniques:

 - ULEB128-like encoding (actually more like ULEB64) so a live object's header
   can typically be stored using 1 byte
 - indirection, so that migrators can be encoded in a small index pointing
   to a migrator table, rather than using an 8-byte pointer; this exploits
   the fact that only a small number of types are stored in LSA
 - moving the responsibility for determining an object's size to its
   migrator, rather than storing it in the header; this exploits the fact
   that the migrator stores type information, and object size is in fact
   information about the type

The patch improves the results of memory_footprint_test as following:

Before:

 - in cache:     976
 - in memtable:  947

After:

mutation footprint:
 - in cache:     880
 - in memtable:  858

A reduction of about 10%.  Further reductions are possible by reducing the
alignment of lsa objects.

logalloc_test was adjusted to free more objects, since with the lower
footprint, rounding errors (to full segments) are different and caused
false errors to be detected.

Missing: adjustments to scylla-gdb.py; will be done after we agree on the
new descriptor's format.
2017-04-24 12:23:12 +02:00
Avi Kivity
dc6ea51ffa UUID: add more comparison operators
Clang wanted them for some unit test; not sure how gcc was able to
synthesize them, but they're clearly needed.
2017-04-22 22:12:33 +03:00
Avi Kivity
9303b09a64 logalloc: remove unused max_occupancy_for_compaction
Noticed by clang.
2017-04-22 21:09:41 +03:00
Tomasz Grabiec
20f4c9bf23 lsa: Reduce reclamation latency
Currently eviction is performed until occupancy of the whole region
drops below the 85% threshold. This may take a while if region had
high occupancy and is large. We could improve the situation by only
evicting until occupancy of the sparsest segment drops below the
threshold, as is done by this change.

I tested this using a c-s read workload in which the condition
triggers in the cache region, with 1G per shard:

 lsa-timing - Reclamation cycle took 12.934 us.
 lsa-timing - Reclamation cycle took 47.771 us.
 lsa-timing - Reclamation cycle took 125.946 us.
 lsa-timing - Reclamation cycle took 144356 us.
 lsa-timing - Reclamation cycle took 655.765 us.
 lsa-timing - Reclamation cycle took 693.418 us.
 lsa-timing - Reclamation cycle took 509.869 us.
 lsa-timing - Reclamation cycle took 1139.15 us.

The 144ms pause is when large eviction is necessary.

Statistics for reclamation pauses for a read workload over
larger-than-memory data set:

Before:

 avg = 865.796362
 stdev = 10253.498038
 min = 93.891000
 max = 264078.000000
 sum = 574022.988000
 samples = 663

After:

 avg = 513.685650
 stdev = 275.270157
 min = 212.286000
 max = 1089.670000
 sum = 340573.586000
 samples = 663

Refs #1634.

Message-Id: <1484730859-11969-1-git-send-email-tgrabiec@scylladb.com>
2017-04-21 12:52:31 +02:00
Tomasz Grabiec
4313641c03 tests: Add test for log_histogram 2017-04-21 12:52:31 +02:00
Tomasz Grabiec
c83768d6bb log_histogram: Allow non-power-of-two minimum values
We will want to reuse the min_size mechanism for the whole compaction
threshold, including the occupancy threshold. That threshold is close
to the segment size and we cannot pick a power of two which would be
close enough to what we need.

Therefore, change log_histogram to support arbitrary minimum base.

bucket_of() was moved into log_histogram_options so that it can be used
in number_of_buckets(), which makes for a simple and much less
error-prone implementation.
2017-04-21 10:54:50 +02:00