scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 20:16:43 +00:00

Author	SHA1	Message	Date
Paweł Dziepak	dcd79af8ed	lsa: optimise disabling reclamation and invalidation counter Most of the lsa gory details are hidden in utils/logalloc.cc. That includes the actual implementation of a lsa region: region_impl. However, there is code in the hot path that often accesses the _reclaiming_enabled member as well as its base class allocation_strategy. In order to optimise those accesses another class is introduced: basic_region_impl that inherits from allocation_strategy and is a base of region_impl. It is defined in utils/logalloc.hh so that it is publicly visible and its member functions are inlineable from anywhere in the code. This class is supposed to be as small as possible, but contain all members and functions that are accessed from the fast path and should be inlined.	2018-01-30 18:33:26 +01:00
Paweł Dziepak	d825ae37bf	lsa: split alloc section into reserving and reclamation-disabled parts Allocating sections reserves certain amount of memory, then disables reclamation and attempts to perform given operation. If that fails due to std::bad_alloc the reserve is increased and the operation is retried. Reserving memory is expensive while just disabling reclamation isn't. Moreover, the code that runs inside the section needs to be safely retryable. This means that we want the amount of logic running with reclamation disabled as small as possible, even if it means entering and leaving the section multiple times. In order to reduce the performance penalty of such solution the memory reserving and reclamation disabling parts of the allocating sections are separated.	2018-01-30 18:33:26 +01:00
Paweł Dziepak	eb2e88e925	linearization_context: remove non-trivial operations from fast path Since linearization_context is thread_local every time it is accessed the compiler needs to emit code that checks if it was already constructed and does so if it wasn't. Moreover, upon leaving the context from the outermost scope the map needs to be cleared. All these operations impose some performance overhead and aren't really necessary if no buffers were linearised (the expected case). This patch rearranges the code so that lineatization_context is trivially constructible and the map is cleared only if it was modified.	2018-01-30 18:33:25 +01:00
Vladimir Krivopalov	9fdf4b24b5	Add helper input streams: buffer_input_stream and prepended_input_stream. buffer_input_stream is a simple input_stream wrapping a single temporary_buffer. prepended_input_stream suits for the case when some data has been read into a buffer and the rest is still in a stream. It accepts a buffer and a data_source and first reads from the buffer and then, when it ends, proceeds reading from the data_source. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-01-29 11:57:04 -08:00
Botond Dénes	12b1520415	exponential_backoff_retry::do_until_value(): restore indentation Deferred from previous patch. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <a10053f6c0ed8a24a74e51f1df4e9a5acf59922d.1517222195.git.bdenes@scylladb.com>	2018-01-29 10:50:01 +00:00
Botond Dénes	e0c082616a	exponential_backoff_retry::do_until_value(): fix use-after-move The exponential_backoff_retry instance is captured by move and is then indirectly moved again as repeat_until_value() moves the lambda its passed into its internal state. This caused problems as internal lambdas store references to the instance and these references go stale after the move. To fix this keep hold of the existential_backoff_retry instance in an enclosing do_with() to make it safe for internal lambdas to reference it. Indentation will be fixed by the next patch. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <adc49d25a6176756d60e092f3713c0c897732382.1517222195.git.bdenes@scylladb.com>	2018-01-29 10:50:01 +00:00
Duarte Nunes	bfe5a8e96f	utils/managed_vector: Return reference to emplaced element We are in 2018, after all. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180126105417.54285-1-duarte@scylladb.com>	2018-01-26 13:49:56 +01:00
Tomasz Grabiec	1292315579	anchorless_list: Introduce last()	2018-01-18 11:32:49 +01:00
Tomasz Grabiec	5c85e9c2db	lsa: Expose max_zone_segments for tests	2018-01-16 13:17:20 +01:00
Tomasz Grabiec	99708cc498	lsa: Expose tracker::non_lsa_used_space() So that it can be used in unit tests.	2018-01-16 13:17:20 +01:00
Tomasz Grabiec	e5f8176c32	lsa: Fix memory leak on zone reclaim _free_segments_in_zones is not adjusted by segment_pool::reclaim_segments() for empty zones on reclaim under some conditions. For instance when some zone becomes empty due to regular free() and then reclaiming is called from the std allocator, and it is satisfied from a zone after the one which is empty. This would result in free memory in such zone to appear as being leaked due to corrupted free segment count, which may cause a later reclaim to fail. This could result in bad_allocs. The fix is to always collect such zones. Fixes #3129 Refs #3119 Refs #3120	2018-01-16 13:17:11 +01:00
Glauber Costa	80c4a211d8	consolidate timeout_clock At the moment, various different subsystems use their different ideas of what a timeout_clock is. This makes it a bit harder to pass timeouts between them because although most are actually a lowres_clock, that is not guaranteed to be the case. As a matter of fact, the timeout for restricted reads is expressed as nanoseconds, which is not a valid duration in the lowres_clock. As a first step towards fixing this, we'll consolidate all of the existing timeout_clocks in one, now called db::timeout_clock. Other things that tend to be expressed in terms of that clock--like the fact that the maximum time_point means no timeout and a semaphore that wait()s with that resolution are also moved to the common header. In the upcoming patch we will fix the restricted reader timeouts to be expressed in terms of the new timeout_clock. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-01-11 12:07:41 -05:00
Duarte Nunes	40ad65666f	utils/exponential_backoff_retry: Add helper to automate retries This patch adds the do_until_value static member function to exponential_backoff_retry, which retries the specified function until it returns an engaged optional. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-12-28 13:00:28 +00:00
Duarte Nunes	9a602c7796	utils/exponential_backoff_retry: Add abort_source-based retry Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-12-28 13:00:28 +00:00
Duarte Nunes	1374f898b9	Merge seastar upstream Class optimized_optional was moved into seastar, and its usage simplified so move_and_disengage() is replaced in favour of std::exchange(_, { }). * seastar adaca37...b0f5591 (9): > Merge "core: Introduce cancellation mechanism" from Duarte > Fix Seastar build that no longer builds with --enable-dpdk after the recent commit fd87ea2 > noncopyable_function: support function objects whose move constructors throw > Adding new hardware options to new config format, using new config format for dpdk device > Fix check for Boost version during pre-build configuration. > variant_utils: add variant_visitor constructor for C++17 mode > Merge "Allows json object to be stream to an" from Amnon > Merge 'Default to C++17' from Avi > Add const version of subscript operator to circular_buffer Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20171228112126.18142-1-duarte@scylladb.com>	2017-12-28 13:24:18 +02:00
Vlad Zolotarov	6c037899b5	utils::fb_utilities: add is_me(addr) method Add a widely used method that returns TRUE if a given address is a broadcast address of the local node. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-12-14 15:05:48 -05:00
Avi Kivity	e6940d8d4a	Merge "Gossip propagation and stabilization" from Calle "Fixes #2866 Fixes #2894 Changes gossip propagation to allow "atomic" grouping of values to ensure their respective order. Modifies gossip bootstrap startup to potentially wait longer in cases where stabilization (messages done) takes time, to avoid data loss in repair." * 'calle/gossip' of github.com:scylladb/seastar-dev: gossip: wait for stabilized gossip on bootstrap gossiper: Prevent race condition in propagation utils::to_string: Add printers for pairs+maps utils::in: Add helper type for perfect forwarding initializer lists	2017-12-12 17:59:00 +02:00
Vlad Zolotarov	0145ae2b4b	utils::crc32: add power64 crc32 HW accelerated implementation Based on the work of Anton Blanchard <anton@au.ibm.com>, IBM that may be found here: https://github.com/antonblanchard/crc32-vpmsum Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-12-08 13:38:13 -05:00
Michael Munday	18c0ab539e	utils/allocation_strategy: force alignment to be at least sizeof(void) The alignment of packed structs can be 1. The system¹ posix_memalign function will return EINVAL when passed this alignment. This fix forces the alignment to be at least sizeof(void). ¹ The seastar implementation of posix_memalign does not appear to have this limitation currently.	2017-12-08 10:12:41 -05:00
Michael Munday	5158b3f484	utils::crc: introduce process_le/be(T) methods Replace the oblique process(T) overloads for integer types with explicit process_le/be(T) methods that would interpret the given integer as a stream of bytes using the corresponding endiannes. For instance process_le(0x11223344) would treat this integer as the following array of bytes: {0x44, 0x33, 0x22, 0x11}. process_be(0x11223344) on the other hand would treat this integer as if it's {0x11, 0x22, 0x33, 0x44}. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-12-08 10:12:21 -05:00
Michael Munday	26b7c2622e	utils/crc: use zlib for crc32 on non-x86 platforms Ideally we should use the Castagnoli polynomial to match the SSE 4.2 crc32 instructions, but this works for now.	2017-12-08 09:47:50 -05:00
Calle Wilund	f4362a5289	utils::in: Add helper type for perfect forwarding initializer lists wrapper type (courtesy of http://cpptruths.blogspot.se/2013/09/21-ways-of-passing-parameters-plus-one.html#inTidiom) to enable move semantics in initializer lists. Useful as an engineering overkill to retain nice call sites.	2017-12-05 14:28:34 +00:00
Amnon Heiman	3f8d9a87ee	estimated_histogram: update the sum and count when merging When merging histograms the count and the sum should be updated. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <20171122154822.23855-1-amnon@scylladb.com>	2017-11-22 16:55:55 +01:00
Glauber Costa	6c4e8049a0	estimated_histogram: also fill up sum metric Prometheus histograms have 3 embedded metrics: count, buckets, and sum. Currently we fill up count and buckets but sum is left at 0. This is particularly bad, since according to the prometheus documentation, the best way to calculate histogram averages is to write: rate(metric_sum[5m]) / rate(metric_count[5m]) One way of keeping track of the sum is adding the value we sampled, every time we sample. However, the interface for the estimated histogram has a method that allows to add a metric while allowing to adjust the count for missing metrics (add_nano()) That makes acumulating a sum inaccurate--as we will have no values for the points that were added. To overcome that, when we call add_nano(), we pretend we are introducing new_count - _count metrics, all with the same value. Long term, doing away with sampling may help us provide more accurate results. After this patch, we are able to correctly calculate latency averages through the data exported in prometheus. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20171122144558.7575-1-glauber@scylladb.com>	2017-11-22 16:10:12 +01:00
Vladimir Krivopalov	61b1988aa1	Use meaningful error messages when throwing a marshal_exception Fixes #2977 Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <20171121005108.23074-1-vladimir@scylladb.com>	2017-11-21 16:05:43 +02:00
Daniel Fiala	21ea05ada1	utils/big_decimal: Fix compilation issue with converion of cpp_int to uint64_t. Signed-off-by: Daniel Fiala <daniel@scylladb.com> Message-Id: <20171121134854.16278-1-daniel@scylladb.com>	2017-11-21 15:51:29 +02:00
Jesse Haber-Kucharsky	6f4241574c	utils/loading_cache: Include necessary dependency	2017-11-15 23:17:05 -05:00
Tomasz Grabiec	8d69d217af	lsa: Guarantee invalidated references on allocating section retry There is existing code (e.g. use of partition_snapshot_row_cursor in cache_streamed_mutation) which assumes that references will be invalidated when bad_alloc is thrown from allocating_section. That is currently the case because on retry we will attempt memory reclamation which will invalidate references either through compaction or eviction. Make this guarantee explicit.	2017-11-13 20:55:13 +01:00
Daniel Fiala	ce2f010859	utils/big_decimal: Added necessary operators and methods for aggregate functions. Signed-off-by: Daniel Fiala <daniel@scylladb.com>	2017-11-12 15:51:29 +01:00
Tomasz Grabiec	5348d9f596	managed_bytes: Declare copy constructor as allocation point Because of the small size optimization, not all copies will call the allocator, so allocation failure injection may miss this site if the value is not large enough. Make the testing more effective by marking this place explicitly as an allocation point.	2017-11-07 15:33:24 +01:00
Tomasz Grabiec	34ccf234ea	Integrate with allocation failure injection framework	2017-11-07 15:33:24 +01:00
Calle Wilund	287b6fd8bd	config_file: Add optional "error_handler" to yaml parse functions Allowing parse errors / unknown options to be ignored.	2017-11-06 09:53:05 +00:00
Duarte Nunes	044b8deae4	Merge 'Solves problems related to gossip which can be observed in a large cluster' from Tomasz "The main problem fixed is slow processing of application state changes. This may lead to a bootstrapping node not having up to date view on the ring, and serve incorrect data. Fixes #2855." * tag 'tgrabiec/gossip-performance-v3' of github.com:scylladb/seastar-dev: gms/gossiper: Remove periodic replication of endpoint state map gossiper: Check for features in the change listener gms/gossiper: Replicate changes incrementally to other shards gms/gossiper: Document validity of endpoint_state properties storage_service: Update token_metadata after changing endpoint_state gms/gossiper: Process endpoints in parallel gms/gossiper: Serialize state changes and notifications for given node utils/loading_shared_values: Allow Loader to return non-future result gms/gossiper: Encapsulate lookup of endpoint_state storage_service: Batch token metadata and endpoint state replication utils/serialized_action: Introduce trigger_later() gossiper: Add and improve logging gms/gossiper: Don't fire change listeners when there is no change gms/gossiper: Allow parallel apply_state_locally() gms/gossiper: Avoid copies in endpoint_state::add_application_state() gms/failure_detector: Ignore short update intervals	2017-10-18 10:13:25 +01:00
Duarte Nunes	c468e59817	Merge 'Extract config file mechanism + allow additional' from Calle "Extracts the yaml/boost-po aspects of the "self-describing" db::config into an abstract type. db::config is then reimplemented in said type, removing some of the slightly cumbersome entanglement with seastar opts (log). Adds a main hook for additional configuration files (options + file)" * 'calle/config' of github.com:scylladb/seastar-dev: main/init: Add registerable configuration objects db::config: Re-implement on utils/config_file. utils::config_file: Abstract out config file to external type	2017-10-18 09:50:53 +01:00
Tomasz Grabiec	f7a7e97095	utils/loading_shared_values: Allow Loader to return non-future result	2017-10-18 08:49:52 +02:00
Tomasz Grabiec	2e2ae4671e	utils/serialized_action: Introduce trigger_later() Can be used instead of trigger() to improve batching.	2017-10-18 08:49:52 +02:00
Calle Wilund	05db87e068	utils::config_file: Abstract out config file to external type Handling all the boost::commandline + YAML stuff. This patch only provides an external version of these functions, it does not modify the db::config object. That is for a follow-up patch.	2017-10-18 00:51:41 +00:00
Pekka Enberg	ae92055b52	Merge "Bring histogram closer to what Prometheus expects" from Glauber "Histograms are a native prometheus type, and there are many functions available that operate on them. There is extensive documentation about them at https://prometheus.io/docs/practices/histograms/ One example is the function histogram_quantile(), that can extract useful quantiles from the histograms. Currently, those functions don't work well. The reasons are twofold: 1) We are only exporting 16 metrics, starting from 1usec. That means that the highest latency we can differentiate is 4ms. After that, everything falls into the same bin. 2) The format that prometheus expects is that each bin will contain the total number of points seen up until that bin, while we currently export the total number of points that falls between bins. IOW, it is a cummulative histogram. About point two, granted it is a bit hidden in their website, but it is there. The following phrase about a caveat make it clear: "Note that we divide the sum of both buckets. The reason is that the histogram buckets are cumulative. The le="0.3" bucket is also contained in the le="1.2" bucket; dividing it by 2 corrects for that." It is also not needed to accumulate things that fall over the last bin: the _count component of the histogram will already account for that." Acked-by: Amnon Heiman <amnon@scylladb.com> Acked-by: Gleb Natapov <gleb@scylladb.com> * 'prometheus-histograms' of github.com:glommer/scylla: storage_proxy: change reporting of estimated histograms estimated_histogram: bring histogram closer to what prometheus expects.	2017-10-17 20:23:10 +03:00
Tomasz Grabiec	68fe1a5bee	utils/loading_cache: Fix compilation on older compilers Message-Id: <1507728312-10585-1-git-send-email-tgrabiec@scylladb.com>	2017-10-12 14:55:34 +03:00
Botond Dénes	d2b294dc06	loading_cache: prepend this-> to method calls on captured this To make gcc 6.3 happy. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <849402e20a1ffa6f603eff4fe295981a94b9ca79.1507282527.git.bdenes@scylladb.com>	2017-10-06 12:09:34 +02:00
Vlad Zolotarov	1394e781be	utils + cql3: use a functor class instead of std::function Define value_extractor_fn as a functor class instead of std::function. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1507137724-2408-2-git-send-email-vladz@scylladb.com>	2017-10-05 15:29:51 +01:00
Glauber Costa	fc4416abcc	estimated_histogram: bring histogram closer to what prometheus expects. Histograms are a native prometheus type, and there are many functions available that operate on them. There is extensive documentation about them at https://prometheus.io/docs/practices/histograms/ One example is the function histogram_quantile(), that can extract useful quantiles from the histograms. Currently, those functions don't work well. The reasons are twofold: 1) We are only exporting 16 metrics, starting from 1usec. That means that the highest latency we can differentiate is 4ms. After that, everything falls into the same bin. 2) The format that prometheus expects is that each bin will contain the total number of points seen up until that bin, while we currently export the total number of points that falls between bins. IOW, it is a cummulative histogram. About point two, granted it is a bit hidden in their website, but it is there. The following phrase about a caveat make it clear: "Note that we divide the sum of both buckets. The reason is that the histogram buckets are cumulative. The le="0.3" bucket is also contained in the le="1.2" bucket; dividing it by 2 corrects for that." It is also not needed to accumulate things that fall over the last bin: the _count component of the histogram will already account for that. This patch changes the histogram format to be more in line with what prometheus expect. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2017-10-04 20:01:13 -04:00
Calle Wilund	801ee44cb8	class_registrator: Fix qualified name matching + provider helpers Should not assume namespace "org", nor should we allow "loose" substring matching.	2017-10-04 12:43:42 +02:00
Calle Wilund	3c509e0333	class_registrator: Allow different return types Allows registry to give back, for example, shared_ptr etc instead of solely unique_ptr. If a registry is defined with seastar/std shared/lw_shared/unique_ptr as "BaseType", the type will assume this is the intended result type.	2017-10-04 12:43:42 +02:00
Paweł Dziepak	fdfa6703c3	Merge "loading_shared_values and size limited and evicting prepared statements cache" from Vlad " The original motivation for the "utils: introduce a loading_shared_values" series was a hinted handoff work where I needed an on-demand asynchronously loading key-value container (a replica address to a commitlog instance map). It turned out that we already have the classes that do almost what I needed: - utils::loading_cache - sstables::shared_index_lists Therefore it made sense to find a common ground, unify this functionality and reuse the code both in the classes above and in the new hinted handoff code. This series introduces the utils::loading_shared_values that generalizes the sstables::shared_index_lists API on top of bi::unordered_set with the rehashing logic from the utils::loading_cache triggered by an addition of an entry to the set (PATCH1). Then it reworks the sstables::shared_index_lists and utils::loading_cache on top of the new class (PATCH2 and PATCH3). PATCH4 optimizes the loading_cache for the long timer period use case. But then we have discovered that we have another "customer" for the loading_cache. Apparently our prepared statements cache had a birth flaw - it was unlimited in size - unless the corresponding keyspace and/or table are modified/dropped the entries are never evicted. We clearly need to limit its size and it would also make sense to evict the cache entries that haven't been used long enough. This seems like a perfect match for a utils::loading_cache except for prepared statements don't need to be reloaded after they are created. Patches starting from PATCH5 are dealing with adding the utils::loading_cache the missing functionality (like making the "reloading" conditional and adding the synchronous methods like find(key)) and then transitioning the CQL and Thrift prepared statements caches to utils::loading_cache. This also fixes #2474." * 'evict_unused_prepared-v5' of https://github.com/vladzcloudius/scylla: tests: loading_cache_test: initial commit cql3::query_processor: implement CQL and Thrift prepared statements caches using cql3::prepared_statements_cache cql3: prepared statements cache on top of loading_cache utils::loading_cache: make the size limitation more strict utils::loading_cache: added static_asserts for checking the callbacks signatures utils::loading_cache: add a bunch of standard synchronous methods utils::loading_cache: add the ability to create a cache that would not reload the values utils::loading_cache: add the ability to work with not-copy-constructable values utils::loading_cache: add EntrySize template parameter utils::loading_cache: rework on top of utils::loading_shared_values sstables::shared_index_list: use utils::loading_shared_values utils: introduce loading_shared_values	2017-10-04 09:13:32 +01:00
Avi Kivity	a2f26f7b29	log_histogram: rename to log_heap log_histogram is not really a histogram, it is a heap-like container. Rename to log_heap in case we do want a log_histogram one day. Message-Id: <20170916172137.30941-1-avi@scylladb.com>	2017-09-18 12:44:05 +02:00
Vlad Zolotarov	9a43398d6a	utils::loading_cache: make the size limitation more strict Ensure that the size of the cache is never bigger than the "max_size". Before this patch the size of the cache could have been indefinitely bigger than the requested value during the refresh time period which is clearly an undesirable behaviour. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	4e72a56310	utils::loading_cache: added static_asserts for checking the callbacks signatures Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	a13362e74b	utils::loading_cache: add a bunch of standard synchronous methods Add a few standard synchronous methods to the cache, e.g. find(), remove_if(), etc. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	fa2f8162a5	utils::loading_cache: add the ability to create a cache that would not reload the values Sometimes we don't want the cached values to be periodically reloaded. This patch adds the ability to control this using a ReloadEnabled template parameter. In case the reloading is not needed the "loading" function is not given to the constructor but rather to the get_ptr(key, loader) method (currently it's the only method that is used, we may add the corresponding get(key, loader) method in the future when needed). Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00

1 2 3 4 5 ...

461 Commits