scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-04 14:03:06 +00:00

Author	SHA1	Message	Date
Avi Kivity	4f02a5f4b3	bloom_filter: fix overflow for large filters We use ::abs(), which has an int parameter, on long arguments, resulting in incorrect results. Switch to std::abs() instead, which has the correct overloads. Fixes #1494. Message-Id: <1469347802-28933-1-git-send-email-avi@scylladb.com> (cherry picked from commit `900639915d`)	2016-07-24 11:32:54 +03:00
Pekka Enberg	07ba03ce7b	utils/exceptions: Whitelist EEXIST and ENOENT in should_stop_on_system_error() There are various call-sites that explicitly check for EEXIST and ENOENT: $ git grep "std::error_code(E" database.cc: if (e.code() != std::error_code(EEXIST, std::system_category())) { database.cc: if (e.code() != std::error_code(ENOENT, std::system_category())) { database.cc: if (e.code() != std::error_code(ENOENT, std::system_category())) { database.cc: if (e.code() != std::error_code(ENOENT, std::system_category())) { sstables/sstables.cc: if (e.code() == std::error_code(ENOENT, std::system_category())) { sstables/sstables.cc: if (e.code() == std::error_code(ENOENT, std::system_category())) { Commit `961e80a` ("Be more conservative when deciding when to shut down due to disk errors") turned these errors into a storage_io_exception that is not expected by the callers, which causes 'nodetool snapshot' functionality to break, for example. Whitelist the two error codes to revert back to the old behavior of io_check(). Message-Id: <1465454446-17954-1-git-send-email-penberg@scylladb.com> (cherry picked from commit `8df5aa7b0c`)	2016-06-16 14:01:33 +03:00
Avi Kivity	de690a6997	Be more conservative when deciding when to shut down due to disk errors Currently we only shut down on EIO. Expand this to shut down on any system_error. This may cause us to shut down prematurely due to a transient error, but this is better than not shutting down due to a permanent error (such as ENOSPC or EPERM). We may whitelist certain errors in the future to improve the behavior. Fixes #1311. Message-Id: <1465136956-1352-1-git-send-email-avi@scylladb.com> (cherry picked from commit `961e80ab74`)	2016-06-16 14:01:33 +03:00
Pekka Enberg	7916182cfa	Revert "Be more conservative when deciding when to shut down due to disk errors" This reverts commit `a6179476c5`. The change breaks 'nodetool snapshot', for example.	2016-06-09 10:11:29 +03:00
Amnon Heiman	6255076c20	rate_moving_average: mean_rate is not initilized The rate_moving_average is used by timed_rate_moving_average to return its internal values. If there are no timed event, the mean_rate is not propertly initilized. To solve that the mean_rate is now initilized to 0 in the structure definition. Refs #1306 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1465231006-7081-1-git-send-email-amnon@scylladb.com> (cherry picked from commit `2cf882c365`)	2016-06-07 09:44:26 +03:00
Avi Kivity	a6179476c5	Be more conservative when deciding when to shut down due to disk errors Currently we only shut down on EIO. Expand this to shut down on any system_error. This may cause us to shut down prematurely due to a transient error, but this is better than not shutting down due to a permanent error (such as ENOSPC or EPERM). We may whitelist certain errors in the future to improve the behavior. Fixes #1311. Message-Id: <1465136956-1352-1-git-send-email-avi@scylladb.com> (cherry picked from commit `961e80ab74`)	2016-06-06 16:15:25 +03:00
Piotr Jastrzebski	136b8148d2	Use idle CPU to compact LSA memory Register an idle CPU handler that compacts a single segment every time there's nothing better to execute on CPU. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <c26aa608a1e0752fb9e6db1833ef3ba1de95f161.1464169748.git.piotr@scylladb.com>	2016-05-26 12:43:53 +03:00
Amnon Heiman	8ef25ceb05	Add waited avrage rate related object This patch adds a few data structure for derived and accumulative statistics that are similiar to the yammer implementation used by the JMX. It also adds a plus operator to histogram which cleans the histogram usage. moving_average - An exponentially-weighted moving average. calculate an event rate on a given interval. rate_moving_average and timed_rate_moving_average - Calculate 1m, 5m and 15m ewma an all time avrage and a counter. rate_moving_average_and_histogram and timed_rate_moving_average_and_histogram - Combines a histogram with a rate_moving_average. It also expose a histogram API so it will be an easy task to replace a histogram with a timed_rate_moving_average_and_histogram. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2016-05-17 11:47:49 +03:00
Pekka Enberg	4ed702f0da	Merge "Authorizer support" from Calle "Conversion/implementation of "authorizer" code from origin, handling permissions management for users/resources. Default implementation keeps mapping of <user.resource>->{permissions} in a table, contents of which is cached for slightly quicker checks. Adds access control to all (existing) cql statements. Adds access management support to the CQL impl. (GRANT/REVOKE/LIST) Verified manually and with dtest auth_test.py. Note that several of these still fail due to (unrelated) unimplemented features, like index, types etc. Fixes #1138"	2016-04-19 15:00:38 +03:00
Calle Wilund	ead1c882f8	utils::loading_cache: Version of the LoadingCache type used in origin Simple, expiring, cache of potentially limited number of entries.	2016-04-19 11:49:05 +00:00
Takuya ASADA	f6252be0c1	utils: fix compilation error on utils/exceptions.hh It doesn't able to find std::system_error due to missing header. Fixes #1202 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1461006884-28316-1-git-send-email-syuu@scylladb.com>	2016-04-19 09:37:31 +03:00
Calle Wilund	c446fe50e6	tuple_hash: Add convinence operator for two arguments (non-pair)	2016-04-18 13:51:15 +00:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	a0cba3c86f	logalloc: Introduce tracker::occupancy() Returns occupancy information for all memory allocated by LSA, including segment pools / zones.	2016-03-22 16:28:10 +01:00
Tomasz Grabiec	529c8b8858	logalloc: Rename tracker::occupancy() to region_occupancy()	2016-03-22 14:56:44 +01:00
Tomasz Grabiec	ca08db504b	managed_bytes: Make operator[] work for large blobs as well Fixes assertion in mutation_test: mutation_test: ./utils/managed_bytes.hh:349: blob_storage::char_type* managed_bytes::data(): Assertion `!_u.ptr->next' Introduced in `ea7c2dd085` Message-Id: <1458648786-9127-1-git-send-email-tgrabiec@scylladb.com>	2016-03-22 14:43:52 +02:00
Tomasz Grabiec	184e2831e7	managed_bytes: Mark move-assignment noexcept	2016-03-21 18:41:27 +01:00
Tomasz Grabiec	92d4cfc3ab	managed_bytes: Make copy assignment exception-safe	2016-03-21 18:41:27 +01:00
Tomasz Grabiec	22d193ba9f	managed_bytes: Make linearization_context::forget() noexcept It is needed for noexcept destruction, which we need for exception safety in higher layers. According to [1], erase() only throws if key comparison throws, and in our case it doesn't. [1] http://en.cppreference.com/w/cpp/container/unordered_map/erase	2016-03-21 18:41:27 +01:00
Benoît Canet	1fb9a48ac5	exception: Optionally shutdown communication on I/O errors. I/O errors cannot be fixed by Scylla the only solution is to shutdown the database communications. Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <1458154098-9977-1-git-send-email-benoit@scylladb.com>	2016-03-17 15:02:52 +02:00
Paweł Dziepak	338fd34770	lsa: update _closed_occupancy after freeing all segments _closed_occupancy will be used when a region is removed from its region group, make sure that it is accurate. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-17 11:12:05 +00:00
Paweł Dziepak	99b61d3944	lsa: set _active to nullptr in region destructor In region destructor, after active segments is freed pointer to it is left unchanged. This confuses the remaining parts of the destructor logic (namely, removal from region group) which may rely on the information in region_impl::_active. In this particular case the problem was that code removing from the region group called region_impl::occupancy() which was dereferencing _active if not null. Fixes #993. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1457341670-18266-1-git-send-email-pdziepak@scylladb.com>	2016-03-07 10:15:28 +01:00
Calle Wilund	e79ca557ed	managed_bytes: Change init of small object to silence error on gcc5 Fixes #865 (Some) gcc 5 (5.3.0 for me) on ubuntu will generate errors on compilation of this code (compiling logalloc_test). The memcpy to inline storage seems to confuse the compiler. Simply change to std::copy, which shuts the compiler up. Any decent stl should convert primitive std::copy to memcpy anyway, but since it is also the inline (small storage), it should not matter which way. Message-Id: <1456931988-5876-4-git-send-email-calle@scylladb.com>	2016-03-02 18:21:51 +02:00
Calle Wilund	43ea1f5945	utils::jointpoint: Helper type to generate a singular value for all shards Lets operations working on all shards "join" and acquire the same value of something, with that value being based on whenever all shards reach the join. Obvious use case: time stamp after one set of per-shard ops, but before final ones. The generation of the value is guaranteed to happen on the shards that created the join point. Based on the join-ops in CF::snapshot, but abstracted and made caller responsibility. Primary use case is to help deal with the join-problem of truncation. Message-Id: <1456332856-23395-1-git-send-email-calle@scylladb.com>	2016-02-24 18:59:25 +02:00
Paweł Dziepak	d5c794d5e4	data_output: add reserve() Allows mixing data_output with other output stream like seastar::simple_output_stream which is useful when switching to the new IDL-based serializers. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-02-19 23:11:59 +00:00
Amnon Heiman	1e4d227b20	managed_bytes: don't return auto from non-member function gcc 4.9 does not allow non-static data member declared auto. This patch replace the auto decleration with std::result_of_t Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1455652166-16860-1-git-send-email-amnon@scylladb.com>	2016-02-16 21:50:55 +02:00
Avi Kivity	13144ea9eb	managed_bytes: get rid of explicit linearize/scatter Now that everything is in a linarization context, we don't need to explicitly gather data.	2016-02-16 14:37:46 +02:00
Avi Kivity	af8ef54d5a	managed_bytes: introduce with_linearized_managed_bytes() A large managed_bytes blob can be scattered in lsa memory. Usually this is fine, but someone we want to examine it in place without copying it out, but using contiguous iterators for efficiency. For this use case, introduce with_linearized_managed_bytes(Func), which runs a function in a "linearization context". Within the linearization context, reads of managed_bytes object will see temporarily linearized copies instead of scattered data.	2016-02-09 19:55:13 +02:00
Avi Kivity	e5b72aedf1	managed_bytes: don't copy data during hashing	2016-02-08 12:43:05 +02:00
Avi Kivity	5d958db869	managed_bytes: fix operator== for fragmented blobs Must compare fragment by fragment.	2016-02-08 12:43:05 +02:00
Erich Keane	49842aacd9	managed_vector: maybe_constructed ctor to non-constexpr Clang enforces that a union's constexpr CTOR must initialize one of the members. The spec is seemingly silent as to what the rule on this is, however, making this non-constexpr results in clang accepting the constructor. Signed-off-by: Erich Keane <erich.keane@verizon.net> Message-Id: <1454604300-1673-1-git-send-email-erich.keane@verizon.net>	2016-02-07 10:30:45 +02:00
Gleb Natapov	4e440ebf8e	Remove old inet_address and uuid serializers	2016-02-02 12:15:50 +02:00
Raphael S. Carvalho	2164aa8d5b	move compaction manager from /utils to /sstables Compaction manager was initially created at utils because it was more generic, and wasn't only intended for compaction. It was more like a task handler based on futures, but now it's only intended to manage compaction tasks, and thus should be moved elsewhere. /sstables is where compaction code is located. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-21 15:23:05 -02:00
Raphael S. Carvalho	3bd240d9e8	compaction: add ability to stop an ongoing compaction That's needed for nodetool stop, which is called to stop all ongoing compaction. The implementation is about informing an ongoing compaction that it was asked to stop, so the compaction itself will trigger an exception. Compaction manager will catch this exception and re-schedule the compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-19 23:15:18 -02:00
Raphael S. Carvalho	ec4c73d451	compaction: rename compaction_stats to compaction_info compaction_info makes more sense because this structure doesn't only store stats about ongoing compaction. Soon, we will add information to it about whether or not an user asked to stop the respective ongoing compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-19 23:15:18 -02:00
Tomasz Grabiec	237819c31f	logalloc: Excluded zones' free segments in lsa/byres-non_lsa_used_space Historically the purpose of the metric is to show how much memory is in standard allocations. After zones were introduced, this would also include free space in lsa zones, which is almost all memory, and thus the metric lost its original meaning. This change brings it back to its original meaning. Message-Id: <1452865125-4033-1-git-send-email-tgrabiec@scylladb.com>	2016-01-18 10:48:14 +02:00
Raphael S. Carvalho	a5c90194f5	db: add support to clean up a column family Cleanup is a procedure that will discard irrelevant keys from all sstables of a column family, thus saving disk space. Scylla will clean up a sstable by using compaction code, in which this sstable will be the only input used. Compaction manager was changed to become aware of cleanup, such that it will be able to schedule cleanup requests and also know how to handle them properly. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 03:53:04 -02:00
Raphael S. Carvalho	d44a5d1e94	compaction: filter out compacting sstables The implementation is about storing generation of compacting sstables in an unordered set per column family, so before strategy is called, compaction manager will filter out compacting sstables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 01:18:29 -02:00
Raphael S. Carvalho	9c13c1c738	compaction: move compaction execution from strategy to manager Currently, compaction strategy is the responsible for both getting the sstables selected for compaction and running compaction. Moving the code that runs compaction from strategy to manager is a big improvement, which will also make possible for the compaction manager to keep track of which sstables are being compacted at a moment. This change will also be needed for cleanup and concurrent compaction on the same column family. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 00:04:27 -02:00
Tomasz Grabiec	eb1b21eb4b	Introduce hashing helpers	2016-01-08 21:10:25 +01:00
Avi Kivity	c8b09a69a9	lsa: disable constant_time_size in binomial_heap implementation Corrupts heap on boost < 1.60, and not needed. Fixes #698.	2015-12-29 12:59:00 +01:00
Vlad Zolotarov	33552829b2	core: use steady_clock where monotinic clock is required Use steady_clock instead of high_resolution_clock where monotonic clock is required. high_resolution_clock is essentially a system_clock (Wall Clock) therefore may not to be assumed monotonic since Wall Clock may move backwards due to time/date adjustments. Fixes issue #638 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-12-27 18:07:53 +02:00
Calle Wilund	803b58620f	data_output: specialize serialized_size for bool to ensure sync with write	2015-12-21 14:19:45 +00:00
Paweł Dziepak	442bc90505	compaction_manager: check whether the manager is already stopped Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-12-17 14:06:41 +01:00
Tomasz Grabiec	157af1036b	data_output: Introduce write_view() which matches data_input::read_view()	2015-12-16 18:06:54 +01:00
Raphael S. Carvalho	e74dcc86bd	compaction_manager: introduce list of compaction_stats This list will store compaction_stats for each ongoing compaction. That's why register and deregister methods are provided. This change is important for compaction stats API that needs data of each ongoing compaction, such as progress, ks, cf, etc. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-15 09:50:28 -02:00
Lucas Meneghel Rodrigues	2167173251	utils/logalloc.cc - Declare member minimum_size from segment_zone struct This fixes compile error: In function `logalloc::segment_zone::segment_zone()': /home/lmr/Code/scylla/utils/logalloc.cc:412: undefined reference to `logalloc::segment_zone::minimum_size' collect2: error: ld returned 1 exit status ninja: build stopped: subcommand failed. Signed-off-by: Lucas Meneghel Rodrigues <lmr@scylladb.com>	2015-12-10 12:54:34 +02:00
Paweł Dziepak	ec453c5037	managed_bytes: fix potentially unaligned accesses blob_storage defined with attribute packed which makes its alignment requirement equal 1. This means that its members may be unaligned. GCC is obviously aware of that and will generate appropriate code (and not generate ubsan checks). However, there are few places where members of blob_storage are accessed via pointers, these have to be wrapped by unaligned_cast<> to let the compiler know that the location pointed to may be not aligned properly. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-12-10 11:59:54 +02:00
Avi Kivity	204610ac61	Merge "Make LSA more large-allocation-friendly" from Paweł "This series attempts to make LSA more friendly for large (i.e. bigger than LSA segment) allocations. It is achieved by introducing segment zones – large, contiguous areas of segments and using them to allocate segments instead of calling malloc() directly. Zones can be shrunk when needed to reclaim memory and segments can be migrated either to reduce number of zone or to defragment one in order to be able to shrink it. LSA tries to keep all segments at the lower addresses and reclaims memory starting from the zones in the highest parts of the address space."	2015-12-09 10:49:23 +02:00
Paweł Dziepak	8ba66bb75d	managed_bytes: fix copy size in move constructor Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-12-09 10:38:28 +02:00

1 2 3 4 5

250 Commits