scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Author	SHA1	Message	Date
Duarte Nunes	c19c633299	size_estimates_recorder: Increase estimate accuracy This patch uses the estimated_keys_for_range() function to get better estimates. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-10-10 17:52:16 +02:00
Duarte Nunes	ceed09b23e	sstables: Get estimates for a particular range This patch adds the estimated_keys_for_range() function, which estimates the number of keys present between the specified range. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-10-10 17:52:15 +02:00
Duarte Nunes	8c223b31c8	sstables/key: Make key::kind public Needed to create synthetic keys without any value but with ordering properties. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-10-10 17:47:24 +02:00
Avi Kivity	b305d92a65	Merge "housekeeping: check version during setup" from Amnon "The version is taken from the installation rather than the API, a mode command line indicated that this is part of the setup and uuid is used for the interaction with the checkversion server." * 'amnon/check_version_on_startup_v3' of github.com:cloudius-systems/seastar-dev: scylla_setup: Check and report the scylla version scylla-housekeeping: check version during setup	2016-10-10 16:37:14 +03:00
Vlad Zolotarov	ab748e829d	docs: tracing.md: initial commit Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1475686745-20383-1-git-send-email-vladz@cloudius-systems.com>	2016-10-10 16:12:02 +03:00
Tomasz Grabiec	4357d0a6d9	db: Add counter for writes blocked on dirty memory There is already queue_length-requests_blocked_memory, but it's a gauge so does not reflect what happened between the sampling points. total_operations-requests_blocked_memory will allow to see if there were any (and how many) requests which were blocked by dirty memory. Message-Id: <1476098616-12682-1-git-send-email-tgrabiec@scylladb.com>	2016-10-10 14:25:22 +03:00
Pekka Enberg	3b75ff1496	docs/docker: Tag `--listen-address` as 1.4 feature The Docker Hub documentation is the same for all image versions. Tag `--listen-address` as 1.4 feature. Message-Id: <1475819164-7865-1-git-send-email-penberg@scylladb.com>	2016-10-10 13:26:16 +03:00
Vlad Zolotarov	006999f46c	api::storage_service::slow_query: don't use duration_cast in GET The slow_query_record_ttl() and slow_query_threshold() return the duration of the appropriate type already - no need for an additional cast. In addition there was a mistake in a cast of ttl. Fixes #1734 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1475669400-5925-1-git-send-email-vladz@cloudius-systems.com>	2016-10-09 18:09:13 +03:00
Takuya ASADA	469e9af1f4	dist/common/scripts/scylla_setup: use 'swapon -s' instead of 'swapon --show' Since Ubuntu 14.04 doesn't supported --show option, we need to prevent use it. Fixes #1740 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1475788340-22939-2-git-send-email-syuu@scylladb.com>	2016-10-09 18:05:14 +03:00
Takuya ASADA	8452045b85	dist/ubuntu: add realpath to dependency, requires for scylla_setup We need dependency to realpath, since scylla_setup using it. Fixes #1740. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1475788340-22939-1-git-send-email-syuu@scylladb.com>	2016-10-09 18:05:14 +03:00
Tomasz Grabiec	41e66ebce2	gdb: Introduce 'scylla heapprof' Presents current heap profile recording. Works in text mode or dumps to collapsed stacks format from which flame graph can be generated. To generate a flamegraph: (gdb) scylla heapprof --flame Wrote heapprof.stacks $ flamegraph.pl --colors mem < heapprof.stacks > heapprof.svg flamegraph.pl comes from: https://github.com/brendangregg/FlameGraph.git Text mode example: (gdb) scylla heapprof --min 100000000 All (274699676, #10213) \-- void* memory::cpu_pages::allocate_large_and_trim<memory::cpu_pages::allocate_large_aligned(unsigned int, unsigned int)::{lambda(unsigned int, unsigned int)#1}>(unsigned int, memory::cpu_pages::allocate_large_aligned(unsigned int, unsigned int)::{lambda(unsigned int, unsigned int)#1}) + 169 (268435456, #1) memory::allocate_large_aligned(unsigned long, unsigned long) + 87 memory::allocate_aligned(unsigned long, unsigned long) + 48 aligned_alloc + 9 logalloc::segment_zone::segment_zone() + 304 logalloc::segment_pool::allocate_segment() + 477 logalloc::segment_pool::segment_pool() + 304 __tls_init.part.801 + 72 logalloc::region_group::release_requests() + 1333 logalloc::region_group::add(logalloc::region_group*) + 514 The branches are formatted like this: -- <symbol> (<size>, #<count>) Where <size> is total size of live objects and <count> is total number of live objects, for all objects allocated from paths going through this node. Nodes which share the same <size> and <count> are stacked like this: -- <symbol_1> (<size>, #<count>) <symbol_2> <symbol_3> Message-Id: <1475583334-19524-1-git-send-email-tgrabiec@scylladb.com>	2016-10-09 10:54:08 +03:00
Glauber Costa	33e9c2bbdd	memtable: reduce sstable flush concurrency to one Limiting the concurrency of memtable flushes to 4 was a temporary workaround for the fact that we lacked good write behind support. Now that write behind is properly merged we can reduce the concurrency to what it should be, one. This means that memtable flushes will now be serialized, and only when one of them ends will the next one begin. Disk parallelism is obtained through the write-behind mechanism. Fixes #1373 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <528f9ef928b5101bed952df600eb8555c275497a.1475881100.git.glauber@scylladb.com>	2016-10-09 10:48:57 +03:00
Tomasz Grabiec	2a5a90f391	db: Do not timeout streaming readers There is a limit to concurrency of sstable readers on each shard. When this limit is exhausted (currently 100 readers) readers queue. There is a timeout after which queued readers are failed, equal to read_request_timeout_in_ms (5s by default). The reason we have the timeout here is primarily because the readers created for the purpose of serving a CQL request no longer need to execute after waiting longer than read_request_timeout_in_ms. The coordinator no longer waits for the result so there is no point in proceeding with the read. This timeout should not apply for readers created for streaming. The streaming client currently times out after 10 minutes, so we could wait at least that long. Timing out sooner makes streaming unreliable, which under high load may prevent streaming from completing. The change sets no timeout for streaming readers at replica level, similarly as we do for system tables readers. Fixes #1741. Message-Id: <1475840678-25606-1-git-send-email-tgrabiec@scylladb.com>	2016-10-07 15:41:04 +03:00
Raphael S. Carvalho	9175977a9d	cql3: fix build failure by defining out unused function Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <cba6207278ea945ee750d78b189320443843a288.1475793747.git.raphaelsc@scylladb.com>	2016-10-07 08:45:18 +03:00
Avi Kivity	9ac441d3b5	range: adjust split_after to allow split_point outside input range Make split_after() more generic by allowing split_point to be anywhere, not just within the input range. If the split_point is before, the entire range is returned; and if it is after, stdx::nullopt is returned. "before" and "after" are not well defined for wrap-around ranges, so but we are phasing them out and soon there will not be wrapping_range::split_after() users. This is a prerequisite for converting partition_range and friends to nonwrapping_range. Message-Id: <1475765099-10657-1-git-send-email-avi@scylladb.com>	2016-10-06 17:54:44 +02:00
Raphael S. Carvalho	7ea4513595	database: trigger compaction after loading new sstables Scylla wasn't trying to compact new sstables uploaded via 'nodetool refresh'. Thus, all new sstables were left uncompacted until user issued 'nodetool flush' or a new sstable was written which would trigger compaction too. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <bbdf274c8bb49f4bedeefcb85da78a6fb61a1232.1475535203.git.raphaelsc@scylladb.com>	2016-10-06 18:26:49 +03:00
Raphael S. Carvalho	9c59ccc52a	storage_service: improve log message for refresh 'No new SSTables were found for keyspace1.standard1' was printed if user uploaded new sstables to upload dir instead, and that is confusing. We should instead print that if new sstables weren't found in both cf and cf/upload dirs. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <90386f6255407697434213227ae7ff0de7464f99.1475535203.git.raphaelsc@scylladb.com>	2016-10-06 18:26:32 +03:00
Raphael S. Carvalho	76862d0d9c	main: start compaction procedure after commit log is replayed Commit log replay is a synchronous operation in bootstrap, so services will only be started after it's completed. By starting compaction before, less bandwidth will be available to both and consequently boot will be slowed down. Fix is simply about moving compaction, which is an asynchronous operation after commitlog replay is over. Fixes #1620. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <d2a173a4ee4d474317b970c6b39530e61067fea9.1475527955.git.raphaelsc@scylladb.com>	2016-10-06 18:25:24 +03:00
Nadav Har'El	ee7ec10b11	CQL parser: "CREATE MATERIALIZED VIEW" statement This patch adds the parsing for the "CREATE MATERIALIZED VIEW" statement, following Cassandra 3 syntax. For example: CREATE MATERIALIZED VIEW building_by_city AS SELECT * FROM buildings WHERE city IS NOT NULL PRIMARY KEY(city, name); It also adds the "IS NOT NULL" operator needed for this purpose. As in Cassandra, "IS NOT NULL" can only be used for materialized view creation, and not in a normal SELECT. It can only be used with the NULL operand (i.e., "IS NOT 3" will be a syntax error). The current implementation of this statement just does some sanity checking (such as to verify that "city" is a valid column name and that the "building" base table exists), complains that materialized views are not yet supported: SyntaxException: <ErrorMessage code=2000 [Syntax error in CQL query] message="Failed parsing statement: [CREATE MATERIALIZED VIEW building_by_city AS SELECT * FROM buildings WHERE city IS NOT NULL PRIMARY KEY(city, name);] reason: unsupported operation: Materialized views not yet supported"> As mentioned above, the "IS NOT NULL" restriction is not allowed in ordinary selects not creating a materialized views: SELECT * FROM buildings WHERE city IS NOT NULL; InvalidRequest: code=2200 [Invalid query] message="restriction 'city IS NOT null' is only supported in materialized view creation" Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1475742927-30695-1-git-send-email-nyh@scylladb.com>	2016-10-06 15:42:37 +03:00
Glauber Costa	7146776d7c	fix sstable tests by not using the flush_reader if no region_group The latest virtual dirty patches broke the SSTable tests. The reason for this is that those tests will flush synthetic memtables that do not have a region_group attached to it. Normally in cases like this we would just give the flush_reader an empty region group. However, the memtable class constructor takes a region_group pointer and that can be null according to the interface. So we must conditionally test it. If there isn't a region_group involved, the virtual dirty accounting should be disabled: after all, we won't even have the baseline memory to begin with. One of the approaches to fix this could be to just provide null accounter classes to be used as a surrogate for the accounting classes in this case. However, since this is mostly used for tests, a much simpler way is to just revert back to the scanning reader in that case. The scanning reader is similar enough to the flush_reader, except that it can handle partial ranges, slices, and delegate accesses to an sstable post-flush. We don't need any of that, but as argued above, there is no need to remove it either. Signed-off-by: Glauber Costa <glommer@scylladb.com> Message-Id: <1475667271-60806-1-git-send-email-glommer@scylladb.com>	2016-10-05 12:44:21 +01:00
Avi Kivity	c94fb1bf12	build: reduce inclusions of messaging_service.hh Remove inclusions from header files (primary offender is fb_utilities.hh) and introduce new messaging_service_fwd.hh to reduce rebuilds when the messaging service changes. Message-Id: <1475584615-22836-1-git-send-email-avi@scylladb.com>	2016-10-05 11:46:49 +03:00
Avi Kivity	f8118d9fc2	Merge "Virtual dirty memory management" from Glauber "Description: ============ Scylla currently suffers from a brick wall behavior of the request throttler. Requests pile up until we reach the dirty memory limit, at which point we stop serving them until we have freed enough memory to allow for more requests. The problem is that freeing dirty memory means writing an SSTable to completion. That can take a long time, even if we are blessed with great disks. Those long waiting times can and will translate into timeouts. That is bad behavior. What this patch does is introduce one form of virtual dirty memory accounting. Instead of allowing 100 % of the dirty memory to be filled up until we stop accepting requests, we will do that when we reach 50 % of memory. However, instead of releasing requests only when an SSTable is fully written, we start releasing them when some memory was written. The practical effect of that, is that once we reach 50 % occupancy in our dirty memory region, we will bring the system from CPU speed to disk speed, and will start accepting requests only at the rate we are able to write memory back. Results ======= With this patchset running a load big enough to easily saturate the disk, (commitlog disabled to highlight the effects of the memtable writer), I am able to run scylla for many minutes, with timeouts occurring only when I run out of disk space, whereas without this patch a swarm of timeouts would start merely 2 seconds after the load started - and would never get stable. In V2, I have sent a set of graphs illustrating the performance of this solution. This version does not have any significant differences in that front. For details, please refer to https://groups.google.com/d/msg/scylladb-dev/iCvD-3Z-QqY/EM8KUh_MAQAJ Accuracy of the accounting: --------------------------- It is important for us to be as accurate as possible when accounting freed memory, since every byte we mark as freed may allow one or more requests to be executed. I have measured the accuracy of this approach (ignoring padding, object size for the mutation fragments) to be 99.83 % of used memory in the test workload I have ran (large, 65k mutations). Memtables under this circumnstance tend to have a very high occupancy ratio because throttle breeds idle, and idle breeds compact-on-idle. Known Issues: ------------- A lot of time can be elapsed between destroying the flush_reader and actually releasing memory. The release of memory only happens when the SSTable is fully sealed, and we have to flush the files, as well as finish writing all SSTable components at this point. This happened in practice with a buggy kernel that would result in flushes taking a long time. After that is fixed, this is just a theoretical problem and in practice it shouldn't matter given the time we expect those operations to take." * 'virtual-dirty-v6' of github.com:glommer/scylla: database: allow virtual dirty memory management streamed_mutation: make _buffer private add accounting of memory read to partition_snapshot_reader move partition_snapshot_reader code to header file LSA: allow a group to query its own region group memtables: split scanning reader in two sstables: use special reader for writing a memtable LSA: export information about object memory footprint LSA: export information about size of the throttle queue database: export virtual dirty bytes region group	2016-10-04 20:57:52 +03:00
Avi Kivity	cc33c8b4ba	Merge seastar upstream * seastar 18f7bb8...f937fb0 (5): > Merge "Fix signal mask corruption" from Tomasz > core/memory: Avoid violating strict aliasing when accessing allocation sites > core/memory: Avoid indirection when storing allocation sites > core/memory: Add a way to disable abort on allocation failure in some scope > core/sharded: Allow mapper to take the service by non-const reference	2016-10-04 20:08:57 +03:00
Glauber Costa	f89a67c75c	database: allow virtual dirty memory management Scylla currently suffers from a brick wall behavior of the request throttler. Requests pile up until we reach the dirty memory limit, at which point we stop serving them until we have freed enough memory to allow for more requests. The problem is that freeing dirty memory means writing an SSTable to completion. That can take a long time, even if we are blessed with great disks. Those long waiting times can and will translate into timeouts. That is bad behavior. What this patch does is introduce one form of virtual dirty memory accounting. Instead of allowing 100 % of the dirty memory to be filled up until we stop accepting requests, we will do that when we reach 50 % of memory. However, instead of releasing requests only when an SSTable is fully written, we start releasing them when some memory was written. The practical effect of that is that once we reach 50 % occupancy in our dirty memory region, we will bring the system from CPU speed to disk speed, and will start accepting requests only at the rate we are able to write memory back. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	7b6e8a2526	streamed_mutation: make _buffer private It is currently protected, but now all users go through push_mutation_fragment(). So we can safely move its visibility to guarantee that it stays that way. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	1db245b52d	add accounting of memory read to partition_snapshot_reader By default, we don't do any accounting. By specializing this class and providing an accounter class, we can account how much memory are we reading as we read through the elements. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	452eb95943	move partition_snapshot_reader code to header file This is so we can template it without worrying about declaring the specializations in the .cc file. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	86aa0b830d	LSA: allow a group to query its own region group Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	eee15578fb	memtables: split scanning reader in two The code that is common will live in its own reader, the iterator_reader. All friendly private access to memtable attributes and methods happen through the iterator reader. After this patch, we are now left with the scanning_reader - same as always, but now implemented on top of the iterator_reader, and a flush_reader, which will be used by SSTable flushes only. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	16886eeb96	sstables: use special reader for writing a memtable Right now the special reader doesn't do much, but the idea is that we will soon replace it will a reader that specializes in flush, and is in turn able to provide read-side on-flush functionality like virtual dirty. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Glauber Costa	28e3f2f6ee	LSA: export information about object memory footprint We allocate objects of a certain size, but we use a bit more memory to hold them. To get a clerer picture about how much memory will an object cost us, we need help from the allocator. This patch exports an interface that allow users to query into a specific allocator to get that information. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-10-04 10:39:10 -04:00
Pekka Enberg	c3bebea1ef	dist/docker: Add '--listen-address' to 'docker run' Add a '--listen-address' command line parameter to the Docker image, which can be used to set Scylla's listen address. Refs #1723 Message-Id: <1475485165-6772-1-git-send-email-penberg@scylladb.com>	2016-10-04 13:57:55 +03:00
Marius	876775a52c	dist/docker/ubuntu: refactored $IP/listen_address In order to allow Scylla’s docker container to handle multiple network interfaces, the start-scylla script was refactored: - `$IP` is now called `$SCYLLA_LISTEN_ADDRESS`, so it is less likely to be confused or interfere with other environment variables. - `$SCYLLA_LISTEN_ADDRESS` now checks its value and also tries to resolve a hostname, if no IP was set to it. - `$SCYLLA_LISTEN_DEVICE` can now be set as environment variable and contain any available NIC device name (e.g. `eth0`). The script automatically retrieves the IP address from the device. Usage: 1. With `$SCYLLA_LISTEN_ADDRESS` as IP: `docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_ADDRESS=192.168.1.100 scylladb/scylla` 2. With `$SCYLLA_LISTEN_ADDRESS` as hostname: `docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_ADDRESS=containername.network.lan scylladb/scylla` 3. With `$SCYLLA_LISTEN_DEVICE`: `docker run -t -i --rm --name scylla -e SCYLLA_LISTEN_DEVICE=eth0 scylladb/scylla` Message-Id: <20161003151230.67672-1-marius@twostairs.com>	2016-10-04 13:56:55 +03:00
Raphael S. Carvalho	747b42299c	database: remove unused code Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <95e1ed590c9e45d15f19a84824a4dce05aefdab8.1475528611.git.raphaelsc@scylladb.com>	2016-10-04 09:26:43 +03:00
Paweł Dziepak	7599ef6fde	query_pager: fix splitting range at the end bound Currently, the code responsible for calculating ranges for the next request could produce a wrap-around partition range. For example, if the original range was (unimportant, A] and the last partition key A then the output range would be (A, A]. This patch adds checks to make sure that in such cases the range is removed. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1475497244-2790-1-git-send-email-pdziepak@scylladb.com>	2016-10-03 19:33:42 +02:00
Avi Kivity	8747054d10	exceptions: mark function called before construction static cassandra_exception::prepare_message() is called from derived classes' constructors before the base cassnadra_exception object is constructed. This is technically illegal but harmless. Fix by marking the function static. Found by clang.	2016-10-03 16:29:02 +03:00
Calle Wilund	5b815b81b4	auth::password_authenticator: Ensure exceptions are processed in continuation Fixes #1718 (even more) Message-Id: <1475497389-27016-1-git-send-email-calle@scylladb.com>	2016-10-03 14:49:59 +02:00
Pekka Enberg	f3cd21c8f1	Merge seastar upstream * seastar 0e60722...18f7bb8 (1): > core/memory: Fix compilation errors	2016-10-03 12:54:38 +03:00
Calle Wilund	d24d0f8f90	auth::password_authenticator: "authenticate" should not throw undeclared excpt Fixes #1718 Message-Id: <1475487331-25927-1-git-send-email-calle@scylladb.com>	2016-10-03 12:53:30 +03:00
Avi Kivity	a51804eca8	Merge "token_restriction: Deal with minimum tokens" from Duarte "This patch set ensures we can correctly handle queries where the minimum token is specified." * 'min-token/v3' of github.com:duarten/scylla: cql_query_test: Add test case for min/max token bounds token_restriction: Deal with minimum tokens partitioner: Parse token from bytes	2016-10-02 12:32:40 +03:00
Avi Kivity	5071f4c0bf	Merge seastar upstream * seastar 9e1d5db...0e60722 (9): > core/memory: Replace assert with bad_alloc in allocate_large() > chunked_fifo: avoid direct use of sized operator delete > memory: fix build without heap profiler > xen: initialize port::_sem > Merge "Make input streams skippable" from Paweł > semaphore: require explict setting for start value > prometheus: remove invalid chars from meric names > core/memory: Introduce heap profiler > util/backtrace: Mark noexcept if func() doesn't throw	2016-10-02 11:43:22 +03:00
Vlad Zolotarov	7e180c7bd3	tracing: introduce the tracing::global_trace_state_ptr class This object, similarly to a global_schema_ptr, allows to dynamically create the trace_state_ptr objects on different shards in a context of the original tracing session. This object would create a secondary tracing session object from the original trace_state_ptr object when a trace_state_ptr object is needed on a "remote" shard, similarly to what we do when we need it on a remote Node. Fixes #1678 Fixes #1647 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1474387767-21910-1-git-send-email-vladz@cloudius-systems.com>	2016-10-02 11:31:37 +03:00
Amnon Heiman	a83bd900be	scylla_setup: Check and report the scylla version This patch adds a call to the scylla-housekeeping check version during setup, so a warning will be printed if a newer version is available. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2016-10-02 11:11:07 +03:00
Amnon Heiman	5e3ab32365	scylla-housekeeping: check version during setup This changes are for running scylla during setup. It contains the following changes: 1. get the current version from the command line (as the syclla does not run at this stage). 2. It support a mode parameter in the command line to indicate that we running during the installation. 3. It accept an external uuid that will be used with all interaction with the check_version server. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2016-10-02 11:11:07 +03:00
Takuya ASADA	15b156c9d4	dist/common/scripts/scylla_io_setup: describe how to set developer mode when validation tests failed Describe how to set developer mode, not to confuse users. Fixes #1701 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1475167584-18092-1-git-send-email-syuu@scylladb.com>	2016-10-02 10:58:38 +03:00
Avi Kivity	58ddfea18f	Merge "Fixes for leveled compaction strategy" from Raphael * 'lcs_fixes' of github.com:raphaelsc/scylla: lcs: fix starvation at higher levels lcs: fix broken token range distribution at higher levels	2016-10-01 21:34:21 +03:00
Takuya ASADA	9639cc840e	dist/redhat: add missing build time dependency for libunwind There was missing dependency for libunwind, so add it. Fixes #1722 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1475260099-25881-1-git-send-email-syuu@scylladb.com>	2016-09-30 21:33:39 +03:00
Takuya ASADA	c89d9599b1	dist/ubuntu: add missing build time dependency for libunwind There was missing dependency for libunwind, so add it. Fixes #1721 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1475255706-26434-1-git-send-email-syuu@scylladb.com>	2016-09-30 21:33:21 +03:00
Raphael S. Carvalho	a8ab4b8f37	lcs: fix starvation at higher levels When max sstable size is increased, higher levels are suffering from starvation because we decide to compact a given level if the following calculation results in a number greater than 1.001: level_size(L) / max_size_for_level_l(L) Fixes #1720. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-30 14:09:49 -03:00
Raphael S. Carvalho	a3bf7558f2	lcs: fix broken token range distribution at higher levels Uniform token range distribution across sstables in a level > 1 was broken, because we were only choosing sstable with lowest first key, when compacting a level > 0. This resulted in performance problem because L1->L2 may have a huge overlap over time, for example. Last compacted key will now be stored for each level to ensure sort of "round robin" selection of sstables for compactions at level >= 1. That's also done by C*, and they were once affected by it as described in https://issues.apache.org/jira/browse/CASSANDRA-6284. Fixes #1719. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-09-30 14:09:16 -03:00

1 2 3 4 5 ...

10503 Commits