scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 12:47:02 +00:00

Author	SHA1	Message	Date
Piotr Jastrzebski	5ca4bfd69a	disk_array_vint_size: Remove unused Size template parameter Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-05-23 16:15:44 +02:00
Avi Kivity	20271b3890	Update scylla-ami submodule * dist/ami/files/scylla-ami e0b35dc...025644d (1): > Merge "AMI build fix" from Takuya	2018-05-16 12:33:45 +03:00
Avi Kivity	05cec4a265	Merge "Reduce LSA memory reclamation overhead" from Tomasz " Main optimization is in the patch titled "lsa: Reduce amount of segment compactions". I measured 50% reduction of cache update run time in a steady state for an append-only workload with large partition, in perf_row_cache_update version from: `c3f9e6ce1f/tests/perf_row_cache_update.cc` Other workloads, and other allocation sites probably also could see the improvement. " * tag 'tgrabiec/reduce-lsa-segment-compactions-v1' of github.com:tgrabiec/scylla: lsa: Expose counters for allocation and compaction throughput lsa: Reduce amount of segment compactions lsa: Avoid the call to segment_pool::descriptor() in compact() lsa: Make reclamation on reserve refill more efficient	2018-05-16 10:24:20 +03:00
Tomasz Grabiec	534068a0f7	Update seastar submodule Fixes #3339 * seastar 840002c...0a1a327 (7): > Merge "fix perftune.py issues with cpu-masks on big machines" from Vlad > Merge 'Handle Intel's NICs in a special way' from Vlad > reactor: fix calculation of idle ticks > log: streamline logging internals a little > Merge "CMake imrovements and compatibility" from Jesse > iotune: fix typo in property name > cmake: do not find_package(Boost ...) if Boost is a target	2018-05-16 09:11:22 +02:00
Avi Kivity	832e8fb1e0	Merge "Support writing counters in SSTables 3.x format." from Vladimir " This patchset adds support for writing counter cells in SSTables 3.x format ('m'). The logic of writing counters is almost identical to that used for the old 2.x format ('k'/'l') with the only difference that the data length preceding serialised shards is written as a vint. Tests: unit {release}. Generated SSTables are verified to be processed fine by sstabledump (note that sstabledump only outputs the binary data for counters, not their actual values, same as sstable2json). Verified with Cassandra 3.11 to get the expected values from the counters table: cqlsh> SELECT * from sst3.counter_table; pk \| ck \| rc1 \| rc2 -----+-----+-----+----- key \| ck1 \| 10 \| 1 (1 rows) Verified that the deleted counter can no longer be updated: cqlsh> use sst3 ; cqlsh:sst3> UPDATE counter_table SET rc1 = rc1 + 2 WHERE pk = 'key' AND ck = 'ck2'; cqlsh:sst3> SELECT * from sst3.counter_table; pk \| ck \| rc1 \| rc2 -----+-----+-----+----- key \| ck1 \| 10 \| 1 (1 rows) " * 'projects/sstables-30/write_counters/v1' of https://github.com/argenet/scylla: tests: Unit tests to cover writing counters in SSTables 3.x format. sstables: Support writing counters for SSTables 3.x. sstables: Move code writing counter value into a separate helper.	2018-05-16 08:46:15 +03:00
Raphael S. Carvalho	59c57861ae	tests/sstable_test: switch to dynamic temporary dir creation sstable test fails when running concurrently (for example, release and debug mode) because it uses a static temporary dir in lots of tests. Let's fix it by switching to dynamic temporary dir, which is created using mkdtemp(). Also the sstable tests will now run in /tmp, and so it's made much faster. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180516042044.15336-1-raphaelsc@scylladb.com>	2018-05-16 08:00:29 +03:00
Tomasz Grabiec	4fdd61f1b0	lsa: Expose counters for allocation and compaction throughput Allow observing amplification induced by segment compaction.	2018-05-15 21:49:01 +02:00
Tomasz Grabiec	3775a9ecec	lsa: Reduce amount of segment compactions Reclaiming memory through segment compaction is expensive. For occupancy of 85%, in order to reclaim one free segment, we need to compact 7 segments, by migrating 6 segments worth of data. This results in significant amplification. Compaction involves moving objects, which in some cases is expensive in itself as well (See https://github.com/scylladb/scylla/issues/3247). This patch reduces amount of segment compactions in favor of doing more eviction. It especially helps workloads in which LRU order matches allocation order, in which case there will be no segment compaction, and just eviction. In perf_row_cache_update test case for large partition with lots of rows, which simulates appending workload, I measured that for each new object allocated, 2 need to be migrated, before the patch. After the patch, only 0.003 objects are migrated. This reduces run time of cache update part by 50%.	2018-05-15 21:49:01 +02:00
Vladimir Krivopalov	a16b8d5d77	tests: Unit tests to cover writing counters in SSTables 3.x format. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-15 11:44:44 -07:00
Vladimir Krivopalov	ffd8886da9	sstables: Support writing counters for SSTables 3.x. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-15 11:44:44 -07:00
Vladimir Krivopalov	28c3c21c73	sstables: Move code writing counter value into a separate helper. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-15 11:44:44 -07:00
Avi Kivity	5f3a5c436e	Merge "chunked vector memory estimation" from Glauber " The memory estimations we have when using the chunked vector are usually slightly wrong. We can make them more accurate by exporting the memory usage directly as a chunked_vector API. " * 'chunked_memory-v2' of github.com:glommer/scylla: large_bitset: be more accurate with memory usage chunked_vector: exports its current memory usage	2018-05-15 19:00:36 +03:00
Glauber Costa	2ba08178ca	large_bitset: be more accurate with memory usage We are slightly underestimating the amount of memory we use. Now that the chunked vector can exports its internal memory usage we can use that directly. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-05-15 11:22:21 -04:00
Glauber Costa	7190bb4f95	chunked_vector: exports its current memory usage There are times in which we would like to estimate how much memory a chunked_vector is using. We have two strategies to do it: 1) multiply the size by the size of the elements. That is wrong, because the chunked_vector can allocate larger chunks in anticipation of more elements to come. 2) multiply the number of chunks by 128kB. That is also wrong, because the chunk_vector will not always allocate the entire chunk if there are only a few elements in it. The best way to deal with it is to allow the chunked_vector to exports its current memory usage. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-05-15 11:22:21 -04:00
Raphael S. Carvalho	83e64192d3	tests/perf: fix compaction and write mode of perf_sstable storage_service_for_tests must be instantiated only once at a global scope. Fixes #3369. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180510042200.2548-1-raphaelsc@scylladb.com>	2018-05-15 18:00:18 +03:00
Avi Kivity	e0ef39705f	dist: redhat: properly package scylla_blocktune.py Commit `9eb8ea8b11` installed scylla_blocktune.py as part of preparing the rpm, but forgot to add it to the installed file list, breaking the rpm build. Fix by listing the file in the %files section. Message-Id: <20180506202807.5719-1-avi@scylladb.com>	2018-05-15 18:00:05 +03:00
Piotr Sarna	40bf5d671b	cql: add secondary index metrics This commit adds basic secondary index metrics to cql_stats: * total number of indexes creates * total number of indexes dropped * total number of reads from a secondary index * total number of rows read from a secondary index References #3384 Reviewed-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <d5eda7a343cee547c921dd4d289ecb1ac1c2bf24.1526374243.git.sarna@scylladb.com>	2018-05-15 17:59:53 +03:00
Avi Kivity	4f81e1f55a	Merge "Use CRC32 to calculate checksums for SSTables 3.0." from Vladimir " SSTables 3.x (format 'm') use CRC32 instead of Adler32 for calculating checksums. This patchset introduces support for CRC32 along with Adler32 in checksummed_file_writer to be used for SSTables written in 'mc' format. Structures and helpers introduced for CRC32 will be later used for calculating checksums for compressed files as well (not a part of this patchset). Tests: unit {release} " * 'projects/sstables-30/write-digest-crc/v3' of https://github.com/argenet/scylla: tests: Add test covering checksumming SSTables 3.0 with CRC32. sstables: Support CRC32 checksum for SSTables 3.x. sstables: Move adler32 routines under the scope of a class. sstables: Move checksum utils into separate header. sstables: Remove unused 'checksum_file' flag from checksummed_file_writer.	2018-05-15 10:18:14 +03:00
Duarte Nunes	3a7d655d01	Merge 'transport: reduce unneeded continuations' from Avi " The native protocol server generates mant reactor tasks that can be easily eliminated. I measured a read workload with 100% cache hit rate, seeing the number of tasks per request drop from ~31 to ~27, and an increase of 3% in throughput. " * tag 'transport-optimize-1/v1' of https://github.com/avikivity/scylla: transport: remove unused capture of flags variable transport: merge response write and error handling continuations transport: make write_repsonse() return void transport: de-template a lambda transport: merge memory-management and logging continuations transport: remove gate continuation transport: merge two response processing continuations transport: simplify response processing continuation transport: remove gratuitous continuation from process_request_one()	2018-05-14 10:12:07 +01:00
Avi Kivity	4500baaaf4	transport: remove unused capture of flags variable	2018-05-14 09:41:06 +03:00
Avi Kivity	88f8fe3168	transport: merge response write and error handling continuations The response write continuation does not defer, so traditional try/catch works well and saves a continuation.	2018-05-14 09:41:06 +03:00
Avi Kivity	3e8d1c8fd7	transport: make write_repsonse() return void It just schedules the response, and returns immediately. (I thought about calling it schedule_response(), but usually it will write the response immediately, since waiting for network writes is rare in a local network).	2018-05-14 09:41:06 +03:00
Avi Kivity	b26f36c2ec	transport: de-template a lambda Generic templates = annoying.	2018-05-14 09:41:06 +03:00
Avi Kivity	7a9b73f166	transport: merge memory-management and logging continuations Merge a continuation that just keeps things alive with another that just logs things.	2018-05-14 09:41:06 +03:00
Avi Kivity	f0887a55e4	transport: remove gate continuation with_gate() generates a continuation if the protected function defers. Avoid that by merging a gate::leave() call with another, preexisting, continuation.	2018-05-14 09:41:06 +03:00
Avi Kivity	876837a5da	transport: merge two response processing continuations We have one coninuation transforming the result, and another shutting down tracing. Since the first cannot defer, we can merge the two, reducing the number of tasks processed by the reactor.	2018-05-14 09:41:06 +03:00
Avi Kivity	38619138be	transport: simplify response processing continuation A continuation in the response processing path is only doing transformation on the output. Make that clear by returning a value, not a future.	2018-05-14 09:41:06 +03:00
Avi Kivity	f0a1478b6c	transport: remove gratuitous continuation from process_request_one() No need to call then() just to convert exceptions to futures, futurize_apply() does this with less ado.	2018-05-14 09:41:06 +03:00
Vladimir Krivopalov	1da6144f90	tests: Add test covering checksumming SSTables 3.0 with CRC32. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Vladimir Krivopalov	e6dfa008d8	sstables: Support CRC32 checksum for SSTables 3.x. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Vladimir Krivopalov	adb43959d1	sstables: Move adler32 routines under the scope of a class. This is a step towards making digest algorithm customizable at compile time. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Vladimir Krivopalov	4e4030676f	sstables: Move checksum utils into separate header. Checksummed writer doesn't need to include all compression stuff. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-13 12:38:25 -07:00
Nadav Har'El	f5536d607e	secondary index: fix multiple appearance of rows This patch fixes a bug where queries using a secondary index would, in some cases, produce the same rows multiple times. The problem was that the code begins by finding a list of primary keys that match the search, and then work on the partitions containing them. If multiple rows matched in the same partition, the partition was considered multiple times, and the same rows were output multiple times. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180510203141.17157-1-nyh@scylladb.com>	2018-05-13 20:08:14 +02:00
Avi Kivity	7d29addb1f	mutation_reader: optimize make_combined_reader for the single-reader case If we're given a single reader (can be common in a low-write-rate table, where most of the data will be in a single large sstable, or in leveled tables) then we can avoid the overhead of the combining reader by returning the single input. Tests: unit (release) Message-Id: <20180513130333.15424-1-avi@scylladb.com>	2018-05-13 20:07:10 +02:00
Duarte Nunes	a23bda3393	Merge 'Implement separate timeout for range queries' from Avi " This patchset implements separate timeouts for range queries, and lays the foundations for separate timeouts for other query types. While the feature in itself is worthy, the real motivation is to have the timeouts decided by the caller, instead of storage_proxy. This in turn is required to disentangle each layer behaving differently depending on whether the query is internal or not; instead, the goal is to have each caller declare its needs in terms of consistency level and timeouts, and have the lower layers implement its requirements instead of making their own decisions. Fixes #3013. Tests: unit (release) " * tag '3013/v1.1' of https://github.com/avikivity/scylla: storage_proxy: remove default_query_timeout() storage_proxy: don't use default timeouts query_options: augment with timeout_config thrift: configure thrift transport and handler with a timeout_config transport: configure native transport with a timeout_config cql3: define and populate timeout_config_selector timeout_config: introduce timeout configuration	2018-05-13 20:05:50 +02:00
Glauber Costa	3d2c4c1cf8	main: change I/O scheduler verification code Before we accept running while not in developer mode, we verify that the I/O Scheduler is properly configured. Up until now, that meant verifying that --max-io-requests is properly set and that the number of I/O Queues is enough to leave at least 4 requests per I/O Queue. Systems that move to newer versions of Scylla may continue doing that, so we need to be backwards compatible and keep testing for that. However, newer systems will not set that option, but pass a YAML property file (or string) instead. So we need to make sure that either one of those is set. If the property file is set, I am deciding here not to test for number of I/O queues. scylla_io_setup will usually configure that anyway, plus we plan on soon moving to all-shards-dispatch making that less important. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20180509163737.5907-1-glauber@scylladb.com>	2018-05-13 19:22:54 +03:00
Glauber Costa	2e0c673432	database: release flush permits earlier There is an ongoing discussion in issue 2678 about the right time to release permits. Right now we are releasing the permit after we flush all data for the memtable plus the SSTables accompanying components - plus flushing them, closing them, etc. During all that time, we are increasing virtual dirty by adding more data to the buffers but we are not able to decrease it-- until we release the permit we can't start flushing the next memtable. This is much more of a concern than I/O overlapping as described in the issue. We have a hook in the SSTable write process that is (should be) called as soon as data is written. We should move the permit release there. We aren't, though, calling that as early as we could. The call to the data written hook is writing after the Index is closed, summary is sealed, etc. This patch fixes that. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20180508182746.28310-2-glauber@scylladb.com>	2018-05-13 19:22:54 +03:00
Tomasz Grabiec	8faafdaae5	lsa: Avoid the call to segment_pool::descriptor() in compact()	2018-05-11 19:07:23 +02:00
Tomasz Grabiec	19edf3970e	lsa: Make reclamation on reserve refill more efficient Currently reserve refill allocates segments repeatedly until the reserve threhsold is met. If single segment allocation needs to reclaim memory, it will ask the reclaimer for one segment. The reclaimer could make better decisions if it knew the total number of segments we try to allocate. In particular, it would not attempt to compact any segment until it evicts total amount of memory first, which may reduce the total amount of segment compactions during refill. This patch changes refill to increase reclamation step used by allocate_segment() so that it matches the total amount of memory we refill.	2018-05-11 19:07:23 +02:00
Takuya ASADA	6fa3c4dcad	dist/redhat: replace scylla-libgcc72/scylla-libstdc++72 with scylla-2.2 metapackage We have conflict between scylla-libgcc72/scylla-libstdc++72 and scylla-libgcc73/scylla-libstdc++73, need to replace *72 package with scylla-2.2 metapackage to prevent it. Fixes #3373 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20180510081246.17928-1-syuu@scylladb.com>	2018-05-11 09:41:57 +03:00
Vladimir Krivopalov	f443e85476	sstables: Remove unused 'checksum_file' flag from checksummed_file_writer. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-10 11:11:06 -07:00
Paweł Dziepak	863a96db48	Merge "Fix partition tombstones for SSTables 3.x" from Vladimir "Previously, partition tombstone was not written for partitions with no rows causing corrupted data files. This is now fixed and covered with tests. In addition, we now track partition tombstones while collecting encoding statistics." * 'projects/sstables-30/fix-partition-tombstone/v3' of https://github.com/argenet/scylla: tests: Don't use deprecated schema constructor. tests: Add tests to cover partitions consisting only of partition keys. sstables: Make sure partition level tombstone is written for partitions with no rows. memtable: Collect statistics from partition-level tombstone.	2018-05-10 16:27:20 +01:00
Vladimir Krivopalov	d7177d9013	tests: Don't use deprecated schema constructor. Rely entirely on schema_builder facilities while preparing schema for unit tests. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-10 08:13:29 -07:00
Vladimir Krivopalov	64cdb30379	tests: Add tests to cover partitions consisting only of partition keys. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-10 08:12:58 -07:00
Vladimir Krivopalov	97079208db	sstables: Make sure partition level tombstone is written for partitions with no rows. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-10 07:28:54 -07:00
Vladimir Krivopalov	ffc3a1ffeb	memtable: Collect statistics from partition-level tombstone. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-10 07:28:50 -07:00
Duarte Nunes	21ccf173a1	Merge 'Preparatory cleanup for stateful range-scans' from Botond " This is preparatory cleanup series with fixes/cleanup of miscellaneous issues that I discovered while working on the stateful range-scans. Since the stateful range-scans series, even without these patches, is a 20+ patches strong series I'd like to fast-track this, to ease reviewing the former. Most of the changes here are related to code-hygenie and effectiveness and there is a patch that is correctness-related ("querier: check only the end bound of ranges when matching them") and one that is related to ease-of-use ("range: clean the deduced transformed type"). Note that altough these changes were made in the context of working on the stateful range-scans they make sense on their own as well. Tests: unit(release, debug) " * '1865/pre-range-scans-cleanup/v1' of https://github.com/denesb/scylla: multishard_combining_reader: use optimized optional for the shard reader Use dht::token_range alias for last/preferred replicas storage_proxy::coordinator_query_result: merge constructors into one w/ default params querier: check only the end bound of ranges when matching them querier: take range and slice by value querier: remove const params from make_compaction_state() querier: make _range and _slice const flat_multi_range_mutation_reader: optimize for non-plural range vectors range: clean the deduced transformed type	2018-05-10 11:09:44 +01:00
Botond Dénes	7a3eab90c8	multishard_combining_reader: use optimized optional for the shard reader Use flat_mutation_reader_opt instead of std::optional<flat_mutation_reader>.	2018-05-10 13:06:47 +03:00
Duarte Nunes	d49348b0e1	Merge 'Include OPTIONS with LIST ROLES' from Jesse " Fixes #3420. Tests: dtest (`auth_test.py`), unit (release) " * 'jhk/fix_3420/v2' of https://github.com/hakuch/scylla: cql3: Include custom options in LIST ROLES auth: Query custom options from the `authenticator` auth: Add type alias for custom auth. options	2018-05-10 11:03:29 +01:00
Vladimir Krivopalov	e5477c6c6c	utils: Use dedicated enum for Bloom filter format instead of a boolean. It better reflects the purpose of the parameter and provides better type-safety. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <10a4fc16dafa0fb3234969041f68f9e7bfc61312.1525899669.git.vladimir@scylladb.com>	2018-05-10 09:47:41 +03:00

1 2 3 4 5 ...

15382 Commits