scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 13:06:57 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	28206993a4	database: fix indentation of distributed_loader::open_sstable Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-05-22 11:52:52 -03:00
Raphael S. Carvalho	a4e414cb3b	database: reduce memory requirement to load sstables SSTable load temporarily uses more space than needed to store metadata, due to: 1) All components are read using read_simple() which uses 128k buffer. file::dma_read_bulk() will allocate 128k, and may potentially allocate another big buffer (128k - read) for file::read_maybe_eof(). 2) read_filter() may use double the space it needs to. Due to the fact that sstable loading parallelism is unlimited, Scylla may require much more memory to load all sstables, and that may lead to OOM. Higher the number of sstables higher the memory overhead. To confirm this problem, I wrote a test[1] which loads 30k sstables in parallel and reports the memory usage peak in the end. When loading 30k sstables, each of which metadata is ~300kb, memory usage peak was ~18G. When loading completed, only ~9GB were needed to store all the metadata. [1]: https://gist.github.com/raphaelsc/2db37b4fb34301833ab9eeed3b1a524d To fix this problem, we need to set a limit on load parallelism (let's start with a small number like 3 and adjust later if needed) and rely on readahead so that the requirement drops considerably without increasing boot time. Actually, boot time is improved by it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Reviewed-by: Nadav Har'El <nyh@scylladb.com>	2017-05-22 11:52:51 -03:00
Raphael S. Carvalho	043fae2ef5	sstables: loads components for a sstable in parallel Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Reviewed-by: Nadav Har'El <nyh@scylladb.com>	2017-05-22 11:52:49 -03:00
Raphael S. Carvalho	0ac729fd57	sstables: enable read ahead for read of in-memory components Read ahead 4 is used. Let's adjust it later if needed. File size is used to prevent file_input_stream from issuing useless reads beyond file size with read ahead enabled. We can switch to variant without length once file_input_stream handles it properly. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-05-22 11:52:37 -03:00
Raphael S. Carvalho	77b8870cf3	sstables: make random_access_reader work with read ahead Scylla crashes if read ahead is enabled by file_random_access_reader because a call to seek() destroys the existing input stream without closing it first. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-05-22 11:52:33 -03:00
Duarte Nunes	6ac73b57fb	cql3/statements/select_statement: Remove dead code Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170522100230.17393-1-duarte@scylladb.com>	2017-05-22 14:32:12 +03:00
Avi Kivity	5828ddcca4	Merge seastar upstream * seastar 4af898c...68dbf60 (4): > dpdk: follow namespace changes to fix compile error > perftune.py: fix regression introduced in df5f74ac > doc: typo in README.md > posix_net: load-balance connections	2017-05-22 12:39:48 +03:00
Asias He	b56ba02335	gossip: Make bootstrap more robust The bootstrapping node will be a gossip only member, until the streaming finishes and the node becomes NORMAL state. If during this time, the bootstrapping node is overwhelmed with streaming, it is possible the node will delay the update the gossip heartbeat. Be forgiving for the bootstrapping node and do not remove it from gossip too fast. Otherwise, streaming rpc verbs will not be resent becasue the node is not in gossip membership anymore. Fixes #2150 Message-Id: <286d7035d854f2a48abf4e1e2e3bfcb8b22b9ca2.1494553580.git.asias@scylladb.com>	2017-05-21 19:25:40 +03:00
Takuya ASADA	7777b558c4	dist/redhat: Use mock for CentOS/RHEL rpms Enable mock for CentOS/RHEL, also support cross building by mock. Fixes #630 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20170513171200.14926-1-syuu@scylladb.com>	2017-05-21 19:22:54 +03:00
Avi Kivity	2f23648b9e	Revert "dist: add conflict with Cassandra" This reverts commit `da55aecca3`. Instead of an install-time conflict, we'll add a run-time conflict.	2017-05-21 18:37:59 +03:00
Alexys Jacob	c8116b4252	scylla_raid_setup: fix typo on print_usage Simple typo fix on the usage message output, the script name was not correct. Signed-off-by: Alexys Jacob <ultrabug@gentoo.org> Message-Id: <20170519145851.6205-1-ultrabug@gentoo.org>	2017-05-21 18:01:28 +03:00
Avi Kivity	5b182537db	Merge seastar upstream * seastar 8aef5f5...4af898c (4): > memory: fix debug build > tests: fix slab_test build > xen: fix fallouts from seastar namespace change > build: make swagger generated files depend on the code generator	2017-05-21 13:48:24 +03:00
Alexys Jacob	8dbad4f34a	scylla_sysconfig_setup: fix typo on print_usage Simple typo fix on the usage message output, the script name was not correct. Signed-off-by: Alexys Jacob <ultrabug@gentoo.org> Message-Id: <20170519143227.2741-1-ultrabug@gentoo.org>	2017-05-21 13:41:43 +03:00
Alexys Jacob	c0756d97b8	scylla_setup: fix typos on cpu scaling messages This fixes typos on CPU scaling related messages. Signed-off-by: Alexys Jacob <ultrabug@gentoo.org> Message-Id: <20170519143703.3574-1-ultrabug@gentoo.org>	2017-05-21 13:41:42 +03:00
Glauber Costa	5f99158889	api: return correct values for bloom filter statistics We are currently suspecting that the bloom filter false positive ratio is not being respected. While trying to debug that, I found out that we have a more basic problem: The numbers are all meaningless, because the stats are wrong. We are accumulating by summing the ratios together. It's easy to see how this doesn't work, if we look at an example where the ratio for some CFs is zero: SST1: false = 1, total = 2. ratio = 0.5 SST2: false = 0, total = 98 . ratio = 0. The real ratio in this example is 1 / (98 + 2) = 1 %, but the displayed ratio will be 0.5 + 0 = 0.5. This patch will map reduce all the sstables together keeping both numerator and denominator, yielding the right value at the end. To do that, we'll reuse the existing ratio_holder class, which already does exactly what we want. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20170518222333.16307-1-glauber@scylladb.com>	2017-05-21 13:11:22 +03:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Avi Kivity	dab2783b58	Merge seastar upstream * seastar 45b718b...f726938 (2): > memory: add --mbind option to supress warning message when running Seastar apps on container > Add support for Gentoo Linux irqbalance configuration detection.	2017-05-20 21:15:46 +03:00
Avi Kivity	c8cb3d6ff5	Merge "Materialized views: bug fixes and unit tests" from Duarte "This series fixes bugs related to materialized views, most pertaining to column filtering in the where clause." * 'materialized-views/bug-fixes/v1' of https://github.com/duarten/scylla: tests/view_schema_test: Add more test cases tests/cql_assertions: Add assertion for row set equality single_column_relation: Correctly print IN relation statement_restrictions: Allow filtering regular columns for views statement_restrictions: Relax clustering restrictions for views statement_restrictions: Relax partition restrictions for views cql3/statements: Prevent setting default ttl on view cql3/restrictions: Complete implementation of is_satisfied_by() db/view: Re-implement clustering_prefix_matches() db/view: Re-implement partition_key_matches() db/view: Generate regular tombstone for base deletions db/view: Consider cell liveness when generating updates db/view: Don't generate view updates for static rows	2017-05-20 13:52:56 +03:00
Tomasz Grabiec	cd4d15672b	utils: estimated_histogram: Fix clear() It was a no-op. It doesn't seem currently used, but I will have a use for it soon. Message-Id: <1495198172-1969-1-git-send-email-tgrabiec@scylladb.com>	2017-05-19 14:34:34 +01:00
Paweł Dziepak	c560cf9d9d	Merge "fixes and improvements in the permissions cache implementation" from Vlad "There are numerous issues in the current implementation of permissions cache starting from the logical errors and bugs and ending with the suboptimal implementation described in the issue #2262." * 'permissions_cache_fixes-v4' of github.com:scylladb/seastar-dev: utils::loading_cache: avoid the reads storm when the key is not in the cache utils::loading_cache: cleanup utils::loading_cache: align the constrains in the constructor with the parameters description utils::loading_cache: refresh in the background auth::auth: add operator<<() for a permission_cache key auth::auth::permissions_cache: use the values from the configuration - don't try to be smart db::config: define a saner default value for permissions_validity_in_ms	2017-05-18 13:33:05 +01:00
Vlad Zolotarov	6a63c87a9f	utils::loading_cache: avoid the reads storm when the key is not in the cache Use a mutex to serialize producers when the key is not present in the cache. Fixes #2262 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-18 07:55:48 -04:00
Tomasz Grabiec	3fc1703ccf	range: Fix SFINAE rule for picking the best do_lower_bound()/do_upper_bound() overload mutation_partition has a slicing constructor which is supposed to copy only the rows from the query range. The rows are located using nonwrapping_range::lower_bound() and nonwrapping_range::lower_bound(). Those two have two different implementations chosen with SFINAE. One is using std::lower_bound(), and one is using container's built in lower_bound() should it exist. We're using intrusive tree in mutation_partition, so container's lower_bound() is preferred. It's O(log N) whereas std::lower_bound() is O(N), because tree's iterator is not random access. However, the current rule for picking container's lower_bound() never triggers, because lower_bound() has two overloads in the container: ./range.hh:618:14: error: decltype cannot resolve address of overloaded function typename = decltype(&std::remove_reference<Range>::type::upper_bound)> ^~~~~~~~ As a result, the overload which uses std::lower_bound() is used. Spotted when running perf_fast_forward with wide partition limit in cache lifted off. It's so slow that I timeouted waiting for the result (> 16 min). Fixes #2395. Message-Id: <1495048614-9913-1-git-send-email-tgrabiec@scylladb.com>	2017-05-18 13:28:10 +03:00
Avi Kivity	ba31619594	tests: fix partitioner_test for g++ 5 It can't make the leap from dht::ring_position to stdx::optional<range_bound<dht::ring_position>> for some reason.	2017-05-18 13:09:41 +03:00
Pekka Enberg	30b5933db2	Merge "Add Gentoo Linux support to utility and setup scripts" from Alexys "These patches add support to setup and operate ScyllaDB on Gentoo Linux. * scylla_setup and related scripts * node_health_check I have kept them as simple as possible and tested them to setup and operate succesfully a three nodes cluster running on Gentoo Linux." * 'gentoo_linux_support' of github.com:ultrabug/scylla: scylla_setup: add gentoo linux installation detection prometheus node_exporter install: add support for gentoo linux raid setup: add support for gentoo linux ntp setup: add support for gentoo linux kernel check: add support for gentoo linux cpuscaling setup: add support for gentoo linux coredump setup: add support for gentoo linux detect gentoo linux on selinux setup add gentoo_variant detection and SYSCONFIG setup	2017-05-18 09:41:13 +03:00
Vlad Zolotarov	1ef22f84c1	utils::loading_cache: cleanup - Fix a callback signature: receive a const ref. - White spaces. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-17 15:03:14 -04:00
Vlad Zolotarov	87ce0b2d47	utils::loading_cache: align the constrains in the constructor with the parameters description According to description of permissions_validity_in_ms the permissions_cache is enabled if this value is set to a non-zero value. Otherwise the permissions_cache is disabled. According to the permissions_update_interval_in_ms description it must have a non-zero value if permissions_cache is enabled. permissions_cache_max_entries description doesn't explicitly state it but it makes no sense to allow it to be zero if permissions_cache is enabled. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-17 15:03:14 -04:00
Vlad Zolotarov	e286828472	utils::loading_cache: refresh in the background This patch changes the way a loading_cache works. Before this patch: 1) If a permissions key is not in the cache it's loaded in the foreground and the original query is blocked till the permissions are loaded. 2) Every _period the timer does the following: 1) If a value was loaded more than _expiry time ago it is removed from the cache. 2) If the cache is too big - the less recently loaded values are removed till the cache fits the requested size. After this patch: 1) If a permissions key is not in the cache it's loaded in the foreground and the original query is blocked till the permissions are loaded. 2) Every _period the timer does the following: 1) If a value in the cache was loaded or read for the last time more than _expiry time ago - it's removed from the cache. 2) If the cache is too big - the less recently read values are removed till the cache fits the requested size. 3) The values that were loaded more than _refresh time ago are re-read in the background. The new implementation allows to minimize the amount of the foreground reads for a frequently used value to a single event (when the value is loaded for the first time). It also ensures we do not reload values we no longer need. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-17 15:03:06 -04:00
Alexys Jacob	fa0944ac19	scylla_setup: add gentoo linux installation detection	2017-05-17 18:06:54 +02:00
Alexys Jacob	9bb1bda466	prometheus node_exporter install: add support for gentoo linux	2017-05-17 18:06:34 +02:00
Alexys Jacob	1d235e5012	raid setup: add support for gentoo linux	2017-05-17 18:06:14 +02:00
Alexys Jacob	fdd5944ab2	ntp setup: add support for gentoo linux	2017-05-17 18:05:59 +02:00
Alexys Jacob	412f96a1bf	kernel check: add support for gentoo linux	2017-05-17 18:05:45 +02:00
Alexys Jacob	a198f2b1af	cpuscaling setup: add support for gentoo linux	2017-05-17 18:05:24 +02:00
Alexys Jacob	6a1807a7d8	coredump setup: add support for gentoo linux	2017-05-17 18:05:08 +02:00
Alexys Jacob	bc63e501db	detect gentoo linux on selinux setup	2017-05-17 18:04:20 +02:00
Vlad Zolotarov	4edb336ac5	auth::auth: add operator<<() for a permission_cache key Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-17 12:03:56 -04:00
Vlad Zolotarov	d780818cac	auth::auth::permissions_cache: use the values from the configuration - don't try to be smart Our configuration already has the default values for for permission cache parameters. Therefore if user decides to give some bad parameters we'd rather fail the load and inform him/her about the bad parameters instead of trying to silently "fix" them. In addition the original code wasn't passing the parameters correctly: it switched the "expiry" and "refresh" parameters in the utils::loaded_cache constructor. Add to this that the original code was doing really strange things in the permission_cache::expiry(cfg) method. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-17 12:03:56 -04:00
Vlad Zolotarov	ea1cfabe28	db::config: define a saner default value for permissions_validity_in_ms It makes little sense to have the same value for permissions_update_interval_in_ms and permissions_validity_in_ms. This may cause the values to be invalidated only because some minor delays in the timer scheduling. It makes a lot more sense to make the permissions_update_interval_in_ms value smaller than permissions_validity_in_ms. This way we would minimize the chances of "false invalidation" due to some small delays in the timer scheduling. In addition, 2s seems to be a too small value for permissions_validity_in_ms since our default read_request_timeout_in_ms is 5s. This means that a single system_auth read failure would guarantee that the following queries are going to read system_auth data in the foreground. Setting it to 10s would allow a second read attempt before we enforce the foreground read. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-17 12:03:56 -04:00
Alexys Jacob	2ca0380d06	add gentoo_variant detection and SYSCONFIG setup	2017-05-17 18:03:53 +02:00
Avi Kivity	2aa5b3e20c	Merge "Improve perf_fast_forward test" from Tomasz "Notably: - add validation of the results (e.g. fragment count, expectations about disk activity) - add cache-specific tests" * 'tgrabiec/add-cache-tests-to-perf-fast-forward' of github.com:cloudius-systems/seastar-dev: tests: perf_fast_forward: Report cache stats row_cache: Keep counters in a struct tests: perf_fast_forward: Add cache-specific tests tests: perf_fast_forward: Extract test_reading_all() tests: perf_fast_forward: Add validation of the results tests: perf_fast_forward: Fix partition scans to read the expected amount of fragments tests: perf_fast_forward: Allow the test to be interrupted tests: perf_fast_forward: Allow testing with cache enabled row_cache: Implement mutation_reader::fast_forward_to() for cache scanner	2017-05-17 18:06:02 +03:00
Calle Wilund	29b20d410a	schema_tables: Remove "class" attribute from strategy options Not 100% proper, but in line with how we still store the info. Ensures (helps at least) to keep schema loaded from tables and schema from builder comparable. Fixes schema_changes_test error. Message-Id: <1495030581-2138-2-git-send-email-calle@scylladb.com>	2017-05-17 17:56:11 +03:00
Calle Wilund	6ca07f16c1	scylla: fix compilation errors on gcc 5 Message-Id: <1495030581-2138-1-git-send-email-calle@scylladb.com>	2017-05-17 17:56:06 +03:00
Paweł Dziepak	3ecceaee48	Merge "Fix fast_forward_to() on sstable reader being ignored in some cases" from Tomasz "When mutation reader enters the partition using index, streamed_mutation object is returned to the user before the row start fragment is processed. In that case, when we process the row start, we should ignore it and not call setup_for_partition() again. That may override user's fast_forward_to() request." * 'tgrabiec/fix-initial-fast-forward-to-for-single-key-sstable-readers' of github.com:scylladb/seastar-dev: tests: mutation_source_test: Test forwarding in single-key readers sstables: Remove unused code sstables: mutation_reader: Fix setup_for_partition() being called twice in some cases sstables: Fix verify_end_state() to tolerate ATOM_START_2 state	2017-05-17 15:35:30 +01:00
Avi Kivity	eb69fe78a4	Merge "Adding private repository to housekeeping" from Amnon "This series adds private repository support to scylla-housekeeping" * 'amnon/housekeeping_private_repo_v3' of github.com:cloudius-systems/seastar-dev: scylla-housekeeping service: Support private repositories scylla-housekeeping-upstart: Use repository id, when checking for version scylla-housekeeping: support private repositories	2017-05-17 15:56:46 +03:00
Tomasz Grabiec	777ffa3a27	tests: perf_fast_forward: Report cache stats	2017-05-17 14:15:14 +02:00
Tomasz Grabiec	d1bde3036e	row_cache: Keep counters in a struct So that taking a snapshot of all stats is easy.	2017-05-17 14:15:14 +02:00
Tomasz Grabiec	7a81f5e980	tests: perf_fast_forward: Add cache-specific tests	2017-05-17 14:15:14 +02:00
Tomasz Grabiec	1a7b03004a	tests: perf_fast_forward: Extract test_reading_all()	2017-05-17 14:15:14 +02:00
Tomasz Grabiec	a38fd16f89	tests: perf_fast_forward: Add validation of the results	2017-05-17 14:15:14 +02:00
Tomasz Grabiec	3c3ea51657	tests: perf_fast_forward: Fix partition scans to read the expected amount of fragments make_pkeys() needs to be invoked with n equal to the number of keys which the table was populated with. Otherwise the extra keys, which are missing in the table, may be placed anywhere in the vector due to ring order sorting, and break the assumption that the table contains all keys from the array up to index n. This resulted in the test reading slighlty less fragments than it would follow from the desired count. Another problem is that we should not skip the fast_forward_to() call for the inital range (workaround for a bug in sstable mutation reader), otherwise we will read slightly less than expected as well.	2017-05-17 14:15:14 +02:00

1 2 3 4 5 ...

12077 Commits