scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 20:57:00 +00:00

Author	SHA1	Message	Date
Duarte Nunes	392403b5b3	row_marker: Mark constructors explicit Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-04-25 11:43:04 +02:00
Tomasz Grabiec	f3609fc813	tests: log_historgram_test: Fix compiation on Ubuntu Some gcc versions incorrectly complain: tests/log_histogram_test.cc:87:22: error: ‘opts1’ is not a valid template argument for type ‘const log_histogram_options&’ because object ‘opts1’ has not external linkage size_t hist_key<node<opts1>>(const node<opts1>& n) { return n.v; } Apparently this is a bug in gcc: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52036 Fixes #2307. Message-Id: <1493108791-11247-1-git-send-email-tgrabiec@scylladb.com>	2017-04-25 12:15:28 +03:00
Pekka Enberg	940c3f4330	Merge "Clang fixes (part 2)" from Avi "This series fixes some more errors found by clang, with the aim of enabling clang/zapcc as a supported compiler. A single issue remains, but it's probably in std::experimental::optional::swap(); not in our code." * tag 'clang/2/v1' of https://github.com/avikivity/scylla: sstable_test: avoid passing negative non-type template arguments to unsigned parameters UUID: add more comparison operators sstable_datafile_test: avoid string_view user-defined literal conversion operator mutation_source_test: avoid template function without template keyword cql_query_test: define static variable cql_query_test: add braces for single-item collection initializers storage_service: don't use typeid(temporary) logalloc: remove unused max_occupancy_for_compaction storage_proxy: drop overzealous use of __int128_t in recently-modified-no-read-repair logic storage_proxy: drop unused member access from return value storage_proxy: fix reference bound to temporary in data_read_resolver::less_compare read_repair_decision: fix operator<<(std::ostream&, ...)	2017-04-24 20:32:16 +03:00
Tomasz Grabiec	dfbb9fd8f1	gdb: Workaround for gdb.Value being not accepted by %x Fixes the following error in "scylla segment-descs" and a similar one in "scylla lsa-segment": Traceback (most recent call last): File "scylla-gdb.py", line 530, in invoke gdb.write('0x%x: lsa free=%d region=0x%x zone=0x%x\n' % (addr, desc['_free_space'], desc['_region'], desc['_zone'])) TypeError: %x format: an integer is required, not gdb.Value Message-Id: <1493029465-6482-1-git-send-email-tgrabiec@scylladb.com>	2017-04-24 13:27:25 +03:00
Avi Kivity	6d9e18fd61	logalloc: reduce descriptor overhead Every lsa-allocated object is prefixed by a header that contains information needed to free or migrate it. This includes its size (for freeing) and an 8-byte migrator (for migrating). Together with some flags, the overhead is 14 bytes (16 bytes if the default alignment is used). This patch reduces the header size to 1 byte (8 bytes if the default alignment is used). It uses the following techniques: - ULEB128-like encoding (actually more like ULEB64) so a live object's header can typically be stored using 1 byte - indirection, so that migrators can be encoded in a small index pointing to a migrator table, rather than using an 8-byte pointer; this exploits the fact that only a small number of types are stored in LSA - moving the responsibility for determining an object's size to its migrator, rather than storing it in the header; this exploits the fact that the migrator stores type information, and object size is in fact information about the type The patch improves the results of memory_footprint_test as following: Before: - in cache: 976 - in memtable: 947 After: mutation footprint: - in cache: 880 - in memtable: 858 A reduction of about 10%. Further reductions are possible by reducing the alignment of lsa objects. logalloc_test was adjusted to free more objects, since with the lower footprint, rounding errors (to full segments) are different and caused false errors to be detected. Missing: adjustments to scylla-gdb.py; will be done after we agree on the new descriptor's format.	2017-04-24 12:23:12 +02:00
Avi Kivity	b4e897a66d	cql3::metadata: fix undefined evaluation order in constructor We both move names_ to its destination, and call names_.size() in the same expression; this has undefined evaluation order, and fails with clang. With this patch as well as the clang build fixes, Scylla starts and is able to serve requests (light cassandra-stress load). Message-Id: <20170423121727.1948-1-avi@scylladb.com>	2017-04-24 10:40:12 +03:00
Duarte Nunes	cddf2f4d74	tests: Fix failure virtual_reader_test This patch fixes a failure of virtual_reader_test, where both the test itself and the cql_test_env initialize the messaging_service to listen on the same address and port, triggering an assert in posix_ap_server_socket_impl::accept(). Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170423104240.21275-1-duarte@scylladb.com>	2017-04-23 14:06:35 +03:00
Avi Kivity	566c094764	sstable_test: avoid passing negative non-type template arguments to unsigned parameters Clang complains. The test looks somewhat bogus, but that's for another patch.	2017-04-22 22:13:55 +03:00
Avi Kivity	dc6ea51ffa	UUID: add more comparison operators Clang wanted them for some unit test; not sure how gcc was able to synthesize them, but they're clearly needed.	2017-04-22 22:12:33 +03:00
Avi Kivity	5424aca745	sstable_datafile_test: avoid string_view user-defined literal conversion operator Clang doesn't like it, perhaps because it isn't in the std namespace (it's still in std::experimental).	2017-04-22 22:11:30 +03:00
Avi Kivity	705ac957a2	mutation_source_test: avoid template function without template keyword This isn't (yet?) standard C++, and clang rejects it.	2017-04-22 22:10:21 +03:00
Avi Kivity	551fb03476	cql_query_test: define static variable single_node_cql_env is declared but not defined; define it to make clang happy.	2017-04-22 22:01:44 +03:00
Avi Kivity	eb700752d8	cql_query_test: add braces for single-item collection initializers Clang complains that braces are missing; I didn't verify it but I'm sure it's right. Add braces to make it happy.	2017-04-22 22:00:49 +03:00
Avi Kivity	6bb8ae7788	storage_service: don't use typeid(temporary) Clang warns that the expression will be evaluated (doh). While the warning seems dubious, keep it and change the code to call the function outside typeid(), in case it does help someone one day.	2017-04-22 21:09:41 +03:00
Avi Kivity	9303b09a64	logalloc: remove unused max_occupancy_for_compaction Noticed by clang.	2017-04-22 21:09:41 +03:00
Avi Kivity	6d0811711f	storage_proxy: drop overzealous use of __int128_t in recently-modified-no-read-repair logic Clang's std::abs() doesn't support __int128_t, so use __int64_t instead. With this change, it's possible that a read repair 252,700 years after a write will be interpreted as a recent write and the read repair will incorrectly be skipped; hopefully by that time __int128_t will be standardized.	2017-04-22 21:09:41 +03:00
Avi Kivity	5ec1742b9a	storage_proxy: drop unused member access from return value Noticed by clang.	2017-04-22 21:09:41 +03:00
Avi Kivity	e4bae0df51	storage_proxy: fix reference bound to temporary in data_read_resolver::less_compare Noticed by clang.	2017-04-22 21:09:41 +03:00
Avi Kivity	944047f039	read_repair_decision: fix operator<<(std::ostream&, ...) Argument-dependent lookup requires that the operator be declared in the same namespace as the class; move it there. While at it, de-static it, it only causes bloat.	2017-04-22 21:09:41 +03:00
Raphael S. Carvalho	4a86dd473d	tests: add tests/sstable_resharding_test.cc Forgot to add file after resolving conflict. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170422172053.3734-1-raphaelsc@scylladb.com>	2017-04-22 21:09:29 +03:00
Benoît Canet	f68049ef5d	tests: Fix clang auto universal reference type deduction Replace it by regular template type deduction. Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <20170421204150.4626-2-benoit@scylladb.com>	2017-04-22 20:04:00 +03:00
Benoit Canet	b902f3b81b	tests: Remove parenthesis in variable declaration Prevent clang compilation of this tests. Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <20170421204150.4626-1-benoit@scylladb.com>	2017-04-22 20:04:00 +03:00
Avi Kivity	54ab13eb8e	Merge "sstable resharding revamp" from Raphael "Currently, a shared sstable is rewritten at all shards it belongs to, and only after that, it's deleted. This new algorithm adds the ability to reshard a set of sstables together at a single shard and produce unshared sstable for all shards involved. That's important for the leveled compaction strategy issue, in which the number of sstables growing considerably after resharding. What happened is that every sstable was being split into N ones, so we could end up with tons of small sstables. Now, we will reshard together a set of adjacent sstables." * 'sstable_resharding_revamp_v9' of github.com:raphaelsc/scylla: tests: add test for new sstable resharding database: kill column_family::start_rewrite database: wire up new resharding algorithm database: implement new sstable resharding algorithm database: introduce function to replace new sstables by their ancestors prevent regular compaction from choosing shared sstables compaction_strategy: implement resharding strategy for compaction strategies sstables: store more info in foreign_sstable_open_info sstables: make it possible to get open info from loaded sstable database: export column family dir database: inform if column family has shared tables sstables: add method to export ancestors lcs: implement get_level_count compaction_manager: introduce method to check if manager stopped lcs: restore invariant instead of sending overlapping sst to L0 sstables: extend compaction for new resharding sstables: allow shard A to correctly create sstable for shard B compaction: rework compacting_sstable_writer to work with multiple writers compaction: prepare compacting_sstable_writer to work with writers sstables: rework compaction to make it easy to extend	2017-04-22 13:31:54 +03:00
Raphael S. Carvalho	8a37b279ed	tests: add test for new sstable resharding Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:34 -03:00
Raphael S. Carvalho	662fe77c11	database: kill column_family::start_rewrite Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:33 -03:00
Raphael S. Carvalho	43ac19eb52	database: wire up new resharding algorithm Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:31 -03:00
Raphael S. Carvalho	cf45333588	database: implement new sstable resharding algorithm NOTE: it's not wired yet. Currently, a shared sstable is rewritten at all shards it belongs to and only after that, it's deleted. With this new algorithm, a shared sstable will be read only once and N unshared sstables will be created, each of them with 1/N of the data. After it's done, each owner shard will receive its new unshared sstable replacing its ancestors. Another benefit is that we'll no longer have resharding resulting in number of sstables growing considerably after resharding. A full-sized leveled sstable is usually 160MB, so after resharding, we could have N files of 160MB/N. Now, leveled strategy will help resharding. N adjacent sstables of same level will be resharded together, so we'll end up with N files of N*160MB/N. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:30 -03:00
Raphael S. Carvalho	6513252e91	database: introduce function to replace new sstables by their ancestors When resharding, we're working with sstables from all shards. So let's say we're done with resharding of sstable A that belongs to shard 0 and 1 and sstable B that belongs to shard 1 and 2. SStables were generated for shards 0, 1, and 2. So shards 0, 1, and 2 need to load the new sstables and remove the ancestors. Shard 1 for example will remove sstables A and B (ancestors) and add the new one. Then it comes this new function. We'll forward new sstables to their target shards using foreign sstable open info. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:27 -03:00
Raphael S. Carvalho	c44a2319e6	prevent regular compaction from choosing shared sstables For new resharding, it's important to exclude resharding sstables from the list of candidates for regular compaction. That's doesn't affect current resharding because it marks the sstables as compacting. That won't work with new resharding which will work with sstables from multiple shards. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:26 -03:00
Raphael S. Carvalho	13477075e2	compaction_strategy: implement resharding strategy for compaction strategies Strategies other than leveled will reshard one shared sstable at a time, and the target shard, shard at which job will run, for each job will be chosen in a round-robin fashion. For leveled strategy, we will reshard together smp::count adjacent sstables that belong to same level. The reason for that is because resharding one sstable at a time may result in creation of file for each shard, meaning after resharding we could end up with NO_SSTABLES*NO_SHARDS. These resharding strategies will be used for our new resharding algorithm. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:24 -03:00
Raphael S. Carvalho	bf930476b3	sstables: store more info in foreign_sstable_open_info We need that info for opening a sstable at different shard, unlike sstable loader which has everything in entry_descriptor, obtained from components in sstable filename. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:22 -03:00
Raphael S. Carvalho	e5e7037aa4	sstables: make it possible to get open info from loaded sstable It will be useful for resharding which will need to move a sstable across shards, and to do that without reloading the sstable at target shard, we need to be able to get the open info and move it to the target shard instead. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:21 -03:00
Raphael S. Carvalho	405e41e9a8	database: export column family dir Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:19 -03:00
Raphael S. Carvalho	2b774c5bc3	database: inform if column family has shared tables That's gonna be useful to quickly determine if it's worth resharding a column family. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:17 -03:00
Raphael S. Carvalho	2d119287b7	sstables: add method to export ancestors Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:16 -03:00
Raphael S. Carvalho	f2f8a2f5c7	lcs: implement get_level_count Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:14 -03:00
Raphael S. Carvalho	585596cede	compaction_manager: introduce method to check if manager stopped Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:12 -03:00
Raphael S. Carvalho	d82a8dfae0	lcs: restore invariant instead of sending overlapping sst to L0 A large token span sstable may find its way into high level due to resharding, which means the strategy invariant is broken. The invariant is restored by compacting first set of overlapping sstables, meaning that the restoration is done incrementally for multiple overlapping sets. Invariant is restored by regular compaction after resharding puts new unshared sstables into their original level, where level > 0. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:09 -03:00
Raphael S. Carvalho	0127309820	sstables: extend compaction for new resharding Extends compaction for new resharding algorithm. Not wired yet. New resharding will compact shared sstable(s) and create one sstable for each owner. It's up to the caller to open these new unshared sstables at their respective column families. This new approach will save a lot of bandwidth because we'll no longer read the entire shared sstable #smp::count times. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:08 -03:00
Raphael S. Carvalho	758bc38e7a	sstables: allow shard A to correctly create sstable for shard B That's possible by shard A explicitly saying that sstable is created for shard B. If we don't do that, sharding metadata isn't correct, and consequently sstable will report wrong owners. We'll need this for resharding which will create sstables for all shards that own the shared sstable. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:06 -03:00
Raphael S. Carvalho	2a437ab427	compaction: rework compacting_sstable_writer to work with multiple writers compacting_sstable_writer only allowed one writer so far, but we will need multiple ones for resharding. It's done by moving writer management to compaction. finish_sstable_writer() is added for compaction impl to stop all writers, whereas stop_sstable_writer() will only stop current writer (needed when current sstable reaches max limit size for example). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:05 -03:00
Raphael S. Carvalho	a35a3a9647	compaction: prepare compacting_sstable_writer to work with writers No need for compacting_sstable_writer to store items that are available in compaction class. Also, that's a step towards supporting multiple writers for compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:03 -03:00
Raphael S. Carvalho	38ed83e2f7	sstables: rework compaction to make it easy to extend compact_sstables() supported both regular and cleanup compaction, but with lots of conditions that made it ugly and hard to extend. In the future, we want to introduce a new type of compaction for resharding that will create one sstable for every shard owning the sstable(s) given as input. That will be easier now. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-04-21 17:11:02 -03:00
Avi Kivity	fdcf64520d	Merge seastar upstream * seastar 2eec212...194d80f (4): > removing the collectd tests > fix fstream metrics reporting. > do_for_each: Make it check for need preempt > core/sharded: introduce copy method to foreign_ptr	2017-04-21 22:14:01 +03:00
Avi Kivity	fccbf2c51f	Merge "Reduce memory reclamation latency" from Tomasz "Currently eviction is performed until occupancy of the whole region drops below the 85% threshold. This may take a while if region had high occupancy and is large. We could improve the situation by only evicting until occupancy of the sparsest segment drops below the threshold, as is done by this change. I tested this using a c-s read workload in which the condition triggers in the cache region, with 1G per shard: lsa-timing - Reclamation cycle took 12.934 us. lsa-timing - Reclamation cycle took 47.771 us. lsa-timing - Reclamation cycle took 125.946 us. lsa-timing - Reclamation cycle took 144356 us. lsa-timing - Reclamation cycle took 655.765 us. lsa-timing - Reclamation cycle took 693.418 us. lsa-timing - Reclamation cycle took 509.869 us. lsa-timing - Reclamation cycle took 1139.15 us. The 144ms pause is when large eviction is necessary. Statistics for reclamation pauses for a read workload over larger-than-memory data set: Before: avg = 865.796362 stdev = 10253.498038 min = 93.891000 max = 264078.000000 sum = 574022.988000 samples = 663 After: avg = 513.685650 stdev = 275.270157 min = 212.286000 max = 1089.670000 sum = 340573.586000 samples = 663 Refs #1634." * tag 'tgrabiec/lsa-reduce-reclaim-latency-v3' of github.com:cloudius-systems/seastar-dev: lsa: Reduce reclamation latency tests: Add test for log_histogram log_histogram: Allow non-power-of-two minimum values lsa: Use regular compaction threshold in on-idle compaction tests: row_cache_test: Induce update failure more reliably lsa: Add getter for region's eviction function	2017-04-21 17:47:06 +03:00
Tomasz Grabiec	20f4c9bf23	lsa: Reduce reclamation latency Currently eviction is performed until occupancy of the whole region drops below the 85% threshold. This may take a while if region had high occupancy and is large. We could improve the situation by only evicting until occupancy of the sparsest segment drops below the threshold, as is done by this change. I tested this using a c-s read workload in which the condition triggers in the cache region, with 1G per shard: lsa-timing - Reclamation cycle took 12.934 us. lsa-timing - Reclamation cycle took 47.771 us. lsa-timing - Reclamation cycle took 125.946 us. lsa-timing - Reclamation cycle took 144356 us. lsa-timing - Reclamation cycle took 655.765 us. lsa-timing - Reclamation cycle took 693.418 us. lsa-timing - Reclamation cycle took 509.869 us. lsa-timing - Reclamation cycle took 1139.15 us. The 144ms pause is when large eviction is necessary. Statistics for reclamation pauses for a read workload over larger-than-memory data set: Before: avg = 865.796362 stdev = 10253.498038 min = 93.891000 max = 264078.000000 sum = 574022.988000 samples = 663 After: avg = 513.685650 stdev = 275.270157 min = 212.286000 max = 1089.670000 sum = 340573.586000 samples = 663 Refs #1634. Message-Id: <1484730859-11969-1-git-send-email-tgrabiec@scylladb.com>	2017-04-21 12:52:31 +02:00
Tomasz Grabiec	4313641c03	tests: Add test for log_histogram	2017-04-21 12:52:31 +02:00
Tomasz Grabiec	c83768d6bb	log_histogram: Allow non-power-of-two minimum values We will want to reuse the min_size mechanism for the whole compaction threshold, including the occupancy threshold. That threshold is close to the segment size and we cannot pick a power of two which would be close enough to what we need. Therefore, change log_histogram to support arbitrary minimum base. bucket_of() was moved into log_histogram_options so that it can be used in number_of_buckets(), which makes for a simple and much less error-prone implementation.	2017-04-21 10:54:50 +02:00
Tomasz Grabiec	7a800c54bf	lsa: Use regular compaction threshold in on-idle compaction Idle-time compaction should not produce not-compactible segments becuase that means we would have to evict a lot when we finally need to reclaim some memory, so that occupancy falls below the regular compaction threshold. This may cause latency spikes. Refs #1634.	2017-04-20 15:00:15 +02:00
Tomasz Grabiec	e054ccc037	tests: row_cache_test: Induce update failure more reliably After changing region evicitability condition to be less strict, cache update stopped failing because reclamation was able to compact dense region. Induce failure by installing evictor which refuses to evict from cache beyond few elements.	2017-04-20 14:51:47 +02:00

1 2 3 4 5 ...

11796 Commits