scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 16:22:15 +00:00

Author	SHA1	Message	Date
Pekka Enberg	815c91a1b8	service/storage_service: Add feature flag for secondary indices	2017-05-04 14:59:11 +03:00
Gleb Natapov	b4c368a6bc	storage_proxy: update correct statistics on range reads Fixes #2167 Message-Id: <20170405094119.GM8197@scylladb.com>	2017-04-09 18:16:06 +03:00
Vlad Zolotarov	c26799c9b0	config: enforce the 'stop' value for commit_failure_policy/disk_failure_policy Fixes #2246 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1491246164-26612-1-git-send-email-vladz@scylladb.com>	2017-04-04 16:46:36 +03:00
Avi Kivity	27c42359bc	Merge seastar upstream * seastar 6b21197...2ebe842 (6): > Merge "Various improvements to execution stages" from Paweł > app-template: allow apps to specify a name for help message > bool_class: avoid initializing object of incomplete type > app-template: make sure we can still get help with required options > prometheus: Http handler that returns prometheus 0.4 protobuf or text format > Update DPDK to 17.02 Includes patch from Pawel to adjust to updated execution_stage interface.	2017-03-26 10:50:21 +03:00
Duarte Nunes	e215f25b11	migration_manager: Atomically migrate table and views This patch changes the migration path for table updates such that the base table mutations are sent and applied atomically with the view schema mutations. This ensures that after schema merging, we have a consistent mapping of base table versions to view table versions, which will be used in later patches. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-03-15 16:03:56 +01:00
Amnon Heiman	295a981c61	storage_proxy: metrics should have unique name Metrics should have their unique name. This patch changes throttled_writes of the queu lenght to current_throttled_writes. Without it, metrics will be reported twice under the same name, which may cause errors in the prometheus server. This could be related to scylladb/seastar#250 Fixes #2163. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <20170314081456.6392-1-amnon@scylladb.com>	2017-03-14 11:19:39 +02:00
Paweł Dziepak	cfde2ad5b4	storage_proxy: make mutate() an execution stage	2017-03-09 09:27:43 +00:00
Paweł Dziepak	6db6d25f66	Merge "Avoid loosing changes to keyspace parameters of system_auth and tracing keyspaces" form Tomek "If a node is bootstrapped with auto_boostrap disabled, it will not wait for schema sync before creating global keyspaces for auth and tracing. When such schema changes are then reconciled with schema on other nodes, they may overwrite changes made by the user before the node was started, because they will have higher timestamp. To prevent that, let's use minimum timestamp so that default schema always looses with manual modifications. This is what Cassandra does. Fixes #2129." * tag 'tgrabiec/prevent-keyspace-metadata-loss-v1' of github.com:scylladb/seastar-dev: db: Create default auth and tracing keyspaces using lowest timestamp migration_manager: Append actual keyspace mutations with schema notifications	2017-03-08 10:59:47 +00:00
Tomasz Grabiec	06d4ad1bdd	migration_manager: Append actual keyspace mutations with schema notifications There is a workaround for notification race, which attaches keyspace mutations to other schema changes in case the target node missed the keyspace creation. Currently that generated keyspace mutations on the spot instead of using the ones stored in schema tables. Those mutations would have current timestamp, as if the keyspace has been just modified. This is problematic because this may generate an overwrite of keyspace parameters with newer timestamp but with stale values, if the node is not up to date with keyspace metadata. That's especially the case when booting up a node without enabling auto_bootstrap. In such case the node will not wait for schema sync before creating auth tables. Such table creation will attach potentially out of date mutations for keyspace metadata, which may overwrite changes made to keyspace paramteters made earlier in the cluster. Refs #2129.	2017-03-07 19:19:15 +01:00
Avi Kivity	439b38f5ab	Merge "Improvements to counter implementation" from Paweł "This series adds various optimisations to counter implementation (nothing extreme, mostly just avoiding unnecessary operations) as well as some missing features such as tracing and dropping timed out queries. Performance was tested using: perf-simple-query -c4 --counters --duration 60 The following results are medians. before after diff write 18640.41 33156.81 +77.9% read 58002.32 62733.93 +8.2%" * tag 'pdziepak/optimise-counters/v3' of github.com:cloudius-systems/seastar-dev: (30 commits) cell_locker: add metrics for lock acquisition storage_proxy: count counter updates for which the node was a leader storage_proxy: use counter-specific timeout for writes storage_proxy: transform counter timeouts to mutation_write_timeout_exception db: avoid allocations in do_apply_counter_update() tests/counters: add test for apply reversability counters: attempt to apply in place atomic_cell: add COUNTER_IN_PLACE_REVERT flag counters: add equality operators counters: implement decrement operators for shard_iterator counters: allow using both views and mutable_views atomic_cell: introduce atomic_cell_mutable_view managed_bytes: add cast to mutable_view bytes: add bytes_mutable_view utils: introduce mutable_view db: add more tracing events for counter writes db: propagate tracing state for counter writes tests/cell_locker: add test for timing out lock acquisition counter_cell_locker: allow setting timeouts db: propagate timeout for counter writes ...	2017-03-07 11:48:13 +02:00
Gleb Natapov	7f5923f510	storage_service: handle empty token list correctly boost::split() return one empty string if called on an empty input. Trying to cast an empty string to a token value results in a bad_lexical_cast exception. Fix it by handling empty token list explicitly. Message-Id: <20170302125405.GU11471@scylladb.com>	2017-03-06 15:31:33 +02:00
Paweł Dziepak	00b42c477f	storage_proxy: count counter updates for which the node was a leader	2017-03-02 09:05:12 +00:00
Paweł Dziepak	cf193f4b41	storage_proxy: use counter-specific timeout for writes	2017-03-02 09:05:12 +00:00
Paweł Dziepak	d177160f90	storage_proxy: transform counter timeouts to mutation_write_timeout_exception	2017-03-02 09:05:12 +00:00
Paweł Dziepak	774241648d	db: add more tracing events for counter writes	2017-03-02 09:05:10 +00:00
Paweł Dziepak	277501f42f	db: propagate tracing state for counter writes	2017-03-02 09:05:10 +00:00
Paweł Dziepak	25173f8095	db: propagate timeout for counter writes	2017-03-02 09:05:10 +00:00
Paweł Dziepak	426345e1d4	storage_proxy: avoid excessive mutation freezes	2017-03-01 16:33:36 +00:00
Paweł Dziepak	f10eb952d0	coordinator: do not apply counter write twice on leader	2017-03-01 16:33:36 +00:00
Calle Wilund	0a4edca756	counters/cql: allow wormholing actual counter values (with shards) via cql Adds yet another magic function "SCYLLA_COUNTER_SHARD_LIST", indicating that argument value, which must be a list of tuples <int, UUID, long, long>, should be inserted as an actual counter value, not update. This of course to allow counters to be read from sstable loader. Note that we also need to allow timestamps for counter mutations, as well as convince the counter code itself to treat the data as already baked. So ugly wormhole galore. v2: * Changed flag names * More explicit wormholing, bypassing normal counter path, to avoid read-before-write etc * throw exceptions on unhandled shard types in marshalling v3: * Added counter id ordering check * Added batch statement check for mixing normal and raw counter updates Message-Id: <1487683665-23426-2-git-send-email-calle@scylladb.com>	2017-02-22 09:19:46 +00:00
Gleb Natapov	bb72425b61	storage_proxy: fix send_to_endpoint() to use correct create_write_response_handler() overload There are several problems with storage_proxy::send_to_endpoint right now. It uses create_write_response_handler() overload that is specific to read repair which is suboptimal and creates incorrect logs, it does not process errors and it does not hold storage_proxy object until write is complete. The patch fixes all of the problems. Message-Id: <20170208101949.GA19474@scylladb.com> Reviewed-by: Nadav Har'El <nyh@scylladb.com>	2017-02-12 10:46:13 +02:00
Avi Kivity	9530bac2d6	Merge "Adding metrics using histogram and labels" from Amnon "This series uses the newly added histogram and label support to add metrics to the storage_proxy and to the column_family. This would add latency and histogram and the missing metrics from column family." * 'amnon/histogram_metrics' of github.com:cloudius-systems/seastar-dev: database: add metrics registration for the coloumn family storage_proxy: add read and write latency histogram estimated_histogram: returns a metrics histogram	2017-02-09 11:40:57 +02:00
Amnon Heiman	2cf13c26e2	storage_proxy: add read and write latency histogram Register the read and write latency histogram on the metrics layer. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2017-02-06 17:54:47 +02:00
Nadav Har'El	f2fd81ece0	materialized views: function to send a mutation to endpoint Add a function for sending one mutation to one remote replica owning this mutation. This is needed for materialized views, where each base replica sends each view mutation to one particular view replica. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2017-02-06 13:36:45 +01:00
Avi Kivity	3896c27e5f	Merge "DNS use in scylla" from Calle "Fixes #1531 Adds lookup to gms::inet_address and uses it in (hopefully all) the salient places where configured symbolic names are interpreted. Removes the dummy dns modula in scylla in favour of the seastar one." * 'calle/use-dns' of github.com:cloudius-systems/seastar-dev: remove scylla dns code service::storage_service: Remove depedency on scylla dns main.cc: remove scylla dns dependency main/init: Lookup inet addresses from config by dns lookup db::system_keyspace: Find rpc_address by lookup gms::inet_address: Add lookup functionality. scylla tls: Add option support for client auth and tls opts	2017-02-06 13:50:42 +02:00
Calle Wilund	ab800c225a	service::storage_service: Remove depedency on scylla dns Use seastar facilities instead	2017-02-06 11:36:57 +00:00
Gleb Natapov	3c372525ed	storage_proxy: use storage_proxy clock instead of explicit lowres_clock Merge commit `45b6070832` used butchered version of storage_proxy patch to adjust to rpc timer change instead the one I've sent. This patch fixes the differences. Message-Id: <20170206095237.GA7691@scylladb.com>	2017-02-06 12:51:36 +02:00
Calle Wilund	ff8f82f21c	scylla tls: Add option support for client auth and tls opts Refs #1813 (fixes scylla part) Added require_client_auth and priority_string options to server_encryption_options/client_encryption_options an process them. Allows TLS method/algo specification. Also enabled enforcing known cert authentication for both node-to-node and client communication.	2017-02-06 09:45:09 +00:00
Paweł Dziepak	1e8814f5ce	storage_proxy: support counter updates	2017-02-02 10:35:14 +00:00
Paweł Dziepak	c14c6b753b	storage_proxy: add get_live_endpoints()	2017-02-02 10:35:14 +00:00
Paweł Dziepak	67ca6959bd	storage_service: add COUNTERS feature	2017-02-02 10:35:14 +00:00
Paweł Dziepak	c66db213d3	storage_service: allow getting local host id without futures<>	2017-02-02 10:35:13 +00:00
Amnon Heiman	45b6070832	Merge seastar upstream * seastar 397685c...c1dbd89 (13): > lowres_clock: drop cache-line alignment for _timer > net/packet: add missing include > Merge "Adding histogram and description support" from Amnon > reactor: Fix the error: cannot bind 'std::unique_ptr' lvalue to 'std::unique_ptr&&' > Set the option '--server' of tests/tcp_sctp_client to be required > core/memory: Remove superfluous assignment > core/memory: Remove dead code > core/reactor: Use logger instead of cerr > fix inverted logic in overprovision parameter > rpc: fix timeout checking condition > rpc: use lowres_clock instead of high resolution one > semaphore: make semaphore's clock configurable > rpc: detect timedout outgoing packets earlier Includes treewide change to accomodate rpc changing its timeout clock to lowres_clock. Includes fixup from Amnon: collectd api should use the metrics getters As part of a preperation of the change in the metrics layer, this change the way the collectd api uses the metrics value to use the getters instead of calling the member directly. This will be important when the internal implementation will changed from union to variant. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1485457657-17634-1-git-send-email-amnon@scylladb.com>	2017-02-01 14:39:08 +02:00
Gleb Natapov	6e4817137e	storage_proxy: report foreground reads instead of reads The reason is the same as why foreground writes are reported instead of total writes (`049ae37d08`): It is much easier to see what is going on this way. Also fixes a typo in a counter's description. Fixes #1217 Message-Id: <20170129093412.GS11469@scylladb.com>	2017-01-29 12:40:56 +02:00
Gleb Natapov	64660397fc	storage_proxy: move operation type information from counter's name to a label Makes it much more flexible to view the data in various ways in Graphana. Message-Id: <20170126102746.GL11469@scylladb.com>	2017-01-26 12:38:29 +02:00
Gleb Natapov	ccee01f352	storage_proxy: put datacenter name into a label instead of counter's name Having datacenter name as a label makes it possible to create Prometheus board for the counters. Message-Id: <20170124132051.GX11469@scylladb.com>	2017-01-24 15:27:34 +02:00
Raphael S. Carvalho	1857ba0abc	db: fix bad resource usage distribution when resharding due to refresh That's because a single shard is used to calculate generation for new sstables in upload directory, and that will result in that single shard sharing all the resources with other shards. For refresh without upload dir, it currently works fine because we reshuffle column family dir instead. flush_upload_dir() is now a free function, takes a distributed database object, and uses calculate_shard_from_sstable_generation() to decide which shard will move sstable using its own generation namespace. Fixes #2008. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <b0cccf7bbb61416ff8718bac92fdca90cc5fb9c9.1484253232.git.raphaelsc@scylladb.com>	2017-01-19 18:55:21 +02:00
Amnon Heiman	e19fa02a17	remove scollectd from headers As the metrics migration progressed, some include to scollectd.hh left behind. Because of the nature of the scollecd implementation those include brings alot of code with them to the header files and eventually to many source file. This patch remove those include and add a missing include to storage_proxy.cc. The reason the compiler didn't complain is an indication to the problematic nature of those include in the first place. Before this patch, change in metrics.hh would cause 169 files to compile, after this change 17. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1484667536-2185-1-git-send-email-amnon@scylladb.com>	2017-01-17 17:39:47 +02:00
Duarte Nunes	c8cbfb7919	storage_service: Make MV feature experimental This patch ensures that the host only announces and registers the MATERIALIZED_VIEWS feature if it was started with the experimental flag. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170116123412.21365-1-duarte@scylladb.com>	2017-01-16 15:45:25 +02:00
Paweł Dziepak	e03868c226	tests: run with all features enabled Since `ce083308a1` "random_mutation_generator: Generate RTs by default" random mutation generator produces range tombstones. However, so far the tests were run with all features disabled (because of incomplete initialization of all services) which meant that RANGE_TOMBSTONE feature was not enabled and the code couldn't handle range tombstones that weren't just prefixes. This patch solves the problem by forcing all features to be enabled when tests are run. Message-Id: <20170116103324.22956-1-pdziepak@scylladb.com>	2017-01-16 11:38:45 +01:00
Tomasz Grabiec	3c3a4358ae	storage_proxy: Fix capturing of on-stack variable by reference partition_range_count was accepted by do_with callback by value and then captured by reference by async code, thus invoking use after destroy. Message-Id: <1484317846-14485-1-git-send-email-tgrabiec@scylladb.com>	2017-01-16 11:49:11 +02:00
Tomasz Grabiec	66547e7d7c	storage_proxy: Add missing initialization of _short_read_allowed Dropped by `a1cafed370` ("storage_proxy: handle range scans of sparsely populated tables"). Fixes the failure in update_cluster_layout_tests.TestUpdateClusterLayout test. Message-Id: <1484317450-13525-1-git-send-email-tgrabiec@scylladb.com>	2017-01-13 16:47:54 +02:00
Tomasz Grabiec	1e8151b4f2	storage_proxy: Fix use-after-free on one_or_two_partition_ranges query_mutations_locally() takes one_or_two_partition_ranges by reference and requires, indirectly, that it is kept alive until operation resolves. However, we were passing expiring value to it, the result of unwrap(). Fixes dtest failure in consistent_bootstrap_test.py:TestBootstrapConsistency.consistent_reads_after_bootstrap_test Another potential problem was that we were dereferencing "s" in the same expression which move-constructs an argument out of it. Message-Id: <1484222759-4967-1-git-send-email-tgrabiec@scylladb.com>	2017-01-12 15:10:51 +02:00
Gleb Natapov	76aed548e3	storage_proxy: add replica side counters for data read Message-Id: <20170112085907.GN11469@scylladb.com>	2017-01-12 11:41:04 +02:00
Avi Kivity	0591303b72	Merge "avoid excessive memory usage during resharding" from Rapahel "Intended to reduce memory usage when resharding by sharing sstable components among shards. File descriptors are also shared from now on, meaning that a much smaller number of file descriptors will be used during resharding. Fixes #1951." branch 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla * 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla: db: avoid excessive memory usage during resharding checked_file_impl: add support to dup sstables: group sstable components that can be shared among shards sstables: rename sstable member	2017-01-09 20:43:50 +02:00
Raphael S. Carvalho	68dfcf5256	db: avoid excessive memory usage during resharding After resharding, sstables may be owned by all shards, which means that file descriptors and memory usage for metadata will increase by a factor equal to number of shards. That can easily lead to OOM. SSTable components are immutable, so they can be stored in one shard and shared with others that need it. We use the following formula to decide which shard will open the sstable and share it with the others: (generation % smp::count), which is the inverse of how we calculate generation for new sstables. So if no resharding is performed, everything is shard-local. With this approach, resource usage due to loaded sstables will be evenly distributed among shards. For this approach to work, we now only populate keyspaces from shard 0. It's now the sole responsible for iterating through column family dirs. In addition, most of population functions are now free and take distributed database object as parameter. Fixes #1951. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-01-09 15:24:36 -02:00
Avi Kivity	8f36dca6f1	storage_proxy: prevent short read due to buffer size limit from being swallowed during range scan mutation_result_merger::get() assumes that the merged result may be a short read if at least one of the partial results is a short read (in other words, if none of the partial results is a short read, then the merged result is also not a short read). However this is not true; because we update the memory accounter incrementally, we may stop scanning early. All the partial results are full; but we did not scan the entire range. Fix by changing the short_read variable initialization from `no` (which assumes we'll encounter a short read indication when processing one of the batches) to `this->short_read()`, which also takes into account the memory accounter. Fixes #2001. Message-Id: <20170108111315.17877-1-avi@scylladb.com>	2017-01-09 09:21:43 +00:00
Vlad Zolotarov	492295eb7f	init: move supervisor_notify() out of main.cc Transform the supervisor_notify() and related functions into the "supervisor" class and place this class implementation in a separate .cc file. This is going to fix the compilation breakage of tests introduced by a commit `8014adc2a1` init: serialize the creation of system_traces KS objects Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1483663955-20096-1-git-send-email-vladz@scylladb.com>	2017-01-06 10:10:55 +00:00
Avi Kivity	eb520e7352	storage_proxy: fix result ordering for parallel partition range scans During a range scan, we try to avoid sorting according to partition range when we can do so. This is when we scan fewer than smp::count shards -- each shard's range is strictly ordered with respect to the others. However, we use the wrong key for the sort -- we use the shard number. But if we started at shard s > 0 and wrapped around to shard 0, then shard 0's range will be after the range belonging to shard s, but will sort before it. Fix by storing the iteration order as the sort key. We use that when we know that shards do not overlap (shards < smp::count) and the index within the source partition range vector when they do. Fixes #1998. Message-Id: <20170105114253.17492-1-avi@scylladb.com>	2017-01-05 12:51:37 +01:00
Vlad Zolotarov	8014adc2a1	init: serialize the creation of system_traces KS objects Serialize the creation of a system_traces KS objects when they do not exist - the initial cluster boot. Avoid creating them in parallel by different cluster Nodes in order to avoid issue #420. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1483552503-12873-3-git-send-email-vladz@scylladb.com>	2017-01-05 12:41:38 +01:00

1 2 3 4 5 ...

993 Commits