scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Asias He	829b4c1438	repair: Make removenode safe by default Currently removenode works like below: - The coordinator node advertises the node to be removed in REMOVING_TOKEN status in gossip - Existing nodes learn the node in REMOVING_TOKEN status - Existing nodes sync data for the range it owns - Existing nodes send notification to the coordinator - The coordinator node waits for notification and announce the node in REMOVED_TOKEN Current problems: - Existing nodes do not tell the coordinator if the data sync is ok or failed. - The coordinator can not abort the removenode operation in case of error - Failed removenode operation will make the node to be removed in REMOVING_TOKEN forever. - The removenode runs in best effort mode which may cause data consistency issues. It means if a node that owns the range after the removenode operation is down during the operation, the removenode node operation will continue to succeed without requiring that node to perform data syncing. This can cause data consistency issues. For example, Five nodes in the cluster, RF = 3, for a range, n1, n2, n3 is the old replicas, n2 is being removed, after the removenode operation, the new replicas are n1, n5, n3. If n3 is down during the removenode operation, only n1 will be used to sync data with the new owner n5. This will break QUORUM read consistency if n1 happens to miss some writes. Improvements in this patch: - This patch makes the removenode safe by default. We require all nodes in the cluster to participate in the removenode operation and sync data if needed. We fail the removenode operation if any of them is down or fails. If the user want the removenode operation to succeed even if some of the nodes are not available, the user has to explicitly pass a list of nodes that can be skipped for the operation. $ nodetool removenode --ignore-dead-nodes <list_of_dead_nodes_to_ignore> <host_id> Example restful api: $ curl -X POST "http://127.0.0.1:10000/storage_service/remove_node/?host_id=7bd303e9-4c7b-4915-84f6-343d0dbd9a49&ignore_nodes=127.0.0.3,127.0.0.5" - The coordinator can abort data sync on existing nodes For example, if one of the nodes fails to sync data. It makes no sense for other nodes to continue to sync data because the whole operation will fail anyway. - The coordinator can decide which nodes to ignore and pass the decision to other nodes Previously, there is no way for the coordinator to tell existing nodes to run in strict mode or best effort mode. Users will have to modify config file or run a restful api cmd on all the nodes to select strict or best effort mode. With this patch, the cluster wide configuration is eliminated. Fixes #7359 Closes #7626	2020-12-10 10:14:39 +02:00
Piotr Wojtczak	c09ab3b869	api: Add cardinality to toppartitions results This change enhances the toppartitions api to also return the cardinality of the read and write sample sets. It now uses the size() method of space_saving_top_k class, counting the unique operations in the sampled set for up to the given capacity. Fixes #4089 Closes #7766	2020-12-08 09:38:59 +01:00
Asias He	0a3a2a82e1	api: Add force_remove_endpoint for gossip It is used to force remove a node from gossip membership if something goes wrong. Note: run the force_remove_endpoint api at the same time on _all_ the nodes in the cluster in order to prevent the removed nodes come back. Becasue nodes without running the force_remove_endpoint api cmd can gossip around the removed node information to other nodes in 2 * ring_delay (2 * 30 seconds by default) time. For instance, in a 3 nodes cluster, node 3 is decommissioned, to remove node 3 from gossip membership prior the auto removal (3 days by default), run the api cmd on both node 1 and node 2 at the same time. $ curl -X POST --header "Accept: application/json" "http://127.0.0.1:10000/gossiper/force_remove_endpoint/127.0.0.3" $ curl -X POST --header "Accept: application/json" "http://127.0.0.2:10000/gossiper/force_remove_endpoint/127.0.0.3" Then run 'nodetool gossipinfo' on all the nodes to check the removed nodes are not present. Fixes #2134 Closes #5436	2020-11-29 13:58:46 +02:00
Tomasz Grabiec	d3a5814f4f	api: Connect nodetool resetlocalschema to schema version recalculation It doesn't really do what the nodetool command is docuemented to do, which is to truncate local schema tables, but it is still an improvement. Message-Id: <1605740190-30332-1-git-send-email-tgrabiec@scylladb.com>	2020-11-19 13:55:09 +02:00
Piotr Dulikowski	0fd36e2579	api: allow changing hinted handoff configuration This commit makes it possible to change hints manager's configuration at runtime through HTTP API. To preserve backwards compatibility, we keep the old behavior of not creating and checking hints directories if they are not enabled at startup. Instead, hint directories are lazily initialized when hints are enabled for the first time through HTTP API.	2020-11-17 10:24:43 +01:00
Piotr Dulikowski	6465dd160b	storage_proxy: fix wrong return type in swagger The GET `hinted_handoff_enabled_by_dc` endpoint had an incorrect return type specified. Although it does not have an implementation, yet, it was supposed to return a list of strings with DC names for which generating hints is enabled - not a list of string pairs. Such return type is expected by the JMX.	2020-11-17 10:24:43 +01:00
Piotr Dulikowski	cefe5214ff	config: plug in hints::host_filter object into configuration Uses db::hints::host_filter as the type of hinted_handoff_enabled configuration option. Previously, hinted_handoff_enabled used to be a string option, and it was parsed later in a separate function during startup. The function returned a std::optional<std::unordered_set<sstring>>, whose meaning in the context of hints is rather enigmatic for an observer not familiar with hints. Now, hinted_handoff_enabled has type of db::hints::host_filter, and it is plugged into the config parsing framework, so there is no need for later post-processing.	2020-11-17 10:24:42 +01:00
Benny Halevy	29ed59f8c4	main: start a shared_token_metadata And use it to get a token_metadata& compatible with current usage, until the services are converted to use token_metadata_ptr. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:23 +02:00
Avi Kivity	82f79c0077	api: column_family: don't capture structured bindings in lambdas Clang does not yet implement p1091r3, which allows lambdas to capture structured bindings. To accomodate it, don't use structured bindings for variables that are later captured.	2020-10-16 15:25:05 +03:00
Amnon Heiman	48c3c94aa6	api/storage_service.cc: Add the get_range_to_endpoint_map The get_range_to_endpoint_map method, takes a keyspace and returns a map between the token ranges and the endpoint. It is used by some external tools for repair. Token ranges are codes as size-2 array, if start or end are empty, they will be added as an empty string. The implementation uses get_range_to_address_map and re-pack it accordingly. The use of stream_range_as_array it to reduce the risk of large allocations and stalls. Relates to scylladb/scylla-jmx#36 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Closes #7329	2020-10-08 12:09:09 +03:00
Avi Kivity	907b775523	Merge "Free compaction from storage service" from Pavel E " There's last call for global storage service left in compaction code, it comes from cleanup_compaction to get local token ranges for filtering. The call in question is a pure wrapper over database, so this set just makes use of the database where it's already available (perform_cleanup) and adds it where it's needed (perform_sstable_upgrade). tests: unit(dev), nodetool upgradesstables " * 'br-remove-ss-from-compaction-3' of https://github.com/xemul/scylla: storage_service: Remove get_local_ranges helper compaction: Use database from options to get local ranges compaction: Keep database reference on upgrade options compaction: Keep database reference on cleanup options db: Factor out get_local_ranges helper	2020-08-23 17:58:32 +03:00
Avi Kivity	0dcb16c061	Merge "Constify access to token_metadata" from Benny " We keep refrences to locator::token_metadata in many places. Most of them are for read-only access and only a few want to modify the token_metadata. Recently, in `94995acedb`, we added yielding loops that access token_metadata in order to avoid cpu stalls. To make that possible we need to make sure they token_metadata object they are traversing won't change mid-loop. This series is a first step in ensuring the serialization of updates to shared token metadata to reading it. Test: unit(dev) Dtest: bootstrap_test:TestBootstrap.start_stop_test{,_node}, update_cluster_layout_tests.py -a next-gating(dev) " * tag 'constify-token-metadata-access-v2' of github.com:bhalevy/scylla: api/http_context: keep a const sharded<locator::token_metadata>& gossiper: keep a const token_metadata& storage_service: separate get_mutable_token_metadata range_streamer: keep a const token_metadata& storage_proxy: delete unused get_restricted_ranges declaration storage_proxy: keep a const token_metadata& storage_proxy: get rid of mutable get_token_metadata getter database: keep const token_metadata& database: keyspace_metadata: pass const locator::token_metadata& around everywhere_replication_strategy: move methods out of line replication_strategy: keep a const token_metadata& abstract_replication_strategy: get_ranges: accept const token_metadata& token_metadata: rename calculate_pending_ranges to update_pending_ranges token_metadata: mark const methods token_ranges: pending_endpoints_for: return empty vector if keyspace not found token_ranges: get_pending_ranges: return empty vector if keyspace not found token_ranges: get rid of unused get_pending_ranges variant replication_strategy: calculate_natural_endpoints: make token_metadata& param const token_metadata: add get_datacenter_racks() const variant	2020-08-22 20:47:45 +03:00
Pavel Emelyanov	8333fed8aa	compaction: Keep database reference on upgrade options The only place that creates them is the API upgrade_sstables call. The created options object doesn't over-survive the returned future, so it's safe to keep this reference there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-21 14:58:40 +03:00
Benny Halevy	436babdb3d	api/http_context: keep a const sharded<locator::token_metadata>& It has no need of changing token_metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-20 16:20:34 +03:00
Pavel Emelyanov	285648620b	repair: Keep sharded messaging service reference on repair_info This reference comes from the API that already has it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 20:50:53 +03:00
Pavel Emelyanov	8b4820b520	repair: Keep sharded messaging service in API The reference will be needed in repair_start, so prepare one in advance Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 20:50:53 +03:00
Pavel Emelyanov	126dac8ad1	repair: Unset API endpoints on stop This unset the roll-back of the correpsonding _set-s. The messaging service will be (already is, but implicitly) used in repair API callbacks, so make sure they are unset before the messaging service is stopped. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 20:50:53 +03:00
Pavel Emelyanov	fe2c479c04	repair: Setup API endpoints in separate helper There will be the unset part soon, this is the preparation. No functional changes in api/storage_server.cc, just move the code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 20:50:52 +03:00
Pavel Emelyanov	b895c2971a	api: Use local reference to messaging_service Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 13:08:12 +03:00
Pavel Emelyanov	d477bd562d	api: Unregister messaging endpoints on stop API is one of the subsystems that work with messaging service. To keep the dependencies correct the related API stuff should be stopped before the messaging service stops. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-19 13:08:12 +03:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Avi Kivity	3530e80ce1	Merge "Support md format" from Benny " This series adds support for the "md" sstable format. Support is based on the following: * do not use clustering based filtering in the presence of static row, tombstones. * Disabling min/max column names in the metadata for formats older than "md". * When updating the metadata, reset and disable min/max in the presence of range tombstones (like Cassandra does and until we process them accurately). * Fix the way we maintain min/max column names by: keeping whole clustering key prefixes as min/max rather than calculating min/max independently for each component, like Cassandra does in the "md" format. Fixes #4442 Tests: unit(dev), cql_query_test -t test_clustering_filtering* (debug) md migration_test dtest from git@github.com:bhalevy/scylla-dtest.git migration_test-md-v1 " * tag 'md-format-v4' of github.com:bhalevy/scylla: (27 commits) config: enable_sstables_md_format by default test: cql_query_test: add test_clustering_filtering unit tests table: filter_sstable_for_reader: allow clustering filtering md-format sstables table: create_single_key_sstable_reader: emit partition_start/end for empty filtered results table: filter_sstable_for_reader: adjust to md-format table: filter_sstable_for_reader: include non-scylla sstables with tombstones table: filter_sstable_for_reader: do not filter if static column is requested table: filter_sstable_for_reader: refactor clustering filtering conditional expression features: add MD_SSTABLE_FORMAT cluster feature config: add enable_sstables_md_format database: add set_format_by_config test: sstable_3_x_test: test both mc and md versions test: Add support for the "md" format sstables: mx/writer: use version from sstable for write calls sstables: mx/writer: update_min_max_components for partition tombstone sstables: metadata_collector: support min_max_components for range tombstones sstable: validate_min_max_metadata: drop outdated logic sstables: rename mc folder to mx sstables: may_contain_rows: always true for old formats sstables: add may_contain_rows ...	2020-08-11 13:29:11 +03:00
Piotr Jastrzebski	80e3923b3c	codebase wide: replace find(...) != end() with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously the code pattern looked like: <collection>.find(<element>) != <collection>.end() In C++20 the same can be expressed with: <collection>.contains(<element>) This is not only more concise but also expresses the intend of the code more clearly. This commit replaces all the occurences of the old pattern with the new approach. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>	2020-08-11 13:28:50 +03:00
Pekka Enberg	a37eaaa022	sstables: Add support for the "md" format enum value Add the sstable_version_types::md enum value and logically extend sstable_version_types comparisons to cover also the > sstable_version_types::mc cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Asias He	271fac56a3	repair: Add synchronous API to query repair status This new api blocks until the repair job is either finished or failed or timeout. E.g., - Without timeout curl -X GET http://127.0.0.1:10000/storage_service/repair_status/?id=123 - With timeout curl -X GET http://127.0.0.1:10000/storage_service/repair_status/?id=123&timeout=5 The timeout is in second. The current asynchronous api returns immediately even if the repair is in progress. E.g., curl -X GET http://127.0.0.1:10000/storage_service/repair_async/ks?id=123 User can use the new synchronous API to avoid keep sending the query to poll if the repair job is finished. Fixes #6445	2020-07-14 11:20:15 +03:00
Amnon Heiman	186301aff8	per table metrics: change estimated_histogram to time_estimated_histogram This patch changes the per table latencies histograms: read, write, cas_prepare, cas_accept, and cas_learn. Beside changing the definition type and the insertion method, the API was changed to support the new metrics. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-07-14 11:17:43 +03:00
Asias He	07e253542d	compaction_manager: Avoid stall in perform_cleanup The following stall was seen during a cleanup operation: scylla: Reactor stalled for 16262 ms on shard 4. \| std::_MakeUniq<locator::tokens_iterator_impl>::__single_object std::make_unique<locator::tokens_iterator_impl, locator::tokens_iterator_impl&>(locator::tokens_iterator_impl&) at /usr/include/fmt/format.h:1158 \| (inlined by) locator::token_metadata::tokens_iterator::tokens_iterator(locator::token_metadata::tokens_iterator const&) at ./locator/token_metadata.cc:1602 \| locator::simple_strategy::calculate_natural_endpoints(dht::token const&, locator::token_metadata&) const at simple_strategy.cc:? \| (inlined by) locator::simple_strategy::calculate_natural_endpoints(dht::token const&, locator::token_metadata&) const at ./locator/simple_strategy.cc:56 \| locator::abstract_replication_strategy::get_ranges(gms::inet_address, locator::token_metadata&) const at /usr/include/fmt/format.h:1158 \| locator::abstract_replication_strategy::get_ranges(gms::inet_address) const at /usr/include/fmt/format.h:1158 \| service::storage_service::get_ranges_for_endpoint(seastar::basic_sstring<char, unsigned int, 15u, true> const&, gms::inet_address const&) const at /usr/include/fmt/format.h:1158 \| service::storage_service::get_local_ranges(seastar::basic_sstring<char, unsigned int, 15u, true> const&) const at /usr/include/fmt/format.h:1158 \| (inlined by) operator() at ./sstables/compaction_manager.cc:691 \| (inlined by) _M_invoke at /usr/include/c++/9/bits/std_function.h:286 \| std::function<std::vector<seastar::lw_shared_ptr<sstables::sstable>, std::allocator<seastar::lw_shared_ptr<sstables::sstable> > > (table const&)>::operator()(table const&) const at /usr/include/fmt/format.h:1158 \| (inlined by) compaction_manager::rewrite_sstables(table, sstables::compaction_options, std::function<std::vector<seastar::lw_shared_ptr<sstables::sstable>, std::allocator<seastar::lw_shared_ptr<sstables::sstable> > > (table const&)>) at ./sstables/compaction_manager.cc:604 \| compaction_manager::perform_cleanup(table) at /usr/include/fmt/format.h:1158 To fix, we furturize the function to get local ranges and sstables. In addition, this patch removes the dependency to global storage_service object. Fixes #6662	2020-07-01 15:03:50 +08:00
Pavel Emelyanov	d0d2da6ccb	api: Remove excessive capture The "result" in this lambda is already not used and can be removed Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-29 19:08:59 +03:00
Pavel Emelyanov	4f5ffa980d	api: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-29 19:08:59 +03:00
Pavel Emelyanov	d99969e0e0	api: Fix wrongly captured map of snapshots The results of get_snapshot_details() is saved in do_with, then is captured on the json callback by reference, then the do_with's future returns, so by the time callback is called the map is already free and empty. Fix by capturing the result directly on the callback. Fixes recently merged `b6086526`. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-29 19:08:21 +03:00
Pavel Emelyanov	d674baacef	snapshot: Move all code into db::snapshot_ctl class This includes - rename namespace in snapshot-ctl.[cc\|hh] - move methods from storage_service to snapshot_ctl - move snapshot_details struct - temporarily make storage_service._snapshot_lock and ._snapshot_ops public - replace two get_local_storage_service() occurrences with this._db The latter is not 100% clear as the code that does this references "this" from another shard, but the _db in question is the distributed object, so they are all the same on all instances. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 19:59:53 +03:00
Pavel Emelyanov	d989d9c1c7	snapshots: Initial skeleton A placeholder for snapshotting code that will be moved into it from the storage_service. Also -- pass it through the API for future use. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 19:54:14 +03:00
Pavel Emelyanov	9a8a1635b7	snapshots: Properly shutdown API endpoints Now with the seastar httpd routes unset() at hands we can shut down individual API endpoints. Do this for snapshot calls, this will make snapshot controller stop safe. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 17:27:45 +03:00
Pavel Emelyanov	b608652622	api: Rewrap set_server_snapshot lambda The lambda calls the core snapshot method deep inside the json marshalling callback. This will bring problems with stopping the snapshot controller in the next patches. To prepare for this -- call the .get_snapshot_details() first, then keep the result in do_with() context. This change doesn't affect the issue the lambde in question is about to solve as the whole result set is anyway kept in memory while being streamed outside. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 17:27:45 +03:00
Avi Kivity	e5be3352cf	database, streaming, messaging: drop streaming memtables Before Scylla 3.0, we used to send streaming mutations using individual RPC requests and flush them together using dedicated streaming memtables. This mechanism is no longer in use and all versions that use it have long reached end-of-life. Remove this code.	2020-06-25 15:25:54 +02:00
Glauber Costa	bb07678346	api: do not allow user to meddle with auto compaction too early We are about to use the auto compaction property during the populate/reshard process. If the user toggles it, the database can be left in a bad state. There should be no reason why a user would want to set that up this early. So we'll disallow it. To do that property, it is better if the check of whether or not the storage service is ready to accomodate this request is local to the storage service itself. We then move the logic of set_tables_autocompaction from api to the storage service. The API layer now merely translates the table names and pass it along. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:00:25 -04:00
Nadav Har'El	86a4dfcd29	merge: api: Command to check and repair cdc streams Merged pull request https://github.com/scylladb/scylla/pull/6551 from Juliusz Stasiewicz: The command regenerates streams when: generations corresponding to a gossiped timestamp cannot be fetched from system_distributed table, or when generation token ranges do not align with token metadata. In such case the streams are regenerated and new timestamp is gossiped around. The returned JSON is always empty, regardless of whether streams needed regeneration or not. Fixes #6498 Accompanied by: scylladb/scylla-jmx#109, scylladb/scylla-tools-java#172	2020-06-15 14:17:35 +03:00
Avi Kivity	d17b05e911	Merge 'Adding Optimized pseudo floating point estimated histogram' from Amnon " This series Adds a pseudo-floating-point histogram implementation. The histogram is used for time_estimated_histogram a histogram for latency tracking and then used in storage_proxy as a more efficient with a higher resolution histogram. Follow up series would use the new histogram in other places in the system and will add an implementation that supports lower values. Fixes #5815 Fixes #4746 " * amnonh-quicker_estimated_histogram: storage_proxy: use time_estimated_histogram for latencies test/boost/estimated_histogram_test utils/histogram_metrics_helper Adding histogram converter utils/estimated_histogram: Adding approx_exponential_histogram	2020-06-15 10:19:36 +03:00
Amnon Heiman	6e1f042b93	storage_proxy: use time_estimated_histogram for latencies This patch change storage_proxy to use time_estimated_histogram. Besides the type, it changes how values are inserted and how the histogram is used by the API. An example how a metric looks like after the change: scylla_storage_proxy_coordinator_write_latency_bucket{le="640.000000",scheduling_group_name="statement",shard="0",type="histogram"} 0 scylla_storage_proxy_coordinator_write_latency_bucket{le="768.000000",scheduling_group_name="statement",shard="0",type="histogram"} 0 scylla_storage_proxy_coordinator_write_latency_bucket{le="896.000000",scheduling_group_name="statement",shard="0",type="histogram"} 0 scylla_storage_proxy_coordinator_write_latency_bucket{le="1024.000000",scheduling_group_name="statement",shard="0",type="histogram"} 0 scylla_storage_proxy_coordinator_write_latency_bucket{le="1280.000000",scheduling_group_name="statement",shard="0",type="histogram"} 0 scylla_storage_proxy_coordinator_write_latency_bucket{le="1536.000000",scheduling_group_name="statement",shard="0",type="histogram"} 0 scylla_storage_proxy_coordinator_write_latency_bucket{le="1792.000000",scheduling_group_name="statement",shard="0",type="histogram"} 2 scylla_storage_proxy_coordinator_write_latency_bucket{le="2048.000000",scheduling_group_name="statement",shard="0",type="histogram"} 2 scylla_storage_proxy_coordinator_write_latency_bucket{le="2560.000000",scheduling_group_name="statement",shard="0",type="histogram"} 3 scylla_storage_proxy_coordinator_write_latency_bucket{le="3072.000000",scheduling_group_name="statement",shard="0",type="histogram"} 5 scylla_storage_proxy_coordinator_write_latency_bucket{le="3584.000000",scheduling_group_name="statement",shard="0",type="histogram"} 5 scylla_storage_proxy_coordinator_write_latency_bucket{le="4096.000000",scheduling_group_name="statement",shard="0",type="histogram"} 7 scylla_storage_proxy_coordinator_write_latency_bucket{le="5120.000000",scheduling_group_name="statement",shard="0",type="histogram"} 8 scylla_storage_proxy_coordinator_write_latency_bucket{le="6144.000000",scheduling_group_name="statement",shard="0",type="histogram"} 9 scylla_storage_proxy_coordinator_write_latency_bucket{le="7168.000000",scheduling_group_name="statement",shard="0",type="histogram"} 11 scylla_storage_proxy_coordinator_write_latency_bucket{le="8192.000000",scheduling_group_name="statement",shard="0",type="histogram"} 11 scylla_storage_proxy_coordinator_write_latency_bucket{le="10240.000000",scheduling_group_name="statement",shard="0",type="histogram"} 19 scylla_storage_proxy_coordinator_write_latency_bucket{le="12288.000000",scheduling_group_name="statement",shard="0",type="histogram"} 49 scylla_storage_proxy_coordinator_write_latency_bucket{le="14336.000000",scheduling_group_name="statement",shard="0",type="histogram"} 132 scylla_storage_proxy_coordinator_write_latency_bucket{le="16384.000000",scheduling_group_name="statement",shard="0",type="histogram"} 294 scylla_storage_proxy_coordinator_write_latency_bucket{le="20480.000000",scheduling_group_name="statement",shard="0",type="histogram"} 1035 scylla_storage_proxy_coordinator_write_latency_bucket{le="24576.000000",scheduling_group_name="statement",shard="0",type="histogram"} 2790 scylla_storage_proxy_coordinator_write_latency_bucket{le="28672.000000",scheduling_group_name="statement",shard="0",type="histogram"} 5788 scylla_storage_proxy_coordinator_write_latency_bucket{le="32768.000000",scheduling_group_name="statement",shard="0",type="histogram"} 9815 scylla_storage_proxy_coordinator_write_latency_bucket{le="40960.000000",scheduling_group_name="statement",shard="0",type="histogram"} 19821 scylla_storage_proxy_coordinator_write_latency_bucket{le="49152.000000",scheduling_group_name="statement",shard="0",type="histogram"} 30063 scylla_storage_proxy_coordinator_write_latency_bucket{le="57344.000000",scheduling_group_name="statement",shard="0",type="histogram"} 38642 scylla_storage_proxy_coordinator_write_latency_bucket{le="65536.000000",scheduling_group_name="statement",shard="0",type="histogram"} 44987 scylla_storage_proxy_coordinator_write_latency_bucket{le="81920.000000",scheduling_group_name="statement",shard="0",type="histogram"} 51821 scylla_storage_proxy_coordinator_write_latency_bucket{le="98304.000000",scheduling_group_name="statement",shard="0",type="histogram"} 54197 scylla_storage_proxy_coordinator_write_latency_bucket{le="114688.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55054 scylla_storage_proxy_coordinator_write_latency_bucket{le="131072.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55363 scylla_storage_proxy_coordinator_write_latency_bucket{le="163840.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55520 scylla_storage_proxy_coordinator_write_latency_bucket{le="196608.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55545 scylla_storage_proxy_coordinator_write_latency_bucket{le="229376.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="262144.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="327680.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="393216.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="458752.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="524288.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="655360.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="786432.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="917504.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="1048576.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="1310720.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="1572864.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="1835008.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="2097152.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="2621440.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="3145728.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="3670016.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="4194304.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="5242880.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="6291456.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="7340032.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="8388608.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="10485760.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="12582912.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="14680064.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="16777216.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="20971520.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="25165824.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="29360128.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="33554432.000000",scheduling_group_name="statement",shard="0",type="histogram"} 55549 scylla_storage_proxy_coordinator_write_latency_bucket{le="+Inf",scheduling_group_name="statement",shard="0",type="histogram"} 55549 Fixes #4746 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-06-15 08:23:02 +03:00
Pavel Emelyanov	a1df24621c	thrift_controller: Switch on standalone Remove the on-storage_service instance and make everybody use th standalone one. Stopping the thrift is done by registering the controller in client service shutdown hooks. This automatically wires the stopping into drain, decommission and isolation codes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:33 +03:00
Pavel Emelyanov	c26943e7b5	thrift_controller: Pass one through management API The goal is to make the relevant endpoints work on standalone thrift controller instead of the storage_service's one, so prepare this controller (dummy for now) and pass it all the way down the API code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:33 +03:00
Pavel Emelyanov	1d5cdfe3c6	cql_controller: Switch on standalone Remove the on-storage_service instance and make everybody use th standalone one. Stopping the server is done by registering the controller in client service shutdown hooks. This automatically wires the stopping into drain, decommission and isolation codes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:09 +03:00
Pavel Emelyanov	7ebe44f33d	cql_controller: Pass one through management API The goal is to make the relevant endpoints work on standalone cql controller instead of the storage_service's one, so prepare this controller (dummy for now) and pass it all the way down the API code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:09 +03:00
Pavel Emelyanov	6a89c987e4	api: Tune reg/unreg of client services control endpoints Currntly API endpoints to start and stop cql_server and thrift are registered right after the storage service is started, but much earlier than those services are. In between these two points a lot of other stuff gets initialized. This opens a small window during which cql_server and thrift can be started by hand too early. The most obvious problem is -- the storage_service::join_cluster() may not yet be called, the auth service is thus not started, but starting cql/thrift needs auth. Another problem is those endpoints are not unregistered on stop, thus creating another way to start cql/thrif at wrong time. Also the endpoints registration change helps further patching. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 18:47:24 +03:00
Rafael Ávila de Espíndola	555d8fe520	build: Be consistent about system versus regular headers We were not consistent about using '#include "foo.hh"' instead of '#include <foo.hh>' for scylla's own headers. This patch fixes that inconsistency and, to enforce it, changes the build to use -iquote instead of -I to find those headers. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200608214208.110216-1-espindola@scylladb.com>	2020-06-10 15:49:51 +03:00
Juliusz Stasiewicz	0ad50013ff	storage_service: Implementation of API call to repair CDC streams The command regenerates streams when: - generations corresponding to a gossiped timestamp cannot be fetched from `system_distributed` table, - or when generation token ranges do not align with token metadata. In such case the streams are regenerated and new timestamp is gossiped around. The returned JSON is always empty, regardless of whether streams needed regeneration or not.	2020-06-06 16:52:21 +02:00
Juliusz Stasiewicz	aadd2ffa6a	api: Added command `/storage_service/cdc_streams_check_and_repair` This commit introduces a placeholder for HTTP POST request at `/storage_service/cdc_streams_check_and_repair`.	2020-05-29 12:23:08 +02:00
Avi Kivity	513faa5c71	Merge 'Use http Stream for describe ring' from Amnon " This series changes the describe_ring API to use HTTP stream instead of serializing the results and send it as a single buffer. While testing the change I hit a 4-year-old issue inside service/storage_proxy.cc that causes a use after free, so I fixed it along the way. Fixes #6297 " * amnonh-stream_describe_ring: api/storage_service.cc: stream result of token_range storage_service: get_range_to_address_map prevent use after free	2020-05-17 14:05:26 +03:00
Amnon Heiman	7c4562d532	api/storage_service.cc: stream result of token_range The get token range API can become big which can cause large allocation and stalls. This patch replace the implementation so it would stream the results using the http stream capabilities instead of serialization and sending one big buffer. Fixes #6297 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-05-17 13:56:05 +03:00
Avi Kivity	f1fde537a9	Merge 'Support Snapshot of multiple tables' from Amnon This series adds support for taking a snapshot of multiple tables. Fixes #6333 * amnonh-snapshot_keyspace_table: api/storage_service.cc: Snapshot, support multiple tables service/storage_service: Take snapshot of multiple tables	2020-05-12 11:34:09 +03:00

1 2 3 4 5 ...

490 Commits