scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 12:06:44 +00:00

Author	SHA1	Message	Date
Botond Dénes	8e117501ac	tools/scylla-sstable: extract sstable_consumer interface into own header So it can be used in code outside scylla-sstable.cc. This source file is quite large already, and as we have yet another large chunk of code to add, we want to add it in a separate file.	2023-01-09 09:46:57 -05:00
Botond Dénes	9b1c486051	tools/json_writer: add accessor to underlying writer	2023-01-09 09:46:57 -05:00
Botond Dénes	cfb5afbe9b	tools/scylla-sstable: fix indentation Left broken by previous patches.	2023-01-09 09:46:57 -05:00
Botond Dénes	d42b0bb5d5	tools/scylla-sstable: export mutation_fragment_json_writer declaration To json_writer.hh. Method definition are left in scylla-sstable.cc. Indentation is left broken, will be fixed by the next patch.	2023-01-09 09:46:57 -05:00
Botond Dénes	517135e155	tools/scylla-sstable: mutation_fragment_json_writer un-implement sstable_consumer There is no point in the former implementing said interface. For one it is a futurized interface, which is not needed for something writing to the stdout. Rename the methods to follow the naming convention of rjson writers more closely.	2023-01-09 09:46:57 -05:00
Botond Dénes	0ee1c6ca57	tools/scylla-sstable: extract json writing logic from json_dumper We want to split this class into two parts: one with the actual logic converting mutation fragments to json, and a wrapper over this one, which implements the sstable_consumer interface. As a first step we extract the class as is (no changes) and just forward all-calls from now empty wrapper to it.	2023-01-09 09:46:57 -05:00
Botond Dénes	55ef0ed421	tools/scylla-sstable: extract json_writer into its own header Other source files will want to use it soon.	2023-01-09 09:46:57 -05:00
Botond Dénes	8623818a8d	tools/scylla-sstable: use json_writer::DataKey() to write all keys This method was renamed from its previous name of PartitionKey. Since in json partition keys and clustering keys look alike, with the only difference being that the former may also have a token, it makes to have a single method to write them (with an optional token parameter). This was the case at some point, json_dumper::write_key() taking this role. However at a later point, json_writer::PartitionKey() was introduced and now the code uses both. Standardize on the latter and give it a more generic name.	2023-01-09 09:46:57 -05:00
Botond Dénes	602fca0a12	tools/scylla-types: fix use-after-free on main lambda captures The main lambda of scylla-types, the one passed to app_template::run() was recently made a coroytine. app_template::run() however doesn't keep this lambda alive and hence after the first suspention point, accessing the lambda's captures triggers use-after-free. The simple fix is to convert the coroutine into continuation chain.	2023-01-09 09:46:57 -05:00
Michał Chojnowski	08b3a9c786	configure: don't reduce parsers' optimization level to 1 in release The line modified in this patch was supposed to increase the optimization levels of parsers in debug mode to 1, because they were too slow otherwise. But as a side effect, it also reduced the optimization level in release mode to 1. This is not a problem for the CQL frontend, because statement preparation is not performance-sensitive, but it is a serious performance problem for Alternator, where it lies in the hot path. Fix this by only applying the -O1 to debug modes. Fixes #12463 Closes #12460	2023-01-06 18:04:36 +02:00
Avi Kivity	6868dcf30b	tools: toolchain: drop s390x from prepare script architecture list It's been a long while since we built ScyllaDB for s390x, and in fact the last time I checked it was broken on the ragel parser generator generating bad source files for the HTTP parser. So just drop it from the list. I kept s390x in the architecture mapping table since it's still valid. Closes #12455	2023-01-06 09:08:01 +02:00
Botond Dénes	2612f98a6c	Merge 'Abort repair tasks' from Aleksandra Martyniuk Aborting of repair operation is fully managed by task manager. Repair tasks are aborted: - on shutdown; top level repair tasks subscribe to global abort source. On shutdown all tasks are aborted recursively - through node operations (applies to data_sync_repair_task_impls and their descendants only); data_sync_repair_task_impl subscribes to node_ops_info abort source - with task manager api (top level tasks are abortable) - with storage_service api and on failure; these cases were modified to be aborted the same way as the ones from above are. Closes #12085 * github.com:scylladb/scylladb: repair: make top level repair tasks abortable repair: unify a way of aborting repair operations repair: delete sharded abort source from node_ops_info repair: delete unused node_ops_info from data_sync_repair_task_impl repair: delete redundant abort subscription from shard_repair_task_impl repair: add abort subscription to data sync task tasks: abort tasks on system shutdown	2023-01-05 15:21:35 +01:00
Avi Kivity	cc6010b512	Merge 'Make restore_replica_count abortable' from Benny Halevy Similar to the way we allow aborting streaming-based removenode, subscribe to storage_service::_abort_source to request abort locally and pass a shared_ptr<abort_source> to `node_ops_info`, used to abort removenode_with_repair on shutdown. Fixes #12429 Closes #12430 * github.com:scylladb/scylladb: storage_service: restore_replica_count: demote status_checker related logging to debug level storage_service: restore_replica_count: allow aborting removenode_with_repair storage_service: coroutinize restore_replica_count storage_service: restore_replica_count: undefer stop_status_checker storage_service: restore_replica_count: handle exceptions from stream_async and send_replication_notification storage_service: restore_replica_count: coroutinize status_checker	2023-01-05 15:21:35 +01:00
Kamil Braun	09da661eeb	Merge 'raft: replace experimental raft option with dedicated flag' from Gleb Natapov Unlike other experimental feature we want to raft to be opt in even after it leaves experimental mode. For that we need to have a separate option to enable it. The patch adds the binary option "consistent-cluster-management" for that. * 'consistent-cluster-management-flag' of github.com:scylladb/scylla-dev: raft: replace experimental raft option with dedicated flag main: move supervisor notification about group registry start where it actually starts	2023-01-05 15:21:35 +01:00
Kamil Braun	df72536fc5	Merge 'docs: add the upgrade guide for Enterprise from 2022.1 to 2022.2' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/12314 This PR adds the upgrade guide for ScyllaDB Enterprise - from version 2022.1 to 2022.2. Using this opportunity, I've replaced "Scylla" with "ScyllaDB" in the upgrade-enterprise index file. In previous releases, we added several upgrade guides - one per platform (and version). In this PR, I've merged the information for different platforms to create one generic upgrade guide. It is similar to what @kbr- added for the Open Source upgrade guide from 5.0 to 5.1. See https://docs.scylladb.com/stable/upgrade/upgrade-opensource/upgrade-guide-from-5.0-to-5.1/. Closes #12339 * github.com:scylladb/scylladb: docs: add the info about minor release docs: add the new upgade guide 2022.1 to 2022.2 to the index and the toctree docs: add the index file for the new upgrage guide from 2022.1 to 2022.2 docs: add the metrics update file to the upgrade guide 2022.1 to 2022.2 docs: add the upgrade guide for ScyllaDB Enterprise from 2022.1 to 2022.2	2023-01-04 18:07:00 +01:00
Benny Halevy	086546f575	storage_service: restore_replica_count: demote status_checker related logging to debug level the status_checker is not the main line of business of restore_replica_count, starting and stopping it do nt seem to deserve info level logging, which might have been useful in the past to debug issues surrounding that. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	3879ee1db8	storage_service: restore_replica_count: allow aborting removenode_with_repair Similar to the way we allow aborting streaming-based removenode, subscribe to storage_service::_abort_source to request abort locally and pass a shared_ptr<abort_source> to `node_ops_info`, used to abort removenode_with_repair on shutdown. Fixes #12429 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	afece5bdc4	storage_service: coroutinize restore_replica_count and unwrap the async thread started for streaming. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	d1eadc39c1	storage_service: restore_replica_count: undefer stop_status_checker Now that all exceptions in the rest of the function are swallowed, just execute the stop_status_checker deferred action serially before returning, on the wau to coroutinizing restore_replica_count (since we can't co_await status_checker inside the deferred action). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	788ecb738d	storage_service: restore_replica_count: handle exceptions from stream_async and send_replication_notification On the way to coroutinizing restore_replica_count, extract awaiting stream_async and send_replication_notification into a try/catch blocks so we can later undefer stop_status_checker. The exception is still returned as an exceptional future which is logged by the caller as warning. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:02:42 +02:00
Benny Halevy	b54d121dfd	storage_service: restore_replica_count: coroutinize status_checker There is no need to start a thread for the status_checker and can be implemented using a background coroutine. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:02:20 +02:00
Botond Dénes	1d273a98b9	readers/multishard: shard_reader::close() silence read-ahead timeouts Timouts are benign, especially on a read-ahead that turned out to be not needed at all. They just introduce noise in the logs, so silence them. Fixes: #12435 Closes #12441	2023-01-04 16:10:09 +02:00
Kamil Braun	4268b1bbc2	Merge 'raft: raft_group0, register RPC verbs on all shards' from Gusev Petr raft_group0 used to register RPC verbs only on shard 0. This worked on clusters with the same --smp setting on all nodes, since RPCs in this case are processed on the same shard as the calling code, and raft_group0 methods only run on shard 0. A new test test_nodes_with_different_smp was added to identify the problem. Since --smp can only be specified via the command line, a corresponding parameter was added to the ManagerClient.server_add method. It allows to override the default parameters set by the SCYLLA_CMDLINE_OPTIONS variable by changing, adding or deleting individual items. Fixes: #12252 Closes #12374 * github.com:scylladb/scylladb: raft: raft_group0, register RPC verbs on all shards raft: raft_append_entries, copy entries to the target shard test.py, allow to specify the node's command line in test	2023-01-04 11:11:21 +01:00
Marcin Maliszkiewicz	61a9816bad	utils/rjson: enable inlining in rapidjson library Due to lack of NDEBUG macro inlining was disabled. It's important for parsing and printing performance. Testing with perf_simple_query shows that it reduced around 7000 insns/op, thus increasing median tps by 4.2% for the alternator frontend. Because inlined functions are called for every character in json this scales with request/response size. When default write size is increased by around 7x (from ~180 to ~ 1255 bytes) then the median tps increased by 12%. Running: ./build/release/test/perf/perf_simple_query_g --smp 1 \ --alternator forbid --default-log-level error \ --random-seed=1235000092 --duration=60 --write Results before the patch: median 46011.50 tps (197.1 allocs/op, 12.1 tasks/op, 170989 insns/op, 0 errors) median absolute deviation: 296.05 maximum: 46548.07 minimum: 42955.49 Results after the patch: median 47974.79 tps (197.1 allocs/op, 12.1 tasks/op, 163723 insns/op, 0 errors) median absolute deviation: 303.06 maximum: 48517.53 minimum: 44083.74 The change affects both json parsing and printing. Closes #12440	2023-01-04 10:27:35 +02:00
Michał Jadwiszczak	83bb77b8bb	test/boost/cql_query_test: enable `parallelized_aggregation` Run tests for parallelized aggregation with `enable_parallelized_aggregation` set always to true, so the tests work even if the default value of the option is false. Closes #12409	2023-01-04 10:11:25 +02:00
Anna Stuchlik	c4d779e447	doc: Fix https://github.com/scylladb/scylla-doc-issues/issues/854 - update the procedure to update topology strategy when nodes are on different racks Closes #12439	2023-01-04 09:50:10 +02:00
Avi Kivity	f600ad5c1b	Update seastar submodule * seastar 3db15b5681...ca586cfb8d (28): > reactor: trim returned buffer to received number of bytes > util/process: include used header > build: drop unused target_include_directories() > build: use BUILD_IN_SOURCE instead chdir <SOURCE_DIR> > build: specify CMake policy CMP0135 to new > tests: only destroy allocated pending connections > build: silence the output when generating private keys > tests, httpd: Limit loopback connection factory sharding > lw_shared_ptr: Add nullptr_t comparing operators > noncopyable_function: Add concept for (Func func) constructor > reactor: add process::terminate() and process::kill() > Merge 'tests, include: include headers without ".." in path' from Kefu Chai > build: customize toolset for building Boost > build: use different toolset base on specified compiler > allocator: add an option to reserve additional memory for the OS > Merge 'build: pass cflags and ldflags to cooking.sh' from Kefu Chai > build: build static library of cryptopp > gate: add gate holders debugging > build: detect debug build of yaml-cpp also > build: do not use pkg_search_module(IMPORTED_TARGET) for finding yaml-cpp > build: bump yaml-cpp to 0.7.0 in cooking_recipe > build: bump cryptopp to 8.7.0 in cooking_recipe > build: bump boost to 1.81.0 in cooking_recipe > build: bump fmtlib to 9.1.0 in cooking_recipe > shared_ptr: add overloads for fmt::ptr() > chunked_fifo: const_iterator: use the base class ctor > build: s/URING_LIBARIES/URING_LIBRARIES/ > build: export the full path of uring with URING_LIBRARIES Closes #12434	2023-01-03 17:58:31 +02:00
Alejo Sanchez	889acf710c	test/python: increase CQL connection timeout for... test_ssl In very slow debug builds the default driver timeouts are too low and tests might fail. Bump up the values to a more reasonable time. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12408	2023-01-03 17:10:46 +02:00
Nadav Har'El	1c96d2134f	docs,alternator: link to issue about missing ACL feature The alternator compatibility.md document mentions the missing ACL (access control) feature, but unlike other missing features we forgot to link to the open issue about this missing feature. So let's add that link. Refs #5047. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12399	2023-01-03 16:50:33 +02:00
Kamil Braun	fc57626afa	Merge 'docs: remove auto_bootstrap option from the documentation' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/12318 This PR removes all occurrences of the `auto_bootstrap` option in the docs. In most cases, I've simply removed the option name and its definition, but sometimes additional changes were necessary: - In node-joined-without-any-data.rst, I removed the `auto_bootstrap `option as one of the causes of the problem. - In rebuild-node.rst, I removed the first step in the procedure (enabling the `auto_bootstrap `option). - In admin. rst, I removed the section about manual bootstrapping - it's based on setting `auto_bootstrap` to false, which is not possible now. Closes #12419 * github.com:scylladb/scylladb: docs: remove the auto_bootstrap option from the admin procedures - involves removing the Manual Bootstraping section docs: remove the auto_bootstrap option from the procedure to replace a dead node docs: remove the auto_bootstrap option from the Troubleshooting article about a node joining with no data docs: remove the auto_bootstrap option from the procedure to rebuild a node after losing the data volume docs: remove the auto_bootstrap option from the procedures to create a cluster or add a DC	2023-01-03 15:44:00 +01:00
Petr Gusev	8417840647	raft: raft_group0, register RPC verbs on all shards raft_group0 used to register RPC verbs only on shard 0. This worked on clusters with the same --smp setting on all nodes, since RPCs in this case are (usually) processed on the same shard as the calling code, and raft_group0 methods only run on shard 0. A new test test_nodes_with_different_smp was added to identify the problem. Fixes: #12252	2023-01-03 17:04:07 +03:00
Anna Stuchlik	00ef20c3df	docs: remove the auto_bootstrap option from the admin procedures - involves removing the Manual Bootstraping section	2023-01-03 14:48:01 +01:00
Anna Stuchlik	b7d62b2fc7	docs: remove the auto_bootstrap option from the procedure to replace a dead node	2023-01-03 14:47:55 +01:00
Anna Stuchlik	bc62e61df1	docs: remove the auto_bootstrap option from the Troubleshooting article about a node joining with no data	2023-01-03 14:46:38 +01:00
Anna Stuchlik	1602f27cd7	docs: remove the auto_bootstrap option from the procedure to rebuild a node after losing the data volume	2023-01-03 14:45:08 +01:00
Petr Gusev	7725e03a09	raft: raft_append_entries, copy entries to the target shard If append_entries RPC was received on a non-zero shard, we may need to pass it to a zero (or, potentially, some other) shard. The problem is that raft::append_request contains entries in the form of raft::log_entry_ptr == lw_shared_ptr<log_entry>, which doesn't support cross-shard reference counting. In debug mode it contains a special ref-counting facility debug_shared_ptr_counter_type, which resorts to on_internal_error if it detects such a case. To solve this, we just copy log entries to the target shard if it isn't equal to the current one. In most cases, if --smp setting is the same on all nodes, RPC will be handled on zero shard, so there will be no overhead.	2023-01-03 15:25:00 +03:00
Petr Gusev	1c23390f12	test.py, allow to specify the node's command line in test An optional parameter cmdline has been added to the ManagerClient.server_add method. It allows you to override the default parameters set by the SCYLLA_CMDLINE_OPTIONS variable by changing, adding or deleting individual items. To change or add a parameter just specify its name and value one after the other. To remove parameter use the special keyword __remove__ as a value. To set a parameter without a value (such as --overprovisioned) use the special keyword __missing__ as the value.	2023-01-03 15:24:54 +03:00
Nadav Har'El	eb85f136c8	cql-pytest: document how to write new cql-pytest tests Add to test/cql-pytest/README.md an explanation of the philosophy of the cql-pytest test suite, and some guideliness on how to write good tests in that framework. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12400	2023-01-03 12:13:22 +02:00
Anna Stuchlik	994bc33147	docs: fix the command on the Manager-Monitoring Integration troubleshooting page Closes #12375	2023-01-03 11:41:16 +02:00
Anna Stuchlik	9d17d812c0	docs: Fix https://github.com/scylladb/scylla-doc-issues/issues/870 , update the nodetool rebuild command Closes #12416	2023-01-03 11:40:40 +02:00
Gleb Natapov	1688163233	raft: replace experimental raft option with dedicated flag Unlike other experimental feature we want to raft to be optional even after it leaves experimental mode. For that we need to have a separate option to enable it. The patch adds the binary option "consistent-cluster-management" for that.	2023-01-03 11:15:11 +02:00
Gleb Natapov	29060cc235	main: move supervisor notification about group registry start where it actually starts `99fe580068` moved raft_group_registry::start call a bit later, but forget to move supervisor notification call. Do it now.	2023-01-03 11:09:30 +02:00
Botond Dénes	2ef71e9c70	Merge 'Improve verbosity of task manager api' from Aleksandra Martyniuk The PR introduces changes to task manager api: - extends tasks' list returned with get_tasks with task type, keyspace, table, entity, and sequence number - extends status returned with get_task_status and wait_task with a list of children's ids Closes #12338 * github.com:scylladb/scylladb: api: extend status in task manager api api: extend get_tasks in task manager api	2023-01-03 10:39:41 +02:00
Botond Dénes	82101b786d	Merge 'docs: document scylla-api-client' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/11999. This PR adds a description of scylla-api-cli. Closes #12392 * github.com:scylladb/scylladb: docs: fix the description of the system log POST example docs: uptate the curl tool name docs: describe how to use the scylla-api-client tool docs: fix the scylla-api-client tool name docs: document scylla-api-cli	2023-01-03 10:30:04 +02:00
Benny Halevy	63c2cdafe8	sstables: index_reader: close(index_bound&) reset current_list When closing _lower_bound and *_upper_bound in the final close() call, they are currently left with an engaged current_list member. If the index_reader uses a _local_index_cache, it is evicted with evict_gently which will, rightfully, see the respective pages as referenced, and they won't be evicted gently (only later when the index_reader is destroyed). Reset index_bound.current_list on close(index_bound&) to free up the reference. Ref #12271 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12370	2023-01-02 16:42:33 +01:00
Avi Kivity	767b7be8be	Merge 'Get rid of handle_state_replacing' from Benny Halevy Since [repair: Always use run_replace_ops](`2ec1f719de`), nodes no longer publish HIBERNATE state so we don't need to support handling it. Replace is now always done using node operations (using repair or streaming). so nodes are never expected to change status to HIBERNATE. Therefore storage_service:handle_state_replacing is not needed anymore. This series gets rid of it and updates documentation related to STATUS:HIBERNATE respectively. Fixes #12330 Closes #12349 * github.com:scylladb/scylladb: docs: replace-dead-node: get rid of hibernate status storage_service: get rid of handle_state_replacing	2023-01-02 13:35:29 +02:00
Gleb Natapov	28952d32ff	storage_service: move leave_ring outside of unbootstrap() We want to reuse the later without the call. Message-Id: <20221228144944.3299711-17-gleb@scylladb.com>	2023-01-02 12:03:29 +02:00
Gleb Natapov	229cef136d	raft: add trace logging to raft::server::start Allows to see initial state of the server during start. Message-Id: <20221228144944.3299711-15-gleb@scylladb.com>	2023-01-02 11:57:53 +02:00
Gleb Natapov	96453ff75f	service: raft: improve group0_state_machine::apply logging Trace how many entries are applied as well. Message-Id: <20221228144944.3299711-14-gleb@scylladb.com>	2023-01-02 11:57:16 +02:00
Gleb Natapov	dbd5b97201	storage_service: improve logging in update_pending_ranges() function We pass the reason for the change. Log it as well. Message-Id: <20221228144944.3299711-11-gleb@scylladb.com>	2023-01-02 11:54:03 +02:00

1 2 3 4 5 ...

34498 Commits