scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 17:40:34 +00:00

Author	SHA1	Message	Date
Calle Wilund	05851578d4	alternator::streams: Report streams as not ready until CDC stream id:s are available Refs #6864 When booting a clean scylla, CDC stream ID:s will not be availble until a nring delay time period has passed. Before this, writing to a CDC enabled table will fail hard. For alternator (and its tests), we can report the stream(s) for tables as not yet available (ENABLING) until such time as id:s are computed. v2: Keep storage service ref in executor	2020-08-03 20:34:15 +03:00
Avi Kivity	4edfdfa78d	Merge 'Build id cleanups' from Benny " Refs #5525 - main: add --build-id option - build_id: mv sources to utils/ - build_id: throw on errors rather than assert - build_id: simplify callback pointer type casting " * bhalevy-build-id-cleanups: build_id: simplify callback pointer type casting build_id: mv sources to utils/ main: add --build-id option	2020-08-03 17:18:09 +03:00
Calle Wilund	30a700c5b0	system_keyspace: Remove support for legacy truncation records Fixes #6341 Since scylla no longer supports upgrading from a version without the "new" (dedicated) truncation record table, we can remove support for these and the migtration thereof. Make sure the above holds whereever this is committed. Note that this does not remove the "truncated_at" field in system.local.	2020-08-03 17:16:26 +03:00
Benny Halevy	bf6e8f66d9	build_id: mv sources to utils/ The root directory is already overcrowded. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-03 15:55:16 +03:00
Benny Halevy	46f7d01536	main: add --build-id option Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-03 15:52:08 +03:00
Avi Kivity	257c17a87a	Merge "Don't depend on seastar::make_(lw_)?shared idiosyncrasies" from Rafael " While working on another patch I was getting odd compiler errors saying that a call to ::make_shared was ambiguous. The reason was that seastar has both: template <typename T, typename... A> shared_ptr<T> make_shared(A&&... a); template <typename T> shared_ptr<T> make_shared(T&& a); The second variant doesn't exist in std::make_shared. This series drops the dependency in scylla, so that a future change can make seastar::make_shared a bit more like std::make_shared. " * 'espindola/make_shared' of https://github.com/espindola/scylla: Everywhere: Explicitly instantiate make_lw_shared Everywhere: Add a make_shared_schema helper Everywhere: Explicitly instantiate make_shared cql3: Add a create_multi_column_relation helper main: Return a shared_ptr from defer_verbose_shutdown	2020-08-02 19:51:24 +03:00
Pavel Emelyanov	50d07696e4	main: Add missing calls to unregister RPC hanlers The gossiper's and migration_manager's unregistration is done on the services' stopm, for the rest we need to call the recently introduced methods. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-22 16:35:07 +03:00
Pavel Emelyanov	cc070ceca0	main: Shorten call to storage_proxy::init_messaging_service Just for brevity Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-07-22 16:31:57 +03:00
Rafael Ávila de Espíndola	ad6d65dbbd	Everywhere: Explicitly instantiate make_shared seastar::make_shared has a constructor taking a T&&. There is no such constructor in std::make_shared: https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared This means that we have to move from make_shared(T(...) to make_shared<T>(...) If we don't want to depend on the idiosyncrasies of seastar::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:49 -07:00
Rafael Ávila de Espíndola	8858873d85	main: Return a shared_ptr from defer_verbose_shutdown This moves a few calls to make_shared to a single location. This makes it easier to drop a dependency on the differences between seastar::make_shared and std::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-07-21 10:33:44 -07:00
Calle Wilund	0708a9971a	executor: Add system_distributed_keyspace as parameter/member Streams implementation will require querying system tables etc to do its work, thus will need access to this object.	2020-07-15 08:10:23 +00:00
Pavel Emelyanov	8d2e05778c	main: Stop http server Currently it's not stopped at all, so calling a REST request shutdown-time may crash things at random places. Fixes: #5702 But it's not the end of the story. Since the server stays up while we are shutting things down, each subsystem should carefully handle the cases when it's half-down, but a request comes. A better solution is to unregister rest verbs eventually, but httpd's rules cannot do it now. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 20:27:28 +03:00
Pavel Emelyanov	ba47ef0397	snapshots: Move ops gate from storage_service Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 20:17:21 +03:00
Pavel Emelyanov	d989d9c1c7	snapshots: Initial skeleton A placeholder for snapshotting code that will be moved into it from the storage_service. Also -- pass it through the API for future use. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 19:54:14 +03:00
Pavel Emelyanov	9a8a1635b7	snapshots: Properly shutdown API endpoints Now with the seastar httpd routes unset() at hands we can shut down individual API endpoints. Do this for snapshot calls, this will make snapshot controller stop safe. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 17:27:45 +03:00
Glauber Costa	e40aa042a7	distributed_loader: reshard before the node is made online This patch moves the resharding process to use the new directory_with_sstables_handler infrastructure. There is no longer a clear reshard step, and that just becomes a natural part of populate_column_family. In main.cc, a couple of changes are necessary to make that happen. The first one obviously is to stop calling reshard. We also need to make sure that: - The compaction manager is started much earlier, so we can register resharding jobs with it. - auto compactions are disabled in the populate method, so resharding doesn't have to fight for bandwidth with auto compactions. Now that we are resharding through the sstable_directory, the old resharding code can be deleted. There is also no need to deal with the resharding backlog either, because the SSTables are not yet added to the sstable set at this point. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:37:18 -04:00
Glauber Costa	9902af894a	compaction_manager: rename run_resharding_job It will be used to run any custom job where the caller provides a function. One such example is indeed resharding, but reshaping SSTables can also fall here. The semaphore is also renamed, and we'll allow only one custom job at a time (across all possible types). We also remove the assumption of the scheduling group. The caller has to have already placed the code in the correct CPU scheduling group. The I/O priority class comes from the descriptor. To make sure that we don't regress, we wrap the entire reshard-at-boot code in the compaction class. Currently the setup would be done in the main group, and the actual resharding in the compaction group. Note that this is temporary, as this code is about to change. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:00:27 -04:00
Pavel Emelyanov	60e283b23e	auth: Move away from storage_service Now after the auth start/stop is standalone, we can remove reference from storage service to it. This frees some tests from the need to carry the auth service around for nothing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:33 +03:00
Pavel Emelyanov	6a46721fb7	auth: Move start-stop code into main The auth service management is currently sitting in storage service, but it was needed there just for cql/thrift start code. After the latters has been moved away there are no other reasons for the auth to be integrated with the storage service, so move it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:33 +03:00
Pavel Emelyanov	3eaf6b3ec7	main: Don't forget to stop cql/thrift when start is aborted The defer action for stopping the storage_service is registered very late, after the cql and thrift started. If an error happens in between, these client-shutdown hooks will not be called. This is a problem with the hooks, but fixing it in hooks place is a big rework, so for now put fuses for cql and thrift individually -- both their stopping codes are re-entrable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:33 +03:00
Pavel Emelyanov	a1df24621c	thrift_controller: Switch on standalone Remove the on-storage_service instance and make everybody use th standalone one. Stopping the thrift is done by registering the controller in client service shutdown hooks. This automatically wires the stopping into drain, decommission and isolation codes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:33 +03:00
Pavel Emelyanov	c26943e7b5	thrift_controller: Pass one through management API The goal is to make the relevant endpoints work on standalone thrift controller instead of the storage_service's one, so prepare this controller (dummy for now) and pass it all the way down the API code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:33 +03:00
Pavel Emelyanov	1d5cdfe3c6	cql_controller: Switch on standalone Remove the on-storage_service instance and make everybody use th standalone one. Stopping the server is done by registering the controller in client service shutdown hooks. This automatically wires the stopping into drain, decommission and isolation codes. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:09 +03:00
Pavel Emelyanov	7ebe44f33d	cql_controller: Pass one through management API The goal is to make the relevant endpoints work on standalone cql controller instead of the storage_service's one, so prepare this controller (dummy for now) and pass it all the way down the API code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 22:14:09 +03:00
Pavel Emelyanov	6a89c987e4	api: Tune reg/unreg of client services control endpoints Currntly API endpoints to start and stop cql_server and thrift are registered right after the storage service is started, but much earlier than those services are. In between these two points a lot of other stuff gets initialized. This opens a small window during which cql_server and thrift can be started by hand too early. The most obvious problem is -- the storage_service::join_cluster() may not yet be called, the auth service is thus not started, but starting cql/thrift needs auth. Another problem is those endpoints are not unregistered on stop, thus creating another way to start cql/thrif at wrong time. Also the endpoints registration change helps further patching. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-12 18:47:24 +03:00
Pavel Emelyanov	7696ed1343	shard_tracker: Configure it in one go Instead of doing 3 smp::invoke_on_all-s and duplicating tracker::impl API for the tracker itself, introduce the tracker::configure, simplify the tracker configuration and narrow down the public tracker API. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200528185442.10682-1-xemul@scylladb.com>	2020-05-29 14:50:43 +02:00
Botond Dénes	e0b98ba921	database: give system reads a concurrency boost during startup In the next patches we will match reads to the appropriate reader concurrency semaphore based on the scheduling group they run in. This will result in a lot of system reads that are executed during startup and that were up to now (incorrectly) using the user read semaphore to switch to the system read semaphore. This latter has a much more constrained concurrency, which was observed to cause system reads to saturate and block on the semaphore, slowing down startup. To solve this, boost the concurrency of the system read semaphore during startup to match that of the user semaphore. This is ok, as during startup there are no user reads to compete with. After startup, before we start serving user reads the concurrency is reverted back to the normal value.	2020-05-28 10:40:08 +03:00
Nadav Har'El	c3da9f2bd4	alternator: add mandatory configurable write isolation mode Alternator supports four ways in which write operations can use quorum writes or LWT or both, which we called "write isolation policies". Until this patch, Alternator defaulted to the most generally safe policy, "always_use_lwt". This default could have been overriden for each table separately, but there was no way to change this default for all tables. This patch adds a "--alternator-write-isolation" configuration option which allows changing the default. Moreover, @dorlaor asked that users must explicitly choose this default mode, and not get "always_use_lwt" without noticing. The previous default, "always_use_lwt" supports any workload correctly but because it uses LWT for all writes it may be disappointingly slow for users who run write-only workloads (including most benchmarks) - such users might find the slow writes so disappointing that they will drop Scylla. Conversely, a default of "forbid_rmw" will be faster and still correct, but will fail on workloads which need read-modify-write operations - and suprise users that need these operations. So Dor asked that that none of the write modes be made the default, and users must make an informed choice between the different write modes, rather than being disappointed by a default choice they weren't aware of. So after this patch, Scylla refuses to boot if Alternator is enabled but a "--alternator-write-isolation" option is missing. The patch also modifies the relevant documentation, adds the same option to our docker image, and the modifies the test-running script test/alternator/run to run Scylla with the old default mode (always_use_lwt), which we need because we want to test RMW operations as well. Fixes #6452 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200524160338.108417-1-nyh@scylladb.com>	2020-05-27 08:40:05 +03:00
Pavel Emelyanov	3c2066bd78	system_keyspace: Cleanup setup() from storage_service Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-05-25 14:17:31 +03:00
Pavel Emelyanov	89a1b09214	sstables_manager: Keep format on Make the database be the format_selector target, so when the format is selected its set on database which in turn just forwards the selection into sstables managers. All users of the format are already patched to read it from those managers. The initial value for the format is the highest, which is needed by tests. When scylla starts the format is updated by format_selector, first after reading from system tables, then by selectiing it from features. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-05-25 14:17:28 +03:00
Pavel Emelyanov	a61f18ed64	format_selector: Make it standalone Remove the selector from storage_service and introduce an instance in main.cc that starts soon after the gossiper and feature_service, starts listening for features and sets the selected format on storage_service. This change includes - Removal of for_testing bit from format_selector constructor, now tests just do not use it - Adding a gate to selection routine to make sure on exit all the selection stuff is done. Although before the cluster join the selector waits for the feature listeners to finish (the .sync() method) this gate is still required to handle aborted start cases and wait for gossiper announcement from selector to complete. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-05-25 14:15:04 +03:00
Pavel Emelyanov	5eb37c3743	storage_service: Introduce format_selector The final goal is to have a entity that will - read the saved sstables format (if any) - listen for sstables format related features enabling - select the top-most format - put the selected format onto a "target" - spread the world about it (via gossiper) The target is the service from which the selected format is read (so the selector can be removed once features agreement is reached). Today it's the storage_service, but at the end of this series it will be sstables_manager. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-05-25 13:27:34 +03:00
Pavel Emelyanov	70391feb8e	storage_service: Tossing bits around The goal is to have main.cc add code between prepare_to_join and join_token_ring. As a side effect this drives us closer to proper split of storage service into sharded service itslef vs start/boot/join code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-05-25 13:21:08 +03:00
Glauber Costa	7423ccc318	compaction_manager: allow early aborts through abort sources. The shutdown process of compaction manager starts with an explicit call from the database object. However that can only happen everything is already initialized. This works well today, but I am soon to change the resharding process to operate before the node is fully ready. One can still stop the database in this case, but reshardings will have to finish before the abort signal is processed. This patch passes the existing abort source to the construction of the compaction_manager and subscribes to it. If the abort source is triggered, the compaction manager will react to it firing and all compactions it manages will be stopped. We still want the database object to be able to wait for the compaction manager, since the database is the object that owns the lifetime of the compaction manager. To make that possible we'll use a future that is return from stop(): no matter what triggered the abort, either an early abort during initial resharding or a database-level event like drain, everything will shut down in the right order. The abort source is passed to the database, who is responsible from constructing the compaction manager. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Glauber Costa	e29701ca1c	compaction_manager: expand state to be able to differentiate between enabled and stopped We are having many issues with the stop code in the compaction_manager. Part of the reason is that the "stopped" state has its meaning overloaded to indicate both "compaction manager is not accepting compactions" and "compaction manager is not ready or destructed". In a later step we could default to enabled-at-start, but right now we maintain current behavior to minimize noise. It is only possible to stop the compaction manager once. It is possible to enable / disable the compaction manager many times. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Botond Dénes	74b020ad05	main: run redis service in the statement scheduling group Like all the other API services (CQL, thrift and alternator). Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200512145631.104051-1-bdenes@scylladb.com>	2020-05-12 18:01:27 +03:00
Avi Kivity	5b971397aa	Revert "compaction_manager: allow early aborts through abort sources." This reverts commit `e8213fb5c3`. It results in an assertion failure in remove_index_file_test. Fixes #6413.	2020-05-10 12:32:18 +03:00
Glauber Costa	e8213fb5c3	compaction_manager: allow early aborts through abort sources. The shutdown process of compaction manager starts with an explicit call from the database object. However that can only happen everything is already initialized. This works well today, but I am soon to change the resharding process to operate before the node is fully ready. One can still stop the database in this case, but reshardings will have to finish before the abort signal is processed. This patch passes the existing abort source to the construction of the compaction_manager and subscribes to it. If the abort source is triggered, the compaction manager will react to it firing and all compactions it manages will be stopped. We still want the database object to be able to wait for the compaction manager, since the database is the object that owns the lifetime of the compaction manager. To make that possible we'll use a future that is return from stop(): no matter what triggered the abort, either an early abort during initial resharding or a database-level event like drain, everything will shut down in the right order. The abort source is passed to the database, who is responsible from constructing the compaction manager. Tests: unit (dev), manual start+stop, manual drain + stop Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200506184749.98288-1-glauber@scylladb.com>	2020-05-07 13:24:47 +03:00
Piotr Sarna	b8df958811	alternator: deduplicate logs on boot Alternator server used to print a startup log line for each shard, which is redundant and creates churn for nodes with many cores. Instead of all that, a single line is now printed once alternator server properly boots. Fixes #6347 Tests: manual(boot), unit(dev)	2020-05-05 16:19:18 +03:00
Pavel Emelyanov	98635b74a6	main: Keep feature_service for storage_proxy Fixes #6250 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200423165608.32419-1-xemul@scylladb.com>	2020-04-23 20:46:36 +02:00
Avi Kivity	88ade3110f	treewide: replace calls to engine().some_api() with some_api() This removes the need to include reactor.hh, a source of compile time bloat. In some places, the call is qualified with seastar:: in order to resolve ambiguities with a local name. Includes are adjusted to make everything compile. We end up having 14 translation units including reactor.hh, primarily for deprecated things like reactor::at_exit(). Ref #1	2020-04-05 12:46:04 +03:00
Pavel Emelyanov	86296ba557	main: Do not destroy token_metadata The storage_proxy instances hold references to token_metadata ones and leave unwaited futures continuing to its query_partition_key_range_concurrent method. The latter is called from do_query so it's not that easy to find out who is leaking. Keep the tokens not freed for a while. Fixes: #6093 Test: manual start-stop Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200402183538.9674-1-xemul@scylladb.com>	2020-04-03 16:00:08 +02:00
Konstantin Osipov	9948f548a5	lwt: remove Paxos from experimental list Always enable lightweight transactions. Remove the check for the command line switch from the feature service, assuming LWT is always enabled. Remove the check for LWT from Alternator. Note that in order for the cluster to work with LWT, all nodes need to support it. Rename LWT to UNUSED in db/config.hh, to keep accepting lwt keyword in --experimental-features command line option, but do nothing with it. Changes in v2: * remove enable_lwt feature flag, it's always there Closes #6102 test: unit (dev, debug) Message-Id: <20200401071149.41921-1-kostja@scylladb.com>	2020-04-01 09:12:21 +02:00
Rafael Ávila de Espíndola	c5795e8199	everywhere: Replace engine().cpu_id() with this_shard_id() This is a bit simpler and might allow removing a few includes of reactor.hh. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200326194656.74041-1-espindola@scylladb.com>	2020-03-27 11:40:03 +03:00
Nadav Har'El	2deba4035a	merge: Hook alternator to admission control Merged patch series from Piotr Sarna: This series hooks alernator to admission control, similarly to how CQL server uses it. The estimated memory consumption is set to 2x raw JSON request, since that seems to be the upper limit of how much more memory rapidjson allocates during parsing. Note, that since Seastar HTTP currently reads the whole contents upfront, there's no easy way to apply admission control before reading the request - that would involve some changes to our HTTP API. Note 2: currently, admission control in CQL does not properly pass memory consumption information for requests that are bounced to another shard - that would require either transferring semaphore units between shards or keeping a foreign pointer to the original units. As a result, alternator also does not pass correct admission control info between shards, and all places in code which do that are marked with clear FIXMEs. Fixes #5029 Piotr Sarna (5): storage_service: add memory limiter semaphore getter alternator: add service permit to callbacks alternator: add memory limiter to alternator server alternator: add addmission control stats entry alternator: hook admission control to alternator server alternator/executor.cc \| 113 ++++++++++++++++++++++-------------- alternator/executor.hh \| 32 +++++----- alternator/rmw_operation.hh \| 1 + alternator/server.cc \| 83 +++++++++++++++----------- alternator/server.hh \| 8 ++- alternator/stats.cc \| 2 + alternator/stats.hh \| 1 + main.cc \| 3 +- service/storage_service.hh \| 4 ++ 9 files changed, 149 insertions(+), 98 deletions(-)	2020-03-19 15:51:17 +02:00
Pavel Emelyanov	da3bf20e71	main: Respect config start_native_transport option There's such an option, and it's not taken into account on scylla start. There's a symmetrical start_rpc one, which is, so make both act similarly. The default value for the option is true, so default set-ups will not get broken. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200310140518.29410-1-xemul@scylladb.com>	2020-03-18 11:17:56 +02:00
Piotr Sarna	a1ea650d83	alternator: add memory limiter to alternator server With the memory limiter semaphore, the server will be able to apply admission control to alternator requets.	2020-03-16 07:44:26 +01:00
Piotr Jastrzebski	22daa262ee	partitioner: move default_partitioner to schema.cc Make it inaccessible to other compilation units. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-03-15 10:25:20 +01:00
Tomasz Grabiec	3548e85ff7	Merge "features: Properly resolve when_enabled futures on stop" from Pavel E. If the feature service is stopped without enabling some features, the latrer may end up with "broken promise" exception on futures attached to the _pr promise. Fix this by switching the only user of it onto 'listener' API and remove future-based one. Tests: unit(debug), manual start-stop and aborted-start	2020-03-10 10:09:24 +02:00
Avi Kivity	8af6dabbf0	Merge "Decouple cql_config from storage_service" from Pavel E " The cql_configu is needed by storage_service to feed it to thrift/transport servers. These servers, in turn, put the config onto query_options. The final goal of this config reference is the guts of query_processor (but currently it's only used by restrictions) This way is rather long and confusing. It seems more natural to keep the cql_config on it's main "user" -- query processor. This patch set does so. However, in order to push the config into its current usage places a huge refactoring is needed -- most of the classes in cql3/statements and cql3/restrictions. It's much more handy to contunue keeping it via query_options, so the query_processor is equipped with the method to return the reference on the config to those initializing query_options. Tests: unit(debug) " * 'br-clean-client-services-from-cql-config-2' of https://github.com/xemul/scylla: storage_service: Forget cql_config transport: Forget cql_config thrift: Forget cql_config query_processor: Carry reference on cql_config	2020-03-09 15:06:59 +02:00

1 2 3 4 5 ...

490 Commits