scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 20:57:00 +00:00

Author	SHA1	Message	Date
Gleb Natapov	100b44f5ff	test use new schema announcement api in cql_test_env.cc	2022-01-13 23:09:02 +02:00
Gleb Natapov	5dffc8ed3e	test: convert database_test to new schema announcement api	2022-01-13 23:09:02 +02:00
Botond Dénes	d6efe27545	Merge 'db: config: add a flag to disable new reversed reads algorithm' from Kamil Braun Just in case the new algorithm turns out to be buggy, or give a performance regression, add a flag to fall-back to the old algorithm for use in the field. Closes #9908 * github.com:scylladb/scylla: db: config: add a flag to disable new reversed reads algorithm replica: table: remove obsolete comment about reversed reads	2022-01-13 23:09:02 +02:00
Avi Kivity	63d254a8d2	Merge 'gms, service: futurize and coroutinize gossiper-related code' from Pavel Solodovnikov This series greatly reduces gossipers' dependence on `seastar::async` (yet, not completely). `i_endpoint_state_change_subscriber` callbacks are converted to return futures (again, to get rid of `seastar::async` dependency), all users are adjusted appropriately (e.g. `storage_service`, `cdc::generation_service`, `streaming::stream_manager`, `view_update_backlog_broker` and `migration_manager`). This includes futurizing and coroutinizing the whole function call chain up to the `i_endpoint_state_change_subscriber` callback functions. To aid the conversion process, a non-`seastar::async` dependent variant of `utils::atomic_vector::for_each` is introduced (`for_each_futurized`). A different name is used to clearly distinguish converted and non-converted code, so that the last step (remove `seastar::async()` wrappers around callback-calling code in gossiper) is easier. This is left for a follow-up series, though. Tests: unit(dev) Closes #9844 * github.com:scylladb/scylla: service: storage_service: coroutinize `set_gossip_tokens` service: storage_service: coroutinize `leave_ring` service: storage_service: coroutinize `handle_state_left` service: storage_service: coroutinize `handle_state_leaving` service: storage_service: coroutinize `handle_state_removing` service: storage_service: coroutinize `do_drain` service: storage_service: coroutinize `shutdown_protocol_servers` service: storage_service: coroutinize `excise` service: storage_service: coroutinize `remove_endpoint` service: storage_service: coroutinize `handle_state_replacing` service: storage_service: coroutinize `handle_state_normal` service: storage_service: coroutinize `update_peer_info` service: storage_service: coroutinize `do_update_system_peers_table` service: storage_service: coroutinize `update_table` service: storage_service: coroutinize `handle_state_bootstrap` service: storage_service: futurize `notify_*` functions service: storage_service: coroutinize `handle_state_replacing_update_pending_ranges` repair: row_level_repair_gossip_helper: coroutinize `remove_row_level_repair` locator: reconnectable_snitch_helper: coroutinize `reconnect` gms: i_endpoint_state_change_subscriber: make callbacks to return futures utils: atomic_vector: introduce future-returning `for_each` function utils: atomic_vector: rename `for_each` to `thread_for_each` gms: gossiper: coroutinize `start_gossiping` gms: gossiper: coroutinize `force_remove_endpoint` gms: gossiper: coroutinize `do_status_check` gms: gossiper: coroutinize `remove_endpoint`	2022-01-13 23:09:02 +02:00
Avi Kivity	230eac439e	Update seastar submodule * seastar ae8d1c28a2...5025cd44ea (2): > Merge "Lazy IO capacity replenishment" from Pavel E Fixes #9893 > configure.py: don't use deprecated mktemp()	2022-01-13 23:09:02 +02:00
Nadav Har'El	f842f65794	Merge 'thrift: switch to replica::database uses to data_dictionary' from Avi Kivity replica::database is (as its name indicates) a replica-side service, while thrift is coordinator-side. Convert thrift's use of replica::database for data dictionary lookups to the data_dictionary module. Since data_dictionary was missing a get_keyspaces() operation, add that. Thrift still uses replica::database to get the schema version. That should be provided by migration_manager, but changing that is left for later. Closes #9888 * github.com:scylladb/scylla: thrift: switch from replica module to data_dictionary module thrift: simplify execute_schema_command() calling convention data_dictionary: add get_keyspaces() method	2022-01-13 10:52:30 +02:00
Nadav Har'El	343c521e28	alternator: avoid large contigous allocation in BatchGetItem The BatchGetItem request can return a very large response - according to DynamoDB documentation up to 16 MB, but presently in Alternator, we allow even more (see #5944). The problem is that the existing code prepares the entire response as a large contiguous string, resulting in oversized allocation warnings - and potentially allocation failures. So in this patch we estimate the size of the BatchGetItem response, and if it is "big enough" (currently over 100 KB), we return it with the recently added streaming output support. This streaming output doesn't avoid the extra memory copies unfortunately, but it does avoid a contiguous allocation which is the goal of this patch. After this patch, one oversized allocation warning is gone from the test: test/alternator/run test_batch.py::test_batch_get_item_large (a second oversized allocation is still present, but comes from the unrelated BatchWriteItem issue #8183). Fixes #8522 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220111170541.637176-1-nyh@scylladb.com>	2022-01-13 09:46:08 +01:00
Kamil Braun	e98711cfcb	db: config: add a flag to disable new reversed reads algorithm Just in case the new algorithm turns out to be buggy, or give a performance regression, add a flag to fall-back to the old algorithm for use in the field.	2022-01-12 18:59:19 +01:00
Avi Kivity	6205d40d5f	thrift: switch from replica module to data_dictionary module Thrift is a coordinator-side service and should not touch the replica module. Switch it to data_dictionary. The switch is straightforward with two exceptions: - client_state still receives replica::database parameters. After this change it will be easier to adapt client_state too. - calls to replica::database::get_version() remain. They should be rerouted to migration_manager instead, as that deals with schema management.	2022-01-12 19:54:38 +02:00
Kamil Braun	7fb7a406e7	replica: table: remove obsolete comment about reversed reads	2022-01-12 17:57:08 +01:00
Avi Kivity	85061b694b	thrift: simplify execute_schema_command() calling convention execute_schema_command is always called with the same first two parameters, which are always defined froom the thrift_handler instance that contains its caller. Simplify it by making it a member function. This simplifies migration to data_dictionary in the next patch.	2022-01-12 18:56:47 +02:00
Avi Kivity	631a19884d	data_dictionary: add get_keyspaces() method Mirroring replica::database::get_keyspaces(), for Thrift's use. We return a vector instead of a hash map. Random access is already available via database::find_keyspace(). The name is available via the keyspace metadata, and in fact Thrift ignore the map name and uses the metadata name. Using a simpler type reduces include dependencies for this heavily used module. The function is plumbed to replica::database::get_keyspaces() so it returns the same data.	2022-01-12 18:24:38 +02:00
Nadav Har'El	8bcd23fa02	Merge: move rest of internal ddl users to use raft from Gleb The patch series moves the rest of internal ddl users to do schema change over raft (if enabled). After that series only tests are left using old API. * 'gleb/raft-schema-rest-v6' of github.com:scylladb/scylla-dev: (33 commits) migration_manager: drop no longer used functions system_distributed_keyspace: move schema creation code to use raft auth: move table creation code to use raft auth: move keyspace creation code to use raft table_helper: move schema creation code to use raft cql3: make query_processor inherit from peering_sharded_service table_helper: make setup_table() static table_helper: co-routinize setup_keyspace() redis: move schema creation code to go through raft thrift: move system_update_column_family() to raft thrift: authenticate a statement before verifying in system_update_column_family() thrift: co-routinize system_update_column_family() thrift: move system_update_keyspace() to raft thrift: authenticate a statement before verifying in system_update_keyspace() thrift: co-routinize system_update_keyspace() thrift: move system_drop_keyspace() to raft thrift: authenticate a statement before verifying in system_drop_keyspace() thrift: co-routinize system_drop_keyspace() thrift: move system_add_keyspace() to raft thrift: co-routinize system_add_keyspace() ...	2022-01-12 18:09:08 +02:00
Gleb Natapov	2aec9009ef	migration_manager: drop no longer used functions	2022-01-12 16:40:06 +02:00
Gleb Natapov	9ce62bcc33	system_distributed_keyspace: move schema creation code to use raft	2022-01-12 16:40:06 +02:00
Gleb Natapov	50b7806c57	auth: move table creation code to use raft	2022-01-12 16:40:06 +02:00
Gleb Natapov	4273a3308c	auth: move keyspace creation code to use raft	2022-01-12 16:40:06 +02:00
Gleb Natapov	03184bd786	table_helper: move schema creation code to use raft	2022-01-12 16:40:06 +02:00
Gleb Natapov	eb62e81843	cql3: make query_processor inherit from peering_sharded_service This what we can get to a distributed object from shard local one.	2022-01-12 16:40:06 +02:00
Gleb Natapov	e2a29d9239	table_helper: make setup_table() static It will make it easier to move schema creation to shard 0.	2022-01-12 16:40:06 +02:00
Gleb Natapov	3995f75b30	table_helper: co-routinize setup_keyspace() Also replace open-coded loops with more modern c++ alternatives.	2022-01-12 16:40:05 +02:00
Gleb Natapov	5b4982d01f	redis: move schema creation code to go through raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	dd36150a7d	thrift: move system_update_column_family() to raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	bcfdcc51d6	thrift: authenticate a statement before verifying in system_update_column_family() Otherwise it is possible to infer if a table exist without having proper credentials.	2022-01-12 16:33:16 +02:00
Gleb Natapov	aec413d0f7	thrift: co-routinize system_update_column_family()	2022-01-12 16:33:16 +02:00
Gleb Natapov	d9c315891a	thrift: move system_update_keyspace() to raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	7ffbdde554	thrift: authenticate a statement before verifying in system_update_keyspace() Otherwise it is possible to infer if a table exist without having proper credentials.	2022-01-12 16:33:16 +02:00
Gleb Natapov	1b4538f5bd	thrift: co-routinize system_update_keyspace()	2022-01-12 16:33:16 +02:00
Gleb Natapov	64b8f4fe50	thrift: move system_drop_keyspace() to raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	52fc815f24	thrift: authenticate a statement before verifying in system_drop_keyspace() Otherwise it is possible to infer if a table exist without having proper credentials.	2022-01-12 16:33:16 +02:00
Gleb Natapov	45ff7e30a1	thrift: co-routinize system_drop_keyspace()	2022-01-12 16:33:16 +02:00
Gleb Natapov	a17f82c647	thrift: move system_add_keyspace() to raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	3a3a3f693e	thrift: co-routinize system_add_keyspace()	2022-01-12 16:33:16 +02:00
Gleb Natapov	845b617256	thrift: move system_drop_column_family() to raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	9b6a9b104e	thrift: co-routinize system_drop_column_family()	2022-01-12 16:33:16 +02:00
Gleb Natapov	7cfedb50bb	thrift: move system_add_column_family() to raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	e4ac3c2777	thrift: authenticate a statement before verifying in system_add_column_family() Otherwise it is possible to infer if a table exist without having proper credentials.	2022-01-12 16:33:16 +02:00
Gleb Natapov	d5f14306d0	thrift: co-routinize system_add_column_family()	2022-01-12 16:33:16 +02:00
Gleb Natapov	1491cc2906	alternator: move create_table() to raft	2022-01-12 16:33:16 +02:00
Gleb Natapov	0cd6d283ad	alternator: move update_table() to raft	2022-01-12 16:33:15 +02:00
Gleb Natapov	7ee39ff94b	alternator: move validation in update_table() to the begining	2022-01-12 16:33:15 +02:00
Gleb Natapov	740b2181e1	alternator: move update_tags() to raft	2022-01-12 16:33:15 +02:00
Gleb Natapov	57be1b773e	alternator: move delete_table() to raft	2022-01-12 16:33:15 +02:00
Gleb Natapov	0ac20b5494	alternator: make some functions static Make add_stream_options, supplement_table_info, supplement_table_stream_info static. They only need a pointer to storage_proxy, so pass it directly.	2022-01-12 16:33:15 +02:00
Gleb Natapov	2e4a8bdfaa	alternator: co-routinize delete_table()	2022-01-12 16:33:15 +02:00
Gleb Natapov	459539e812	migration_manager: do not allow creating keyspace with arbitrary timestamp This was needed to fix issue #2129 which was only manifest itself with auto_bootstrap set to false. The option is ignored now and we always wait for schema to synch during boot.	2022-01-12 16:33:15 +02:00
Botond Dénes	bdcbf3f71b	Merge 'database: Add error message with mutation info on commit log apply failure' from Calle Wilund Fixes #9408 While it is rare, some customer issues have shown that we can run into cases where commit log apply (writing mutations to it) fails badly. In the known cases, due to oversized mutations. While these should have been caught earlier in the call chain really, it would probably help both end users and us (trying to figure out how they got so big and how they got so far) iff we added info to the errors thrown (and printed), such as ks, cf, and mutation content. Somewhat controversial, this makes the apply with CL decision path coroutinized, mainly to be able to do the error handling for the more informative wrapper exception easier/less ugly. Could perhaps do with futurize_invoke + then_wrapper also. But future is coroutines... This is as stated somewhat problematic, it adds an allocation to perf_simple_query::write path (because of crap clang cr frame folding?). However, tasks/op remain constant and actual tps (though unstable) remain more or less the same (on my crappy measurements). Counter path is unaffected, as coroutine frame alloc replaces with(...) dtest for the wrapped exception on separate pr. Closes #9412 * github.com:scylladb/scylla: database: Add error message with mutation info on commit log apply failure database: coroutinize do_apply and apply_with_commitlog	2022-01-12 16:16:29 +02:00
Calle Wilund	a6202ae079	database: Add error message with mutation info on commit log apply failure Fixes #9408 While it is rare, some customer issues have shown that we can run into cases where commit log apply (writing mutations to it) fails badly. In the known cases, due to oversized mutations. While these should have been caught earlier in the call chain really, it would probably help both end users and us (trying to figure out how they got so big and how they got so far) iff we added info to the errors thrown (and printed), such as ks, cf, and mutation content.	2022-01-12 14:04:23 +00:00
Calle Wilund	63ea666ca0	database: coroutinize do_apply and apply_with_commitlog Somewhat controversial. Making the apply with CL decision path coroutinized, mainly to be able to in next patch make error handling more informative (because we will have exceptions that are immediate and/or futurized). This is as stated somewhat problematic, it adds an allocation to perf_simple_query::write path (because of crap clang cr frame folding?). However, tasks/op remain constant and actual tps (though unstable) remain more or less the same (on my crappy measurements). Counter path is unaffected, as coroutine frame alloc replaces with(...) alloc, and all is same and dandy. I am hoping that the simpler error + verbose code will compensate for the extra alloc.	2022-01-12 14:04:15 +00:00
Nadav Har'El	23e93a26b3	Merge 'Alternator: stream results + chunk results to remove large allocations' from Calle Wilund Refs: #9555 When running the "Kraken" dynamodb streams test to provoke the issued observed by QA, I noticed on my setup mainly two things: Large allocation stalls (+ warnings) and timeouts on read semaphores in DB. This tries to address the first issue, partly by making query_result_view serialization using chunked vector instead of linear one, and by introducing a streaming option for json return objects, avoiding linearizing to string before wire. Note that the latter has some overhead issues of its own, mainly data copying, since we essentially will be triple buffering (local, wrapped http stream, and final output stream). Still, normal string output will typically do a lot of realloc which is potential extra copies as well, so... This is not really performance tested, but with these tweaks I no longer get large alloc stalls at least, so that is a plus. :-) Closes #9713 * github.com:scylladb/scylla: alternator::executor: Use streamed result for scan etc if large result alternator::streams: Use streamed result in get_records if large result executor/server: Add routine to make stream object return rjson: Add print to stream of rjson::value query_idl: Make qr_partition::rows/query_result::partitions chunked	2022-01-12 15:53:31 +02:00

1 2 3 4 5 ...

29773 Commits