scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Author	SHA1	Message	Date
Aleksandra Martyniuk	ec86410094	task_manager: test api layer implementation The implementation of a test api that helps testing task manager api. It provides methods to simulate the operations that can happen on modules and theirs task. Through the api user can: register and unregister the test module and the tasks belonging to the module, and finish the tasks with success or custom error.	2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk	42f36db55b	task_manager: test api layer The test api that helps testing task manager api. It can be used to simulate the operations that can happen on modules and theirs task. Through the api user can: register and unregister the test module and the tasks belonging to the module, and finish the tasks with success or custom error.	2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk	c9637705a6	task_manager: api layer implementation The implementation of a task manager api layer. It provides methods to list the modules registered in task_manager, list tasks belonging to the given module, abort, wait for or retrieve a status of the given task.	2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk	07043cee68	task_manager: api layer The task manager api layer. It can be used to list the modules registered in task_manager, list tasks belonging to the given module, abort, wait for or retrieve a status of the given task.	2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk	b87a0a74ab	task_manager: keep task_manager reference in http_context Keep a reference to sharded<task_manager> as a member of http_context so it can be reached from rest api.	2022-09-09 14:29:28 +02:00
Amnon Heiman	5ac20ac861	Reduce the number of per-scheduling group metrics This patch reduces the number of metrics ScyllaDB generates. Motivation: The combination of per-shard with per-scheduling group generates a lot of metrics. When combined with histograms, which require many metrics, the problem becomes even bigger. The two tools we are going to use: 1. Replace per-shard histograms with summaries 2. Do not report unused metrics. The storage_proxy stats holds information for the API and the metrics layer. We replaced timed_rate_moving_average_and_histogram and time_estimated_histogram with the unfied timed_rate_moving_average_summary_and_histogram which give us an option to report per-shard summaries instead of histogram. All the counters, histograms, and summaries were marked as skip_when_empty. The API was modified to use timed_rate_moving_average_summary_and_histogram. Closes #11173	2022-08-11 13:31:19 +03:00
Avi Kivity	be44fd63f9	Merge 'Make get_range_addresses async and hold effective_replication_map_ptr around it' from Benny Halevy This series converts the synchronous `effective_replication_map::get_range_addresses` to async by calling the replication strategy async entry point with the same name, as its callers are already async or can be made so easily. To allow it to yield and work on a coherent view of the token_metadata / topology / replication_map, let the callers of this patch hold a effective_replication_map per keyspace and pass it down to the (now asynchronous) functions that use it (making affected storage_service methods static where possible if they no longer depend on the storage_service instance). Also, the repeated calls to everywhere_replication_strategy::calculate_natural_endpoints are optimized in this series by introducing a virtual abstract_replication_strategy::has_static_natural_endpoints predicate that is true for local_strategy and everywhere_replication_strategy, and is false otherwise. With it, functions repeatedly calling calculate_natural_endpoints in a loop, for every token, will call it only once since it will return the same result every time anyhow. Refs #11005 Doesn't fix the issue as the large allocation still remains until we make change dht::token_range_vector chunked (chunked_vector cannot be used as is at the moment since we require the ability to push also to the front when unwrapping) Closes #11009 * github.com:scylladb/scylladb: effective_replication_map: make get_range_addresses asynchronous range_streamer: add_ranges and friends: get erm as param storage_service: get_new_source_ranges: get erm as param storage_service: get_changed_ranges_for_leaving: get erm as param storage_service: get_ranges_for_endpoint: get erm as param repair: use get_non_local_strategy_keyspaces_erms database: add get_non_local_strategy_keyspaces_erms database: add get_non_local_strategy_keyspaces storage_service: coroutinize update_pending_ranges effective_replication_map: add get_replication_strategy effective_replication_map: get_range_addresses: use the precalculated replication_map abstract_replication_strategy: get_pending_address_ranges: prevent extra vector copies abstract_replication_strategy: reindent utils: sequenced_set: expose set and `contains` method abstract_replication_strategy: calculate_natural_endpoints: return endpoint_set utils: sequenced_set: templatize VectorType utils: sanitize sequenced_set utils: sequenced_set: delete mutable get_vector method	2022-08-09 13:25:53 +03:00
Benny Halevy	7ee6048255	database: add get_non_local_strategy_keyspaces For node operations, we currently call get_non_system_keyspaces but really want to work on all keyspace that have non-local replication strategy as they are replicated on other nodes. Reflect that in the replica::database function name. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:01 +03:00
Benny Halevy	257d74bb34	schema, everywhere: define and use table_id as a strong type Define table_id as a distinct utils::tagged_uuid modeled after raft tagged_id, so it can be differentiated from other uuid-class types, in particular from table_schema_version. Fixes #11207 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:41 +03:00
Benny Halevy	d96b56fee2	database: rename {flush,snapshot}_on_all and make static Follow the convention of drop_table_on_all_shards. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-07 12:53:05 +03:00
Benny Halevy	14faa3b6f4	compaction_manager: perform_cleanup, perform_sstable_upgrade: use a lw_shared_ptr for owned token ranges And completely get rid of the dependency on replica::database. Also, add respective rest_api tests. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 08:08:11 +03:00
Aleksandra Martyniuk	6ea5bc96d7	scrub compaction: return status indicating aborted operations over the rest api Performing compaction scrub user did not know whether an operation was aborted. If compaction scrub is aborted, return status the user gets over rest api is set to 1.	2022-07-29 09:35:20 +02:00
Aleksandra Martyniuk	f1980f8dc6	scrub compaction: count validation errors and return status over the rest api Performing compaction scrub user did not know whether any validation errors were encountered. The number of validation errors per given compaction scrub is gathered and summed from each shard. Basing on that value return status over the rest api is set to 3 if any validation errors were encountered.	2022-07-29 09:35:20 +02:00
Amnon Heiman	99a060126d	database: Reduce the number of per-table metrics This patch reduces the number of metrics that is reported per table, when the per-table flag is on. When possible, it moves from time_estimated_histogram and timed_rate_moving_average_and_histogram to use the unified timer. Instead of a histogram per shard, it will now report a summary per shard and a histogram per node. Counters, histograms, and summaries will not be reported if they were never used. The API was updated accordingly so it would not break. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2022-07-27 16:58:52 +03:00
Benny Halevy	5eb31eff64	storage_service: coroutinize get_range_to_address_map and friends And add calls to maybe_yield to prevent stalls in this path as seen in performance testing. Also, add a respective rest_api test. Fixes #11114 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-25 18:06:28 +03:00
Benny Halevy	0b474866a3	storage_service: get_range_to_address_map: move selection of arbitrary ks to api layer It is only needed for the "storage_service/describe_ring" api and service/storage_service shouldn't bother with it. It's an api sugar coating. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-25 18:06:28 +03:00
Raphael S. Carvalho	cebe6e22cb	compaction_manager: scrub: switch to table_state Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	d29f7070d9	compaction_manager: upgrade: switch to table_state Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	bdd049afd6	compaction_manager: cleanup: switch to table_state Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	7a9908dbf1	compaction_manager: make stop compaction procedures switch to table_state they're used to stop all ongoing compaction on behalf of a given table T. Today, each table has a single table_state representing it, but after we implement compaction groups, we'll need to call the procedure for each group in a table. But the discussion doesn't belong here, as compaction group work will only come later. By the time being, we're only making compaction manager fully switch to table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Aleksandra Martyniuk	7871989551	api: list of the user keyspaces contains only user keyspaces storage_service/keyspaces?type=user along with user keyspaces returned the keyspaces that were internal but non-system. The list of the keyspaces for the user option (storage_service/keyspaces?type=user) contains neither system nor internal but only user keyspaces. Fixes: #11042 Closes #11049	2022-07-15 20:42:30 +02:00
Pavel Emelyanov	3a753068be	Merge "Make permissions cache live updateable and add an API for resetting authorization cache" from Igor Ribeiro Barbosa Duarte Currently, for users who have permissions_cache configs set to very high values (and thus can't wait for the configured times to pass) having to restart the service every time they make a change related to permissions or prepared_statements cache (e.g. Adding a user and changing their permissions) can become pretty annoying. This patch series make permissions_validity_in_ms, permissions_update_interval_in_ms and permissions_cache_max_entries live updateable so that restarting the service is not necessary anymore for these cases. It also adds an API for flushing the cache to make it easier for users who don't want to modify their permissions_cache config. branch: https://github.com/igorribeiroduarte/scylla/tree/make_permissions_cache_live_updateable CI: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1005/ dtests: https://github.com/igorribeiroduarte/scylla-dtest/tree/test_permissions_cache * https://github.com/igorribeiroduarte/scylla/make_permissions_cache_live_updateable: loading_cache_test: Test loading_cache::reset and loading_cache::update_config api: Add API for resetting authorization cache authorization_cache: Make permissions cache and authorized prepared statements cache live updateable auth_prep_statements_cache: Make aut_prep_statements_cache accept a config struct utils/loading_cache.hh: Add update_config method utils/loading_cache.hh: Rename permissions_cache_config to loading_cache_config and move it to loading_cache.hh utils/loading_cache.hh: Add reset method	2022-06-29 11:14:13 +03:00
Igor Ribeiro Barbosa Duarte	a23c3d6338	api: Add API for resetting authorization cache For cases where we have very high values set to permissions_cache validity and update interval (E.g.: 1 day), whenever a change to permissions is made it's necessary to update scylla config and decrease these values, since waiting for all this time to pass wouldn't be viable. This patch adds an API for resetting the authorization cache so that changing the config won't be mandatory for these cases. Usage: $ curl -X POST http://localhost:10000/authorization_cache/reset Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-06-28 19:58:06 -03:00
Botond Dénes	6c818f8625	Merge 'sstables: generation_type tidy-up' from Michael Livshin - Use `sstables::generation_type` in more places - Enforce conceptual separation of `sstables::generation_type` and `int64_t` - Fix `extremum_tracker` so that `sstables::generation_type` can be non-default-constructible Fixes #10796. Closes #10844 * github.com:scylladb/scylla: sstables: make generation_type an actual separate type sstables: use generation_type more soundly extremum_tracker: do not require default-constructible value types	2022-06-28 08:50:12 +03:00
Pavel Emelyanov	3ab7c9320c	api: Get rack/datacenter from topology The http_ctx already has token metadata on board, it's possible to get topology from it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-22 11:47:27 +03:00
Michael Livshin	ab13127761	sstables: use generation_type more soundly `generation_type` is (supposed to be) conceptually different from `int64_t` (even if physically they are the same), but at present Scylla code still largely treats them interchangeably. In addition to using `generation_type` in more places, we provide (no-op) `generation_value()` and `generation_from_value()` operations to make the smoke-and-mirrors more believable. The churn is considerable, but all mechanical. To avoid even more (way, way more) churn, unit test code is left untreated for now, except where it uses the affected core APIs directly. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-06-20 19:37:31 +03:00
Michael Livshin	28d44ce6db	api-doc: correct spelling Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-06-15 11:30:58 +03:00
Michael Livshin	aab4cd850c	allow pre-scrub snapshots of materialized views and secondary indices Previously, any attempt to take a materialized view or secondary index snapshot was considered a mistake and caused the snapshot operation to abort, with a suggestion to snapshot the base table instead. But an automatic pre-scrub snapshot of a view cannot be attributed to user error, so the operation should not be aborted in that case. (It is an open question whether the more correct thing to do during pre-scrub snapshot would be to silently ignore views. Or perhaps they should be ignored in all cases except when the user explicitly asks to snapshot them, by name) Closes #10760. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-06-15 11:30:58 +03:00
Nadav Har'El	043b1c7f89	Update seastar submodule. Unfortunately, also requires two changes to Scylla itself to make it still compile - see below * seastar 5e863627...96bb3a1b (18): > install-dependencies: add rocky as a supported distro > circleci: relax docker limits to allow running with new toolchain > core: memory: Add memory::free_memory() also in Debug mode > build: bump up zlib to 1.2.12 > cmake: add FindValgrind.cmake > Merge 'seastar-addr2line: support sct syslogs' from Benny Halevy > rpc: lower log level for 'failed to connect' errors > scripts: Build validation > perftune.py: remove rx_queue_count from mode condition. > memory: add attributes to memalign for compatibility with glibc 2.35 > condition-variable: Fix timeout "when" potentially not killing timer > Merge "tests: perf: measure coroutines performance" from Benny > Merge: Refine COUNTER metrics > Revert "Merge: Refine COUNTER metrics" > reactor: document intentional bitwise-on-bool op in smp_pollfn::poll() > Merge: Refine COUNTER metrics > SLES: additionally check irqbalance.service under /usr/lib > rpc_tester: job_cpu: mark virtual methods override Changes to Scylla also included in this merge: 1. api: Don't export DERIVEs (Pavel Emelyanov) Newer seastar doesn't have DERIVE metrics, but does have REAL_COUNTER one. Teach the collectd getter the change. (for the record: I don't understand how this endpoing works at all, there's a HISTOGRAM metrics out there that would be attempted to get exposed with the v.ui() call which's totally wrong) 2. test: use linux_perf_events.{cc,hh} from Seastar Seastar now has linux_perf_events.{cc,hh}. Remove Scylla's version of the same files and use Seastar's. Without this change, Scylla fails to compile when some source files end up including both versions and seeing double definitions. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-05-11 14:46:30 +02:00
Benny Halevy	01b1e54e22	api: storage_service: increase visibility of snapshot ops in the log snapshot operations over the api are rare but they contain significant state on disk in the form of sstables hard-linked to the snapshot directories. Also, we've seen snapshot operations hang in the field, requiring a core dump to analyse the issue, while there were no records in the log indicating when previous snapshot operations were last executed. This change promotes logging to info level when take_snapshot and del_snapshot start, and logs errors if in case they fail. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:15:46 +03:00
Benny Halevy	b9d972d029	api: storage_service: coroutinize take_snapshot and del_snapshot Before making any further changes in them. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:02:52 +03:00
Benny Halevy	10b86ee5bd	api: storage_service: take_snapshot: improve api help messages Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:02:47 +03:00
Benny Halevy	5b4eb44795	database: add flush_on_all variants Use by api layer. Will be used in a later patch to flush on all shards before taking a snapshot. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 09:56:44 +03:00
Pavel Emelyanov	334d3434e7	code: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	5ac28a29d3	gossiper, code: Relax get_up/down/all_counters() helpers These helpers count elements in the endpoint state map. It makes sense to keep them in gossiper API, but it's worth removing the wrappers that do invoke_on(0). This makes code shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	5f53799ffb	api: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	0ef33b71ba	gossiper, api: Remove get_arrival_samples() It's empty too, but the API-side conversion probably has some value for the future, so keep it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	37d392c772	gossiper, api: Remove get/set phi convict threshold helpers These are empty anyway. API caller can place return stubs itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	ad786d6b4d	gossiper, api: Move get_simple_states() into API code The API method in question just tries to scan the state map. There's no need in doing invoke_on(0) and in a separate helper method in gossiper, the creation of the json return value can happen in the API handler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	f278d84cfe	gossiper, api: Remove get_endpoint_state() helpers There are two of them -- one to do invoke_on(0) the other one to get the needed data. The former one is not needed -- the scanned endpoint state map is replicated accross shards and is the same everywhere. The latter is not needed, because there's only one user of it -- the API -- which can work with the existing gossiper API. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	0aea43a245	gossiper: Make state and locks maps private Locks are not needed outside gossiper, state map is sometimes read from, but there a const getter for such cases. Both methods now desrve the underbar prefix, but it doesn't come with this short patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Avi Kivity	582802825a	treewide: use system-#include (angle brackets) for seastar Seastar is an external library from Scylla's point of view so we should use the angle bracket #include style. Most of the source follows this, this patch fixes a few stragglers. Also fix cases of #include which reached out to seastar's directory tree directly, via #include "seastar/include/sesatar/..." to just refer to <seastar/...>. Closes #10433	2022-04-26 14:46:42 +03:00
Avi Kivity	de6631656c	api: avoid function specialization in req_param Function specializations are not allowed (you're supposed to use overloads), but clang appears to allow them. Here, we can't use an overload since the type doesn't appear in the parameter list. Use a constraint instead.	2022-04-18 12:27:18 +03:00
Pavel Emelyanov	633746b87d	snitch: Make config-based construction of all drivers Currently snitch drivers register themselves in class-registry with all sorts of construction options possible. All those different constuctors are in fact "config options". When later snitch will declare its dependencies (gossiper and system keyspace), it will require patching all this registrations, which's very inconvenient. This patch introduces the snitch_config struct and replaces all the snitch constructors with the snitch_driver(snitch_config cfg) one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:38:34 +03:00
Pavel Emelyanov	ba6d2ecc6f	api: Remove unused argument from set_tables_autocompaction helper Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220329093113.5953-1-xemul@scylladb.com>	2022-03-30 11:42:52 +03:00
Avi Kivity	3c2271af52	Merge "De-globalize system keyspace local cache" from Pavel E " There's a static global sharded<local_cache> variable in system keyspace the keeps several bits on board that other subsystems need to get from the system keyspace, but what to have it in future<>-less manner. Some time ago the system_keyspace became a classical sharded<> service that references the qctx and the local cache. This set removes the global cache variable and makes its instances be unique_ptr's sitting on the system keyspace instances. The biggest obstacle on this route is the local_host_id that was cached, but at some point was copied onto db::config to simplify getting the value from sstables manager (there's no system keyspace at hand there at all). So the first thing this set does is removes the cached host_id and makes all the users get it from the db::config. (There's a BUG with config copy of host id -- replace node doesn't update it. This set also fixes this place) De-globalizing the cache is the prerequisite for untangling the snitch- -messaging-gossiper-system_keyspace knot. Currently cache is initialized too late -- when main calls system_keyspace.start() on all shards -- but before this time messaging should already have access to it to store its preferred IP mappings. tests: unit(dev), dtest.simple_boot_shutdown(dev) " * 'br-trade-local-hostid-for-global-cache' of https://github.com/xemul/scylla: system_keyspace: Make set_local_host_id non-static system_keyspace: Make load_local_host_id non-static system_keyspace: Remove global cache instance system_keyspace: Make it peering service system_keyspace,snitch: Make load_dc_rack_info non-static system_keyspace,cdc,storage_service: Make bootstrap manipulations non-static system_keyspace: Coroutinize set_bootstrap_state gossiper: Add system keyspace dependency cdc_generation_service: Add system keyspace dependency system_keyspace: Remove local host id from local cache storage_service: Update config.host_id on replace storage_service: Indentation fix after previous patch storage_service: Coroutinize prepare_replacement_info() system_distributed_keyspace: Indentation fix after previous patch code,system_keyspace: Relax system_keyspace::load_local_host_id() usage code,system_keyspace: Remove system_keyspace::get_local_host_id()	2022-03-27 17:19:24 +03:00
Avi Kivity	1feec08c2d	Revert "api: storage_service: force_keyspace_compaction: compact one table at a time" This reverts commit `37dc31c429`. There is no reason to suppose compacting different tables concurently on different shards reduces space requirements, apart from non-deterministically pausing random shards. However, when data is badly distributed and there are many tables, it will slow down major compaction considerably. Consider a case where there are 100 tables, each with a 2GB large partition on some shard. This extra 200GB will be compacted on just one shard. With compation rate of 40 MB/s, this adds more than an hour to the process. With the existing code, these compactions would overlap if the badly distributed data was not all in one shard. It is also counter to tablets, where data is not equally ditributed on purpose. Closes #10246	2022-03-25 19:24:50 +03:00
Pavel Emelyanov	b8d3048104	code,system_keyspace: Relax system_keyspace::load_local_host_id() usage The method is nowadays called from several places: - API - sys.dist.ks. (to udpate view building info) - storage service prepare_to_join() - set up in main They all, but the last, can use db::config cached value, because it's loaded earlier than any of them (but the last -- that's the loading part itself). Once patched, the load_local_host_id() can avoid checking the cache for that value -- it will not be there for sure. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-25 13:23:30 +03:00
Pavel Emelyanov	b80d5f8900	schema_tables: Add sharded<system_keyspace> argument to update_schema_version_and_announce All its (indirect) callers had been patched to have it, now it's possible to have the argument in it. Next patch will make use of it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-16 14:24:40 +03:00
Pavel Emelyanov	bd4beeeebe	api: Carry sharded<system_keyspace> reference along There's an APi call to recalculate schema version that needs the system_keyspace instance at hand Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-03-16 14:24:40 +03:00

1 2 3 4 5 ...

657 Commits