scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Alejo Sanchez	700054abee	test.py: use internal id to manage servers Instead of using assigned IP addresses, use an internal server id. Define types to distinguish local server id, host ID (UUID), and IP address. This is needed to test servers changing IP address and for node replace (host UUID). Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	1e38f5478c	test.py: rename hostname to ip_addr The code explicitly manages an IP as string, make it explicit in the variable name. Define its type and test for set in the instance instead of using an empty string as placeholder. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	f478eb52a3	test.py: get host id When initializing a ScyllaServer, try to get the host id instead of only checking the REST API is up. Use the existing aiohttp session from ScyllaCluster. In case of HTTP error check the status was not an internal error (500+). Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	78663dda72	test.py: use REST api client in ScyllaCluster Move the REST api client to ScyllaCluster. This will allow the cluster to query its own servers. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	75ea345611	test.py: remove unnecessary reference to web app The aiohttp.web.Application only needs to be passed, so don't store a reference in ScyllaCluster object. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	a5316b0c6b	test.py: requests without aiohttp ClientSession Simplify REST helper by doing requests without a session. Reusing an aiohttp.ClientSession causes knock-on effects on `rest_api/test_task_manager` due to handling exceptions outside of an async with block. Requests for cluster management and Scylla REST API don't need session, anyway. Raise HTTPError with status code, text reason, params, and json. In ScyllaCluster.install_and_start() instead of adding one more custom exception, just catch all exceptions as they will be re-raised later. While there avoid code duplication and improve sanity, type checking, and lint score. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Petr Gusev	44f48bea0f	raft: test_remove_node_with_concurrent_ddl The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce scylladb#11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems. Closes #11734	2022-11-04 17:16:35 +01:00
Kamil Braun	4974a31510	test/topology_raft_disabled: more Raft upgrade tests The tests are checking the upgrade procedure and recovery from failure in scenarios like when a node fails causing the procedure to get stuck or when we lose a majority in a fully upgraded cluster. Added some new functionalities to `ScyllaRESTAPIClient` like injecting errors and obtaining gossip generation numbers.	2022-10-10 14:32:10 +02:00
Kamil Braun	fa8dcb0d54	test/pylib: scylla_cluster: pass a list of ignored nodes to removenode The `removenode` operation normally requires the removing node to contact every node in the cluster except the one that is being removed. But if more than 1 node is down it's possible to specify a list of nodes to ignore for the operation; the `/storage_service/remove_node` endpoint accepts an `ignore_nodes` param which is a comma-separated list of IPs. Extend `ScyllaRESTAPIClient`, `ScyllaClusterManager` and `ManagerClient` so it's possible to pass the list of ignored nodes. We also modify the `/cluster/remove-node` Manager endpoint to use `put_json` instead of `get_text` and pass all parameters except the initiator IP (the IP of the node who coordinates the `removenode` operation) through JSON. This simplifies the URL greatly (it was already messy with 3 parameters) and more closely resembles Scylla's endpoint.	2022-10-10 12:59:12 +02:00
Kamil Braun	130ab1d312	test/pylib: rest_client: propagate errors from put_json	2022-10-10 12:59:12 +02:00
Kamil Braun	63892326d5	test/pylib: fix some type hints	2022-10-10 12:59:12 +02:00
Kamil Braun	6e3fe13fcf	test/pylib: scylla_cluster: don't create and drop keyspaces to check if cql is up Do a simple `SELECT` instead. This speeds up tests - creating and dropping keyspaces is relatively expensive, and we did this on every server restart.	2022-10-10 12:59:12 +02:00
Alejo Sanchez	abf1425ad4	test.py: Scylla REST methods for topology tests Provide a helper client for Scylla REST requests. Use it on both ScyllaClusterManager (e.g. remove node, test.py process) and ManagerClient (e.g. get uuid, pytest process). For now keep using IPs as key in ScyllaCluster, but this will be changed to UUID -> IP in the future. So, for now, pass both independently. Note the UUID must be obtained from the server before stopping it. Refresh client driver connection when decommissioning or removing a node. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-10-03 19:01:03 +02:00
Alejo Sanchez	86c752c2a0	test.py: rename server_id to server_ip In ScyllaCluster currently servers are tracked by the host IP. This is not the host id (UUID). Fix the variable name accordingly Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-10-03 19:01:03 +02:00
Alejo Sanchez	a7a0b446f0	test.py: HTTP client helper Split aiohttp client to a shared helper file. While there, move aiohttp session setup back to constructors. When there were teardown issues it looked it could be caused by aiohttp session being created outside a coroutine. But this is proven not to be the case after recent fixes. So move it back to the ManagerClient constructor. On th other hand, create a close() coroutine to stop the aiohttp session. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-10-03 19:01:03 +02:00
Alejo Sanchez	41dbdf0f70	test.py: topology pass ManagerClient instead of... cql connection When there are topology changes, the driver needs to be updated. Instead of passing the CassandraCluster.Connection, pass the ManagerClient instance which manages the driver connection inside of it. Remove workaround for test_raft_upgrade. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-10-03 19:00:47 +02:00
Alejo Sanchez	0c3a06d0d7	test.py: delete unimplemented remove server Delete of Unused and unimplemented broken version of remove server. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-10-03 18:57:38 +02:00
Alejo Sanchez	98bc4c198f	test.py: fix variable name ssl name clash Change variable ssl to use_ssl to avoid clash with ssl module. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-10-03 18:57:38 +02:00
Kamil Braun	b2cf610567	test/pylib: scylla_cluster: improve cluster printing Print the cluster name and stopped servers in addition to the running servers. Fix a logging call which tried to print a server in place of a cluster and even at that it failed (the server didn't have a hostname yet so it printed as an empty string). Add another logging call.	2022-09-30 17:00:05 +02:00
Kamil Braun	05ed3769dd	test/pylib: don't pass test_case_name to after-test endpoint It's redundant now, the manager tracks the current test case using before-test endpoint calls.	2022-09-30 16:41:45 +02:00
Kamil Braun	dc6f37b7f7	test/pylib: scylla_cluster: track current test case name and print it Use `_before_test` calls to track the current test case name. Concatenate it with the unique test name like this: `test_topology.1::test_add_server_add_column`, and print it instead of the test case name.	2022-09-30 16:38:35 +02:00
Kamil Braun	5be818d73b	test.py: pass the unique test name (e.g. `test_topology.1`) to cluster manager This helps us distinguish the different repeats of a test in logs. Rename the variable accordingly in `ScyllaClusterManager`.	2022-09-30 16:24:10 +02:00
Kamil Braun	fde4642472	test/pylib: scylla_cluster: pass the test case name to `before_test` We pass the test case name to `after_test` - so make it consistent. Arguably, the test case name is more useful (as it's more precise) than the test name.	2022-09-30 16:17:59 +02:00
Kamil Braun	43d8b4a214	test/pylib: use "test_case_name" variable name when talking about test cases Distinguish "test name" (e.g. `test_topology`) from "test case name" (e.g. `test_add_server_add_column` - a test case inside `test_topology`).	2022-09-30 16:15:48 +02:00
Kamil Braun	1793d43b15	test/pylib: scylla_cluster: mark `server_remove` as not implemented The `server_remove` function did a very weird thing: it shut down a server and made the framework 'forget' about it. From the point of view of the Scylla cluster and the driver the server was still there. Replace the function's body with `raise NotImplementedError`. In the future it can be replaced with an implementation that calls `removenode` on the Scylla cluster. Remove `test_remove_server_add_column` from `test_topology`. It effectively does the same thing as `test_stop_server_add_column`, except that the framework also 'forgets' about the stopped server. This could lead to weird situations because the forgotten server's IP could be reused in another test that was running concurrently with this test. Closes #11657	2022-09-29 21:03:18 +03:00
Nadav Har'El	de1bc147bc	Merge 'test.py: cleanups in topology test suites' from Kamil Braun Fix the type of `create_server`, rename `topology_for_class` to `get_cluster_factory`, simplify the suite definitions and parameters passed to `get_cluster_factory` Closes #11590 * github.com:scylladb/scylladb: test.py: replace `topology` with `cluster_size` in Topology tests test.py: rename `topology_for_class` to `get_cluster_factory` test/pylib: ScyllaCluster: fix create_server parameter type	2022-09-28 15:19:54 +03:00
Kamil Braun	1bcc28b48b	test/topology_raft_disabled: reenable `test_raft_upgrade` The test was disabled due to a bug in the Python driver which caused the driver not to reconnect after a node was restarted (see scylladb/python-driver#170). Introduce a workaround for that bug: we simply create a new driver session after restarting the nodes. Reenable the test. Closes #11641	2022-09-28 15:13:42 +03:00
Alejo Sanchez	02933c9b82	test.py: close aiohttp session for topology tests Close the aiohttp ClientSession after pytest session finishes. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #11648	2022-09-27 18:09:08 +02:00
Kamil Braun	06cc4f9259	test/pylib: ScyllaCluster: fix create_server parameter type The only usage of `ScyllaCluster` constructor passed a `create_server` function which expected a `List[str]` for the second parameter, while the constructor specified that the function should expect an `Optional[List[str]]`. There was no reason for the latter, we can easily fix this type error. Also give a type hint for `create_cluster` function in `PythonTestSuite.topology_for_class`. This is actually what catched the type error.	2022-09-26 11:45:44 +02:00
Alejo Sanchez	510215d79a	test.py: fix ScyllaClusterManager start/stop Check existing is_running member to avoid re-starting. While there, set it to false after stopping. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-09-21 11:42:02 +02:00
Alejo Sanchez	933d93d052	test.py: fix topology init error handling Start ScyllaClusterManager within error handling so the ScyllaCluster logs are available in case of error starting up. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-09-21 09:15:25 +02:00
Alejo Sanchez	087ae521c5	test.py: make client fail if before test check fails Check if request to server side (test.py) failed and raise if so. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #11575	2022-09-19 18:04:07 +02:00
Kamil Braun	348582c4c8	test/pylib: pool: make it possible to free up space Some tests mark clusters as 'dirty', which makes them non-reusable by later tests; we don't want to return them to the pool of clusters. This use-case was covered by the `add_one` function in the `Pool` class. However, it had the unintended side effect of creating extra clusters even if there were no more tests that were waiting for new clusters. Rewrite the implementation of `Pool` so it provides 3 interface functions: - `get` borrows an object, building it first if necessary - `put` returns a borrowed object - `steal` is called by a borrower to free up space in the pool; the borrower is then responsible for cleaning up the object. Both `put` and `steal` wake up any outstanding `get` calls. Objects are built only in `get`, so no objects are built if none are needed. Closes #11558	2022-09-18 12:05:57 +03:00
Alejo Sanchez	2da7304696	test.py: log server restarts for topology tests Add missing logging for server restart. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-09-15 15:10:29 +02:00
Alejo Sanchez	61a92afa2d	test.py: log actions for topology tests For debugging, log driver connection, before and after checks, and topology changes. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-09-15 15:10:29 +02:00
Alejo Sanchez	604f7353ef	Revert "test.py: restart stopped servers before... teardown..." This reverts commit `df1ca57fda`. In order to prevent timeouts on teardown queries, the previous commit added functionality to restart servers that were down. This issue is fixed in fc0263fc9b so there's no longer need to restart stopped servers on test teardown.	2022-09-15 14:47:01 +02:00
Alejo Sanchez	ed81f1a85c	test.py: ManagerClient API fix return text For ManagerClient request API, don't return status, raise an exception. Server side errors are signaled by status 500, not text body. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-09-15 14:47:01 +02:00
Alejo Sanchez	4a5f2418ec	test.py: ManagerClient raise on HTTP != 200 Raise an exception if the request result is not HTTP 200 for .get() helper. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-09-15 14:47:01 +02:00
Alejo Sanchez	a84bde38c0	test.py: ManagerClient fix paths to updated resource Fix missing path renames for server-side rename "node" -> "server" API. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-09-15 14:47:01 +02:00
Alejo Sanchez	b8f68729b0	test.py: Pool add fresh when item not returned Pool.get() might have waiting callers, so if an item is not returned to the pool after use, tell the pool to add a new one and tell the pool an entry was taken (used for total running entries, i.e. clusters). Use it when a ScyllaCluster is dirty and not returned. While there improve logging and docstrings. Issue reported by @kbr-. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #11546	2022-09-15 13:56:44 +03:00
Kamil Braun	73bf781e17	test/pylib: APIs to read and modify configuration from tests We introduce `server_get_config` to fetch the entire configuration dict and `update_config` to update a value under the given key.	2022-09-14 12:46:41 +02:00
Kamil Braun	1f550428a9	test/pylib: ScyllaServer: extract _write_config_file function For refreshing the on-disk config file with the config stored in dict form in the `self.config` field.	2022-09-14 12:46:41 +02:00
Kamil Braun	52e52e8503	test/pylib: ScyllaCluster: extend ActionReturn with dict data For returning types more complex than text. Also specify a default empty string value for the `msg` field for non-text return values.	2022-09-14 12:46:41 +02:00
Kamil Braun	c9348ae8ea	test/pylib: ManagerClient: introduce _put_json For sending PUT requests to the Manager (such as updating configuration).	2022-09-14 12:46:41 +02:00
Kamil Braun	d81c722476	test/pylib: ManagerClient: replace `_request` with `_get`, `_get_text` `_request` performed a GET request and extracted a text body out of the response. Split it into `_get`, which only performs the request, and `_get_text`, which calls `_get` and extracts the body as text. Also extract a `_resource_uri` function which will be used for other request types.	2022-09-14 12:46:41 +02:00
Kamil Braun	9d39e14518	test: pylib: store server configuration in `ScyllaServer` In following commits we will make this configuration accessible from tests through the Manager (for fetching and updating).	2022-09-14 12:46:41 +02:00
Kamil Braun	311806244d	test: pylib: use Python dicts to manipulate `ScyllaServer` configuration Previously we used a formattable string to represent the configuration; values in the string were substituted by Python's formatting mechanism and the resulting string was stored to obtain the config file. This approach had some downsides, e.g. it required boilerplate work to extend: to add a new config options, you would have to modify this template string. Instead we can represent the configuration as a Python dictionary. Dicts are easy to manipulate, for example you can sum two dicts; if a key appears in both, the second dict 'wins': ``` {1:1} \| {1:2} == {1:2} ``` This makes the configuration easy to extend without having to write boilerplate: if the user of `ScyllaServer` wants to add or override a config option, they can simply add it to the `config_options` dict and that's it - no need to modify any internal template strings in `ScyllaServer` implementation like before. The `config_options` dict is simply summed with the 'base' config dict of `ScyllaServer` (`config_options` is the right summand so anything in there overrides anything in the base dict). An example of this extensibility is the `authenticator` and `authorizer` options which no longer appear in `scylla_cluster.py` module after this change, they only appear in the suite.yaml file. Also, use "workdir" option instead of specifying data dir, commitlog dir etc. separately.	2022-09-12 11:57:58 +02:00
Kamil Braun	fd19825eaa	test: pylib: store `config_options` in `ScyllaServer` Previously the code extracted `authenticator` and `authorizer` keys from the config options and stored them. Store the entire dict instead. The new code is easier to extend if we want to make more options configurable.	2022-09-12 11:57:18 +02:00
Pavel Emelyanov	bbad3eac63	pylib: Cast port number config to int explicitly Otherwise it crashes some python versions. The cast was there before `a2dd64f68f` explicitly dropped one while moving the code between files. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11511	2022-09-09 18:08:08 +02:00
Kamil Braun	dba595d347	Merge 'Minimal implementation of Broadcast Tables' from Mikołaj Grzebieluch Broadcast tables are tables for which all statements are strongly consistent (linearizable), replicated to every node in the cluster and available as long as a majority of the cluster is available. If a user wants to store a “small” volume of metadata that is not modified “too often” but provides high resiliency against failures and strong consistency of operations, they can use broadcast tables. The main goal of the broadcast tables project is to solve problems which need to be solved when we eventually implement general-purpose strongly consistent tables: designing the data structure for the Raft command, ensuring that the commands are idempotent, handling snapshots correctly, and so on. In this MVP (Minimum Viable Product), statements are limited to simple SELECT and UPDATE operations on the built-in table. In the future, other statements and data types will be available but with this PR we can already work on features like idempotent commands or snapshotting. Snapshotting is not handled yet which means that restarting a node or performing too many operations (which would cause a snapshot to be created) will give incorrect results. In a follow-up, we plan to add end-to-end Jepsen tests (https://jepsen.io/). With this PR we can already simulate operations on lists and test linearizability in linear complexity. This can also test Scylla's implementation of persistent storage, failure detector, RPC, etc. Design doc: https://docs.google.com/document/d/1m1IW320hXtsGulzSTSHXkfcBKaG5UlsxOpm6LN7vWOc/edit?usp=sharing Closes #11164 * github.com:scylladb/scylladb: raft: broadcast_tables: add broadcast_kv_store test raft: broadcast_tables: add returning query result raft: broadcast_tables: add execution of intermediate language raft: broadcast_tables: add compilation of cql to intermediate language raft: broadcast_tables: add definition of intermediate language db: system_keyspace: add broadcast_kv_store table db: config: add BROADCAST_TABLES feature flag	2022-09-09 18:05:37 +02:00

1 2 3

105 Commits