scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 04:37:00 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	f953fb2f52	schema_change_test: Use proxy from cql_test_env There's one place where test case calls for storage proxy and currently does it via global refernece. Time to switch it to cql_test_env's one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 14:18:00 +03:00
Pavel Emelyanov	681a19f54c	test: Carry proxy reference on cql_test_env All sharded<> services are created by cql_test_env on the stack. The cql_test_env() is then used to keep references on some of them and to export them to test cases via its methods. Proxy is missing on that exportable list, but will be needed, so add one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 14:16:54 +03:00
Avi Kivity	0c64dd12b1	test: raft_server_test: fix string compare for clang 15 Clang 15 rejects string compares where the left-hand-side is a C string, so help it along by converting it ourselves. Closes #13582	2023-04-21 06:38:10 +03:00
Botond Dénes	1426c623eb	Merge 'Tune up S3 unit tests environment usage (and a bit more)' from Pavel Emelyanov The tests in question are using MINIO_SERVER_ADDRESS environment variable to export minio server address from pylib to test cases. Also they use hard-coded public bucket name. Both plays badly with AWS S3, the former due to MINIO_... in its name and the latter because public bucket name can be any. So this PR puts address and public bucket name into S3_..._FOR_TEST environment variables and fixes output stream closure on failure while at it. Detached from #13493 Closes #13546 * github.com:scylladb/scylladb: s3/test: Rename MINIO_SERVER_ADDRESS environment variable s3/test: Keep public bucket name in environment s3/test: Fix upload stream closure test/lib: Add getenv_safe() helper	2023-04-20 18:01:12 +03:00
Botond Dénes	66ee73641e	test/cql-pytest/nodetool.py: no_autocompaction_context: use the correct API This `with` context is supposed to disable, then re-enable autocompaction for the given keyspaces, but it used the wrong API for it, it used the column_family/autocompaction API, which operates on column families, not keyspaces. This oversight led to a silent failure because the code didn't check the result of the request. Both are fixed in this patch: * switch to use `storage_service/auto_compaction/{keyspace}` endpoint * check the result of the API calls and report errors as exceptions Fixes: #13553 Closes #13568	2023-04-20 16:21:16 +03:00
Kamil Braun	8d7b5f1710	Merge 'test/pylib: topology fix asyncio fixture and fix logger' from Alecco Remove unnecessary asyncio marker and re-introduce top level logger instance. Closes #13561 * github.com:scylladb/scylladb: test/pylib: add missing logger test/pylib: remove unnecessary asyncio marker	2023-04-20 14:23:05 +02:00
Alejo Sanchez	11561a73cb	test/pylib: ManagerClient helpers to wait for... server to see other servers after start/restart When starting/restarting a server, provide a way to wait for the server to see at least n other servers. Also leave the implementation methods available for manual use and update previous tests, one to wait for a specific server to be seen, and one to wait for a specific server to not be seen (down). Fixes #13147 Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13438	2023-04-20 14:22:31 +02:00
Alejo Sanchez	2c1ba377bf	test/pylib: add missing logger The logger instancewas removed in a previous commit but it is used in the wrapper helper. Add it back. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-20 10:36:02 +02:00
Alejo Sanchez	05338a6cd7	test/pylib: remove unnecessary asyncio marker Remove missing asyncio marker for fixture as this is only needed for tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-20 10:36:02 +02:00
Botond Dénes	0c430c01e9	Merge 'cql: allow SUM() aggregations which result in a NaN' from Nadav Har'El This short PR fixes a bug in SUM() aggregation where if the data contains +Inf and -Inf the returned sum should be NaN but we returned an error instead. This is a recent regression uncovered by a dtest (see issue #13551), but in the first patch we add additional tests in the cql-pytest framework which reproduce this bug and explore various other areas (wrongly) implicated by the failing dtest. Fixes #13551 Closes #13564 * github.com:scylladb/scylladb: cql3: allow SUM() aggregation to result in a NaN test/cql-pytest: add tests for data casts and inf in sums	2023-04-19 13:50:23 +03:00
Pavel Emelyanov	a77ca69360	s3/test: Rename MINIO_SERVER_ADDRESS environment variable Using it the pylib minio code export minio address for tests. This creates unneeded WTFs when running the test over AWS S3, so it's better to rename to variable not to mention MINIO at all. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	12c4e7d605	s3/test: Keep public bucket name in environment Local test.py runs minio with the public 'testbucket' bucket and all test cases know that. This series adds an ability to run tests over real S3 so the bucket name should be configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	91674da982	s3/test: Fix upload stream closure If multipart upload fails for some reason the output stream remains not closed and the respective assertion masquerades the original failure. Fix that by closing the stream in all cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	b239e0d368	test/lib: Add getenv_safe() helper The helper is like ::getenv() but checks if the variable exists and throws descriptive exception. So instead of fatal error: in "...": std::logic_error: basic_string: construction from null is not valid one could get something like fatal error: in "...": std::logic_error: Environment variable ... not set Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:49:26 +03:00
Nadav Har'El	81e0f5b581	cql3: allow SUM() aggregation to result in a NaN When floating-point data contains +Inf and -Inf, the sum is NaN. Our SUM() aggregation calculated this sum correctly, but then instead of returning it, complained that the sum overflowed by narrowing. This was a false positive: The sum() finalizer wanted to test that no precision was lost when casting the accumulator to the result type, so checked that the result before and after the cast are the same. But specifically for NaN, it is never equal to anything - not even to itself. This check is wrong for floating point, but moreover - isn't even necessary when the two types (accumulator type and result type) are identical so in this patch we skip it in this case. Note that in the current code, a different accumulator and result type is only used in the case of integer types; When accumulating floating point sums, the same type is used, so the broken check will be avoided. The test for this issue starts to pass with this patch, so the xfail tag is removed. Fixes #13551 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-04-19 09:31:41 +03:00
Nadav Har'El	78555ba7f1	test/cql-pytest: add tests for data casts and inf in sums This patch adds tests to reproduce issue #13551. The issue, discovered by a dtest (cql_cast_test.py), claimed that either cast() or sum(cast()) from varint type broke. So we add two tests in cql-pytest: 1. A new test file, test_cast_data.py, for testing data casts (a CAST (...) as ... in a SELECT), starting with testing casts from varint to other types. The test uncovers a lot of interesting cases (it is heavily commented to explain these cases) but nothing there is wrong and all tests pass on Scylla. 2. An xfailing test for sum() aggregate of +Inf and -Inf. It turns out that this caused #13551. In Cassandra and older Scylla, the sum returned a NaN. In Scylla today, it generates a misleading error message. As usual, the tests were run on both Cassandra (4.1.1) and Scylla. Refs #13551. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-04-18 13:38:42 +03:00
Avi Kivity	7a42927a3d	treewide: stop using 'using namespace std' in namespace scope Such namespace-wide imports can create conflicts between names that are the same in seastar and std, such as {std,seastar}::future and {std,seastar}::format, since we also have 'using namespace seastar'. Replace the namespace imports with explicit qualification, or with specific name imports. Closes #13528	2023-04-17 14:08:37 +03:00
Botond Dénes	6c889213bf	Merge 'Topology add node exception safety' from Benny Halevy Currently if index_node throws when trying to add an already indexed node, pop_node might unindex the existing node instead of the new one. Instead, with this change, unindex_node looks up the node by its pointer and removed it from the index map only if it's found there so to clean up safely after index_node throws (at any stage). Add a unit test to verify that. In addition, added a unit test to reproduce #13502 and test the fix. Closes #13512 * github.com:scylladb/scylladb: test: locator_topology: add test_update_node topology: add_node, unindex_node: make exception safe	2023-04-17 11:02:15 +03:00
Botond Dénes	4c37dc5507	Merge 'keys: specialize fmt::formatter<partition_key> and friends' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Closes #13513 * github.com:scylladb/scylladb: keys: consolidate the formatter for partition_keys keys: specialize fmt::formatter<partition_key> and friends	2023-04-17 10:27:31 +03:00
Benny Halevy	e18eb71fa3	test: locator_topology: add test_update_node Reproduces issue fixed in PR #13502 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-14 17:51:07 +03:00
Benny Halevy	e29994b2aa	topology: add_node, unindex_node: make exception safe Current if index_node throws when trying to add an already indexed node, pop_node might unindex the existing node instead of the new one. Instead, with this change, unindex_node looks up the node by its pointer and removed it from the index map only if it's found there so to clean up safely after index_node throws (at any stage). Add a unit test to verify that. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-14 17:51:05 +03:00
Tomasz Grabiec	952b455310	Merge ' tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes scylla-sstable currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a CQL format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a schema.cql is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like qurantine, staging etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13448 * github.com:scylladb/scylladb: test/cql-pytest: test_tools.py: add tests for schema loading test/cql-pytest: add no_autocompaction_context docs: scylla-sstable.rst: remove accidentally added copy-pasta docs: scylla-sstable.rst: remove paragraph with schema limitations docs: scylla-sstable.rst: update schema section test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-04-14 16:46:26 +02:00
Kamil Braun	200123624f	Merge 'test: reproducers for store mutation with schema change and host down' from Alecco Reproducers for https://github.com/scylladb/scylladb/issues/10770. (Already fixed in `15ebd59071`) Includes necessary improvements and fixes to `pylib`. Closes #12699 * github.com:scylladb/scylladb: test/pytest: reproducers for store mutation... test: pylib: Add a way to create cql connections with particular coordinators test/pylib: get gossiper alive endpoints test/topology: default replication factor 3 test/pylib: configurable replication factor	2023-04-14 13:47:51 +02:00
Kefu Chai	c580e30ec7	cql3: expr: return more accurate error message for invalidated token() args before this change, we just print out the addresses of the elements in `column_defs`, if the arguments passed to `token()` function are not valid. this is not quite helpful from the user's perspective. as user would be more interested in the values. also, we could print more accurate error message for different error. in this change, following Cassandra 4.1's behavior, three cases are identified, and corresponding errors are returned respectively: * duplicated partition keys * wrong order of partition key * missing keys where, if the partition key order is wrong, instead of printing the keys specified by user, the correct order is printed in the error message for helping user to correct the `token()` function. for better performance, the checks are performed only if the keys do not match, based on the assumption that the error handling path is not likely to be executed. tests are added accordingly. they tested with Canssandra 4.1.1 also. Fixes #13468 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13470	2023-04-14 11:46:18 +03:00
Raphael S. Carvalho	47b2a0a1f6	data_directory: Describe storage options of a keyspace Description of storage options is important for S3, as one needs to know if underlying storage is either local or remote, and if the latter, details about it. This relies on server-side desc statement. $ ./bin/cqlsh.py -e "describe keyspace1;" CREATE KEYSPACE keyspace1 WITH replication = { ... } AND storage = {'type': 'S3', 'bucket': 'sstables', 'endpoint': '127.0.0.1:9000'} AND durable_writes = true; Fixes #13507. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13510	2023-04-14 11:34:35 +03:00
Raphael S. Carvalho	a47bac931c	Move TWCS option from table into TWCS itself enable_optimized_twcs_queries is specific to TWCS, therefore it belongs to TWCS, not replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13489	2023-04-14 08:28:16 +03:00
Kefu Chai	3738fcbe05	keys: specialize fmt::formatter<partition_key> and friends this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-14 13:21:30 +08:00
Alejo Sanchez	9597822214	test/pytest: reproducers for store mutation... with schema change and host down Reproducers for a failure during lwt operation due to missing of a column mapping in schema history table. Issue #10770	2023-04-13 21:23:03 +02:00
Tomasz Grabiec	041ee3ffdd	test: pylib: Add a way to create cql connections with particular coordinators Usage: await manager.driver_connect(server=servers[0]) manager.cql.execute(f"...", execution_profile='whitelist')	2023-04-13 21:23:03 +02:00
Alejo Sanchez	62a945ccd5	test/pylib: get gossiper alive endpoints Helper to get list of gossiper alive endpoints from REST API. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-13 21:23:03 +02:00
Alejo Sanchez	08d754e13f	test/topology: default replication factor 3 For most tests there will be nodes down, increase replication factor to 3 to avoid having problems for partitions belonging to down nodes. Use replication factor 1 for raft upgrade tests.	2023-04-13 21:23:02 +02:00
Alejo Sanchez	3508a4e41e	test/pylib: configurable replication factor Make replication factor configurable for the RandomTables helper. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-13 21:23:02 +02:00
Botond Dénes	bd57471e54	reader_concurrency_semaphore: don't evict inactive readers needlessly Inactive readers should only be evicted to free up resources for waiting readers. Evicting them when waiters are not admitted for any other reason than resources is wasteful and leads to extra load later on when these evicted readers have to be recreated end requeued. This patch changes the logic on both the registering path and the admission path to not evict inactive readers unless there are readers actually waiting on resources. A unit-test is also added, reproducing the overly-agressive eviction and checking that it doesn't happen anymore. Fixes: #11803 Closes #13286	2023-04-13 15:20:18 +03:00
Kefu Chai	87170bf07a	build: cmake: add more tests this change should add the remaining tests under boost/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13494	2023-04-13 14:57:00 +03:00
Botond Dénes	1440efa042	test/cql-pytest: test_tools.py: add tests for schema loading A set of comprehensive tests covering all the supported ways of providing the schema to scylla-sstable, either explicitely or implicitely (auto-detect).	2023-04-12 03:14:43 -04:00
Botond Dénes	76a7d3448f	test/cql-pytest: add no_autocompaction_context	2023-04-12 03:14:43 -04:00
Botond Dénes	222f624757	test/cql-pytest: nodetool.py: add flush_keyspace() It would have been better if `flush()` could have been called with a keyspace and optional table param, but changing it now is too much churn, so we add a dedicated method to flush a keyspace instead.	2023-04-12 03:14:43 -04:00
Botond Dénes	525b21042f	Merge 'Rewrite sstables keyspace compaction task' from Aleksandra Martyniuk Task manager task implementations of classes that cover rewrite sstables keyspace compaction which can be start through /storage_service/keyspace_compaction/ api. Top level task covers the whole compaction and creates child tasks on each shard. Closes #12714 * github.com:scylladb/scylladb: test: extend test_compaction_task.py to test rewrite sstables compaction compaction: create task manager's task for rewrite sstables keyspace compaction on one shard compaction: create task manager's task for rewrite sstables keyspace compaction compaction: create rewrite_sstables_compaction_task_impl	2023-04-12 08:38:59 +03:00
Botond Dénes	f1bbf705f9	Merge 'Cleanup sstables in resharding and other compaction types' from Benny Halevy This series extends sstable cleanup to resharding and other (offstrategy, major, and regular) compaction types so to: * cleanup uploaded sstables (#11933) * cleanup staging sstables after they are moved back to the main directory and become eligible for compaction (#9559) When perform_cleanup is called, all sstables are scanned, and those that require cleanup are marked as such, and are added for tracking to table_state::cleanup_sstable_set. They are removed from that set once released by compaction. Along with that sstables set, we keep the owned_ranges_ptr used by cleanup in the table_state to allow other compaction types (offstrategy, major, or regular) to cleanup those sstables that are marked as require_cleanup and that were skipped by cleanup compaction for either being in the maintenance set (requiring offstrategy compaction) or in staging. Resharding is using a more straightforward mechanism of passing the owned token ranges when resharding uploaded sstables and using it to detect sstable that require cleanup, now done as piggybacked on resharding compaction. Closes #12422 * github.com:scylladb/scylladb: table: discard_sstables: update_sstable_cleanup_state when deleting sstables compaction_manager: compact_sstables: retrieve owned ranges if required sstables: add a printer for shared_sstable compaction_manager: keep owned_ranges_ptr in compaction_state compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup compaction: refactor compaction_state out of compaction_manager compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state compaction_manager: refactor get_candidates compaction_manager: get_candidates: mark as const table, compaction_manager: add requires_cleanup sstable_set: add for_each_sstable_until distributed_loader: reshard: update sstable cleanup state table, compaction_manager: add update_sstable_cleanup_state compaction_manager: needs_cleanup: delete unused schema param compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges distributed_loader: reshard: consider sstables for cleanup distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard distributed_loader: reshard: add optional owned_ranges_ptr param distributed_loader: reshard: get a ref to table_state distributed_loader: reshard: capture creator by ref distributed_loader: reshard: reserve num_jobs buckets compaction: move owned ranges filtering to base class compaction: move owned_ranges into descriptor	2023-04-11 14:52:29 +03:00
Aleksandra Martyniuk	e170fa1c99	test: extend test_compaction_task.py to test rewrite sstables compaction	2023-04-11 13:07:22 +02:00
Botond Dénes	dba1d36aa6	Merge 'alternator: fix isolation of concurrent modifications to tags' from Nadav Har'El Alternator's implementation of TagResource, UntagResource and UpdateTimeToLive (the latter uses tags to store the TTL configuration) was unsafe for concurrent modifications - some of these modifications may be lost. This short series fixes the bug, and also adds (in the last patch) a test that reproduces the bug and verifies that it's fixed. The cause of the incorrect isolation was that we separately read the old tags and wrote the modified tags. In this series we introduce a new function, `modify_tags()` which can do both under one lock, so concurrent tag operations are serialized and therefore isolated as expected. Fixes #6389. Closes #13150 * github.com:scylladb/scylladb: test/alternator: test concurrent TagResource / UntagResource db/tags: drop unsafe update_tags() utility function alternator: isolate concurrent modification to tags db/tags: add safe modify_tags() utility functions migration_manager: expose access to storage_proxy	2023-04-11 11:17:23 +03:00
Kefu Chai	86b66a9875	build: cmake: drop test_table.CC this change mirrors the corresponding change in `configure.py` in `4b5b6a9010` . Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13461	2023-04-11 09:42:58 +03:00
Nadav Har'El	79114c5030	cql-pytest: translate Cassandra's tests for DELETE operations This is a translation of Cassandra's CQL unit test source file validation/operations/DeleteTest.java into our cql-pytest framework. There are 51 tests, and they did not reproduce any previously-unknown bug, but did provide additional reproducers for three known issues: Refs #4244 Add support for mixing token, multi- and single-column restrictions Refs #12474 DELETE prints misleading error message suggesting ALLOW FILTERING would work Refs #13250 one-element multi-column restriction should be handled like a single-column restriction Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13436	2023-04-11 09:10:11 +03:00
Botond Dénes	05b381bfa2	Merge 'Simple S3 storage for sstables' from Pavel Emelyanov The PR adds sstables storage backend that keeps all component files as S3 objects and system.sstables_registry ownership table that keeps track of what sstables objects belong to local node and their names. When a keyspace is configured with 'STORAGE = { 'type': 'S3' }' the respective class table object eventually gets the storage_options instance pointing to the target S3 endpoint and bucket. All the sstables created for that table attach the S3 storage implementation that maintains components' files as S3 objects. Writing to and reading from components is handled by the S3 client facilities from utils/. Changing the sstable state, which is -- moving between normal, staging and quarantine states -- is not yet implemented, but would eventually happen by updating entries in the sstables registry. To keep track of which node owns which objects, to provide bucket-wide uniqueness of object names and to maintain sstable state the storage driver keeps records in the system.sstables_registry ownership table. The table maps sstable location and generation to the object format, version, status-state () and (!) unique identifier (some time soon this identifier is supposed to be replaced with UUID sstables generations). The component object name is thus s3://bucket/uuid/component_basename. The registry is also used on boot. The distributed loader picks up sstables from all the tables found in schema and for S3-backed keyspaces it lists entries in the registry to a) identify those and b) get their unique S3-side identifiers to open by name. () About sstable's status and state. The state field is the part of today's sstable path on disk -- staging, quarantine, normal (root table data dir), etc. Since S3 doesn't have the renaming facility, moving sstable between those states is only possible by updating the entry in the registry. This is not yet implemented in this set (#13017) The status field tracks sstable' transition through its creation-deletion. It first starts with 'creating' status which corresponds to the today's TemporaryTOC file. After being created and written to the sstable moves into 'sealed' state which corresponds to the today's normal sstable being with the TOC file. To delete sstable atomically it first moves into 'removing' state which is equivalent to being in the deletion-log for the on-disk sstable. Once removed from the bucket, the entry is removed from the registry. To play with: 1. Start minio (installed by install-dependencies.sh) ``` export MINIO_ROOT_USER=${root_user} export MINIO_ROOT_PASSWORD=${root_pass} mkdir -p ${root_directory} minio server ${root_directory} ``` 2. Configure minio CLI, create anonymous bucket ``` mc config host rm local mc config host add local http://127.0.0.1:9000 ${root_user} ${root_pass} mc mb local/sstables mc anonymous set public local/sstables ``` 3. Start Scylla with object-storage feature enabled ``` scylla ... --experimental-features=keyspace-storage-options --workdir ${as_usual}``` 4. Create KS with S3 storage ``` create keyspace ... storage = { 'type': 'S3', 'endpoint': '127.0.0.1:9000', 'bucket': 'sstables' };``` The S3 client has a logger named "s3", it's useful to use on with `trace` verbosity. Closes #12523 * github.com:scylladb/scylladb: test: Add object-storage test distributed_loader: Print storage type when populating sstable_directory: Add ownership table components lister sstable_directory: Make components_lister and API sstable_directory: Create components lister based on storage options sstables: Add S3 storage implementation system_keyspace: Add ownership table system_keyspace: Plug to user sstables manager too sstable: Make storage instance based on storage options sstable_directory: Keep storage_options aboard sstable: Virtualize the helper that gets on-disk stats for sstable sstable, storage: Virtualize data sink making for small components sstable, storage: Virtualize data sink making for Data and Index sstable/writer: Shuffle writer::init_file_writers() sstable: Make storage an API utils: Add S3 readable file impl for random reads utils: Add S3 data sink for multipart upload utils: Add S3 client with basic ops cql-pytest: Add option to run scylla over stable directory test.py: Equip it with minio server sstables: Detach write_toc() helper	2023-04-11 08:17:25 +03:00
Benny Halevy	9105f9800c	sstables: add a printer for shared_sstable Refactor the printing logic in compaction::formatted_sstables_list out to sstables::to_string(const shared_sstable&, bool include_origin) and operator<<(const shared_sstable) on top of it. So that we can easily print std::vector<shared_sstable> from compaction_manager in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:31:35 +03:00
Benny Halevy	1baca96de1	compaction_manager: needs_cleanup: delete unused schema param It isn't needed. The sstable already has a schema. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:03:53 +03:00
Benny Halevy	09df04c919	compaction: move owned_ranges into descriptor Move the owned_ranges_ptr, currently used only by cleanup and upgrade compactions, to the generic compaction descriptor so we apply cleanup in other compaction types. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:52:12 +03:00
Pavel Emelyanov	fd817e199c	Merge 'auth: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `auth::auth_authentication_options` and `auth::resource_kind` without the help of fmt::ostream. and their `operator<<(ostream,..)` are dropped, as there are no users of them anymore. Refs #13245 Closes #13460 * github.com:scylladb/scylladb: auth: remove unused operator<<(.., resource_kind) auth: specialize fmt::formatter<resource_kind> auth: remove unused operator<<(.., authentication_option) auth: specialize fmt::formatter<authentication_option>	2023-04-10 17:05:09 +03:00
Pavel Emelyanov	21ef5bcc22	test: Add object-storage test The test does - starts scylla (over stable directory - creates S3-backed keyspace (minio is up and running by test.py already) - creates table in that keyspace and populates it with several rows - flushes the keyspace to make sstables hit the storage - checks that the ownership table is populated properly - restarts scylla - makes sure old entries exist Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	4bb885b759	sstable: Make storage instance based on storage options This patch adds storage options lw-ptr to sstables_manager::make_sstable and makes the storage instance creation depend on the options. For local it just creates the filesystem storage instance, for S3 -- throws, but next patch will fix that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00

1 2 3 4 5 ...

4686 Commits