scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 18:40:38 +00:00

Author	SHA1	Message	Date
Kefu Chai	1e6b2eb4c8	tools/scylla-nodetool: mark format string as constexpr this change change `const` to `constexpr`. because the string literal defined here is not only immutable, but also initialized at compile-time, and can be used by constexpr expressions and functions. this change is introduced to reduce the size of the change when moving to compile-time format string in future. so far, seastar::format() does not use the compile-time format string, but we have patches pending on review implementing this. and the author of this change has local branches implementing the changes on scylla side to support compile-time format string, which practically replaces most of the `format()` calls with `seastar::format()`. without this change, if we use compile-time format check, compiler fails like: ``` /home/kefu/dev/scylladb/tools/scylla-nodetool.cc:276:44: error: call to consteval function 'fmt::basic_format_string<char, const char const &, seastar::basic_sstring<char, unsigned int, 15>>::basic_format_string<const char , 0>' is not a constant expression .description = seastar::format(description_template, app_name, boost::algorithm::join(operations \| boost::adaptors::transformed([] (const auto& op) { ^ /usr/include/fmt/core.h:3148:67: note: read of non-constexpr variable 'description_template' is not allowed in a constant expression FMT_CONSTEVAL FMT_INLINE basic_format_string(const S& s) : str_(s) { ^ /home/kefu/dev/scylladb/tools/scylla-nodetool.cc:276:44: note: in call to 'basic_format_string(description_template)' .description = seastar::format(description_template, app_name, boost::algorithm::join(operations \| boost::adaptors::transformed([] (const auto& op) { ^ /home/kefu/dev/scylladb/tools/scylla-nodetool.cc:258:16: note: declared here const auto description_template = ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15432	2023-09-15 19:28:38 +03:00
Kefu Chai	6c75dc4be8	tools/scylla-nodetool: do not compare unsigned with int change the loop variable to `int` to silence warning like ``` /home/kefu/.local/bin/clang++ -DBOOST_NO_CXX98_FUNCTION_BASE -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -I/home/kefu/dev/scylladb/build/cmake/gen -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -march=westmere -Og -g -gz -std=gnu++20 -fvisibility=hidden -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o -MF tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o.d -o tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o -c /home/kefu/dev/scylladb/tools/scylla-nodetool.cc /home/kefu/dev/scylladb/tools/scylla-nodetool.cc:215:28: error: comparison of integers of different signs: 'unsigned int' and 'int' [-Werror,-Wsign-compare] for (unsigned i = 0; i < argc; ++i) { ~ ^ ~~~~ ``` `i` is used as the index in a plain C-style array, it's perfectly fine to use a signed integer as index in this case. as per C++ standard, > The expression E1[E2] is identical (by definition) to *((E1)+(E2)) Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15431	2023-09-15 19:28:14 +03:00
Botond Dénes	b87660f90c	tools/scylla-sstable: log where schema was obtained from Currently, we only log anything about what was tried w.r.t. obtaining the schema if it failed. Add a log message to the success path too, so in case the wrong schema was successfully loaded, the user can find the problem. The log message is printed with debug-level, so it doesn't distrurb output by default. Fixes: #15384 Closes #15417	2023-09-14 23:09:30 +03:00
Avi Kivity	d9a453e72e	Merge 'Introduce a scylla-native nodetool' from Botond Dénes This series introduces a scylla-native nodetool. It is invokable via the main scylla executable as the other native tools we have. It uses the seastar's new `http::client` to connect to the specified node and execute the desired commands. For now a single command is implemented: `nodetool compact`, invokable as `scylla nodetool compact`. Once all the boilerplate is added to create a new tool, implementing a single command is not too bad, in terms of code-bloat. Certainly not as clean as a python implementation would be, but good enough. The advantages of a C++ implementation is that all of us in the core team know C++ and that it is shipped right as part of the scylla executable.. Closes #14841 * github.com:scylladb/scylladb: test: add nodetool tests test.py: add ToolTestSuite and ToolTest tools/scylla-nodetool: implement compact operation tools/scylla-nodetool: implement basic scylla_rest_api_client tools: introduce scylla-nodetool utils: export dns_connection_factory from s3/client.cc to http.hh utils/s3/client: pass logger to dns_connection_factory in constructor tools/utils: tool_app_template::run_async(): also detect --help* as --help	2023-09-14 17:20:40 +03:00
Benny Halevy	a5a22fe5b7	tools/scylla-sstable: load_sstables: handle load errors Currently, exceptions thrown from `sst->load` are unhandled, resulting in, e.g.: ``` ERROR 2023-09-12 08:02:58,124 [shard 0:main] seastar - Exiting on unhandled exception: std::runtime_error (SSTable /home/bhalevy/.dtest/dtest-dxg4xdxg/test/node1/data/ks/cf-a3009f20512911ee8000d81cd2da3fd7/me-3g9b_0e0x_39vtt1y2rcqrffz55j-big-Data.db uses org.apache.cassandra.dht.Murmur3Partitioner partitioner which is different than com.scylladb.dht.CDCPartitioner partitioner used by the database) ``` Log the errors and exit the tool with non-zero status in this case. Fixes #15359 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #15376	2023-09-14 14:27:38 +03:00
Botond Dénes	60dc2e9303	tools/scylla-nodetool: implement compact operation Equivalent of nodetool compact. The following arguments are accepted: * split-output,s (unused) * user-defined (error is raised) * start-token,st (unused) * end-token,et (unused) * partition (unused) The partition argument is mentioned only in our doc, our nodetool doesn't recognize it. I added it nevertheless (it is ignored). Split-output doesn't work with our current nodetool, attempting to use it will result in an error. The option is parsed but an error is used if used.	2023-09-14 05:25:14 -04:00
Botond Dénes	d67e22b876	tools/scylla-nodetool: implement basic scylla_rest_api_client Add --host and --port parameters, parse and resolve these and establish a connection to the provided host. Add a simple get() and post() method, parsing the returned data as json. Add the following compatibility arguments: * password,pw * password-file,pwf * username,u * print-port,pp These are parsed and silently ignored, as they are specific to JMX and aren't needed when connecting to the REST API. Since boost program options doesn't support multi-char short-form switches, as well as the -short=value syntax, the argv has to be massaged into a form which boost program options can digest. This is achieved by substituting all incompatible option formats and syntax with the equivalent boost program options compatible one. This mechanism is also used to make sure -h is translated to --host, not --help. The help message is unfortunately still ambiguous, displaying both with -h. This will be addressed in a follow-up.	2023-09-14 05:25:14 -04:00
Botond Dénes	eb1beca1b6	tools: introduce scylla-nodetool This patch only introuces the bare skeleton of the tool, plus the wiring into main. No operations are added yet, they will be added in later patches.	2023-09-14 05:25:14 -04:00
Botond Dénes	4dd373b8d3	tools/utils: tool_app_template::run_async(): also detect --help* as --help Don't try to lookup the current operation if the first argument is --help*. This allows --help-seastar and --help-loggers to work.	2023-09-14 05:25:14 -04:00
Kamil Braun	bff9cedef9	Merge 'system_keyspace: remove flushes when writing to system tables' from Petr Gusev There are several system tables with strict durability requirements. This means that if we have written to such a table, we want to be sure that the write won't be lost in case of node failure. We currently accomplish this by accompanying each write to these tables with `db.flush()` on all shards. This is expensive, since it causes all the memtables to be written to sstables, which causes a lot of disk writes. This overheads can become painful during node startup, when we write the current boot state to `system.local`/`system.scylla_local` or during topology change, when `update_peer_info`/`update_tokens` write to `system.peers`. In this series we remove flushes on writes to the `system.local`, `system.peers`, `system.scylla_local` and `system.cdc_local` tables and start using schema commitlog for durability. Fixes: #15133 Closes #15279 * github.com:scylladb/scylladb: system_keyspace: switch CDC_LOCAL to schema commitlog system_keyspace: scylla_local: use schema commitlog database.cc: make _uses_schema_commitlog optional system_keyspace: drop load phases database.hh: add_column_family: add readonly parameter schema_tables: merge_tables_and_views: delay events until tables/views are created on all shards system_keyspace: switch system.peers to schema commitlog system_keyspace: switch system.local to schema commitlog main.cc: move schema commitlog replay earlier sstables_format_selector: extract listener sstables_format_selector: wrap when_enabled with seastar::async main.cc: inline and split system_keyspace.setup system_keyspace: refactor save_system_schema function system_keyspace: move initialize_virtual_tables into virtual_tables.hh system_keyspace: remove unused parameter config.cc: drop db::config::host_id main.cc:: extract local_info initialization into function schema.cc: check static_props for sanity system_keyspace: set null sharder when configuring schema commitlog system_keyspace: rename static variables system_keyspace: remove redundant wait_for_sync_to_commitlog	2023-09-14 10:39:20 +02:00
Kefu Chai	25457fca38	Update tools/cqlsh submodule * tools/cqlsh 66ae7eac...e651e12e (6): > setup.py: specify Cython language_level explicitly > setup.py: pass extensions as a list > setup.py: reindent block in else branch > setup.py: early return in get_extension() > reloc: install build==0.10.0 > reloc: add --verbose option to build_reloc.sh Closes #15401	2023-09-14 10:30:07 +02:00
Yaniv Kaul	6c67c270c8	Update node exporter to v1.6.1 Fixes: https://github.com/scylladb/scylladb/issues/15044 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes #15045 [avi: toolchain regenerated; also pulls in clang-16.0.6-3] Ref #15090 Closes #15392	2023-09-14 01:04:14 +03:00
Petr Gusev	b90011294d	config.cc: drop db::config::host_id In this refactoring commit we remove the db::config::host_id field, as it's hacky and duplicates token_metadata::get_my_id. Some tests want specific host_id, we add it to cql_test_config and use in cql_test_env. We can't pass host_id to sstables_manager by value since it's initialized in database constructor and host_id is not loaded yet. We also prefer not to make a dependency on shared_token_metadata since in this case we would have to create artificial shared_token_metadata in many tools and tests where sstables_manager is used. So we pass a function that returns host_id to sstables_manager constructor.	2023-09-13 23:00:15 +04:00
Botond Dénes	7e7101c180	Revert "Merge 'database, storage_proxy: Reconcile pages with dead rows and partitions incrementally' from Botond Dénes" This reverts commit `628e6ffd33`, reversing changes made to `45ec76cfbf`. The test included with this PR is flaky and often breaks CI. Revert while a fix is found. Fixes: #15371	2023-09-13 10:45:37 +03:00
Kefu Chai	571fab4179	build: cmake: build cqlsh as a submodule since we also redistribute cqlsh, let's package it as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-12 18:18:31 +08:00
Kefu Chai	111d20958e	build: cmake: build python3 dist tarball with arch postfix now that `configure.py` always generate python3 dist tarball with ${arch} postfix, let's mirror this behavior. as `build_unified.sh` uses this naming convention. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-12 18:18:31 +08:00
Kefu Chai	34e3302c01	dbuild: use --userns option when using podman instead of fabricating a `/etc/password` manually, we can just leave it to podman to add an entry in `/etc/password` in container. as podman allows us to map user's account to the same UID in the container. see https://docs.podman.io/en/stable/markdown/options/userns.container.html. this is not only a cosmetic change, it also avoid the permission denied failure when accessing `/etc/passwd` in the container when selinux is enabled. without this change, we would otherwise need to either add the selinux lable to the bind volume with ':Z' option address the failure like: ``` type=AVC msg=audit(1693449115.261:2599): avc: denied { open } for pid=2298247 comm="bash" path="/etc/passwd" dev="tmpfs" ino=5931 scontext=system_u:system_r:container_t:s0:c252,c259 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=file permissive=0 type=AVC msg=audit(1693449115.263:2600): avc: denied { open } for pid=2298249 comm="id" path="/etc/passwd" dev="tmpfs" ino=5931 scontext=system_u:system_r:container_t:s0:c252,c259 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=file permissive=0 ``` found in `/var/log/audit/audit.log`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15230	2023-09-11 21:41:48 +03:00
Avi Kivity	b8a655f55e	Update tools/python3 submodule * tools/python3 45fbd05...3e833f1 (1): > install.sh: replace <tab> with spaces	2023-09-11 21:38:02 +03:00
Avi Kivity	628e6ffd33	Merge 'database, storage_proxy: Reconcile pages with dead rows and partitions incrementally' from Botond Dénes Currently, mutation query on replica side will not respond with a result which doesn't have at least one live row. This causes problems if there is a lot of dead rows or partitions before we reach a live row, which stem from the fact that resulting reconcilable_result will be large: 1. Large allocations. Serialization of reconcilable_result causes large allocations for storing result rows in std::deque 2. Reactor stalls. Serialization of reconcilable_result on the replica side and on the coordinator side causes reactor stalls. This impacts not only the query at hand. For 1M dead rows, freezing takes 130ms, unfreezing takes 500ms. Coordinator does multiple freezes and unfreezes. The reactor stall on the coordinator side is >5s 3. Too large repair mutations. If reconciliation works on large pages, repair may fail due to too large mutation size. 1M dead rows is already too much: Refs https://github.com/scylladb/scylladb/issues/9111. This patch fixes all of the above by making mutation reads respect the memory accounter's limit for the page size, even for dead rows. This patch also addresses the problem of client-side timeouts during paging. Reconciling queries processing long strings of tombstones will now properly page tombstones,like regular queries do. My testing shows that this solution even increases efficiency. I tested with a cluster of 2 nodes, and a table of RF=2. The data layout was as follows (1 partition): * Node1: 1 live row, 1M dead rows * Node2: 1M dead rows, 1 live row This was designed to trigger reconciliation right from the very start of the query. Before: ``` Running query (node2, CL=ONE, cold cache) Query done, duration: 140.0633503ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 66.7195275ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 873.5400742ms, pages: 2, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] ``` After: ``` Running query (node2, CL=ONE, cold cache) Query done, duration: 136.9035122ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 69.5286021ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 162.6239498ms, pages: 100, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] ``` Non-reconciling queries have almost identical duration (1 few ms changes can be observed between runs). Note how in the after case, the reconciling read also produces 100 pages, vs. just 2 pages in the before case, leading to a much lower duration (less than 1/4 of the before). Refs https://github.com/scylladb/scylladb/issues/7929 Refs https://github.com/scylladb/scylladb/issues/3672 Refs https://github.com/scylladb/scylladb/issues/7933 Fixes https://github.com/scylladb/scylladb/issues/9111 Closes #14923 * github.com:scylladb/scylladb: test/topology_custom: add test_read_repair.py replica/mutation_dump: detect end-of-page in range-scans tools/scylla-sstable: write: abort parser thread if writing fails test/pylib: add REST methods to get node exe and workdir paths test/pylib/rest_client: add load_new_sstables, keyspace_{flush,compaction} service/storage_proxy: add trace points for the actual read executor type service/storage_proxy: add trace points for read-repair storage_proxy: Add more trace-level logging to read-repair database: Fix accounting of small partitions in mutation query database, storage_proxy: Reconcile pages with no live rows incrementally	2023-09-11 19:20:19 +03:00
Botond Dénes	82f4563757	tools/scylla-sstable: write: abort parser thread if writing fails Currently if writing the sstable fails, e.g. because the input data is out-of-order, the json parser thread hangs because its output is no longer consumed. This results in the entire application just freezing. Fix this by aborting the parsing thread explicitely in the json_mutation_stream_parser destructor. If the parser thread existed successfully, this will be a no-op, but on the error-path, this will ensure that the parser thread doesn't hang.	2023-09-11 07:02:14 -04:00
Botond Dénes	685486a20d	Update tools/python3 submodule * tools/python3 30b8fc21...45fbd056 (1): > build_reloc: do not run SCYLLA-VERSION-GEN twice	2023-09-11 10:59:56 +03:00
Avi Kivity	0656810c28	Update tools/java submodule * tools/java 585b30fda6...9dddad27bf (1): > install-dependencies.sh: do not install weak dependencies Frozen toolchain regenerated. Closes #15322	2023-09-08 17:22:07 +03:00
Israel Fruchter	3d082acd29	Update tools/cqlsh submodule * tools/cqlsh 2254e920...66ae7eac (5): > switch from `ssl_options` to `ssl_context` > cqlsh should use cql v4 by default when connecting #44 > Revert "Skip pp38-macosx wheel builds" > update to newer cibuildwheel > Skip pp38-macosx wheel builds Closes #15308	2023-09-07 22:48:37 +03:00
Kefu Chai	a29838f9e1	sstables: change make_descriptor() to accept fs::path change another overload of `make_descriptor()` to accept `fs::path`, in the same spirit of a previous change in this area. so we have a more consistent API for creating sstable descriptor. and this new API is simpler to use. Refs #15187 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-01 07:44:06 +08:00
Kefu Chai	6656707164	sstables: change make_descriptor() to accept fs::path to lower the programmer's cognitive load. as programmer might want to pass the full path as the `fname` when calling `make_descriptor(sstring sstdir, sstring fname)`, but this overload only accepts the filename component as its second parameter. a single `path` parameter would be easier to work with. Refs #15187 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-01 07:44:06 +08:00
Botond Dénes	1609c76d62	tools/scylla-sstable: scrub: don't qurantine sstables after validate Scylla sstable promises to never mutate its input sstables. This promise was broken by `scylla sstable scrub --scrub-mode=validate`, because validate moves invalid input sstables into qurantine. This is unexpected and caused occasional failures in the scrub tests in test_tools.py. Fix by propagating a flag down to `scrub_sstables_validate_mode()` in `compaction.cc`, specifying whether validate should qurantine invalid sstables, then set this flag to false in `scylla-sstable.cc`. The existing test for validate-mode scrub is ammended to check that the sstable is not mutated. The test now fails before the fix and passes afterwards. Fixes: #14309 Closes #15139	2023-08-23 21:53:12 +03:00
Kefu Chai	adfc139a74	tools/scylla-sstable: path::parent_path() when appropriate in load_sstables(), `sst_path` is already an instace of `std::filesystem::path`, so there is no need to cast it to `std::filesystem::path`. also, `path.remove_filename()` returns something like "system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/", when the trailing slash. when we get a component's path in `sstable::filename`, we always add a "/" in between the `dir` and the filename, so this'd end up with two slashes in the path like: "/var/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f//mc-2-big-Data.db" so, in order to remove the duplicated slash, let's just use `path.parent_path()` here. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15035	2023-08-21 09:28:03 +03:00
Tomasz Grabiec	bd8bb5d4b1	Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho Compaction group is the data plane for tablets, so this integration allows each tablet to have its own storage (memtable + sstables). A crucial step for dynamic tablets, where each tablet can be worked on independently. There are still some inefficiencies to be worked on, but as it is, it already unlocks further development. ``` INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf ``` Closes #14863 * github.com:scylladb/scylladb: Kill scylla option to configure number of compaction groups replica: Wire tablet into compaction group token_metadata: Add this_host_id to topology config replica: Switch to chunked_vector for storing compaction groups replica: Generate group_id for compaction_group on demand	2023-08-18 15:17:17 +02:00
Avi Kivity	1901475598	Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly. Closes #14948 * github.com:scylladb/scylladb: config: remove unused namespace alias config: use std::ranges when appropriate config: drop "experimental" option test: disable 'enable_user_defined_functions' if experimental_features does not include udf test: pylib: specify experimental_features explicitly	2023-08-17 20:42:02 +03:00
Avi Kivity	e8f3b073c3	Merge 'Maintain sstable state explicitly' from Pavel Emelyanov An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state. This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path. refs: #13017 refs: #12707 Closes #14152 * github.com:scylladb/scylladb: sstable/storage: Make filesystem storage with initial state sstable: Maintain state sstable: Make .change_state() accept state, not directory string sstable: Construct it with state sstables_manager: Remove state-less make_sstable() table: Make sstables with required state test: Make sstables with upload state in some cases tools: Make sstables with normal state table: Open-code sstables making streaming helpers tests: Make sstables with normal state by default sstable_directory: Make sstable with required state sstable_directory: Construct with state distributed_loader: Make sstable with desired state when populating distributed_loader: Make sstable with upload state when uploading sstable: Introduce state enum sstable_directory: Merge verify and g.c. calls distributed_loader: Merge verify and gc invocations sstable/filesystem: Put underscores to dir members sstable/s3: Mark make_s3_object_name() const sstable: Remove filename(dir, ...) method	2023-08-15 17:44:06 +03:00
Raphael S. Carvalho	2590eec352	replica: Generate group_id for compaction_group on demand There are a few good reasons for this change. 1) compaction_group doesn't have to be aware of # of groups 2) thinking forward to dynamic tablets, # of groups cannot be statically embedded in group id, otherwise it gets stale. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-15 09:04:05 -03:00
Kefu Chai	c82f1d2f57	tools/scylla-sstable: dump column_desc as an object before this change, `scylla sstable dump-statistics` prints the "regular_columns" as a list of strings, like: ``` "regular_columns": [ "name", "clustering_order", "type_name", "org.apache.cassandra.db.marshal.UTF8Type", "name", "column_name_bytes", "type_name", "org.apache.cassandra.db.marshal.BytesType", "name", "kind", "type_name", "org.apache.cassandra.db.marshal.UTF8Type", "name", "position", "type_name", "org.apache.cassandra.db.marshal.Int32Type", "name", "type", "type_name", "org.apache.cassandra.db.marshal.UTF8Type" ] ``` but according https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/scylla-sstable.html#dump-statistics, > $SERIALIZATION_HEADER_METADATA := { > "min_timestamp_base": Uint64, > "min_local_deletion_time_base": Uint64, > "min_ttl_base": Uint64", > "pk_type_name": String, > "clustering_key_types_names": [String, ...], > "static_columns": [$COLUMN_DESC, ...], > "regular_columns": [$COLUMN_DESC, ...], > } > > $COLUMN_DESC := { > "name": String, > "type_name": String > } "regular_columns" is supposed to be a list of "$COLUMN_DESC". the same applies to "static_columnes". this schema makes sense, as each column should be considered as a single object which is composed of two properties. but we dump them like a list. so, in this change, we guard each visit() call of `json_dumper()` with `StartObject()` and `EndObject()` pair, so that each column is printed as an object. after the change, "regular_columns" are printed like: ``` "regular_columns": [ { "name": "clustering_order", "type_name": "org.apache.cassandra.db.marshal.UTF8Type" }, { "name": "column_name_bytes", "type_name": "org.apache.cassandra.db.marshal.BytesType" }, { "name": "kind", "type_name": "org.apache.cassandra.db.marshal.UTF8Type" }, { "name": "position", "type_name": "org.apache.cassandra.db.marshal.Int32Type" }, { "name": "type", "type_name": "org.apache.cassandra.db.marshal.UTF8Type" } ] ``` Fixes #15036 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15037	2023-08-15 08:22:51 +03:00
Pavel Emelyanov	9e752ca6ab	tools: Make sstables with normal state Just like tests, tool open sstable by its full path and doesn't make any assumptions about sstable state Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 14:56:02 +03:00
Pavel Emelyanov	c0b922a8af	sstable_directory: Construct with state This is to replace full path sitting on this object eventually. For now they have to co-exist, but state will be used to make_sstable()-s from manager with its new API Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 14:56:01 +03:00
Kefu Chai	64bc8d2f7d	config: drop "experimental" option "experimental" was marked deprecated in `8b917f7c`. this change was included since Scylla 4.6. now that 5.3 has been branched, this change will be included 5.4. this should be long enough for the user's turn around if this option is ever used. the dtests using this option has been audited and updated accordingly. and the unit testing this option is removed as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-09 10:17:34 +08:00
Kefu Chai	374bed8c3d	tools: do not create bpo::value unless transfer it to an option_description `boost::program_options::value()` create a new typed_value<T> object, without holding it with a shared_ptr. boost::program_options expects developer to construct a `bpo::option_description` right away from it. and `boost::program_options::option_description` takes the ownership of the `type_value<T>*` raw pointer, and manages its life cycle with a shared_ptr. but before passing it to a `bpo::option_description`, the pointer created by `boost::program_options::value()` is a still a raw pointer. before this change, we initialize positional options as global variables using `boost::program_options::value()`. but unfortunately, we don't always initialize a `bpo::option_description` from it -- we only do this on demand when the corresponding subcommand is called. so, if the corresponding subcommand is not called, the created `typed_value<T>` objects are leaked. hence LeakSanitizer warns us. after this change, we create the option vector as a static local variable in a function so it is created on demand as well. as an alternative, we could initialize the options vector as local variable where it used. but to be more consistent with how `global_option` is specified. and to colocate them in a single place, let's keep the existing code layout. Fixes #14929 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14939	2023-08-04 08:03:11 +03:00
Botond Dénes	2d26613f28	tools: move operation-options to the operations themselves Currently, operation-options are declared in a single global list, then operations refer to the options they support via name. This system was born at a time, when scylla-sstable had a lot of shared options between its operations, so it was desirable to declare them centrally and only add references to individual operations, to reduce duplication. However, as the dust settled, only 2 options are shared by 2 operations each. This is a very low benefit. Up to now the cost was also very low -- shared options meant the same in all operations that used them. However this is about to change and this system becomes very awkward to use as soon as multiple operations want to have an option with the same name, but sligthly (or very) different meaning/semantics. So this patch changes moves the options to the operations themselves. Each will declare the list of options it supports, without having to reference some common list. This also removes an entire (although very uncommon) class of bugs: option-name referring to inexistent option. Closes #14898	2023-07-31 20:16:41 +03:00
Kefu Chai	1c525c02a3	tools/utils: use std::shift_left() when appropriate instead of using a loop of std::swap(), let's use std::shift_left() when appropriate. simpler and more readable this way. moreover, the pattern of looking for a command and consume it from the command line resembles what we have in main(), so let's use similar logic to handle both of them. probably we can consolidate them in future. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14888	2023-07-31 09:46:52 +03:00
Kefu Chai	eab160e947	tools/scylla-sstable: mark const variable with constexpr this change change `const` to `constexpr`. because the string literal defined here is not only immutable, but also initialized at compile-time, and can be used by constexpr expressions and functions. this change is introduced to reduce the size of the change when moving to compile-time format string in future. so far, seastar::format() does not use the compile-time format string, but we have patches pending on review implementing this. and the author of this change has local branches implementing the changes on scylla side to support compile-time format string, which practically replaces most of the `format()` calls with `seastar::format()`. to reduce the size of the change and the pain of rebasing, some of the less controversial changes are extracted and upstreamed. this one is one of them. this change also addresses following compilation failure: ``` /home/kefu/dev/scylladb/tools/scylla-sstable.cc:2836:44: error: call to consteval function 'fmt::basic_format_string<char, const char const &, seastar::basic_sstring<char, unsigned int, 15>>::basic_format_string<const char , 0>' is not a constant expression 2836 \| .description = seastar::format(description_template, app_name, boost::algorithm::join(operations \| boost::adaptors::transformed([] (const auto& op) { \| ^ /usr/include/fmt/core.h:3148:67: note: read of non-constexpr variable 'description_template' is not allowed in a constant expression 3148 \| FMT_CONSTEVAL FMT_INLINE basic_format_string(const S& s) : str_(s) { \| ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14887	2023-07-31 09:44:00 +03:00
Avi Kivity	accd6271bc	Merge 'tools: introduce tool_app_template and migrate all tools to it' from Botond Dénes The scaffolding required to have a working scylla tool app, is considerable, leading to a large amount of boilerplate code in each such app. This logic is also very similar across the two tool apps we have and would presumably be very similar in any future app. This PR extracts this logic into `tools/utils.hh` and introduces `tool_app_template`, which is similar to `seastar::app_template` in that it centralizes all the option handling and more in a single class, that each tool has to just instantiate and then call `run()` to run the app. This cuts down on the repetition and boilerplate in our current tool apps and make prototyping new tool apps much easier. Closes #14855 * github.com:scylladb/scylladb: tools/utils.hh: remove unused headers tools/utils: make get_selected_operation() and configure_tool_mode() private tools/utils.hh: de-template get_selected_operation() tools/scylla-types: migrate to tools_app_template tools/scylla-types: prepare for migration to tool_app_template tools/scylla-sstable.cc: fix indentation tools/scylla-sstables: migrate to tool_app_template tools/scylla-sstables: prepare for migration to tool_app_template tools: extract tool app skeleton to utils.hh	2023-07-30 18:31:10 +03:00
Avi Kivity	1c3d22b717	build: update frozen toolchain to Fedora 38 This refreshes clang to 16.0.6 and libstdc++ to 13.1.1. compiler-rt, libasan, and libubsan are added to install-dependencies.sh since they are no longer pulled in as depdendencies. Closes #13730	2023-07-30 03:08:48 +03:00
Botond Dénes	1eca60fe10	tools/utils.hh: remove unused headers	2023-07-28 08:41:34 -04:00
Botond Dénes	cbcb20f0f9	tools/utils: make get_selected_operation() and configure_tool_mode() private Their only user is in tools/utils.cc, so move them there, into an anonymous namespace.	2023-07-28 08:41:34 -04:00
Botond Dénes	fc0c87002c	tools/utils.hh: de-template get_selected_operation() It now has a single user, so it doesn't have to be a template. For now, make the method inline, so it can stay in the header. It will be moved to utils.cc in the next patch.	2023-07-28 08:41:16 -04:00
Botond Dénes	8caf258539	tools/scylla-types: migrate to tools_app_template Discard the locally coded app skeleton and reuse the tool app template instead. Reduces boilerplate greatly.	2023-07-28 08:30:53 -04:00
Botond Dénes	68a452be00	tools/scylla-types: prepare for migration to tool_app_template Make options more declarative and create a local reference to app.configuration() in the main lambda. To faciliate further patching.	2023-07-28 08:30:53 -04:00
Botond Dénes	7598c23359	tools/scylla-sstable.cc: fix indentation Broken in the previous patch.	2023-07-28 08:30:53 -04:00
Botond Dénes	d082622ab9	tools/scylla-sstables: migrate to tool_app_template Removing a great amount of boilerplate, streamlinging the main method.	2023-07-28 08:30:53 -04:00
Botond Dénes	092650b20b	tools/scylla-sstables: prepare for migration to tool_app_template Make options more declarative. To facilitate further patching.	2023-07-28 08:30:53 -04:00
Botond Dénes	89d7d80fce	tools: extract tool app skeleton to utils.hh The skeleton of the two existing scylla-native tools (scylla-types and scylla-sstable) is very similar. By skeleton, I mean all the boilerplate around creating and configuring a seastar::app_template, representing operations/command and their options, and presenting and selecting these. To facilitate code-sharing and quick development of any new tools, extract this skeleton from scylla-sstable.cc into tools/utils.hh, in the form of a new tool_app_template, which wraps a seastar::app_template and centralizes all the boilerplate logic in a single place. The extracted code is not a simple copy-paste, although many elements are simply copied. The original code is not removed yet.	2023-07-28 08:30:53 -04:00

1 2 3 4 5 ...

608 Commits