The object in question fully describes the keyspace to be created and,
among other things, contains replication strategy options. Next patches
move the "initial_tablets" option out of those options and keep it
separately, so the ks metadata should also carry this option separately.
This patch is _just_ extending the metadata creation API, in fact the
new field is unused (write-only) so all the places that need to provide
this data keep it disengaged and are explicitly marked with FIXME
comment. Next patches will fix that.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The reader used to read the sstables was not closed. This could
sometimes trigger an abort(), because the reader was destroyed, without
it being closed first.
Why only sometimes? This is due to two factors:
* read_mutation_from_flat_mutation_reader() - the method used to extract
a mutation from the reader, uses consume(), which does not trigger
`set_close_is_required()` (#16520). Due to this, the top-level
combined reader did not complain when destroyed without close.
* The combined reader closes underlying readers who have no more data
for the current range. If the circumstances are just right, all
underlying readers are closed, before the combined reader is
destoyed. Looks like this is what happens for the most time.
This bug was discovered in SCT testing. After fixing #16520, all
invokations of `scylla-sstable`, which use this code would trigger the
abort, without this patch. So no further testing is required.
Fixes: #16519Closesscylladb/scylladb#16521
In other words, print more user-friendly messages, and avoid crashing.
Specifically:
* Don't crash when attempting to load schema tables from configured data-dir, while configuration does not have any configured data-directories.
* Detect the case where schema mutations have no rows for the current table -- the keyspace exists, but the table doesn't.
* Add negative tests for schema-loading.
Fixes: https://github.com/scylladb/scylladb/issues/16459Closesscylladb/scylladb#16494
* github.com:scylladb/scylladb:
test/cql-pytest: test_tools.py: add test for failed schema loadig
tools/scylla-sstable: use at() instead of operator [] when obtaining data dirs
tools/schema_loader: also check for empty table/column mutations
tools/schema_loader: log more details when loading schema from schema tables
Currently, `tool_app_template::run_async()` crashes when invoked with empty argv (with just `argv[0]` populated). This can happen if the tool app is invoked without any further args, e.g. just invoking `scylla nodetool`. The crash happens because unconditional dereferencing of `argv[1]` to get the current operation.
To fix, add an early-exit for this case, just printing a usage message and exiting with exit code 2.
Fixes: #16451Closesscylladb/scylladb#16456
* github.com:scylladb/scylladb:
test: add regression tests for invoking tools with no args
tools/utils: tool_app_template: handle the case of no args
tools/utils: tool_app_template: remove "scylla-" prefix from app name
system_schema.tables and system_schema.columns must have content for
every existing table. To detect a failed load of a table, before
attempting to invoke `db::schema_tables::create_table_from_mutations()`,
we check for the mutations read from these two tables, to not be
disengaged. There is another failure scenario however. The mutations are
not null, but do not have any clustering rows. This currently results in
a cryptic error message, about failing to lookup a row in a result-set.
This happens when the lookup-up keyspace exists, but the table doesn't.
Add this to the check, so we get a human-readeable error message when
this happens.
Currently, there is no visibility at all into what happens when
attempting to load schema from schema tables. If it fails, we are left
guessing on what went wrong.
Add a logger and add various debug/trace logs to help following the
process and identify what went wrong.
Currently, tool_app_template::run_async() crashes when invoked with
empty argv (with just argv[0] populated). This can happen if the tool
app is invoked without any further args, e.g. just invoking `scylla
nodetool`. The crash happens because unconditional dereferencing of
argv[1] to get the current operation.
To fix, add an early-exit for this case, just printing a usage message
and exiting with exit code 2.
In other words, have all tools pass their name without the "scylla-"
prefix to `tool_app_template::config::name`. E.g., replace
"scylla-nodetool" with just "nodetool".
Patch all usages to re-add the prefix if needed.
The app name is just more flexible this way, some users might want the
name without the "scylla-" prefix (in the next patch).
when migrating to the uuid-based identifiers, the mapping from the
integer-based generation to the shard-id is preserved. we used to have
"gen % smp_count" for calculating the shard which is responsible to host
a given sstable. despite that this is not a documented behavior, this is
handy when we try to correlate an sstable to a shard, typically when
looking at a performance issue.
in this change, a new subcommand is added to expose the connection
between the sstable and its "owner" shards.
Fixes#16343
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16345
Reduce code duplication by defining each metric just once, instead of three times, by having the semaphore register metrics by itself. This also makes the lifecycle of metrics contained in that of the semaphore. This is important on enterprise where semaphores are added and removed, together with service levels.
We don't want all semaphores to export metrics, so a new parameter is introduced and all call-sites make a call whether they opt-in or not.
Fixes: https://github.com/scylladb/scylladb/issues/16402Closesscylladb/scylladb#16383
* github.com:scylladb/scylladb:
database, reader_concurrency_sempaphore: deduplicate reader_concurrency_sempaphore metrics
reader_concurrency_semaphore: add register_metrics constructor parameter
sstables: name sstables_manager
To be used in the next patch to control whether the semaphore registers
and exports metrics or not. We want to move metric registration to the
semaphore but we don't want all semaphores to export metrics. The
decision on whether a semaphore should or shouldn't export metrics
should be made on a case-by-case basis so this new parameter has no
default value (except for the for_tests constructor).
Soon, the reader_concurrency_semaphore will require a unique
and meaningful name in order to label its metrics. To prepare
for that, name sstable_manager instances. This will be used
to generate a name for sstable_manager's reader_concurrency_semaphore.
On top of the capabilities of the java-nodetool command, the following
additional functionalit is implemented:
* Expose quarantine-mode option of the scrub_keyspace REST API
* Exit with error and print a message, when scrub finishes with abort or
validation_errors return code
* ./tools/java 26f5f71c...29fe44da (3):
> tools: catch and print UnsupportedOperationException
> tools/SSTableMetadataViewer: continue if sstable does not exist
> throw more informative error when fail to parse sstable generation
Fixes: scylladb/scylla-tools-java#360
As part of code coverage we need some additional packages in order to
being able to process the code coverage data and being able to provide
some meaningful information in logs.
Here we add the following packages:
fedora packages:
----------------
lcov - A package of utilities to manipulate lcov traces and generate
coverage html reports
fedora python3 packages:
------------------------
The following packages are added into fedora_packages and not the
python3_packages since we don't need them to be packaged into
scylla-python3 package but we only require them for the build
environment.
python3-unidiff - A python library for working with patch files, this is
required in order to generate "patch coverage" reports.
python3-humanfriendly - A python library to format some quantities into
a human readable strings (time spans, sizes, etc...)
we use it to print meaningful logs that tracks
the volume and time it takes to process coverage
data so we can better debug and optimize it in the
future.
python3-jinja3 - This is a template based generator that will eventually
will allow to consolidate and rearrange several reports into one so we
can publish a single report "site" for all of the coverage information.
For example, include both, coverage report as well as
patch report in a tab based site.
pip packages:
-------------
treelib - A tree data structure that supports also pretty printing of
the tree data. We use it to log the coverage processing steps in
order to have debugging capabilities in the future.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Closesscylladb/scylladb#16330
[avi: regenerate toolchain]
Closesscylladb/scylladb#16357
In the java nodetool, this command ends up calling an API endpoint which
just throws an exception saying moving tokens is not supported. So in
the native implementation we just throw an exception to the same effect
in scylla-nodetool itself.
utils::fb_utilities is a global in-memory registry for storing and retrieving broadcast_address and broadcat_rpc_address.
As part of the effort to get rid of all global state, this series gets rid of fb_utilities.
This will eventually allow e.g. cql_test_env to instantiate multiple scylla server nodes, each serving on its own address.
Closesscylladb/scylladb#16250
* github.com:scylladb/scylladb:
treewide: get rid of now unused fb_utilities
tracing: use locator::topology rather than fb_utilities
streaming: use locator::topology rather than fb_utilities
raft: use locator::topology/messaging rather than fb_utilities
storage_service: use locator::topology rather than fb_utilities
storage_proxy: use locator::topology rather than fb_utilities
service_level_controller: use locator::topology rather than fb_utilities
misc_services: use locator::topology rather than fb_utilities
migration_manager: use messaging rather than fb_utilities
forward_service: use messaging rather than fb_utilities
messaging_service: accept broadcast_addr in config rather than via fb_utilities
messaging_service: move listen_address and port getters inline
test: manual: modernize message test
table: use gossiper rather than fb_utilities
repair: use locator::topology rather than fb_utilities
dht/range_streamer: use locator::topology rather than fb_utilities
db/view: use locator::topology rather than fb_utilities
database: use locator::topology rather than fb_utilities
db/system_keyspace: use topology via db rather than fb_utilities
db/system_keyspace: save_local_info: get broadcast addresses from caller
db/hints/manager: use locator::topology rather than fb_utilities
db/consistency_level: use locator::topology rather than fb_utilities
api: use locator::topology rather than fb_utilities
alternator: ttl: use locator::topology rather than fb_utilities
gossiper: use locator::topology rather than fb_utilities
gossiper: add get_this_endpoint_state_ptr
test: lib: cql_test_env: pass broadcast_address in cql_test_config
init: get_seeds_from_db_config: accept broadcast_address
locator: replication strategies: use locator::topology rather than fb_utilities
locator: topology: add helpers to retrieve this host_id and address
snitch: pass broadcast_address in snitch_config
snitch: add optional get_broadcast_address method
locator: ec2_multi_region_snitch: keep local public address as member
ec2_multi_region_snitch: reindent load_config
ec2_multi_region_snitch: coroutinize load_config
ec2_snitch: reindent load_config
ec2_snitch: coroutinize load_config
thrift: thrift_validation: use std::numeric_limits rather than fb_utilities
Currently, scylla.yaml is read conditionally, if either the user
provided `--scylla-yaml-file` command line parameter, or if deducing the
data dir location from the sstable path failed.
We want the scylla.yaml file to be always read, so that when working
with encrypted file (enterprise), scylla-sstable can pick up the
configuration for the encryption.
This patch makes scylla-sstable always attempt to read the scylla-yaml
file, whether the user provided a location for it or not. When not, the
default location is used (also considering the `SCYLLA_CONF` and
`SCYLLA_HOME` environment variables.
Failing to find the scylla.yaml file is not considered an error. The
rational is that the user will discover this if they attempt to do an
operation that requires this anyway.
There is a debug-level log about whether it was successfully read or
not.
Fixes: #16132Closesscylladb/scylladb#16174
Said commands print errors as they validate the sstables. Currently this
intermingles with the regular JSON output of these commands, resulting
in ugly and confusing output.
This is not a problem for scripted use, as logs go to stderr while the
JSON go to stdout, but it is a problem for human users.
Solve this by outputting the JSON into a std::stringstream and printing
it in one go at the very end. This means JSON is accumulated in a memory
buffer, but these commands don't output a lot of JSON, so this shouldn't
be a problem.
Closesscylladb/scylladb#16216
For major compacting all tables in the database.
The advantage of this api is that `commitlog->force_new_active_segment`
happens only once in `database::flush_all_tables` rather than
once per keyspace (when `nodetool compact` translates to
a sequence of `/storage_service/keyspace_compaction` calls).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
For flushing all tables in the database.
The advantage of this api is that `commitlog->force_new_active_segment`
happens only once in `database::flush_all_tables` rather than
once per keyspace (when `nodetool flush` translates to
a sequence of `/storage_service/keyspace_flush` calls).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
When flushing is done externally, e.g. by running
`nodetool flush` prior to `nodetool compact`,
flush_memtables=false can be passed to skip flushing
of tables right before they are major-compacted.
This is useful to prevent creation of small sstables
due to excessive memtable flushing.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Document the behavior if no keyspace is specified
or no table(s) are specified for a given keyspace.
Fixesscylladb/scylladb#16032
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
* ./tools/jmx 05bb7b68...80ce5996 (4):
> StorageService: Normalize endpoint inetaddress strings to java form
Fixes#16039
> ColumnFamilyStore: only quote table names if necessary
> APIBuilder: allow quoted scope names
> ColumnFamilyStore: don't fail if there is a table with ":" in its name
Fixes#16153
* ./tools/java 10480342...26f5f71c (1):
> NodeProbe: allow addressing table name with colon in it
Also needed for #16153Closesscylladb/scylladb#16146
instead of printing the result of the "validate" subcommand in a
free-style plain text, let's print it using JSON. for two reasons:
1. it is simpler to consume the output with other tools and tests.
2. more consistent with other commands.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16105
instead of printing the result of the "validate-checksum" subcommand
with the logging message, let's print it using JSON. for three reasons:
1. it is simpler to consume the output with other tools and tests.
2. more consistent with other commands.
3. the logging system is used for audit the behavior and for debugging
purposes, not for building a user-facing command line interface.
4. the behavior should match with the corresponding document. and
in docs/operating-scylla/admin-tools/scylla-sstable.sst, we claim
that `validate-checksums` subcommand prints a dict of
```
$ROOT := { "$sstable_path": Bool, ... }
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16106
Update node_exporter to 1.7.0.
The previous version (1.6.1) was flagged by security scanners (such as
Trivy) with HIGH-severity CVE-2023-39325. 1.7.0 release fixed that
problem.
[Botond: regenerate frozen toolchain]
Fixes#16085Closesscylladb/scylladb#16086Closesscylladb/scylladb#16090
before this change, `load_sstables()` fills the output sstables vector
by indexing it with the sstable's path. but if there are duplicated
items in the given sstable_names, the returned vector would have uninitialized
shared_sstable instance(s) in it. if we feed such a sstables to the
operation funcs, they would segfault when derferencing the empty
lw_shared_ptr.
in this change, we error out if duplicated sstable names are specified
in the command line.
an alternative is to tolerate this usage by initializing the sstables
vector with a back_inserter, as we always return a dictionary with the
sstable's name as the key, but it might be desirable from user's
perspective to preserve the order, like OrderedDict in Python. so
let's preserve the ordering of the sstables in the command line.
this should address the problem of the segfault if we pass duplicated
sstable paths to this tool.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16048
to have feature parity with `configure.py`. we won't need this
once we migrate to C++20 modules. but before that day comes, we
need to stick with C++ headers.
we generate a rule for each .hh files to create a corresponding
.cc and then compile it, in order to verify the self-containness of
that header. so the number of rule is quite large, to avoid the
unnecessary overhead. the check-header target is enabled only if
`Scylla_CHECK_HEADERS` option is enabled.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15913
Having to extract 1 keyspace and N tables from the command-line is
proving to be a common pattern among commands. Extract this into a
method, so the boiler-plate can be shared. Add a forward-looking
overload as well, which will be used in the next patch.