Now it's scattered between dist. loader and sstable directory code making the latter quite bloated. Keeping everything in distributed loader makes the sstable_directory code compact and easier to patch to support object storage backend.
Closes#12771
* github.com:scylladb/scylladb:
sstable_directory: Rename remove_input_sstables_from_reshaping()
sstable_directory: Make use of remove_sstables() helper
sstable_directory: Merge output sstables collecting methods
distributed_loader: Remove max_compaction_threshold argument from reshard()
distributed_loader: Remove compaction_manager& argument from reshard()
sstable_directory: Move the .reshard() to distributed_loader
sstable_directory: Add helper to load foreign sstable
sstable_directory: Add io-prio argument to .reshard()
sstable_directory: Move reshard() to distributed_loader.cc
distributed_loader: Remove compaction_manager& argument from reshape()
sstable_directory: Move the .reshape() to distributed loader
sstable_directory: Add helper to retrive local sstables
sstable_directory: Add io-prio argument to .reshape()
sstable_directory: Move reshape() to distributed_loader.cc
CQL transport sever error handling fixes and improvements:
* log failed requests with `DEBUG` level for easier debugging;
* in case of unhandled errors, deliver them to the client as `SERVER_ERROR`'s
* fix for `protocol_error`'s in case of shedded big requests;
* explicit tests have been written for the error handling problems above.
Closes#11949
* github.com:scylladb/scylladb:
transport server: fix "request size too large" handling
transport server: log failed requests with debug level
transport server: fix unexpected server errors handling
transport server: log client errors with debug level
It unlinks unshared sstables filtering some of them out. Name it
according to what it does without mentioning reshape/reshard.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently it's called remove_input_sstables_from_resharding() but it's
just unlinks sstables in parallel from the given list. So rename it not
to mention reshard and also make use of this "new" helper in the
remove_input_sstables_from_reshaping(), it needs exactly the same
functionality.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are two of them collecting sstables from resharding and reshaping.
Both doing the same job except for the latter doesn't expect the list to
contain remote sstables.
This patch merges them together with the help of an extra sanity boolean
to check for the remote sstable not in the list. And renames the method
not to mention reshape/reshard.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We have a cql3::expr::expression::printer wrapper that annotates
an expression with a debug_mode boolean prior to formatting. The
fmt library, however, provides a much simpler alterantive: a custom
format specifier. With this, we can write format("{:user}", expr) for
user-oriented prints, or format("{:debug}", expr) for debug-oriented
prints (if nothing is specified, the default remains debug).
This is done by implementing fmt::formatter::parse() for the
expression type, can using expression::printer internally.
Since sometimes we pass expression element types rather than
the expression variant, we also provide a custom formatter for all
ExpressionElement Types.
Uses for expression::printer are updated to use the nicer syntax. In
one place we eliminate a temporary that is no longer needed since
ExpressionElement:s can be formatted directly.
Closes#12702
Calling _read_buf.close() doesn't imply eof(), some data
may have already been read into kernel or client buffers
and will be returned next time read() is called.
When the _server._max_request_size limit was exceeded
and the _read_buf was closed, the process_request method
finished and we started processing the next request in
connection::process. The unread data from _read_buf was
treated as the header of the next request frame, resulting
in "Invalid or unsupported protocol version" error.
The existing test_shed_too_large_request was adjusted.
It was originally written with the assumption that the data
of a large query would simply be dropped from the socket
and the connection could be used to handle the
next requests. This behaviour was changed in scylladb#8800,
now the connection is closed on the Scylla side and
can no longer be used. To check there are no errors
in this case, we use Scylla metrics, getting them
from the Scylla Prometheus API.
If request processing ended with an error, it is worth
sending the error to the client through
make_error/write_response. Previously in this case we
just wrote a message to the log and didn't handle the
client connection in any way. As a result, the only
thing the client got in this case was timeout error.
A new test_batch_with_error is added. It is quite
difficult to reproduce error condition in a test,
so we use error injection instead. Passing injection_key
in the body of the request ensures that the exception
will be thrown only for this test request and
will not affect other requests that
the driver may send in the background.
Closes: scylladb#12104
Since the whole reshard() is local to dist. loader code now, the caller
of the reshard helper may let this method get the threshold itself
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now all the reshading logic is accumulated in distributed loader and the
sstable_directory is just the place where sstables are collected.
The changes summary is:
- add sstable_directory as argument (used to be "this")
- replace all "this" captures with &dir ones
- remove temporary namespace gap and declaration from sst-dir class
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is to generalize the code duplication between .reshard() and
existing .load_foreign_sstables() (plural form).
Make it coroutinized right at once.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now it gets one from this-> but the method is becoming static one in
distributed_loader which only has it as an argument. That's not big deal
as the current IO class is going to be derived from current sched group,
so this extra arg will go away at all some day.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now all the reshaping logic is accumulated in distributed loader and the
sstable_directory is just the place where sstables are collected.
The changes summary is:
- add sstable_directory as argument (used to be "this")
- replace all "this" captures with &dir ones
- remove temporary namespace gap and declaration from sst-dir class
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are methods to retrive shared local sstables and foreign sstables,
so here's one more to the family
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now it gets one from this-> but the method is becoming static one in
distributed_loader which only has it as an argument. That's not big deal
as the current IO class is going to be derived from current sched group,
so this extra arg will go away at all some day.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Let the initial range passed to query_partition_key_range
be [1, 2) where 2 is the successor of 1 in terms
of ring_position order and 1 is equal to vnode.
Then query_ranges_to_vnodes_generator() -> [[1, 1], (1, 2)],
so we get an empty range (1,2) and subsequently will
make a data request with this empty range in
storage_proxy::query_partition_key_range_concurrent,
which will be redundant.
The patch adds a check for this condition after
making a split in the main loop in process_one_range.
The patch does not attempt to handle cases where the
original ranges were empty, since this check is the
responsibility of the caller. We only take care
not to add empty ranges to the result as an
unintentional artifact of the algorithm in
query_ranges_to_vnodes_generator.
A test case is added in test_get_restricted_ranges.
The helper lambda check is changed so that not to limit
the number of ranges to the length of expected
ranges, otherwise this check passes without
the change in process_one_range.
Fixes: #12566Closes#12755
request_controller_timeout_exception_factory::timeout() creates an
instance of `request_controller_timed_out_error` whose ctor is
default-created by compiler from that of timed_out_error, which is
in turn default-created from the one of `std::exception`. and
`std::exception::exception` does not throw. so it's safe to
mark this factory method `noexcept`.
with this specifier, we don't need to worry about the exception thrown
by it, and don't need to handle them if any in `seastar::semaphore`,
where `timeout()` is called for the customized exception.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12759
- main: consider EDQUOT as environmental failure also
- main: use defer_verbose_shutdown() to shutdown compaction manager
- replica/table: extract should_retry() int with_retry
- replica/table: retry on EDQUOT when flushing memtable
Fixes#12626Closes#12653
* github.com:scylladb/scylladb:
replica/table: retry on EDQUOT when flushing memtable
replica/table: extract should_retry() int with_retry
main: use defer_verbose_shutdown() to shutdown compaction manager
main: consider EDQUOT as environmental failure also
This patch fixes#12475, where an aggregation (e.g., COUNT(*), MIN(v))
of absolutely no partitions (e.g., "WHERE p = null" or "WHERE p in ()")
resulted in an internal error instead of the "zero" result that each
aggregator expects (e.g., 0 for COUNT, null for MIN).
The problem is that normally our aggregator forwarder picks the nodes
which hold the relevant partition(s), forwards the request to each of
them, and then combines these results. When there are no partitions,
the query is sent to no node, and we end up with an empty result set
instead of the "zero" results. So in this patch we recognize this
case and build those "zero" results (as mentioned above, these aren't
always 0 and depend on the aggregation function!).
The patch also adds two tests reproducing this issue in a fairly general
way (e.g., several aggregators, different aggregation functions) and
confirming the patch fixes the bug.
The test also includes two additional tests for COUNT aggregation, which
uncovered an incompatibility with Cassandra which is still not fixed -
so these tests are marked "xfail":
Refs #12477: Combining COUNT with GROUP by results with empty results
in Cassandra, and one result with empty count in Scylla.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12715
Related: https://github.com/scylladb/scylladb/pull/12586
This PR improves the upgrade policy added with https://github.com/scylladb/scylladb/pull/12586, according to the feedback from:
@tzach
> Upgrading from 4.6 to 5.0 is not clear; better to use 4.x to 4.y versions as an example.
and @bhalevy
> It is not completely clear that when upgrading through several versions, the whole cluster needs to be upgraded to each consecutive version, not just the rolling node.
In addition, the content is organized into sections for the sake of readability.
Closes#12647
* github.com:scylladb/scylladb:
doc: add the information abou patch releases
doc: add the info about the minor versions
doc: reorganize the content on the Upgrade ScyllaDB page
doc: improve the overview of the upgrade procedure (apply feedback)
This patch adds three additional tests for empty (e.g., empty string)
clustering keys.
The first test disproves a worry that was raised in #12561 that perhaps
empty clustering keys only seem work, but they don't get written to
sstables. The new test verifies that there is no bug - they are written
and can be read correctly.
The second and third test reproduce issue #12749 - an empty clustering
should be allowed in a COMPACT STORAGE table only if there is a compound
(multi-column) clustering key. But as the tests demonstrate, 1. if there
is just one clustering column, Scylla gives the wrong error message, and
2. if there is a compound clustering key, Scylla doesn't allow an empty
key as it should.
As usual, all tests pass on Cassandra. The last two tests fail on
Scylla, so are marked xfail.
Refs #12561
Refs #12749
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12750
Ideally, these errors should be transparently delivered
to the client, but in practice, due to various
flaws/bugs in scylla and/or the driver,
they can be lost, which enormously complicates troubleshooting.
const socket_address& get_remote_address() is needed for its
convenient conversion to string, which includes ip and port.
retry when memtable flush fails due to EDQUOT.
there are chances that user exceeds the disk quota when scylla flushes
memtable and user manages to free up the necessary resource before the
next retry.
before this change, we simply `abort()` in this case.
after this change, we just keep on retrying until the service is shutdown.
Fixes#12626
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
* extract a lambda encapsulating the condition if we should retry
at seeing an exception when calling functions with `with_retry()`.
we apply the same check to the exception raised when performing
table related i/o operations. in this change, the two checks are
consolidated and extracted into a single lambda, so we can add
yet more error code (s) which should be considered retry-able
failures.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
* use `defer_verbose_shutdown()` to shutdown compaction manager
`EDQUOT` is quite similar as `ENOSPC`, in the sense that both of them
are caused by environmental issues.
before this change, `compaction_manager` filters the
ENOSPC exceptions thrown by `compaction_manager::really_do_stop()`,
so they are not propagated to caller when calling
`compaction_manager::stop()` -- only a warning message is printed
in the log. but `EDQUOT` is not handled.
after this change, the exception raised by compaction manager's
stop process is not filtered anymore and is handled by
`defer_verbose_shutdown()` instead, which is able to check the
type of exception, and print out error message in the log. so
the `ENOSPC` and `EDQUOT` errors are taken care of, and more
visible from user's perspective as they are printed as errors
instead of warning. but they are not printed using the
`compaction_manager` logger anymore. so if our testing or user's
workflow depends on this behavior, the related setting should be
updated accordingly.
Fixes#12626
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
EDQUOT can be returned as the errno when the underlying filesystem
is trying to reserve necessary resources from disk for performing
i/o on behalf of the effective user, and the filesystem fails to
acquire the necessary resources. it could be inode, volume space,
or whatever resources for completing the i/o operation. but none
of them is the consequence of scylla's fault. so we should not
`abort()` at seeing this errno. instead, it's should be reported
to the administrator.
in this change, EDQUOT is also considered as an environmental
failure just like EIO, EACCES and ENOSPC. they could be thrown
when stopping an server.
Fixes#12626
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
We currently have two method families to generate partition keys:
* make_keys() in test/lib/simple_schema.hh
* token_generation_for_shard() in test/lib/sstable_utils.hh
Both work only for schemas with a single partition key column of `text` type and both generate keys of fixed size.
This is very restrictive and simplistic. Tests, which wanted anything more complicated than that had to rely on open-coded key generation.
Also, many tests started to rely on the simplistic nature of these keys, in particular two tests started failing because the new key generation method generated keys of varying size:
* sstable_compaction_test.sstable_run_based_compaction_test
* sstable_mutation_test.test_key_count_estimation
These two tests seems to depend on generated keys all being of the same size. This makes some sense in the case of the key count estimation test, but makes no sense at all to me in the case of the sstable run test.
Closes#12657
* github.com:scylladb/scylladb:
test/lib/sstable_utils: remove now unused token_generation_for_shard() and friends
test/lib/simple_schema: remove now unused make_keys() and friends
test: migrate to tests::generate_partition_key[s]()
test/lib/test_services: add table_for_tests::make_default_schema()
test/lib: add key_utils.hh
test/lib/random_schema.hh: value_generator: add min_size_in_bytes
This patch fixes 2 issues with checking whether UDFs are used in UDAs:
1) UDFs types are not considered during the check, which prevents us from dropping a UDF that isn't used in any UDAs, but shares its name with one of them.
2) the REDUCEFUNC is not considered during the check, which allows dropping a UDF even though it's used in a UDA as the REDUCEFUNC.
Additionally, tests for these issues are added
Closes#12681
* github.com:scylladb/scylladb:
udf: also check reducefunc to confirm that a UDF is not used in a UDA
udf: fix dropping UDFs that share names with other UDFs used in UDAs
pytest: add optional argument for new_function argument types
Fixes: https://github.com/scylladb/scylladb/issues/12309Closes#12720
* github.com:scylladb/scylladb:
service/raft: raft_group_registry: use recent_entries_map to store rate_limits in pinger. Fixes#12309
utils: introduce recent_entries_map datatype to track least recent visited entries.
This PR is related to https://github.com/scylladb/scylla-enterprise/issues/2176.
It adds a FAQ about a workaround to install a ScyllaDB version that is not the most recent patch version.
In addition, the link to that FAQ is added to the patch upgrade guides 2021 and 2022 .
Closes#12660
* github.com:scylladb/scylladb:
doc: add the missing sudo command
doc: replace the reduntant link with an alternative way to install a non-latest version
doc: add the link to the FAQ about pinning to the patch upgrade guides 2022 and 2022
doc: add a FAQ with a workaround to install a non-latest ScyllaDB version on Debian and Ubuntu
`poetry install` consistently times out when resolving the
dependencies. like:
```
Command ['/home/kefu/.cache/pypoetry/virtualenvs/scylla-1fWQLpOv-py3.9/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/kefu/.cache/pypoetry/virtualenvs
/scylla-1fWQLpOv-py3.9', '--upgrade', '--no-deps', '/home/kefu/.cache/pypoetry/artifacts/e6/ad/ab/eca9f61c5b15fd05df7192c0e5914a9e5ac927744b1fb5f6c07a92d7a4/sphinx-sitemap-2.2.0.tar.gz'] errored with the following return code 1, and out
put:
Processing /home/kefu/.cache/pypoetry/artifacts/e6/ad/ab/eca9f61c5b15fd05df7192c0e5914a9e5ac927744b1fb5f6c07a92d7a4/sphinx-sitemap-2.2.0.tar.gz
Installing build dependencies: started
Installing build dependencies: finished with status 'error'
ERROR: Command errored out with exit status 2:
command: /home/kefu/.cache/pypoetry/virtualenvs/scylla-1fWQLpOv-py3.9/bin/python /tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-37k3lwqd/overlay --no-warn-scrip
t-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=40.8.0' wheel
cwd: None
Complete output (80 lines):
Collecting setuptools>=40.8.0
Downloading setuptools-67.1.0-py3-none-any.whl (1.1 MB)
ERROR: Exception:
Traceback (most recent call last):
File "/tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip/_vendor/urllib3/response.py", line 438, in _error_catcher
yield
File "/tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip/_vendor/urllib3/response.py", line 519, in read
data = self._fp.read(amt) if not fp_closed else b""
File "/tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip/_vendor/cachecontrol/filewrapper.py", line 62, in read
data = self.__fp.read(amt)
File "/usr/lib64/python3.9/http/client.py", line 463, in read
n = self.readinto(b)
File "/usr/lib64/python3.9/http/client.py", line 507, in readinto
n = self.fp.readinto(b)
File "/usr/lib64/python3.9/socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "/usr/lib64/python3.9/ssl.py", line 1242, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib64/python3.9/ssl.py", line 1100, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
```
while sphinx-sitemap 2.5.0 installs without problems. sphinx-sitemap
2.50 is the latest version published to pypi.
according to sphinx-sitemap's changelog at
https://github.com/jdillard/sphinx-sitemap/blob/master/CHANGELOG.rst ,
no breaking changes were introduced in between 2.2.0 and 2.5.0.
after bumping sphinx-sitemap 2.5.0, following commands can complete
without errors:
```
poetry lock
make preview
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#12705
[table: Fix disk-space related metrics](529a1239a9) fixes the table's disk space related metrics.
whereas second patch fixes an inefficiency when computing statistics which can be triggered with multiple compaction groups.
Closes#12718
* github.com:scylladb/scylladb:
table: Fix inefficiency when rebuilding statistics with compaction groups
table: Fix disk-space related metrics
When dropping a UDF we're checking if it's not begin used in any UDAs
and fail otherwise. However, we're only checking its state function
and final function, and it may also be used as its reduce function.
This patch adds the missing checks and a test for them.
Currently, when dropping a function, we only check if there exist
an aggregate that uses a function with the same name as its state
function or final function. This may cause the drop to fail even
when it's just another UDF with the same name that's used in the
aggregate, even when the actual dropped function is not used there.
This patch fixes this by checking whether not only the name of the
UDA's sfunc and finalfunc, but also their argument types.
When multiple functions with the same name but different argument types
are created, the default drop statement for these functions will fail
because it does not include the argument types.
With this change, this problem can be worked around by specifying
argument types when creating the function, as this will cause the drop
statement to include them.
The system_keyspace defines several auxiliary methods to help view_builder update system.scylla_views_builds_in_progress and system.built_views tables. All use global qctx thing.
It only takes adding view_builder -> system_keyspace dependency in order to de-static all those helpers and let them use query-processor from it, not the qctx.
Closes#12728
* github.com:scylladb/scylladb:
system_keysace: De-static calls that update view-building tables
storage_service: Coroutinize mark_existing_views_as_built()
api: Unset column_famliy endpoints
api: Carry sharded<db::system_keyspace> reference over
view_builder: Add system_keyspace dependency
In most cases, tasks manager's tasks are started just after they are
created. Thus, to reduce boilerplate required for creating and starting
tasks, tasks::task_manager::module::make_and_start_task method is added.
Repair tasks are modified to use the method where possible.
Closes#12729
* github.com:scylladb/scylladb:
repair: use tasks::task_manager::module::make_and_start_task for repair tasks
tasks: add task_manager::module::make_and_start_task method