Close _delegate if it's engaged both in the close() method
and when ever it is currently reset by _delegate = {}.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We don't own _source therefore do not close it.
That said, we still need to make sure that the reversing reader
itself is closed to calm down the check when it's destroyed.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The underlying reader is owned by the caller if it is moved to it,
but not if it was constructed with a reference to the underlying reader.
Close the underlying reader on close() only in the former case.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Close both the _index_reader and _context, if they are engaged.
Warn and ignore any erros from close as it may be called either
from the destructor or from f_m_r close.
Call close() for closing in the background if needed when destroyed
and warn about.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We'd like that to simplify the soon-to-be-introduced
sstable_mutation_reader::close error handling path.
close_index_list can be marked noexcept since parallel_for_each is,
with that index_reader::close can be marked noexcept too.
Note that since reader close can not fail
both lower and upper bounds are closed (since
closing lower_bound cannot fail).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
`with_closeable` simplifies scoped use of
flat_mutation_reader, making sure to always close
the reader after use.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We cannot close in the background since there are use cases
that require the impl to be destroyed synchronously.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Allow closing readers before destorying them.
This way, outstanding background operations
such as read-aheads can be gently canceled
and be waited upon.
Note that similar to destructors, close must not fail.
There is nothing to do about errors after the f_m_r is done.
Enforce that in flat_mutation_reader::close() so if the f_m_r
implementation did return a failure, report it and abort as internal
error.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This series adds a wrapper for the default rjson allocator which throws on allocation/reallocation failures. It's done to work around several rapidjson (the underlying JSON parsing library) bugs - in a few cases, malloc/realloc return value is not checked, which results in dereferencing a null pointer (or an arbitrary pointer computed as 0 + `size`, with the `size` parameter being provided by the user). The new allocator will throw an `rjson:error` if it fails to allocate or reallocate memory.
This series comes with unit tests which checks the new allocator behavior and also validates that an internal rapidjson structure which we indirectly rely upon (Stack) is not left in invalid state after throwing. The last part is verified by the fact that its destructor ran without errors.
Fixes#8521
Refs #8515
Tests:
* unit(release)
* YCSB: inserting data similar to the one mentioned in #8515 - 1.5MB objects clustered in partitions 30k objects in size - nothing crashed during various YCSB workloads, but nothing also crashed for me locally before this patch, so it's not 100% robust
relevant YCSB workload config for using 1.5MB objects:
```yaml
fieldcount=150
fieldlength=10000
```
Closes#8529
* github.com:scylladb/scylla:
test: add a test for rjson allocation
test: rename alternator_base64_test to alternator_unit_test
rjson: add a throwing allocator
Migration manager has a function to get a schema (for read or write),
this function queries a peer node and retrieves the schema from it. One
scenario where it can happen is if an old node, queries an old not fixed
index.
This makes a hole through which views that are only adjusted for reading
can slip through.
Here we plug the hole by fixing such views before they are registered.
Closes#8509
Otherwise waiters on committed configuration changes (e.g.
`server::set_configuration`) would never get notified.
Also if we tried to send another entry concurrently we would get
replication_test: raft/server.cc:318: void raft::server_impl::notify_waiters(std::map<index_t, op_status> &, const std::vector<log_entry_ptr> &): Assertion `entry_idx >= first_idx' failed.
(not sure if this commit also fixes whatever caused that).
Message-Id: <20210419181319.68628-2-kbraun@scylladb.com>
sstable cells are parsed into temporary_buffers, which causes large contiguous allocations for some cells.
This is fixed by storing fragments of the cell value in a fragmented_temporary_buffer instead.
To achieve this, this patch also adds new methods to the fragmented_temporary_buffer(size(), ostream& operator<<()) and adds methods to the underlying parser(primitive_consumer) for parsing byte strings into fragmented buffers.
Fixes#7457Fixes#6376Closes#8182
* github.com:scylladb/scylla:
primitive_consumer: keep fragments of parsed buffer in a small_vector
sstables: add parsing of cell values into fragmented buffers
sstables: add non-contiguous parsing of byte strings to the primitive_consumer
utils: add ostream operator<<() for fragmented_temporary_buffer::view
compound_type: extend serialize_value for all FragmentedView types
The service level controller spawns an updating thread,
which wasn't properly waited for during shutdown.
This behavior is now fixed.
Tests: manual
Fixes#8468Closes#8470
* github.com:scylladb/scylla:
qos: make sure to wait for sl updates on shutdown
db: stop using infinite timeout for service level updates
This adds support for disabling writeback cache by adding a new
DISABLE_WRITEBACK_CACHE option to "scylla-server" sysconfig file, which
makes the "scylla_prepare" script (that is run before Scylla starts up)
call perftune.py with appropriate parameters. Also add a
"--disable-writeback-cache" option to "scylla_sysconfig_setup", which
can be called by scylla-machine image scripts, for example.
Refs: #7341
Tests: dtest (next-gating)
Closes#8526
"
Memory usage is considerably reduced by making reshape switch to partitioned set,
given that input sstables are disjoint. This will benefit reshape for all
strategies, not only TWCS.
Write amplification is reduced a lot by compacting all input sstables at once,
which is possible given that unbounded memory usage is fixed too.
With both these issues fixed, TWCS reshape will be much more efficient.
tests: mode(dev).
"
* 'twcs_reshape_fixes' of github.com:raphaelsc/scylla:
tests: sstables: Check that TWCS is able to reshape disjoint sstables efficiently
TWCS: Reshape all sstables in a time window at once if they're disjoint
sstables: Extract code to count amount of overlapping into a function
LCS: reshape: Fix overlapping check when determining if a sstable set is disjoint
compaction: Make reshape compaction always use partitioned_sstable_set
compaction: Allow a compaction type to override the sstable_set for input sstables
The service level controller spawns an updating thread,
which wasn't properly waited for during shutdown.
This behavior is now fixed.
In order to make the shutdown order more standardized,
the operation is split into two phases - draining and stopping.
Tests: manual
Fixes#8468
Due to a porting bug, the routines for updating service levels
used the default infinite timeout for internal CQL queries,
which causes Scylla to hang on shutdown. The behavior is now
fixed and the routines use the same timeout as the other
similar functions - 10s at the time of writing this message.
With repair-based operations, each window will have 256 disjoint
sstables due to data segregation which produces N sstables for each
vnode range, where N = # of existing windows. So each window ends up
with one sstable per vnode range = 256.
Given that reshape now unconditionally uses partitioned set's incremental
selector, all the 256 sstables can be compacted at once as compaction
essentially becomes a copy operation, where only one sstable will be
opened at a time, making its memory usage very efficient.
By compacting all sstables at once, write amplification is a lot
reduced because each byte is now only rewritten once.
Previously, with the initial set of 256 sstables, write amp could be
up to 8, which makes reshape for TWCS very slow.
Refs #8449.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
This function will be reused by TWCS reshape when checking if all
sstables in a window are disjoint and can be all compacted together.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Wrong comparison operator is used when checking for overlapping. It
would miss overlapping when last key of a sstable is equal to the first
key of another sstable that comes next in the set, which is sorted by
first key.
Fixes#8531.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The default rapidjson allocator returns nullptr from
a failed allocation or reallocation. It's not a bug by itself,
but rapidjson internals usually don't check for these return values
and happily use nullptr as a valid pointer, which leads to segmentation
faults and memory corruptions.
In order to prevent these bugs, the default allocator is wrapped
with a class which simply throws once it fails to allocate or reallocate
memory, thus preventing the use of nullptr in the code.
One exception is Malloc/Realloc with size 0, which is expected
to return nullptr by rapidjson code.
On 'product != scylla' environment, we have a bug with .default file
(sysconfig file) handling.
Since .default file should be install original name, package name can be
doesn't match with .default filename.
(ex: default file is /etc/default/scylla-node-exporter, but
package name is scylla-enterprise-node-exporter)
When filename doesn't match with package name, it should be renamed with
as follows:
<package name>.<filename>.default
We already do this on .service file, but mistakenly haven't handled
.default file, so let's add it too.
Related scylladb/scylla-enterprise#1718
Fixes#8527Closes#8528
Reduce rebuilds and build time by removing unnecessary includes. Along the way,
improve header sanity.
Ref #1.
Test: dev-headers, unit(dev).
Closes#8524
* github.com:scylladb/scylla:
treewide: remove inclusions of storage_proxy.hh from headers
storage_proxy: unnest coordinator_query_result
treewide: make headers self-sufficient
utils: intrusive_btree: add missing #pragma once
Make sure that sstable::unlink will never fail.
It will terminate in the unlikely case toc_filename
throws (e,g, on bad_alloc), otherwise it ignores any other error
and juts warns about it.
Make unlink a coroutine to simplify the implementation
without introducing additional allocations.
Note that remove_by_toc_name and maybe_delete_large_data_entries
are executed asynchronously and concurrently.
Waiting for them to finish is serialized by co_await,
making sure that both are being waited on so not to leave
abandoned futures behind.
Test: unit(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210420135020.102733-1-bhalevy@scylladb.com>
Reshape compaction potentially works with disjoint sstables, so it will
benefit a lot from using partitioned_sstable_set, which is able to
incrementally open the disjoint sstables. Without it, all sstables are
opened at once, which means unbounded memory usage.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
storage_proxy.hh is huge and includes many headers itself, so
remove its inclusions from headers and re-add smaller headers
where needed (and storage_proxy.hh itself in source files that
need it).
Ref #1.
Nested classes cannot be forward declared, and
storage_proxy::coordinator_query_result is used in pagers, where
we'd like to forward-declare it. Unnest it and introduce an alias
for compatibility.