"
The motivation is to keep code related to each format separate, to make it
easier to comprehend and reduce incremental compilation times.
Also reduces dependency on sstable writer code by removing writer bits from
sstales.hh.
The ka/la format writers are still left in sstables.cc, they could be also extracted.
"
* 'extract-sstable-writer-code' of github.com:tgrabiec/scylla:
sstables: Make variadic write() not picked on substitution error
sstables: Extract MC format writer to mc/writer.cc
sstables: Extract maybe_add_summary_entry() out of components_writer
sstables: Publish functions used by writers in writer.hh
sstables: Move common write functions to writer.hh
sstables: Extract sstable_writer_impl to a header
sstables: Do not include writer.hh from sstables.hh
sstables: mc: Extract bound_kind_m related stuff into mc/types.hh
sstables: types: Extract sstable_enabled_features::all()
sstables: Move components_writer to .cc
tests: sstable_datafile_test: Avoid dependency on components_writer
The previous code was using mp_row_consumer_k_l to be as close to the
tested code as possible.
Given that it is testing for an unhandled exception, there is probably
more value in moving it to a higher level, easier to use, API.
This patch changes it to use read_rows_flat().
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181210235016.41133-1-espindola@scylladb.com>
Cassandra supports a "CREATE CUSTOM INDEX" to create a secondary index
with a custom implementation. The only custom implementation that Cassandra
supports is SASI. But Scylla doesn't support this, or any other custom
index implementation. If a CREATE CUSTOM INDEX statement is used, we
shouldn't silently ignore the "CUSTOM" tag, we should generate an error.
This patch also includes a regression test that "CREATE CUSTOM INDEX"
statements with valid syntax fail (before this patch, they succeeded).
Fixes#3977
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20181211224545.18349-2-nyh@scylladb.com>
These are the current uninteresting cases I found when looking at
malformed_sstable_exception. The existing code is working, just not
being tested.
* https://github.com/espindola/scylla.git espindola/espindola/broken-sst:
Add a broken sstable test.
Add a test with mismatched schema.
db::config is a global class; changes in any module can cause changes
in db::config. Therefore, it is a cause of needless recompilation.
Remove some of these dependencies by having consumers of db::config
declare an intermediate config struct that is contains only
configuration of interest to them, and have their caller fill it out
(in the case of auth, it already followed this scheme and the patchset
only moves the translation function).
In addition, some outright pointless inclusions of db/config.hh are
removed.
The result is somewhat shorter compile times, and fewer needless
recompiles.
* https://github.com/avikivity/scylla unconfig-1/v1:
config: remove inclusions of db/config.hh from header files
repair: remove unneeded config.hh inclusion
batchlog_manager: remove dependency on db::config
auth: remove permissions_cache dependency on db::config
auth: remove auth::service dependency on db::config
auth: remove unneeded db/config.hh includes
"
This series optimises the read path by replacing some usages of
std::vector by utils::small_vector. The motivation for this change was
an observation that memory allocation functions are pointed out by the
profiler as the ones where we spent most time and while they have a
large number of callers storage allocation for some vectors was close to
the top. The gains are not huge, since the problem is a lot of things
adding up and not a single slow thing, but we need to start with
something.
Unfortunately, the performance of boost::container::small_vector is
quite disappointing so a new implementation of a small_vector was
introduced.
perf_simple_query -c4 --duration 60, medians:
./perf_before ./perf_after diff
read 343086.80 360720.53 5.1%
Tests: unit(release, small_vector in debug)
"
* tag 'small_vector/v2.1' of https://github.com/pdziepak/scylla:
partition_slice: use small_vector for column_ids
mutation_fragment_merger: use small_vector
auth: use small_vector in resource
auth: avoid list-initialisation of vectors
idl: serialiser: add serialiser for utils::small_vector
idl: serialiser: deduplicate vector serialisers
utils: introduce small_vector
intrusive_set_external_comparator: make iterator nothrow move constructible
mutation_fragment_merger: value-initialise iterator
"
Refs #3929
Enables re-use of commitlog segments.
First, ensures we never succeed playing back a commitlog
segment with name not matching the ID:s in the actual
file data, by determining expected id based on file name.
This will also handle partially written re-used files, as
each chunk headers CRC is dependent on the ID, and will
fail once we hit any left-overs.
Second part renamed and puts files into a recycle list
instead of actually deleting them when finished.
Allocating new files will the prioritize this list
before creating a new file.
Note that since consumtion and release of segments can
be somewhat unbalanced, this does not really guarantee
we will use recycled files even in all cases when it
might be possible, simply because of timing. It does
however give a good chance of it.
We limit recycled files based on the max disk size
setting, thus we can potentially grow disk size
more than without depending on timing, but not
uncontrolled.
While all this theoretially might improve disk
writes in some cases, it is far from any magic bullet.
No real performance testing has been done yet, only
functional.
"
* 'calle/commitlog-reuse' of github.com:scylladb/seastar-dev:
commitlog: Recycle used segments instead of delete + new file
commitlog: Terminate all segments with a zero chunk
commitlog_replay: Enforce file name based id matching
When reading the header chunk of a commitlog file, check the stored id
value against the id derived from the file name, and ignore if
mismatched. This is a prerequisite for re-using renamed commitlog files,
as we can then fail-fast should one such be left on disk, instead of
trying to replay it.
We also check said id via the CRC check for each chunk parsed. If we
find a chunk with
mismatched id, we will get a CRC error for the chunk, and replay will
terminate (albeit not gracefully).
"
Make major compaction aware of compaction strategy, by using an
optimal approach which suits the strategy needs.
Refs #1431.
"
* 'compaction_strategy_aware_major_compaction_v2' of github.com:raphaelsc/scylla:
tests: add test for compaction-strategy-aware major compaction
compaction: implement major compaction heuristic for leveled strategy
compaction: introduce notion of compaction-strategy-aware major compaction
Extract configuration into a new struct batchlog_manager_config and have the
callers populate it using db::config. This reduces dependencies on global objects.
Instead, distribute those inclusions to .cc files that require them. This
reduces rebuilds when config.hh changes, and makes it easier to locate files
that need config disaggregation.
The results vector should be populated vertically, not horizontally.
Responsible for assertion failure with --cache-enabled:
void result_collector::add(test_result_vector): Assertion `rs.size() == results.size()' failed.
Introduced in 3fc78a25bf.
Message-Id: <1544105835-24530-2-git-send-email-tgrabiec@scylladb.com>