sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().
Mechanically converted with https://github.com/avikivity/unsprint.
The index_reader class public interface has been amended to only deal
with the upper bound cursor along with advancing the lower bound.
Since the class users can only explicitly operate with the lower bound
cursor (take data file position, advance to the next partition, etc), it
no longer makes sense to specify that the method operates on the lower
bound cursor in its name.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
As a prepratation for the switch to the new cell representation this
patch changes the type returned by atomic_cell_view::value() to one that
requires explicit linearisation of the cell value. Even though the value
is still implicitly linearised (and only when managed by the LSA) the
new interface is the same as the target one so that no more changes to
its users will be needed.
sstable test fails when running concurrently (for example, release and debug
mode) because it uses a static temporary dir in lots of tests.
Let's fix it by switching to dynamic temporary dir, which is created using
mkdtemp(). Also the sstable tests will now run in /tmp, and so it's made
much faster.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180516042044.15336-1-raphaelsc@scylladb.com>
We were feeding the total estimation partition count of an input shared
sstable to the output unshared ones.
So sstable writer thinks, *from estimation*, that each sstable created
by resharding will have the same data amount as the shared sstable they
are being created from. That's a problem because estimation is feeded to
bloom filter creation which directly influences its size.
So if we're resharding all sstables that belong to all shards, the
disk usage taken by filter components will be multiplied by the number
of shards. That becomes more of a problem with #3302.
Partition count estimation for a shard S will now be done as follow:
//
// TE, the total estimated partition count for a shard S, is defined as
// TE = Sum(i = 0...N) { Ei / Si }.
//
// where i is an input sstable that belongs to shard S,
// Ei is the estimated partition count for sstable i,
// Si is the total number of shards that own sstable i.
Fixes#2672.
Refs #3302.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180423151001.9995-1-raphaelsc@scylladb.com>
I see the following error:
seastar/core/future-util.hh:597:10: note: constraints not satisfied
seastar/core/future-util.hh:597:10: note: with ‘sstables::sstable_version_types* c’
seastar/core/future-util.hh:597:10: note: with ‘sub_partitions_read::run_test_case()::<lambda(sstables::sstable::version_types)> aa’
seastar/core/future-util.hh:597:10: note: the required expression ‘seastar::futurize_apply(aa, (* c.begin()))’ would be ill-formed
seastar/core/future-util.hh:597:10: note: ‘seastar::futurize_apply(aa, (* c.begin()))’ is not implicitly convertible to ‘seastar::future<>’
The C array all_sstable_versions decayed to a pointer (see second gcc note)
and of course doesn't support std::begin().
Fix by replacing the C array with an std::array<>, which supports std::begin().
Not clear what made this break again, or why it worked before.
Message-Id: <20180325095239.12407-1-avi@scylladb.com>
"
These patches add support for C* 2.2 file(name) format.
Namely:
* It forces Scylla to write files in la format.
* Adds storage-service feature for them.
* cf and ks are determined from directory, not from file-name (for 2.2 format).
* Adds some other fixes to make dtest happy.
* Unit tests work with la format or with both formats.
"
* 'danfiala/filename-format-2.2-v4' of https://github.com/hagrid-the-developer/scylla:
tests/sstables: Tests use la format or iterate over both formats.
tests/sstables: Helper functions support 2.2 format directory structure.
stables: Use 2.2 (la) format as a default format to store sstables if it is enabled by feature-bits.
storage_service: Support la sstable storage format as a feature.
sstables: make_descriptor accepts sstable-directory, because it is necessary to determine cf and ks in 2.2 format.
sstables: Throw more detail exception for unknown item in reverse_map.
sstables/compaction: Suppress NaN in a report of a throughput.
Right now the summary can be copied, but in real life there is no reason
for this to be a requirement. Tests want it, so we can destroy a summary,
load another, and compare the two. We can achieve this by allowing the first
summary to be moved, and then we can still have a reference to the second.
I am about to make a change that will make the summary not copyable as a
requirement, so we need to do this first.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
"
Adds extension points to schema/sstables to enable hooking in
stuff, like, say, something that modifies how sstable disk io
works. (Cough, cough, *encryption*)
Extensions are processed as property keywords in CQL. To add
an extension, a "module" must register it into the extensions
object on boot time. To avoid globals (and yet don't),
extensions are reachable from config (and thus from db).
Table/view tables already contain an extension element, so
we utilize this to persist config.
schema_tables tables/views from mutations now require a "context"
object (currently only extensions, but abstracted for easier
further changes.
Because of how schemas currently operate, there is a super
lame workaround to allow "schema_registry" access to config
and by extension extensions. DB, upon instansiation, calls
a thread local global "init" in schema_registry and registers
the config. It, in turn, can then call table_from_mutations
as required.
Includes the (modified) patch to encapsulate compression
into objects, mainly because it is nice to encapsulate, and
isolate a little.
"
* 'calle/extensions-v5' of github.com:scylladb/seastar-dev:
extensions: Small unit test
sstables: Process extensions on file open
sstables::types: Add optional extensions attribute to scylla metadata
sstables::disk_types: Add hash and comparator(sstring) to disk_string
schema_tables: Load/save extensions table
cql: Add schema extensions processing to properties
schema_tables: Require context object in schema load path
schema_tables: Add opaque context object
config_file_impl: Remove ostream operators
main/init: Formalize configurables + add extensions to init call
db::config: Add extensions as a config sub-object
db::extensions: Configuration object to store various extensions
cql3::statements::property_definitions: Use std::variant instead of any
sstables: Add extension type for wrapping file io
schema: Add opaque type to represent extensions
sstables::compress/compress: Make compression a virtual object
71495691aa removed sstable::get_index_reader(),
but forgot to update its callers in tests/. Update the callers to construct
a temporary shared_index_list and create the index_reader directly.
This is none too clean, but shared_index_lists needs to be retired, and then
the changes in this patch can go away too.
Tests: unit (release)
Message-Id: <20180211164739.17862-1-avi@scylladb.com>
Make a "compressor" an actual class, that can be implemented and
registered via class registry.
For "common" compressors, the objects will be shared, but complex
implementors can be semi-stateful.
sstable compression is split into two parts: The "static" config
which is shared across shards, and a "local" one, which holds
a compressor pointer. The latter is encapsulated, along with
actual compressed data writers, in sstables/compress.cc.
For compression (write), compression writer is instansiated
with the settings active in table metadata.
For decompression (read), compression reader is instansiated
with the settings stored in sstable metadata, which can
differ from the currently active table metadata.
v2:
* Structured patch sets differently (dependencies)
* Added more comments/api descs
* Added patch to move all sstable compression into compress.cc,
effectively separating top-level virtual compressor object
from sstable io knowledge
v3:
* Rebased
v4:
* Moved all sstable compression logic/knowledge into
compress.cc (local compression). Merged the two patches
(separation just confuses reader).
Now promoted index is converted into an input_stream and skipped over
instead of being consumed immediately and stored as a single buffer.
The only part that is read right away is the deletion time as it is
likely to be there in the already read buffer and reading it should both
be cheap and prevent from reading the whole promoted index if only
deletion time mark is needed.
When accessed, promoted index is parsed in chunks, buffer by buffer, to
limit memory consumption.
Fixes#2981
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
sstables created prior to cc6c383 can contain bad max deletion time stat,
which would make get_fully_expired_sstables return sstables that aren't
actually fully expired. Let's make sstable invalidate the stat if it
is potentially incorrect.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Currently, fully expired sstable[1] is unconditionally chosen for compaction
by DTCS, but that may lead to a compaction loop under certain conditions.
Let's consider that an almost expired sstable is compacted, and it's not
deleted yet, and that the new sstable becomes expired before its ancestor is
deleted.
Because this new sstable is expired, it will be chosen by DTCS, but it will
not be purged because 'compacted undeleted' sstables are taken into account
by calculation of max purgeable timestamp and prevents expired data from
being purged. The problem is that this sequence of events can keep happening
forever as reported by issue #2260.
NOTE: This problem was easier to reproduce before improvement on compaction
of expired cells, because fully expired sstable was being converted into a
sstable full of tombstones, which is also considered fully expired.
Fixes#2260.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170428233554.13744-1-raphaelsc@scylladb.com>
So that as_mutation_reader() will create the same kind of reader which
database::make_sstable_reader() does.
Before this change, all readers were range readers.
There are instantiations of binary_search() used in sstables.cc, but
defined in partition.cc. The instantiations are explicitly declared in
partition.cc, but the types changed and they became obsolete. The
thing worked because partition.cc also instantiated it with the right
type. But after that code will be removed, it no longer would, and we
would get a linker error. To avoid such problems, define
binary_search() in a header.
We intend to share immutable sstable components among shards to
reduce excessive memory usage when resharding shared sstables.
This change is about grouping those components into a structure,
and using foreign ptr to make sure that the structure will be
deleted by whichever shard created it.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Rename _components to _recognized_components because _components
will be used to name a field with shareable components.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Problem will cause size tiered to return small jobs when there are
more than max_threshold sstables of similar size. For example, if
max_threshold is 32, and there are 36 sstables of similar size,
strategy will only return 4 sstables to be compacted. That's because
we incorrectly create a new bucket when it meets the max threshold.
What we should do is to allow buckets to grow beyond max threshold
and trim them when selecting the most suitable one for compaction.
Important to mention that estimation for size tiered will now
work better when there are more than max_threshold sstables of
similar size.
Fixes#1901.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <080bad70d6cb86eaf52ac1bdd6765ac47aab5b03.1478316140.git.raphaelsc@scylladb.com>
After 7c28ed, the schemas defined in the test became compressed by
default. This patch changes the test so that it is explicit about
which schemas shouldn't define a compressor.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1478646530-5558-1-git-send-email-duarte@scylladb.com>
leveled strategy uses heavily first and last decorated keys of a
sstable to get overlapping sstables in a given level. By storing
first and last decorated keys in sstable object, it's expected
that performance of leveled strategy (not compaction) will be
improved.
We will set first and last keys in sstable when either loading
or sealing it.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <0abca819454ab4c088541bb49714f1f6a7dc4f42.1473959677.git.raphaelsc@scylladb.com>
That will be needed for optimization that will store decorated keys
in the sstable object, and also for a subsequent work that will
detect wrong metadata (min/max column names) by looking at columns
in the schema. As schema is stored in sstable, there's no longer
a need to store ks and cf names in it.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>