"
This patch series introduces initial support for writing SSTables in
'mc' format (aka SSTables 3.0).
Currently, the following components are written in 3.0 format:
- Data.db
- Index.db
- Summary.db
(there were no changes to summary files format compared to ka/la)
Other SSTables components are written in the old format for now as they
still need to exist to satisfy post-flush processing.
For now, only rows are written to the data file and indexed. Range
tombstones are not supported.
Writing rows is supported in full with the only exception being counter
cells. All the other features (TTLed data, row/cell level tombstones,
collections, etc) are supported.
Unit tests rely on producing files and binary-comparing them with
'golden' copies that are produced using Cassandra 3.11. This is done to
not block until reading SSTables 3.0 format is implemented.
=======================================
Implementation notes
=======================================
Internally, sstable_writer has been refactored to support multiple
implementations that are instantiated in its constructor based on the
sstable version. Little to no code is shared among sstable_writer_v2 and
sstable_writer_v3 as we only intend to support sstable_writer_v2
alongside sstable_writer_v3 for a single release (to be able to do
rollback on rolling upgrade failure) and then plan to get rid of it
entirely and switch to always writing SSTables in the new format.
The design of sstable_writer_v3 mostly follows that of its precursors
sstable_writer(_v2) and components_writer. Some refactoring and further
code rearrangements are expected in the future but the main code is
there.
"
* 'projects/sstables-30/write-rows/v2' of https://github.com/argenet/scylla:
Add tests for writing data and index files in SSTables 3.0 ('mc') format.
Support for writing SSTables 3.0 ('mc') Data.db and Index.db files - rows only.
Add missing enum values to bound_kind.
Add building blocks for writing data in SSTables 3.0 format.
Refactor sstable_writer to support various internal implementations.
Add is_fixed_length() to data types.
Add mutation_partition::apply_insert() overload that accepts TTL and expiry for row marker.
bound_kind::clustering, bound_kind::excl_end_incl_start and
bound_kind::incl_end_excl_start are used during SSTables 3.0 writing.
bound_kind::static_clustering is not used yet but added for completeness
and parity with the Origin.
For #1969.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
For any given CQL data type, this member returns whether its values are
of fixed or variable length. This is used by SSTables 3.0 format to only
store the length value for variable-length cells.
For #1969.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
"
This patchset prepares everything for support of both 2.x and 3.x formats and implements reading from sstable 3.x
very simple table with just partition keys.
Tests: units (release)
"
* 'haaawk/sstables3/read_only_partitions_v4' of ssh://github.com/scylladb/seastar-dev: (22 commits)
Test for reading sstable in MC format with no columns
Use new mp_row_consumer_m and data_consume_rows_context_m
Introduce mp_row_consumer_m
Rename mp_row_consumer to mp_row_consumer_k_l
Introduce consumer_m and data_consume_rows_context_m
Use read_short_length_bytes in RANGE_TOMBSTONE
Use read_short_length_bytes in ATOM_START
Use read_short_length_bytes in ROW_START
Add continuous_data_consumer::read_short_length_bytes
Reduce duplication with continuous_data_consumer::read_partial_int
Add test for a simple table with just partition key
Add test for reading index
Extract mp_row_consumer to separate header
Make sstable_mutation_reader independent from mp_row_consumer
Make sstable_mutation_reader a template
Make data_consume_context a template
Move data_consume_rows_context from row.cc to row.hh
Decouple sstable.hh and row.hh
Reduce visibility of sstable::data_consume_*
Move data_consume_context to separate header
...
Take DataConsumeRowsContext type as parameter.
This will allow us to implement different context
for reading 3.x files.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Parametrize it with the type of data consume rows context.
There will be different implementations used for different
sstable file formats.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
It will be used as a template parameter for sstable_mutation_reader
once it's turned into a template. This means the definition has
to be accessible.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
They are used just in partition.cc, row.cc and sstables_test.cc
so it is usefull to cut their scope by moving them
to data_consume_context.hh.
This will make it much easier to turn data_consume_context into
a template.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
It's used only in row.cc, partition.cc and sstables_test.cc
so it's better to reduce the dependency just to those files.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
On some build environment we may want to limit number of parallel jobs since
ninja-build runs ncpus jobs by default, it may too many since g++ eats very
huge memory.
So support --jobs <njobs> just like on rpm build script.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180425205439.30053-1-syuu@scylladb.com>
"
This patchset brings in a statistics collector that tracks minimal
values for timestamps, TTLs and local deletion times for all the updates
made to a given memtable.
This statistics is later used when flushing memtables into SSTables
using 3.x ('mc') format to delta-encode corresponding values using
collected minimums as bases (that is why it is called encoding
statistics).
This patchset is sent out apart from other changes that introduce
writing SSTables 3.x to facilitate read path implementation that also
needs the encoding_stats structure.
The tests for write path implicitly cover this functionality as any rows
written to a SSTable 3.0 file make use of delta-encoding.
"
* 'projects/sstables-30/collect-encoding-statistics-v4' of https://github.com/argenet/scylla:
Collect encoding statistics for memtable updates.
Factor out min_tracker and max_tracker as common helpers.
Always pass mutation_partitions to partition_entry::apply()
We keep track of all updates and store the minimal values of timestamps,
TTLs and local deletion times across all the inserted data.
These values are written as a part of serialization_header for
Statistics.db and used for delta-encoding values when writing Data.db
file in SSTables 3.0 (mc) format.
For #1969.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
They will be re-used for collecting encoding statistics which is needed
to write SSTables 3.0.
Part of #1969.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Previously it was also possible to pass a frozen_mutation to it.
Now we de-serialize frozen mutations at the calling side.
This is a pre-requisite for collecting memtable statistics needed for
writing into the SSTables 3.0 format.
For #1969.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
When provisioning a Scylla docker image with --developer-mode 0 (disabled)
scylla_raid_setup is not invoked. As a consequence the "data" directory is not
created and scylla_io_setup fails (steps to reproduce and error message provided
at the end).
This patch adds the same verifications present in scylla_io_setup to docker's
scyllasetup.py and creates the data directory in the case it is not present.
--
Steps to reproduce on AWS i3.2xlarge with Ubuntu 16.04:
sudo -s
apt update && apt upgrade -y && apt-get install docker.io -y
mdadm --create --verbose --force --run /dev/md0 --level=0 -c1024 --raid-devices=1 /dev/nvme0n1
mkfs.xfs /dev/md0 -f -K
mkdir /var/lib/scylla
mount -t xfs /dev/md0 /var/lib/scylla
docker run --name some-scylla \
--volume /var/lib/scylla:/var/lib/scylla \
-p 9042:9042 -p 7000:7000 -p 7001:7001 -p 7199:7199 \
-p 9160:9160 -p 9180:9180 -p 10000:10000 \
-d scylladb/scylla --overprovisioned 1 --developer-mode 0
docker logs some-scylla
running: (['/usr/lib/scylla/scylla_dev_mode_setup', '--developer-mode', '0'],)
running: (['/usr/lib/scylla/scylla_io_setup'],)
terminate called after throwing an instance of 'std::system_error'
what(): open: No such file or directory
ERROR:root:/var/lib/scylla/data did not pass validation tests, it may not be on XFS and/or has limited disk space.
This is a non-supported setup, and performance is expected to be very bad.
For better performance, placing your data on XFS-formatted directories is required.
To override this error, enable developer mode as follow:
sudo /usr/lib/scylla/scylla_dev_mode_setup --developer-mode 1
failed!
Traceback (most recent call last):
File "/docker-entrypoint.py", line 15, in <module>
setup.io()
File "/scyllasetup.py", line 34, in io
self._run(['/usr/lib/scylla/scylla_io_setup'])
File "/scyllasetup.py", line 23, in _run
subprocess.check_call(*args, **kwargs)
File "/usr/lib64/python3.4/subprocess.py", line 558, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/lib/scylla/scylla_io_setup']' returned non-zero exit status 1
ls -latr /var/lib/scylla
total 4
drwxr-xr-x 44 root root 4096 Abr 24 13:02 ..
drwxr-xr-x 2 root root 6 Abr 24 13:10 .
Signed-off-by: Moreno Garcia <moreno@scylladb.com>
Message-Id: <20180424173729.22151-1-moreno@scylladb.com>
Fixes#3187
Requires seastar "inet_address: Add constructor and conversion function
from/to IPv4"
Implements support IPv6 for CQL inet data. The actual data stored will
now vary between 4 and 16 bytes. gms::inet_address has been augumented
to interop with seastar::inet_address, though of course actually trying
to use an Ipv6 address there or in any of its tables with throw badly.
Tests assuming ipv4 changed. Storing a ipv4_address should be
transparent, as it now "widens". However, since all ipv4 is
inet_address, but not vice versa, there is no implicit overloading on
the read paths. I.e. tests and system_keyspace (where we read ip
addresses from tables explicitly) are modified to use the proper type.
Message-Id: <20180424161817.26316-1-calle@scylladb.com>
CQL normally folds identifiers such as column names to lowercase. However,
if the column name is quoted, case-sensitive column names and other strange
characters can be used. We had a bug where such columns could be indexed,
but then, when trying to use the index in a SELECT statement, it was not
found.
The existing code remembered the index's column after converting it to CQL
format (adding quotes). But such conversion was unnecessary, and wrong,
because the rest of the code works with bare strings and does not involve
actual CQL statements. So the fix avoids this mistaken conversion.
This patch also includes a test to reproduce this problem.
Fixes#3154.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180424154920.15924-1-nyh@scylladb.com>
This commit fixes two closely related issues with handling
case-sensitive column names in JSON:
* according to doc, case-sensitive names should be wrapped with
additional pair of double quotes during JSON SELECT
* logic error in parse_json() prevented INSERT JSON from working
properly on case-sensitive column names
This commit is followed by updated cql_query_test, which checks
case-sensitive cases as well.
Message-Id: <82d9d5e193a656e99bc86b297c00662a6fb808a0.1524576066.git.sarna@scylladb.com>
"
Pass sstable version to parse, write and describe_type methods to make it possible to handle different versions.
For now serialization header from 3.x format is ignored.
Tests: units (release)
"
* 'haaawk/sstables3/loading_v4' of ssh://github.com/scylladb/seastar-dev:
Add test for loading the whole sstable
Add test for loading statistics
Add support for 3_x stats metadata
Pass sstable version to describe_type
Pass sstable version to write methods
metadata_type: add Serialization type
Pass sstable_version_types to parse methods
Add test for reading filter
Add test for read_summary
sstables 3.x: Add test for reading TOC
sstable: Make component_map version dependent
sstable::component_type: add operator<<
Extract sstable::component_type to separete header
Remove unused sstable::get_shared_components
sstable_version_types: add mc version