This patchset addresses two issues with static rows support in SSTables
3.x. ('mc' format):
1. Since collections are allowed in static rows, we need to check for
complex deletion, set corresponding flag and write tombstones, if any.
2. Column indices need to be partitioned for static columns the same way
they are partitioned for regular ones.
* github.com/argenet/scylla.git projects/sstables-30/columns-proper-order-followup/v1:
sstables: Partition static columns by atomicity when reading/writing
SSTables 3.x.
sstables: Use std::reference_wrapper<> instead of a helper structure.
sstables: Check for complex deletion when writing static rows.
tests: Add/fix comments to
test_write_interleaved_atomic_and_collection_columns.
tests: Add test covering inverleaved atomic and collection cells in
static row.
It is possible to have collections in a static row so we need to check
for collection-wide tombstones like with clustering rows.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Collections are permitted in static rows so same partitioning as for
regular columns is required.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Current scylla.spec fails build on Fedora 27, since python2-pystache is
new package name that renamed on Fedora 28.
But Fedora 28's python2-pystache has tag "Provides: pystache",
so we can depends on old package name, this way we can build scylla.spec both
on Fedora 27/28.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20181028175450.31156-1-syuu@scylladb.com>
When a node reshards (i.e., restarts with a different number of CPUs), and
is in the middle of building a view for a pre-existing table, the view
building needs to find the right token from which to start building on all
shards. We ran the same code on all shards, hoping they would all make
the same decision on which token to continue. But in some cases, one
shard might make the decision, start building, and make progress -
all before a second shard goes to make the decision, which will now
be different.
This resulted, in some rare cases, in the new materialized view missing
a few rows when the build was interrupted with a resharding.
The fix is to add the missing synchronization: All shards should make
the same decision on whether and how to reshard - and only then should
start building the view.
Fixes#3890Fixes#3452
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20181028140549.21200-1-nyh@scylladb.com>
"
In Cassandra, row columns are stored in a BTree that uses the following
ordering on them:
- all atomic columns go first, then all multi-cell ones
- columns of both types (atomic and multi-cell) are
lexicographically ordered by name regarding each other
Scylla needs to store columns and their respective indices using the
same ordering as well as when reading them back.
Fixes#3853
Tests: unit {release}
+
Checked that the following SSTables are dumped fine using Cassandra's
sstabledump:
cqlsh:sst3> CREATE TABLE atomic_and_collection3 ( pk int, ck int, rc1 text, rc2 list<text>, rc3 text, rc4 list<text>, rc5 text, rc6 list<text>, PRIMARY KEY (pk, ck)) WITH compression = {'sstable_compression': ''};
cqlsh:sst3> INSERT INTO atomic_and_collection3 (pk, ck, rc1, rc4, rc5) VALUES (0, 0, 'hello', ['beautiful','world'], 'here');
<< flush >>
sstabledump:
[
{
"partition" : {
"key" : [ "0" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 96,
"clustering" : [ 0 ],
"liveness_info" : { "tstamp" : "1540599270139464" },
"cells" : [
{ "name" : "rc1", "value" : "hello" },
{ "name" : "rc5", "value" : "here" },
{ "name" : "rc4", "deletion_info" : { "marked_deleted" : "1540599270139463", "local_delete_time" : "1540599270" } },
{ "name" : "rc4", "path" : [ "45e22cb0-d97d-11e8-9f07-000000000000" ], "value" : "beautiful" },
{ "name" : "rc4", "path" : [ "45e22cb1-d97d-11e8-9f07-000000000000" ], "value" : "world" }
]
}
]
}
]
"
* 'projects/sstables-30/columns-proper-order/v1' of https://github.com/argenet/scylla:
tests: Test interleaved atomic and multi-cell columns written to SSTables 3.x.
sstables: Re-order columns (atomic first, then collections) for SSTables 3.x.
sstables: Use a compound structure for storing information used for reading columns.
In Cassandra, row columns are stored in a BTree that uses the following
ordering on them:
- all atomic columns go first, then all multi-cell ones
- columns of both types (atomic and multi-cell) are
lexicographically ordered by name regarding each other
Since schema already has all columns lexicographically sorted by name,
we only need to stably partition them by atomicity for that.
Fixes#3853
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
This representation makes it easier to operate with compound structures
instead of separate values that were stored in multiple containers.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Before this fix, write_missing_columns() helper would always deal with
regular columns even when writing static rows.
This would cause errors on reading those files.
Now, the missing columns are written correctly for regular and static
rows alike.
* github.com/argenet/scylla.git projects/sstables-30/fix-writing-static-missing-columns/v1:
schema: Add helper method returning the count of columns of specified
kind.
sstables: Honour the column kind when writing missing columns in 'mc'
format.
tests: Add test for a static row with missing columns (SStables 3.x.).
Previously, we've been writing the wrong missing columns indices for
static rows because write_missing_columns() explicitly used regular
columns internally.
Now, it takes the proper column kind into account.
Fixes#3892
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
* seastar d152f2d...c1e0e5d (6):
> scripts: perftune.py: properly merge parameters from the command line and the configuration file
> fmt: update to 5.2.1
> io_queue: only increment statistics when request is admitted
> Adds `read_first_line.cc` and `read_first_line.hh` to CMake.
> fstream: remove default extent allocation hint
> core/semaphore: Change the access of semaphore_units main ctor
Due to a compile-time fight between fmt and boost::multiprecision, a
lexical_cast was added to mediate.
sprint("%s", var) no longer accepts numeric values, so some sprint()s were
converted to format() calls. Since more may be lurking we'll need to remove
all sprint() calls.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
"
This patchset adddresses two problems with shadowable deletions handling
in SSTables 3.x. ('mc' format).
Firstly, we previously did not set a flag indicating the presence of
extended flags byte with HAS_SHADOWABLE_DELETION bitmask on writing.
This would break subsequent reading and cause all types of failures up
to crash.
Secondly, when reading rows with this extended flag set, we need to
preserve that information and create a shadowable_tombstone for the row.
Tests: unit {release}
+
Verified manually with 'hexdump' and using modified 'sstabledump' that
second (shadowable) tombstone is written for MV tables by Scylla.
+
DTest (materialized_views_test.py:TestMaterializedViews.hundred_mv_concurrent_test)
that originally failed due to this issue has successfully passed locally.
"
* 'projects/sstables-30/shadowable-deletion/v4' of https://github.com/argenet/scylla:
tests: Add tests writing both regular and shadowable tombstones to SSTables 3.x.
tests: Add test covering writing and reading a shadowable tombstone with SSTables 3.x.
sstables: Support Scylla-specific extension for writing shadowable tombstones.
sstables: Introduce a feature for shadowable tombstones in Scylla.db.
memtable: Track regular and shadowable tombstones separately in encoding_stats_collector.
sstables: Error out when reading SSTables 3.x with Cassandra shadowable deletion.
sstables: Support checking row extension flags for Cassandra shadowable deletion.
Even when we're using a full clustering range, need_skip() will return
true when we start a new partition and advance_context() will be
called with position_in_partition::before_all_clustered_rows(). We
should detect that there is no need to skip to that position before
the call to advance_to(*_current_partition_key), which will read the
index page.
Fixes#3868.
Message-Id: <1539881775-8578-1-git-send-email-tgrabiec@scylladb.com>
"
This patchset adds support generating .rpm/.deb from relocatable
package.
"
* 'reloc_rpmdeb_v5' of https://github.com/syuu1228/scylla:
configure.py: run create-relocatable-package.py everytime
configure.py: add SCYLLA-RELEASE-FILE/SCYLLA-VERSION-FILE targets
configure.py: use {mode} instead of $mode on scylla-package.tar.gz build target
dist/ami: build relocatable .rpm when --localrpm specified
dist/debian: use relocatable package to produce .deb
dist/redhat: use relocatable package to produce .rpm
install-dependencies.sh: add libsystemd as dependencies
install.sh: drop hardcoded distribution name, add --target option to specify distribution
build: add script to build relocatable package
build: compress relocatable package
build: add files on relocatable package to support generating .rpm/.deb
Right now we don't have dependencies for dist/, ninja not able to detect
changes under the directory.
To update relocatable package even only change is under dist/, we need
to run create-relocatable-package.py everytime.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
To re-generate scylla version files when it removed, since these files
required for relocatable package.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Since debian packaging system requires source package to compress tar
file, so let's use .gz compression.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
After the new in-memory representation of cells was introduced there was
a regression in atomic_cell_or_collection::operator<< which stopped
printing the content of the cell. This makes debugging more incovenient
are time-consuming. This patch fixes the problem. Schema is propagated
to the atomic_cell_or_collection printer and the full content of the
cell is printed.
Fixes#3571.
Message-Id: <20181024095413.10736-1-pdziepak@scylladb.com>
Limit message size according to the configuration, to avoid a huge message from
allocating all of the server's memory.
We also need to limit memory used in aggregate by thrift, but that is left to
another patch.
Fixes#3878.
Message-Id: <20181024081042.13067-1-avi@scylladb.com>
Compaction mode fails if more than one shard is used because it doesn't
make sure sstables used as input for compaction only contain local keys.
Therefore, sstable generated by compaction has less keys than expected
because non-local keys are purged out.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20181022225153.12029-1-raphaelsc@scylladb.com>
It is very useful for investigations in scylla issues, and we have
been moving those scripts manually when needed. Make it officially
part of the scylla package.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20181023184400.23187-1-glauber@scylladb.com>
scyllatop uses a log file, if opening the file fails, the user should
get a clear response not an exception trace.
The same is true for connecting to scylla
After this patch the following:
$ scyllatop.py -L /usr/lib/scyllatop.log
scyllatop failed opening log file: '/usr/lib/scyllatop.log' With an error: [Errno 13] Permission denied: '/usr/lib/scyllatop.log'
Fixes#3860
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20181021065525.22749-1-amnon@scylladb.com>
The original SSTables 'mc' format, as defined in Cassandra, does not provide
a way to store shadowable deletion in addition to regular row deletion
for materialized views.
It is essential to store it because of known corner-case issues that
otherwise appear.
For this to work, we introduce a Scylla-specific extended flag to be set
in SSTables in 'mc' format that indicates a shadowable tombstone is
written after the regular row tombstone.
This is deemed to be safe because shadowable tombstones are specific to
materialized views and MV tables are not supposed to be imported or
exported.
Note that a shadowable tombstone can be written without a regular
tombstone as well as along with it.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
This is used to indicate that the SSTables being read may contain a
Scylla-specific HAS_SCYLLA_SHADOWABLE_TOMBSTONE extended flag set.
If feature is not disabled, we should not honour this flag.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>