Fixes crash in cql_tests.StorageProxyCQLTester.table_test
"avoid race condition when deleting sstable on behalf..." changed
discard_sstables behaviour to only return rp:s for sstables owned
and submitted for deletion (not all matching time stamp),
which can in some cases cause zero rp returned.
Message-Id: <20180508070003.1110-1-calle@scylladb.com>
When node is decommissioned/removed it will drain all its hints and all
remote nodes that have hints to it will drain their hints to this node.
What "drain" means? - The node that "drains" hints to a specific
destination will ignore failures and will continue sending hints till the end
of the current segment, erase it and move to the next one till there are
no more segments left.
After all hints are drained the corresponding hints directory is removed.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Returning a future with an exception from end_point_manager::stop()
is practically useless because the best the caller can do is to log
it and continue as if it didn't happen because it has other things
to shut down.
Therefore in order to simplify the caller we will log the exception
if it happens and will always return a non-exceptional future.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
This ensures we respect the write timeout set by the client when
applying base writes, in case a writes takes too long to acquire the
row lock for the read-before-write phase of a materialized view
update.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180507132755.8751-1-duarte@scylladb.com>
* seastar ac02df7...840002c (20):
> dpdk: protect against missing statistics
> alien: make visible in documentation
> Merge "rewrite iotune to conform to the new ioscheduler" from Glauber
> app_template: Correct outdated comment
> apps, tests: Catch polymorphic exceptions by reference
> configure.py: Enhance detection for gcc -fvisibility=hidden bug
> reactor: add rudimentary task histogram reporting
> Revert "Merge "rewrite iotune to conform to the new ioscheduler" from Glauber"
> Merge "rewrite iotune to conform to the new ioscheduler" from Glauber
> build: Use the same warning name for Clang and GCC
> core/rwlock: Add support for timeouts
> fs qualification: protect against EINTR
> Docker: Fix failing build due to missing GNU make
> reactor: move optional to experimental so we compile with c++14
> future: remove allocation from future::get() thread context switch
> Merge "rpc streaming" from Gleb
> reactor: put mountpoint_params in seastar namespace
> Tutorial: in PDF version of tutorial, better backtick typesetting
> tutorial: support, and start using, links to other sections
> tutorial: improve second half of semaphores section
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch adds a simple and naive mechanism to ensure a base replica
doesn't overwhelm a potentially overloaded view replica by sending too
many concurrent view updates. We add a semaphore to limit to 100 the
number of outstanding view updates. We limit globally per shard, and
not per destination view replica. We also limit statically.
Refs #2538
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180426134457.21290-2-duarte@scylladb.com>
The functions in cql_assertions.hh are very convenient, but have one
frustrating drawback: When you have many of those assertions in one
test, it's very hard to know *which* of the similar assertions failed.
The problem is that an error often looks like this:
unknown location(0): fatal error: in "test_many_columns":
std::runtime_error: Expected 2 row(s) but got 0
tests/cql_assertions.cc(131): last checkpoint
Which of the many similar checks in "test_many_columns" failed? Note the
unhelpful "unknown location" and also the "last checkpoint" points to code
in cql_assertions.cc, not in the actual test, so it is useless.
The root cause of these problems is that the Boost macros use the C
preprocessor __FILE__ and __LINE__, which in actual C++ functions like
is_rows() remembers its location, instead of the caller. Fixing this will
not be simple. But this patch has a much simpler solution - fixing the
"last checkpoint". What ruins the last checkpoint is the use of BOOST_REQUIRE
inside the cql_assertions.cc is_rows() - when that succeeds, it records
the location inside cql_assertions.cc (!) as the last success.
If we just replace BOOST_REQUIRE by our own test (just like in the rest of
the cql_assertions.cc code), this code will not override the last checkpoint.
The user can see the last real successful BOOST_REQUIRE, or use
BOOST_TEST_PASSPOINT() to set his own checkpoints between different parts of
the same test.
After this patch, and with adding BOOST_TEST_PASSPOINT() calls between
different parts of my test, the failure above now looks like:
unknown location(0): fatal error: in "test_many_columns":
std::runtime_error: Expected 2 row(s) but got 0
tests/secondary_index_test.cc(299): last checkpoint
The "last checkpoint" now shows me exactly where my failing check was.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180501152638.26238-1-nyh@scylladb.com>
"This patchset adds ostream operators to result_message and uses them
in cql_assertions."
* tag 'result_message-print/v1.1' of https://github.com/avikivity/scylla:
tests: cql_assersions: improve error message when a row is not found
transport: add ostream support to result_message
transport: const correctness for result_message::accept()
Use utf8_type where warranted.
Fixes view_schema_test failure where the rows did not match. I don't
understand exactly why the failure happened (using the wrong type
should not cause a failure here), but the change fixes the problem.
Tests: view_schema_test (release)
Message-Id: <20180506130015.7450-1-avi@scylladb.com>
The visitor does not alter the result_message it is visiting (and
its signature indicates that) so accept() should be const-qualified
to indicate that and to allow visiting const result_message:s.
"
This patchset adds support for writing Statistics.db in the SSTables
'mc' (3.x) format. This file is essential for reading data stored in
Data.db as it contains base values used for delta encoding and types of
columns.
This patchset also fixes several bugs found in writing data and index
files as well as bugs in a statistics-related structure definition.
Tests: unit {debug, release}
All SSTables files for write unit tests are validated to be processed by
sstabledump and output is verified to show the expected data.
"
* 'projects/sstables-30/write-statistics/v1' of https://github.com/argenet/scylla:
Add test covering the composite partition key case.
Add Statistics.db files to write tests for SSTables 3.0.
Do not check rows and cells for expiration when writing them to the data file.
Fix promoted index serialization.
Fix the order of items in stats_metadata.
Fix timestamp_epoch value which was truncated on exceeding int32_t type limit.
Write serialization header to Statistics.db for SSTables 3.x.
Do not pass schema to metadata_collector::update(column_stats)
Collect metadata statistics when writing SSTables 3.0.
Call get_metadata_collector() instead of referencing sstable::_collector directly.
Fix logic of writing TTLed cells in SSTable 3.0 format.
Separate statistics for count of cells, columns and rows in column_stats.
Deserialize collection in a way that doesn't incur shared_ptr counter increment and is generally shorter.
Track both min & max values for timestamp, TTL and local deletion time in metadata_collector.
Add class for tracking both extremum values (min and max) on updates.
Mainly to check that the composite type is properly serialized when
writing serialization header to Statistics.db.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
For these tests to work, all time-related values are now fixed as these
are stored in Statistics.db files.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Although this logic may be seen as a useful optimization, it hinders
unit tests writing SSTables 3.0 as those need to have fixed time-related
values to produce Statistics.db files with the same content on each run.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
There is a new field introduced in the SSTables 3.0 index file format
named 'partition_header_length' that can be used to skip over to the
first clustering row in a wide partition. This one has not been
previously written and caused malformed indices.
Updated the corresponding test to include a static row and write
multiple wide partitions.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Serialization header is a new components in Statistics.db introduced in
SSTables 3.0 ('ma') format. It is essential for reading data file as it
contains the base values used for delta-encoded values (timestamps,
TTLs, local deletion times) and description of column types.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
After reboot, all existing sstables are considered shared. That's a safe default.
Reader used by compaction decides to use filtering reader (filters out data that
doesn't belong to this shard) if sstable is considered shared even though it may
actually be unshared.
By avoiding filtering reader we're avoiding an extra check for each key, and that
may be meaningful for compaction of tons of small partitions and even range
reads of such. We do so by fixing sstable::_shared, which is now set properly for
existing sstables at start.
quick check using microbenchmark which extends perf_sstable with compaction mode:
before: 69407.61 +- 37.03 partitions / sec (30 runs, 1 concurrent ops)
after: 70161.09 +- 40.35 partitions / sec (30 runs, 1 concurrent ops)
Fixes#3042.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180504182158.21130-1-raphaelsc@scylladb.com>
"
This series introduces a system.large_partitions table,
used to gather information on largest partitions in the cluster.
Schema below allows easy extraction of most offending keys and removal
by sstable name, which happens when a table is compacted away.
Schema: (
keyspace_name text,
table_name text,
sstable_name text,
partition_size bigint,
key text,
compaction_time timestamp,
PRIMARY KEY((keyspace_name, table_name), sstable_name, partition_size, key)
) WITH CLUSTERING ORDER BY (partition_size DESC);
"
Closes#3292.
* 'large_partition_table_3' of https://github.com/psarna/scylla:
database, sstables, tests: add large_partition_handler
db: add large_partition_handler interface with implementations
docs: init system_keyspace entry with system.large_partitions
db: add system.large_partitions table
This commit makes database, sstables and tests aware
of which large_partition_handler they use.
Proper large_partition_handler is retrievable from config information
and is based on existing compaction_large_partition_warning_threshold_mb
entry. Right now CQL TABLE variant of large_partition_handler is used
in the database.
Tests use a NOP version of large_partition_handler, which does not
depend on CQL queries at all.
This commit introduces large_partition_handler class, which can be used
to take additional action when large partitions are written.
It comes with two implementations:
* NOP, used in tests, which does nothing on large partition
update/delete
* CQL TABLE, which inserts/deletes information on particular sstable
to system.large_partitions table, in order to be retrievable from
cqlsh later.
References #3292
This commit adds a system.large_partitions table, which can be used
to trace largest partitions of a cluster.
Schema: (
keyspace_name text,
table_name text,
sstable_name text,
partition_size bigint,
key text,
compaction_time timestamp,
PRIMARY KEY((keyspace_name, table_name), sstable_name, partition_size, key)
) WITH CLUSTERING ORDER BY (partition_size DESC);
References #3292
After removal of deletion manager, caller is now responsible for properly
submitting the deletion of a shared sstable. That's because deletion manager
was responsible for holding deletion until all owners agreed on it.
Resharding for example was changed to delete the shared sstables at the end,
but truncate wasn't changed and so race condition could happen when deleting
same sstable at more than one shard in parallel. Change the operation to only
submit a shared sstable for deletion in only one owner.
Fixes dtest migration_test.TestMigration.migrate_sstable_with_schema_change_test
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180503193427.24049-1-raphaelsc@scylladb.com>
A step to untie classes sstable_writer_m and sstable so that eventually
we could stop them being friends.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
SSTables 3.0 format makes a distinction between count of cells and count
of columns. In that sense, a column of a collection type counts as one
column but every atomic cell in it counts as a separate cell.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
test_multishard_combining_reader_destroyed_with_pending_create_reader
was failing because it relied on smp == 3 and thus the shard on which
the reader creation is blocked being shard-2. Since the test requires to
be run with smp >= 3 we can hardcode this shard to be 2 because if the
test runs at all we are guaranteed to have at least smp >= 3.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <38883a1f4c18ca0cd065aa13826a4f1858353289.1525328233.git.bdenes@scylladb.com>
These tests are quite complicated and require intimate knowledge of how
foreign_reader and multishard_combining_reader operates. Knowing these
two objects is still required to understand the tests but make it that
much easier by explaining how they were designed to test what they test.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <8de580131a8652924de920c2bc68a98e579398ee.1525328226.git.bdenes@scylladb.com>
'shard' is a short-lived on-stack variable that gets captured by
reference by continuation that gets executed on another shard.
Fixes a race condition that leads to an heap-use-after-free.
Message-Id: <20180502150507.2776-1-pdziepak@scylladb.com>