Currently, send_complete_message is not used. We will use it shortly in
case the local session is failed. Send a complete message with failed
flag to notify peer node that the session is failed so that peer can
close the session. This can speed up the closing of failed session.
Also rename it to send_failed_complete_message.
The complete_message is not needed and the handler of this rpc message
does nothing but returns a ready future. The patch to remove it did not
make into the Scylla 1.0 release so it was left there.
Use this flag to notify the peer that the session is failed so that the
peer can close the failed session more quickly.
The flag is used as a rpc::optional so it is compatible use old
version of the verb.
It is useful for larger cluster with larger gossip message latency. By
default the fd_max_interval_ms is 2 seconds which means the
failure_detector will ignore any gossip message update interval larger
than 2 seconds. However, in larger cluster, the gossip message udpate
interval can be larger than 2 seconds.
Fixes#2603.
Message-Id: <49b387955fbf439e49f22e109723d3a19d11a1b9.1500278434.git.asias@scylladb.com>
branch 'tgrabiec/schema-migration-fixes' of github.com:scylladb/seastar-dev:
schema: Use proper name comparator
legacy_schema_migrator: Properly migrate non-UTF8 named columns
schema_tables: Store column_name in text form
legacy_schema_migrator: Migrate columns like Cassandra
schema_builder: Add factory method for default_names
legacy_schema_migrator: Simplify logic
thrift: Don't set regular_column_name_type
schema: Use proper column name type for static columns
schema: Fix column_name_type() for static compact tables
schema: Introduce clustering_column_at()
thrift: Reuse cell_comparator::to_sstring() for obtaining comparator type
partition_slice_builder: Use proper column's type instead of regular_column_name_type()
This replaces column_definition::name_comparator, which incorrectly
assumes that names are always utf8, with name_compare moved from
schema::rebuild() and unifies usages.
Currently migrator assumed all columns are utf8-named, which
doesn't have to be the case for static compact tables.
Refs #2597.
Due to #2573, we can assume that Scylla wasn't used with non-utf8
column names, and that old names are always in textual form.
This fixes generation of synthetic columns for static compact tables.
Current code always generates synthetic clustering column with utf8
type and synthetic regular column with bytes type (in schema_builder).
That's fine when creating a new CQL table, but not when migrating
existing tables created via thrift API.
Fixes#2584.
This also migrates empty compact value columns like Cassandra
does. Such columns are present in compact tables without regular
columns, e.g.:
create table test (k int, ck int, primary key (k, ck)) with compact storage;
They should be migrated to a synthetic regular column with
empty_type type and a non-empty name.
The expression "is_dense.value_or(true)" is always true inside the if,
so drop it.
This allows us to drop temporary calulated_is_dense.
We can also get rid of one of the if branches by extracting
builder.set_is_dense() outside.
After f5dae826ce, static columns not
always have utf8 column names. For static compact tables it's
determined by the cell name comparator type, which is equal to the
type of the synthetic clustering column.
Caused various errors with static thrift tables with non-utf8
comparator.
Reduces view_schema_test runtime to 5 seconds, from 53 seconds on an NVMe disk
with write-back cache, and forever on a spinning disk.
Message-Id: <20170716081653.10018-1-avi@scylladb.com>
We will be creating links to those sstable's files, and those don't work
if the data directory and the test sstable are on different devices.
Copying the files to the same directory fixes the problem.
Message-Id: <20170716090405.14307-1-avi@scylladb.com>
We don't ensure mutations are applied in memory following the order of their
replay positions. A memtable can thus be flushed with replay position rp,
with the new one being at replay position rp', where rp' < rp. This breaks
an intrinsic assumption in the code, which this series addresses.
Fixes#2074
branch memtable-flush/v3 of git@github.com:duarten/scylla.git:
commitlog: Always flush latest memtable
column_family: More precise count of switched memtables
column_family: Fix typo in pending_tasks metric name
column_family: More precise count of pending flushes
dirty_memory_manager: Remove unnecessary check from flush_one()
column_family: Don't rely on flush_queue to guarantee flushes finished
column_family: Don't bother closing the flush_queue on stop()
column_family: Stop using flush_queue
column_family: Remove outdated comment about the flush_queue
memtable: Stop tracking the highest flushed rp
Clang is happy to create a vector<data_value> from a {}, a {1, 2}, but not a {1}.
No doubt it is correct, but sheesh.
Make the data_value explicit to humor it.
Message-Id: <20170713074315.9857-1-avi@scylladb.com>
In storage_proxy we arrange the mutations sent by the replicas in a
vector of vectors, such that each row corresponds to a partition key
and each column contains the mutation, possibly empty, as sent by a
particular replica.
There is reconciliation-related code that assumes that all the
mutations sent by a particular replica can be found in a single
column, but that isn't guaranteed by the way we initially arrange the
mutations.
This patch fixes this and enforces the expected order.
Fixes#2531Fixes#2593
Signed-off-by: Gleb Natapov <gleb@scylladb.com>
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170713162014.15343-1-duarte@scylladb.com>
Since we no longer enforce that mutations are applied in memory
ordered by their replay_positions, the way the highest_flush_rp is
being tracked is no longer correct.
The invariant it was used to maintain no longer exists, so we can get
rid of it together with the assertion on the highest_flush_rp on
flush().
Fixes#2074
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Since commitlog ordering requirements have been relaxed, we now keep
the set of replay_positions seen by a memtable in a set, which we then
use to clean up relevant segments in the commitlog. This means that
the guarantees provided by the flush_queue are no longer necessary.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
When stopping a column family we issue a flush(), for which we wait.
Since writes are supposed to have stopped coming in, and also new
flush requests, there's no need to call and wait for the flush_queue
to be closed.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
We now don't ensure mutations are applied in memory following the
order of their replay positions, so we can't rely on the replay
position to order memtable flushes. So, use a phased_barrier() to
ensure that calling flush() returns a future that completes when all
flushes up to that point have finished.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
We don't need to check whether a memtable is empty in flush_one(), as
that must be checked later, during the actual sealing.
The condition itself is rare and is checked already after the potentially
contented semaphore has been acquired.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch ensures we update the count of pending flushes in the same
place as we update the stats across column families, which is more
correct since it only accounts for actual flushes and not those of
empty memtables or that have been coalesced together.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
The memtable_switch_count metric is supposed to count the number of
times a flush has resulted in the memtable being switched out, but we
were incrementing the count regardless of whether we tried to flush an
empty memtable or two or more flushes were coalesced into one. This
patch fixes this by moving the metric to where the memtable is
actually switched.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
We now don't ensure mutations are applied in memory following the
order of their replay positions, so we can't rely on the replay
position to order memtable flushes. When flushing commit log segments,
ensure we flush the latest memtable.
Refs #2074
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
"This series aims to fix the "serving invalid (old) values" issue in the
loading_cache (issue #2590) by arming the timer with a period that equals
min(expire, refresh).
We are still trying to optimize the main case where 'expire' is
significantly longer than 'refresh' period.
We don't want to add any additional logic in the fast path and this
series gives the immediate solution for the issue above while not adding
any additional CPU cycle to the fast path."
* 'loading_cache_short_expired-v2' of https://github.com/vladzcloudius/scylla:
utils::loading_cache: arm the timer with a period equal to min(_expire, _update)
utils::loading_cache: make a timer use a loading_cache_clock_type clock as a source
Make the descriptions of permissions_validity_in_ms, permissions_update_interval_in_ms
and permissions_cache_max_entries more readable and more related to what they really
do.
Mention the none-zero value requirement for the permissions_update_interval_in_ms and
the permissions_cache_max_entries when the permissions cache is enabled.
Adjust the parameters description in the scylla.yaml too.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1499957053-31792-1-git-send-email-vladz@scylladb.com>
Arm the timer with a period that is not greater than either the permissions_validity_in_ms
or the permissions_update_interval_in_ms in order to ensure that we are not stuck with
the values older than permissions_validity_in_ms.
Fixes#2590
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Current algorithm was marking tables with regular columns not named
"value" as not dense, which doesn't have to be the case. It can be
either way.
It should be enough to look at clustering components. If there is a
clustering key, then table is dense if and only if all comparator
components belong to the clustering key.
If there is no clustering key, then if there are any regular columns
we're sure it's not dense.
Fixes#2587.
Message-Id: <1499877777-7083-1-git-send-email-tgrabiec@scylladb.com>
Cassandra 3.10 added the `duration` type [1], intended to manipulate date-time
values with offsets (for example, `now() - 2y3h`).
The full implementation of the `duration` type in Scylla requires support
for version 5 of the binary protocol, which is not yet available.
In the meantime, this patch patch adds the implementation of the underlying type
for the eventual `duration` type. Included is also the ported test suite from
the reference implementation and additional tests.
Related to #2240.
[1] https://issues.apache.org/jira/browse/CASSANDRA-11873
Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <b1e481da103efee82106bf31f261c5a1f4f8d9ca.1499885803.git.jhaberku@scylladb.com>
Since base tables no longer look for their views, we need to parse
base tables first so that when we add a view we can fetch and connect
it to its base table.
When announcing view table mutations to other nodes we always include
the base table mutations, so there's no need to expect a view being
added before its base table.
Found out while testing view building.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170712172115.2960-1-duarte@scylladb.com>