All variants of write_component now take an io_priority. The public
interfaces are by default set to Seastar's default priority.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Compaction manager was initially created at utils because it was
more generic, and wasn't only intended for compaction.
It was more like a task handler based on futures, but now it's
only intended to manage compaction tasks, and thus should be
moved elsewhere. /sstables is where compaction code is located.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
test_cassandra_hash also sort of expects exceptions. ASas causes false
positives here as well with seastar::thread, so do it with normal cont.
Message-Id: <1453295521-29580-2-git-send-email-calle@scylladb.com>
From Paweł:
"This series contains some more fixes for issues related to alter table,
namely: incorrect parsing of collection information in comparator, missing
schema::_raw._collections in equality check, missing compatibility
information for utf8->blob, ascii->blob and ascii->utf8 casts."
test_password_authenticator_operations causes ASan failures, in a way
that I am 99% sure is fully false positive, caused by a combo of
seastar threads, exception throwing and externals.
In lieu of actually identifying what ASan flaw causes this and
potentially cure it, for now, lets just re-write the test in question
to not use seastar::async, but normal continuation. Less easy to read,
but passes ASan.
Message-Id: <1453205136-10308-1-git-send-email-calle@scylladb.com>
When compacting sstable, mutation that doesn't belong to current shard
should be filtered out. Otherwise, mutation would be duplicated in
all shards that share the sstable being compacted.
sstable_test will now run with -c1 because arbitrary keys are chosen
for sstables to be compacted, so test could fail because of mutations
being filtered out.
fixes#527.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <1acc2e8b9c66fb9c0c601b05e3ae4353e514ead5.1453140657.git.raphaelsc@scylladb.com>
Representation format is an implementation detail of
partition_key. Code which compares a value to representation makes
assumptions about key's representation. Compare keys to keys instead.
Message-Id: <1453136316-18125-1-git-send-email-tgrabiec@scylladb.com>
"This series makes sure that Scylla rejects adding a collections if
its column name is the same as a collection that existed before and
their types are incompatible.
Fixes#782"
"This patch is intended to add support to column family cleanup, which will
make 'nodetool cleanup' possible.
Why is this feature needed? Remove irrelevant data from a node that loses part
of its token range to a newly added node."
The implementation is about storing generation of compacting sstables
in an unordered set per column family, so before strategy is called,
compaction manager will filter out compacting sstables.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
"Our domain objects have schema version dependent format, for efficiency
reasons. The data structures which map between columns and values rely on
column ids, which are consecutive integers. For example, we store cells in a
vector where index into the vector is an implicit column id identifying table
column of the cell. When columns are added or removed the column ids may
shift. So, to access mutations or query results one needs to know the version
of the schema corresponding to it.
In case of query results, the schema version to which it conforms will always
be the version which was used to construct the query request. So there's no
change in the way query result consumers operate to handle schema changes. The
interfaces for querying needed to be extended to accept schema version and do
the conversions if necessary.
Shard-local interfaces work with a full definition of schema version,
represented by the schema type (usually passed as schema_ptr). Schema versions
are identified across shards and nodes with a UUID (table_schema_version
type). We maintain schema version registry (schema_registry) to avoid fetching
definitions we already know about. When we get a request using unknown schema,
we need to fetch the definition from the source, which must know it, to obtain
a shard-local schema_ptr for it.
Because mutation representation is schema version dependent, mutations of
different versions don't necessarily commute. When a column is dropped from
schema, the dropped column is no longer representable in the new schema. It is
generally fine to not hold data for dropped columns, the intent behind
dropping a column is to lose the data in that column. However, when merging an
incoming mutation with an existing mutation both of which have different
schema versions, we'd have to choose which schema should be considered
"latest" in order not to loose data. Schema changes can be made concurrently
in the cluster and initiated on different nodes so there is not always a
single notion of latest schema. However, schema changes are commutative and by
merging changes nodes eventually agree on the version. For example adding
column A (version X) on one node and adding column B (version Y) on another
eventually results in a schema version with both A and B (version Z). We
cannot tell which version among X and Y is newer, but we can tell that version
Z is newer than both X and Y. So the solution to the problem of merging
conflicting mutations could be to ensure that such merge is performed using
the schema which is superior to schemas of both mutations.
The approach taken in the series for ensuring this is as follows. When a node
receives a mutation of an unknown schema version it first performs a schema
merge with the source of that mutation. Schema merge makes sure that current
node's version is superior to the schema of incoming mutation. Once the
version is synced with, it is remembered as such and won't be synced with on
later mutations. Because of this bookkeeping, schema versions must be
monotonic; we don't want table altering to result in any earlier version
because that would cause nodes to avoid syncing with them. The version is a
cryptographically-secure hash of schema mutations, which should fulfill this
purpose in practice.
TODO: It's possible that the node is already performing a sync triggered by
broadcasted schema mutations. To avoid triggering a second sync needlessly, the
schema merging should mark incoming versions as being synced with.
Each table shard keeps track of its current schema version, which is
considered to be superior to all versions which are going to be applied to it.
All data sources for given column family within a shard have the same notion
of current schema version. Individual entries in cache and memtables may be at
earlier versions but this is hidden behind the interface. The entries are
upgraded to current version lazily on access. Sstables are immutable, so they
don't need to track current version. Like any other data source, they can be
queried with any schema version.
Note, the series triggered a bug in demangler:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68700"
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.
Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.
Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.
Schema requesting across nodes is currently stubbed (throws runtime
exception).
Schema is tracked in memtable and cache per-entry. Entries are
upgraded lazily on access. Incoming mutations are upgraded to table's
current schema on given shard.
Mutating nodes need to keep schema_ptr alive in case schema version is
requested by target node.
This patch fixes a regression introduced by
a commit ca935bf "tests: Fix gossip_test".
database service initializes a replication_strategy
object and a replication_strategy requires a snitch
service to be initialized.
A snitch service requires a broadcast address to be
set.
If any of the above is not initialized we are going
to hit the corresponding assert().
Set a snitch to a SimpleSnitch and a broadcast
address to 127.0.0.1.
Fixes issue #770
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1452421748-9605-1-git-send-email-vladz@cloudius-systems.com>
The goal is to provide various test cases with a way of iterating over
many combinations of mutaitons. It's good to have this in one place to
avoid duplication and increased coverage.