`duration` is a new native type that was introduced in Cassandra 3.10 [1].
Support for parsing and the internal representation of the type was added in
8fa47b74e8.
Important note: The version of cqlsh distributed with Scylla does not have
support for durations included (it was added to Cassandra in [2]). To test this
change, you can use cqlsh distributed with Cassandra.
Duration types are useful when working with time-series tables, because they can
be used to manipulate date-time values in relative terms.
Two interesting applications are:
- Aggregation by time intervals [3]:
`SELECT * FROM my_table GROUP BY floor(time, 3h)`
- Querying on changes in date-times:
`SELECT ... WHERE last_heartbeat_time < now() - 3h`
(Note: neither of these is currently supported, though columns with duration
values are.)
Internally, durations are represented as three signed counters: one for months,
for days, and for nanoseconds. Each of these counters is serialized using a
variable-length encoding which is described in version 5 of the CQL native
protocol specification.
The representation of a duration as three counters means that a semantic
ordering on durations doesn't exist: Is `1mo` greater than `1mo1d`? We cannot
know, because some months have more days than others. Durations can only have a
concrete absolute value when they are "attached" to absolute date-time
references. For example, `2015-04-31 at 12:00:00 + 1mo`.
That duration values are not comparable presents some difficulties for the
implementation, because most CQL types are. Like in Cassandra's implementation
[2], I adopted a similar strategy to the way restrictions on the `counter` type
are checked. A type "references" a duration if it is either a duration or it
contains a duration (like a `tuple<..., duration, ...>`, or a UDT with a
duration member).
The following restrictions apply on durations. Note that some of these contexts
are either experimental features (materialized views), or not currently
supported at run-time (though support exists in the parser and code, so it is
prudent to add the restrictions now):
- Durations cannot appear in any part of a primary key, either for tables or
materialized views.
- Durations cannot be directly used as the element type of a `set`, nor can they
be used as the key type of a `map`. Because internal ordering on durations is
based on a byte-level comparison, this property of Cassandra was intended to
help avoid user confusion around ordering of collection elements.
- Secondary indexes on durations are not supported.
- "Slice" relations (<=, <, >=, >) are not supported on durations with `WHERE`
restrictions (like `SELECT ... WHERE span <= 3d`). Multi-column restrictions
only work with clustering columns, which cannot be `duration` due to the
first rule.
- "Slice" relations are not supported on durations with query conditions (like
`UPDATE my_table ... IF span > 5us`).
Backwards incompatibility note:
As described in the documentation [4], duration literals take one of two
forms: either ISO 8601 formats (there are three), or a "standard" format. The ISO
8601 formats start with "P" (like "P5W"). Therefore, identifiers that have this
form are no longer supported.
Fixes#2240.
[1] https://issues.apache.org/jira/browse/CASSANDRA-11873
[2] bfd57d13b7
[3] https://issues.apache.org/jira/browse/CASSANDRA-11871
[4] http://cassandra.apache.org/doc/latest/cql/types.html#working-with-durations
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
- 'net' namespace conflicts with seastar::net, renamed to 'netw'.
- 'transport' namespace conflicts with seastar::transport, renamed to
cql_transport.
- "logger" global variables now conflict with logger global type, renamed
to xlogger.
- other minor changes
This changes announce_migration() to return a change event directory in
schema_altering_statement base class. It's needed for drop index
statement, which does not know the keyspace or column family until it
looks up them based on the index. Two stage approach of announcing a
migration and then creating the change event won't work because in the
latter stage, the lookup will fail. The same change in
announce_migration() has been applied to Apache Cassandra.
Use seastar::checked_ptr<weak_ptr<pepared_statement>> instead of shared_ptr for passing prepared statements around.
This allows an easy tracking and handling of statements invalidation.
This implementation will throw an exception every time an invalidated
statement reference is dereferenced.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
This patchset adds missing properties to the create_view_statement,
such as whether the view is compact or the order of its clustering
columns.
Fixes#1766
This patch extracts the definition of the default compressor into the
compression_parameters class, so that the table and view creation
statements don't have to explicitly deal with it.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch extracts the cf_properties class, which contains common
attributes of tables and materialized views.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
In preparation for the removal of schema_altering_statement::prepare(),
add a fake create_table_statement::prepare(). create_table_statement
has already been split to raw and prepared variants, so this prepare()
will never be called, but it is required because schema_altering_statement
is both a cql_statement and a prepared_statement. This confusion will
be fixed later on.
Current algorithm is O(N^2) where N is the column count. This causes
limits.py:TestLimits.max_columns_and_query_parameters_test to timeout
because CREATE TABLE statement takes too long.
This change replaces it with an algorithm of O(N)
complexity. _defined_names are already sorted so if any duplicates
exist, they must be next to each other.
Message-Id: <1456058447-5080-1-git-send-email-tgrabiec@scylladb.com>
The CQL tokenizer recognizes "COUNTER" token but the parser rule for
counter type is disabled. This causes users to see the following error
in cqlsh, for example:
CREATE TABLE count (u int PRIMARY KEY, c counter);
SyntaxException: <ErrorMessage code=2000 [Syntax error in CQL query] message=" : cannot match to any predicted input... ">
We cannot disable the "COUNTER" token because it's also used in batch
statements. Instead, fix the issue by implementing a stub counter type.
Fixes#195.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Use get_storage_proxy() and get_local_storage_proxy() helpers under the
hood to simplify migration manager API users.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
In preparation for adding listener state to migration manager, use
sharded<> for migration manager.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
The statement being removed in this patch is wrong, and nonexistent in Origin.
If the list of column aliases is empty, we should leave it this way.
This code was already present before the compact storage series. But because
tables created using the schema_builder directly won't exercise this code path,
I ended up not noticing - specially because it only happens with tables that
lack a clustering key. The ones I tested through cqlsh, all had a clustering
key.
Fixes#45
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
We have the information that they should be reverted, but we are not yet
reverting them. Go ahead and do it
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
First of all, we should abide by our convention of prepending member names with
a '_'. (That is the underline character, it just looks like a face)
But more importantly, because we will be searching its contents frequently, a
helper function is provided.
Note that obviously a hash is better suited for this: but because we do need to
keep the fields in order they are inserted, a vector really is the best choice
for that.
A table is not expected to have a lot of clustering keys. So this search should
be cheap. If it turns out to be a problem, we can adjust later.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
We currently have code to calculate "is_dense" in the create statement handler.
That obviously don't work for the system schemas, which are not defined this
way.
Since all of our schemas now have to pass through the schema_builder one way or
another, that is the best place in which to do that calculation.
Note that unfortunately, that does not mean we can just get rid of
set_is_dense() in the schema builder: we still need to set it in some
situations, where for instance, we read that property in schema_columnfamilies,
and then apply to the relevant CF. Those uses are, however, all internal to
legacy_schema_tables.cc
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Persist column family's "is_dense" value to system tables. Please note
that we throw an exception if "is_dense" is null upon read. That needs
to be fixed later by inferring the value from other information like
Origin does.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Store the column family key validator in system tables. Please note that
we derive the validator from CQL partition keys and never actually read
it from the database. This is different from Origin which uses
CompositeType that is both stored and read from the system tables.
Fixes#7.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Tested-by: Pekka Enberg <penberg@cloudius-systems.com>
The column_identifiers are wrapped in shared_ptr<> so use the
appropriate hash and comparison functions for the container.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Add clustering key support to create_table_statement. While at it, add
compound types for partition keys, to unify schema building code for
both partition and clustering keys.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Move raw_statement implementation out of the header file. This makes
life easier when we modify the class for clustering key support.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Make 'create table' statements also specify the following for
schema_ptrs:
- Partition keys
- Regular columns
- Static columns
Please note that clustering keys are _not_ included because we seem to
lack infrastructure like CompoundType and CellNameType to properly
enable them.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Switch to using schema_builder in create_table_statement in preparation
for also defining columns in the resulting schema_ptr.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Move create_table_statement code out-of-line in preparation for
modifying the implementation.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>