Commit Graph

194 Commits

Author SHA1 Message Date
Avi Kivity
a71ab365e3 toplevel: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Avi Kivity
8db8c01fbe types: get rid of PRId64 formatting
It's not needed for out sprint() implementation, and gets in the way of
converting all formatting to fmt.
2018-11-01 13:16:16 +00:00
Piotr Sarna
37a5c38471 types: enable deserializing varint from JSON string
Previously deserialization failed because the JSON string
representing a number was unnecessarily quoted.

Fixes #3666
Message-Id: <a0a100dbac7c151d627522174303657d1da05c27.1534845398.git.sarna@scylladb.com>
2018-08-21 11:20:11 +01:00
Piotr Sarna
b3f438bfec types: enable parsing numeric JSON values from string
In order to be Cassandra-compatible, JSON values passed in INSERT JSON
statement should accept string parameters for numeric types - int,
double, etc.

Fixes #3666
Message-Id: <4da9a2f68de31492a2e9432493663a62b138c2f2.1534153955.git.sarna@scylladb.com>
2018-08-13 23:57:37 +01:00
Piotr Sarna
9ba218c161 cql3: remove superfluous null conversions in to_json_string
Some types checked when passed bytes argument was empty, and if so,
returned "null" as a JSON string. Now, with to_json_string(bytes_opt)
it's not needed anymore. Also, some types returned "null" instead
of signaling a deserialization error.
2018-08-09 18:07:12 +02:00
Piotr Sarna
957cc712b6 cql3: enable parsing decimal JSON values from string
In order to be Cassandra-compatible, decimal type should be parsable
from both numeric values and strings.

Fixes #3666
2018-08-09 18:07:12 +02:00
Piotr Sarna
d307b5712c types: use value_to_quoted_string in JSON quoting
In order to avoid regressions caused by external libraries,
our own value_to_quoted_string implementation is used.

Fixes #3622
2018-07-25 13:16:06 +02:00
Paweł Dziepak
a0c1c0c921 types: bytes_view: override fragmented validate()
The default implementation linearises the buffer and calls
validate(bytes_view). This is bad and not needed for bytes_type which
doesn't do any validation anyway.
2018-07-18 12:28:06 +01:00
Piotr Sarna
90d323a522 types: add time_native_type
CQL3's time_type didn't have any suitable native type,
so time_native_type is introduced to serve that purpose.
2018-06-14 11:11:41 +02:00
Paweł Dziepak
e34ff8b4bf treewide: require type for creating collection_mutation_view 2018-05-31 15:51:11 +01:00
Paweł Dziepak
aa25f0844f atomic_cell: introduce fragmented buffer value interface
As a prepratation for the switch to the new cell representation this
patch changes the type returned by atomic_cell_view::value() to one that
requires explicit linearisation of the cell value. Even though the value
is still implicitly linearised (and only when managed by the LSA) the
new interface is the same as the target one so that no more changes to
its users will be needed.
2018-05-31 15:51:11 +01:00
Paweł Dziepak
418c159057 treewide: require type to copy atomic_cell 2018-05-31 15:51:11 +01:00
Paweł Dziepak
43b216b43d types: provide information for IMR 2018-05-31 15:51:11 +01:00
Vladimir Krivopalov
3981dd6dd6 types: Treat byte_type as a variable-length type for compatibility reasons.
Although values of the byte_type that corresponds to CQL TINYINT type
always occupy only a single byte, Cassandra treats this it as a
variable-length type for SSTables 3.0 reading and writing.

While it is clearly a mistake at Cassandra side, we have to stay
compatible.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-05-25 21:41:23 -07:00
Vladimir Krivopalov
24cb062834 types: Remove is_value_fixed() and use value_length_if_fixed() instead.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-05-25 21:41:23 -07:00
Piotr Jastrzebski
7a25819e5a Add abstract_type::value_length_if_fixed
This info is used by SSTable 3.x format to read column values
without reading their lengths.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-05-23 19:54:16 +02:00
Paweł Dziepak
0b4c6b8938 types: make some collection_type_impl functions non-static
The switch to the new in-memory representation will require a larger
parts of the logic be aware of the type of the values they are dealing
with. In most cases it is not a significant burden for the users.
2018-05-09 16:52:26 +01:00
Vladimir Krivopalov
36fe06fd3e Make abstract_type::is_fixed_length() non-virtual.
This method is called agressively through SSTable 3.0 read/write, we
want to reasonably optimise it to no incur extra indirect calls.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <2d00ddecd112af867a30d3d6930c10165dd5af34.1524851530.git.vladimir@scylladb.com>
2018-04-27 20:57:46 +03:00
Vladimir Krivopalov
54bd74fda0 Add is_fixed_length() to data types.
For any given CQL data type, this member returns whether its values are
of fixed or variable length. This is used by SSTables 3.0 format to only
store the length value for variable-length cells.

For #1969.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-04-26 14:34:20 -07:00
Calle Wilund
b1edf75c8b types: Make seastar::inet_address the "native" type for CQL inet.
Fixes #3187

Requires seastar "inet_address: Add constructor and conversion function
from/to IPv4"

Implements support IPv6 for CQL inet data. The actual data stored will
now vary between 4 and 16 bytes. gms::inet_address has been augumented
to interop with seastar::inet_address, though of course actually trying
to use an Ipv6 address there or in any of its tables with throw badly.

Tests assuming ipv4 changed. Storing a ipv4_address should be
transparent, as it now "widens". However, since all ipv4 is
inet_address, but not vice versa, there is no implicit overloading on
the read paths. I.e. tests and system_keyspace (where we read ip
addresses from tables explicitly) are modified to use the proper type.
Message-Id: <20180424161817.26316-1-calle@scylladb.com>
2018-04-24 23:12:07 +01:00
Vladimir Krivopalov
fc644a8778 Fix Scylla to compile with older versions of JsonCpp (<= 1.7.0).
Old versions of JsonCpp declare the following typedefs for internally
used aliases:
    typedef long long int Int64;
    typedef unsigned long long int UInt64;

In newer versions (1.8.x), those are declared as:
    typedef int64_t Int64;
    typedef uint64_t UInt64;

Those base types are not identical so in cases when a type has
constructors overloaded only for specific integral types (such as
Json::Value in JsonCpp or data_value in Scylla), an attempt to
pack/unpack an integer from/to a JSON object causes ambiguous calls.

Fixes #3208

Tests: unit {release}.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <e9fff9f41e0f34b15afc90b5439be03e4295623e.1524556258.git.vladimir@scylladb.com>
2018-04-24 10:58:38 +03:00
Piotr Sarna
1d40d2186e cql3: add from_json_object function to types
This commit adds a 'from_json_object' method which will be used
for converting JSON representation of a value to raw bytes representing
the same value. This functionality will be needed by 'INSERT JSON'
clause implementation, which can turn these raw bytes into cql3::term.

References #2058
2018-04-23 12:00:56 +02:00
Piotr Sarna
399ab1d455 cql3: add to_json_string function to types
This commit adds a 'to_json_string' method which will be used
for converting values to JSON strings. In several cases it's not
sufficient to use 'to_string', e.g. actual strings need to be
surrounded with double quotes.

References #2058
2018-04-11 13:27:56 +02:00
Tomasz Grabiec
52c61df930 Relax includes
To avoid unnecessary recompilations.
Message-Id: <1522168295-994-1-git-send-email-tgrabiec@scylladb.com>
2018-03-28 10:49:07 +03:00
Avi Kivity
1193e7d2e2 Merge "CAST from integers to decimal" from Daniel
"It turned out that decimal numbers that were obtained as cast from integers
should always contain just one decimal place 0.

This can be recognised especially when calculating avg(.) over such numbers
because result contains just one decimal point.

Fixes #3111."

* 'danfiala/integers-to-decimal' of github.com:hagrid-the-developer/scylla:
  tests: Add test that decimal obtained as CAST from integer always contain one decimal place.
  types: Decimal that is obtained from integer always contain one decimal place.
2018-01-21 20:21:00 +02:00
Daniel Fiala
39a08cac6b types: Decimal that is obtained from integer always contain one decimal place.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2018-01-21 17:37:24 +01:00
Daniel Fiala
0d71194da6 types: Added native types for timestamp and timeuuid.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2018-01-14 13:11:36 +01:00
Vladimir Krivopalov
6d76ac8043 Lift checks on list and map values to allow values of length > 64K.
Fixes #3007

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <7b232a655b5531d4bfa2be3d9611f8b1ba0349b0.1512021011.git.vladimir@scylladb.com>
2017-11-30 10:31:19 +02:00
Vladimir Krivopalov
61b1988aa1 Use meaningful error messages when throwing a marshal_exception
Fixes #2977

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <20171121005108.23074-1-vladimir@scylladb.com>
2017-11-21 16:05:43 +02:00
Daniel Fiala
f5629b3a23 types: Use std::pair instead of std::tuple to avoid compile-time error with explicit constructor.
Fixes #2895.

Signed-off-by: Daniel Fiala <daniel@scylladb.com>
Message-Id: <20171017071316.2836-1-daniel@scylladb.com>
2017-10-17 12:32:43 +01:00
Daniel Fiala
61570e4a73 types:: Add support for CAST AS functions.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2017-10-07 21:04:40 +02:00
Daniel Fiala
e2c0a57ecf types: Moved code that implements conversion of types' values to string.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2017-10-07 21:04:40 +02:00
Daniel Fiala
1133838b9f types: Add data_type_for for varint and decimal, data_value constructor for simple_date_type.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
Message-Id: <20171004044040.21631-1-daniel@scylladb.com>
2017-10-04 10:52:57 +03:00
Daniel Fiala
19b21a0ab2 types: Allow 'T' as a date-time separator in timestamps.
* Letter 'T' is specified in ISO 8601 and also in Cassandra
  documentation.

Signed-off-by: Daniel Fiala <daniel@scylladb.com>
Message-Id: <20171003073558.19257-1-daniel@scylladb.com>
2017-10-03 11:10:11 +03:00
Duarte Nunes
20337053ad Don't use literal lambdas
These are only available in C++17. Fixes the build after b5460c2.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-08-11 13:08:42 +02:00
Jesse Haber-Kucharsky
509626fe08 Support duration CQL native type
`duration` is a new native type that was introduced in Cassandra 3.10 [1].

Support for parsing and the internal representation of the type was added in
8fa47b74e8.

Important note: The version of cqlsh distributed with Scylla does not have
support for durations included (it was added to Cassandra in [2]). To test this
change, you can use cqlsh distributed with Cassandra.

Duration types are useful when working with time-series tables, because they can
be used to manipulate date-time values in relative terms.

Two interesting applications are:

- Aggregation by time intervals [3]:

`SELECT * FROM my_table GROUP BY floor(time, 3h)`

- Querying on changes in date-times:

`SELECT ... WHERE last_heartbeat_time < now() - 3h`

(Note: neither of these is currently supported, though columns with duration
values are.)

Internally, durations are represented as three signed counters: one for months,
for days, and for nanoseconds. Each of these counters is serialized using a
variable-length encoding which is described in version 5 of the CQL native
protocol specification.

The representation of a duration as three counters means that a semantic
ordering on durations doesn't exist: Is `1mo` greater than `1mo1d`? We cannot
know, because some months have more days than others. Durations can only have a
concrete absolute value when they are "attached" to absolute date-time
references. For example, `2015-04-31 at 12:00:00 + 1mo`.

That duration values are not comparable presents some difficulties for the
implementation, because most CQL types are. Like in Cassandra's implementation
[2], I adopted a similar strategy to the way restrictions on the `counter` type
are checked. A type "references" a duration if it is either a duration or it
contains a duration (like a `tuple<..., duration, ...>`, or a UDT with a
duration member).

The following restrictions apply on durations. Note that some of these contexts
are either experimental features (materialized views), or not currently
supported at run-time (though support exists in the parser and code, so it is
prudent to add the restrictions now):

- Durations cannot appear in any part of a primary key, either for tables or
  materialized views.

- Durations cannot be directly used as the element type of a `set`, nor can they
  be used as the key type of a `map`. Because internal ordering on durations is
  based on a byte-level comparison, this property of Cassandra was intended to
  help avoid user confusion around ordering of collection elements.

- Secondary indexes on durations are not supported.

- "Slice" relations (<=, <, >=, >) are not supported on durations with `WHERE`
   restrictions (like `SELECT ... WHERE span <= 3d`). Multi-column restrictions
   only work with clustering columns, which cannot be `duration` due to the
   first rule.

- "Slice" relations are not supported on durations with query conditions (like
  `UPDATE my_table ... IF span > 5us`).

Backwards incompatibility note:

As described in the documentation [4], duration literals take one of two
forms: either ISO 8601 formats (there are three), or a "standard" format. The ISO
8601 formats start with "P" (like "P5W"). Therefore, identifiers that have this
form are no longer supported.

Fixes #2240.

[1] https://issues.apache.org/jira/browse/CASSANDRA-11873

[2] bfd57d13b7

[3] https://issues.apache.org/jira/browse/CASSANDRA-11871

[4] http://cassandra.apache.org/doc/latest/cql/types.html#working-with-durations
2017-08-10 15:01:10 -04:00
Duarte Nunes
3bfcf47cc6 types: Implement hash() for collections
This patch provides a rather trivial implementation of hash() for
collection types.

It is needed for view building, where we hold mutations in a map
indexed by partition keys (and frozen collection types can be part of
the key).

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170718192107.13746-1-duarte@scylladb.com>
2017-07-19 09:52:56 +03:00
Gleb Natapov
a032078410 intern also tuple and user defined types
Currently each time UDT or tuple is parsed new object is created. If
those objects are used to create container type repeatedly it will cause
memory leak since container types are interned, but lookup in the
cache is done using pointer to a contained type (which will be always
different for UDT and tuples). This patches interns also UDT and tuple,
so each type the same object is parsed same pointer is also returned.

Refs #2469
Fixes #2487

Message-Id: <20170612142942.GO21915@scylladb.com>
2017-06-14 14:41:17 +03:00
Avi Kivity
f5dae826ce Merge "Migrate schema tables to v3 format" from Calle
"Defines origin v3-format for system/schema tables, and use them for
schema storage/retrival.

Includes a legacy_schema_migrator implementation/port from origin. Note
that since we don't support features like triggers, functions and
aggregates, it will bail if encountering such a feature used.

Note also that this patch set does not convert the "hints" and
"backlog" tables, even though these have changed in v3 as well.
That will be a separate patch set.

Tested against dtests. Note that patches for dtest + ccm
will follow."

* 'calle/systemtables' of github.com:cloudius-systems/seastar-dev: (36 commits)
  legacy_schema_migrator: Actually truncate legacy schema tables on finish
  database: Extract "remove" from "drop_columnfamily"
  v3 schema test fixes
  thrift: Update CQL mapping of static CFs
  schema_tables: Use v3 schema tables and formats
  type_parser: Origin expects empty string -> bytes_type
  cf_prop_defs: Add crc_check_chance as recognized (even if we don't use)
  types_test: v3 style schemas enforce explicit "frozen" in tupes/ut:s
  cql3_type: v3 to_string
  cql_types: Introduce cql3_type::empty and associate with empty data_type
  schema: rename column accessors to be in line with origin
  schema: Add "is_static_compact_table"
  schema_builder: Add helper to generate unique column names akin origin
  schema: Add utility functions for static columns
  schema: Use heterogeneous comparator for columns bounds
  cql3_type_parser: Resolve from cql3 names/expressions
  cql3_type: Add "prepare_interal" and "references_user_type"
  cql3::cql3_type: Add prepare_internal path using only "local" holders
  cql3_type: Add virtual destructor.
  database/main: encapsulate system CF dir touching
  ...
2017-05-17 11:25:52 +03:00
Vlad Zolotarov
494ea82a88 utils::UUID: align the UUID serialization API with the similar API of other classes in the project
The standard serialization API (e.g. in data_value) includes the following methods:

size_t serialized_size() const;
void serialize(bytes::iterator& it) const;
bytes serialize() const;

Align the utils::UUID API with the pattern above.

The only addition is that we are going to make an output iterator parameter of a second method above
a template so that we may serialize into different output sources.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-05-16 15:56:03 -04:00
Calle Wilund
c572a8c83c cql_types: Introduce cql3_type::empty and associate with empty data_type 2017-05-10 16:44:48 +00:00
Duarte Nunes
4e693383f7 mutation_partion: Use row_tombstone
This patch replaces the current row tombstone representation by a
row_tombstone.

The intent of the patch is thus to reify the idea of shadowable
tombstones, that up until now we considered all materialized view row
tombstones to be.

We need to distinguish shadowable from non-shadowable row tombstones
to support scenarios such as, when inserting to a table with a
materialzied view:

1. insert into base (p, v1, v2) values (3, 1, 3) using timestamp 1
2. delete from base using timestamp 2 where p = 3
3. insert into base (p, v1) values (3, 1) using timestamp 3

These should yield a view row where v2 is definitely null, but with
the current implementation, v2 will pop back with its value v2=3@TS=1,
even though its dead in the base row. This is because the row
tombstone inserted at 2) is a shadowable one.

This patch only addresses the memory representation of such
row_tombstones.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-04-25 11:46:33 +02:00
Raphael S. Carvalho
a6f8f4fe24 compaction: do not write expired cell as dead cell if it can be purged right away
When compacting a fully expired sstable, we're not allowing that sstable
to be purged because expired cell is *unconditionally* converted into a
dead cell. Why not check if the expired cell can be purged instead using
gc before and max purgeable timestamp?

Currently, we need two compactions to get rid of a fully expired sstable
which cells could have always been purged.

look at this sstable with expired cell:
  {
    "partition" : {
      "key" : [ "2" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 120,
        "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z",
"ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true },
        "cells" : [
          { "name" : "country", "value" : "1" },
        ]

now this sstable data after first compaction:
[shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79
(~65% of original) in 229ms = 0.000328997MB/s.

  {
    ...
    "rows" : [
      {
        "type" : "row",
        "position" : 79,
        "cells" : [
          { "name" : "country", "deletion_info" :
{ "local_delete_time" : "2017-04-09T17:07:12Z" },
            "tstamp" : "2017-04-09T17:07:12.702597Z"
          },
        ]

now another compaction will actually get rid of data:
compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original)
in 1ms = 0MB/s. ~2 total partitions merged to 0

NOTE:
It's a waste of time to wait for second compaction because the expired
cell could have been purged at first compaction because it satisfied
gc_before and max purgeable timestamp.

Fixes #2249, #2253

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com>
2017-04-13 10:59:19 +03:00
Duarte Nunes
61741a69b6 collection_type_impl: Use set difference for tombstones
This patch fixes collection_type_impl::difference() so it does set
difference for tombstones instead of just returning the larger
one, as difference() is supposed to return only the information in
mutation A that supersedes that in B, given difference(A, B).

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 14:34:01 +01:00
Duarte Nunes
19fcd2d140 collection_type_impl: A mutation with a tombstone is not empty
This patch changes the collection_type_impl::is_empty() function so
that it doesn't consider empty a collection_mutation which has a
tombstone.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 14:34:01 +01:00
Paweł Dziepak
53d9a6f220 types: make counter_type_impl report its cql3_type 2017-02-02 10:35:14 +00:00
Paweł Dziepak
0c93d01232 atomic_cell: make sure upper level tombstones cover counters
Support for deletion of counters is limited in a way that once deleted
they cannot be used again (i.e. tombstone always wins, regardless of the
timestamp). Logic responsible for merging two counter cells already
makes sure that tombstones are handled properly, but it is also
necessary to ensure that higher level tombstones always cover counters.
2017-02-02 10:35:14 +00:00
Paweł Dziepak
8cdffd7c57 time_type_impl: value initialize result
parse_time() adds hourse, minutes, etc to a final value 'result'.
However, it is of type std::chrono::nanoseconds which means it is not
zeroed at initialization unless it is explicitly asked to do so.

Fixed debug mode failures in types_tyes and cql_query_test.

Message-Id: <20170125155239.1253-1-pdziepak@scylladb.com>
2017-01-25 17:56:31 +02:00
Pekka Enberg
93e6592296 cql3: TIME data type support
This adds support for the TIME data type introduced in CQL 3.3.1.

Refs #1284
2017-01-09 10:42:20 +02:00
Pekka Enberg
9def7db381 cql3: DATE type support
This adds support for the DATE type introduced in CQL 3.3.1.

Refs #1284
2017-01-09 10:42:20 +02:00