Commit Graph

391 Commits

Author SHA1 Message Date
Benny Halevy
f8d9e81bdb types: time_point_to_string: prevent overflow of nanoseconds
Due to #7175, microseconds are stored in a db_clock::time_point
as if they were milliseconds.

std::chrono::duration_cast<std::chrono::nanoseconds> may cause overflow
and end up with invalid/negative nanos.

This change specializes time_point_to_string to std::chrono::milliseconds
since it's currently only called to print db_clock::time_point
and uses boost::posix_time::milliseconds to print the count.

This would generate an exception in today's time stamps
and the output will look like:
    1599493018559873 milliseconds (Year is out of valid range: 1400..9999)
instead of:
    1799-07-16T19:57:52.175010

It is preferrable to print the numeric value annotated as out of valid range
than to print a bogus date in the past.

Test: unit(dev), commitlog_test:TestCommitLog.test_mixed_mode_commitlog_same_partition_smp_1
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200907162845.147477-1-bhalevy@scylladb.com>
2020-09-08 10:02:02 +03:00
Piotr Grabowski
ffd8c8c505 utf8: Print invalid UTF-8 character position
Add new validate_with_error_position function
which returns -1 if data is a valid UTF-8 string
or otherwise a byte position of first invalid
character. The position is added to exception
messages of all UTF-8 parsing errors in Scylla.

validate_with_error_position is done in two
passes in order to preserve the same performance
in common case when the string is valid.
2020-09-07 18:11:21 +03:00
Benny Halevy
0c474b1c01 types: time_point_to_string: handle errors from boost::posix_time::to_iso_extended_string
As seen in https://github.com/scylladb/scylla/issues/7175,
1e676cd845
that was merged in bc77939ada
exposed a preexisting problem in time_point_to_string
where it tried printing a timestamp that was in microseconds
(taken from an api::timestamp_type instead of db_clock::time_point)
and hit `boost::wrapexcept<boost::gregorian::bad_year> (Year is out of valid range: 1400..9999)`

If hit, this patch with print the offending time_stamp in
nanoseconds and the error message.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200907083303.33229-1-bhalevy@scylladb.com>
2020-09-07 11:36:43 +03:00
Benny Halevy
66ce3a4c25 types: time_point_to_string: do not assume tp is in milliseconds
T& tp may have other period than milliseconds.
Cast the time_point duration to nanoseconds (or microseconds
if boost doesn't supports it) so it is printed in the best possible
resolution.

Note that we presume that the time_point epoch is the
Unix epoch of 1970-01-01, but the c++ standard doesn't guwarntee
that.  See https://github.com/scylladb/scylla/issues/5498

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200906171106.690872-1-bhalevy@scylladb.com>
2020-09-07 10:44:52 +03:00
Avi Kivity
a4c44cab88 treewide: update concepts language from the Concepts TS to C++20
Seastar recently lost support for the experimental Concepts Technical
Specification (TS) and gained support for C++20 concepts. Re-enable
concepts in Scylla by updating our use of concepts to the C++20
standard.

This change:
 - peels off uses of the GCC6_CONCEPT macro
 - removes inclusions of <seastar/gcc6-concepts.hh>
 - replaces function-style concepts (no longer supported) with
   equation-style concepts
 - semicolons added and removed as needed
 - deprecated std::is_pod replaced by recommended replacement
 - updates return type constraints to use concepts instead of
   type names (either std::same_as or std::convertible_to, with
   std::same_as chosen when possible)

No attempt is made to improve the concepts; this is a specification
update only.
Message-Id: <20200531110254.2555854-1-avi@scylladb.com>
2020-06-02 09:12:21 +03:00
Avi Kivity
777d5e88c3 types: support altering fixed-size integer types to varint
Fixed-size integer types are legal varints - both are serialized as
two's complement in network byte order. So there's tinyint, shortint,
int, and bigint can be interpreted as varints.

Change is_compatible_with() to reflect that.
Message-Id: <20200516115143.28690-2-avi@scylladb.com>
2020-05-17 11:31:00 +03:00
Avi Kivity
ff57e4d9a5 types: make short and byte types value-compatible with varint
The short and byte types are two's complement network byte order,
just like varint (except fixed size) and so varint can read them
just fine.

Mark them as value compatible like int32_type and long_type.

A unit test is added.
Message-Id: <20200516115143.28690-1-avi@scylladb.com>
2020-05-17 11:31:00 +03:00
Botond Dénes
2e09a0317c types, compound: pass std::current_exception() to on_internal_error()
So that  nested exceptions are not lost. Also, marshal exceptions, the
ones we have in these places, already have a backtrace, so might as well
use that, instead of creating a new one, loosing unwound frames.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200507091405.244544-1-bdenes@scylladb.com>
2020-05-07 11:25:25 +02:00
Piotr Sarna
c32faee657 Merge 'counters: Fix filtering of counters' from Juliusz
Queries with `ALLOW FILTERING` and constraints on counter
values used to be rejected as "unimplemented". The reason
was a missing tri-comparator, which is added in this patch.

Fixes #5635

* jul-stas-5635-filtering-on-counters:
  cql/tests: Added test for filtering on counter columns
  counters: add comparator and remove `unimplemented` from restrictions
2020-04-27 13:53:34 +02:00
Juliusz Stasiewicz
cf2d81bb12 counters: add comparator and remove unimplemented from restrictions
CQL `counter_type_impl` is now made comparable by deserializing it
as an `int64_t`. It allows the use of counters in statement
restrictions.
2020-04-27 13:27:48 +02:00
Pavel Solodovnikov
f6e765b70f cql3: pass column_specification via lw_shared_ptr
`column_specification` class is marked as "final": it's safe
to use non-polymorphic pointer "lw_shared_ptr" instead of a
more generic "shared_ptr".

tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200427084016.26068-1-pa.solodovnikov@scylladb.com>
2020-04-27 12:47:42 +03:00
Botond Dénes
9e1d6ada0f types: compare(): cover more paths with on_internal_error()
Currently we call `on_internal_error()` if `tri_compare()` throws
`marshal_exception`. Some compare paths however might go around
`tri_compare()` and call `abstract_type::compare()` directly. Move the
check there to cover these cases too.

Tests: dev
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200403162530.1175801-1-bdenes@scylladb.com>
2020-04-03 18:35:30 +02:00
Avi Kivity
3c772757c0 treewide: use utils::multiprecision_int for varint implementation
The goal is to forward-declare utils::multiprecision_int, something
beyond my capabilities for boost::multiprecision::cpp_int, to reduce
compile time bloat.

The patch is mostly search-and-replace, with a few casts added to
disambiguate conversions the compiler had trouble with.
2020-03-04 13:28:16 +02:00
Calle Wilund
b6443e44b9 set: Make set_type_impl::serialize_partially_deserialized_form static
Conform with map + does not require any instance info.
2020-03-02 14:43:34 +00:00
Rafael Ávila de Espíndola
93de9597bf types: Add more data_value constructors
With this we can construct a data_value from any string type. This
also avoids a few sstring copies.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-28 08:36:27 -08:00
Benny Halevy
b31867eafa types: tri_compare: turn marshal_exception to on_internal_error
We see this exception on gemini testing with large number of pk, ck, columns, for example:
  2020-02-19T17:52:54+00:00  gemini-8h-large-num-columns-GeminiL-db-node-f2d6a8e0-3 !ERR     | scylla: [shard 0] storage_proxy - Exception when communicating with 10.0.207.169: std::runtime_error (marshaling error: read_simple_exactly - size mismatch (expected 4, got 1) Backtrace:   0x2c4f08d#012  0x9fcd3e#012  0x444b28#012  0x4d8fe5#012  0xa78e8b#012  0xeab269#012  0xc27a67#012  0xc28239#012  0xc600e3#012  0xadebf3#012  0xae14c1#012  0x29ff291#012  0x29ff49f#012  0x2a3fc65#012  0x29a5d6f#012  0x29a6e9e#012  0x72a4e3#012  /opt/scylladb/libreloc/libc.so.6+0x271a2#012  0x77548d#012)

Decoded backtrace:
  seastar::current_backtrace() at crtstuff.c:?
  seastar::internal::backtraced<marshal_exception>::backtraced<seastar::basic_sstring<char, unsigned int, 15u, true> >(seastar::basic_sstring<char, unsigned int, 15u, true>&&) at crtstuff.c:?
  void seastar::throw_with_backtrace<marshal_exception, seastar::basic_sstring<char, unsigned int, 15u, true> >(seastar::basic_sstring<char, unsigned int, 15u, true>&&) at crtstuff.c:?
  abstract_type::compare(std::basic_string_view<signed char, std::char_traits<signed char> >, std::basic_string_view<signed char, std::char_traits<signed char> >) const [clone .cold] at types.cc:?
  bound_view::tri_compare::operator()(clustering_key_prefix const&, int, clustering_key_prefix const&, int) const at crtstuff.c:?
  sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?
  mutation_reader_merger::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?
  combined_mutation_reader::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?
  restricting_mutation_reader::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?
  cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?

This patch should help us get a core dump if this happens again.

Ref #5856

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200227131939.388770-1-bhalevy@scylladb.com>
2020-02-28 07:57:13 +02:00
Pavel Solodovnikov
abb3a7e218 cql3: minor sweeps through the cql layer code to reduce shared_ptrs count
Convert some more helper functions to accept const reference to
column_specification and column_identifier instead of shared_ptr.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-02-16 17:24:26 +03:00
Pavel Solodovnikov
49bf936403 cql3: change signatures of several functions to return crefs instead of pointers
The following functions now accept const reference to
column_specification instead of shared_ptr:
 * lists::index_spec_of
 * lists::value_spec_of
 * lists::uuid_index_spec_of
 * sets::value_spec_of

Changed maps::value_spec_of and maps::key_spec_of signatures
to accept const ref instead of non-const ref to
column_specification.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-02-16 17:23:56 +03:00
Pavel Solodovnikov
76a0652deb types: fix serialization and validation of empty values
Empty values (zero-sized string in serialized form) were not
handled properly in serialize routines for floating types and
uuids, which led to runtime exceptions and failing tests as
described in https://github.com/scylladb/scylla/issues/5782.

Also fix validation visitor to handle empty values properly.

There already was the code in place that took into
consideration zero-sized values. But it was trying to read
some bytes regardless of that (e.g. for timeuuid values),
even if there is none to read.

Tests: unit(dev, debug)

Fixes: #5782

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200213130021.31598-1-pa.solodovnikov@scylladb.com>
2020-02-16 11:22:30 +02:00
Avi Kivity
541893e69a Merge "Fix conversion of lua nil to cql null" from Rafael
"
The fix itself is fairly simple, but looking at the code I found that
our code base was not cleanly distinguishing null and empty values and
was treating null and missing values differently, but that distinction
was dead since a null is represented as a dead cell.
"

* 'espindola/lua-fix-null-v6' of https://github.com/espindola/scylla:
  lua: Handle nil returns correctly
  types: Return bytes_opt from data_value::serialize
  query-result-set: Assert that we don't have null values
  types: Fix comparison of empty and null data_values
  Revert "tests: Handle null and not present values differently"
  query-result-set: Avoid a copy during construction
  types: Move operator== for data_value out-of-line
2020-02-02 15:43:24 +02:00
Rafael Ávila de Espíndola
cc81ba3432 types: Use a fancy iterator to avoid a temporary buffer
By using a fancy iterator we can avoid calling export_bits with a
temporary buffer before copying the result to the output.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-30 10:26:39 -08:00
Rafael Ávila de Espíndola
7e67ce0bdb types: Use export_bits to serialize cpp_int
This avoid a copy when serializing positive numbers.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-30 10:26:39 -08:00
Rafael Ávila de Espíndola
27a67f1a2c types: Avoid a branch in a loop
Thanks to Benny for the suggestion.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-30 10:26:39 -08:00
Rafael Ávila de Espíndola
c89c90d07f types: Fix encoding of negative varint
We would sometimes produce an unnecessary extra 0xff prefix byte.

The new encoding matches what cassandra does.

This was both a efficiency and correctness issue, as using varint in a
key could produce different tokens.

Fixes #5656

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-30 10:25:09 -08:00
Rafael Ávila de Espíndola
ed747122aa types: Replace "num.sign() < 0" with "num < 0"
Surprisingly, this produces better code with cpp_int.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-30 10:24:03 -08:00
Rafael Ávila de Espíndola
4b4efcf302 types: Remove collection_type_impl::serialize
The rest of the serialize api has been devirtualized some time ago,
but this auxiliary function stayed virtual.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200129203916.20460-1-espindola@scylladb.com>
2020-01-30 14:10:18 +02:00
Rafael Ávila de Espíndola
bd93a0af52 types: Return bytes_opt from data_value::serialize
Since a data_value can contain a null value, returning bytes from
serialize() was losing information as it was mapping null to empty.

This also introduces a serialize_nonnull that still returns bytes, but
results in an internal error if called with a null value.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-29 14:04:59 -08:00
Rafael Ávila de Espíndola
3abac35d9f types: Fix comparison of empty and null data_values
Before this patch a null data_value would compare equal to any
data_value that serialized to an empty byte sequence.

With this patch null only compares equal to null.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-29 13:24:10 -08:00
Rafael Ávila de Espíndola
02e8e8d6b3 types: Move operator== for data_value out-of-line
Most of the work is done by decompose and compare which are
out-of-line anyway.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-29 13:24:10 -08:00
Rafael Ávila de Espíndola
054f5761a7 types: Refactor code into a serialize_varint helper
This is a bit cleaner and avoids a boost::multiprecision::cpp_int copy
while serializing a decimal.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200110221422.35807-1-espindola@scylladb.com>
2020-01-14 16:28:27 +02:00
Nadav Har'El
de1171181c user defined types: fix support for case-sensitive type names
In the current code, support for case-sensitive (quoted) user-defined type
names is broken. For example, a test doing:

    CREATE TYPE "PHone" (country_code int, number text)
    CREATE TABLE cf (pk blob, pn "PHone", PRIMARY KEY (pk))

Fails - the first line creates the type with the case-sensitive name PHone,
but the second line wrongly ends up looking for the lowercased name phone,
and fails with an exception "Unknown type ks.phone".

The problem is in cql3_type_name_impl. This class is used to convert a
type object into its proper CQL syntax - for example frozen<list<int>>.
The problem is that for a user-defined type, we forgot to quote its name
if not lowercase, and the result is wrong CQL; For example, a list of
PHone will be written as list<PHone> - but this is wrong because the CQL
parser, when it sees this expression, lowercases the unquoted type name
PHone and it becomes just phone. It should be list<"PHone">, not list<PHone>.

The solution is for cql3_type_name_impl to use for a user-defined type
its get_name_as_cql_string() method instead of get_name_as_string().

get_name_as_cql_string() is a new method which prints the name of the
user type as it should be in a CQL expression, i.e., quoted if necessary.

The bug in the above test was apparently caused when our code serialized
the type name to disk as the string PHone (without any quoting), and then
later deserialized it using the CQL type parser, which converted it into
a lowercase phone. With this patch, the type's name is serialized as
"PHone", with the quotes, and deserialized properly as the type PHone.
While the extra quotes may seem excessive, they are necessary for the
correct CQL type expression - remember that the type expression may be
significantly more complex, e.g., frozen<list<"PHone">> and all of this,
including the quotes, is necessary for our parser to be able to translate
this string back into a type object.

This patch may cause breakage to existing databases which used case-
sensitive user-defined types, but I argue that these use cases were
already broken (as demonstrated by this test) so we won't break anything
that actually worked before.

Fixes #5544

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200101160805.15847-1-nyh@scylladb.com>
2020-01-03 15:48:20 +02:00
Rafael Ávila de Espíndola
5417c5356b types: Move get_castas_fctn to cql3
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191120181213.111758-9-espindola@scylladb.com>
2019-11-21 12:08:50 +02:00
Rafael Ávila de Espíndola
f06d6df4df types: Simplify casts to string
These now just use the to_string member functions, which makes it
possible to move the code to another file.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191120181213.111758-8-espindola@scylladb.com>
2019-11-21 12:08:50 +02:00
Rafael Ávila de Espíndola
786b1ec364 types: Move json code to its own file
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191120181213.111758-7-espindola@scylladb.com>
2019-11-21 12:08:49 +02:00
Rafael Ávila de Espíndola
af8e207491 types: Avoid using deserialize_value in json code
This makes it independent of internal functions and makes it possible
to move it to another file.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191120181213.111758-6-espindola@scylladb.com>
2019-11-21 12:08:49 +02:00
Rafael Ávila de Espíndola
ed65e2c848 types: Move cql3_kind to the cql3 directory
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191120181213.111758-5-espindola@scylladb.com>
2019-11-21 12:08:47 +02:00
Rafael Ávila de Espíndola
bd560e5520 types: Fix dynamic types of some data_value objects
I found these mismatched types while converting some member functions
to standalone functions, since they have to use the public API that
has more type checks.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191120181213.111758-4-espindola@scylladb.com>
2019-11-21 12:08:46 +02:00
Rafael Ávila de Espíndola
9208b2f498 Lua: Implement support for returning inet
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
64be94ab01 Lua: Implement support for inet arguments
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
faf029d472 Lua: Implement support for returning time
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
484f498534 Lua: Implement support for returning timeuuid
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
9c2daf6554 Lua: Implement support for returning uuid
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
f8aeed5beb Lua: Implement support for returning date
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
63bc960152 Lua: Implement support for returning timestamp
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
0d9d53b5da Lua: Implement support for counter arguments
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:41:08 -08:00
Rafael Ávila de Espíndola
9bf9a84e4d types: Move the data_value visitor to a header
It will be used by the UDF implementation.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-11-07 08:19:52 -08:00
Rafael Ávila de Espíndola
c74864447b types: Simplify validate_visitor for strings
We have different types for ascii and utf8, so there is no need for
an extra if.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191024232911.22700-1-espindola@scylladb.com>
2019-10-29 11:02:55 +02:00
Kamil Braun
612de1f4e3 types: handle trailing nulls in tuples/UDTs better.
Comparing user types after adding new fields was bugged.
In the following scenario:

create type ut (a int);
create table cf (a int primary key, b frozen<ut>);
insert into cf (a, b) values (0, (0));
alter type ut add b int;
select * from cf where b = {a:0,b:null};

the row with a = 0 should be returned, even though the value stored in the database is shorter
(by one null) than the value given by the user. Until now it wouldn't
have.
2019-10-25 12:04:44 +02:00
Kamil Braun
abe6c2d3d2 types: introduce to_bytes_opt_vec function.
It converts a vector<bytes_view_opt> to a vector<bytes_opt>.
Used in a bunch of places.
2019-10-25 12:04:44 +02:00
Kamil Braun
a8c7670722 types: add multi_cell field to user_type_impl.
is_value_compatible_with_internal and update_user_type were generalized
to the non-frozen case.

For now, all user_type_impls in the code are non-multi-cell (frozen).
This will be changed in future commits.
2019-10-25 12:04:44 +02:00