Commit Graph

705 Commits

Author SHA1 Message Date
Piotr Sarna
d1db198211 Merge ' Allow repeated LIKE on same column' from Dejan
Fixes #5902 by making the LIKE restriction keep a vector
of matchers and apply them all to the column value.

Tests: unit (dev)

* dekimir/multiple-likes:
  cql3: Allow repeated LIKE on same column
  cql3: Forbid calling LIKE::values()
  cql3: Move LIKE::_last_pattern to matcher
2020-03-06 09:55:54 +01:00
Nadav Har'El
f67a402c48 merge: Remove treewide dependency on boost/multiprecision
Merged patch series from Avi Kivity:

boost/multiprecision is a heavyweight library, pulling in 20,000 lines of code into
each header that depends on it. It is used by converting_mutation_partition_applier
and types.hh. While the former is easy to put out-of-line, the latter is not.

All we really need is to forward-declare boost::multiprecision::cpp_int, but that
is not easy - it is a template taking several parameters, among which are non-type
template parameters also defined in that header. So it's quite difficult to
disentangle, and fragile wrt boost changes.

This patchset introduces a wrapper type utils::multiprecision_int which _can_
be forward declared, and together with a few other small fixes, manages to
uninclude boost/multiprecision from most of the source files. The total reduction
in number of lines compiled over a full build is 324 * 23,227 or around 7.5
million.

Tests: unit (dev)
Ref #1

https://github.com/avikivity/scylla uninclude-boost-multiprecision/v1

Avi Kivity (5):
  converting_mutation_partition_applier: move to .cc file
  utils: introduce multiprecision_int
  tests: cdc_test: explicitly convert from cdc::operation to uint8_t
  treewide: use utils::multiprecision_int for varint implementation
  types: forward-declare multiprecision_int

 configure.py                             |   2 +
 concrete_types.hh                        |   2 +-
 converting_mutation_partition_applier.hh | 163 ++-------------
 types.hh                                 |  12 +-
 utils/big_decimal.hh                     |   3 +-
 utils/multiprecision_int.hh              | 256 +++++++++++++++++++++++
 converting_mutation_partition_applier.cc | 188 +++++++++++++++++
 cql3/functions/aggregate_fcts.cc         |  10 +-
 cql3/functions/castas_fcts.cc            |  28 +--
 cql3/type_json.cc                        |   2 +-
 lua.cc                                   |  38 ++--
 mutation_partition_view.cc               |   2 +
 test/boost/cdc_test.cc                   |   6 +-
 test/boost/cql_query_test.cc             |  16 +-
 test/boost/json_cql_query_test.cc        |  12 +-
 test/boost/types_test.cc                 |  58 ++---
 test/boost/user_function_test.cc         |   2 +-
 test/lib/random_schema.cc                |  14 +-
 types.cc                                 |  20 +-
 utils/big_decimal.cc                     |   4 +-
 utils/multiprecision_int.cc              |  37 ++++
 21 files changed, 627 insertions(+), 248 deletions(-)
 create mode 100644 utils/multiprecision_int.hh
 create mode 100644 converting_mutation_partition_applier.cc
 create mode 100644 utils/multiprecision_int.cc
2020-03-04 15:13:42 +02:00
Avi Kivity
3c772757c0 treewide: use utils::multiprecision_int for varint implementation
The goal is to forward-declare utils::multiprecision_int, something
beyond my capabilities for boost::multiprecision::cpp_int, to reduce
compile time bloat.

The patch is mostly search-and-replace, with a few casts added to
disambiguate conversions the compiler had trouble with.
2020-03-04 13:28:16 +02:00
Avi Kivity
7434c81a29 utils: introduce multiprecision_int
multiprecision_int is a wrapper around boost::multiprecision::cpp_int that adds
no functionality. The intent is to allow forward declration; cpp_int is so
complicated that just finding out what its true type is a difficult exercise, as
it depends on many internal declarations.

Because cpp_int uses expression templates, the implementation has to explicitly
cast to the desired type in many places, otherwise the C++ compile is presented
with too many choices, especially in conjunction with data_value (which can
convert from many different types too).
2020-03-04 12:42:57 +02:00
Tomasz Grabiec
82b76163e3 utils/small_vector: Add missing include
Needed for std::uninitialized_move() et al

Message-Id: <20200303191148.11716-1-tgrabiec@scylladb.com>
2020-03-03 21:23:40 +02:00
Avi Kivity
157fe4bd19 Merge "Remove default timeouts" from Botond
"
Timeouts defaulted to `db::no_timeout` are dangerous. They allow any
modifications to the code to drop timeouts and introduce a source of
unbounded request queue to the system.

This series removes the last such default timeouts from the code. No
problems were found, only test code had to be updated.

tests: unit(dev)
"

* 'no-default-timeouts/v1' of https://github.com/denesb/scylla:
  database: database::query*(), database::apply*(): remove default timeouts
  database: table::query(): remove default timeout
  mutation_query: data_query(): remove default timeout
  mutation_query: mutation_query(): remove default timeout
  multishard_mutation_query: query_mutations_on_all_shards(): remove default timeout
  reader_concurrency_semaphore: wait_admission(): remove default timeout
  utils/logallog: run_when_memory_available(): remove default timeout
2020-03-01 17:29:17 +02:00
Avi Kivity
db544db5e2 Merge "Convert a few APIs to std::string_view" from Rafael
"
As part of avoiding static initialization order problems I want to
switch a few global sstring to constexpr std::string_view. The
advantage being that a constexpr variable doesn't need runtime
initialization and therefore cannot be part of a static initialization
order problem.

In order to do the conversion I needed to convert a few APIs to use
std::string_view instead of sstring and const sstring&.

These patches are the simple cases that are also an improvement in
their own right.
"

* 'espindola/string_view' of https://github.com/espindola/scylla: (22 commits)
  test: Pass a string_view to create_table's callback
  Pass string_view to the schema constructor
  cql3: Pass string_view to the column_specification constructor
  Pass string_view to keyspace_metadata::new_keyspace
  Pass string_view to the keyspace_metadata constructor
  utils: Use std::string as keys in nonstatic_class_registry
  utils: Pass a string_view to class_registry::to_qualified_class_name
  auth: Return a string_view from authorizer::qualified_java_name
  auth: Return a string_view from authenticator::qualified_java_name
  utils: Pass string_view to is_class_name_qualified
  test: Pass a string_view to create_keyspace
  Pass string_view to no_such_column_family's constructor
  perf_simple_query: Pass a string_view to make_counter_schema
  Pass string_view to the schema_builder constructor
  types: Add more data_value constructors
  transport: Pass a string_view to cql_server::connection::make_autheticate
  transport: Pass a string_view to cql_server::response::write_string
  cql3: Pass std::string_view to query_processor::compute_id
  cql3: Remove unused variable
  cql3: Pass a string_view to cf_statement::prepare_keyspace
  ...
2020-03-01 14:22:28 +02:00
Rafael Ávila de Espíndola
b3d396ea1f utils: Use on_internal_error from seastar
With this change abort_on_internal_error is enable on every
SEASTAR_TEST_CASE.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200227164823.21021-1-espindola@scylladb.com>
2020-02-29 19:28:57 +02:00
Rafael Ávila de Espíndola
01fe766f1f utils: Use std::string as keys in nonstatic_class_registry
The sstring::compare functions was never updated to work with
std::string_view. We could fix that, but it seems better to just
switch to std::string.

With a working compare function we can avoid copying the argument
passed to to_qualified_class_name when an entry is found in the map.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-28 17:04:08 -08:00
Rafael Ávila de Espíndola
31985d3c28 utils: Pass a string_view to class_registry::to_qualified_class_name
This just moves a string copy from the caller to the implementation.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-28 13:30:00 -08:00
Rafael Ávila de Espíndola
fae05e9268 utils: Pass string_view to is_class_name_qualified
With this we don't need to construct a sstring just to call
is_class_name_qualified.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-28 08:36:27 -08:00
Botond Dénes
93039a085d utils/logallog: run_when_memory_available(): remove default timeout 2020-02-27 18:36:32 +02:00
Botond Dénes
d1194da98d utils::updateable_value: add operator=(T)
Allow assigning a const value.
2020-02-27 18:11:54 +02:00
Dejan Mircevski
0d7457946f cql3: Allow repeated LIKE on same column
No reason to disallow this.  We still forbid mixing LIKE and non-LIKE
relations on the same column.

Fixes #5902.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-02-27 09:34:51 -05:00
Dejan Mircevski
fd583196ce cql3: Move LIKE::_last_pattern to matcher
Instead of keeping the LIKE pattern in a restriction object (as we
currently do), keep it in like_matcher.  Also move the
pattern-idempotence check from the restriction to the matcher.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-02-26 14:00:04 -05:00
Avi Kivity
48b694df55 cql3: like_matcher: pimplify to reduce inclusions of boost/regex
boost/regex has huge header dependencies amounting to tens of thousands of
lines. This are now replicated in 167 translation units.

This patch converts like_matcher to use the pointer-to-implementation
idiom, which reduces the number of translations including boost/regex
to 28.

Since regular expressions are relatively expensive, and like_matcher is
relatively rare, the extra memory usage and run time will be
negligible.
Message-Id: <20200211170152.809554-1-avi@scylladb.com>
2020-02-12 17:04:12 +02:00
Pavel Emelyanov
d1775dd701 utils: Move disk-error-handler into it
The disk-error-handler is purely auxiliary thing that helps
propagating IO errors to the rest of the code. It well
deserves not sitting in the root namespace.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200207112443.18475-1-xemul@scylladb.com>
2020-02-09 17:26:52 +02:00
Avi Kivity
b01f0cab60 utils: add missing include for ssize_t
gcc 10 tightened its C++ includes to no longer provide ssize_t,
so we must get it from a C header instead.
Message-Id: <20200129205912.21139-1-avi@scylladb.com>
2020-01-30 14:10:18 +02:00
Rafael Ávila de Espíndola
090164791c logalloc: Store unused ids in a std::vector
There doesn't seem to be any requirement for how unused ids are
reused, so we may as well use the simpler type.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200129211154.47907-1-espindola@scylladb.com>
2020-01-30 10:31:16 +02:00
Avi Kivity
3343baf159 Merge "cql3: time_uuid_fcts: validate time UUID" from Benny
"
Throw an error in case we hit an invalid time UUID
rather than hitting an assert.

Fixes #5552

(Ref #5588 that was dequeued and fixed here)

Test: UUID_test, cql_query_test(debug)
"

* 'validate-time-uuid' of https://github.com/bhalevy/scylla:
  cql3: abstract_function_selector: provide assignment_testable_source_context
  test: cql_query_test: add time uuid validation tests
  cql3: time_uuid_fcts: validate timestamp arg
  cql3: make_max_timeuuid_fct: delete outdated FIXME comment
  cql3: time_uuid_fcts: validate time UUID
  test: UUID_test: add tests for time uuid
  utils: UUID: create_time assert nanos_since validity
  utils/UUID_gen: make_nanos_since
  utils: UUID: assert UUID.is_timestamp
2020-01-29 00:11:17 +02:00
Avi Kivity
ec1687e4fe Merge "Remove deprecated partitioners #5636" from Piotr
"
This PR makes named_value respect allowed_values and then use it to transition away from old deprecated RandomPartitioner and ByteOrderedPartitioner. Then it removes the code that's no longer used.

We want to remove deprecated partitioners because, on one hand, they lead to performance problems and hot nodes. Moreover, we're planning to unify the token representation which would allow per table partitioner support. That, in turn, is a feature helpful in multiple efforts like CDC, materialized views, secondary indexes and multi-tenancy.

tests: unit(dev)
"

* 'remove_deprecated_partitioners' of https://github.com/haaawk/scylla:
  partitioners: remove random_partitioner
  partitioners: Make it impossible to use RandomPartitioner
  partitioners: remove byte_ordered_partitioner
  partitioners: Make it impossible to use ByteOrderedPartitioner
  partitioners: Remove leftovers of OrderPreservingPartitioner
  i_partitioner.cc: stop including byte_ordered_partitioner.hh
  i_partitioner.cc: stop including random_partitioner.hh
  config: use allowed_values to verify named_value input
  config: add operator<< for seed_provider_type
2020-01-29 00:11:17 +02:00
Benny Halevy
72e2ea47c1 cql3: time_uuid_fcts: validate time UUID
Throw an error in case we hit an invalid time UUID
rather than hitting an assert.

Ref #5552

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-27 11:09:01 +02:00
Benny Halevy
f8b079b599 utils: UUID: create_time assert nanos_since validity
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-27 11:09:01 +02:00
Benny Halevy
cd3460cc88 utils/UUID_gen: make_nanos_since
Safely convert millis to "nanos_since" (number of 100
nanseconds since START_EPOCH) while type casting to uint64_t
to avoid possible int overflow.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-27 11:08:16 +02:00
Benny Halevy
22bac26023 utils: UUID: assert UUID.is_timestamp
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-01-26 18:54:36 +02:00
Piotr Jastrzebski
6a2cd64b5c config: use allowed_values to verify named_value input
Even though we configure the set of accepted values for
some config flags, named_value ignore them.

This patch implements the checks that verify flag is
not set to the value that's not on the list.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-24 09:08:59 +01:00
Rafael Ávila de Espíndola
d9a71a7cff service: Refactor code into a atomic_vector class
This templates the code for listener_vector, renames it to
atomic_vector and moves it to the utils directory.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-22 08:16:03 -08:00
Avi Kivity
1f46133273 Merge "data: make cell::make_collection() exception safe" from Botond
"
Most of the code in `cell` and the `imr` infrastructure it is built on
is `noexcept`. This means that extra care must be taken to avoid rouge
exceptions as they will bring down the node. The changes introduced by
0a453e5d3a did just that - introduced rouge `std::bad_alloc` into this
code path by violating an undocumented and unvalidated assumption --
that fragment ranges passed to `cell::make_collection()` are nothrow
copyable and movable.

This series refactors `cell::make_collection()` such that it does not
have this assumption anymore and is safe to use with any range.

Note that the unit test included in this series, that was used to find
all the possible exception sources will not be currently run in any of
our build modes, due to `SEASTAR_ENABLE_ALLOC_FAILURE_INJECTION` not
being set. I plan to address this in a followup because setting this
flags fails other tests using the failure injection mechanism. This is
because these tests are normally run with the failure injection disabled
so failures managed to lurk in without anyone noticing.

Fixes: #5575
Refs: #5341

Tests: unit(dev, debug)
"

* 'data-cell-make-collection-exception-safety/v2' of https://github.com/denesb/scylla:
  test: mutation_test: add exception safety test for large collection serialization
  data/cell.hh: avoid accidental copies of non-nothrow copiable ranges
  utils/fragment_range.hh: introduce fragment_range_view
2020-01-14 10:01:06 +02:00
Botond Dénes
b52b4d36a2 utils/fragment_range.hh: introduce fragment_range_view
A lightweight, trivially copyable and movable view for fragment ranges.
Allows for uniform treatment of all kinds of ranges, i.e. treating all
of them as a view. Currently `fragment_range.hh` provides lightweight,
view-like adaptors for empty and single-fragment ranges (`bytes_view`). To
allow code to treat owning multi-fragment ranges the shame way as the
former two, we need a view for the latter as well -- this is
`fragment_range_view`.
2020-01-13 16:52:59 +02:00
Avi Kivity
454074f284 Merge "database: Avoid OOMing with flush continuations after failed memtable flush" from Tomasz
"
The original fix (10f6b125c8) didn't
take into account that if there was a failed memtable flush (Refs
flush) but is not a flushable memtable because it's not the latest in
the memtable list. If that happens, it means no other memtable is
flushable as well, cause otherwise it would be picked due to
evictable_occupancy(). Therefore the right action is to not flush
anything in this case.

Suspected to be observed in #4982. I didn't manage to reproduce after
triggering a failed memtable flush.

Fixes #3717
"

* tag 'avoid-ooming-with-flush-continuations-v2' of github.com:tgrabiec/scylla:
  database: Avoid OOMing with flush continuations after failed memtable flush
  lsa: Introduce operator bool() to occupancy_stats
  lsa: Expose region_impl::evictable_occupancy in the region class
2020-01-08 16:58:54 +02:00
Rafael Ávila de Espíndola
3d641d4062 lua: Use existing cpp_int cast logic
Different versions of boost have different rules for what conversions
from cpp_int to smaller intergers are allowed.

We already had a function that worked with all supported versions, but
it was not being use by lua.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200104041028.215153-1-espindola@scylladb.com>
2020-01-05 12:10:54 +02:00
Benny Halevy
4c884908bb directories: Keep a unique set of directories to initialize
If any two directories of data/commitlog/hints/view_hints
are the same we still end up running verify_owner_and_mode
and disk_sanity(check_direct_io_support) in parallel
on the same directoriea and hit #5510.

This change uses std::set rather than std::vector to
collect a unique set of directories that need initialization.

Fixes #5510

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20191225160645.2051184-1-bhalevy@scylladb.com>
2019-12-29 16:26:26 +02:00
Pavel Emelyanov
a5cdfea799 directories: Do not mess with per-shard base dir
The hints and view_hints directory has per-shard sub-dirs,
and the directories code tries to create, check and lock
all of them, including the base one.

The manipulations in question are excessive -- it's enough
to check and lock either the base dir, or all the per-shard
ones, but not everything. Let's take the latter approach for
its simplicity.

Fixes #5510

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Looks-good-to: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20191223142429.28448-1-xemul@scylladb.com>
2019-12-24 14:49:28 +02:00
Pavel Emelyanov
23a8d32920 directories: Make internals work on fs::path
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2019-12-12 19:52:01 +03:00
Pavel Emelyanov
373fcfdb3e directories: Cleanup adding dirs to the vector to work on
The unordered_set is turned into vector since for fs::path
there's no hash() method that's needed for set.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2019-12-12 19:52:01 +03:00
Pavel Emelyanov
14437da769 directories: Drop seastar::async usage
Now the only future-able operation remained is the call to
parallel_for_each(), all the rest is non-blocking preparation,
so we can drop the seastar::async and just return the future
from parallel_for_each.

The indendation is now good, as in previous patch is was prepared
just for that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2019-12-12 19:52:01 +03:00
Pavel Emelyanov
06f4f3e6d8 directories: Do touch_and_lock and verify sequentially
The goal is to drop the seastar::async() usage.

Currently we have two places that return futures -- calls to
parallel_for_each-s.  We can either chain them together or,
since both are working on the same set of directories, chain
actions inside them.

For code simplicity I propose to chain actions.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2019-12-12 19:52:01 +03:00
Pavel Emelyanov
8d0c820aa1 directories: Do touch_and_lock in parallel
The list of paths that should be touch-and-locked is already
at hands, this shortens the code and makes it slightly faster
(in theory).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2019-12-12 19:52:01 +03:00
Pavel Emelyanov
71a528d404 directories: Move the whole stuff into own .cc file
In order not to pollute the root dir place the code in
utils/ directory, "utils" namespace.

While doing this -- move the touch_and_lock from the
class declaration.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2019-12-12 19:52:01 +03:00
Pavel Emelyanov
82ef2a7730 file_lock: Work with fs::path, not sstring
The main.cc code that converts sstring to fs::path
will be patched soon, the file_desc::open belongs
to seastar and works on sstrings.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2019-12-12 17:32:10 +03:00
Dejan Mircevski
a26bd9b847 utils: Add enum_option
This allows us to accept command-line options with a predefined set of
valid arguments.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-12-09 09:45:59 -05:00
Tomasz Grabiec
aa173898d6 Merge "Named semaphores in concurrency reader, segment_manager and region_group" from Juliusz
Selected semaphores' names are now included in exception messages in
case of timeout or when admission queue overflows.

Resolves #5281
2019-12-05 14:19:56 +01:00
Juliusz Stasiewicz
430b2ad19d commitlog+region_group: timeout exceptions with names
`segment_manager' now uses a decorated version of `timed_out_error'
with hardcoded name. On the other hand `region_group' uses named
`on_request_expiry' within its `expiring_fifo'.
2019-12-03 19:07:19 +01:00
Botond Dénes
690e9d2b44 utils: introduce linearizing_input_stream
`linearizing_input_stream` allows transparently reading linearized
values from a fragmented buffer. This is done by linearizing on-the-fly
only those read values that happen to be split across multiple
fragments. This reduces the size of the largest allocation from the size
of the entire buffer (when the entire buffer is linearized) to the size
of the largest read value. This is a huge gain when the buffer contains
loads of small objects, and modest gains when the buffer contains few
large objects. But the even in the worst case the size of the largest
allocation will be less or equal compared to the case where the entire
buffer is linearized.

This stream is planned to be used as glue code between the fragmented
cell value and the collection deserialization code which expects to be
reading linearized values.
2019-12-02 10:10:31 +02:00
Botond Dénes
4054ba0c45 serialization: accept any CharOutputIterator
Not just bytes::output_iterator. Allow writing into streams other than
just `bytes`. In fact we should be very careful with writing into
`bytes` as they require potentially large contiguous allocations.

The `write()` method is now templatized also on the type of its first
argument, which now accepts any CharOutputIterator. Due to our poor
usage of namespace this now collides with `write` defined inside
`db/commitlog/commitlog.cc`. Luckily, the latter doesn't really have to
be templatized on the data type it reads from, and de-templatizing it
resolves the clash.
2019-12-02 10:10:31 +02:00
Dejan Mircevski
c43b286f35 utils: Add operator<< for big_decimal
... and remove an existing duplicate from lua.cc.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2019-11-29 15:32:09 -05:00
Pavel Solodovnikov
2f442f28af treewide: add const qualifiers throughout the code base 2019-11-26 02:24:49 +03:00
Tomasz Grabiec
fb28543116 lsa: Introduce operator bool() to occupancy_stats 2019-11-22 12:08:28 +01:00
Tomasz Grabiec
a69fda819c lsa: Expose region_impl::evictable_occupancy in the region class 2019-11-22 12:08:10 +01:00
Tomasz Grabiec
5e4abd75cc main: Abort on EBADF and ENOTSOCK by default
Those are typically symptoms of use-after-free or memory corruption in
the program. It's better to catch such error sooner than later.

That situation is also dangerous since if a valid descriptor would
land under the invalid access, not the one which was intended for the
operation, then the operation may be performed on the wrong file and
result in corruption.

Message-Id: <1565206788-31254-1-git-send-email-tgrabiec@scylladb.com>
2019-11-19 13:07:33 +02:00