Commit Graph

24538 Commits

Author SHA1 Message Date
Michał Chojnowski
2b3d2c193d types: serialize sets to bytes_ostream
Avoids linearization by serializing to a fragmented type.
It's still linearized at the very end, this will be changed in the near future.
2020-12-07 17:47:49 +01:00
Michał Chojnowski
35823d12db types: serialize maps to bytes_ostream
Avoids linearization by serializing to a fragmented type.
It's still linearized at the very end, this will be changed in the near future.
2020-12-07 17:47:12 +01:00
Michał Chojnowski
60a3cecfea utils: fragment_range: use range-based for loop instead of boost::for_each
We want to pass bytes_ostream to this loop in later commits.
bytes_ostream does not conform to some boost concepts required by
boost::for_each, so let's just use C++'s native loop.
2020-12-07 12:50:36 +01:00
Michał Chojnowski
1fe7490970 types: add write_collection_value() overload for bytes_ostream and value_view
We will use it to serialize collections to bytes_ostream in serialize_for_cql().
2020-12-07 08:48:31 +01:00
Nadav Har'El
0cd05dd0fd cql-pytest: add tests for ALLOW FILTERING
The original goal of this patch was to replace the two single-node dtests
allow_filtering_test and allow_filtering_secondary_indexes_test, which
recently caused us problems when we wanted to change the ALLOW FILTERING
behavior but the tests were outside the tree. I'm hoping that after this
patch, those two tests could be removed from dtest.

But this patch actually tests more cases then those original dtest, and
moreover tests not just whether ALLOW FILTERING is required or not, but
also that the results of the filtering is correct.

Currently, four of the included tests are expected to fail ("xfail") on
Scylla, reproducing two issues:

1. Refs #5545:
   "WHERE x IN ..." on indexed column x wrongly requires ALLOW FILTERING
2. Refs #7608:
   "WHERE c=1" on clustering key c should require ALLOW FILTERING, but
   doesn't.

All tests, except the one for issue #5545, pass on Cassandra. That one
fails on Cassandra because doesn't support IN on an indexed column at all
(regardless of whether ALLOW FILTERING is used or not).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20201115124631.1224888-1-nyh@scylladb.com>
2020-12-06 19:51:25 +02:00
Pavel Solodovnikov
56c0fcfcb2 cql_query_test: handle bounce_to_shard msg in test_null_value_tuple_floating_types_and_uuids
Use `prepared_on_shard` helper function to handle `bounce_to_shard`
messages that can happen when using LWT statements.

Fixes: #7757
Tests: unit(dev)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20201204172944.601730-1-pa.solodovnikov@scylladb.com>
2020-12-06 19:34:13 +02:00
Amos Kong
6b1659ee80 schema.cc/describe: fix invalid compaction options in schema
There is a typo in schema.cql of snapshot, lack of comma after
compaction strategy. It will fail to restore schema by the file.

    AND compaction = {'class': 'SizeTieredCompactionStrategy''max_compaction_threshold': '32'}

map_as_cql_param() function has a `first` parameter to smartly add
comma, the compaction_strategy_options is always not the first.

Fixes #7741

Signed-off-by: Amos Kong <amos@scylladb.com>

Closes #7734
2020-12-06 17:40:05 +02:00
Avi Kivity
ca950e6f08 Merge "Remove get_local_storage_service() from counters" from Pavel E
"
The storage service is called there to get the cached value
of db::system_keyspace::get_local_host_id(). Keeping the value
on database decouples it from storage service and kills one
more global storage service reference.

tests: unit(dev)
"

* 'br-remove-storage-service-from-counters-2' of https://github.com/xemul/scylla:
  counters: Drop call to get_local_storage_service and related
  counters: Use local id arg in transform_counter_update_to_shards
  database: Have local id arg in transform_counter_updates_to_shards()
  storage_service: Keep local host id to database
2020-12-06 16:15:21 +02:00
Avi Kivity
6e460e121a Merge 'docs: Add Sphinx and ScyllaDB theme' from David Garcia
This PR adds the Sphinx documentation generator and the custom theme ``sphinx-scylladb-theme``. Once merged, the GitHub Actions workflow should automatically publish the developer notes stored under ``docs`` directory on http://scylladb.github.io/scylla

1. Run the command ``make preview`` from the ``docs`` directory.
3. Check the terminal where you have executed the previous command. It should not raise warnings.
3. Open in a new browser tab http://127.0.0.1:5500/ to see the generated documentation pages.

The table of contents displays the files sorted as they appear on GitHub. In a subsequent iteration, @lauranovich and I will submit an additional PR proposing a new folder organization structure.

Closes #7752

* github.com:scylladb/scylla:
  docs: fixed warnings
  docs: added theme
2020-12-06 15:26:57 +02:00
Benny Halevy
64a4ffc579 large_data_handler: do not delete records in the absence of large_data_stats
The previous way of deleting records based on the whole
sstatble data_size causes overzealous deletions (#7668)
and inefficiency in the rows cache due to the large number
of range tombstones created.

Therefore we'd be better of by juts letting the
records expire using he 30 days TTL.

Test: unit(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20201206083725.1386249-1-bhalevy@scylladb.com>
2020-12-06 11:34:37 +02:00
Avi Kivity
dc77d128e9 Revert "Merge "raft: fix replication if existing log on leader" from Gleb"
This reverts commit 0aa1f7c70a, reversing
changes made to 72c59e8000. The diff is
strange, including unrelated commits. There is no understanding of the
cause, so to be safe, revert and try again.
2020-12-06 11:34:19 +02:00
Pavel Emelyanov
df0e26035f counters: Drop call to get_local_storage_service and related
The local host id is now passed by argument, so we don't
need the counter_id::local() and some other methods that
call or are called by it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-12-04 16:31:12 +03:00
Pavel Emelyanov
914613b3c3 counters: Use local id arg in transform_counter_update_to_shards
Only few places in it need the uuid. And since it's only 16 bytes
it's possibvle to safely capture it by value in the called lambdas.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-12-04 16:30:31 +03:00
Pavel Emelyanov
62214e2258 database: Have local id arg in transform_counter_updates_to_shards()
There are two places that call it -- database code itself and
tests. The former already has the local host id, so just pass
one.

The latter are a bit trickier. Currently they use the value from
storage_service created by storage_service_for_tests, but since
this version of service doesn't pass through prepare_to_join()
the local_host_id value there is default-initialized, so just
default-initialize the needed argument in place.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-12-04 15:09:30 +03:00
Pavel Emelyanov
5a286ee8d4 storage_service: Keep local host id to database
The value in question is cached from db::system_keyspace
for places that want to have it without waiting for
futures. So far the only place is database counters code,
so keep the value on database itself. Next patches will
make use of it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-12-04 15:09:29 +03:00
Piotr Sarna
2015988373 Merge 'types: get rid of linearization in deserialize()' from Michał Chojnowski
Citing #6138: > In the past few years we have converted most of our codebase to
work in terms of fragmented buffers, instead of linearised ones, to help avoid
large allocations that put large pressure on the memory allocator.  > One
prominent component that still works exclusively in terms of linearised buffers
is the types hierarchy, more specifically the de/serialization code to/from CQL
format. Note that for most types, this is the same as our internal format,
notable exceptions are non-frozen collections and user types.  > > Most types
are expected to contain reasonably small values, but texts, blobs and especially
collections can get very large. Since the entire hierarchy shares a common
interface we can either transition all or none to work with fragmented buffers.

This series gets rid of intermediate linearizations in deserialization. The next
steps are removing linearizations from serialization, validation and comparison
code.

Series summary:
- Fix a bug in `fragmented_temporary_buffer::view::remove_prefix`. (Discovered
  while testing. Since it wasn't discovered earlier, I guess it doesn't occur in
  any code path in master.)
- Add a `FragmentedView` concept to allow uniform handling of various types of
  fragmented buffers (`bytes_view`, `temporary_fragmented_buffer::view`,
  `ser::buffer_view` and likely `managed_bytes_view` in the future).
- Implement `FragmentedView` for relevant fragmented buffer types.
- Add helper functions for reading from `FragmentedView`.
- Switch `deserialize()` and all its helpers from `bytes_view` to
  `FragmentedView`.
- Remove `with_linearized()` calls which just became unnecessary.
- Add an optimization for single-fragment cases.

The addition of `FragmentedView` might be controversial, because another concept
meant for the same purpose - `FragmentRange` - is already used. Unfortunately,
it lacks the functionality we need. The main (only?) thing we want to do with a
fragmented buffer is to extract a prefix from it and `FragmentRange` gives us no
way to do that, because it's immutable by design. We can work around that by
wrapping it into a mutable view which will track the offset into the immutable
`FragmentRange`, and that's exactly what `linearizing_input_stream` is. But it's
wasteful. `linearizing_input_stream` is a heavy type, unsuitable for passing
around as a view - it stores a pair of fragment iterators, a fragment view and a
size (11 words) to conform to the iterator-based design of `FragmentRange`, when
one fragment iterator (4 words) already contains all needed state, just hidden.
I suggest we replace `FragmentRange` with `FragmentedView` (or something
similar) altogether.

Refs: #6138

Closes #7692

* github.com:scylladb/scylla:
  types: collection: add an optimization for single-fragment buffers in deserialize
  types: add an optimization for single-fragment buffers in deserialize
  cql3: tuples: don't linearize in in_value::from_serialized
  cql3: expr: expression: replace with_linearize with linearized
  cql3: constants: remove unneeded uses of with_linearized
  cql3: update_parameters: don't linearize in prefetch_data_builder::add_cell
  cql3: lists: remove unneeded use of with_linearized
  query-result-set: don't linearize in result_set_builder::deserialize
  types: remove unneeded collection deserialization overloads
  types: switch collection_type_impl::deserialize from bytes_view to FragmentedView
  cql3: sets: don't linearize in value::from_serialized
  cql3: lists: don't linearize in value::from_serialized
  cql3: maps: don't linearize in value::from_serialized
  types: remove unused deserialize_aux
  types: deserialize: don't linearize tuple elements
  types: deserialize: don't linearize collection elements
  types: switch deserialize from bytes_view to FragmentedView
  types: deserialize tuple types from FragmentedView
  types: deserialize set type from FragmentedView
  types: deserialize map type from FragmentedView
  types: deserialize list type from FragmentedView
  types: add FragmentedView versions of read_collection_size and read_collection_value
  types: deserialize varint type from FragmentedView
  types: deserialize floating point types from FragmentedView
  types: deserialize decimal type from FragmentedView
  types: deserialize duration type from FragmentedView
  types: deserialize IP address types from FragmentedView
  types: deserialize uuid types from FragmentedView
  types: deserialize timestamp type from FragmentedView
  types: deserialize simple date type from FragmentedView
  types: deserialize time type from FragmentedView
  types: deserialize boolean type from FragmentedView
  types: deserialize integer types from FragmentedView
  types: deserialize string types from FragmentedView
  types: remove unused read_simple_opt
  types: implement read_simple* versions for FragmentedView
  utils: fragmented_temporary_buffer: implement FragmentedView for view
  utils: fragment_range: add single_fragmented_view
  serializer: implement FragmentedView for buffer_view
  utils: fragment_range: add linearized and with_linearized for FragmentedView
  utils: fragment_range: add FragmentedView
  utils: fragmented_temporary_buffer: fix view::remove_prefix
2020-12-04 09:46:20 +01:00
Michał Chojnowski
a1f7fabb3d types: collection: add an optimization for single-fragment buffers in deserialize
Helpers parametrized with single_fragmented_view should compile to better code,
so let's use them when possible.
2020-12-04 09:21:05 +01:00
Michał Chojnowski
08c394726e types: add an optimization for single-fragment buffers in deserialize
Values usually come in a single fragment, but we pay the cost of fragmented
deserialization nevertheless: bigger view objects (4 words instead of 2 words)
more state to keep updated (i.e. total view size in addition to current fragment
size) and more branches.

This patch adds a special case for single-fragment buffers to
abstract_type::deserialize. They are converted to a single_fragmented_view
before doing anything else. Templates instantiated with single_fragmented_view
should compile to better code than their multi-fragmented counterparts. If
abstract_type::deserialize is inlined, this patch should completely prevent any
performance penalties for switching from with_linearized to fragmented
deserialization.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
f75db1fcf5 cql3: tuples: don't linearize in in_value::from_serialized
We can deserialize directly from fragmented buffers now.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
68177a6721 cql3: expr: expression: replace with_linearize with linearized
with_linearized creates an additional internal `bytes` when the input is
fragmented. linearized copies the data directly to the output `bytes`, so it's
more efficient.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
5ffe40d5a2 cql3: constants: remove unneeded uses of with_linearized
We can deserialize directly from fragmented buffers now.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
3c98806df9 cql3: update_parameters: don't linearize in prefetch_data_builder::add_cell
We can deserialize directly from fragmented buffers now.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
c43ef3951b cql3: lists: remove unneeded use of with_linearized
We can deserialize directly from fragmented buffers now.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
0d5c5b8645 query-result-set: don't linearize in result_set_builder::deserialize
We can deserialize directly from fragmented buffers now.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
04786dee30 types: remove unneeded collection deserialization overloads
Inherit the method from base class rather than reimplementing it in every child.
2020-12-04 09:19:39 +01:00
Michał Chojnowski
c08419e28d types: switch collection_type_impl::deserialize from bytes_view to FragmentedView
Devirtualizes collection_type_impl::deserialize (so it can be templated) and
adds a FragmentedView overload. This will allow us to deserialize collections
with explicit cql_serialization_format directly from fragmented buffers.
2020-12-04 09:19:37 +01:00
dgarcia360
1304f6a0bb docs: fixed warnings
docs: fixed warnings
2020-12-03 17:40:34 +01:00
dgarcia360
a340b46a79 docs: added theme 2020-12-03 17:37:18 +01:00
Michał Chojnowski
d731b34d95 cql3: sets: don't linearize in value::from_serialized
We can deserialize directly from fragmented buffers now.
2020-12-03 10:57:07 +01:00
Michał Chojnowski
64e64fd2b3 cql3: lists: don't linearize in value::from_serialized
We can deserialize directly from fragmented buffers now.
2020-12-03 10:57:07 +01:00
Michał Chojnowski
536a2f8c8d cql3: maps: don't linearize in value::from_serialized
We can deserialize directly from fragmented buffers now.
2020-12-03 10:57:07 +01:00
Michał Chojnowski
58d9f52363 types: remove unused deserialize_aux
Dead code.
2020-12-03 10:57:07 +01:00
Michał Chojnowski
8440279130 types: deserialize: don't linearize tuple elements
We can deserialize directly from fragmented buffers now.
2020-12-03 10:57:07 +01:00
Michał Chojnowski
a216b0545f types: deserialize: don't linearize collection elements
We can deserialize directly from fragmented buffers now.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
1ccdfc7a90 types: switch deserialize from bytes_view to FragmentedView
The final part of the transition of deserialize from bytes_view to
FragmentedView.
Adds a FragmentedView overload to abstract_type::deserialize and
switches deserialize_visitor from bytes_view to FragmentedView, allowing
deserialization of all types with no intermediate linearization.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
898cea4cde types: deserialize tuple types from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
507883f808 types: deserialize set type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
9b211a7285 types: deserialize map type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
5f1939554c types: deserialize list type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
ad7ab73cd0 types: add FragmentedView versions of read_collection_size and read_collection_value
We will need those to deserialize collections from FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
495bf5c431 types: deserialize varint type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
0f8ad89740 types: deserialize floating point types from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
0bb0291e50 types: deserialize decimal type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
760bc5fd60 types: deserialize duration type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
75a56f439b types: deserialize IP address types from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
9f668929db types: deserialize uuid types from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
3e1a24ca0d types: deserialize timestamp type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
a4bc43ab19 types: deserialize simple date type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
24bd986aea types: deserialize time type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00
Michał Chojnowski
c03ad52513 types: deserialize boolean type from FragmentedView
A part of the transition of deserialize from bytes_view to FragmentedView.
2020-12-03 10:57:06 +01:00