Commit Graph

15634 Commits

Author SHA1 Message Date
Piotr Sarna
a6aae369da storage_proxy: add hints manager for views
This commit adds a separate hints manager that serves
only failed materialized view updates.
2018-06-04 09:46:06 +02:00
Piotr Sarna
204bc17bd7 hints: decouple hints manager metrics from constructor
Now that more than one instance of hints manager can be present
at the same time, registering metrics is moved out of the constructor
to prevent 'registering metrics twice' errors.
2018-06-04 09:46:06 +02:00
Piotr Sarna
a791dce0ae db, config: add view_pending_updates directory
Hints for materialized view updates need to be kept somewhere,
because their dedicated hints manager has to have a root directory.
view_pending_updates directory resides in /data and is used
for that purpose.
2018-06-04 09:46:06 +02:00
Piotr Sarna
f345efc79a hints: move space_watchdog to resource manager
Space watchdog is decoupled from hints manager and moved to resource
manager, so it can be shared among different hints manager instances.
2018-06-04 09:46:01 +02:00
Piotr Sarna
ef40f7e628 hints: move send limiter to resource manager
Send limiting semaphore is moved from hints manager to resource manager.
In consequence, hints manager now keeps a reference to its resource
manager.
2018-06-04 09:35:58 +02:00
Piotr Sarna
2315937854 hints: move constants to resource_manager
Constants related to managing resources are moved to newly created
resource_manager class. Later, this class will be used to manage
(potentially shared) resources of hints managers.
2018-06-04 09:35:58 +02:00
Amos Kong
364c2551c8 scylla_setup: fix conditional statement of silent mode
Commit 300af65555 introdued a problem in
conditional statement, script will always abort in silent mode, it doesn't
care about the return value.

Fixes #3485

Signed-off-by: Amos Kong <amos@scylladb.com>
Message-Id: <1c12ab04651352964a176368f8ee28f19ae43c68.1528077114.git.amos@scylladb.com>
2018-06-04 10:14:06 +03:00
Avi Kivity
6f2d3b7f9f Merge "Fix previous row size calculation for SSTables 3.x" from Vladimir
"
SSTables 3.x format ('m') stores the size of previous row or RT marker
inside each row/marker. That potentially allows to traverse rows/markers
in reverse order.

The previous code calculating those sizes appeared to produce invalid
values for all rows except the first one. The problem with detecting
this bug was that neither Cassandra itself nor the sstabledump tool use
those values, they are simply rejected on reading.
From UnfilteredSerializer.deserializeRowBody() method,
https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/db/rows/UnfilteredSerializer.java#L562
:

            if (header.isForSSTable())
            {
                in.readUnsignedVInt(); // Skip row size
                in.readUnsignedVInt(); // previous unfiltered size
            }

So while the previous test files were technically correct in that they
contained valid data readable by Cassandra/sstabledump, they didn't
follow the format specification.

This patchset fixes the code to produce correct values and replaces
incorrect data files with correct ones. The newly generated data files
have been validated to be identical to files generated with Cassandra
using same data and timestamps as unit tests.

Tests: Unit {release}
"

* 'projects/sstables-30/fix-prev-row_size/v1' of https://github.com/argenet/scylla:
  tests: Fix test files to use correct previous row sizes.
  sstables: Fix calculation of previous row size for SSTables 3.x
  sstables: Factor out code building promoted index blocks into separate helpers.
2018-06-03 11:38:22 +03:00
Avi Kivity
a43b3e22fc Merge "Fix clustering blocks serialization for SSTables 3.x" from Vladimir
"
This patchset contains two fixes to the clustering key prefixes
serialization logic for SSTables 3.x.

First, it fixes a vexing typo: a bitwise-and (&) has been used instead
of a remainder operator (%) for truncating the shift value.
This did not show up in existing tests because they all had non-empty
clustering columns values.
Added tests to cover empty clustering columns values.

Second, it fixes the logic of serialization to write values up to the
prefix length, not the length of the clustering key as defined by
schema. This matches the way it is done by the Origin.

There is, however, a special case where the prefix size is smaller than
that of a clustering key but we still need to serialize up to the full
size. This is the case when a compact table is being used and some
rows in it are added using incomplete clustering keys (containing null
for trailing columns).
In Cassandra, these prefixes still have a full length and missing
columns are just set to 'null'. In our code those prefixes have their
real length, but since we need to serialize beyond it, we pass a flag to
indicate this.
"

* 'projects/sstables-30/fix-clustering-blocks/v1' of https://github.com/argenet/scylla:
  tests: Add test covering compact table with non-full clustering key.
  sstables: Improve clustering blocks writing, use logical clustering prefix size.
  tests: Add test covering large clustering keys (>32 columns) for SSTables 3.x
  tests: Add unit test covering empty values in clustering key.
  sstables: Fix typo in clustering blocks write helper.
2018-06-03 11:35:49 +03:00
Avi Kivity
1071e481ed Merge "Implement support for missing columns in SSTable 3.0" from Piotr
"
Add handling for missing columns and tests for it.

There are 3 cases:
1. Number of columns in a table is smaller than 64
2. Number of columns in a table is greater than 64
2a. and less than half of all possible columns are present in sstable
2b. and at least half of all possible columns are present in sstable

Case 1 is implemented using bit mask and column is present if mask & (1 << <column number>) == 0
Case 2 is implemented by storing list of column numbers for each present column
case 3 is implemented by storing list of column numbers for each absent column
"

* 'haaawk/sstables3/read-missing-columns-v3' of ssh://github.com/scylladb/seastar-dev:
  sstables 3: add test for reading big dense subset of columns
  sstables 3: support reading big dense subsets of columns
  sstables 3: add test for reading big sparse subset of columns
  sstables 3: support reading big sparse subsets of columns
  sstables 3: add test for reading small subset of columns
  sstables 3: support reading small subsets of columns
2018-06-03 10:42:00 +03:00
Avi Kivity
78182a704b partition_snapshot_row_cursor: initialize _dummy and _continuous
Debug mode view_schema_test sometimes complains that a bool member
doesn't contain in-range values, apparenty in the move constructor.

Initialize them for its benefit to avoid false-positive test
failures.
Message-Id: <20180602184934.31258-1-avi@scylladb.com>
2018-06-02 19:51:36 +01:00
Avi Kivity
187ebdbe46 auth: fix possible use of disengaged optional in has_salted_hash()
untyped_result_set_row's cell data type is bytes_opt, and the
get_block() accessor accesses the value assuming it's engaged
(relying on the caller to call has()).

has_unsalted_hash() calls get_blob() without calling has() beforehand,
potentially triggering undefined behavior.

Fix by using get_or() instead, which also simplifies the caller.

I observed failures in Jenkins in this area. It's hard to be sure
this is the root cause, since the failures triggered an internal
consistency assertion in asan rather than an asan report. However,
the error is hard to reproduce and the fix makes sense even if it
doesn't prevent the error.

See #3480 for the asan error.

Fixes #3480 (hopefully).
Message-Id: <20180602181919.29204-1-avi@scylladb.com>
2018-06-02 19:46:32 +01:00
Piotr Jastrzebski
2fd0566eb7 sstables 3: add test for reading big dense subset of columns
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-06-02 10:41:18 +02:00
Piotr Jastrzebski
829f0c5f80 sstables 3: support reading big dense subsets of columns
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-06-02 10:41:18 +02:00
Piotr Jastrzebski
4e4972ffea sstables 3: add test for reading big sparse subset of columns
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-06-02 10:40:56 +02:00
Piotr Jastrzebski
e5fb499736 sstables 3: support reading big sparse subsets of columns
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-06-01 21:35:28 +02:00
Piotr Jastrzebski
24e9ab4ab6 sstables 3: add test for reading small subset of columns
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-06-01 21:34:03 +02:00
Piotr Jastrzebski
63d45c4f24 sstables 3: support reading small subsets of columns
Small subset is contains no more than 63 elements.
Support for large subsets will come in the following
patches.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-06-01 21:33:50 +02:00
Vladimir Krivopalov
b6511d1b07 tests: Add test covering compact table with non-full clustering key.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-05-31 17:30:36 -07:00
Vladimir Krivopalov
47a7e78bc8 sstables: Improve clustering blocks writing, use logical clustering prefix size.
In the Origin, the size of the clustering key prefix used during
serialization is the actual length of the prefix and not the full size
as defined in schema. So the code is fixed to align with that logic.
This, in particular, is needed to write clustering blocks for RT
markers.

There is, however, a special case where the prefix size is smaller than
that of a clustering key but we still need to serialize up to the full
size. This is the case when a compact table is being used and some
rows in it are added using incomplete clustering keys (containing null
for trailing columns).
In Cassandra, these prefixes still have a full length and missing
columns are just set to 'null'. In our code those prefixes have their
real length, but since we need to serialize beyond it, we pass a flag to
indicate this.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-05-31 17:30:36 -07:00
Vladimir Krivopalov
3f404f19dc tests: Add test covering large clustering keys (>32 columns) for SSTables 3.x
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-05-31 17:30:36 -07:00
Vladimir Krivopalov
487796de85 tests: Add unit test covering empty values in clustering key.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-05-31 17:30:36 -07:00
Vladimir Krivopalov
0dadd4fdf3 sstables: Fix typo in clustering blocks write helper.
What supposed to be an operation of taking remainder turned to be a
bitwise 'and'. This didn't show up in existing tests only because they
all had non-empty clustering values.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-05-31 15:12:40 -07:00
Avi Kivity
aab6b0ee27 Merge "Introduce new in-memory representation for cells" from Paweł
"
This is the first part of the first step of switching Scylla. It covers
converting cells to the new serialisation format. The actual structure
of the cells doesn't differ much from the original one with a notable
exception of the fact that large values are now fragmented and
linearisation needs to be explicit. Counters and collections still
partially rely on their old, custom serialisation code and their
handling is not optimial (although not significantly worse than it used
to be).

The new in-memory representation allows objects to be of varying size
and makes it possible to provide deserialisation context so that we
don't need to keep in each instance of an IMR type all the information
needed to interpret it. The structure of IMR types is described in C++
using some metaprogramming with the hopes of making it much easier to
modify the serialisation format that it would be in case of open-coded
serialisation functions.

Moreover, IMR types can own memory thanks to a limited support for
destructors and movers (the latter are not exactly the same thing as C++
move constructors hence a different name). This makes it (relatively)
to ensure that there is an upper bound on the size of all allocations.

For now the only thing that is converted to the IMR are atomic_cells
and collections which means that the reduction in the memory footprint
is not as big as it can be, but introducing the IMR is a big step on its
own and also paves the way towards complete elimination of unbounded
memory allocations.

The first part of this patchset contains miscellaneous preparatory
changes to various parts of the Scylla codebase. They are followed by
introduction of the IMR infrastructure. Then structure of cells is
defined and all helper functions are implemented. Next are several
treewide patches that mostly deal with propagating type information to
the cell-related operations. Finally, atomic_cell and collections are
switched to used the new IMR-based cell implementation.

The IMR is described in much more detail in imr/IMR.md added in "imr:
add IMR documentation".

Refs #2031.
Refs #2409.

perf_simple_query -c4, medians of 30 results:

        ./perf_base  ./perf_imr   diff
 read     308790.08   309775.35   0.3%
 write    402127.32   417729.18   3.9%

The same with 1 byte values:
        ./perf_base1  ./perf_imr1   diff
 read      314107.26    314648.96   0.2%
 write     463801.40    433255.96  -6.6%

The memory footprint is reduced, but that is partially due to removal of
small buffer optimisation (whether it will be restored depends on the
exact mesurements of the performance impact). Generally, this series was
not expected to make a huge difference as this would require converting
whole rows to the IMR.

Memory footprint:
Before:
mutation footprint:
 - in cache: 1264
 - in memtable: 986

After:
mutation footprint:
 - in cache: 1104
 - in memtable: 866

Tests: unit (release, debug)
"

* tag 'imr-cells/v3' of https://github.com/pdziepak/scylla: (37 commits)
  tests/mutation: add test for changing column type
  atomic_cell: switch to new IMR-based cell reperesentation
  atomic_cell: explicitly state when atomic_cell is a collection member
  treewide: require type for creating collection_mutation_view
  treewide: require type for comparing cells
  atomic_cell: introduce fragmented buffer value interface
  treewide: require type to compute cell memory usage
  treewide: require type to copy atomic_cell
  treewide: require type info for copying atomic_cell_or_collection
  treewide: require type for creating atomic_cell
  atomic_cell: require column_definition for creating atomic_cell views
  tests: test imr representation of cells
  types: provide information for IMR
  data: introduce cell
  data: introduce type_info
  imr/utils: add imr object holder
  imr: introduce concepts
  imr: add helper for allocating objects
  imr: allow creating lsa migrators for IMR objects
  imr: introduce placeholders
  ...
2018-05-31 19:21:15 +03:00
Amnon Heiman
bc7503feee Scyllatop to use prometheus by default
Scylla now expose the prometheus API by default. This patch chagnes
scyllatop to use the Prometheus API, the collect API is still available.

The main changes in the patch:
* Move collectd specific logic inside collectd.
* Add support for help information.
* Add command line to configure prometheus end point and to enable
collectd.

* Add a prometheus class that collect information from prometheus.

Fixes: #1541
Message-Id: <20180531124156.26336-1-amnon@scylladb.com>
2018-05-31 18:00:22 +03:00
Tomasz Grabiec
b5e42bc6a0 tests: row_cache: Do not hang when only one of the readers throws
Message-Id: <20180531122729.3314-1-tgrabiec@scylladb.com>
2018-05-31 18:00:22 +03:00
Piotr Sarna
360326fdc5 cql3: add compatibility with libjsoncpp < 1.6.0
Only libjsoncpp >= 1.6.0 offers a safe name() method for value
iterators. For older versions, deprecated memberName() is used
instead. Note that memberName() was deprecated because of its
inability to deal with embedded null characters.

Fixes #3471

Message-Id: <e64a62bfc24ef06daee238d79d557fe6ec8979d3.1527758708.git.sarna@scylladb.com>
2018-05-31 18:00:22 +03:00
Paweł Dziepak
131a47dea3 tests/mutation: add test for changing column type
With the introduction of the new in-memory representation changing
column type has become a more complex operation since it needs to handle
switch from fixed-size to variable-size types. This commit adds an
explicit test for such cases.
2018-05-31 15:51:11 +01:00
Paweł Dziepak
a040d37cd5 atomic_cell: switch to new IMR-based cell reperesentation
This patch changes the implementation of atomic_cell and
atomic_cell_or_collection to use the data::cell implementation which is
based on the new in-memory representation infrastructure.
2018-05-31 15:51:11 +01:00
Paweł Dziepak
0ea6d14cf5 atomic_cell: explicitly state when atomic_cell is a collection member
Collections are not going to be fully converted to the IMR just yet and
still use the old serialisation format. This means that they still don't
support fragmented values very well. This patch passes the information
when an atomic_cell is created as a member of a collection so that later
we can avoid fragmenting the value in such cases.
2018-05-31 15:51:11 +01:00
Paweł Dziepak
e34ff8b4bf treewide: require type for creating collection_mutation_view 2018-05-31 15:51:11 +01:00
Paweł Dziepak
9bb1f10bb6 treewide: require type for comparing cells 2018-05-31 15:51:11 +01:00
Paweł Dziepak
aa25f0844f atomic_cell: introduce fragmented buffer value interface
As a prepratation for the switch to the new cell representation this
patch changes the type returned by atomic_cell_view::value() to one that
requires explicit linearisation of the cell value. Even though the value
is still implicitly linearised (and only when managed by the LSA) the
new interface is the same as the target one so that no more changes to
its users will be needed.
2018-05-31 15:51:11 +01:00
Paweł Dziepak
ec9d166a4f treewide: require type to compute cell memory usage 2018-05-31 15:51:11 +01:00
Paweł Dziepak
418c159057 treewide: require type to copy atomic_cell 2018-05-31 15:51:11 +01:00
Paweł Dziepak
27014a23d7 treewide: require type info for copying atomic_cell_or_collection 2018-05-31 15:51:11 +01:00
Paweł Dziepak
e9d6fc48ac treewide: require type for creating atomic_cell 2018-05-31 15:51:11 +01:00
Paweł Dziepak
93130e80fb atomic_cell: require column_definition for creating atomic_cell views 2018-05-31 15:51:11 +01:00
Paweł Dziepak
b25cc61a13 tests: test imr representation of cells 2018-05-31 15:51:11 +01:00
Paweł Dziepak
43b216b43d types: provide information for IMR 2018-05-31 15:51:11 +01:00
Paweł Dziepak
eec33fda14 data: introduce cell
This commit introduces cell serializers and views based on the in-memory
representation infrastructure. The code doesn't assume anything about
how the cells are stored, they can be either a part of another IMR
object (once the rows are converted to the IMR) or a separate objects
(just like current atomic_cell).
2018-05-31 15:51:11 +01:00
Duarte Nunes
f8626c7c93 tests/view_schema_test: Test view correctness under base schema changes
Reproducer for #3443.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180530194536.51202-2-duarte@scylladb.com>
2018-05-31 12:10:50 +03:00
Duarte Nunes
c4f267bdfe database: Refresh view dependent fields when altering base
A view schema's view_info contains the id of the base regular column
that view includes in its primary key. Since the column id of a
particular column can potentially change with a new schema version, we
need to refresh the stored column id. We weren't doing that when
unselected base columns are added, and this patch fixes it by
triggering an update of the view schema when base columns are added
and the view contains a base regular column in its PK.

Fixes #3443

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180530194536.51202-1-duarte@scylladb.com>
2018-05-31 12:10:49 +03:00
Paweł Dziepak
544b3c9a34 data: introduce type_info
This patch introduces type_info class which contains all type
information needed by IMR deserialisation contexts.
2018-05-31 10:09:01 +01:00
Paweł Dziepak
4929c1f39a imr/utils: add imr object holder
imr::object<> is an owning pointer to an IMR objects. It is LSA-aware.
2018-05-31 10:09:01 +01:00
Paweł Dziepak
fd47858755 imr: introduce concepts
This commit adds type traits and concepts for sizers, serializers and
writers that help explicitly specify requirements of various interfaces.
2018-05-31 10:09:01 +01:00
Paweł Dziepak
28ea36a686 imr: add helper for allocating objects
IMR objects may own memory. object_allocator takes care of allocating
memory for all owned objects during the serialisation of their owner.

In practice a writer of the parent object would accept a helper object
created by object_allocator. That helper object would be either
responsible for computing the size of buffers that have to be allocated
or perform the actual serialisation in the same two phase manner as it
is done for the parent IMR object.
2018-05-31 10:09:01 +01:00
Paweł Dziepak
79941f2fc7 imr: allow creating lsa migrators for IMR objects
This patch introduces helpers for creating LSA migrators from IMR
deserialisation contexts and context factories.
2018-05-31 10:09:01 +01:00
Paweł Dziepak
5ddb118c78 imr: introduce placeholders
In some cases the actual value of an IMR object is not know at the
serialisation time. If the type is fixed-size we can use a placeholder
to defer writing it to a more conveninent moment.
2018-05-31 10:09:01 +01:00
Paweł Dziepak
8c38f09fbc tests/imr: add tests for destructor and mover methods 2018-05-31 10:09:01 +01:00