In bd6855e, we reverted to Boost ranges and commented out the concept
check. But Boost has its own concept check, which this patch enables.
Tests: unit (dev)
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Closes#7471
Currently, we linearize large UTF8 cells in order to validate them.
This can cause large latency spikes if the cell is large.
This series changes UTF8 validation to work on fragmented buffers.
This is somewhat tricky since the validation routines are optimized
for single-instruction-multiple-data (SIMD) architectures.
The unit tests are expanded to cover the new functionality.
Fixes#7448.
Closes#7449
* github.com:scylladb/scylla:
types: don't linearize utf8 for validation
test: utf8: add fragmented buffer validation tests
utils: utf8: add function to validate fragmented buffers
utils: utf8: expose validate_partial() in a header
utils: utf8: introduce validate_partial()
utils: utf8: extract a function to evaluate a single codepoint
This PR purpose is to handle schema integrity issues that can arise in races involving materialized views.
The possibility of such integrity issues was found in #7420 , where a view schema was used for reading without
it's _base_info member initialized resulting in a segfault.
We handle this doing 3 things:
1. First guard against using an uninitialized base info - this will be considered as an internal error as it will indicate that
there is a path in our code that creates a view schema to be used for reads or writes but is not initializing the base info.
2. We allow the base info to be initialized also from partially matching base (most likely a newer one that this used to create the view).
3. We fix the suspected path that create such a view schema to initialize it. (in migration manager)
It is worth mentioning that this PR is a workaround to a probable design flaw in our materialized views which requires the base
table's information to be retrieved in the first place instead of just being self contained.
Refs #7420Closes#7469
* github.com:scylladb/scylla:
materialized views: add a base table reference if missing
view info: support partial match between base and view for only reading from view.
view info: guard against null dereference of the base info
schema pointers can be obtained from two distinct entities,
one is the database, those schema are obtained from the table
objects and the other is from the schema registry.
When a schema or a new schema is attached to a table object that
represents a base table for views, all of the corresponding attached
view schemas are guarantied to have their base info in sync.
However if an older schema is inserted into the registry by the
migratrion manager i.e loaded from other node, it will be
missing this info.
This becomes a problem when this schema is published through the
schema registry as it can be obtained for an obsolete read command
for example and then eventually cause a segmentation fault by null
dereferencing the _base_info ptr.
Refs #7420
only reading from view.
The current implementation of materialized views does
no keep the version to which a specific version of materialized
view schema corresponds to. This complicate things especially on
old views versions that the schema doesn't support anymore. However,
the views, being also an independent table should allow reading from
them as long as they exist even if the base table changed since then.
For the reading purpose, we don't need to know the exact composition
of view primary key columns that are not part of the base primary
key, we only need to know that there are any, and this is a much
looser constrain on the schema.
We can rely on a table invariants such as the fact that pk columns are
not going to disappear on newer version of the table.
This means that if we don't find a view column in the base table, it is
not a part of the base table primary key.
This information is enough for us to perform read on the view.
This commit adds support for being able to rely on such partial
information along with a validation that it is not going to be used for
writes. If it is, we simply abort since this means that our schema
integrity is compromised.
The change's purpose is to guard against segfault that is the
result of dereferencing the _base_info member when it is
uninitialized. We already know this can happen (#7420).
The only purpose of this change is to treat this condition as
an internal error, the reason is that it indicates a schema integrity
problem.
Besides this change, other measures should be taken to ensure that
the _base_table member is initialized before calling methods that
rely on it.
We call the internal_error as a last resort.
Use the new non-linearizing validator, avoiding linearization.
Linearization can cause large contiguous memory allocations, which
in turn causes latency spikes.
Fixes#7448.
Since there are a huge number of variations, we use random
testing. Each test case is composed of a random number of valid
code points, with a possible invalid code point somehwere. The
test case is broken up into a random number of fragments. We
test both validation success and error position indicator.
Add a function to validate fragmented buffers. We validate
each buffer with SIMD-optimized validate_partial(), then
collect the codepoint that spans buffer boundaries (if any)
in a temporary buffer, validate that too, and continue.
The current validators expect the buffer to contain a full
UTF-8 string. This won't be the case for fragmented buffers,
since a codepoint can straddle two (or more) buffers.
To prepare for that, convert the existing validators to
validate_partial(), which returns either an error, or
success with an indication of the size of the tail that
was not validated and now many bytes it is missing.
This is natural since the SIMD validators already
cannot process a tail in SIMD mode if it's smaller than
the vector size, so only minor rearrangements are needed.
In addition, we now have validate_partial() for non-SIMD
architectures, since we'll need it for fragmented buffer
validation.
Our SIMD optimized validators cannot process a codepoint that
spans multiple buffers, and adapting them to be able to will slow
them down. So our strategy is to special-case any codepoint that
spans two buffers.
To do that, extract an evaluate_codepoint() function from the
current validate_naive() function. It returns three values:
- if a codepoint was successfully decoded from the buffer,
how many bytes were consumed
- if not enough bytes were in the buffer, how many more
are needed
- otherwise, an error happened, so return an indication
The new function uses a table to calculate a codepoint's
size from its first byte, similar to the SIMD variants.
validate_naive() is now implemented in terms of
evaluate_codepoint().
This is a regression caused by aebd965f0.
After the sstable_directory changes, resharding now waits for all sstables
to be exhausted before releasing reference to them, which prevents their
resources like disk space and fd from being released. Let's restore the
old behavior of incrementally releasing resources, reducing the space
requirement significantly.
Fixes#7463.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20201020140939.118787-1-raphaelsc@scylladb.com>
... type deduction and parenthesized aggregate construction' from Avi Kivity
Clang does not implement P0960R3 and P1816R0, so constructions
of aggregates (structs with no constructors) have to used braced initialization
and cannot use class template argument deduction. This series makes the
adjustments.
Closes#7456
* github.com:scylladb/scylla:
reader_concurrency_semaphore: adjust permit_summary construction for clang
schema_tables: adjust altered_schema construction for clang
types: adjust validation_visitor construction for clang
Clang does not implement P0960R3, parenthesized initialization of
aggregates, so we have to use brace initialization in
permit_summary.
As the parenthesized constructor call is done by emplace_back(),
we have to do the braced call ourselves.
Clang does not implement P0960R3, parenthesized initialization of
aggregates, so we have to use brace initialization in
altered_schema.
As the parenthesized constructor call is done by emplace_back(),
we have to do the braced call ourselves.
Clang does not implement P0960R3, parenthesized initialization of
aggregates, so we have to use brace initialization in
validation_visitor. It also does not implement class template
argument deduction for aggregates (P1816r0), so we have to
specify the template parameters explicity.
Clang has trouble compiling libstdc++'s `<ranges>`. It is not known whether
the problem is in clang or in libstdc++; I filed bugs for both [1] [2].
Meanwhile, we wish to use clang to gain working coroutine support,
so drop the failing uses of `<ranges>`. Luckily the changes are simple.
[1] https://bugs.llvm.org/show_bug.cgi?id=47509
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97120Closes#7450
* github.com:scylladb/scylla:
test: view_build_test: drop <ranges>
test: mutation_reader_test: drop <ranges>
cql3: expression: drop <ranges>
sstables: leveled_compaction_strategy: drop use of <ranges>
utils: to_range(): relax constraint
The check was added to support migration from schema tables format v2
to v3. It was needed to handle the rolling upgrade from 2.x to 3.x
scylla version. Old nodes wouldn't recognize new schema mutations, so
the pull handler in v3 was changed to ignore requests from v2 nodes based
on their advertised SCHEMA_TABLES_VERSION gossip state.
This started to cause problems after 3b1ff90 (get rid of the seed
concept). The bootstrapping node sometimes would hang during boot
unable to reach schema agreement.
It's relevant that gossip exchanges about new nodes are unidirectional
(refs #2862).
It's also relevant that pulls are edge-triggered only (refs #7426).
If the bootstrapping node (A) is listed as a seed in one of the
existing node's (B) configuration then node A can be contacted before
it contacts node B. Node A may then send schema pull request to node B
before it learns about node A, and node B will assume it's an old node
and give an empty response. As a result, node A will end up with an
old schema.
The fix is to drop the check so that pull handler always responds with
the schema. We don't support upgrades from nodes using v2 schema
tables format anymore so this should be safe.
Fixes#7396
Tests:
- manual (ccm)
- unit (dev)
Message-Id: <1602612578-21258-1-git-send-email-tgrabiec@scylladb.com>
The input range to utils::to_range() should be indeed a range,
but clang has trouble compiling <ranges> which causes it to fail.
Relax the constraint until this is fixed.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Hopefully, most of these lambda captures will be replaces with
coroutines.
Closes#7445
* github.com:scylladb/scylla:
test: mutation_reader_test: don't capture structured bindings in lambdas
api: column_family: don't capture structured bindings in lambdas
thrift: don't capture structured bindings in lambdas
test: partition_data_test: don't capture structured bindings in lambdas
test: querier_cache_test: don't capture structured bindings in lambdas
test: mutation_test: don't capture structured bindings in lambdas
storage_proxy: don't capture structured bindings in lambdas
db: hints/manager: don't capture structured bindings in lambdas
db: commitlog_replayer: don't capture structured bindings in lambdas
cql3: select_statement: don't capture structured bindings in lambdas
cql3: statement_restrictions: don't capture structured bindings in lambdas
cdc: log: don't capture structured bindings in lambdas
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
"
Focusing on different aspects of debugging Scylla. Also expand some of
the existing segments and fix some small issues around the document.
"
* 'debugging.md-advanced-guides/v1' of https://github.com/denesb/scylla:
docs/debugging.md: add thematic debugging guides
docs/debugging.md: tips and tricks: add section about optimized-out variables
docs/debugging.md: TLS variables: add missing $ to terminal command
docs/debugging.md: TUI: describe how to switch between windows
docs/debugging.md: troubleshooting: expand on crash on backtrace
Before this change, invalid query exception on selects with both normal
and token restrictions was only thrown when token restriction was after
normal restriction.
This change adds proper validation when token restriction is before normal restriction.
**Before the change - does not return error in last query; returns wrong results:**
```
cqlsh> CREATE TABLE ks.t(pk int, PRIMARY KEY(pk));
cqlsh> INSERT INTO ks.t(pk) VALUES (1);
cqlsh> INSERT INTO ks.t(pk) VALUES (2);
cqlsh> INSERT INTO ks.t(pk) VALUES (3);
cqlsh> INSERT INTO ks.t(pk) VALUES (4);
cqlsh> SELECT pk, token(pk) FROM ks.t WHERE pk = 2 AND token(pk) > 0;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Columns "ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}" cannot be restricted by both a normal relation and a token relation"
cqlsh> SELECT pk, token(pk) FROM ks.t WHERE token(pk) > 0 AND pk = 2;
pk | system.token(pk)
----+---------------------
3 | 9010454139840013625
(1 rows)
```
Closes#7441
* github.com:scylladb/scylla:
tests: Add token and non-token conjunction tests
token_restriction: Add non-token merge exception
This patch introduces a new system table: `system.scylla_table_schema_history`,
which is used to keep track of column mappings for obsolete table
schema versions (i.e. schema becomes obsolete when it's being changed
by means of `CREATE TABLE` or `ALTER TABLE` DDL operations).
It is populated automatically when a new schema version is being
pulled from a remote in get_schema_definition() at migration_manager.cc
and also when schema change is being propagated to system schema tables
in do_merge_schema() at schema_tables.cc.
The data referring to the most recent table schema version is always
present. Other entries are garbage-collected when the corresponding
table schema version is obsoleted (they will be updated with a TTL equal
to `DEFAULT_GC_GRACE_SECONDS` on `ALTER TABLE`).
In case we failed to persist column mapping after a schema change,
missing entries will be recreated on node boot.
Later, the information from this table is used in `paxos_state::learn`
callback in case we have a mismatch between the most recent schema
version and the one that is stored inside the `frozen_mutation`
for the accepted proposal.
Such situation may arise under following circumstances:
1. The previous LWT operation crashed on the "accept" stage,
leaving behind a stale accepted proposal, which waits to be
repaired.
2. The table affected by LWT operation is being altered, so that
schema version is now different. Stored proposal now references
obsolete schema.
3. LWT query is retried, so that Scylla tries to repair the
unfinished Paxos round and apply the mutation in the learn stage.
When such mismatch happens, prior to that patch the stored
`frozen_mutation` is able to be applied only if we are lucky enough
and column_mapping in the mutation is "compatible" with the new
table schema.
It wouldn't work if, for example, the columns are reordered, or
some columns, which are referenced by an LWT query, are dropped.
With this patch we try to look up the column mapping for
the obsolete schema version, then upgrade the stored mutation
using obtained column mapping and apply an upgraded mutation instead.
* git@github.com:ManManson/scylla.git feature/table_schema_history_v7:
lwt: add column_mapping history persistence tests
schema: add equality operator for `column_mapping` class
lwt: store column_mapping's for each table schema version upon a DDL change
schema_tables: extract `fill_column_info` helper
frozen_mutation: introduce `unfreeze_upgrading` method