Adjusts the column_computation interface so that it is able to accept
both clustering and static rows through the common
db::view::clustering_or_static_row interface.
The structure "bytes_with_action" was very hard to understand because of
its mysterious and general-sounding name, and no comments.
In this patch I add a large comment explaining its purpose, and rename
it to a more suitable name, view_key_and_action, which suggests that
each such object is about one view key (where to add a view row), and
an additional "action" that we need to take beyond adding the view row.
This is the best I can do to make this code easier to understand without
completely reorganizing it.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
depends_on_non_primary_key_column for a column computation is needed to
detect a case where the primary key of a materialized view depends on a
non primary key column from the base table, but at the same time, the view
itself doesn't have non-primary key columns. This is an issue, since as
for now, it was assumed that no non-primary key columns in view schema
meant that the update cannot change the primary key of the view, and
therefore the update path can be simplified.
This type of column computation will be used for creating updates to
materialized views that are indexes over collections.
This type features additional function, compute_values_with_action,
which depending on an (optional) old row and new row (the update to the
base table) returns multiple bytes_with_action, a vector of pairs
(computed value, some action), where the action signifies whether a
deletion of row with a specific key is needed, or creation thereby.
The compute_value function of column_computation has had previously the
following signature:
virtual bytes_opt compute_value(const schema& schema, const partition_key& key, const clustering_row& row) const override;
This is superfluous, since never in the history of Scylla, the last
parameter (row) was used in any implentation, and never did it happen
that it returned bytes_opt. The absurdity of this interface can be seen
especially when looking at call sites like following, where dummy empty
row was created:
```
token_column.get_computation().compute_value(
*_schema, pkv_linearized, clustering_row(clustering_key_prefix::make_empty()));
```
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
If utils/rjson.hh is modified, 300 (!) source files get recompiled.
This is frustrating for anyone working on this header file (like me).
Moreover - utils/rjson.hh includes the large rapidjson header
files (rapidjson is a header-only library!), slowing the compilation
all these 300 files.
It turns out most includers utils/rjson.hh get it because
column_computation.hh includes it. But the fact that column
computations are serialized as JSON are an internal implementation
detail that the users of this header don't need to know - and they
care even less that this JSON implementation uses utils/rjson.hh.
So in this patch column_computation.hh no longer includes rjson.hh,
and no longer exposes a method taking a rjson::value that was never
used outside the implementation.
After this patch, touching utils/rjson.hh only recompiles 21 files.
Refs #1
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210422183526.114366-1-nyh@scylladb.com>
Introduce new token_column_computation class which is intended to
replace legacy_token_column_computation. The new column computation
returns token as long_type, which means that it will be ordered
according to signed comparison (not unsigned comparison of bytes), which
is the correct ordering of tokens.
Raname token_column_computation to legacy_token_column_computation, as
it will be replaced with new column_computation. The reason is that this
computation returns bytes, but all tokens in Scylla can now be
represented by int64_t. Moreover, returning bytes causes invalid token
ordering as bytes comparison is done in unsigned way (not signed as
int64_t). See issue:
https://github.com/scylladb/scylla/issues/7443
In order to eventually switch to a single JSON library,
most of the libjsoncpp usage is dropped in favor of rjson.
Unfortunately, one usage still remains:
test/utils/test_repl utility heavily depends on the *exact textual*
format of its output JSON files, so replacing a library results
in all tests failing because of differences in formatting.
It is possible to force rjson to print its documents in the exact
matching format, but that's left for later, since the issue is not
critical. It would be nice though if our test suite compared
JSON documents with a real JSON parser, since there are more
differences - e.g. libjsoncpp keeps children of the object
sorted, while rapidjson uses an unordered data structure.
This change should cause no change in semantics, it strives
just to replace all usage of libjsoncpp with rjson.
Some columns may represent not user-provided values, but ones computed
from other columns. Currently an example is token column used in secondary
indexes to provide proper ordering. In order to avoid hardcoding special
cases in execution stage, optional additional information for computed
columns is stored in column definition.