We use boost test logging primarily to generate nice XML xunit
files used in Jenkins. These XML files can be bloated
with messages from BOOST_TEST_MESSAGE(), hundreds of megabytes
of build archives, on every build.
Let's use seastar logger for test logging instead, reserving
the use of boost log facilities for boost test markup information.
Most test-methods log a message with their names upon entering them.
This helps in identifying the test-method a failure happened in in the
logs. Two methods were missing this log line, so add it.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200304155235.46170-1-bdenes@scylladb.com>
Merged patch series from Avi Kivity:
boost/multiprecision is a heavyweight library, pulling in 20,000 lines of code into
each header that depends on it. It is used by converting_mutation_partition_applier
and types.hh. While the former is easy to put out-of-line, the latter is not.
All we really need is to forward-declare boost::multiprecision::cpp_int, but that
is not easy - it is a template taking several parameters, among which are non-type
template parameters also defined in that header. So it's quite difficult to
disentangle, and fragile wrt boost changes.
This patchset introduces a wrapper type utils::multiprecision_int which _can_
be forward declared, and together with a few other small fixes, manages to
uninclude boost/multiprecision from most of the source files. The total reduction
in number of lines compiled over a full build is 324 * 23,227 or around 7.5
million.
Tests: unit (dev)
Ref #1https://github.com/avikivity/scylla uninclude-boost-multiprecision/v1
Avi Kivity (5):
converting_mutation_partition_applier: move to .cc file
utils: introduce multiprecision_int
tests: cdc_test: explicitly convert from cdc::operation to uint8_t
treewide: use utils::multiprecision_int for varint implementation
types: forward-declare multiprecision_int
configure.py | 2 +
concrete_types.hh | 2 +-
converting_mutation_partition_applier.hh | 163 ++-------------
types.hh | 12 +-
utils/big_decimal.hh | 3 +-
utils/multiprecision_int.hh | 256 +++++++++++++++++++++++
converting_mutation_partition_applier.cc | 188 +++++++++++++++++
cql3/functions/aggregate_fcts.cc | 10 +-
cql3/functions/castas_fcts.cc | 28 +--
cql3/type_json.cc | 2 +-
lua.cc | 38 ++--
mutation_partition_view.cc | 2 +
test/boost/cdc_test.cc | 6 +-
test/boost/cql_query_test.cc | 16 +-
test/boost/json_cql_query_test.cc | 12 +-
test/boost/types_test.cc | 58 ++---
test/boost/user_function_test.cc | 2 +-
test/lib/random_schema.cc | 14 +-
types.cc | 20 +-
utils/big_decimal.cc | 4 +-
utils/multiprecision_int.cc | 37 ++++
21 files changed, 627 insertions(+), 248 deletions(-)
create mode 100644 utils/multiprecision_int.hh
create mode 100644 converting_mutation_partition_applier.cc
create mode 100644 utils/multiprecision_int.cc
The goal is to forward-declare utils::multiprecision_int, something
beyond my capabilities for boost::multiprecision::cpp_int, to reduce
compile time bloat.
The patch is mostly search-and-replace, with a few casts added to
disambiguate conversions the compiler had trouble with.
I found that a few variables in cql_test_env were wrapping sharded in
shared_ptr for no apparent reason. These patches convert them to plain
sharded<...>.
"
This set cleans sstable_writer_config and surrounding sstables
code from using global storage_ and feature_ service-s and database
by moving the configuration logic onto sstables_manager (that
was supposed to do it since eebc3701a5).
Most of the complexity is hidden around sstable_writer_config
creation, this set makes the sstables_manager create this object
with an explicit call. All the rest are consequences of this change.
Tests: unit(debug), manual start-stop
"
* 'br-clean-sstables-manager-2' of https://github.com/xemul/scylla:
sstables: Move get_highest_supported_format
sstables: Remove global get_config() helper
sstables: Use manager's config() in .new_sstable_component_file()
sstable_writer_config: Extend with more db::config stuff
sstables_manager: Don't use global helper to generate writer config
sstable_writer_config: Sanitize out some features fields initialization
sstable_writer_config: Factor out some field initialization
sstables: Generate writer config via manager only
sstables: Keep reference on manager
test: Re-use existing global sstables_manager
table: Pass sstable_writer_config into write_memtable_to_sstable
The main goal of this patch is to stop using get_config() glbal
when creating the sstable_writer_config instance.
Other than being global the existing get_config() is also confusing
as it effectively generates 3 (three) sorts of configs -- one for
scylla, when db config and features are ready, the other one for
tests, when no storage service is at hands, and the third one for
tests as well, when the storage service is created by test env
(likely intentionally, but maybe by coincidence the resulting config
is the same as for no-storage-service case).
With this patch it's now 100% clear which one is used when. Also
this makes half the work of removing get_config() helper.
The db::config and feature_service used to initialize the managers
are referenced by database that creates and keeps managers on,
so the references are safe.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The sstable_writer_config creation looks simple (just declare
the struct instance) but behind the scenes references storage
and feature services, messes with database config, etc.
This patch teaches the sstables_manager generate the writer
config and makes the rest of the code use it. For future
safety by-hands creation of the sstable_writer_config is
prohibited.
The manager is referenced through table-s and sstable-s, but
two existing sstables_managers live on database object, and
table-s and sstable-s both live shorter than the database,
this reference is save.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The sstables_manager in scylla binary outlives the sstables objects
created by it, this makes it possible to add sstable->manager reference
and use it. In unit tests there are cases when sstables::test_env that
keeps manager in _mgr field is destroyed right after sstable creation
(e.g. -- in the boost/sstable_mutation_test.cc ka_sst() helper).
Fix this by chaning the _mgr being reference on the manager and
initialize it with already existing global manager. Few exceptions
from this rule that need to set own large data handler will create
the sstable_manager their own.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The latter creates the config by hands, but the plan is to
create it via sstables_manager. Callers of this helper are the
final frontiers where the manager will be safely accessible.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
random_schema already has a _schema field which in turn
has a get_partitioner() function. Store partitioner
in random_schema is redundant.
At the moment all uses of random_schema are based on
default partitioner so it is not necessary to set it
explicitly. If in the future we need random_schema to
work with other partitioners we will add the constructor
back and fix the creation of _schema to contain it. It's
not needed now though.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
and replace all dht::global_partitioner().decorate_key
with dht::decorate_key
It is an improvement because dht::decorate_key takes schema
and uses it to obtain partitioner instead of using global
partitioner as it was before.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Take const schema& as a parameter of shard_of and
use it to obtain partitioner instead of calling
global_partitioner().
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
All internal execution always uses query text as a key in the
cache of internal prepared statements. There is no need
to publish API for executing an internal prepared statement object.
The folded execute_internal() calls an internal prepare() and then
internal execute().
execute_internal(cache=true) does exactly that.
Rename an overloaded function process() to execute_direct().
Execute direct is a common term for executing a statement
that was not previously prepared. See, for example
SQLExecuteDirect in ODBC/SQL CLI specification,
mysql_stmt_execute_direct() in MySQL C API or EXECUTE DIRECT
in Postgres XC.
Way too many places in code needs storage_service just for token_metadata.
These references increase the amount of get(_local)?_storage_service()
calls and create loops in components dependencies. Keep the token_metadata
separately from storage_service and pass instances' references where
needed (for now -- only into the storage_service itself).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
There's a lot of code around that needs storage service purely to
get the specific feature value (cluster_supports_<something> calls).
This creates several circular dependencies, e.g. storage_service <->
migration_manager one and database <-> storage_servuce. Also features
sit on storage_service, but register themselfs on the feature_service
and the former subscribes on them back which also looks strange.
I propose to keep all the features on feature_service, this keeps the
latter intependent from other components, makes it possible to break
one of the mentioned circle dependencyand heavily relax the other.
Also the set helps us fighting the globals and, after it, the
feature_service can be safely stopped at the very last moment.
Tests: unit(dev), manual debug build start-stop
"
* 'br-features-to-service-5' of https://github.com/xemul/scylla:
gossiper: Avoid string merge-split for nothing
features: Stop on shutdown
storage_service: Remove helpers
storage_service: Prepare to switch from on-board feature helpers
cql3: Check feature in .validate
database: Use feature service
storage_proxy: Use feature service
migration_manager: Use feature service
start: Pass needed feature as argument into migrate_truncation_records
features: Unfriend storage_service
features: Simplify feature registration
features: Introduce known_feature_set
features: Move disabled features set from storage_service
features: Move schema_features helper
features: Move all features from storage_service to feature_service
storage_service: Use feature_config from _feature_service
features: Add feature_config
storage_service: Kill set_disabled_features
gms: Move features stuff into own .cc file
migration_manager: Move some fns into class
The view_update_generator acceps (and keeps) database and storage_proxy,
the latter is only needed to initialize the view_updating_consumer which,
in turn, only needs it to get database from (to find column family).
This can be relaxed by providing the database from _generator to _consumer
directly, without using the storage_proxy in between.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200207112427.18419-1-xemul@scylladb.com>
This is not just a direct flip to a variable with the negated Boolean
value. When created, a large_data_handler is not considered to be
running, the user has to call start() before it can be used.
The advantaged of doing this is that if initialization fails and a
database is destructed before the large_data_handler is started, the
assert
database::stop() {
assert(!_large_data_handler->running());
is not triggered.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Merged pull request https://github.com/scylladb/scylla/pull/5485
by Kamil Braun:
This series introduces the notion of CDC generations: sets of CDC streams
used by the cluster to choose partition keys for CDC log writes.
Each CDC generation begins operating at a specific time point, called the
generation's timestamp (cdc_streams_timestamp in the code).
It continues being used by all nodes in the cluster to generate log writes
until superseded by a new generation.
Generations are chosen so that CDC log writes are colocated with their
corresponding base table writes, i.e. their partition keys (which are CDC
stream identifiers picked from the generation operating at time of making
the write) fall into the same vnode and shard as the corresponding base
table write partition keys. Currently this is probabilistic and not 100%
of log writes will be colocated - this will change in future commits,
after per-table partitioners are implemented.
CDC generations are a global property of the cluster -- they don't depend
on any particular table's configuration. Therefore the old "CDC stream
description tables", which were specific to each CDC-enabled table,
were removed and replaced by a new, global description table inside the
system_distributed keyspace.
A new generation is introduced and supersedes the previous one whenever
we insert new tokens into the token ring, which breaks the colocation
property of the previous generation. The new generation is chosen to
account for the new tokens and restore colocation. This happens when a
new node joins the cluster.
The joining node is responsible for creating and informing other nodes
about the new CDC generation. It does that by serializing it and inserting
into an internal distributed table ("CDC topology description table").
If it fails the insert, it fails the joining process. It then announces
the generation to other nodes through gossip using the generation's
timestamp, which is the partition key of the inserted distributed table
entry.
Nodes that learn about the new generation through gossip attempt to
retrieve it from the distributed table. This might fail - for example,
if the node is partitioned away from all replicas that hold this
generation's table entry. In that case the node might stop accepting
writes, since it knows that it should send log entries to a new generation
of streams, but it doesn't know what the generation is. The node will keep
trying to retrieve the data in the background until it succeeds or sees
that it is no longer necessary (e.g., because yet another generation
superseded this one). So we give up some availability to achieve safety.
However, this solution is not completely safe (might break consistency
properties): if a node learns about a new generation too late (if gossip
doesn't reach this node in time), the node might send writes to the wrong
(old) generation. In the future we will introduce a transaction-based
approach where we will always make sure that all nodes receive the new
generation before any of them starts using it (and if it's impossible
e.g. due to a network partition, we will fail the bootstrap attempt).
In practice, if the admin makes sure that the cluster works correctly
before bootstrapping a new node, and a network partition doesn't start
in the few seconds window where a new generation is announced, everything
will work as it should.
After the learning node retrieves the generation, it inserts it into an
in-memory data structure called "CDC metadata". This structure is then
used when performing writes to the CDC log -- given the timestamp of the
written mutation, the data structure will return the CDC generation
operating at this time point. CDC metadata might reject the query for
two reasons: if the timestamp belongs to an earlier generation, which
most probably doesn't have the colocation property anymore, or if it is
picked too far away into the future, where we don't know if the current
generation won't be superseded by a different one (so we don't yet know
the set of streams that this log write should be sent to). If the client
uses server-generated timestamps, the query will never be rejected.
Clients can also use client-generated timestamps, but they must make sure
that their clocks are not too desynchronized with the database --
otherwise some or all of their writes to CDC-enabled tables will be
rejected.
In the case of rolling upgrade, where we restart nodes that were
previously running without CDC, we act a bit differently - there is no
naturally selected joining node which must propose a new generation.
We have to select such a node using other means. For this we use a bully
approach: every node compares its host id with host ids of other nodes
and if it finds that it has the greatest host id, it becomes responsible
for creating the first generation.
This change also fixes the way of choosing values of the "time" column
of CDC log writes: the timeuuid is chosen in a way which preserves
ordering of corresponding base table mutations (the timestamp of this
timeuuid is equal to the base table mutation timestamp).
Warning: if you were running a previous CDC version (without topology
change support), make sure to disable CDC on all tables before performing
the upgrade. This will drop the log data -- backup it if needed.
TODO in future patchset: expire CDC generations. Currently, each inserted
CDC generation will stay in the distributed tables forever (until
manually removed by the administrator). When a generation is superseded,
it should become "expired", and 24 hours after expiration, it should be
removed. The distributed tables (cdc_topology_description and
cdc_description) both have an "expired" column which can be used for
this purpose.
Unit tests: dev, debug, release
dtests (dev): https://jenkins.scylladb.com/job/scylla-master/job/byo/job/byo_build_tests_dtest/907/
Keep local feature_service reference on database. This relaxes the
circular storage_service <-> database reference, but not removes it
completely.
This needs some args tossing in apply_to_builder, but it's
rather straightforward, so comes in the same patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Keep reference on local feature service from storage_proxy
and use it in places that have (local) storage_proxy at hands.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This unties migration_manager from storage_service thus breaking
the circular dependency between these two.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
The fix itself is fairly simple, but looking at the code I found that
our code base was not cleanly distinguishing null and empty values and
was treating null and missing values differently, but that distinction
was dead since a null is represented as a dead cell.
"
* 'espindola/lua-fix-null-v6' of https://github.com/espindola/scylla:
lua: Handle nil returns correctly
types: Return bytes_opt from data_value::serialize
query-result-set: Assert that we don't have null values
types: Fix comparison of empty and null data_values
Revert "tests: Handle null and not present values differently"
query-result-set: Avoid a copy during construction
types: Move operator== for data_value out-of-line
We use eventually() in tests to wait for eventually consistent data
to become consistent. However, we see spurious failures indicating
that we wait too little.
Increasing the timeout has a negative side effect in that tests that
fail will now take longer to do so. However, this negative side effect
is negligible to false-positive failures, since they throw away large
test efforts and sometimes require a person to investigate the problem,
only to conclude it is a false positive.
This patch therefore makes eventually() more patient, by a factor of
32.
Fixes#4707.
Message-Id: <20200130162745.45569-1-avi@scylladb.com>
This commit builds on top of the introduced per scheduling group
statistics template and employs it for achieving a per scheduling
group statistics in storage_proxy.
Some of the statistics also had meaning as a global - per
shard one. Those are the ones for determining if to
throttle the write request. This was handled by creating a
global stats struct that will hold those stats and by changing
the stat update to also include the global one.
One point that complicated it is an already existing aggregation
over the per shard stats that now became a per scheduling group
per shard stats, converting the aggregation to a two-dimensional
aggregation.
One thing this commit doesn't handle is validating that an individual
statistic didn't "cross a scheduling group boundary", such validation
is possible but it can easily be added in the future. There is a
subtlety to doing so since if the operation did cross to other
scheduling group two connected statistics can lose balance
for example written bytes and completed write transactions.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Since a data_value can contain a null value, returning bytes from
serialize() was losing information as it was mapping null to empty.
This also introduces a serialize_nonnull that still returns bytes, but
results in an internal error if called with a null value.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
This reverts commit 2ebd1463b2.
The test introduced by that commit was wrong, and in fact depended on
a bug in operator== for data_value. A followup patch fixes operator==,
so this reverts the broken commit first.
The reason it was broken was that it created a live cell with a null
data_value. In reality, null values are represented with dead cells.
For example, the sstable produced by
CREATE TABLE my_table (key int PRIMARY KEY, v1 int, v2 int) with compression = {'sstable_compression': ''};
INSERT INTO my_table (key, v1, v2) VALUES (1, 42, null);
Is
00 04 key_length
00 00 00 01 key
7f ff ff ff local_deletion_time
80 00 00 00 00 00 00 00 marked_for_delete_at
24 HAS_ALL_COLUMNS | HAS_TIMESTAMP
09 row_body_size
12 prev_unfiltered_size
00 delta_timestamp
08 USE_ROW_TIMESTAMP_MASK
00 00 00 2a value
0d USE_ROW_TIMESTAMP_MASK | HAS_EMPTY_VALUE_MASK | IS_DELETED_MASK
00 deletion time
01 END_OF_PARTITION
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>