In order to prevent users from creating a network topology
strategy instance with invalid inputs, it's not enough to use
std::stol() on the input: a string "3abc" still returns the number '3',
but will later confuse cqlsh and other drivers, when they ask for
topology strategy details.
The error message is now more human readable, since for incorrect
numeric inputs it used to return a rather cryptic message:
ServerError: stol()
This commit fixes the issue and comes with a simple test.
Fixes#3801
Tests: unit(dev)
Message-Id: <7aaae83d003738f047d28727430ca0a5cec6b9c6.1583478000.git.sarna@scylladb.com>
... and reading cdc metadata' from Piotr
Currently, information on what cdc options are enabled
in a table - cdc metadata in short - is stored in two places:
In cdc column of the system_schema.scylla_tables,
In a cdc schema extension.
The former is used as a source of truth, i.e. a node reads cdc metadata
from that column, while the latter is used for cosmetic purposes
(e.g. cqlsh displays info on cdc based on this extension)
and is only written, but never read by the node.
Introducing the cdc column to scylla_tables made the logic
of schema agreement more complicated. As a first step of removing
this column, this PR makes the cdc schema extension as the
"source of truth" - a node will from now on read cdc metadata
from that extension.
The cdc column will be deprecated and removed in subsequent releases,
but it is left for now and will still be written to in order not to break
the logic of schema agreement.
Acked-by: Nadav Har-El <nyh@scylladb.com>
Refs: #5737
Tests: unit(dev), 2-node cluster upgrade under write load to a cdc-enabled table
* piodul/5737-cdc-schema-extension:
schema: get cdc options from schema extensions
alter_table_statement: fix indentation
cf_prop_defs: initialize schema extensions externally
cf_prop_defs: move checking of cdc support to ::validate
cf_prop_defs: pass database& to ::validate, not db::extensions&
unit tests: register cdc extension before tests
cdc: construct cdc_options directly inside cdc_extension
db::extensions: add shorthands for add_schema_extension
Moves initialization of schema extensions outside of cf_prop_defs. This
allows to construct these extensions once, and use them several times in
cd_prop_defs' methods without caching or recalculating them several
times.
Changes cf_prop_defs::validate function to take database& as an argument
instead of db::extensions&. This change will allow us to move the check
which asserts that the cluster supports CDC from `apply_to_builder` to
`validate` method.
Instead of storing a raw map of options inside `cdc_extension`, the
extension now converts them into `cdc_options` directly on construction.
This removes the need to construct `cdc_options` object multiple times.
With #5950 we changed the representation of stream_id
in CDC Log from two int columns to a single blob column.
This PR cleans up stream_id representation internally.
Now stream_id is stored as blob both in-memory and in
internal CDC tables.
Tests: unit(dev)
* hawk/stream_id_representation:
cdc: store stream_ids as blobs in internal tables
cdc: improve do_update_streams_description
cdc: Fix generate_topology_description
cdc: add stream_id::operator<
cdc: change stream_id representation
Fixes#5891
Refs #5899
When creating segments with the o_dsync option active, we write max_size
zeros to disk, to ensure actual disk blocks are allocated.
However, if we recycle a segment, we should, when not actually creating
a new file, check the existing size on disk, and only zero any blocks
not already allocated (i.e. if recycled file was smaller than max_size,
due to segement truncation on close).
test: unit
Message-Id: <20200226121601.15347-1-calle@scylladb.com>
from Kamil.
If a batch update is performed with a sequence of changes with a single
timestamp, they will now show up in CDC with a single timeuuid
in the cdc$time column, distinguished by different cdc$batch_seq_no values.
Fixes#5953.
Tests: unit(dev)
* haaawk/splitbatch:
cdc: use a single timeuuid value for a batch of changes
cdc: replace `split` with `for_each_change`
Now it will show the full info about range being streamed, like
range_streamer - Rebuild with 127.0.0.2 for keyspace=ks2, streaming [72, 96) out of 248 ranges
The [x, y) range is semi-open one, the full streaming progress
then can be logged like
... streaming [0, 16) out of 36 ranges <- first send
... streaming [16, 24) out of 36 ranges
... streaming [24, 36) out of 36 ranges <- last send
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200304101505.5506-1-xemul@scylladb.com>
If a batch update is performed with a sequence of changes with a single
timestamp, they will now show up in CDC with a single timeuuid in the
`time` column, distinguished by different `batch_seq_no` values.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Presently lightweight transactions piggy back the old
row value on prepare round response. If one of the participants
did not provide the old value or the values from peers don't match,
we perform a full read round which will repair the Paxos table and the
base table, if necessary, at all participants.
Capture the fact that read optimization has failed in a metric.
Message-Id: <20200304192955.84208-2-kostja@scylladb.com>
`for_each_change` is like `split` but it doesn't return a vector of
mutations representing each change; instead, it takes as a parameter
a function which gets called on each mutation.
This reduced the memory usage and allows to preserve common context
when handling each change (will be useful in next commits).
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
The relocatable package requires a magic dynamic linker path for
"patchelf" to work correctly. Therefore, use the "get-dynamic-linker.sh"
script to unconditionally define a magic dynamic linker path to ensure
that building the relocatable package with ninja build ("ninja-build
build/<mode>/scylla-package.tar.gz") is always correct. Although the
path looks odd with a lot of leading slashes, it works outside
relocatable package too.
Message-Id: <20200305091919.6315-2-penberg@scylladb.com>
In preparation for moving dynamic linker flags to ninja build, move the
magic dynamic linker path generation to "reloc/get-dynamic-linker.sh"
script that configure.py can call.
Message-Id: <20200305084331.5339-1-penberg@scylladb.com>
In new CDC Log format stream_id is represented by a single
blob column so it makes sense to store it in the same form
everywhere - including internal CDC tables.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Use std::set::insert that takes range instead of
looping through elements and adding them one by one.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
In new CDC Log format we store only a single stream_id column.
This means generate_topology_description has to use appropriate
schema for generating tokens for stream_ids.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
New CDC Log format stores stream ids as blobs.
It makes sense to keep them internally in the same form.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Currently, writes to a static row in a base table are not reflected
at all in the corresponding cdc log. This patch causes such writes
to be properly logged.
Fixes: #5744
Tests: unit(dev)
* piodul/5744-handle-static-row-correctly-in-cdc:
cdc_test: add tests for handling static row
cdc: fix indentation in transformer::transform
cdc: handle static rows separately in transformer::transform
cdc: move process_cells higher (and fix captured variables)
cdc: reduce dependencies on captured variables in process_cells
cdc: fix preimage query for static rows
Until this patch, we used the default_smp_service_group() when bouncing
Alternator requests between shards (which is needed for LWT).
This patch creates a new smp_service_group for this purpose, which is
limited to 5000 concurrent requests (the same limit used for CQL's
bounce_request_smp_service_group). The purpose of this limit is to avoid
many shards admitting a huge number of requests and bouncing all of them
to the same shard who now can't "unadmit" these requests.
Fixes#5664.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200304170825.27226-1-nyh@scylladb.com>
We use boost test logging primarily to generate nice XML xunit
files used in Jenkins. These XML files can be bloated
with messages from BOOST_TEST_MESSAGE(), hundreds of megabytes
of build archives, on every build.
Let's use seastar logger for test logging instead, reserving
the use of boost log facilities for boost test markup information.
Now, if CDC is enabled, `paxos_response_handler::learn_decision()`
augments the base table mutation. The differences in logic between:
(1) `mutate_internal<std::vector<mutation>>()`
and
(2) `mutate_internal<std::vector<std::tuple<paxos::proposal, schema_ptr, ...>>>()`
make it necessary to separate "CDC mutations" from "base mutation"
and send them, respectively, to (1) and (2).
Gleb explained in #5869 why it became necessary to add CDC code to LWT
writes specifically, instead of doing it somewhere central that affects
all writes:
"All paths that do write goes through mutate_internally() eventually so it
would have been best to do augmentations there, but cdc chose to log only
certain writes and not others (unlike MV that does not care how write
happened) and mutate_internal have no idea which is which so I do not have
other choice but code duplication. ... paxos_response_handler::learn_decision
is probably the place to add cdc augmentation."
Fixes#5869
It is possible to produce an empty mutation using CQL. For example, the
following query:
DELETE FROM ks.tbl WHERE pk = 0 AND ck < 1 AND ck > 2;
will attempt to delete from an empty range of rows. This is translated
to the following mutation:
{ks.tbl {key: pk{000400000000}, token:-3485513579396041028}
{mutation_partition:
static: cont=1 {row: },
clustered: {}}}
Such mutation does not contain any timestamp, therefore it is difficult
to determine what timestamp was used while making the query. This is
problematic for CDC, because an entry in CDC log should be written with
the same timestamp as a part of the mutation.
Because an empty mutation does not modify the table in any way, we can
safely skip logging such mutations in CDC and still preserve the
ability to reconstruct the current state of the base table from full
CDC log.
Tests: unit(dev)
Before this patch, `transform` did not generate any log rows about
static row change. This commit fixes that - now, a log row is created if
a static row is changed, and this row is separate from the rows that
describe changes to the clustering rows.
This is a preparation for moving the lambda outside the for loop.
- `log_ck`, `pikey`, `pirow` are now passed as arguments,
- `value` is now a variable local to the lambda,
- `ttl` is now a variable local to the lambda that is returned.
Most test-methods log a message with their names upon entering them.
This helps in identifying the test-method a failure happened in in the
logs. Two methods were missing this log line, so add it.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200304155235.46170-1-bdenes@scylladb.com>
Regular compaction relies on compaction manager to run compaction jobs
until compaction strategy is satisfied. Resharding, on the other hand,
is an one-off operation which runs only once in compaction manager,
and leave the sstable set in such a way that the strategy is very
likely unsatisfied. We need to trigger regular compaction whenever
a resharding job replaces a shared sstable by an unshared sstable,
so that compaction will not fall way behind due to lots of new sstables
created by resharding process.
Fixes#5262.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200217144946.20338-1-raphaelsc@scylladb.com>
Merged patch series from Avi Kivity:
boost/multiprecision is a heavyweight library, pulling in 20,000 lines of code into
each header that depends on it. It is used by converting_mutation_partition_applier
and types.hh. While the former is easy to put out-of-line, the latter is not.
All we really need is to forward-declare boost::multiprecision::cpp_int, but that
is not easy - it is a template taking several parameters, among which are non-type
template parameters also defined in that header. So it's quite difficult to
disentangle, and fragile wrt boost changes.
This patchset introduces a wrapper type utils::multiprecision_int which _can_
be forward declared, and together with a few other small fixes, manages to
uninclude boost/multiprecision from most of the source files. The total reduction
in number of lines compiled over a full build is 324 * 23,227 or around 7.5
million.
Tests: unit (dev)
Ref #1https://github.com/avikivity/scylla uninclude-boost-multiprecision/v1
Avi Kivity (5):
converting_mutation_partition_applier: move to .cc file
utils: introduce multiprecision_int
tests: cdc_test: explicitly convert from cdc::operation to uint8_t
treewide: use utils::multiprecision_int for varint implementation
types: forward-declare multiprecision_int
configure.py | 2 +
concrete_types.hh | 2 +-
converting_mutation_partition_applier.hh | 163 ++-------------
types.hh | 12 +-
utils/big_decimal.hh | 3 +-
utils/multiprecision_int.hh | 256 +++++++++++++++++++++++
converting_mutation_partition_applier.cc | 188 +++++++++++++++++
cql3/functions/aggregate_fcts.cc | 10 +-
cql3/functions/castas_fcts.cc | 28 +--
cql3/type_json.cc | 2 +-
lua.cc | 38 ++--
mutation_partition_view.cc | 2 +
test/boost/cdc_test.cc | 6 +-
test/boost/cql_query_test.cc | 16 +-
test/boost/json_cql_query_test.cc | 12 +-
test/boost/types_test.cc | 58 ++---
test/boost/user_function_test.cc | 2 +-
test/lib/random_schema.cc | 14 +-
types.cc | 20 +-
utils/big_decimal.cc | 4 +-
utils/multiprecision_int.cc | 37 ++++
21 files changed, 627 insertions(+), 248 deletions(-)
create mode 100644 utils/multiprecision_int.hh
create mode 100644 converting_mutation_partition_applier.cc
create mode 100644 utils/multiprecision_int.cc
This reduces the number of translation units that depend on
boost/multiprecision from 354 to 30, and reduces the size of
database.i (as an example) from 406160 to 382933 (smaller
files will benefit more, relatively).
Ref #1
The goal is to forward-declare utils::multiprecision_int, something
beyond my capabilities for boost::multiprecision::cpp_int, to reduce
compile time bloat.
The patch is mostly search-and-replace, with a few casts added to
disambiguate conversions the compiler had trouble with.
After the varint data type starts using the new multiprecision_int type,
this code fails to compile. I expect that somehow the conversion from enum
class to cpp_int was allowed to succeed, and we ended up with a data_value
of type varint. The tests succeeded because the serialized representation
happened to be the same.
Previously we had stream_id_1 and stream_id_2 columns
of type long each. They were forming a partition key.
In a new format we want a single stream_id column that
forms a partition key. To be able to still store two
longs, the new column will have type blob and its value
will be concatenated bytes of two longs that
partition key is composed of.
We still want partition key to logically be two longs
because those two values will be used by a custom partitioner
later once we implement it.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
multiprecision_int is a wrapper around boost::multiprecision::cpp_int that adds
no functionality. The intent is to allow forward declration; cpp_int is so
complicated that just finding out what its true type is a difficult exercise, as
it depends on many internal declarations.
Because cpp_int uses expression templates, the implementation has to explicitly
cast to the desired type in many places, otherwise the C++ compile is presented
with too many choices, especially in conjunction with data_value (which can
convert from many different types too).
converting_mutation_partition_applier is a heavyweight class that is not
used in the hot path, so it can be safely out-of-lined. This moves
some includes to boost/multiprecision out of header files, where they
can infect a lot of code.
mutation_partition_view.cc's includes were adjusted to recover
missing dependencies.
The previous version errorneously used local db reference
which was propagated into another shard. This time carry
the sharded instance and use .local() as before.
tests: unit(dev)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200303221729.31261-1-xemul@scylladb.com>