Commit Graph

21477 Commits

Author SHA1 Message Date
Piotr Sarna
6df132436f cql3: disallow range deletions for specific columns
Range deletions of specific columns are not well-defined
(range tombstones cover entire rows) and are forbidden
in Cassandra, so we follow suit.
This commit comes with a simple test.

Fixes #5728
Tests: unit(dev)
Message-Id: <896264f5f5790b9f96fcc18655ac3248a6abf37a.1583424131.git.sarna@scylladb.com>
2020-03-06 10:04:05 +02:00
Piotr Sarna
5b7a35e02b network_topology_strategy: validate integers
In order to prevent users from creating a network topology
strategy instance with invalid inputs, it's not enough to use
std::stol() on the input: a string "3abc" still returns the number '3',
but will later confuse cqlsh and other drivers, when they ask for
topology strategy details.
The error message is now more human readable, since for incorrect
numeric inputs it used to return a rather cryptic message:
    ServerError: stol()
This commit fixes the issue and comes with a simple test.

Fixes #3801
Tests: unit(dev)
Message-Id: <7aaae83d003738f047d28727430ca0a5cec6b9c6.1583478000.git.sarna@scylladb.com>
2020-03-06 09:50:33 +02:00
Piotr Sarna
30d2826358 Merge 'cdc: use cdc schema extension for storing...
... and reading cdc metadata' from Piotr

Currently, information on what cdc options are enabled
in a table - cdc metadata in short - is stored in two places:

    In cdc column of the system_schema.scylla_tables,
    In a cdc schema extension.

The former is used as a source of truth, i.e. a node reads cdc metadata
from that column, while the latter is used for cosmetic purposes
(e.g. cqlsh displays info on cdc based on this extension)
and is only written, but never read by the node.

Introducing the cdc column to scylla_tables made the logic
of schema agreement more complicated. As a first step of removing
this column, this PR makes the cdc schema extension as the
"source of truth" - a node will from now on read cdc metadata
from that extension.

The cdc column will be deprecated and removed in subsequent releases,
but it is left for now and will still be written to in order not to break
the logic of schema agreement.

Acked-by: Nadav Har-El <nyh@scylladb.com>

Refs: #5737
Tests: unit(dev), 2-node cluster upgrade under write load to a cdc-enabled table

* piodul/5737-cdc-schema-extension:
  schema: get cdc options from schema extensions
  alter_table_statement: fix indentation
  cf_prop_defs: initialize schema extensions externally
  cf_prop_defs: move checking of cdc support to ::validate
  cf_prop_defs: pass database& to ::validate, not db::extensions&
  unit tests: register cdc extension before tests
  cdc: construct cdc_options directly inside cdc_extension
  db::extensions: add shorthands for add_schema_extension
2020-03-05 16:31:40 +01:00
Piotr Dulikowski
861c7b5626 schema: get cdc options from schema extensions
Removes logic responsible for setting cdc_options from dedicated column
in scylla_tables, and uses the "cdc" schema extension instead.
2020-03-05 16:11:21 +01:00
Piotr Dulikowski
e98766dd81 alter_table_statement: fix indentation 2020-03-05 16:11:21 +01:00
Piotr Dulikowski
828077be5e cf_prop_defs: initialize schema extensions externally
Moves initialization of schema extensions outside of cf_prop_defs. This
allows to construct these extensions once, and use them several times in
cd_prop_defs' methods without caching or recalculating them several
times.
2020-03-05 16:11:21 +01:00
Piotr Dulikowski
0bdc22e33b cf_prop_defs: move checking of cdc support to ::validate
Validation of CDC options fits better into the `validate` method rather
than `apply_to_builder`.
2020-03-05 16:11:21 +01:00
Piotr Dulikowski
260c47d758 cf_prop_defs: pass database& to ::validate, not db::extensions&
Changes cf_prop_defs::validate function to take database& as an argument
instead of db::extensions&. This change will allow us to move the check
which asserts that the cluster supports CDC from `apply_to_builder` to
`validate` method.
2020-03-05 16:11:21 +01:00
Piotr Dulikowski
38b7f1ad45 unit tests: register cdc extension before tests
In the following commits, using cdc in tests will require registering
cdc extension explicitly in db config.
2020-03-05 16:11:20 +01:00
Piotr Dulikowski
0f4f48ef76 cdc: construct cdc_options directly inside cdc_extension
Instead of storing a raw map of options inside `cdc_extension`, the
extension now converts them into `cdc_options` directly on construction.
This removes the need to construct `cdc_options` object multiple times.
2020-03-05 16:09:44 +01:00
Piotr Dulikowski
6895b0e395 db::extensions: add shorthands for add_schema_extension
This abstract away a pattern used everywhere when adding a schema
extension.
2020-03-05 16:09:44 +01:00
Piotr Sarna
c35160457b Merge 'Clean up stream_id representation' from Piotr
With #5950 we changed the representation of stream_id
in CDC Log from two int columns to a single blob column.

This PR cleans up stream_id representation internally.
Now stream_id is stored as blob both in-memory and in
internal CDC tables.

Tests: unit(dev)

* hawk/stream_id_representation:
  cdc: store stream_ids as blobs in internal tables
  cdc: improve do_update_streams_description
  cdc: Fix generate_topology_description
  cdc: add stream_id::operator<
  cdc: change stream_id representation
2020-03-05 14:14:29 +01:00
Tomasz Grabiec
d5557023f6 Merge "Stop using BOOST_TEST_MESSAGE() in unit tests" from Kostja
Stop using BOOST_TEST_MESSAGE() in unit tests, it bloats test XML
output. Use Scylla logger instead.

Test: unit (debug, dev, release)
2020-03-05 13:27:30 +01:00
Calle Wilund
b48255a4cd db::commitlog: Only zero disk blocks not already allocated in segment
Fixes #5891
Refs #5899

When creating segments with the o_dsync option active, we write max_size
zeros to disk, to ensure actual disk blocks are allocated.

However, if we recycle a segment, we should, when not actually creating
a new file, check the existing size on disk, and only zero any blocks
not already allocated (i.e. if recycled file was smaller than max_size,
due to segement truncation on close).

test: unit
Message-Id: <20200226121601.15347-1-calle@scylladb.com>
2020-03-05 13:27:08 +01:00
Piotr Sarna
875d230298 Merge "CDC: use a single cdc$time value for a batch of changes"
from Kamil.

If a batch update is performed with a sequence of changes with a single
timestamp, they will now show up in CDC with a single timeuuid
in the cdc$time column, distinguished by different cdc$batch_seq_no values.

Fixes #5953.

Tests: unit(dev)

* haaawk/splitbatch:
  cdc: use a single timeuuid value for a batch of changes
  cdc: replace `split` with `for_each_change`
2020-03-05 13:17:34 +01:00
Pavel Emelyanov
7bc34c17eb range-streamer: Tune the progress message
Now it will show the full info about range being streamed, like

range_streamer - Rebuild with 127.0.0.2 for keyspace=ks2, streaming [72, 96) out of 248 ranges

The [x, y) range is semi-open one, the full streaming progress
then can be logged like

... streaming [0, 16) out of 36 ranges   <- first send
... streaming [16, 24) out of 36 ranges
... streaming [24, 36) out of 36 ranges  <- last send

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200304101505.5506-1-xemul@scylladb.com>
2020-03-05 12:56:29 +01:00
Kamil Braun
3200d415da cdc: use a single timeuuid value for a batch of changes
If a batch update is performed with a sequence of changes with a single
timestamp, they will now show up in CDC with a single timeuuid in the
`time` column, distinguished by different `batch_seq_no` values.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 12:32:57 +01:00
Konstantin Osipov
94ee511f6a lwt: implement cas_failed_read_round_optimization metric
Presently lightweight transactions piggy back the old
row value on prepare round response. If one of the participants
did not provide the old value or the values from peers don't match,
we perform a full read round which will repair the Paxos table and the
base table, if necessary, at all participants.

Capture the fact that read optimization has failed in a metric.
Message-Id: <20200304192955.84208-2-kostja@scylladb.com>
2020-03-05 12:20:45 +01:00
Kamil Braun
292eba9da0 cdc: replace split with for_each_change
`for_each_change` is like `split` but it doesn't return a vector of
mutations representing each change; instead, it takes as a parameter
a function which gets called on each mutation.

This reduced the memory usage and allows to preserve common context
when handling each change (will be useful in next commits).

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 12:05:08 +01:00
Pekka Enberg
0beb45faf3 build: Use reloc dynamic linker unconditionally
The relocatable package requires a magic dynamic linker path for
"patchelf" to work correctly. Therefore, use the "get-dynamic-linker.sh"
script to unconditionally define a magic dynamic linker path to ensure
that building the relocatable package with ninja build ("ninja-build
build/<mode>/scylla-package.tar.gz") is always correct. Although the
path looks odd with a lot of leading slashes, it works outside
relocatable package too.
Message-Id: <20200305091919.6315-2-penberg@scylladb.com>
2020-03-05 12:53:28 +02:00
Pekka Enberg
8a810cc41a reloc: Move dynamic linker magic to get-dynamic-linker.sh
In preparation for moving dynamic linker flags to ninja build, move the
magic dynamic linker path generation to "reloc/get-dynamic-linker.sh"
script that configure.py can call.
Message-Id: <20200305084331.5339-1-penberg@scylladb.com>
2020-03-05 12:53:22 +02:00
Konstantin Osipov
ac0717fb64 test: consistently use a global testlog object in all tests
Use test/lib/log.hh in all tests now that we have it.
2020-03-05 13:34:24 +03:00
Piotr Jastrzebski
57cfe6d0e1 cdc: store stream_ids as blobs in internal tables
In new CDC Log format stream_id is represented by a single
blob column so it makes sense to store it in the same form
everywhere - including internal CDC tables.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 11:31:22 +01:00
Piotr Jastrzebski
b2acdc9307 cdc: improve do_update_streams_description
Use std::set::insert that takes range instead of
looping through elements and adding them one by one.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 11:31:22 +01:00
Piotr Jastrzebski
446722d6ed cdc: Fix generate_topology_description
In new CDC Log format we store only a single stream_id column.
This means generate_topology_description has to use appropriate
schema for generating tokens for stream_ids.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 11:31:22 +01:00
Piotr Jastrzebski
9a212dcaef cdc: add stream_id::operator<
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 11:31:21 +01:00
Piotr Jastrzebski
f317a659d9 cdc: change stream_id representation
New CDC Log format stores stream ids as blobs.
It makes sense to keep them internally in the same form.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-05 11:30:10 +01:00
Piotr Sarna
f21bd57058 Merge "cdc: log static rows correctly" from Piotr
Currently, writes to a static row in a base table are not reflected
at all in the corresponding cdc log. This patch causes such writes
to be properly logged.

Fixes: #5744
Tests: unit(dev)

* piodul/5744-handle-static-row-correctly-in-cdc:
  cdc_test: add tests for handling static row
  cdc: fix indentation in transformer::transform
  cdc: handle static rows separately in transformer::transform
  cdc: move process_cells higher (and fix captured variables)
  cdc: reduce dependencies on captured variables in process_cells
  cdc: fix preimage query for static rows
2020-03-05 10:42:15 +01:00
Nadav Har'El
96ca5ac2c8 alternator: use separate smp_service_group for bouncing requests
Until this patch, we used the default_smp_service_group() when bouncing
Alternator requests between shards (which is needed for LWT).

This patch creates a new smp_service_group for this purpose, which is
limited to 5000 concurrent requests (the same limit used for CQL's
bounce_request_smp_service_group). The purpose of this limit is to avoid
many shards admitting a huge number of requests and bouncing all of them
to the same shard who now can't "unadmit" these requests.

Fixes #5664.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200304170825.27226-1-nyh@scylladb.com>
2020-03-05 10:17:51 +01:00
Konstantin Osipov
ff3f9cb7cf test: stop using BOOST_TEST_MESSAGE() for logging
We use boost test logging primarily to generate nice XML xunit
files used in Jenkins. These XML files can be bloated
with messages from BOOST_TEST_MESSAGE(), hundreds of megabytes
of build archives, on every build.

Let's use seastar logger for test logging instead, reserving
the use of boost log facilities for boost test markup information.
2020-03-05 11:38:11 +03:00
Juliusz Stasiewicz
c8527f20b0 CDC+LWT: fix missing CDC entries for successful LWTs
Now, if CDC is enabled, `paxos_response_handler::learn_decision()`
augments the base table mutation. The differences in logic between:
(1) `mutate_internal<std::vector<mutation>>()`
and
(2) `mutate_internal<std::vector<std::tuple<paxos::proposal, schema_ptr, ...>>>()`
make it necessary to separate "CDC mutations" from "base mutation"
and send them, respectively, to (1) and (2).

Gleb explained in #5869 why it became necessary to add CDC code to LWT
writes specifically, instead of doing it somewhere central that affects
all writes:

"All paths that do write goes through mutate_internally() eventually so it
would have been best to do augmentations there, but cdc chose to log only
certain writes and not others (unlike MV that does not care how write
happened) and mutate_internal have no idea which is which so I do not have
other choice but code duplication. ... paxos_response_handler::learn_decision
is probably the place to add cdc augmentation."

Fixes #5869
2020-03-05 09:49:19 +02:00
Piotr Dulikowski
204e204586 cdc: do not attempt to log empty mutations
It is possible to produce an empty mutation using CQL. For example, the
following query:

DELETE FROM ks.tbl WHERE pk = 0 AND ck < 1 AND ck > 2;

will attempt to delete from an empty range of rows. This is translated
to the following mutation:

{ks.tbl {key: pk{000400000000}, token:-3485513579396041028}
 {mutation_partition:
  static: cont=1 {row: },
  clustered: {}}}

Such mutation does not contain any timestamp, therefore it is difficult
to determine what timestamp was used while making the query. This is
problematic for CDC, because an entry in CDC log should be written with
the same timestamp as a part of the mutation.

Because an empty mutation does not modify the table in any way, we can
safely skip logging such mutations in CDC and still preserve the
ability to reconstruct the current state of the base table from full
CDC log.

Tests: unit(dev)
2020-03-05 08:32:54 +01:00
Piotr Dulikowski
e6751fad62 cdc_test: add tests for handling static row 2020-03-05 00:16:17 +01:00
Piotr Dulikowski
39519ce923 cdc: fix indentation in transformer::transform 2020-03-05 00:16:17 +01:00
Piotr Dulikowski
0d05b17881 cdc: handle static rows separately in transformer::transform
Before this patch, `transform` did not generate any log rows about
static row change. This commit fixes that - now, a log row is created if
a static row is changed, and this row is separate from the rows that
describe changes to the clustering rows.
2020-03-05 00:16:17 +01:00
Piotr Dulikowski
6a0b0b5786 cdc: move process_cells higher (and fix captured variables)
The `process_cells` lambda is moved outside the loop, because it will be
used by other code in subsequent commits.
2020-03-05 00:15:57 +01:00
Piotr Dulikowski
f136f6e02c cdc: reduce dependencies on captured variables in process_cells
This is a preparation for moving the lambda outside the for loop.

- `log_ck`, `pikey`, `pirow` are now passed as arguments,
- `value` is now a variable local to the lambda,
- `ttl` is now a variable local to the lambda that is returned.
2020-03-05 00:14:05 +01:00
Piotr Dulikowski
a7f51449c3 cdc: fix preimage query for static rows
For static rows, we need to fetch at least one row from its partition in
order to compute its preimage.
2020-03-04 18:43:55 +01:00
Botond Dénes
8b908a9aba test: lib/mutation_source_test: log the name of the test-method
Most test-methods log a message with their names upon entering them.
This helps in identifying the test-method a failure happened in in the
logs. Two methods were missing this log line, so add it.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200304155235.46170-1-bdenes@scylladb.com>
2020-03-04 18:16:21 +02:00
Pekka Enberg
7fde2e28da dist/redhat: Specify files once in scylla.spec file
Silences the following warnings when building an RPM:

  warning: File listed twice: /opt/scylladb/scripts/libexec/hex2list.py
  warning: File listed twice: /opt/scylladb/scripts/libexec/node_exporter_install
  warning: File listed twice: /opt/scylladb/scripts/libexec/perftune.py
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla-blocktune
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla-housekeeping
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_bootparam_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_config_get.py
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_coredump_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_cpuscaling_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_cpuset_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_dev_mode_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_ec2_check
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_fstrim
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_fstrim_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_io_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_kernel_check
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_ntp_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_prepare
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_raid_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_selinux_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_stop
  warning: File listed twice: /opt/scylladb/scripts/libexec/scylla_sysconfig_setup
  warning: File listed twice: /opt/scylladb/scripts/libexec/seastar-addr2line
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/LICENSE-crc32-vpmsum.TXT
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/README.md
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/apache-license-2.0.txt
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/boost-license-1.0.txt
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/date-license.txt
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/git-archive-all-license.txt
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/libdeflate-license.txt
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/xxhash-license.txt
  warning: File listed twice: /opt/scylladb/share/doc/scylla/licenses/zstd-license.txt

I verified that the files are in the generated RPMs after the change:

  [penberg@nero scylla]$ rpm -ql build/dist/dev/redhat/RPMS/x86_64/scylla-server-666.development-0.20200304.2bc700b008.x86_64.rpm | grep scripts.*libexec
  /opt/scylladb/scripts/libexec
  /opt/scylladb/scripts/libexec/hex2list.py
  /opt/scylladb/scripts/libexec/node_exporter_install
  /opt/scylladb/scripts/libexec/perftune.py
  /opt/scylladb/scripts/libexec/scylla-blocktune
  /opt/scylladb/scripts/libexec/scylla-housekeeping
  /opt/scylladb/scripts/libexec/scylla_bootparam_setup
  /opt/scylladb/scripts/libexec/scylla_config_get.py
  /opt/scylladb/scripts/libexec/scylla_coredump_setup
  /opt/scylladb/scripts/libexec/scylla_cpuscaling_setup
  /opt/scylladb/scripts/libexec/scylla_cpuset_setup
  /opt/scylladb/scripts/libexec/scylla_dev_mode_setup
  /opt/scylladb/scripts/libexec/scylla_ec2_check
  /opt/scylladb/scripts/libexec/scylla_fstrim
  /opt/scylladb/scripts/libexec/scylla_fstrim_setup
  /opt/scylladb/scripts/libexec/scylla_io_setup
  /opt/scylladb/scripts/libexec/scylla_kernel_check
  /opt/scylladb/scripts/libexec/scylla_ntp_setup
  /opt/scylladb/scripts/libexec/scylla_prepare
  /opt/scylladb/scripts/libexec/scylla_raid_setup
  /opt/scylladb/scripts/libexec/scylla_selinux_setup
  /opt/scylladb/scripts/libexec/scylla_setup
  /opt/scylladb/scripts/libexec/scylla_stop
  /opt/scylladb/scripts/libexec/scylla_sysconfig_setup
  /opt/scylladb/scripts/libexec/seastar-addr2line
  [penberg@nero scylla]$ rpm -ql build/dist/dev/redhat/RPMS/x86_64/scylla-server-666.development-0.20200304.2bc700b008.x86_64.rpm | grep license
  /opt/scylladb/share/doc/scylla/licenses
  /opt/scylladb/share/doc/scylla/licenses/LICENSE-crc32-vpmsum.TXT
  /opt/scylladb/share/doc/scylla/licenses/README.md
  /opt/scylladb/share/doc/scylla/licenses/apache-license-2.0.txt
  /opt/scylladb/share/doc/scylla/licenses/boost-license-1.0.txt
  /opt/scylladb/share/doc/scylla/licenses/date-license.txt
  /opt/scylladb/share/doc/scylla/licenses/git-archive-all-license.txt
  /opt/scylladb/share/doc/scylla/licenses/libdeflate-license.txt
  /opt/scylladb/share/doc/scylla/licenses/xxhash-license.txt
  /opt/scylladb/share/doc/scylla/licenses/zstd-license.txt

Message-Id: <20200304150057.2621-1-penberg@scylladb.com>
2020-03-04 17:25:53 +02:00
Tomasz Grabiec
da4bd3d2e6 Merge "Clean cql3 usage of storage_proxy and _service" from Pavel E.
This set removes _all_ mentionings of storage_service and _all_ calls
for global storage_proxy instances from cql3/ code.

Tests: unit(dev)
2020-03-04 15:20:24 +01:00
Raphael S. Carvalho
3ba3ee2a7b distributed_loader: trigger regular compaction on resharding completion
Regular compaction relies on compaction manager to run compaction jobs
until compaction strategy is satisfied. Resharding, on the other hand,
is an one-off operation which runs only once in compaction manager,
and leave the sstable set in such a way that the strategy is very
likely unsatisfied. We need to trigger regular compaction whenever
a resharding job replaces a shared sstable by an unshared sstable,
so that compaction will not fall way behind due to lots of new sstables
created by resharding process.

Fixes #5262.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200217144946.20338-1-raphaelsc@scylladb.com>
2020-03-04 16:08:13 +02:00
Nadav Har'El
f67a402c48 merge: Remove treewide dependency on boost/multiprecision
Merged patch series from Avi Kivity:

boost/multiprecision is a heavyweight library, pulling in 20,000 lines of code into
each header that depends on it. It is used by converting_mutation_partition_applier
and types.hh. While the former is easy to put out-of-line, the latter is not.

All we really need is to forward-declare boost::multiprecision::cpp_int, but that
is not easy - it is a template taking several parameters, among which are non-type
template parameters also defined in that header. So it's quite difficult to
disentangle, and fragile wrt boost changes.

This patchset introduces a wrapper type utils::multiprecision_int which _can_
be forward declared, and together with a few other small fixes, manages to
uninclude boost/multiprecision from most of the source files. The total reduction
in number of lines compiled over a full build is 324 * 23,227 or around 7.5
million.

Tests: unit (dev)
Ref #1

https://github.com/avikivity/scylla uninclude-boost-multiprecision/v1

Avi Kivity (5):
  converting_mutation_partition_applier: move to .cc file
  utils: introduce multiprecision_int
  tests: cdc_test: explicitly convert from cdc::operation to uint8_t
  treewide: use utils::multiprecision_int for varint implementation
  types: forward-declare multiprecision_int

 configure.py                             |   2 +
 concrete_types.hh                        |   2 +-
 converting_mutation_partition_applier.hh | 163 ++-------------
 types.hh                                 |  12 +-
 utils/big_decimal.hh                     |   3 +-
 utils/multiprecision_int.hh              | 256 +++++++++++++++++++++++
 converting_mutation_partition_applier.cc | 188 +++++++++++++++++
 cql3/functions/aggregate_fcts.cc         |  10 +-
 cql3/functions/castas_fcts.cc            |  28 +--
 cql3/type_json.cc                        |   2 +-
 lua.cc                                   |  38 ++--
 mutation_partition_view.cc               |   2 +
 test/boost/cdc_test.cc                   |   6 +-
 test/boost/cql_query_test.cc             |  16 +-
 test/boost/json_cql_query_test.cc        |  12 +-
 test/boost/types_test.cc                 |  58 ++---
 test/boost/user_function_test.cc         |   2 +-
 test/lib/random_schema.cc                |  14 +-
 types.cc                                 |  20 +-
 utils/big_decimal.cc                     |   4 +-
 utils/multiprecision_int.cc              |  37 ++++
 21 files changed, 627 insertions(+), 248 deletions(-)
 create mode 100644 utils/multiprecision_int.hh
 create mode 100644 converting_mutation_partition_applier.cc
 create mode 100644 utils/multiprecision_int.cc
2020-03-04 15:13:42 +02:00
Avi Kivity
5dee627f73 types: forward-declare multiprecision_int
This reduces the number of translation units that depend on
boost/multiprecision from 354 to 30, and reduces the size of
database.i (as an example) from 406160 to 382933 (smaller
files will benefit more, relatively).

Ref #1
2020-03-04 13:28:16 +02:00
Avi Kivity
3c772757c0 treewide: use utils::multiprecision_int for varint implementation
The goal is to forward-declare utils::multiprecision_int, something
beyond my capabilities for boost::multiprecision::cpp_int, to reduce
compile time bloat.

The patch is mostly search-and-replace, with a few casts added to
disambiguate conversions the compiler had trouble with.
2020-03-04 13:28:16 +02:00
Avi Kivity
874f65c58c tests: cdc_test: explicitly convert from cdc::operation to uint8_t
After the varint data type starts using the new multiprecision_int type,
this code fails to compile. I expect that somehow the conversion from enum
class to cpp_int was allowed to succeed, and we ended up with a data_value
of type varint. The tests succeeded because the serialized representation
happened to be the same.
2020-03-04 13:28:16 +02:00
Piotr Jastrzebski
354e3c34c8 cdc log: merge stream_id columns into a single column
Previously we had stream_id_1 and stream_id_2 columns
of type long each. They were forming a partition key.

In a new format we want a single stream_id column that
forms a partition key. To be able to still store two
longs, the new column will have type blob and its value
will be concatenated bytes of two longs that
partition key is composed of.

We still want partition key to logically be two longs
because those two values will be used by a custom partitioner
later once we implement it.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-04 13:27:48 +02:00
Avi Kivity
7434c81a29 utils: introduce multiprecision_int
multiprecision_int is a wrapper around boost::multiprecision::cpp_int that adds
no functionality. The intent is to allow forward declration; cpp_int is so
complicated that just finding out what its true type is a difficult exercise, as
it depends on many internal declarations.

Because cpp_int uses expression templates, the implementation has to explicitly
cast to the desired type in many places, otherwise the C++ compile is presented
with too many choices, especially in conjunction with data_value (which can
convert from many different types too).
2020-03-04 12:42:57 +02:00
Avi Kivity
414ec8c68e converting_mutation_partition_applier: move to .cc file
converting_mutation_partition_applier is a heavyweight class that is not
used in the hot path, so it can be safely out-of-lined. This moves
some includes to boost/multiprecision out of header files, where they
can infect a lot of code.

mutation_partition_view.cc's includes were adjusted to recover
missing dependencies.
2020-03-04 12:42:57 +02:00
Pavel Emelyanov
35b0e6dd7f repair_writer: Use db from repair_meta (2nd try)
The previous version errorneously used local db reference
which was propagated into another shard. This time carry
the sharded instance and use .local() as before.

tests: unit(dev)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200303221729.31261-1-xemul@scylladb.com>
2020-03-04 11:31:52 +01:00