Commit Graph

7 Commits

Author SHA1 Message Date
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Piotr Jastrzebski
649f254863 cdc: Limit size of topology description
Currently, whole topology description for CDC is stored in a single row.
This means that for a large cluster of strong machines (say 100 nodes 64
cpus each), the size of the topology description can reach 32MB.

This causes multiple problems. First of all, there's a hard limit on
mutation size that can be written to Scylla. It's related to commit log
block size which is 16MB by default. Mutations bigger than that can't be
saved. Moreover, such big partitions/rows cause reactor stalls and
negatively influence latency of other requests.

This patch limits the size of topology description to about 4MB. This is
done by reducing the number of CDC streams per vnode and can lead to CDC
data not being fully colocated with Base Table data on shards. It can
impact performance and consistency of data.

This is just a quick fix to make it easily backportable. A full solution
to the problem is under development.

For more details see #7961, #7993 and #7985.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2021-02-17 13:24:40 +01:00
Calle Wilund
05851578d4 alternator::streams: Report streams as not ready until CDC stream id:s are available
Refs #6864

When booting a clean scylla, CDC stream ID:s will not be availble until
a n*ring delay time period has passed. Before this, writing to a CDC
enabled table will fail hard.
For alternator (and its tests), we can report the stream(s) for tables as not yet
available (ENABLING) until such time as id:s are
computed.

v2:
* Keep storage service ref in executor
2020-08-03 20:34:15 +03:00
Piotr Jastrzebski
f0f6e220ea cdc: stop using partitioners
CDC can get all it needs from a config and does not need
partitioner.

For base table specific operations CDC is using partitioner
from that table (obtained with schema::get_partitioner).

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:59:15 +01:00
Piotr Jastrzebski
50cfe81331 murmur3: move sharding logic to token and i_partitioner
Since token representation is fixed now, all the partitioners
will share the sharding logic. It makes sense now to keep
the logic in common super class and separate header that's
included only in i_partitioner.cc.

shard_of and token_for_next_shard are now implemented in
i_partitioner. They would be non-virtual but we have to
keep them virtual because one test is overriding them
to enforce some specific sharding.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-05 09:31:32 +01:00
Kamil Braun
834c2ca997 cdc: add cdc::metadata class
The class stores a queue of CDC generations to be used for choosing
streams when writing to the CDC log.

This data structure will be updated on some gossip events (when a new node
joins the cluster and proposes a new generation of CDC streams).
2020-01-30 11:10:08 +01:00