Nadav Har'El fec562ec8f Materialized views: limit size of row batching during bulk view building
The bulk materialized-view building processes (when adding a materialized
view to a table with existing data) currently reads the base table in
batches of 128 (view_builder::batch_size) rows. This is clearly better
than reading entire partitions (which may be huge), but still, 128 rows
may grow pretty large when we have rows with large strings or blobs,
and there is no real reason to buffer 128 rows when they are large.

Instead, when the rows we read so far exceed some size threshold (in this
patch, 1MB), we can operate on them immediately instead of waiting for
128.

As a side-effect, this patch also solves another bug: At worst case, all
the base rows of one batch may be written into one output view partition,
in one mutation. But there is a hard limit on the size of one mutation
(commitlog_segment_size_in_mb, by default 32MB), so we cannot allow the
batch size to exceed this limit. By not batching further after 1MB,
we avoid reaching this limit when individual rows do not reach it but
128 of them did.

Fixes #4213.

This patch also includes a unit test reproducing #4213, and demonstrating
that it is now solved.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190214093424.7172-1-nyh@scylladb.com>
2019-02-14 12:04:40 +02:00
2018-11-21 00:01:44 +02:00
2019-01-30 11:17:38 +02:00
2018-08-01 16:50:58 +01:00
2018-12-03 11:18:02 +02:00
2018-11-26 18:59:41 +01:00
2019-02-12 18:42:07 +02:00
2019-01-28 15:03:14 -08:00
2018-11-27 13:01:02 +02:00
2018-06-19 16:26:51 +03:00
2018-11-21 00:01:44 +02:00
2018-11-21 00:01:44 +02:00
2018-11-21 00:01:44 +02:00
2018-11-21 00:01:44 +02:00
2019-02-13 13:33:50 +02:00
2017-06-23 11:35:35 -04:00
2016-04-08 08:12:47 +03:00
2018-11-21 00:01:44 +02:00
2018-02-14 14:15:59 -05:00
2019-01-22 15:34:32 +02:00
2018-11-01 13:16:17 +00:00
2018-11-21 00:01:44 +02:00
2017-08-27 13:11:33 +03:00
2019-02-11 09:25:25 +01:00
2018-11-28 23:54:03 +01:00
2017-06-23 11:35:35 -04:00
2018-11-21 00:01:44 +02:00
2018-11-21 00:01:44 +02:00
2018-11-21 00:01:44 +02:00
2018-11-21 00:01:44 +02:00
2018-12-12 16:49:01 +08:00

Scylla

Quick-start

$ git submodule update --init --recursive
$ sudo ./install-dependencies.sh
$ ./configure.py --mode=release
$ ninja-build -j4 # Assuming 4 system threads.
$ ./build/release/scylla
$ # Rejoice!

Please see HACKING.md for detailed information on building and developing Scylla.

Note: GCC >= 8.1.1 is require to compile Scylla.

Note: See frozen toolchain for a way to build and run on an older distribution.

Running Scylla

  • Run Scylla
./build/release/scylla

  • run Scylla with one CPU and ./tmp as data directory
./build/release/scylla --datadir tmp --commitlog-directory tmp --smp 1
  • For more run options:
./build/release/scylla --help

Building Fedora RPM

As a pre-requisite, you need to install Mock on your machine:

# Install mock:
sudo yum install mock

# Add user to the "mock" group:
usermod -a -G mock $USER && newgrp mock

Then, to build an RPM, run:

./dist/redhat/build_rpm.sh

The built RPM is stored in /var/lib/mock/<configuration>/result directory. For example, on Fedora 21 mock reports the following:

INFO: Done(scylla-server-0.00-1.fc21.src.rpm) Config(default) 20 minutes 7 seconds
INFO: Results and/or logs in: /var/lib/mock/fedora-21-x86_64/result

Building Fedora-based Docker image

Build a Docker image with:

cd dist/docker
docker build -t <image-name> .

Run the image with:

docker run -p $(hostname -i):9042:9042 -i -t <image name>

Contributing to Scylla

Guidelines for contributing

Description
No description provided
Readme 459 MiB
Languages
C++ 72.3%
Python 26.5%
CMake 0.3%
GAP 0.3%
Shell 0.3%