It is well known that seastar applications, like Scylla, do not play
well with external processes: CPU usage from external processes may
confuse the I/O and CPU schedulers and create stalls.
We have also recently seen that memory usage from other application's
anonymous and page cache memory can bring the system to OOM.
Linux has a very good infrastructure for resource control contributed by
amazingly bright engineers in the form of cgroup controllers. This
infrastructure is exposed by SystemD in the form of slices: a
hierarchical structure to which controllers can be attached.
In true systemd way, the hierarchy is implicit in the filenames of the
slice files. a "-" symbol defines the hierarchy, so the files that this
patch presents, scylla-server and scylla-helper, essentially create a
"scylla" cgroup at the top level with "server" and "helper" children.
Later we mark the Services needed to run scylla as belonging to one
or the other through the Slice= directive.
Scylla DBAs can benefit from this setup by using the systemd-run
utility to fire ad-hoc commands.
Let's say for example that someone wants to hypothetically run a backup
and transfer files to an external object store like S3, making sure that
the amount of page cache used won't create swap pressure leading to
database timeouts.
One can then run something like:
```
sudo systemd-run --uid=`id -u scylla` --gid=`id -g scylla` -t --slice=scylla-helper.slice /path/to/my/magical_backup_tool
```
(or even better, the backup tool can itself be a systemd timer)
Changes from last version:
- No longer use the CPUQuota
- Minor typo fixes
- postinstall fixup for small machines
Benchmark results:
==================
Test: read from disk, with 100% disk util using a single i3.xlarge (4 vCPUs).
We have to fill the cache as we read, so this should stress CPU, memory and
disk I/O.
cassandra-stress command:
```
cassandra-stress read no-warmup duration=5m -rate threads=20 -node 10.2.209.188 -pop dist=uniform\(1..150000000\)
```
Baseline results:
```
Results:
Op rate : 13,830 op/s [READ: 13,830 op/s]
Partition rate : 13,830 pk/s [READ: 13,830 pk/s]
Row rate : 13,830 row/s [READ: 13,830 row/s]
Latency mean : 1.4 ms [READ: 1.4 ms]
Latency median : 1.4 ms [READ: 1.4 ms]
Latency 95th percentile : 2.4 ms [READ: 2.4 ms]
Latency 99th percentile : 2.8 ms [READ: 2.8 ms]
Latency 99.9th percentile : 3.4 ms [READ: 3.4 ms]
Latency max : 12.0 ms [READ: 12.0 ms]
Total partitions : 4,149,130 [READ: 4,149,130]
Total errors : 0 [READ: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:05:00
```
Question 1:
===========
Does putting scylla in a special slice affect its performance ?
Results with Scylla running in a slice:
```
Results:
Op rate : 13,811 op/s [READ: 13,811 op/s]
Partition rate : 13,811 pk/s [READ: 13,811 pk/s]
Row rate : 13,811 row/s [READ: 13,811 row/s]
Latency mean : 1.4 ms [READ: 1.4 ms]
Latency median : 1.4 ms [READ: 1.4 ms]
Latency 95th percentile : 2.2 ms [READ: 2.2 ms]
Latency 99th percentile : 2.6 ms [READ: 2.6 ms]
Latency 99.9th percentile : 3.3 ms [READ: 3.3 ms]
Latency max : 23.2 ms [READ: 23.2 ms]
Total partitions : 4,151,409 [READ: 4,151,409]
Total errors : 0 [READ: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:05:00
```
*Conclusion* : No significant change
Question 2:
===========
What happens when there is a CPU hog running in the same server as scylla?
CPU hog:
```
taskset -c 0 /bin/sh -c "while true; do true; done" &
taskset -c 1 /bin/sh -c "while true; do true; done" &
taskset -c 2 /bin/sh -c "while true; do true; done" &
taskset -c 3 /bin/sh -c "while true; do true; done" &
sleep 330
```
Scenario 1: CPU hog runs freely:
```
Results:
Op rate : 2,939 op/s [READ: 2,939 op/s]
Partition rate : 2,939 pk/s [READ: 2,939 pk/s]
Row rate : 2,939 row/s [READ: 2,939 row/s]
Latency mean : 6.8 ms [READ: 6.8 ms]
Latency median : 5.3 ms [READ: 5.3 ms]
Latency 95th percentile : 11.0 ms [READ: 11.0 ms]
Latency 99th percentile : 14.9 ms [READ: 14.9 ms]
Latency 99.9th percentile : 17.1 ms [READ: 17.1 ms]
Latency max : 26.3 ms [READ: 26.3 ms]
Total partitions : 884,460 [READ: 884,460]
Total errors : 0 [READ: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:05:00
```
Scenario 2: CPU hog runs inside scylla-helper slice
```
Results:
Op rate : 13,527 op/s [READ: 13,527 op/s]
Partition rate : 13,527 pk/s [READ: 13,527 pk/s]
Row rate : 13,527 row/s [READ: 13,527 row/s]
Latency mean : 1.5 ms [READ: 1.5 ms]
Latency median : 1.4 ms [READ: 1.4 ms]
Latency 95th percentile : 2.4 ms [READ: 2.4 ms]
Latency 99th percentile : 2.9 ms [READ: 2.9 ms]
Latency 99.9th percentile : 3.8 ms [READ: 3.8 ms]
Latency max : 18.7 ms [READ: 18.7 ms]
Total partitions : 4,069,934 [READ: 4,069,934]
Total errors : 0 [READ: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:05:00
```
*Conclusion*: With systemd slice we can keep the performance very close to
baseline
Question 3:
===========
What happens when there is a CPU hog running in the same server as scylla?
I/O hog: (Data in the cluster is 2x size of memory)
```
while true; do
find /var/lib/scylla/data -type f -exec grep glauber {} +
done
```
Scenario 1: I/O hog runs freely:
```
Results:
Op rate : 7,680 op/s [READ: 7,680 op/s]
Partition rate : 7,680 pk/s [READ: 7,680 pk/s]
Row rate : 7,680 row/s [READ: 7,680 row/s]
Latency mean : 2.6 ms [READ: 2.6 ms]
Latency median : 1.3 ms [READ: 1.3 ms]
Latency 95th percentile : 7.8 ms [READ: 7.8 ms]
Latency 99th percentile : 10.9 ms [READ: 10.9 ms]
Latency 99.9th percentile : 16.9 ms [READ: 16.9 ms]
Latency max : 40.8 ms [READ: 40.8 ms]
Total partitions : 2,306,723 [READ: 2,306,723]
Total errors : 0 [READ: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:05:00
```
Scenario 2: I/O hog runs in the scylla-helper systemd slice:
```
Results:
Op rate : 13,277 op/s [READ: 13,277 op/s]
Partition rate : 13,277 pk/s [READ: 13,277 pk/s]
Row rate : 13,277 row/s [READ: 13,277 row/s]
Latency mean : 1.5 ms [READ: 1.5 ms]
Latency median : 1.4 ms [READ: 1.4 ms]
Latency 95th percentile : 2.4 ms [READ: 2.4 ms]
Latency 99th percentile : 2.9 ms [READ: 2.9 ms]
Latency 99.9th percentile : 3.5 ms [READ: 3.5 ms]
Latency max : 183.4 ms [READ: 183.4 ms]
Total partitions : 3,984,080 [READ: 3,984,080]
Total errors : 0 [READ: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:05:00
```
*Conclusion*: With systemd slice we can keep the performance very close to
baseline
Signed-off-by: Glauber Costa <glauber@scylladb.com>
212 lines
7.6 KiB
Bash
Executable File
212 lines
7.6 KiB
Bash
Executable File
#!/bin/bash
|
|
#
|
|
# Copyright (C) 2018 ScyllaDB
|
|
#
|
|
|
|
#
|
|
# This file is part of Scylla.
|
|
#
|
|
# Scylla is free software: you can redistribute it and/or modify
|
|
# it under the terms of the GNU Affero General Public License as published by
|
|
# the Free Software Foundation, either version 3 of the License, or
|
|
# (at your option) any later version.
|
|
#
|
|
# Scylla is distributed in the hope that it will be useful,
|
|
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
# GNU General Public License for more details.
|
|
#
|
|
# You should have received a copy of the GNU General Public License
|
|
# along with Scylla. If not, see <http://www.gnu.org/licenses/>.
|
|
#
|
|
|
|
set -e
|
|
|
|
print_usage() {
|
|
cat <<EOF
|
|
Usage: install.sh [options]
|
|
|
|
Options:
|
|
--root /path/to/root alternative install root (default /)
|
|
--prefix /prefix directory prefix (default /usr)
|
|
--python3 /opt/python3 path of the python3 interpreter relative to install root (default /opt/scylladb/python3/bin/python3)
|
|
--housekeeping enable housekeeping service
|
|
--target centos specify target distribution
|
|
--disttype [redhat|debian] specify type of distribution (redhat or debian)
|
|
--pkg package specify build package (server/conf/kernel-conf)
|
|
--help this helpful message
|
|
EOF
|
|
exit 1
|
|
}
|
|
|
|
root=/
|
|
prefix=/opt/scylladb
|
|
housekeeping=false
|
|
target=centos
|
|
python3=/opt/scylladb/python3/bin/python3
|
|
|
|
while [ $# -gt 0 ]; do
|
|
case "$1" in
|
|
"--root")
|
|
root="$2"
|
|
shift 2
|
|
;;
|
|
"--prefix")
|
|
prefix="$2"
|
|
shift 2
|
|
;;
|
|
"--housekeeping")
|
|
housekeeping=true
|
|
shift 1
|
|
;;
|
|
"--target")
|
|
target="$2"
|
|
shift 2
|
|
;;
|
|
"--python3")
|
|
python3="$2"
|
|
shift 2
|
|
;;
|
|
"--disttype")
|
|
disttype="$2"
|
|
shift 2
|
|
;;
|
|
"--pkg")
|
|
pkg="$2"
|
|
shift 2
|
|
;;
|
|
"--help")
|
|
shift 1
|
|
print_usage
|
|
;;
|
|
*)
|
|
print_usage
|
|
;;
|
|
esac
|
|
done
|
|
if [ -n "$pkg" ] && [ "$pkg" != "server" -a "$pkg" != "conf" -a "$pkg" != "kernel-conf" ]; then
|
|
print_usage
|
|
exit 1
|
|
fi
|
|
|
|
rprefix="$root/$prefix"
|
|
retc="$root/etc"
|
|
rusr="$root/usr"
|
|
rdoc="$rprefix/share/doc"
|
|
|
|
is_redhat=false
|
|
is_debian=false
|
|
MUSTACHE_DIST="\"$target\": true, \"target\": \"$target\""
|
|
if [ "$disttype" = "redhat" ]; then
|
|
MUSTACHE_DIST="\"redhat\": true, $MUSTACHE_DIST"
|
|
is_redhat=true
|
|
sysconfdir=sysconfig
|
|
elif [ "$disttype" = "debian" ]; then
|
|
MUSTACHE_DIST="\"debian\": true, $MUSTACHE_DIST"
|
|
is_debian=true
|
|
sysconfdir=default
|
|
else
|
|
print_usage
|
|
exit 1
|
|
fi
|
|
|
|
mkdir -p build
|
|
pystache dist/common/systemd/scylla-server.service.mustache "{ $MUSTACHE_DIST }" > build/scylla-server.service
|
|
pystache dist/common/systemd/scylla-housekeeping-daily.service.mustache "{ $MUSTACHE_DIST }" > build/scylla-housekeeping-daily.service
|
|
pystache dist/common/systemd/scylla-housekeeping-restart.service.mustache "{ $MUSTACHE_DIST }" > build/scylla-housekeeping-restart.service
|
|
|
|
|
|
if [ -z "$pkg" ] || [ "$pkg" = "conf" ]; then
|
|
install -d -m755 "$retc"/scylla
|
|
install -d -m755 "$retc"/scylla.d
|
|
install -m644 conf/scylla.yaml -Dt "$retc"/scylla
|
|
install -m644 conf/cassandra-rackdc.properties -Dt "$retc"/scylla
|
|
# XXX: since housekeeping.cfg is mistakenly belongs to different package
|
|
# in .rpm/.deb, we need this workaround to make package upgradable
|
|
if $is_redhat && $housekeeping; then
|
|
install -m644 conf/housekeeping.cfg -Dt "$retc"/scylla.d
|
|
fi
|
|
fi
|
|
if [ -z "$pkg" ] || [ "$pkg" = "kernel-conf" ]; then
|
|
install -m755 -d "$rusr/lib/sysctl.d"
|
|
install -m644 dist/common/sysctl.d/*.conf -Dt "$rusr"/lib/sysctl.d
|
|
fi
|
|
if [ -z "$pkg" ] || [ "$pkg" = "server" ]; then
|
|
install -m755 -d "$retc/$sysconfdir"
|
|
install -m755 -d "$retc/security/limits.d"
|
|
install -m755 -d "$retc/scylla.d"
|
|
install -m644 dist/common/sysconfig/scylla-server -Dt "$retc"/$sysconfdir
|
|
install -m644 dist/common/limits.d/scylla.conf -Dt "$retc"/security/limits.d
|
|
install -m644 dist/common/scylla.d/*.conf -Dt "$retc"/scylla.d
|
|
|
|
install -d -m755 "$retc"/scylla "$rusr/lib/systemd/system" "$rusr/bin" "$rprefix/bin" "$rprefix/libexec" "$rprefix/libreloc" "$rprefix/scripts"
|
|
install -m644 build/*.service -Dt "$rusr"/lib/systemd/system
|
|
install -m644 dist/common/systemd/*.service -Dt "$rusr"/lib/systemd/system
|
|
install -m644 dist/common/systemd/*.slice -Dt "$rusr"/lib/systemd/system
|
|
install -m644 dist/common/systemd/*.timer -Dt "$rusr"/lib/systemd/system
|
|
install -m755 seastar/scripts/seastar-cpu-map.sh -Dt "$rprefix"/scripts
|
|
install -m755 seastar/dpdk/usertools/dpdk-devbind.py -Dt "$rprefix"/scripts
|
|
install -m755 bin/* -Dt "$rprefix/bin"
|
|
# some files in libexec are symlinks, which "install" dereferences
|
|
# use cp -P for the symlinks instead.
|
|
install -m755 libexec/*.bin -Dt "$rprefix/libexec"
|
|
for f in libexec/*; do
|
|
if [[ "$f" != *.bin ]]; then
|
|
cp -P "$f" "$rprefix/libexec"
|
|
fi
|
|
done
|
|
install -m755 libreloc/* -Dt "$rprefix/libreloc"
|
|
ln -srf "$rprefix/bin/scylla" "$rusr/bin/scylla"
|
|
ln -srf "$rprefix/bin/iotune" "$rusr/bin/iotune"
|
|
|
|
# XXX: since housekeeping.cfg is mistakenly belongs to different package
|
|
# in .rpm/.deb, we need this workaround to make package upgradable
|
|
if $is_debian && $housekeeping; then
|
|
install -m644 conf/housekeeping.cfg -Dt "$retc"/scylla.d
|
|
fi
|
|
install -d -m755 "$rdoc"/scylla
|
|
install -m644 README.md -Dt "$rdoc"/scylla/
|
|
install -m644 README-DPDK.md -Dt "$rdoc"/scylla
|
|
install -m644 NOTICE.txt -Dt "$rdoc"/scylla/
|
|
install -m644 ORIGIN -Dt "$rdoc"/scylla/
|
|
install -d -m755 -d "$rdoc"/scylla/licenses/
|
|
install -m644 licenses/* -Dt "$rdoc"/scylla/licenses/
|
|
install -m755 -d "$root"/var/lib/scylla/
|
|
install -m755 -d "$root"/var/lib/scylla/data
|
|
install -m755 -d "$root"/var/lib/scylla/commitlog
|
|
install -m755 -d "$root"/var/lib/scylla/hints
|
|
install -m755 -d "$root"/var/lib/scylla/view_hints
|
|
install -m755 -d "$root"/var/lib/scylla/coredump
|
|
install -m755 -d "$root"/var/lib/scylla-housekeeping
|
|
install -m755 -d "$rprefix"/swagger-ui
|
|
cp -r swagger-ui/dist "$rprefix"/swagger-ui
|
|
install -d -m755 -d "$rprefix"/api
|
|
cp -r api/api-doc "$rprefix"/api
|
|
install -d -m755 -d "$rprefix"/scyllatop
|
|
cp -r tools/scyllatop/* "$rprefix"/scyllatop
|
|
install -d -m755 -d "$rprefix"/scripts
|
|
cp -r dist/common/scripts/* "$rprefix"/scripts
|
|
ln -srf "$rprefix/scyllatop/scyllatop.py" "$rusr/bin/scyllatop"
|
|
|
|
SBINFILES=$(cd dist/common/scripts/; ls scylla_*setup node_exporter_install node_health_check scylla_ec2_check scylla_kernel_check)
|
|
install -d "$rusr"/sbin
|
|
for i in $SBINFILES; do
|
|
ln -srf "$rprefix/scripts/$i" "$rusr/sbin/$i"
|
|
done
|
|
|
|
install -m755 scylla-gdb.py -Dt "$rprefix"/scripts/
|
|
|
|
PYSCRIPTS=$(find dist/common/scripts/ -maxdepth 1 -type f -exec grep -Pls '\A#!/usr/bin/env python3' {} +)
|
|
for i in $PYSCRIPTS; do
|
|
./relocate_python_scripts.py \
|
|
--installroot $rprefix/scripts/ --with-python3 "$root/$python3" $i
|
|
done
|
|
./relocate_python_scripts.py \
|
|
--installroot $rprefix/scripts/ --with-python3 "$root/$python3" \
|
|
seastar/scripts/perftune.py seastar/scripts/seastar-addr2line seastar/scripts/perftune.py
|
|
|
|
./relocate_python_scripts.py \
|
|
--installroot $rprefix/scyllatop/ --with-python3 "$root/$python3" \
|
|
tools/scyllatop/scyllatop.py
|
|
fi
|