Since scylla-cpupower.service isn't installed by .rpm package, but created
in the setup script, it's better to not use /usr/lib directory, use /etc.
We already doing same way for scylla-server.service.d/*.conf, *.mount, and
*.swap created by setup scripts.
Amazon Linux 2 has /usr/bin/cpupower, but does not have cpupower.service
unlike CentOS7.
We need to provide the .service file when distribution is Amazon Linux 2.
Fixes#5977
In 28c3d4 `out()` was used without `shell=True` and was the spliting of arguments
failed cause of the complex commands in the cmd (pipe and such)
Fixes#6159
We generate a coredump as part of "scylla_coredump_setup" to verify that
coredumps are working. However, we need to *remove* that test coredump
to avoid people and test infrastructure reporting those coredumps.
Fixes#6159
we always raise exception 'Unit xxx not found' when exception is raised in
executing 'systemctl cat xxx'. Sometimes the error is confused.
On OEL7, the 'systemctl cat var-lib-systemd-coredump.mount' will also verify
the config content, scylla_coredump_setup failed for that the config file
is invalid, but the error is 'unit var-lib-systemd-coredump.mount not found'.
This patch improved the error message.
Related issue: https://github.com/scylladb/scylla/issues/6432
Currently we use a systemd mount (var-lib-systemd-coredump.mount) to mount
default coredump directory (/var/lib/systemd/coredump) to
(/var/lib/scylla/coredump). The /var/lib/scylla had been mounted to a big
storage, so we will have enough space for coredump after the mount.
Currently in coredump_setup, we only enabled var-lib-systemd-coredump.mount,
but not start it. The directory won't be mounted after coredump_setup, so the
coredump will still be saved to default coredump directory.
The mount will only effect after reboot.
Fixes#6566
This reverts commit e77dad3adf because its
incorrect.
Amos explains:
"Quote from https://www.freedesktop.org/software/systemd/man/systemd.mount.html
What=
Takes an absolute path of a device node, file or other resource to
mount. See mount(8) for details. If this refers to a device node, a
dependency on the respective device unit is automatically created.
Where=
Takes an absolute path of a file or directory for the mount point; in
particular, the destination cannot be a symbolic link. If the mount
point does not exist at the time of mounting, it is created as
directory.
So the mount point is '/var/lib/systemd/coredump' and
'/var/lib/scylla/coredump' is the file to mount, because /var/lib/scylla
had mounted a second big storage, which has enough space for Huge
coredumps.
Bentsi or other touched problem with old scylla-master AMI, a coredump
occurred but not successfully saved to disk for enospc. The directory
/var/lib/systemd/coredump wasn't mounted to /var/lib/scylla/coredump.
They WRONGLY thought the wrong mount was caused by the config problem,
so he posted a fix.
Actually scylla-ami-setup / coredump wasn't executed on that AMI, err:
unit scylla-ami-setup.service not found Because
'scylla-ami-setup.service' config file doesn't exist or is invalid.
Details of my testing: https://github.com/scylladb/scylla/issues/6300#issuecomment-637324507
So we need to revert Bentsi's patch, it changed the right config to wrong."
since dbuild was updated to fedora-32, hence to python3.8
`platform.dist()` is deprecated, and need to be replaced
Fixes: #6501
[avi: folded patch with install-dependencies.sh change]
[avi: regenerated toolchain]
The issue is that the mount is /var/lib/scylla/coredump ->
/var/lib/systemd/coredump. But we need to do the opposite in order to
save the coredump on the partition that Scylla is using:
/var/lib/systemd/coredump-> /var/lib/scylla/coredump
Fixes#6301
On Centos 7 machine:
fstrim.timer not enabled, only unmasked due scylla_fstrim_setup on installation
When trying run scylla-fstrim service manually you get error:
Traceback (most recent call last):
File "/opt/scylladb/scripts/libexec/scylla_fstrim", line 60, in <module>
main()
File "/opt/scylladb/scripts/libexec/scylla_fstrim", line 44, in main
cfg = parse_scylla_dirs_with_default(conf=args.config)
File "/opt/scylladb/scripts/scylla_util.py", line 484, in parse_scylla_dirs_with_default
if key not in y or not y[k]:
NameError: name 'k' is not defined
It caused by error in scylla_util.py
Fixes#6294.
On some environment systemd-coredump does not work with symlink directory,
we can use bind-mount instead.
Also, it's better to check systemd-coredump is working by generating coredump.
To fix#5916, drop scylla_coredump_setup from .rpm %post scriptlet.
Fixes#5753Fixes#5916
This reverts commit 65aadad9a6. It causes
crashes (due to the coredump test) during package install, since scylla_coredump_setup
is called from rpm postinstall. The test should be done only from scylla_setup (and
the user should be warned).
Fixes#5916.
On some environment systemd-coredump does not work with symlink directory,
we can use bind-mount instead.
Also, it's better to check systemd-coredump is working by generating coredump.
Fixes#5753
Since we set 'eth0' as default NIC name, we get following error when running scylla_setup in non-interactive mode without --nic parameter:
$ sudo scylla_setup --setup-nic-and-disks --no-raid-setup --no-verify-package --no-io-setup
NIC eth0 doesn't exist.
It looks strange since user actually does not specified 'eth0', they might forget to specify --nic.
I think we should shows up usage, when eth0 is not available on the system.
Fixes#5828
On Debian, we don't add xfsprogs/mdadm on package dependency, install on
scylla_raid_setup script instead.
Since xfsprogs/mdadm only needed for constructing RAID, we can move
dependencies to scylla_raid_setup too.
Currently --ami does not check instance types, creates invalid
io_properties.yaml on unsupported instance types.
It actually won't occur on AMI startup, since scylla_ami_setup only
invoke scylla_io_setup --ami when the instance is supported, so we don't
get the issue on startup, but we still get when we run scylla_io_setup
manually.
It's better to check instance type on scylla_io_setup, too.
Refs #5438
VERSION_ID of centos7 is "7", but VERSION_ID of oel7.7 is "7.7"
scylla_ntp_setup doesn't work on OEL7.7 for ValueError.
- ValueError: invalid literal for int() with base 10: '7.7'
This patch changed redhat_version() to return version string, and compare
with parse_version().
Fixes#5433
Signed-off-by: Amos Kong <amos@scylladb.com>
A Linux machine typically has multiple clocksources with distinct
performances. Setting a high-performant clocksource might result in
better performance for ScyllaDB, so this should be considered whenever
starting it up.
This patch introduces the possibility of enforcing optimized Linux
clocksource to Scylla's setup/start-up processes. It does so by adding
an interactive question about enforcing clocksource setting to scylla_setup,
which modifies the parameter "CLOCKSOURCE" in scylla_server configuration
file. This parameter is read by perftune.py which, if set to "yes", proceeds
to (non persistently) setting the clocksource. On x86, TSC clocksource is used.
Fixes#4474Fixes#5474Fixes#5480
"
These series solves an issue with scylla_setup and prevent it from
waiting forever if housekeeping cannot look for the new Scylla version.
Fixes#5302
It should be backported to versions that support offline installations.
"
* 'scylla_setup_timeout' of git://github.com/amnonh/scylla:
scylla_setup: do not wait forever if no reply is return housekeeping
scylla_util.py: Add optional timeout to out function
When scylla is installed without a network connectivity, the test if a
newer version is available can cause scylla_setup to wait forever.
This patch adds a limit to the time scylla_setup will wait for a reply.
When there is no reply, the relevent error will be shown that it was
unable to check for newer version, but this will not block the setup
script.
Fixes#5302
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This reverts commit 4333b37f9e. It breaks upgrades,
and the user question is not informative enough for the user to make a correct
decision.
Fixes#5478.
Fixes#5480.
A Linux machine typically has multiple clocksources with distinct
performances. Setting a high-performant clocksource might result in
better performance for ScyllaDB, so this should be considered whenever
starting it up.
This patch introduces the possibility of enforcing optimized Linux
clocksource to Scylla's setup/start-up processes. It does so by adding
an interactive question about enforcing clocksource setting to scylla_setup,
which modifies the parameter "CLOCKSOURCE" in scylla_server configuration
file. This parameter is read by perftune.py which, if set to "yes", proceeds
to (non persistently) setting the clocksource. On x86, TSC clocksource is
used.
Fixes#4474
We may able to use chrony setup script on future version of RHEL/CentOS,
it better to run chrony setup when RHEL version >= 8, not only 8.
Note that on Fedora it still provides ntp/ntpdate package, so we run
ntp setup on it for now. (same on debian variants)
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20191203192812.5861-1-syuu@scylladb.com>
Currently interactive RAID setup prompt does not list virtio-blk devices due to
following reasons:
- We fail matching '-p' option on 'lsblk --help' output since misusage of
regex functon, list_block_devices() always skipping to use lsblk output.
- We don't check existance of /dev/vd* when we skipping to use lsblk.
- We mistakenly excluded virtio-blk devices on 'lsblk -pnr' output using '-e'
option, but we actually needed them.
To fix the problem we need to use re.search() instead of re.match() to match
'-p' option on 'lsblk --help', need to add '/dev/vd*' on block device list,
then need to stop '-e 252' option on lsblk which excludes virtio-blk.
Additionally, it better to parse 'TYPE' field of lsblk output, we should skip
'loop' devices and 'rom' devices since these are not disk devices.
Fixes#4066
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20191201160143.219456-1-syuu@scylladb.com>
Currently we support to assign workdir from scylla.yaml, and we use many
hardcode '/var/lib/scylla' in setup scripts.
Some setup scripts get scylla directories by parsing scylla.yaml, introduced
parse_scylla_dirs_with_default() that adds default values if scylla directories
aren't assigned in scylla.yaml
Signed-off-by: Amos Kong <amos@scylladb.com>
Currently we are checking an invalid path of some default scylla directories,
the directories don't exist, so the tune will always be skipped. It caused by
two problem.
Problem 1: paths of default directories is invalid
Introduced by commit 5ec191536e, we try to tune some scylla default directories
if they exist. But the directory paths we try are wrong.
For example:
- What we check: /var/lib/scylla/commitlog_directory
- Correct one: /var/lib/scylla/commitlog
Problem 2: wrong path join
Introduced by commit 31ddb2145a, default_path might be replaced from
'/var/lib/scylla/' to '/var/lib/scylla'.
Our code tries to check an invalid path that is wrongly join, eg:
'/var/lib/scyllacommitlog'
Signed-off-by: Amos Kong <amos@scylladb.com>
The default values of data_file_directories and commitlog_directory were
commented by commit e0f40ed16a. It causes scylla_util.py:get_scylla_dirs() to
fail in checking the values.
This patch changed get_scylla_dirs() to return default data/commitlog
directories if they aren't set.
Fixes#5358
Reviewed-by: Pavel Emelyanov <xemul@scylladb.com>
Signed-off-by: Amos Kong <amos@scylladb.com>
Currently scylla_io_setup will skip in scylla_setup, because we didn't support
those new instance types.
I manually executed scylla_io_setup, and the scylla-server started and worked
well.
Let's apply this patch first, then check if there is some new problem in
ami-test.
Signed-off-by: Amos Kong <amos@scylladb.com>
It is useful to have an option to limit the execution time of a shell
script.
This patch adds an optional timeout parameter, if a parameter will be
provided a command will return and failure if the duration is passed.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Since nonroot mode requires to run everything on non-privileged user,
most of setup scripts does not able to use nonroot mode.
We only provide following functions on nonroot mode:
- EC2 check
- IO setup
- Node exporter installer
- Dev mode setup
Rest of functions will be skipped on scylla_setup.
To implement nonroot mode on setup scripts, scylla_util provides
utility functions to abstract difference of directory structure between normal
installation and nonroot mode.
Scylla requires the CLMUL and SSE 4.2 instruction sets and will fail without them.
There is a check in main(), but that happens after the code is running and it may
already be too late. Add a check in scylla_prepare which runs before the main
executable.
The scylla_blocktune.py has a FIXME for btrfs from 2016, which is no
longer relevant for Scylla deployments, as Red Hat dropped support for
the file system in 2017.
Message-Id: <20190823114013.31112-1-penberg@scylladb.com>
Don't let perftune.py print false alarm error message when we calculate
a compute CPU set for tuning modes.
This may happen when we calculate a CPU set for non-MQ tuning modes on
small systems on which these modes are forbidden because they would
result in a zero CPU set, e.g. sq_split on a system with a single
physical core.
We are going to utilize a newly introduced --get-cpu-mask-quiet execution
mode introduced to the seastar/script/perftune.py by the
"perftune.py: introduce --get-cpu-mask-quiet" series which would return
a zero CPU set if that's what it turns out to be instead of exiting with
an error what --get-cpu-mask would do in such a case.
The rest of scylla_util.py logic is going to handle a zero CPU set
returned by get_mode_cpuset() correctly.
Fixes#4211Fixes#4443
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <20190731212901.9510-1-vladz@scylladb.com>