var-lib-scylla.mount should wait for MDRAID initilization, so we need to add
'After=mdmonitor.service'.
However, currently mdmonitor.service fails to start due to no mail address
specified, we need to add the entry on mdadm.conf.
Fixes#6876
On some CLI tools, command options may different between latest version
vs older version.
To maximize compatibility of setup scripts, we should always use
relocatable CLI tools instead of distribution version of the tool.
Related #6954
On GCE, /dev/sda14 reported as unused disk but it's BIOS boot partition,
should not use for scylla data partition, also cannot use for it since it's
too small.
It's better to exclude such partiotion from unsed disk list.
Fixes#6636
On 2d63acdd6a we replaced 'ol' and 'amzn'
to 'oracle' and 'amazon', but distro.id() actually returns 'amzn' for
Amazon Linux 2, so we need to revert the change.
Fixes#6882
To make the "scylla_setup" interface similar to Docker image, let's add
a "--io-setup ENABLE" command line option. The old "--no-io-setup"
option is retained for compatibility.
Let's report each missing CPU feature individually, and improve the
error message a bit. For example, if the "clmul" instruction is missing,
the report looks as follows:
ERROR: You will not be able to run Scylla on this machine because its CPU lacks the following features: pclmulqdq
If this is a virtual machine, please update its CPU feature configuration or upgrade to a newer hypervisor.
Fixes#6528
"coredumpctl info" behavior had been changed since systemd-v232, we need to
support both version.
Before systemd-v232, it was simple.
It print 'Coredump' field only when the coredump exists on filesystem.
Otherwise print nothing.
After the change made on systemd-v232, it become more complex.
It always print 'Storage' field even the coredump does not exists.
Not just available/unavailable, it describe more:
- Storage: none
- Storage: journal
- Storage: /path/to/file (inacessible)
- Storage: /path/to/file
To support both of them, we need to detect message version first, then
try to detect coredump path.
Fixes: #6789
reference: 47f5064207
We could hit "cannot serialize '_io.BufferedReader' object" when request get 404 error from the server
Now you will get legit error message in the case.
Fixes#6690
Print error message and exit with non-zero status by following condition:
- coredumpctl says the coredump file is inaccessible
- failed to detect coredump file path from 'coredumpctl info <pid>'
- deleting coredump file failed because the file is missing
Fixes#6654
When we started to porting bash script to python script, we are not able to use
subprocess.run() since EPEL only provides python 3.4, but now we have
relocatable python, so we can switch to it.
Currently we we mistakenly made two different way to detect distribution,
directly reading /etc/os-release and use distro package.
distro package provides well abstracted APIs and still have full access to
os-release informations, we should switch to it.
Fixes#6691
On Ubuntu 18.04 and ealier & Deiban 10 and ealier, /usr merge is not done, so
/usr/bin/systemd-escape and /bin/systemd-escape is different place, and we call
/usr/bin but Debian variants tries to install the command in /bin.
Drop full path, just call command name and resolve by default PATH.
Fixes: #6650
Since scylla-cpupower.service isn't installed by .rpm package, but created
in the setup script, it's better to not use /usr/lib directory, use /etc.
We already doing same way for scylla-server.service.d/*.conf, *.mount, and
*.swap created by setup scripts.
Amazon Linux 2 has /usr/bin/cpupower, but does not have cpupower.service
unlike CentOS7.
We need to provide the .service file when distribution is Amazon Linux 2.
Fixes#5977
In 28c3d4 `out()` was used without `shell=True` and was the spliting of arguments
failed cause of the complex commands in the cmd (pipe and such)
Fixes#6159
We generate a coredump as part of "scylla_coredump_setup" to verify that
coredumps are working. However, we need to *remove* that test coredump
to avoid people and test infrastructure reporting those coredumps.
Fixes#6159
we always raise exception 'Unit xxx not found' when exception is raised in
executing 'systemctl cat xxx'. Sometimes the error is confused.
On OEL7, the 'systemctl cat var-lib-systemd-coredump.mount' will also verify
the config content, scylla_coredump_setup failed for that the config file
is invalid, but the error is 'unit var-lib-systemd-coredump.mount not found'.
This patch improved the error message.
Related issue: https://github.com/scylladb/scylla/issues/6432
Currently we use a systemd mount (var-lib-systemd-coredump.mount) to mount
default coredump directory (/var/lib/systemd/coredump) to
(/var/lib/scylla/coredump). The /var/lib/scylla had been mounted to a big
storage, so we will have enough space for coredump after the mount.
Currently in coredump_setup, we only enabled var-lib-systemd-coredump.mount,
but not start it. The directory won't be mounted after coredump_setup, so the
coredump will still be saved to default coredump directory.
The mount will only effect after reboot.
Fixes#6566
This reverts commit e77dad3adf because its
incorrect.
Amos explains:
"Quote from https://www.freedesktop.org/software/systemd/man/systemd.mount.html
What=
Takes an absolute path of a device node, file or other resource to
mount. See mount(8) for details. If this refers to a device node, a
dependency on the respective device unit is automatically created.
Where=
Takes an absolute path of a file or directory for the mount point; in
particular, the destination cannot be a symbolic link. If the mount
point does not exist at the time of mounting, it is created as
directory.
So the mount point is '/var/lib/systemd/coredump' and
'/var/lib/scylla/coredump' is the file to mount, because /var/lib/scylla
had mounted a second big storage, which has enough space for Huge
coredumps.
Bentsi or other touched problem with old scylla-master AMI, a coredump
occurred but not successfully saved to disk for enospc. The directory
/var/lib/systemd/coredump wasn't mounted to /var/lib/scylla/coredump.
They WRONGLY thought the wrong mount was caused by the config problem,
so he posted a fix.
Actually scylla-ami-setup / coredump wasn't executed on that AMI, err:
unit scylla-ami-setup.service not found Because
'scylla-ami-setup.service' config file doesn't exist or is invalid.
Details of my testing: https://github.com/scylladb/scylla/issues/6300#issuecomment-637324507
So we need to revert Bentsi's patch, it changed the right config to wrong."
since dbuild was updated to fedora-32, hence to python3.8
`platform.dist()` is deprecated, and need to be replaced
Fixes: #6501
[avi: folded patch with install-dependencies.sh change]
[avi: regenerated toolchain]
The issue is that the mount is /var/lib/scylla/coredump ->
/var/lib/systemd/coredump. But we need to do the opposite in order to
save the coredump on the partition that Scylla is using:
/var/lib/systemd/coredump-> /var/lib/scylla/coredump
Fixes#6301
On Centos 7 machine:
fstrim.timer not enabled, only unmasked due scylla_fstrim_setup on installation
When trying run scylla-fstrim service manually you get error:
Traceback (most recent call last):
File "/opt/scylladb/scripts/libexec/scylla_fstrim", line 60, in <module>
main()
File "/opt/scylladb/scripts/libexec/scylla_fstrim", line 44, in main
cfg = parse_scylla_dirs_with_default(conf=args.config)
File "/opt/scylladb/scripts/scylla_util.py", line 484, in parse_scylla_dirs_with_default
if key not in y or not y[k]:
NameError: name 'k' is not defined
It caused by error in scylla_util.py
Fixes#6294.
On some environment systemd-coredump does not work with symlink directory,
we can use bind-mount instead.
Also, it's better to check systemd-coredump is working by generating coredump.
To fix#5916, drop scylla_coredump_setup from .rpm %post scriptlet.
Fixes#5753Fixes#5916
This reverts commit 65aadad9a6. It causes
crashes (due to the coredump test) during package install, since scylla_coredump_setup
is called from rpm postinstall. The test should be done only from scylla_setup (and
the user should be warned).
Fixes#5916.
On some environment systemd-coredump does not work with symlink directory,
we can use bind-mount instead.
Also, it's better to check systemd-coredump is working by generating coredump.
Fixes#5753
Since we set 'eth0' as default NIC name, we get following error when running scylla_setup in non-interactive mode without --nic parameter:
$ sudo scylla_setup --setup-nic-and-disks --no-raid-setup --no-verify-package --no-io-setup
NIC eth0 doesn't exist.
It looks strange since user actually does not specified 'eth0', they might forget to specify --nic.
I think we should shows up usage, when eth0 is not available on the system.
Fixes#5828