Commit Graph

545 Commits

Author SHA1 Message Date
Takuya ASADA
d9a625c842 scylla_setup: don't run node-exporter setup when it's not installed
We need to run package existance check before run setup of
node-exporter.

Fixes #8276

Closes #8278
2021-03-18 11:24:18 +01:00
Takuya ASADA
e8cfd5114f scylla_coredump_setup: support SLES
SLES requires to install systemd-coredump package and enable
systemd-coredump.socket to use systemd-coredump.
2021-03-15 19:19:56 +09:00
Takuya ASADA
13871ff1f8 scylla_setup: use rpm to check package availability for SLES
Use rpm to check scylla packages installed on SLES.
2021-03-15 19:18:44 +09:00
Takuya ASADA
e3b5ffcf14 dist: install optional packages for SLES
Support SUSE original package manager 'zypper' for pkg_install()
function.
2021-03-15 19:17:48 +09:00
Takuya ASADA
af8eae317b scylla_coredump_setup: avoid coredump failure when hard limit of coredump is set to zero
On the environment hard limit of coredump is set to zero, coredump test
script will fail since the system does not generate coredump.
To avoid such issue, set ulimit -c 0 before generating SEGV on the script.

Note that scylla-server.service can generate coredump even ulimit -c 0
because we set LimitCORE=infinity on its systemd unit file.

Fixes #8238

Closes #8245
2021-03-10 19:28:10 +02:00
Takuya ASADA
2d9feaacea scylla_raid_setup: don't abort using raiddev when array_state is 'clear'
On Ubuntu 20.04 AMI, scylla_raid_setup --raiddev /dev/md0 causes
'/dev/md0 is already using' (issue #7627).
So we merged the patch to find free mdX (587b909).

However, look into /proc/mdstat of the AMI, it actually says no active md device available:

ubuntu@ip-10-0-0-43:~$ cat /proc/mdstat
Personalities :
unused devices: <none>

We currently decide mdX is used when os.path.exists('/sys/block/mdX/md/array_state') == True,
but according to kernel doc, the file may available even array is STOPPED:

    clear

        No devices, no size, no level
        Writing is equivalent to STOP_ARRAY ioctl
https://www.kernel.org/doc/html/v4.15/admin-guide/md.html

So we should also check array_state != 'clear', not just array_state
existance.

Fixes #8219

Closes #8220
2021-03-07 18:30:11 +02:00
Takuya ASADA
53c7600da8 dist: increase fs.aio-max-nr value for other apps
Current fs.aio-max-nr value cpu_count() * 11026 is exact size of scylla
uses, if other apps on the environment also try to use aio, aio slot
will be run out.
So increase value +65536 for other apps.

Related #8133

Closes #8228
2021-03-07 12:11:36 +02:00
Takuya ASADA
870c3a28c1 scylla_setup: strip spaces of comma separated list
On RAID prompt, we can type disk list something like this:
 /dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sdd1

However, if the list has spaces in the list, it doesn't work:
 /dev/sda1, /dev/sdb1, /dev/sdc1, /dev/sdd1

Because the script mistakenly recognize the space part of a device path.
So we need strip() the input for each item.

Fixes #8174

Closes #8190
2021-03-02 12:48:18 +02:00
Takuya ASADA
d0297c599a dist: tune fs.aio-max-nr based on the number of cpus
Current aio-max-nr is set up statically to 1048576 in
/etc/sysctl.d/99-scylla-aio.conf.
This is sufficient for most use cases, but falls short on larger machines
such as i3en.24xlarge on AWS that has 96 vCPUs.

We need to tune the parameter based on the number of cpus, instead of
static setting.

Fixes #8133

Signed-off-by: Takuya ASADA <syuu@scylladb.com>

Closes #8188
2021-03-01 14:18:24 +02:00
Takuya ASADA
4cf9b6988e scylla_coredump_setup: don't run apt-get when systemd-coredump is already installed
Check systemd-coredump existance before running apt-get install
systemd-coredump.

Closes #8185
2021-03-01 09:38:51 +02:00
Takuya ASADA
f3a82f4685 scylla_setup: allow running scylla_setup with strict umask setting
We currently deny running scylla_setup when umask != 0022.
To remove this limitation, run os.chmod(0o644) on every file creation
to allow reading from scylla user.

Note that perftune.yaml is not really needed to set 0644 since perftune.py is
running in root user, but setting it to align permission with other files.

Fixes #8049

Closes #8119
2021-02-25 16:42:45 +02:00
Takuya ASADA
32d4ec6b8a scylla_util.py: resolve /dev/root to get actual device on aws
When psutil.disk_paritions() reports / is /dev/root, aws_instance mistakenly
reports root partition is part of ephemeral disks, and RAID construction will
fail.
This prevents the error and reports correct free disks.

Fixes #8055

Closes #8040
2021-02-18 20:25:45 +02:00
Shlomi Livne
718976e794 scylla_io_setup did not configure pre tuned gce instances correctly
scylla_io_setup condition for nr_disks was using the bitwise operator
(&) instead of logical and operator (and) causing the io_properties
files to have incorrect values

Fixes #7341

Reviewed-by: Lubos Kosco <lubos@scylladb.com>
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>

Closes #8019
2021-02-11 11:06:00 +02:00
Benny Halevy
55e3df8a72 dist: scylla_util: prevent IndexError when no ephemeral_disks were found
Currently we call firstNvmeSize before checking that we have enough
(at least 1) ephemeral disks.  When none are found, we hit the following
error (see #7971):
```
File "/opt/scylladb/scripts/libexec/scylla_io_setup", line 239, in
if idata.is_recommended_instance():
File "/opt/scylladb/scripts/scylla_util.py", line 311, in is_recommended_instance
diskSize = self.firstNvmeSize
File "/opt/scylladb/scripts/scylla_util.py", line 291, in firstNvmeSize
firstDisk = ephemeral_disks[0]
IndexError: list index out of range
```

This change reverses the order and first checks that we found
enough disks before getting the fist disk size.

Fixes #7971

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #8027
2021-02-03 11:30:18 +02:00
Takuya ASADA
5d527bd17e scylla_ntp_setup: use chrony on all distributions
To simplify scylla_ntp_setup, use chrony on all distributions.

Closes #7922
2021-01-24 11:45:58 +02:00
Takuya ASADA
7a74f8cd2e dist: use sysconfig_parser to parse gentoo config file
Use sysconfig_parser instead of regex, to improve code readability.
2021-01-13 21:34:23 +09:00
Takuya ASADA
2a4d293841 dist: add package name translation
Translate package name from CentOS package to different distribution
package name, to use single package name for pkg_install().
2021-01-13 21:27:14 +09:00
Takuya ASADA
0a9843842d dist: support SLES/OpenSUSE
Add support SLES/OpenSUSE on setup script.
2021-01-13 19:32:46 +09:00
Takuya ASADA
e8f74e800c dist: show warning on unsupported distributions
Add warning message on unsupported distributions, for
scylla_cpuscaling_setup and scylla_ntp_setup.
2021-01-13 19:32:45 +09:00
Takuya ASADA
2f344cf50d dist: drop Ubuntu 14.04 code
We don't support Ubuntu 14.04 anymore, drop them
2021-01-13 19:32:45 +09:00
Takuya ASADA
8e59f70080 dist: move back is_amzn2() to scylla_util.py
Distribution detection functions should be placed same place,
so move back it to scylla_util.py
2021-01-13 19:32:45 +09:00
Takuya ASADA
921b1676c0 dist: rename is_gentoo_variant() to is_gentoo()
is_redhat_variant() is the function to detect RHEL/CentOS/Fedora/OEL,
and is_debian_variant() is the function to detect Debian/Ubuntu.
Unlike these functions, is_gentoo_variant() does not detect "Gentoo variants",
we should rename it to is_gentoo().
2021-01-13 19:32:45 +09:00
Takuya ASADA
fffa8f5ded dist: support Arch Linux
Add support Arch Linux on setup script.
2021-01-13 19:32:45 +09:00
Takuya ASADA
0d11f9463d dist: make sysconfig directory detectable
Currently, install.sh provide a way to customize sysconfig directory,
but sysconfig directory is hardcoded on script.
Also, /etc/sysconfig seems correct to use default value, but current
code specify /etc/default as non-redhat distributions.

Instead of hardcoding, generate generate python script in install.sh
to save specified sysconfig directory path in python code.
2021-01-13 19:32:45 +09:00
Amos Kong
9adc6f68ee scylla_setup: cleanup if judgments
This patch merged two nested if judgments.

Signed-off-by: Amos Kong <amos@scylladb.com>
2020-12-26 04:45:25 +08:00
Amos Kong
632b01ce4e scylla_setup: enable node_exporter for offline installation
node_exporter had been added to scylla-server package by commit
95197a09c9.

So we can enable it by default for offline installation.

Signed-off-by: Amos Kong <amos@scylladb.com>
2020-12-25 10:54:31 +08:00
Takuya ASADA
95197a09c9 dist: add node_exporter to scylla-server package
To connection-less environment, we need to add node_exporter binary
to scylla-server package, not downloading it from internet.

Related #7765
Fixes #2190

Closes #7796
2020-12-24 11:44:13 +02:00
Aleksandr Bykov
e74dc311e7 dist: scylla_util: fix aws_instance.ebs_disks method
aws_instance.ebs_disks() method should return ebs disk
instead of ephemeral

Signed-off-by: Aleksandr Bykov <alex.bykov@scylladb.com>

Closes #7780
2020-12-13 17:33:37 +02:00
Avi Kivity
461c9826de Merge 'scylla_setup: fix wrong command suggestion' from Takuya ASADA
scylla_setup command suggestion does not shows an argument of --io-setup,
because we mistakely stores bool value on it (recognized as 'store_true').
We always need to print '--io-setup X' on the suggestion instead.

Also, --nic is currently ignored by command suggestion, need to print it just like other options.

Related #7395

Closes #7724

* github.com:scylladb/scylla:
  scylla_setup: print --swap-directory and --swap-size on command suggestion
  scylla_setup: print --nic on command suggestion
  scylla_setup: fix wrong command suggestion on --io-setup scylla_setup command suggestion does not shows an argument of --io-setup, because we mistakely stores bool value on it (recognized as 'store_true'). We always need to print '--io-setup X' on the suggestion instead.
2020-12-08 13:58:55 +02:00
Lubos Kosco
a0b1474bba scylla_util.py: Increase disk to ram ratio for GCP
Increase accepted disk-to-RAM ratio to 105 to accomodate even 7.5GB of
RAM for one NVMe log various reasons for not recommending the instance
type.

Fixes #7587

Closes #7600
2020-12-08 11:20:30 +02:00
Takuya ASADA
c3abba1913 scylla_setup: print --swap-directory and --swap-size on command suggestion
We need to print --swap-directory and --swap-size on command suggestion just like other options.

Related #7395
2020-12-07 18:40:59 +09:00
Takuya ASADA
582a3ffb2f scylla_setup: print --nic on command suggestion
We need to print --nic on command suggestion just like other options.

Related #7395
2020-12-07 18:40:59 +09:00
Avi Kivity
2fd895a367 Merge 'dist/common/scripts/scylla_setup: Optionally config rsyslog destination' from Amnon Heiman
This patch adds an option to scylla_setup to configure an rsyslog destination.

The monitoring stack has an option to get information from rsyslog it
requires that rsyslog on the scylla machines will send the trace line to
it.

The configuration will be in a Scylla configuration file, so it is safe to run it multiple times.

Fixes #7589

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Closes #7634

* github.com:scylladb/scylla:
  dist/common/scripts/scylla_setup: Optionally config rsyslog destination
  Adding dist/common/scripts/scylla_rsyslog_setup utility
2020-12-01 13:12:32 +02:00
Amnon Heiman
4036cecdea dist/common/scripts/scylla_setup: Optionally config rsyslog destination
This patch adds an option to scylla_setup to configure an rsyslog
destination.

The monitoring stack has an option to get information from rsyslog, it
requires that rsyslog on the scylla machines will send the trace line to
it.

If the /etc/rsyslog.d/ directory exists (that means the current system
runs rsyslog) it will ask if to add rsyslog configuration and if yes, it
would run scylla_rsyslog_setup.

Fixes #7589

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2020-12-01 12:33:37 +02:00
Takuya ASADA
572d6b2a4e scylla_setup: fix wrong command suggestion on --io-setup
scylla_setup command suggestion does not shows an argument of --io-setup,
because we mistakely stores bool value on it (recognized as 'store_true').
We always need to print '--io-setup X' on the suggestion instead.

Related #7395
2020-12-01 07:23:55 +09:00
Lubos Kosco
4d0587ed11 scylla_util.py: fix metadata gcp call for disks to get details
disk parsing expects output from recursive listing of GCP
metadata REST call, the method used to do it by default,
but now it requires a boolean flag to run in recursive mode

Fixes #7684

Closes #7685
2020-11-27 15:20:56 +02:00
Takuya ASADA
b90ddc12c9 scylla_prepare: add --tune system when SET_CLOCKSOURCE=yes
perftune.py only run clocksource setup when --tune system specified,
so we need to add it on the parameter when SET_CLOCKSOURCE=yes.

Fixes #7672
2020-11-23 10:51:16 +02:00
Takuya ASADA
3fefa520bd dist/common/scripts: drop run() and out(), swtich to subprocess.run()
We initially implemented run() and out() functions because we couldn't use
subprocess.run() since we were on Python 3.4.
But since we moved to relocatable python3, we don't need to implement it ourselves.
Why we keep using these functions are, because we needed to set environemnt variable to set PATH.
Since we recently moved away these codes to python thunk, we finally able to
drop run() and out(), switch to subprocess.run().
2020-11-22 17:59:27 +02:00
Amnon Heiman
9e116d136e Adding dist/common/scripts/scylla_rsyslog_setup utility
scylla_rsyslog_setup adds a configuration file to rsyslog to forward the
trances to a remote server.

It will override any existing file, so it is safe to run it multiple
times.

It takes an ip, or ip and port from the users for that configuration, if
no port is provided, the default port of Scylla-Monitoring promtail is
used.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2020-11-22 15:48:48 +02:00
Evgeniy Naydanov
587b909c5c scylla_raid_setup: try /dev/md[0-9] if no --raiddev provided
If scylla_raid_setup script called without --raiddev argument
then try to use any of /dev/md[0-9] devices instead of only
one /dev/md0.  Do it in this way because on Ubuntu 20.04
/dev/md0 used by OS already.

Closes #7628
2020-11-18 18:42:31 +02:00
Takuya ASADA
2ce8ca0f75 dist/common/scripts/scylla_util.py: move DEBIAN_FRONTEND environment variable to apt_install()/apt_uninstall()
DEBIAN_FRONTEND environment variable was added just for prevent opening
dialog when running 'apt-get install mdadm', no other program depends on it.
So we can move it inside of apt_install()/apt_uninstall() and drop scylla_env,
since we don't have any other environment variables.
To passing the variable, added env argument on run()/out().
2020-11-16 14:21:36 +02:00
Lubos Kosco
5c488b6e9a scylla_util.py: properly parse GCP instances without size
fixes #7577

Closes #7592
2020-11-12 13:01:40 +02:00
Takuya ASADA
5867af4edd install.sh: set PATH for relocatable CLI tools in python thunk
We currently set PATH for relocatable CLI tools in scylla_util.run() and
scylla_util.out(), but it doesn't work for perftune.py, since it's not part of
Scylla, does not use scylla_util module.
We can set PATH in python thunk instead, it can set PATH for all python scripts.

Fixes #7350
2020-11-11 10:27:08 +02:00
Bentsi Magidovich
956b97b2a8 scylla_util.py: fix exception handling in curl
Retry mechanism didn't work when URLError happend. For example:

  urllib.error.URLError: <urlopen error [Errno 101] Network is unreachable>

Let's catch URLError instead of HTTP since URLError is a base exception
for all exceptions in the urllib module.

Fixes: #7569

Closes #7567
2020-11-09 10:20:35 +02:00
Bentsi Magidovich
2866f2d65d scylla_util.py: remove unnecessary logging
when calling curl and exception is raised we can see unnecessary log messages that we can't control.
For example when used in scylla_login we can see following messages:
WARNING:root:Failed to grab http://169.254.169.254/latest/...
WARNING:root:Failed to grab http://169.254.169.254/latest/...
    Initial image configuration failed!

To see status, run
 'systemctl status scylla-image-setup'
2020-11-02 01:13:44 +03:00
Bentsi Magidovich
a62237f1c6 scylla_util.py: make is_aws_instance faster
when used for example in scylla_login we need to understand that we
are not running on AWS faster then 10 seconds
2020-11-02 00:11:21 +03:00
Bentsi Magidovich
83a8550a5f scylla_util.py: added ability to control sleep time between retries in curl() 2020-11-01 22:39:19 +03:00
Takuya ASADA
fc1c4f2261 scylla_raid_setup: use sysfs to detect existing RAID volume
We may not able to detect existing RAID volume by device file existance,
we should use sysfs instead to make sure it's running.

Fixes #7383

Closes #7399
2020-10-29 09:13:55 +02:00
Takuya ASADA
fe2d6765f9 node_exporter_install: upgrade to latest release
We currently uses outdated version of node_exporter, let's upgrade to latest
version.

Fixes #7427
2020-10-25 13:59:14 +02:00
Takuya ASADA
d5ff82dc61 scylla_setup: skip iotune when developer_mode is enabled
When developer mode automatically enabled on nonroot mode, we should skip
iotune since the parameter won't be used.

Closes #7327
2020-10-12 11:08:10 +03:00