scylladb

Author	SHA1	Message	Date
Takuya ASADA	48b6aec16a	scripts: use "out()" function for all capture_output subprocesses On `acaf0bb` we applied out() just for perftune.py because we had issue #10390 with this script. But the issue can happen with other commands too, let's apply it to all commands which uses capture_output. related #10390 Closes #10414	2022-04-26 13:56:52 +03:00
Takuya ASADA	acaf0bb88a	scripts: print perftune.py error message when capture_output=True We currently does not able to get any error message from subprocess when we specified capture_output=True on subprocess.run(). This is because CalledProcessError does not print stdout/stderr when it raised, and we don't catch the exception, we just let python to cause Traceback. Result of that, we only able to know exit status and failed command but not able to get stdout/stderr. This is problematic especially working on perftune.py bug, since the script should caused Traceback but we never able to see it. To resolve this, add wrapper function "out()" for capture output, and print stdout/stderr with error message inside the function. Fixes #10390 Closes #10391	2022-04-18 14:06:51 +03:00
Takuya ASADA	59adf05951	scylla_sysconfig_setup: avoid perse error on perftune.py --get-cpu-mask Currently, we just passes entire output of perftune.py when getting CPU mask from the script, but it may cause parse error since the script may also print warning message. To avoid that, we need to extract CPU mask from the output. Fixes #10082 Closes #10107	2022-03-28 16:31:14 +03:00
Takuya ASADA	59c72d5d60	scylla_prepare: print Traceback with current user-friendly messages On `e1b15ba`, we introduce user-friendly error message when Exception occured while generating perftune.yaml. However, it becomes difficult to investigate bugs since we dropped traceback. To resolve this problem, let's print both traceback and user-friendly messages. Related #10050 Closes #10140	2022-03-20 16:55:18 +02:00
Takuya ASADA	c2ccdac297	move cloud related code from scylla repository to scylla-machine-image Currently, cloud related code have cross-dependencies between scylla and scylla-machine-image. It is not good way to implement, and single change can break both package. To resolve the issue, we need to move all cloud related code to scylla-machine-image, and remove them from scylla repository. Change list: - move cloud part of scylla_util.py to scylla-machine-image - move cloud part of scylla_io_setup to scylla-machine-image - move scylla_ec2_check to scylla-machine-image - move cloud part of scylla_bootparam_setup to scylla-machine-image Closes #9957	2022-02-01 11:26:59 +02:00
Takuya ASADA	218dd3851c	scylla_swap_setup: add --swap-size-bytes Currently, --swap-size does not able to specify exact file size because the option takes parameter only in GB. To fix the limitation, let's add --swpa-size-bytes to specify swap size in bytes. We need this to implement preallocate swapfile while building IaaS image. see scylladb/scylla-machine-image#285 Closes #9971	2022-01-31 18:32:32 +02:00
Takuya ASADA	32f2eb63ac	scylla_raid_setup: use mdmonitor only when RAID level > 0 We found that monitor mode of mdadm does not work on RAID0, and it is not a bug, expected behavior according to RHEL developer. Therefore, we should stop enabling mdmonitor when RAID0 is specified. Fixes #9540	2022-01-26 22:33:07 +09:00
Takuya ASADA	cd57815fff	Revert "scylla_raid_setup: workaround for mdmonitor.service issue on CentOS8" This reverts commit `0d8f932f0b`, because RHEL developer explains this is not a bug, it's expected behavior. (mdadm --monitor does not start when RAID level is 0) see: https://bugzilla.redhat.com/show_bug.cgi?id=2031936 So we should stop downgrade mdadm package and modify our script not to enable mdmonitor.service on RAID0, use it only for RAID5.	2022-01-26 22:33:06 +09:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Valerii Ponomarov	12fa68fe67	scylla_util: return boolean calling systemd_unit.available As of now, 'systemd_unit.available' works ok only when provided unit is present. It raises Exception instead of returning boolean when provided systemd unit is absent. So, make it return boolean in both cases. Fixes https://github.com/scylladb/scylla/issues/9848 Closes #9849	2021-12-28 15:14:04 +02:00
Takuya ASADA	6a834261fb	scylla_coredump_setup: prevent coredump timeout on systemd-coredump@.service On newer version of systemd-coredump, coredump handled in systemd-coredump@.service, and may causes timeout while running the systemd unit, like this: systemd[1]: systemd-coredump@xxxx.service: Service reached runtime time limit. Stopping. To prevent that, we need to override TimeoutStartSec=infinity. Fixes #9837 Closes #9841	2021-12-27 13:58:07 +02:00
Takuya ASADA	0d8f932f0b	scylla_raid_setup: workaround for mdmonitor.service issue on CentOS8 On CentOS8, mdmonitor.service does not works correctly when using mdadm-4.1-15.el8.x86_64 and later versions. Until we find a solution, let's pinning the package version to older one which does not cause the issue (4.1-14.el8.x86_64). Fixes #9540 Closes #9782	2021-12-27 12:07:34 +02:00
Takuya ASADA	7064ae3d90	dist: fix scylla-housekeeping uuid file chmod call Should use chmod() on a file, not fchmod() Fixes #9683 Closes #9802	2021-12-27 11:47:06 +02:00
Takuya ASADA	6870938842	scylla_raid_setup: fix typo Closes #9790	2021-12-14 11:15:23 +02:00
Takuya ASADA	ea20f89c56	dist: allow running scylla-housekeeping with strict umask setting To avoid failing scylla-housekeeping in strict umask environment, we need to chmod a+r on repository file and housekeeping.uuid. Fixes #9683 Closes #9739	2021-12-05 20:46:46 +02:00
Takuya ASADA	097a6ee245	dist: add support im4gn/is4gen instance on AWS Add support next-generation, storage-optimized ARM64 instance types. Fixes #9711 Closes #9730	2021-12-05 13:20:01 +02:00
Michał Chojnowski	08f7b81b36	dist: scylla_io_setup: run iotune for supported but not preconfigured AWS instance types Currently, for AWS instances in `is_supported_instance_class()` other than i3* and *gd (for example: m5d), scylla_io_setup neither provides preconfigured values for io_properties.yaml nor runs iotune nor fails. This silently results in a broken io_properties.yaml, like so: disks: - mountpoint: /var/lib/scylla Fix that. Closes #9660	2021-11-24 18:28:13 +02:00
Avi Kivity	a19d00ef9b	dist: scylla_raid_setup: mount XFS with online discard Online discard asks the disk to erase flash memory cells as soon as files are deleted. This gives the disk more freedom to choose where to place new files, so it improves performance. On older kernel versions, and on really bad disks, this can reduce performance so we add an option to disable it. Since fstrim is pointless when online discard is enabled, we don't configure it if online discard is selected. I tested it on an AWS i3.large instance, the flag showd up in `mount` after configuration. Closes #9608	2021-11-15 14:16:08 +02:00
Takuya ASADA	279fabe9b4	scylla_ntp_setup: use string in systemd_unit.is_active() Since we reverted `2545d7fd43`, we need to use string instead of bool value.	2021-11-15 19:50:31 +09:00
Takuya ASADA	d646673705	Revert "scylla_util.py: return bool value on systemd_unit.is_active()" This reverts commit `2545d7fd43`. Fixes #9627 Fixes scylladb/scylla-machine-image#241	2021-11-15 19:50:31 +09:00
Takuya ASADA	9b4cf8c532	scylla_util.py: On is_gce(), return False when it's on GKE GKE metadata server does not provide same metadata as GCE, we should not return True on is_gce(). So try to fetch machine-type from metadata server, return False if it 404 not found. Fixes #9471 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Closes #9582	2021-11-04 12:49:06 +02:00
Avi Kivity	075ceb8918	Merge 'AWS: add scylla_io_setup preset parameters for ARM instances' from Takuya ASADA Currently, scylla-server fails to start on ARM instances because scylla_io_setup does not have preset parameters even instance type added to 'supported instance'. To fix this, we need to add io parameter preset on scylla_io_setup. Also, we mistakenly added EBS only instances at `a004b1da30`, need to remove them. Instrances does not have ephemeral disk should be 'unsupported instance', we still run our AMI on it, but we print warning message on login prompt, and user requires to run scylla_io_setup. Fixes #9493 Closes #9532 * github.com:scylladb/scylla: scylla_util.py: remove EBS only ARM instances from support instance list scylla_io_setup: support ARM instances on AWS	2021-11-03 10:19:59 +02:00
Takuya ASADA	4a96a8145e	scylla_util.py: remove EBS only ARM instances from support instance list Since we required ephemeral disks for our AMI, these EBS only ARM instances cannot add in it is 'supported instance' list. We still able to run our AMI on these instance types but login message warns it is 'unsupported instance type', and requires to run scylla_io_setup manually.	2021-11-03 10:26:42 +09:00
Takuya ASADA	4e8060ba72	scylla_io_setup: support ARM instances on AWS Add preset parameters for AWS ARM intances. Fixes #9493	2021-11-03 10:26:42 +09:00
Takuya ASADA	13ffe3c094	scylla_util.py: detect ephemeral/EBS disks correctly on Nitro System Currently, aws_instance.ephemeral_disks() returns both ephemeral disks and EBS disks on Nitro System. This is because both are attached as NVMe disks, we need to add disk type detection code on NVMe handle logic. Fixes #9440 Closes #9462	2021-10-28 08:58:25 +03:00
Takuya ASADA	3b798afc1e	scylla_io_setup: handle nr_disks on GCP correctly nr_disks is int, should not be string. Fixes #9429 Closes #9430	2021-10-06 12:31:38 +03:00
Takuya ASADA	9c830297ac	scylla_util.py: add persistent disk support for GCE Just like EBS disks for EC2, we want to use persistent disk on GCE. We won't recommend to use it, but still need to support it. Related scylladb/scylla-machine-image#215 Closes #9395	2021-10-03 17:58:18 +03:00
Takuya ASADA	d87b80ad14	scylla_util.py: add persistent disk support for Azure Just like EBS disks for EC2, we want to use persistent disk on Azure. We won't recommend to use it, but still need to support it. Related https://github.com/scylladb/scylla-machine-image/issues/218 Closes #9417	2021-10-03 17:56:31 +03:00
Takuya ASADA	cd7fe9a998	scylla_cpuscaling_setup: disable ondemand.service on Ubuntu On Ubuntu, scaling_governor becomes powersave after rebooted, even we configured cpufrequtils. This is because ondemand.service, it unconditionally change scaling_governor to ondemand or powersave. cpufrequtils will start before ondemand.service, scaling_governor overwrite by ondemand.service. To configure scaling_governor correctly, we have to disable this service. Fixes #9324 Closes #9325	2021-09-29 10:32:34 +03:00
Takuya ASADA	f928dced0c	scylla_cpuscaling_setup: add --force option To building Ubuntu AMI with CPU scaling configuration, we need force running mode for scylla_cpuscaling_setup, which run setup without checking scaling_governor support. See scylladb/scylla-machine-image#204 Closes #9326	2021-09-13 18:45:46 +03:00
Felipe Mendes	1b8dff63c3	iotune - Fix i3en.xlarge check i3en.xlarge is currently not getting tuned properly. A quick test using Scylla AMI ( ami-07a31481e4394d346 ) reveals that the storage capabilities under this instance are greatly reduced: $ grep iops /etc/scylla.d/io_properties.yaml read_iops: 257024 write_iops: 174080 This patch corrects this typo, in such a way that iotune now properly tunes this instance type. Closes #9298	2021-09-07 10:44:39 +03:00
Takuya ASADA	5b62bebbb6	scylla_io_setup: check root privilege on root mode This is side effect of allowing to run scylla_io_setup in nonroot mode, the script able to run in non-root user even the installation is not nonroot mode. Result of that, the script finally failed to write io_properties.yaml and causes permission denied. Since the evaluation takes long time, we should run permission check before starting it. We need to add root privilege check again, but skip it on nonroot mode. Fixes #8915 Closes #8984	2021-08-22 16:49:40 +03:00
Takuya ASADA	e5bb88b69a	scylla_cpuscaling_setup: change scaling_governor path On some environment /sys/devices/system/cpu/cpufreq/policy0/scaling_governor does not exist even it supported CPU scaling. Instead, /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor is avaliable on both environment, so we should switch to it. Fixes #9191 Closes #9193	2021-08-11 15:31:14 +03:00
Nadav Har'El	9662de85f5	Merge 'Azure snitch support' from Pekka Enberg This add support for Azure snitch. The work is an adaptation of AzureSnitch for Apache Cassandra by Yoshua Wakeham: https://raw.githubusercontent.com/yoshw/cassandra/9387-trunk/src/java/org/apache/cassandra/locator/AzureSnitch.java Also change `production_snitch_base` to protect against a snitch implementation setting DC and rack to an empty string, which Lubos' says can happen on Azure. Fixes #8593 Closes #9084 * github.com:scylladb/scylla: scylla_util: Use AzureSnitch on Azure production_snitch_base: Fallback for empty DC or rack strings azure_snitch: Azure snitch support	2021-08-03 22:52:05 +03:00
Eduardo Benzecri	f196a4131a	scylla_setup: Fix outdated message Message changed according to what 'scylla_bootparam_setup' currently does (set a clock source at boot time) instead of of what it used to do in the past (setting huge pages). Closes #9116.	2021-08-02 16:04:38 +03:00
Pekka Enberg	ef5b2934e8	scylla_util: Use AzureSnitch on Azure Fixes #8593	2021-07-28 14:07:42 +03:00
Takuya ASADA	42fd73d033	scylla_setup: add RAID5 support This supports optional RAID5 support on scylla_setup. Fixes #9076 Closes #9093	2021-07-27 12:49:29 +03:00
Yaron Kaikov	a004b1da30	scylla_util:add AWS arm based instance to supported list Today we have a Scylla AMI image based on x86 archituctre only. Following the work we did in https://github.com/scylladb/scylla-machine-image/pull/153 we can build ARM based AMI image Let's add ARM based instance to supported list Closes #9064	2021-07-22 15:48:29 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Yaron Kaikov	6a447db8a8	scylla_util.py: Fix Azure support for machine-image In https://github.com/scylladb/scylla/pull/7807 we added support for Azure instance in Scylla. The following changes are required in order machine-image to work: 1) fix wrong metadata URL and updating metadata path values (was intreduce in `f627fcbb0c`) 2) fix function naming which been used my machine image 3) add missing function which are reuqired by mahcine-image 4) cleanup unused functions Closes #8596	2021-06-06 09:21:23 +03:00
Lubos Kosco	777771df34	scylla_util.py: Relax GCE setup NVMe device checks We don't want to fail I/O setup if there are more than one NVMe devices mounted as root nor if there are no NVMe devices. Fixes #8032 Closes #8444	2021-06-06 09:21:23 +03:00
Yaron Kaikov	dd453ffe6a	install.sh: Setup aio-max-nr upon installation This is a follow up change to #8512. Let's add aio conf file during scylla installation process and make sure we also remove this file when uninstall Scylla As per Avi Kivity's suggestion, let's set aio value as static configuration, and make it large enough to work with 500 cpus. Closes #8650	2021-05-24 14:24:20 +03:00
Takuya ASADA	3d307919c3	scylla_raid_setup: use /dev/disk/by-uuid to specify filesystem Currently, var-lib-scylla.mount may fails because it can start before MDRAID volume initialized. We may able to add "After=dev-disk-by\x2duuid-<uuid>.device" to wait for device become available, but systemd manual says it automatically configure dependency for mount unit when we specify filesystem path by "absolute path of a device node". So we need to replace What=UUID=<uuid> to What=/dev/disk/by-uuid/<uuid>. Fixes #8279 Closes #8681	2021-05-24 14:24:08 +03:00
Yaron Kaikov	588a065304	scylla_io_setup: configure "aio-max-nr" before iotune On severl instance types in AWS and Azure, we get the following failure during scylla_io_setup process: ``` ERROR 2021-04-14 07:50:35,666 [shard 5] seastar - Could not setup Async I/O: Resource temporarily unavailable. The most common cause is not enough request capacity in /proc/sys/fs/aio-max-nr. Try increasing that number or reducing the amount of logical CPUs available for your application ``` We have scylla_prepare:configure_io_slots() running before the scylla-server.service start, but the scylla_io_setup is taking place before 1) Let's move configure_io_slots() to scylla_util.py since both scylla_io_setup and scylla_prepare are import functions from it 2) cleanup scylla_prepare since we don't need the same function twice 3) Let's use configure_io_slots() during scylla_io_setup to avoid such failure Fixes: #8587 Closes #8512	2021-05-11 18:39:10 +03:00
Avi Kivity	6977064693	dist: scylla_raid_setup: reduce xfs block size to 1k Since Linux 5.12 [1], XFS is able to to asynchronously overwrite sub-block ranges without stalling. However, we want good performance on older Linux versions, so this patch reduces the block size to the minimum possible. That turns out to be 1024 for crc-protected filesystems (which we want) and it can also not be smaller than the sector size. So we fetch the sector size and set the block size to that if it is larger than 512. Most SSDs have a sector size of 512, so this isn't a problem. Tested on AWS i3.large. Fixes #8156. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ed1128c2d0c87e5ff49c40f5529f06bc35f4251b Closes #8585	2021-05-05 16:07:50 +03:00
Lubos Kosco	c26bcf29f9	scylla_io_setup: add disk properties for L Azure instances	2021-05-04 13:13:05 +02:00
Lubos Kosco	f627fcbb0c	scylla_util.py: add new class for Azure cloud support	2021-05-04 13:12:42 +02:00
Takuya ASADA	c9324634ca	scylla_raid_setup: enabling mdmonitor.service on Debian variants On Debian variants, mdmonitor.service cannnot enable because it missing [Install] section, so 'systemctl enable mdmonitor.service' will fail, not able to run mdmonitor after the system restarted. To force running the service, add Wants=mdmonitor.service on var-lib-scylla.mount. Fixes #8494 Closes #8530	2021-04-28 11:32:27 +03:00
Peter Veentjer	c255903fb0	dist: Added r5b to ena instance_class. The r5b instances also have ena support. For a confirmation that all r5b instances have ena, go to the following page: https://instances.vantage.sh/ Select the r5b and add the 'enhanced networking' column. Then it will show that for every r5b type there is ena support Closes #8546	2021-04-27 15:39:24 +03:00
Pekka Enberg	0ddbed2513	dist: Add support for disabling writeback cache This adds support for disabling writeback cache by adding a new DISABLE_WRITEBACK_CACHE option to "scylla-server" sysconfig file, which makes the "scylla_prepare" script (that is run before Scylla starts up) call perftune.py with appropriate parameters. Also add a "--disable-writeback-cache" option to "scylla_sysconfig_setup", which can be called by scylla-machine image scripts, for example. Refs: #7341 Tests: dtest (next-gating) Closes #8526	2021-04-22 11:24:49 +03:00

1 2 3 4 5 ...

603 Commits