scylladb

Author	SHA1	Message	Date
Takuya ASADA	fefa76bffc	scylla_raid_setup: install update-initramfs when it's not available scylla_raid_setup may fail on Ubuntu minimal image since it calls update-initramfs without installing. (cherry picked from commit `b6dedf1ee1`) Closes scylladb/scylladb#19871	2024-07-25 13:58:11 +03:00
Patryk Wrobel	ec820e214c	scylla_io_setup: ensure correct RLIMIT_NOFILE for iotune The default limit of open file descriptors per process may be too small for iotune on certain machines with large number of cores. In such case iotune reports failure due to unability to create files or to set up seastar framework. This change configures the limit of open file descriptors before running iotune to ensure that the failure does not occur. The limit is set via 'resource.setrlimit()' in the parent process. The limit is then inherited by the child process. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#18546	2024-05-13 08:35:52 +03:00
Takuya ASADA	9538af0d95	scylla_kernel_check: fix block device size error on latest mkfs.xfs On latest mkfs.xfs, it does not allow to format a block device which is smaller than 300MB. There are options to ignore this validation but it is unsupported feature, so it is better to increase the loopback image size to "supported size" == 300MB. reference: https://lore.kernel.org/all/164738662491.3191861.15611882856331908607.stgit@magnolia/ Fixes #18568 Closes scylladb/scylladb#18620	2024-05-13 07:23:29 +03:00
Yaron Kaikov	2cf7cc1ea5	scylla_setup: Remove jmx and tools packages from being verified Following `b8634fb244` machine image started to fail with the following error: ``` 10:44:59 ␛[0;32m googlecompute.gce: scylla-jmx package is not installed.␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: Traceback (most recent call last):␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: File "/home/ubuntu/scylla_install_image", line 135, in <module>␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: run('/opt/scylladb/scripts/scylla_setup --no-coredump-setup --no-sysconfig-setup --no-raid-setup --no-io-setup --no-ec2-check --no-swap-setup --no-cpuscaling-setup --no-ntp-setup', shell=True, check=True)␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: File "/usr/lib/python3.10/subprocess.py", line 526, in run␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: raise CalledProcessError(retcode, process.args,␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: subprocess.CalledProcessError: Command '/opt/scylladb/scripts/scylla_setup --no-coredump-setup --no-sysconfig-setup --no-raid-setup --no-io-setup --no-ec2-check --no-swap-setup --no-cpuscaling-setup --no-ntp-setup' returned non-zero exit status 1.␛[0m ``` It seems we no longer need to verify that jmx and tools-java packages are installed. Closes scylladb/scylladb#18494	2024-05-02 13:30:50 +03:00
Yaniv Kaul	2ce2649ec1	Typo: you -> your Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#17806	2024-04-04 14:55:46 +03:00
Kefu Chai	ab07fb25f5	scylla_raid_setup: reference xfsprog on the minimal 1024 block size the quote of "The minimum block size for crc enabled filesystems is 1024" comes from the output of mkfs.xfs, let's quote the source for better maintainability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17094	2024-02-14 08:44:14 +02:00
Kefu Chai	cd3c7a50ed	scylla_raid_setup: drop unused import Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17095	2024-02-02 15:20:40 +01:00
Takuya ASADA	7275b614aa	scylla_util.py: wait for apt operation on other processes apt_install() / apt_uninstall() may fail if background process running apt operation, such as unattended-upgrades. To avoid this, we need to add two things: 1. For apt-get install / remove, we need to option "DPkg::Lock::Timeout=-1" to wait for dpkg lock. 2. For apt-get update, there is no option to wait for cache lock. Therefore, we need to implement retry-loop to wait for apt-get update succeed. Fixes #16537 Closes scylladb/scylladb#16561	2023-12-28 19:00:36 +02:00
Avi Kivity	3da346a86d	Merge 'Drop CentOS7 specific codes' from Takuya ASADA Since we decided to drop CentOS7 support from latest version of Scylla, now we can drop CentOS7 specific codes from packaging scripts and setup scripts. Related scylladb/scylla-enterprise#3502 Closes scylladb/scylladb#16365 * github.com:scylladb/scylladb: scylla-server.service: switch deprecated PermissionsStartsOnly to ExecStartPre=+ dist: drop legacy control group parameters scylla-server.slice: Drop workaround for MemorySwapMax=0 bug dist: move AmbientCapabilities to scylla-server.service Revert "scylla_setup: add warning for CentOS7 default kernel" [avi: CentOS 7 reached EOL on June 2024]	2023-12-25 18:25:05 +02:00
Takuya ASADA	7c38aff368	scylla_swap_setup: fix AttributeError On `dffadabb94` we mistakenly added "if args.overwrite_unit_file", but the option is comming from unmerged patch. So we need to drop this to fix script error. Fixes #16331 Closes scylladb/scylladb#16358	2023-12-11 13:41:00 +02:00
Takuya ASADA	1dc4feb68d	Revert "scylla_setup: add warning for CentOS7 default kernel" This reverts commit `85339d1820`.	2023-12-11 19:38:28 +09:00
Avi Kivity	92d61def57	Merge 'scylla_swap_setup: run error check before allocating swap and increase swap allocation speed' from Takuya ASADA This patch fixes error check and speed up swap allocation. Following patches are included: - scylla_swap_setup: run error check before allocating swap avoid create swapfile before running error check - scylla_swap_setup: use fallocate on ext4 this inclease swap allocation speed on ext4 Closes scylladb/scylladb#12668 * github.com:scylladb/scylladb: scylla_swap_setup: use fallocate on ext4 scylla_swap_setup: run error check before allocating swap	2023-12-06 21:40:10 +02:00
Yaniv Kaul	f2b810a16a	Update dist/common/scripts/scylla-housekeeping cobvert -> convert	2023-12-05 15:20:35 +02:00
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Takuya ASADA	c9d77699e1	scylla_setup: stop listing virtual devices on the NIC prompt Currently, the NIC prompt on scylla_setupshows up virtual devices such as VLAN devices and bridge devices, but perftune.py does not support them. To prevent causing error while running scylla_setup, we should stop listing these devices from the NIC prompt. closes #6757 Closes scylladb/scylladb#15958	2023-11-23 10:27:09 +03:00
Takuya ASADA	b97df92d76	scylla_setup: stop aborting on old kernel warning when non-interactive mode On non-interactive mode setup, RHEL/CentOS7 old kernel check causes "Setup aborted", this is not what we want. We should keep warning but proceed setup, so default value of the kernel check should be True, since it will automatically applied on non-interactive mode. Fixes #16045 Closes scylladb/scylladb#16100	2023-11-22 17:44:07 +02:00
Kefu Chai	2bae14f743	dist: let scylla-server.service Wants var-lib-systemd-coredump without adding `WantedBy=scylla-server.service` in var-lib-systemd-coredump, if we starts `scylla-server.service`, it does not necessarily starts `var-lib-systemd-coredump` even if the latter is installed. with `WantedBy=scylla-server.service` in var-lib-systemd-coredump, if we starts `scylla-server.service`, var-lib-systemd-coredump will be started also. and `Before=scylla-server.service` ensures that, before `scylla-server.service` is started, var-lib-systemd-coredump is already ready. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15984	2023-11-14 14:54:39 +02:00
Takuya ASADA	85339d1820	scylla_setup: add warning for CentOS7 default kernel Since CentOS7 default kernel is too old, has performance issues and also has some bugs, we have been recommended to use kernel-ml kernel. Let's check kernel version in scylla_setup and print warning if the kernel is CentOS7 default one. related #7365 Closes scylladb/scylladb#15705	2023-11-13 13:47:06 +02:00
Takuya ASADA	a4aeef2eb0	scylla_util.py: run apt-get update before apt-get install if it necessary Unlike yum, "apt-get install" may fails because package cache is outdated. Let's check package cache mtime and run "apt-get update" if it's too old. Fixes #4059 Closes scylladb/scylladb#15960	2023-11-07 20:40:16 +02:00
Takuya ASADA	2e7552a0ca	dist/redhat: drop rpm conflict with ABRT, add systemd conflict instead Currently, "yum install scylla" causes conflict when ABRT is installed. To avoid this behavior and keep using systemd-coredump for scylla coredump, let's drop "Conflicts: abrt" from rpm and add "Conflicts=abrt-ccpp.service" to systemd unit. Fixes #892 Closes scylladb/scylladb#15691	2023-11-07 10:30:23 +02:00
Takuya ASADA	a23278308f	dist: fix local-fs.target dependency systemd man page says: systemd-fstab-generator(3) automatically adds dependencies of type Before= to all mount units that refer to local mount points for this target unit. So "Before=local-fs.taget" is the correct dependency for local mount points, but we currently specify "After=local-fs.target", it should be fixed. Also replaced "WantedBy=multi-user.target" with "WantedBy=local-fs.target", since .mount are not related with multi-user but depends local filesystems. Fixes #8761 Closes scylladb/scylladb#15647	2023-11-06 18:39:53 +01:00
Takuya ASADA	58d94a54a3	scylla_raid_setup: faillback to other paths when UUID not avialable On some environment such as VMware instance, /dev/disk/by-uuid/<UUID> is not available, scylla_raid_setup will fail while mounting volume. To avoid failing to mount /dev/disk/by-uuid/<UUID>, fetch all available paths to mount the disk and fallback to other paths like by-partuuid, by-id, by-path or just using real device path like /dev/md0. To get device path, and also to dumping device status when UUID is not available, this will introduce UdevInfo class which communicate udev using pyudev. Related #11359 Closes scylladb/scylladb#13803	2023-10-17 12:24:58 +03:00
Takuya ASADA	ae25a216bc	scylla_fstrim_setup: stop disabling fstrim.timer Disabling fstrim.timer was for avoid running fstrim on /var/lib/scylla from both scylla-fstrim.timer and fstrim.timer, but fstrim.timer actually never do that, since it is only looking on fstab entries, not our systemd unit. To run fstrim correctly on rootfs and other filesystems not related scylla, we should stop disabling fstrim.timer. Fixes #15176 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Closes #15177	2023-08-27 14:56:37 +03:00
Vlad Zolotarov	e13a2b687d	scylla_raid_setup: make --online-discard argument useful This argument was dead since its introduction and 'discard' was always configured regardless of its value. This patch allows actually configuring things using this argument. Fixes #14963 Closes #14964	2023-08-21 12:21:23 +03:00
Takuya ASADA	c70a9cbffe	scylla_fstrim_setup: start scylla-fstrim.timer on setup Currently, scylla_fstrim_setup does not start scylla-fstrim.timer and just enables it, so the timer starts only after rebooted. This is incorrect behavior, we start start it during the setup. Also, unmask is unnecessary for enabling the timer. Fixes #14249 Closes #14252	2023-06-26 11:17:51 +03:00
Takuya ASADA	fdceda20cc	scylla_raid_setup: wipe filesystem signatures from specified disks The discussion on the thread says, when we reformat a volume with another filesystem, kernel and libblkid may skip to populate /dev/disk/by-* since it detected two filesystem signatures, because mkfs.xxx did not cleared previous filesystem signature. To avoid this, we need to run wipefs before running mkfs. Note that this runs wipefs twice, for target disks and also for RAID device. wipefs for RAID device is needed since wipefs on disks doesn't clear filesystem signatures on /dev/mdX (we may see previous filesystem signature on /dev/mdX when we construct RAID volume multiple time on same disks). Also dropped -f option from mkfs.xfs, it will check wipefs is working as we expected. Fixes #13737 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Closes #13738	2023-05-08 16:53:43 +03:00
Petr Gusev	0152c000bb	commitlog: use separate directory for schema commitlog The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in commitlog::descriptor::descriptor, which is logged with the WARN level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new schema_commitlog_directory parameter to move the schema commitlog to another disk drive. By default, the schema commitlog directory is nested in the commitlog_directory. This can help avoid problems during an upgrade if the commitlog_directory in the custom scylla.yaml is located on a separate disk partition. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867	2023-03-30 21:55:50 +04:00
Takuya ASADA	160c184d0b	scylla_kernel_check: suppress verbose iotune messages Stop printing verbose iotune messages while the check, just print error message. Fixes #13373. Closes #13362	2023-03-30 07:30:07 +03:00
Takuya ASADA	bf27fdeaa2	scylla_coredump_setup: fix coredump timeout settings We currently configure only TimeoutStartSec, but probably it's not enough to prevent coredump timeout, since TimeoutStartSec is maximum waiting time for service startup, and there is another directive to specify maximum service running time (RuntimeMaxSec). To fix the problem, we should specify RunTimeMaxSec and TimeoutSec (it configures both TimeoutStartSec and TimeoutStopSec). Fixes #5430 Closes #12757	2023-02-16 10:23:20 +02:00
Takuya ASADA	ea61b14f27	scylla_swap_setup: use fallocate on ext4 We stop using fallocate for allocating swap since it does not work on xfs (#6650). However, dd is much slower than fallocate since it filling data on the file, let's use fallocate when filesystem is ext4 since it actually works and faster. Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2023-02-01 01:58:13 +09:00
Takuya ASADA	dffadabb94	scylla_swap_setup: run error check before allocating swap We should run error check before running dd, otherwise it will left swapfile on disk without completing swap setup. Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2023-02-01 01:58:13 +09:00
Takuya ASADA	9acdd3af23	dist: drop deprecated AMI parameters on setup scripts Since we moved all IaaS code to scylla-machine-image, we nolonger need AMI variable on sysconfig file or --ami parameter on setup scripts, and also never used /etc/scylla/ami_disabled. So let's drop all of them from Scylla core core. Related with scylladb/scylla-machine-image#61 Closes #12043	2022-11-23 17:56:13 +02:00
Takuya ASADA	acc408c976	scylla_setup: fix incorrect type definition on --online-discard option --online-discard option defined as string parameter since it doesn't specify "action=", but has default value in boolean (default=True). It breaks "provisioning in a similar environment" since the code supposed boolean value should be "action='store_true'" but it's not. We should change the type of the option to int, and also specify "choices=[0, 1]" just like --io-setup does. Fixes #11700 Closes #11831	2022-11-08 08:40:44 +02:00
Takuya ASADA	464b5de99b	scylla_setup: allow symlink to --disks option Currently, --disks options does not allow symlinks such as /dev/disk/by-uuid/* or /dev/disk/azure/*. To allow using them, is_unused_disk() should resolve symlink to realpath, before evaluating the disk path. Fixes #11634 Closes #11646	2022-10-28 07:24:11 +03:00
Takuya ASADA	cd6030d5df	scylla_util.py: adding unescape for sysconfig_parser Even we have __escape() for escaping " middle of the value to writing sysconfig file, we didn't unescape for reading from sysconfig file. So adding __unescape() and call it on get().	2022-10-27 16:39:47 +09:00
Takuya ASADA	de57433bcf	scylla_util.py: on sysconfig_parser, don't use double quote when it's possible It seems like distribution original sysconfig files does not use double quote to set the parameter when the value does not contain space. Adding function to detect spaces in the value, don't usedouble quote when it not detected. Fixes #9149	2022-10-27 16:36:27 +09:00
Takuya ASADA	a938b009ca	scylla_raid_setup: run uuidpath existance check only after mount failed We added UUID device file existance check on #11399, we expect UUID device file is created before checking, and we wait for the creation by "udevadm settle" after "mkfs.xfs". However, we actually getting error which says UUID device file missing, it probably means "udevadm settle" doesn't guarantee the device file created, on some condition. To avoid the error, use var-lib-scylla.mount to wait for UUID device file is ready, and run the file existance check when the service is failed. Fixes #11617 Closes #11666	2022-10-25 08:54:21 +03:00
Vlad Zolotarov	8195dab92a	scylla_prepare: correctly handle a former 'MQ' mode Fixes a regression introduced in `80917a1054`: "scylla_prepare: stop generating 'mode' value in perftune.yaml" When cpuset.conf contains a "full" CPU set the negation of it from the "full" CPU set is going to generate a zero mask as a irq_cpu_mask. This is an illegal value that will eventually end up in the generated perftune.yaml, which in line will make the scylla service fail to start until the issue is resolved. In such a case a irq_cpu_mask must represent a "full" CPU set mimicking a former 'MQ' mode. Fixes #11701 Tested: - Manually on a 2 vCPU VM in an 'auto-selection' mode. - Manually on a large VM (48 vCPUs) with an 'MQ' manually enforced. Message-Id: <20221004004237.2961246-1-vladz@scylladb.com>	2022-10-06 17:43:37 +03:00
Avi Kivity	372eadf542	Merge "perftune related improvements in scylla_* scripts" from Vlad Zolotarov " This series adds a long waited transition of our auto-generation code to irq_cpu_mask instead of 'mode' in perftune.yaml. And then it fixes a regression in scylla_prepare perftune.yaml auto-generation logic. " * 'scylla_prepare_fix_regression-v1' of https://github.com/vladzcloudius/scylla: scylla_prepare + scylla_cpuset_setup: make scylla_cpuset_setup idempotent without introducing regressions scylla_prepare: stop generating 'mode' value in perftune.yaml	2022-10-02 13:25:13 +03:00
Takuya ASADA	8835a34ab6	scylla_raid_setup: prevent mount failed for /var/lib/scylla Just like `4a8ed4c`, we also need to wait for udev event completion to create /dev/disk/by-uuid/$UUID for newly formatted disk, to mount the disk just after formatting. Fixes #11359	2022-08-27 03:27:44 +09:00
Takuya ASADA	40134efee4	scylla_raid_setup: check uuid and device path are valid Added code to check make sure uuid and uuid based device path are valid.	2022-08-27 03:08:31 +09:00
Vlad Zolotarov	c538cc2372	scylla_prepare + scylla_cpuset_setup: make scylla_cpuset_setup idempotent without introducing regressions This patch fixes the regression introduced by `3a51e78` which broke a very important contract: perftune.yaml should not be "touched" by Scylla scriptology unless explicitly requested. And a call for scylla_cpuset_setup is such an explicit request. The issue that the offending patch was intending to fix was that cpuset.conf was always generated anew for every call of scylla_cpuset_setup - even if a resulting cpuset.conf would come out exactly the same as the one present on the disk before tha call. And since the original code was following the contract mentioned above it was also deleting perftune.yaml every time too. However, this was just an unavoidable side-effect of that cpuset.conf re-generation. The above also means that if scylla_cpuset_setup doesn't write to cpuset.conf we should not "touch" perftune.yaml and vise versa. This patch implements exactly that together with reverting the dangerous logic introduced by `3a51e78`. Fixes #11385 Fixes #10121	2022-08-25 13:03:02 -04:00
Vlad Zolotarov	80917a1054	scylla_prepare: stop generating 'mode' value in perftune.yaml Modern perftune.py supports a more generic way of defining IRQ CPUs: 'irq_cpu_mask'. This patch makes our auto-generation code create a perftune.yaml that uses this new parameter instead of using outdated 'mode'. As a side effect, this change eliminates the notion of "incorrect" value in cpuset.conf - every value is valid now as long as it fits into the 'all' CPU set of the specific machine. Auto-generated 'irq_cpu_mask' is going to include all bits from 'all' CPU mask except those defined in cpuset.conf. Fixes #9903	2022-08-25 13:02:57 -04:00
Takuya ASADA	ce87e15ecf	scylla_prepare: fix Exception when SET_NIC_AND_DISKS=no and SET_CLOCKSOURCE=yes We shouldn't call get_tune_mode() when NIC tuning is disabled. fixes #10412 Closes #10959	2022-07-05 14:52:52 +03:00
Takuya ASADA	7501465b7c	scylla_util.py: change debug log directory to /var/tmp/scylla Current debug log is bit difficult to collect in CI, to find the debug log we must know which script caused Exception. Because the filename does not include prefix, and also specified directory is shared with other programs. To make things more easily, let's change debug log directory to /var/tmp/scylla. Closes #10730	2022-07-05 14:49:00 +03:00
Takuya ASADA	3a51e7820a	scylla_cpuset_setup: stop deleting perftune.yaml and skip update cpuset.conf when same parameter specified To make scylla setup scripts easier to handle in Ansible, stop deleting perftune.yaml and detect cpuset.conf changes by mtime of the file. Also, skip update cpuset.conf when same parameter specified. Fixes #10121 Closes #10312	2022-06-23 10:28:36 +03:00
Israel Fruchter	d2ca2455db	scripts/scylla_util.py: introduce back user/group arguments for out() since #10467 remove the user/group parameters needed for the housekeeping call, need to introuce them back Fixes: #10804 Closes #10818	2022-06-16 13:50:17 +03:00
Takuya ASADA	5643c6de56	scylla_util.py: fix "systemctl is-active" causes error On `48b6aec16a` we mistakenly allowed check=True on systemd_unit.is_active(), it should be check=False. We check unit's status by "systemctl is-active" output string, it returns "active" or "inactive". But systemctl command returns non-zero status when it returning "inactive", so we are getting Exception here. To fix this, we need new option "ignore_error=True" for out(), and use it in systemd_unit.is_active(). Fixes #10455 Closes #10467	2022-06-13 13:45:50 +03:00
Takuya ASADA	ad2344a864	scylla_coredump_setup: support new format of Storage field Storage field of "coredumpctl info" changed at systemd-v248, it added "(present)" on the end of line when coredump file available. Fixes #10669 Closes #10714	2022-06-07 02:21:32 +03:00

1 2 3 4 5 ...

657 Commits