scylladb

Author	SHA1	Message	Date
Amnon Heiman	a213e41250	scylla-node-exporter: Add ethtool to node exporter AWS suggests following multiple network performance metrics: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-network-performance-ena.html#network-performance-metrics This patch enables the ethtool collector with the specific list of metrics Ater this patch the relevant metris looks like: $ curl http://localhost:9100/metrics \|& grep ethtool node_ethtool_bw_in_allowance_exceeded{device="ens5"} 0 node_ethtool_bw_out_allowance_exceeded{device="ens5"} 0 node_ethtool_conntrack_allowance_available{device="ens5"} 51303 node_ethtool_conntrack_allowance_exceeded{device="ens5"} 0 node_ethtool_info{bus_info="0000:00:05.0",device="ens5",driver="ena",expansion_rom_version="",firmware_version="",version="6.14.0-1015-aws"} 1 node_ethtool_linklocal_allowance_exceeded{device="ens5"} 0 node_scrape_collector_duration_seconds{collector="ethtool"} 0.001091436 node_scrape_collector_success{collector="ethtool"} 1 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Closes scylladb/scylladb#27358	2025-12-08 14:27:10 +02:00
Avi Kivity	1f6e3301e7	dist: systemd: drop deprecated CPU and I/O shares/weight from scylla-server.slice The BlockIOWeight and CPUShares are deprecated. They are only used on RHEL 7, which has reached end-of-life. Their replacements, IOWeight and CPUWeight, are already set in the file. Remove the deprecated settings to reduce noise in the logs. Closes scylladb/scylladb#27222	2025-11-26 06:42:11 +02:00
Avi Kivity	8e480110c2	dist: housekeeping: set python.multiprocessing fork mode to "fork" Python 3.14 changed the multiprocessing fork mode to "forkserver", presumably for good reasons. However, it conflicts with our relocatable Python system. "forkserver" forks and execs a Python process at startup, but it does this without supplying our relocated ld.so. The system ld.so detects a conflict and crashes. Fix this by switching back to "fork", which is sufficient for housekeeping's modest needs. Closes scylladb/scylladb#26831	2025-11-05 15:47:38 +03:00
Takuya ASADA	eb30594a60	dist: detect corrupted NUMA topology information There are some environment which has corrupted NUMA topology information, such as some instance types on AWS EC2 with specific Linux kernel images. On such environment, we cannot get HW information correctly from hwloc, so we cannot proceed optimization on perftune. To avoid causing script error, check NUMA topology information and skip running perftune if the information corrupted. Related scylladb/seastar#2925 Closes scylladb/scylladb#26344	2025-10-22 01:11:14 +03:00
Avi Kivity	bb02295695	setup: add the lazytime XFS mount option In `f828fe0d59` ("setup: add the lazytime XFS version") we added the lazytime mount option to /var/lib/scylla, but it was quickly reverted (`8f5e80e61a`) as it caused a regression on CentOS 7. We reinstate it now with a kernel version check. This will avoid the lazytime mount option on CentOS 7, which is unsupported anyway. The lazytime option avoids marking the inode as dirty if it's only for the purpose of updating mtime/ctime. This won't help much while writing sstables (since the write also updates extent information), but may help a little with with commitlog writes, since those are pure overwrites. It likely won't help with the RWF_NOWAIT violations seen in [1], since those are likely due to in-memory locking, not flushing dirty inodes to disk. Tested with an install to Ubuntu 24.04 LTS followed by a scylla_setup run. The lazytime option was added the the .mount file and showed up in the live mount. [1] https://github.com/scylladb/seastar/issues/2974 Closes scylladb/scylladb#26436 Fixes #26002	2025-10-09 15:55:58 +03:00
Robert Bindar	2c74a6981b	Make scylla_io_setup detect request size for best write IOPS We noticed during work on scylladb/seastar#2802 that on i7i family (later proved that it's valid for i4i family as well), the disks are reporting the physical sector sizes incorrectly as 512bytes, whilst we proved we can render much better write IOPS with 4096bytes. This is not the case on AWS i3en family where the reported 512bytes physical sector size is also the size we can achieve the best write IOPS. This patch works around this issue by changing `scylla_io_setup` to parse the instance type out of `/sys/devices/virtual/dmi/id/product_name` and run iotune with the correct request size based on the instance type. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#25315	2025-10-08 14:30:52 +03:00
Avi Kivity	5d1846d783	dist: scylla_raid_setup: don't override XFS block size on modern kernels In `6977064693` ("dist: scylla_raid_setup: reduce xfs block size to 1k"), we reduced the XFS block size to 1k when possible. This is because commitlog wants to write the smallest amount of padding it can, and older Linux could only write a multiple of the block size. Modern Linux [1] can O_DIRECT overwrite a range smaller than a filesystem block. However, this doesn't play well with some SSDs that have 512 byte logical sector size and 4096 byte physical sector size - it causes them to issue read-modify-writes. To improve the situation, if we detect that the kernel is recent enough, format the filesystem with its default block size, which should be optimal. Note that commitlog will still issue sub-4k writes, which can translate to RMW. There, we believe that the amplification is reduced since sequential sub-physical-sector writes can be merged, and that the overhead from commitlog space amplification is worse than the RMW overhead. Tested on AWS i4i.large. fsqual report: ``` memory DMA alignment: 512 disk DMA alignment: 512 filesystem block size: 4096 context switch per write io (size-changing, append, blocksize 4096, iodepth 1): 0.0003 (GOOD) context switch per write io (size-changing, append, blocksize 4096, iodepth 3): 0.7961 (BAD) context switch per write io (size-unchanging, append, blocksize 4096, iodepth 3): 0 (GOOD) context switch per write io (size-unchanging, append, blocksize 4096, iodepth 7): 0.0001 (GOOD) context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.125 (BAD) context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD) context switch per write io (size-changing, append, blocksize 4096, iodepth 1): 0 (GOOD) context switch per write io (size-changing, append, blocksize 4096, iodepth 3): 0.8006 (BAD) context switch per write io (size-unchanging, append, blocksize 4096, iodepth 3): 0.0001 (GOOD) context switch per write io (size-unchanging, append, blocksize 4096, iodepth 7): 0 (GOOD) context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.125 (BAD) context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD) context switch per read io (size-changing, append, blocksize 512, iodepth 30): 0 (GOOD) ``` The sub-block overwrite cases are GOOD. In comparison, the fsqual report for 1k (similar): ``` memory DMA alignment: 512 disk DMA alignment: 512 filesystem block size: 1024 context switch per write io (size-changing, append, blocksize 1024, iodepth 1): 0.0005 (GOOD) context switch per write io (size-changing, append, blocksize 1024, iodepth 3): 0.7948 (BAD) context switch per write io (size-unchanging, append, blocksize 1024, iodepth 3): 0.0015 (GOOD) context switch per write io (size-unchanging, append, blocksize 1024, iodepth 7): 0.0022 (GOOD) context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.4999 (BAD) context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD) context switch per write io (size-changing, append, blocksize 1024, iodepth 1): 0 (GOOD) context switch per write io (size-changing, append, blocksize 1024, iodepth 3): 0.798 (BAD) context switch per write io (size-unchanging, append, blocksize 1024, iodepth 3): 0.0012 (GOOD) context switch per write io (size-unchanging, append, blocksize 1024, iodepth 7): 0.0019 (GOOD) context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.5 (BAD) context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD) context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD) context switch per read io (size-changing, append, blocksize 512, iodepth 30): 0 (GOOD) ``` Fixes #25441. [1] `ed1128c2d0` Closes scylladb/scylladb#25445	2025-09-30 17:14:36 +03:00
Taras Veretilnyk	15e3980693	docs: Add documentation for the nodetool dropquarantinedsstables command Fixes scylladb/scylladb#19061	2025-08-01 11:46:33 +02:00
Yaron Kaikov	fdcaa9a7e7	dist/common/scripts/scylla_sysconfig_setup: fix `SyntaxWarning: invalid escape sequence` There are invalid escape sequence warnings where raw strings should be used for the regex patterns Fixes: https://github.com/scylladb/scylladb/issues/24915 Closes scylladb/scylladb#24916	2025-07-14 11:20:41 +02:00
Yaniv Michael Kaul	198ecd8039	Do not perform blkdiscard by default on the disks during RAID setup. This is not needed on clean disks, which is often the case with cloud instances, but can be useful on bare metal servers with disks that were used before. Therefore, the default is to skip blkdiscard operation, which makes overall installation faster. If the user wishes to run it anyway, use the newly introduced --blkdiscard option of scylla_raid_setup to perform it. Note: since we either perform online discard or schedule fstrim, the (previously used) space will gradually get trimmed, this way or another. Fixes: https://github.com/scylladb/scylladb/issues/24470 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#24579	2025-06-26 12:25:38 +02:00
Avi Kivity	092a88c9b9	dist: drop the scylla-env package scylla-env was used to glue together support for older distributions. It hasn't been used for many years. Remove it. Closes scylladb/scylladb#23985	2025-05-09 14:10:00 +03:00
Takuya ASADA	781dec5852	dist/docker: run the container as non-root user Since it is requirement for Red Hat OpenShift Certification, we need to run the container as non-root user. Related scylladb/scylla-pkg#4858 Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2025-03-05 23:39:56 +09:00
Avi Kivity	5c647408c7	systemd: map libraries close to the executable The Intel Optimizaton Manual states that branches with relative offsets greater than 2GB suffer a penalty. They cite a 6% improvement when this is avoided. Our code doesn't rely heavily on dynamically linked libraries, so I don't expect a similar win, but it's still better to do it than not. Eliminate long branches by asking the dynamic linker to restrict itself to the lower 4GB of the address space. I saw that it maps libraries at 1GB+ addresses, so this satisfies the limitation. Fix is from the Intel Optimization Manual as well. This change was ported from ScyllaDB Enterprise. Closes scylladb/scylladb#22498	2025-02-11 22:16:09 +02:00
Avi Kivity	cf72c31617	treewide: improve bash error reporting bash error handling and reporting is atrocious. Without -e it will just ignore errors. With -e it will stop on errors, but not report where the error happened (apart from exiting itself with an error code). Improve that with the `trap ERR` command. Note that this won't be invoked on intentional error exit with `exit 1`. We apply this on every bash script that contains -e or that it appears trivial to set it in. Non-trivial scripts without -e are left unmodified, since they might intentionally invoke failing scripts. Closes scylladb/scylladb#22747	2025-02-10 18:28:52 +03:00
Yaron Kaikov	b74565e83f	dist/common/scripts/scylla_raid_setup: reduce XFS metadata overhead The block size of 1k is significantly increasing metadata overhead with xfs since it reserves space upfront for btree expansion. With CRC disabled, this reservation doesn't happen. Smaller btree blocks reduce the fanout factor, increasing btree height and the reservation size. So block size implies a trade-off between write amplification and metadata size. Bigger blocks, smaller metadata, more write ampl. Smaller blocks, more metadata, and less write ampl. Let's disable both `rmapbt` and `relink` since we replicate data, and we can afford to rebuild a replica on local corruption. Fixes: https://github.com/scylladb/scylladb/issues/22028 Closes scylladb/scylladb#22072	2025-01-07 13:18:21 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Kefu Chai	a9c244ddf7	dist: scylla_io_setup: use raw string to avoid invalid escape sequence Use raw string literals to prevent syntax warnings when using regular expressions with backslash-based patterns. The original code triggered a SyntaxWarning in developer mode (`python3 -Xdev`) due to unescaped backslash characters in regex patterns like '\s'. While CPython typically interprets these silently, strict Python parsing modes raise warnings about potentially unintended escape sequences. This change adds the `r` prefix to string literals containing regex patterns, ensuring consistent behavior across different Python runtime configurations and eliminating unnecessary syntax warning like: ``` /opt/scylladb/scripts/libexec/scylla_io_setup:41: SyntaxWarning: invalid escape sequence '\s' pattern = re.compile(_nocomment + r"CPUSET=\s\"" + _reopt(_cpuset) + _reopt(_smp) + "\s\"") ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21839	2024-12-09 19:18:39 +03:00
Takuya ASADA	2b3115ac79	scylla-server.service: drop scylla-jmx.service Since we dropped scylla-jmx at `3cd2a61`, Wants=scylla-jmx.service is not needed anymore. Also we have issue on nonroot mode installation with this line (#21720), we need to drop this now. Fixes #21720 Closes scylladb/scylladb#21721	2024-12-01 14:14:33 +02:00
Takuya ASADA	6fe09a5a16	scylla_raid_setup: support installing semanage on Amazon Linux 2 Since Amazon Linux 2 has different package name for semange, we need to adjust package name. Fixes #21351	2024-11-11 17:27:24 +09:00
Takuya ASADA	7ad5e69c54	scylla_raid_setup: fix failure on SELinux package installation After merged `5a470b2`, we found that scylla_raid_setup fails on offline mode installation. This is because pkg_install() just print error and exit script on offline mode, instead of installing packages since offline mode not supposed able to connect internet. Seems like it occur because of missing "policycoreutils-python-utils" package, which is the package for "semange" command. So we need to implement the relabeling patch without using the command. Fixes #21441	2024-11-11 17:27:24 +09:00
Kefu Chai	961a53f716	dist: systemd: use default KillMode before this change, we specify the KillMode of the scylla-service service unit explicitly to "process". according to according to https://www.freedesktop.org/software/systemd/man/latest/systemd.kill.html, > If set to process, only the main process itself is killed (not recommended!). and the document suggests use "control-group" over "process". but scylla server is not a multi-process server, it is a multi-threaded server. so it should not make any difference even if we switch to the recommended "control-group". in the light that we've been seeing "defunct" scylla process after stopping the scylla service using systemd. we are wondering if we should try to change the `KillMode` to "control-group", which is the default value of this setting. in this change, we just drop the setting so that the systemd stops the service by stopping all processes in the control group of this unit are stopped. Refs scylladb/scylladb#21507 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21508	2024-11-09 20:07:11 +02:00
Yaniv Michael Kaul	4c5e102aee	node_exporter: use fewer collectors Remove unused / less useful collectors by default. While it doesn't seem to reduce memory usage, it may reduce potential performance or security issues in the future. This is what we are left with (snippet of log when loading node exporter manually with the changed command line): ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:111 level=info msg="Enabled collectors" ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=arp ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=bonding ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=conntrack ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=cpu ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=cpufreq ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=diskstats ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=dmi ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=edac ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=entropy ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=filefd ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=filesystem ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=interrupts ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=loadavg ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=mdadm ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=meminfo ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=netclass ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=netdev ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=netstat ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=nvme ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=os ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=pressure ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=schedstat ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=selinux ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=sockstat ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=softnet ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=stat ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=textfile ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=time ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=timex ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=uname ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=vmstat ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=watchdog ts=2024-11-03T15:41:06.855Z caller=node_exporter.go:118 level=info collector=xfs Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Improvement, no need to backport. Closes scylladb/scylladb#21419	2024-11-05 10:41:09 +03:00
Tomasz Grabiec	850d9cfb59	node-exporter: Disable hwmon collector This collector reads nvme temperature sensor, which was observed to cause bad performance on Azure cloud following the reading of the sensor for ~6 seconds. During the event, we can see elevated system time (up to 30%) and softirq time. CPU utilization is high, with nvm_queue_rq taking several orders of magnitude more time than normally. There are signs of contention, we can see __pv_queued_spin_lock_slowpath in the perf profile, called. This manifests as latency spikes and potentially also throughput drop due to reduced CPU capacity. By default, the monitoring stack queries it once every 60s. Closes scylladb/scylladb#21165	2024-10-27 21:59:15 +02:00
Kefu Chai	947d9d5a97	scylla_coredump_setup: fix typos in comment these typos were identified by the codespell workflow. and fixed a syntax error along the way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#20877	2024-09-30 13:29:34 +03:00
Avi Kivity	5a470b2bfb	Merge 'scylla_raid_setup: configure SELinux file context' from Takuya ASADA On RHEL9, systemd-coredump fails to coredump on /var/lib/scylla/coredump because the service only have write acess with systemd_coredump_var_lib_t. To make it writable, we need to add file context rule for /var/lib/scylla/coredump, and run restorecon on /var/lib/scylla. Fixes #19325 Closes scylladb/scylladb#20528 * github.com:scylladb/scylladb: scylla_raid_setup: configure SELinux file context scylla_coredump_setup: fix SELinux configuration for RHEL9	2024-09-29 12:53:00 +03:00
Takuya ASADA	3cd2a61736	dist: drop scylla-jmx Since JMX server is deprecated, drop them from submodule, build system and package definition. Related scylladb/scylla-tools-java#370 Related #14856 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Closes scylladb/scylladb#17969	2024-09-13 07:59:45 +03:00
Takuya ASADA	0ac450de05	scylla_raid_setup: configure SELinux file context On RHEL9, systemd-coredump fails to coredump on /var/lib/scylla/coredump because the service only have write acess with systemd_coredump_var_lib_t. To make it writable, we need to add file context rule for /var/lib/scylla/coredump, and run restorecon on /var/lib/scylla. Fixes #20573	2024-09-13 04:31:52 +09:00
Takuya ASADA	56c971373c	scylla_coredump_setup: fix SELinux configuration for RHEL9 Seems like specific version of systemd pacakge on RHEL9 has a bug on SELinux configuration, it introduced "systemd-container-coredump" module to provide rule for systemd-coredump, but not enabled by default. We have to manually load it, otherwise it causes permission error. Fixes #19325	2024-09-13 04:31:16 +09:00
Takuya ASADA	e36c939505	dist: tune LimitNOFILES for large nodes On very large node, LimitNOFILES=80000 may not enough size, it can cause "Too many files" error. To avoid that, let's increase LimitNOFILES on scylla_setup stage, generate optimal value calurated from memory size and number of cpus. Closes scylladb/scylla-enterprise#4304 Closes scylladb/scylladb#20443	2024-09-09 14:13:49 +03:00
Kefu Chai	a06e1c6545	scylla-housekeeping: use raw string to avoid using escape sequence before this change, when running `scylla-housekeeping`: ``` /opt/scylladb/scripts/libexec/scylla-housekeeping:122: SyntaxWarning: invalid escape sequence '\s' match = re.search(".http.?://repositories./scylladb/([^/\s]+)/./([^/\s]+)/scylladb-.", line) ``` we could have the warning above. because `\s` is not a valid escape sequence, but the Python interpreter accepts it as two separated characters of `\s` after complaining. but it's still annoying. so, let's use a raw string here. Refs scylladb/scylladb#20317 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#20359	2024-09-01 18:59:23 +03:00
Kefu Chai	03ab80501f	tools/scylla-nodetool: add restore integration as we have an API for restore a keyspace / table, let's expose this feature with nodetool. so we can exercise it without the help of scylla-manager or 3rd-party tools with a user-friendly interface. in this change: * add a new subcommand named "restore" to nodetool * add test to verify its interaction with the API server * update the document accordingly. * the bash completion script is updated accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-08-28 15:42:49 +03:00
Kefu Chai	969cbb75ce	tools/scylla-nodetool: add backup integration as we have an API for backup a keyspace, let's expose this feature with nodetool. so we can exercise it without the help of scylla-manager or 3rd-party tools with a user-friendly interface. in this change: * add a new subcommand named "backup" to nodetool * add test to verify its interaction with the API server * add two more route to the REST API mock server, as the test is using /task_manager/wait_task/{task_id} API. for the sake of completeness, the route for /task_manager/{part1} is added as well. * update the document accordingly. * the bash completion script is updated accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-08-22 19:48:06 +03:00
Yaron Kaikov	8221a178d8	Revert "dist: support nonroot and offline mode for scylla-housekeeping" This reverts commit `c3bea539b6`. Since it breaking offline-installer artifact-tests. Also, it seems that we should have merged it in the first place since we don't need scylla-housekeeping checks for offline-installer Closes scylladb/scylladb#19976	2024-08-04 10:55:26 +03:00
Takuya ASADA	02b20089cb	scylla_raid_setup: install update-initramfs when it's not available scylla_raid_setup may fail on Ubuntu minimal image since it calls update-initramfs without installing. Closes scylladb/scylladb#19651	2024-07-24 11:55:16 +03:00
Takuya ASADA	c3bea539b6	dist: support nonroot and offline mode for scylla-housekeeping Introduce support nonroot and offline mode for scylla-housekeeping. Closes #13084 Closes scylladb/scylladb#13088	2024-07-23 07:57:32 +03:00
Takuya ASADA	373a7825b5	scylla-housekeeping: fix exception on parsing version string Since Python 3.12, version parsing becomes strict, parse_version() does not accept the version string like '6.1.0~dev'. To fix this, we need to pass acceptable version string to parse_version() like '6.1.0.dev0', which is allowed on Python version scheme. Also, release canditate version like '6.0.0~rc3' has same issue, it should be replaced to '6.0.0rc3' to compare in parse_version(). reference: https://packaging.python.org/en/latest/specifications/version-specifiers/ Fixes #19564 Closes scylladb/scylladb#19572	2024-07-12 03:23:34 +09:00
Takuya ASADA	db04f8b16e	Revert "scylla-housekeeping: fix exception on parsing version string" This reverts commit `65fbf72ed0`, since it breaks scylla-housekeeping and SCT because the patch modified version string. We shoudn't modify version string directly, need to pass modified string just for parse_version() instead.	2024-07-12 03:23:34 +09:00
Takuya ASADA	cbf33aba5c	scylla_coredump_setup: install systemd-coredump before has_zstd() On Ubuntu/Debian, we have to install systemd-coredump before running has_ztd(), since it detect ZSTD support by running coredumpctl. Move pkg_install('systemd-coredump') to the head of the script. Fixes #19643 Closes scylladb/scylladb#19648	2024-07-08 15:04:34 +03:00
Takuya ASADA	09e22690dc	scylla_coredump_setup: enable compress by default when zstd support detected We disabled coredump compression by default because it was too slow, but recent versions of systemd-coredump supports faster zstd based compression, so let's enable compression by default when zstd support detected. Related scylladb/scylla-machine-image#462 Closes scylladb/scylladb#18854	2024-07-04 10:38:22 +03:00
Takuya ASADA	65fbf72ed0	scylla-housekeeping: fix exception on parsing version string Since Python 3.12, version parsing becomes strict, parse_version() does not accept the version string like '6.1.0~dev'. To fix this, we need to replace version string from '6.1.0~dev' to '6.1.0.dev0', which is allowed on Python version scheme. reference: https://packaging.python.org/en/latest/specifications/version-specifiers/ Fixes #19564 Closes scylladb/scylladb#19572	2024-07-04 10:27:51 +03:00
Yaniv Michael Kaul	9b0eb82175	dist/common/scripts/scylla_coredump_setup: fix typo Does not able -> Unable Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#19328	2024-06-17 17:33:46 +03:00
Kefu Chai	4c1006a5bb	dist: s/SafeConfigParser/ConfigParser/ `SafeConfigParser` was renamed to `ConfigParser` in Python 3.2, and Python warns us: > scylla-housekeeping:183: DeprecationWarning: The SafeConfigParser > class has been renamed to ConfigParser in Python 3.2. This alias will > be removed in Python 3.12. Use ConfigParser directly instead. see https://docs.python.org/3.2/library/configparser.html#configparser.ConfigParser and https://docs.python.org/3.1/library/configparser.html#configparser.SafeConfigParser Fixes #13046 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19285	2024-06-14 09:59:22 +03:00
Avi Kivity	dffd0901b3	dist: scylla_util: sysconfig_parser: replace deprecated ConfigParser.readfp ConfigParser.readfp was deprecated in Python 3.2 and removed in Python 3.12. Under Fedora 40, the container fails to launch because it cannot parse its configuration. Fix by using the newer read_file(). Closes scylladb/scylladb#19236	2024-06-12 10:07:10 +03:00
Patryk Wrobel	ec820e214c	scylla_io_setup: ensure correct RLIMIT_NOFILE for iotune The default limit of open file descriptors per process may be too small for iotune on certain machines with large number of cores. In such case iotune reports failure due to unability to create files or to set up seastar framework. This change configures the limit of open file descriptors before running iotune to ensure that the failure does not occur. The limit is set via 'resource.setrlimit()' in the parent process. The limit is then inherited by the child process. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#18546	2024-05-13 08:35:52 +03:00
Takuya ASADA	9538af0d95	scylla_kernel_check: fix block device size error on latest mkfs.xfs On latest mkfs.xfs, it does not allow to format a block device which is smaller than 300MB. There are options to ignore this validation but it is unsupported feature, so it is better to increase the loopback image size to "supported size" == 300MB. reference: https://lore.kernel.org/all/164738662491.3191861.15611882856331908607.stgit@magnolia/ Fixes #18568 Closes scylladb/scylladb#18620	2024-05-13 07:23:29 +03:00
Yaron Kaikov	2cf7cc1ea5	scylla_setup: Remove jmx and tools packages from being verified Following `b8634fb244` machine image started to fail with the following error: ``` 10:44:59 ␛[0;32m googlecompute.gce: scylla-jmx package is not installed.␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: Traceback (most recent call last):␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: File "/home/ubuntu/scylla_install_image", line 135, in <module>␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: run('/opt/scylladb/scripts/scylla_setup --no-coredump-setup --no-sysconfig-setup --no-raid-setup --no-io-setup --no-ec2-check --no-swap-setup --no-cpuscaling-setup --no-ntp-setup', shell=True, check=True)␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: File "/usr/lib/python3.10/subprocess.py", line 526, in run␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: raise CalledProcessError(retcode, process.args,␛[0m 10:44:59 ␛[1;31m==> googlecompute.gce: subprocess.CalledProcessError: Command '/opt/scylladb/scripts/scylla_setup --no-coredump-setup --no-sysconfig-setup --no-raid-setup --no-io-setup --no-ec2-check --no-swap-setup --no-cpuscaling-setup --no-ntp-setup' returned non-zero exit status 1.␛[0m ``` It seems we no longer need to verify that jmx and tools-java packages are installed. Closes scylladb/scylladb#18494	2024-05-02 13:30:50 +03:00
Botond Dénes	7cbe5c78b4	install.sh: use the native nodetool directly * tools/java b810e8b00e...4ee15fd9ea (1): > install.sh: don't install nodetool into /usr/bin Add a bin/nodetool and install it to bin/ in install.sh. This script simply forwards to scylla nodetool and it is the replacement for the Java nodetool, which is dropped from the java-tools's install.sh, in the submodule update also included in this patch. With this change, we now hardwire the usage of the native nodetool, as the nodetool, with the intermediary nodetool wrapper script removed from the picture. Bash completion was copied from the java tools repository and it is now installed by the scylla package, together with nodetool. The Java nodetool is still available as as a fall-back, in case the native nodetool has problems, at the path of /opt/scylladb/share/cassandra/bin/nodetool. Testing I tested upgrades on a DEB and RPM distro: Ubuntu and Fedora. First I installed scylla-5.4, then I installed the packages for this PR. On Ubuntu, I had to use dpkg -i --auto-deconfigure, otherwise, dpkg would refuse to install the new packages because they break the old ones. No extra flags were required on Fedora. In both cases, /usr/bin/nodetool was changed from a thunk calling the Java nodetool (from 5.4) to the native launcher script from this PR. /opt/scylladb/share/cassandra/bin/nodetool remained in place and still works after the upgrade. I also verified that --nonroot installs also work. Nodetool works both when called with an absolute path, or when ~/scylladb/bin is added to $PATH. Fixes: #18226 Fixes: #17412 Closes scylladb/scylladb#18255 [avi: reset submodule to actual hash we ended up with]	2024-04-25 22:52:00 +03:00
Yaniv Kaul	2ce2649ec1	Typo: you -> your Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#17806	2024-04-04 14:55:46 +03:00
Kefu Chai	ab07fb25f5	scylla_raid_setup: reference xfsprog on the minimal 1024 block size the quote of "The minimum block size for crc enabled filesystems is 1024" comes from the output of mkfs.xfs, let's quote the source for better maintainability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17094	2024-02-14 08:44:14 +02:00
Kefu Chai	cd3c7a50ed	scylla_raid_setup: drop unused import Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17095	2024-02-02 15:20:40 +01:00

1 2 3 4 5 ...

791 Commits