Commit Graph

1642 Commits

Author SHA1 Message Date
artem.penner
9898e5700b scylla-node-exporter: Add systemd collector to node exporter
This PR enables the node_exporter systemd collector and configures the unit whitelist to include scylla-server.service and systemd-coredump services.

**Motivation**: We currently lack visibility into system-level service states, which is critical for diagnosing stability issues.

This configuration enables two specific use cases:
- Detecting Coredump Loops: We encounter scenarios where ScyllaDB enters a restart loop. To pinpoint SIGSEGV (coredumps) as the root cause, we need to track when the systemd-coredump service becomes active, indicating a dump is being processed.
- Identifying Startup Failures: We need to detect when the scylla-server unit enters a failed state. This is essential for catching unrecoverable errors (e.g., corrupted commitlogs or configuration bugs) that prevent the server from starting.

example of promql queries:
- `node_systemd_unit_state{name=~"systemd-coredump@.*", state="active"} == 1`
- `node_systemd_unit_state{name="scylla-server.service", state="failed"} == 1`

Closes #28402
2026-03-20 08:39:56 +02:00
Avi Kivity
5ae40caa6d dist: tune tcp_mem to 3% of total memory in scylla-kernel-conf package
tcp_mem defaults to 9% of total memory. ScyllaDB defaults to 93%. The
sum is more than 100%.

Fix by tuning tcp_mem to 3% of total memory.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-734

Closes scylladb/scylladb#28700
2026-03-05 12:51:04 +03:00
Grzegorz Burzyński
b4f0eb666f packaging: add systemctl command to dependencies
scylladb/scylla container image doesn't include systemctl binary, while it
is used by perftune.py script shipped within the same image.

Scylla Operator runs this script to tune Scylla nodes/containers,
expecting its all dependencies to be available in the container's PATH.
Without systemctl, the script fails on systems that run irqbalance
(e.g., on EKS nodes) as the script tries to reconfigure irqbalance and
restart it via systemctl afterwards.

Fixes: scylladb/scylla-operator#3080

Closes scylladb/scylladb#28567
2026-02-25 10:19:32 +02:00
Botond Dénes
0bf4c68af5 Merge 'docs: fix link to docker build readme in the README.MD' from Marcin Szopa
Links were pointing to the `debian` subdirectory. However, there docker build was refactored to use `redhat`: 1abf981a73, see https://github.com/scylladb/scylladb/pull/22910

No backport, just a README link fixes.

Closes scylladb/scylladb#28699

* github.com:scylladb/scylladb:
  docs: fix path to the build_docker.sh which was moved from debian to redhat subdirectory
  docs: fix link to docker build README.MD
2026-02-20 08:21:46 +02:00
Avi Kivity
58a662b9db dist: refresh container base image (ubi9-minimal)
Using an outdated image can cause problems when `microdnf update`
runs, if the distribution doesn't maintain good update hygiene.
Although, I suspect that when update failures happen they're really
caused by propagation delay of packages to mirrors.

Fix by using --pull=always to get a fresh image.

Ref https://scylladb.atlassian.net/browse/SCYLLADB-714

Closes scylladb/scylladb#28680
2026-02-19 12:42:43 +03:00
Marcin Szopa
9217f85e99 docs: fix path to the build_docker.sh which was moved from debian to redhat subdirectory 2026-02-18 12:19:27 +01:00
Israel Fruchter
6b3ce5fdcc dist: scylla_coredump_setup: force unmount /var/lib/systemd/coredump before setup
When setting up coredump handling, if there are old mounts in a deleted state (e.g. from an older installation),
systemd might fail to activate the new `.mount` unit properly because it assumes the path is already mounted.

Explicitly unmount `/var/lib/systemd/coredump` before proceeding with the setup to ensure a clean state.

Fix: scylladb/scylla-enterprise#5692

Closes scylladb/scylladb#28300
2026-01-22 14:35:26 +02:00
Yaniv Michael Kaul
af8eaa9ea5 scripts: fixes flagged by CodeQL/PyLens
Unused imports, unused variables and such.
Initially, there were no functional changes, just to get rid of some standard CodeQL warnings.

I've then broken the CI, as apparently there's a install time(!?) Python script creation for the sole purpose of product
naming. I changed it - we have it in etcdir, as SCYLLA-PRODUCT-FILE.
So added (copied from a different script) a get_product() helper function in scylla_util.py and used it instead.

While at it, also fixed the too broad import from scylla_util, which 'forced' me to also fix other specific imports (such as shutil).

Improvement - no need to backport.
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#27883
2026-01-09 15:13:12 +02:00
Yaron Kaikov
1ee89c9682 Revert "scripts: benign fixes flagged by CodeQL/PyLens"
This reverts commit 377c3ac072.

This breaks all artifact tests and cloud image build process

Closes scylladb/scylladb#27881
2025-12-28 09:49:49 +02:00
Yaniv Michael Kaul
377c3ac072 scripts: benign fixes flagged by CodeQL/PyLens
Unused imports, unused variables and such.
No functional changes, just to get rid of some standard CodeQL warnings.

Benign - no need to backport.
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#27801
2025-12-24 13:08:24 +02:00
Amnon Heiman
a213e41250 scylla-node-exporter: Add ethtool to node exporter
AWS suggests following multiple network performance metrics:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-network-performance-ena.html#network-performance-metrics

This patch enables the ethtool collector with the specific list of
metrics

Ater this patch the relevant metris looks like:

$ curl http://localhost:9100/metrics |& grep ethtool
node_ethtool_bw_in_allowance_exceeded{device="ens5"} 0
node_ethtool_bw_out_allowance_exceeded{device="ens5"} 0
node_ethtool_conntrack_allowance_available{device="ens5"} 51303
node_ethtool_conntrack_allowance_exceeded{device="ens5"} 0
node_ethtool_info{bus_info="0000:00:05.0",device="ens5",driver="ena",expansion_rom_version="",firmware_version="",version="6.14.0-1015-aws"} 1
node_ethtool_linklocal_allowance_exceeded{device="ens5"} 0
node_scrape_collector_duration_seconds{collector="ethtool"} 0.001091436
node_scrape_collector_success{collector="ethtool"} 1

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Closes scylladb/scylladb#27358
2025-12-08 14:27:10 +02:00
Avi Kivity
b82f92b439 main: replace p11-kit hack for trust paths override with gnutls hack
p11-kit has hardcoded paths for the trust paths. Of course, each
Linux distribution hardcodes those paths differently. As a result,
our relocatable gnutls, which uses p11-kit-trust.so to process the
trust paths, needs some overrides to select the right paths.

Currently, we use p11_kit_override_system_files(), a p11-kit API
intended for testing, but which worked well enough for our purpose,
to override the trust module configuration.

Unfortunately, starting (presumably [1]) in gnutls 3.8.11, gnutls
changed how it works with p11-kit and our override is now ignored.

This was likely unintentional, but there appears to be a better way:
instead of letting gnutls auto-load the trust module from a hacked
configuration, we load the modules outselves using
gnutls_pkcs11_init(GNUTLS_PKCS11_FLAG_MANUAL) and
gnutls_pkcs11_add_provider(). These appear to be intended for the purpose.

We communicate the paths to the scylla executable using an environment
variable. This isn't optimal, but is much easier than adding a command
line variable since there are multiple levels of command line parsing due
to the subtool mechanism.

With this, we unlock the possibility to upgrade gnutls to newer versions.

[1] aa5f15a872

Closes scylladb/scylladb#27348
2025-12-04 11:33:51 +02:00
Avi Kivity
1f6e3301e7 dist: systemd: drop deprecated CPU and I/O shares/weight from scylla-server.slice
The BlockIOWeight and CPUShares are deprecated. They are only used on
RHEL 7, which has reached end-of-life. Their replacements, IOWeight
and CPUWeight, are already set in the file.

Remove the deprecated settings to reduce noise in the logs.

Closes scylladb/scylladb#27222
2025-11-26 06:42:11 +02:00
Yehuda Lebi
a05ebbbfbb dist/docker: add configurable blocked-reactor-notify-ms parameter
Add --blocked-reactor-notify-ms argument to allow overriding the default
blocked reactor notification timeout value of 25 ms.

This change provides users the flexibility to customize the reactor
notification timeout as needed.

Fixes: scylladb/scylla-enterprise#5525

Closes scylladb/scylladb#26892
2025-11-11 12:38:40 +02:00
Avi Kivity
8e480110c2 dist: housekeeping: set python.multiprocessing fork mode to "fork"
Python 3.14 changed the multiprocessing fork mode to "forkserver",
presumably for good reasons. However, it conflicts with our
relocatable Python system. "forkserver" forks and execs a Python
process at startup, but it does this without supplying our relocated
ld.so. The system ld.so detects a conflict and crashes.

Fix this by switching back to "fork", which is sufficient for
housekeeping's modest needs.

Closes scylladb/scylladb#26831
2025-11-05 15:47:38 +03:00
Takuya ASADA
eb30594a60 dist: detect corrupted NUMA topology information
There are some environment which has corrupted NUMA topology
information, such as some instance types on AWS EC2 with specific Linux
kernel images.
On such environment, we cannot get HW information correctly from hwloc,
so we cannot proceed optimization on perftune.
To avoid causing script error, check NUMA topology information and skip
running perftune if the information corrupted.

Related scylladb/seastar#2925

Closes scylladb/scylladb#26344
2025-10-22 01:11:14 +03:00
Avi Kivity
bb02295695 setup: add the lazytime XFS mount option
In f828fe0d59 ("setup: add the lazytime XFS version") we added the
lazytime mount option to /var/lib/scylla, but it was quickly reverted
(8f5e80e61a) as it caused a regression on CentOS 7.

We reinstate it now with a kernel version check. This will avoid
the lazytime mount option on CentOS 7, which is unsupported anyway.

The lazytime option avoids marking the inode as dirty if it's only for the
purpose of updating mtime/ctime. This won't help much while writing sstables
(since the write also updates extent information), but may help a little
with with commitlog writes, since those are pure overwrites.

It likely won't help with the RWF_NOWAIT violations seen in [1], since
those are likely due to in-memory locking, not flushing dirty inodes
to disk.

Tested with an install to Ubuntu 24.04 LTS followed by a scylla_setup run.
The lazytime option was added the the .mount file and showed up in
the live mount.

[1] https://github.com/scylladb/seastar/issues/2974

Closes scylladb/scylladb#26436
Fixes #26002
2025-10-09 15:55:58 +03:00
Robert Bindar
2c74a6981b Make scylla_io_setup detect request size for best write IOPS
We noticed during work on scylladb/seastar#2802 that on i7i family
(later proved that it's valid for i4i family as well),
the disks are reporting the physical sector sizes incorrectly
as 512bytes, whilst we proved we can render much better write IOPS with
4096bytes.

This is not the case on AWS i3en family where the reported 512bytes
physical sector size is also the size we can achieve the best write IOPS.

This patch works around this issue by changing `scylla_io_setup` to parse
the instance type out of `/sys/devices/virtual/dmi/id/product_name`
and run iotune with the correct request size based on the instance type.

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Closes scylladb/scylladb#25315
2025-10-08 14:30:52 +03:00
Avi Kivity
5d1846d783 dist: scylla_raid_setup: don't override XFS block size on modern kernels
In 6977064693 ("dist: scylla_raid_setup:
reduce xfs block size to 1k"), we reduced the XFS block size to 1k when
possible. This is because commitlog wants to write the smallest amount
of padding it can, and older Linux could only write a multiple of the
block size. Modern Linux [1] can O_DIRECT overwrite a range smaller than
a filesystem block.

However, this doesn't play well with some SSDs that have 512 byte
logical sector size and 4096 byte physical sector size - it causes them
to issue read-modify-writes.

To improve the situation, if we detect that the kernel is recent enough,
format the filesystem with its default block size, which should be optimal.

Note that commitlog will still issue sub-4k writes, which can translate
to RMW. There, we believe that the amplification is reduced since
sequential sub-physical-sector writes can be merged, and that the overhead
from commitlog space amplification is worse than the RMW overhead.

Tested on AWS i4i.large. fsqual report:

```
memory DMA alignment:    512
disk DMA alignment:      512
filesystem block size:   4096
context switch per write io (size-changing, append, blocksize 4096, iodepth 1): 0.0003 (GOOD)
context switch per write io (size-changing, append, blocksize 4096, iodepth 3): 0.7961 (BAD)
context switch per write io (size-unchanging, append, blocksize 4096, iodepth 3): 0 (GOOD)
context switch per write io (size-unchanging, append, blocksize 4096, iodepth 7): 0.0001 (GOOD)
context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.125 (BAD)
context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD)
context switch per write io (size-changing, append, blocksize 4096, iodepth 1): 0 (GOOD)
context switch per write io (size-changing, append, blocksize 4096, iodepth 3): 0.8006 (BAD)
context switch per write io (size-unchanging, append, blocksize 4096, iodepth 3): 0.0001 (GOOD)
context switch per write io (size-unchanging, append, blocksize 4096, iodepth 7): 0 (GOOD)
context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.125 (BAD)
context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD)
context switch per read io (size-changing, append, blocksize 512, iodepth 30): 0 (GOOD)
```

The sub-block overwrite cases are GOOD.

In comparison, the fsqual report for 1k (similar):

```
memory DMA alignment:    512
disk DMA alignment:      512
filesystem block size:   1024
context switch per write io (size-changing, append, blocksize 1024, iodepth 1): 0.0005 (GOOD)
context switch per write io (size-changing, append, blocksize 1024, iodepth 3): 0.7948 (BAD)
context switch per write io (size-unchanging, append, blocksize 1024, iodepth 3): 0.0015 (GOOD)
context switch per write io (size-unchanging, append, blocksize 1024, iodepth 7): 0.0022 (GOOD)
context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.4999 (BAD)
context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD)
context switch per write io (size-changing, append, blocksize 1024, iodepth 1): 0 (GOOD)
context switch per write io (size-changing, append, blocksize 1024, iodepth 3): 0.798 (BAD)
context switch per write io (size-unchanging, append, blocksize 1024, iodepth 3): 0.0012 (GOOD)
context switch per write io (size-unchanging, append, blocksize 1024, iodepth 7): 0.0019 (GOOD)
context switch per write io (size-unchanging, append, blocksize 512, iodepth 1): 0.5 (BAD)
context switch per write io (size-unchanging, overwrite, blocksize 512, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 1): 0 (GOOD)
context switch per write io (size-unchanging, overwrite, blocksize 512, O_DSYNC, iodepth 3): 0 (GOOD)
context switch per read io (size-changing, append, blocksize 512, iodepth 30): 0 (GOOD)
```

Fixes #25441.

[1] ed1128c2d0

Closes scylladb/scylladb#25445
2025-09-30 17:14:36 +03:00
Yaron Kaikov
0a025d121f packaging: Add adduser as dependnacy
As `adduser` command is being used by `/var/lib/dpkg/info/scylla-server.postinst` and similar during rpm post-install.

Fixes: https://github.com/scylladb/scylladb/issues/23722

Closes scylladb/scylladb#25928
2025-09-10 21:51:25 +03:00
Yaron Kaikov
d57741edc2 build_docker.sh: enable debug symboles installation
Adding the latest scylla.repo location to our docker container, this
will allow installation scylla-debuginfo package in case it's needed

Fixes: https://github.com/scylladb/scylladb/issues/24271

Closes scylladb/scylladb#25646
2025-09-08 18:39:27 +03:00
Michael Litvak
25fb3b49fa dist/docker: add dc and rack arguments
add --dc and --rack commandline arguments to the scylla docker image, to
allow starting a node with a specified dc and rack names in a simple
way.

This is useful mostly for small examples and demonstrations of starting
multiple nodes with different racks, when we prefer not to bother with
editing configuration files. The ability to assign nodes to different
racks is especially important with RF=Rack enforcing.

The previous method to achieve this is to set the snitch to
GossipingPropertyFileSnitch and provide a configuration file in
/etc/scylla/cassandra-rackdc.properties with the name of the dc and
rack.

The new dc and rack parameters are implemented similarly by using the
snitch GossipingPropertyFileSnitch and writing the dc and rack values to
the rackdc properties file. We don't support passing the parameters
together with a different snitch, or when mounting a properties file
from the host, because we don't want to overwrite it.

Example:
docker run -d --name scylla1 scylladb/scylla --dc my_dc1 --rack my_rack1

Fixes scylladb/scylladb#23423

Closes scylladb/scylladb#25607
2025-08-24 17:48:07 +03:00
Taras Veretilnyk
15e3980693 docs: Add documentation for the nodetool dropquarantinedsstables command
Fixes scylladb/scylladb#19061
2025-08-01 11:46:33 +02:00
Yaron Kaikov
fdcaa9a7e7 dist/common/scripts/scylla_sysconfig_setup: fix SyntaxWarning: invalid escape sequence
There are invalid escape sequence warnings where raw strings should be used for the regex patterns

Fixes: https://github.com/scylladb/scylladb/issues/24915

Closes scylladb/scylladb#24916
2025-07-14 11:20:41 +02:00
Yaron Kaikov
66ff6ab6f9 packaging: add ps command to dependancies
ScyllaDB container image doesn't have ps command installed, while this command is used by perftune.py script shipped within the same image. This breaks node and container tuning in Scylla Operator.

Fixes: #24827

Closes scylladb/scylladb#24830
2025-07-13 17:09:05 +03:00
Avi Kivity
07c5edcc30 tools: add patchelf utility
We use patchelf to rewrite the dynamic loader (known as the interpreter)
of the binaries we ship, so we can point to our shipped dynamic loader,
which is compatible with our binaries, rather than rely on the distribution's
dynamic loader, which is likely to be incompatible.

Upstream patchelf losing compatibity [1] with Linux 5.17 and below.
This change was also picked up by Fedora 42, so we cannot update the
toolchain to that distribution until we have an alternative.

Here we add a minimal patchelf alternative. It was mostly written by
Claude. It is minimal in that it only supports --set-interpreter and
--print-interpreter, and works well enough for our needs. We still use
the original patchelf for --remove-rpath; this reduces our maintenance
needs.

[1] 43b75fbc9f
[2] 4b015255d1

Closes scylladb/scylladb#24695
2025-06-30 07:24:05 +03:00
Yaniv Michael Kaul
198ecd8039 Do not perform blkdiscard by default on the disks during RAID setup.
This is not needed on clean disks, which is often the case with cloud instances, but can be useful on bare metal servers with disks that were used before.
Therefore, the default is to skip blkdiscard operation, which makes overall installation faster.
If the user wishes to run it anyway, use the newly introduced --blkdiscard option of scylla_raid_setup to perform it.

Note: since we either perform online discard or schedule fstrim, the (previously used) space will gradually get trimmed, this way or another.

Fixes: https://github.com/scylladb/scylladb/issues/24470
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#24579
2025-06-26 12:25:38 +02:00
Avi Kivity
37f9cf6de6 dist: rpm: override %_sbindir for Fedora 42
Fedora 42 merged /usr/sbin into /usr/bin [1]. As part of that change
the rpm macro %_sbindir was redefined from /usr/sbin to /usr/bin. As
a result RPM build on Fedora 42 fails: install.sh places some files
into /usr/sbin, while rpmbuild looks for them in /usr/bin.

We could resolve this either by following the change and moving
the files to /usr/bin as well, or fixing the spec to place the files
in /usr/sbin. The former is more difficult:
 - what about Debian/Ubuntu?
 - what about older RPM-based distributions (like all RHEL distributions)?
 - what about scripts that hard-code /usr/sbin/<scylla utility>?

So we pick the latter, and redefine %_sbindir to /usr/sbin. Since that
directory still exists (as a symlink), installation on systems with
merged /usr/bin and /usr/sbin will work.

We'll have to address the problem later (likely by installing to either
/usr/bin or /usr/sbin depending on context), but for now, this is a simple
solution that works everywhere.

[1] https://fedoraproject.org/wiki/Changes/Unify_bin_and_sbin

Closes scylladb/scylladb#24101
2025-05-16 12:05:29 +02:00
Avi Kivity
092a88c9b9 dist: drop the scylla-env package
scylla-env was used to glue together support for older
distributions. It hasn't been used for many years. Remove
it.

Closes scylladb/scylladb#23985
2025-05-09 14:10:00 +03:00
Yaniv Michael Kaul
b374f94b15 pip installation: use --no-cache-dir
There are two reasons we may want NOT to use caching of pip deps:
1. When building a container, unless we specifically clean it up, it'll remain, even when we squash the image layers later.
2. When building a container, that cache is not useful, as we squash our containers later (so that layer is not cached really). And our CI cleans up the layers repo anyway.
3. Caching sometimes isn't great, and doesn't ensure we pick up the exact version (or latest) that we wish to...

This PR changes two locations in Scylla, both of which (also) build containers, so certainly relevant for 1, 2 above and possibly 3.
No real need to backport.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#23822
2025-04-21 13:46:57 +03:00
Avi Kivity
5e1cf90a51 build: replace tools/java submodule with packaged cassandra-stress
We no longer use tools/java (scylladb/scylla-tools-java.git) for
nodetool or cqlsh; only cassandra-stress. Since that is available
in package form install that and excise the tools/java submodule
from the source tree.

pgo/ is adjusted to use the packaged cassandra-stress (and the cqlsh
submodule).

A few jmx references are dropped as well.

Frozen toolchain regenerated.

Optimized clang from

  https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-aarch64.tar.gz
  https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-x86_64.tar.gz

Closes scylladb/scylladb#23698
2025-04-15 10:11:28 +03:00
Takuya ASADA
781dec5852 dist/docker: run the container as non-root user
Since it is requirement for Red Hat OpenShift Certification, we need to
run the container as non-root user.

Related scylladb/scylla-pkg#4858

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
2025-03-05 23:39:56 +09:00
Takuya ASADA
1abf981a73 dist/docker: switch to UBI9
Switch container base image to UBI9, to prepare for Red Hat OpenShift
Certification.

Fixes scylladb/scylla-pkg#4858

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
2025-03-05 23:39:56 +09:00
Takuya ASADA
f2a8ae101b dist/docker: drop hostname package, use Python API
We currently depends on hostname command to get local IP, but we can do
this on Python API.
After the change, we can drop the package.

Closes scylladb/scylladb#22909
2025-02-24 15:03:44 +02:00
Takuya ASADA
b5e306047f dist: fix upgrade error from 2024.1
We need to allow replacing nodetool from scylla-enterprise-tools < 2024.2,
just like we did for scylla-tools < 5.5.
This is required to make packages able to upgrade from 2024.1.

Fixes #22820

Closes scylladb/scylladb#22821
2025-02-13 12:36:24 +02:00
Avi Kivity
5c647408c7 systemd: map libraries close to the executable
The Intel Optimizaton Manual states that branches with relative offsets
greater than 2GB suffer a penalty. They cite a 6% improvement when this
is avoided. Our code doesn't rely heavily on dynamically linked
libraries, so I don't expect a similar win, but it's still better to do
it than not.

Eliminate long branches by asking the dynamic linker to restrict itself
to the lower 4GB of the address space. I saw that it maps libraries
at 1GB+ addresses, so this satisfies the limitation.

Fix is from the Intel Optimization Manual as well.

This change was ported from ScyllaDB Enterprise.

Closes scylladb/scylladb#22498
2025-02-11 22:16:09 +02:00
Avi Kivity
cf72c31617 treewide: improve bash error reporting
bash error handling and reporting is atrocious. Without -e it will
just ignore errors. With -e it will stop on errors, but not report
where the error happened (apart from exiting itself with an error code).

Improve that with the `trap ERR` command. Note that this won't be invoked
on intentional error exit with `exit 1`.

We apply this on every bash script that contains -e or that it appears
trivial to set it in. Non-trivial scripts without -e are left unmodified,
since they might intentionally invoke failing scripts.

Closes scylladb/scylladb#22747
2025-02-10 18:28:52 +03:00
Yaron Kaikov
93f53f4eb8 dist: support smooth upgrade from enterprise to source availalbe
When upgrading for example from `2024.1` to `2025.1` the package name is
not identical casuing the upgrade command to fail:
```
Command: 'sudo DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade scylla -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold"'
Exit code: 100
Stdout:
Selecting previously unselected package scylla.
Preparing to unpack .../6-scylla_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb ...
Unpacking scylla (2025.1.0~dev-0.20250118.1ef2d9d07692-1) ...
Errors were encountered while processing:
/tmp/apt-dpkg-install-JbOMav/0-scylla-conf_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/1-scylla-python3_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/2-scylla-server_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/3-scylla-kernel-conf_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/4-scylla-node-exporter_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/5-scylla-cqlsh_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
Stderr:
E: Sub-process /usr/bin/dpkg returned an error code (1)
```

Adding `Obsoletes` (for rpm) and `Replaces` (for deb)

Fixes: https://github.com/scylladb/scylladb/issues/22420

Closes scylladb/scylladb#22457
2025-02-08 21:56:09 +02:00
Takuya ASADA
fb4c7dc3d8 dist: Support FIPS mode
- To make Scylla able to run in FIPS-compliant system, add .hmac files for
  crypto libraries on relocatable/rpm/deb packages.
- Currently we just write hmac value on *.hmac files, but there is new
  .hmac file format something like this:

  ```
  [global]
  format-version = 1
  [lib.xxx.so.yy]
  path = /lib64/libxxx.so.yy
  hmac = <hmac>
  ```
  Seems like GnuTLS rejects fips selftest on .libgnutls.so.30.hmac when
  file format is older one.
  Since we need to absolute path on "path" directive, we need to generate
  .libgnutls.so.30.hmac in older format on create-relocatable-script.py,

Signed-off-by: Takuya ASADA <syuu@scylladb.com>

Closes scylladb/scylladb#22384
2025-01-26 22:49:21 +02:00
Takuya ASADA
f2a53d6a2c dist: make p11-kit-trust.so able to work in relocatable package
Currently, our relocatable package doesn't contains p11-kit-trust.so
since it dynamically loaded, not showing on "ldd" results
(Relocatable packaging script finds dependent libraries by "ldd").
So we need to add it on create-relocatable-pacakge.py.

Also, we have two more problems:
1. p11 module load path is defined as "/usr/lib64/pkcs11", not
   referencing to /opt/scylladb/libreloc
   (and also RedHat variants uses different path than Debian variants)

2. ca-trust-source path is configured on build time (on Fedora),
   it compatible with RedHat variants but not compatible with Debian
   variants

To solve these problems, we need to override default p11-kit
configuration.
To do so, we need to add an configuration file to
/opt/scylladb/share/pkcs11/modules/p11-kit-trust.module.
Also, ofcause p11-kit doesn't reference /opt/scylladb by default, we
need to override load path by p11_kit_override_system_files().

On the configuration file, we can specify module load path by "modules: <path>",
and also we can specify ca-trust-source path by "x-init-reservied: paths=<path>".

Fixes scylladb/scylladb#13904

Closes scylladb/scylladb#22302
2025-01-15 10:09:17 +02:00
Kefu Chai
f8885a4afd dist/docker,docs: replace "--experimental" with "--experimental-features"
The "--experimental" option was removed in commit f6cca741ea. Using this
deprecated option now causes Scylla to fail with the error:

```
error: the argument ('on') for option '--experimental-features' is invalid
```

So, in this change, let's update the docker entry point script to use
`--experimental-features` command line option instead. The related
document is updated accordingly.

Fixes scylladb/scylladb#22207
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22283
2025-01-14 07:56:38 -05:00
Yaron Kaikov
b74565e83f dist/common/scripts/scylla_raid_setup: reduce XFS metadata overhead
The block size of 1k is significantly increasing metadata overhead with xfs since it reserves space upfront for btree expansion. With CRC disabled, this reservation doesn't happen. Smaller btree blocks reduce the fanout factor, increasing btree height and the reservation size. So block size implies a trade-off between write amplification and metadata size. Bigger blocks, smaller metadata, more write ampl. Smaller blocks, more metadata, and less write ampl.

Let's disable both `rmapbt` and `relink` since we replicate data, and we can afford to rebuild a replica on local corruption.

Fixes: https://github.com/scylladb/scylladb/issues/22028

Closes scylladb/scylladb#22072
2025-01-07 13:18:21 +02:00
Yaron Kaikov
74c5aabd23 build_docker: add option for building container based on Ubuntu Pro
Today our container is based on ubuntu:22.04, we need to build another container based on Ubuntu Pro for FIPS support (currently the latest one is 20.04)

The default docker build process doesn't change, if FIPS is required I have added `--type pro` to build a supported container.

To enable FIPS there is a need to attach an Ubuntu Pro subscription (it will be done as part of https://github.com/scylladb/scylla-pkg/issues/4186)

Closes scylladb/scylladb#21974
2024-12-20 13:09:24 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Yaron Kaikov
3a00ffd2eb build_docker.sh: remove rsyslog installation and conf
It seems that no one is using rsyslog, so there is no point having it
inside our container (see https://github.com/scylladb/scylladb/issues/21923#issuecomment-2545191667)

Refs: https://github.com/scylladb/scylladb/issues/21923

Closes scylladb/scylladb#21953
2024-12-17 17:34:35 +02:00
Kefu Chai
a9c244ddf7 dist: scylla_io_setup: use raw string to avoid invalid escape sequence
Use raw string literals to prevent syntax warnings when using regular
expressions with backslash-based patterns.

The original code triggered a SyntaxWarning in developer mode (`python3 -Xdev`)
due to unescaped backslash characters in regex patterns like '\s'. While
CPython typically interprets these silently, strict Python parsing modes
raise warnings about potentially unintended escape sequences.

This change adds the `r` prefix to string literals containing regex patterns,
ensuring consistent behavior across different Python runtime configurations
and eliminating unnecessary syntax warning like:

```
/opt/scylladb/scripts/libexec/scylla_io_setup:41: SyntaxWarning: invalid escape sequence '\s'
  pattern = re.compile(_nocomment + r"CPUSET=\s*\"" + _reopt(_cpuset) + _reopt(_smp) + "\s*\"")
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21839
2024-12-09 19:18:39 +03:00
Takuya ASADA
2b3115ac79 scylla-server.service: drop scylla-jmx.service
Since we dropped scylla-jmx at 3cd2a61, Wants=scylla-jmx.service is not
needed anymore.

Also we have issue on nonroot mode installation with this line (#21720),
we need to drop this now.

Fixes #21720

Closes scylladb/scylladb#21721
2024-12-01 14:14:33 +02:00
Takuya ASADA
6fe09a5a16 scylla_raid_setup: support installing semanage on Amazon Linux 2
Since Amazon Linux 2 has different package name for semange, we need to
adjust package name.

Fixes #21351
2024-11-11 17:27:24 +09:00
Takuya ASADA
7ad5e69c54 scylla_raid_setup: fix failure on SELinux package installation
After merged 5a470b2, we found that scylla_raid_setup fails on offline mode
installation.
This is because pkg_install() just print error and exit script on offline mode, instead of installing packages since offline mode not supposed able to connect
internet.
Seems like it occur because of missing "policycoreutils-python-utils"
package, which is the package for "semange" command.
So we need to implement the relabeling patch without using the command.

Fixes #21441
2024-11-11 17:27:24 +09:00
Kefu Chai
961a53f716 dist: systemd: use default KillMode
before this change, we specify the KillMode of the scylla-service
service unit explicitly to "process". according to
according to
https://www.freedesktop.org/software/systemd/man/latest/systemd.kill.html,

> If set to process, only the main process itself is killed (not recommended!).

and the document suggests use "control-group" over "process".
but scylla server is not a multi-process server, it is a multi-threaded
server. so it should not make any difference even if we switch to
the recommended "control-group".

in the light that we've been seeing "defunct" scylla process after
stopping the scylla service using systemd. we are wondering if we should
try to change the `KillMode` to "control-group", which is the default
value of this setting.

in this change, we just drop the setting so that the systemd stops the
service by stopping all processes in the control group of this unit
are stopped.

Refs scylladb/scylladb#21507

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21508
2024-11-09 20:07:11 +02:00