Commit Graph

62 Commits

Author SHA1 Message Date
Avi Kivity
07c5edcc30 tools: add patchelf utility
We use patchelf to rewrite the dynamic loader (known as the interpreter)
of the binaries we ship, so we can point to our shipped dynamic loader,
which is compatible with our binaries, rather than rely on the distribution's
dynamic loader, which is likely to be incompatible.

Upstream patchelf losing compatibity [1] with Linux 5.17 and below.
This change was also picked up by Fedora 42, so we cannot update the
toolchain to that distribution until we have an alternative.

Here we add a minimal patchelf alternative. It was mostly written by
Claude. It is minimal in that it only supports --set-interpreter and
--print-interpreter, and works well enough for our needs. We still use
the original patchelf for --remove-rpath; this reduces our maintenance
needs.

[1] 43b75fbc9f
[2] 4b015255d1

Closes scylladb/scylladb#24695
2025-06-30 07:24:05 +03:00
Takuya ASADA
fb4c7dc3d8 dist: Support FIPS mode
- To make Scylla able to run in FIPS-compliant system, add .hmac files for
  crypto libraries on relocatable/rpm/deb packages.
- Currently we just write hmac value on *.hmac files, but there is new
  .hmac file format something like this:

  ```
  [global]
  format-version = 1
  [lib.xxx.so.yy]
  path = /lib64/libxxx.so.yy
  hmac = <hmac>
  ```
  Seems like GnuTLS rejects fips selftest on .libgnutls.so.30.hmac when
  file format is older one.
  Since we need to absolute path on "path" directive, we need to generate
  .libgnutls.so.30.hmac in older format on create-relocatable-script.py,

Signed-off-by: Takuya ASADA <syuu@scylladb.com>

Closes scylladb/scylladb#22384
2025-01-26 22:49:21 +02:00
Takuya ASADA
f2a53d6a2c dist: make p11-kit-trust.so able to work in relocatable package
Currently, our relocatable package doesn't contains p11-kit-trust.so
since it dynamically loaded, not showing on "ldd" results
(Relocatable packaging script finds dependent libraries by "ldd").
So we need to add it on create-relocatable-pacakge.py.

Also, we have two more problems:
1. p11 module load path is defined as "/usr/lib64/pkcs11", not
   referencing to /opt/scylladb/libreloc
   (and also RedHat variants uses different path than Debian variants)

2. ca-trust-source path is configured on build time (on Fedora),
   it compatible with RedHat variants but not compatible with Debian
   variants

To solve these problems, we need to override default p11-kit
configuration.
To do so, we need to add an configuration file to
/opt/scylladb/share/pkcs11/modules/p11-kit-trust.module.
Also, ofcause p11-kit doesn't reference /opt/scylladb by default, we
need to override load path by p11_kit_override_system_files().

On the configuration file, we can specify module load path by "modules: <path>",
and also we can specify ca-trust-source path by "x-init-reservied: paths=<path>".

Fixes scylladb/scylladb#13904

Closes scylladb/scylladb#22302
2025-01-15 10:09:17 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Botond Dénes
7cbe5c78b4 install.sh: use the native nodetool directly
* tools/java b810e8b00e...4ee15fd9ea (1):
  > install.sh: don't install nodetool into /usr/bin

Add a bin/nodetool and install it to bin/ in install.sh. This script
simply forwards to scylla nodetool and it is the replacement for the
Java nodetool, which is dropped from the java-tools's install.sh, in the
submodule update also included in this patch.
With this change, we now hardwire the usage of the native nodetool, as
*the* nodetool, with the intermediary nodetool wrapper script removed
from the picture.
Bash completion was copied from the java tools repository and it is now
installed by the scylla package, together with nodetool.

The Java nodetool is still available as as a fall-back, in case the
native nodetool has problems, at the path of
/opt/scylladb/share/cassandra/bin/nodetool.

Testing

I tested upgrades on a DEB and RPM distro: Ubuntu and Fedora.
First I installed scylla-5.4, then I installed the packages for this PR.
On Ubuntu, I had to use dpkg -i --auto-deconfigure, otherwise, dpkg would
refuse to install the new packages because they break the old ones. No
extra flags were required on Fedora.
In both cases, /usr/bin/nodetool was changed from a thunk calling the
Java nodetool (from 5.4) to the native launcher script from this PR.
/opt/scylladb/share/cassandra/bin/nodetool remained in place and still
works after the upgrade.

I also verified that --nonroot installs also work. Nodetool works both
when called with an absolute path, or when ~/scylladb/bin is added to
$PATH.

Fixes: #18226
Fixes: #17412

Closes scylladb/scylladb#18255

[avi: reset submodule to actual hash we ended up with]
2024-04-25 22:52:00 +03:00
Kefu Chai
d93b018bcf create-relocatable-package.py: add --debian-dir option
before this change, we assume that debian packaging directory is
always located under `build/debian/debian`. which is hardwired by
`configure.py`. but this could might hold anymore, if we want to
have a self-contained build, in the sense that different builds do
not share the same build directory. this could be a waste for the
non-mult-config build, but `configure.py` uses mult-config generator
when building with CMake. so in that case, all builds still share the
same $build_dir/debian/ directory.

in order to work with the out-of-source build, where the build
directory is not necessarily "build", a new option is added to
`create-relocatable-package.py`, this allows us to specify the directory
where "debian" artifacts are located.

Refs #15241

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17558
2024-03-06 10:18:00 +02:00
Kefu Chai
16eea4569d create-relocatable-package.py: prefer $build_dir/SCYLLA-RELEASE-FILE
similar to d9dcda9dd5, we need to
use the version files located under $build_dir instead "build".
so let's check the existence of $build_dir/SCYLLA-RELEASE-FILE,
and then fallback to the ones under "build".

Refs #15241

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-14 12:45:40 +08:00
Kefu Chai
6dc6b39609 create-relocatable-package.py: create SCYLLA-RELOCATABLE-FILE with tempfile
this change serves two purposes:

1. so we don't assume the existence of '$PWD/build' directory. we should
   not assume this. as the build directory could be any diectory, it
   does not have to be "build".
2. we don't have to actually create a file under $build_dir. what we
   need is but an empty file. so tempfile serves this purpose just well.

Refs #15241

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-14 12:45:15 +08:00
Kefu Chai
e1d08d888a create-relocatable-package.py: add --node-exporter-dir option
before this change, we assume that node_exporter artifacts are
always located under `build/node_exporter`. but this could might
hold anymore, if we want to have a self-contained build, in the sense
that different builds do not share the same set of node_exporter
artifacts. this could be a waste as the node_exporter artifacts
are identical across different builds, but this makes things
a lot simpler -- different builds do not have to hardwire to
a certain directory.

so, a new option is added to `create-relocatable-package.py`, this
allows us to specify the directory where node_export artifacts
are located.

Refs #15241
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-06 20:52:07 +08:00
Kefu Chai
8bd9645d1d build: specify the build dir instead mode
instead of specifying the build "mode", and assuming that
the build directory is always located at "build/${mode}", specify
the build directory explicitly. this allows us to use
`create-relocatable-package.py` to package artifacts built
at build directory whose path does not comply to the
naming convention, for instance, we might want to build
scylla in `build/yet-another-super-feature/release`.

so, in this change, we trade `--mode` for an option named
`--build-dir` and update `configure.py` accordingly.

Refs #15241
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-06 20:47:42 +08:00
Kefu Chai
17c1b15c81 create-relocatable-package.py: add version file with tempfile
before this change, we build multiple relocatable package for
different builds in parallel using ninja. all these relocatable
packages are built using the same script of
`create-relocatable-package.py`. but this script always use the
same directory and file for the `.relocatable_package_version`
file.

so there are chances that these jobs building the relocatable
package can race and writing / accessing the same file at the same
time. so, in this change, instead of using a fixed file path
for this temporary file, we use a NamedTemporaryFile for this purpose.
this should helps us avoid the build failures like

```
[2023-08-10T09:38:00.019Z] FAILED: build/debug/dist/tar/scylla-unstripped-5.4.0~dev-0.20230810.116c10a2b0c6.x86_64.tar.gz
[2023-08-10T09:38:00.019Z] scripts/create-relocatable-package.py --mode debug 'build/debug/dist/tar/scylla-unstripped-5.4.0~dev-0.20230810.116c10a2b0c6.x86_64.tar.gz'
[2023-08-10T09:38:00.019Z] Traceback (most recent call last):
[2023-08-10T09:38:00.019Z]   File "/jenkins/workspace/scylla-master/scylla-ci/scylla/scripts/create-relocatable-package.py", line 130, in <module>
[2023-08-10T09:38:00.019Z]     os.makedirs(f'build/{SCYLLA_DIR}')
[2023-08-10T09:38:00.019Z]   File "<frozen os>", line 225, in makedirs
[2023-08-10T09:38:00.019Z] FileExistsError: [Errno 17] File exists:
'build/scylla-package'
```

Fixes #15018
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #15007
2023-08-11 12:57:28 +03:00
Kefu Chai
024b96a211 create-relocatable-package.py: package build/node_export only for stripped version
because we build stripped package and non-stripped package in parallel
using ninja. there are chances that the non-stripped build job could
be adding build/node_exporter directory to the tarball while the job
building stripped package is using objcopy to extract the symbols from
the build/node_exporter/node_exporter executable. but objcopy creates
temporary files when processing the executables. and the temporary
files can be spotted by the non-stripped build job. there are two
consequences:

1. non-stripped build job includes the temporary files in its tarball,
   even they are not supposed to be distributed
2. non-stripped build job fails to include the temporary file(s), as
   they are removed after objcopy finishes its job. but the job did spot
   them when preparing the tarball. so when the tarfile python module
   tries to include the previous found temporary file(s), it throws.

neither of these consequences is expected. but fortunately, this only
happens when packaging the non-stripped package. when packaging the
stripped package, the build/node_exported directory is not in flux
anymore. as ninja ensures the dependencies between the jobs.

so, in this change, we do not add the whole directory when packaging
the non-stripped version. as all its ingredients have been added
separately as regular files. and when packaing the stripped version,
we still use the existing step, as we don't have to list all the
files created by strip.sh:

node_exporter{,.debug,.dynsyms,.funcsyms,.keep_symbols,.minidebug.xz}

we could do so in this script, but the repeatings is unnecessary and
error-prune. so, let's keep including the whole directory recursively,
so all the debug symbols are included.

Fixes #14079
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-30 14:00:02 +08:00
Kefu Chai
665a747fab create-relocatable-package.py: use positive condition when possible
to reduce the programmer's cognitive load.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-05-30 14:00:02 +08:00
Kefu Chai
a0b8aa9b13 create-relocatable-package.py: raise if rmtree fails
occasionally, we are observing build failures like:
```
17:20:54  FAILED: build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz
17:20:54  dist/debuginfo/scripts/create-relocatable-package.py --mode release 'build/release/dist/tar/scylla-debuginfo-5.4.0~dev-0.20230522.5b2687e11800.x86_64.tar.gz'
17:20:54  Traceback (most recent call last):
17:20:54    File "/jenkins/workspace/scylla-master/scylla-ci/scylla/dist/debuginfo/scripts/create-relocatable-package.py", line 60, in <module>
17:20:54      os.makedirs(f'build/{SCYLLA_DIR}')
17:20:54    File "<frozen os>", line 225, in makedirs
17:20:54  FileExistsError: [Errno 17] File exists: 'build/scylla-debuginfo-package'
```

to understand the root cause better, instead of swallowing the error,
let's raise the exception it is not caused by non-existing directory.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13978
2023-05-29 23:03:24 +03:00
Kefu Chai
60ff230d54 create-relocatable-package.py: use f-string
in dcce0c96a9, we should have used
f-string for printing the return code of gzip subprocess. but the
"f" prefix was missed. so, in this change, it is added.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13500
2023-04-14 08:29:33 +03:00
Kefu Chai
dcce0c96a9 create-relocatable-package.py: error out if pigz fails
before this change, we don't error out even if pigz fails. but
there is chance that pigz fails to create the gzip'ed relocatable
tarball either due to environmental issues or some other problems,
and we are not aware of this until packaging scripts like
`reloc/build_rpm.sh` tries to ungzip this corrupted gzip file.

in this change, if pigz's status code is not 0, the status code
is printed, and create-relocatable-package.py will return 1.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13459
2023-04-11 14:29:25 +03:00
Takuya ASADA
497dd7380f create-relocatable-package.py: stop using filter function on tools
We introduced exclude_submodules at 19da4a5b8f
to exclude tools/java and tools/jmx since they have their own
relocatable packages, so we don't want to package same files twice.

However, most of the files under tools/ are not needed for installation,
we just need tools/scyllatop.
So what we really need to do is "ar.reloc_add('tools/scyllatop')", not
excluding files from tools/.

related with #13183

Closes #13215
2023-03-29 16:23:43 +03:00
Kefu Chai
96ba88f621 dist/debian: add libexec/scylla to source/include-binaries
* scripts/create-relocatable-package.py: add a command to print out
  executables under libexec
* dist/debian/debian_files_gen.py: call create-relocatable-package.py
  for a list of files under libexec and create source/include-binaries
  with the list.

we repackage the precompiled binaries in the relocatable package into a debian source package using `./scylla/install.sh`, which edits the executable to use the specified dynamic library loader. but dpkg-source does not like this, as it wants to ensure that the files in original tarball (*.orig.tar.gz) is identical to the files in the source package created by dpkg-source.

so we have following failure when running reloc/build_deb.sh

```
dpkg-source: error: cannot represent change to scylla/libexec/scylla: binary file contents changed
dpkg-source: error: add scylla/libexec/scylla in debian/source/include-binaries if you want to store the modified binary in the debian tarball
dpkg-source: error: unrepresentable changes to source
dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status 1
debuild: fatal error at line 1182:
dpkg-buildpackage -rfakeroot -us -uc -ui failed
```

in this change, to address the build failure, as proposed by dpkg, the
path to the patched/edited executable is added to
`debian/source/include-binaries`. see the "Building" section in https://manpages.debian.org/bullseye/dpkg-dev/dpkg-source.1.en.html for more details.
please search `adjust_bin()` in `scylladb/install.sh` for more details.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12722
2023-03-27 10:10:12 +03:00
Takuya ASADA
a79604b0d6 create-relocatable-package.py: exclude tools/cqlsh
We should exclude tools/cqlsh for relocatable package.

fixes #13181

Closes #13183
2023-03-16 13:37:16 +02:00
Avi Kivity
46690bcb32 build: harden create-relocatable-package.py against changes in libthread-db.so name
create-relocatable-package.py collects shared libraries used by
executables for packaging. It also adds libthread-db.so to make
debugging possible. However, the name it uses has changed in glibc,
so packaging fails in Fedora 37.

Switch to the version-agnostic names, libthread-db.so. This happens
to be a symlink, so resolve it.

Closes #11917
2022-11-08 08:41:22 +02:00
Takuya ASADA
1a11a38add unified: move unified package contents to sub-directory
On most of the software distribution tar.gz, it has sub-directory to contain
everything, to prevent extract contents to current directory.
We should follow this style on our unified package too.

To do this we need to increment relocatable package version to '3.0'.

Fixes #8349

Closes #8867
2022-10-25 08:58:15 +03:00
Takuya ASADA
49d5e51d76 reloc: add support stripped binary installation for relocatable package
This add support stripped binary installation for relocatable package.
After this change, scylla and unified packages only contain stripped binary,
and introduce "scylla-debuginfo" package for debug symbol.
On scylla-debuginfo package, install.sh script will extract debug symbol
at /opt/scylladb/<dir>/.debug.

Note that we need to keep unstripped version of relocatable package for rpm/deb,
otherwise rpmbuild/debuild fails to create debug symbol package.
This version is renamed to scylla-unstripped-$version-$release.$arch.tar.gz.

See #8918

Signed-off-by: Takuya ASADA <syuu@scylladb.com>

Closes #9005
2022-10-13 15:11:32 +02:00
Takuya ASADA
cd5320fe60 install.sh: add --without-systemd option
Since we fail to write files to $USER/.config on Jenkins jobs, we need
an option to skip installing systemd units.
Let's add --without-systemd to do that.

Also, to detect the option availability, we need to increment
relocatable package version.

See scylladb/scylla-dtest#2819

Closes #11345
2022-09-12 13:04:00 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Takuya ASADA
76519751bc install.sh: add fix_system_distributed_tables.py to the package
Related with #4601

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
2021-11-08 08:07:49 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Takuya ASADA
95197a09c9 dist: add node_exporter to scylla-server package
To connection-less environment, we need to add node_exporter binary
to scylla-server package, not downloading it from internet.

Related #7765
Fixes #2190

Closes #7796
2020-12-24 11:44:13 +02:00
Benny Halevy
bc64ee5410 reloc: add ubsan-suppressions.supp to relocatable package
So we can use it to suppress false-positive ubsan error
when running scylla in debug mode.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20201110165214.1467027-1-bhalevy@scylladb.com>
2020-11-10 19:14:27 +02:00
Takuya ASADA
100127bc02 install.sh: allow --packaging with nonroot mode
Since scylla-ccm wants to skip systemctl, we need to support --packaging
in nonroot mode too.

Related: #7187
2020-11-02 12:07:14 +02:00
Avi Kivity
fc15d0a4be build: relocatable package: exclude tools/python3
python3 has its own relocatable package, no need to include it
in scylla-package.tar.gz.

Python has its own relocatable package, so packaging it in scylla-package.ta

Closes #7467
2020-10-28 16:22:23 +02:00
Takuya ASADA
6ba2a6c42e create-relocatable-package.py: add lsblk for relocatable CLI tools
We need latest version of lsblk that supported partition type UUID.

Fixes #6954
2020-07-31 04:23:03 +09:00
Avi Kivity
19da4a5b8f build: don't package tools/java and tools/jmx in relocatable pacakge
tools/java and tools/jmx have their own relocatable packages (and rpm/deb),
so they should not be part of the main relocatable package.

Enforce this by enabling the filter parameter in reloc_add, and passing
a filter that excludes tools/java and tools/jmx.
2020-07-22 20:03:18 +03:00
Takuya ASADA
536ab4ebe4 reloc-pkg: move all files under project name directory
To make unified relocatable package easily, we may want to merge tarballs to single tarball like this:
zcat *.tar.gz | gzip -c > scylla-unified.tar.xz
But it's not possible with current relocatable package format, since there are multiple files conflicts, install.sh, SCYLLA-*-FILE, dist/, README.md, etc..

To support this, we need to archive everything in the directory when building relocatable package.

This is modifying relocatable package format, we need to provide a way to
detect the format version.
To do this, we added a new file ".relocatable_package_version" on the top of the
archive, and set version number "2" to the file.

Fixes #6315
2020-06-03 09:52:44 +03:00
Takuya ASADA
287d6e5ece dist/debian: drop dependency on pystache
Same as 9d91ac345a, drop dependency on pystache
since it nolonger present in Fedora 32.

To implement it, simplified debian package build process.
It will be generate debian/ directory when building relocatable package,
we just need to run debuild using the package.

To generate debian/ directory this commit added debian_files_gen.py,
it construct whole directory including control and changelog files
from template files.
Since we need to stop pystache, these template files swiched to
string.Template class which is included python3 standard library.

see: https://github.com/scylladb/scylla/pull/6313
2020-05-27 08:40:05 +03:00
Takuya ASADA
46386beba2 install.sh: convert relocate_python_scripts.py to a bash function
Since we need to run relocate_python_scripts.py on install time,
python script may not able to run on various different environment.
So convert the script to bash script, merge it into install.sh.
2020-01-20 11:15:34 +02:00
Takuya ASADA
e0071b1756 reloc: don't archive dist/ami/files/*.rpm on relocatable package
We should skip archiving dist/ami/files/*.rpm on relocatable package,
since it doesn't used.
Also packer and variables.json, too.

Fixes #5508

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20191223121044.163861-1-syuu@scylladb.com>
2019-12-23 14:19:51 +02:00
Avi Kivity
440ad6abcc Revert "relocatable: Check that patchelf didn't mangle the PT_LOAD headers"
This reverts commit 237ba74743. While it
works for the scylla executable, it fails for iotune, which is built
by seastar. It should be reinstated after we pass the correct link
parameters to the seastar build system.
2019-12-19 11:20:34 +02:00
Rafael Ávila de Espíndola
8d777b3ad5 relocatable: Use a super long path for the dynamic linker
Having a long path allows patchelf to change the interpreter without
changing the PT_LOAD headers and therefore without moving the
build-id out of the first page.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191213224803.316783-1-espindola@scylladb.com>
2019-12-18 19:10:59 +02:00
Rafael Ávila de Espíndola
237ba74743 relocatable: Check that patchelf didn't mangle the PT_LOAD headers
Should avoid issue #4983 showing up again.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191213224803.316783-2-espindola@scylladb.com>
2019-12-16 20:18:32 +02:00
Pekka Enberg
0c1dad0838 Merge "Misc documentation cleanup" from Botond
"Delete README-DPDK.md, move IDL.md to docs/ and fix
docs/review-checklist.md to point to scylla's coding style document,
instead of seastar's."

* 'documentation-cleanup/v3' of https://github.com/denesb/scylla:
  docs/review-checklist.md: point to scylla's coding-style.md instead of seastar's
  docs: mv coding-style.md docs/
  rm README-DPDK.md
  docs: mv IDL.md docs/
2019-10-15 12:53:49 +02:00
Avi Kivity
d77171e10e build: adjust libthread_db file name to match gdb expectations
gdb searches for libthread_db.so using its canonical name of libthread_db.so.1 rather
than the file name of libthread_db-1.0.so, so use that name to store the file in the
archive.

Fixes #4996.
2019-09-16 14:48:42 +02:00
Benny Halevy
bdfb73f67d scripts/create-relocatable-package: ldd: print executable name in exception
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190903080511.534-1-bhalevy@scylladb.com>
2019-09-03 15:34:38 +03:00
Avi Kivity
f1d73d0c13 Merge "systemd: put scylla processes in systemd slices. #4743" from Glauber
"
It is well known that seastar applications, like Scylla, do not play
well with external processes: CPU usage from external processes may
confuse the I/O and CPU schedulers and create stalls.

We have also recently seen that memory usage from other application's
anonymous and page cache memory can bring the system to OOM.

Linux has a very good infrastructure for resource control contributed by
amazingly bright engineers in the form of cgroup controllers. This
infrastructure is exposed by SystemD in the form of slices: a
hierarchical structure to which controllers can be attached.

In true systemd way, the hierarchy is implicit in the filenames of the
slice files. a "-" symbol defines the hierarchy, so the files that this
patch presents, scylla-server and scylla-helper, essentially create a
"scylla" cgroup at the top level with "server" and "helper" children.

Later we mark the Services needed to run scylla as belonging to one
or the other through the Slice= directive.

Scylla DBAs can benefit from this setup by using the systemd-run
utility to fire ad-hoc commands.

Let's say for example that someone wants to hypothetically run a backup
and transfer files to an external object store like S3, making sure that
the amount of page cache used won't create swap pressure leading to
database timeouts.

One can then run something like:

sudo systemd-run --uid=id -u scylla --gid=id -g scylla -t --slice=scylla-helper.slice /path/to/my/magical_backup_tool

(or even better, the backup tool can itself be a systemd timer)
"

* 'slices' of https://github.com/glommer/scylla:
  systemd: put scylla processes in systemd slices.
  move postinst steps to an external script
2019-08-26 20:16:55 +03:00
Avi Kivity
698b72b501 relocatable: switch from run-time relocation to install-time relocation
Our current relocation works by invoking the dynamic linker with the
executable as an argument. This confuses gdb since the kernel records
the dynamic linker as the executable, not the real executable.

Switch to install-time relocation with patchelf: when installing the
executable and libraries, all paths are known, and we can update the
path to the dynamic loader and to the dynamic libraries.

Since patchelf itself is dynamically linked, we have to relocate it
dynamically (with the old method of invoking it via the dynamic linker).
This is okay since it's a one-time operation and since we don't expect
to debug core dumps of patchelf crashes.

We lose the ability to run scylla directly from the uninstalled
tarball, but since the nonroot installer is already moving in the
direction of requiring install.sh, that is not a great loss, and
certainly the ability to debug is more important.

dh_strip barfs on some binaries which were treated with patchelf,
so exclude them from dh_strip. This doesn't lose any functionality,
since these binaries didn't have debug information to begin with
(they are already-stripped Fedora executables).

Fixes #4673.
2019-08-20 00:25:43 +02:00
Tomasz Grabiec
b9447d0319 Revert "relocatable: switch from run-time relocation to install-time relocation"
This reverts commit 4ecce2d286.

Should be committed via the next branch.
2019-08-20 00:22:30 +02:00
Avi Kivity
4ecce2d286 relocatable: switch from run-time relocation to install-time relocation
Our current relocation works by invoking the dynamic linker with the
executable as an argument. This confuses gdb since the kernel records
the dynamic linker as the executable, not the real executable.

Switch to install-time relocation with patchelf: when installing the
executable and libraries, all paths are known, and we can update the
path to the dynamic loader and to the dynamic libraries.

Since patchelf itself is dynamically linked, we have to relocate it
dynamically (with the old method of invoking it via the dynamic linker).
This is okay since it's a one-time operation and since we don't expect
to debug core dumps of patchelf crashes.

We lose the ability to run scylla directly from the uninstalled
tarball, but since the nonroot installer is already moving in the
direction of requiring install.sh, that is not a great loss, and
certainly the ability to debug is more important.

dh_strip barfs on some binaries which were treated with patchelf,
so exclude them from dh_strip. This doesn't lose any functionality,
since these binaries didn't have debug information to begin with
(they are already-stripped Fedora executables).

Fixes #4673.
2019-08-20 00:20:19 +02:00
Glauber Costa
da260ecd61 systemd: put scylla processes in systemd slices.
It is well known that seastar applications, like Scylla, do not play
well with external processes: CPU usage from external processes may
confuse the I/O and CPU schedulers and create stalls.

We have also recently seen that memory usage from other application's
anonymous and page cache memory can bring the system to OOM.

Linux has a very good infrastructure for resource control contributed by
amazingly bright engineers in the form of cgroup controllers. This
infrastructure is exposed by SystemD in the form of slices: a
hierarchical structure to which controllers can be attached.

In true systemd way, the hierarchy is implicit in the filenames of the
slice files. a "-" symbol defines the hierarchy, so the files that this
patch presents, scylla-server and scylla-helper, essentially create a
"scylla" cgroup at the top level with "server" and "helper" children.

Later we mark the Services needed to run scylla as belonging to one
or the other through the Slice= directive.

Scylla DBAs can benefit from this setup by using the systemd-run
utility to fire ad-hoc commands.

Let's say for example that someone wants to hypothetically run a backup
and transfer files to an external object store like S3, making sure that
the amount of page cache used won't create swap pressure leading to
database timeouts.

One can then run something like:

```
   sudo systemd-run --uid=`id -u scylla` --gid=`id -g scylla` -t --slice=scylla-helper.slice /path/to/my/magical_backup_tool
```

(or even better, the backup tool can itself be a systemd timer)

Changes from last version:
- No longer use the CPUQuota
- Minor typo fixes
- postinstall fixup for small machines

Benchmark results:
==================

Test: read from disk, with 100% disk util using a single i3.xlarge (4 vCPUs).
We have to fill the cache as we read, so this should stress CPU, memory and
disk I/O.

cassandra-stress command:
```
  cassandra-stress read no-warmup duration=5m -rate threads=20 -node 10.2.209.188 -pop dist=uniform\(1..150000000\)
```

Baseline results:

```
Results:
Op rate                   :   13,830 op/s  [READ: 13,830 op/s]
Partition rate            :   13,830 pk/s  [READ: 13,830 pk/s]
Row rate                  :   13,830 row/s [READ: 13,830 row/s]
Latency mean              :    1.4 ms [READ: 1.4 ms]
Latency median            :    1.4 ms [READ: 1.4 ms]
Latency 95th percentile   :    2.4 ms [READ: 2.4 ms]
Latency 99th percentile   :    2.8 ms [READ: 2.8 ms]
Latency 99.9th percentile :    3.4 ms [READ: 3.4 ms]
Latency max               :   12.0 ms [READ: 12.0 ms]
Total partitions          :  4,149,130 [READ: 4,149,130]
Total errors              :          0 [READ: 0]
Total GC count            : 0
Total GC memory           : 0.000 KiB
Total GC time             :    0.0 seconds
Avg GC time               :    NaN ms
StdDev GC time            :    0.0 ms
Total operation time      : 00:05:00
```

Question 1:
===========

Does putting scylla in a special slice affect its performance ?

Results with Scylla running in a slice:

```
Results:
Op rate                   :   13,811 op/s  [READ: 13,811 op/s]
Partition rate            :   13,811 pk/s  [READ: 13,811 pk/s]
Row rate                  :   13,811 row/s [READ: 13,811 row/s]
Latency mean              :    1.4 ms [READ: 1.4 ms]
Latency median            :    1.4 ms [READ: 1.4 ms]
Latency 95th percentile   :    2.2 ms [READ: 2.2 ms]
Latency 99th percentile   :    2.6 ms [READ: 2.6 ms]
Latency 99.9th percentile :    3.3 ms [READ: 3.3 ms]
Latency max               :   23.2 ms [READ: 23.2 ms]
Total partitions          :  4,151,409 [READ: 4,151,409]
Total errors              :          0 [READ: 0]
Total GC count            : 0
Total GC memory           : 0.000 KiB
Total GC time             :    0.0 seconds
Avg GC time               :    NaN ms
StdDev GC time            :    0.0 ms
Total operation time      : 00:05:00
```

*Conclusion* : No significant change

Question 2:
===========

What happens when there is a CPU hog running in the same server as scylla?

CPU hog:

```
   taskset -c 0 /bin/sh -c "while true; do true; done" &
   taskset -c 1 /bin/sh -c "while true; do true; done" &
   taskset -c 2 /bin/sh -c "while true; do true; done" &
   taskset -c 3 /bin/sh -c "while true; do true; done" &
   sleep 330
```

Scenario 1: CPU hog runs freely:

```
Results:
Op rate                   :    2,939 op/s  [READ: 2,939 op/s]
Partition rate            :    2,939 pk/s  [READ: 2,939 pk/s]
Row rate                  :    2,939 row/s [READ: 2,939 row/s]
Latency mean              :    6.8 ms [READ: 6.8 ms]
Latency median            :    5.3 ms [READ: 5.3 ms]
Latency 95th percentile   :   11.0 ms [READ: 11.0 ms]
Latency 99th percentile   :   14.9 ms [READ: 14.9 ms]
Latency 99.9th percentile :   17.1 ms [READ: 17.1 ms]
Latency max               :   26.3 ms [READ: 26.3 ms]
Total partitions          :    884,460 [READ: 884,460]
Total errors              :          0 [READ: 0]
Total GC count            : 0
Total GC memory           : 0.000 KiB
Total GC time             :    0.0 seconds
Avg GC time               :    NaN ms
StdDev GC time            :    0.0 ms
Total operation time      : 00:05:00
```

Scenario 2: CPU hog runs inside scylla-helper slice

```
Results:
Op rate                   :   13,527 op/s  [READ: 13,527 op/s]
Partition rate            :   13,527 pk/s  [READ: 13,527 pk/s]
Row rate                  :   13,527 row/s [READ: 13,527 row/s]
Latency mean              :    1.5 ms [READ: 1.5 ms]
Latency median            :    1.4 ms [READ: 1.4 ms]
Latency 95th percentile   :    2.4 ms [READ: 2.4 ms]
Latency 99th percentile   :    2.9 ms [READ: 2.9 ms]
Latency 99.9th percentile :    3.8 ms [READ: 3.8 ms]
Latency max               :   18.7 ms [READ: 18.7 ms]
Total partitions          :  4,069,934 [READ: 4,069,934]
Total errors              :          0 [READ: 0]
Total GC count            : 0
Total GC memory           : 0.000 KiB
Total GC time             :    0.0 seconds
Avg GC time               :    NaN ms
StdDev GC time            :    0.0 ms
Total operation time      : 00:05:00
```

*Conclusion*: With systemd slice we can keep the performance very close to
baseline

Question 3:
===========

What happens when there is a CPU hog running in the same server as scylla?

I/O hog: (Data in the cluster is 2x size of memory)

```
while true; do
	find /var/lib/scylla/data -type f -exec grep glauber {} +
done
```

Scenario 1: I/O hog runs freely:

```
Results:
Op rate                   :    7,680 op/s  [READ: 7,680 op/s]
Partition rate            :    7,680 pk/s  [READ: 7,680 pk/s]
Row rate                  :    7,680 row/s [READ: 7,680 row/s]
Latency mean              :    2.6 ms [READ: 2.6 ms]
Latency median            :    1.3 ms [READ: 1.3 ms]
Latency 95th percentile   :    7.8 ms [READ: 7.8 ms]
Latency 99th percentile   :   10.9 ms [READ: 10.9 ms]
Latency 99.9th percentile :   16.9 ms [READ: 16.9 ms]
Latency max               :   40.8 ms [READ: 40.8 ms]
Total partitions          :  2,306,723 [READ: 2,306,723]
Total errors              :          0 [READ: 0]
Total GC count            : 0
Total GC memory           : 0.000 KiB
Total GC time             :    0.0 seconds
Avg GC time               :    NaN ms
StdDev GC time            :    0.0 ms
Total operation time      : 00:05:00
```

Scenario 2: I/O hog runs in the scylla-helper systemd slice:

```
Results:
Op rate                   :   13,277 op/s  [READ: 13,277 op/s]
Partition rate            :   13,277 pk/s  [READ: 13,277 pk/s]
Row rate                  :   13,277 row/s [READ: 13,277 row/s]
Latency mean              :    1.5 ms [READ: 1.5 ms]
Latency median            :    1.4 ms [READ: 1.4 ms]
Latency 95th percentile   :    2.4 ms [READ: 2.4 ms]
Latency 99th percentile   :    2.9 ms [READ: 2.9 ms]
Latency 99.9th percentile :    3.5 ms [READ: 3.5 ms]
Latency max               :  183.4 ms [READ: 183.4 ms]
Total partitions          :  3,984,080 [READ: 3,984,080]
Total errors              :          0 [READ: 0]
Total GC count            : 0
Total GC memory           : 0.000 KiB
Total GC time             :    0.0 seconds
Avg GC time               :    NaN ms
StdDev GC time            :    0.0 ms
Total operation time      : 00:05:00
```

*Conclusion*: With systemd slice we can keep the performance very close to
baseline

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2019-08-19 14:31:28 -04:00
Glauber Costa
ffc328c924 move postinst steps to an external script
There are systemd-related steps done in both rpm and deb builds.
Move that to a script so we avoid duplication.

The tests are so far a bit specific to the distributions, so it
needs to be adapted a bit.

Also note that this also fixes a bug with rpm as a side-effect:
rpm does not call daemon-reload after potentially changing the
systemd files (it is only implied during postun operations, that
happen during uninstall). daemon-reload was called explicitly for
debian packages, and now it is called for both.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2019-08-15 10:43:17 -04:00
Takuya ASADA
842f75d066 reloc: provide libthread_db.so.1 to debug thread on gdb
In scylla-debuginfo package, we have /usr/lib/debug/opt/scylladb/libreloc/libthread_db-1.0.so-666.development-0.20190711.73a1978fb.el7.x86_64.debug
but we actually does not have libthread_db.so.1 in /opt/scylladb/libreloc
since it's not available on ldd result with scylla binary.

To debug thread, we need to add the library in a relocatable package manually.

Fixes #4673

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190711111058.7454-1-syuu@scylladb.com>
2019-07-12 19:21:26 +03:00
Takuya ASADA
214c74a71d dist: merge product name parameter on single place
When we add product name customization, we mistakenly defined the
parameter on each package build script.
Number of script is increasing since we recently added relocatable
python3 package, we should merge it in single place.

Also we should save the parameter on relocatable package, just like
version-release parameters.

So move the definition to SCYLLA-VERSION-GEN, save it to
build/SCYLLA-PRODUCT-FILE then archive it to relocatable package.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190417163335.10191-1-syuu@scylladb.com>
2019-04-19 11:47:40 +03:00