Commit Graph

59 Commits

Author SHA1 Message Date
Jenkins Promoter
f8c02a420d Update pgo profiles - aarch64 2025-10-01 05:32:35 +03:00
Jenkins Promoter
b45a57f65e Update pgo profiles - x86_64 2025-10-01 04:54:14 +03:00
Jenkins Promoter
c63b335819 Update pgo profiles - aarch64 2025-09-15 05:17:07 +03:00
Jenkins Promoter
e97a0c8b42 Update pgo profiles - x86_64 2025-09-14 21:23:37 -04:00
Marcin Maliszkiewicz
2109110037 pgo: add links to issues about tablet missing features 2025-09-03 15:43:52 +02:00
Marcin Maliszkiewicz
8aa2825caa pgo: enable counters workload
It was not enabled due to some cqlsh dependency missing.
After 3 years it's hard to say if the thing is fixed or not,
but anyway we don't need another big dependecy while we already
have python driver used exstensively in tests. We use simple
wrapper file exec_cql.py, shared with auth_conns workload to
conveniently read needed preparation statements from the file.

Additionally we switch tablets off as counters don't support
it yet.
2025-09-03 15:43:51 +02:00
Marcin Maliszkiewicz
09476a4df8 pgo: add auth connections stress workload
It uses some derived roles and permissions
to exercise auth code paths and also creates new
connection with each stress request to exercise
also transport/server.cc connection handling code.
2025-09-03 15:43:51 +02:00
Marcin Maliszkiewicz
f2270034ec pgo: enable auth in training clusters
As it's best practice to use auth and we don't
want to have 2^n configs to train we just enable
auth for every workload.
2025-09-03 15:29:27 +02:00
Jenkins Promoter
619b4102bd Update pgo profiles - x86_64 2025-09-01 05:08:56 +03:00
Jenkins Promoter
783f866bd3 Update pgo profiles - aarch64 2025-09-01 05:05:17 +03:00
Jenkins Promoter
d4ce070168 Update pgo profiles - aarch64 2025-08-15 05:03:28 +03:00
Jenkins Promoter
c0f691f4d9 Update pgo profiles - x86_64 2025-08-15 04:56:11 +03:00
Jenkins Promoter
2de91d43d5 Update pgo profiles - x86_64 2025-08-13 07:52:17 +03:00
Jenkins Promoter
647d9fe45d Update pgo profiles - aarch64 2025-08-13 07:43:38 +03:00
Botond Dénes
72b2bbac4f pgo/pgo.py: use tablet repair API for repair
Since a1d7722 tablet keyspaces are not allowed to be repaired via the
old /storage_service/repair_async/{keyspace} API, instead the new
/storage_service/tablets/repair API has to be used. Adjust the repair
code and also add await_completion=true: the script just waits
for the repair to finish immediately after starting it.

Closes scylladb/scylladb#25455
2025-08-12 20:32:19 +03:00
Jenkins Promoter
41bc6a8e86 Update pgo profiles - x86_64 2025-07-15 04:54:17 +03:00
Jenkins Promoter
b86674a922 Update pgo profiles - aarch64 2025-07-15 04:49:45 +03:00
Jenkins Promoter
94d7c22880 Update pgo profiles - aarch64 2025-07-01 11:33:20 +03:00
Jenkins Promoter
7531fc72a6 Update pgo profiles - x86_64 2025-07-01 11:33:20 +03:00
Jenkins Promoter
b0a7fcf21b Update pgo profiles - aarch64 2025-06-23 19:20:50 +03:00
Jenkins Promoter
e15e5a6081 Update pgo profiles - x86_64 2025-06-23 19:20:50 +03:00
Jenkins Promoter
1b5eee6a12 Update pgo profiles - aarch64 2025-06-15 04:57:59 +03:00
Jenkins Promoter
e0c2d591c7 Update pgo profiles - x86_64 2025-06-15 04:44:13 +03:00
Jenkins Promoter
7d562c24b1 Update pgo profiles - aarch64 2025-06-01 04:45:06 +03:00
Jenkins Promoter
75cf16afa2 Update pgo profiles - x86_64 2025-06-01 04:31:56 +03:00
Jenkins Promoter
76dddb758e Update pgo profiles - x86_64 2025-05-27 12:02:49 +03:00
Jenkins Promoter
de9d9c9ece Update pgo profiles - aarch64 2025-05-27 11:59:56 +03:00
Avi Kivity
29932a5af1 pgo: drop Java configuration
Since 5e1cf90a51
("build: replace tools/java submodule with packaged cassandra-stress")
we run pre-packaged cassandra-stress. As such, we don't need to look for
a Java runtime (which is missing on the frozen toolchain) and can
rely on the cassandra-stress package finding its own Java runtime.

Fix by just dropping all the Java-finding stuff.

Note: Java 11 is in fact present on the frozen toolchain, just
not in a way that pgo.py can find it.

Fixes #24176.

Closes scylladb/scylladb#24178
2025-05-26 10:16:03 +02:00
Avi Kivity
5e1cf90a51 build: replace tools/java submodule with packaged cassandra-stress
We no longer use tools/java (scylladb/scylla-tools-java.git) for
nodetool or cqlsh; only cassandra-stress. Since that is available
in package form install that and excise the tools/java submodule
from the source tree.

pgo/ is adjusted to use the packaged cassandra-stress (and the cqlsh
submodule).

A few jmx references are dropped as well.

Frozen toolchain regenerated.

Optimized clang from

  https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-aarch64.tar.gz
  https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-x86_64.tar.gz

Closes scylladb/scylladb#23698
2025-04-15 10:11:28 +03:00
Jenkins Promoter
9699c3ded4 Update pgo profiles - aarch64 2025-04-15 04:45:34 +03:00
Jenkins Promoter
8472aa9e53 Update pgo profiles - x86_64 2025-04-15 04:29:24 +03:00
Jenkins Promoter
6c528f5027 Update pgo profiles - aarch64 2025-04-01 04:45:44 +03:00
Jenkins Promoter
3c12029584 Update pgo profiles - x86_64 2025-04-01 04:27:11 +03:00
Jenkins Promoter
d84da3dc11 Update pgo profiles - x86_64 2025-03-15 04:57:28 +02:00
Jenkins Promoter
6e8e2ae333 Update pgo profiles - aarch64 2025-03-15 04:48:49 +02:00
Jenkins Promoter
7b50fbafb3 Update pgo profiles - aarch64 2025-03-01 04:58:49 +02:00
Jenkins Promoter
84e1514152 Update pgo profiles - x86_64 2025-03-01 04:26:11 +02:00
Jenkins Promoter
0d5f5e6c9d Update pgo profiles - x86_64 2025-02-15 20:32:23 +02:00
Jenkins Promoter
9daf50d424 Update pgo profiles - aarch64 2025-02-15 20:32:22 +02:00
Avi Kivity
cf72c31617 treewide: improve bash error reporting
bash error handling and reporting is atrocious. Without -e it will
just ignore errors. With -e it will stop on errors, but not report
where the error happened (apart from exiting itself with an error code).

Improve that with the `trap ERR` command. Note that this won't be invoked
on intentional error exit with `exit 1`.

We apply this on every bash script that contains -e or that it appears
trivial to set it in. Non-trivial scripts without -e are left unmodified,
since they might intentionally invoke failing scripts.

Closes scylladb/scylladb#22747
2025-02-10 18:28:52 +03:00
Jenkins Promoter
9add2ccc41 Update pgo profiles - x86_64 2025-02-05 08:44:54 +02:00
Jenkins Promoter
c7660e5962 Update pgo profiles - aarch64 2025-02-05 07:51:46 +02:00
Michał Chojnowski
bea434f417 pgo: disable tablets for training with secondary index, lwt and counters
As of right now, materialized views (and consequently secondary
indexes), lwt and counters are unsupported or experimental with tablets.
Since by defaults tablets are enabled, training cases using those
features are currently broken.

The right thing to do here is to disable tablets in those cases.

Fixes https://github.com/scylladb/scylladb/issues/22638

Closes scylladb/scylladb#22661
2025-02-04 15:38:53 +02:00
Jenkins Promoter
f5f15c6d07 Update pgo profiles - aarch64 2025-01-15 04:49:45 +02:00
Jenkins Promoter
f021e16d0c Update pgo profiles - x86_64 2025-01-15 04:26:42 +02:00
Jenkins Promoter
a7d8d21e86 Update pgo profiles - aarch64 2025-01-12 15:27:50 +02:00
Jenkins Promoter
b4ca9489c4 Update pgo profiles - x86_64 2025-01-12 15:05:40 +02:00
Kefu Chai
de42dce4c4 pgo: use java-11 when running cassandra-stress
we updated tools/java/build.xml recently to only build for java-11. so
if

- the `java` executable in `$PATH` points to a java which is neither
  java-8 nor java-11.
- java-8 is installed

java-8 is used to execute the cassandra-stress tool. and we would have
following failure:

```
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/cassandra/stress/Stress has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recogniz
es class file versions up to 52.0
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:473)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:621)
```

in order to be compatible with the bytecode targeting java-11, let's run
cassandra-stress with java-11. we do not need to support java-8, because
the new tools/java is now building cassandra-stress targeting java-11 jre.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22142
2025-01-02 16:56:29 +02:00
Kefu Chai
6adf70ec03 build: cmake: add CMake options for PGO support
- "Scylla_BUILD_INSTRUMENTED" option

  Scylla_BUILD_INSTRUMENTED allows us to instrument the code at
  different level, namely, IR, and CSIR. this option mirrors
  "--pgo" and "--cspgo" options in `configure.py` . please note,
  the instrumentation at the frontend is not supported, as the IR
  based instrumentation is better when it comes to the use case of
  optimization for performance.
  see https://lists.llvm.org/pipermail/llvm-dev/2015-August/089044.html
  for the rationales.

- "Scylla_PROFDATA_FILE" option

  this option allows us to specify the profile data previous generated
  with the "Scylla_BUILD_INSTRUMENTED" option. this option mirrors
  the `--use-profile` option in `configure.py`, but it does not
  take the empty option as a special case and consider it as a file
  fetched from Git LFS. that will be handled by another option in a
  follow-up change. please note, one cannot use
  -DScylla_BUILD_INSTRUMENTED=PGO and -DScylla_PROFDATA_FILE=...
  at the same time. clang just does not allow this. but CSPGO is fine.

- "Scylla_PROFDATA_COMPRESSED_FILE" option

  this option allows us to specify the compressed profile data previouly
  generated with the "Scylla_BUILD_INSTRUMENTED" option. along with
  "Scylla_PROFDATA_FILE", this option mirros the functionality of
  `--use-profile` in `configure.py`. the goal is to ensure user always
  gets the result with the specified options. if anything goes wrong,
  we just error out.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-12-27 16:16:04 +08:00
Michał Chojnowski
a868b44ad8 configure.py: introduce profile-guided optimization
This commit enables profile-guided optimizations (PGO) in the Scylla build.

A full LLVM PGO requires 3 builds:
1. With -fprofile-generate to generate context-free (pre-inlining) profile. This
profile influences inlining, indirect-call promotion and call graph
simplifications.
2. With -fprofile-use=results_of_build_1 -fcs-profile-generate to generate
context-sensitive (post-inlining) profile. This profile influences post-inline
and codegen optimizations.
3. With -fprofile-use=merged_results_of_builds_1_2 to build the final binary
with both profiles.

We do all three in one ninja call by adding release-pgo and release-cs-pgo
"stages" to release. They are a copy of regular release mode, just with the
flags described above added. With the full course, release objects depend on the
profile file produced by build/release-cs-pgo/scylla, while release-cs-pgo
depends on the profile file generated by build/release-pgo/scylla.

The stages are orthogonal and enabled with separate options. It's recommended
to run them both for full performance, but unfortunately each one adds a full
build of scylla to the compile time, so maybe we can drop one of them in the
future if it turns out e.g. that regular PGO doesn't have a big effect.

It's strongly recommended to combine PGO with LTO. The latter enables the entire
class of binary layout optimizations, which for us is probably the most
important part of the entire thing.
2024-12-27 16:16:04 +08:00