It was not enabled due to some cqlsh dependency missing.
After 3 years it's hard to say if the thing is fixed or not,
but anyway we don't need another big dependecy while we already
have python driver used exstensively in tests. We use simple
wrapper file exec_cql.py, shared with auth_conns workload to
conveniently read needed preparation statements from the file.
Additionally we switch tablets off as counters don't support
it yet.
It uses some derived roles and permissions
to exercise auth code paths and also creates new
connection with each stress request to exercise
also transport/server.cc connection handling code.
As of right now, materialized views (and consequently secondary
indexes), lwt and counters are unsupported or experimental with tablets.
Since by defaults tablets are enabled, training cases using those
features are currently broken.
The right thing to do here is to disable tablets in those cases.
Fixes https://github.com/scylladb/scylladb/issues/22638Closesscylladb/scylladb#22661
This workload is added to teach PGO about repair.
Tests are inconclusive about its alignment with existing workloads,
because repair doesn't seem utilize 100% of the reactor.
This workload is added to teach PGO about counters.
Tests seem to show it's mostly aligned with existing CQL workloads.
The config YAML is based on the default cassandra-stress schema.
This workload is added to teach PGO about secondary indexes.
Tests seem to show that it's mostly aligned with existing CQL workloads.
The config YAML was copied from one of scylla-cluster-test test cases.
This workload is added to teach PGO about LWT codepaths.
Tests seem to show that it's mostly aligned with existing CQL workloads.
The config YAML was copied from one of scylla-cluster-tests test cases.
In contrast to the basic workload, this workload uses clustering
keys, CK range queries, RF=1, logged batches, and more CQL types.
Tests seem to show that this workload is mostly aligned with the existing basic
workload (where "aligned" means that training on workload A improves workload B
about as much as training on workload B).
The config YAML is based on the example YAML attached to cassandra-stress
sources.
Profile-guided optimization consists of the following steps:
1. Build the program as usual, but with with special options (instrumentation
or just some supplementary info tables, depending on the exact flavor of PGO
in use).
2. Collect an execution profile from the special binary by running a
training workload on it.
3. Rebuild the program again, using the collected profile.
This commit introduces a script automating step 2: running PGO training workloads
on Scylla. The contents of training workloads will be added in future commits.
The changes in configure.py responsible for steps 1. and 3. will also appear
in future commits.
As input, the script takes a path to the instrumented binary, a path to a
the output file, and a directory with (optionally) prepopulated datasets for use
in training. The output profile file can be then passed to the compiler to
perform a PGO build.
The script current supports two kinds of PGO instrumentation: LLVM instrumentation
(binary instrumented with -fprofile-generate and -fcs-profile-generate passed to
clang during compilation) and BOLT instrumentation (binary instrumented with
`llvm-bolt -instrument`, with logs from this operation saved to
$binary_path.boltlog)
The actual training workloads for generating the profile will be added in later
commits.