mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 23:43:31 +00:00

Go to file

Marcin Maliszkiewicz 80989556ac pgo: add alternator workloads training

This patch adds a set of alternator workloads to pgo training
script.

To confirm that added workloads are indeed affecting profile we can compare:

⤖ llvm-profdata show ./build/release-pgo/profiles/workdirs/clustering/prof.profdata

Instrumentation level: IR  entry_first = 0
Total functions: 105075
Maximum function count: 1079870885
Maximum internal block count: 2197851358

and

⤖ llvm-profdata show ./build/release-pgo/profiles/workdirs/alternator/prof.profdata

Instrumentation level: IR  entry_first = 0
Total functions: 105075
Maximum function count: 5240506052
Maximum internal block count: 9112894084

to see that function counters are on similar levels, they are around 5x higher for alternator
but that's because it combines 5 specific sub-workloads.

To confirm that final profile contains alterantor functions we can inspect:

⤖ llvm-profdata show --counts --function=alternator --value-cutoff 100000 ./build/release-pgo/profiles/merged.profdata
(...)
Instrumentation level: IR  entry_first = 0
Functions shown: 356
Total functions: 105075
Number of functions with maximum count (< 100000): 97275
Number of functions with maximum count (>= 100000): 7800
Maximum function count: 7248370728
Maximum internal block count: 13722347326

we can see that 356 functions which symbol name contains word alternator were identified as 'hot' (with max count grater than 100'000). Running:

⤖ llvm-profdata show --counts --function=alternator --value-cutoff 1 ./build/release-pgo/profiles/merged.profdata
(...)
Instrumentation level: IR  entry_first = 0
Functions shown: 806
Total functions: 105075
Number of functions with maximum count (< 1): 67036
Number of functions with maximum count (>= 1): 38039
Maximum function count: 7248370728
Maximum internal block count: 13722347326

we can see that 806 alternator functions were executed at least once during training.

And finally to confirm that alternator specific PGO brings any speedups we run:

for workload in read scan write write_gsi write_rmw
do
./build/release/scylla perf-alternator-workloads --smp 4 --cpuset "10,12,14,16" --workload $workload --duration 1 --remote-host 127.0.0.1 2> /dev/null | grep median
done

results BEFORE:

median 258137.51910849303
median absolute deviation: 786.06
median 547.2578202937141
median absolute deviation: 6.33
median 145718.19856685458
median absolute deviation: 5689.79
median 89024.67095807113
median absolute deviation: 1302.56
median 43708.101729598646
median absolute deviation: 294.47

results AFTER:

median 303968.55333940056
median absolute deviation: 1152.19
median 622.4757636209254
median absolute deviation: 8.42
median 198566.0403745328
median absolute deviation: 1689.96
median 91696.44912842038
median absolute deviation: 1891.84
median 51445.356525664996
median absolute deviation: 1780.15

We can see that single node cluster tps increase is typically 13% - 17% with notable exceptions,
improvement for write_gsi is 3% and for write workload whopping 36%.
The increase is on top of CQL PGO.

Write workload is executed more often because it's involved also as data preparation for read and scan.
Some further improvement could be to separate preparation from training as it's done for CQL but it would
be a bit odd if ~3x higher counters for one flow have so big impact.

Additional disclaimers:
 - tests are performing exactly the same workloads as in training so there might be some bias
 - tests are running single node cluster, more realistic setup will likely show lower improvement

Fixes https://github.com/scylladb/scylla-enterprise/issues/4066

2024-12-27 16:16:04 +08:00

.github

.github: cherry-pick each commit instead of merge commit when available

2024-12-26 13:10:18 +02:00

abseil @ d7aaad83b4

build: bring abseil submodule back

2024-05-05 23:31:09 +03:00

alternator

alternator: use get_datacenters() in get_network_topology_options()

2024-12-22 18:57:10 +02:00

api

Merge 'Enhance load-and-stream with "scope"' from Pavel Emelyanov

2024-12-25 13:52:51 +02:00

auth

treewide: use angle brackets when including seastar headers

2024-12-20 16:16:28 +02:00

bin

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cdc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cmake

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

compaction

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

conf

db: config: add user_task_ttl_seconds named value

2024-11-25 14:16:06 +01:00

cql3

cql3: implement NOT IN

2024-12-22 15:15:23 +02:00

data_dictionary

Merge 'Make restore task abortable' from Calle Wilund

2024-12-19 12:23:33 +03:00

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

debug

…

dht

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

direct_failure_detector

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

dist

build_docker: add option for building container based on Ubuntu Pro

2024-12-20 13:09:24 +02:00

docs

api: New "scope" API param to load-and-stream calls

2024-12-23 19:28:05 +03:00

exceptions

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

gms

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

idl

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

index

api: Use built_views table in get_built_indexes API

2024-12-24 16:18:00 +03:00

lang

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

licenses

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

locator

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

message

message: do not include unused header

2024-12-19 15:16:34 +02:00

mutation

mutation: remove unused function

2024-12-20 16:12:10 +02:00

mutation_writer

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

node_ops

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

pgo

pgo: add alternator workloads training

2024-12-27 16:16:04 +08:00

raft

raft: do not include unused headers

2024-12-19 14:57:22 +02:00

readers

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

redis

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

reloc

reloc: create $BUILDDIR for getting its path

2024-05-01 09:52:17 +03:00

repair

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

replica

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

rust

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

schema

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

scripts

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

seastar @ 3133ecdd65

Update seastar submodule

2024-12-12 18:30:54 +02:00

service

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

sstables

sstables: parse(summary): reserve positions vector

2024-12-26 13:33:29 +02:00

streaming

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

swagger-ui @ 12f1da1082

…

tasks

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

test

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

tools

api: New "scope" API param to load-and-stream calls

2024-12-23 19:28:05 +03:00

tracing

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

transport

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

types

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

unified

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

utils

utils: small_vector: expose internal_capacity()

2024-12-24 12:19:20 +02:00

.clang-format

clang-format: argument and function packing

2024-10-04 14:52:41 +02:00

.dockerignore

…

.gitattributes

gitattributes: Mark swagger .js files as binary

2024-06-19 15:07:56 +03:00

.gitignore

Add .idea folder to .gitignore

2024-09-20 11:49:41 +03:00

.gitmodules

dist: drop scylla-jmx

2024-09-13 07:59:45 +03:00

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

absl-flat_hash_map.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

amplify.yml

…

backlog_controller.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

build_mode.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes_fwd.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes_ostream.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cache_mutation_reader.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cache_temperature.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cartesian_product.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cell_locking.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

checked-file-impl.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

client_data.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

client_data.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clocks-impl.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clocks-impl.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clustering_bounds_comparator.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clustering_interval_set.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clustering_key_filter.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clustering_ranges_walker.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

CMakeLists.txt

configure: enable the io_uring backend

2024-12-04 02:55:31 -05:00

collection_mutation.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

collection_mutation.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

column_computation.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

combine.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

compound_compat.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

compound.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

compress.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

compress.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

concrete_types.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

configure.py

configure.py: don't include non-default modes in dist-server-* rules

2024-12-27 16:16:04 +08:00

CONTRIBUTING.md

…

converting_mutation_partition_applier.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

converting_mutation_partition_applier.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

counters.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

counters.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

coverage_excludes.txt

test.py: support code coverage

2024-01-18 11:11:34 +02:00

coverage_sources.list

configure.py support coverage profiles on standrad build modes

2024-01-18 11:11:34 +02:00

cql_serialization_format.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

db_clock.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

debug.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

debug.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

default.nix

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

Doxyfile

…

duration.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

duration.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

encoding_stats.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

enum_set.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

fix_system_distributed_tables.py

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

flake.lock

…

flake.nix

…

frozen_schema.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

frozen_schema.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

full_position.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

gc_clock.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

gdbinit

…

gen_segmented_compress_params.py

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

generic_server.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

generic_server.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

HACKING.md

HACKIGN.md: clarify the use of dbuild when running test.py

2024-09-10 13:40:45 +03:00

hashing_partition_visitor.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

idl-compiler.py

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

inet_address_vectors.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

init.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

init.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

install-dependencies.sh

Merge 'install-dependencies.sh: cleanups to silence shellcheck' from Kefu Chai

2024-12-24 10:27:27 +03:00

install.sh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

interval.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

keys.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

keys.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

LICENSE-ScyllaDB-Source-Available.md

LICENSE-ScyllaDB-Source-Available.md: fix markdown

2024-12-20 16:11:39 +02:00

main.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

map_difference.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

marshal_exception.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

multishard_mutation_query.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

multishard_mutation_query.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

mutation_query.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

mutation_query.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

noexcept_traits.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

partition_range_compat.hh

treewide: migrate from boost::copy_range to std::ranges::to

2024-12-26 11:46:26 +02:00

partition_slice_builder.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

partition_slice_builder.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

partition_snapshot_reader.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

partition_snapshot_row_cursor.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

protocol_server.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

querier.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

querier.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query_id.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query_ranges_to_vnodes.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query_ranges_to_vnodes.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query_result_merger.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-request.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-result-reader.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-result-set.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-result-set.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-result-writer.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-result.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

read_context.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

reader_concurrency_semaphore.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

reader_concurrency_semaphore.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

reader_permit.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

README.md

README: Update the version of C++ to C++23

2024-08-14 12:06:23 +03:00

real_dirty_memory_accounter.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

release.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

release.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

reversibly_mergeable.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

row_cache.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

row_cache.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

schema_mutations.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

schema_mutations.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

schema_upgrader.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

scylla_post_install.sh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

scylla-gdb.py

compaction/compaction_manager: make _tasks an intrusive list

2024-11-03 10:17:11 +02:00

SCYLLA-VERSION-GEN

SCYLLA-VERSION-GEN: correct the logic for skipping SCYLLA-*-FILE

2024-10-14 13:49:15 +03:00

seastarx.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

serialization_visitors.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

serializer_impl.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

serializer.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

serializer.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

service_permit.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

setup.py

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

shell.nix

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

sstables_loader.cc

sstables_loader: Propagate scope from API down

2024-12-23 19:28:05 +03:00

sstables_loader.hh

sstables_loader: Propagate scope from API down

2024-12-23 19:28:05 +03:00

supervisor.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

table_helper.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

table_helper.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

test.py

test: compile unit tests into a single executable

2024-12-22 19:14:09 +02:00

timeout_config.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

timeout_config.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

timestamp.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc_extension.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc_options.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc_options.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc-internals.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tox.ini

…

ubsan-suppressions.supp

…

unimplemented.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

unimplemented.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

validation.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

validation.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

version.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

view_info.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

vint-serialization.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

vint-serialization.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

zstd.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.1%

Python 26.7%

CMake 0.3%

GAP 0.3%

Shell 0.3%