From 378bcd69e353d84ca987c5176cb8e92b558c14dc Mon Sep 17 00:00:00 2001 From: Piotr Szymaniak Date: Wed, 1 Apr 2026 21:19:05 +0200 Subject: [PATCH] tree: add AGENTS.md router and improve AI instruction files Add AGENTS.md as a minimal router that directs AI agents to the relevant instruction files based on what they are editing. Improve the instruction files: - cpp.instructions.md: clarify seastarx.hh scope (headers, not "many files"), explain std::atomic restriction (single-shard model, not "blocking"), scope macros prohibition to new ad-hoc only, add coroutine exception propagation pattern, add invariant checking section preferring throwing_assert() over SCYLLA_ASSERT (issue #7871) - python.instructions.md: demote PEP 8 to fallback after local style, clarify that only wildcard imports are prohibited - copilot-instructions.md: show configure.py defaults to dev mode, add frozen toolchain section, clarify --no-gather-metrics applies to test.py, fix Python test paths to use .py extension, add license header guidance for new files Closes scylladb/scylladb#29023 --- .github/copilot-instructions.md | 33 +++++++++++---------- .github/instructions/cpp.instructions.md | 14 ++++++--- .github/instructions/python.instructions.md | 4 +-- AGENTS.md | 16 ++++++++++ 4 files changed, 46 insertions(+), 21 deletions(-) create mode 100644 AGENTS.md diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 6f100664f1..3da62bd475 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -5,13 +5,14 @@ High-performance distributed NoSQL database. Core values: performance, correctne ## Build System -### Modern Build (configure.py + ninja) +### Using native OS environment ```bash -# Configure (run once per mode, or when switching modes) -./configure.py --mode= # mode: dev, debug, release, sanitize +# Configure (run once) +./configure.py # Build everything -ninja -build # e.g., ninja dev-build +ninja -build # modes: dev, debug, release, sanitize + # dev is recommended for development (fastest compilation) # Build Scylla binary only (sufficient for Python integration tests) ninja build//scylla @@ -20,6 +21,9 @@ ninja build//scylla ninja build//test/boost/ ``` +### Using frozen toolchain (Docker) +Prefix any build command with `./tools/toolchain/dbuild`. + ## Running Tests ### C++ Unit Tests @@ -36,9 +40,9 @@ ninja build//test/boost/ ``` **Important:** -- Use full path with `.cc` extension (e.g., `test/boost/test_name.cc`, not `boost/test_name`) +- Use full path with `.cc` extension (e.g., `test/boost/memtable_test.cc`) - To run a single test case, append `::` to the file path -- If you encounter permission issues with cgroup metric gathering, add `--no-gather-metrics` flag +- If you encounter permission issues with cgroup metrics, add `--no-gather-metrics` to the `./test.py` command **Rebuilding Tests:** - test.py does NOT automatically rebuild when test source files are modified @@ -60,25 +64,21 @@ ninja build//scylla # Run a single test case from a file ./test.py --mode= test//.py:: -# Run all tests in a directory -./test.py --mode= test// - # Examples ./test.py --mode=dev test/alternator/ -./test.py --mode=dev test/cluster/test_raft_voters.py::test_raft_limited_voters_retain_coordinator ./test.py --mode=dev test/cqlpy/test_json.py +./test.py --mode=dev test/cluster/test_raft_voters.py::test_raft_limited_voters_retain_coordinator # Optional flags -./test.py --mode=dev test/cluster/test_raft_no_quorum.py -v # Verbose output -./test.py --mode=dev test/cluster/test_raft_no_quorum.py --repeat 5 # Repeat test 5 times +./test.py --mode=dev test/cluster/test_raft_no_quorum.py -v --repeat 5 ``` **Important:** -- Use full path with `.py` extension (e.g., `test/cluster/test_raft_no_quorum.py`, not `cluster/test_raft_no_quorum`) +- Use full path with `.py` extension - To run a single test case, append `::` to the file path - Add `-v` for verbose output - Add `--repeat ` to repeat a test multiple times -- After modifying C++ source files, only rebuild the Scylla binary for Python tests - building the entire repository is unnecessary +- After modifying C++ source files, only rebuild the Scylla binary for Python tests ## Code Philosophy - Performance matters in hot paths (data read/write, inner loops) @@ -92,10 +92,13 @@ ninja build//scylla ## Test Philosophy - Performance matters. Tests should run as quickly as possible. Sleeps in the code are highly discouraged and should be avoided, to reduce run time and flakiness. - Stability matters. Tests should be stable. New tests should be executed 100 times at least to ensure they pass 100 out of 100 times. (use --repeat 100 --max-failures 1 when running it) -- Unit tests should ideally test one thing and one thing only. +- Unit tests should ideally test one thing only. - Tests for bug fixes should run before the fix - and show the failure and after the fix - and show they now pass. - Tests for bug fixes should have in their comments which bug fixes (GitHub or JIRA issue) they test. - Tests in debug are always slower, so if needed, reduce number of iterations, rows, data used, cycles, etc. in debug mode. - Tests should strive to be repeatable, and not use random input that will make their results unpredictable. - Tests should consume as little resources as possible. Prefer running tests on a single node if it is sufficient, for example. +## New Files +- Include `LicenseRef-ScyllaDB-Source-Available-1.1` in the SPDX header +- Use the current year for new files; for existing code keep the year as is diff --git a/.github/instructions/cpp.instructions.md b/.github/instructions/cpp.instructions.md index 2c14e7f556..1c2d6eaafc 100644 --- a/.github/instructions/cpp.instructions.md +++ b/.github/instructions/cpp.instructions.md @@ -25,6 +25,8 @@ applyTo: "**/*.{cc,hh}" - Use `seastar::gate` for shutdown coordination - Use `seastar::semaphore` for resource limiting (not `std::mutex`) - Break long loops with `maybe_yield()` to avoid reactor stalls +- Most Scylla code runs on a single shard where atomics are unnecessary +- Use Seastar message passing for cross-shard communication ## Coroutines ```cpp @@ -36,10 +38,16 @@ seastar::future func() { ## Error Handling - Throw exceptions for errors (futures propagate them automatically) +- In coroutines, use `co_await coroutine::return_exception_ptr()` or `co_return coroutine::exception()` to avoid the overhead of throwing - In data path: avoid exceptions, use `std::expected` (or `boost::outcome`) instead - Use standard exceptions (`std::runtime_error`, `std::invalid_argument`) - Database-specific: throw appropriate schema/query exceptions +## Invariant Checking +- Prefer `throwing_assert()` (`utils/assert.hh`), it logs and throws instead of aborting +- Use `SCYLLA_ASSERT` where critical to system stability where no clean shutdown is possible, it aborts +- Use `on_internal_error()` for should-never-happen conditions that should be logged with backtrace + ## Performance - Pass large objects by `const&` or `&&` (move semantics) - Use `std::string_view` for non-owning string references @@ -68,7 +76,7 @@ seastar::future func() { - Use `#pragma once` - Include order: own header, C++ std, Seastar, Boost, project headers - Forward declare when possible -- Never `using namespace` in headers (exception: `using namespace seastar` is globally available via `seastarx.hh`) +- Never `using namespace` in headers. Exception: most headers include `seastarx.hh`, which provides `using namespace seastar` project-wide. ## Documentation - Public APIs require clear documentation @@ -101,10 +109,8 @@ seastar::future func() { - `malloc`/`free` - `printf` family (use logging or fmt) - Raw pointers for ownership -- `using namespace` in headers - Blocking operations: `std::sleep`, `std::read`, `std::mutex` (use Seastar equivalents) -- `std::atomic` (reserved for very special circumstances only) -- Macros (use `inline`, `constexpr`, or templates instead) +- New ad-hoc macros (prefer `inline`, `constexpr`, or templates; established project macros like `SCYLLA_ASSERT` are fine) ## Testing When modifying existing code, follow TDD: create/update test first, then implement. diff --git a/.github/instructions/python.instructions.md b/.github/instructions/python.instructions.md index dc327cc798..763a2e032f 100644 --- a/.github/instructions/python.instructions.md +++ b/.github/instructions/python.instructions.md @@ -7,7 +7,7 @@ applyTo: "**/*.py" **Important:** Match existing code style. Some directories (like `test/cqlpy` and `test/alternator`) prefer simplicity over type hints and docstrings. ## Style -- Follow PEP 8 +- Match style of the file and directory you are editing; fall back to PEP 8 if unclear - Use type hints for function signatures (unless directory style omits them) - Use f-strings for formatting - Line length: 160 characters max @@ -25,7 +25,7 @@ from cassandra.cluster import Cluster from test.utils import setup_keyspace ``` -Never use `from module import *` +Avoid wildcard imports (`from module import *`). ## Documentation All public functions/classes need docstrings (unless the current directory conventions omit them): diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000..b499c65ef2 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,16 @@ +# ScyllaDB — AI Agent Instructions + +This file routes you to the relevant instruction files. +Do NOT load all files at once — read only what applies to your current task. + +## Instruction Files + +- `.github/copilot-instructions.md` — build system, test runner, code philosophy, test philosophy +- `.github/instructions/cpp.instructions.md` — C++ style, Seastar patterns, memory, error handling (for `*.cc`, `*.hh`) +- `.github/instructions/python.instructions.md` — Python style, testing conventions (for `*.py`) + +## Which files to read + +- **Always read** `.github/copilot-instructions.md` for build/test commands and project values +- **If editing C++ files** (`*.cc`, `*.hh`): also read `.github/instructions/cpp.instructions.md` +- **If editing Python files** (`*.py`): also read `.github/instructions/python.instructions.md`