scylladb

Files

Botond Dénes df2ac0f257 Merge 'test: dtest: schema_management_test.py: migrate from dtest' from Dario Mirovic

This PR migrates schema management tests from dtest to this repository.

One reason is that there is an ongoing effort to migrate tests from dtest to here.

Test `TestLargePartitionAlterSchema.test_large_partition_with_drop_column` failed with timeout error once. The main suspect so far are infra related problems, like infra congestion. The [logs from the test execution](https://jenkins.scylladb.com/job/scylla-master/job/dtest-release/1062/testReport/junit/schema_management_test/TestLargePartitionAlterSchema/Run_Dtest_Parallel_Cloud_Machines___Dtest___full_split001___test_large_partition_with_drop_column/), linked in the issue [test_large_partition_with_drop_column failed on TimeoutError #26932](https://github.com/scylladb/scylladb/issues/26932) show the following:
- `populate` works as intended - it starts, then during populate/insert drop column happened, then an exception is raised and intentionally ignored in the test, so no `Finish populate DB` for 50 x 1490 records - expected
- drop column works as intended - interrupts `populate` and proceeds to flush
- flush **probably** works as intended - logs are consistent with what we expect and what I got in local test runs
- `read` is the only thing that visibly got stuck, all the way until timeout happened, 5 minutes after the start

Migrating the test to this repo will also give us test start and end times on CI machines, in the sql report database. It has start and end timestamp for each test executed. We will be able to see how long does it usually take when the test is successful. It can not be seen from the logs, because logs are not kept for successful tests.

Another thing this PR does is adding a log message at the end of `database::flush_all_tables`. This will let us know if a thread got stuck inside or finished successfully. This addresses the **probably** part of the flush analysis step described above. If the issue reoccurs, we will have more information.

The test `test_large_partition_with_add_column` has not been executing for ~5 years. It was never migrated to pytest. The name was left as `large_partition_with_add_column_test`, and was skipped. Now it is enabled and updated.

Both `test_large_partition_with_add_column` and `test_large_partition_with_drop_column` are improved.
Small performance improvements:
- Regex compilation extracted from the stress function to the module level, to avoid recompilation.
- Do not materialize list in `stress_object` for loop. Use a generator expression.

The tests in `TestLargePartitionAlterSchema` are `test_large_partition_with_add_column`
and `test_large_partition_with_drop_column`.

These tests need to replicate the following conditions that led to a bug before a fix from around 5 years ago.

The scenario in which the problem could have happened has to involve:
- a large partition with many rows, large enough for preemption (every 0.5ms) to happen during the scan of the partition.
- appending writes to the partition (not overwrites)
- scans of the partition
- schema alter of that table. The issue is exposed only by adding or dropping a column, such that the added/dropped
  column lands in the middle (in alphabetical order) of the old column set.

The way the test is set up is:
- fixed number of writes per populate call
- fixed number of reads

This has the following implications:
- if the machine executing the test is fast, all the writes are done before the 10 seconds sleep
- there are too many reads - most of them get executed after the test logic is done

This patch solves these issues in the following way:
- populate lazily generates write data, and stops when instructed by `stop_populating` event
- read, which is done sequentially, stops when instructed by `stop_reading` event
- number of max operations is increased significantly, but the operations are stopped 1 second
  after node flush; this makes sure there are enough operations during the test, but also that
  the test does not take unnecessary time

Test execution time has been reduced severalfold. On dev machine the time the tests take is
reduced from 110 seconds to 34 seconds.

scylla-dtest PR that removes migrated tests:
[schema_management_test.py: remove tests already ported to scylladb repo #6427](https://github.com/scylladb/scylla-dtest/pull/6427)

Fixes #26932

This is a migration of existing tests to this repository. No need for backport.

Closes scylladb/scylladb#27106

* github.com:scylladb/scylladb:
  test: dtest: schema_management_test.py: speed up `TestLargePartitionAlterSchema` tests
  test: dtest: schema_management_test.py: fix large partition add column test
  test: dtest: schema_management_test.py: add `TestSchemaManagement.prepare`
  test: dtest: schema_management_test.py: test enhancements
  test: dtest: schema_management_test.py: make the tests work
  test: dtest: migrate setup and tools from dtest
  test: dtest: copy unmodified schema_management_test.py
  replica: database: flush_all_tables log on completion

2025-12-19 12:30:00 +02:00

alternator

test/alternator: delete unnecessary "pass"

2025-12-16 19:29:23 +03:00

boost

schema_registry: fix learning a schema with cdc schema

2025-12-17 20:01:00 +02:00

broadcast_tables

…

cluster

Merge 'test: dtest: schema_management_test.py: migrate from dtest' from Dario Mirovic

2025-12-19 12:30:00 +02:00

cql

vector_search: Restrict vector index tests to tablets only

2025-11-25 09:26:16 +02:00

cqlpy

Merge 'test/cqlpy: rename tests with duplicate name' from Nadav Har'El

2025-12-16 19:32:20 +03:00

ldap

main: auth: add auth cache dependency to auth service

2025-11-26 12:01:31 +01:00

lib

Merge 'service: support conversion of tablet keyspaces to rack-list using ALTER KEYSPACE' from Aleksandra Martyniuk

2025-12-17 10:05:06 +01:00

manual

replica/table: keep track of total pre-compression file size

2025-11-13 00:49:57 +01:00

nodetool

Revert "Merge 'Add option to use sstable identifier in snapshot' from Benny Halevy"

2025-12-12 03:55:13 +00:00

perf

service: pass topology and system_keyspace to load_balancer ctor

2025-12-16 13:25:38 +01:00

pylib

test/pylib/suite/python.py: Handle extra_cmdline_options correctly

2025-12-16 20:14:43 +03:00

pylib_test

…

raft

test/raft: fix race condition in failure_detector_test

2025-12-19 09:42:19 +02:00

resource

streaming: add pytest case to reproduce mutation loss issue

2025-11-18 09:34:41 +02:00

rest_api

test: add API tests for client_routes endpoints

2025-12-15 17:46:14 +01:00

scylla_gdb

test/scylla_gdb: use gcore instead of signal SIGSEGV to generate a coredump on failure

2025-12-16 06:53:43 +02:00

storage

test_user_writes_rejection: Disable speculative retries

2025-12-19 09:39:09 +02:00

unit

…

vector_search

unittest: fix vector_store_client_test_dns_refresh_aborted hangs

2025-12-02 12:22:44 +01:00

vector_search_validator

vector_search: add vector-search-validator tests

2025-11-24 17:26:04 +01:00

__init__.py

…

CMakeLists.txt

vector_search: implement building vector-search-validator

2025-11-24 17:26:04 +01:00

conftest.py

…

pytest.ini

…

README.md

…

README.md

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.