Files
scylladb/test
Nadav Har'El 2431f92967 alternator, test: add reproducer for issue about immediate LWT timeout
This patch adds a reproducer for issue #16261, where it was reported
that when Alternator read-modify-write (using LWT) operations to the
same partition are sent to different nodes, sometimes the operation
fails immediately, with an InternalServerError claiming to be a "timeout",
although this happens almost immediately (after a few milliseconds),
not after any real timeout.

The test uses 3 nodes, and 3 threads which send RMW operations to different
items in the same partition, and usually (though not with 100% certainty)
it reaches the InternalServerError in around 100 writes by each thread.
This InternalServerError looks like:

    Internal server error: exceptions::mutation_write_timeout_exception
    (Operation timed out for alternator_alternator_Test_1719157066704.alternator_Test_1719157066704 - received only 1 responses from 2 CL=LOCAL_SERIAL.)

The test also prints how much time it took for the request to fail,
for example:
    In incrementing 1,0 on node 1: error after 0.017074108123779297
This is 0.017 seconds - it's not the cas_contention_timeout_in_ms
timeout (1 second) or any other timeout.

If we enable trace logging, adding to topology_experimental_raft/suite.yaml
    extra_scylla_cmdline_options: ["--logger-log-level", "paxos=trace"]
we get the following TRACE-level message in the log:

    paxos - CAS[0] accept_proposal: proposal is partially rejected

This again shows the problem is "uncertainty" (partial rejection) and not
a timeout.

Refs #16261

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19445
2025-08-01 11:58:52 +03:00
..
2025-07-24 20:04:43 +02:00

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.