mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-30 13:17:01 +00:00
We have been seeing rare failures of the cql-pytest (translated from
Cassandra's unit tests) for testing TTL in secondary indexes:
cassandra_tests/validation/entities/secondary_index_test.py::testIndexOnRegularColumnInsertExpiringColumn
The problem is that the test writes an item with 1 second TTL, and then
sleeps *exactly* 1.0 seconds, and expects to see the item disappear
by that time. Which doesn't always happen:
The problem with that assumption stems from Scylla's TTL clock ("gc_clock")
being based on Seastar's lowres clock. lowres_clock only has a 10ms
"granularity": The time Scylla sees when deciding whether an item expires
may be up to 10ms in the past - the arbitrary point when the lowres timer
happened to last run. In rare overload cases, the inaccuracy may be even
grater than 10ms (if the timer got delayed by other things running).
So when Scylla is asked to expire an item in 1 second - we cannot be
sure it will be expired in exactly 1 second or less - the expiration
can be also around 10ms later.
So in this patch we change the test to sleep with more than enough
margin - 1.1 seconds (i.e., 100ms more than 1 second). By that time
we're sure the item must have expired.
Before this patch, I saw the test failing once every few hundred runs,
after this patch I ran if 2,000 times without a single failure.
Fixes #9008
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210712100655.953969-1-nyh@scylladb.com>