Use an internally create instance of random engine. Passing a readily
seeded engine from the outside is pointless now that we have a mechanism
to seed entire test suites with a command line algorithm: the internal
engine is seeded from tests::random, so the seed of the test suite
determines the internal seed as well.
Update the sole user of this method (mutation_writer_test.cc) to not
generate local seeds anymore.
The new collector parameter is a pointer to a
`compaction_garbage_collector` implementation. This collector is passed
all atoms that are expired and would be discarded. The body of
`compact_and_expire()` was changed so that it checks cells' tombstone
coverage before it checks their expiry, so that cells that are both
covered by a tombstone and also expired are not passed to the collector.
The collector param is optional and defaults to nullptr. To accommodate
the collector, which needs to know the column id, a new `column_id`
parameter was added as well.
"
Fixes#2027
Modifies inet address type in scylla to use seastar::net::inet_address,
and removes explicit use of ipv4_addr in various network code in favour
of socket_address. Thus capable of resolving and binding to ipv6.
Adds config option to enable/disable ipv6 (default enabled), so
upgrading cluster can continue to work while running mixed version
nodes (since gossip message address serialization becomes different).
"
* 'calle/ipv6' of https://github.com/elcallio/scylla:
test-serialization: Add small roundtrip test for inet address (v4 + v6)
inet_address/init: Make ipv6 default enabled
db::config: Add enable ipv6 switch (default off)
gms::inet_address: Make serialization ipv6 aware
Remove usage of inet_address::raw_addr()
Replace use of "ipv4_addr" with socket_address
inet_address: Add optional family to lookup
gms::inet_address: Change inet_address to wrap actual seastar::net::inet_address
types: Add ipv6_address support
Post commit b3adabda2d
(Reduce logalloc differences between debug and release)
logalloc_test's memory footprint has grown, in particular
in test_region_groups, and it triggers the oom killer on
our test automation machines.
This patch scales down this test case so it requires less memory.
Fixes#4669
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Currently the handling of partition tombstones is broken in multiple
ways:
* The partition-tombstone is lost when the bucket is calculated for its
timestamp (due to a misplaced `std::exchange()`).
* When the `partition_start` fragment (containing the partition
tombstone) is actually written to the bucket we emit another
`partition_start` fragment before it because the bucket has not seen
that partition before and we fail to notice that we are actually writing
the partition header.
This bug was allowed to fly under the radar because the unit test was
accidentally not creating partition tombstones in the generated data
(due to a mistake). It was discovered while working on unit tests for
another test and fixing the data generation function to actually
generate partition tombstones.
This patch fixes both problems in the handling of partition tombstones
but it doesn't yet fixes the test. That is deferred until the patch
series which uncovered this bug is merged to avoid merge conflicts.
The other series mentioned here is: [PATCH v6 00/15] compaction: allow
collecting purged data
Fixes: #4683
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20190710092427.122623-1-bdenes@scylladb.com>
"
This miniseries expands big_decimal interface with convenience operators
(-=, +, -), provides test cases for it and makes one of the constructors
explicit.
Tests: unit(dev)
"
* 'expand_big_decimal_interface' of https://github.com/psarna/scylla:
utils: make string-based big decimal constructor explicit
tests: add more operators to big decimal tests
utils: add operators to big_decimal
"
Implement LIKE parsing, intermediate representation, and query processing. Add tests
for this implementation (leaving the LIKE functionality tests in
tests/like_matcher_test.cc).
Refs #4477.
"
* 'finish-like' of https://github.com/dekimir/scylla:
cql3: Add LIKE operator to CQL grammar
cql3: Ensure LIKE filtering for partition columns
cql3: Add LIKE restriction
cql3: Add LIKE relation
This patchset allows changing the configuration at runtime, The user
triggers this by editing the configuration file normally, then
signalling the database with SIGHUP (as is traditional).
The implementation is somewhat complicated due the need to store
non-atomic mutable state per-shard and to synchronize the values in
all shards. This is somewhat similar to Seastar's sharded<>, but that
cannot be used since the configuration is read before Seastar is
initialized (due to the need to read command-line options).
Tests: unit (dev, debug), manual test with extra prints (dev)
Ref #2689Fixes#2517.
Once we start using updateable_value<>, we must make it refer
to the updateable_value_source<> on the same shard, and to do
that we need to call broadcast_to_all_shards() first (this
creates the per-shard copy).
The updateable_value and updateable_value_source classes allow broadcasting
configuration changes across the application. The updateable_value_source class
represents a value that can be updated, and updateable_value tracks its source
and reflects changes. A typical use replaces "uint64_t config_item" with
"updateable_value<uint64_t> config_item", and from now on changes to the source
will be reflected in config_item. For more complicated uses, which must run some
callback when configuration changes, you can also call
config_item.observe(callback) to be actively notified of changes.
Currently, we allow adjusting configuration via
cfg.whatever() = 5;
by returning a mutable reference from cfg.whatever(). Soon, however, this operation
will have side effects (updating all references to the config item, and triggering
notifiers). While this can be done with a proxy, it is too tricky.
Switch to an ordinary setter interface:
cfg.whatever.set(5);
Because boost::program_options no longer gets a reference to the value to be written
to, we have to move the update to a notifier, and the value_ex() function has to
be adjusted to infer whether it was called with a vector type after it is
called, not before.
config_file and db::config are soon not going to be copyable. The reason is that
in order to support live updating, we'll need per-shard copies of each value,
and per-shard tracking of references to values. While these can be copied, it
will be an asycnronous operation and thus cannot be done from a copy constructor.
So to prepare for these changes, replace all copies of db::config by references
and delete config_file's copy constructor.
Some existing references had to be made const in order to adapt the const-ness
of db::config now being propagated (rather than being terminated by a non-const
copy).
Copying the config object breaks the link between the original and the copied
object, so updates to config items will not be visible. To allow updates, don't
copy any more, and instead keep a pointer.
The pointer won't work will once config is updateable, since the same object is
shared across multiple shard, but that can be addressed later.
We want to eliminate the default database constructor (to be explained in
the next patch), so eliminate its only use in gossip_test, using the regular
constructor instead.
With this unused descriptors and objects should always be poisoned.
* https://github.com/espindola/scylla/ align-descriptors-so-that-they-are-poisoned-v4:
Convert macros to inline functions
More precise poisoning in logalloc
This change aligns descriptors and values to 8 bytes so that poisoning
a descriptor or value doesn't interfere with other descriptors and
values.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
"
When writing streamed data into sstables, while using time window
compaction strategy, we have to emit a new sstable for each time window.
Otherwise we can end up with sstables, mixing data from wildly different
windows, ruining the compaction strategy's ability to drop entire
sstables when all data within is expired. This gets worse as these mixed
sstables get compacted together with sstables that used to contain a
single time window.
This series provides a solution to this by segregating the data by its
atom's the time-windows. This is done on the new RPC streaming and the
new row-level, repair, memtable-flush and compaction, ensuring that the
segregation requirement is respected at all times.
Fixes: #2687
"
* 'segregate-data-into-sstables-by-time-window-streaming/v2.1' of ssh://github.com/denesb/scylla:
streaming,repair: restore indentation
repair: pass the data stream through the compaction strategy's interposer consumer
streaming: pass the data stream through the compaction strategy's interposer consumer
TWCS: implement add_interposer_consumer()
compaction_strategy: add add_interposer_consumer()
Add mutation_source_metadata
tests: add unit test for timestamp_based_splitting_writer
Add timestamp_based_splitting_writer
Introduce mutation_writer namespace
Most of our tests use overly simplistic schemas (`simple_schema`) or
very specialized ones that focus on exercising a specific area of the
tested code. This is fine in most places as not all code is schema
dependent, however practice has showed that there can be nasty bugs
hiding in dark corners that only appear with a schema that has a
specific combination of types.
This series introduces `tests::random_schema` a utility class for
generating random schemas and random data for them. An important goal is
to make using random schemas in tests as simple and convenient as
possible, therefore fostering the appearance of tests using random
schemas.
Random schema was developed to help testing code I'm currently working
on, which segregates data by time-windows. As I wasn't confident in my
ability to think of every possible combination of types that can break
my code I came up with random-schema to help me finding these corner
cases. So far I consider it a success, it already found bugs in my code
that I'm not sure I would have found if I had relied on specific
schemas. It also found bugs in unrelated areas of the code which proves
my point in the first paragraph.
* https://github.com/denesb/scylla.git random_schema/v5:
tests/data_model: approximate to the modeled data structures
data_value: add ascii constructor
tests/random-utils.hh: add stepped_int_distribution
tests/random-utils.hh: get_int() add overloads that accept external
rand engine
tests/random-utils.hh: add get_real()
tests: introduce random_schema
Currently there is a single mutation_writer: `multishard_writer`,
however in the next path we are going to add another one. This is the
right moment to move these into a common namespace (and folder), we
have way too much stuff scattered already in the top-level namespace
(and folder).
Also rename `tests/multishard_writer_test.cc` to
`tests/mutation_writer_test.cc`, this test-suite will be the home of all
the different mutation writer's unit test cases.
O_DSYNC causes commitlog to pre-allocate each commitlog segment by writing
zeroes into it. In normal operation, this is amortized over the many
times the segment will be reused. In tests, this is wasteful, but under
the default workstation configuration with /tmp using tmpfs, no actual
writes occur.
However on a non-default configuration with /tmp mounted on a real disk,
this causes huge disk I/O and eventually a crash (observed in
schema_change_test). The crash is likely only caused indirectly, as the
extra I/O (exacerbated by many tests running in parallel) xcauses timeouts.
I reproduced this problem by running 15 copies of schema_change_test in
parallel with /tmp mounted on a real filesystem. Without this change, I
usually observe one or two of the copies crashing, with the change they
complete (and much more quickly, too).
"
If the database supports infinite bound range deletions,
CQL layer will no longer throw an error indicating that both ranges
need to be specified.
Fixes#432
Update test_range_deletion_scenarios unit test accordingly.
"
* 'cql3-lift-infinite-bound-check' of https://github.com/bhalevy/scylla:
cql3: lift infinite bound check if it's supported
service: enable infinite bound range deletions with mc
database: add flag for infinite bound range deletions
random_schema is a utility class that provides methods for generating
random schemas as well as generating data (mutations) for them. The aim
is to make using random schemas in tests as simple and convenient as
is using `simple_schema`. For this reason the interface of
`random_schema` follows closely that of `simple_schema` to the extent
that it makes sense. An important difference is that `random_schema`
relies on `data_model` to actually build mutations. So all its
mutation-related operations work with `data_model::mutation_descrition`
instead of actual `mutation` objects. Once the user arrived at the
desired mutation description they can generate an actual mutation via
`data_model::mutation_description::build()`.
In addition to the `random_schema` class, the `random_schema.hh` header
exposes the generic utility classes for generating types and values
that it internally uses.
random_schema is fully deterministic. Using the same seed and the same
set of operations is guaranteed to result in generating the same schema
and data.
Make the the data modelling structures model their "real" counterparts
more closely, allowing the user greater control on the produced data.
The changes:
* Add timestamp to atomic_value (which is now a struct, not just an
alias to bytes).
* Add tombstone to collection.
* Add row_tombstone to row.
* Add bound kinds and tombstone to range_tombstone.
Great care was taken to preserve backward compatibility, to avoid
unnecessary changes in existing code.
If the database supports infinite bound range deletions,
CQL layer will no longer throw an error indicating that both ranges
need to be specified.
[bhalevy] Update test_range_deletion_scenarios unit test accordingly.
Fixes#432
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Indexed queries used to erroneously return partial per-page results
for aggregation queries. This test case used to reproduce the problem
and now ensures that there would be no regressions.
Refs #4540
"
partitioned_sstable_set is not self sufficient because it relies on
compatible_ring_position_view, which in turn relies on lifetime of
sstable object. This leads to use-after-free. Fix this problem by
introducing compatible_ring_position and using it in p__s__s.
Fixes#4572.
Test: unit (dev), compaction dtests (dev)
"
* 'projects/fix_partitioned_sstable_set/v4' of ssh://github.com/bhalevy/scylla:
tests: Test partitioned sstable set's self-sufficiency
sstables: Fix partitioned_sstable_set by making it self sufficient
Introduce compatible_ring_position and compatible_ring_position_or_view
The code that decides whether a query should used indexing was buggy - a partition key index might have influenced the decision even if the whole partition key was passed in the query (which effectively means that indexing it is not necessary).
Fixes#4539
Closes https://github.com/scylladb/scylla/pull/4544
Merged from branch 'fix_deciding_whether_a_query_uses_indexing' of git://github.com/psarna/scylla
tests: add case for partition key index and filtering
cql3: fix deciding if a query uses indexing
Currently, calling unfreeze() using the wrong version of the schema
results in undefined behavior. That can cause hard-to-debug
problems. Better to throw in such cases.
Refs #4549.
Tests:
- unit (dev)
Message-Id: <1560459022-23786-1-git-send-email-tgrabiec@scylladb.com>
To provide test reproducibility use the seastar local_random_engine.
To reproduce a run, use the --random-seed command line option
with the seed printed accordingly.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190613114307.31038-1-bhalevy@scylladb.com>