Commit Graph

1986 Commits

Author SHA1 Message Date
Avi Kivity
3497891cf9 utils: spell "barrett" correctly
As P. T. Barnoom famously said, "write what you like but spell my name
correctly". Following that, we correct the spelling of Barrett's name
in the source tree.

Closes #11989
2022-11-16 16:30:38 +02:00
Botond Dénes
bd1fcbc38f Merge 'Introduce reverse vector_deserializer.' from Michał Radwański
As indicated in #11816, we'd like to enable deserializing vectors in reverse.
The forward deserialization is achieved by reading from an input_stream. The
input stream internally is a singly linked list with complicated logic. In order to
allow for going through it in reverse, instead when creating the reverse vector
initializer, we scan the stream and store substreams to all the places that are a
starting point for a next element. The iterator itself just deserializes elements
from the remembered substreams, this time in reverse.

Fixes #11816

Closes #11956

* github.com:scylladb/scylladb:
  test/boost/serialization_test.cc: add test for reverse vector deserializer
  serializer_impl.hh: add reverse vector serializer
  serializer_impl: remove unneeded generic parameter
2022-11-16 07:37:24 +02:00
Botond Dénes
34f29c8d67 Merge 'Use with_sstable_directory() helper in tests' from Pavel Emelyanov
The helper is already widely used, one (last) test case can benefit from using it too

Closes #11978

* github.com:scylladb/scylladb:
  test: Indentation fix after previous patch
  test: Wse with_sstable_directory() helper
2022-11-15 14:21:48 +01:00
Nadav Har'El
8a4ab87e44 Merge 'utils: crc: generate crc barrett fold tables at compile time' from Avi Kivity
We use Barrett tables (misspelled in the code unfortunately) to fold
crc computations of multiple buffers into a single crc. This is important
because it turns out to be faster to compute crc of three different buffers
in parallel rather than compute the crc of one large buffer, since the crc
instruction has latency 3.

Currently, we have a separate code generation step to compute the
fold tables. The step generates a new C++ source files with the tables.
But modern C++ allows us to do this computation at compile time, avoiding
the code generation step. This simplifies the build.

This series does that. There is some complication in that the code uses
compiler intrinsics for the computation, and these are not constexpr friendly.
So we first introduce constexpr-friendly alternatives and use them.

To prove the transformation is correct, I compared the generated code from
before the series and from just before the last step (where we use constexpr
evaluation but still retain the generated file) and saw no difference in the values.

Note that constexpr is not strictly needed - we could have run the code in the
global variables' initializer. But that would cause a crash if we run on a pre-clmul
machine, and is not as fun.

Closes #11957

* github.com:scylladb/scylladb:
  test: crc: add unit tests for constexpr clmul and barrett fold
  utils: crc combine table: generate at compile time
  utils: barrett: inline functions in header
  utils: crc combine table: generate tables at compile time
  utils: crc combine table: extract table generation into a constexpr function
  utils: crc combine table: extract "pow table" code into constexpr function
  utils: crc combine table: store tables std::arrray rather than C array
  utils: barrett: make the barrett reduction constexpr friendly
  utils: clmul: add 64-bit constexpr clmul
  utils: barrett: extract barrett reduction constants
  utils: barrett: reorder functions
  utils: make clmul() constexpr
2022-11-15 14:21:48 +01:00
Pavel Emelyanov
8dcd9d98d6 test: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-14 20:11:01 +03:00
Pavel Emelyanov
c9128e9791 test: Wse with_sstable_directory() helper
It's already used everywhere, but one test case wires up the
sstable_directory by hand. Fix it too, but keep in mind, that the caller
fn stops the directory early.

(indentation is deliberately left broken)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-11-14 20:11:01 +03:00
Michał Radwański
32c60b44c5 test/boost/serialization_test.cc: add test for reverse vector
deserializer

This test is just a copy-pasted version of forward serializer test.
2022-11-14 16:06:24 +01:00
Avi Kivity
b8cb34b928 test: crc: add unit tests for constexpr clmul and barrett fold
Check that the constexpr variants indeed match the runtime variants.

I verified manually that exactly one computation in each test is
executed at run time (and is compared against a constant).
2022-11-13 16:22:29 +02:00
Raphael S. Carvalho
835927a2ad test/sstable_compaction_test: Switch to table_state::get_backlog_tracker()
Important for decoupling backlog tracker from table's compaction
strategy.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-11-11 09:17:36 -03:00
Botond Dénes
725e5b119d Revert "replica: Pick new generation for SSTables being moved from staging dir"
This reverts commit ba6186a47f.

Said commit violates the widely held assumption that sstables
generations can be used as sstable identity. One known problem caused
this is potential OOO partition emitted when reading from sstables
(#11843). We now also have a better fix for #11789 (the bug this commit
was meant to fix): 4aa0b16852. So we can
revert without regressions.

Fixes: #11843

Closes #11886
2022-11-09 16:35:31 +02:00
Michał Chojnowski
3e0c7a6e9f test: sstable_datafile_test: eliminate a use of std::regex to prevent stack overflow
This usage of std::regex overflows the seastar::thread stack size (128 KiB),
causing memory corruption. Fix that.

Closes #11911
2022-11-08 14:41:34 +02:00
Avi Kivity
ca2010144e test: loading_cache_test: fix use-after-free in test_loading_cache_remove_leaves_no_old_entries_behind
We capture `key` by reference, but it is in a another continuation.

Capture it by value, and avoid the default capture specification.

Found by clang 15 + asan + aarch64.

Closes #11884
2022-11-03 17:23:40 +02:00
Avi Kivity
9ebac12e60 test: mutation-test: fix off-by-one in test_large_collection_allocation
The test wants to see that no allocations larger than 128k are present,
but sets the warning threshold to exactly 128k. Due to an off-by-one in
Seastar, this went unnoticed. However, now that the off-by-one in Seastar
is fixed [1], this test starts to fail.

Fix by setting the warning threshold to 128k + 1.

[1] 429efb5086

Closes #11817
2022-10-21 10:04:40 +03:00
Avi Kivity
db79f1eb60 Merge 'cql3: expr: Add unit tests for evaluate()' from Jan Ciołek
This PR adds some unit tests for the `expr::evaluate()` function.

At first I wanted to add the unit tests as part of #11658, but their size grew and grew, until I decided that they deserve their own pull request.

I found a few places where I think it would be better to behave in a different way, but nothing serious.

Closes #11815

* github.com:scylladb/scylladb:
  test/boost: move expr_test_utils.hh to .hh and .cc in test/lib
  cql3: expr: Add unit tests for bind_variable validation of collections
  cql3: expr: Add test for subscripted list and map
  cql3: expr: Add test for usertype_constructor
  cql3: expr: Add test for tuple_constructor
  cql3: expr: Add tests for evaluation of collection constructors
  cql3: expr: Add tests for evaluation of column_values and bind_variables
  cql3: expr: Add constant evaluation tests
  test/boost: Add expr_test_utils.hh
  cql3: Add ostream operator for raw_value
  cql3: add is_empty_value() to raw_value and raw_value_view
2022-10-20 22:55:34 +03:00
Jan Ciolek
4c4ed8e6df test/boost: move expr_test_utils.hh to .hh and .cc in test/lib
expr_test_utils.hh was a header file with helper methods for
expression tests. All functions were inline, because I didn't
know how to create and link a .cc file in test/boost.

Now the header is split into expr_test_utils.hh and expr_test_utils.cc
and moved to test/lib, which is designed to keep this kind of files.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-20 17:31:37 +02:00
Jan Ciolek
75b27cb61c cql3: expr: Add unit tests for bind_variable validation of collections
evaluating a bind variable should validate
collection values.

Test that bound collection values are validated,
even in case of a nested collection.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-20 12:12:03 +02:00
Jan Ciolek
c4651e897f cql3: expr: Add test for subscripted list and map
Test that subscripting lists and maps works as expected.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-20 12:12:03 +02:00
Jan Ciolek
5a00c3dd76 cql3: expr: Add test for usertype_constructor
Test that evaluate(usertype_constructor) works
as expected.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-20 12:12:03 +02:00
Jan Ciolek
8f6309bd66 cql3: expr: Add test for tuple_constructor
Test that evaluate(tuple_constructor) works
as expected.

It was necessary to implement a custom function
for serializing tuples, because some tests
require the tuple to contain unset_value
or an empty value, which is impossible
to express using the exisiting code.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-20 12:12:03 +02:00
Jan Ciolek
5ae719d51a cql3: expr: Add tests for evaluation of collection constructors
Test that evaluate(collection_constructor) works as expected.

Added a bunch of utility methods for creating
collection values to expr_test_utils.hh.

I was forced to write custom serialization of
collections. I tried to use data_value,
but it doesn't allow to express unset_value
and empty values.

The custom serialization isnt actually used
in this specific commit, but it's needed
in the following ones.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-20 12:12:02 +02:00
Pavel Emelyanov
01b1f56bd7 code: Deglobalize snitch
All uses of snitch not have their own local referece. The global
instance can now be replaced with the one living in main (and tests)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-20 12:33:41 +03:00
Pavel Emelyanov
8e4e3f7185 tests: Get local reference on global snitch instance once
Some tests actively use global snitch instance. This patch makes each
test get a local reference and use it everywhere. Next patch will
replace global instance with local one

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-20 12:33:40 +03:00
Pavel Emelyanov
1674882220 snitch: Add sharded<snitch_ptr> arg to reset_snitch()
The method replaces snitch instance on the existing sharded<snitch_ptr>
and the "existing" is nowadays the global instance. This patch changes
it to use local reference passed from API code

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-20 12:33:34 +03:00
Avi Kivity
6b0afb968d Merge 'reader_concurrency_semaphore: add set_resources()' from Botond Dénes
Allowing to change the total or initial resources the semaphore has. After calling `set_resources()` the semaphore will look like as if it was created with the specified amount of resources when created.

Use the new method in `replica::database::revert_initial_system_read_concurrency_boost()` so it doesn't lead to strange semaphore diagnostics output. Currently the system semaphore has 90/100 count units when there are no reads against it, which has led to some confusion.

I also plan on using the new facility in enterprise.

Closes #11772

* github.com:scylladb/scylladb:
  replica/database: revert initial boost to system semaphore with set_resources()
  reader_concurrency_semaphore: add set_resources()
2022-10-19 18:04:20 +03:00
Raphael S. Carvalho
ba6186a47f replica: Pick new generation for SSTables being moved from staging dir
When moving a SSTable from staging to base dir, we reused the generation
under the assumption that no SSTable in base dir uses that same
generation. But that's not always true.

When reshaping staging dir, reshape compaction can pick a generation
taken by a SSTable in base dir. That's because staging dir is populated
first and it doesn't have awareness of generations in base dir yet.

When that happens, view building will fail to move SSTable in staging
which shares the same generation as another in base dir.

We could have played with order of population, populating base dir
first than staging dir, but the fragility wouldn't be gone. Not
future proof at all.
We can easily make this safe by picking a new generation for the SSTable
being moved from staging, making sure no clash will ever happen.

Fixes #11789.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #11790
2022-10-19 15:33:30 +03:00
Jan Ciolek
1b7acc758e cql3: expr: Add tests for evaluation of column_values and bind_variables
Add tests which test that evaluate(column_value)
and evaluate(bind_variable) work as expected.

values of columns and bind variables are
kept in evaluation_inputs, so we need to mock
them in order for evaluate() to work.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-19 10:30:51 +02:00
Jan Ciolek
0f29015d9f cql3: expr: Add constant evaluation tests
Add unit test for evaluating expr::constant values.
evaluate(constant) just returns constant.value,
so there is no point in trying all the possible combinations.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-19 10:30:42 +02:00
Botond Dénes
2d581e9e8f Merge "Maintain dc/rack by topology" from Pavel Emelyanov
"
There's an ongoing effort to move the endpoint -> {dc/rack} mappings
from snitch onto topology object and this set finalizes it. After it the
snitch service stops depending on gossiper and system keyspace and is
ready for de-globalization. As a nice side-effect the system keyspace no
longer needs to maintain the dc/rack info cache and its starting code gets
relaxed.

refs: #2737
refs: #2795
"

* 'br-snitch-dont-mess-with-topology-data-2' of https://github.com/xemul/scylla: (23 commits)
  system_keyspace: Dont maintain dc/rack cache
  system_keyspace: Indentation fix after previous patch
  system_keyspace: Coroutinuze build_dc_rack_info()
  topology: Move all post-configuration to topology::config
  snitch: Start early
  gossiper: Do not export system keyspace
  snitch: Remove gossiper reference
  snitch: Mark get_datacenter/_rack methods const
  snitch: Drop some dead dependency knots
  snitch, code: Make get_datacenter() report local dc only
  snitch, code: Make get_rack() report local rack only
  storage_service: Populate pending endpoint in on_alive()
  code: Populate pending locations
  topology: Put local dc/rack on topology early
  topology: Add pending locations collection
  topology: Make get_location() errors more verbose
  token_metadata: Add config, spread everywhere
  token_metadata: Hide token_metadata_impl copy constructor
  gosspier: Remove messaging service getter
  snitch: Get local address to gossip via config
  ...
2022-10-19 06:50:21 +03:00
Jan Ciolek
429600a957 test/boost: Add expr_test_utils.hh
Add a header file which will contain
utilities for writing expression tests.

For now it contains simple functions
like make_int_constant(), but there
are many more to come.

I feel like it's cleaner to put
all these functions in a separate
file instead of having them spread randomly
between tests.

It also enables code reuse so that
future expression tests can reuse
these functions instead of writing
them from scratch.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-18 22:48:33 +02:00
Botond Dénes
7fbad8de87 reader_concurrency_semaphore: unify admission logic across all paths
The semaphore currently has two admission paths: the
obtain_permit()/with_permit() methods which admits permits on user
request (the front door) and the maybe_admit_waiters() which admits
permits based on internal events like memory resource being returned
(the back door). The two paths used their own admission conditions
and naturally this means that they diverged in time. Notably,
maybe_admit_waiters() did not look at inactive readers assuming that if
there are waiters there cannot be inactive readers. This is not true
however since we merged the execution-stage into the semaphore. Waiters
can queue up even when there are inactive reads and thus
maybe_admit_waiters() has to consider evicting some of them to see if
this would allow for admitting new reads.
To avoid such divergence in the future, the admission logic was moved
into a new method can_admit_read() which is now shared between the two
method families. This method now checks for the possibility of evicting
inactive readers as well.
The admission logic was tuned slightly to only consider evicting
inactive readers if there is a real possibility that this will result
in admissions: notably, before this patch, resource availability was
checked before stalls were (used permits == blocked permits), so we
could evict readers even if this couldn't help.
Because now eviction can be started from maybe_admit_waiters(), which is
also downstream from eviction, we added a flag to avoid recursive
evict -> maybe admit -> evict ... loops.

Fixes: #11770

Closes #11784
2022-10-18 17:07:43 +03:00
Tomasz Grabiec
4ff204c028 Merge 'cache: make all removals of cache items explicit' from Michał Chojnowski
This series is a step towards non-LRU cache algorithms.

Our cache items are able to unlink themselves from the LRU list. (In other words, they can be unlinked solely via a pointer to the item, without access to the containing list head). Some places in the code make use of that, e.g. by relying on auto-unlink of items in their destructor.

However, to implement algorithms smarter than LRU, we might want to update some cache-wide metadata on item removal. But any cache-wide structures are unreachable through an item pointer, since items only have access to themselves and their immediate neighbours. Therefore, we don't want items to unlink themselves — we want `cache.remove(item)`, rather than `item.remove_self()`, because the former can update the metadata in `cache`.

This series inserts explicit item unlink calls in places that were previously relying on destructors, gets rid of other self-unlinks, and adds an assert which ensures that every item is explicitly unlinked before destruction.

Closes #11716

* github.com:scylladb/scylladb:
  utils: lru: assert that evictables are unlinked before destruction
  utils: lru: remove unlink_from_lru()
  cache: make all cache unlinks explicit
2022-10-17 12:47:02 +02:00
Michał Chojnowski
d785364375 cache: make all cache unlinks explicit
Our LSA cache is implemented as an auto_unlink Boost intrusive list, meaning
that elements of the list unlink themselves from the list automatically on
destruction. Some parts of the code rely on that, and don't unlink them
manually.

However, this precludes accurate bookkeeping about the cache. Elements only have
access to themselves and their neighbours, not to any bookkeeping context.
Therefore, a destructor cannot update the relevant metadata.

In this patch, we fix this by adding explicit unlink calls to places where it
would be done by a destructor. In a following patch, we will add an assert to
the destructor to check that every element is unlinked before destruction.
2022-10-17 12:07:27 +02:00
Botond Dénes
ecc7c72acd reader_concurrency_semaphore: add set_resources()
Allowing to change the total or initial resources the semaphore has.
After calling `set_resources()` the semaphore will look like as if it
was created with the specified amount of resources when created.
2022-10-17 07:39:20 +03:00
Tomasz Grabiec
c8a372ae7f test: db: Add test for row merging involving many versions
The test verifies that a row which participated in earlier merge, and
its cells lost on the timestamp check, behaves exactly like an empty
row and can accept any mutation.

This wasn't the case in versions prior to f006acc.

Closes #11787
2022-10-16 14:29:49 +03:00
Pavel Emelyanov
707efb6dfb tracing: Wire tracing test back
The boost/tracing test is not run, because test.py boost suite collects
tests that match *_test.cc pattern. The tracing one apparently doesn't

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:59:13 +03:00
Pavel Emelyanov
5b67a2a876 tracing: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:59:08 +03:00
Pavel Emelyanov
53ac8536f1 tracing: Move test into thread
The test calls future<>.get()'s in its lambda which is only allowed in
seastar threads. It's not stepped upon because (surprise, surprise) this
test is not run at all. Next patch fixes it.

Meanwhile, the fix is in using cql_env_thread thing for the whole lambda
which runs in it seastar::async() context

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:57:35 +03:00
Pavel Emelyanov
5c8a61ace2 tracing: Dismantle trace-backend registry
It's not used any longer

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:57:24 +03:00
Pavel Emelyanov
0a6a5a242e tracing: Remove copy-n-paste comments from test
Tests don't have supervisor, so there's no sense in keeping these bits

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 17:55:40 +03:00
Pavel Emelyanov
2bb354b2e7 snitch: Remove gossiper reference
It doesn't need gossiper any longer. This change will allow starting
snitch early by the next patch, and eventually improving the
token-metadata start-up sequence

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-11 05:17:08 +03:00
Pavel Emelyanov
4206b1f98f snitch, code: Make get_datacenter() report local dc only
The continuation of the previous patch -- all the code uses
topology::get_datacenter(endpoint) to get peers' dc string. The topology
still uses snitch for that, but it already contains the needed data.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-11 05:17:08 +03:00
Pavel Emelyanov
6c6711404f snitch, code: Make get_rack() report local rack only
All the code out there now calls snitch::get_rack() to get rack for the
local node. For other nodes the topology::get_rack(endpoint) is used.
Since now the topology is properly populated with endpoints, it can
finally be patched to stop using snitch and get rack from its internal
collections

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-11 05:17:08 +03:00
Pavel Emelyanov
da75552e1f topology: Add pending locations collection
Nowadays the topology object only keeps info about nodes that are normal
members of the ring. Nodes that are joining or bootstrapping or leaving
are out of it. However, one of the goals of this patchset is to make
topology object provide dc/rack info for _all_ nodes, even those in
transitive state.

The introduced _pending_locations is about to hold the dc/rack info for
transitive endpoints. When a node becomes member of the ring it is moved
from pending (if it's there) to current locations, when it leaves the
ring it's moved back to pending.

For now the new collection is just added and the add/remove/get API is
extended to maintain it, but it's not really populated. It will come in
the next patch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-11 05:17:08 +03:00
Pavel Emelyanov
d60ebc5ace token_metadata: Add config, spread everywhere
Next patches will need to provide some early-start data for topology.
The standard way of doing it is via service config, so this patch adds
one. The new config is empty in this patch, to be filled later

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-11 05:17:08 +03:00
Botond Dénes
b247f29881 Merge 'De-static system_keyspace::get_{saved|local}_tokens()' from Pavel Emelyanov
Yet another user of global qctx object. Making the method(s) non-static requires pushing the system_keyspace all the way down to size_estimate_virtual_reader and a small update of the cql_test_env

Closes #11738

* github.com:scylladb/scylladb:
  system_keyspace: Make get_{local|saved}_tokens non static
  size_estimates_virtual_reader: Pass sys_ks argument to get_local_ranges()
  cql_test_env: Keep sharded<system_keyspace> reference
  size_estimate_virtual_reader: Keep system_keyspace reference
  system_keyspace: Pass sys_ks argument to install_virtual_readers()
  system_keyspace: Make make() non-static
  distributed_loader: Pass sys_ks argument to init_system_keyspace()
  system_keyspace: Remove dangling forward declaration
2022-10-07 11:28:32 +03:00
Avi Kivity
20bad62562 Merge 'Detect and record large collections' from Benny Halevy
This series adds support for detecting collections that have too many items
and recording them in `system.large_cells`.

A configuration variable was added to db/config: `compaction_collection_items_count_warning_threshold` set by default to 10000.
Collections that have more items than this threshold will be warned about and will be recorded as a large cell in the `system.large_cells` table.  Documentation has been updated respectively.

A new column was added to system.large_cells: `collection_items`.
Similar to the `rows` column in system.large_partition, `collection_items` holds the number of items in a collection when the large cell is a collection, or 0 if it isn't.  Note that the collection may be recorded in system.large_cells either due to its size, like any other cell, and/or due to the number of items in it, if it cross the said threshold.

Note that #11449 called for a new system.large_collections table, but extending system.large_cells follows the logic of system.large_partitions is a smaller change overall, hence it was preferred.

Since the system keyspace schema is hard coded, the schema version of system.large_cells was bumped, and since the change is not backward compatible, we added a cluster feature - `LARGE_COLLECTION_DETECTION` - to enable using it.
The large_data_handler large cell detection record function will populate the new column only when the new cluster feature is enabled.

In addition, unit tests were added in sstable_3_x_test for testing large cells detection by cell size, and large_collection detection by the number of items.

Closes #11449

Closes #11674

* github.com:scylladb/scylladb:
  sstables: mx/writer: optimize large data stats members order
  sstables: mx/writer: keep large data stats entry as members
  db: large_data_handler: dynamically update config thresholds
  utils/updateable_value: add transforming_value_updater
  db/large_data_handler: cql_table_large_data_handler: record large_collections
  db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler
  db/large_data_handler: cql_table_large_data_handler: move ctor out of line
  docs: large-rows-large-cells-tables: fix typos
  db/system_keyspace: add collection_elements column to system.large_cells
  gms/feature_service: add large_collection_detection cluster feature
  test: sstable_3_x_test: add test_sstable_too_many_collection_elements
  test: lib: simple_schema: add support for optional collection column
  test: lib: simple_schema: build schema in ctor body
  test: lib: simple_schema: cql: define s1 as static only if built this way
  db/large_data_handler: maybe_record_large_cells: consider collection_elements
  db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries
  sstables: mx/writer: pass collection_elements to writer::maybe_record_large_cells
  sstables: mx/writer: add large_data_type::elements_in_collection
  db/large_data_handler: get the collection_elements_count_threshold
  db/config: add compaction_collection_elements_count_warning_threshold
  test: sstable_3_x_test: add test_sstable_write_large_cell
  test: sstable_3_x_test: pass cell_threshold_bytes to large_data_handler
  test: sstable_3_x_test: large_data_handler: prepare callback for testing large_cells
  test: sstable_3_x_test: large_data tests: use BOOST_REQUIRE_[GL]T
  test: sstable_3_x_test: test_sstable_log_too_many_rows: use tests::random
2022-10-06 18:28:21 +03:00
Avi Kivity
62a4d2d92b Merge 'Preliminary changes for multiple Compaction Groups' from Raphael "Raph" Carvalho
What's contained in this series:
- Refactored compaction tests (and utilities) for integration with multiple groups
    - The idea is to write a new class of tests that will stress multiple groups, whereas the existing ones will still stress a single group.
- Fixed a problem when cloning compound sstable set (cannot be triggered today so I didn't open a GH issue)
- Many changes in replica::table for allowing integration with multiple groups

Next:
- Introduce for_each_compaction_group() for iterating over groups wherever needed.
- Use for_each_compaction_group() in replica::table operations spanning all groups (API, readers, etc).
- Decouple backlog tracker from compaction strategy, to allow for backlog isolation across groups
- Introduce static option for defining number of compaction groups and implement function to map a token to its respective group.
- Testing infrastructure for multiple compaction groups (helpful when testing the dynamic behavior: i.e. merging / splitting).

Closes #11592

* github.com:scylladb/scylladb:
  sstable_resharding_test: Switch to table_for_tests
  replica: Move compacted_undeleted_sstables into compaction group
  replica: Use correct compaction_group in try_flush_memtable_to_sstable()
  replica: Make move_sstables_from_staging() robust and compaction group friendly
  test: Rename column_family_for_tests to table_for_tests
  sstable_compaction_test: Use column_family_for_tests::as_table_state() instead
  test: Don't expose compound set in column_family_for_tests
  test: Implement column_family_for_tests::table_state::is_auto_compaction_disabled_by_user()
  sstable_compaction_test: Merge table_state_for_test into column_family_for_tests
  sstable_compaction_test: use table_state_for_test itself in fully_expired_sstables()
  sstable_compaction_test: Switch to table_state in compact_sstables()
  sstable_compaction_test: Reduce boilerplate by switching to column_family_for_tests
2022-10-06 18:23:47 +03:00
Pavel Emelyanov
b03f1e7b17 size_estimates_virtual_reader: Pass sys_ks argument to get_local_ranges()
This method static calls system_keyspace::get_local_tokens(). Having the
system_keyspace reference will make this method non-static

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-06 18:00:09 +03:00
Avi Kivity
9932c4bd62 Merge 'cql3: Make CONTAINS NULL and CONTAINS KEY NULL return false' from Jan Ciołek
Currently doing `CONTAINS NULL` or `CONTAINS KEY NULL` on a collection evaluates to `true`.

This is a really weird behaviour. Collections can't contain `NULL`, even if they wanted to.
Any operation that has a NULL on either side should evaluate to `NULL`, which is interpreted as `false`.
In Cassandra trying to do `CONTAINS NULL` causes an error.

Fixes: #10359

The only problem is that this change is not backwards compatible. Some existing code might break.

Closes #11730

* github.com:scylladb/scylladb:
  cql3: Make CONTAINS KEY NULL return false
  cql3: Make CONTAINS NULL return false
2022-10-06 17:08:56 +03:00
Raphael S. Carvalho
14d6459efc compaction: Make compaction_manager stop more robust
Commit aba475fe1d accidentally fixed a race, which happens in
the following sequence of events:

1) storage service starts drain() via API for example
2) main's abort source is triggered, calling compaction_manager's do_stop()
via subscription.
        2.1) do_stop() initiates the stop but doesn't wait for it.
        2.2) compaction_manager's state is set to stopped, such that
        compaction_manager::stop() called in defer_verbose_shutdown()
        will wait for the stop and not start a new one.
3) drain() calls compaction_manager::drain() changing the state from
stopped to disabled.
4) main calls compaction_manager::stop() (as described in 2.2) and
incorrectly tries to stop the manager again, because the state was
changed in step 3.

aba475fe1d accidentally fixed this problem because drain() will no
longer take place if it detects the shutdown process was initiated
(it does so by ignoring drain request if abort source's subscription
was unlinked).

This shows us that looking at the state to determine if stop should be
performed is fragile, because once the state changes from A to B,
manager doesn't know the state was A. To make it robust, we can instead
check if the future that stores stop's promise is engaged, meaning that
the stop was already initiated and we don't have to start a new one.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #11711
2022-10-06 13:49:26 +02:00