Commit Graph

39095 Commits

Author SHA1 Message Date
Botond Dénes
2cc37eb89b Merge 'Sanitize storage_service API maintenance' from Pavel Emelyanov
Storage service API set/unset has two flaws.

First, unset doesn't happen, so after storage service is stopped its handlers become "local is not initialized"-assertion and use-after-free landmines.

Second, setting of storage service API carry gossiper and system keyspace references, thus duplicating the knowledge about storage service dependencies.

This PR fixes both by adding the storage service API unsetting and by making the handlers use _only_ storage service instance, not any externally provided references.

Closes scylladb/scylladb#15547

* github.com:scylladb/scylladb:
  main, api: Set/Unset storage_service API in proper place
  api/storage_service: Remove gossiper arg from API
  api/storage_service: Remove system keyspace arg from API
  api/storage_service: Get gossiper from storage service
  api/storage_service: Get token_metadata from storage service
2023-09-27 10:00:54 +03:00
Avi Kivity
301b0a989a Merge ' cql3/prepare_context: fix generating pk_indexes for duplicate named bind variables' from Jan Ciołek
When presented with queries that use the same named bind variables twice, like this one:
```cql
SELECT p FROM table WHERE p = :x AND c = :x
```

Scylla generated empty `partition_key_bind_indexes` (`pk_indexes`).
`pk_indexes` tell the driver which bind variables it should use to calculate the partition token for a query. Without it, the driver is unable to determine the token and it will send the query to a random node.

Scylla should generate pk_indexes which tell the driver that it can use bind variable with `bind_index = 0` to calculate the partition token for this query.

The problem was that `_target_columns` keep only a single target_column for each bind variable.
In the example above `:x` is compared with both `p` and `c`, but `_target_columns` would contain only one of them, and Scylla wasn't able to tell that this bind variable is compared with a partition key column.

To fix it, let's replace `_target_columns` with `_targets`. `_targets` keeps all comparisons
between bind variables and other expressions, so none of them will be forgotten/overwritten.

A `cql-pytest` reproducer is added.

I also added some comments in `prepare_context.hh/cc` to make it easier to read.

Fixes: https://github.com/scylladb/scylladb/issues/15374

Closes scylladb/scylladb#15526

* github.com:scylladb/scylladb:
  cql-pytest/test-prepare: remove xfail marker from *pk_indexes_duplicate_named_variables
  cql3/prepare_context: fix generating pk_indexes for duplicate named bind variables
  cql3: improve readability of prepare_context
  cql-pytest: test generation of pk indexes during PREPARE
2023-09-26 19:47:04 +03:00
Nadav Har'El
9dea20539d Merge 'Sanitize forward-service shutdown' from Pavel Emelyanov
There's a dedicated forward_service::shutdown() method that's defer-scheduled in main for very early invocation. That's not nice, the fwd service start-shutdown-stop sequence can be made "canonical" by moving the shutting down code into abort source subscription. Similar thing was done for view updates generator in 3b95f4f107

refs: #2737
refs: #4384

Closes scylladb/scylladb#15545

* github.com:scylladb/scylladb:
  forward_service: Remove .shutdown() method
  forward_service: Set _shutdown in abort-source subscription
  forward_service: Add abort_source to constructor
2023-09-26 18:36:52 +03:00
Kefu Chai
50c937439b reloc: strip.sh: always generate symbol list with posix format
we compare the symbols lists of stripped ELF file ($orig.stripped) and
that of the one including debugging symbols ($orig.debug) to get a
an ELF file which includes only the necessary bits as the debuginfo
($orig.minidebug).

but we generate the symbol list of stripped ELF file using the
sysv format, while generate the one from the unstripped one using
posix format. the former is always padded the symbol names with spaces
so that their the length at least the same as the section name after
we split the fields with "|".

that's why the diff includes the stuff we don't expect. and hence,
we have tons of warnings like:

```
objcopy: build/node_exporter/node_exporter.keep_symbols:4910: Ignoring rubbish found on this line
```

when using objcopy to filter the ELF file to keep only the
symbols we are interested in.

so, in this change

* use the same format when dumping the symbols from unstripped ELF
  file
* include the symbols in the text area -- the code, by checking
  "T" and "t" in the dumped symbols. this was achieved by matching
  the lines with "FUNC" before this change.
* include the the symbols in .init data section -- the global
  variables which are initialized at compile time. they could
  be also interesting when debugging an application.

Fixes #15513
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15514
2023-09-26 17:59:40 +03:00
Alexander Turetskiy
024ba84637 cql3: SELECT CAST column names should match Cassandra's
When doing a SELECT CAST(b AS int), Cassandra returns a column named
cast(b as int). Currently, Scylla uses a different name -
system.castasint(b). For Cassandra compatibility, we should switch to
the same name.

fixes #14508

Closes scylladb/scylladb#14800
2023-09-26 17:26:14 +03:00
Aleksandra Martyniuk
f42be12f43 repair: release resources of shard_repair_task_impl
Before integration with task manager the state of one shard repair
was kept in repair_info. repair_info object was destroyed immediately
after shard repair was finished.

In an integration process repair_info's fields were moved to
shard_repair_task_impl as the two served the similar purposes.
Though, shard_repair_task_impl isn't immediately destoyed, but is
kept in task manager for task_ttl seconds after it's complete.
Thus, some of repair_info's fields have their lifetime prolonged,
which makes the repair state change delayed.

Release shard_repair_task_impl resources immediately after shard
repair is finished.

Fixes: #15505.

Closes scylladb/scylladb#15506
2023-09-26 17:09:47 +03:00
Piotr Jastrzebski
47917bcf22 filter: hash key once per sstable set not sstable
Before this commit the primary key was hashed for bloom filter check
for each sstable.
This commit makes the key be hashed once per sstable set and reused
for bloom filter lookups in all sstables in the set.

I tested this change using perf_simple_query with the following modifications:
1. Create more than one sstable to have sstable set of more than one elements
2. Try to prevent compactions (I wasn't 100% successful)
3. Use a key that's not present to avoid reading from disk

```
diff --git a/test/perf/perf_simple_query.cc b/test/perf/perf_simple_query.cc
index 26dbf1e99..6bd460df2 100644
--- a/test/perf/perf_simple_query.cc
+++ b/test/perf/perf_simple_query.cc
@@ -105,6 +105,8 @@ std::ostream& operator<<(std::ostream& os, const test_config& cfg) {

 static void create_partitions(cql_test_env& env, test_config& cfg) {
     std::cout << "Creating " << cfg.partitions << " partitions..." << std::endl;
+    // Create 10 sstables each with all the data
+    for (unsigned count = 0; count < 10; ++count) {
     for (unsigned sequence = 0; sequence < cfg.partitions; ++sequence) {
         if (cfg.counters) {
             execute_counter_update_for_key(env, make_key(sequence));
@@ -117,6 +119,7 @@ static void create_partitions(cql_test_env& env, test_config& cfg) {
         std::cout << "Flushing partitions..." << std::endl;
         env.db().invoke_on_all(&replica::database::flush_all_memtables).get();
     }
+    }
 }

 static int64_t make_random_seq(test_config& cfg) {
@@ -137,8 +140,18 @@ static std::vector<perf_result> test_read(cql_test_env& env, test_config& cfg) {
         query += " using timeout " + cfg.timeout;
     }
     auto id = env.prepare(query).get0();
-    return time_parallel([&env, &cfg, id] {
-            bytes key = make_random_key(cfg);
+    // Always use the same key that is not present
+    // to make sure we don't read from disk and make
+    // the benchmark CPU bounded.
+    int64_t key_value = 6;
+    bytes key(bytes::initialized_later(), 5*sizeof(key_value));
+    auto i = key.begin();
+    write<uint64_t>(i, key_value);
+    write<uint64_t>(i, key_value);
+    write<uint64_t>(i, key_value);
+    write<uint64_t>(i, key_value);
+    write<uint64_t>(i, key_value);
+    return time_parallel([&env, id, key] {
             return env.execute_prepared(id, {{cql3::raw_value::make_value(std::move(key))}}).discard_result();
         }, cfg.concurrency, cfg.duration_in_seconds, cfg.operations_per_shard, cfg.stop_on_error);
 }
@@ -423,6 +436,10 @@ static std::vector<perf_result> do_cql_test(cql_test_env& env, test_config& cfg)
                 .with_column("C2", bytes_type)
                 .with_column("C3", bytes_type)
                 .with_column("C4", bytes_type)
+		// Try to prevent compaction
+		// to keep the number of sstables high
+		.set_compaction_enabled(false)
+		.set_min_compaction_threshold(2000000000)
                 .build();
     }).get();

@@ -539,6 +556,11 @@ int scylla_simple_query_main(int argc, char** argv) {
             const auto enable_cache = app.configuration()["enable-cache"].as<bool>();
             std::cout << "enable-cache=" << enable_cache << '\n';
             db_cfg->enable_cache(enable_cache);
+	    // Try to prevent compaction
+	    // to keep the number of sstables high
+	    db_cfg->concurrent_compactors(1);
+	    db_cfg->compaction_enforce_min_threshold(true);
+	    db_cfg->compaction_throughput_mb_per_sec(1);

             cql_test_config cfg(db_cfg);
           return do_with_cql_env_thread([&app] (auto&& env) {
```

The following command showed 2-3% improvement on my machine but this
depends on the lenght of the key and the number of sstables in the set.

```
./build/release/scylla perf-simple-query --bypass-cache --flush -c 1
--random-seed=2068087418 --enable-cache false
```

Signed-off-by: Piotr Jastrzebski <haaawk@gmail.com>

Closes scylladb/scylladb#15538
2023-09-26 16:27:11 +03:00
Botond Dénes
d5f095d5a4 Merge 'Make interaction of compaction strategy with sstable runs more robust and efficient' from Raphael "Raph" Carvalho
SSTable runs work hard to keep the disjointness invariant, therefore they're
expensive to build from scratch.
For every insertion, it keeps the elements sorted by their first key in
order to reject insertion of element that would introduce overlapping.

Additionally, a sstable run can grow to dozens of elements (or hundreds)
therefore, we can also make interaction with compaction strategies more
efficient by not copying them when building a list of candidates in compaction
manager. And less fragile by filtering out any sstable runs that are not
completely eligible for compaction.

Previously, ICS had to give up on using runs managed by sstable set due to
fragility of the interface (meaning runs are being built from scratch
on every call to the strategy, which is very inefficient, but that had to
be done for correctness), but now we can restore that.

Closes scylladb/scylladb#15440

* github.com:scylladb/scylladb:
  compaction: Switch to strategy_control::candidates() for regular compaction
  tests: Prepare sstable_compaction_test for change in compaction_strategy interface
  compaction: Allow strategy to retrieve candidates either as sstables or runs
  compaction: Make get_candidates() work with frozen_sstable_run too
  sstables: add sstable_run::run_identifier()
  sstables: tag sstable_run::insert() with nodiscard
  sstables: Make all_sstable_runs() more efficient by exposing frozen shared runs
  sstables: Simplify sstable_set interface to retrieve runs
2023-09-26 14:56:05 +03:00
Aleksandra Martyniuk
d799adc536 tasks: change task_manager::task::impl::is_internal()
Most of the time only the roots of tasks tree should be non internal.

Change default implementation of is_internal and delete overrides
consistent with it.

Closes scylladb/scylladb#15353
2023-09-26 14:49:49 +03:00
Avi Kivity
5804386ca6 Merge 'Don't mess with table directories in distributed loader' from Pavel Emelyanov
Distributed loader code still "knows" that table's datadir is a filesystem directory with some structure. For S3-backed sstables this still works because for S3 keyspaces scylla still creates and maintains empty directories in datadir. This set fixes the dist. loader assumptions about that and moves them into sstable directory's lister.

refs: #13020

Closes scylladb/scylladb#15542

* github.com:scylladb/scylladb:
  sstable_directory: Indentation fix after previous patch
  sstable_directory: Simplify filesystem prepare()
  distributed_loader: Remove get_path() method
  distributed_loader: Move directory touching to sstable_directory
  distributed_loader: Move directory existance checks to sstable_directory
  sstable_directory: Move prepare() core to lister
2023-09-26 14:48:23 +03:00
Kefu Chai
8066929960 build: cmake: use if (.. IN_LIST ..) when appropriate
for better readability, and do not create a CMAKE_BUILD_TYPE CACHE
entry if it is already set using `-D`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#15543
2023-09-26 14:15:53 +03:00
Pavel Emelyanov
e022a76350 main, api: Set/Unset storage_service API in proper place
Currently the storage-service API handlers are set up in "random" place.
It can happen earlier -- as soon as the storage service itself is ready.

Also, despite storage service is stopped on shutdown, API handlers
continue reference it leading to potential use-after-frees or "local is
not initialized" assertions.

Fix both. Unsetting is pretty bulky, scylladb/seastar#1620 is to help.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 12:22:13 +03:00
Pavel Emelyanov
78a22c5ae3 api/storage_service: Remove gossiper arg from API
Now all handlers work purely on storage_service and gossiper argument is
no longer needed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 12:21:46 +03:00
Pavel Emelyanov
8dc6e74138 api/storage_service: Remove system keyspace arg from API
It's not used nowadays

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 12:21:25 +03:00
Pavel Emelyanov
27eaff9d44 api/storage_service: Get gossiper from storage service
Some handlers in set_storage_service() have implicit dependency on
gossiper. It's not API that should track it, but storage service itself,
so get the gossiper from service, not from the external argument (it
will be removed soon)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 12:20:27 +03:00
Pavel Emelyanov
4008ebb1b0 api/storage_service: Get token_metadata from storage service
The API handlers that live in set_storage_service() should be
self-contained and operate on storage-service only. Said that, they
should get the token metadata, when needed, from storage service, not
from somewhere else.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 12:19:24 +03:00
Tomasz Grabiec
0f22e8d196 storage_service: Fixed missed notificaiton on tablet metadata update
There can be 2 waiters now (coordinator and CDC generation publisher),
so signal() is not enough.

Change made in c416c9ff33 missed to
update this site.

Closes scylladb/scylladb#15527
2023-09-26 10:37:57 +02:00
Jan Ciolek
e5f0468761 cql/prepare_expr: fix wrong receiver in field_selection_test_assignment
When preparing a `field_selection`, we need to prepare the UDT value,
and then verify that it has this field.

`field_selection_test_assignment` prepares the UDT value using the same
receiver as the whole `field_selection`. This is wrong, this receiver
has the type of the field, and not the UDT.

It's impossible to create a receiver for the UDT. Many different UDTs
can produce an `int` value when the field `a` is selected.
Therefore the receiver should be `nullptr`.

No unit test is added, as this bug doesn't currently cause any issues.
Preparing a column value doesn't do any type checks, so nothing fails.
Still it's good to fix it, just to be correct.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>

Closes scylladb/scylladb#14788
2023-09-26 11:15:00 +03:00
Pavel Emelyanov
0e0f9a57c6 forward_service: Remove .shutdown() method
It's now empty and has no value

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 10:39:22 +03:00
Pavel Emelyanov
a251b9893f forward_service: Set _shutdown in abort-source subscription
Currently the bit is set in .shutdown() method which is called early on
stop. After the patch the bit it set in the abort-source subscription
callback which is also called early on stop.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 10:38:34 +03:00
Pavel Emelyanov
b18c54f56c forward_service: Add abort_source to constructor
It will be used by the next patch

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-26 10:38:26 +03:00
Raphael S. Carvalho
8997fe0625 compaction: Switch to strategy_control::candidates() for regular compaction
Now everything is prepared for the switch, let's do it.

Now let's wait for ICS to enjoy the set of changes.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
761a37022f tests: Prepare sstable_compaction_test for change in compaction_strategy interface
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
02f1f24f27 compaction: Allow strategy to retrieve candidates either as sstables or runs
That's needed for upcoming changes that will allow ICS to efficiently
retrieve sstable runs.

Next patch will remove candidates from compaction_strategy's interface
to retrieve candidates using this one instead.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
ff8510445d compaction: Make get_candidates() work with frozen_sstable_run too
This is done in preparation for ICS to retrieve candidates as
sstable runs.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
4b193c04dd sstables: add sstable_run::run_identifier()
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
8235889b8a sstables: tag sstable_run::insert() with nodiscard
sstable_run may reject insertion of a sstable if it's going
to break the disjoint invariant of the run, but it's important
that the caller is aware of it, so it can act on it like
generating a new run id for the sstable so it can be inserted
in another run. the tag is important to avoid unknown
problems in this area.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:21 -03:00
Raphael S. Carvalho
0fe2630d70 sstables: Make all_sstable_runs() more efficient by exposing frozen shared runs
Users of all_sstable_runs() don't want to mutate the runs, but rather
work with their content. So let's avoid copy and make the intention
explicit with the new frozen_sstable_run used as return type
for the interface.

This will guarantee that ICS will be able to fetch uncompacting
runs efficiently.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:18:20 -03:00
Raphael S. Carvalho
9f6c3369d2 sstables: Simplify sstable_set interface to retrieve runs
This interface selects all runs that store at least one of the
sstables in the vector.

But that's very fragile, to the point that even ICS had to
stop using it. A better interface is to return all runs
managed by the set and allow compaction manager to do its
filtering.

We want to use it in ICS to avoid the overhead of rebuilding
sstable runs which may be expensive as sorting is performed
to guarantee the disjoint invariant.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2023-09-25 17:04:20 -03:00
Tomasz Grabiec
19ff4b730f storage_service: Avoid SIGSEGV when tablet cleanup is invoked on non-0 shard
We access group0, which is only set on shard 0.

Closes scylladb/scylladb#15469
2023-09-25 20:59:27 +03:00
Pavel Emelyanov
901bbf21e9 Merge 'build: extract code fragments into functions' from Kefu Chai
more structured this way. this also allows us to quickly identify the part which should/can be reused when migrating to CMake based building system.

Refs https://github.com/scylladb/scylladb/issues/15379

Closes scylladb/scylladb#15515

* github.com:scylladb/scylladb:
  build: extract get_os_ids() out
  build: extract find_ninja() out
  build: extract thrift_uses_boost_share_ptr() out
2023-09-25 20:57:59 +03:00
Botond Dénes
caeddb9c88 tools/utils: return a distinct error-code on unknown operation
Currently, the tools loosely follow the following convention on
error-codes:
* return 1 if the error is with any of the command-line arguments
* return 2 on other errors

This patch changes the returned error-code on unknown operation/command
to 100 (instead of the previous 1). The intent is to allow any wrapper
script to determine that the tool failed because the operation is
unrecognized and not because of something else. In particular this
should enable us to write a wrapper script for scylla-nodetool, which
dispatches commands still un-implemented in scylla-nodetool, to the java
nodetool.
Note that the tool will still print an error message on an unknown
operation. So such wrapper script would have to make sure to not let
this bleed-through when it decides to forward the operation.

Closes scylladb/scylladb#15517
2023-09-25 20:56:44 +03:00
Pavel Emelyanov
99cbb6b733 sstable_directory: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-25 20:34:52 +03:00
Pavel Emelyanov
7ab03e33a2 sstable_directory: Simplify filesystem prepare()
When FS lister gets prepared it

- checks if the directory exists
- creates if it it doesn't or bais out if it's quarantine one
- goes and checks the directory's owner and mode

The last step is excessive if the directory didn't exist on entry and
was created.

Indentation is deliberately left broken.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-25 20:03:19 +03:00
Pavel Emelyanov
0232f939dc distributed_loader: Remove get_path() method
It's no longer used

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-25 20:03:06 +03:00
Pavel Emelyanov
9c3e055d22 distributed_loader: Move directory touching to sstable_directory
This is continuation of the previous patch -- when populating a table,
creating directories should be (optionally) performed by the lister
backend, not by the generic loader.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-25 20:01:53 +03:00
Pavel Emelyanov
2678cc2ae8 distributed_loader: Move directory existance checks to sstable_directory
The loader code still "knows" that tables' sstables live in directories
on datadir filesystem, but that's not always so. So whether or not the
directory with sstables exists should be checked by sstable directory's
component lister, not the loader.

After this change potentially missing quarantine directory will be
processed with the sstable directory with empty result, but that's OK,
empty directories should be already handled correctly, so even if the
directory lister doesn't produce any sstables because it found no files,
or because it just skipped scanning doesn't make any difference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-25 19:59:41 +03:00
Pavel Emelyanov
603f3ca042 sstable_directory: Move prepare() core to lister
Current sstable_directory::prepare() code checks the sstable directory
existance which only makes sense for filesystem-backed sstables.
S3-backed don't (well -- won't) have any directories in datadir, so the
check should be moved into component lister.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-09-25 19:58:53 +03:00
Jan Ciolek
649b634c63 cql-pytest/test-prepare: remove xfail marker from *pk_indexes_duplicate_named_variables
Issue #15374 has been fixed, so these tests can be enabled.
Duplicate bind variable names are now handled correctly.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-09-25 17:19:07 +02:00
Jan Ciolek
3cff10f756 cql3/prepare_context: fix generating pk_indexes for duplicate named bind variables
When presented with queries that use the same named bind variables twice,
like this one:
```cql
SELECT p FROM table WHERE p = :x AND c = :x
```

Scylla generated empty partition_key_bind_indexes (pk_indexes).
pk_indexes tell the driver which bind variables it should use to calculate the partition
token for a query. Without it, the driver is unable to determine the token and it will
send the query to a random node.

Scylla should generate pk_indexes which tell the driver that it can use bind variable
with bind_index = 0 to calculate the partition token for a query.

The problem was that _target_columns keep only a single target_column for each bind variable.
In the example above :x is compared with both p and c, but _target_columns would contain
only one of them, and Scylla wasn't able to tell that this bind variable is compared with
a partition key column.

To fix it, let's replace _target_columns with _targets. _targets keeps all comparisons
between bind variables and other expressions, so none of them will be forgotten/overwritten.

Fixes: #15374

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-09-25 17:18:53 +02:00
Jan Ciolek
a993ae31f8 cql3: improve readability of prepare_context
This commits adds a few comments and changes a few variable names
so that it's easier to figure out what the code does.

When I first started looking at this part of the code it wasn't
obvious what's going on - what are _specs, how are they different
from _target_columns? What happens when a variable doesn't have a name?

I hope that this change will make it easier to understand for future readers.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-09-25 17:18:53 +02:00
Anna Stuchlik
4afe2b9d9f doc: add RBNO to glossary
This commit adds Repair Based Node Operations
to the ScyllaDB glossary.

Fixes https://github.com/scylladb/scylladb/issues/11959

Closes scylladb/scylladb#15522
2023-09-25 18:16:53 +03:00
Jan Ciolek
f3ecd279f2 cql-pytest: test generation of pk indexes during PREPARE
Add some tests that test whether `pk indexes` are generated correctly.
When a driver asks to prepare a statement, Scylla's response includes
the metadata for this prepared statement.
In this metadata there's `pk indexes`, which tells the driver which
bind variable values it should use to calculate the partition token.

For a query like:
SELECT * FROM t WHERE p2 = ? AND p1 = ? AND p3 = ?

The correct pk_indexes would be [1, 0, 2], which means
"To calculate the token calculate Hash(bind_vars[1] | bind_vars[0] | bind_vars[2])".

More information is available in the specification:
1959502d8b/doc/native_protocol_v4.spec (L699-L707)

Two tests are marked as xfail because of #15374 - Scylla doesn't correctly handle using the same
named variable in multiple places. This will be fixed soon.

I couldn't find a good place for these tests, so I created a new file - test_prepare.py.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-09-25 17:12:17 +02:00
Pavel Emelyanov
652153c291 Merge 'populate_keyspace: use datadir' from Benny Halevy
Currently the datadir is ignored.
Use it to construct the table's base path.

Fixes scylladb/scylladb#15418

Closes scylladb/scylladb#15480

* github.com:scylladb/scylladb:
  distributed_loader: populate_keyspace: access cf by ref
  distributed_loader: table_populator: use datadir for base_path
  distributed_loader: populate_keyspace: issue table mark_ready_for_writes after all datadirs are processed
  distributed_loader: populate_keyspace: fixup indentation
  distributed_loader: populate_keyspace: iterate over datadirs in the inner loop
  test: sstable_directory_test: add test_multiple_data_dirs
  table: init_storage: create upload and staging subdirs on all datadirs
2023-09-25 13:40:50 +03:00
Nadav Har'El
1a5debac5c test/cql-pytest: cleaner reproducer for spurious static row returned
Issue #10357 is about a SELECT with a filter on a regular column which
incorrectly returns a static row without regular columns set (so the
filter would not have matched). We already have four tests reproducing
this issue, but each of them is a small part of a large tests translated
from Cassandra, making it hard to understand the scope of this bug.

So in this patch we add two new tests, one passing and one xfailing,
which clarify the scope of this bug. It turns out that the bug only
occurs when a partition has no clustering rows and only has a static
row. If the partition does have clustering rows - even if those don't
match the filter - the bug doesn't happen. The xfailing test is just
two statements long - a single INSERT and a single SELECT

Refs #10357.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#15120
2023-09-25 11:01:22 +03:00
Raphael S. Carvalho
914cbc11cf reader_concurrency_semaphore: Fix stop() in face of evictable reads becoming inactive
Scylla can crash due to a complicated interaction of service level drop,
evictable readers, inactive read registration path.

1) service level drop invoke stop of reader concurrency semaphore, which will
wait for in flight requests

2) turns out it stops first the gate used for closing readers that will
become inactive.

3) proceeds to wait for in-flight reads by closing the reader permit gate.

4) one of evictable reads take the inactive read registration path, and
finds the gate for closing readers closed.

5) flat mutation reader is destroyed, but finds the underlying reader was
not closed gracefully and triggers the abort.

By closing permit gate first, evictable readers becoming inactive will
be able to properly close underlying reader, therefore avoiding the
crash.

Fixes #15534.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15535
2023-09-25 08:55:50 +03:00
Nadav Har'El
be942c1bce Merge 'treewide: rename s3 credentials related variable and option names' from Kefu Chai
in this series, we rename s3 credential related variable and option names so they are more consistent with AWS's official document. this should help with the maintainability.

Closes scylladb/scylladb#15529

* github.com:scylladb/scylladb:
  main.cc: rename aws option
  utils/s3/creds: rename aws_config member variables
2023-09-24 14:03:47 +03:00
Nadav Har'El
4e1e7568d8 Merge 'cql3:statements:describe_statement: include UDT/UDF/UDA in generic describe' from Michał Jadwiszczak
So far generic describe (`DESC <name>`) followed Cassandra implementation and it only described keyspace/table/view/index.

This commit adds UDT/UDF/UDA to generic describe.

Fixes: #14170

Closes scylladb/scylladb#14334

* github.com:scylladb/scylladb:
  docs:cql: add information  about generic describe
  cql-pytest:test_describe: add test for generic UDT/UDF/UDA desc
  cql3:statements:describe_statement: include UDT/UDF/UDA in generic describe
2023-09-24 13:03:04 +03:00
Kefu Chai
f3f31f0c65 main.cc: rename aws option
- s/aws_key/aws_access_key_id/
- s/aws_secret/aws_secret_access_key/
- s/aws_token/aws_session_token/

rename them to more popular names, these names are also used by
boto's API. this should improve the readability and consistency.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-23 14:31:32 +08:00
Kefu Chai
ac3406e537 utils/s3/creds: rename aws_config member variables
- s/key/access_key_id/
- s/secret/secret_access_key/
- s/token/session_token/

so they are more aligned with the AWS document.
for instance, in
https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#ConstructingTheAuthenticationHeader
AWSAccessKeyId is used in the "Authorization" header.

this would help with the readability and maintainability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-09-23 14:28:07 +08:00