Currently, scylla_fstrim_setup does not start scylla-fstrim.timer and
just enables it, so the timer starts only after rebooted.
This is incorrect behavior, we start start it during the setup.
Also, unmask is unnecessary for enabling the timer.
Fixes#14249Closes#14252
The current Seastar RPC infrastructure lacks support
for null values in tuples in handler responses.
In this commit we add the make_default_rpc_tuple function,
which solves the problem by returning pointers to
default-constructed values for smart pointer types
rather than nulls.
The problem was introduced in this commit
2d791a5ed4. The
function `encode_replica_exception_for_rpc` used
`default_tuple_maker` callback to create tuples
containing exceptions. Callers returned pointers
to default-constructed values in this callback,
e.g. `foreign_ptr(make_lw_shared<reconcilable_result>())`.
The commit changed this to just `SourceTuple{}`,
which means nullptr for pointer types.
Fixes: #14282Closes#14352
Fixes https://github.com/scylladb/scylla-enterprise/issues/3036
This commit adds support for Ubuntu 22.04 to the list
of OSes supported by ScyllaDB Enterprise 2021.1.
This commit fixex a bug and must be backported to
branch-5.3 and branch-5.2.
Closes#14372
Compaction tasks covering table major, cleanup, offstrategy,
and upgrade sstables compaction inherit sequence number from their
parents. Thus they do not need to have a new sequence number
generated as it will be overwritten anyway.
Closes#14379
The series contains mostly cleanups for query processor and no functional
change. The last patch is a small cleanup for the storage_proxy.
* 'qp-cleanup' of https://github.com/gleb-cloudius/scylla:
storage_proxy: remove unused variable
client_state: co-routinise has_column_family_access function
query_processor: get rid of internal_state and create individual query_satate for each request
cql3: move validation::validate_column_family from client_state::has_column_family_access
client_state: drop unneeded argument from has.*access functions
cql3: move check for dropping cdc tables from auth to the drop statement code itself
query_processor: co-routinise execute_prepared_without_checking_exception_message function
query_processor: co-routinize execute_direct_without_checking_exception_message function
cql3: remove empty statement::validate functions
cql3: remove empty function validate_cluster_support
cql3/statements: fix indentation and spurious white spaces
query_processor: move statement::validate call into execute_with_params function
query_processor: co-routinise execute_with_params function
query_processor: execute statement::validate before each execution of internal query instead of only during prepare
query_processor: get rid of shared internal_query_state
query_processor: co-routinize execute_paged_internal function
query_processor: co_routinize execute_batch_without_checking_exception_message function
query_processor: co-routinize process_authorized_statement function
It's very annoying to add a declaration to expression.hh and watch
the whole world get recompiled. Improve that by moving less-common
functions to a new header expr-utils.hh. Move the evaluation machinery
to a new header evaluate.hh. The remaining definitions in expression.hh
should not change as often, and thus cause less frequent recompiles.
Closes#14346
* github.com:scylladb/scylladb:
cql3: expr: break up expression.hh header
cql3: expr: restrictions.hh: protect against double inclusions
cql3: constants: deinline
cql3: statement_restrictions: deinline
cql3: deinline operation::fill_prepare_context()
There was a bug in describe_statement. If executing `DESC FUNCTION <uda name>` or ` DESC AGGREGATE <udf name>`, Scylla was crashing because the function was found (`functions::find()` searches both UDFs and UDAs) but the function was bad and the pointer wasn't checked after cast.
Added a test for this.
Fixes: #14360Closes#14332
* github.com:scylladb/scylladb:
cql-pytest:test_describe: add test for filtering UDF and UDA
cql3:statements:describe_statement: check pointer to UDF/UDA
Adding a function declaration to expression.hh causes many
recompilations. Reduce that by:
- moving some restrictions-related definitions to
the existing expr/restrictions.hh
- moving evaluation related names to a new header
expr/evaluate.hh
- move utilities to a new header
expr/expr-utilities.hh
expression.hh contains only expression definitions and the most
basic and common helpers, like printing.
To reduce future header fan-in, deinline all non-trivial functions.
While these aer on the hot path, they can't be inlined anyway as they're
virtual, and they're quite heavy anyway.
Checking keyspace/table presence should not be part of authorization code
and it is not done consistently today. For instance keyspace presence
is not checked in "alter keyspace" during authorization, but during
statement execution. Make it consistent.
Checking if a table is CDC log and cannot be dropped should not be done
as part of authentication (this has nothing to do with auth), but in the
drop statement itself. Throwing unauthorized_exception is wrong as well,
but unfortunately it is enshrined with a test. Not sure if it is a good
idea to change it now.
There is a discrepancy on how statement::validate is used. On a regular
path it is called before each execution, but on internal execution
path it is called only once during prepare. Such discrepancy make it
hard to reason what can and cannot be done during the call. Call it
uniformly before each execution. This allow validate to check a state that
can change after prepare.
internal_query_state was passed in shared_ptr from the java
translation times. It may be a regular c++ type with a lifetime
bound by the function execution it was created in.
Make evaluate()'s body more regular, then exploit it by
replacing the long list of branches with a lambda template.
Closes#14306
* github.com:scylladb/scylladb:
cql3: expr: simplify evaluate()
cql3: expr: standardize evaluate() branches to call do_evaluate()
cql3: expr: rename evaluate(ExpressionElement) to do_evaluate()
This is V2 of https://github.com/scylladb/scylladb/pull/14108
This commit moves the installation instruction for the cloud from the [website ](https://www.scylladb.com/download/)to the docs.
The scope:
* Added new files with instructions for AWS, GCP, and Azure.
* Added the new files to the index.
* Updating the "Install ScyllaDB" page to create the "Cloud Deployment" section.
* Adding new bookmarks in other files to create stable links, for example, ".. _networking-ports:"
* Moving common files to the new "installation-common" directory. This step is required to exclude the open source-only files in the Enterprise repository.
In addition:
- The Configuration Reference file was moved out of the installation section (it's not about installation at all)
- The links to creating a cluster were removed from the installation page (as not related).
Related: https://github.com/scylladb/scylla-docs/issues/4091Closes#14153
* github.com:scylladb/scylladb:
doc: remove the rpm-info file (What is in each RPM) from the installation section
doc: move cloud deployment instruction to docs -v2
Spans are slightly cleaner, slightly faster (as they avoid an indirection),
and allow for replacing some of the arguments with small_vector:s.
Closes#14313
There was a bug that caused aggregates to fail when used on column-sensitive columns.
For example:
```cql
SELECT SUM("SomeColumn") FROM ks.table;
```
would fail, with a message saying that there is no column "somecolumn".
This is because the case-sensitivity got lost on the way.
For non case-sensitive column names we convert them to lowercase, but for case sensitive names we have to preserve the name as originally written.
The problem was in `forward_service` - we took a column name and created a non case-sensitive `column_identifier` out of it.
This converted the name to lowercase, and later such column couldn't be found.
To fix it, let's make the `column_identifier` case-sensitive.
It will preserve the name, without converting it to lowercase.
Fixes: https://github.com/scylladb/scylladb/issues/14307Closes#14340
* github.com:scylladb/scylladb:
service/forward_service.cc: make case-sensitivity explicit
cql-pytest/test_aggregate: test case-sensitive column name in aggregate
forward_service: fix forgetting case-sensitivity in aggregates
Task manager task covering compaction group major
compaction.
Uses multiple inheritance on already existing
major_compaction_task_executor to keep track of
the operation with task manager.
Closes#14271
* github.com:scylladb/scylladb:
test: extend test_compaction_task.py
test: use named variable for task tree depth
compaction: turn major_compaction_task_executor into major_compaction_task_impl
compaction: take gate holder out of task executor
compaction: extend signature of some methods
tasks: keep shared_ptr to impl in task
compaction: rename compaction_task_executor methods
Use the new Seastar functionality for storing references to connections to implement banning hosts that have left the cluster (either decommissioned or using removenode) in raft-topology mode. Any attempts at communication from those nodes will be rejected.
This works not only for nodes that restart, but also for nodes that were running behind a network partition and we removed them. Even when the partition resolves, the existing nodes will effectively put a firewall from that node.
Some changes to the decommission algorithm had to be introduced for it to work with node banning. As a side effect a pre-existing problem with decommission was fixed. Read the "introduce `left_token_ring` state" and "prepare decommission path for node banning" commits for details.
Closes#13850
* github.com:scylladb/scylladb:
test: pylib: increase checking period for `get_alive_endpoints`
test: add node banning test
test: pylib: manager_client: `get_cql()` helper
test: pylib: ScyllaCluster: server pause/unpause API
raft topology: ban left nodes
raft topology: skip `left_token_ring` state during `removenode`
raft topology: prepare decommission path for node banning
raft topology: introduce `left_token_ring` state
raft topology: `raft_topology_cmd` implicit constructor
messaging_service: implement host banning
messaging_service: exchange host IDs and map them to connections
messaging_service: store the node's host ID
messaging_service: don't use parameter defaults in constructor
main: move messaging_service init after system_keyspace init
Fixes https://github.com/scylladb/scylladb/issues/14333
This commit replaces the documentation landing page with
the Open Source-only documentation landing page.
This change is required as now there is a separate landing
page for the ScyllaDB documentation, so the page is duplicated,
creating bad user experience.
Closes#14343
Make it explicit that the boolean argument determines case-sensitivity. It emphasizes its importance.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
There was a bug which made aggregates fail when used with case-sensitive
column names.
Add a test to make sure that this doesn't happen in the future.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
There was a bug that caused aggregates to fail when
used on column-sensitive columns.
For example:
```
SELECT SUM("SomeColumn") FROM ks.table;
```
would fail, with a message saying that there
is no column "somecolumn".
This is because the case-sensitivity got lost on the way.
For non case-sensitive column names we convert them to lowercase,
but for case sensitive names we have to preserve the name
as originally written.
The problem was in `forward_service` - we took a column name
and created a non case-sensitive `column_identifier` out of it.
This converted the name to lowercase, and later such column
couldn't be found.
To fix it, let's make the `column_identifier` case-sensitive.
It will preserve the name, without converting it to lowercase.
Fixes: https://github.com/scylladb/scylladb/issues/14307
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
The chunk size used in sstable compression can be set when creating a
table, using the "chunk_length_in_kb" parameter. It can be any power-of-two
multiple of 1KB. Very large compression chunks are not useful - they
offer diminishing returns on compression ratio, and require very large
memory buffers and reading a very large amount of disk data just to
read a small row. In fact, small chunks are recommended - Scylla
defaults to 4 KB chunks, and Cassandra lowered their default from 64 KB
(in Cassandra 3) to 16 KB (in Cassandra 4).
Therefore, allowing arbitrarily large chunk sizes is just asking for
trouble. Today, a user can ask for a 1 GB chunk size, and crash or hang
Scylla when it runs out of memory. So in this patch we add a hard limit
of 128 KB for the chunk size - anything larger is refused.
Fixes#9933
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#14267
This reverts commit 562087beff.
The regressions introduced by the reverted change have been fixed.
So let's revert this revert to resurrect the
uuid_sstable_identifier_enabled support.
Fixes#10459
This PR changes the system to respect shard assignment to tablets in tablet metadata (system.tablets):
1. The tablet allocator is changed to distribute tablets evenly across shards taking into account currently allocated tablets in the system. Each tablet has equal weight. vnode load is ignored.
2. CDC subsystem was not adjusted (not supported yet)
3. sstable sharding metadata reflects tablet boundaries
5. resharding is NOT supported yet (the node will abort on boot if there is a need to reshard tablet-based tables)
6. The system is NOT prepared to handle tablet migration / topology changes in a safe way.
7. Sstable cleanup is not wired properly yet
After this PR, dht::shard_of() and schema::get_sharder() are deprecated. One should use table::shard_of() and effective_replication_map::get_sharder() instead.
To make the life easier, support was added to obtain table pointer from the schema pointer:
```
schema_ptr s;
s->table().shard_of(...)
```
Closes#13939
* github.com:scylladb/scylladb:
locator: network_topology_startegy: Allocate shards to tablets
locator: Store node shard count in topology
service: topology: Extract topology updating to a lambda
test: Move test_tablets under topology_experimental
sstables: Add trace-level logging related to shard calculation
schema: Catch incorrect uses of schema::get_sharder()
dht: Rename dht::shard_of() to dht::static_shard_of()
treewide: Replace dht::shard_of() uses with table::shard_of() / erm::shard_of()
storage_proxy: Avoid multishard reader for tablets
storage_proxy: Obtain shard from erm in the read path
db, storage_proxy: Drop mutation/frozen_mutation ::shard_of()
forward_service: Use table sharder
alternator: Use table sharder
db: multishard: Obtain sharder from erm
sstable_directory: Improve trace-level logging
db: table: Introduce shard_of() helper
db: Use table sharder in compaction
sstables: Compute sstable shards using sharder from erm when loading
sstables: Generate sharding metadata using sharder from erm when writing
test: partitioner: Test split_range_to_single_shard() on tablet-like sharder
dht: Make split_range_to_single_shard() prepared for tablet sharder
sstables: Move compute_shards_for_this_sstable() to load()
dht: Take sharder externally in splitting functions
locator: Make sharder accessible through effective_replication_map
dht: sharder: Document guarantees about mapping stability
tablets: Implement tablet sharder
tablets: Include pending replica in get_shard()
dht: sharder: Introduce next_shard()
db: token_ring_table: Filter out tablet-based keyspaces
db: schema: Attach table pointer to schema
schema_registry: Fix SIGSEGV in learn() when concurrent with get_or_load()
schema_registry: Make learn(schema_ptr) attach entry to the target schema
test: lib: cql_test_env: Expose feature_service
test: Extract throttle object to separate header
Fixes#11017
When doing writes, storage proxy creates types deriving from abstract_write_response_handler.
These are created in the various scheduling groups executing the write inducing code. They
pick up a group-local reference to the various metrics used by SP. Normally all code
using (and esp. modifying) these metrics are executed in the same scheduling group.
However, if gossip sees a node go down, it will notify listeners, which eventually
calls get_ep_stat and register_metrics.
This code (before this patch) uses _active_ scheduling group to eventually add
metrics, using a local dict as guard against double regs. If, as described above,
we're called in a different sched group than the original one however, this
can cause double registrations.
Fixed here by keeping a reference to creating scheduling group and using this, not
active one, when/if creating new metrics.
Closes#14294
Uses a simple algorihtm for allocating shards which chooses
least-loaded shard on a given node, encapsulated in load_sketch.
Takes load due to current tablet allocation into account.
Each tablet, new or allocated for other tables, is assumed to have an
equal load weight.