Commit Graph

1709 Commits

Author SHA1 Message Date
Benny Halevy
736f89b31a tablets: enforce tablets using tablets_mode_for_new_keyspaces=enforced config option
`tablets_mode_for_new_keyspaces=enforced` enables tablets by default for
new keyspaces, like `tablets_mode_for_new_keyspaces=enabled`.
However, it does not allow to opt-out when creating
new keyspaces by setting `tablets = {'enabled': false}`.

Refs scylladb/scylla-enterprise#4355

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 62aeba759b)
2025-04-08 08:35:14 +03:00
Benny Halevy
a49e27ac8f db/config: add tablets_mode_for_new_keyspaces option
The new option deprecates the existing `enable_tablets` option.
It will be extended in the next patch with a 3rd value: "enforced"
while will enable tablets by default for new keyspace but
without the posibility to opt out using the `tablets = {'enabled':
false}` keyspace schema option.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit c62865df90)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-04-08 08:08:47 +03:00
Dawid Mędrek
af2215c2d2 cql3: Ensure that CREATE and ALTER never lead to RF-rack-invalid keyspaces
In this commit, we refuse to create or alter a keyspace when that operation
would make it RF-rack-invalid if the option `rf_rack_valid_keyspaces` is
enabled.

We provide two tests verifying that the changes work as intended.

Fixes scylladb/scylladb#23276

(cherry picked from commit 41f862d7ba)
2025-03-21 12:27:04 +00:00
Wojciech Mitros
138c68d80e mv: forbid views with tablets by default
Materialized views with tablets are not stable yet, but we want
them available as an experimental feature, mainly for teseting.

The feature was added in scylladb/scylladb#21833,
but currently it has no effect. All tests have been updated to use the
feature, so we should finally make it work.
This patch prevents users from creating materialized views in keyspaces
using tablets when the VIEWS_WITH_TABLETS feature is not enabled - such
requests will now get rejected.

Fixes scylladb/scylladb#21832

Closes scylladb/scylladb#22217

(cherry picked from commit 677f9962cf)

Closes scylladb/scylladb#22659
2025-02-04 08:06:23 +01:00
Nadav Har'El
a8805c4fc1 Merge 'cql3, test, utils: switch from boost::adaptors::uniqued to utils::views:unique ' from Kefu Chai
In order to reduce the dependency on external libraries, and for better integration with ranges in C++ standard library. let's use the homebrew `utils::views::unique()` before unique is accepted by the C++ standard.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#22393

* github.com:scylladb/scylladb:
  cql3, test: switch from boost::adaptors::uniqued to utils::views:unique
  utils: implement drop-in replacement for replacing boost::adaptors::uniqued
2025-01-21 19:06:21 +02:00
Kefu Chai
ccb7b4e606 cql3, test: switch from boost::adaptors::uniqued to utils::views:unique
In order to reduce the dependency on external libraries, and for better
integration with ranges in C++ standard library. let's use the homebrew
`utils::views::unique()` before unique is accepted by the C++ standard.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-21 16:24:45 +08:00
Nadav Har'El
3e16b80014 Merge 'Reject create table with compact storage' from Benny Halevy
As discussed in
https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813,
compact storage tables are deprecated.

Yet, there's is nothing in the code that prevents users
from creating such tables.

This patch adds a live-updateable config option:
`enable_create_table_with_compact_storage`, set to
`false` by default, that require users to opt-in
in order to create new tables WITH COMPACT STORAGE.

Refs scylladb/scylladb#12263, scylladb/scylladb#16375

* Since this guardrail is an enhancement, no backport is needed

Closes scylladb/scylladb#16403

* github.com:scylladb/scylladb:
  docs: ddl: document the deprecation of compact tables
  test: enable_create_table_with_compact_storage for tests that need it
  config: add enable_create_table_with_compact_storage
2025-01-20 22:02:02 +02:00
Benny Halevy
88ae067ddb everywhere: add skeletal support for the in_memory_tables feature
Forward-ported from scylla-enterprise.
Note that the feature has been deprecated and the implementation
is provided only for backward compatibility with pre-existing
features and schema.

Tested manually after adding the following to feature_service:
```
    gms::feature workload_prioritization { *this, "WORKLOAD_PRIORITIZATION"sv };
```

Launched a single-node cluster running 2023.1.10
```
cqlsh> create KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
cqlsh> create TABLE ks.test ( pk int PRIMARY KEY, val int ) WITH compaction = {'class': 'InMemoryCompactionStrategy'};
```

log:
```
Scylla version 2023.1.10-0.20241227.21cffccc1ccd with build-id bd65b8399cb13b713a87e57fe333cfcabfd50be7 starting ...
...
INFO  2024-12-27 19:45:16,563 [shard 0] migration_manager - Create new ColumnFamily: org.apache.cassandra.config.CFMetaData@0x600000f1b400[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName=ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,readRepairChance=0,dcLocalReadRepairChance=0,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,keyValidator=org.apache.cassandra.db.marshal.Int32Type,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,in_memory=false,version=5529c631-c47a-11ef-bd1d-4295734ce5a8,droppedColumns={},collections={},indices={}]
INFO  2024-12-27 19:45:16,564 [shard 0] schema_tables - Creating ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440
```

Upgraded to this branch and started scylla.
Verified that ks.test was successfuly loaded:

log:
```
INFO  2024-12-27 19:48:58,115 [shard 0:main] init - Scylla version 6.3.0~dev-0.20241227.a64c6dfc153e with build-id f9496134a09cf2e55d3865b9e9ff499f672aa7da starting ...
...
WARN  2024-12-27 19:53:02,948 [shard 1:main] CompactionStrategy - InMemoryCompactionStrategy is no longer supported. Defaulting to NullCompactionStrategy.
...
INFO  2024-12-27 19:53:02,948 [shard 0:main] database - Keyspace ks: Reading CF test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440 storage=/home/bhalevy/scylladb/data/ks/test-5529c630c47a11efbd1d4295734ce5a8
```

Then, tested:
```
cqlsh> describe KEYSPACE ks;

CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false};

CREATE TABLE ks.test (
    pk int,
    val int,
    PRIMARY KEY (pk)
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'InMemoryCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99.0PERCENTILE';

cqlsh> alter TABLE ks.test with compaction = {'class': 'SizeTieredCompactionStrategy'};
cqlsh> describe KEYSPACE ks;

CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false};

CREATE TABLE ks.test (
    pk int,
    val int,
    PRIMARY KEY (pk)
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99.0PERCENTILE'
    AND tombstone_gc = {'mode': 'timeout', 'propagation_delay_in_seconds': '3600'};
```

log:
```
INFO  2024-12-27 19:56:40,465 [shard 0:stmt] migration_manager - Update table 'ks.test' From org.apache.cassandra.config.CFMetaData@0x60000362d800[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ec88d510-6aff-344a-914d-541d37081440,droppedColumns={},collections={},indices={}] To org.apache.cassandra.config.CFMetaData@0x60000336e000[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ecccf010-c47b-11ef-b52c-622f2f0e87c4,droppedColumns={},collections={},indices={}]
INFO  2024-12-27 19:56:40,466 [shard 0: gms] schema_tables - Altering ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ecccf010-c47b-11ef-b52c-622f2f0e87c4
```

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#22068
2025-01-20 16:55:17 +02:00
Benny Halevy
0110eb0506 config: add enable_create_table_with_compact_storage
As discussed in
https://github.com/scylladb/scylladb/issues/12263#issuecomment-1853576813,
compact storage tables are deprecated.

Yet, there's is nothing in the code that prevents users
from creating such tables.

This patch adds a live-updateable config option:
`enable_create_table_with_compact_storage` that require users
to opt-in in order to create new tables WITH COMPACT STORAGE.

The option is currently set to `true` by default in db/config
to reduce the churn to tests and to `false` in scylla.yaml,
for new clusters.

TODO: once regressions tests that use compact storage
are converted to enable the option, change the default in
db/config to false.

A unit test was added to test/cql-pytest that
checks that the respective cql query fails as expected
with the default option or when it is explicitly set to `false`,
and that the query succeeds when the option is set to `true`.

Note that `check_restricted_table_properties` already
returns an optional warning, but it is only logged
but not returned in the `prepared_statement`.
Fixing that is out of the scope of this patch.
See https://github.com/scylladb/scylladb/issues/20945

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-01-20 08:03:25 +02:00
Piotr Dulikowski
6aa962f5f4 Merge 'Add audit subsystem for database operations' from Paweł Zakrzewski
Introduces a comprehensive audit system to track database operations for security
and compliance purposes. This change includes:

Core Components:
- New audit subsystem for logging database operations
- Service level integration for proper resource management
- CQL statement tracking with operation categories
- Login process integration for tenant management

Key Features:
- Configurable audit logging (syslog/table)
- Operation categorization (QUERY/DML/DDL/DCL/AUTH/ADMIN)
- Selective auditing by keyspace/table
- Password sanitization in audit logs
- Service level shares support (1-1000) for workload prioritization
- Proper lifecycle management and cleanup

I ran the dtests for audit (manually enabled) and they pass.
The in-repo tests pass.

Notably, there should be no non-whitespace changes between this and scylla-enterprise

Fixes scylladb/scylla-enterprise#4999

Closes scylladb/scylladb#22147

* github.com:scylladb/scylladb:
  audit: Add shares support to service level management
  audit: Add service level support to CQL login process
  audit: Add support to CQL statements
  audit: Integrate audit subsystem into Scylla main process
  audit: Add documentation for the audit subsystem
  audit: Add the audit subsystem
2025-01-17 13:14:55 +01:00
Gleb Natapov
83d15b8e32 cql3: report host id instead of ip in error during SELECT FROM MUTATION_FRAGMENTS query
We want to drop ip from the topology::node.
2025-01-16 16:37:07 +02:00
Paweł Zakrzewski
5b1da31595 audit: Add shares support to service level management
Introduces shares-based workload prioritization for service levels, allowing
fine-grained control over resource allocation between tenants. Key changes:

- Add shares option to service level configuration:
  - Valid range: 1-1000 shares
  - Default value: 1000 shares
  - Enterprise-only feature gated by WORKLOAD_PRIORITIZATION feature flag

- Extend CQL interface:
  - Add shares parameter to CREATE/ALTER SERVICE_LEVEL
  - Add shares column to system_distributed.service_levels
  - Add percentage calculation to LIST SERVICE_LEVELS
  - Add shares to DESCRIBE EFFECTIVE SERVICE_LEVEL output

- Add validation:
  - Enforce shares range (1-1000)
  - Validate enterprise feature flag
  - Handle unset/delete markers properly

- Update service level statements:
  - Add shares validation to CREATE/ALTER operations
  - Preserve shares through default value replacement
  - Add proper decomposition for shares values in result sets

This change enables operators to control relative resource allocation between
tenants using proportional share scheduling, while maintaining backward
compatibility with existing service level configurations.
2025-01-15 15:01:05 +01:00
Paweł Zakrzewski
98f5e49ea8 audit: Add support to CQL statements
Integrates audit functionality into CQL statement processing to enable tracking of database operations. Key changes:

- Add audit_info and statement_category to all CQL statements
- Implement audit categories for different statement types:
  - DDL: Schema altering statements (CREATE/ALTER/DROP)
  - DML: Data manipulation (INSERT/UPDATE/DELETE/TRUNCATE/USE)
  - DCL: Access control (GRANT/REVOKE/CREATE ROLE)
  - QUERY: SELECT statements
  - ADMIN: Service level operations

- Add audit inspection points in query processing:
  - Before statement execution
  - After access checks
  - After statement completion
  - On execution failures

- Add password sanitization for role management statements
  - Mask plaintext passwords in audit logs
  - Handle both direct password parameters and options maps
  - Preserve query structure while hiding sensitive data

- Modify prepared statement lifecycle to carry audit context
  - Pass audit info during statement preparation
  - Track audit info through statement execution
  - Support batch statement auditing

This change enables comprehensive auditing of CQL operations while ensuring sensitive data is properly masked in audit logs.
2025-01-15 11:10:36 +01:00
Paweł Zakrzewski
384641194a audit: Add the audit subsystem
This change introduces a new audit subsystem that allows tracking and logging of database operations for security and compliance purposes. Key features include:

- Configurable audit logging to either syslog or a dedicated system table (audit.audit_log)
- Selective auditing based on:
  - Operation categories (QUERY, DML, DDL, DCL, AUTH, ADMIN)
  - Specific keyspaces
  - Specific tables
- New configuration options:
  - audit: Controls audit destination (none/syslog/table)
  - audit_categories: Comma-separated list of operation categories to audit
  - audit_tables: Specific tables to audit
  - audit_keyspaces: Specific keyspaces to audit
  - audit_unix_socket_path: Path for syslog socket
  - audit_syslog_write_buffer_size: Buffer size for syslog writes

The audit logs capture details including:
- Operation timestamp
- Node and client IP addresses
- Operation category and query
- Username
- Success/failure status
- Affected keyspace and table names
2025-01-15 11:10:35 +01:00
Kefu Chai
7215d4bfe9 utils: do not include unused headers
these unused includes were identifier by clang-include-cleaner. after
auditing these source files, all of the reports have been confirmed.

please note, because quite a few source files relied on
`utils/to_string.hh` to pull in the specialization of
`fmt::formatter<std::optional<T>>`, after removing
`#include <fmt/std.h>` from `utils/to_string.hh`, we have to
include `fmt/std.h` directly.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-14 07:56:39 -05:00
Wojciech Mitros
d04f376227 mv: add an experimental feature for creating views using tablets
We still have a number of issues to be solved for views with tablets.
Until they are fixed, we should prevent users from creating them,
and use the vnode-based views instead.

This patch prepares the feature for enabling views with tablets. The
feature is disabled by default, but currently it has no effect.
After all tests are adjusted to use the feature, we should depend
on the feature for deciding whether we can create materialized views
in tablet-enabled keyspaces.

The unit tests are adjusted to enable this feature explicitly, and it's
also added to the scylla sstable tool config - this tool treats all
tables as if they were tablet-based (surprisingly, with SimpleStrategy),
so for it to work on views, the new feature must be enabled.

Refs scylladb/scylladb#21832

Closes scylladb/scylladb#21833
2025-01-07 15:52:36 +01:00
Kefu Chai
e4463b11af treewide: replace boost::algorithm::join() with fmt::join()
Replace usages of `boost::algorithm::join()` with `fmt::join()` to improve
performance and reduce dependency on Boost. `fmt::join()` allows direct
formatting of ranges and tuples with custom separators without creating
intermediate strings.

When formatting comma-separated values into another string, fmt::join()
avoids the overhead of temporary string creation that
`boost::algorithm::join()` requires. This change also helps streamline
our dependencies by leveraging the existing fmt library instead of
Boost.Algorithm.

To avoid the ambiguity, some caller sites were updated to call
`seastar::format()` explicitly.

See also

- boost::algorithm::join():
  https://www.boost.org/doc/libs/1_87_0/doc/html/string_algo/reference.html#doxygen.join_8hpp
- fmt::join():
  https://fmt.dev/11.0/api/#ranges-api

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22082
2025-01-07 12:45:05 +02:00
Piotr Dulikowski
0f62eb45d1 cql3/statements: update SL statements for workload prioritization
Introduce the "SHARES" keyword which can be used in conjunction with
existing CQL statements related to the service levels.

Adjust the CQL statements for service levels:

- CREATE/ALTER now allow to set shares (only if the cluster is fully
  upgraded)
- LIST EFFECTIVE SERVICE LEVEL now return the number of shares in a new
  column
- LIST SERVICE LEVEL(S) also return the number of shares, and has the
  additional column "percentage of all service level shares"
2025-01-02 07:13:34 +01:00
Michael Litvak
5ef7afb968 cql3: allow SELECT of specific collection key
This adds to the grammar the option to SELECT a specific key in a
collection column using subscript syntax.

For example:
SELECT map['key'] FROM table
SELECT map['key1']['key2'] FROM table

The key can also be parameterized in a prepared query. For this we need
to pass the query options to result_set_builder where we process the
selectors.

Fixes scylladb/scylladb#7751
2024-12-30 17:05:20 +02:00
Avi Kivity
eb62593f2c treewide: use angle brackets when including seastar headers
We treat Seastar as a "system" library, and those are included
with angle brackets.

Closes scylladb/scylladb#21959
2024-12-20 16:16:28 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Michael Litvak
53224d90be service/qos: increase timeout of internal get_service_levels queries
The function get_service_levels is used to retrieve all service levels
and it is called from multiple different contexts.
Importantly, it is called internally from the context of group0 state reload,
where it should be executed with a long timeout, similarly to other
internal queries, because a failure of this function affects the entire
group0 client, and a longer timeout can be tolerated.
The function is also called in the context of the user command LIST
SERVICE LEVELS, and perhaps other contexts, where a shorter timeout is
preferred.

The commit introduces a function parameter to indicate whether the
context is internal or not. For internal context, a long timeout is
chosen for the query. Otherwise, the timeout is shorter, the same as
before. When the distinction is not important, a default value is
chosen which maintains the same behavior.

The main purpose is to fix the case where the timeout is too short and causes
a failure that propagates and fails the group0 client.

Fixes scylladb/scylladb#20483

Closes scylladb/scylladb#21748
2024-12-09 13:20:32 +01:00
Kefu Chai
bab12e3a98 treewide: migrate from boost::adaptors::transformed to std::views::transform
now that we are allowed to use C++23. we now have the luxury of using
`std::views::transform`.

in this change, we:

- replace `boost::adaptors::transformed` with `std::views::transform`
- use `fmt::join()` when appropriate where `boost::algorithm::join()`
  is not applicable to a range view returned by `std::view::transform`.
- use `std::ranges::fold_left()` to accumulate the range returned by
  `std::view::transform`
- use `std::ranges::fold_left()` to get the maximum element in the
  range returned by `std::view::transform`
- use `std::ranges::min()` to get the minimal element in the range
  returned by `std::view::transform`
- use `std::ranges::equal()` to compare the range views returned
  by `std::view::transform`
- remove unused `#include <boost/range/adaptor/transformed.hpp>`
- use `std::ranges::subrange()` instead of `boost::make_iterator_range()`,
  to feed `std::views::transform()` a view range.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

limitations:

there are still a couple places where we are still using
`boost::adaptors::transformed` due to the lack of a C++23 alternative
for `boost::join()` and `boost::adaptors::uniqued`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21700
2024-12-03 09:41:32 +02:00
Kefu Chai
a5ee0c896b treewide: migrate from boost::adaptors::filtered to std::views::filter
Modernize the codebase by replacing Boost range adaptors with C++23 standard library views,
reducing external dependencies and leveraging modern C++ language features.

Key Changes:
- Replace `boost::adaptors::filtered` with `std::views::filter`
- Remove `#include <boost/range/adaptor/filtered.hpp>`
- Utilize standard library range views

Motivation:
- Reduce project's external dependency footprint
- Leverage standard library's range and view capabilities
- Improve long-term code maintainability
- Align with modern C++ best practices

Implementation Challenges and Considerations:
1. Range Conversion and Move Semantics
   - `std::ranges::to` adaptor requires rvalue references
   - Necessitated updates to variable and parameter constness
   - Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const`
     from `common` to enable efficient range conversion

2. Range Iteration and Mutation
   - Range views may mutate internal state during iteration
   - Cannot pass ranges by const reference in some scenarios
   - Solution: Pass ranges by rvalue reference to explicitly indicate
     state invalidation

Limitations:
- One instance of `boost::adaptors::filtered` temporarily preserved
  due to lack of a C++23 alternative for `boost::join()`
- A comprehensive replacement will be addressed in a follow-up change

This change is part of our ongoing effort to modernize the codebase,
reducing external dependencies and adopting modern C++ practices.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21648
2024-11-26 14:26:50 +02:00
muthu90tech
0ea0234a7a Avoid unnecessary copy in query_processor::execute_direct_without_checking_exception_message
instead of making a copy of the warnings vector, make the warnings a non const in prepared_statement
and move the warnings vector to execute_maybe_with_guard

Closes scylladb/scylladb#20361

Closes scylladb/scylladb#21083
2024-11-22 13:34:31 +02:00
Botond Dénes
075ca6cc02 Merge 'cql3: respect PER PARTITION LIMIT for aggregate queries' from Paweł Zakrzewski
Currently, PER PARTITION LIMIT is not implemented for aggregates and queries can result in more rows than expected from the same partition.

Instrument the result_set_builder class so that it can enforce PER PARTITION LIMIT for aggregate queries, specifically:
- add per_partition_limit to the result_set_builder
- expose the number of input rows in the selector

result_set_builder gets two new functions handling partition start and end:
- accept_partition_end for notifying that a partition has been finished. This is also called when a page ends, so we cannot simply flush here, as a naive implementation could do.
- accept_new_partition, where we flush_selectors() if it's indeed a new partition (and not a continuation of the previous) and the query has a grouping: we don't want to flush on new partition in a query like SELECT COUNT(*) FROM foo;

Fixes #5363

Closes scylladb/scylladb#21125

* github.com:scylladb/scylladb:
  test: enable PER PARTIION LIMIT + GROUP BY tests
  cql3: respect PER PARTITION LIMIT for aggregates
  cql3: selection: count input rows in the selector
  cql3: selection: pass per partition limit to the result_set_builder
  cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit
2024-11-20 09:54:28 +02:00
Avi Kivity
b14871ad3f Merge 'code cleanup: remove "sstring_view" and replace its usages by std::string_view' from Nadav Har'El
For historic reasons, we have (in bytes.hh) a type sstring_view which is an alias for std::string_view - since the same standard type can hold a pointer into both a seastar::sstring and std::string.

This alias in unnecessary and misleading to new developers, who might be misled to believe it is assume it is somehow different from std::string_view - when it isn't.

This series removes all uses of sstring_view (changing them to use std::string_view), and in the last patch removes the alias itself. A few functions whose name referred to "sstring" but take a std::string_view were renamed.

The patches are fairly mechanical and trivial, with no functional changes intended. To ease the review the series was split to a few smaller patches that modify specific areas of the code.

Fixes #4062.

Closes scylladb/scylladb#21617

* github.com:scylladb/scylladb:
  bytes: remove unused alias sstring_view
  change remaining sstring_view to std::string_view
  test: change sstring_view to std::string_view
  cql3: change sstring_view to std::string_view
  alternator: change sstring_view to std::string_view
  type: change from_sstring() to from_string_view()
  cross-tree: change to_sstring_view() to to_string_view()
2024-11-18 22:43:46 +02:00
Paweł Zakrzewski
aea3c3851e cql3: selection: pass per partition limit to the result_set_builder
Aggregates require the limit to be applied from within the builder
class, so it needs to be passed to it.
2024-11-18 17:56:53 +01:00
Paweł Zakrzewski
cb1483037c cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit
select_statement::get_limit is used to evaluate the LIMIT value for both
LIMIT and PER PARTITION LIMIT. This change fixes the error message for
incorrect values passed by the user.
2024-11-18 17:56:53 +01:00
Nadav Har'El
b778ce08a9 cql3: change sstring_view to std::string_view
Our "sstring_view" is an historic alias for the standard std::string_view.
The cql3/ directory used this old alias in a few of random places, let's
change them to use the standard type name.

Refs #4062.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 15:57:20 +02:00
Avi Kivity
de822d3a46 mutation: mutation_partition.hh: switch from boost ranges to std ranges
Consolidate on one range solution. Fallout in mutation_partition.cc
due to interoperability problems is adjusted.
2024-11-15 14:09:31 +02:00
Kefu Chai
00810e6a01 treewide: include seastar/core/format.hh instead of seastar/core/print.hh
The later includes the former and in addition to `seastar::format()`,
`print.hh` also provides helpers like `seastar::fprint()` and
`seastar::print()`, which are deprecated and not used by scylladb.

Previously, we include `seastar/core/print.hh` for using
`seastar::format()`. and in seastar 5b04939e, we extracted
`seastar::format()` into `seastar/core/format.hh`. this allows us
to include a much smaller header.

In this change, we just include `seastar/core/format.hh` in place of
`seastar/core/print.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21574
2024-11-14 17:45:07 +02:00
Nikita Kurashkin
3032d8ccbf add check to refuse usage of DESC TABLE on a materialized view
Fixes #21026

Closes scylladb/scylladb#21500
2024-11-11 10:23:30 +02:00
Benny Halevy
4b21cca443 treewide: always allow tablets keyspaces
With the tablets feature always enabled (Unless gossip toopology
changes are forced), the enable_tablets option now controls only
the default for newly created keyspaces.

Even when set to `false`, tablets are still enabled as a
feature and the user may explicitly enable tablets
using `CREATE KEYSPACE <name> WITH tablets = {'enabled': true}`

Note: best viewed with `git show -w`

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-11-07 13:57:39 +02:00
Avi Kivity
4dab2473a2 Merge 'treewide: trade boost's any_of and all_of for std's any_of and all_of' from Kefu Chai
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::all_of` and `std::ranges::any_of`

in this change, we replace `boost::algorithm::all_of` and `boost::algorithm::any_of` with
`std::ranges::all_of` and `std::ranges::any_of` respectively.

to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#21411

* github.com:scylladb/scylladb:
  treewide: s/boost::algorithm::any_of/std::ranges::any_of/
  treewide: s/boost::algorithm::all_of/std::ranges::all_of/
2024-11-05 12:48:24 +02:00
Piotr Dulikowski
7f17894c88 Merge 'cql3: Allow for describing CDC log tables' from Dawid Mędrek
In the past, DESC SCHEMA would produce create statements for both the base
and the log table. That was incorrect as the log table is automatically
created alongside the base one. That was solved in scylladb/scylladb@9ab57b1
(scylladb/scylladb#18467).

The mentioned changes implemented the following solution:

* DESC SCHEMA/KEYSPACE/TABLE would still print a create statement for the
  CDC base table,
* DESC SCHEMA/KEYSPACE would start printing an alter statement for the
  CDC log table. That statement would ensure that the restored log table
  has the same parameters as the original one,
* DESC TABLE <base table> would behave as DESC SCHEMA/KEYSPACE, i.e.
  it would print a create statement for the base table and an alter
  statement for the log table,
* DESC TABLE <log table> would result in an error.

While that solution was good and behaved correctly in the context of
restoring the schema, it had one flaw: describe statement aren't only
used as a means for producing a backup; they also serve an informative
purpose to learn about the schema, e.g. to learn what parameters a specific
table uses. Because we didn't allow for describing CDC log tables, the user
couldn't look them up directly via a describe statement -- they had to
describe the base table for that.

Attempting to describe a log table ended with an error, e.g.:

```
$ DESC TABLE ks.t_scylla_cdc_log;

ks.t_scylla_cdc_log is a cdc log table and it cannot be described directly. Try `DESC TABLE ks.t` to describe cdc base table and it's log table.
```

In these changes, we allow for describing CDC log tables again. The
semantics of the first three bullets above remains unchanged, but
we impose new behavior for DESC TABLE <log table>:

* When the user executes DESC TABLE <log table>, a create statement
  will be returned, treating the table as if it were a regular one,
* The create statement will be wrapped in CQL comment markers.

The rationale for the second bullet is that although we want to give the
user a means to look into the structure and options of a CDC log table,
the returned statement is not supposed to be ever executed by them. We
want to minimize the risk of that.

An example of the behavior after the change:

```
$ DESC TABLE ks.t_scylla_cdc_log;

/* Do NOT execute this statement! It's only for informational purposes.
   A CDC log table is created automatically when the base is created.

CREATE TABLE ks.t_scylla_cdc_log (
    "cdc$stream_id" blob,
    "cdc$time" timeuuid,
    "cdc$batch_seq_no" int,
    "cdc$end_of_batch" boolean,
    "cdc$operation" tinyint,
    "cdc$ttl" bigint,
    p int,
    PRIMARY KEY ("cdc$stream_id", "cdc$time", "cdc$batch_seq_no")
) WITH CLUSTERING ORDER BY ("cdc$time" ASC, "cdc$batch_seq_no" ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'enabled': 'false', 'keys': 'NONE', 'rows_per_partition': 'NONE'}
    AND comment = 'CDC log for ks.t'
    AND compaction = {'class': 'TimeWindowCompactionStrategy', 'compaction_window_size': '60', 'compaction_window_unit': 'MINUTES', 'expired_sstable_check_frequency_seconds': '1800'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 0
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99.0PERCENTILE';

*/
```

We also extend the developer documentation regarding DESCRIBE
statements on CDC tables.

Fixes scylladb/scylladb#21235

Backport: these changes are an enhancement, so not needed.

Closes scylladb/scylladb#21228

* github.com:scylladb/scylladb:
  docs/dev: Document semantics of describing CDC tables
  cql3: Allow for describing CDC log tables
2024-11-05 10:06:13 +01:00
Kefu Chai
59eb2ab119 treewide: s/boost::algorithm::any_of/std::ranges::any_of/
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::any_of`.

in this change, we replace `boost::algorithm::any_of` with
`std::ranges::any_of`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-05 14:06:09 +08:00
Kefu Chai
f8bb1c64f1 treewide: s/boost::algorithm::all_of/std::ranges::all_of/
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::all_of`.

in this change, we replace `boost::algorithm::all_of` with
`std::ranges::all_of`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-05 14:05:24 +08:00
Kefu Chai
ee2a9419b3 lang: remove unused "#includes"
these unused includes are identified by clang-include-cleaner. after
auditing the source files, all of the reports have been confirmed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-05 10:01:04 +08:00
Avi Kivity
856489ded1 cql3: remove unused request_validations methods
These methods are not used and therefore removed.

Closes scylladb/scylladb#21392
2024-11-03 13:17:32 +02:00
Kefu Chai
64122b3df3 treewide: s/boost::transform/std::ranges::transform/
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::transform`.

in this change, we:

- replace `boost::transform` with `std::ranges::transform`
- update affected code to work with `std::ranges::transform`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21318
2024-11-01 08:15:14 +02:00
Dawid Mędrek
39e0513e1b cql3: Allow for describing CDC log tables
In the past, DESC SCHEMA would produce create statements for both the base
and the log table. That was incorrect as the log table is automatically
created alongside the base one. That was solved in scylladb/scylladb@9ab57b1
(scylladb/scylladb#18467).

The mentioned changes implemented the following solution:

* DESC SCHEMA/KEYSPACE/TABLE would still print a create statement for the
  CDC base table,
* DESC SCHEMA/KEYSPACE would start printing an alter statement for the
  CDC log table. That statement would ensure that the restored log table
  has the same parameters as the original one,
* DESC TABLE <base table> would behave as DESC SCHEMA/KEYSPACE, i.e.
  it would print a create statement for the base table and an alter
  statement for the log table,
* DESC TABLE <log table> would result in an error.

While that solution was good and behaved correctly in the context of
restoring the schema, it had one flaw: describe statement aren't only
used as a means for producing a backup; they also serve an informative
purpose to learn about the schema, e.g. to learn what parameters a specific
table uses. Because we didn't allow for describing CDC log tables, the user
couldn't look them up directly via a describe statement -- they had to
describe the base table for that.

Attempting to describe a log table ended with an error, e.g.:

```
$ DESC TABLE ks.t_scylla_cdc_log;

ks.t_scylla_cdc_log is a cdc log table and it cannot be described directly. Try `DESC TABLE ks.t` to describe cdc base table and it's log table.
```

In these changes, we allow for describing CDC log tables again. The
semantics of the first three bullets above remains unchanged, but
we impose new behavior for DESC TABLE <log table>:

* When the user executes DESC TABLE <log table>, a create statement
  will be returned, treating the table as if it were a regular one,
* The create statement will be wrapped in CQL comment markers.

The rationale for the second bullet is that although we want to give the
user a means to look into the structure and options of a CDC log table,
the returned statement is not supposed to be ever executed by them. We
want to minimize the risk of that.

An example of the behavior after the change:

```
$ DESC TABLE ks.t_scylla_cdc_log;

/* Do NOT execute this statement! It's only for informational purposes.
   A CDC log table is created automatically when the base is created.

CREATE TABLE ks.t_scylla_cdc_log (
    "cdc$stream_id" blob,
    "cdc$time" timeuuid,
    "cdc$batch_seq_no" int,
    "cdc$end_of_batch" boolean,
    "cdc$operation" tinyint,
    "cdc$ttl" bigint,
    p int,
    PRIMARY KEY ("cdc$stream_id", "cdc$time", "cdc$batch_seq_no")
) WITH CLUSTERING ORDER BY ("cdc$time" ASC, "cdc$batch_seq_no" ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'enabled': 'false', 'keys': 'NONE', 'rows_per_partition': 'NONE'}
    AND comment = 'CDC log for ks.t'
    AND compaction = {'class': 'TimeWindowCompactionStrategy', 'compaction_window_size': '60', 'compaction_window_unit': 'MINUTES', 'expired_sstable_check_frequency_seconds': '1800'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 0
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99.0PERCENTILE';

*/
```

Fixes scylladb/scylladb#21235
2024-10-31 11:25:19 +01:00
Dawid Mędrek
b984488552 cql3: Rename SALTED HASH to HASHED PASSWORD
Cassandra 4.1 announced a new option to create a role with:
`HASHED PASSWORD`. Example:

```
CREATE ROLE bob WITH HASHED PASSWORD = 'hashed_password';
```

We've already introduced another option following the same
semantics: `SALTED HASH`; example:

```
CREATE ROLE bob WITH SALTED HASH = 'salted_hash';
```

The change hasn't made it to any release yet, so in this commit
we rename it to `HASHED PASSWORD` to be compatible with Cassandra.

Additionally, we adjust existing tests to work against Cassandra too.

Fixes scylladb/scylladb#21350

Closes scylladb/scylladb#21352
2024-10-30 14:07:58 +02:00
Avi Kivity
d3dae09316 compound: replace boost ranges with std ranges
Standardize on the standard range library.

The serialize_value(initializer_list) overload is disambiguated
not to call itself. Apparently it wasn't called before.

Since std::ranges::subrange does not provide operator==, replace
it with std::ranges::equals().
2024-10-28 18:35:41 +02:00
Kefu Chai
24d14b601b treewide: s/boost::adaptors::map_values/std::views::values/
now that we are allowed to use C++23. we now have the luxury of using
`std::views::values`.

in this change, we:

- replace `boost::adaptors::map_values` with `std::views::values`
- update affected code to work with `std::views::values`
- the places where we use `boost::join()` are not changed, because
  we cannot use `std::views::concat` yet. this helper is only
  available in C++26.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21265
2024-10-27 21:32:45 +02:00
Avi Kivity
847c850034 schema: add accessors for primary key columns and non-primary-key columns
It's somewhat common to ask for the partition key and clustering key
columns, or for the static and regular columsn. Provide accessors for them
rather than requiring the user to glue them.

Some callers are converted.

Closes scylladb/scylladb#21191
2024-10-22 15:01:14 +02:00
Kefu Chai
6ead5a4696 treewide: move log.hh into utils/log.hh
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.

in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-22 06:54:46 +03:00
Kefu Chai
5cd619a60c treewide: s/boost::adaptors::map_keys/std::views::keys/
now that we are allowed to use C++23. we now have the luxury of using
`std::views::keys`.

in this change, we:

- replace `boost::adaptors::map_keys` with `std::views::keys`
- update affected code to work with `std::views::keys`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21198
2024-10-21 12:47:52 +03:00
Avi Kivity
c3be2489ce treewide: drop includes of <boost/range/adaptors.hpp>
This includes way too much, including <boost/regex.hpp>, which is huge.
Drop includes of adaptors.hpp and replace by what is needed.

Closes scylladb/scylladb#21187
2024-10-20 17:17:11 +03:00
Botond Dénes
568b767ec3 Merge 'schema: convert from boost ranges to std ranges' from Avi Kivity
To reduce dependency load, change uses of boost ranges to std::ranges.

The first patch is preparation, replacing a construct that isn't easy to support with std ranges with something simpler.

No backport as this is a code cleanup.

Closes scylladb/scylladb#21122

* github.com:scylladb/scylladb:
  schema: replace boost ranges with std ranges
  schema: precompute all_columns_in_select_order()
2024-10-18 08:42:50 +03:00