Commit Graph

21 Commits

Author SHA1 Message Date
Nadav Har'El
0a990d2a48 config: split tri_mode_restriction to a separate header
Today, any source file or header file that wants to use the
tri_mode_restriction type needs to include db/config.hh, which is a
large and frequently-changing header file. In this patch we split this
type into a separate header file, db/tri_mode_restriction.hh, and avoid
a few unnecessary inclusions of db/config.hh. However, a few source
files now need to explicitly include db/config.hh, after its
transitive inclusion is gone.

Note that the overwhelmingly common inclusion of db/config.hh continues
to be a problem after this patch - 128 source files include it directly.
So this patch is just the first step in long journey.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#25692
2025-08-27 13:47:04 +03:00
Wojciech Mitros
88d3fc68b5 alter_table_statement: fix renaming multiple columns in tables with views
When we rename columns in a table which has materialized views depending
on it, we need to also rename them in the materialized views' WHERE
clauses.
Currently, we do that by creating a new WHERE clause after each rename,
with the updated column. This is later converted to a mutation that
overwrites the WHERE clause. After multiple renames, we have multiple
mutations, each overwriting the WHERE clause with one column renamed.
As a result, the final WHERE clause is one of the modified clauses with
one column renamed.
Instead, we should prepare one new WHERE clause which includes all the
renamed columns. This patch accomplishes this by processing all the
column renames first, and only preparing the new view schema with the
new WHERE clause afterwards.

This patch also includes a test reproducer for this scenario.

Fixes scylladb/scylladb#22194

Closes scylladb/scylladb#23152
2025-03-25 09:58:58 +01:00
Kefu Chai
e4463b11af treewide: replace boost::algorithm::join() with fmt::join()
Replace usages of `boost::algorithm::join()` with `fmt::join()` to improve
performance and reduce dependency on Boost. `fmt::join()` allows direct
formatting of ranges and tuples with custom separators without creating
intermediate strings.

When formatting comma-separated values into another string, fmt::join()
avoids the overhead of temporary string creation that
`boost::algorithm::join()` requires. This change also helps streamline
our dependencies by leveraging the existing fmt library instead of
Boost.Algorithm.

To avoid the ambiguity, some caller sites were updated to call
`seastar::format()` explicitly.

See also

- boost::algorithm::join():
  https://www.boost.org/doc/libs/1_87_0/doc/html/string_algo/reference.html#doxygen.join_8hpp
- fmt::join():
  https://fmt.dev/11.0/api/#ranges-api

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22082
2025-01-07 12:45:05 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
bab12e3a98 treewide: migrate from boost::adaptors::transformed to std::views::transform
now that we are allowed to use C++23. we now have the luxury of using
`std::views::transform`.

in this change, we:

- replace `boost::adaptors::transformed` with `std::views::transform`
- use `fmt::join()` when appropriate where `boost::algorithm::join()`
  is not applicable to a range view returned by `std::view::transform`.
- use `std::ranges::fold_left()` to accumulate the range returned by
  `std::view::transform`
- use `std::ranges::fold_left()` to get the maximum element in the
  range returned by `std::view::transform`
- use `std::ranges::min()` to get the minimal element in the range
  returned by `std::view::transform`
- use `std::ranges::equal()` to compare the range views returned
  by `std::view::transform`
- remove unused `#include <boost/range/adaptor/transformed.hpp>`
- use `std::ranges::subrange()` instead of `boost::make_iterator_range()`,
  to feed `std::views::transform()` a view range.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

limitations:

there are still a couple places where we are still using
`boost::adaptors::transformed` due to the lack of a C++23 alternative
for `boost::join()` and `boost::adaptors::uniqued`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21700
2024-12-03 09:41:32 +02:00
Nadav Har'El
b778ce08a9 cql3: change sstring_view to std::string_view
Our "sstring_view" is an historic alias for the standard std::string_view.
The cql3/ directory used this old alias in a few of random places, let's
change them to use the standard type name.

Refs #4062.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 15:57:20 +02:00
Avi Kivity
8c67f9b42e cql3: util: remove unneeded boost/range includes from header files
The includes are redistributed to the source files that need them.

Closes scylladb/scylladb#21391
2024-10-31 23:49:44 +01:00
Avi Kivity
d69bf4f010 cql3: introduce dialect infrastructure
A dialect is a different way to interpret the same CQL statement.

Examples:
 - how duplicate bind variable names are handled (later in this series)
 - whether `column = NULL` in LWT can return true (as is now) or
   whether it always returns NULL (as in SQL)

Currently, dialect is an empty structure and will be filled in later.
It is passed to query_processor methods that also accept a CQL string,
and from there to the parser. It is part of the prepared statement cache
key, so that if the dialect is changed online, previous parses of the
statement are ignored and the statement is prepared again.

The patch is careful to pick up the dialect at the entry point (e.g.
CQL protocol server) so that the dialect doesn't change while a statement
is parsed, prepared, and cached.
2024-08-29 21:19:23 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Avi Kivity
b858a4669d cql3: expr: break up expression.hh header
Adding a function declaration to expression.hh causes many
recompilations. Reduce that by:

 - moving some restrictions-related definitions to
   the existing expr/restrictions.hh
 - moving evaluation related names to a new header
   expr/evaluate.hh
 - move utilities to a new header
   expr/expr-utilities.hh

expression.hh contains only expression definitions and the most
basic and common helpers, like printing.
2023-06-22 14:21:03 +03:00
Wojciech Mitros
fc8dcc1a62 cql: add a check for currently used stack in parser
While in debug mode, we may switch the default stack to
a larger one when parsing cql. We may, however, invoke
the parser recusively, causing us to switch to the big
stack while currently using it. After the reset, we
assume that the stack is empty, so after switching to
the same stack, we write over its previous contents.

This is fixed by checking if we're already using the large
stack, which is achieved by comparing the address of
a local variable to the start and end of the large stack.
2023-03-23 01:41:58 +01:00
Avi Kivity
0f15ff740d cql3: expr: simplify user/debug formatting
We have a cql3::expr::expression::printer wrapper that annotates
an expression with a debug_mode boolean prior to formatting. The
fmt library, however, provides a much simpler alterantive: a custom
format specifier. With this, we can write format("{:user}", expr) for
user-oriented prints, or format("{:debug}", expr) for debug-oriented
prints (if nothing is specified, the default remains debug).

This is done by implementing fmt::formatter::parse() for the
expression type, can using expression::printer internally.

Since sometimes we pass expression element types rather than
the expression variant, we also provide a custom formatter for all
ExpressionElement Types.

Uses for expression::printer are updated to use the nicer syntax. In
one place we eliminate a temporary that is no longer needed since
ExpressionElement:s can be formatted directly.

Closes #12702
2023-02-08 12:24:58 +02:00
Nadav Har'El
5bf94ae220 cql: allow disabling of USING TIMESTAMP sanity checking
As requested by issue #5619, commit 2150c0f7a2
added a sanity check for USING TIMESTAMP - the number specified in the
timestamp must not be more than 3 days into the future (when viewed as
a number of microseconds since the epoch).

This sanity checking helps avoid some annoying client-side bugs and
mis-configurations, but some users genuinely want to use arbitrary
or futuristic-looking timestamps and are hindered by this sanity check
(which Cassandra doesn't have, by the way).

So in this patch we add a new configuration option, restrict_future_timestamp
If set to "true", futuristic timestamps (more than 3 days into the future)
are forbidden. The "true" setting is the default (as has been the case
sinced #5619). Setting this option to "false" will allow using any 64-bit
integer as a timestamp, like is allowed Cassanda (and was allowed in
Scylla prior to #5619.

The error message in the case where a futuristic timestamp is rejected
now mentions the configuration paramter that can be used to disable this
check (this, and the option's name "restrict_*", is similar to other
so-called "safe mode" options).

This patch also includes a test, which works in Scylla and Cassandra,
with either setting of restrict_future_timestamp, checking the right
thing in all these cases (the futuristic timestamp can either be written
and read, or can't be written). I used this test to manually verify that
the new option works, defaults to "true", and when set to "false" Scylla
behaves like Cassandra.

Fixes #12527

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12537
2023-01-16 23:18:56 +02:00
Avi Kivity
9823e75d16 cql3: grammar: make where clause return an expression
In preparation of the relaxation of the grammar to return any expression,
change the whereClause production to return an expression rather than
terms. Note that the expression is still constrained to be a conjunction
of relations, and our filtering code isn't prepared for more.

Before the patch, if the WHERE clause was optional, the grammar would
pass an empty vector of expressions (which is exactly correct). After
the patch, it would pass a default-constructed expression. Now that
happens to be an empty conjunction, which is exactly what's needed, but
it is too accidental, so the patch changes optional WHERE clauses to
explicitly generate an empty conjunction if the WHERE clause wasn't
specified.
2022-07-22 20:14:48 +03:00
Avi Kivity
a037f9a086 cql3: util: deinline where clause utilities
Some where clause related functions were unnecessarily inline; another
was just recently de-templated. Move them to .cc.
2022-07-22 20:14:48 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Nadav Har'El
ec5e4c338b cql: fix undefined behavior in timestamp verification
Commit 2150c0f7a2 proposed by issue #5619
added a limitation that USING TIMESTAMP cannot be more than 3 days into
the future. But the actual code used to check it,

     timestamp - now > MAX_DIFFERENCE

only makes sense for *positive* timestamps. For negative timestamps,
which are allowed in Cassandra, the difference "timestamp - now" might
overflow the signed integer and the result is undefined - leading to the
undefined-behavior sanitizer to complain as reported in issue #8895.
Beyond the sanitizer, in practice, on my test setup, the timestamp -2^63+1
causes such overflow, which causes the above if() to make the nonsensical
statement that the timestamp is more than 3 days into the future.

This patch assumes that negative timestamps of any magnitude are still
allowed (as they are in Cassandra), and fixes the above if() to only
check timestamps which are in the future (timestamp > now).

We also add a cql-pytest test for negative timestamps, passing on both
Cassandra and Scylla (after this patch - it failed before, and also
reported sanitizer errors in the debug build).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210621141255.309485-1-nyh@scylladb.com>
2021-07-24 11:01:08 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Piotr Wojtczak
2150c0f7a2 cql: Check for timestamp correctness in USING TIMESTAMP statements
In certain CQL statements it's possible to provide a custom timestamp via the USING TIMESTAMP clause. Those values are accepted in microseconds, however, there's no limit on the timestamp (apart from type size constraint) and providing a timestamp in a different unit like nanoseconds can lead to creating an entry with a timestamp way ahead in the future, thus compromising the table.

To avoid this, this change introduces a sanity check for modification and batch statements that raises an error when a timestamp of more than 3 days into the future is provided.

Fixes #5619

Closes #7475
2020-11-01 11:01:24 +02:00
Avi Kivity
cb6231d1e2 cql3: use larger stack for do_with_cql_parser() in debug mode
Our cql parser uses large amounts of stack, and can overflow it
in debug mode with clang. To prevent this stack overflow,
temporarily use a larger (1MB) stack.

We can't use seastar::thread(), since do_with_cql_parser() does
not yield. We can't use std::thread(), since lw_shared_ptr()'s
debug mode will scream murder at an lw_shared_ptr used across
threads (even though it's perfectly safe in this case). We
can't use boost::context2 since that requires the library to
be compiled with address sanitizer support, which it isn't on
Fedora. So we use a fiber switch using the getcontext() function
familty. This requires extra annotations for debu mode, which are
added.
2020-10-10 00:31:50 +03:00
Avi Kivity
31886bc562 cql3: deinline do_with_cql_parser()
The cql parser causes trouble with the santizers and clang,
since it consumes a large amount of stack space (it does so
with gcc too, but does not overflow our 128k stacks). In
preparation for working around the problem, deinline it
so the hacks need not spread to the entire code base
via #include.

There is no performance impact from the virtual function,
as cql parsing will dominate the call.
2020-10-09 23:49:42 +03:00