Commit Graph

26 Commits

Author SHA1 Message Date
Nadav Har'El
f76f6dbccb secondary index: avoid special characters in default index names
In CQL, table names are limited to so-called word characters (letters,
numbers and underscores), but column names don't have such a limitation.
When we create a secondary index, its default name is constructed from
the column name - so can contain problematic characters. It can include
even the "/" character. The problem is that the index name is then used,
like a table name, to create a directory with that name.

The test included in this patch demonstrates that before this patch, this
can be misused to create subdirectories anywhere in the filesystem, or to
crash Scylla when it fails to create a directory (which it considers an
unrecoverable I/O error).

In this patch we do what Cassandra does - remove all non-word
characters from the indexed column name before constructing the default
index name. In the included test - which can run on both Scylla and
Cassandra - we verify that the constructed index name is the same as
in Cassandra, which is useful to know (e.g., because knowing the index
name is needed to DROP the index).

Also, this patch adds a second line of defense against the security problem
described above: It is now an error to create a schema with a slash or
null (the two characters not allowed in Unix filenames) in the keyspace
or table names. So if the first line of defense (CQL checking the validity
of its commands) fails, we'll have that second line of defense. I verified
that if I revert the default-index-name fix, the second line of defense
kicks in, and the index creation is aborted and cannot create files in
the wrong place to crash Scylla.

Fixes #3403

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220320162543.3091121-1-nyh@scylladb.com>
2022-03-20 18:33:48 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Dejan Mircevski
ba55769f80 test: Use ALLOW FILTERING more strictly
Prepare for the upcoming strict ALLOW FILTERING check by modifying
unit-test queries that need it.  Current code allows such queries both
with and without ALLOW FILTERING; future code will reject them without
ALLOW FILTERING.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2021-08-08 08:01:19 +02:00
Piotr Grabowski
e06102aed9 tests: add secondary index tests with TOKEN clause
Add tests of SELECTs with TOKEN clauses on tables with secondary
indexes (both global and local).

test_select_with_token_range_cases checks all possible token range
combinations (inclusive/exclusive/infinity start/end) on tables without
index, with local or with global index.

test_select_with_token_range_filtering checks whether TOKEN restrictions
combined with column restrictions work properly. As different code paths
are taken if index is created on clustering key (first or non-first) or
non-primary-key column, the tests checks scenarios when index is created
on different columns.
2021-07-21 16:12:55 +02:00
Piotr Grabowski
e2bd1cdb9d secondary_index_test: extract test data
Extract test data to a separate variables, allowing it to be easily
reused by other tests. The tokens are hard-coded, because calculating
their value brought too much complexity to this code.
2021-07-21 16:12:55 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Piotr Sarna
c5214eb096 treewide: remove timeout config from query options
Timeout config is now stored in each connection, so there's no point
in tracking it inside each query as well. This patch removes
timeout_config from query_options and follows by removing now
unnecessary parameters of many functions and constructors.
2021-02-25 17:20:27 +01:00
Dejan Mircevski
46b4b59945 cql3: Fix value_for when restriction is impossible
Previously, single_column_restrictions::value_for() assumed that a
column's restriction specifies exactly one value for the column.  But
since 37ebe521e3, multiple equalities on the same column are allowed,
so the restriction could be a conjunction of conflicting
equalities (eg, c=1 AND c=0).  That violates an assert and crashes
Scylla.

This patch fixes value_for() by gracefully handling the
impossible-restriction case.

Fixes #7772

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-12-16 15:00:29 -05:00
Dejan Mircevski
e45af3b9b8 index: Ensure restriction is supported in find_idx
Previously, statement_restrictions::find_idx() would happily return an
index for a non-EQ restriction (because it checked only the column
name, not the operator).  This is incorrect: when the selected index
is for a non-EQ restriction, it is impossible to query that index
table.

Fixes #7659.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>

Closes #7665
2020-12-01 15:16:48 +02:00
Dejan Mircevski
db63b40347 cql3: Don't use index for multi-column restrictions
The downstream code expects a single-column restriction when using an
index.  We could fix it, but we'd still have to filter the rows
fetched from the index table, unlike the code that queries the base
table directly.  For instance, WHERE (c1,c2,c3) = (1,2,3) with an
index on c3 can fetch just the right rows from the base table but all
the c3=3 rows from the index table.

Fixes #7680

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-11-25 10:39:04 -05:00
Avi Kivity
756b14f309 Merge 'cql3: Drop unneeded filtering when continuous clustering-key is selected' from Dejan Mircevski
I noticed that we require filtering for continuous clustering key, which is not necessary.  I dropped the requirement and made sure the correct data is read from the storage proxy.

The corresponding dtest PR: https://github.com/scylladb/scylla-dtest/pull/1727

Tests: unit (dev,debug), dtest (next-gating, cql*py)

Closes #7460

* github.com:scylladb/scylla:
  cql3: Delete some newlines
  cql3: Drop superfluous ALLOW FILTERING
  cql3: Drop unneeded filtering for continuous CK
2020-11-10 17:41:00 +02:00
Piotr Grabowski
491987016c tests: add token ordering test of indexed selects
Add new test validating that rows returned from both non-indexed selects
and indexed selects return rows sorted in token order (making sure
that both positive and negative tokens are present to test if signed
comparison order is maintained).
2020-11-04 12:02:42 +01:00
Piotr Grabowski
2bd23fbfa9 tests: fix tests according to new token ordering
Fix tests to adhere to new (correct) token ordering of rows when
querying tables with secondary indexes.
2020-11-04 12:02:42 +01:00
Piotr Grabowski
b1350af951 token_column_computation: rename as legacy
Raname token_column_computation to legacy_token_column_computation, as
it will be replaced with new column_computation. The reason is that this
computation returns bytes, but all tokens in Scylla can now be
represented by int64_t. Moreover, returning bytes causes invalid token
ordering as bytes comparison is done in unsigned way (not signed as
int64_t). See issue:

https://github.com/scylladb/scylla/issues/7443
2020-11-04 12:00:18 +01:00
Piotr Grabowski
e96ef0d629 tests: Cleanup select_statement_utils
Add additional comments to select_statement_utils, fix formatting, add
missing #pragma once and introduce set_internal_paging_size_guard to
set internal_paging in RAII fashion.

Closes #7507
2020-10-29 15:25:02 +01:00
Piotr Grabowski
006d4f40d9 tests: Add secondary index aggregates tests
Extensively tests queries on tables with secondary indices with
aggregates and GROUP BYs. Tests three cases that are implemented
in indexed_table_select_statement::do_execute - partition_slices,
whole_partitions and (non-partition_slices and non-whole_partitions).
As some of the issues found were related to paging, the tests check
scenarios where the inserted data is smaller than a page, larger than
a page and larger than two pages (and some boundary scenarios).
2020-10-28 17:01:25 +01:00
Dejan Mircevski
b037b0c10b cql3: Delete some newlines
Makes files shorter while still keeping the lines under 120 columns.
Separate from other commits to make review easier.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-10-19 15:40:55 -04:00
Dejan Mircevski
62ea6dcd28 cql3: Drop superfluous ALLOW FILTERING
Required no longer, after the last commit.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-10-19 15:38:11 -04:00
Dejan Mircevski
6773563d3d cql3: Drop unneeded filtering for continuous CK
Don't require filtering when a continuous slice of the clustering key
is requested, even if partition is unrestricted.  The read command we
generate will fetch just the selected data; filtering is unnecessary.

Some tests needed to update the expected results now that we're not
fetching the extra data needed for filtering.  (Because tests don't do
the final trim to match selectors and assert instead on all the data
read.)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-10-19 14:46:43 -04:00
Piotr Sarna
88913e9d44 test: add cases for empty paging state for index queries
In order to check regressions related to #6136 and similar issues,
test cases for handling paging state with empty partition/clustering
key pair are added.
2020-04-06 08:59:40 +02:00
Rafael Ávila de Espíndola
eca0ac5772 everywhere: Update for deprecated apply functions
Now apply is only for tuples, for varargs use invoke.

This depends on the seastar changes adding invoke.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200324163809.93648-1-espindola@scylladb.com>
2020-03-25 08:49:53 +02:00
Piotr Sarna
62c34a9085 cql: fix qualifying indexed columns for filtering
When qualifying columns to be fetched for filtering, we also check
if the target column is not used as an index - in which case there's
no need of fetching it. However, the check was incorrectly assuming
that any restriction is eligible for indexing, while it's currently
only true for EQ. The fix makes a more specific check and contains
many dynamic casts, but these will hopefully we gone once our
long planned "restrictions rewrite" is done.
This commit comes with a test.

Fixes #5708
Tests: unit(dev)
2020-03-19 10:34:16 +02:00
Pavel Emelyanov
5b86f4be9a test: Split secondary_index test
Detach test_index_with_paging into individual file.

This particular test-case is the longest one in the sute,
it takes ~14 minutes to run, further splitting of this
test is pointless (for now) and all subsequent splits in
this set just make the resulting times less than this.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-03-16 20:26:34 +03:00
Pavel Solodovnikov
d64fd52ae5 paging_state: switch from shared_ptr to lw_shared_ptr
Change the way `service::pager::paging_state` is passed around
from `shared_ptr` to `lw_shared_ptr`. It's safe since
`paging_state` is final.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-02-16 17:23:36 +03:00
Konstantin Osipov
1c8736f998 tests: move all test source files to their new locations
1. Move tests to test (using singular seems to be a convention
   in the rest of the code base)
2. Move boost tests to test/boost, other
   (non-boost) unit tests to test/unit, tests which are
   expected to be run manually to test/manual.

Update configure.py and test.py with new paths to tests.
2019-12-16 17:47:42 +03:00