Commit Graph

10 Commits

Author SHA1 Message Date
Nadav Har'El
55317666c6 test/cql-pytest: check that most aggregators don't take "*"
Although you can "SELECT COUNT(*)", this has special handling in the CQL
parser (it is converted into a special row-counting request) and you can't
give "*" to other aggregators - e.g., "SELECT SUM(*)". This patch includes
a simple test that confirms this.

I wanted to check this in relation to the previous patch, which did,
sort of, a "SELECT $$first$$(*)" - a syntax which this test shows
wouldn't have actually worked if we tried it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-12-25 17:53:42 +02:00
Avi Kivity
f6f974cdeb cql3: selection: fix GROUP BY, empty groups, and aggregations
A GROUP BY combined with aggregation should produce a single
row per group, except for empty groups. This is in contrast
to an aggregation without GROUP BY, which produces a single
row no matter what.

The existing code only considered the case of no grouping
and forced a row into the result, but this caused an unwanted
row if grouping was used.

Fix by refining the check to also consider GROUP BY.

XFAIL tests are relaxed.

Fixes #12477.

Note, forward_service requires that aggregation produce
exactly one row, but since it can't work with grouping,
it isn't affected.

Closes #14399
2023-06-28 18:56:22 +03:00
Jan Ciolek
854b0301be cql-pytest/test_aggregate: test case-sensitive column name in aggregate
There was a bug which made aggregates fail when used with case-sensitive
column names.
Add a test to make sure that this doesn't happen in the future.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-06-21 14:49:24 +02:00
Avi Kivity
e7c1824ed0 test: add regression test for rejection of aggregates in the WHERE clause
The test passes on Cassandra and ScyllaDB.
2023-06-13 21:04:49 +03:00
Avi Kivity
78f4ee385f cql3: functions: fix count(col) for non-scalar types
count(col), unlike count(*), does not count rows for which col is NULL.
However, if col's data type is not a scalar (e.g. a collection, tuple,
or user-defined type) it behaves like count(*), counting NULLs too.

The cause is that get_dynamic_aggregate() converts count() to
the count(*) version. It works for scalars because get_dynamic_aggregate()
intentionally fails to match scalar arguments, and functions::get() then
matches the arguments against the pre-declared count functions.

As we can only pre-declare count(scalar) (there's an infinite number
of non-scalar types), we change the approach to be the same as min/max:
we make count() a generic function. In fact count(col) is much better
as a generic function, as it only examines its input to see if it is
NULL.

A unit test is added. It passes with Cassandra as well.

Fixes #14198.

Closes #14199
2023-06-13 14:40:14 +03:00
Nadav Har'El
9c3907bb3c test/cql-pytest: reproducers for incorrect AVG of "decimal" type
This patch contains tests reproducing issue #13601 and the corresponding
Cassandra issue CASSANDRA-18470. These issues are about what the AVG
aggregation does for arbitrary-precision "decimal" numbers - the tests
we add here show examples where the current behavior doesn't make sense:

The problem is that "decimal" has arbitrary precision - so, should an
average of 1/3 be returned as 0.3 or 0.33333333333333333? This is not
specified, so Scylla (and Cassandra) decided to pick the result precision
based on the input precision. In particular, the average of 1 and 2
is returned as 2 (zero digits after the decimal point, like in the
inputs) instead of the expected 1.5. Arguably this isn't useful behavior.

The test adds a second test which fails on Cassandra, but does pass
on Scylla: Cassandra returns as the average of 1, 2, 2, 3 the integer 1
whereas the correct average is 2 (and Scylla returns it correctly).
The reason why this bug is even worse on Cassandra is that Scylla's AVG
only loses precision when dividing the sum and count, but Cassandra
tries to maintain only the average, and loses precision at every step.

Refs #13601

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13603
2023-04-21 08:32:30 +03:00
Nadav Har'El
81e0f5b581 cql3: allow SUM() aggregation to result in a NaN
When floating-point data contains +Inf and -Inf, the sum is NaN.

Our SUM() aggregation calculated this sum correctly, but then instead
of returning it, complained that the sum overflowed by narrowing.
This was a false positive: The sum() finalizer wanted to test that no
precision was lost when casting the accumulator to the result type,
so checked that the result before and after the cast are the same.
But specifically for NaN, it is never equal to anything - not even
to itself. This check is wrong for floating point, but moreover -
isn't even necessary when the two types (accumulator type and result
type) are identical so in this patch we skip it in this case.

Note that in the current code, a different accumulator and result type
is only used in the case of integer types; When accumulating floating
point sums, the same type is used, so the broken check will be avoided.

The test for this issue starts to pass with this patch, so the xfail
tag is removed.

Fixes #13551

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-04-19 09:31:41 +03:00
Nadav Har'El
78555ba7f1 test/cql-pytest: add tests for data casts and inf in sums
This patch adds tests to reproduce issue #13551. The issue, discovered
by a dtest (cql_cast_test.py), claimed that either cast() or sum(cast())
from varint type broke. So we add two tests in cql-pytest:

1. A new test file, test_cast_data.py, for testing data casts (a
   CAST (...) as ... in a SELECT), starting with testing casts from
   varint to other types.

   The test uncovers a lot of interesting cases (it is heavily
   commented to explain these cases) but nothing there is wrong
   and all tests pass on Scylla.

2. An xfailing test for sum() aggregate of +Inf and -Inf. It turns out
   that this caused #13551. In Cassandra and older Scylla, the sum
   returned a NaN. In Scylla today, it generates a misleading
   error message.

As usual, the tests were run on both Cassandra (4.1.1) and Scylla.

Refs #13551.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-04-18 13:38:42 +03:00
Nadav Har'El
130c090251 cql-pytest: add tests for sum() aggregate
This patch adds regression tests for the strange (but Cassandra-compatible)
behavior described in issue #13027 - that sum of no results returns 0
(not null or nothing), and if also asking for p, we get a null there too.

Refs #13027.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-02-28 15:35:21 +02:00
Nadav Har'El
e1f97715eb test/cql-pytest: move aggregation tests to one file
We had separate test files test_minmax.py and test_count.py but the
separate was artificial (and test_count.py even had one test using
min()). Now I that want to add another test for sum(), I don't know
where to put it. So in this patch I combine test_minmax.py and
test_count.py into one test file - test_aggregate.py, and we can
later add sum() tests in the same file.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-02-28 14:39:04 +02:00