scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 02:20:37 +00:00

Author	SHA1	Message	Date
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Avi Kivity	b7556e9482	cql3: functions: add "first" aggregate function first(x) returns the first x it sees in the group. This is useful for SELECT clauses that return a mix of aggregates and non-aggregates, for example SELECT max(x), x with inputs of x = { 1, 2, 3 } is expected to return (3, 1). Currently, this behavior is handled by individual selectors, which means they need to contain extra state for this, which cannot be easily translated to expressions. The new first function allows translating the SELECT clause above to SELECT max(x), first(x) so all selectors are aggregations and can be handled in the same way. The first() function is not exposed to users.	2023-07-02 18:15:00 +03:00
Avi Kivity	78f4ee385f	cql3: functions: fix count(col) for non-scalar types count(col), unlike count(), does not count rows for which col is NULL. However, if col's data type is not a scalar (e.g. a collection, tuple, or user-defined type) it behaves like count(), counting NULLs too. The cause is that get_dynamic_aggregate() converts count() to the count(*) version. It works for scalars because get_dynamic_aggregate() intentionally fails to match scalar arguments, and functions::get() then matches the arguments against the pre-declared count functions. As we can only pre-declare count(scalar) (there's an infinite number of non-scalar types), we change the approach to be the same as min/max: we make count() a generic function. In fact count(col) is much better as a generic function, as it only examines its input to see if it is NULL. A unit test is added. It passes with Cassandra as well. Fixes #14198. Closes #14199	2023-06-13 14:40:14 +03:00
Kefu Chai	ca6ebbd1f0	cql3, db: sstable: specialize fmt::formatter<function_name> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `function_name` without the help of `operator<<`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13608	2023-04-21 10:07:28 +03:00
Botond Dénes	d828cfcb23	Merge 'db, cql3: functions: switch argument passing to std::span' from Avi Kivity Database functions currently receive their arguments as an std::vector. This is inflexible (for example, one cannot use small_vector to reduce allocations). This series adapts the function signature to accept parameters using std::span. Some changes in the keys interface are needed to support this. Lastly, one call site is migrated to small_vector. This is in support of changing selectors to use expressions. Closes #13581 * github.com:scylladb/scylladb: cql3: abstract_function_selector: use small_vector for argument buffer db, cql3: functions: pass function parameters as a span instead of a vector keys: change from_optional_exploded to accept a span instead of a vector	2023-04-21 06:49:07 +03:00
Avi Kivity	3e0aacc8b5	db, cql3: functions: pass function parameters as a span instead of a vector Spans are more flexible and can be constructed from any contiguous container (such as small_vector), or a subrange of such a container. This can save allocations, so change the signature to accept a span. Spans cannot be constructed from std::initializer_list, so one such call site is changed to use construct a span directly from the single argument.	2023-04-19 20:38:55 +03:00
Nadav Har'El	81e0f5b581	cql3: allow SUM() aggregation to result in a NaN When floating-point data contains +Inf and -Inf, the sum is NaN. Our SUM() aggregation calculated this sum correctly, but then instead of returning it, complained that the sum overflowed by narrowing. This was a false positive: The sum() finalizer wanted to test that no precision was lost when casting the accumulator to the result type, so checked that the result before and after the cast are the same. But specifically for NaN, it is never equal to anything - not even to itself. This check is wrong for floating point, but moreover - isn't even necessary when the two types (accumulator type and result type) are identical so in this patch we skip it in this case. Note that in the current code, a different accumulator and result type is only used in the case of integer types; When accumulating floating point sums, the same type is used, so the broken check will be avoided. The test for this issue starts to pass with this patch, so the xfail tag is removed. Fixes #13551 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-04-19 09:31:41 +03:00
Avi Kivity	58eb21aa5d	db: functions: fold stateless_aggregate_function_adapter into aggregate_function Now that all aggregate functions are derived from stateless_aggregate_function_adapter, we can just fold its functionality into the base class. This exposes stateless_aggregate_function to all users of aggregate_function, so they can begin to benefit from the transformation, though this patch doesn't touch those users. The aggregate_function base class is partiallly devirtualized since there is just a single implementation now.	2023-03-28 23:47:11 +03:00
Avi Kivity	68529896aa	cql3: functions: simplify accumulator_for template The accumulator_for template is used to select the accumulator type for aggregates. After refactoring, all that is needed from it is to select the native type, so remove all the excess code.	2023-03-28 23:47:11 +03:00
Avi Kivity	4ea3136026	cql3: functions: base user-defined aggregates on stateless aggregates Since the model for stateless aggregates was taken from user defined aggregates, the conversion is trivial.	2023-03-28 23:47:11 +03:00
Avi Kivity	f2715b289a	cql3: functions: drop native_aggregate_function Now that all aggregates are implemented staetelessly, native_aggregate_function no longer has subclasses, so drop it.	2023-03-28 23:47:11 +03:00
Avi Kivity	6bceb25982	cql3: functions: reimplement count(column) statelessly Note that we don't use the automarshalling helper for the aggregation function, since it doesn't work for compound types.	2023-03-28 23:47:11 +03:00
Avi Kivity	4f2cdace9a	cql3: functions: reimplement avg() statelessly	2023-03-28 23:47:11 +03:00
Avi Kivity	b0a8fd3287	cql3: functions: reimplement sum() statelessly	2023-03-28 23:47:11 +03:00
Avi Kivity	d21d11466a	cql3: functions: change wide accumulator type to varint Currently, we use __int128, but this has no direct counterpart in CQL, so we can't express the accumulator type as part of a CQL scalar function. Switch to varint which is a superset, although slower.	2023-03-28 23:47:11 +03:00
Avi Kivity	3252dc0172	cql3: functions: unreverse types for min/max Currently it works without this, but later unreversing will be removed from another part of the stack, causing min/max on reversed types to return incorrect results. Anticipate that an unreverse the types during construction.	2023-03-28 23:47:09 +03:00
Avi Kivity	ed466b7e68	cql3: functions: rename make_{min,max}_dynamic_function There's no longer a statically-typed variant, so no need to distinguish the dynamically-typed one.	2023-03-28 23:37:49 +03:00
Avi Kivity	bfd70c192e	cql3: functions: reimplement min/max statelessly min() and max() had two implementations: one static (for each type in a select list) and one dynamic (for compound types). Since the dynamic implementation is sufficient, we only reimplement that. This means we don't use the automarshalling helpers, since we don't do any arithemetic on values apart from comparison, which is conveniently provided by abstract_type.	2023-03-26 15:18:22 +03:00
Avi Kivity	e6342d476b	cql3: functions: reimplement count(*) statelessly Note we have to explicitly decay lambdas to functions using unary operator +.	2023-03-26 15:18:22 +03:00
Avi Kivity	9291ec5ed1	cql3: functions: simplify creating native functions even more Add a helper function to consolidate the internal native function class and the automatic marshalling introduced in previous patches. Since decaying a lambda into a function pointer (in order to infer its signature) there are two overloads: one accepts a lambda and decays it into a function pointer, the second accepts a function pointer, infers its argument, and constructs the function object.	2023-03-26 15:15:36 +03:00
Avi Kivity	7a5e609d8d	cql3: functions: add helpers for automating marshalling for scalar functions Add a helper that, given a C++ function, deduces its arguument types and wraps the function in marshalling/unmarshalling code. The native function expects non-null inputs, so an additional helper is called to decide what to do if nulls are encountered. One such helper is return_accumulator_on_null (since that's the default behavior of aggregates), and the other is return_any_nonnull(), useful for reductions.	2023-03-15 22:28:41 +02:00
Avi Kivity	6c8d942fa1	cql3: functions: add helper class for internal scalar functions We'll need many scalar functions to implement aggregates in terms of scalars, so we add an internal_scalar_function class to reduce boilerplate. The new class proxies the scalar function into a native noncopyable_function provided by the constructor.	2023-03-15 22:22:02 +02:00
Kefu Chai	df63e2ba27	types: move types.{cc,hh} into types they are part of the CQL type system, and are "closer" to types. let's move them into "types" directory. the building systems are updated accordingly. the source files referencing `types.hh` were updated using following command: ``` find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} + ``` the source files under sstables include "types.hh", which is indeed the one located under "sstables", so include "sstables/types.hh" instea, so it's more explicit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12926	2023-02-19 21:05:45 +02:00
Avi Kivity	2739ac66ed	treewide: drop cql_serialization_format Now that we don't accept cql protocol version 1 or 2, we can drop cql_serialization format everywhere, except when in the IDL (since it's part of the inter-node protocol). A few functions had duplicate versions, one with and one without a cql_serialization_format parameter. They are deduplicated. Care is taken that `partition_slice`, which communicates the cql_serialization_format across nodes, still presents a valid cql_serialization_format to other nodes when transmitting itself and rejects protocol 1 and 2 serialization\ format when receiving. The IDL is unchanged. One test checking the 16-bit serialization format is removed.	2023-01-03 19:54:13 +02:00
Michał Jadwiszczak	29ad5a08a8	implement `keyspace_element` interface This patch implements `data_dictionary::keyspace_element` interfece in: `keyspace_metadata`, `user_type_impl`, `user_function`, `user_aggregate` and schema.	2022-12-10 12:34:09 +01:00
Jadw1	0f08c8e099	cql3: reducible aggregates Introduces reducible aggregates which don't return final result but accumulator, that can be later reduced.	2022-07-18 15:25:41 +02:00
Jadw1	d13f347621	DB: Add `scylla_aggregates` system table Saving information about UDA's reduce function to `scylla_aggregates` table and distributing it across cluster.	2022-07-18 15:25:37 +02:00
Jadw1	d8f3461147	CQL3: Add reduce function to UDA Add optional field to UDA, that describes reduce function to allow parallelization of UDA aggregates.	2022-07-18 14:18:48 +02:00
Avi Kivity	5937b1fa23	treewide: remove empty comments in top-of-files After `fcb8d040` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), many dual-licensed files were left with empty comments on top. Remove them to avoid visual noise. Closes #10562	2022-05-13 07:11:58 +02:00
Jadw1	c921efd1b3	cql3: allow no final_func and no initcond in UDA Makes final function and initial condition to be optional while creating UDA. No final function means UDA returns final state and defeult initial condition is `null`. Fixes: #10324	2022-04-06 09:08:50 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Piotr Sarna	d83d212ee5	cql3: fix aggregates with > 1 argument It was impossible to use an aggreagate with more than 1 argument due to an overzealous assert, which is now removed.	2021-08-16 19:49:03 +02:00
Piotr Sarna	ee81453596	cql3: add user-define aggregate representation A user-defined aggregate is represented as an aggregate which calls its state function on each input row and then finalizes its execution by calling its final function on the final state, after all rows were already processed.	2021-08-13 11:13:41 +02:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Dejan Mircevski	d79c2cab63	cql3: Use correct comparator in timeuuid min/max The min/max aggregators use aggregate_type_for comparators, and the aggregate_type_for<timeuuid> is regular uuid. But that yields wrong results; timeuuids should be compared as timestamps. Fix it by changing aggregate_type_for<timeuuid> from uuid to timeuuid, so aggregators can distinguish betwen the two. Then specialize the aggregation utilities for timeuuid. Add a cql-pytest and change some unit tests, which relied on naive uuid comparators. Fixes #7729. Tests: unit (dev, debug) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #7910	2021-01-13 11:07:29 +02:00
Raphael S. Carvalho	7a728803f7	cql3/functions: protect against uninitialized value impl_count_function doesn't explicitly initialize _count, so its correctness depends on default initialization. Let's explicitly initialize _count to make the code future proof. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200714162604.64402-1-raphaelsc@scylladb.com>	2020-07-15 12:38:39 +03:00
Juliusz Stasiewicz	5b438e79be	aggregate_fcts: Use per-type comparators for dynamic types For collections and UDTs the `MIN()` and `MAX()` functions are generated on the fly. Until now they worked by comparing just the byte representations of arguments. This patch uses specific per-type comparators to provide semantically sensible, dynamically created aggregates. Fixes #6768	2020-07-08 13:39:10 +02:00
Avi Kivity	3c772757c0	treewide: use utils::multiprecision_int for varint implementation The goal is to forward-declare utils::multiprecision_int, something beyond my capabilities for boost::multiprecision::cpp_int, to reduce compile time bloat. The patch is mostly search-and-replace, with a few casts added to disambiguate conversions the compiler had trouble with.	2020-03-04 13:28:16 +02:00
Benny Halevy	476a102de0	cql3: aggregate_fcts: simplify accumulator_for template definitions Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-01-19 08:26:40 +02:00
Benny Halevy	ff55b5dca3	cql3: functions: limit sum overflow detection to integral types Other types do not have a wider accumulator at the moment. And static_cast<accumulator_type>(ret) != _sum evaluates as false for NaN/Inf floating point values. Fixes #5586 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200112183436.77951-1-bhalevy@scylladb.com>	2020-01-14 10:01:06 +02:00
Benny Halevy	1c81422c1b	cql3: functions: protect against int overflow in avg Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-01-08 09:48:33 +02:00
Benny Halevy	e97a111f64	cql3: functions: detect and handle int overflow in sum Detect integer overflow in cql sum functions and throw an error. Note that Cassandra quietly truncates the sum if it doesn't fit in the input type but we rather break compatibility in this case. See https://issues.apache.org/jira/browse/CASSANDRA-4914?focusedCommentId=14158400&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14158400 Fixes #5536 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-01-08 09:48:33 +02:00
Rafael Ávila de Espíndola	bbed9cac35	cql3: move function creation to a .cc file We had a lot of code in a .hh file, that while using templeates, was only used from creating functions during startup. This moves it to a new .cc file. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200101002158.246736-1-espindola@scylladb.com>	2020-01-03 15:48:19 +02:00

43 Commits