Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.
mutation_reader remains in the readers/ module.
mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.
This is a step forward towards librarization or modularization of the
source base.
Closes#12788
Now that we don't accept cql protocol version 1 or 2, we can
drop cql_serialization format everywhere, except when in the IDL
(since it's part of the inter-node protocol).
A few functions had duplicate versions, one with and one without
a cql_serialization_format parameter. They are deduplicated.
Care is taken that `partition_slice`, which communicates
the cql_serialization_format across nodes, still presents
a valid cql_serialization_format to other nodes when
transmitting itself and rejects protocol 1 and 2 serialization\
format when receiving. The IDL is unchanged.
One test checking the 16-bit serialization format is removed.
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
We had a logger called "query result log", with spaces, which made it
impossible to enable it with the REST API due to missing percent
decoding support in our HTTP server (see #9614).
Although that HTTP server bug should be fixed as well (in Seastar -
see scylladb/seastar#725), there is no good reason to have a logger
name with a space in it. This is the only logger whose name has
a space: We have 77 other loggers using underscores (_) in their name,
and only 9 using hyphens (-). So in this patch we choose the more
popular alternative - an underscore.
Fixes#9614.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211205093732.1092553-1-nyh@scylladb.com>
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.
The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).
In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.
The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).
Fixes#5101.
If somebody wants to bypass proper memory accounting they should at
the very least be forced to consider if that is indeed wise and think a
second about the limit they want to apply.
Null values are represented with dead cells and never included in a
result_set. To enforce that, this adds a non_null_data_value that
wraps a data_value and whose constructor calls on_internal_error if a
null data_value is passed.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
We've seen schema application failing with marshal_exception
here. That's not enough information to figure out what is the
problem. Knowing which table and column is affected would make
diagnosis much easier in certain cases.
This patch wraps errors in query::deserialization_error with more
information.
Example output:
query::deserialization_error (failed on column system_schema.tables#bloom_filter_fp_chance \
(version: c179c1d7-9503-3f66-a5b3-70e72af3392a, id: 0, index: 0, type: org.apache.cassandra.db.marshal.DoubleType):\
seastar::internal::backtraced<marshal_exception> (marshaling error: read_simple - not enough bytes (expected 8, got 3)
Message-Id: <20190221113219.13018-1-tgrabiec@scylladb.com>
query::result_view already operates on views of a serialised
query::result. However, until now the value of a cell was always
linearised and copied. This patch makes use of ser::buffer_view to avoid
that.
Introduce class result_options to carry result options through the
request pipeline, which at this point mean the result type and the
digest algorithm. This class allows us to encapsulate the concrete
digest algorithm to use.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Clustering key prefix may have less columns than described in schema.
Deserailiaztion should stop when end of buffer is reached.
Message-Id: <20160503140420.GP23113@scylladb.com>
"Currently data query digest includes cells and tombstones which may have
expired or be covered by higher-level tombstones. This causes digest
mismatch between replicas if some elements are compacted on one of the
nodes and not on others. This mismatch triggers read-repair which doesn't
resolve because mutations received by mutation queries are not differing,
they are compacted already.
The fix adds compacting step before writing and digesting query results by
reusing the algorithm used by mutation query. This is not the most optimal
way to fix this. The compaction step could be folded with the query writing,
there is redundancy in both steps. However such change carries more risk,
and thus was postponed.
perf_simple_query test (cassandra-stress-like partitions) shows regression
from 83k to 77k (7%) ops/s.
Fixes #1165."
Currently data query digest includes cells and tombstones which may have
expired or be covered by higher-level tombstones. This causes digest
mismatch between replicas if some elements are compacted on one of the
nodes and not on others. This mismatch triggers read-repair which doesn't
resolve because mutations received by mutation queries are not differing,
they are compacted already.
The fix adds compacting step before writing and digesting query results by
reusing the algorithm used by mutation query. This is not the most optimal
way to fix this. The compaction step could be folded with the query writing,
there is redundancy in both steps. However such change carries more risk,
and thus was postponed.
perf_simple_query test (cassandra-stress-like partitions) shows regression
from 83k to 77k (7%) ops/s.
Fixes#1165.
We want the format of query results to be eventually defined in the
IDL and be independent of the format we use in memory to represent
collections. This change is a step in this direction.
The change decouples format of collection cells in query results from
our in-memory representation. We currently use collection_mutation_view,
after the change we will use CQL binary protocol format. We use that because
it requires less transformations on the coordinator side.
One complication is that some list operations need to retrieve keys
used in list cells, not only values. To satisfy this need, new query
option was added called "collections_as_maps" which will cause lists
and sets to be reinterpreted as maps matching their underlying
representation. This allows the coordinator to generate mutations
referencing existing items in lists.
We use boost::any to convert to and from database values (stored in
serlialized form) and native C++ values. boost::any captures information
about the data type (how to copy/move/delete etc.) and stores it inside
the boost::any instance. We later retrieve the real value using
boost::any_cast.
However, data_value (which has a boost::any member) already has type
information as a data_type instance. By teaching data_type intances about
the corresponding native type, we can elimiante the use of boost::any.
While boost::any is evil and eliminating it improves efficiency somewhat,
the real goal is growing native type support in data_type. We will use that
later to store native types in the cache, enabling O(log n) access to
collections, O(1) access to tuples, and more efficient large blob support.
Collection cells in query results are serialized using the "mutation form",
and must be deserialized using that format as well.
Fixes "--logger-log-level storage_proxy=trace" crasher.
This builder is the one used to build the convenient result_set (not
on fast path). The builder was assuming that the whole set of columns
was always queried, which resulted in buffer underflow exceptions
during parsing of the results if this was not the case. Let's also
handle queries which have narrower column sets.
Add a query::result_set class that contains per-row cells that can be
accessed by column name. Partition keys, clustering keys, and static
values are duplicated for every row for convenience.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>