Take DataConsumeRowsContext type as parameter.
This will allow us to implement different context
for reading 3.x files.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Parametrize it with the type of data consume rows context.
There will be different implementations used for different
sstable file formats.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
It will be used as a template parameter for sstable_mutation_reader
once it's turned into a template. This means the definition has
to be accessible.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
They are used just in partition.cc, row.cc and sstables_test.cc
so it is usefull to cut their scope by moving them
to data_consume_context.hh.
This will make it much easier to turn data_consume_context into
a template.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
It's used only in row.cc, partition.cc and sstables_test.cc
so it's better to reduce the dependency just to those files.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Introduce sstable_version_constants that will be a proxy
serving correct constants depending on the format version.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
We were feeding the total estimation partition count of an input shared
sstable to the output unshared ones.
So sstable writer thinks, *from estimation*, that each sstable created
by resharding will have the same data amount as the shared sstable they
are being created from. That's a problem because estimation is feeded to
bloom filter creation which directly influences its size.
So if we're resharding all sstables that belong to all shards, the
disk usage taken by filter components will be multiplied by the number
of shards. That becomes more of a problem with #3302.
Partition count estimation for a shard S will now be done as follow:
//
// TE, the total estimated partition count for a shard S, is defined as
// TE = Sum(i = 0...N) { Ei / Si }.
//
// where i is an input sstable that belongs to shard S,
// Ei is the estimated partition count for sstable i,
// Si is the total number of shards that own sstable i.
Fixes#2672.
Refs #3302.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180423151001.9995-1-raphaelsc@scylladb.com>
"
Fixes to several issues around view update generation, pertaining to
timestamp and TTL management.
Fixes#3361Fixes#3360Fixes#3140
Refs #3362
Tests: unit(release, debug), dtest(materialized_views.py)
"
Reviewed-by: Nadav Har'El <nyh@scylladb.com>
* 'materialized-views/fixes-galore/v2' of http://github.com/duarten/scylla:
mutation_partition: Clarify comment about emptiness
tests: Add view_complex_test
tests/view_schema_test: Complete test
db/view: Move cells instead of copying in add_cells_to_view()
db/view: Handle unselected base columns and corner cases
mutation_partition: Regular base column in view determines row liveness
db/view: Don't avoid read-before-write when view PK matches base
db/view: Process base updates to column unselected by its views
db/view: Consider partition tombstone when generating updates
tests/view_schema_test: Remove unneeded test
mutation_fragment: Allow querying if row is live
view_info: Add view_column() overload
view_info: Explicitly initialize base-dependent fields
cql3/alter_table_statement: Forbid dropping columns of MV base tables
This patch fixes several cases where it was disallowed to create
a materialized view with a filter ("where ..."), for no good reason.
After this patch, these cases will be allowed. Fixes#2367.
In ordinary SELECT queries, certain types of filtering which is known to
be deceptively inefficient is now allowed. For example, trying to query
a range of partition keys cannot be done without reading the entire
database (because the murmur3 tokenizer randomizes the order of partitions).
Restricting two partition key components also cannot be done without
reading excessive amount of the entire partition. So Scylla, following
Cassandra, chooses to disallow such SELECT queries, and give an error
message.
However, the same SELECT statements *should* be allowed when defining a
materialized view. In this case, the filter is just used to check an
individual row - not to search for one - so there is no performance
concern.
Unfortunately the existing code did these validations while building the
SELECT statement's "restrictions", in code shared by both uses of SELECT
(query and MV definition). It was easy to move one of the validations
to later code which runs after the restriction has already been built (and
knows if it is working for query or MV), but because of the way the
"restrictions" objects (translated from Cassandra 2's code) hide what they
contain, many of the checks are harder to perform after having built the
restrictions object. So instead, we add in strategic places in the
restriction-handling code a new "allow_filtering" flag. If restrictions
are built with allow_filtering=true, the extra performance-oriented tests
on the filtering restrictions is not done. Materialized views sets
allow_filtering=true.
The allow_filtering flag will also be useful later when we want to support
the "ALLOW FILTERING" query option which is currently not supported properly
(we have several open issues on that). However note that this patch doesn't
complete that support: I left a FIXME in the spot where we set
allow_filtering in the Materialized Views case, but in the futre also need
to set it if the user specified "ALLOWED FILTERING" in the query.
This patch also enables several unit tests written by Duarte which used to
fail because of this bug, and now pass. These tests verify that the
restrictions are now allowed and filter the view as desired; But I also
added test code to verify that the same restrictions are still forbidden,
as before, when used in ordinary SELECT queries.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180423124343.17591-1-nyh@scylladb.com>
"
Make sure install_dependencies.sh installs all the right dependencies
and that the example `configure.py` invokation can just be copy-pasted
into the terminal and will "just work".
Ref: #3208
"
* 'fix_centos_compile/v2' of https://github.com/denesb/scylla:
install_dependencies.sh: update centos package list and example
configure.py: add --with-ragel option
configure.py: add --with-antlr3
configure.py: check compiler version first
Add missing packages to `yum install` list:
* scylla-boost163-static
* scylla-python34-pyparsing20
Update the configure.py example so that it just works:
* Change g++ to 7.3
* Add --with-antlr3 pointing to antlr3 installed from scylla 3rdparty
Before checking anything else (presence of boost, its version, etc.)
check that the compiler is present and can compile and link a simple c++
program.
Before if the compiler was not set up correctly configure.py would fail
at one of the other try_compile checks, whichever came first (usually
the one checking for boost). This lead the user into chasing some
false-positive error when in fact the compiler wasn't working.
Debian 8 causes "Invalid argument" when we used AmbientCapabilities on systemd
unit file, so drop the line when we build .deb package for Debian 8.
For other distributions, keep using the feature.
Fixes#3344
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180423102041.2138-1-syuu@scylladb.com>
"
This series complements JSON support with INSERT JSON and fromJson
cql function.
INSERT JSON implementation tries hard to interfere as little as possible
with regular INSERT path. So, after being parsed, insertJsonStatement
exists as a separate statement and is handled in a special way.
Overridden add_update_for_key extracts values from JSON map and applies
them to columns.
Converting from insert_json_statement to insert_statement uses auxiliary
from_json_object methods to convert JSON-encoded types to bytes.
Then, terms are matched to appropriate column names and cells are
updated.
fromJson CQL function uses the same from_json_object helper methods,
but applies them to single arguments, not whole rows.
Existing json handling functions from json.hh and libjsoncpp were used
where possible.
Things implemented:
* expanding CQL grammar to accept INSERT JSON
* converting JSON representation of cql values to cql terms
* serving 'INSERT INTO xxx JSON yyy' clause
* tests for INSERT JSON and fromJson()
"
* 'json_ops_2' of https://github.com/psarna/scylla:
tests: add cql unit tests for INSERT JSON
cql3: add fromJson() function
cql3: add INSERT JSON parsing to CQL grammar
cql3: add support for INSERT JSON clause
cql3: decouple execute from term binding in setters
cql3: change operation::make_* functions to static
cql3: add from_json_object function to types
cql3: Make literals::NULL_VALUE public
This commit adds tests for INSERT JSON clause, which is expected
to accept JSON strings and insert appropriate values to columns
defined there.
The tests also cover fromJson function calls and inserting prepared
batch statements with INSERT JSON inside.
References #2058
This function extends JSON support with fromJson() function,
which can be used in UPDATE clause to transform JSON value
into a value with proper CQL type.
fromJson() accepts strings and may return any type, so its instances,
like toJson(), are generated during calls.
This commit also extends functions::get() with additional
'receiver' parameter. This parameter is used to extract receiver type
information neeeded to generate proper fromJson instance.
Receiver is known only during insert/update, so functions::get() also
accepts a nullptr if receiver is not known (e.g. during selection).
References #2058
This commit adds the implementation of INSERT JSON clause
which accepts JSON object as parameter and inserts appropriate
values into appropriate columns, as defined in given JSON.
Example:
INSERT INTO testme JSON '{
"id" : 77,
"name" : "Jones",
"ranking" : 8.5
}'
References #2058
This commit makes it possible to pass values to setters,
instead of having to pass cql3::term instances.
Thanks to that previously prepared terminals can be directly
used in a setter execution.
References #2058
This commit makes operation::make* functions static, because they
don't access any instance-specific data anyway. It is later needed
to decouple setter execution from binding a cql3::term.
This commit adds a 'from_json_object' method which will be used
for converting JSON representation of a value to raw bytes representing
the same value. This functionality will be needed by 'INSERT JSON'
clause implementation, which can turn these raw bytes into cql3::term.
References #2058