Compare commits

..

99 Commits

Author SHA1 Message Date
Jenkins
d27eb734a7 release: prepare for 2.2.2 by hagitsegev 2019-01-12 18:28:25 +02:00
Avi Kivity
e6aeb490b5 Update seastar submodule
* seastar 6f61d74...88cb58c (2):
  > reactor: disable nowait aio due to a kernel bug
  > configure.py: Enhance detection for gcc -fvisibility=hidden bug

Fixes #3996.
2018-12-17 15:57:58 +02:00
Vladimir Krivopalov
2e3b09b593 database: Capture io_priority_class by reference to avoid dangling ref.
The original reference points to a thread-local storage object that
guaranteed to outlive the continuation, but copying it make the
subsequent calls point to a local object and introduces a use-after-free
bug.

Fixes #3948

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
(cherry picked from commit 68458148e7)
2018-12-02 13:32:59 +02:00
Tomasz Grabiec
92c74f4e0b utils: phased_barrier: Make advance_and_await() have strong exception guarantees
Currently, when advance_and_await() fails to allocate the new gate
object, it will throw bad_alloc and leave the phased_barrier object in
an invalid state. Calling advance_and_await() again on it will result
in undefined behavior (typically SIGSEGV) beacuse _gate will be
disengaged.

One place affected by this is table::seal_active_memtable(), which
calls _flush_barrier.advance_and_await(). If this throws, subsequent
flush attempts will SIGSEGV.

This patch rearranges the code so that advance_and_await() has strong
exception guarantees.
Message-Id: <1542645562-20932-1-git-send-email-tgrabiec@scylladb.com>

Fixes #3931.

(cherry picked from commit 57e25fa0f8)
2018-11-21 12:18:25 +02:00
Avi Kivity
89d835e9e3 tests: fix network_topology_test timing out in debug mode
In 2.2, SEASTAR_DEBUG is just DEBUG.
2018-10-21 19:04:08 +03:00
Takuya ASADA
263a740084 dist/debian: use --configfile to specify pbuilderrc
Use --configfile to specify pbuilderrc, instead of copying it to home directory.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180420024624.9661-1-syuu@scylladb.com>
(cherry picked from commit 01c36556bf)
2018-10-21 18:21:18 +03:00
Avi Kivity
7f24b5319e release: prepare for 2.2.1 2018-10-19 21:16:14 +03:00
Avi Kivity
fe16c0e985 locator: fix abstract_replication_strategy::get_ranges() and friends violating sort order
get_ranges() is supposed to return ranges in sorted order. However, a35136533d
broke this and returned the range that was supposed to be last in the second
position (e.g. [0, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9]). The broke cleanup, which
relied on the sort order to perform a binary search. Other users of the
get_ranges() family did not rely on the sort order.

Fixes #3872.
Message-Id: <20181019113613.1895-1-avi@scylladb.com>

(cherry picked from commit 1ce52d5432)
2018-10-19 21:16:12 +03:00
Glauber Costa
f85badaaac api: use longs instead of ints for snapshot sizes
Int types in json will be serialized to int types in C++. They will then
only be able to handle 4GB, and we tend to store more data than that.

Without this patch, listsnapshots is broken in all versions.

Fixes: #3845

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20181012155902.7573-1-glauber@scylladb.com>
(cherry picked from commit 98332de268)
2018-10-12 22:02:56 +03:00
Eliran Sinvani
2193d41683 cql3 : add workaround to antlr3 null dereference bug
The Antlr3 exception class has a null dereference bug that crashes
the system when trying to extract the exception message using
ANTLR_Exception<...>::displayRecognitionError(...) function. When
a parsing error occurs the CqlParser throws an exception which in
turn processesed for some special cases in scylla to generate a custom
message. The default case however, creates the message using
displayRecognitionError, causing the system to crash.
The fix is a simple workaround, making sure the pointer is not null
before the call to the function. A "proper" fix can't be implemented
because the exception class itself is implemented outside scylla
in antlr headers that resides on the host machine os.

Tested manualy 2 testcases, a typo causing scylla to crash and
a cql comment without a newline at the end also caused scylla to crash.
Ran unit tests (release).

Fixes #3740
Fixes #3764

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <cfc7e0d758d7a855d113bb7c8191b0fd7d2e8921.1538566542.git.eliransin@scylladb.com>
(cherry picked from commit 20f49566a2)
2018-10-08 11:02:16 +03:00
Avi Kivity
1e1f0c29bf utils: crc32: mark power crc32 assembly as not requiring an executable stack
The linker uses an opt-in system for non-executable stack: if all object files
opt into a non-executable stack, the binary will have a non-executable stack,
which is very desirable for security. The compiler cooperates by opting into
a non-executable stack whenever possible (always for our code).

However, we also have an assembly file (for fast power crc32 computations).
Since it doesn't opt into a non-executable stack, we get a binary with
executable stack, which Gentoo's build system rightly complains about.

Fix by adding the correct incantation to the file.

Fixes #3799.

Reported-by: Alexys Jacob <ultrabug@gmail.com>
Message-Id: <20181002151251.26383-1-avi@scylladb.com>
(cherry picked from commit aaab8a3f46)
2018-10-08 11:02:16 +03:00
Calle Wilund
84d4588b5f storage_proxy: Add missing re-throw in truncate_blocking
Iff truncation times out, we want to log it, but the exception should
not be swallowed, but re-thrown.

Fixes #3796.

Message-Id: <20181001112325.17809-1-calle@scylladb.com>
(cherry picked from commit 2996b8154f)
2018-10-08 11:02:16 +03:00
Duarte Nunes
7b43b26709 tests/aggregate_fcts_test: Add test case for wrapped types
Provide a test case which checks a type being wrapped in a
reverse_type plays no role in assignment.

Refs #3789

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180927223201.28152-2-duarte@scylladb.com>
(cherry picked from commit 17578c3579)
2018-10-08 11:02:16 +03:00
Duarte Nunes
0ed01acf15 cql3/selection/selector: Unwrap types when validating assignment
When validating assignment between two types, it's possible one of
them is wrapped in a reverse_type, if it comes, for example, from the
type associated with a clustering column. When checking for weak
assignment the types are correctly unwrapped, but not when checking
for an exact match, which this patch fixes.

Technically, the receiver is never a reversed_type for the current
callers, but this is the morally correct implementation, as the type
being reversed or not plays no role in assignment.

Tests: unit(release)

Fixes #3789

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180927223201.28152-1-duarte@scylladb.com>
(cherry picked from commit 5e7bb20c8a)
2018-10-08 11:02:16 +03:00
Gleb Natapov
7ce160f408 mutation_query_test: add test for result size calculation
Check that digest only and digest+data query calculate result size to be
the same.

Message-Id: <20180906153800.GK2326@scylladb.com>
(cherry picked from commit 9e438933a2)

Message-Id: <20181008075901.GC2380@scylladb.com>
2018-10-08 11:02:09 +03:00
Gleb Natapov
5017d9b46a mutation_partition: accurately account for result size in digest only queries
When measuring_output_stream is used to calculate result's element size
it incorrectly takes into account not only serialized element size, but
a placeholder that ser::qr_partition__rows/qr_partition__static_row__cells
constructors puts in the beginning. Fix it by taking starting point in a
stream before element serialization and subtracting it afterwords.

Fixes #3755

Message-Id: <20180906153609.GJ2326@scylladb.com>
(cherry picked from commit d7674288a9)
2018-10-07 18:16:19 +03:00
Gleb Natapov
50b6ab3552 mutation_partition: correctly measure static row size when doing digest calculation
The code uses incorrect output stream in case only digest is requested
and thus getting incorrect data size. Failing to correctly account
for static row size while calculating digest may cause digest mismatch
between digest and data query.

Fixes #3753.

Message-Id: <20180905131219.GD2326@scylladb.com>
(cherry picked from commit 98092353df)
2018-09-06 16:51:19 +03:00
Eliran Sinvani
b1652823aa cql3: ensure repeated values in IN clauses don't return repeated rows
When the list of values in the IN list of a single column contains
duplicates, multiple executors are activated since the assumption
is that each value in the IN list corresponds to a different partition.
this results in the same row appearing in the result number times
corresponding to the duplication of the partition value.

Added queries for the in restriction unitest and fixed with a bad result check.

Fixes #2837
Tests: Queries as in the usecase from the GitHub issue in both forms ,
prepared and plain (using python driver),Unitest.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <ad88b7218fa55466be7bc4303dc50326a3d59733.1534322238.git.eliransin@scylladb.com>
(cherry picked from commit d734d316a6)
2018-08-26 15:51:17 +03:00
Tomasz Grabiec
02b24aec34 Merge 'Fix multi-cell static list updates in the presence of ckeys' from Duarte
Fixes a regression introduced in
9e88b60ef5, which broke the lookup for
prefetched values of lists when a clustering key is specified.

This is the code that was removed from some list operations:

 std::experimental::optional<clustering_key> row_key;
 if (!column.is_static()) {
   row_key = clustering_key::from_clustering_prefix(*params._schema, prefix);
 }
 ...
 auto&& existing_list = params.get_prefetched_list(m.key().view(), row_key, column);

Put it back, in the form of common code in the update_parameters class.

Fixes #3703

* https://github.com/duarten/scylla cql-list-fixes/v1:
  tests/cql_query_test: Test multi-cell static list updates with ckeys
  cql3/lists: Fix multi-cell static list updates in the presence of ckeys
  keys: Add factory for an empty clustering_key_prefix_view

(cherry picked from commit 6937cc2d1c)
2018-08-21 21:39:22 +01:00
Duarte Nunes
22eea4d8cf cql3/query_options: Use _value_views in prepare()
_value_views is the authoritative data structure for the
client-specified values. Indeed, the ctor called
transport::request::read_options() leaves _values completely empty.

In query_options::prepare() we were, however, using _values to
associated values to the client-specified column names, and not
_value_views. Fix this by using _value_views instead.

As for the reasons we didn't see this bug earlier, I assume it's
because very few drivers set the 0x04 query options flag, which means
column names are omitted. This is the right thing to do since most
drivers have enough information to correctly position the values.

Fixes #3688

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814234605.14775-1-duarte@scylladb.com>
(cherry picked from commit a4355fe7e7)
2018-08-21 21:39:22 +01:00
Tomasz Grabiec
d257f6d57c mutation_partition: Fix exception safety of row::apply_monotonically()
When emplace_back() fails, value is already moved-from into a
temporary, which breaks monotonicity expected from
apply_monotonically(). As a result, writes to that cell will be lost.

The fix is to avoid the temporary by in-place construction of
cell_and_hash. To do that, appropriate cell_and_hash constructor was
added.

Found by mutation_test.cc::test_apply_monotonically_is_monotonic with
some modifications to the random mutation generator.

Introduced in 99a3e3a.

Fixes #3678.

Message-Id: <1533816965-27328-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 024b3c9fd9)
2018-08-21 21:39:18 +01:00
Takuya ASADA
6fca92ac3c dist/common/scripts/scylla_ec2_check: support custom NIC ifname on EC2
This is bash version of commit 88fe3c2694.

Since some AMIs using consistent network device naming, primary NIC
ifname is not 'eth0'.
But we hardcoded NIC name as 'eth0' on scylla_ec2_check, we need to add
--nic option to specify custom NIC ifname.

Fixes #3658

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180807231650.13697-1-syuu@scylladb.com>
2018-08-08 09:16:57 +03:00
Jesse Haber-Kucharsky
26e3917046 auth: Don't use unsupported hashing algorithms
In previous versions of Fedora, the `crypt_r` function returned
`nullptr` when a requested hashing algorithm was not supported.

This is consistent with the documentation of the function in its man
page.

As of Fedora 28, the function's behavior changes so that the encrypted
text is not `nullptr` on error, but instead the string "*0".

The info pages for `crypt_r` clarify somewhat (and contradict the man
pages):

    Some implementations return `NULL` on failure, and others return an
    _invalid_ hashed passphrase, which will begin with a `*` and will
    not be the same as SALT.

Because of this change of behavior, users running Scylla on a Fedora 28
machine which was upgraded from a previous release would not be able to
authenticate: an unsupported hashing algorithm would be selected,
producing encrypted text that did not match the entry in the table.

With this change, unsupported algorithms are correctly detected and
users should be able to continue to authenticate themselves.

Fixes #3637.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <bcd708f3ec195870fa2b0d147c8910fb63db7e0e.1533322594.git.jhaberku@scylladb.com>
(cherry picked from commit fce10f2c6e)
2018-08-05 10:30:47 +03:00
Gleb Natapov
3892594a93 cache_hitrate_calculator: fix race when new table is added during calculations
The calculation consists of several parts with preemption point between
them, so a table can be added while calculation is ongoing. Do not
assume that table exists in intermediate data structure.

Fixes #3636

Message-Id: <20180801093147.GD23569@scylladb.com>
(cherry picked from commit 44a6afad8c)
2018-08-01 14:34:08 +03:00
Amos Kong
4b24439841 scylla_setup: fix conditional statement of silent mode
Commit 300af65555 introdued a problem in
conditional statement, script will always abort in silent mode, it doesn't
care about the return value.

Fixes #3485

Signed-off-by: Amos Kong <amos@scylladb.com>
Message-Id: <1c12ab04651352964a176368f8ee28f19ae43c68.1528077114.git.amos@scylladb.com>
(cherry picked from commit 364c2551c8)
2018-07-25 09:36:32 +03:00
Takuya ASADA
a02a4592d8 dist/common/scripts/scylla_setup: abort running script when one of setup failed in silent mode
Current script silently continues even one of setup fails, need to
abort.

Fixes #3433

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180522180355.1648-1-syuu@scylladb.com>
(cherry picked from commit 300af65555)
2018-07-25 09:36:29 +03:00
Avi Kivity
b6e1c08451 Merge "row_cache: Fix violation of continuity on concurrent eviction and population" from Tomasz
"
The problem happens under the following circumstances:

  - we have a partially populated partition in cache, with a gap in the middle

  - a read with no clustering restrictions trying to populate that gap

  - eviction of the entry for the lower bound of the gap concurrent with population

The population may incorrectly mark the range before the gap as continuous.
This may result in temporary loss of writes in that clustering range. The
problem heals by clearing cache.

Caught by row_cache_test::test_concurrent_reads_and_eviction, which has been
failing sporadically.

The problem is in ensure_population_lower_bound(), which returns true if
current clustering range covers all rows, which means that the populator has a
right to set continuity flag to true on the row it inserts. This is correct
only if the current population range actually starts since before all
clustering rows. Otherwise, we're populating since _last_row and should
consult it.

Fixes #3608.
"

* 'tgrabiec/fix-violation-of-continuity-on-concurrent-read-and-eviction' of github.com:tgrabiec/scylla:
  row_cache: Fix violation of continuity on concurrent eviction and population
  position_in_partition: Introduce is_before_all_clustered_rows()

(cherry picked from commit 31151cadd4)
2018-07-18 12:07:01 +02:00
Botond Dénes
9469afcd27 storage_proxy: use the original row limits for the final results merging
`query_partition_key_range()` does the final result merging and trimming
(if necessary) to make sure we don't send more rows to the client than
requested. This merging and trimming is done by a continuation attached
to the `query_partition_key_range_concurrent()` which does the actual
querying. The continuations captures via value the `row_limit` and
`partition_limit` fields of the `query::read_command` object of the
query. This has an unexpected consequence. The lambda object is
constructed after the call to `query_partition_key_range_concurrent()`
returns. If this call doesn't defer, any modifications done to the read
command object done by `query_partition_key_range_concurrent()` will be
visible to the lambda. This is undesirable because
`query_partition_key_range_concurrent()` updates the read command object
directly as the vnodes are traversed which in turn will result in the
lambda doing the final trimming according to a decremented `row_limits`,
which will cause the paging logic to declare the query as exhausted
prematurely because the page will not be full.
To avoid all this make a copy of the relevant limit fields before
`query_partition_key_range_concurrent()` is called and pass these copies
to the continuation, thus ensuring that the final trimming will be done
according to the original page limits.

Spotted while investigating a dtest failure on my 1865/range-scans/v2
branch. On that branch the way range scans are executed on replicas is
completely refactored. These changes appearantly reduce the number of
continuations in the read path to the point where an entire page can be
filled without deferring and thus causing the problem to surface.

Fixes #3605.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <f11e80a6bf8089d49ba3c112b25a69edf1a92231.1531743940.git.bdenes@scylladb.com>
(cherry picked from commit cc4acb6e26)
2018-07-16 17:51:06 +03:00
Avi Kivity
240b9f122b Merge "Backport empty partition range scan fixes" from Botond
"
This mini-series lumps together the fix for the empty partition range
scan crash (#3564) and the two follow-up patches.
"

* 'paging-fix-backport-2.2/v1' of https://github.com/denesb/scylla:
  query_pager: use query::is_single_partition() to check for singular range
  tests/cql_query_tess: add unit test for querying empty ranges test
  query_pager: be prepared to _ranges being empty
2018-07-05 10:29:31 +03:00
Botond Dénes
cb16cd7724 query_pager: use query::is_single_partition() to check for singular range
Use query::is_single_partition() to check whether the queried ranges are
singular or not. The current method of using
`dht::partition_range::is_singular()` is incorrect, as it is possible to
build a singular range that doesn't represent a single partition.
`query::is_single_partition()` correctly checks for this so use it
instead.

Found during code-review.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <f671f107e8069910a2f84b14c8d22638333d571c.1530675889.git.bdenes@scylladb.com>
(cherry picked from commit 8084ce3a8e)
2018-07-04 12:57:45 +03:00
Botond Dénes
c864d198fc tests/cql_query_tess: add unit test for querying empty ranges test
A bug was found recently (#3564) in the paging logic, where the code
assumed the queried ranges list is non-empty. This assumption is
incorrect as there can be valid (if rare) queries that can result in the
ranges list to be empty. Add a unit test that executes such a query with
paging enabled to detect any future bugs related to assumptions about
the ranges list being non-empty.

Refs: #3564
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <f5ba308c4014c24bb392060a7e72e7521ff021fa.1530618836.git.bdenes@scylladb.com>
(cherry picked from commit c236a96d7d)
2018-07-04 09:52:54 +03:00
Botond Dénes
25125e9c4f query_pager: be prepared to _ranges being empty
do_fetch_page() checks in the beginning whether there is a saved query
state already, meaning this is not the first page. If there is not it
checks whether the query is for a singulular partitions or a range scan
to decide whether to enable the stateful queries or not. This check
assumed that there is at least one range in _ranges which will not hold
under some circumstances. Add a check for _ranges being empty.

Fixes: #3564
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <cbe64473f8013967a93ef7b2104c7ca0507afac9.1530610709.git.bdenes@scylladb.com>
(cherry picked from commit 59a30f0684)
2018-07-04 09:52:54 +03:00
Shlomi Livne
faf10fe6aa release: prepare for 2.2.0
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
2018-07-01 22:40:42 +03:00
Calle Wilund
f76269cdcf sstables::compress: Ensure unqualified compressor name if possible
Fixes #3546

Both older origin and scylla writes "known" compressor names (i.e. those
in origin namespace) unqualified (i.e. LZ4Compressor).

This behaviour was not preserved in the virtualization change. But
probably should be.

Message-Id: <20180627110930.1619-1-calle@scylladb.com>
(cherry picked from commit 054514a47a)
2018-06-28 18:55:15 +03:00
Avi Kivity
a9b0ccf116 Merge "Disable sstable filtering based on min/max clustering key components" from Tomasz
"
With DateTiered and TimeWindow, there is a read optimization enabled
which excludes sstables based on overlap with recorded min/max values
of clustering key components. The problem is that it doesn't take into
account partition tombstones and static rows, which should still be
returned by the reader even if there is no overlap in the query's
clustering range. A read which returns no clustering rows can
mispopulate cache, which will appear as partition deletion or writes
to the static row being lost. Until node restart or eviction of the
partition entry.

There is also a bad interaction between cache population on read and
that optimization. When the clustering range of the query doesn't
overlap with any sstable, the reader will return no partition markers
for the read, which leads cache populator to assume there is no
partition in sstables and it will cache an empty partition. This will
cause later reads of that partition to miss prior writes to that
partition until it is evicted from cache or node is restarted.

Disable until a more elaborate fix is implemented.

Fixes #3552
Fixes #3553
"

* tag 'tgrabiec/disable-min-max-sstable-filtering-v1' of github.com:tgrabiec/scylla:
  tests: Add test for slicing a mutation source with date tiered compaction strategy
  tests: Check that database conforms to mutation source
  database: Disable sstable filtering based on min/max clustering key components

(cherry picked from commit e1efda8b0c)
2018-06-28 18:55:15 +03:00
Tomasz Grabiec
abc5941f87 flat_mutation_reader: Move field initialization to initializer list
This works around a problem of std::terminate() being called in debug
mode build if initialization of _current throws.

Backtrace:

Thread 2 "row_cache_test_" received signal SIGABRT, Aborted.
0x00007ffff17ce9fb in raise () from /lib64/libc.so.6
(gdb) bt
  #0  0x00007ffff17ce9fb in raise () from /lib64/libc.so.6
  #1  0x00007ffff17d077d in abort () from /lib64/libc.so.6
  #2  0x00007ffff5773025 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
  #3  0x00007ffff5770c16 in ?? () from /lib64/libstdc++.so.6
  #4  0x00007ffff576fb19 in ?? () from /lib64/libstdc++.so.6
  #5  0x00007ffff5770508 in __gxx_personality_v0 () from /lib64/libstdc++.so.6
  #6  0x00007ffff3ce4ee3 in ?? () from /lib64/libgcc_s.so.1
  #7  0x00007ffff3ce570e in _Unwind_Resume () from /lib64/libgcc_s.so.1
  #8  0x0000000003633602 in reader::reader (this=0x60e0001160c0, r=...) at flat_mutation_reader.cc:214
  #9  0x0000000003655864 in std::make_unique<make_forwardable(flat_mutation_reader)::reader, flat_mutation_reader>(flat_mutation_reader &&) (__args#0=...)
    at /usr/include/c++/7/bits/unique_ptr.h:825
  #10 0x0000000003649a63 in make_flat_mutation_reader<make_forwardable(flat_mutation_reader)::reader, flat_mutation_reader>(flat_mutation_reader &&) (args#0=...)
    at flat_mutation_reader.hh:440
  #11 0x000000000363565d in make_forwardable (m=...) at flat_mutation_reader.cc:270
  #12 0x000000000303f962 in memtable::make_flat_reader (this=0x61300001d540, s=..., range=..., slice=..., pc=..., trace_state_ptr=..., fwd=..., fwd_mr=...)
    at memtable.cc:592

Message-Id: <1528792447-13336-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 6d6b93d1e7)
2018-06-28 18:55:15 +03:00
Asias He
a152ac12af gossip: Fix tokens assignment in assassinate_endpoint
The tokens vector is defined a few lines above and is needed outsie the
if block.

Do not redefine it again in the if block, otherwise the tokens will be empty.

Found by code inspection.

Fixes #3551.

Message-Id: <c7a06375c65c950e94236571127f533e5a60cbfd.1530002177.git.asias@scylladb.com>
(cherry picked from commit c3b5a2ecd5)
2018-06-28 18:55:15 +03:00
Botond Dénes
c274fdf2ec querier: find_querier(): return end() when no querier matches the range
When none of the queriers found for the lookup key match the lookup
range `_entries.end()` should be returned as the search failed. Instead
the iterator returned from the failed `std::find_if()` is returned
which, if the find failed, will be the end iterator returned by the
previous call to `_entries.equal_range()`. This is incorrect because as
long as `equal_range()`'s end iterator is not also `_entries.end()` the
search will always return an iterator to a querier regardless of whether
any of them actually matches the read range.
Fix by returning `_entries.end()` when it is detected that no queriers
match the range.

Fixes: #3530
(cherry picked from commit 2609a17a23)
2018-06-28 18:55:15 +03:00
Botond Dénes
5b88d6b4d6 querier_cache: restructure entries storage
Currently querier_cache uses a `std::unordered_map<utils::UUID, querier>`
to store cache entries and an `std::list<meta_entry>` to store meta
information about the querier entries, like insertion order, expiry
time, etc.

All cache eviction algorithms use the meta-entry list to evict entries
in reverse insertion order (LRU order). To make this possible
meta-entries keep an iterator into the entry map so that given a
meta-entry one can easily erase the querier entry. This however poses a
problem as std::unordered_map can possibly invalidate all its iterators
when new items are inserted. This is use-after-free waiting to happen.

Another disadvantages of the current solution is that it requires the
meta-entry to use a weak pointer to the querier entry so that in case
that is removed (as a result of a successful lookup) it doesn't try to
access it. This has an impact on all cache eviction algorithms as they
have to be prepared to deal with stale meta-entries. Stale meta-entries
also unnecesarily consume memory.

To solve these problems redesign how querier_cache stores entries
completely. Instead of storing the entries in an `std::unordered_map`
and storing the meta-entries in an `std::list`, store the entries in an
`std::list` and an intrusive-map (index) for lookups. This new design
has severeal advantages over the old one:
* The entries will now be in insert order, so eviction strategies can
  work on the entry list itself, no need to involve additional data
  structures for this.
* All data related to an entry is stored in one place, no data
  duplication.
* Removing an entry automatically removes it from the index as intrusive
  containers support auto unlink. This means there is no need to store
  iterators for long terms, risking use-after-free when the container
  invalidates it's iterators.

Additional changes:
* Modify eviction strategies so that they work with the `entry`
  interface rather than the stored value directly.

Ref #3424

(cherry picked from commit 7ce7f3f0cc)
2018-06-28 18:55:15 +03:00
Botond Dénes
2d626e1cf8 tests/querier_cache: fix memory based eviction test
Do increment the key counter after inserting the first querier into the
cache. Otherwise two queriers with the same key will be inserted and
will fail the test. This problem is exposed by the changes the next
patches make to the querier-cache but will be fixed before to maintain
bisectability of the code.

Fixes: #3529
(cherry picked from commit b9d51b4c08)
2018-06-28 18:55:15 +03:00
Avi Kivity
c11bd3e1cf Merge "Do not allow compaction controller shares to grow indefinitely" from Glauber
"
We are seeing some workloads with large datasets where the compaction
controller ends up with a lot of shares. Regardless of whether or not
we'll change the algorithm, this patchset handles a more basic issue,
which is the fact that the current controller doesn't set a maximum
explicitly, so if the input is larger than the maximum it will keep
growing without bounds.

It also pushes the maximum input point of the compaction controller from
10 to 30, allowing us to err on the side of caution for the 2.2 release.
"

* 'tame-controller' of github.com:glommer/scylla:
  controller: do not increase shares of controllers for inputs higher than the maximum
  controller: adjust constants for compaction controller

(cherry picked from commit e0eb66af6b)
2018-06-20 10:58:20 +03:00
Avi Kivity
9df3df92bc Merge "Try harder to move STCS towards zero-backlog" from Glauber
"
Tests: unit (release)

Before merging the LCS controller, we merged patches that would
guarantee that LCS would move towards zero backlog - otherwise the
backlog could get too high.

We didn't do the same for STCS, our first controlled strategy. So we may
end up with a situation where there are many SSTables inducing a large
backlog, but they are not yet meeting the minimum criteria for
compaction. The backlog, then, never goes down.

This patch changes the SSTable selection criteria so that if there is
nothing to do, we'll keep pushing towards reaching a state of zero
backlog. Very similar to what we did for LCS.
"

* 'stcs-min-threshold-v4' of github.com:glommer/scylla:
  STCS: bypass min_threshold unless configure to enforce strictly
  compaction_strategy: allow the user to tell us if min_threshold has to be strict

(cherry picked from commit f0fc888381)
2018-06-18 14:21:52 +03:00
Takuya ASADA
8ad9578a6c dist/debian: add --jobs <njobs> option just like build_rpm.sh
On some build environment we may want to limit number of parallel jobs since
ninja-build runs ncpus jobs by default, it may too many since g++ eats very
huge memory.
So support --jobs <njobs> just like on rpm build script.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180425205439.30053-1-syuu@scylladb.com>
(cherry picked from commit 782ebcece4)
2018-06-14 15:04:50 +03:00
Tomasz Grabiec
4cb6061a9f tests: row_cache: Reduce concurrency limit to avoid bad_alloc
The test uses random mutations. We saw it failing with bad_alloc from time to time.
Reduce concurrency to reduce memory footprint.

Message-Id: <20180611090304.16681-1-tgrabiec@scylladb.com>
(cherry picked from commit a91974af7a)
2018-06-14 13:40:00 +02:00
Tomasz Grabiec
1940e6bd95 tests: row_cache: Do not hang when only one of the readers throws
Message-Id: <20180531122729.3314-1-tgrabiec@scylladb.com>
(cherry picked from commit b5e42bc6a0)
2018-06-14 13:40:00 +02:00
Avi Kivity
044cfde5f3 database: stop using incremental selectors
There is a bug in incremental_selector for partitioned_sstable_set, so
until it is found, stop using it.

This degrades scan performance of Leveled Compaction Strategy tables.

Fixes #3513. (as a workaround)
Introduced: 2.1
Message-Id: <20180613131547.19084-1-avi@scylladb.com>

(cherry picked from commit aeffbb6732)
2018-06-13 21:04:56 +03:00
Vlad Zolotarov
262a246436 locator::ec2_multi_region_snitch: don't call for ec2_snitch::gossiper_starting()
ec2_snitch::gossiper_starting() calls for the base class (default) method
that sets _gossip_started to TRUE and thereby prevents to following
reconnectable_snitch_helper registration.

Fixes #3454

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1528208520-28046-1-git-send-email-vladz@scylladb.com>
(cherry picked from commit 2dde372ae6)
2018-06-12 19:02:19 +03:00
Botond Dénes
799dbb4f2e forwardable reader: implement fast_forward_to(position_in_partition)
Instead of throwing std::bad_function_call. Needed by the foreign_reader
unit test. Not sure how other tests didn't hit this before as the test
is using `run_mutation_source_tests()`.

(cherry picked from commit 50b67232e5)
Fixes #3491.
2018-06-05 12:34:15 +03:00
Shlomi Livne
a2fe669dd3 dist/docker: Switch to Scylla 2.2 repository
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <83b4ff801b283ade512a7035ecea9057a864dcdd.1526995747.git.shlomi@scylladb.com>
2018-06-05 12:34:15 +03:00
Avi Kivity
56de761daf Update seastar submodule
* seastar 7c6ba3a...6f61d74 (1):
  > tls: Ensure handshake always drains output before return/throw

Fixes #3461.
2018-06-05 12:34:15 +03:00
Shlomi Livne
c3187093a3 release: prepare for 2.2.rc2
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
2018-05-30 17:32:16 +03:00
Avi Kivity
111c2ecf5d Update scylla-ami submodule
* dist/ami/files/scylla-ami 49896ec...6ed71a3 (1):
  > scylla_install_ami: Update CentOS to latest version
2018-05-28 14:02:43 +03:00
Takuya ASADA
a6ecdbbba6 Revert "dist/ami: update CentOS base image to latest version"
This reverts commit 69d226625a.
Since ami-4bf3d731 is Market Place AMI, not possible to publish public AMI based on it.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180523112414.27307-1-syuu@scylladb.com>
(cherry picked from commit 6b1b9f9e602c570bbc96692d30046117e7d31ea7)
2018-05-28 13:40:15 +03:00
Glauber Costa
17cc62d0b3 commitlog: don't move pointer to segment
We are currently moving the pointer we acquired to the segment inside
the lambda in which we'll handle the cycle.

The problem is, we also use that same pointer inside the exception
handler. If an exception happens we'll access it and we'll crash.

Probably #3440.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20180518125820.10726-1-glauber@scylladb.com>
(cherry picked from commit 596a525950)
2018-05-19 19:12:26 +03:00
Shlomi Livne
eb646c61ed release: prepare for 2.2.rc1
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
2018-05-16 21:31:50 +03:00
Avi Kivity
782d817e84 dist: redhat: get rid of raid0.devices_discard_performance
This parameter is not available on recent Red Hat kernels or on
non-Red Hat kernels (it was removed on 3.10.0-772.el7,
RHBZ 1455932). The presence of the parameter on kernels that don't
support it cause the module load to fail, with the result that the
storage is not available.

Fix by removing the parameter. For someone running an older Red Hat
kernel the effect will be that discard is disabled, but they can fix
that by updating the kernel. For someone running a newer kernel, the
effect will be that they can access their data.

Fixes #3437.
Message-Id: <20180516134913.6540-1-avi@scylladb.com>

(cherry picked from commit 3b8118d4e5)
2018-05-16 20:13:59 +03:00
Avi Kivity
3ed5e63e8a Update scylla-ami submodule
* dist/ami/files/scylla-ami 02b1853...49896ec (1):
  > Merge "AMI build fix" from Takuya
2018-05-16 12:37:03 +03:00
Tomasz Grabiec
d17ce46983 Update seastar submodule
Fixes #3339.

* seastar 491f994...7c6ba3a (2):
  > Merge "fix perftune.py issues with cpu-masks on big machines" from Vlad
  > Merge 'Handle Intel's NICs in a special way'  from Vlad
2018-05-16 09:37:41 +02:00
Takuya ASADA
7ca5e7e993 dist/redhat: replace scylla-libgcc72/scylla-libstdc++72 with scylla-2.2 metapackage
We have conflict between scylla-libgcc72/scylla-libstdc++72 and
scylla-libgcc73/scylla-libstdc++73, need to replace *72 package with
scylla-2.2 metapackage to prevent it.

Fixes #3373

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180510081246.17928-1-syuu@scylladb.com>
(cherry picked from commit 6fa3c4dcad)
2018-05-11 09:42:28 +03:00
Duarte Nunes
07b0ce27fa Merge 'Include OPTIONS with LIST ROLES' from Jesse
"
Fixes #3420.

Tests: dtest (`auth_test.py`), unit (release)
"

* 'jhk/fix_3420/v2' of https://github.com/hakuch/scylla:
  cql3: Include custom options in LIST ROLES
  auth: Query custom options from the `authenticator`
  auth: Add type alias for custom auth. options

(cherry picked from commit d49348b0e1)
2018-05-10 13:22:49 +03:00
Amnon Heiman
27be3cd242 scylla-housekeeping: support new 2018.1 path variation
Starting from 2018.1 and 2.2 there was a change in the repository path.
It was made to support multiple product (like manager and place the
enterprise in a different path).

As a result, the regular expression that look for the repository fail.

This patch change the way the path is searched, both rpm and debian
varations are combined and both options of the repository path are
unified.

See scylladb/scylla-enterprise#527

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20180429151926.20431-1-amnon@scylladb.com>
(cherry picked from commit 6bf759128b)
2018-05-09 15:22:55 +03:00
Calle Wilund
abf50aafef database: Fix assert in truncate
Fixes crash in cql_tests.StorageProxyCQLTester.table_test
"avoid race condition when deleting sstable on behalf..." changed
discard_sstables behaviour to only return rp:s for sstables owned
and submitted for deletion (not all matching time stamp),
which can in some cases cause zero rp returned.
Message-Id: <20180508070003.1110-1-calle@scylladb.com>
2018-05-09 10:02:09 +01:00
Duarte Nunes
dfe5b38a43 db/view: Limit number of pending view updates
This patch adds a simple and naive mechanism to ensure a base replica
doesn't overwhelm a potentially overloaded view replica by sending too
many concurrent view updates. We add a semaphore to limit to 100 the
number of outstanding view updates. We limit globally per shard, and
not per destination view replica. We also limit statically.

Refs #2538

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180426134457.21290-2-duarte@scylladb.com>
(cherry picked from commit 4b3562c3f5)
2018-05-08 00:46:33 +01:00
Duarte Nunes
9bdc8c25f5 db/view: Return a future when sending view updates
While we now send view mutations asynchronously in the normal view
write path, other processes interested in sending view updates, such
as streaming or view building, may wish to do it synchronously.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
(cherry picked from commit dc44a08370)
2018-05-08 00:46:19 +01:00
Duarte Nunes
e75c55b2db db/timeout_clock: Properly scope type names
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180426134457.21290-1-duarte@scylladb.com>
(cherry picked from commit 2be75bdfc9)
2018-05-07 19:29:48 +01:00
Botond Dénes
756feae052 database: when dropping a table evict all relevant queriers
Queriers shouldn't outlive the table they read from as that could lead
to use-after-free problems when they are destroyed.

Fixes: #3414

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <3d7172cef79bb52b7097596e1d4ebba3a6ff757e.1525716986.git.bdenes@scylladb.com>
(cherry picked from commit 6f7d919470)
2018-05-07 21:20:42 +03:00
Tomasz Grabiec
202b4e6797 storage_proxy: Request schema from the coordinator in the original DC
The mutation forwarding intermediary (src_addr) may not always know
about the schema which was used by the original coordinator. I think
this may be the cause of the "Schema version ... not found" error seen
in one of the clusters which entered some pathological state:

  storage_proxy - Failed to apply mutation from 1.1.1.1#5: std::_Nested_exception<schema_version_loading_failed> (Failed to load schema version 32893223-a911-3a01-ad70-df1eb2a15db1): std::runtime_error (Schema version 32893223-a911-3a01-ad70-df1eb2a15db1 not found)

Fixes #3393.

Message-Id: <1524639030-1696-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 423712f1fe)
2018-05-07 13:08:40 +03:00
Raphael S. Carvalho
76ac200eff database: avoid race condition when deleting sstable on behalf of cf truncate
After removal of deletion manager, caller is now responsible for properly
submitting the deletion of a shared sstable. That's because deletion manager
was responsible for holding deletion until all owners agreed on it.
Resharding for example was changed to delete the shared sstables at the end,
but truncate wasn't changed and so race condition could happen when deleting
same sstable at more than one shard in parallel. Change the operation to only
submit a shared sstable for deletion in only one owner.

Fixes dtest migration_test.TestMigration.migrate_sstable_with_schema_change_test

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180503193427.24049-1-raphaelsc@scylladb.com>
2018-05-04 13:10:12 +01:00
Tomasz Grabiec
9aa172fe8e db: schema_tables: Treat drop of scylla_tables.version as an alter
After upgrade from 1.7 to 2.0, nodes will record a per-table schema
version which matches that on 1.7 to support the rolling upgrade. Any
later schema change (after the upgrade is done) will drop this record
from affected tables so that the per-table schema version is
recalculated. If nodes perform a schema pull (they detect schema
mismatch), then the merge will affect all tables and will wipe the
per-table schema version record from all tables, even if their schema
did not change. If then only some nodes get restarted, the restarted
nodes will load tables with the new (recalculated) per-table schema
version, while not restarted nodes will still use the 1.7 per-table
schema version. Until all nodes are restarted, writes or reads between
nodes from different groups will involve a needless exchange of schema
definition.

This will manifest in logs with repeated messages indicating schema
merge with no effect, triggered by writes:

  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f
  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f
  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f

The sync will be performed if the receiving shard forgets the foreign
version, which happens if it doesn't process any request referencing
it for more than 1 second.

This may impact latency of writes and reads.

The fix is to treat schema changes which drop the 1.7 per-table schema
version marker as an alter, which will switch in-memory data
structures to use the new per-table schema version immediately,
without the need for a restart.

Fixes #3394

Tests:
    - dtest: schema_test.py, schema_management_test.py
    - reproduced and validated the fix with run_upgrade_tests.sh from git@github.com:tgrabiec/scylla-dtest.git
    - unit (release)

Message-Id: <1524764211-12868-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit b1465291cf)
2018-05-03 10:51:19 +03:00
Takuya ASADA
c4af043ef7 dist/common/scripts/scylla_raid_setup: prevent 'device or resource busy' on creating mdraid device
According to this web site, there is possibility we have race condition with
mdraid creation vs udev:
http://dev.bizo.com/2012/07/mdadm-device-or-resource-busy.html
And looks like it can happen on our AMI, too (see #2784).

To initialize RAID safely, we should wait udev events are finished before and
after mdadm executed.

Fixes #2784

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1505898196-28389-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 4a8ed4cc6f)
2018-04-24 12:53:34 +03:00
Raphael S. Carvalho
06b25320be sstables: Fix bloom filter size after resharding by properly estimating partition count
We were feeding the total estimation partition count of an input shared
sstable to the output unshared ones.

So sstable writer thinks, *from estimation*, that each sstable created
by resharding will have the same data amount as the shared sstable they
are being created from. That's a problem because estimation is feeded to
bloom filter creation which directly influences its size.
So if we're resharding all sstables that belong to all shards, the
disk usage taken by filter components will be multiplied by the number
of shards. That becomes more of a problem with #3302.

Partition count estimation for a shard S will now be done as follow:
    //
    // TE, the total estimated partition count for a shard S, is defined as
    // TE = Sum(i = 0...N) { Ei / Si }.
    //
    // where i is an input sstable that belongs to shard S,
    //       Ei is the estimated partition count for sstable i,
    //       Si is the total number of shards that own sstable i.

Fixes #2672.
Refs #3302.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180423151001.9995-1-raphaelsc@scylladb.com>
(cherry picked from commit 11940ca39e)
2018-04-24 12:53:34 +03:00
Takuya ASADA
ff70d9f15c dist: Drop AmbientCapabilities from scylla-server.service for Debian 8
Debian 8 causes "Invalid argument" when we used AmbientCapabilities on systemd
unit file, so drop the line when we build .deb package for Debian 8.
For other distributions, keep using the feature.

Fixes #3344

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180423102041.2138-1-syuu@scylladb.com>
(cherry picked from commit 7b92c3fd3f)
2018-04-24 12:53:34 +03:00
Avi Kivity
9bbd5821a2 Update scylla-ami submodule
* dist/ami/files/scylla-ami 9b4be70...02b1853 (1):
  > scylla_install_ami: remove the host id file after scylla_setup
2018-04-24 12:53:34 +03:00
Avi Kivity
a7841f1f2e release: prepare for 2.2.rc0 2018-04-18 11:08:43 +03:00
Takuya ASADA
84859e0745 dist/debian: use ~root as HOME to place .pbuilderrc
When 'always_set_home' is specified on /etc/sudoers pbuilder won't read
.pbuilderrc from current user home directory, and we don't have a way to change
the behavor from sudo command parameter.

So let's use ~root/.pbuilderrc and switch to HOME=/root when sudo executed,
this can work both environment which does specified always_set_home and doesn't
specified.

Fixes #3366

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1523926024-3937-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit ace44784e8)
2018-04-17 09:38:43 +03:00
Avi Kivity
6b74e1f02d Update seastar submodule
* seastar bcfbe0c...491f994 (3):
  > tls: Ensure we always pass through semaphores on shutdown
  > cpu scheduler: don't penalize first group to run
  > reactor: fix sleep mode

Fixes #3350.
2018-04-14 20:44:11 +03:00
Avi Kivity
520f17b315 Point seastar submodule at scylla-seastar.git
This allows backporting seastar patches.
2018-04-14 20:43:28 +03:00
Gleb Natapov
9fe3d04f31 cql_server: fix a race between closing of a connection and notifier registration
There is a race between cql connection closure and notifier
registration. If a connection is closed before notification registration
is complete stale pointer to the connection will remain in notification
list since attempt to unregister the connection will happen to early.
The fix is to move notifier unregisteration after connection's gate
is closed which will ensure that there is no outstanding registration
request. But this means that now a connection with closed gate can be in
notifier list, so with_gate() may throw and abort a notifier loop. Fix
that by replacing with_gate() by call to is_closed();

Fixes: #3355
Tests: unit(release)

Message-Id: <20180412134744.GB22593@scylladb.com>
(cherry picked from commit 1a9aaece3e)
2018-04-12 16:57:07 +03:00
Raphael S. Carvalho
a74183eb1e sstables/compaction_manager: do not break lcs invariant by not allowing parallel compaction for it
After change to serialize compaction on compaction weight (eff62bc61e),
LCS invariant may break because parallel compaction can start, and it's
not currently supported for LCS.

The condition is that weight is deregistered right before last sstable
for a leveled compaction is sealed, so it may happen that a new compaction
starts for the same column family meanwhile that will promote a sstable to
an overlapping token range.

That leads to strategy restoring invariant when it finds the overlapping,
and that means wasted resources.
The fix is about removing a fast path check which is incorrect now because
we release weight early and also fixing a check for ongoing compaction
which prevented compaction from starting for LCS whenever weight tracker
was not empty.

Fixes #3279.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180410034538.30486-1-raphaelsc@scylladb.com>
(cherry picked from commit 638a647b7d)
2018-04-10 20:59:48 +03:00
Raphael S. Carvalho
e059f17bf2 database: make sure sstable is also forwarded to shard responsible for its generation
After f59f423f3c, sstable is loaded only at shards
that own it so as to reduce the sstable load overhead.

The problem is that a sstable may no longer be forwarded to a shard that needs to
be aware of its existence which would result in that sstable generation being
reallocated for a write request.
That would result in a failure as follow:
"SSTable write failed due to existence of TOC file for generation..."

This can be fixed by forwarding any sstable at load to all its owner shards
*and* the shard responsible for its generation, which is determined as follow:
s = generation % smp::count

Fixes #3273.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180405035245.30194-1-raphaelsc@scylladb.com>
(cherry picked from commit 30b6c9b4cd)
2018-04-05 10:58:29 +03:00
Duarte Nunes
0e8e005357 db/view: Reject view entries with non-composite, empty partition key
Empty partition keys are not supported on normal tables - they cannot
be inserted or queried (surprisingly, the rules for composite
partition keys are different: all components are then allowed to be
empty). However, the (non-composite) partition key of a view could end
up being empty if that column is: a base table regular column, a
base table clustering key column, or a base table partition key column,
part of a composite key.

Fixes #3262
Refs CASSANDRA-14345

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180403122244.10626-1-duarte@scylladb.com>
(cherry picked from commit ec8960df45)
2018-04-03 17:20:33 +03:00
Glauber Costa
8bf6f39392 docker: default docker to overprovisioned mode.
By default, overprovisioned is not enabled on docker unless it is
explicitly set. I have come to believe that this is a mistake.

If the user is running alone in the machine, and there are no other
processes pinned anywhere - including interrupts - not running
overprovisioned is the best choice.

But everywhere else, it is not: even if a user runs 2 docker containers
in the same machine and statically partitions CPUs with --smp (but
without cpuset) the docker containers will pin themselves to the same
sets of CPU, as they are totally unaware of each other.

It is also very common, specially in some virtualized environments, for
interrupts not to be properly distributed - being particularly keen on
being delivered on CPU0, a CPU which Scylla will pin by default.

Lastly, environments like Kubernetes simply don't support pinning at the
moment.

This patch enables the overprovisioned flag if it is explicitly set -
like we did before - but also by default unless --cpuset is set.

Fixes #3336.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20180331142131.842-1-glauber@scylladb.com>
(cherry picked from commit ef84780c27)
2018-04-02 17:07:20 +03:00
Glauber Costa
04ba51986e parse and ignore background writer controller
Unused options are not exposed as command line options and will prevent
Scylla from booting when present, although they can still be pased over
YAML, for Cassandra compatibility.

That has never been a problem, but we have been adding options to i3
(and others) that are now deprecated, but were previously marked as
Used. Systems with those options may have issues upgrading.

While this problem is common to all Unused options, the likelihood for
any other unused option to appear in the command line is near zero,
except for those two - since we put them there ourselves.

There are two ways to handle this issue:

1) Mark them as Used, and just ignore them.
2) Add them explicitly to boost program options, and then ignore them.

The second option is preferred here, because we can add them as hidden
options in program_options, meaning they won't show up in the help. We
can then just print a discrete message saying that those options are,
for now on ignored.

v2: mark set as const (Botond)
v3: rebase on top of master, identation suggested by Duarte.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20180329145517.8462-1-glauber@scylladb.com>
(cherry picked from commit a9ef72537f)
2018-03-29 17:57:43 +03:00
Asias He
1d5379c462 gossip: Relax generation max difference check
start node 1 2 3
shutdown node2
shutdown node1 and node3
start node1 and node3
nodetool removenode node2
clean up all scylla data on node2
bootstrap node2 as a new node

I saw node2 could not bootstrap stuck at waiting for schema information to compelte for ever:

On node1, node3

    [shard 0] gossip - received an invalid gossip generation for peer 127.0.0.2; local generation = 2, received generation = 1521779704

On node2

    [shard 0] storage_service - JOINING: waiting for schema information to complete

This is becasue in nodetool removenode operation, the generation of node1 was increased from 0 to 2.

   gossiper::advertise_removing () calls eps.get_heart_beat_state().force_newer_generation_unsafe();
   gossiper::advertise_token_removed() calls eps.get_heart_beat_state().force_newer_generation_unsafe();

Each force_newer_generation_unsafe increases the generation by 1.

Here is an example,

Before nodetool removenode:
```
curl -X GET --header "Accept: application/json" "http://127.0.0.1:10000/failure_detector/endpoints/" | python -mjson.tool
   {
   "addrs": "127.0.0.2",
   "generation": 0,
   "is_alive": false,
   "update_time": 1521778757334,
   "version": 0
   },
```

After nodetool revmoenode:
```
curl -X GET --header "Accept: application/json" "http://127.0.0.1:10000/failure_detector/endpoints/" | python -mjson.tool
 {
     "addrs": "127.0.0.2",
     "application_state": [
         {
             "application_state": 0,
             "value": "removed,146b52d5-dc94-4e35-b7d4-4f64be0d2672,1522038476246",
             "version": 214
         },
         {
             "application_state": 6,
             "value": "REMOVER,14ecc9b0-4b88-4ff3-9c96-38505fb4968a",
             "version": 153
            }
     ],
     "generation": 2,
     "is_alive": false,
     "update_time": 1521779276246,
     "version": 0
 },
```

In gossiper::apply_state_locally, we have this check:

```
if (local_generation != 0 && remote_generation > local_generation + MAX_GENERATION_DIFFERENCE) {
    // assume some peer has corrupted memory and is broadcasting an unbelievable generation about another peer (or itself)
  logger.warn("received an invalid gossip generation for peer {}; local generation = {}, received generation = {}",ep, local_generation, remote_generation);

}
```
to skip the gossip update.

To fix, we relax generation max difference check to allow the generation
of a removed node.

After this patch, the removed node bootstraps successfully.

Tests: dtest:update_cluster_layout_tests.py
Fixes #3331

Message-Id: <678fb60f6b370d3ca050c768f705a8f2fd4b1287.1522289822.git.asias@scylladb.com>
(cherry picked from commit f539e993d3)
2018-03-29 12:10:09 +03:00
Avi Kivity
cb5dc56bfd Update scylla-ami submodule
Ref #3332.
2018-03-29 10:35:54 +03:00
Duarte Nunes
b578b492cd column_family: Don't retry flushing memtable if shutdown is requested
Since we just keep retrying, this can cause Scylla to not shutdown for
a while.

The data will be safe in the commit log.

Note that this patch doesn't fix the issue when shutdown goes through
storage_service::drain_on_shutdown - more work is required to handle
that case.

Ref #3318.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180324140822.3743-3-duarte@scylladb.com>
(cherry picked from commit a985ea0fcb)
2018-03-26 15:26:56 +03:00
Duarte Nunes
30c950a7f6 column_family: Increase scope of exception handling when flushing a memtable
In column_family::try_flush_memtable_to_sstable, the handle_exception()
block is on the inside of the continuations to
write_memtable_to_sstable(), which, if it fails, will leave the
sstable in the compaction_backlog_tracker::_ongoing_writes map, which
will waste disk space, and that sstable will map to a dangling pointer
to a destroyed database_sstable_write_monitor, which causes a seg
fault when accessed (for example, through the backlog_controller,
which accounts the _ongoing_writes when calculating the backlog).

Fix this by increasing the scope of handle_exception().

Fixes #3315

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180324140822.3743-2-duarte@scylladb.com>
(cherry picked from commit 50ad37d39b)
2018-03-26 15:26:54 +03:00
Duarte Nunes
f0d1e9c518 backlog_controller: Stop update timer
On database shutdown, this timer can cause use-after-free errors if
not stopped.

Refs #3315

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180324140822.3743-1-duarte@scylladb.com>
(cherry picked from commit b7bd9b8058)
2018-03-26 15:26:52 +03:00
Avi Kivity
597aeca93d Merge "Bug fixes for access-control, and finalizing roles" from Jesse
"
This series does not add or change any features of access-control and
roles, but addresses some bugs and finalizes the switch to roles.

"auth: Wait for schema agreement" and the patch prior help avoid false
negatives for integration tests and error messages in logs.

"auth: Remove ordering dependence" fixes an important bug in `auth` that
could leave the default superuser in a corrupted state when it is first
created.

Since roles are feature-complete (to the best of the author's knowledge
as of this writing), the final patch in the series removes any warnings
about them being unimplemented.

Tests: unit (release), dtest (PENDING)
"

* 'jhk/auth_fixes/v1' of https://github.com/hakuch/scylla:
  Roles are implemented
  auth: Increase delay before background tasks start
  auth: Remove ordering dependence
  auth: Don't warn on rescheduled task
  auth: Wait for schema agreement
  Single-node clusters can agree on schema

(cherry picked from commit 999df41a49)
2018-03-26 12:37:41 +03:00
Duarte Nunes
1a94b90a4d Merge 'Grant default permissions' from Jesse
The functional change in this series is in the last patch
("auth: Grant all permissions to object creator").

The first patch addresses `const` correctness in `auth`. This change
allowed the new code added in the last patch to be written with the
correct `const` specifiers, and also some code to be removed.

The second-to-last patch addresses error-handling in the authorizer for
unsupported operations and is a prerequisite for the last patch (since
we now always grant permissions for new database objects).

Tests: unit (release)

* 'jhk/default_permissions/v3' of https://github.com/hakuch/scylla:
  auth: Grant all permissions to object creator
  auth: Unify handling for unsupported errors
  auth: Fix life-time issue with parameter
  auth: Fix `const` correctness

(cherry picked from commit 934d805b4b)
2018-03-26 12:37:35 +03:00
Avi Kivity
acdd42c7c8 Merge "Fix abort during counter table read-on-delete" from Tomasz
"
This fixes an abort in an sstable reader when querying a partition with no
clustering ranges (happens on counter table mutation with no live rows) which
also doesn't have any static columns. In such case, the
sstable_mutation_reader will setup the data_consume_context such that it only
covers the static row of the partition, knowing that there is no need to read
any clustered rows. See partition.cc::advance_to_upper_bound(). Later when
the reader is done with the range for the static row, it will try to skip to
the first clustering range (missing in this case). If clustering_ranges_walker
tells us to skip to after_all_clustering_rows(), we will hit an assert inside
continuous_data_consumer::fast_forward_to() due to attempt to skip past the
original data file range. If clustering_ranges_walker returns
before_all_clustering_rows() instead, all is fine because we're still at the
same data file position.

Fixes #3304.
"

* 'tgrabiec/fix-counter-read-no-static-columns' of github.com:scylladb/seastar-dev:
  tests: mutation_source_test: Test reads with no clustering ranges and no static columns
  tests: simple_schema: Allow creating schema with no static column
  clustering_ranges_walker: Stop after static row in case no clustering ranges

(cherry picked from commit 054854839a)
2018-03-22 18:13:29 +02:00
Takuya ASADA
bd4f658555 scripts/scylla_install_pkg: follow redirection of specified repo URL
We should follow redirection on curl, just like normal web browser does.
Fixes #3312

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521712056-301-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit bef08087e1)
2018-03-22 12:56:58 +02:00
Vladimir Krivopalov
a983ba7aad perf_fast_forward: fix error in date formatting
Instead of 'month', 'minutes' has been used.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <1e005ecaa992d8205ca44ea4eebbca4621ad9886.1521659341.git.vladimir@scylladb.com>
(cherry picked from commit 3010b637c9)
2018-03-22 12:56:56 +02:00
Duarte Nunes
0a561fc326 gms/gossiper: Synchronize endpoint state destruction
In gossiper::handle_major_state_change() we set the endpoint_state for
a particular endpoint and replicate the changes to other cores.

This is totally unsynchronized with the execution of
gossiper::evict_from_membership(), which can happen concurrently, and
can remove the very same endpoint from the map  (in all cores).

Replicating the changes to other cores in handle_major_state_change()
can interleave with replicating the changes to other cores in
evict_from_membership(), and result in an undefined final state.

Another issue happened in debug mode dtests, where a fiber executes
handle_major_state_change(), calls into the subscribers, of which
storage_service is one, and ultimately lands on
storage_service::update_peer_info(), which iterates over the
endpoint's application state with deferring points in between (to
update a system table). gossiper::evict_from_membership() was executed
concurrently by another fiber, which freed the state the first one is
iterating over.

Fixes #3299.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180318123211.3366-1-duarte@scylladb.com>
(cherry picked from commit 810db425a5)
2018-03-18 14:54:54 +02:00
Takuya ASADA
1f10549056 dist/redhat: build only scylla, iotune
Since we don't package tests, we don't need to build them.
It reduces package building time.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521066363-4859-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 1bb3531b90)
2018-03-15 10:48:36 +02:00
Takuya ASADA
c2a2560ea3 dist/debian: use 3rdparty ppa on Ubuntu 18.04
Currently Ubuntu 18.04 uses distribution provided g++ and boost, but it's easier
to maintain Scylla package to build with same version toolchain/libraries, so
switch to them.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521075576-12064-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 945e6ec4f6)
2018-03-15 10:48:31 +02:00
Takuya ASADA
237e36a0b4 dist/ami: update CentOS base image to latest version
Since we requires updated version of systemd, we need to update CentOS base
image.

Fixes #3184

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1518118694-23770-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 69d226625a)
2018-03-15 10:47:54 +02:00
Takuya ASADA
e78c137bfc dist/redhat: switch to gcc-7.3
We have hit following bug on debug-mode binary:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82560
Since it's fixed on gcc-7.3, we need to upgrade our gcc package.

See: https://groups.google.com/d/topic/scylladb-dev/RIdIpqMeTog/discussion
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521064473-17906-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 856dc0a636)
2018-03-15 10:39:40 +02:00
Avi Kivity
fb99a7c902 Merge "Ubuntu/Debian build error fixes" from Takuya
* 'debian-ubuntu-build-fixes-v2' of https://github.com/syuu1228/scylla:
  dist/debian: build only scylla, iotune
  dist/debian: switch to boost-1.65
  dist/debian: switch to gcc-7.3

(cherry picked from commit bb4b1f0e91)
2018-03-14 22:51:44 +02:00
1006 changed files with 13590 additions and 39505 deletions

View File

@@ -1,9 +1,3 @@
This is Scylla's bug tracker, to be used for reporting bugs only.
If you have a question about Scylla, and not a bug, please ask it in
our mailing-list at scylladb-dev@googlegroups.com or in our slack channel.
- [] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.
*Installation details*
Scylla version (or git commit hash):
Cluster size:

View File

@@ -1,4 +0,0 @@
Scylla doesn't use pull-requests, please send a patch to the [mailing list](mailto:scylladb-dev@googlegroups.com) instead.
See our [contributing guidelines](../CONTRIBUTING.md) and our [Scylla development guidelines](../HACKING.md) for more information.
If you have any questions please don't hesitate to send a mail to the [dev list](mailto:scylladb-dev@googlegroups.com).

1
.gitignore vendored
View File

@@ -18,4 +18,3 @@ CMakeLists.txt.user
*.egg-info
__pycache__CMakeLists.txt.user
.gdbinit
resources

2
.gitmodules vendored
View File

@@ -1,6 +1,6 @@
[submodule "seastar"]
path = seastar
url = ../seastar
url = ../scylla-seastar
ignore = dirty
[submodule "swagger-ui"]
path = swagger-ui

View File

@@ -1,6 +1,6 @@
#!/bin/sh
VERSION=666.development
VERSION=2.2.2
if test -f version
then

View File

@@ -455,7 +455,7 @@
"operations":[
{
"method":"GET",
"summary":"Returns a list of sstable filenames that contain the given partition key on this node",
"summary":"Returns a list of filenames that contain the given key on this node",
"type":"array",
"items":{
"type":"string"
@@ -475,7 +475,7 @@
},
{
"name":"key",
"description":"The partition key. In a composite-key scenario, use ':' to separate the columns in the key.",
"description":"The key",
"required":true,
"allowMultiple":false,
"type":"string",

View File

@@ -1,30 +0,0 @@
"/v2/config/{id}": {
"get": {
"description": "Return a config value",
"operationId": "find_config_id",
"produces": [
"application/json"
],
"tags": ["config"],
"parameters": [
{
"name": "id",
"in": "path",
"description": "ID of config to return",
"required": true,
"type": "string"
}
],
"responses": {
"200": {
"description": "Config value"
},
"default": {
"description": "unexpected error",
"schema": {
"$ref": "#/definitions/ErrorModel"
}
}
}
}
}

View File

@@ -2129,41 +2129,6 @@
]
}
]
},
{
"path":"/storage_service/view_build_statuses/{keyspace}/{view}",
"operations":[
{
"method":"GET",
"summary":"Gets the progress of a materialized view build",
"type":"array",
"items":{
"type":"mapper"
},
"nickname":"view_build_statuses",
"produces":[
"application/json"
],
"parameters":[
{
"name":"keyspace",
"description":"The keyspace",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"view",
"description":"View name",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
}
],
"models":{
@@ -2228,11 +2193,11 @@
"description":"The column family"
},
"total":{
"type":"int",
"type":"long",
"description":"The total snapshot size"
},
"live":{
"type":"int",
"type":"long",
"description":"The live snapshot size"
}
}

View File

@@ -39,7 +39,6 @@
#include "http/exception.hh"
#include "stream_manager.hh"
#include "system.hh"
#include "api/config.hh"
namespace api {
@@ -66,7 +65,6 @@ future<> set_server_init(http_context& ctx) {
rb->set_api_doc(r);
rb02->set_api_doc(r);
rb02->register_api_file(r, "swagger20_header");
set_config(rb02, ctx, r);
rb->register_function(r, "system",
"The system related API");
set_system(ctx, r);

View File

@@ -429,7 +429,7 @@ void set_column_family(http_context& ctx, routes& r) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
utils::estimated_histogram res(0);
for (auto i: *cf.get_sstables() ) {
res.merge(i->get_stats_metadata().estimated_cells_count);
res.merge(i->get_stats_metadata().estimated_column_count);
}
return res;
},
@@ -905,20 +905,5 @@ void set_column_family(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(res);
});
});
cf::get_sstables_for_key.set(r, [&ctx](std::unique_ptr<request> req) {
auto key = req->get_query_param("key");
auto uuid = get_uuid(req->param["name"], ctx.db.local());
return ctx.db.map_reduce0([key, uuid] (database& db) {
return db.find_column_family(uuid).get_sstables_by_partition_key(key);
}, std::unordered_set<sstring>(),
[](std::unordered_set<sstring> a, std::unordered_set<sstring>&& b) mutable {
a.insert(b.begin(),b.end());
return a;
}).then([](const std::unordered_set<sstring>& res) {
return make_ready_future<json::json_return_type>(container_to_vec(res));
});
});
}
}

View File

@@ -24,7 +24,6 @@
#include "api.hh"
#include "api/api-doc/column_family.json.hh"
#include "database.hh"
#include <any>
namespace api {
@@ -38,15 +37,9 @@ template<class Mapper, class I, class Reducer>
future<I> map_reduce_cf_raw(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer) {
auto uuid = get_uuid(name, ctx.db.local());
using mapper_type = std::function<std::any (database&)>;
using reducer_type = std::function<std::any (std::any, std::any)>;
return ctx.db.map_reduce0(mapper_type([mapper, uuid](database& db) {
return I(mapper(db.find_column_family(uuid)));
}), std::any(std::move(init)), reducer_type([reducer = std::move(reducer)] (std::any a, std::any b) mutable {
return I(reducer(std::any_cast<I>(std::move(a)), std::any_cast<I>(std::move(b))));
})).then([] (std::any r) {
return std::any_cast<I>(std::move(r));
});
return ctx.db.map_reduce0([mapper, uuid](database& db) {
return mapper(db.find_column_family(uuid));
}, init, reducer);
}
@@ -58,42 +51,35 @@ future<json::json_return_type> map_reduce_cf(http_context& ctx, const sstring& n
});
}
template<class Mapper, class I, class Reducer, class Result>
future<I> map_reduce_cf_raw(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer, Result result) {
auto uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([mapper, uuid](database& db) {
return mapper(db.find_column_family(uuid));
}, init, reducer);
}
template<class Mapper, class I, class Reducer, class Result>
future<json::json_return_type> map_reduce_cf(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer, Result result) {
return map_reduce_cf_raw(ctx, name, init, mapper, reducer).then([result](const I& res) mutable {
return map_reduce_cf_raw(ctx, name, init, mapper, reducer, result).then([result](const I& res) mutable {
result = res;
return make_ready_future<json::json_return_type>(result);
});
}
struct map_reduce_column_families_locally {
std::any init;
std::function<std::any (column_family&)> mapper;
std::function<std::any (std::any, std::any)> reducer;
std::any operator()(database& db) const {
template<class Mapper, class I, class Reducer>
future<I> map_reduce_cf_raw(http_context& ctx, I init,
Mapper mapper, Reducer reducer) {
return ctx.db.map_reduce0([mapper, init, reducer](database& db) {
auto res = init;
for (auto i : db.get_column_families()) {
res = reducer(res, mapper(*i.second.get()));
}
return res;
}
};
template<class Mapper, class I, class Reducer>
future<I> map_reduce_cf_raw(http_context& ctx, I init,
Mapper mapper, Reducer reducer) {
using mapper_type = std::function<std::any (column_family&)>;
using reducer_type = std::function<std::any (std::any, std::any)>;
auto wrapped_mapper = mapper_type([mapper = std::move(mapper)] (column_family& cf) mutable {
return I(mapper(cf));
});
auto wrapped_reducer = reducer_type([reducer = std::move(reducer)] (std::any a, std::any b) mutable {
return I(reducer(std::any_cast<I>(std::move(a)), std::any_cast<I>(std::move(b))));
});
return ctx.db.map_reduce0(map_reduce_column_families_locally{init, std::move(wrapped_mapper), wrapped_reducer}, std::any(init), wrapped_reducer).then([] (std::any res) {
return std::any_cast<I>(std::move(res));
});
}, init, reducer);
}

View File

@@ -1,112 +0,0 @@
/*
* Copyright 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "api/config.hh"
#include "api/api-doc/config.json.hh"
#include "db/config.hh"
#include <sstream>
#include <boost/algorithm/string/replace.hpp>
namespace api {
template<class T>
json::json_return_type get_json_return_type(const T& val) {
return json::json_return_type(val);
}
/*
* As commented on db::seed_provider_type is not used
* and probably never will.
*
* Just in case, we will return its name
*/
template<>
json::json_return_type get_json_return_type(const db::seed_provider_type& val) {
return json::json_return_type(val.class_name);
}
std::string format_type(const std::string& type) {
if (type == "int") {
return "integer";
}
return type;
}
future<> get_config_swagger_entry(const std::string& name, const std::string& description, const std::string& type, bool& first, output_stream<char>& os) {
std::stringstream ss;
if (first) {
first=false;
} else {
ss <<',';
};
ss << "\"/config/" << name <<"\": {"
"\"get\": {"
"\"description\": \"" << boost::replace_all_copy(boost::replace_all_copy(boost::replace_all_copy(description,"\n","\\n"),"\"", "''"), "\t", " ") <<"\","
"\"operationId\": \"find_config_"<< name <<"\","
"\"produces\": ["
"\"application/json\""
"],"
"\"tags\": [\"config\"],"
"\"parameters\": ["
"],"
"\"responses\": {"
"\"200\": {"
"\"description\": \"Config value\","
"\"schema\": {"
"\"type\": \"" << format_type(type) << "\""
"}"
"},"
"\"default\": {"
"\"description\": \"unexpected error\","
"\"schema\": {"
"\"$ref\": \"#/definitions/ErrorModel\""
"}"
"}"
"}"
"}"
"}";
return os.write(ss.str());
}
namespace cs = httpd::config_json;
#define _get_config_value(name, type, deflt, status, desc, ...) if (id == #name) {return get_json_return_type(ctx.db.local().get_config().name());}
#define _get_config_description(name, type, deflt, status, desc, ...) f = f.then([&os, &first] {return get_config_swagger_entry(#name, desc, #type, first, os);});
void set_config(std::shared_ptr < api_registry_builder20 > rb, http_context& ctx, routes& r) {
rb->register_function(r, [] (output_stream<char>& os) {
return do_with(true, [&os] (bool& first) {
auto f = make_ready_future();
_make_config_values(_get_config_description)
return f;
});
});
cs::find_config_id.set(r, [&ctx] (const_req r) {
auto id = r.param["id"];
_make_config_values(_get_config_value)
throw bad_param_exception(sstring("No such config entry: ") + id);
});
}
}

View File

@@ -1,30 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
#include <seastar/http/api_docs.hh>
namespace api {
void set_config(std::shared_ptr<api_registry_builder20> rb, http_context& ctx, routes& r);
}

View File

@@ -852,15 +852,6 @@ void set_storage_service(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(map_to_key_value(ownership, res));
});
});
ss::view_build_statuses.set(r, [&ctx] (std::unique_ptr<request> req) {
auto keyspace = validate_keyspace(ctx, req->param);
auto view = req->param["view"];
return service::get_local_storage_service().view_build_statuses(std::move(keyspace), std::move(view)).then([] (std::unordered_map<sstring, sstring> status) {
std::vector<storage_service_json::mapper> res;
return make_ready_future<json::json_return_type>(map_to_key_value(std::move(status), res));
});
});
}
}

View File

@@ -1,239 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "atomic_cell.hh"
#include "atomic_cell_or_collection.hh"
#include "types.hh"
/// LSA mirator for cells with irrelevant type
///
///
const data::type_imr_descriptor& no_type_imr_descriptor() {
static thread_local data::type_imr_descriptor state(data::type_info::make_variable_size());
return state;
}
atomic_cell atomic_cell::make_dead(api::timestamp_type timestamp, gc_clock::time_point deletion_time) {
auto& imr_data = no_type_imr_descriptor();
return atomic_cell(
imr_data.type_info(),
imr_object_type::make(data::cell::make_dead(timestamp, deletion_time), &imr_data.lsa_migrator())
);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value, atomic_cell::collection_member cm) {
auto& imr_data = type.imr_state();
return atomic_cell(
imr_data.type_info(),
imr_object_type::make(data::cell::make_live(imr_data.type_info(), timestamp, value, bool(cm)), &imr_data.lsa_migrator())
);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value, atomic_cell::collection_member cm) {
auto& imr_data = type.imr_state();
return atomic_cell(
imr_data.type_info(),
imr_object_type::make(data::cell::make_live(imr_data.type_info(), timestamp, value, bool(cm)), &imr_data.lsa_migrator())
);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl, atomic_cell::collection_member cm) {
auto& imr_data = type.imr_state();
return atomic_cell(
imr_data.type_info(),
imr_object_type::make(data::cell::make_live(imr_data.type_info(), timestamp, value, expiry, ttl, bool(cm)), &imr_data.lsa_migrator())
);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value,
gc_clock::time_point expiry, gc_clock::duration ttl, atomic_cell::collection_member cm) {
auto& imr_data = type.imr_state();
return atomic_cell(
imr_data.type_info(),
imr_object_type::make(data::cell::make_live(imr_data.type_info(), timestamp, value, expiry, ttl, bool(cm)), &imr_data.lsa_migrator())
);
}
atomic_cell atomic_cell::make_live_counter_update(api::timestamp_type timestamp, int64_t value) {
auto& imr_data = no_type_imr_descriptor();
return atomic_cell(
imr_data.type_info(),
imr_object_type::make(data::cell::make_live_counter_update(timestamp, value), &imr_data.lsa_migrator())
);
}
atomic_cell atomic_cell::make_live_uninitialized(const abstract_type& type, api::timestamp_type timestamp, size_t size) {
auto& imr_data = no_type_imr_descriptor();
return atomic_cell(
imr_data.type_info(),
imr_object_type::make(data::cell::make_live_uninitialized(imr_data.type_info(), timestamp, size), &imr_data.lsa_migrator())
);
}
static imr::utils::object<data::cell::structure> copy_cell(const data::type_imr_descriptor& imr_data, const uint8_t* ptr)
{
using imr_object_type = imr::utils::object<data::cell::structure>;
// If the cell doesn't own any memory it is trivial and can be copied with
// memcpy.
auto f = data::cell::structure::get_member<data::cell::tags::flags>(ptr);
if (!f.template get<data::cell::tags::external_data>()) {
data::cell::context ctx(f, imr_data.type_info());
// XXX: We may be better off storing the total cell size in memory. Measure!
auto size = data::cell::structure::serialized_object_size(ptr, ctx);
return imr_object_type::make_raw(size, [&] (uint8_t* dst) noexcept {
std::copy_n(ptr, size, dst);
}, &imr_data.lsa_migrator());
}
return imr_object_type::make(data::cell::copy_fn(imr_data.type_info(), ptr), &imr_data.lsa_migrator());
}
atomic_cell::atomic_cell(const abstract_type& type, atomic_cell_view other)
: atomic_cell(type.imr_state().type_info(),
copy_cell(type.imr_state(), other._view.raw_pointer()))
{ }
atomic_cell_or_collection atomic_cell_or_collection::copy(const abstract_type& type) const {
if (!_data.get()) {
return atomic_cell_or_collection();
}
auto& imr_data = type.imr_state();
return atomic_cell_or_collection(
copy_cell(imr_data, _data.get())
);
}
atomic_cell_or_collection::atomic_cell_or_collection(const abstract_type& type, atomic_cell_view acv)
: _data(copy_cell(type.imr_state(), acv._view.raw_pointer()))
{
}
static collection_mutation_view get_collection_mutation_view(const uint8_t* ptr)
{
auto f = data::cell::structure::get_member<data::cell::tags::flags>(ptr);
auto ti = data::type_info::make_collection();
data::cell::context ctx(f, ti);
auto view = data::cell::structure::get_member<data::cell::tags::cell>(ptr).as<data::cell::tags::collection>(ctx);
auto dv = data::cell::variable_value::make_view(view, f.get<data::cell::tags::external_data>());
return collection_mutation_view { dv };
}
collection_mutation_view atomic_cell_or_collection::as_collection_mutation() const {
return get_collection_mutation_view(_data.get());
}
collection_mutation::collection_mutation(const collection_type_impl& type, collection_mutation_view v)
: _data(imr_object_type::make(data::cell::make_collection(v.data), &type.imr_state().lsa_migrator()))
{
}
collection_mutation::collection_mutation(const collection_type_impl& type, bytes_view v)
: _data(imr_object_type::make(data::cell::make_collection(v), &type.imr_state().lsa_migrator()))
{
}
collection_mutation::operator collection_mutation_view() const
{
return get_collection_mutation_view(_data.get());
}
bool atomic_cell_or_collection::equals(const abstract_type& type, const atomic_cell_or_collection& other) const
{
auto ptr_a = _data.get();
auto ptr_b = other._data.get();
if (!ptr_a || !ptr_b) {
return !ptr_a && !ptr_b;
}
if (type.is_atomic()) {
auto a = atomic_cell_view::from_bytes(type.imr_state().type_info(), _data);
auto b = atomic_cell_view::from_bytes(type.imr_state().type_info(), other._data);
if (a.timestamp() != b.timestamp()) {
return false;
}
if (a.is_live()) {
if (!b.is_live()) {
return false;
}
if (a.is_counter_update()) {
if (!b.is_counter_update()) {
return false;
}
return a.counter_update_value() == b.counter_update_value();
}
if (a.is_live_and_has_ttl()) {
if (!b.is_live_and_has_ttl()) {
return false;
}
if (a.ttl() != b.ttl() || a.expiry() != b.expiry()) {
return false;
}
}
return a.value() == b.value();
}
return a.deletion_time() == b.deletion_time();
} else {
return as_collection_mutation().data == other.as_collection_mutation().data;
}
}
size_t atomic_cell_or_collection::external_memory_usage(const abstract_type& t) const
{
if (!_data.get()) {
return 0;
}
auto ctx = data::cell::context(_data.get(), t.imr_state().type_info());
auto view = data::cell::structure::make_view(_data.get(), ctx);
auto flags = view.get<data::cell::tags::flags>();
size_t external_value_size = 0;
if (flags.get<data::cell::tags::external_data>()) {
if (flags.get<data::cell::tags::collection>()) {
external_value_size = get_collection_mutation_view(_data.get()).data.size_bytes();
} else {
auto cell_view = data::cell::atomic_cell_view(t.imr_state().type_info(), view);
external_value_size = cell_view.value_size();
}
// Add overhead of chunk headers. The last one is a special case.
external_value_size += (external_value_size - 1) / data::cell::maximum_external_chunk_length * data::cell::external_chunk_overhead;
external_value_size += data::cell::external_last_chunk_overhead;
}
return data::cell::structure::serialized_object_size(_data.get(), ctx)
+ imr_object_type::size_overhead + external_value_size;
}
std::ostream& operator<<(std::ostream& os, const atomic_cell_or_collection& c) {
if (!c._data.get()) {
return os << "{ null atomic_cell_or_collection }";
}
using dc = data::cell;
os << "{ ";
if (dc::structure::get_member<dc::tags::flags>(c._data.get()).get<dc::tags::collection>()) {
os << "collection";
} else {
os << "atomic cell";
}
return os << " @" << static_cast<const void*>(c._data.get()) << " }";
}

View File

@@ -30,50 +30,189 @@
#include <cstdint>
#include <iosfwd>
#include <seastar/util/gcc6-concepts.hh>
#include "data/cell.hh"
#include "data/schema_info.hh"
#include "imr/utils.hh"
#include "serializer.hh"
template<typename T, typename Input>
static inline
void set_field(Input& v, unsigned offset, T val) {
reinterpret_cast<net::packed<T>*>(v.begin() + offset)->raw = net::hton(val);
}
class abstract_type;
class collection_type_impl;
template<typename T>
static inline
T get_field(const bytes_view& v, unsigned offset) {
return net::ntoh(*reinterpret_cast<const net::packed<T>*>(v.begin() + offset));
}
using atomic_cell_value_view = data::value_view;
using atomic_cell_value_mutable_view = data::value_mutable_view;
class atomic_cell_or_collection;
/// View of an atomic cell
template<mutable_view is_mutable>
class basic_atomic_cell_view {
protected:
data::cell::basic_atomic_cell_view<is_mutable> _view;
friend class atomic_cell;
/*
* Represents atomic cell layout. Works on serialized form.
*
* Layout:
*
* <live> := <int8_t:flags><int64_t:timestamp>(<int32_t:expiry><int32_t:ttl>)?<value>
* <dead> := <int8_t: 0><int64_t:timestamp><int32_t:deletion_time>
*/
class atomic_cell_type final {
private:
static constexpr int8_t LIVE_FLAG = 0x01;
static constexpr int8_t EXPIRY_FLAG = 0x02; // When present, expiry field is present. Set only for live cells
static constexpr int8_t COUNTER_UPDATE_FLAG = 0x08; // Cell is a counter update.
static constexpr int8_t COUNTER_IN_PLACE_REVERT = 0x10;
static constexpr unsigned flags_size = 1;
static constexpr unsigned timestamp_offset = flags_size;
static constexpr unsigned timestamp_size = 8;
static constexpr unsigned expiry_offset = timestamp_offset + timestamp_size;
static constexpr unsigned expiry_size = 4;
static constexpr unsigned deletion_time_offset = timestamp_offset + timestamp_size;
static constexpr unsigned deletion_time_size = 4;
static constexpr unsigned ttl_offset = expiry_offset + expiry_size;
static constexpr unsigned ttl_size = 4;
friend class counter_cell_builder;
private:
static bool is_counter_update(bytes_view cell) {
return cell[0] & COUNTER_UPDATE_FLAG;
}
static bool is_counter_in_place_revert_set(bytes_view cell) {
return cell[0] & COUNTER_IN_PLACE_REVERT;
}
template<typename BytesContainer>
static void set_counter_in_place_revert(BytesContainer& cell, bool flag) {
cell[0] = (cell[0] & ~COUNTER_IN_PLACE_REVERT) | (flag * COUNTER_IN_PLACE_REVERT);
}
static bool is_live(const bytes_view& cell) {
return cell[0] & LIVE_FLAG;
}
static bool is_live_and_has_ttl(const bytes_view& cell) {
return cell[0] & EXPIRY_FLAG;
}
static bool is_dead(const bytes_view& cell) {
return !is_live(cell);
}
// Can be called on live and dead cells
static api::timestamp_type timestamp(const bytes_view& cell) {
return get_field<api::timestamp_type>(cell, timestamp_offset);
}
template<typename BytesContainer>
static void set_timestamp(BytesContainer& cell, api::timestamp_type ts) {
set_field(cell, timestamp_offset, ts);
}
// Can be called on live cells only
private:
template<typename BytesView>
static BytesView do_get_value(BytesView cell) {
auto expiry_field_size = bool(cell[0] & EXPIRY_FLAG) * (expiry_size + ttl_size);
auto value_offset = flags_size + timestamp_size + expiry_field_size;
cell.remove_prefix(value_offset);
return cell;
}
public:
using pointer_type = std::conditional_t<is_mutable == mutable_view::no, const uint8_t*, uint8_t*>;
static bytes_view value(bytes_view cell) {
return do_get_value(cell);
}
static bytes_mutable_view value(bytes_mutable_view cell) {
return do_get_value(cell);
}
// Can be called on live counter update cells only
static int64_t counter_update_value(bytes_view cell) {
return get_field<int64_t>(cell, flags_size + timestamp_size);
}
// Can be called only when is_dead() is true.
static gc_clock::time_point deletion_time(const bytes_view& cell) {
assert(is_dead(cell));
return gc_clock::time_point(gc_clock::duration(
get_field<int32_t>(cell, deletion_time_offset)));
}
// Can be called only when is_live_and_has_ttl() is true.
static gc_clock::time_point expiry(const bytes_view& cell) {
assert(is_live_and_has_ttl(cell));
auto expiry = get_field<int32_t>(cell, expiry_offset);
return gc_clock::time_point(gc_clock::duration(expiry));
}
// Can be called only when is_live_and_has_ttl() is true.
static gc_clock::duration ttl(const bytes_view& cell) {
assert(is_live_and_has_ttl(cell));
return gc_clock::duration(get_field<int32_t>(cell, ttl_offset));
}
static managed_bytes make_dead(api::timestamp_type timestamp, gc_clock::time_point deletion_time) {
managed_bytes b(managed_bytes::initialized_later(), flags_size + timestamp_size + deletion_time_size);
b[0] = 0;
set_field(b, timestamp_offset, timestamp);
set_field(b, deletion_time_offset, deletion_time.time_since_epoch().count());
return b;
}
static managed_bytes make_live(api::timestamp_type timestamp, bytes_view value) {
auto value_offset = flags_size + timestamp_size;
managed_bytes b(managed_bytes::initialized_later(), value_offset + value.size());
b[0] = LIVE_FLAG;
set_field(b, timestamp_offset, timestamp);
std::copy_n(value.begin(), value.size(), b.begin() + value_offset);
return b;
}
static managed_bytes make_live_counter_update(api::timestamp_type timestamp, int64_t value) {
auto value_offset = flags_size + timestamp_size;
managed_bytes b(managed_bytes::initialized_later(), value_offset + sizeof(value));
b[0] = LIVE_FLAG | COUNTER_UPDATE_FLAG;
set_field(b, timestamp_offset, timestamp);
set_field(b, value_offset, value);
return b;
}
static managed_bytes make_live(api::timestamp_type timestamp, bytes_view value, gc_clock::time_point expiry, gc_clock::duration ttl) {
auto value_offset = flags_size + timestamp_size + expiry_size + ttl_size;
managed_bytes b(managed_bytes::initialized_later(), value_offset + value.size());
b[0] = EXPIRY_FLAG | LIVE_FLAG;
set_field(b, timestamp_offset, timestamp);
set_field(b, expiry_offset, expiry.time_since_epoch().count());
set_field(b, ttl_offset, ttl.count());
std::copy_n(value.begin(), value.size(), b.begin() + value_offset);
return b;
}
// make_live_from_serializer() is intended for users that need to serialise
// some object or objects to the format used in atomic_cell::value().
// With just make_live() the patter would look like follows:
// 1. allocate a buffer and write to it serialised objects
// 2. pass that buffer to make_live()
// 3. make_live() needs to prepend some metadata to the cell value so it
// allocates a new buffer and copies the content of the original one
//
// The allocation and copy of a buffer can be avoided.
// make_live_from_serializer() allows the user code to specify the timestamp
// and size of the cell value as well as provide the serialiser function
// object, which would write the serialised value of the cell to the buffer
// given to it by make_live_from_serializer().
template<typename Serializer>
GCC6_CONCEPT(requires requires(Serializer serializer, bytes::iterator it) {
serializer(it);
})
static managed_bytes make_live_from_serializer(api::timestamp_type timestamp, size_t size, Serializer&& serializer) {
auto value_offset = flags_size + timestamp_size;
managed_bytes b(managed_bytes::initialized_later(), value_offset + size);
b[0] = LIVE_FLAG;
set_field(b, timestamp_offset, timestamp);
serializer(b.begin() + value_offset);
return b;
}
template<typename ByteContainer>
friend class atomic_cell_base;
friend class atomic_cell;
};
template<typename ByteContainer>
class atomic_cell_base {
protected:
explicit basic_atomic_cell_view(data::cell::basic_atomic_cell_view<is_mutable> v)
: _view(std::move(v)) { }
basic_atomic_cell_view(const data::type_info& ti, pointer_type ptr)
: _view(data::cell::make_atomic_cell_view(ti, ptr))
{ }
ByteContainer _data;
protected:
atomic_cell_base(ByteContainer&& data) : _data(std::forward<ByteContainer>(data)) { }
friend class atomic_cell_or_collection;
public:
operator basic_atomic_cell_view<mutable_view::no>() const noexcept {
return basic_atomic_cell_view<mutable_view::no>(_view);
}
void swap(basic_atomic_cell_view& other) noexcept {
using std::swap;
swap(_view, other._view);
}
bool is_counter_update() const {
return _view.is_counter_update();
return atomic_cell_type::is_counter_update(_data);
}
bool is_counter_in_place_revert_set() const {
return atomic_cell_type::is_counter_in_place_revert_set(_data);
}
bool is_live() const {
return _view.is_live();
return atomic_cell_type::is_live(_data);
}
bool is_live(tombstone t, bool is_counter) const {
return is_live() && !is_covered_by(t, is_counter);
@@ -82,136 +221,122 @@ public:
return is_live() && !is_covered_by(t, is_counter) && !has_expired(now);
}
bool is_live_and_has_ttl() const {
return _view.is_expiring();
return atomic_cell_type::is_live_and_has_ttl(_data);
}
bool is_dead(gc_clock::time_point now) const {
return !is_live() || has_expired(now);
return atomic_cell_type::is_dead(_data) || has_expired(now);
}
bool is_covered_by(tombstone t, bool is_counter) const {
return timestamp() <= t.timestamp || (is_counter && t.timestamp != api::missing_timestamp);
}
// Can be called on live and dead cells
api::timestamp_type timestamp() const {
return _view.timestamp();
return atomic_cell_type::timestamp(_data);
}
void set_timestamp(api::timestamp_type ts) {
_view.set_timestamp(ts);
atomic_cell_type::set_timestamp(_data, ts);
}
// Can be called on live cells only
data::basic_value_view<is_mutable> value() const {
return _view.value();
}
// Can be called on live cells only
size_t value_size() const {
return _view.value_size();
}
bool is_value_fragmented() const {
return _view.is_value_fragmented();
auto value() const {
return atomic_cell_type::value(_data);
}
// Can be called on live counter update cells only
int64_t counter_update_value() const {
return _view.counter_update_value();
return atomic_cell_type::counter_update_value(_data);
}
// Can be called only when is_dead(gc_clock::time_point)
gc_clock::time_point deletion_time() const {
return !is_live() ? _view.deletion_time() : expiry() - ttl();
return !is_live() ? atomic_cell_type::deletion_time(_data) : expiry() - ttl();
}
// Can be called only when is_live_and_has_ttl()
gc_clock::time_point expiry() const {
return _view.expiry();
return atomic_cell_type::expiry(_data);
}
// Can be called only when is_live_and_has_ttl()
gc_clock::duration ttl() const {
return _view.ttl();
return atomic_cell_type::ttl(_data);
}
// Can be called on live and dead cells
bool has_expired(gc_clock::time_point now) const {
return is_live_and_has_ttl() && expiry() <= now;
}
bytes_view serialize() const {
return _view.serialize();
return _data;
}
void set_counter_in_place_revert(bool flag) {
atomic_cell_type::set_counter_in_place_revert(_data, flag);
}
};
class atomic_cell_view final : public basic_atomic_cell_view<mutable_view::no> {
atomic_cell_view(const data::type_info& ti, const uint8_t* data)
: basic_atomic_cell_view<mutable_view::no>(ti, data) {}
template<mutable_view is_mutable>
atomic_cell_view(data::cell::basic_atomic_cell_view<is_mutable> view)
: basic_atomic_cell_view<mutable_view::no>(view) { }
friend class atomic_cell;
class atomic_cell_view final : public atomic_cell_base<bytes_view> {
atomic_cell_view(bytes_view data) : atomic_cell_base(std::move(data)) {}
public:
static atomic_cell_view from_bytes(const data::type_info& ti, const imr::utils::object<data::cell::structure>& data) {
return atomic_cell_view(ti, data.get());
}
static atomic_cell_view from_bytes(const data::type_info& ti, bytes_view bv) {
return atomic_cell_view(ti, reinterpret_cast<const uint8_t*>(bv.begin()));
}
static atomic_cell_view from_bytes(bytes_view data) { return atomic_cell_view(data); }
friend class atomic_cell;
friend std::ostream& operator<<(std::ostream& os, const atomic_cell_view& acv);
};
class atomic_cell_mutable_view final : public basic_atomic_cell_view<mutable_view::yes> {
atomic_cell_mutable_view(const data::type_info& ti, uint8_t* data)
: basic_atomic_cell_view<mutable_view::yes>(ti, data) {}
class atomic_cell_mutable_view final : public atomic_cell_base<bytes_mutable_view> {
atomic_cell_mutable_view(bytes_mutable_view data) : atomic_cell_base(std::move(data)) {}
public:
static atomic_cell_mutable_view from_bytes(const data::type_info& ti, imr::utils::object<data::cell::structure>& data) {
return atomic_cell_mutable_view(ti, data.get());
}
static atomic_cell_mutable_view from_bytes(bytes_mutable_view data) { return atomic_cell_mutable_view(data); }
friend class atomic_cell;
};
using atomic_cell_ref = atomic_cell_mutable_view;
class atomic_cell final : public basic_atomic_cell_view<mutable_view::yes> {
using imr_object_type = imr::utils::object<data::cell::structure>;
imr_object_type _data;
atomic_cell(const data::type_info& ti, imr::utils::object<data::cell::structure>&& data)
: basic_atomic_cell_view<mutable_view::yes>(ti, data.get()), _data(std::move(data)) {}
class atomic_cell_ref final : public atomic_cell_base<managed_bytes&> {
public:
class collection_member_tag;
using collection_member = bool_class<collection_member_tag>;
atomic_cell_ref(managed_bytes& buf) : atomic_cell_base(buf) {}
};
class atomic_cell final : public atomic_cell_base<managed_bytes> {
atomic_cell(managed_bytes b) : atomic_cell_base(std::move(b)) {}
public:
atomic_cell(const atomic_cell&) = default;
atomic_cell(atomic_cell&&) = default;
atomic_cell& operator=(const atomic_cell&) = delete;
atomic_cell& operator=(const atomic_cell&) = default;
atomic_cell& operator=(atomic_cell&&) = default;
void swap(atomic_cell& other) noexcept {
basic_atomic_cell_view<mutable_view::yes>::swap(other);
_data.swap(other._data);
static atomic_cell from_bytes(managed_bytes b) {
return atomic_cell(std::move(b));
}
operator atomic_cell_view() const { return atomic_cell_view(_view); }
atomic_cell(const abstract_type& t, atomic_cell_view other);
static atomic_cell make_dead(api::timestamp_type timestamp, gc_clock::time_point deletion_time);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value,
collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value,
collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, const bytes& value,
collection_member cm = collection_member::no) {
return make_live(type, timestamp, bytes_view(value), cm);
atomic_cell(atomic_cell_view other) : atomic_cell_base(managed_bytes{other._data}) {}
operator atomic_cell_view() const {
return atomic_cell_view(_data);
}
static atomic_cell make_live_counter_update(api::timestamp_type timestamp, int64_t value);
static atomic_cell make_live(const abstract_type&, api::timestamp_type timestamp, bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type&, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, const bytes& value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member cm = collection_member::no)
static atomic_cell make_dead(api::timestamp_type timestamp, gc_clock::time_point deletion_time) {
return atomic_cell_type::make_dead(timestamp, deletion_time);
}
static atomic_cell make_live(api::timestamp_type timestamp, bytes_view value) {
return atomic_cell_type::make_live(timestamp, value);
}
static atomic_cell make_live(api::timestamp_type timestamp, const bytes& value) {
return make_live(timestamp, bytes_view(value));
}
static atomic_cell make_live_counter_update(api::timestamp_type timestamp, int64_t value) {
return atomic_cell_type::make_live_counter_update(timestamp, value);
}
static atomic_cell make_live(api::timestamp_type timestamp, bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl)
{
return make_live(type, timestamp, bytes_view(value), expiry, ttl, cm);
return atomic_cell_type::make_live(timestamp, value, expiry, ttl);
}
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value, ttl_opt ttl, collection_member cm = collection_member::no) {
static atomic_cell make_live(api::timestamp_type timestamp, const bytes& value,
gc_clock::time_point expiry, gc_clock::duration ttl)
{
return make_live(timestamp, bytes_view(value), expiry, ttl);
}
static atomic_cell make_live(api::timestamp_type timestamp, bytes_view value, ttl_opt ttl) {
if (!ttl) {
return make_live(type, timestamp, value, cm);
return atomic_cell_type::make_live(timestamp, value);
} else {
return make_live(type, timestamp, value, gc_clock::now() + *ttl, *ttl, cm);
return atomic_cell_type::make_live(timestamp, value, gc_clock::now() + *ttl, *ttl);
}
}
static atomic_cell make_live_uninitialized(const abstract_type& type, api::timestamp_type timestamp, size_t size);
template<typename Serializer>
static atomic_cell make_live_from_serializer(api::timestamp_type timestamp, size_t size, Serializer&& serializer) {
return atomic_cell_type::make_live_from_serializer(timestamp, size, std::forward<Serializer>(serializer));
}
friend class atomic_cell_or_collection;
friend std::ostream& operator<<(std::ostream& os, const atomic_cell& ac);
};
@@ -225,24 +350,33 @@ class collection_mutation_view;
// list: tbd, probably ugly
class collection_mutation {
public:
using imr_object_type = imr::utils::object<data::cell::structure>;
imr_object_type _data;
managed_bytes data;
collection_mutation() {}
collection_mutation(const collection_type_impl&, collection_mutation_view v);
collection_mutation(const collection_type_impl&, bytes_view bv);
collection_mutation(managed_bytes b) : data(std::move(b)) {}
collection_mutation(collection_mutation_view v);
operator collection_mutation_view() const;
};
class collection_mutation_view {
public:
atomic_cell_value_view data;
bytes_view data;
bytes_view serialize() const { return data; }
static collection_mutation_view from_bytes(bytes_view v) { return { v }; }
};
inline
collection_mutation::collection_mutation(collection_mutation_view v)
: data(v.data) {
}
inline
collection_mutation::operator collection_mutation_view() const {
return { data };
}
class column_definition;
int compare_atomic_cell_for_merge(atomic_cell_view left, atomic_cell_view right);
void merge_column(const abstract_type& def,
void merge_column(const column_definition& def,
atomic_cell_or_collection& old,
const atomic_cell_or_collection& neww);

View File

@@ -33,15 +33,12 @@ template<>
struct appending_hash<collection_mutation_view> {
template<typename Hasher>
void operator()(Hasher& h, collection_mutation_view cell, const column_definition& cdef) const {
cell.data.with_linearized([&] (bytes_view cell_bv) {
auto ctype = static_pointer_cast<const collection_type_impl>(cdef.type);
auto m_view = ctype->deserialize_mutation_form(cell_bv);
auto m_view = collection_type_impl::deserialize_mutation_form(cell);
::feed_hash(h, m_view.tomb);
for (auto&& key_and_value : m_view.cells) {
::feed_hash(h, key_and_value.first);
::feed_hash(h, key_and_value.second, cdef);
}
});
}
};
@@ -53,9 +50,7 @@ struct appending_hash<atomic_cell_view> {
feed_hash(h, cell.timestamp());
if (cell.is_live()) {
if (cdef.is_counter()) {
counter_cell_view::with_linearized(cell, [&] (counter_cell_view ccv) {
::feed_hash(h, ccv);
});
::feed_hash(h, counter_cell_view(cell));
return;
}
if (cell.is_live_and_has_ttl()) {
@@ -90,9 +85,9 @@ struct appending_hash<atomic_cell_or_collection> {
template<typename Hasher>
void operator()(Hasher& h, const atomic_cell_or_collection& c, const column_definition& cdef) const {
if (cdef.is_atomic()) {
feed_hash(h, c.as_atomic_cell(cdef), cdef);
feed_hash(h, c.as_atomic_cell(), cdef);
} else {
feed_hash(h, c.as_collection_mutation(), cdef);
}
}
};
};

View File

@@ -25,56 +25,42 @@
#include "schema.hh"
#include "hashing.hh"
#include "imr/utils.hh"
// A variant type that can hold either an atomic_cell, or a serialized collection.
// Which type is stored is determined by the schema.
// Has an "empty" state.
// Objects moved-from are left in an empty state.
class atomic_cell_or_collection final {
// FIXME: This has made us lose small-buffer optimisation. Unfortunately,
// due to the changed cell format it would be less effective now, anyway.
// Measure the actual impact because any attempts to fix this will become
// irrelevant once rows are converted to the IMR as well, so maybe we can
// live with this like that.
using imr_object_type = imr::utils::object<data::cell::structure>;
imr_object_type _data;
managed_bytes _data;
private:
atomic_cell_or_collection(imr::utils::object<data::cell::structure>&& data) : _data(std::move(data)) {}
atomic_cell_or_collection(managed_bytes&& data) : _data(std::move(data)) {}
public:
atomic_cell_or_collection() = default;
atomic_cell_or_collection(atomic_cell_or_collection&&) = default;
atomic_cell_or_collection(const atomic_cell_or_collection&) = delete;
atomic_cell_or_collection& operator=(atomic_cell_or_collection&&) = default;
atomic_cell_or_collection& operator=(const atomic_cell_or_collection&) = delete;
atomic_cell_or_collection(atomic_cell ac) : _data(std::move(ac._data)) {}
atomic_cell_or_collection(const abstract_type& at, atomic_cell_view acv);
static atomic_cell_or_collection from_atomic_cell(atomic_cell data) { return { std::move(data._data) }; }
atomic_cell_view as_atomic_cell(const column_definition& cdef) const { return atomic_cell_view::from_bytes(cdef.type->imr_state().type_info(), _data); }
atomic_cell_ref as_atomic_cell_ref(const column_definition& cdef) { return atomic_cell_mutable_view::from_bytes(cdef.type->imr_state().type_info(), _data); }
atomic_cell_mutable_view as_mutable_atomic_cell(const column_definition& cdef) { return atomic_cell_mutable_view::from_bytes(cdef.type->imr_state().type_info(), _data); }
atomic_cell_or_collection(collection_mutation cm) : _data(std::move(cm._data)) { }
atomic_cell_or_collection copy(const abstract_type&) const;
atomic_cell_view as_atomic_cell() const { return atomic_cell_view::from_bytes(_data); }
atomic_cell_ref as_atomic_cell_ref() { return { _data }; }
atomic_cell_mutable_view as_mutable_atomic_cell() { return atomic_cell_mutable_view::from_bytes(_data); }
atomic_cell_or_collection(collection_mutation cm) : _data(std::move(cm.data)) {}
explicit operator bool() const {
return bool(_data);
return !_data.empty();
}
static constexpr bool can_use_mutable_view() {
return true;
bool can_use_mutable_view() const {
return !_data.is_fragmented();
}
void swap(atomic_cell_or_collection& other) noexcept {
_data.swap(other._data);
static atomic_cell_or_collection from_collection_mutation(collection_mutation data) {
return std::move(data.data);
}
collection_mutation_view as_collection_mutation() const {
return collection_mutation_view{_data};
}
bytes_view serialize() const {
return _data;
}
bool operator==(const atomic_cell_or_collection& other) const {
return _data == other._data;
}
size_t external_memory_usage() const {
return _data.external_memory_usage();
}
static atomic_cell_or_collection from_collection_mutation(collection_mutation data) { return std::move(data._data); }
collection_mutation_view as_collection_mutation() const;
bytes_view serialize() const;
bool equals(const abstract_type& type, const atomic_cell_or_collection& other) const;
size_t external_memory_usage(const abstract_type&) const;
friend std::ostream& operator<<(std::ostream&, const atomic_cell_or_collection&);
};
namespace std {
inline void swap(atomic_cell_or_collection& a, atomic_cell_or_collection& b) noexcept
{
a.swap(b);
}
}

View File

@@ -103,7 +103,6 @@ future<bool> default_authorizer::any_granted() const {
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
infinite_timeout_config,
{},
true).then([this](::shared_ptr<cql3::untyped_result_set> results) {
return !results->empty();
@@ -116,8 +115,7 @@ future<> default_authorizer::migrate_legacy_metadata() const {
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
infinite_timeout_config).then([this](::shared_ptr<cql3::untyped_result_set> results) {
db::consistency_level::LOCAL_ONE).then([this](::shared_ptr<cql3::untyped_result_set> results) {
return do_for_each(*results, [this](const cql3::untyped_result_set_row& row) {
return do_with(
row.get_as<sstring>("username"),
@@ -198,7 +196,6 @@ default_authorizer::authorize(const role_or_anonymous& maybe_role, const resourc
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
infinite_timeout_config,
{*maybe_role.name, r.name()}).then([](::shared_ptr<cql3::untyped_result_set> results) {
if (results->empty()) {
return permissions::NONE;
@@ -228,7 +225,6 @@ default_authorizer::modify(
return _qp.process(
query,
db::consistency_level::ONE,
infinite_timeout_config,
{permissions::to_strings(set), sstring(role_name), resource.name()}).discard_result();
});
}
@@ -254,7 +250,6 @@ future<std::vector<permission_details>> default_authorizer::list_all() const {
return _qp.process(
query,
db::consistency_level::ONE,
infinite_timeout_config,
{},
true).then([](::shared_ptr<cql3::untyped_result_set> results) {
std::vector<permission_details> all_details;
@@ -282,7 +277,6 @@ future<> default_authorizer::revoke_all(stdx::string_view role_name) const {
return _qp.process(
query,
db::consistency_level::ONE,
infinite_timeout_config,
{sstring(role_name)}).discard_result().handle_exception([role_name](auto ep) {
try {
std::rethrow_exception(ep);
@@ -303,7 +297,6 @@ future<> default_authorizer::revoke_all(const resource& resource) const {
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
infinite_timeout_config,
{resource.name()}).then_wrapped([this, resource](future<::shared_ptr<cql3::untyped_result_set>> f) {
try {
auto res = f.get0();
@@ -321,7 +314,6 @@ future<> default_authorizer::revoke_all(const resource& resource) const {
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
infinite_timeout_config,
{r.get_as<sstring>(ROLE_NAME), resource.name()}).discard_result().handle_exception(
[resource](auto ep) {
try {

View File

@@ -149,7 +149,9 @@ static sstring gensalt() {
// blowfish 2011 fix, blowfish, sha512, sha256, md5
for (sstring pfx : { "$2y$", "$2a$", "$6$", "$5$", "$1$" }) {
salt = pfx + input;
if (crypt_r("fisk", salt.c_str(), &tlcrypt)) {
const char* e = crypt_r("fisk", salt.c_str(), &tlcrypt);
if (e && (e[0] != '*')) {
prefix = pfx;
return salt;
}
@@ -162,7 +164,7 @@ static sstring hashpw(const sstring& pass) {
}
static bool has_salted_hash(const cql3::untyped_result_set_row& row) {
return !row.get_or<sstring>(SALTED_HASH, "").empty();
return utf8_type->deserialize(row.get_blob(SALTED_HASH)) != data_value::make_null(utf8_type);
}
static const sstring update_row_query = sprint(
@@ -183,8 +185,7 @@ future<> password_authenticator::migrate_legacy_metadata() const {
return _qp.process(
query,
db::consistency_level::QUORUM,
infinite_timeout_config).then([this](::shared_ptr<cql3::untyped_result_set> results) {
db::consistency_level::QUORUM).then([this](::shared_ptr<cql3::untyped_result_set> results) {
return do_for_each(*results, [this](const cql3::untyped_result_set_row& row) {
auto username = row.get_as<sstring>("username");
auto salted_hash = row.get_as<sstring>(SALTED_HASH);
@@ -192,7 +193,6 @@ future<> password_authenticator::migrate_legacy_metadata() const {
return _qp.process(
update_row_query,
consistency_for_user(username),
infinite_timeout_config,
{std::move(salted_hash), username}).discard_result();
}).finally([results] {});
}).then([] {
@@ -209,7 +209,6 @@ future<> password_authenticator::create_default_if_missing() const {
return _qp.process(
update_row_query,
db::consistency_level::QUORUM,
infinite_timeout_config,
{hashpw(DEFAULT_USER_PASSWORD), DEFAULT_USER_NAME}).then([](auto&&) {
plogger.info("Created default superuser authentication record.");
});
@@ -309,7 +308,6 @@ future<authenticated_user> password_authenticator::authenticate(
return _qp.process(
query,
consistency_for_user(username),
infinite_timeout_config,
{username},
true);
}).then_wrapped([=](future<::shared_ptr<cql3::untyped_result_set>> f) {
@@ -337,7 +335,6 @@ future<> password_authenticator::create(stdx::string_view role_name, const authe
return _qp.process(
update_row_query,
consistency_for_user(role_name),
infinite_timeout_config,
{hashpw(*options.password), sstring(role_name)}).discard_result();
}
@@ -355,7 +352,6 @@ future<> password_authenticator::alter(stdx::string_view role_name, const authen
return _qp.process(
query,
consistency_for_user(role_name),
infinite_timeout_config,
{hashpw(*options.password), sstring(role_name)}).discard_result();
}
@@ -366,7 +362,7 @@ future<> password_authenticator::drop(stdx::string_view name) const {
meta::roles_table::qualified_name(),
meta::roles_table::role_col_name);
return _qp.process(query, consistency_for_user(name), infinite_timeout_config, {sstring(name)}).discard_result();
return _qp.process(query, consistency_for_user(name), {sstring(name)}).discard_result();
}
future<custom_options> password_authenticator::query_custom_options(stdx::string_view role_name) const {

View File

@@ -72,14 +72,12 @@ future<bool> default_role_row_satisfies(
return qp.process(
query,
db::consistency_level::ONE,
infinite_timeout_config,
{meta::DEFAULT_SUPERUSER_NAME},
true).then([&qp, &p](::shared_ptr<cql3::untyped_result_set> results) {
if (results->empty()) {
return qp.process(
query,
db::consistency_level::QUORUM,
infinite_timeout_config,
{meta::DEFAULT_SUPERUSER_NAME},
true).then([&p](::shared_ptr<cql3::untyped_result_set> results) {
if (results->empty()) {
@@ -103,8 +101,7 @@ future<bool> any_nondefault_role_row_satisfies(
return do_with(std::move(p), [&qp](const auto& p) {
return qp.process(
query,
db::consistency_level::QUORUM,
infinite_timeout_config).then([&p](::shared_ptr<cql3::untyped_result_set> results) {
db::consistency_level::QUORUM).then([&p](::shared_ptr<cql3::untyped_result_set> results) {
if (results->empty()) {
return false;
}

View File

@@ -37,7 +37,7 @@
#include "cql3/query_processor.hh"
#include "cql3/untyped_result_set.hh"
#include "db/config.hh"
#include "db/consistency_level_type.hh"
#include "db/consistency_level.hh"
#include "exceptions/exceptions.hh"
#include "log.hh"
#include "service/migration_listener.hh"
@@ -223,7 +223,6 @@ future<bool> service::has_existing_legacy_users() const {
return _qp.process(
default_user_query,
db::consistency_level::ONE,
infinite_timeout_config,
{meta::DEFAULT_SUPERUSER_NAME},
true).then([this](auto results) {
if (!results->empty()) {
@@ -233,7 +232,6 @@ future<bool> service::has_existing_legacy_users() const {
return _qp.process(
default_user_query,
db::consistency_level::QUORUM,
infinite_timeout_config,
{meta::DEFAULT_SUPERUSER_NAME},
true).then([this](auto results) {
if (!results->empty()) {
@@ -242,8 +240,7 @@ future<bool> service::has_existing_legacy_users() const {
return _qp.process(
all_users_query,
db::consistency_level::QUORUM,
infinite_timeout_config).then([](auto results) {
db::consistency_level::QUORUM).then([](auto results) {
return make_ready_future<bool>(!results->empty());
});
});

View File

@@ -89,7 +89,6 @@ static future<stdx::optional<record>> find_record(cql3::query_processor& qp, std
return qp.process(
query,
consistency_for_role(role_name),
infinite_timeout_config,
{sstring(role_name)},
true).then([](::shared_ptr<cql3::untyped_result_set> results) {
if (results->empty()) {
@@ -174,7 +173,6 @@ future<> standard_role_manager::create_default_role_if_missing() const {
return _qp.process(
query,
db::consistency_level::QUORUM,
infinite_timeout_config,
{meta::DEFAULT_SUPERUSER_NAME}).then([](auto&&) {
log.info("Created default superuser role '{}'.", meta::DEFAULT_SUPERUSER_NAME);
return make_ready_future<>();
@@ -200,8 +198,7 @@ future<> standard_role_manager::migrate_legacy_metadata() const {
return _qp.process(
query,
db::consistency_level::QUORUM,
infinite_timeout_config).then([this](::shared_ptr<cql3::untyped_result_set> results) {
db::consistency_level::QUORUM).then([this](::shared_ptr<cql3::untyped_result_set> results) {
return do_for_each(*results, [this](const cql3::untyped_result_set_row& row) {
role_config config;
config.is_superuser = row.get_as<bool>("super");
@@ -263,7 +260,6 @@ future<> standard_role_manager::create_or_replace(stdx::string_view role_name, c
return _qp.process(
query,
consistency_for_role(role_name),
infinite_timeout_config,
{sstring(role_name), c.is_superuser, c.can_login},
true).discard_result();
}
@@ -307,7 +303,6 @@ standard_role_manager::alter(stdx::string_view role_name, const role_config_upda
build_column_assignments(u),
meta::roles_table::role_col_name),
consistency_for_role(role_name),
infinite_timeout_config,
{sstring(role_name)}).discard_result();
});
}
@@ -327,7 +322,6 @@ future<> standard_role_manager::drop(stdx::string_view role_name) const {
return _qp.process(
query,
consistency_for_role(role_name),
infinite_timeout_config,
{sstring(role_name)}).then([this, role_name](::shared_ptr<cql3::untyped_result_set> members) {
return parallel_for_each(
members->begin(),
@@ -367,7 +361,6 @@ future<> standard_role_manager::drop(stdx::string_view role_name) const {
return _qp.process(
query,
consistency_for_role(role_name),
infinite_timeout_config,
{sstring(role_name)}).discard_result();
};
@@ -394,7 +387,6 @@ standard_role_manager::modify_membership(
return _qp.process(
query,
consistency_for_role(grantee_name),
infinite_timeout_config,
{role_set{sstring(role_name)}, sstring(grantee_name)}).discard_result();
};
@@ -406,7 +398,6 @@ standard_role_manager::modify_membership(
"INSERT INTO %s (role, member) VALUES (?, ?)",
meta::role_members_table::qualified_name()),
consistency_for_role(role_name),
infinite_timeout_config,
{sstring(role_name), sstring(grantee_name)}).discard_result();
case membership_change::remove:
@@ -415,7 +406,6 @@ standard_role_manager::modify_membership(
"DELETE FROM %s WHERE role = ? AND member = ?",
meta::role_members_table::qualified_name()),
consistency_for_role(role_name),
infinite_timeout_config,
{sstring(role_name), sstring(grantee_name)}).discard_result();
}
@@ -516,7 +506,7 @@ future<role_set> standard_role_manager::query_all() const {
// To avoid many copies of a view.
static const auto role_col_name_string = sstring(meta::roles_table::role_col_name);
return _qp.process(query, db::consistency_level::QUORUM, infinite_timeout_config).then([](::shared_ptr<cql3::untyped_result_set> results) {
return _qp.process(query, db::consistency_level::QUORUM).then([](::shared_ptr<cql3::untyped_result_set> results) {
role_set roles;
std::transform(

View File

@@ -96,12 +96,6 @@ protected:
}
virtual ~backlog_controller() {}
public:
backlog_controller(backlog_controller&&) = default;
float backlog_of_shares(float shares) const;
seastar::scheduling_group sg() {
return _scheduling_group;
}
};
// memtable flush CPU controller.
@@ -125,7 +119,7 @@ public:
flush_controller(seastar::scheduling_group sg, const ::io_priority_class& iop, float static_shares) : backlog_controller(sg, iop, static_shares) {}
flush_controller(seastar::scheduling_group sg, const ::io_priority_class& iop, std::chrono::milliseconds interval, float soft_limit, std::function<float()> current_dirty)
: backlog_controller(sg, iop, std::move(interval),
std::vector<backlog_controller::control_point>({{soft_limit, 10}, {soft_limit + (hard_dirty_limit - soft_limit) / 2, 200} , {hard_dirty_limit, 1000}}),
std::vector<backlog_controller::control_point>({{soft_limit, 100}, {soft_limit + (hard_dirty_limit - soft_limit) / 2, 200} , {hard_dirty_limit, 1000}}),
std::move(current_dirty)
)
{}
@@ -134,8 +128,6 @@ public:
class compaction_controller : public backlog_controller {
public:
static constexpr unsigned normalization_factor = 30;
static constexpr float disable_backlog = std::numeric_limits<double>::infinity();
static constexpr float backlog_disabled(float backlog) { return std::isinf(backlog); }
compaction_controller(seastar::scheduling_group sg, const ::io_priority_class& iop, float static_shares) : backlog_controller(sg, iop, static_shares) {}
compaction_controller(seastar::scheduling_group sg, const ::io_priority_class& iop, std::chrono::milliseconds interval, std::function<float()> current_backlog)
: backlog_controller(sg, iop, std::move(interval),

View File

@@ -29,7 +29,7 @@
#include <functional>
#include "utils/mutable_view.hh"
using bytes = basic_sstring<int8_t, uint32_t, 31, false>;
using bytes = basic_sstring<int8_t, uint32_t, 31>;
using bytes_view = std::experimental::basic_string_view<int8_t>;
using bytes_mutable_view = basic_mutable_view<bytes_view::value_type>;
using bytes_opt = std::experimental::optional<bytes>;
@@ -78,11 +78,3 @@ struct appending_hash<bytes_view> {
h.update(reinterpret_cast<const char*>(v.begin()), v.size() * sizeof(bytes_view::value_type));
}
};
inline int32_t compare_unsigned(bytes_view v1, bytes_view v2) {
auto n = memcmp(v1.begin(), v2.begin(), std::min(v1.size(), v2.size()));
if (n) {
return n;
}
return (int32_t) (v1.size() - v2.size());
}

View File

@@ -65,9 +65,8 @@ private:
size_type _size;
public:
class fragment_iterator : public std::iterator<std::input_iterator_tag, bytes_view> {
chunk* _current = nullptr;
chunk* _current;
public:
fragment_iterator() = default;
fragment_iterator(chunk* current) : _current(current) {}
fragment_iterator(const fragment_iterator&) = default;
fragment_iterator& operator=(const fragment_iterator&) = default;
@@ -290,24 +289,6 @@ public:
}
}
// Removes n bytes from the end of the bytes_ostream.
// Beware of O(n) algorithm.
void remove_suffix(size_t n) {
_size -= n;
auto left = _size;
auto current = _begin.get();
while (current) {
if (current->offset >= left) {
current->offset = left;
_current = current;
current->next.reset();
return;
}
left -= current->offset;
current = current->next.get();
}
}
// begin() and end() form an input range to bytes_view representing fragments.
// Any modification of this instance invalidates iterators.
fragment_iterator begin() const { return { _begin.get() }; }

View File

@@ -60,11 +60,12 @@ class cache_flat_mutation_reader final : public flat_mutation_reader::impl {
// - _next_row_in_range = _next.position() < _upper_bound
// - _last_row points at a direct predecessor of the next row which is going to be read.
// Used for populating continuity.
// - _population_range_starts_before_all_rows is set accordingly
reading_from_underlying,
end_of_stream
};
partition_snapshot_ptr _snp;
lw_shared_ptr<partition_snapshot> _snp;
position_in_partition::tri_compare _position_cmp;
query::clustering_key_filter_ranges _ck_ranges;
@@ -86,10 +87,12 @@ class cache_flat_mutation_reader final : public flat_mutation_reader::impl {
partition_snapshot_row_cursor _next_row;
bool _next_row_in_range = false;
// Whether _lower_bound was changed within current fill_buffer().
// If it did not then we cannot break out of it (e.g. on preemption) because
// forward progress is not guaranteed in case iterators are getting constantly invalidated.
bool _lower_bound_changed = false;
// True iff current population interval, since the previous clustering row, starts before all clustered rows.
// We cannot just look at _lower_bound, because emission of range tombstones changes _lower_bound and
// because we mark clustering intervals as continuous when consuming a clustering_row, it would prevent
// us from marking the interval as continuous.
// Valid when _state == reading_from_underlying.
bool _population_range_starts_before_all_rows;
future<> do_fill_buffer(db::timeout_clock::time_point);
void copy_from_cache_to_buffer();
@@ -129,7 +132,7 @@ public:
dht::decorated_key dk,
query::clustering_key_filter_ranges&& crr,
lw_shared_ptr<read_context> ctx,
partition_snapshot_ptr snp,
lw_shared_ptr<partition_snapshot> snp,
row_cache& cache)
: flat_mutation_reader::impl(std::move(s))
, _snp(std::move(snp))
@@ -149,6 +152,9 @@ public:
cache_flat_mutation_reader(const cache_flat_mutation_reader&) = delete;
cache_flat_mutation_reader(cache_flat_mutation_reader&&) = delete;
virtual future<> fill_buffer(db::timeout_clock::time_point timeout) override;
virtual ~cache_flat_mutation_reader() {
maybe_merge_versions(_snp, _lsa_manager.region(), _lsa_manager.read_section());
}
virtual void next_partition() override {
clear_buffer_to_next_partition();
if (is_buffer_empty()) {
@@ -228,6 +234,7 @@ inline
future<> cache_flat_mutation_reader::do_fill_buffer(db::timeout_clock::time_point timeout) {
if (_state == state::move_to_underlying) {
_state = state::reading_from_underlying;
_population_range_starts_before_all_rows = _lower_bound.is_before_all_clustered_rows(*_schema);
auto end = _next_row_in_range ? position_in_partition(_next_row.position())
: position_in_partition(_upper_bound);
return _read_context->fast_forward_to(position_range{_lower_bound, std::move(end)}, timeout).then([this, timeout] {
@@ -255,13 +262,9 @@ future<> cache_flat_mutation_reader::do_fill_buffer(db::timeout_clock::time_poin
}
_next_row.maybe_refresh();
clogger.trace("csm {}: next={}, cont={}", this, _next_row.position(), _next_row.continuous());
_lower_bound_changed = false;
while (_state == state::reading_from_cache) {
while (!is_buffer_full() && _state == state::reading_from_cache) {
copy_from_cache_to_buffer();
// We need to check _lower_bound_changed even if is_buffer_full() because
// we may have emitted only a range tombstone which overlapped with _lower_bound
// and thus didn't cause _lower_bound to change.
if ((need_preempt() || is_buffer_full()) && _lower_bound_changed) {
if (need_preempt()) {
break;
}
}
@@ -357,7 +360,7 @@ future<> cache_flat_mutation_reader::read_from_underlying(db::timeout_clock::tim
inline
bool cache_flat_mutation_reader::ensure_population_lower_bound() {
if (!_ck_ranges_curr->start()) {
if (_population_range_starts_before_all_rows) {
return true;
}
if (!_last_row.refresh(*_snp)) {
@@ -371,7 +374,7 @@ bool cache_flat_mutation_reader::ensure_population_lower_bound() {
rows_entry::compare less(*_schema);
// FIXME: Avoid the copy by inserting an incomplete clustering row
auto e = alloc_strategy_unique_ptr<rows_entry>(
current_allocator().construct<rows_entry>(*_schema, *_last_row));
current_allocator().construct<rows_entry>(*_last_row));
e->set_continuous(false);
auto insert_result = rows.insert_check(rows.end(), *e, less);
auto inserted = insert_result.second;
@@ -412,6 +415,7 @@ inline
void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {
if (!can_populate()) {
_last_row = nullptr;
_population_range_starts_before_all_rows = false;
_read_context->cache().on_mispopulate();
return;
}
@@ -424,7 +428,7 @@ void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {
cr.cells().prepare_hash(*_schema, column_kind::regular_column);
}
auto new_entry = alloc_strategy_unique_ptr<rows_entry>(
current_allocator().construct<rows_entry>(*_schema, cr.key(), cr.tomb(), cr.marker(), cr.cells()));
current_allocator().construct<rows_entry>(cr.key(), cr.tomb(), cr.marker(), cr.cells()));
new_entry->set_continuous(false);
auto it = _next_row.iterators_valid() ? _next_row.get_iterator_in_latest_version()
: mp.clustered_rows().lower_bound(cr.key(), less);
@@ -445,6 +449,7 @@ void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {
with_allocator(standard_allocator(), [&] {
_last_row = partition_snapshot_row_weakref(*_snp, it, true);
});
_population_range_starts_before_all_rows = false;
});
}
@@ -466,19 +471,15 @@ void cache_flat_mutation_reader::copy_from_cache_to_buffer() {
_next_row.touch();
position_in_partition_view next_lower_bound = _next_row.dummy() ? _next_row.position() : position_in_partition_view::after_key(_next_row.key());
for (auto &&rts : _snp->range_tombstones(_lower_bound, _next_row_in_range ? next_lower_bound : _upper_bound)) {
position_in_partition::less_compare less(*_schema);
// This guarantees that rts starts after any emitted clustering_row
// and not before any emitted range tombstone.
if (!less(_lower_bound, rts.position())) {
rts.set_start(*_schema, _lower_bound);
} else {
if (rts.trim_front(*_schema, _lower_bound)) {
_lower_bound = position_in_partition(rts.position());
_lower_bound_changed = true;
if (is_buffer_full()) {
return;
}
push_mutation_fragment(std::move(rts));
}
push_mutation_fragment(std::move(rts));
}
// We add the row to the buffer even when it's full.
// This simplifies the code. For more info see #3139.
@@ -515,7 +516,6 @@ void cache_flat_mutation_reader::move_to_range(query::clustering_row_ranges::con
_last_row = nullptr;
_lower_bound = std::move(lb);
_upper_bound = std::move(ub);
_lower_bound_changed = true;
_ck_ranges_curr = next_it;
auto adjacent = _next_row.advance_to(_lower_bound);
_next_row_in_range = !after_current_range(_next_row.position());
@@ -593,7 +593,6 @@ void cache_flat_mutation_reader::add_clustering_row_to_buffer(mutation_fragment&
auto new_lower_bound = position_in_partition::after_key(row.key());
push_mutation_fragment(std::move(mf));
_lower_bound = std::move(new_lower_bound);
_lower_bound_changed = true;
}
inline
@@ -601,16 +600,10 @@ void cache_flat_mutation_reader::add_to_buffer(range_tombstone&& rt) {
clogger.trace("csm {}: add_to_buffer({})", this, rt);
// This guarantees that rt starts after any emitted clustering_row
// and not before any emitted range tombstone.
position_in_partition::less_compare less(*_schema);
if (!less(_lower_bound, rt.end_position())) {
if (!rt.trim_front(*_schema, _lower_bound)) {
return;
}
if (!less(_lower_bound, rt.position())) {
rt.set_start(*_schema, _lower_bound);
} else {
_lower_bound = position_in_partition(rt.position());
_lower_bound_changed = true;
}
_lower_bound = position_in_partition(rt.position());
push_mutation_fragment(std::move(rt));
}
@@ -664,7 +657,7 @@ inline flat_mutation_reader make_cache_flat_mutation_reader(schema_ptr s,
query::clustering_key_filter_ranges crr,
row_cache& cache,
lw_shared_ptr<cache::read_context> ctx,
partition_snapshot_ptr snp)
lw_shared_ptr<partition_snapshot> snp)
{
return make_flat_mutation_reader<cache::cache_flat_mutation_reader>(
std::move(s), std::move(dk), std::move(crr), std::move(ctx), std::move(snp), cache);

View File

@@ -23,7 +23,21 @@
#include <boost/intrusive/unordered_set.hpp>
#include "utils/small_vector.hh"
#if __has_include(<boost/container/small_vector.hpp>)
#include <boost/container/small_vector.hpp>
template <typename T, size_t N>
using small_vector = boost::container::small_vector<T, N>;
#else
#include <vector>
template <typename T, size_t N>
using small_vector = std::vector<T>;
#endif
#include "fnv1a_hasher.hh"
#include "mutation_fragment.hh"
#include "mutation_partition.hh"
@@ -31,7 +45,7 @@
#include "db/timeout_clock.hh"
class cells_range {
using ids_vector_type = utils::small_vector<column_id, 5>;
using ids_vector_type = small_vector<column_id, 5>;
position_in_partition_view _position;
ids_vector_type _ids;

View File

@@ -22,7 +22,6 @@
#pragma once
#include <functional>
#include "keys.hh"
#include "schema.hh"
#include "range.hh"
@@ -44,20 +43,22 @@ bound_kind invert_kind(bound_kind k);
int32_t weight(bound_kind k);
class bound_view {
const static thread_local clustering_key _empty_prefix;
std::reference_wrapper<const clustering_key_prefix> _prefix;
bound_kind _kind;
public:
const static thread_local clustering_key empty_prefix;
const clustering_key_prefix& prefix;
bound_kind kind;
bound_view(const clustering_key_prefix& prefix, bound_kind kind)
: _prefix(prefix)
, _kind(kind)
: prefix(prefix)
, kind(kind)
{ }
bound_view(const bound_view& other) noexcept = default;
bound_view& operator=(const bound_view& other) noexcept = default;
bound_kind kind() const { return _kind; }
const clustering_key_prefix& prefix() const { return _prefix; }
bound_view& operator=(const bound_view& other) noexcept {
if (this != &other) {
this->~bound_view();
new (this) bound_view(other);
}
return *this;
}
struct tri_compare {
// To make it assignable and to avoid taking a schema_ptr, we
// wrap the schema reference.
@@ -81,13 +82,13 @@ public:
return d1 < d2 ? w1 - (w1 <= 0) : -(w2 - (w2 <= 0));
}
int operator()(const bound_view b, const clustering_key_prefix& p) const {
return operator()(b._prefix, weight(b._kind), p, 0);
return operator()(b.prefix, weight(b.kind), p, 0);
}
int operator()(const clustering_key_prefix& p, const bound_view b) const {
return operator()(p, 0, b._prefix, weight(b._kind));
return operator()(p, 0, b.prefix, weight(b.kind));
}
int operator()(const bound_view b1, const bound_view b2) const {
return operator()(b1._prefix, weight(b1._kind), b2._prefix, weight(b2._kind));
return operator()(b1.prefix, weight(b1.kind), b2.prefix, weight(b2.kind));
}
};
struct compare {
@@ -100,26 +101,26 @@ public:
return _cmp(p1, w1, p2, w2) < 0;
}
bool operator()(const bound_view b, const clustering_key_prefix& p) const {
return operator()(b._prefix, weight(b._kind), p, 0);
return operator()(b.prefix, weight(b.kind), p, 0);
}
bool operator()(const clustering_key_prefix& p, const bound_view b) const {
return operator()(p, 0, b._prefix, weight(b._kind));
return operator()(p, 0, b.prefix, weight(b.kind));
}
bool operator()(const bound_view b1, const bound_view b2) const {
return operator()(b1._prefix, weight(b1._kind), b2._prefix, weight(b2._kind));
return operator()(b1.prefix, weight(b1.kind), b2.prefix, weight(b2.kind));
}
};
bool equal(const schema& s, const bound_view other) const {
return _kind == other._kind && _prefix.get().equal(s, other._prefix.get());
return kind == other.kind && prefix.equal(s, other.prefix);
}
bool adjacent(const schema& s, const bound_view other) const {
return invert_kind(other._kind) == _kind && _prefix.get().equal(s, other._prefix.get());
return invert_kind(other.kind) == kind && prefix.equal(s, other.prefix);
}
static bound_view bottom() {
return {_empty_prefix, bound_kind::incl_start};
return {empty_prefix, bound_kind::incl_start};
}
static bound_view top() {
return {_empty_prefix, bound_kind::incl_end};
return {empty_prefix, bound_kind::incl_end};
}
template<template<typename> typename R>
GCC6_CONCEPT( requires Range<R, clustering_key_prefix_view> )
@@ -143,13 +144,13 @@ public:
template<template<typename> typename R>
GCC6_CONCEPT( requires Range<R, clustering_key_prefix_view> )
static stdx::optional<typename R<clustering_key_prefix_view>::bound> to_range_bound(const bound_view& bv) {
if (&bv._prefix.get() == &_empty_prefix) {
if (&bv.prefix == &empty_prefix) {
return {};
}
bool inclusive = bv._kind != bound_kind::excl_end && bv._kind != bound_kind::excl_start;
return {typename R<clustering_key_prefix_view>::bound(bv._prefix.get().view(), inclusive)};
bool inclusive = bv.kind != bound_kind::excl_end && bv.kind != bound_kind::excl_start;
return {typename R<clustering_key_prefix_view>::bound(bv.prefix.view(), inclusive)};
}
friend std::ostream& operator<<(std::ostream& out, const bound_view& b) {
return out << "{bound: prefix=" << b._prefix.get() << ", kind=" << b._kind << "}";
return out << "{bound: prefix=" << b.prefix << ", kind=" << b.kind << "}";
}
};

View File

@@ -25,8 +25,7 @@
#include "exceptions/exceptions.hh"
#include "sstables/compaction_backlog_manager.hh"
class table;
using column_family = table;
class column_family;
class schema;
using schema_ptr = lw_shared_ptr<const schema>;

View File

@@ -0,0 +1,67 @@
/*
* Copyright (C) 2016 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "query-request.hh"
#include <experimental/optional>
// Wraps ring_position so it is compatible with old-style C++: default constructor,
// stateless comparators, yada yada
class compatible_ring_position {
const schema* _schema = nullptr;
// optional to supply a default constructor, no more
std::experimental::optional<dht::ring_position> _rp;
public:
compatible_ring_position() noexcept = default;
compatible_ring_position(const schema& s, const dht::ring_position& rp)
: _schema(&s), _rp(rp) {
}
compatible_ring_position(const schema& s, dht::ring_position&& rp)
: _schema(&s), _rp(std::move(rp)) {
}
const dht::token& token() const {
return _rp->token();
}
friend int tri_compare(const compatible_ring_position& x, const compatible_ring_position& y) {
return x._rp->tri_compare(*x._schema, *y._rp);
}
friend bool operator<(const compatible_ring_position& x, const compatible_ring_position& y) {
return tri_compare(x, y) < 0;
}
friend bool operator<=(const compatible_ring_position& x, const compatible_ring_position& y) {
return tri_compare(x, y) <= 0;
}
friend bool operator>(const compatible_ring_position& x, const compatible_ring_position& y) {
return tri_compare(x, y) > 0;
}
friend bool operator>=(const compatible_ring_position& x, const compatible_ring_position& y) {
return tri_compare(x, y) >= 0;
}
friend bool operator==(const compatible_ring_position& x, const compatible_ring_position& y) {
return tri_compare(x, y) == 0;
}
friend bool operator!=(const compatible_ring_position& x, const compatible_ring_position& y) {
return tri_compare(x, y) != 0;
}
};

View File

@@ -1,64 +0,0 @@
/*
* Copyright (C) 2016 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "query-request.hh"
#include <optional>
// Wraps ring_position_view so it is compatible with old-style C++: default
// constructor, stateless comparators, yada yada.
class compatible_ring_position_view {
const schema* _schema = nullptr;
// Optional to supply a default constructor, no more.
std::optional<dht::ring_position_view> _rpv;
public:
constexpr compatible_ring_position_view() = default;
compatible_ring_position_view(const schema& s, dht::ring_position_view rpv)
: _schema(&s), _rpv(rpv) {
}
const dht::ring_position_view& position() const {
return *_rpv;
}
friend int tri_compare(const compatible_ring_position_view& x, const compatible_ring_position_view& y) {
return dht::ring_position_tri_compare(*x._schema, *x._rpv, *y._rpv);
}
friend bool operator<(const compatible_ring_position_view& x, const compatible_ring_position_view& y) {
return tri_compare(x, y) < 0;
}
friend bool operator<=(const compatible_ring_position_view& x, const compatible_ring_position_view& y) {
return tri_compare(x, y) <= 0;
}
friend bool operator>(const compatible_ring_position_view& x, const compatible_ring_position_view& y) {
return tri_compare(x, y) > 0;
}
friend bool operator>=(const compatible_ring_position_view& x, const compatible_ring_position_view& y) {
return tri_compare(x, y) >= 0;
}
friend bool operator==(const compatible_ring_position_view& x, const compatible_ring_position_view& y) {
return tri_compare(x, y) == 0;
}
friend bool operator!=(const compatible_ring_position_view& x, const compatible_ring_position_view& y) {
return tri_compare(x, y) != 0;
}
};

View File

@@ -25,7 +25,6 @@
#include <boost/range/adaptor/transformed.hpp>
#include "compound.hh"
#include "schema.hh"
#include "sstables/version.hh"
//
// This header provides adaptors between the representation used by our compound_type<>
@@ -303,7 +302,7 @@ private:
}
public:
template <typename Describer>
auto describe_type(sstables::sstable_version_types v, Describer f) const {
auto describe_type(Describer f) const {
return f(const_cast<bytes&>(_bytes));
}

View File

@@ -241,7 +241,7 @@ size_t lz4_processor::compress(const char* input, size_t input_len,
output[1] = (input_len >> 8) & 0xFF;
output[2] = (input_len >> 16) & 0xFF;
output[3] = (input_len >> 24) & 0xFF;
#ifdef SEASTAR_HAVE_LZ4_COMPRESS_DEFAULT
#ifdef HAVE_LZ4_COMPRESS_DEFAULT
auto ret = LZ4_compress_default(input, output + 4, input_len, LZ4_compressBound(input_len));
#else
auto ret = LZ4_compress(input, output + 4, input_len);

View File

@@ -228,7 +228,6 @@ scylla_tests = [
'tests/memory_footprint',
'tests/perf/perf_sstable',
'tests/cql_query_test',
'tests/secondary_index_test',
'tests/storage_proxy_test',
'tests/schema_change_test',
'tests/mutation_reader_test',
@@ -236,7 +235,6 @@ scylla_tests = [
'tests/row_cache_test',
'tests/test-serialization',
'tests/sstable_test',
'tests/sstable_3_x_test',
'tests/sstable_mutation_test',
'tests/sstable_resharding_test',
'tests/memtable_test',
@@ -275,15 +273,12 @@ scylla_tests = [
'tests/input_stream_test',
'tests/virtual_reader_test',
'tests/view_schema_test',
'tests/view_build_test',
'tests/view_complex_test',
'tests/counter_test',
'tests/cell_locker_test',
'tests/row_locker_test',
'tests/streaming_histogram_test',
'tests/duration_test',
'tests/vint_serialization_test',
'tests/continuous_data_consumer_test',
'tests/compress_test',
'tests/chunked_vector_test',
'tests/loading_cache_test',
@@ -298,18 +293,11 @@ scylla_tests = [
'tests/extensions_test',
'tests/cql_auth_syntax_test',
'tests/querier_cache',
'tests/limiting_data_source_test',
'tests/meta_test',
'tests/imr_test',
'tests/partition_data_test',
'tests/reusable_buffer_test',
'tests/multishard_writer_test',
'tests/querier_cache_resource_based_eviction',
]
perf_tests = [
'tests/perf/perf_mutation_readers',
'tests/perf/perf_mutation_fragment',
'tests/perf/perf_idl',
'tests/perf/perf_mutation_readers'
]
apps = [
@@ -370,10 +358,6 @@ arg_parser.add_argument('--enable-gcc6-concepts', dest='gcc6_concepts', action='
help='enable experimental support for C++ Concepts as implemented in GCC 6')
arg_parser.add_argument('--enable-alloc-failure-injector', dest='alloc_failure_injector', action='store_true', default=False,
help='enable allocation failure injection')
arg_parser.add_argument('--with-antlr3', dest='antlr3_exec', action='store', default=None,
help='path to antlr3 executable')
arg_parser.add_argument('--with-ragel', dest='ragel_exec', action='store', default=None,
help='path to ragel executable')
args = arg_parser.parse_args()
defines = []
@@ -383,7 +367,6 @@ extra_cxxflags = {}
cassandra_interface = Thrift(source = 'interface/cassandra.thrift', service = 'Cassandra')
scylla_core = (['database.cc',
'atomic_cell.cc',
'schema.cc',
'frozen_schema.cc',
'schema_registry.cc',
@@ -396,11 +379,11 @@ scylla_core = (['database.cc',
'frozen_mutation.cc',
'memtable.cc',
'schema_mutations.cc',
'release.cc',
'supervisor.cc',
'utils/logalloc.cc',
'utils/large_bitset.cc',
'utils/buffer_input_stream.cc',
'utils/limiting_data_source.cc',
'mutation_partition.cc',
'mutation_partition_view.cc',
'mutation_partition_serializer.cc',
@@ -410,9 +393,7 @@ scylla_core = (['database.cc',
'keys.cc',
'counters.cc',
'compress.cc',
'sstables/mp_row_consumer.cc',
'sstables/sstables.cc',
'sstables/sstable_version.cc',
'sstables/compress.cc',
'sstables/row.cc',
'sstables/partition.cc',
@@ -421,12 +402,9 @@ scylla_core = (['database.cc',
'sstables/compaction_manager.cc',
'sstables/integrity_checked_file_impl.cc',
'sstables/prepended_input_stream.cc',
'sstables/m_format_write_helpers.cc',
'sstables/m_format_read_helpers.cc',
'transport/event.cc',
'transport/event_notifier.cc',
'transport/server.cc',
'transport/messages/result_message.cc',
'cql3/abstract_marker.cc',
'cql3/attributes.cc',
'cql3/cf_name.cc',
@@ -514,7 +492,6 @@ scylla_core = (['database.cc',
'cql3/variable_specifications.cc',
'db/consistency_level.cc',
'db/system_keyspace.cc',
'db/system_distributed_keyspace.cc',
'db/schema_tables.cc',
'db/cql_type_parser.cc',
'db/legacy_schema_migrator.cc',
@@ -522,17 +499,15 @@ scylla_core = (['database.cc',
'db/commitlog/commitlog_replayer.cc',
'db/commitlog/commitlog_entry.cc',
'db/hints/manager.cc',
'db/hints/resource_manager.cc',
'db/config.cc',
'db/extensions.cc',
'db/heat_load_balance.cc',
'db/large_partition_handler.cc',
'db/index/secondary_index.cc',
'db/marshal/type_parser.cc',
'db/batchlog_manager.cc',
'db/view/view.cc',
'db/view/row_locking.cc',
'index/secondary_index_manager.cc',
'index/secondary_index.cc',
'utils/UUID_gen.cc',
'utils/i_filter.cc',
'utils/bloom_filter.cc',
@@ -629,8 +604,6 @@ scylla_core = (['database.cc',
'vint-serialization.cc',
'utils/arch/powerpc/crc32-vpmsum/crc32_wrapper.cc',
'querier.cc',
'data/cell.cc',
'multishard_writer.cc',
]
+ [Antlr3Grammar('cql3/Cql.g')]
+ [Thrift('interface/cassandra.thrift', 'Cassandra')]
@@ -667,9 +640,7 @@ api = ['api/api.cc',
'api/api-doc/stream_manager.json',
'api/stream_manager.cc',
'api/api-doc/system.json',
'api/system.cc',
'api/config.cc',
'api/api-doc/config.json',
'api/system.cc'
]
idls = ['idl/gossip_digest.idl.hh',
@@ -697,7 +668,7 @@ idls = ['idl/gossip_digest.idl.hh',
'idl/cache_temperature.idl.hh',
]
scylla_tests_dependencies = scylla_core + idls + [
scylla_tests_dependencies = scylla_core + api + idls + [
'tests/cql_test_env.cc',
'tests/cql_assertions.cc',
'tests/result_set_assertions.cc',
@@ -710,7 +681,7 @@ scylla_tests_seastar_deps = [
]
deps = {
'scylla': idls + ['main.cc', 'release.cc'] + scylla_core + api,
'scylla': idls + ['main.cc'] + scylla_core + api,
}
pure_boost_tests = set([
@@ -738,10 +709,6 @@ pure_boost_tests = set([
'tests/auth_resource_test',
'tests/enum_set_test',
'tests/cql_auth_syntax_test',
'tests/meta_test',
'tests/imr_test',
'tests/partition_data_test',
'tests/reusable_buffer_test',
])
tests_not_using_seastar_test_framework = set([
@@ -760,6 +727,7 @@ tests_not_using_seastar_test_framework = set([
'tests/memory_footprint',
'tests/gossip',
'tests/perf/perf_sstable',
'tests/querier_cache_resource_based_eviction',
]) | pure_boost_tests
for t in tests_not_using_seastar_test_framework:
@@ -772,7 +740,7 @@ for t in scylla_tests:
deps[t] += scylla_tests_dependencies
deps[t] += scylla_tests_seastar_deps
else:
deps[t] += scylla_core + idls + ['tests/cql_test_env.cc']
deps[t] += scylla_core + api + idls + ['tests/cql_test_env.cc']
perf_tests_seastar_deps = [
'seastar/tests/perf/perf_tests.cc'
@@ -791,10 +759,6 @@ deps['tests/murmur_hash_test'] = ['bytes.cc', 'utils/murmur_hash.cc', 'tests/mur
deps['tests/allocation_strategy_test'] = ['tests/allocation_strategy_test.cc', 'utils/logalloc.cc', 'utils/dynamic_bitset.cc']
deps['tests/log_heap_test'] = ['tests/log_heap_test.cc']
deps['tests/anchorless_list_test'] = ['tests/anchorless_list_test.cc']
deps['tests/perf/perf_fast_forward'] += ['release.cc']
deps['tests/meta_test'] = ['tests/meta_test.cc']
deps['tests/imr_test'] = ['tests/imr_test.cc']
deps['tests/reusable_buffer_test'] = ['tests/reusable_buffer_test.cc']
warnings = [
'-Wno-mismatched-tags', # clang-only
@@ -868,22 +832,6 @@ for pkglist in optional_packages:
alternatives = ':'.join(pkglist[1:])
print('Missing optional package {pkglist[0]} (or alteratives {alternatives})'.format(**locals()))
compiler_test_src = '''
#if __GNUC__ < 7
#error "MAJOR"
#elif __GNUC__ == 7
#if __GNUC_MINOR__ < 3
#error "MINOR"
#endif
#endif
int main() { return 0; }
'''
if not try_compile_and_link(compiler=args.cxx, source=compiler_test_src):
print('Wrong GCC version. Scylla needs GCC >= 7.3 to compile.')
sys.exit(1)
if not try_compile(compiler=args.cxx, source='#include <boost/version.hpp>'):
print('Boost not installed. Please install {}.'.format(pkgname("boost-devel")))
sys.exit(1)
@@ -1009,16 +957,6 @@ do_sanitize = True
if args.static:
do_sanitize = False
if args.antlr3_exec:
antlr3_exec = args.antlr3_exec
else:
antlr3_exec = "antlr3"
if args.ragel_exec:
ragel_exec = args.ragel_exec
else:
ragel_exec = "ragel"
with open(buildfile, 'w') as f:
f.write(textwrap.dedent('''\
configure_args = {configure_args}
@@ -1032,7 +970,7 @@ with open(buildfile, 'w') as f:
pool seastar_pool
depth = 1
rule ragel
command = {ragel_exec} -G2 -o $out $in
command = ragel -G2 -o $out $in
description = RAGEL $out
rule gen
command = echo -e $text > $out
@@ -1078,7 +1016,7 @@ with open(buildfile, 'w') as f:
# Because we add such a variable to every function, and because `ExceptionBaseType` is not a global
# name, we also add a global typedef to avoid compilation errors.
command = sed -e '/^#if 0/,/^#endif/d' $in > $builddir/{mode}/gen/$in $
&& {antlr3_exec} $builddir/{mode}/gen/$in $
&& antlr3 $builddir/{mode}/gen/$in $
&& sed -i -e 's/^\\( *\)\\(ImplTraits::CommonTokenType\\* [a-zA-Z0-9_]* = NULL;\\)$$/\\1const \\2/' $
-e '1i using ExceptionBaseType = int;' $
-e 's/^{{/{{ ExceptionBaseType\* ex = nullptr;/; $
@@ -1086,7 +1024,7 @@ with open(buildfile, 'w') as f:
s/exceptions::syntax_exception e/exceptions::syntax_exception\& e/' $
build/{mode}/gen/${{stem}}Parser.cpp
description = ANTLR3 $in
''').format(mode = mode, antlr3_exec = antlr3_exec, **modeval))
''').format(mode = mode, **modeval))
f.write('build {mode}: phony {artifacts}\n'.format(mode = mode,
artifacts = str.join(' ', ('$builddir/' + mode + '/' + x for x in build_artifacts))))
compiles = {}

View File

@@ -39,32 +39,16 @@ private:
return ::is_compatible(new_def.kind, kind) && new_def.type->is_value_compatible_with(*old_type);
}
static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const data_type& old_type, atomic_cell_view cell) {
if (!is_compatible(new_def, old_type, kind) || cell.timestamp() <= new_def.dropped_at()) {
return;
if (is_compatible(new_def, old_type, kind) && cell.timestamp() > new_def.dropped_at()) {
dst.apply(new_def, atomic_cell_or_collection(cell));
}
auto new_cell = [&] {
if (cell.is_live() && !old_type->is_counter()) {
if (cell.is_live_and_has_ttl()) {
return atomic_cell_or_collection(
atomic_cell::make_live(*new_def.type, cell.timestamp(), cell.value().linearize(), cell.expiry(), cell.ttl())
);
}
return atomic_cell_or_collection(
atomic_cell::make_live(*new_def.type, cell.timestamp(), cell.value().linearize())
);
} else {
return atomic_cell_or_collection(*new_def.type, cell);
}
}();
dst.apply(new_def, std::move(new_cell));
}
static void accept_cell(row& dst, column_kind kind, const column_definition& new_def, const data_type& old_type, collection_mutation_view cell) {
if (!is_compatible(new_def, old_type, kind)) {
return;
}
cell.data.with_linearized([&] (bytes_view cell_bv) {
auto&& ctype = static_pointer_cast<const collection_type_impl>(old_type);
auto old_view = ctype->deserialize_mutation_form(cell_bv);
auto old_view = ctype->deserialize_mutation_form(cell);
collection_type_impl::mutation_view new_view;
if (old_view.tomb.timestamp > new_def.dropped_at()) {
@@ -76,7 +60,6 @@ private:
}
}
dst.apply(new_def, ctype->serialize_mutation_form(std::move(new_view)));
});
}
public:
converting_mutation_partition_applier(
@@ -92,10 +75,6 @@ public:
_p.apply(t);
}
void accept_static_cell(column_id id, atomic_cell cell) {
return accept_static_cell(id, atomic_cell_view(cell));
}
virtual void accept_static_cell(column_id id, atomic_cell_view cell) override {
const column_mapping_entry& col = _visited_column_mapping.static_column_at(id);
const column_definition* def = _p_schema.get_column_definition(col.name());
@@ -123,10 +102,6 @@ public:
_current_row = &r;
}
void accept_row_cell(column_id id, atomic_cell cell) {
return accept_row_cell(id, atomic_cell_view(cell));
}
virtual void accept_row_cell(column_id id, atomic_cell_view cell) override {
const column_mapping_entry& col = _visited_column_mapping.regular_column_at(id);
const column_definition* def = _p_schema.get_column_definition(col.name());
@@ -145,11 +120,11 @@ public:
// Appends the cell to dst upgrading it to the new schema.
// Cells must have monotonic names.
static void append_cell(row& dst, column_kind kind, const column_definition& new_def, const column_definition& old_def, const atomic_cell_or_collection& cell) {
static void append_cell(row& dst, column_kind kind, const column_definition& new_def, const data_type& old_type, const atomic_cell_or_collection& cell) {
if (new_def.is_atomic()) {
accept_cell(dst, kind, new_def, old_def.type, cell.as_atomic_cell(old_def));
accept_cell(dst, kind, new_def, old_type, cell.as_atomic_cell());
} else {
accept_cell(dst, kind, new_def, old_def.type, cell.as_collection_mutation());
accept_cell(dst, kind, new_def, old_type, cell.as_collection_mutation());
}
}
};

View File

@@ -78,10 +78,10 @@ std::vector<counter_shard> counter_cell_view::shards_compatible_with_1_7_4() con
return sorted_shards;
}
static bool apply_in_place(const column_definition& cdef, atomic_cell_mutable_view dst, atomic_cell_mutable_view src)
static bool apply_in_place(atomic_cell_or_collection& dst, atomic_cell_or_collection& src)
{
auto dst_ccmv = counter_cell_mutable_view(dst);
auto src_ccmv = counter_cell_mutable_view(src);
auto dst_ccmv = counter_cell_mutable_view(dst.as_mutable_atomic_cell());
auto src_ccmv = counter_cell_mutable_view(src.as_mutable_atomic_cell());
auto dst_shards = dst_ccmv.shards();
auto src_shards = src_ccmv.shards();
@@ -118,19 +118,48 @@ static bool apply_in_place(const column_definition& cdef, atomic_cell_mutable_vi
auto src_ts = src_ccmv.timestamp();
dst_ccmv.set_timestamp(std::max(dst_ts, src_ts));
src_ccmv.set_timestamp(dst_ts);
src.as_mutable_atomic_cell().set_counter_in_place_revert(true);
return true;
}
void counter_cell_view::apply(const column_definition& cdef, atomic_cell_or_collection& dst, atomic_cell_or_collection& src)
static void revert_in_place_apply(atomic_cell_or_collection& dst, atomic_cell_or_collection& src)
{
auto dst_ac = dst.as_atomic_cell(cdef);
auto src_ac = src.as_atomic_cell(cdef);
assert(dst.can_use_mutable_view() && src.can_use_mutable_view());
auto dst_ccmv = counter_cell_mutable_view(dst.as_mutable_atomic_cell());
auto src_ccmv = counter_cell_mutable_view(src.as_mutable_atomic_cell());
auto dst_shards = dst_ccmv.shards();
auto src_shards = src_ccmv.shards();
auto dst_it = dst_shards.begin();
auto src_it = src_shards.begin();
while (src_it != src_shards.end()) {
while (dst_it != dst_shards.end() && dst_it->id() < src_it->id()) {
++dst_it;
}
assert(dst_it != dst_shards.end() && dst_it->id() == src_it->id());
dst_it->swap_value_and_clock(*src_it);
++src_it;
}
auto dst_ts = dst_ccmv.timestamp();
auto src_ts = src_ccmv.timestamp();
dst_ccmv.set_timestamp(src_ts);
src_ccmv.set_timestamp(dst_ts);
src.as_mutable_atomic_cell().set_counter_in_place_revert(false);
}
bool counter_cell_view::apply_reversibly(atomic_cell_or_collection& dst, atomic_cell_or_collection& src)
{
auto dst_ac = dst.as_atomic_cell();
auto src_ac = src.as_atomic_cell();
if (!dst_ac.is_live() || !src_ac.is_live()) {
if (dst_ac.is_live() || (!src_ac.is_live() && compare_atomic_cell_for_merge(dst_ac, src_ac) < 0)) {
std::swap(dst, src);
return true;
}
return;
return false;
}
if (dst_ac.is_counter_update() && src_ac.is_counter_update()) {
@@ -138,26 +167,22 @@ void counter_cell_view::apply(const column_definition& cdef, atomic_cell_or_coll
auto dst_v = dst_ac.counter_update_value();
dst = atomic_cell::make_live_counter_update(std::max(dst_ac.timestamp(), src_ac.timestamp()),
src_v + dst_v);
return;
return true;
}
assert(!dst_ac.is_counter_update());
assert(!src_ac.is_counter_update());
with_linearized(dst_ac, [&] (counter_cell_view dst_ccv) {
with_linearized(src_ac, [&] (counter_cell_view src_ccv) {
if (dst_ccv.shard_count() >= src_ccv.shard_count()) {
auto dst_amc = dst.as_mutable_atomic_cell(cdef);
auto src_amc = src.as_mutable_atomic_cell(cdef);
if (!dst_amc.is_value_fragmented() && !src_amc.is_value_fragmented()) {
if (apply_in_place(cdef, dst_amc, src_amc)) {
return;
}
if (counter_cell_view(dst_ac).shard_count() >= counter_cell_view(src_ac).shard_count()
&& dst.can_use_mutable_view() && src.can_use_mutable_view()) {
if (apply_in_place(dst, src)) {
return true;
}
}
auto dst_shards = dst_ccv.shards();
auto src_shards = src_ccv.shards();
src.as_mutable_atomic_cell().set_counter_in_place_revert(false);
auto dst_shards = counter_cell_view(dst_ac).shards();
auto src_shards = counter_cell_view(src_ac).shards();
counter_cell_builder result;
combine(dst_shards.begin(), dst_shards.end(), src_shards.begin(), src_shards.end(),
@@ -166,9 +191,22 @@ void counter_cell_view::apply(const column_definition& cdef, atomic_cell_or_coll
});
auto cell = result.build(std::max(dst_ac.timestamp(), src_ac.timestamp()));
src = std::exchange(dst, atomic_cell_or_collection(std::move(cell)));
});
});
src = std::exchange(dst, atomic_cell_or_collection(cell));
return true;
}
void counter_cell_view::revert_apply(atomic_cell_or_collection& dst, atomic_cell_or_collection& src)
{
if (dst.as_atomic_cell().is_counter_update()) {
auto src_v = src.as_atomic_cell().counter_update_value();
auto dst_v = dst.as_atomic_cell().counter_update_value();
dst = atomic_cell::make_live(dst.as_atomic_cell().timestamp(),
long_type->decompose(dst_v - src_v));
} else if (src.as_atomic_cell().is_counter_in_place_revert_set()) {
revert_in_place_apply(dst, src);
} else {
std::swap(dst, src);
}
}
stdx::optional<atomic_cell> counter_cell_view::difference(atomic_cell_view a, atomic_cell_view b)
@@ -178,15 +216,13 @@ stdx::optional<atomic_cell> counter_cell_view::difference(atomic_cell_view a, at
if (!b.is_live() || !a.is_live()) {
if (b.is_live() || (!a.is_live() && compare_atomic_cell_for_merge(b, a) < 0)) {
return atomic_cell(*counter_type, a);
return atomic_cell(a);
}
return { };
}
return with_linearized(a, [&] (counter_cell_view a_ccv) {
return with_linearized(b, [&] (counter_cell_view b_ccv) {
auto a_shards = a_ccv.shards();
auto b_shards = b_ccv.shards();
auto a_shards = counter_cell_view(a).shards();
auto b_shards = counter_cell_view(b).shards();
auto a_it = a_shards.begin();
auto a_end = a_shards.end();
@@ -208,21 +244,18 @@ stdx::optional<atomic_cell> counter_cell_view::difference(atomic_cell_view a, at
if (!result.empty()) {
diff = result.build(std::max(a.timestamp(), b.timestamp()));
} else if (a.timestamp() > b.timestamp()) {
diff = atomic_cell::make_live(*counter_type, a.timestamp(), bytes_view());
diff = atomic_cell::make_live(a.timestamp(), bytes_view());
}
return diff;
});
});
}
void transform_counter_updates_to_shards(mutation& m, const mutation* current_state, uint64_t clock_offset) {
// FIXME: allow current_state to be frozen_mutation
auto transform_new_row_to_shards = [&s = *m.schema(), clock_offset] (column_kind kind, auto& cells) {
cells.for_each_cell([&] (column_id id, atomic_cell_or_collection& ac_o_c) {
auto& cdef = s.column_at(kind, id);
auto acv = ac_o_c.as_atomic_cell(cdef);
auto transform_new_row_to_shards = [clock_offset] (auto& cells) {
cells.for_each_cell([clock_offset] (auto, atomic_cell_or_collection& ac_o_c) {
auto acv = ac_o_c.as_atomic_cell();
if (!acv.is_live()) {
return; // continue -- we are in lambda
}
@@ -233,35 +266,32 @@ void transform_counter_updates_to_shards(mutation& m, const mutation* current_st
};
if (!current_state) {
transform_new_row_to_shards(column_kind::static_column, m.partition().static_row());
transform_new_row_to_shards(m.partition().static_row());
for (auto& cr : m.partition().clustered_rows()) {
transform_new_row_to_shards(column_kind::regular_column, cr.row().cells());
transform_new_row_to_shards(cr.row().cells());
}
return;
}
clustering_key::less_compare cmp(*m.schema());
auto transform_row_to_shards = [&s = *m.schema(), clock_offset] (column_kind kind, auto& transformee, auto& state) {
auto transform_row_to_shards = [clock_offset] (auto& transformee, auto& state) {
std::deque<std::pair<column_id, counter_shard>> shards;
state.for_each_cell([&] (column_id id, const atomic_cell_or_collection& ac_o_c) {
auto& cdef = s.column_at(kind, id);
auto acv = ac_o_c.as_atomic_cell(cdef);
auto acv = ac_o_c.as_atomic_cell();
if (!acv.is_live()) {
return; // continue -- we are in lambda
}
counter_cell_view::with_linearized(acv, [&] (counter_cell_view ccv) {
counter_cell_view ccv(acv);
auto cs = ccv.local_shard();
if (!cs) {
return; // continue
}
shards.emplace_back(std::make_pair(id, counter_shard(*cs)));
});
});
transformee.for_each_cell([&] (column_id id, atomic_cell_or_collection& ac_o_c) {
auto& cdef = s.column_at(kind, id);
auto acv = ac_o_c.as_atomic_cell(cdef);
auto acv = ac_o_c.as_atomic_cell();
if (!acv.is_live()) {
return; // continue -- we are in lambda
}
@@ -283,7 +313,7 @@ void transform_counter_updates_to_shards(mutation& m, const mutation* current_st
});
};
transform_row_to_shards(column_kind::static_column, m.partition().static_row(), current_state->partition().static_row());
transform_row_to_shards(m.partition().static_row(), current_state->partition().static_row());
auto& cstate = current_state->partition();
auto it = cstate.clustered_rows().begin();
@@ -293,10 +323,10 @@ void transform_counter_updates_to_shards(mutation& m, const mutation* current_st
++it;
}
if (it == end || cmp(cr.key(), it->key())) {
transform_new_row_to_shards(column_kind::regular_column, cr.row().cells());
transform_new_row_to_shards(cr.row().cells());
continue;
}
transform_row_to_shards(column_kind::regular_column, cr.row().cells(), it->row().cells());
transform_row_to_shards(cr.row().cells(), it->row().cells());
}
}

View File

@@ -79,7 +79,7 @@ static_assert(std::is_pod<counter_id>::value, "counter_id should be a POD type")
std::ostream& operator<<(std::ostream& os, const counter_id& id);
template<mutable_view is_mutable>
template<typename View>
class basic_counter_shard_view {
enum class offset : unsigned {
id = 0u,
@@ -88,8 +88,7 @@ class basic_counter_shard_view {
total_size = unsigned(logical_clock) + sizeof(int64_t),
};
private:
using pointer_type = std::conditional_t<is_mutable == mutable_view::no, const signed char*, signed char*>;
pointer_type _base;
typename View::pointer _base;
private:
template<typename T>
T read(offset off) const {
@@ -101,7 +100,7 @@ public:
static constexpr auto size = size_t(offset::total_size);
public:
basic_counter_shard_view() = default;
explicit basic_counter_shard_view(pointer_type ptr) noexcept
explicit basic_counter_shard_view(typename View::pointer ptr) noexcept
: _base(ptr) { }
counter_id id() const { return read<counter_id>(offset::id); }
@@ -112,7 +111,7 @@ public:
static constexpr size_t off = size_t(offset::value);
static constexpr size_t size = size_t(offset::total_size) - off;
signed char tmp[size];
typename View::value_type tmp[size];
std::copy_n(_base + off, size, tmp);
std::copy_n(other._base + off, size, _base + off);
std::copy_n(tmp, size, other._base + off);
@@ -139,7 +138,7 @@ public:
};
};
using counter_shard_view = basic_counter_shard_view<mutable_view::no>;
using counter_shard_view = basic_counter_shard_view<bytes_view>;
std::ostream& operator<<(std::ostream& os, counter_shard_view csv);
@@ -199,7 +198,7 @@ public:
return do_apply(other);
}
static constexpr size_t serialized_size() {
static size_t serialized_size() {
return counter_shard_view::size;
}
void serialize(bytes::iterator& out) const {
@@ -253,33 +252,15 @@ public:
}
atomic_cell build(api::timestamp_type timestamp) const {
// If we can assume that the counter shards never cross fragment boundaries
// the serialisation code gets much simpler.
static_assert(data::cell::maximum_external_chunk_length % counter_shard::serialized_size() == 0);
auto ac = atomic_cell::make_live_uninitialized(*counter_type, timestamp, serialized_size());
auto dst_it = ac.value().begin();
auto dst_current = *dst_it++;
for (auto&& cs : _shards) {
if (dst_current.empty()) {
dst_current = *dst_it++;
}
assert(!dst_current.empty());
auto value_dst = dst_current.data();
cs.serialize(value_dst);
dst_current.remove_prefix(counter_shard::serialized_size());
}
return ac;
return atomic_cell::make_live_from_serializer(timestamp, serialized_size(), [this] (bytes::iterator out) {
serialize(out);
});
}
static atomic_cell from_single_shard(api::timestamp_type timestamp, const counter_shard& cs) {
// We don't really need to bother with fragmentation here.
static_assert(data::cell::maximum_external_chunk_length >= counter_shard::serialized_size());
auto ac = atomic_cell::make_live_uninitialized(*counter_type, timestamp, counter_shard::serialized_size());
auto dst = ac.value().first_fragment().begin();
cs.serialize(dst);
return ac;
return atomic_cell::make_live_from_serializer(timestamp, counter_shard::serialized_size(), [&cs] (bytes::iterator out) {
cs.serialize(out);
});
}
class inserter_iterator : public std::iterator<std::output_iterator_tag, counter_shard> {
@@ -306,32 +287,28 @@ public:
// <counter_id> := <int64_t><int64_t>
// <shard> := <counter_id><int64_t:value><int64_t:logical_clock>
// <counter_cell> := <shard>*
template<mutable_view is_mutable>
template<typename View>
class basic_counter_cell_view {
protected:
using linearized_value_view = std::conditional_t<is_mutable == mutable_view::no,
bytes_view, bytes_mutable_view>;
using pointer_type = typename linearized_value_view::pointer;
basic_atomic_cell_view<is_mutable> _cell;
linearized_value_view _value;
atomic_cell_base<View> _cell;
private:
class shard_iterator : public std::iterator<std::input_iterator_tag, basic_counter_shard_view<is_mutable>> {
pointer_type _current;
basic_counter_shard_view<is_mutable> _current_view;
class shard_iterator : public std::iterator<std::input_iterator_tag, basic_counter_shard_view<View>> {
typename View::pointer _current;
basic_counter_shard_view<View> _current_view;
public:
shard_iterator() = default;
shard_iterator(pointer_type ptr) noexcept
shard_iterator(typename View::pointer ptr) noexcept
: _current(ptr), _current_view(ptr) { }
basic_counter_shard_view<is_mutable>& operator*() noexcept {
basic_counter_shard_view<View>& operator*() noexcept {
return _current_view;
}
basic_counter_shard_view<is_mutable>* operator->() noexcept {
basic_counter_shard_view<View>* operator->() noexcept {
return &_current_view;
}
shard_iterator& operator++() noexcept {
_current += counter_shard_view::size;
_current_view = basic_counter_shard_view<is_mutable>(_current);
_current_view = basic_counter_shard_view<View>(_current);
return *this;
}
shard_iterator operator++(int) noexcept {
@@ -341,7 +318,7 @@ private:
}
shard_iterator& operator--() noexcept {
_current -= counter_shard_view::size;
_current_view = basic_counter_shard_view<is_mutable>(_current);
_current_view = basic_counter_shard_view<View>(_current);
return *this;
}
shard_iterator operator--(int) noexcept {
@@ -358,23 +335,22 @@ private:
};
public:
boost::iterator_range<shard_iterator> shards() const {
auto begin = shard_iterator(_value.data());
auto end = shard_iterator(_value.data() + _value.size());
auto bv = _cell.value();
auto begin = shard_iterator(bv.data());
auto end = shard_iterator(bv.data() + bv.size());
return boost::make_iterator_range(begin, end);
}
size_t shard_count() const {
return _cell.value().size_bytes() / counter_shard_view::size;
return _cell.value().size() / counter_shard_view::size;
}
protected:
public:
// ac must be a live counter cell
explicit basic_counter_cell_view(basic_atomic_cell_view<is_mutable> ac, linearized_value_view vv) noexcept
: _cell(ac), _value(vv)
{
explicit basic_counter_cell_view(atomic_cell_base<View> ac) noexcept : _cell(ac) {
assert(_cell.is_live());
assert(!_cell.is_counter_update());
}
public:
api::timestamp_type timestamp() const { return _cell.timestamp(); }
static data_type total_value_type() { return long_type; }
@@ -405,22 +381,18 @@ public:
}
};
struct counter_cell_view : basic_counter_cell_view<mutable_view::no> {
struct counter_cell_view : basic_counter_cell_view<bytes_view> {
using basic_counter_cell_view::basic_counter_cell_view;
template<typename Function>
static decltype(auto) with_linearized(basic_atomic_cell_view<mutable_view::no> ac, Function&& fn) {
return ac.value().with_linearized([&] (bytes_view value_view) {
counter_cell_view ccv(ac, value_view);
return fn(ccv);
});
}
// Returns counter shards in an order that is compatible with Scylla 1.7.4.
std::vector<counter_shard> shards_compatible_with_1_7_4() const;
// Reversibly applies two counter cells, at least one of them must be live.
static void apply(const column_definition& cdef, atomic_cell_or_collection& dst, atomic_cell_or_collection& src);
// Returns true iff dst was modified.
static bool apply_reversibly(atomic_cell_or_collection& dst, atomic_cell_or_collection& src);
// Reverts apply performed by apply_reversible().
static void revert_apply(atomic_cell_or_collection& dst, atomic_cell_or_collection& src);
// Computes a counter cell containing minimal amount of data which, when
// applied to 'b' returns the same cell as 'a' and 'b' applied together.
@@ -429,15 +401,9 @@ struct counter_cell_view : basic_counter_cell_view<mutable_view::no> {
friend std::ostream& operator<<(std::ostream& os, counter_cell_view ccv);
};
struct counter_cell_mutable_view : basic_counter_cell_view<mutable_view::yes> {
struct counter_cell_mutable_view : basic_counter_cell_view<bytes_mutable_view> {
using basic_counter_cell_view::basic_counter_cell_view;
explicit counter_cell_mutable_view(atomic_cell_mutable_view ac) noexcept
: basic_counter_cell_view<mutable_view::yes>(ac, ac.value().first_fragment())
{
assert(!ac.value().is_fragmented());
}
void set_timestamp(api::timestamp_type ts) { _cell.set_timestamp(ts); }
};

View File

@@ -373,7 +373,7 @@ useStatement returns [::shared_ptr<raw::use_statement> stmt]
;
/**
* SELECT [JSON] <expression>
* SELECT <expression>
* FROM <CF>
* WHERE KEY = "key1" AND COL > 1 AND COL < 100
* LIMIT <NUMBER>;
@@ -384,12 +384,9 @@ selectStatement returns [shared_ptr<raw::select_statement> expr]
::shared_ptr<cql3::term::raw> limit;
raw::select_statement::parameters::orderings_type orderings;
bool allow_filtering = false;
bool is_json = false;
}
: K_SELECT (
( K_JSON { is_json = true; } )?
( K_DISTINCT { is_distinct = true; } )?
sclause=selectClause
: K_SELECT ( ( K_DISTINCT { is_distinct = true; } )?
sclause=selectClause
)
K_FROM cf=columnFamilyName
( K_WHERE wclause=whereClause )?
@@ -397,7 +394,7 @@ selectStatement returns [shared_ptr<raw::select_statement> expr]
( K_LIMIT rows=intValue { limit = rows; } )?
( K_ALLOW K_FILTERING { allow_filtering = true; } )?
{
auto params = ::make_shared<raw::select_statement::parameters>(std::move(orderings), is_distinct, allow_filtering, is_json);
auto params = ::make_shared<raw::select_statement::parameters>(std::move(orderings), is_distinct, allow_filtering);
$expr = ::make_shared<raw::select_statement>(std::move(cf), std::move(params),
std::move(sclause), std::move(wclause), std::move(limit));
}
@@ -451,51 +448,33 @@ orderByClause[raw::select_statement::parameters::orderings_type& orderings]
: c=cident (K_ASC | K_DESC { reversed = true; })? { orderings.emplace_back(c, reversed); }
;
jsonValue returns [::shared_ptr<cql3::term::raw> value]
:
| s=STRING_LITERAL { $value = cql3::constants::literal::string(sstring{$s.text}); }
| ':' id=ident { $value = new_bind_variables(id); }
| QMARK { $value = new_bind_variables(shared_ptr<cql3::column_identifier>{}); }
;
/**
* INSERT INTO <CF> (<column>, <column>, <column>, ...)
* VALUES (<value>, <value>, <value>, ...)
* USING TIMESTAMP <long>;
*
*/
insertStatement returns [::shared_ptr<raw::modification_statement> expr]
insertStatement returns [::shared_ptr<raw::insert_statement> expr]
@init {
auto attrs = ::make_shared<cql3::attributes::raw>();
std::vector<::shared_ptr<cql3::column_identifier::raw>> column_names;
std::vector<::shared_ptr<cql3::term::raw>> values;
bool if_not_exists = false;
::shared_ptr<cql3::term::raw> json_value;
}
: K_INSERT K_INTO cf=columnFamilyName
'(' c1=cident { column_names.push_back(c1); } ( ',' cn=cident { column_names.push_back(cn); } )* ')'
( K_VALUES
'(' v1=term { values.push_back(v1); } ( ',' vn=term { values.push_back(vn); } )* ')'
( K_IF K_NOT K_EXISTS { if_not_exists = true; } )?
( usingClause[attrs] )?
{
$expr = ::make_shared<raw::insert_statement>(std::move(cf),
std::move(attrs),
std::move(column_names),
std::move(values),
if_not_exists);
}
| K_JSON
json_token=jsonValue { json_value = $json_token.value; }
( K_IF K_NOT K_EXISTS { if_not_exists = true; } )?
( usingClause[attrs] )?
{
$expr = ::make_shared<raw::insert_json_statement>(std::move(cf),
std::move(attrs),
std::move(json_value),
if_not_exists);
}
)
K_VALUES
'(' v1=term { values.push_back(v1); } ( ',' vn=term { values.push_back(vn); } )* ')'
( K_IF K_NOT K_EXISTS { if_not_exists = true; } )?
( usingClause[attrs] )?
{
$expr = ::make_shared<raw::insert_statement>(std::move(cf),
std::move(attrs),
std::move(column_names),
std::move(values),
if_not_exists);
}
;
usingClause[::shared_ptr<cql3::attributes::raw> attrs]
@@ -1671,7 +1650,6 @@ basic_unreserved_keyword returns [sstring str]
| K_LANGUAGE
| K_NON
| K_DETERMINISTIC
| K_JSON
) { $str = $k.text; }
;
@@ -1808,7 +1786,6 @@ K_NON: N O N;
K_OR: O R;
K_REPLACE: R E P L A C E;
K_DETERMINISTIC: D E T E R M I N I S T I C;
K_JSON: J S O N;
K_SCYLLA_TIMEUUID_LIST_INDEX: S C Y L L A '_' T I M E U U I D '_' L I S T '_' I N D E X;
K_SCYLLA_COUNTER_SHARD_LIST: S C Y L L A '_' C O U N T E R '_' S H A R D '_' L I S T;

View File

@@ -1,187 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "cql3/prepared_statements_cache.hh"
namespace cql3 {
struct authorized_prepared_statements_cache_size {
size_t operator()(const statements::prepared_statement::checked_weak_ptr& val) {
// TODO: improve the size approximation - most of the entry is occupied by the key here.
return 100;
}
};
class authorized_prepared_statements_cache_key {
public:
using cache_key_type = std::pair<auth::authenticated_user, typename cql3::prepared_cache_key_type::cache_key_type>;
private:
cache_key_type _key;
public:
authorized_prepared_statements_cache_key(auth::authenticated_user user, cql3::prepared_cache_key_type prepared_cache_key)
: _key(std::move(user), std::move(prepared_cache_key.key())) {}
cache_key_type& key() { return _key; }
const cache_key_type& key() const { return _key; }
bool operator==(const authorized_prepared_statements_cache_key& other) const {
return _key == other._key;
}
bool operator!=(const authorized_prepared_statements_cache_key& other) const {
return !(*this == other);
}
static size_t hash(const auth::authenticated_user& user, const cql3::prepared_cache_key_type::cache_key_type& prep_cache_key) {
return utils::hash_combine(std::hash<auth::authenticated_user>()(user), utils::tuple_hash()(prep_cache_key));
}
};
/// \class authorized_prepared_statements_cache
/// \brief A cache of previously authorized statements.
///
/// Entries are inserted every time a new statement is authorized.
/// Entries are evicted in any of the following cases:
/// - When the corresponding prepared statement is not valid anymore.
/// - Periodically, with the same period as the permission cache is refreshed.
/// - If the corresponding entry hasn't been used for \ref entry_expiry.
class authorized_prepared_statements_cache {
public:
struct stats {
uint64_t authorized_prepared_statements_cache_evictions = 0;
};
static stats& shard_stats() {
static thread_local stats _stats;
return _stats;
}
struct authorized_prepared_statements_cache_stats_updater {
static void inc_hits() noexcept {}
static void inc_misses() noexcept {}
static void inc_blocks() noexcept {}
static void inc_evictions() noexcept {
++shard_stats().authorized_prepared_statements_cache_evictions;
}
};
private:
using cache_key_type = authorized_prepared_statements_cache_key;
using checked_weak_ptr = typename statements::prepared_statement::checked_weak_ptr;
using cache_type = utils::loading_cache<cache_key_type,
checked_weak_ptr,
utils::loading_cache_reload_enabled::yes,
authorized_prepared_statements_cache_size,
std::hash<cache_key_type>,
std::equal_to<cache_key_type>,
authorized_prepared_statements_cache_stats_updater>;
public:
using key_type = cache_key_type;
using value_type = checked_weak_ptr;
using entry_is_too_big = typename cache_type::entry_is_too_big;
using iterator = typename cache_type::iterator;
private:
cache_type _cache;
logging::logger& _logger;
public:
// Choose the memory budget such that would allow us ~4K entries when a shard gets 1GB of RAM
authorized_prepared_statements_cache(std::chrono::milliseconds entry_expiration, std::chrono::milliseconds entry_refresh, size_t cache_size, logging::logger& logger)
: _cache(cache_size, entry_expiration, entry_refresh, logger, [this] (const key_type& k) {
_cache.remove(k);
return make_ready_future<value_type>();
})
, _logger(logger)
{}
future<> insert(auth::authenticated_user user, cql3::prepared_cache_key_type prep_cache_key, value_type v) noexcept {
return _cache.get_ptr(key_type(std::move(user), std::move(prep_cache_key)), [v = std::move(v)] (const cache_key_type&) mutable {
return make_ready_future<value_type>(std::move(v));
}).discard_result();
}
iterator find(const auth::authenticated_user& user, const cql3::prepared_cache_key_type& prep_cache_key) {
struct key_view {
const auth::authenticated_user& user_ref;
const cql3::prepared_cache_key_type& prep_cache_key_ref;
};
struct hasher {
size_t operator()(const key_view& kv) {
return cql3::authorized_prepared_statements_cache_key::hash(kv.user_ref, kv.prep_cache_key_ref.key());
}
};
struct equal {
bool operator()(const key_type& k1, const key_view& k2) {
return k1.key().first == k2.user_ref && k1.key().second == k2.prep_cache_key_ref.key();
}
bool operator()(const key_view& k2, const key_type& k1) {
return operator()(k1, k2);
}
};
return _cache.find(key_view{user, prep_cache_key}, hasher(), equal());
}
iterator end() {
return _cache.end();
}
void remove(const auth::authenticated_user& user, const cql3::prepared_cache_key_type& prep_cache_key) {
iterator it = find(user, prep_cache_key);
_cache.remove(it);
}
size_t size() const {
return _cache.size();
}
size_t memory_footprint() const {
return _cache.memory_footprint();
}
future<> stop() {
return _cache.stop();
}
};
}
namespace std {
template <>
struct hash<cql3::authorized_prepared_statements_cache_key> final {
size_t operator()(const cql3::authorized_prepared_statements_cache_key& k) const {
return cql3::authorized_prepared_statements_cache_key::hash(k.key().first, k.key().second);
}
};
inline std::ostream& operator<<(std::ostream& out, const cql3::authorized_prepared_statements_cache_key& k) {
return out << "{ " << k.key().first << ", " << k.key().second << " }";
}
}

View File

@@ -22,7 +22,6 @@
#include "cql3/column_identifier.hh"
#include "exceptions/exceptions.hh"
#include "cql3/selection/simple_selector.hh"
#include "cql3/util.hh"
#include <regex>
@@ -63,11 +62,14 @@ sstring column_identifier::to_string() const {
}
sstring column_identifier::to_cql_string() const {
return util::maybe_quote(_text);
}
sstring column_identifier::raw::to_cql_string() const {
return util::maybe_quote(_text);
static const std::regex unquoted_identifier_re("[a-z][a-z0-9_]*");
if (std::regex_match(_text.begin(), _text.end(), unquoted_identifier_re)) {
return _text;
}
static const std::regex double_quote_re("\"");
std::string result = _text;
std::regex_replace(result, double_quote_re, "\"\"");
return '"' + result + '"';
}
column_identifier::raw::raw(sstring raw_text, bool keep_case)

View File

@@ -123,7 +123,6 @@ public:
bool operator!=(const raw& other) const;
virtual sstring to_string() const;
sstring to_cql_string() const;
friend std::hash<column_identifier::raw>;
friend std::ostream& operator<<(std::ostream& out, const column_identifier::raw& id);

View File

@@ -85,8 +85,8 @@ public:
virtual ::shared_ptr<terminal> bind(const query_options& options) override { return {}; }
virtual sstring to_string() const override { return "null"; }
};
public:
static thread_local const ::shared_ptr<terminal> NULL_VALUE;
public:
virtual ::shared_ptr<term> prepare(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver) override {
if (!is_assignable(test_assignment(db, keyspace, receiver))) {
throw exceptions::invalid_request_exception("Invalid null value for counter increment/decrement");
@@ -203,14 +203,10 @@ public:
virtual void execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params) override {
auto value = _t->bind_and_get(params._options);
execute(m, prefix, params, column, std::move(value));
}
static void execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params, const column_definition& column, cql3::raw_value_view value) {
if (value.is_null()) {
m.set_cell(prefix, column, std::move(make_dead_cell(params)));
} else if (value.is_value()) {
m.set_cell(prefix, column, std::move(make_cell(*column.type, *value, params)));
m.set_cell(prefix, column, std::move(make_cell(*value, params)));
}
}
};

View File

@@ -395,15 +395,18 @@ operator<<(std::ostream& os, const cql3_type::raw& r) {
namespace util {
sstring maybe_quote(const sstring& identifier) {
static const std::regex unquoted_identifier_re("[a-z][a-z0-9_]*");
if (std::regex_match(identifier.begin(), identifier.end(), unquoted_identifier_re)) {
return identifier;
sstring maybe_quote(const sstring& s) {
static const std::regex unquoted("\\w*");
static const std::regex double_quote("\"");
if (std::regex_match(s.begin(), s.end(), unquoted)) {
return s;
}
static const std::regex double_quote_re("\"");
std::string result = identifier;
std::regex_replace(result, double_quote_re, "\"\"");
return '"' + result + '"';
std::ostringstream ss;
ss << "\"";
std::regex_replace(std::ostreambuf_iterator<char>(ss), s.begin(), s.end(), double_quote, "\"\"");
ss << "\"";
return ss.str();
}
}

View File

@@ -45,7 +45,6 @@
#include "service/query_state.hh"
#include "service/storage_proxy.hh"
#include "cql3/query_options.hh"
#include "timeout_config.hh"
namespace cql_transport {
@@ -63,15 +62,10 @@ class metadata;
shared_ptr<const metadata> make_empty_metadata();
class cql_statement {
timeout_config_selector _timeout_config_selector;
public:
explicit cql_statement(timeout_config_selector timeout_selector) : _timeout_config_selector(timeout_selector) {}
virtual ~cql_statement()
{ }
timeout_config_selector get_timeout_config_selector() const { return _timeout_config_selector; }
virtual uint32_t get_bound_terms() = 0;
/**
@@ -87,7 +81,7 @@ public:
*
* @param state the current client state
*/
virtual void validate(service::storage_proxy& proxy, const service::client_state& state) = 0;
virtual void validate(distributed<service::storage_proxy>& proxy, const service::client_state& state) = 0;
/**
* Execute the statement and return the resulting result or null if there is no result.
@@ -96,7 +90,15 @@ public:
* @param options options for this query (consistency, variables, pageSize, ...)
*/
virtual future<::shared_ptr<cql_transport::messages::result_message>>
execute(service::storage_proxy& proxy, service::query_state& state, const query_options& options) = 0;
execute(distributed<service::storage_proxy>& proxy, service::query_state& state, const query_options& options) = 0;
/**
* Variant of execute used for internal query against the system tables, and thus only query the local node = 0.
*
* @param state the current query state
*/
virtual future<::shared_ptr<cql_transport::messages::result_message>>
execute_internal(distributed<service::storage_proxy>& proxy, service::query_state& state, const query_options& options) = 0;
virtual bool uses_function(const sstring& ks_name, const sstring& function_name) const = 0;
@@ -109,7 +111,6 @@ public:
class cql_statement_no_metadata : public cql_statement {
public:
using cql_statement::cql_statement;
virtual shared_ptr<const metadata> get_result_metadata() const override {
return make_empty_metadata();
}

View File

@@ -67,6 +67,12 @@ class error_collector : public error_listener<RecognizerType, ExceptionBaseType>
*/
const sstring_view _query;
/**
* An empty bitset to be used as a workaround for AntLR null dereference
* bug.
*/
static typename ExceptionBaseType::BitsetListType _empty_bit_list;
public:
/**
@@ -144,6 +150,14 @@ private:
break;
}
default:
// AntLR Exception class has a bug of dereferencing a null
// pointer in the displayRecognitionError. The following
// if statement makes sure it will not be null before the
// call to that function (displayRecognitionError).
// bug reference: https://github.com/antlr/antlr3/issues/191
if (!ex->get_expectingSet()) {
ex->set_expectingSet(&_empty_bit_list);
}
ex->displayRecognitionError(token_names, msg);
}
return msg.str();
@@ -345,4 +359,8 @@ private:
#endif
};
template<typename RecognizerType, typename TokenType, typename ExceptionBaseType>
typename ExceptionBaseType::BitsetListType
error_collector<RecognizerType,TokenType,ExceptionBaseType>::_empty_bit_list = typename ExceptionBaseType::BitsetListType();
}

View File

@@ -42,7 +42,6 @@
#pragma once
#include "types.hh"
#include "cql3/cql3_type.hh"
#include <vector>
#include <iosfwd>
#include <boost/functional/hash.hpp>
@@ -106,9 +105,9 @@ abstract_function::print(std::ostream& os) const {
if (i > 0) {
os << ", ";
}
os << _arg_types[i]->as_cql3_type()->to_string();
os << _arg_types[i]->name(); // FIXME: asCQL3Type()
}
os << ") -> " << _return_type->as_cql3_type()->to_string();
os << ") -> " << _return_type->name(); // FIXME: asCQL3Type()
}
}

View File

@@ -20,7 +20,6 @@
*/
#include "functions.hh"
#include "function_call.hh"
#include "token_fct.hh"
#include "cql3/maps.hh"
@@ -42,22 +41,11 @@ functions::init() {
declare(time_uuid_fcts::make_min_timeuuid_fct());
declare(time_uuid_fcts::make_max_timeuuid_fct());
declare(time_uuid_fcts::make_date_of_fct());
declare(time_uuid_fcts::make_unix_timestamp_of_fct());
declare(time_uuid_fcts::make_currenttimestamp_fct());
declare(time_uuid_fcts::make_currentdate_fct());
declare(time_uuid_fcts::make_currenttime_fct());
declare(time_uuid_fcts::make_currenttimeuuid_fct());
declare(time_uuid_fcts::make_timeuuidtodate_fct());
declare(time_uuid_fcts::make_timestamptodate_fct());
declare(time_uuid_fcts::make_timeuuidtotimestamp_fct());
declare(time_uuid_fcts::make_datetotimestamp_fct());
declare(time_uuid_fcts::make_timeuuidtounixtimestamp_fct());
declare(time_uuid_fcts::make_timestamptounixtimestamp_fct());
declare(time_uuid_fcts::make_datetounixtimestamp_fct());
declare(time_uuid_fcts::make_unix_timestamp_of_fcf());
declare(make_uuid_fct());
for (auto&& type : cql3_type::values()) {
// Note: because text and varchar ends up being synonymous, our automatic makeToBlobFunction doesn't work
// Note: because text and varchar ends up being synonimous, our automatic makeToBlobFunction doesn't work
// for varchar, so we special case it below. We also skip blob for obvious reasons.
if (type == cql3_type::varchar || type == cql3_type::blob) {
continue;
@@ -107,22 +95,15 @@ functions::init() {
declare(aggregate_fcts::make_max_function<sstring>());
declare(aggregate_fcts::make_min_function<sstring>());
declare(aggregate_fcts::make_count_function<simple_date_native_type>());
declare(aggregate_fcts::make_max_function<simple_date_native_type>());
declare(aggregate_fcts::make_min_function<simple_date_native_type>());
declare(aggregate_fcts::make_count_function<timestamp_native_type>());
declare(aggregate_fcts::make_max_function<timestamp_native_type>());
declare(aggregate_fcts::make_min_function<timestamp_native_type>());
declare(aggregate_fcts::make_count_function<timeuuid_native_type>());
declare(aggregate_fcts::make_max_function<timeuuid_native_type>());
declare(aggregate_fcts::make_min_function<timeuuid_native_type>());
declare(aggregate_fcts::make_count_function<utils::UUID>());
declare(aggregate_fcts::make_max_function<utils::UUID>());
declare(aggregate_fcts::make_min_function<utils::UUID>());
//FIXME:
//declare(aggregate_fcts::make_count_function<bytes>());
//declare(aggregate_fcts::make_max_function<bytes>());
@@ -172,73 +153,23 @@ functions::get_overload_count(const function_name& name) {
return _declared.count(name);
}
inline
shared_ptr<function>
make_to_json_function(data_type t) {
return make_native_scalar_function<true>("tojson", utf8_type, {t},
[t](cql_serialization_format sf, const std::vector<bytes_opt>& parameters) -> bytes_opt {
return utf8_type->decompose(t->to_json_string(parameters[0].value()));
});
}
inline
shared_ptr<function>
make_from_json_function(database& db, const sstring& keyspace, data_type t) {
return make_native_scalar_function<true>("fromjson", t, {utf8_type},
[&db, &keyspace, t](cql_serialization_format sf, const std::vector<bytes_opt>& parameters) -> bytes_opt {
Json::Value json_value = json::to_json_value(utf8_type->to_string(parameters[0].value()));
bytes_opt parsed_json_value;
if (!json_value.isNull()) {
parsed_json_value.emplace(t->from_json_object(json_value, sf));
}
return std::move(parsed_json_value);
});
}
shared_ptr<function>
functions::get(database& db,
const sstring& keyspace,
const function_name& name,
const std::vector<shared_ptr<assignment_testable>>& provided_args,
const sstring& receiver_ks,
const sstring& receiver_cf,
shared_ptr<column_specification> receiver) {
const sstring& receiver_cf) {
static const function_name TOKEN_FUNCTION_NAME = function_name::native_function("token");
static const function_name TO_JSON_FUNCTION_NAME = function_name::native_function("tojson");
static const function_name FROM_JSON_FUNCTION_NAME = function_name::native_function("fromjson");
if (name.has_keyspace()
? name == TOKEN_FUNCTION_NAME
: name.name == TOKEN_FUNCTION_NAME.name) {
? name == TOKEN_FUNCTION_NAME
: name.name == TOKEN_FUNCTION_NAME.name)
{
return ::make_shared<token_fct>(db.find_schema(receiver_ks, receiver_cf));
}
if (name.has_keyspace()
? name == TO_JSON_FUNCTION_NAME
: name.name == TO_JSON_FUNCTION_NAME.name) {
if (provided_args.size() != 1) {
throw exceptions::invalid_request_exception("toJson() accepts 1 argument only");
}
selection::selector *sp = dynamic_cast<selection::selector *>(provided_args[0].get());
if (!sp) {
throw exceptions::invalid_request_exception("toJson() is only valid in SELECT clause");
}
return make_to_json_function(sp->get_type());
}
if (name.has_keyspace()
? name == FROM_JSON_FUNCTION_NAME
: name.name == FROM_JSON_FUNCTION_NAME.name) {
if (provided_args.size() != 1) {
throw exceptions::invalid_request_exception("fromJson() accepts 1 argument only");
}
if (!receiver) {
throw exceptions::invalid_request_exception("fromJson() can only be called if receiver type is known");
}
return make_from_json_function(db, keyspace, receiver->type);
}
std::vector<shared_ptr<function>> candidates;
auto&& add_declared = [&] (function_name fn) {
auto&& fns = _declared.equal_range(fn);
@@ -483,7 +414,7 @@ function_call::raw::prepare(database& db, const sstring& keyspace, ::shared_ptr<
[] (auto&& x) -> shared_ptr<assignment_testable> {
return x;
});
auto&& fun = functions::functions::get(db, keyspace, _name, args, receiver->ks_name, receiver->cf_name, receiver);
auto&& fun = functions::functions::get(db, keyspace, _name, args, receiver->ks_name, receiver->cf_name);
if (!fun) {
throw exceptions::invalid_request_exception(sprint("Unknown function %s called", _name));
}
@@ -547,7 +478,7 @@ function_call::raw::test_assignment(database& db, const sstring& keyspace, share
// of another, existing, function. In that case, we return true here because we'll throw a proper exception
// later with a more helpful error message that if we were to return false here.
try {
auto&& fun = functions::get(db, keyspace, _name, _terms, receiver->ks_name, receiver->cf_name, receiver);
auto&& fun = functions::get(db, keyspace, _name, _terms, receiver->ks_name, receiver->cf_name);
if (fun && receiver->type->equals(fun->return_type())) {
return assignment_testable::test_result::EXACT_MATCH;
} else if (!fun || receiver->type->is_value_compatible_with(*fun->return_type())) {

View File

@@ -80,18 +80,16 @@ public:
const function_name& name,
const std::vector<shared_ptr<assignment_testable>>& provided_args,
const sstring& receiver_ks,
const sstring& receiver_cf,
::shared_ptr<column_specification> receiver = nullptr);
const sstring& receiver_cf);
template <typename AssignmentTestablePtrRange>
static shared_ptr<function> get(database& db,
const sstring& keyspace,
const function_name& name,
AssignmentTestablePtrRange&& provided_args,
const sstring& receiver_ks,
const sstring& receiver_cf,
::shared_ptr<column_specification> receiver = nullptr) {
const sstring& receiver_cf) {
const std::vector<shared_ptr<assignment_testable>> args(std::begin(provided_args), std::end(provided_args));
return get(db, keyspace, name, args, receiver_ks, receiver_cf, receiver);
return get(db, keyspace, name, args, receiver_ks, receiver_cf);
}
static std::vector<shared_ptr<function>> find(const function_name& name);
static shared_ptr<function> find(const function_name& name, const std::vector<data_type>& arg_types);

View File

@@ -117,7 +117,7 @@ make_date_of_fct() {
inline
shared_ptr<function>
make_unix_timestamp_of_fct() {
make_unix_timestamp_of_fcf() {
return make_native_scalar_function<true>("unixtimestampof", long_type, { timeuuid_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
@@ -129,163 +129,6 @@ make_unix_timestamp_of_fct() {
});
}
inline shared_ptr<function>
make_currenttimestamp_fct() {
return make_native_scalar_function<true>("currenttimestamp", timestamp_type, {},
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
return {timestamp_type->decompose(timestamp_native_type{db_clock::now()})};
});
}
inline shared_ptr<function>
make_currenttime_fct() {
return make_native_scalar_function<true>("currenttime", time_type, {},
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
constexpr int64_t milliseconds_in_day = 3600 * 24 * 1000;
int64_t milliseconds_since_epoch = std::chrono::duration_cast<std::chrono::milliseconds>(db_clock::now().time_since_epoch()).count();
int64_t nanoseconds_today = (milliseconds_since_epoch % milliseconds_in_day) * 1000 * 1000;
return {time_type->decompose(time_native_type{nanoseconds_today})};
});
}
inline shared_ptr<function>
make_currentdate_fct() {
return make_native_scalar_function<true>("currentdate", simple_date_type, {},
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
auto to_simple_date = get_castas_fctn(simple_date_type, timestamp_type);
return {simple_date_type->decompose(to_simple_date(timestamp_native_type{db_clock::now()}))};
});
}
inline
shared_ptr<function>
make_currenttimeuuid_fct() {
return make_native_scalar_function<true>("currenttimeuuid", timeuuid_type, {},
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
return {timeuuid_type->decompose(timeuuid_native_type{utils::UUID_gen::get_time_UUID()})};
});
}
inline
shared_ptr<function>
make_timeuuidtodate_fct() {
return make_native_scalar_function<true>("todate", simple_date_type, { timeuuid_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
auto& bb = values[0];
if (!bb) {
return {};
}
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb))));
auto to_simple_date = get_castas_fctn(simple_date_type, timestamp_type);
return {simple_date_type->decompose(to_simple_date(ts))};
});
}
inline
shared_ptr<function>
make_timestamptodate_fct() {
return make_native_scalar_function<true>("todate", simple_date_type, { timestamp_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
auto& bb = values[0];
if (!bb) {
return {};
}
auto ts_obj = timestamp_type->deserialize(*bb);
if (ts_obj.is_null()) {
return {};
}
auto to_simple_date = get_castas_fctn(simple_date_type, timestamp_type);
return {simple_date_type->decompose(to_simple_date(ts_obj))};
});
}
inline
shared_ptr<function>
make_timeuuidtotimestamp_fct() {
return make_native_scalar_function<true>("totimestamp", timestamp_type, { timeuuid_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
auto& bb = values[0];
if (!bb) {
return {};
}
auto ts = db_clock::time_point(db_clock::duration(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb))));
return {timestamp_type->decompose(ts)};
});
}
inline
shared_ptr<function>
make_datetotimestamp_fct() {
return make_native_scalar_function<true>("totimestamp", timestamp_type, { simple_date_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
auto& bb = values[0];
if (!bb) {
return {};
}
auto simple_date_obj = simple_date_type->deserialize(*bb);
if (simple_date_obj.is_null()) {
return {};
}
auto from_simple_date = get_castas_fctn(timestamp_type, simple_date_type);
return {timestamp_type->decompose(from_simple_date(simple_date_obj))};
});
}
inline
shared_ptr<function>
make_timeuuidtounixtimestamp_fct() {
return make_native_scalar_function<true>("tounixtimestamp", long_type, { timeuuid_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
auto& bb = values[0];
if (!bb) {
return {};
}
return {long_type->decompose(UUID_gen::unix_timestamp(UUID_gen::get_UUID(*bb)))};
});
}
inline
shared_ptr<function>
make_timestamptounixtimestamp_fct() {
return make_native_scalar_function<true>("tounixtimestamp", long_type, { timestamp_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
auto& bb = values[0];
if (!bb) {
return {};
}
auto ts_obj = timestamp_type->deserialize(*bb);
if (ts_obj.is_null()) {
return {};
}
return {long_type->decompose(ts_obj)};
});
}
inline
shared_ptr<function>
make_datetounixtimestamp_fct() {
return make_native_scalar_function<true>("tounixtimestamp", long_type, { simple_date_type },
[] (cql_serialization_format sf, const std::vector<bytes_opt>& values) -> bytes_opt {
using namespace utils;
auto& bb = values[0];
if (!bb) {
return {};
}
auto simple_date_obj = simple_date_type->deserialize(*bb);
if (simple_date_obj.is_null()) {
return {};
}
auto from_simple_date = get_castas_fctn(timestamp_type, simple_date_type);
return {long_type->decompose(from_simple_date(simple_date_obj))};
});
}
}
}
}

View File

@@ -237,12 +237,7 @@ lists::precision_time::get_next(db_clock::time_point millis) {
void
lists::setter::execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params) {
auto value = _t->bind(params._options);
execute(m, prefix, params, column, std::move(value));
}
void
lists::setter::execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params, const column_definition& column, ::shared_ptr<terminal> value) {
const auto& value = _t->bind(params._options);
if (value == constants::UNSET_VALUE) {
return;
}
@@ -304,7 +299,7 @@ lists::setter_by_index::execute(mutation& m, const clustering_key_prefix& prefix
if (!value) {
mut.cells.emplace_back(eidx, params.make_dead_cell());
} else {
mut.cells.emplace_back(eidx, params.make_cell(*ltype->value_comparator(), *value, atomic_cell::collection_member::yes));
mut.cells.emplace_back(eidx, params.make_cell(*value));
}
auto smut = ltype->serialize_mutation_form(mut);
m.set_cell(prefix, column, atomic_cell_or_collection::from_collection_mutation(std::move(smut)));
@@ -331,7 +326,7 @@ lists::setter_by_uuid::execute(mutation& m, const clustering_key_prefix& prefix,
list_type_impl::mutation mut;
mut.cells.reserve(1);
mut.cells.emplace_back(to_bytes(*index), params.make_cell(*ltype->value_comparator(), *value, atomic_cell::collection_member::yes));
mut.cells.emplace_back(to_bytes(*index), params.make_cell(*value));
auto smut = ltype->serialize_mutation_form(mut);
m.set_cell(prefix, column,
atomic_cell_or_collection::from_collection_mutation(
@@ -370,7 +365,7 @@ lists::do_append(shared_ptr<term> value,
auto uuid1 = utils::UUID_gen::get_time_UUID_bytes();
auto uuid = bytes(reinterpret_cast<const int8_t*>(uuid1.data()), uuid1.size());
// FIXME: can e be empty?
appended.cells.emplace_back(std::move(uuid), params.make_cell(*ltype->value_comparator(), *e, atomic_cell::collection_member::yes));
appended.cells.emplace_back(std::move(uuid), params.make_cell(*e));
}
m.set_cell(prefix, column, ltype->serialize_mutation_form(appended));
} else {
@@ -379,7 +374,7 @@ lists::do_append(shared_ptr<term> value,
m.set_cell(prefix, column, params.make_dead_cell());
} else {
auto newv = list_value->get_with_protocol_version(cql_serialization_format::internal());
m.set_cell(prefix, column, params.make_cell(*column.type, std::move(newv)));
m.set_cell(prefix, column, params.make_cell(std::move(newv)));
}
}
}
@@ -400,14 +395,14 @@ lists::prepender::execute(mutation& m, const clustering_key_prefix& prefix, cons
mut.cells.reserve(lvalue->get_elements().size());
// We reverse the order of insertion, so that the last element gets the lastest time
// (lists are sorted by time)
auto&& ltype = static_cast<const list_type_impl*>(column.type.get());
for (auto&& v : lvalue->_elements | boost::adaptors::reversed) {
auto&& pt = precision_time::get_next(time);
auto uuid = utils::UUID_gen::get_time_UUID_bytes(pt.millis.time_since_epoch().count(), pt.nanos);
mut.cells.emplace_back(bytes(uuid.data(), uuid.size()), params.make_cell(*ltype->value_comparator(), *v, atomic_cell::collection_member::yes));
mut.cells.emplace_back(bytes(uuid.data(), uuid.size()), params.make_cell(*v));
}
// now reverse again, to get the original order back
std::reverse(mut.cells.begin(), mut.cells.end());
auto&& ltype = static_cast<const list_type_impl*>(column.type.get());
m.set_cell(prefix, column, atomic_cell_or_collection::from_collection_mutation(ltype->serialize_mutation_form(std::move(mut))));
}

View File

@@ -147,7 +147,6 @@ public:
: operation(column, std::move(t)) {
}
virtual void execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params) override;
static void execute(mutation& m, const clustering_key_prefix& prefix, const update_parameters& params, const column_definition& column, ::shared_ptr<terminal> value);
};
class setter_by_index : public operation {

View File

@@ -266,11 +266,6 @@ maps::marker::bind(const query_options& options) {
void
maps::setter::execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params) {
auto value = _t->bind(params._options);
execute(m, row_key, params, column, std::move(value));
}
void
maps::setter::execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params, const column_definition& column, ::shared_ptr<terminal> value) {
if (value == constants::UNSET_VALUE) {
return;
}
@@ -300,11 +295,10 @@ maps::setter_by_key::execute(mutation& m, const clustering_key_prefix& prefix, c
if (!key) {
throw invalid_request_exception("Invalid null map key");
}
auto ctype = static_pointer_cast<const map_type_impl>(column.type);
auto avalue = value ? params.make_cell(*ctype->get_values_type(), *value, atomic_cell::collection_member::yes) : params.make_dead_cell();
map_type_impl::mutation update;
update.cells.emplace_back(std::move(to_bytes(*key)), std::move(avalue));
auto avalue = value ? params.make_cell(*value) : params.make_dead_cell();
map_type_impl::mutation update = { {}, { { std::move(to_bytes(*key)), std::move(avalue) } } };
// should have been verified as map earlier?
auto ctype = static_pointer_cast<const map_type_impl>(column.type);
auto col_mut = ctype->serialize_mutation_form(std::move(update));
m.set_cell(prefix, column, std::move(col_mut));
}
@@ -329,10 +323,10 @@ maps::do_put(mutation& m, const clustering_key_prefix& prefix, const update_para
return;
}
auto ctype = static_pointer_cast<const map_type_impl>(column.type);
for (auto&& e : map_value->map) {
mut.cells.emplace_back(e.first, params.make_cell(*ctype->get_values_type(), e.second, atomic_cell::collection_member::yes));
mut.cells.emplace_back(e.first, params.make_cell(e.second));
}
auto ctype = static_pointer_cast<const map_type_impl>(column.type);
auto col_mut = ctype->serialize_mutation_form(std::move(mut));
m.set_cell(prefix, column, std::move(col_mut));
} else {
@@ -342,7 +336,7 @@ maps::do_put(mutation& m, const clustering_key_prefix& prefix, const update_para
} else {
auto v = map_type_impl::serialize_partially_deserialized_form({map_value->map.begin(), map_value->map.end()},
cql_serialization_format::internal());
m.set_cell(prefix, column, params.make_cell(*column.type, std::move(v)));
m.set_cell(prefix, column, params.make_cell(std::move(v)));
}
}
}

View File

@@ -117,7 +117,6 @@ public:
}
virtual void execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params) override;
static void execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params, const column_definition& column, ::shared_ptr<terminal> value);
};
class setter_by_key : public operation {

View File

@@ -87,15 +87,15 @@ public:
virtual ~operation() {}
static atomic_cell make_dead_cell(const update_parameters& params) {
atomic_cell make_dead_cell(const update_parameters& params) const {
return params.make_dead_cell();
}
static atomic_cell make_cell(const abstract_type& type, bytes_view value, const update_parameters& params) {
return params.make_cell(type, value);
atomic_cell make_cell(bytes_view value, const update_parameters& params) const {
return params.make_cell(value);
}
static atomic_cell make_counter_update_cell(int64_t delta, const update_parameters& params) {
atomic_cell make_counter_update_cell(int64_t delta, const update_parameters& params) const {
return params.make_counter_update_cell(delta);
}

View File

@@ -68,14 +68,6 @@ public:
static thrift_prepared_id_type thrift_id(const prepared_cache_key_type& key) {
return key.key().second;
}
bool operator==(const prepared_cache_key_type& other) const {
return _key == other._key;
}
bool operator!=(const prepared_cache_key_type& other) const {
return !(*this == other);
}
};
class prepared_statements_cache {
@@ -110,9 +102,9 @@ private:
}
};
public:
static const std::chrono::minutes entry_expiry;
public:
using key_type = prepared_cache_key_type;
using value_type = checked_weak_ptr;
using statement_is_too_big = typename cache_type::entry_is_too_big;
@@ -124,8 +116,8 @@ private:
value_extractor_fn _value_extractor_fn;
public:
prepared_statements_cache(logging::logger& logger, size_t size)
: _cache(size, entry_expiry, logger)
prepared_statements_cache(logging::logger& logger)
: _cache(memory::stats().total_memory() / 256, entry_expiry, logger)
{}
template <typename LoadFunc>
@@ -163,10 +155,6 @@ public:
size_t memory_footprint() const {
return _cache.memory_footprint();
}
future<> stop() {
return _cache.stop();
}
};
}
@@ -180,11 +168,4 @@ inline std::ostream& operator<<(std::ostream& os, const cql3::prepared_cache_key
os << p.key();
return os;
}
template<>
struct hash<cql3::prepared_cache_key_type> final {
size_t operator()(const cql3::prepared_cache_key_type& k) const {
return utils::tuple_hash()(k.key());
}
};
}

View File

@@ -46,11 +46,10 @@ namespace cql3 {
thread_local const query_options::specific_options query_options::specific_options::DEFAULT{-1, {}, {}, api::missing_timestamp};
thread_local query_options query_options::DEFAULT{db::consistency_level::ONE, infinite_timeout_config, std::experimental::nullopt,
thread_local query_options query_options::DEFAULT{db::consistency_level::ONE, std::experimental::nullopt,
std::vector<cql3::raw_value_view>(), false, query_options::specific_options::DEFAULT, cql_serialization_format::latest()};
query_options::query_options(db::consistency_level consistency,
const ::timeout_config& timeout_config,
std::experimental::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
std::vector<cql3::raw_value_view> value_views,
@@ -58,7 +57,6 @@ query_options::query_options(db::consistency_level consistency,
specific_options options,
cql_serialization_format sf)
: _consistency(consistency)
, _timeout_config(timeout_config)
, _names(std::move(names))
, _values(std::move(values))
, _value_views(value_views)
@@ -69,14 +67,12 @@ query_options::query_options(db::consistency_level consistency,
}
query_options::query_options(db::consistency_level consistency,
const ::timeout_config& timeout_config,
std::experimental::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
bool skip_metadata,
specific_options options,
cql_serialization_format sf)
: _consistency(consistency)
, _timeout_config(timeout_config)
, _names(std::move(names))
, _values(std::move(values))
, _value_views()
@@ -88,14 +84,12 @@ query_options::query_options(db::consistency_level consistency,
}
query_options::query_options(db::consistency_level consistency,
const ::timeout_config& timeout_config,
std::experimental::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value_view> value_views,
bool skip_metadata,
specific_options options,
cql_serialization_format sf)
: _consistency(consistency)
, _timeout_config(timeout_config)
, _names(std::move(names))
, _values()
, _value_views(std::move(value_views))
@@ -105,10 +99,9 @@ query_options::query_options(db::consistency_level consistency,
{
}
query_options::query_options(db::consistency_level cl, const ::timeout_config& timeout_config, std::vector<cql3::raw_value> values, specific_options options)
query_options::query_options(db::consistency_level cl, std::vector<cql3::raw_value> values, specific_options options)
: query_options(
cl,
timeout_config,
{},
std::move(values),
false,
@@ -120,7 +113,6 @@ query_options::query_options(db::consistency_level cl, const ::timeout_config& t
query_options::query_options(std::unique_ptr<query_options> qo, ::shared_ptr<service::pager::paging_state> paging_state)
: query_options(qo->_consistency,
qo->get_timeout_config(),
std::move(qo->_names),
std::move(qo->_values),
std::move(qo->_value_views),
@@ -132,7 +124,7 @@ query_options::query_options(std::unique_ptr<query_options> qo, ::shared_ptr<ser
query_options::query_options(std::vector<cql3::raw_value> values)
: query_options(
db::consistency_level::ONE, infinite_timeout_config, std::move(values))
db::consistency_level::ONE, std::move(values))
{}
db::consistency_level query_options::get_consistency() const
@@ -217,19 +209,18 @@ void query_options::prepare(const std::vector<::shared_ptr<column_specification>
}
auto& names = *_names;
std::vector<cql3::raw_value> ordered_values;
std::vector<cql3::raw_value_view> ordered_values;
ordered_values.reserve(specs.size());
for (auto&& spec : specs) {
auto& spec_name = spec->name->text();
for (size_t j = 0; j < names.size(); j++) {
if (names[j] == spec_name) {
ordered_values.emplace_back(_values[j]);
ordered_values.emplace_back(_value_views[j]);
break;
}
}
}
_values = std::move(ordered_values);
fill_value_views();
_value_views = std::move(ordered_values);
}
void query_options::fill_value_views()

View File

@@ -44,14 +44,13 @@
#include <seastar/util/gcc6-concepts.hh>
#include "timestamp.hh"
#include "bytes.hh"
#include "db/consistency_level_type.hh"
#include "db/consistency_level.hh"
#include "service/query_state.hh"
#include "service/pager/paging_state.hh"
#include "cql3/column_specification.hh"
#include "cql3/column_identifier.hh"
#include "cql3/values.hh"
#include "cql_serialization_format.hh"
#include "timeout_config.hh"
namespace cql3 {
@@ -71,7 +70,6 @@ public:
};
private:
const db::consistency_level _consistency;
const timeout_config& _timeout_config;
const std::experimental::optional<std::vector<sstring_view>> _names;
std::vector<cql3::raw_value> _values;
std::vector<cql3::raw_value_view> _value_views;
@@ -105,14 +103,12 @@ public:
query_options(const query_options&) = delete;
explicit query_options(db::consistency_level consistency,
const timeout_config& timeouts,
std::experimental::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
bool skip_metadata,
specific_options options,
cql_serialization_format sf);
explicit query_options(db::consistency_level consistency,
const timeout_config& timeouts,
std::experimental::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value> values,
std::vector<cql3::raw_value_view> value_views,
@@ -120,7 +116,6 @@ public:
specific_options options,
cql_serialization_format sf);
explicit query_options(db::consistency_level consistency,
const timeout_config& timeouts,
std::experimental::optional<std::vector<sstring_view>> names,
std::vector<cql3::raw_value_view> value_views,
bool skip_metadata,
@@ -152,12 +147,10 @@ public:
// forInternalUse
explicit query_options(std::vector<cql3::raw_value> values);
explicit query_options(db::consistency_level, const timeout_config& timeouts,
std::vector<cql3::raw_value> values, specific_options options = specific_options::DEFAULT);
explicit query_options(db::consistency_level, std::vector<cql3::raw_value> values, specific_options options = specific_options::DEFAULT);
explicit query_options(std::unique_ptr<query_options>, ::shared_ptr<service::pager::paging_state> paging_state);
db::consistency_level get_consistency() const;
const timeout_config& get_timeout_config() const { return _timeout_config; }
cql3::raw_value_view get_value_at(size_t idx) const;
cql3::raw_value_view make_temporary(cql3::raw_value value) const;
size_t get_values_count() const;
@@ -168,11 +161,6 @@ public:
::shared_ptr<service::pager::paging_state> get_paging_state() const;
/** Serial consistency for conditional updates. */
std::experimental::optional<db::consistency_level> get_serial_consistency() const;
const std::experimental::optional<std::vector<sstring_view>>& get_names() const noexcept {
return _names;
}
api::timestamp_type get_timestamp(service::query_state& state) const;
/**
* The protocol version for the query. Will be 3 if the object don't come from
@@ -200,7 +188,7 @@ query_options::query_options(query_options&& o, std::vector<OneMutationDataRange
std::vector<query_options> tmp;
tmp.reserve(values_ranges.size());
std::transform(values_ranges.begin(), values_ranges.end(), std::back_inserter(tmp), [this](auto& values_range) {
return query_options(_consistency, _timeout_config, {}, std::move(values_range), _skip_metadata, _options, _cql_serialization_format);
return query_options(_consistency, {}, std::move(values_range), _skip_metadata, _options, _cql_serialization_format);
});
_batch_options = std::move(tmp);
}

View File

@@ -58,7 +58,6 @@ using namespace cql_transport::messages;
logging::logger log("query_processor");
logging::logger prep_cache_log("prepared_statements_cache");
logging::logger authorized_prepared_statements_cache_log("authorized_prepared_statements_cache");
distributed<query_processor> _the_query_processor;
@@ -92,16 +91,12 @@ api::timestamp_type query_processor::next_timestamp() {
return _internal_state->next_timestamp();
}
query_processor::query_processor(service::storage_proxy& proxy, distributed<database>& db, query_processor::memory_config mcfg)
query_processor::query_processor(distributed<service::storage_proxy>& proxy, distributed<database>& db)
: _migration_subscriber{std::make_unique<migration_subscriber>(this)}
, _proxy(proxy)
, _db(db)
, _internal_state(new internal_state())
, _prepared_cache(prep_cache_log, mcfg.prepared_statment_cache_size)
, _authorized_prepared_cache(std::min(std::chrono::milliseconds(_db.local().get_config().permissions_validity_in_ms()),
std::chrono::duration_cast<std::chrono::milliseconds>(prepared_statements_cache::entry_expiry)),
std::chrono::milliseconds(_db.local().get_config().permissions_update_interval_in_ms()),
mcfg.authorized_prepared_cache_size, authorized_prepared_statements_cache_log) {
, _prepared_cache(prep_cache_log) {
namespace sm = seastar::metrics;
_metrics.add_group(
@@ -164,11 +159,6 @@ query_processor::query_processor(service::storage_proxy& proxy, distributed<data
sm::description("Counts a total number of LOGGED batches that were executed as UNLOGGED "
"batches.")),
sm::make_derive(
"rows_read",
_cql_stats.rows_read,
sm::description("Counts a total number of rows read during CQL requests.")),
sm::make_derive(
"prepared_cache_evictions",
[] { return prepared_statements_cache::shard_stats().prepared_cache_evictions; },
@@ -182,70 +172,7 @@ query_processor::query_processor(service::storage_proxy& proxy, distributed<data
sm::make_gauge(
"prepared_cache_memory_footprint",
[this] { return _prepared_cache.memory_footprint(); },
sm::description("Size (in bytes) of the prepared statements cache.")),
sm::make_derive(
"secondary_index_creates",
_cql_stats.secondary_index_creates,
sm::description("Counts a total number of CQL CREATE INDEX requests.")),
sm::make_derive(
"secondary_index_drops",
_cql_stats.secondary_index_drops,
sm::description("Counts a total number of CQL DROP INDEX requests.")),
// secondary_index_reads total count is also included in all cql reads
sm::make_derive(
"secondary_index_reads",
_cql_stats.secondary_index_reads,
sm::description("Counts a total number of CQL read requests performed using secondary indexes.")),
// secondary_index_rows_read total count is also included in all cql rows read
sm::make_derive(
"secondary_index_rows_read",
_cql_stats.secondary_index_rows_read,
sm::description("Counts a total number of rows read during CQL requests performed using secondary indexes.")),
// read requests that required ALLOW FILTERING
sm::make_derive(
"filtered_read_requests",
_cql_stats.filtered_reads,
sm::description("Counts a total number of CQL read requests that required ALLOW FILTERING. See filtered_rows_read_total to compare how many rows needed to be filtered.")),
// rows read with filtering enabled (because ALLOW FILTERING was required)
sm::make_derive(
"filtered_rows_read_total",
_cql_stats.filtered_rows_read_total,
sm::description("Counts a total number of rows read during CQL requests that required ALLOW FILTERING. See filtered_rows_matched_total and filtered_rows_dropped_total for information how accurate filtering queries are.")),
// rows read with filtering enabled and accepted by the filter
sm::make_derive(
"filtered_rows_matched_total",
_cql_stats.filtered_rows_matched_total,
sm::description("Counts a number of rows read during CQL requests that required ALLOW FILTERING and accepted by the filter. Number similar to filtered_rows_read_total indicates that filtering is accurate.")),
// rows read with filtering enabled and rejected by the filter
sm::make_derive(
"filtered_rows_dropped_total",
[this]() {return _cql_stats.filtered_rows_read_total - _cql_stats.filtered_rows_matched_total;},
sm::description("Counts a number of rows read during CQL requests that required ALLOW FILTERING and dropped by the filter. Number similar to filtered_rows_read_total indicates that filtering is not accurate and might cause performance degradation.")),
sm::make_derive(
"authorized_prepared_statements_cache_evictions",
[] { return authorized_prepared_statements_cache::shard_stats().authorized_prepared_statements_cache_evictions; },
sm::description("Counts a number of authenticated prepared statements cache entries evictions.")),
sm::make_gauge(
"authorized_prepared_statements_cache_size",
[this] { return _authorized_prepared_cache.size(); },
sm::description("A number of entries in the authenticated prepared statements cache.")),
sm::make_gauge(
"user_prepared_auth_cache_footprint",
[this] { return _authorized_prepared_cache.memory_footprint(); },
sm::description("Size (in bytes) of the authenticated prepared statements cache."))
});
sm::description("Size (in bytes) of the prepared statements cache."))});
service::get_local_migration_manager().register_listener(_migration_subscriber.get());
}
@@ -255,7 +182,7 @@ query_processor::~query_processor() {
future<> query_processor::stop() {
service::get_local_migration_manager().unregister_listener(_migration_subscriber.get());
return _authorized_prepared_cache.stop().finally([this] { return _prepared_cache.stop(); });
return make_ready_future<>();
}
future<::shared_ptr<result_message>>
@@ -275,55 +202,33 @@ query_processor::process(const sstring_view& query_string, service::query_state&
metrics.regularStatementsExecuted.inc();
#endif
tracing::trace(query_state.get_trace_state(), "Processing a statement");
return process_statement_unprepared(std::move(cql_statement), query_state, options);
return process_statement(std::move(cql_statement), query_state, options);
}
future<::shared_ptr<result_message>>
query_processor::process_statement_unprepared(
query_processor::process_statement(
::shared_ptr<cql_statement> statement,
service::query_state& query_state,
const query_options& options) {
return statement->check_access(query_state.get_client_state()).then([this, statement, &query_state, &options] () mutable {
return process_authorized_statement(std::move(statement), query_state, options);
});
}
return statement->check_access(query_state.get_client_state()).then([this, statement, &query_state, &options]() {
auto& client_state = query_state.get_client_state();
future<::shared_ptr<result_message>>
query_processor::process_statement_prepared(
statements::prepared_statement::checked_weak_ptr prepared,
cql3::prepared_cache_key_type cache_key,
service::query_state& query_state,
const query_options& options,
bool needs_authorization) {
statement->validate(_proxy, client_state);
::shared_ptr<cql_statement> statement = prepared->statement;
future<> fut = make_ready_future<>();
if (needs_authorization) {
fut = statement->check_access(query_state.get_client_state()).then([this, &query_state, prepared = std::move(prepared), cache_key = std::move(cache_key)] () mutable {
return _authorized_prepared_cache.insert(*query_state.get_client_state().user(), std::move(cache_key), std::move(prepared)).handle_exception([this] (auto eptr) {
log.error("failed to cache the entry", eptr);
});
});
}
return fut.then([this, statement = std::move(statement), &query_state, &options] () mutable {
return process_authorized_statement(std::move(statement), query_state, options);
});
}
future<::shared_ptr<result_message>>
query_processor::process_authorized_statement(const ::shared_ptr<cql_statement> statement, service::query_state& query_state, const query_options& options) {
auto& client_state = query_state.get_client_state();
statement->validate(_proxy, client_state);
auto fut = statement->execute(_proxy, query_state, options);
return fut.then([statement] (auto msg) {
if (msg) {
return make_ready_future<::shared_ptr<result_message>>(std::move(msg));
auto fut = make_ready_future<::shared_ptr<cql_transport::messages::result_message>>();
if (client_state.is_internal()) {
fut = statement->execute_internal(_proxy, query_state, options);
} else {
fut = statement->execute(_proxy, query_state, options);
}
return make_ready_future<::shared_ptr<result_message>>(::make_shared<result_message::void_message>());
return fut.then([statement] (auto msg) {
if (msg) {
return make_ready_future<::shared_ptr<result_message>>(std::move(msg));
}
return make_ready_future<::shared_ptr<result_message>>(
::make_shared<result_message::void_message>());
});
});
}
@@ -435,7 +340,6 @@ query_options query_processor::make_internal_options(
const statements::prepared_statement::checked_weak_ptr& p,
const std::initializer_list<data_value>& values,
db::consistency_level cl,
const timeout_config& timeout_config,
int32_t page_size) {
if (p->bound_names.size() != values.size()) {
throw std::invalid_argument(
@@ -459,11 +363,10 @@ query_options query_processor::make_internal_options(
api::timestamp_type ts = api::missing_timestamp;
return query_options(
cl,
timeout_config,
bound_values,
cql3::query_options::specific_options{page_size, std::move(paging_state), serial_consistency, ts});
}
return query_options(cl, timeout_config, bound_values);
return query_options(cl, bound_values);
}
statements::prepared_statement::checked_weak_ptr query_processor::prepare_internal(const sstring& query_string) {
@@ -494,7 +397,7 @@ struct internal_query_state {
::shared_ptr<internal_query_state> query_processor::create_paged_state(const sstring& query_string,
const std::initializer_list<data_value>& values, int32_t page_size) {
auto p = prepare_internal(query_string);
auto opts = make_internal_options(p, values, db::consistency_level::ONE, infinite_timeout_config, page_size);
auto opts = make_internal_options(p, values, db::consistency_level::ONE, page_size);
::shared_ptr<internal_query_state> res = ::make_shared<internal_query_state>(
internal_query_state{
query_string,
@@ -543,7 +446,7 @@ future<> query_processor::for_each_cql_result(
future<::shared_ptr<untyped_result_set>>
query_processor::execute_paged_internal(::shared_ptr<internal_query_state> state) {
return state->p->statement->execute(_proxy, *_internal_state, *state->opts).then(
return state->p->statement->execute_internal(_proxy, *_internal_state, *state->opts).then(
[state, this](::shared_ptr<cql_transport::messages::result_message> msg) mutable {
class visitor : public result_message::visitor_base {
::shared_ptr<internal_query_state> _state;
@@ -582,9 +485,9 @@ future<::shared_ptr<untyped_result_set>>
query_processor::execute_internal(
statements::prepared_statement::checked_weak_ptr p,
const std::initializer_list<data_value>& values) {
query_options opts = make_internal_options(p, values, db::consistency_level::ONE, infinite_timeout_config);
query_options opts = make_internal_options(p, values);
return do_with(std::move(opts), [this, p = std::move(p)](auto& opts) {
return p->statement->execute(
return p->statement->execute_internal(
_proxy,
*_internal_state,
opts).then([&opts, stmt = p->statement](auto msg) {
@@ -597,16 +500,15 @@ future<::shared_ptr<untyped_result_set>>
query_processor::process(
const sstring& query_string,
db::consistency_level cl,
const timeout_config& timeout_config,
const std::initializer_list<data_value>& values,
bool cache) {
if (cache) {
return process(prepare_internal(query_string), cl, timeout_config, values);
return process(prepare_internal(query_string), cl, values);
} else {
auto p = parse_statement(query_string)->prepare(_db.local(), _cql_stats);
p->statement->validate(_proxy, *_internal_state);
auto checked_weak_ptr = p->checked_weak_from_this();
return process(std::move(checked_weak_ptr), cl, timeout_config, values).finally([p = std::move(p)] {});
return process(std::move(checked_weak_ptr), cl, values).finally([p = std::move(p)] {});
}
}
@@ -614,9 +516,8 @@ future<::shared_ptr<untyped_result_set>>
query_processor::process(
statements::prepared_statement::checked_weak_ptr p,
db::consistency_level cl,
const timeout_config& timeout_config,
const std::initializer_list<data_value>& values) {
auto opts = make_internal_options(p, values, cl, timeout_config);
auto opts = make_internal_options(p, values, cl);
return do_with(std::move(opts), [this, p = std::move(p)](auto & opts) {
return p->statement->execute(_proxy, *_internal_state, opts).then([](auto msg) {
return make_ready_future<::shared_ptr<untyped_result_set>>(::make_shared<untyped_result_set>(msg));
@@ -628,18 +529,11 @@ future<::shared_ptr<cql_transport::messages::result_message>>
query_processor::process_batch(
::shared_ptr<statements::batch_statement> batch,
service::query_state& query_state,
query_options& options,
std::unordered_map<prepared_cache_key_type, authorized_prepared_statements_cache::value_type> pending_authorization_entries) {
return batch->check_access(query_state.get_client_state()).then([this, &query_state, &options, batch, pending_authorization_entries = std::move(pending_authorization_entries)] () mutable {
return parallel_for_each(pending_authorization_entries, [this, &query_state] (auto& e) {
return _authorized_prepared_cache.insert(*query_state.get_client_state().user(), e.first, std::move(e.second)).handle_exception([this] (auto eptr) {
log.error("failed to cache the entry", eptr);
});
}).then([this, &query_state, &options, batch] {
batch->validate();
batch->validate(_proxy, query_state.get_client_state());
return batch->execute(_proxy, query_state, options);
});
query_options& options) {
return batch->check_access(query_state.get_client_state()).then([this, &query_state, &options, batch] {
batch->validate();
batch->validate(_proxy, query_state.get_client_state());
return batch->execute(_proxy, query_state, options);
});
}

View File

@@ -49,7 +49,6 @@
#include <seastar/core/shared_ptr.hh>
#include "cql3/prepared_statements_cache.hh"
#include "cql3/authorized_prepared_statements_cache.hh"
#include "cql3/query_options.hh"
#include "cql3/statements/prepared_statement.hh"
#include "cql3/statements/raw/parsed_statement.hh"
@@ -100,14 +99,10 @@ public:
class query_processor {
public:
class migration_subscriber;
struct memory_config {
size_t prepared_statment_cache_size = 0;
size_t authorized_prepared_cache_size = 0;
};
private:
std::unique_ptr<migration_subscriber> _migration_subscriber;
service::storage_proxy& _proxy;
distributed<service::storage_proxy>& _proxy;
distributed<database>& _db;
struct stats {
@@ -122,7 +117,6 @@ private:
std::unique_ptr<internal_state> _internal_state;
prepared_statements_cache _prepared_cache;
authorized_prepared_statements_cache _authorized_prepared_cache;
// A map for prepared statements used internally (which we don't want to mix with user statement, in particular we
// don't bother with expiration on those.
@@ -141,7 +135,7 @@ public:
static ::shared_ptr<statements::raw::parsed_statement> parse_statement(const std::experimental::string_view& query);
query_processor(service::storage_proxy& proxy, distributed<database>& db, memory_config mcfg);
query_processor(distributed<service::storage_proxy>& proxy, distributed<database>& db);
~query_processor();
@@ -149,7 +143,7 @@ public:
return _db;
}
service::storage_proxy& proxy() {
distributed<service::storage_proxy>& proxy() {
return _proxy;
}
@@ -157,21 +151,6 @@ public:
return _cql_stats;
}
statements::prepared_statement::checked_weak_ptr get_prepared(const auth::authenticated_user* user_ptr, const prepared_cache_key_type& key) {
if (user_ptr) {
auto it = _authorized_prepared_cache.find(*user_ptr, key);
if (it != _authorized_prepared_cache.end()) {
try {
return it->get()->checked_weak_from_this();
} catch (seastar::checked_ptr_is_null_exception&) {
// If the prepared statement got invalidated - remove the corresponding authorized_prepared_statements_cache entry as well.
_authorized_prepared_cache.remove(*user_ptr, key);
}
}
}
return statements::prepared_statement::checked_weak_ptr();
}
statements::prepared_statement::checked_weak_ptr get_prepared(const prepared_cache_key_type& key) {
auto it = _prepared_cache.find(key);
if (it == _prepared_cache.end()) {
@@ -181,19 +160,11 @@ public:
}
future<::shared_ptr<cql_transport::messages::result_message>>
process_statement_unprepared(
process_statement(
::shared_ptr<cql_statement> statement,
service::query_state& query_state,
const query_options& options);
future<::shared_ptr<cql_transport::messages::result_message>>
process_statement_prepared(
statements::prepared_statement::checked_weak_ptr statement,
cql3::prepared_cache_key_type cache_key,
service::query_state& query_state,
const query_options& options,
bool needs_authorization);
future<::shared_ptr<cql_transport::messages::result_message>>
process(
const std::experimental::string_view& query_string,
@@ -244,14 +215,12 @@ public:
future<::shared_ptr<untyped_result_set>> process(
const sstring& query_string,
db::consistency_level,
const timeout_config& timeout_config,
const std::initializer_list<data_value>& = { },
bool cache = false);
future<::shared_ptr<untyped_result_set>> process(
statements::prepared_statement::checked_weak_ptr p,
db::consistency_level,
const timeout_config& timeout_config,
const std::initializer_list<data_value>& = { });
/*
@@ -273,11 +242,7 @@ public:
future<> stop();
future<::shared_ptr<cql_transport::messages::result_message>>
process_batch(
::shared_ptr<statements::batch_statement>,
service::query_state& query_state,
query_options& options,
std::unordered_map<prepared_cache_key_type, authorized_prepared_statements_cache::value_type> pending_authorization_entries);
process_batch(::shared_ptr<statements::batch_statement>, service::query_state& query_state, query_options& options);
std::unique_ptr<statements::prepared_statement> get_statement(
const std::experimental::string_view& query,
@@ -289,13 +254,9 @@ private:
query_options make_internal_options(
const statements::prepared_statement::checked_weak_ptr& p,
const std::initializer_list<data_value>&,
db::consistency_level,
const timeout_config& timeout_config,
db::consistency_level = db::consistency_level::ONE,
int32_t page_size = -1);
future<::shared_ptr<cql_transport::messages::result_message>>
process_authorized_statement(const ::shared_ptr<cql_statement> statement, service::query_state& query_state, const query_options& options);
/*!
* \brief created a state object for paging
*

View File

@@ -95,32 +95,7 @@ public:
uint32_t size() const override {
return uint32_t(get_column_defs().size());
}
bool has_unrestricted_components(const schema& schema) const;
virtual bool needs_filtering(const schema& schema) const;
};
template<>
inline bool primary_key_restrictions<partition_key>::has_unrestricted_components(const schema& schema) const {
return size() < schema.partition_key_size();
}
template<>
inline bool primary_key_restrictions<clustering_key>::has_unrestricted_components(const schema& schema) const {
return size() < schema.clustering_key_size();
}
template<>
inline bool primary_key_restrictions<partition_key>::needs_filtering(const schema& schema) const {
return !empty() && !is_on_token() && (has_unrestricted_components(schema) || is_contains() || is_slice());
}
template<>
inline bool primary_key_restrictions<clustering_key>::needs_filtering(const schema& schema) const {
// Currently only overloaded single_column_primary_key_restrictions will require ALLOW FILTERING
return false;
}
}
}

View File

@@ -64,15 +64,13 @@ class single_column_primary_key_restrictions : public primary_key_restrictions<V
using bounds_range_type = typename primary_key_restrictions<ValueType>::bounds_range_type;
private:
schema_ptr _schema;
bool _allow_filtering;
::shared_ptr<single_column_restrictions> _restrictions;
bool _slice;
bool _contains;
bool _in;
public:
single_column_primary_key_restrictions(schema_ptr schema, bool allow_filtering)
single_column_primary_key_restrictions(schema_ptr schema)
: _schema(schema)
, _allow_filtering(allow_filtering)
, _restrictions(::make_shared<single_column_restrictions>(schema))
, _slice(false)
, _contains(false)
@@ -112,7 +110,7 @@ public:
}
void do_merge_with(::shared_ptr<single_column_restriction> restriction) {
if (!_restrictions->empty() && !_allow_filtering) {
if (!_restrictions->empty()) {
auto last_column = *_restrictions->last_column();
auto new_column = restriction->get_column_def();
@@ -129,6 +127,11 @@ public:
last_column.name_as_text(), new_column.name_as_text()));
}
}
if (_in && _schema->position(new_column) > _schema->position(last_column)) {
throw exceptions::invalid_request_exception(sprint("Clustering column \"%s\" cannot be restricted by an IN relation",
new_column.name_as_text()));
}
}
_slice |= restriction->is_slice();
@@ -314,10 +317,6 @@ public:
fail(unimplemented::cause::LEGACY_COMPOSITE_KEYS); // not 100% correct...
}
const single_column_restrictions::restrictions_map& restrictions() const {
return _restrictions->restrictions();
}
virtual bool has_supporting_index(const secondary_index::secondary_index_manager& index_manager) const override {
return _restrictions->has_supporting_index(index_manager);
}
@@ -353,8 +352,6 @@ public:
_restrictions->restrictions() | boost::adaptors::map_values,
[&] (auto&& r) { return r->is_satisfied_by(schema, key, ckey, cells, options, now); });
}
virtual bool needs_filtering(const schema& schema) const override;
};
template<>
@@ -412,29 +409,6 @@ single_column_primary_key_restrictions<clustering_key_prefix>::bounds_ranges(con
return bounds;
}
template<>
bool single_column_primary_key_restrictions<partition_key>::needs_filtering(const schema& schema) const {
return primary_key_restrictions<partition_key>::needs_filtering(schema);
}
template<>
bool single_column_primary_key_restrictions<clustering_key>::needs_filtering(const schema& schema) const {
// Restrictions currently need filtering in three cases:
// 1. any of them is a CONTAINS restriction
// 2. restrictions do not form a contiguous prefix (i.e. there are gaps in it)
// 3. a SLICE restriction isn't on a last place
column_id position = 0;
for (const auto& restriction : _restrictions->restrictions() | boost::adaptors::map_values) {
if (restriction->is_contains() || position != restriction->get_column_def().id) {
return true;
}
if (!restriction->is_slice()) {
position = restriction->get_column_def().id + 1;
}
}
return false;
}
}
}

View File

@@ -93,8 +93,6 @@ public:
}
virtual bool is_supported_by(const secondary_index::index& index) const = 0;
using abstract_restriction::is_satisfied_by;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const = 0;
#if 0
/**
* Check if this type of restriction is supported by the specified index.
@@ -115,7 +113,7 @@ public:
class contains;
protected:
std::optional<atomic_cell_value_view> get_value(const schema& schema,
bytes_view_opt get_value(const schema& schema,
const partition_key& key,
const clustering_key_prefix& ckey,
const row& cells,
@@ -168,7 +166,6 @@ public:
const row& cells,
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
#if 0
@Override
@@ -204,8 +201,15 @@ public:
const row& cells,
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
virtual std::vector<bytes_opt> values_raw(const query_options& options) const = 0;
virtual std::vector<bytes_opt> values(const query_options& options) const override {
std::vector<bytes_opt> ret = values_raw(options);
std::sort(ret.begin(),ret.end());
ret.erase(std::unique(ret.begin(),ret.end()),ret.end());
return ret;
}
#if 0
@Override
protected final boolean isSupportedBy(SecondaryIndex index)
@@ -228,7 +232,7 @@ public:
return abstract_restriction::term_uses_function(_values, ks_name, function_name);
}
virtual std::vector<bytes_opt> values(const query_options& options) const override {
virtual std::vector<bytes_opt> values_raw(const query_options& options) const override {
std::vector<bytes_opt> ret;
for (auto&& v : _values) {
ret.emplace_back(to_bytes_opt(v->bind_and_get(options)));
@@ -253,7 +257,7 @@ public:
return false;
}
virtual std::vector<bytes_opt> values(const query_options& options) const override {
virtual std::vector<bytes_opt> values_raw(const query_options& options) const override {
auto&& lval = dynamic_pointer_cast<multi_item_terminal>(_marker->bind(options));
if (!lval) {
throw exceptions::invalid_request_exception("Invalid null value for IN restriction");
@@ -360,7 +364,6 @@ public:
const row& cells,
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
};
// This holds CONTAINS, CONTAINS_KEY, and map[key] = value restrictions because we might want to have any combination of them.
@@ -482,7 +485,6 @@ public:
const row& cells,
const query_options& options,
gc_clock::time_point now) const override;
virtual bool is_satisfied_by(bytes_view data, const query_options& options) const override;
#if 0
private List<ByteBuffer> keys(const query_options& options) {

View File

@@ -23,7 +23,6 @@
#include <boost/range/algorithm/transform.hpp>
#include <boost/range/algorithm.hpp>
#include <boost/range/adaptors.hpp>
#include <boost/algorithm/cxx11/any_of.hpp>
#include "statement_restrictions.hh"
#include "single_column_primary_key_restrictions.hh"
@@ -37,24 +36,19 @@
namespace cql3 {
namespace restrictions {
static logging::logger rlogger("restrictions");
using boost::adaptors::filtered;
using boost::adaptors::transformed;
template<typename T>
class statement_restrictions::initial_key_restrictions : public primary_key_restrictions<T> {
bool _allow_filtering;
public:
initial_key_restrictions(bool allow_filtering)
: _allow_filtering(allow_filtering) {}
using bounds_range_type = typename primary_key_restrictions<T>::bounds_range_type;
::shared_ptr<primary_key_restrictions<T>> do_merge_to(schema_ptr schema, ::shared_ptr<restriction> restriction) const {
if (restriction->is_multi_column()) {
throw std::runtime_error(sprint("%s not implemented", __PRETTY_FUNCTION__));
}
return ::make_shared<single_column_primary_key_restrictions<T>>(schema, _allow_filtering)->merge_to(schema, restriction);
return ::make_shared<single_column_primary_key_restrictions<T>>(schema)->merge_to(schema, restriction);
}
::shared_ptr<primary_key_restrictions<T>> merge_to(schema_ptr schema, ::shared_ptr<restriction> restriction) override {
if (restriction->is_multi_column()) {
@@ -63,7 +57,7 @@ public:
if (restriction->is_on_token()) {
return static_pointer_cast<token_restriction>(restriction);
}
return ::make_shared<single_column_primary_key_restrictions<T>>(schema, _allow_filtering)->merge_to(restriction);
return ::make_shared<single_column_primary_key_restrictions<T>>(schema)->merge_to(restriction);
}
void merge_with(::shared_ptr<restriction> restriction) override {
throw exceptions::unsupported_operation_exception();
@@ -128,10 +122,9 @@ statement_restrictions::initial_key_restrictions<clustering_key_prefix>::merge_t
}
template<typename T>
::shared_ptr<primary_key_restrictions<T>> statement_restrictions::get_initial_key_restrictions(bool allow_filtering) {
static thread_local ::shared_ptr<primary_key_restrictions<T>> initial_kr_true = ::make_shared<initial_key_restrictions<T>>(true);
static thread_local ::shared_ptr<primary_key_restrictions<T>> initial_kr_false = ::make_shared<initial_key_restrictions<T>>(false);
return allow_filtering ? initial_kr_true : initial_kr_false;
::shared_ptr<primary_key_restrictions<T>> statement_restrictions::get_initial_key_restrictions() {
static thread_local ::shared_ptr<primary_key_restrictions<T>> initial_kr = ::make_shared<initial_key_restrictions<T>>();
return initial_kr;
}
std::vector<::shared_ptr<column_identifier>>
@@ -148,10 +141,10 @@ statement_restrictions::get_partition_key_unrestricted_components() const {
return r;
}
statement_restrictions::statement_restrictions(schema_ptr schema, bool allow_filtering)
statement_restrictions::statement_restrictions(schema_ptr schema)
: _schema(schema)
, _partition_key_restrictions(get_initial_key_restrictions<partition_key>(allow_filtering))
, _clustering_columns_restrictions(get_initial_key_restrictions<clustering_key_prefix>(allow_filtering))
, _partition_key_restrictions(get_initial_key_restrictions<partition_key>())
, _clustering_columns_restrictions(get_initial_key_restrictions<clustering_key_prefix>())
, _nonprimary_key_restrictions(::make_shared<single_column_restrictions>(schema))
{ }
#if 0
@@ -169,9 +162,8 @@ statement_restrictions::statement_restrictions(database& db,
::shared_ptr<variable_specifications> bound_names,
bool selects_only_static_columns,
bool select_a_collection,
bool for_view,
bool allow_filtering)
: statement_restrictions(schema, allow_filtering)
bool for_view)
: statement_restrictions(schema)
{
/*
* WHERE clause. For a given entity, rules are: - EQ relation conflicts with anything else (including a 2nd EQ)
@@ -205,7 +197,7 @@ statement_restrictions::statement_restrictions(database& db,
throw exceptions::invalid_request_exception(sprint("restriction '%s' is only supported in materialized view creation", relation->to_string()));
}
} else {
add_restriction(relation->to_restriction(db, schema, bound_names), for_view, allow_filtering);
add_restriction(relation->to_restriction(db, schema, bound_names));
}
}
}
@@ -217,11 +209,11 @@ statement_restrictions::statement_restrictions(database& db,
|| _nonprimary_key_restrictions->has_supporting_index(sim);
// At this point, the select statement if fully constructed, but we still have a few things to validate
process_partition_key_restrictions(has_queriable_index, for_view, allow_filtering);
process_partition_key_restrictions(has_queriable_index, for_view);
// Some but not all of the partition key columns have been specified;
// hence we need turn these restrictions into index expressions.
if (_uses_secondary_indexing || _partition_key_restrictions->needs_filtering(*_schema)) {
if (_uses_secondary_indexing) {
_index_restrictions.push_back(_partition_key_restrictions);
}
@@ -237,14 +229,13 @@ statement_restrictions::statement_restrictions(database& db,
}
}
process_clustering_columns_restrictions(has_queriable_index, select_a_collection, for_view, allow_filtering);
process_clustering_columns_restrictions(has_queriable_index, select_a_collection, for_view);
// Covers indexes on the first clustering column (among others).
if (_is_key_range && has_queriable_clustering_column_index) {
_uses_secondary_indexing = true;
}
if (_is_key_range && has_queriable_clustering_column_index)
_uses_secondary_indexing = true;
if (_uses_secondary_indexing || _clustering_columns_restrictions->needs_filtering(*_schema)) {
if (_uses_secondary_indexing) {
_index_restrictions.push_back(_clustering_columns_restrictions);
} else if (_clustering_columns_restrictions->is_contains()) {
fail(unimplemented::cause::INDEXES);
@@ -273,48 +264,31 @@ statement_restrictions::statement_restrictions(database& db,
uses_secondary_indexing = true;
#endif
}
// Even if uses_secondary_indexing is false at this point, we'll still have to use one if
// there is restrictions not covered by the PK.
if (!_nonprimary_key_restrictions->empty()) {
if (has_queriable_index) {
_uses_secondary_indexing = true;
} else if (!allow_filtering) {
throw exceptions::invalid_request_exception("Cannot execute this query as it might involve data filtering and "
"thus may have unpredictable performance. If you want to execute "
"this query despite the performance unpredictability, use ALLOW FILTERING");
}
_uses_secondary_indexing = true;
_index_restrictions.push_back(_nonprimary_key_restrictions);
}
if (_uses_secondary_indexing && !(for_view || allow_filtering)) {
if (_uses_secondary_indexing && !for_view) {
validate_secondary_index_selections(selects_only_static_columns);
}
}
void statement_restrictions::add_restriction(::shared_ptr<restriction> restriction, bool for_view, bool allow_filtering) {
void statement_restrictions::add_restriction(::shared_ptr<restriction> restriction) {
if (restriction->is_multi_column()) {
_clustering_columns_restrictions = _clustering_columns_restrictions->merge_to(_schema, restriction);
} else if (restriction->is_on_token()) {
_partition_key_restrictions = _partition_key_restrictions->merge_to(_schema, restriction);
} else {
add_single_column_restriction(::static_pointer_cast<single_column_restriction>(restriction), for_view, allow_filtering);
add_single_column_restriction(::static_pointer_cast<single_column_restriction>(restriction));
}
}
void statement_restrictions::add_single_column_restriction(::shared_ptr<single_column_restriction> restriction, bool for_view, bool allow_filtering) {
void statement_restrictions::add_single_column_restriction(::shared_ptr<single_column_restriction> restriction) {
auto& def = restriction->get_column_def();
if (def.is_partition_key()) {
// A SELECT query may not request a slice (range) of partition keys
// without using token(). This is because there is no way to do this
// query efficiently: mumur3 turns a contiguous range of partition
// keys into tokens all over the token space.
// However, in a SELECT statement used to define a materialized view,
// such a slice is fine - it is used to check whether individual
// partitions, match, and does not present a performance problem.
assert(!restriction->is_on_token());
if (restriction->is_slice() && !for_view && !allow_filtering) {
throw exceptions::invalid_request_exception(
"Only EQ and IN relation are supported on the partition key (unless you use the token() function or allow filtering)");
}
_partition_key_restrictions = _partition_key_restrictions->merge_to(_schema, restriction);
} else if (def.is_clustering_key()) {
_clustering_columns_restrictions = _clustering_columns_restrictions->merge_to(_schema, restriction);
@@ -333,7 +307,7 @@ const std::vector<::shared_ptr<restrictions>>& statement_restrictions::index_res
return _index_restrictions;
}
void statement_restrictions::process_partition_key_restrictions(bool has_queriable_index, bool for_view, bool allow_filtering) {
void statement_restrictions::process_partition_key_restrictions(bool has_queriable_index, bool for_view) {
// If there is a queriable index, no special condition are required on the other restrictions.
// But we still need to know 2 things:
// - If we don't have a queriable index, is the query ok
@@ -342,32 +316,28 @@ void statement_restrictions::process_partition_key_restrictions(bool has_queriab
// components must have a EQ. Only the last partition key component can be in IN relation.
if (_partition_key_restrictions->is_on_token()) {
_is_key_range = true;
} else if (_partition_key_restrictions->has_unrestricted_components(*_schema)) {
_is_key_range = true;
_uses_secondary_indexing = has_queriable_index;
}
if (_partition_key_restrictions->needs_filtering(*_schema)) {
if (!allow_filtering && !for_view && !has_queriable_index) {
throw exceptions::invalid_request_exception("Cannot execute this query as it might involve data filtering and "
"thus may have unpredictable performance. If you want to execute "
"this query despite the performance unpredictability, use ALLOW FILTERING");
} else if (has_partition_key_unrestricted_components()) {
if (!_partition_key_restrictions->empty() && !for_view) {
if (!has_queriable_index) {
throw exceptions::invalid_request_exception(sprint("Partition key parts: %s must be restricted as other parts are",
join(", ", get_partition_key_unrestricted_components())));
}
}
_is_key_range = true;
_uses_secondary_indexing = has_queriable_index;
}
}
bool statement_restrictions::has_partition_key_unrestricted_components() const {
return _partition_key_restrictions->has_unrestricted_components(*_schema);
return _partition_key_restrictions->size() < _schema->partition_key_size();
}
bool statement_restrictions::has_unrestricted_clustering_columns() const {
return _clustering_columns_restrictions->has_unrestricted_components(*_schema);
return _clustering_columns_restrictions->size() < _schema->clustering_key_size();
}
void statement_restrictions::process_clustering_columns_restrictions(bool has_queriable_index, bool select_a_collection, bool for_view, bool allow_filtering) {
void statement_restrictions::process_clustering_columns_restrictions(bool has_queriable_index, bool select_a_collection, bool for_view) {
if (!has_clustering_columns_restriction()) {
return;
}
@@ -376,36 +346,38 @@ void statement_restrictions::process_clustering_columns_restrictions(bool has_qu
throw exceptions::invalid_request_exception(
"Cannot restrict clustering columns by IN relations when a collection is selected by the query");
}
if (_clustering_columns_restrictions->is_contains() && !has_queriable_index && !allow_filtering) {
if (_clustering_columns_restrictions->is_contains() && !has_queriable_index) {
throw exceptions::invalid_request_exception(
"Cannot restrict clustering columns by a CONTAINS relation without a secondary index or filtering");
"Cannot restrict clustering columns by a CONTAINS relation without a secondary index");
}
if (has_clustering_columns_restriction() && _clustering_columns_restrictions->needs_filtering(*_schema)) {
if (has_queriable_index) {
_uses_secondary_indexing = true;
} else if (!allow_filtering && !for_view) {
auto clustering_columns_iter = _schema->clustering_key_columns().begin();
for (auto&& restricted_column : _clustering_columns_restrictions->get_column_defs()) {
const column_definition* clustering_column = &(*clustering_columns_iter);
++clustering_columns_iter;
if (clustering_column != restricted_column) {
throw exceptions::invalid_request_exception(sprint(
"PRIMARY KEY column \"%s\" cannot be restricted as preceding column \"%s\" is not restricted",
restricted_column->name_as_text(), clustering_column->name_as_text()));
}
auto clustering_columns_iter = _schema->clustering_key_columns().begin();
for (auto&& restricted_column : _clustering_columns_restrictions->get_column_defs()) {
const column_definition* clustering_column = &(*clustering_columns_iter);
++clustering_columns_iter;
if (clustering_column != restricted_column && !for_view) {
if (!has_queriable_index) {
throw exceptions::invalid_request_exception(sprint(
"PRIMARY KEY column \"%s\" cannot be restricted as preceding column \"%s\" is not restricted",
restricted_column->name_as_text(), clustering_column->name_as_text()));
}
_uses_secondary_indexing = true; // handle gaps and non-keyrange cases.
break;
}
}
if (_clustering_columns_restrictions->is_contains()) {
_uses_secondary_indexing = true;
}
}
dht::partition_range_vector statement_restrictions::get_partition_key_ranges(const query_options& options) const {
if (_partition_key_restrictions->empty()) {
return {dht::partition_range::make_open_ended_both_sides()};
}
if (_partition_key_restrictions->needs_filtering(*_schema)) {
return {dht::partition_range::make_open_ended_both_sides()};
}
return _partition_key_restrictions->bounds_ranges(options);
}
@@ -413,30 +385,18 @@ std::vector<query::clustering_range> statement_restrictions::get_clustering_boun
if (_clustering_columns_restrictions->empty()) {
return {query::clustering_range::make_open_ended_both_sides()};
}
// TODO(sarna): For filtering to work, clustering range is not bounded at all. For filtering to work faster,
// the biggest clustering prefix restriction should be used here.
if (_clustering_columns_restrictions->needs_filtering(*_schema)) {
return {query::clustering_range::make_open_ended_both_sides()};
}
return _clustering_columns_restrictions->bounds_ranges(options);
}
bool statement_restrictions::need_filtering() const {
bool statement_restrictions::need_filtering() {
uint32_t number_of_restricted_columns = 0;
for (auto&& restrictions : _index_restrictions) {
number_of_restricted_columns += restrictions->size();
}
if (_partition_key_restrictions->is_multi_column() || _clustering_columns_restrictions->is_multi_column()) {
// TODO(sarna): Implement ALLOW FILTERING support for multi-column restrictions - return false for now
// in order to ensure backwards compatibility
return false;
}
return number_of_restricted_columns > 1
|| (number_of_restricted_columns == 0 && _partition_key_restrictions->empty() && !_clustering_columns_restrictions->empty())
|| (number_of_restricted_columns != 0 && _nonprimary_key_restrictions->has_multiple_contains())
|| (number_of_restricted_columns != 0 && !_uses_secondary_indexing);
|| (number_of_restricted_columns == 0 && has_clustering_columns_restriction())
|| (number_of_restricted_columns != 0 && _nonprimary_key_restrictions->has_multiple_contains());
}
void statement_restrictions::validate_secondary_index_selections(bool selects_only_static_columns) {
@@ -454,34 +414,7 @@ void statement_restrictions::validate_secondary_index_selections(bool selects_on
}
}
const single_column_restrictions::restrictions_map& statement_restrictions::get_single_column_partition_key_restrictions() const {
static single_column_restrictions::restrictions_map empty;
auto single_restrictions = dynamic_pointer_cast<single_column_primary_key_restrictions<partition_key>>(_partition_key_restrictions);
if (!single_restrictions) {
if (dynamic_pointer_cast<initial_key_restrictions<partition_key>>(_partition_key_restrictions)) {
return empty;
}
throw std::runtime_error("statement restrictions for multi-column partition key restrictions are not implemented yet");
}
return single_restrictions->restrictions();
}
/**
* @return clustering key restrictions split into single column restrictions (e.g. for filtering support).
*/
const single_column_restrictions::restrictions_map& statement_restrictions::get_single_column_clustering_key_restrictions() const {
static single_column_restrictions::restrictions_map empty;
auto single_restrictions = dynamic_pointer_cast<single_column_primary_key_restrictions<clustering_key>>(_clustering_columns_restrictions);
if (!single_restrictions) {
if (dynamic_pointer_cast<initial_key_restrictions<clustering_key>>(_clustering_columns_restrictions)) {
return empty;
}
throw std::runtime_error("statement restrictions for multi-column partition key restrictions are not implemented yet");
}
return single_restrictions->restrictions();
}
static std::optional<atomic_cell_value_view> do_get_value(const schema& schema,
static bytes_view_opt do_get_value(const schema& schema,
const column_definition& cdef,
const partition_key& key,
const clustering_key_prefix& ckey,
@@ -489,21 +422,21 @@ static std::optional<atomic_cell_value_view> do_get_value(const schema& schema,
gc_clock::time_point now) {
switch(cdef.kind) {
case column_kind::partition_key:
return atomic_cell_value_view(key.get_component(schema, cdef.component_index()));
return key.get_component(schema, cdef.component_index());
case column_kind::clustering_key:
return atomic_cell_value_view(ckey.get_component(schema, cdef.component_index()));
return ckey.get_component(schema, cdef.component_index());
default:
auto cell = cells.find_cell(cdef.id);
if (!cell) {
return std::nullopt;
return stdx::nullopt;
}
assert(cdef.is_atomic());
auto c = cell->as_atomic_cell(cdef);
return c.is_dead(now) ? std::nullopt : std::optional<atomic_cell_value_view>(c.value());
auto c = cell->as_atomic_cell();
return c.is_dead(now) ? stdx::nullopt : bytes_view_opt(c.value());
}
}
std::optional<atomic_cell_value_view> single_column_restriction::get_value(const schema& schema,
bytes_view_opt single_column_restriction::get_value(const schema& schema,
const partition_key& key,
const clustering_key_prefix& ckey,
const row& cells,
@@ -523,24 +456,11 @@ bool single_column_restriction::EQ::is_satisfied_by(const schema& schema,
auto operand = value(options);
if (operand) {
auto cell_value = get_value(schema, key, ckey, cells, now);
if (!cell_value) {
return false;
}
return cell_value->with_linearized([&] (bytes_view cell_value_bv) {
return _column_def.type->compare(*operand, cell_value_bv) == 0;
});
return cell_value && _column_def.type->compare(*operand, *cell_value) == 0;
}
return false;
}
bool single_column_restriction::EQ::is_satisfied_by(bytes_view data, const query_options& options) const {
if (_column_def.type->is_counter()) {
fail(unimplemented::cause::COUNTERS);
}
auto operand = value(options);
return operand && _column_def.type->compare(*operand, data) == 0;
}
bool single_column_restriction::IN::is_satisfied_by(const schema& schema,
const partition_key& key,
const clustering_key_prefix& ckey,
@@ -555,20 +475,8 @@ bool single_column_restriction::IN::is_satisfied_by(const schema& schema,
return false;
}
auto operands = values(options);
return cell_value->with_linearized([&] (bytes_view cell_value_bv) {
return std::any_of(operands.begin(), operands.end(), [&] (auto&& operand) {
return operand && _column_def.type->compare(*operand, cell_value_bv) == 0;
});
});
}
bool single_column_restriction::IN::is_satisfied_by(bytes_view data, const query_options& options) const {
if (_column_def.type->is_counter()) {
fail(unimplemented::cause::COUNTERS);
}
auto operands = values(options);
return boost::algorithm::any_of(operands, [this, &data] (const bytes_opt& operand) {
return operand && _column_def.type->compare(*operand, data) == 0;
return operand && _column_def.type->compare(*operand, *cell_value) == 0;
});
}
@@ -602,16 +510,7 @@ bool single_column_restriction::slice::is_satisfied_by(const schema& schema,
if (!cell_value) {
return false;
}
return cell_value->with_linearized([&] (bytes_view cell_value_bv) {
return to_range(_slice, options).contains(cell_value_bv, _column_def.type->as_tri_comparator());
});
}
bool single_column_restriction::slice::is_satisfied_by(bytes_view data, const query_options& options) const {
if (_column_def.type->is_counter()) {
fail(unimplemented::cause::COUNTERS);
}
return to_range(_slice, options).contains(data, _column_def.type->as_tri_comparator());
return to_range(_slice, options).contains(*cell_value, _column_def.type->as_tri_comparator());
}
bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
@@ -637,8 +536,7 @@ bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
auto&& element_type = col_type->is_set() ? col_type->name_comparator() : col_type->value_comparator();
if (_column_def.type->is_multi_cell()) {
auto cell = cells.find_cell(_column_def.id);
return cell->as_collection_mutation().data.with_linearized([&] (bytes_view collection_bv) {
auto&& elements = col_type->deserialize_mutation_form(collection_bv).cells;
auto&& elements = col_type->deserialize_mutation_form(cell->as_collection_mutation()).cells;
auto end = std::remove_if(elements.begin(), elements.end(), [now] (auto&& element) {
return element.second.is_dead(now);
});
@@ -648,9 +546,7 @@ bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
continue;
}
auto found = std::find_if(elements.begin(), end, [&] (auto&& element) {
return element.second.value().with_linearized([&] (bytes_view value_bv) {
return element_type->compare(value_bv, *val) == 0;
});
return element_type->compare(element.second.value(), *val) == 0;
});
if (found == end) {
return false;
@@ -677,26 +573,16 @@ bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
auto found = std::find_if(elements.begin(), end, [&] (auto&& element) {
return map_key_type->compare(element.first, *map_key) == 0;
});
if (found == end) {
return false;
}
auto cmp = found->second.value().with_linearized([&] (bytes_view value_bv) {
return element_type->compare(value_bv, *map_value);
});
if (cmp != 0) {
if (found == end || element_type->compare(found->second.value(), *map_value) != 0) {
return false;
}
}
return true;
});
} else {
auto cell_value = get_value(schema, key, ckey, cells, now);
if (!cell_value) {
return false;
}
auto deserialized = cell_value->with_linearized([&] (bytes_view cell_value_bv) {
return _column_def.type->deserialize(cell_value_bv);
});
auto deserialized = _column_def.type->deserialize(*cell_value);
for (auto&& value : _values) {
auto val = value->bind_and_get(options);
if (!val) {
@@ -756,11 +642,6 @@ bool single_column_restriction::contains::is_satisfied_by(const schema& schema,
return true;
}
bool single_column_restriction::contains::is_satisfied_by(bytes_view data, const query_options& options) const {
//TODO(sarna): Deserialize & return. It would be nice to deduplicate, is_satisfied_by above is rather long
fail(unimplemented::cause::INDEXES);
}
bool token_restriction::EQ::is_satisfied_by(const schema& schema,
const partition_key& key,
const clustering_key_prefix& ckey,
@@ -772,9 +653,7 @@ bool token_restriction::EQ::is_satisfied_by(const schema& schema,
for (auto&& operand : values(options)) {
if (operand) {
auto cell_value = do_get_value(schema, **cdef, key, ckey, cells, now);
satisfied = cell_value && cell_value->with_linearized([&] (bytes_view cell_value_bv) {
return (*cdef)->type->compare(*operand, cell_value_bv) == 0;
});
satisfied = cell_value && (*cdef)->type->compare(*operand, *cell_value) == 0;
}
if (!satisfied) {
break;
@@ -796,9 +675,7 @@ bool token_restriction::slice::is_satisfied_by(const schema& schema,
if (!cell_value) {
return false;
}
satisfied = cell_value->with_linearized([&] (bytes_view cell_value_bv) {
return range.contains(cell_value_bv, cdef->type->as_tri_comparator());
});
satisfied = range.contains(*cell_value, cdef->type->as_tri_comparator());
if (!satisfied) {
break;
}

View File

@@ -67,7 +67,7 @@ private:
class initial_key_restrictions;
template<typename T>
static ::shared_ptr<primary_key_restrictions<T>> get_initial_key_restrictions(bool allow_filtering);
static ::shared_ptr<primary_key_restrictions<T>> get_initial_key_restrictions();
/**
* Restrictions on partitioning columns
@@ -108,7 +108,7 @@ public:
* @param cfm the column family meta data
* @return a new empty <code>StatementRestrictions</code>.
*/
statement_restrictions(schema_ptr schema, bool allow_filtering);
statement_restrictions(schema_ptr schema);
statement_restrictions(database& db,
schema_ptr schema,
@@ -117,11 +117,10 @@ public:
::shared_ptr<variable_specifications> bound_names,
bool selects_only_static_columns,
bool select_a_collection,
bool for_view = false,
bool allow_filtering = false);
bool for_view = false);
private:
void add_restriction(::shared_ptr<restriction> restriction, bool for_view, bool allow_filtering);
void add_single_column_restriction(::shared_ptr<single_column_restriction> restriction, bool for_view, bool allow_filtering);
void add_restriction(::shared_ptr<restriction> restriction);
void add_single_column_restriction(::shared_ptr<single_column_restriction> restriction);
public:
bool uses_function(const sstring& ks_name, const sstring& function_name) const;
@@ -175,7 +174,7 @@ public:
*/
bool has_unrestricted_clustering_columns() const;
private:
void process_partition_key_restrictions(bool has_queriable_index, bool for_view, bool allow_filtering);
void process_partition_key_restrictions(bool has_queriable_index, bool for_view);
/**
* Returns the partition key components that are not restricted.
@@ -190,7 +189,7 @@ private:
* @param select_a_collection <code>true</code> if the query should return a collection column
* @throws InvalidRequestException if the request is invalid
*/
void process_clustering_columns_restrictions(bool has_queriable_index, bool select_a_collection, bool for_view, bool allow_filtering);
void process_clustering_columns_restrictions(bool has_queriable_index, bool select_a_collection, bool for_view);
/**
* Returns the <code>Restrictions</code> for the specified type of columns.
@@ -358,7 +357,7 @@ public:
* Checks if the query need to use filtering.
* @return <code>true</code> if the query need to use filtering, <code>false</code> otherwise.
*/
bool need_filtering() const;
bool need_filtering();
void validate_secondary_index_selections(bool selects_only_static_columns);
@@ -399,16 +398,6 @@ public:
const single_column_restrictions::restrictions_map& get_non_pk_restriction() const {
return _nonprimary_key_restrictions->restrictions();
}
/**
* @return partition key restrictions split into single column restrictions (e.g. for filtering support).
*/
const single_column_restrictions::restrictions_map& get_single_column_partition_key_restrictions() const;
/**
* @return clustering key restrictions split into single column restrictions (e.g. for filtering support).
*/
const single_column_restrictions::restrictions_map& get_single_column_clustering_key_restrictions() const;
};
}

View File

@@ -1,139 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "selection/selection.hh"
#include "stats.hh"
namespace cql3 {
class result_generator {
schema_ptr _schema;
foreign_ptr<lw_shared_ptr<query::result>> _result;
lw_shared_ptr<const query::read_command> _command;
shared_ptr<const selection::selection> _selection;
cql_stats* _stats;
private:
template<typename Visitor>
class query_result_visitor {
const schema& _schema;
std::vector<bytes> _partition_key;
std::vector<bytes> _clustering_key;
uint32_t _partition_row_count = 0;
uint32_t _total_row_count = 0;
Visitor& _visitor;
const selection::selection& _selection;
private:
void accept_cell_value(const column_definition& def, query::result_row_view::iterator_type& i) {
if (def.is_multi_cell()) {
_visitor.accept_value(i.next_collection_cell());
} else {
auto cell = i.next_atomic_cell();
_visitor.accept_value(cell ? std::optional<query::result_bytes_view>(cell->value()) : std::optional<query::result_bytes_view>());
}
}
public:
query_result_visitor(const schema& s, Visitor& visitor, const selection::selection& select)
: _schema(s), _visitor(visitor), _selection(select) { }
void accept_new_partition(const partition_key& key, uint32_t row_count) {
_partition_key = key.explode(_schema);
accept_new_partition(row_count);
}
void accept_new_partition(uint32_t row_count) {
_partition_row_count = row_count;
_total_row_count += row_count;
}
void accept_new_row(const clustering_key& key, query::result_row_view static_row,
query::result_row_view row) {
_clustering_key = key.explode(_schema);
accept_new_row(static_row, row);
}
void accept_new_row(query::result_row_view static_row, query::result_row_view row) {
auto static_row_iterator = static_row.iterator();
auto row_iterator = row.iterator();
_visitor.start_row();
for (auto&& def : _selection.get_columns()) {
switch (def->kind) {
case column_kind::partition_key:
_visitor.accept_value(query::result_bytes_view(bytes_view(_partition_key[def->component_index()])));
break;
case column_kind::clustering_key:
if (_clustering_key.size() > def->component_index()) {
_visitor.accept_value(query::result_bytes_view(bytes_view(_clustering_key[def->component_index()])));
} else {
_visitor.accept_value({});
}
break;
case column_kind::regular_column:
accept_cell_value(*def, row_iterator);
break;
case column_kind::static_column:
accept_cell_value(*def, static_row_iterator);
break;
}
}
_visitor.end_row();
}
void accept_partition_end(const query::result_row_view& static_row) {
if (_partition_row_count == 0) {
_total_row_count++;
_visitor.start_row();
auto static_row_iterator = static_row.iterator();
for (auto&& def : _selection.get_columns()) {
if (def->is_partition_key()) {
_visitor.accept_value(query::result_bytes_view(bytes_view(_partition_key[def->component_index()])));
} else if (def->is_static()) {
accept_cell_value(*def, static_row_iterator);
} else {
_visitor.accept_value({});
}
}
_visitor.end_row();
}
}
uint32_t rows_read() const { return _total_row_count; }
};
public:
result_generator() = default;
result_generator(schema_ptr s, foreign_ptr<lw_shared_ptr<query::result>> result, lw_shared_ptr<const query::read_command> cmd,
::shared_ptr<const selection::selection> select, cql_stats& stats)
: _schema(std::move(s))
, _result(std::move(result))
, _command(std::move(cmd))
, _selection(std::move(select))
, _stats(&stats)
{ }
template<typename Visitor>
void visit(Visitor&& visitor) const {
query_result_visitor<Visitor> v(*_schema, visitor, *_selection);
query::result_view::consume(*_result, _command->slice, v);
_stats->rows_read += v.rows_read();
}
};
}

View File

@@ -47,12 +47,6 @@
#include "service/pager/paging_state.hh"
#include "schema.hh"
#include "query-result-reader.hh"
#include "result_generator.hh"
#include <seastar/util/gcc6-concepts.hh>
namespace cql3 {
class metadata {
@@ -137,22 +131,10 @@ public:
const std::vector<uint16_t>& partition_key_bind_indices() const;
};
GCC6_CONCEPT(
template<typename Visitor>
concept bool ResultVisitor = requires(Visitor& visitor) {
visitor.start_row();
visitor.accept_value(std::optional<query::result_bytes_view>());
visitor.end_row();
};
)
class result_set {
public:
::shared_ptr<metadata> _metadata;
std::deque<std::vector<bytes_opt>> _rows;
friend class result;
public:
result_set(std::vector<::shared_ptr<column_specification>> metadata_);
@@ -181,80 +163,6 @@ public:
// Returns a range of rows. A row is a range of bytes_opt.
const std::deque<std::vector<bytes_opt>>& rows() const;
template<typename Visitor>
GCC6_CONCEPT(requires ResultVisitor<Visitor>)
void visit(Visitor&& visitor) const {
auto column_count = get_metadata().column_count();
for (auto& row : _rows) {
visitor.start_row();
for (auto i = 0u; i < column_count; i++) {
auto& cell = row[i];
visitor.accept_value(cell ? std::optional<query::result_bytes_view>(*cell) : std::optional<query::result_bytes_view>());
}
visitor.end_row();
}
}
class builder;
};
class result_set::builder {
result_set _result;
std::vector<bytes_opt> _current_row;
public:
explicit builder(shared_ptr<metadata> mtd)
: _result(std::move(mtd)) { }
void start_row() { }
void accept_value(std::optional<query::result_bytes_view> value) {
if (!value) {
_current_row.emplace_back();
return;
}
_current_row.emplace_back(value->linearize());
}
void end_row() {
_result.add_row(std::exchange(_current_row, { }));
}
result_set get_result_set() && { return std::move(_result); }
};
class result {
std::unique_ptr<cql3::result_set> _result_set;
result_generator _result_generator;
shared_ptr<cql3::metadata> _metadata;
public:
explicit result(std::unique_ptr<cql3::result_set> rs)
: _result_set(std::move(rs))
, _metadata(_result_set->_metadata)
{ }
explicit result(result_generator generator, shared_ptr<metadata> m)
: _result_generator(std::move(generator))
, _metadata(std::move(m))
{ }
const cql3::metadata& get_metadata() const { return *_metadata; }
cql3::result_set result_set() const {
if (_result_set) {
return *_result_set;
} else {
auto builder = result_set::builder(_metadata);
_result_generator.visit(builder);
return std::move(builder).get_result_set();
}
}
template<typename Visitor>
GCC6_CONCEPT(requires ResultVisitor<Visitor>)
void visit(Visitor&& visitor) const {
if (_result_set) {
_result_set->visit(std::forward<Visitor>(visitor));
} else {
_result_generator.visit(std::forward<Visitor>(visitor));
}
}
};
}

View File

@@ -112,32 +112,6 @@ selectable::with_function::raw::make_count_rows_function() {
std::vector<shared_ptr<cql3::selection::selectable::raw>>());
}
shared_ptr<selector::factory>
selectable::with_anonymous_function::new_selector_factory(database& db, schema_ptr s, std::vector<const column_definition*>& defs) {
auto&& factories = selector_factories::create_factories_and_collect_column_definitions(_args, db, s, defs);
return abstract_function_selector::new_factory(_function, std::move(factories));
}
sstring
selectable::with_anonymous_function::to_string() const {
return sprint("%s(%s)", _function->name().name, join(", ", _args));
}
shared_ptr<selectable>
selectable::with_anonymous_function::raw::prepare(schema_ptr s) {
std::vector<shared_ptr<selectable>> prepared_args;
prepared_args.reserve(_args.size());
for (auto&& arg : _args) {
prepared_args.push_back(arg->prepare(s));
}
return ::make_shared<with_anonymous_function>(_function, std::move(prepared_args));
}
bool
selectable::with_anonymous_function::raw::processes_selection() const {
return true;
}
shared_ptr<selector::factory>
selectable::with_field_selection::new_selector_factory(database& db, schema_ptr s, std::vector<const column_definition*>& defs) {
auto&& factory = _selected->new_selector_factory(db, s, defs);

View File

@@ -46,7 +46,6 @@
#include "core/shared_ptr.hh"
#include "cql3/selection/selector.hh"
#include "cql3/cql3_type.hh"
#include "cql3/functions/function.hh"
#include "cql3/functions/function_name.hh"
namespace cql3 {
@@ -83,7 +82,6 @@ public:
class writetime_or_ttl;
class with_function;
class with_anonymous_function;
class with_field_selection;
@@ -116,28 +114,6 @@ public:
};
};
class selectable::with_anonymous_function : public selectable {
shared_ptr<functions::function> _function;
std::vector<shared_ptr<selectable>> _args;
public:
with_anonymous_function(::shared_ptr<functions::function> f, std::vector<shared_ptr<selectable>> args)
: _function(f), _args(std::move(args)) {
}
virtual sstring to_string() const override;
virtual shared_ptr<selector::factory> new_selector_factory(database& db, schema_ptr s, std::vector<const column_definition*>& defs) override;
class raw : public selectable::raw {
shared_ptr<functions::function> _function;
std::vector<shared_ptr<selectable::raw>> _args;
public:
raw(shared_ptr<functions::function> f, std::vector<shared_ptr<selectable::raw>> args)
: _function(f), _args(std::move(args)) {
}
virtual shared_ptr<selectable> prepare(schema_ptr s) override;
virtual bool processes_selection() const override;
};
};
class selectable::with_cast : public selectable {
::shared_ptr<selectable> _arg;

View File

@@ -53,15 +53,13 @@ selection::selection(schema_ptr schema,
std::vector<const column_definition*> columns,
std::vector<::shared_ptr<column_specification>> metadata_,
bool collect_timestamps,
bool collect_TTLs,
trivial is_trivial)
bool collect_TTLs)
: _schema(std::move(schema))
, _columns(std::move(columns))
, _metadata(::make_shared<metadata>(std::move(metadata_)))
, _collect_timestamps(collect_timestamps)
, _collect_TTLs(collect_TTLs)
, _contains_static_columns(std::any_of(_columns.begin(), _columns.end(), std::mem_fn(&column_definition::is_static)))
, _is_trivial(is_trivial)
{ }
query::partition_slice::option_set selection::get_query_options() {
@@ -102,7 +100,7 @@ public:
*/
simple_selection(schema_ptr schema, std::vector<const column_definition*> columns,
std::vector<::shared_ptr<column_specification>> metadata, bool is_wildcard)
: selection(schema, std::move(columns), std::move(metadata), false, false, trivial::yes)
: selection(schema, std::move(columns), std::move(metadata), false, false)
, _is_wildcard(is_wildcard)
{ }
@@ -330,86 +328,93 @@ std::unique_ptr<result_set> result_set_builder::build() {
return std::move(_result_set);
}
bool result_set_builder::restrictions_filter::operator()(const selection& selection,
const std::vector<bytes>& partition_key,
const std::vector<bytes>& clustering_key,
const query::result_row_view& static_row,
const query::result_row_view& row) const {
static logging::logger rlogger("restrictions_filter");
result_set_builder::visitor::visitor(
cql3::selection::result_set_builder& builder, const schema& s,
const selection& selection)
: _builder(builder), _schema(s), _selection(selection), _row_count(0) {
}
if (_current_pratition_key_does_not_match || _current_static_row_does_not_match) {
return false;
void result_set_builder::visitor::add_value(const column_definition& def,
query::result_row_view::iterator_type& i) {
if (def.type->is_multi_cell()) {
auto cell = i.next_collection_cell();
if (!cell) {
_builder.add_empty();
return;
}
_builder.add_collection(def, *cell);
} else {
auto cell = i.next_atomic_cell();
if (!cell) {
_builder.add_empty();
return;
}
_builder.add(def, *cell);
}
}
void result_set_builder::visitor::accept_new_partition(const partition_key& key,
uint32_t row_count) {
_partition_key = key.explode(_schema);
_row_count = row_count;
}
void result_set_builder::visitor::accept_new_partition(uint32_t row_count) {
_row_count = row_count;
}
void result_set_builder::visitor::accept_new_row(const clustering_key& key,
const query::result_row_view& static_row,
const query::result_row_view& row) {
_clustering_key = key.explode(_schema);
accept_new_row(static_row, row);
}
void result_set_builder::visitor::accept_new_row(
const query::result_row_view& static_row,
const query::result_row_view& row) {
auto static_row_iterator = static_row.iterator();
auto row_iterator = row.iterator();
auto non_pk_restrictions_map = _restrictions->get_non_pk_restriction();
auto partition_key_restrictions_map = _restrictions->get_single_column_partition_key_restrictions();
auto clustering_key_restrictions_map = _restrictions->get_single_column_clustering_key_restrictions();
for (auto&& cdef : selection.get_columns()) {
switch (cdef->kind) {
case column_kind::static_column:
// fallthrough
case column_kind::regular_column:
if (cdef->type->is_multi_cell()) {
rlogger.debug("Multi-cell filtering is not implemented yet", cdef->name_as_text());
_builder.new_row();
for (auto&& def : _selection.get_columns()) {
switch (def->kind) {
case column_kind::partition_key:
_builder.add(_partition_key[def->component_index()]);
break;
case column_kind::clustering_key:
if (_clustering_key.size() > def->component_index()) {
_builder.add(_clustering_key[def->component_index()]);
} else {
auto cell_iterator = (cdef->kind == column_kind::static_column) ? static_row_iterator : row_iterator;
auto cell = cell_iterator.next_atomic_cell();
auto restr_it = non_pk_restrictions_map.find(cdef);
if (restr_it == non_pk_restrictions_map.end()) {
continue;
}
restrictions::single_column_restriction& restriction = *restr_it->second;
bool regular_restriction_matches;
if (cell) {
regular_restriction_matches = cell->value().with_linearized([&restriction](bytes_view data) {
return restriction.is_satisfied_by(data, cql3::query_options({ }));
});
} else {
regular_restriction_matches = restriction.is_satisfied_by(bytes(), cql3::query_options({ }));
}
if (!regular_restriction_matches) {
_current_static_row_does_not_match = (cdef->kind == column_kind::static_column);
return false;
}
_builder.add({});
}
break;
case column_kind::partition_key: {
auto restr_it = partition_key_restrictions_map.find(cdef);
if (restr_it == partition_key_restrictions_map.end()) {
continue;
}
restrictions::single_column_restriction& restriction = *restr_it->second;
const bytes& value_to_check = partition_key[cdef->id];
bool pk_restriction_matches = restriction.is_satisfied_by(value_to_check, cql3::query_options({ }));
if (!pk_restriction_matches) {
_current_pratition_key_does_not_match = true;
return false;
}
}
case column_kind::regular_column:
add_value(*def, row_iterator);
break;
case column_kind::clustering_key: {
auto restr_it = clustering_key_restrictions_map.find(cdef);
if (restr_it == clustering_key_restrictions_map.end()) {
continue;
}
restrictions::single_column_restriction& restriction = *restr_it->second;
const bytes& value_to_check = clustering_key[cdef->id];
bool pk_restriction_matches = restriction.is_satisfied_by(value_to_check, cql3::query_options({ }));
if (!pk_restriction_matches) {
return false;
}
}
case column_kind::static_column:
add_value(*def, static_row_iterator);
break;
default:
break;
assert(0);
}
}
}
void result_set_builder::visitor::accept_partition_end(
const query::result_row_view& static_row) {
if (_row_count == 0) {
_builder.new_row();
auto static_row_iterator = static_row.iterator();
for (auto&& def : _selection.get_columns()) {
if (def->is_partition_key()) {
_builder.add(_partition_key[def->component_index()]);
} else if (def->is_static()) {
add_value(*def, static_row_iterator);
} else {
_builder.add_empty();
}
}
}
return true;
}
api::timestamp_type result_set_builder::timestamp_of(size_t idx) {
@@ -421,7 +426,7 @@ int32_t result_set_builder::ttl_of(size_t idx) {
}
bytes_opt result_set_builder::get_value(data_type t, query::result_atomic_cell_view c) {
return {c.value().linearize()};
return {to_bytes(c.value())};
}
}

View File

@@ -48,7 +48,6 @@
#include "exceptions/exceptions.hh"
#include "cql3/selection/raw_selector.hh"
#include "cql3/selection/selector_factories.hh"
#include "cql3/restrictions/statement_restrictions.hh"
#include "unimplemented.hh"
namespace cql3 {
@@ -85,15 +84,12 @@ private:
const bool _collect_timestamps;
const bool _collect_TTLs;
const bool _contains_static_columns;
bool _is_trivial;
protected:
using trivial = bool_class<class trivial_tag>;
selection(schema_ptr schema,
std::vector<const column_definition*> columns,
std::vector<::shared_ptr<column_specification>> metadata_,
bool collect_timestamps,
bool collect_TTLs, trivial is_trivial = trivial::no);
bool collect_TTLs);
virtual ~selection() {}
public:
@@ -227,12 +223,6 @@ public:
}
}
/**
* Returns true if the selection is trivial, i.e. there are no function
* selectors (including casts or aggregates).
*/
bool is_trivial() const { return _is_trivial; }
friend class result_set_builder;
};
@@ -248,28 +238,6 @@ private:
const gc_clock::time_point _now;
cql_serialization_format _cql_serialization_format;
public:
class nop_filter {
public:
inline bool operator()(const selection&, const std::vector<bytes>&, const std::vector<bytes>&, const query::result_row_view&, const query::result_row_view&) const {
return true;
}
void reset() {
}
};
class restrictions_filter {
::shared_ptr<restrictions::statement_restrictions> _restrictions;
mutable bool _current_pratition_key_does_not_match = false;
mutable bool _current_static_row_does_not_match = false;
public:
restrictions_filter() = default;
explicit restrictions_filter(::shared_ptr<restrictions::statement_restrictions> restrictions) : _restrictions(restrictions) {}
bool operator()(const selection& selection, const std::vector<bytes>& pk, const std::vector<bytes>& ck, const query::result_row_view& static_row, const query::result_row_view& row) const;
void reset() {
_current_pratition_key_does_not_match = false;
_current_static_row_does_not_match = false;
}
};
result_set_builder(const selection& s, gc_clock::time_point now, cql_serialization_format sf);
void add_empty();
void add(bytes_opt value);
@@ -279,9 +247,8 @@ public:
std::unique_ptr<result_set> build();
api::timestamp_type timestamp_of(size_t idx);
int32_t ttl_of(size_t idx);
// Implements ResultVisitor concept from query.hh
template<typename Filter = nop_filter>
class visitor {
protected:
result_set_builder& _builder;
@@ -290,100 +257,20 @@ public:
uint32_t _row_count;
std::vector<bytes> _partition_key;
std::vector<bytes> _clustering_key;
Filter _filter;
public:
visitor(cql3::selection::result_set_builder& builder, const schema& s,
const selection& selection, Filter filter = Filter())
: _builder(builder)
, _schema(s)
, _selection(selection)
, _row_count(0)
, _filter(filter)
{}
visitor(cql3::selection::result_set_builder& builder, const schema& s, const selection&);
visitor(visitor&&) = default;
void add_value(const column_definition& def, query::result_row_view::iterator_type& i) {
if (def.type->is_multi_cell()) {
auto cell = i.next_collection_cell();
if (!cell) {
_builder.add_empty();
return;
}
_builder.add_collection(def, cell->linearize());
} else {
auto cell = i.next_atomic_cell();
if (!cell) {
_builder.add_empty();
return;
}
_builder.add(def, *cell);
}
}
void accept_new_partition(const partition_key& key, uint32_t row_count) {
_partition_key = key.explode(_schema);
_row_count = row_count;
_filter.reset();
}
void accept_new_partition(uint32_t row_count) {
_row_count = row_count;
_filter.reset();
}
void accept_new_row(const clustering_key& key, const query::result_row_view& static_row, const query::result_row_view& row) {
_clustering_key = key.explode(_schema);
accept_new_row(static_row, row);
}
void accept_new_row(const query::result_row_view& static_row, const query::result_row_view& row) {
auto static_row_iterator = static_row.iterator();
auto row_iterator = row.iterator();
if (!_filter(_selection, _partition_key, _clustering_key, static_row, row)) {
return;
}
_builder.new_row();
for (auto&& def : _selection.get_columns()) {
switch (def->kind) {
case column_kind::partition_key:
_builder.add(_partition_key[def->component_index()]);
break;
case column_kind::clustering_key:
if (_clustering_key.size() > def->component_index()) {
_builder.add(_clustering_key[def->component_index()]);
} else {
_builder.add({});
}
break;
case column_kind::regular_column:
add_value(*def, row_iterator);
break;
case column_kind::static_column:
add_value(*def, static_row_iterator);
break;
default:
assert(0);
}
}
}
void accept_partition_end(const query::result_row_view& static_row) {
if (_row_count == 0) {
_builder.new_row();
auto static_row_iterator = static_row.iterator();
for (auto&& def : _selection.get_columns()) {
if (def->is_partition_key()) {
_builder.add(_partition_key[def->component_index()]);
} else if (def->is_static()) {
add_value(*def, static_row_iterator);
} else {
_builder.add_empty();
}
}
}
}
void add_value(const column_definition& def, query::result_row_view::iterator_type& i);
void accept_new_partition(const partition_key& key, uint32_t row_count);
void accept_new_partition(uint32_t row_count);
void accept_new_row(const clustering_key& key,
const query::result_row_view& static_row,
const query::result_row_view& row);
void accept_new_row(const query::result_row_view& static_row,
const query::result_row_view& row);
void accept_partition_end(const query::result_row_view& static_row);
};
private:
bytes_opt get_value(data_type t, query::result_atomic_cell_view c);
};

View File

@@ -105,9 +105,11 @@ public:
virtual void reset() = 0;
virtual assignment_testable::test_result test_assignment(database& db, const sstring& keyspace, ::shared_ptr<column_specification> receiver) override {
if (receiver->type == get_type()) {
auto t1 = receiver->type->underlying_type();
auto t2 = get_type()->underlying_type();
if (t1 == t2) {
return assignment_testable::test_result::EXACT_MATCH;
} else if (receiver->type->is_value_compatible_with(*get_type())) {
} else if (t1->is_value_compatible_with(*t2)) {
return assignment_testable::test_result::WEAKLY_ASSIGNABLE;
} else {
return assignment_testable::test_result::NOT_ASSIGNABLE;

View File

@@ -225,12 +225,7 @@ sets::marker::bind(const query_options& options) {
void
sets::setter::execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params) {
auto value = _t->bind(params._options);
execute(m, row_key, params, column, std::move(value));
}
void
sets::setter::execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params, const column_definition& column, ::shared_ptr<terminal> value) {
const auto& value = _t->bind(params._options);
if (value == constants::UNSET_VALUE) {
return;
}
@@ -269,7 +264,7 @@ sets::adder::do_add(mutation& m, const clustering_key_prefix& row_key, const upd
}
for (auto&& e : set_value->_elements) {
mut.cells.emplace_back(e, params.make_cell(*set_type->value_comparator(), {}, atomic_cell::collection_member::yes));
mut.cells.emplace_back(e, params.make_cell({}));
}
auto smut = set_type->serialize_mutation_form(mut);
@@ -279,7 +274,7 @@ sets::adder::do_add(mutation& m, const clustering_key_prefix& row_key, const upd
auto v = set_type->serialize_partially_deserialized_form(
{set_value->_elements.begin(), set_value->_elements.end()},
cql_serialization_format::internal());
m.set_cell(row_key, column, params.make_cell(*column.type, std::move(v)));
m.set_cell(row_key, column, params.make_cell(std::move(v)));
} else {
m.set_cell(row_key, column, params.make_dead_cell());
}

View File

@@ -113,7 +113,6 @@ public:
: operation(column, std::move(t)) {
}
virtual void execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params) override;
static void execute(mutation& m, const clustering_key_prefix& row_key, const update_parameters& params, const column_definition& column, ::shared_ptr<terminal> value);
};
class adder : public operation {

View File

@@ -116,6 +116,18 @@ single_column_relation::to_receivers(schema_ptr schema, const column_definition&
throw exceptions::invalid_request_exception(sprint(
"IN predicates on non-primary-key columns (%s) is not yet supported", column_def.name_as_text()));
}
} else if (is_slice()) {
// Non EQ relation is not supported without token(), even if we have a 2ndary index (since even those
// are ordered by partitioner).
// Note: In theory we could allow it for 2ndary index queries with ALLOW FILTERING, but that would
// probably require some special casing
// Note bis: This is also why we don't bother handling the 'tuple' notation of #4851 for keys. If we
// lift the limitation for 2ndary
// index with filtering, we'll need to handle it though.
if (column_def.is_partition_key()) {
throw exceptions::invalid_request_exception(
"Only EQ and IN relation are supported on the partition key (unless you use the token() function)");
}
}
if (is_contains() && !receiver->type->is_collection()) {

View File

@@ -134,7 +134,7 @@ protected:
#endif
virtual sstring to_string() const override {
auto entity_as_string = _entity->to_cql_string();
auto entity_as_string = _entity->to_string();
if (_map_key) {
entity_as_string = sprint("%s[%s]", std::move(entity_as_string), _map_key->to_string());
}

View File

@@ -42,7 +42,7 @@
#include "alter_keyspace_statement.hh"
#include "prepared_statement.hh"
#include "service/migration_manager.hh"
#include "db/system_keyspace.hh"
#include "database.hh"
bool is_system_keyspace(const sstring& keyspace);
@@ -59,7 +59,7 @@ future<> cql3::statements::alter_keyspace_statement::check_access(const service:
return state.has_keyspace_access(_name, auth::permission::ALTER);
}
void cql3::statements::alter_keyspace_statement::validate(service::storage_proxy& proxy, const service::client_state& state) {
void cql3::statements::alter_keyspace_statement::validate(distributed<service::storage_proxy>& proxy, const service::client_state& state) {
try {
service::get_local_storage_proxy().get_db().local().find_keyspace(_name); // throws on failure
auto tmp = _name;
@@ -90,7 +90,7 @@ void cql3::statements::alter_keyspace_statement::validate(service::storage_proxy
}
}
future<shared_ptr<cql_transport::event::schema_change>> cql3::statements::alter_keyspace_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only) {
future<shared_ptr<cql_transport::event::schema_change>> cql3::statements::alter_keyspace_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) {
auto old_ksm = service::get_local_storage_proxy().get_db().local().find_keyspace(_name).metadata();
return service::get_local_migration_manager().announce_keyspace_update(_attrs->as_ks_metadata_update(old_ksm), is_local_only).then([this] {
using namespace cql_transport;

View File

@@ -60,8 +60,8 @@ public:
const sstring& keyspace() const override;
future<> check_access(const service::client_state& state) override;
void validate(service::storage_proxy& proxy, const service::client_state& state) override;
future<shared_ptr<cql_transport::event::schema_change>> announce_migration(service::storage_proxy& proxy, bool is_local_only) override;
void validate(distributed<service::storage_proxy>& proxy, const service::client_state& state) override;
future<shared_ptr<cql_transport::event::schema_change>> announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) override;
virtual std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;
};

View File

@@ -62,12 +62,12 @@ public:
, _options(std::move(options)) {
}
void validate(service::storage_proxy&, const service::client_state&) override;
void validate(distributed<service::storage_proxy>&, const service::client_state&) override;
virtual future<> check_access(const service::client_state&) override;
virtual future<::shared_ptr<cql_transport::messages::result_message>>
execute(service::storage_proxy&, service::query_state&, const query_options&) override;
execute(distributed<service::storage_proxy>&, service::query_state&, const query_options&) override;
};
}

View File

@@ -75,7 +75,7 @@ future<> alter_table_statement::check_access(const service::client_state& state)
return state.has_column_family_access(keyspace(), column_family(), auth::permission::ALTER);
}
void alter_table_statement::validate(service::storage_proxy& proxy, const service::client_state& state)
void alter_table_statement::validate(distributed<service::storage_proxy>& proxy, const service::client_state& state)
{
// validated in announce_migration()
}
@@ -165,9 +165,9 @@ static void validate_column_rename(database& db, const schema& schema, const col
}
}
future<shared_ptr<cql_transport::event::schema_change>> alter_table_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)
future<shared_ptr<cql_transport::event::schema_change>> alter_table_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only)
{
auto& db = proxy.get_db().local();
auto& db = proxy.local().get_db().local();
auto schema = validation::validate_column_family(db, keyspace(), column_family());
if (schema->is_view()) {
throw exceptions::invalid_request_exception("Cannot use ALTER TABLE on Materialized View");
@@ -247,15 +247,12 @@ future<shared_ptr<cql_transport::event::schema_change>> alter_table_statement::a
cfm.with_column(column_name->name(), type, _is_static ? column_kind::static_column : column_kind::regular_column);
// Adding a column to a table which has an include all view requires the column to be added to the view
// as well. If the view has a regular base column in its PK, then the column ID needs to be updated in
// view_info; for that, rebuild the schema.
// as well
if (!_is_static) {
for (auto&& view : cf.views()) {
if (view->view_info()->include_all_columns() || view->view_info()->base_non_pk_column_in_view_pk()) {
if (view->view_info()->include_all_columns()) {
schema_builder builder(view);
if (view->view_info()->include_all_columns()) {
builder.with_column(column_name->name(), type);
}
builder.with_column(column_name->name(), type);
view_updates.push_back(view_ptr(builder.build()));
}
}
@@ -308,10 +305,14 @@ future<shared_ptr<cql_transport::event::schema_change>> alter_table_statement::a
}
}
if (!cf.views().empty()) {
// If a column is dropped which is included in a view, we don't allow the drop to take place.
auto view_names = ::join(", ", cf.views()
| boost::adaptors::filtered([&] (auto&& v) { return bool(v->get_column_definition(column_name->name())); })
| boost::adaptors::transformed([] (auto&& v) { return v->cf_name(); }));
if (!view_names.empty()) {
throw exceptions::invalid_request_exception(sprint(
"Cannot drop column %s on base table %s.%s with materialized views",
column_name, keyspace(), column_family()));
"Cannot drop column %s, depended on by materialized views (%s.{%s})",
column_name, keyspace(), view_names));
}
break;
}

View File

@@ -77,8 +77,8 @@ public:
bool is_static);
virtual future<> check_access(const service::client_state& state) override;
virtual void validate(service::storage_proxy& proxy, const service::client_state& state) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(service::storage_proxy& proxy, bool is_local_only) override;
virtual void validate(distributed<service::storage_proxy>& proxy, const service::client_state& state) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) override;
virtual std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;
};

View File

@@ -66,7 +66,7 @@ future<> alter_type_statement::check_access(const service::client_state& state)
return state.has_keyspace_access(keyspace(), auth::permission::ALTER);
}
void alter_type_statement::validate(service::storage_proxy& proxy, const service::client_state& state)
void alter_type_statement::validate(distributed<service::storage_proxy>& proxy, const service::client_state& state)
{
// Validation is left to announceMigration as it's easier to do it while constructing the updated type.
// It doesn't really change anything anyway.
@@ -135,10 +135,10 @@ void alter_type_statement::do_announce_migration(database& db, ::keyspace& ks, b
}
}
future<shared_ptr<cql_transport::event::schema_change>> alter_type_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)
future<shared_ptr<cql_transport::event::schema_change>> alter_type_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only)
{
return seastar::async([this, &proxy, is_local_only] {
auto&& db = proxy.get_db().local();
auto&& db = proxy.local().get_db().local();
try {
auto&& ks = db.find_keyspace(keyspace());
do_announce_migration(db, ks, is_local_only);

View File

@@ -59,11 +59,11 @@ public:
virtual future<> check_access(const service::client_state& state) override;
virtual void validate(service::storage_proxy& proxy, const service::client_state& state) override;
virtual void validate(distributed<service::storage_proxy>& proxy, const service::client_state& state) override;
virtual const sstring& keyspace() const override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(service::storage_proxy& proxy, bool is_local_only) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) override;
class add_or_alter;
class renames;

View File

@@ -69,14 +69,14 @@ future<> alter_view_statement::check_access(const service::client_state& state)
return make_ready_future<>();
}
void alter_view_statement::validate(service::storage_proxy&, const service::client_state& state)
void alter_view_statement::validate(distributed<service::storage_proxy>&, const service::client_state& state)
{
// validated in announce_migration()
}
future<shared_ptr<cql_transport::event::schema_change>> alter_view_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)
future<shared_ptr<cql_transport::event::schema_change>> alter_view_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only)
{
auto&& db = proxy.get_db().local();
auto&& db = proxy.local().get_db().local();
schema_ptr schema = validation::validate_column_family(db, keyspace(), column_family());
if (!schema->is_view()) {
throw exceptions::invalid_request_exception("Cannot use ALTER MATERIALIZED VIEW on Table");
@@ -86,10 +86,10 @@ future<shared_ptr<cql_transport::event::schema_change>> alter_view_statement::an
throw exceptions::invalid_request_exception("ALTER MATERIALIZED VIEW WITH invoked, but no parameters found");
}
_properties->validate(proxy.get_db().local().get_config().extensions());
_properties->validate(proxy.local().get_db().local().get_config().extensions());
auto builder = schema_builder(schema);
_properties->apply_to_builder(builder, proxy.get_db().local().get_config().extensions());
_properties->apply_to_builder(builder, proxy.local().get_db().local().get_config().extensions());
if (builder.get_gc_grace_seconds() == 0) {
throw exceptions::invalid_request_exception(

View File

@@ -43,7 +43,7 @@
#include <seastar/core/shared_ptr.hh>
#include "database_fwd.hh"
#include "database.hh"
#include "cql3/statements/cf_prop_defs.hh"
#include "cql3/statements/schema_altering_statement.hh"
#include "cql3/cf_name.hh"
@@ -61,9 +61,9 @@ public:
virtual future<> check_access(const service::client_state& state) override;
virtual void validate(service::storage_proxy&, const service::client_state& state) override;
virtual void validate(distributed<service::storage_proxy>&, const service::client_state& state) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(service::storage_proxy& proxy, bool is_local_only) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) override;
virtual std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;
};

View File

@@ -67,10 +67,17 @@ bool cql3::statements::authentication_statement::depends_on_column_family(
}
void cql3::statements::authentication_statement::validate(
service::storage_proxy&,
distributed<service::storage_proxy>&,
const service::client_state& state) {
}
future<> cql3::statements::authentication_statement::check_access(const service::client_state& state) {
return make_ready_future<>();
}
future<::shared_ptr<cql_transport::messages::result_message>> cql3::statements::authentication_statement::execute_internal(
distributed<service::storage_proxy>& proxy,
service::query_state& state, const query_options& options) {
// Internal queries are exclusively on the system keyspace and makes no sense here
throw std::runtime_error("unsupported operation");
}

View File

@@ -52,8 +52,6 @@ namespace statements {
class authentication_statement : public raw::parsed_statement, public cql_statement_no_metadata, public ::enable_shared_from_this<authentication_statement> {
public:
authentication_statement() : cql_statement_no_metadata(&timeout_config::other_timeout) {}
uint32_t get_bound_terms() override;
std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;
@@ -66,7 +64,10 @@ public:
future<> check_access(const service::client_state& state) override;
void validate(service::storage_proxy&, const service::client_state& state) override;
void validate(distributed<service::storage_proxy>&, const service::client_state& state) override;
future<::shared_ptr<cql_transport::messages::result_message>>
execute_internal(distributed<service::storage_proxy>& proxy, service::query_state& state, const query_options& options) override;
};
}

View File

@@ -67,7 +67,7 @@ bool cql3::statements::authorization_statement::depends_on_column_family(
}
void cql3::statements::authorization_statement::validate(
service::storage_proxy&,
distributed<service::storage_proxy>&,
const service::client_state& state) {
}
@@ -75,6 +75,13 @@ future<> cql3::statements::authorization_statement::check_access(const service::
return make_ready_future<>();
}
future<::shared_ptr<cql_transport::messages::result_message>> cql3::statements::authorization_statement::execute_internal(
distributed<service::storage_proxy>& proxy,
service::query_state& state, const query_options& options) {
// Internal queries are exclusively on the system keyspace and makes no sense here
throw std::runtime_error("unsupported operation");
}
void cql3::statements::authorization_statement::maybe_correct_resource(auth::resource& resource, const service::client_state& state) {
if (resource.kind() == auth::resource_kind::data) {
const auto data_view = auth::data_resource_view(resource);

View File

@@ -56,8 +56,6 @@ namespace statements {
class authorization_statement : public raw::parsed_statement, public cql_statement_no_metadata, public ::enable_shared_from_this<authorization_statement> {
public:
authorization_statement() : cql_statement_no_metadata(&timeout_config::other_timeout) {}
uint32_t get_bound_terms() override;
std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;
@@ -70,7 +68,10 @@ public:
future<> check_access(const service::client_state& state) override;
void validate(service::storage_proxy&, const service::client_state& state) override;
void validate(distributed<service::storage_proxy>&, const service::client_state& state) override;
future<::shared_ptr<cql_transport::messages::result_message>>
execute_internal(distributed<service::storage_proxy>& proxy, service::query_state& state, const query_options& options) override;
protected:
static void maybe_correct_resource(auth::resource&, const service::client_state&);

View File

@@ -67,27 +67,19 @@ namespace statements {
logging::logger batch_statement::_logger("BatchStatement");
timeout_config_selector
timeout_for_type(batch_statement::type t) {
return t == batch_statement::type::COUNTER
? &timeout_config::counter_write_timeout
: &timeout_config::write_timeout;
}
batch_statement::batch_statement(int bound_terms, type type_,
std::vector<single_statement> statements,
std::vector<shared_ptr<modification_statement>> statements,
std::unique_ptr<attributes> attrs,
cql_stats& stats)
: cql_statement_no_metadata(timeout_for_type(type_))
, _bound_terms(bound_terms), _type(type_), _statements(std::move(statements))
: _bound_terms(bound_terms), _type(type_), _statements(std::move(statements))
, _attrs(std::move(attrs))
, _has_conditions(boost::algorithm::any_of(_statements, [] (auto&& s) { return s.statement->has_conditions(); }))
, _has_conditions(boost::algorithm::any_of(_statements, std::mem_fn(&modification_statement::has_conditions)))
, _stats(stats)
{
}
batch_statement::batch_statement(type type_,
std::vector<single_statement> statements,
std::vector<shared_ptr<modification_statement>> statements,
std::unique_ptr<attributes> attrs,
cql_stats& stats)
: batch_statement(-1, type_, std::move(statements), std::move(attrs), stats)
@@ -97,7 +89,7 @@ batch_statement::batch_statement(type type_,
bool batch_statement::uses_function(const sstring& ks_name, const sstring& function_name) const
{
return _attrs->uses_function(ks_name, function_name)
|| boost::algorithm::any_of(_statements, [&] (auto&& s) { return s.statement->uses_function(ks_name, function_name); });
|| boost::algorithm::any_of(_statements, [&] (auto&& s) { return s->uses_function(ks_name, function_name); });
}
bool batch_statement::depends_on_keyspace(const sstring& ks_name) const
@@ -118,11 +110,7 @@ uint32_t batch_statement::get_bound_terms()
future<> batch_statement::check_access(const service::client_state& state)
{
return parallel_for_each(_statements.begin(), _statements.end(), [&state](auto&& s) {
if (s.needs_authorization) {
return s.statement->check_access(state);
} else {
return make_ready_future<>();
}
return s->check_access(state);
});
}
@@ -142,12 +130,12 @@ void batch_statement::validate()
}
}
bool has_counters = boost::algorithm::any_of(_statements, [] (auto&& s) { return s.statement->is_counter(); });
bool has_non_counters = !boost::algorithm::all_of(_statements, [] (auto&& s) { return s.statement->is_counter(); });
bool has_counters = boost::algorithm::any_of(_statements, std::mem_fn(&modification_statement::is_counter));
bool has_non_counters = !boost::algorithm::all_of(_statements, std::mem_fn(&modification_statement::is_counter));
if (timestamp_set && has_counters) {
throw exceptions::invalid_request_exception("Cannot provide custom timestamp for a BATCH containing counters");
}
if (timestamp_set && boost::algorithm::any_of(_statements, [] (auto&& s) { return s.statement->is_timestamp_set(); })) {
if (timestamp_set && boost::algorithm::any_of(_statements, std::mem_fn(&modification_statement::is_timestamp_set))) {
throw exceptions::invalid_request_exception("Timestamp must be set either on BATCH or individual statements");
}
if (_type == type::COUNTER && has_non_counters) {
@@ -163,35 +151,35 @@ void batch_statement::validate()
if (_has_conditions
&& !_statements.empty()
&& (boost::distance(_statements
| boost::adaptors::transformed([] (auto&& s) { return s.statement->keyspace(); })
| boost::adaptors::transformed(std::mem_fn(&modification_statement::keyspace))
| boost::adaptors::uniqued) != 1
|| (boost::distance(_statements
| boost::adaptors::transformed([] (auto&& s) { return s.statement->column_family(); })
| boost::adaptors::transformed(std::mem_fn(&modification_statement::column_family))
| boost::adaptors::uniqued) != 1))) {
throw exceptions::invalid_request_exception("Batch with conditions cannot span multiple tables");
}
std::experimental::optional<bool> raw_counter;
for (auto& s : _statements) {
if (raw_counter && s.statement->is_raw_counter_shard_write() != *raw_counter) {
if (raw_counter && s->is_raw_counter_shard_write() != *raw_counter) {
throw exceptions::invalid_request_exception("Cannot mix raw and regular counter statements in batch");
}
raw_counter = s.statement->is_raw_counter_shard_write();
raw_counter = s->is_raw_counter_shard_write();
}
}
void batch_statement::validate(service::storage_proxy& proxy, const service::client_state& state)
void batch_statement::validate(distributed<service::storage_proxy>& proxy, const service::client_state& state)
{
for (auto&& s : _statements) {
s.statement->validate(proxy, state);
s->validate(proxy, state);
}
}
const std::vector<batch_statement::single_statement>& batch_statement::get_statements()
const std::vector<shared_ptr<modification_statement>>& batch_statement::get_statements()
{
return _statements;
}
future<std::vector<mutation>> batch_statement::get_mutations(service::storage_proxy& storage, const query_options& options, bool local, api::timestamp_type now, tracing::trace_state_ptr trace_state) {
future<std::vector<mutation>> batch_statement::get_mutations(distributed<service::storage_proxy>& storage, const query_options& options, bool local, api::timestamp_type now, tracing::trace_state_ptr trace_state) {
// Do not process in parallel because operations like list append/prepend depend on execution order.
using mutation_set_type = std::unordered_set<mutation, mutation_hash_by_key, mutation_equals_by_key>;
return do_with(mutation_set_type(), [this, &storage, &options, now, local, trace_state] (auto& result) {
@@ -200,7 +188,7 @@ future<std::vector<mutation>> batch_statement::get_mutations(service::storage_pr
return do_for_each(boost::make_counting_iterator<size_t>(0),
boost::make_counting_iterator<size_t>(_statements.size()),
[this, &storage, &options, now, local, &result, trace_state] (size_t i) {
auto&& statement = _statements[i].statement;
auto&& statement = _statements[i];
statement->inc_cql_stats();
auto&& statement_options = options.for_statement(i);
auto timestamp = _attrs->get_timestamp(now, statement_options);
@@ -235,21 +223,43 @@ void batch_statement::verify_batch_size(const std::vector<mutation>& mutations)
size_t warn_threshold = service::get_local_storage_proxy().get_db().local().get_config().batch_size_warn_threshold_in_kb() * 1024;
size_t fail_threshold = service::get_local_storage_proxy().get_db().local().get_config().batch_size_fail_threshold_in_kb() * 1024;
size_t size = 0;
class my_partition_visitor : public mutation_partition_visitor {
public:
void accept_partition_tombstone(tombstone) override {}
void accept_static_cell(column_id, atomic_cell_view v) override {
size += v.value().size();
}
void accept_static_cell(column_id, collection_mutation_view v) override {
size += v.data.size();
}
void accept_row_tombstone(const range_tombstone&) override {}
void accept_row(position_in_partition_view, const row_tombstone&, const row_marker&, is_dummy, is_continuous) override {}
void accept_row_cell(column_id, atomic_cell_view v) override {
size += v.value().size();
}
void accept_row_cell(column_id id, collection_mutation_view v) override {
size += v.data.size();
}
size_t size = 0;
};
my_partition_visitor v;
for (auto&m : mutations) {
size += m.partition().external_memory_usage(*m.schema());
m.partition().accept(*m.schema(), v);
}
if (size > warn_threshold) {
if (v.size > warn_threshold) {
auto error = [&] (const char* type, size_t threshold) -> sstring {
std::unordered_set<sstring> ks_cf_pairs;
for (auto&& m : mutations) {
ks_cf_pairs.insert(m.schema()->ks_name() + "." + m.schema()->cf_name());
}
return sprint("Batch of prepared statements for %s is of size %d, exceeding specified %s threshold of %d by %d.",
join(", ", ks_cf_pairs), size, type, threshold, size - threshold);
join(", ", ks_cf_pairs), v.size, type, threshold, v.size - threshold);
};
if (size > fail_threshold) {
if (v.size > fail_threshold) {
_logger.error(error("FAIL", fail_threshold).c_str());
throw exceptions::invalid_request_exception("Batch too large");
} else {
@@ -261,24 +271,17 @@ void batch_statement::verify_batch_size(const std::vector<mutation>& mutations)
struct batch_statement_executor {
static auto get() { return &batch_statement::do_execute; }
};
static thread_local inheriting_concrete_execution_stage<
future<shared_ptr<cql_transport::messages::result_message>>,
batch_statement*,
service::storage_proxy&,
service::query_state&,
const query_options&,
bool,
api::timestamp_type> batch_stage{"cql3_batch", batch_statement_executor::get()};
static thread_local auto batch_stage = seastar::make_execution_stage("cql3_batch", batch_statement_executor::get());
future<shared_ptr<cql_transport::messages::result_message>> batch_statement::execute(
service::storage_proxy& storage, service::query_state& state, const query_options& options) {
distributed<service::storage_proxy>& storage, service::query_state& state, const query_options& options) {
++_stats.batches;
return batch_stage(this, seastar::ref(storage), seastar::ref(state),
seastar::cref(options), false, options.get_timestamp(state));
}
future<shared_ptr<cql_transport::messages::result_message>> batch_statement::do_execute(
service::storage_proxy& storage,
distributed<service::storage_proxy>& storage,
service::query_state& query_state, const query_options& options,
bool local, api::timestamp_type now)
{
@@ -302,7 +305,7 @@ future<shared_ptr<cql_transport::messages::result_message>> batch_statement::do_
}
future<> batch_statement::execute_without_conditions(
service::storage_proxy& storage,
distributed<service::storage_proxy>& storage,
std::vector<mutation> mutations,
db::consistency_level cl,
tracing::trace_state_ptr tr_state)
@@ -332,11 +335,11 @@ future<> batch_statement::execute_without_conditions(
mutate_atomic = false;
}
}
return storage.mutate_with_triggers(std::move(mutations), cl, mutate_atomic, std::move(tr_state));
return storage.local().mutate_with_triggers(std::move(mutations), cl, mutate_atomic, std::move(tr_state));
}
future<shared_ptr<cql_transport::messages::result_message>> batch_statement::execute_with_conditions(
service::storage_proxy& storage,
distributed<service::storage_proxy>& storage,
const query_options& options,
service::query_state& state)
{
@@ -388,6 +391,23 @@ future<shared_ptr<cql_transport::messages::result_message>> batch_statement::exe
#endif
}
future<shared_ptr<cql_transport::messages::result_message>> batch_statement::execute_internal(
distributed<service::storage_proxy>& proxy,
service::query_state& query_state, const query_options& options)
{
throw std::runtime_error(sprint("%s not implemented", __PRETTY_FUNCTION__));
#if 0
assert !hasConditions;
for (IMutation mutation : getMutations(BatchQueryOptions.withoutPerStatementVariables(options), true, queryState.getTimestamp()))
{
// We don't use counters internally.
assert mutation instanceof Mutation;
((Mutation) mutation).apply();
}
return null;
#endif
}
namespace raw {
std::unique_ptr<prepared_statement>
@@ -398,9 +418,7 @@ batch_statement::prepare(database& db, cql_stats& stats) {
stdx::optional<sstring> first_cf;
bool have_multiple_cfs = false;
std::vector<cql3::statements::batch_statement::single_statement> statements;
statements.reserve(_parsed_statements.size());
std::vector<shared_ptr<cql3::statements::modification_statement>> statements;
for (auto&& parsed : _parsed_statements) {
if (!first_ks) {
first_ks = parsed->keyspace();
@@ -408,7 +426,7 @@ batch_statement::prepare(database& db, cql_stats& stats) {
} else {
have_multiple_cfs = first_ks.value() != parsed->keyspace() || first_cf.value() != parsed->column_family();
}
statements.emplace_back(parsed->prepare(db, bound_names, stats));
statements.push_back(parsed->prepare(db, bound_names, stats));
}
auto&& prep_attrs = _attrs->prepare(db, "[batch]", "[batch]");
@@ -419,7 +437,7 @@ batch_statement::prepare(database& db, cql_stats& stats) {
std::vector<uint16_t> partition_key_bind_indices;
if (!have_multiple_cfs && batch_statement_.get_statements().size() > 0) {
partition_key_bind_indices = bound_names->get_partition_key_bind_indexes(batch_statement_.get_statements()[0].statement->s);
partition_key_bind_indices = bound_names->get_partition_key_bind_indexes(batch_statement_.get_statements()[0]->s);
}
return std::make_unique<prepared>(make_shared(std::move(batch_statement_)),
bound_names->get_specifications(),

View File

@@ -66,24 +66,10 @@ class batch_statement : public cql_statement_no_metadata {
static logging::logger _logger;
public:
using type = raw::batch_statement::type;
struct single_statement {
shared_ptr<modification_statement> statement;
bool needs_authorization = true;
public:
single_statement(shared_ptr<modification_statement> s)
: statement(std::move(s))
{}
single_statement(shared_ptr<modification_statement> s, bool na)
: statement(std::move(s))
, needs_authorization(na)
{}
};
private:
int _bound_terms;
type _type;
std::vector<single_statement> _statements;
std::vector<shared_ptr<modification_statement>> _statements;
std::unique_ptr<attributes> _attrs;
bool _has_conditions;
cql_stats& _stats;
@@ -97,12 +83,12 @@ public:
* @param attrs additional attributes for statement (CL, timestamp, timeToLive)
*/
batch_statement(int bound_terms, type type_,
std::vector<single_statement> statements,
std::vector<shared_ptr<modification_statement>> statements,
std::unique_ptr<attributes> attrs,
cql_stats& stats);
batch_statement(type type_,
std::vector<single_statement> statements,
std::vector<shared_ptr<modification_statement>> statements,
std::unique_ptr<attributes> attrs,
cql_stats& stats);
@@ -121,11 +107,11 @@ public:
// The batch itself will be validated in either Parsed#prepare() - for regular CQL3 batches,
// or in QueryProcessor.processBatch() - for native protocol batches.
virtual void validate(service::storage_proxy& proxy, const service::client_state& state) override;
virtual void validate(distributed<service::storage_proxy>& proxy, const service::client_state& state) override;
const std::vector<single_statement>& get_statements();
const std::vector<shared_ptr<modification_statement>>& get_statements();
private:
future<std::vector<mutation>> get_mutations(service::storage_proxy& storage, const query_options& options, bool local, api::timestamp_type now, tracing::trace_state_ptr trace_state);
future<std::vector<mutation>> get_mutations(distributed<service::storage_proxy>& storage, const query_options& options, bool local, api::timestamp_type now, tracing::trace_state_ptr trace_state);
public:
/**
@@ -135,25 +121,29 @@ public:
static void verify_batch_size(const std::vector<mutation>& mutations);
virtual future<shared_ptr<cql_transport::messages::result_message>> execute(
service::storage_proxy& storage, service::query_state& state, const query_options& options) override;
distributed<service::storage_proxy>& storage, service::query_state& state, const query_options& options) override;
private:
friend class batch_statement_executor;
future<shared_ptr<cql_transport::messages::result_message>> do_execute(
service::storage_proxy& storage,
distributed<service::storage_proxy>& storage,
service::query_state& query_state, const query_options& options,
bool local, api::timestamp_type now);
future<> execute_without_conditions(
service::storage_proxy& storage,
distributed<service::storage_proxy>& storage,
std::vector<mutation> mutations,
db::consistency_level cl,
tracing::trace_state_ptr tr_state);
future<shared_ptr<cql_transport::messages::result_message>> execute_with_conditions(
service::storage_proxy& storage,
distributed<service::storage_proxy>& storage,
const query_options& options,
service::query_state& state);
public:
virtual future<shared_ptr<cql_transport::messages::result_message>> execute_internal(
distributed<service::storage_proxy>& proxy,
service::query_state& query_state, const query_options& options) override;
// FIXME: no cql_statement::to_string() yet
#if 0
sstring to_string() const {

View File

@@ -44,6 +44,7 @@
#include "cql3/statements/property_definitions.hh"
#include "schema.hh"
#include "database.hh"
#include "schema_builder.hh"
#include "compaction_strategy.hh"
#include "utils/UUID.hh"

View File

@@ -75,9 +75,9 @@ create_index_statement::check_access(const service::client_state& state) {
}
void
create_index_statement::validate(service::storage_proxy& proxy, const service::client_state& state)
create_index_statement::validate(distributed<service::storage_proxy>& proxy, const service::client_state& state)
{
auto& db = proxy.get_db().local();
auto& db = proxy.local().get_db().local();
auto schema = validation::validate_column_family(db, keyspace(), column_family());
if (schema->is_counter()) {
@@ -216,11 +216,11 @@ void create_index_statement::validate_targets_for_multi_column_index(std::vector
}
future<::shared_ptr<cql_transport::event::schema_change>>
create_index_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only) {
create_index_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) {
if (!service::get_local_storage_service().cluster_supports_indexes()) {
throw exceptions::invalid_request_exception("Index support is not enabled");
}
auto& db = proxy.get_db().local();
auto& db = proxy.local().get_db().local();
auto schema = db.find_schema(keyspace(), column_family());
std::vector<::shared_ptr<index_target>> targets;
for (auto& raw_target : _raw_targets) {
@@ -252,7 +252,6 @@ create_index_statement::announce_migration(service::storage_proxy& proxy, bool i
sprint("Index %s is a duplicate of existing index %s", index.name(), existing_index.value().name()));
}
}
++_cql_stats->secondary_index_creates;
schema_builder builder{schema};
builder.with_index(index);
return service::get_local_migration_manager().announce_column_family_update(
@@ -268,7 +267,6 @@ create_index_statement::announce_migration(service::storage_proxy& proxy, bool i
std::unique_ptr<cql3::statements::prepared_statement>
create_index_statement::prepare(database& db, cql_stats& stats) {
_cql_stats = &stats;
return std::make_unique<prepared_statement>(make_shared<create_index_statement>(*this));
}
@@ -281,7 +279,7 @@ index_metadata create_index_statement::make_index_metadata(schema_ptr schema,
index_options_map new_options = options;
auto target_option = boost::algorithm::join(targets | boost::adaptors::transformed(
[schema](const auto &target) -> sstring {
return target->as_string();
return target->as_cql_string(schema);
}), ",");
new_options.emplace(index_target::target_option_name, target_option);
return index_metadata{name, new_options, kind};

View File

@@ -70,7 +70,7 @@ class create_index_statement : public schema_altering_statement {
const std::vector<::shared_ptr<index_target::raw>> _raw_targets;
const ::shared_ptr<index_prop_defs> _properties;
const bool _if_not_exists;
cql_stats* _cql_stats = nullptr;
public:
create_index_statement(::shared_ptr<cf_name> name, ::shared_ptr<index_name> index_name,
@@ -78,8 +78,8 @@ public:
::shared_ptr<index_prop_defs> properties, bool if_not_exists);
future<> check_access(const service::client_state& state) override;
void validate(service::storage_proxy&, const service::client_state& state) override;
future<::shared_ptr<cql_transport::event::schema_change>> announce_migration(service::storage_proxy&, bool is_local_only) override;
void validate(distributed<service::storage_proxy>&, const service::client_state& state) override;
future<::shared_ptr<cql_transport::event::schema_change>> announce_migration(distributed<service::storage_proxy>&, bool is_local_only) override;
virtual std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;
private:

View File

@@ -69,7 +69,7 @@ future<> create_keyspace_statement::check_access(const service::client_state& st
return state.has_all_keyspaces_access(auth::permission::CREATE);
}
void create_keyspace_statement::validate(service::storage_proxy&, const service::client_state& state)
void create_keyspace_statement::validate(distributed<service::storage_proxy>&, const service::client_state& state)
{
std::string name;
name.resize(_name.length());
@@ -103,7 +103,7 @@ void create_keyspace_statement::validate(service::storage_proxy&, const service:
#endif
}
future<shared_ptr<cql_transport::event::schema_change>> create_keyspace_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)
future<shared_ptr<cql_transport::event::schema_change>> create_keyspace_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only)
{
return make_ready_future<>().then([this, is_local_only] {
return service::get_local_migration_manager().announce_new_keyspace(_attrs->as_ks_metadata(_name), is_local_only);

View File

@@ -79,9 +79,9 @@ public:
*
* @throws InvalidRequestException if arguments are missing or unacceptable
*/
virtual void validate(service::storage_proxy&, const service::client_state& state) override;
virtual void validate(distributed<service::storage_proxy>&, const service::client_state& state) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(service::storage_proxy& proxy, bool is_local_only) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) override;
virtual std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;

View File

@@ -72,12 +72,12 @@ public:
future<> grant_permissions_to_creator(const service::client_state&) const;
void validate(service::storage_proxy&, const service::client_state&) override;
void validate(distributed<service::storage_proxy>&, const service::client_state&) override;
virtual future<> check_access(const service::client_state&) override;
virtual future<::shared_ptr<cql_transport::messages::result_message>>
execute(service::storage_proxy&, service::query_state&, const query_options&) override;
execute(distributed<service::storage_proxy>&, service::query_state&, const query_options&) override;
};
}

View File

@@ -76,7 +76,7 @@ future<> create_table_statement::check_access(const service::client_state& state
return state.has_keyspace_access(keyspace(), auth::permission::CREATE);
}
void create_table_statement::validate(service::storage_proxy&, const service::client_state& state) {
void create_table_statement::validate(distributed<service::storage_proxy>&, const service::client_state& state) {
// validated in announceMigration()
}
@@ -94,9 +94,9 @@ std::vector<column_definition> create_table_statement::get_columns()
return column_defs;
}
future<shared_ptr<cql_transport::event::schema_change>> create_table_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only) {
future<shared_ptr<cql_transport::event::schema_change>> create_table_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) {
return make_ready_future<>().then([this, is_local_only, &proxy] {
return service::get_local_migration_manager().announce_new_column_family(get_cf_meta_data(proxy.get_db().local()), is_local_only);
return service::get_local_migration_manager().announce_new_column_family(get_cf_meta_data(proxy.local().get_db().local()), is_local_only);
}).then_wrapped([this] (auto&& f) {
try {
f.get();

View File

@@ -100,9 +100,9 @@ public:
virtual future<> check_access(const service::client_state& state) override;
virtual void validate(service::storage_proxy&, const service::client_state& state) override;
virtual void validate(distributed<service::storage_proxy>&, const service::client_state& state) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(service::storage_proxy& proxy, bool is_local_only) override;
virtual future<shared_ptr<cql_transport::event::schema_change>> announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only) override;
virtual std::unique_ptr<prepared> prepare(database& db, cql_stats& stats) override;

View File

@@ -76,10 +76,10 @@ inline bool create_type_statement::type_exists_in(::keyspace& ks)
return keyspace_types.find(_name.get_user_type_name()) != keyspace_types.end();
}
void create_type_statement::validate(service::storage_proxy& proxy, const service::client_state& state)
void create_type_statement::validate(distributed<service::storage_proxy>& proxy, const service::client_state& state)
{
try {
auto&& ks = proxy.get_db().local().find_keyspace(keyspace());
auto&& ks = proxy.local().get_db().local().find_keyspace(keyspace());
if (type_exists_in(ks) && !_if_not_exists) {
throw exceptions::invalid_request_exception(sprint("A user type of name %s already exists", _name.to_string()));
}
@@ -129,9 +129,9 @@ inline user_type create_type_statement::create_type(database& db)
std::move(field_names), std::move(field_types));
}
future<shared_ptr<cql_transport::event::schema_change>> create_type_statement::announce_migration(service::storage_proxy& proxy, bool is_local_only)
future<shared_ptr<cql_transport::event::schema_change>> create_type_statement::announce_migration(distributed<service::storage_proxy>& proxy, bool is_local_only)
{
auto&& db = proxy.get_db().local();
auto&& db = proxy.local().get_db().local();
// Keyspace exists or we wouldn't have validated otherwise
auto&& ks = db.find_keyspace(keyspace());

Some files were not shown because too many files have changed in this diff Show More