Compare commits

..

99 Commits

Author SHA1 Message Date
Jenkins
d27eb734a7 release: prepare for 2.2.2 by hagitsegev 2019-01-12 18:28:25 +02:00
Avi Kivity
e6aeb490b5 Update seastar submodule
* seastar 6f61d74...88cb58c (2):
  > reactor: disable nowait aio due to a kernel bug
  > configure.py: Enhance detection for gcc -fvisibility=hidden bug

Fixes #3996.
2018-12-17 15:57:58 +02:00
Vladimir Krivopalov
2e3b09b593 database: Capture io_priority_class by reference to avoid dangling ref.
The original reference points to a thread-local storage object that
guaranteed to outlive the continuation, but copying it make the
subsequent calls point to a local object and introduces a use-after-free
bug.

Fixes #3948

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
(cherry picked from commit 68458148e7)
2018-12-02 13:32:59 +02:00
Tomasz Grabiec
92c74f4e0b utils: phased_barrier: Make advance_and_await() have strong exception guarantees
Currently, when advance_and_await() fails to allocate the new gate
object, it will throw bad_alloc and leave the phased_barrier object in
an invalid state. Calling advance_and_await() again on it will result
in undefined behavior (typically SIGSEGV) beacuse _gate will be
disengaged.

One place affected by this is table::seal_active_memtable(), which
calls _flush_barrier.advance_and_await(). If this throws, subsequent
flush attempts will SIGSEGV.

This patch rearranges the code so that advance_and_await() has strong
exception guarantees.
Message-Id: <1542645562-20932-1-git-send-email-tgrabiec@scylladb.com>

Fixes #3931.

(cherry picked from commit 57e25fa0f8)
2018-11-21 12:18:25 +02:00
Avi Kivity
89d835e9e3 tests: fix network_topology_test timing out in debug mode
In 2.2, SEASTAR_DEBUG is just DEBUG.
2018-10-21 19:04:08 +03:00
Takuya ASADA
263a740084 dist/debian: use --configfile to specify pbuilderrc
Use --configfile to specify pbuilderrc, instead of copying it to home directory.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180420024624.9661-1-syuu@scylladb.com>
(cherry picked from commit 01c36556bf)
2018-10-21 18:21:18 +03:00
Avi Kivity
7f24b5319e release: prepare for 2.2.1 2018-10-19 21:16:14 +03:00
Avi Kivity
fe16c0e985 locator: fix abstract_replication_strategy::get_ranges() and friends violating sort order
get_ranges() is supposed to return ranges in sorted order. However, a35136533d
broke this and returned the range that was supposed to be last in the second
position (e.g. [0, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9]). The broke cleanup, which
relied on the sort order to perform a binary search. Other users of the
get_ranges() family did not rely on the sort order.

Fixes #3872.
Message-Id: <20181019113613.1895-1-avi@scylladb.com>

(cherry picked from commit 1ce52d5432)
2018-10-19 21:16:12 +03:00
Glauber Costa
f85badaaac api: use longs instead of ints for snapshot sizes
Int types in json will be serialized to int types in C++. They will then
only be able to handle 4GB, and we tend to store more data than that.

Without this patch, listsnapshots is broken in all versions.

Fixes: #3845

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20181012155902.7573-1-glauber@scylladb.com>
(cherry picked from commit 98332de268)
2018-10-12 22:02:56 +03:00
Eliran Sinvani
2193d41683 cql3 : add workaround to antlr3 null dereference bug
The Antlr3 exception class has a null dereference bug that crashes
the system when trying to extract the exception message using
ANTLR_Exception<...>::displayRecognitionError(...) function. When
a parsing error occurs the CqlParser throws an exception which in
turn processesed for some special cases in scylla to generate a custom
message. The default case however, creates the message using
displayRecognitionError, causing the system to crash.
The fix is a simple workaround, making sure the pointer is not null
before the call to the function. A "proper" fix can't be implemented
because the exception class itself is implemented outside scylla
in antlr headers that resides on the host machine os.

Tested manualy 2 testcases, a typo causing scylla to crash and
a cql comment without a newline at the end also caused scylla to crash.
Ran unit tests (release).

Fixes #3740
Fixes #3764

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <cfc7e0d758d7a855d113bb7c8191b0fd7d2e8921.1538566542.git.eliransin@scylladb.com>
(cherry picked from commit 20f49566a2)
2018-10-08 11:02:16 +03:00
Avi Kivity
1e1f0c29bf utils: crc32: mark power crc32 assembly as not requiring an executable stack
The linker uses an opt-in system for non-executable stack: if all object files
opt into a non-executable stack, the binary will have a non-executable stack,
which is very desirable for security. The compiler cooperates by opting into
a non-executable stack whenever possible (always for our code).

However, we also have an assembly file (for fast power crc32 computations).
Since it doesn't opt into a non-executable stack, we get a binary with
executable stack, which Gentoo's build system rightly complains about.

Fix by adding the correct incantation to the file.

Fixes #3799.

Reported-by: Alexys Jacob <ultrabug@gmail.com>
Message-Id: <20181002151251.26383-1-avi@scylladb.com>
(cherry picked from commit aaab8a3f46)
2018-10-08 11:02:16 +03:00
Calle Wilund
84d4588b5f storage_proxy: Add missing re-throw in truncate_blocking
Iff truncation times out, we want to log it, but the exception should
not be swallowed, but re-thrown.

Fixes #3796.

Message-Id: <20181001112325.17809-1-calle@scylladb.com>
(cherry picked from commit 2996b8154f)
2018-10-08 11:02:16 +03:00
Duarte Nunes
7b43b26709 tests/aggregate_fcts_test: Add test case for wrapped types
Provide a test case which checks a type being wrapped in a
reverse_type plays no role in assignment.

Refs #3789

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180927223201.28152-2-duarte@scylladb.com>
(cherry picked from commit 17578c3579)
2018-10-08 11:02:16 +03:00
Duarte Nunes
0ed01acf15 cql3/selection/selector: Unwrap types when validating assignment
When validating assignment between two types, it's possible one of
them is wrapped in a reverse_type, if it comes, for example, from the
type associated with a clustering column. When checking for weak
assignment the types are correctly unwrapped, but not when checking
for an exact match, which this patch fixes.

Technically, the receiver is never a reversed_type for the current
callers, but this is the morally correct implementation, as the type
being reversed or not plays no role in assignment.

Tests: unit(release)

Fixes #3789

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180927223201.28152-1-duarte@scylladb.com>
(cherry picked from commit 5e7bb20c8a)
2018-10-08 11:02:16 +03:00
Gleb Natapov
7ce160f408 mutation_query_test: add test for result size calculation
Check that digest only and digest+data query calculate result size to be
the same.

Message-Id: <20180906153800.GK2326@scylladb.com>
(cherry picked from commit 9e438933a2)

Message-Id: <20181008075901.GC2380@scylladb.com>
2018-10-08 11:02:09 +03:00
Gleb Natapov
5017d9b46a mutation_partition: accurately account for result size in digest only queries
When measuring_output_stream is used to calculate result's element size
it incorrectly takes into account not only serialized element size, but
a placeholder that ser::qr_partition__rows/qr_partition__static_row__cells
constructors puts in the beginning. Fix it by taking starting point in a
stream before element serialization and subtracting it afterwords.

Fixes #3755

Message-Id: <20180906153609.GJ2326@scylladb.com>
(cherry picked from commit d7674288a9)
2018-10-07 18:16:19 +03:00
Gleb Natapov
50b6ab3552 mutation_partition: correctly measure static row size when doing digest calculation
The code uses incorrect output stream in case only digest is requested
and thus getting incorrect data size. Failing to correctly account
for static row size while calculating digest may cause digest mismatch
between digest and data query.

Fixes #3753.

Message-Id: <20180905131219.GD2326@scylladb.com>
(cherry picked from commit 98092353df)
2018-09-06 16:51:19 +03:00
Eliran Sinvani
b1652823aa cql3: ensure repeated values in IN clauses don't return repeated rows
When the list of values in the IN list of a single column contains
duplicates, multiple executors are activated since the assumption
is that each value in the IN list corresponds to a different partition.
this results in the same row appearing in the result number times
corresponding to the duplication of the partition value.

Added queries for the in restriction unitest and fixed with a bad result check.

Fixes #2837
Tests: Queries as in the usecase from the GitHub issue in both forms ,
prepared and plain (using python driver),Unitest.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <ad88b7218fa55466be7bc4303dc50326a3d59733.1534322238.git.eliransin@scylladb.com>
(cherry picked from commit d734d316a6)
2018-08-26 15:51:17 +03:00
Tomasz Grabiec
02b24aec34 Merge 'Fix multi-cell static list updates in the presence of ckeys' from Duarte
Fixes a regression introduced in
9e88b60ef5, which broke the lookup for
prefetched values of lists when a clustering key is specified.

This is the code that was removed from some list operations:

 std::experimental::optional<clustering_key> row_key;
 if (!column.is_static()) {
   row_key = clustering_key::from_clustering_prefix(*params._schema, prefix);
 }
 ...
 auto&& existing_list = params.get_prefetched_list(m.key().view(), row_key, column);

Put it back, in the form of common code in the update_parameters class.

Fixes #3703

* https://github.com/duarten/scylla cql-list-fixes/v1:
  tests/cql_query_test: Test multi-cell static list updates with ckeys
  cql3/lists: Fix multi-cell static list updates in the presence of ckeys
  keys: Add factory for an empty clustering_key_prefix_view

(cherry picked from commit 6937cc2d1c)
2018-08-21 21:39:22 +01:00
Duarte Nunes
22eea4d8cf cql3/query_options: Use _value_views in prepare()
_value_views is the authoritative data structure for the
client-specified values. Indeed, the ctor called
transport::request::read_options() leaves _values completely empty.

In query_options::prepare() we were, however, using _values to
associated values to the client-specified column names, and not
_value_views. Fix this by using _value_views instead.

As for the reasons we didn't see this bug earlier, I assume it's
because very few drivers set the 0x04 query options flag, which means
column names are omitted. This is the right thing to do since most
drivers have enough information to correctly position the values.

Fixes #3688

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180814234605.14775-1-duarte@scylladb.com>
(cherry picked from commit a4355fe7e7)
2018-08-21 21:39:22 +01:00
Tomasz Grabiec
d257f6d57c mutation_partition: Fix exception safety of row::apply_monotonically()
When emplace_back() fails, value is already moved-from into a
temporary, which breaks monotonicity expected from
apply_monotonically(). As a result, writes to that cell will be lost.

The fix is to avoid the temporary by in-place construction of
cell_and_hash. To do that, appropriate cell_and_hash constructor was
added.

Found by mutation_test.cc::test_apply_monotonically_is_monotonic with
some modifications to the random mutation generator.

Introduced in 99a3e3a.

Fixes #3678.

Message-Id: <1533816965-27328-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 024b3c9fd9)
2018-08-21 21:39:18 +01:00
Takuya ASADA
6fca92ac3c dist/common/scripts/scylla_ec2_check: support custom NIC ifname on EC2
This is bash version of commit 88fe3c2694.

Since some AMIs using consistent network device naming, primary NIC
ifname is not 'eth0'.
But we hardcoded NIC name as 'eth0' on scylla_ec2_check, we need to add
--nic option to specify custom NIC ifname.

Fixes #3658

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180807231650.13697-1-syuu@scylladb.com>
2018-08-08 09:16:57 +03:00
Jesse Haber-Kucharsky
26e3917046 auth: Don't use unsupported hashing algorithms
In previous versions of Fedora, the `crypt_r` function returned
`nullptr` when a requested hashing algorithm was not supported.

This is consistent with the documentation of the function in its man
page.

As of Fedora 28, the function's behavior changes so that the encrypted
text is not `nullptr` on error, but instead the string "*0".

The info pages for `crypt_r` clarify somewhat (and contradict the man
pages):

    Some implementations return `NULL` on failure, and others return an
    _invalid_ hashed passphrase, which will begin with a `*` and will
    not be the same as SALT.

Because of this change of behavior, users running Scylla on a Fedora 28
machine which was upgraded from a previous release would not be able to
authenticate: an unsupported hashing algorithm would be selected,
producing encrypted text that did not match the entry in the table.

With this change, unsupported algorithms are correctly detected and
users should be able to continue to authenticate themselves.

Fixes #3637.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <bcd708f3ec195870fa2b0d147c8910fb63db7e0e.1533322594.git.jhaberku@scylladb.com>
(cherry picked from commit fce10f2c6e)
2018-08-05 10:30:47 +03:00
Gleb Natapov
3892594a93 cache_hitrate_calculator: fix race when new table is added during calculations
The calculation consists of several parts with preemption point between
them, so a table can be added while calculation is ongoing. Do not
assume that table exists in intermediate data structure.

Fixes #3636

Message-Id: <20180801093147.GD23569@scylladb.com>
(cherry picked from commit 44a6afad8c)
2018-08-01 14:34:08 +03:00
Amos Kong
4b24439841 scylla_setup: fix conditional statement of silent mode
Commit 300af65555 introdued a problem in
conditional statement, script will always abort in silent mode, it doesn't
care about the return value.

Fixes #3485

Signed-off-by: Amos Kong <amos@scylladb.com>
Message-Id: <1c12ab04651352964a176368f8ee28f19ae43c68.1528077114.git.amos@scylladb.com>
(cherry picked from commit 364c2551c8)
2018-07-25 09:36:32 +03:00
Takuya ASADA
a02a4592d8 dist/common/scripts/scylla_setup: abort running script when one of setup failed in silent mode
Current script silently continues even one of setup fails, need to
abort.

Fixes #3433

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180522180355.1648-1-syuu@scylladb.com>
(cherry picked from commit 300af65555)
2018-07-25 09:36:29 +03:00
Avi Kivity
b6e1c08451 Merge "row_cache: Fix violation of continuity on concurrent eviction and population" from Tomasz
"
The problem happens under the following circumstances:

  - we have a partially populated partition in cache, with a gap in the middle

  - a read with no clustering restrictions trying to populate that gap

  - eviction of the entry for the lower bound of the gap concurrent with population

The population may incorrectly mark the range before the gap as continuous.
This may result in temporary loss of writes in that clustering range. The
problem heals by clearing cache.

Caught by row_cache_test::test_concurrent_reads_and_eviction, which has been
failing sporadically.

The problem is in ensure_population_lower_bound(), which returns true if
current clustering range covers all rows, which means that the populator has a
right to set continuity flag to true on the row it inserts. This is correct
only if the current population range actually starts since before all
clustering rows. Otherwise, we're populating since _last_row and should
consult it.

Fixes #3608.
"

* 'tgrabiec/fix-violation-of-continuity-on-concurrent-read-and-eviction' of github.com:tgrabiec/scylla:
  row_cache: Fix violation of continuity on concurrent eviction and population
  position_in_partition: Introduce is_before_all_clustered_rows()

(cherry picked from commit 31151cadd4)
2018-07-18 12:07:01 +02:00
Botond Dénes
9469afcd27 storage_proxy: use the original row limits for the final results merging
`query_partition_key_range()` does the final result merging and trimming
(if necessary) to make sure we don't send more rows to the client than
requested. This merging and trimming is done by a continuation attached
to the `query_partition_key_range_concurrent()` which does the actual
querying. The continuations captures via value the `row_limit` and
`partition_limit` fields of the `query::read_command` object of the
query. This has an unexpected consequence. The lambda object is
constructed after the call to `query_partition_key_range_concurrent()`
returns. If this call doesn't defer, any modifications done to the read
command object done by `query_partition_key_range_concurrent()` will be
visible to the lambda. This is undesirable because
`query_partition_key_range_concurrent()` updates the read command object
directly as the vnodes are traversed which in turn will result in the
lambda doing the final trimming according to a decremented `row_limits`,
which will cause the paging logic to declare the query as exhausted
prematurely because the page will not be full.
To avoid all this make a copy of the relevant limit fields before
`query_partition_key_range_concurrent()` is called and pass these copies
to the continuation, thus ensuring that the final trimming will be done
according to the original page limits.

Spotted while investigating a dtest failure on my 1865/range-scans/v2
branch. On that branch the way range scans are executed on replicas is
completely refactored. These changes appearantly reduce the number of
continuations in the read path to the point where an entire page can be
filled without deferring and thus causing the problem to surface.

Fixes #3605.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <f11e80a6bf8089d49ba3c112b25a69edf1a92231.1531743940.git.bdenes@scylladb.com>
(cherry picked from commit cc4acb6e26)
2018-07-16 17:51:06 +03:00
Avi Kivity
240b9f122b Merge "Backport empty partition range scan fixes" from Botond
"
This mini-series lumps together the fix for the empty partition range
scan crash (#3564) and the two follow-up patches.
"

* 'paging-fix-backport-2.2/v1' of https://github.com/denesb/scylla:
  query_pager: use query::is_single_partition() to check for singular range
  tests/cql_query_tess: add unit test for querying empty ranges test
  query_pager: be prepared to _ranges being empty
2018-07-05 10:29:31 +03:00
Botond Dénes
cb16cd7724 query_pager: use query::is_single_partition() to check for singular range
Use query::is_single_partition() to check whether the queried ranges are
singular or not. The current method of using
`dht::partition_range::is_singular()` is incorrect, as it is possible to
build a singular range that doesn't represent a single partition.
`query::is_single_partition()` correctly checks for this so use it
instead.

Found during code-review.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <f671f107e8069910a2f84b14c8d22638333d571c.1530675889.git.bdenes@scylladb.com>
(cherry picked from commit 8084ce3a8e)
2018-07-04 12:57:45 +03:00
Botond Dénes
c864d198fc tests/cql_query_tess: add unit test for querying empty ranges test
A bug was found recently (#3564) in the paging logic, where the code
assumed the queried ranges list is non-empty. This assumption is
incorrect as there can be valid (if rare) queries that can result in the
ranges list to be empty. Add a unit test that executes such a query with
paging enabled to detect any future bugs related to assumptions about
the ranges list being non-empty.

Refs: #3564
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <f5ba308c4014c24bb392060a7e72e7521ff021fa.1530618836.git.bdenes@scylladb.com>
(cherry picked from commit c236a96d7d)
2018-07-04 09:52:54 +03:00
Botond Dénes
25125e9c4f query_pager: be prepared to _ranges being empty
do_fetch_page() checks in the beginning whether there is a saved query
state already, meaning this is not the first page. If there is not it
checks whether the query is for a singulular partitions or a range scan
to decide whether to enable the stateful queries or not. This check
assumed that there is at least one range in _ranges which will not hold
under some circumstances. Add a check for _ranges being empty.

Fixes: #3564
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <cbe64473f8013967a93ef7b2104c7ca0507afac9.1530610709.git.bdenes@scylladb.com>
(cherry picked from commit 59a30f0684)
2018-07-04 09:52:54 +03:00
Shlomi Livne
faf10fe6aa release: prepare for 2.2.0
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
2018-07-01 22:40:42 +03:00
Calle Wilund
f76269cdcf sstables::compress: Ensure unqualified compressor name if possible
Fixes #3546

Both older origin and scylla writes "known" compressor names (i.e. those
in origin namespace) unqualified (i.e. LZ4Compressor).

This behaviour was not preserved in the virtualization change. But
probably should be.

Message-Id: <20180627110930.1619-1-calle@scylladb.com>
(cherry picked from commit 054514a47a)
2018-06-28 18:55:15 +03:00
Avi Kivity
a9b0ccf116 Merge "Disable sstable filtering based on min/max clustering key components" from Tomasz
"
With DateTiered and TimeWindow, there is a read optimization enabled
which excludes sstables based on overlap with recorded min/max values
of clustering key components. The problem is that it doesn't take into
account partition tombstones and static rows, which should still be
returned by the reader even if there is no overlap in the query's
clustering range. A read which returns no clustering rows can
mispopulate cache, which will appear as partition deletion or writes
to the static row being lost. Until node restart or eviction of the
partition entry.

There is also a bad interaction between cache population on read and
that optimization. When the clustering range of the query doesn't
overlap with any sstable, the reader will return no partition markers
for the read, which leads cache populator to assume there is no
partition in sstables and it will cache an empty partition. This will
cause later reads of that partition to miss prior writes to that
partition until it is evicted from cache or node is restarted.

Disable until a more elaborate fix is implemented.

Fixes #3552
Fixes #3553
"

* tag 'tgrabiec/disable-min-max-sstable-filtering-v1' of github.com:tgrabiec/scylla:
  tests: Add test for slicing a mutation source with date tiered compaction strategy
  tests: Check that database conforms to mutation source
  database: Disable sstable filtering based on min/max clustering key components

(cherry picked from commit e1efda8b0c)
2018-06-28 18:55:15 +03:00
Tomasz Grabiec
abc5941f87 flat_mutation_reader: Move field initialization to initializer list
This works around a problem of std::terminate() being called in debug
mode build if initialization of _current throws.

Backtrace:

Thread 2 "row_cache_test_" received signal SIGABRT, Aborted.
0x00007ffff17ce9fb in raise () from /lib64/libc.so.6
(gdb) bt
  #0  0x00007ffff17ce9fb in raise () from /lib64/libc.so.6
  #1  0x00007ffff17d077d in abort () from /lib64/libc.so.6
  #2  0x00007ffff5773025 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
  #3  0x00007ffff5770c16 in ?? () from /lib64/libstdc++.so.6
  #4  0x00007ffff576fb19 in ?? () from /lib64/libstdc++.so.6
  #5  0x00007ffff5770508 in __gxx_personality_v0 () from /lib64/libstdc++.so.6
  #6  0x00007ffff3ce4ee3 in ?? () from /lib64/libgcc_s.so.1
  #7  0x00007ffff3ce570e in _Unwind_Resume () from /lib64/libgcc_s.so.1
  #8  0x0000000003633602 in reader::reader (this=0x60e0001160c0, r=...) at flat_mutation_reader.cc:214
  #9  0x0000000003655864 in std::make_unique<make_forwardable(flat_mutation_reader)::reader, flat_mutation_reader>(flat_mutation_reader &&) (__args#0=...)
    at /usr/include/c++/7/bits/unique_ptr.h:825
  #10 0x0000000003649a63 in make_flat_mutation_reader<make_forwardable(flat_mutation_reader)::reader, flat_mutation_reader>(flat_mutation_reader &&) (args#0=...)
    at flat_mutation_reader.hh:440
  #11 0x000000000363565d in make_forwardable (m=...) at flat_mutation_reader.cc:270
  #12 0x000000000303f962 in memtable::make_flat_reader (this=0x61300001d540, s=..., range=..., slice=..., pc=..., trace_state_ptr=..., fwd=..., fwd_mr=...)
    at memtable.cc:592

Message-Id: <1528792447-13336-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 6d6b93d1e7)
2018-06-28 18:55:15 +03:00
Asias He
a152ac12af gossip: Fix tokens assignment in assassinate_endpoint
The tokens vector is defined a few lines above and is needed outsie the
if block.

Do not redefine it again in the if block, otherwise the tokens will be empty.

Found by code inspection.

Fixes #3551.

Message-Id: <c7a06375c65c950e94236571127f533e5a60cbfd.1530002177.git.asias@scylladb.com>
(cherry picked from commit c3b5a2ecd5)
2018-06-28 18:55:15 +03:00
Botond Dénes
c274fdf2ec querier: find_querier(): return end() when no querier matches the range
When none of the queriers found for the lookup key match the lookup
range `_entries.end()` should be returned as the search failed. Instead
the iterator returned from the failed `std::find_if()` is returned
which, if the find failed, will be the end iterator returned by the
previous call to `_entries.equal_range()`. This is incorrect because as
long as `equal_range()`'s end iterator is not also `_entries.end()` the
search will always return an iterator to a querier regardless of whether
any of them actually matches the read range.
Fix by returning `_entries.end()` when it is detected that no queriers
match the range.

Fixes: #3530
(cherry picked from commit 2609a17a23)
2018-06-28 18:55:15 +03:00
Botond Dénes
5b88d6b4d6 querier_cache: restructure entries storage
Currently querier_cache uses a `std::unordered_map<utils::UUID, querier>`
to store cache entries and an `std::list<meta_entry>` to store meta
information about the querier entries, like insertion order, expiry
time, etc.

All cache eviction algorithms use the meta-entry list to evict entries
in reverse insertion order (LRU order). To make this possible
meta-entries keep an iterator into the entry map so that given a
meta-entry one can easily erase the querier entry. This however poses a
problem as std::unordered_map can possibly invalidate all its iterators
when new items are inserted. This is use-after-free waiting to happen.

Another disadvantages of the current solution is that it requires the
meta-entry to use a weak pointer to the querier entry so that in case
that is removed (as a result of a successful lookup) it doesn't try to
access it. This has an impact on all cache eviction algorithms as they
have to be prepared to deal with stale meta-entries. Stale meta-entries
also unnecesarily consume memory.

To solve these problems redesign how querier_cache stores entries
completely. Instead of storing the entries in an `std::unordered_map`
and storing the meta-entries in an `std::list`, store the entries in an
`std::list` and an intrusive-map (index) for lookups. This new design
has severeal advantages over the old one:
* The entries will now be in insert order, so eviction strategies can
  work on the entry list itself, no need to involve additional data
  structures for this.
* All data related to an entry is stored in one place, no data
  duplication.
* Removing an entry automatically removes it from the index as intrusive
  containers support auto unlink. This means there is no need to store
  iterators for long terms, risking use-after-free when the container
  invalidates it's iterators.

Additional changes:
* Modify eviction strategies so that they work with the `entry`
  interface rather than the stored value directly.

Ref #3424

(cherry picked from commit 7ce7f3f0cc)
2018-06-28 18:55:15 +03:00
Botond Dénes
2d626e1cf8 tests/querier_cache: fix memory based eviction test
Do increment the key counter after inserting the first querier into the
cache. Otherwise two queriers with the same key will be inserted and
will fail the test. This problem is exposed by the changes the next
patches make to the querier-cache but will be fixed before to maintain
bisectability of the code.

Fixes: #3529
(cherry picked from commit b9d51b4c08)
2018-06-28 18:55:15 +03:00
Avi Kivity
c11bd3e1cf Merge "Do not allow compaction controller shares to grow indefinitely" from Glauber
"
We are seeing some workloads with large datasets where the compaction
controller ends up with a lot of shares. Regardless of whether or not
we'll change the algorithm, this patchset handles a more basic issue,
which is the fact that the current controller doesn't set a maximum
explicitly, so if the input is larger than the maximum it will keep
growing without bounds.

It also pushes the maximum input point of the compaction controller from
10 to 30, allowing us to err on the side of caution for the 2.2 release.
"

* 'tame-controller' of github.com:glommer/scylla:
  controller: do not increase shares of controllers for inputs higher than the maximum
  controller: adjust constants for compaction controller

(cherry picked from commit e0eb66af6b)
2018-06-20 10:58:20 +03:00
Avi Kivity
9df3df92bc Merge "Try harder to move STCS towards zero-backlog" from Glauber
"
Tests: unit (release)

Before merging the LCS controller, we merged patches that would
guarantee that LCS would move towards zero backlog - otherwise the
backlog could get too high.

We didn't do the same for STCS, our first controlled strategy. So we may
end up with a situation where there are many SSTables inducing a large
backlog, but they are not yet meeting the minimum criteria for
compaction. The backlog, then, never goes down.

This patch changes the SSTable selection criteria so that if there is
nothing to do, we'll keep pushing towards reaching a state of zero
backlog. Very similar to what we did for LCS.
"

* 'stcs-min-threshold-v4' of github.com:glommer/scylla:
  STCS: bypass min_threshold unless configure to enforce strictly
  compaction_strategy: allow the user to tell us if min_threshold has to be strict

(cherry picked from commit f0fc888381)
2018-06-18 14:21:52 +03:00
Takuya ASADA
8ad9578a6c dist/debian: add --jobs <njobs> option just like build_rpm.sh
On some build environment we may want to limit number of parallel jobs since
ninja-build runs ncpus jobs by default, it may too many since g++ eats very
huge memory.
So support --jobs <njobs> just like on rpm build script.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180425205439.30053-1-syuu@scylladb.com>
(cherry picked from commit 782ebcece4)
2018-06-14 15:04:50 +03:00
Tomasz Grabiec
4cb6061a9f tests: row_cache: Reduce concurrency limit to avoid bad_alloc
The test uses random mutations. We saw it failing with bad_alloc from time to time.
Reduce concurrency to reduce memory footprint.

Message-Id: <20180611090304.16681-1-tgrabiec@scylladb.com>
(cherry picked from commit a91974af7a)
2018-06-14 13:40:00 +02:00
Tomasz Grabiec
1940e6bd95 tests: row_cache: Do not hang when only one of the readers throws
Message-Id: <20180531122729.3314-1-tgrabiec@scylladb.com>
(cherry picked from commit b5e42bc6a0)
2018-06-14 13:40:00 +02:00
Avi Kivity
044cfde5f3 database: stop using incremental selectors
There is a bug in incremental_selector for partitioned_sstable_set, so
until it is found, stop using it.

This degrades scan performance of Leveled Compaction Strategy tables.

Fixes #3513. (as a workaround)
Introduced: 2.1
Message-Id: <20180613131547.19084-1-avi@scylladb.com>

(cherry picked from commit aeffbb6732)
2018-06-13 21:04:56 +03:00
Vlad Zolotarov
262a246436 locator::ec2_multi_region_snitch: don't call for ec2_snitch::gossiper_starting()
ec2_snitch::gossiper_starting() calls for the base class (default) method
that sets _gossip_started to TRUE and thereby prevents to following
reconnectable_snitch_helper registration.

Fixes #3454

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1528208520-28046-1-git-send-email-vladz@scylladb.com>
(cherry picked from commit 2dde372ae6)
2018-06-12 19:02:19 +03:00
Botond Dénes
799dbb4f2e forwardable reader: implement fast_forward_to(position_in_partition)
Instead of throwing std::bad_function_call. Needed by the foreign_reader
unit test. Not sure how other tests didn't hit this before as the test
is using `run_mutation_source_tests()`.

(cherry picked from commit 50b67232e5)
Fixes #3491.
2018-06-05 12:34:15 +03:00
Shlomi Livne
a2fe669dd3 dist/docker: Switch to Scylla 2.2 repository
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <83b4ff801b283ade512a7035ecea9057a864dcdd.1526995747.git.shlomi@scylladb.com>
2018-06-05 12:34:15 +03:00
Avi Kivity
56de761daf Update seastar submodule
* seastar 7c6ba3a...6f61d74 (1):
  > tls: Ensure handshake always drains output before return/throw

Fixes #3461.
2018-06-05 12:34:15 +03:00
Shlomi Livne
c3187093a3 release: prepare for 2.2.rc2
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
2018-05-30 17:32:16 +03:00
Avi Kivity
111c2ecf5d Update scylla-ami submodule
* dist/ami/files/scylla-ami 49896ec...6ed71a3 (1):
  > scylla_install_ami: Update CentOS to latest version
2018-05-28 14:02:43 +03:00
Takuya ASADA
a6ecdbbba6 Revert "dist/ami: update CentOS base image to latest version"
This reverts commit 69d226625a.
Since ami-4bf3d731 is Market Place AMI, not possible to publish public AMI based on it.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180523112414.27307-1-syuu@scylladb.com>
(cherry picked from commit 6b1b9f9e602c570bbc96692d30046117e7d31ea7)
2018-05-28 13:40:15 +03:00
Glauber Costa
17cc62d0b3 commitlog: don't move pointer to segment
We are currently moving the pointer we acquired to the segment inside
the lambda in which we'll handle the cycle.

The problem is, we also use that same pointer inside the exception
handler. If an exception happens we'll access it and we'll crash.

Probably #3440.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20180518125820.10726-1-glauber@scylladb.com>
(cherry picked from commit 596a525950)
2018-05-19 19:12:26 +03:00
Shlomi Livne
eb646c61ed release: prepare for 2.2.rc1
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
2018-05-16 21:31:50 +03:00
Avi Kivity
782d817e84 dist: redhat: get rid of raid0.devices_discard_performance
This parameter is not available on recent Red Hat kernels or on
non-Red Hat kernels (it was removed on 3.10.0-772.el7,
RHBZ 1455932). The presence of the parameter on kernels that don't
support it cause the module load to fail, with the result that the
storage is not available.

Fix by removing the parameter. For someone running an older Red Hat
kernel the effect will be that discard is disabled, but they can fix
that by updating the kernel. For someone running a newer kernel, the
effect will be that they can access their data.

Fixes #3437.
Message-Id: <20180516134913.6540-1-avi@scylladb.com>

(cherry picked from commit 3b8118d4e5)
2018-05-16 20:13:59 +03:00
Avi Kivity
3ed5e63e8a Update scylla-ami submodule
* dist/ami/files/scylla-ami 02b1853...49896ec (1):
  > Merge "AMI build fix" from Takuya
2018-05-16 12:37:03 +03:00
Tomasz Grabiec
d17ce46983 Update seastar submodule
Fixes #3339.

* seastar 491f994...7c6ba3a (2):
  > Merge "fix perftune.py issues with cpu-masks on big machines" from Vlad
  > Merge 'Handle Intel's NICs in a special way'  from Vlad
2018-05-16 09:37:41 +02:00
Takuya ASADA
7ca5e7e993 dist/redhat: replace scylla-libgcc72/scylla-libstdc++72 with scylla-2.2 metapackage
We have conflict between scylla-libgcc72/scylla-libstdc++72 and
scylla-libgcc73/scylla-libstdc++73, need to replace *72 package with
scylla-2.2 metapackage to prevent it.

Fixes #3373

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180510081246.17928-1-syuu@scylladb.com>
(cherry picked from commit 6fa3c4dcad)
2018-05-11 09:42:28 +03:00
Duarte Nunes
07b0ce27fa Merge 'Include OPTIONS with LIST ROLES' from Jesse
"
Fixes #3420.

Tests: dtest (`auth_test.py`), unit (release)
"

* 'jhk/fix_3420/v2' of https://github.com/hakuch/scylla:
  cql3: Include custom options in LIST ROLES
  auth: Query custom options from the `authenticator`
  auth: Add type alias for custom auth. options

(cherry picked from commit d49348b0e1)
2018-05-10 13:22:49 +03:00
Amnon Heiman
27be3cd242 scylla-housekeeping: support new 2018.1 path variation
Starting from 2018.1 and 2.2 there was a change in the repository path.
It was made to support multiple product (like manager and place the
enterprise in a different path).

As a result, the regular expression that look for the repository fail.

This patch change the way the path is searched, both rpm and debian
varations are combined and both options of the repository path are
unified.

See scylladb/scylla-enterprise#527

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20180429151926.20431-1-amnon@scylladb.com>
(cherry picked from commit 6bf759128b)
2018-05-09 15:22:55 +03:00
Calle Wilund
abf50aafef database: Fix assert in truncate
Fixes crash in cql_tests.StorageProxyCQLTester.table_test
"avoid race condition when deleting sstable on behalf..." changed
discard_sstables behaviour to only return rp:s for sstables owned
and submitted for deletion (not all matching time stamp),
which can in some cases cause zero rp returned.
Message-Id: <20180508070003.1110-1-calle@scylladb.com>
2018-05-09 10:02:09 +01:00
Duarte Nunes
dfe5b38a43 db/view: Limit number of pending view updates
This patch adds a simple and naive mechanism to ensure a base replica
doesn't overwhelm a potentially overloaded view replica by sending too
many concurrent view updates. We add a semaphore to limit to 100 the
number of outstanding view updates. We limit globally per shard, and
not per destination view replica. We also limit statically.

Refs #2538

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180426134457.21290-2-duarte@scylladb.com>
(cherry picked from commit 4b3562c3f5)
2018-05-08 00:46:33 +01:00
Duarte Nunes
9bdc8c25f5 db/view: Return a future when sending view updates
While we now send view mutations asynchronously in the normal view
write path, other processes interested in sending view updates, such
as streaming or view building, may wish to do it synchronously.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
(cherry picked from commit dc44a08370)
2018-05-08 00:46:19 +01:00
Duarte Nunes
e75c55b2db db/timeout_clock: Properly scope type names
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180426134457.21290-1-duarte@scylladb.com>
(cherry picked from commit 2be75bdfc9)
2018-05-07 19:29:48 +01:00
Botond Dénes
756feae052 database: when dropping a table evict all relevant queriers
Queriers shouldn't outlive the table they read from as that could lead
to use-after-free problems when they are destroyed.

Fixes: #3414

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <3d7172cef79bb52b7097596e1d4ebba3a6ff757e.1525716986.git.bdenes@scylladb.com>
(cherry picked from commit 6f7d919470)
2018-05-07 21:20:42 +03:00
Tomasz Grabiec
202b4e6797 storage_proxy: Request schema from the coordinator in the original DC
The mutation forwarding intermediary (src_addr) may not always know
about the schema which was used by the original coordinator. I think
this may be the cause of the "Schema version ... not found" error seen
in one of the clusters which entered some pathological state:

  storage_proxy - Failed to apply mutation from 1.1.1.1#5: std::_Nested_exception<schema_version_loading_failed> (Failed to load schema version 32893223-a911-3a01-ad70-df1eb2a15db1): std::runtime_error (Schema version 32893223-a911-3a01-ad70-df1eb2a15db1 not found)

Fixes #3393.

Message-Id: <1524639030-1696-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 423712f1fe)
2018-05-07 13:08:40 +03:00
Raphael S. Carvalho
76ac200eff database: avoid race condition when deleting sstable on behalf of cf truncate
After removal of deletion manager, caller is now responsible for properly
submitting the deletion of a shared sstable. That's because deletion manager
was responsible for holding deletion until all owners agreed on it.
Resharding for example was changed to delete the shared sstables at the end,
but truncate wasn't changed and so race condition could happen when deleting
same sstable at more than one shard in parallel. Change the operation to only
submit a shared sstable for deletion in only one owner.

Fixes dtest migration_test.TestMigration.migrate_sstable_with_schema_change_test

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180503193427.24049-1-raphaelsc@scylladb.com>
2018-05-04 13:10:12 +01:00
Tomasz Grabiec
9aa172fe8e db: schema_tables: Treat drop of scylla_tables.version as an alter
After upgrade from 1.7 to 2.0, nodes will record a per-table schema
version which matches that on 1.7 to support the rolling upgrade. Any
later schema change (after the upgrade is done) will drop this record
from affected tables so that the per-table schema version is
recalculated. If nodes perform a schema pull (they detect schema
mismatch), then the merge will affect all tables and will wipe the
per-table schema version record from all tables, even if their schema
did not change. If then only some nodes get restarted, the restarted
nodes will load tables with the new (recalculated) per-table schema
version, while not restarted nodes will still use the 1.7 per-table
schema version. Until all nodes are restarted, writes or reads between
nodes from different groups will involve a needless exchange of schema
definition.

This will manifest in logs with repeated messages indicating schema
merge with no effect, triggered by writes:

  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f
  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f
  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f

The sync will be performed if the receiving shard forgets the foreign
version, which happens if it doesn't process any request referencing
it for more than 1 second.

This may impact latency of writes and reads.

The fix is to treat schema changes which drop the 1.7 per-table schema
version marker as an alter, which will switch in-memory data
structures to use the new per-table schema version immediately,
without the need for a restart.

Fixes #3394

Tests:
    - dtest: schema_test.py, schema_management_test.py
    - reproduced and validated the fix with run_upgrade_tests.sh from git@github.com:tgrabiec/scylla-dtest.git
    - unit (release)

Message-Id: <1524764211-12868-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit b1465291cf)
2018-05-03 10:51:19 +03:00
Takuya ASADA
c4af043ef7 dist/common/scripts/scylla_raid_setup: prevent 'device or resource busy' on creating mdraid device
According to this web site, there is possibility we have race condition with
mdraid creation vs udev:
http://dev.bizo.com/2012/07/mdadm-device-or-resource-busy.html
And looks like it can happen on our AMI, too (see #2784).

To initialize RAID safely, we should wait udev events are finished before and
after mdadm executed.

Fixes #2784

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1505898196-28389-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 4a8ed4cc6f)
2018-04-24 12:53:34 +03:00
Raphael S. Carvalho
06b25320be sstables: Fix bloom filter size after resharding by properly estimating partition count
We were feeding the total estimation partition count of an input shared
sstable to the output unshared ones.

So sstable writer thinks, *from estimation*, that each sstable created
by resharding will have the same data amount as the shared sstable they
are being created from. That's a problem because estimation is feeded to
bloom filter creation which directly influences its size.
So if we're resharding all sstables that belong to all shards, the
disk usage taken by filter components will be multiplied by the number
of shards. That becomes more of a problem with #3302.

Partition count estimation for a shard S will now be done as follow:
    //
    // TE, the total estimated partition count for a shard S, is defined as
    // TE = Sum(i = 0...N) { Ei / Si }.
    //
    // where i is an input sstable that belongs to shard S,
    //       Ei is the estimated partition count for sstable i,
    //       Si is the total number of shards that own sstable i.

Fixes #2672.
Refs #3302.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180423151001.9995-1-raphaelsc@scylladb.com>
(cherry picked from commit 11940ca39e)
2018-04-24 12:53:34 +03:00
Takuya ASADA
ff70d9f15c dist: Drop AmbientCapabilities from scylla-server.service for Debian 8
Debian 8 causes "Invalid argument" when we used AmbientCapabilities on systemd
unit file, so drop the line when we build .deb package for Debian 8.
For other distributions, keep using the feature.

Fixes #3344

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180423102041.2138-1-syuu@scylladb.com>
(cherry picked from commit 7b92c3fd3f)
2018-04-24 12:53:34 +03:00
Avi Kivity
9bbd5821a2 Update scylla-ami submodule
* dist/ami/files/scylla-ami 9b4be70...02b1853 (1):
  > scylla_install_ami: remove the host id file after scylla_setup
2018-04-24 12:53:34 +03:00
Avi Kivity
a7841f1f2e release: prepare for 2.2.rc0 2018-04-18 11:08:43 +03:00
Takuya ASADA
84859e0745 dist/debian: use ~root as HOME to place .pbuilderrc
When 'always_set_home' is specified on /etc/sudoers pbuilder won't read
.pbuilderrc from current user home directory, and we don't have a way to change
the behavor from sudo command parameter.

So let's use ~root/.pbuilderrc and switch to HOME=/root when sudo executed,
this can work both environment which does specified always_set_home and doesn't
specified.

Fixes #3366

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1523926024-3937-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit ace44784e8)
2018-04-17 09:38:43 +03:00
Avi Kivity
6b74e1f02d Update seastar submodule
* seastar bcfbe0c...491f994 (3):
  > tls: Ensure we always pass through semaphores on shutdown
  > cpu scheduler: don't penalize first group to run
  > reactor: fix sleep mode

Fixes #3350.
2018-04-14 20:44:11 +03:00
Avi Kivity
520f17b315 Point seastar submodule at scylla-seastar.git
This allows backporting seastar patches.
2018-04-14 20:43:28 +03:00
Gleb Natapov
9fe3d04f31 cql_server: fix a race between closing of a connection and notifier registration
There is a race between cql connection closure and notifier
registration. If a connection is closed before notification registration
is complete stale pointer to the connection will remain in notification
list since attempt to unregister the connection will happen to early.
The fix is to move notifier unregisteration after connection's gate
is closed which will ensure that there is no outstanding registration
request. But this means that now a connection with closed gate can be in
notifier list, so with_gate() may throw and abort a notifier loop. Fix
that by replacing with_gate() by call to is_closed();

Fixes: #3355
Tests: unit(release)

Message-Id: <20180412134744.GB22593@scylladb.com>
(cherry picked from commit 1a9aaece3e)
2018-04-12 16:57:07 +03:00
Raphael S. Carvalho
a74183eb1e sstables/compaction_manager: do not break lcs invariant by not allowing parallel compaction for it
After change to serialize compaction on compaction weight (eff62bc61e),
LCS invariant may break because parallel compaction can start, and it's
not currently supported for LCS.

The condition is that weight is deregistered right before last sstable
for a leveled compaction is sealed, so it may happen that a new compaction
starts for the same column family meanwhile that will promote a sstable to
an overlapping token range.

That leads to strategy restoring invariant when it finds the overlapping,
and that means wasted resources.
The fix is about removing a fast path check which is incorrect now because
we release weight early and also fixing a check for ongoing compaction
which prevented compaction from starting for LCS whenever weight tracker
was not empty.

Fixes #3279.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180410034538.30486-1-raphaelsc@scylladb.com>
(cherry picked from commit 638a647b7d)
2018-04-10 20:59:48 +03:00
Raphael S. Carvalho
e059f17bf2 database: make sure sstable is also forwarded to shard responsible for its generation
After f59f423f3c, sstable is loaded only at shards
that own it so as to reduce the sstable load overhead.

The problem is that a sstable may no longer be forwarded to a shard that needs to
be aware of its existence which would result in that sstable generation being
reallocated for a write request.
That would result in a failure as follow:
"SSTable write failed due to existence of TOC file for generation..."

This can be fixed by forwarding any sstable at load to all its owner shards
*and* the shard responsible for its generation, which is determined as follow:
s = generation % smp::count

Fixes #3273.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180405035245.30194-1-raphaelsc@scylladb.com>
(cherry picked from commit 30b6c9b4cd)
2018-04-05 10:58:29 +03:00
Duarte Nunes
0e8e005357 db/view: Reject view entries with non-composite, empty partition key
Empty partition keys are not supported on normal tables - they cannot
be inserted or queried (surprisingly, the rules for composite
partition keys are different: all components are then allowed to be
empty). However, the (non-composite) partition key of a view could end
up being empty if that column is: a base table regular column, a
base table clustering key column, or a base table partition key column,
part of a composite key.

Fixes #3262
Refs CASSANDRA-14345

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180403122244.10626-1-duarte@scylladb.com>
(cherry picked from commit ec8960df45)
2018-04-03 17:20:33 +03:00
Glauber Costa
8bf6f39392 docker: default docker to overprovisioned mode.
By default, overprovisioned is not enabled on docker unless it is
explicitly set. I have come to believe that this is a mistake.

If the user is running alone in the machine, and there are no other
processes pinned anywhere - including interrupts - not running
overprovisioned is the best choice.

But everywhere else, it is not: even if a user runs 2 docker containers
in the same machine and statically partitions CPUs with --smp (but
without cpuset) the docker containers will pin themselves to the same
sets of CPU, as they are totally unaware of each other.

It is also very common, specially in some virtualized environments, for
interrupts not to be properly distributed - being particularly keen on
being delivered on CPU0, a CPU which Scylla will pin by default.

Lastly, environments like Kubernetes simply don't support pinning at the
moment.

This patch enables the overprovisioned flag if it is explicitly set -
like we did before - but also by default unless --cpuset is set.

Fixes #3336.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20180331142131.842-1-glauber@scylladb.com>
(cherry picked from commit ef84780c27)
2018-04-02 17:07:20 +03:00
Glauber Costa
04ba51986e parse and ignore background writer controller
Unused options are not exposed as command line options and will prevent
Scylla from booting when present, although they can still be pased over
YAML, for Cassandra compatibility.

That has never been a problem, but we have been adding options to i3
(and others) that are now deprecated, but were previously marked as
Used. Systems with those options may have issues upgrading.

While this problem is common to all Unused options, the likelihood for
any other unused option to appear in the command line is near zero,
except for those two - since we put them there ourselves.

There are two ways to handle this issue:

1) Mark them as Used, and just ignore them.
2) Add them explicitly to boost program options, and then ignore them.

The second option is preferred here, because we can add them as hidden
options in program_options, meaning they won't show up in the help. We
can then just print a discrete message saying that those options are,
for now on ignored.

v2: mark set as const (Botond)
v3: rebase on top of master, identation suggested by Duarte.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20180329145517.8462-1-glauber@scylladb.com>
(cherry picked from commit a9ef72537f)
2018-03-29 17:57:43 +03:00
Asias He
1d5379c462 gossip: Relax generation max difference check
start node 1 2 3
shutdown node2
shutdown node1 and node3
start node1 and node3
nodetool removenode node2
clean up all scylla data on node2
bootstrap node2 as a new node

I saw node2 could not bootstrap stuck at waiting for schema information to compelte for ever:

On node1, node3

    [shard 0] gossip - received an invalid gossip generation for peer 127.0.0.2; local generation = 2, received generation = 1521779704

On node2

    [shard 0] storage_service - JOINING: waiting for schema information to complete

This is becasue in nodetool removenode operation, the generation of node1 was increased from 0 to 2.

   gossiper::advertise_removing () calls eps.get_heart_beat_state().force_newer_generation_unsafe();
   gossiper::advertise_token_removed() calls eps.get_heart_beat_state().force_newer_generation_unsafe();

Each force_newer_generation_unsafe increases the generation by 1.

Here is an example,

Before nodetool removenode:
```
curl -X GET --header "Accept: application/json" "http://127.0.0.1:10000/failure_detector/endpoints/" | python -mjson.tool
   {
   "addrs": "127.0.0.2",
   "generation": 0,
   "is_alive": false,
   "update_time": 1521778757334,
   "version": 0
   },
```

After nodetool revmoenode:
```
curl -X GET --header "Accept: application/json" "http://127.0.0.1:10000/failure_detector/endpoints/" | python -mjson.tool
 {
     "addrs": "127.0.0.2",
     "application_state": [
         {
             "application_state": 0,
             "value": "removed,146b52d5-dc94-4e35-b7d4-4f64be0d2672,1522038476246",
             "version": 214
         },
         {
             "application_state": 6,
             "value": "REMOVER,14ecc9b0-4b88-4ff3-9c96-38505fb4968a",
             "version": 153
            }
     ],
     "generation": 2,
     "is_alive": false,
     "update_time": 1521779276246,
     "version": 0
 },
```

In gossiper::apply_state_locally, we have this check:

```
if (local_generation != 0 && remote_generation > local_generation + MAX_GENERATION_DIFFERENCE) {
    // assume some peer has corrupted memory and is broadcasting an unbelievable generation about another peer (or itself)
  logger.warn("received an invalid gossip generation for peer {}; local generation = {}, received generation = {}",ep, local_generation, remote_generation);

}
```
to skip the gossip update.

To fix, we relax generation max difference check to allow the generation
of a removed node.

After this patch, the removed node bootstraps successfully.

Tests: dtest:update_cluster_layout_tests.py
Fixes #3331

Message-Id: <678fb60f6b370d3ca050c768f705a8f2fd4b1287.1522289822.git.asias@scylladb.com>
(cherry picked from commit f539e993d3)
2018-03-29 12:10:09 +03:00
Avi Kivity
cb5dc56bfd Update scylla-ami submodule
Ref #3332.
2018-03-29 10:35:54 +03:00
Duarte Nunes
b578b492cd column_family: Don't retry flushing memtable if shutdown is requested
Since we just keep retrying, this can cause Scylla to not shutdown for
a while.

The data will be safe in the commit log.

Note that this patch doesn't fix the issue when shutdown goes through
storage_service::drain_on_shutdown - more work is required to handle
that case.

Ref #3318.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180324140822.3743-3-duarte@scylladb.com>
(cherry picked from commit a985ea0fcb)
2018-03-26 15:26:56 +03:00
Duarte Nunes
30c950a7f6 column_family: Increase scope of exception handling when flushing a memtable
In column_family::try_flush_memtable_to_sstable, the handle_exception()
block is on the inside of the continuations to
write_memtable_to_sstable(), which, if it fails, will leave the
sstable in the compaction_backlog_tracker::_ongoing_writes map, which
will waste disk space, and that sstable will map to a dangling pointer
to a destroyed database_sstable_write_monitor, which causes a seg
fault when accessed (for example, through the backlog_controller,
which accounts the _ongoing_writes when calculating the backlog).

Fix this by increasing the scope of handle_exception().

Fixes #3315

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180324140822.3743-2-duarte@scylladb.com>
(cherry picked from commit 50ad37d39b)
2018-03-26 15:26:54 +03:00
Duarte Nunes
f0d1e9c518 backlog_controller: Stop update timer
On database shutdown, this timer can cause use-after-free errors if
not stopped.

Refs #3315

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180324140822.3743-1-duarte@scylladb.com>
(cherry picked from commit b7bd9b8058)
2018-03-26 15:26:52 +03:00
Avi Kivity
597aeca93d Merge "Bug fixes for access-control, and finalizing roles" from Jesse
"
This series does not add or change any features of access-control and
roles, but addresses some bugs and finalizes the switch to roles.

"auth: Wait for schema agreement" and the patch prior help avoid false
negatives for integration tests and error messages in logs.

"auth: Remove ordering dependence" fixes an important bug in `auth` that
could leave the default superuser in a corrupted state when it is first
created.

Since roles are feature-complete (to the best of the author's knowledge
as of this writing), the final patch in the series removes any warnings
about them being unimplemented.

Tests: unit (release), dtest (PENDING)
"

* 'jhk/auth_fixes/v1' of https://github.com/hakuch/scylla:
  Roles are implemented
  auth: Increase delay before background tasks start
  auth: Remove ordering dependence
  auth: Don't warn on rescheduled task
  auth: Wait for schema agreement
  Single-node clusters can agree on schema

(cherry picked from commit 999df41a49)
2018-03-26 12:37:41 +03:00
Duarte Nunes
1a94b90a4d Merge 'Grant default permissions' from Jesse
The functional change in this series is in the last patch
("auth: Grant all permissions to object creator").

The first patch addresses `const` correctness in `auth`. This change
allowed the new code added in the last patch to be written with the
correct `const` specifiers, and also some code to be removed.

The second-to-last patch addresses error-handling in the authorizer for
unsupported operations and is a prerequisite for the last patch (since
we now always grant permissions for new database objects).

Tests: unit (release)

* 'jhk/default_permissions/v3' of https://github.com/hakuch/scylla:
  auth: Grant all permissions to object creator
  auth: Unify handling for unsupported errors
  auth: Fix life-time issue with parameter
  auth: Fix `const` correctness

(cherry picked from commit 934d805b4b)
2018-03-26 12:37:35 +03:00
Avi Kivity
acdd42c7c8 Merge "Fix abort during counter table read-on-delete" from Tomasz
"
This fixes an abort in an sstable reader when querying a partition with no
clustering ranges (happens on counter table mutation with no live rows) which
also doesn't have any static columns. In such case, the
sstable_mutation_reader will setup the data_consume_context such that it only
covers the static row of the partition, knowing that there is no need to read
any clustered rows. See partition.cc::advance_to_upper_bound(). Later when
the reader is done with the range for the static row, it will try to skip to
the first clustering range (missing in this case). If clustering_ranges_walker
tells us to skip to after_all_clustering_rows(), we will hit an assert inside
continuous_data_consumer::fast_forward_to() due to attempt to skip past the
original data file range. If clustering_ranges_walker returns
before_all_clustering_rows() instead, all is fine because we're still at the
same data file position.

Fixes #3304.
"

* 'tgrabiec/fix-counter-read-no-static-columns' of github.com:scylladb/seastar-dev:
  tests: mutation_source_test: Test reads with no clustering ranges and no static columns
  tests: simple_schema: Allow creating schema with no static column
  clustering_ranges_walker: Stop after static row in case no clustering ranges

(cherry picked from commit 054854839a)
2018-03-22 18:13:29 +02:00
Takuya ASADA
bd4f658555 scripts/scylla_install_pkg: follow redirection of specified repo URL
We should follow redirection on curl, just like normal web browser does.
Fixes #3312

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521712056-301-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit bef08087e1)
2018-03-22 12:56:58 +02:00
Vladimir Krivopalov
a983ba7aad perf_fast_forward: fix error in date formatting
Instead of 'month', 'minutes' has been used.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <1e005ecaa992d8205ca44ea4eebbca4621ad9886.1521659341.git.vladimir@scylladb.com>
(cherry picked from commit 3010b637c9)
2018-03-22 12:56:56 +02:00
Duarte Nunes
0a561fc326 gms/gossiper: Synchronize endpoint state destruction
In gossiper::handle_major_state_change() we set the endpoint_state for
a particular endpoint and replicate the changes to other cores.

This is totally unsynchronized with the execution of
gossiper::evict_from_membership(), which can happen concurrently, and
can remove the very same endpoint from the map  (in all cores).

Replicating the changes to other cores in handle_major_state_change()
can interleave with replicating the changes to other cores in
evict_from_membership(), and result in an undefined final state.

Another issue happened in debug mode dtests, where a fiber executes
handle_major_state_change(), calls into the subscribers, of which
storage_service is one, and ultimately lands on
storage_service::update_peer_info(), which iterates over the
endpoint's application state with deferring points in between (to
update a system table). gossiper::evict_from_membership() was executed
concurrently by another fiber, which freed the state the first one is
iterating over.

Fixes #3299.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180318123211.3366-1-duarte@scylladb.com>
(cherry picked from commit 810db425a5)
2018-03-18 14:54:54 +02:00
Takuya ASADA
1f10549056 dist/redhat: build only scylla, iotune
Since we don't package tests, we don't need to build them.
It reduces package building time.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521066363-4859-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 1bb3531b90)
2018-03-15 10:48:36 +02:00
Takuya ASADA
c2a2560ea3 dist/debian: use 3rdparty ppa on Ubuntu 18.04
Currently Ubuntu 18.04 uses distribution provided g++ and boost, but it's easier
to maintain Scylla package to build with same version toolchain/libraries, so
switch to them.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521075576-12064-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 945e6ec4f6)
2018-03-15 10:48:31 +02:00
Takuya ASADA
237e36a0b4 dist/ami: update CentOS base image to latest version
Since we requires updated version of systemd, we need to update CentOS base
image.

Fixes #3184

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1518118694-23770-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 69d226625a)
2018-03-15 10:47:54 +02:00
Takuya ASADA
e78c137bfc dist/redhat: switch to gcc-7.3
We have hit following bug on debug-mode binary:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82560
Since it's fixed on gcc-7.3, we need to upgrade our gcc package.

See: https://groups.google.com/d/topic/scylladb-dev/RIdIpqMeTog/discussion
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1521064473-17906-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 856dc0a636)
2018-03-15 10:39:40 +02:00
Avi Kivity
fb99a7c902 Merge "Ubuntu/Debian build error fixes" from Takuya
* 'debian-ubuntu-build-fixes-v2' of https://github.com/syuu1228/scylla:
  dist/debian: build only scylla, iotune
  dist/debian: switch to boost-1.65
  dist/debian: switch to gcc-7.3

(cherry picked from commit bb4b1f0e91)
2018-03-14 22:51:44 +02:00
4184 changed files with 79910 additions and 285323 deletions

View File

@@ -1,4 +0,0 @@
.git
build
seastar/build
testlog

87
.github/CODEOWNERS vendored
View File

@@ -1,87 +0,0 @@
# AUTH
auth/* @elcallio @vladzcloudius
# CACHE
row_cache* @tgrabiec @haaawk
*mutation* @tgrabiec @haaawk
tests/mvcc* @tgrabiec @haaawk
# CDC
cdc/* @haaawk @kbr- @elcallio @piodul @jul-stas
test/cql/cdc_* @haaawk @kbr- @elcallio @piodul @jul-stas
test/boost/cdc_* @haaawk @kbr- @elcallio @piodul @jul-stas
# COMMITLOG / BATCHLOG
db/commitlog/* @elcallio
db/batch* @elcallio
# COORDINATOR
service/storage_proxy* @gleb-cloudius
# COMPACTION
sstables/compaction* @raphaelsc @nyh
# CQL TRANSPORT LAYER
transport/* @penberg
# CQL QUERY LANGUAGE
cql3/* @tgrabiec @penberg @psarna
# COUNTERS
counters* @haaawk @jul-stas
tests/counter_test* @haaawk @jul-stas
# GOSSIP
gms/* @tgrabiec @asias
# DOCKER
dist/docker/* @penberg
# LSA
utils/logalloc* @tgrabiec
# MATERIALIZED VIEWS
db/view/* @nyh @psarna
cql3/statements/*view* @nyh @psarna
test/boost/view_* @nyh @psarna
# PACKAGING
dist/* @syuu1228
# REPAIR
repair/* @tgrabiec @asias @nyh
# SCHEMA MANAGEMENT
db/schema_tables* @tgrabiec @nyh
db/legacy_schema_migrator* @tgrabiec @nyh
service/migration* @tgrabiec @nyh
schema* @tgrabiec @nyh
# SECONDARY INDEXES
db/index/* @nyh @penberg @psarna
cql3/statements/*index* @nyh @penberg @psarna
test/boost/*index* @nyh @penberg @psarna
# SSTABLES
sstables/* @tgrabiec @raphaelsc @nyh
# STREAMING
streaming/* @tgrabiec @asias
service/storage_service.* @tgrabiec @asias
# ALTERNATOR
alternator/* @nyh @psarna
test/alternator/* @nyh @psarna
# HINTED HANDOFF
db/hints/* @haaawk @piodul @vladzcloudius
# REDIS
redis/* @nyh @syuu1228
redis-test/* @nyh @syuu1228
# READERS
reader_* @denesb
querier* @denesb
test/boost/mutation_reader_test.cc @denesb
test/boost/querier_cache_test.cc @denesb

View File

@@ -1,9 +1,3 @@
This is Scylla's bug tracker, to be used for reporting bugs only.
If you have a question about Scylla, and not a bug, please ask it in
our mailing-list at scylladb-dev@googlegroups.com or in our slack channel.
- [] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.
*Installation details*
Scylla version (or git commit hash):
Cluster size:

View File

@@ -1,33 +0,0 @@
name: "CI Docs"
on:
push:
branches:
- master
paths:
- 'docs/**'
jobs:
release:
name: Build
runs-on: ubuntu-latest
env:
LATEST_VERSION: master
steps:
- name: Checkout
uses: actions/checkout@v2
with:
persist-credentials: false
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v1
with:
python-version: 3.7
- name: Build docs
run: |
export PATH=$PATH:~/.local/bin
cd docs
make multiversion
- name: Deploy
run : ./docs/_utils/deploy.sh
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

9
.gitignore vendored
View File

@@ -18,12 +18,3 @@ CMakeLists.txt.user
*.egg-info
__pycache__CMakeLists.txt.user
.gdbinit
resources
.pytest_cache
/expressions.tokens
tags
testlog
test/*/*.reject
.vscode
docs/_build
docs/poetry.lock

21
.gitmodules vendored
View File

@@ -6,18 +6,9 @@
path = swagger-ui
url = ../scylla-swagger-ui
ignore = dirty
[submodule "libdeflate"]
path = libdeflate
url = ../libdeflate
[submodule "abseil"]
path = abseil
url = ../abseil-cpp
[submodule "scylla-jmx"]
path = tools/jmx
url = ../scylla-jmx
[submodule "scylla-tools"]
path = tools/java
url = ../scylla-tools-java
[submodule "scylla-python3"]
path = tools/python3
url = ../scylla-python3
[submodule "dist/ami/files/scylla-ami"]
path = dist/ami/files/scylla-ami
url = ../scylla-ami
[submodule "xxHash"]
path = xxHash
url = ../xxHash

View File

@@ -1,756 +1,141 @@
cmake_minimum_required(VERSION 3.18)
##
## For best results, first compile the project using the Ninja build-system.
##
cmake_minimum_required(VERSION 3.7)
project(scylla)
if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
message(STATUS "Setting build type to 'Release' as none was specified.")
set(CMAKE_BUILD_TYPE "Release" CACHE
STRING "Choose the type of build." FORCE)
# Set the possible values of build type for cmake-gui
set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS
"Debug" "Release" "Dev" "Sanitize")
if (NOT DEFINED FOR_IDE AND NOT DEFINED ENV{FOR_IDE} AND NOT DEFINED ENV{CLION_IDE})
message(FATAL_ERROR "This CMakeLists.txt file is only valid for use in IDEs, please define FOR_IDE to acknowledge this.")
endif()
if(CMAKE_BUILD_TYPE)
string(TOLOWER "${CMAKE_BUILD_TYPE}" BUILD_TYPE)
else()
set(BUILD_TYPE "release")
endif()
# Default value. A more accurate list is populated through `pkg-config` below if `seastar.pc` is available.
set(SEASTAR_INCLUDE_DIRS "seastar")
function(default_target_arch arch)
set(x86_instruction_sets i386 i686 x86_64)
if(CMAKE_SYSTEM_PROCESSOR IN_LIST x86_instruction_sets)
set(${arch} "westmere" PARENT_SCOPE)
elseif(CMAKE_SYSTEM_PROCESSOR EQUAL "aarch64")
set(${arch} "armv8-a+crc+crypto" PARENT_SCOPE)
# These paths are always available, since they're included in the repository. Additional DPDK headers are placed while
# Seastar is built, and are captured in `SEASTAR_INCLUDE_DIRS` through parsing the Seastar pkg-config file (below).
set(SEASTAR_DPDK_INCLUDE_DIRS
seastar/dpdk/lib/librte_eal/common/include
seastar/dpdk/lib/librte_eal/common/include/generic
seastar/dpdk/lib/librte_eal/common/include/x86
seastar/dpdk/lib/librte_ether)
find_package(PkgConfig REQUIRED)
set(ENV{PKG_CONFIG_PATH} "${CMAKE_SOURCE_DIR}/seastar/build/release:$ENV{PKG_CONFIG_PATH}")
pkg_check_modules(SEASTAR seastar)
find_package(Boost COMPONENTS filesystem program_options system thread)
##
## Populate the names of all source and header files in the indicated paths in a designated variable.
##
## When RECURSIVE is specified, directories are traversed recursively.
##
## Use: scan_scylla_source_directories(VAR my_result_var [RECURSIVE] PATHS [path1 path2 ...])
##
function (scan_scylla_source_directories)
set(options RECURSIVE)
set(oneValueArgs VAR)
set(multiValueArgs PATHS)
cmake_parse_arguments(args "${options}" "${oneValueArgs}" "${multiValueArgs}" "${ARGN}")
set(globs "")
foreach (dir ${args_PATHS})
list(APPEND globs "${dir}/*.cc" "${dir}/*.hh")
endforeach()
if (args_RECURSIVE)
set(glob_kind GLOB_RECURSE)
else()
set(${arch} "" PARENT_SCOPE)
set(glob_kind GLOB)
endif()
endfunction()
default_target_arch(target_arch)
if(target_arch)
set(target_arch_flag "-march=${target_arch}")
endif()
# Configure Seastar compile options to align with Scylla
set(Seastar_CXX_FLAGS -fcoroutines ${target_arch_flag} CACHE INTERNAL "" FORCE)
set(Seastar_CXX_DIALECT gnu++20 CACHE INTERNAL "" FORCE)
file(${glob_kind} var
${globs})
add_subdirectory(seastar)
add_subdirectory(abseil)
# Exclude absl::strerror from the default "all" target since it's not
# used in Scylla build and, moreover, makes use of deprecated glibc APIs,
# such as sys_nerr, which are not exposed from "stdio.h" since glibc 2.32,
# which happens to be the case for recent Fedora distribution versions.
#
# Need to use the internal "absl_strerror" target name instead of namespaced
# variant because `set_target_properties` does not understand the latter form,
# unfortunately.
set_target_properties(absl_strerror PROPERTIES EXCLUDE_FROM_ALL TRUE)
# System libraries dependencies
find_package(Boost COMPONENTS filesystem program_options system thread regex REQUIRED)
find_package(Lua REQUIRED)
find_package(ZLIB REQUIRED)
find_package(ICU COMPONENTS uc REQUIRED)
set(scylla_build_dir "${CMAKE_BINARY_DIR}/build/${BUILD_TYPE}")
set(scylla_gen_build_dir "${scylla_build_dir}/gen")
file(MAKE_DIRECTORY "${scylla_build_dir}" "${scylla_gen_build_dir}")
# Place libraries, executables and archives in ${buildroot}/build/${mode}/
foreach(mode RUNTIME LIBRARY ARCHIVE)
set(CMAKE_${mode}_OUTPUT_DIRECTORY "${scylla_build_dir}")
endforeach()
# Generate C++ source files from thrift definitions
function(scylla_generate_thrift)
set(one_value_args TARGET VAR IN_FILE OUT_DIR SERVICE)
cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})
get_filename_component(in_file_name ${args_IN_FILE} NAME_WE)
set(aux_out_file_name ${args_OUT_DIR}/${in_file_name})
set(outputs
${aux_out_file_name}_types.cpp
${aux_out_file_name}_types.h
${aux_out_file_name}_constants.cpp
${aux_out_file_name}_constants.h
${args_OUT_DIR}/${args_SERVICE}.cpp
${args_OUT_DIR}/${args_SERVICE}.h)
add_custom_command(
DEPENDS
${args_IN_FILE}
thrift
OUTPUT ${outputs}
COMMAND ${CMAKE_COMMAND} -E make_directory ${args_OUT_DIR}
COMMAND thrift -gen cpp:cob_style,no_skeleton -out "${args_OUT_DIR}" "${args_IN_FILE}")
add_custom_target(${args_TARGET}
DEPENDS ${outputs})
set(${args_VAR} ${outputs} PARENT_SCOPE)
set(${args_VAR} ${var} PARENT_SCOPE)
endfunction()
scylla_generate_thrift(
TARGET scylla_thrift_gen_cassandra
VAR scylla_thrift_gen_cassandra_files
IN_FILE interface/cassandra.thrift
OUT_DIR ${scylla_gen_build_dir}
SERVICE Cassandra)
## Although Seastar is an external project, it is common enough to explore the sources while doing
## Scylla development that we'll treat the Seastar sources as part of this project for easier navigation.
scan_scylla_source_directories(
VAR SEASTAR_SOURCE_FILES
RECURSIVE
# Parse antlr3 grammar files and generate C++ sources
function(scylla_generate_antlr3)
set(one_value_args TARGET VAR IN_FILE OUT_DIR)
cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})
PATHS
seastar/core
seastar/http
seastar/json
seastar/net
seastar/rpc
seastar/tests
seastar/util)
get_filename_component(in_file_pure_name ${args_IN_FILE} NAME)
get_filename_component(stem ${in_file_pure_name} NAME_WE)
scan_scylla_source_directories(
VAR SCYLLA_ROOT_SOURCE_FILES
PATHS .)
set(outputs
"${args_OUT_DIR}/${stem}Lexer.hpp"
"${args_OUT_DIR}/${stem}Lexer.cpp"
"${args_OUT_DIR}/${stem}Parser.hpp"
"${args_OUT_DIR}/${stem}Parser.cpp")
scan_scylla_source_directories(
VAR SCYLLA_SUB_SOURCE_FILES
RECURSIVE
add_custom_command(
DEPENDS
${args_IN_FILE}
OUTPUT ${outputs}
# Remove #ifdef'ed code from the grammar source code
COMMAND sed -e "/^#if 0/,/^#endif/d" "${args_IN_FILE}" > "${args_OUT_DIR}/${in_file_pure_name}"
COMMAND antlr3 "${args_OUT_DIR}/${in_file_pure_name}"
# We replace many local `ExceptionBaseType* ex` variables with a single function-scope one.
# Because we add such a variable to every function, and because `ExceptionBaseType` is not a global
# name, we also add a global typedef to avoid compilation errors.
COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Lexer.hpp"
COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Lexer.cpp"
COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Parser.hpp"
COMMAND sed -i
-e "s/^\\( *\\)\\(ImplTraits::CommonTokenType\\* [a-zA-Z0-9_]* = NULL;\\)$/\\1const \\2/"
-e "/^.*On :.*$/d"
-e "1i using ExceptionBaseType = int;"
-e "s/^{/{ ExceptionBaseType\\* ex = nullptr;/; s/ExceptionBaseType\\* ex = new/ex = new/; s/exceptions::syntax_exception e/exceptions::syntax_exception\\& e/"
"${args_OUT_DIR}/${stem}Parser.cpp"
VERBATIM)
PATHS
api
auth
cql3
db
dht
exceptions
gms
index
io
locator
message
repair
service
sstables
streaming
tests
thrift
tracing
transport
utils)
add_custom_target(${args_TARGET}
DEPENDS ${outputs})
scan_scylla_source_directories(
VAR SCYLLA_GEN_SOURCE_FILES
RECURSIVE
PATHS build/release/gen)
set(${args_VAR} ${outputs} PARENT_SCOPE)
endfunction()
set(antlr3_grammar_files
cql3/Cql.g
alternator/expressions.g)
set(antlr3_gen_files)
foreach(f ${antlr3_grammar_files})
get_filename_component(grammar_file_name "${f}" NAME_WE)
get_filename_component(f_dir "${f}" DIRECTORY)
scylla_generate_antlr3(
TARGET scylla_antlr3_gen_${grammar_file_name}
VAR scylla_antlr3_gen_${grammar_file_name}_files
IN_FILE ${f}
OUT_DIR ${scylla_gen_build_dir}/${f_dir})
list(APPEND antlr3_gen_files "${scylla_antlr3_gen_${grammar_file_name}_files}")
endforeach()
# Generate C++ sources from ragel grammar files
seastar_generate_ragel(
TARGET scylla_ragel_gen_protocol_parser
VAR scylla_ragel_gen_protocol_parser_file
IN_FILE redis/protocol_parser.rl
OUT_FILE ${scylla_gen_build_dir}/redis/protocol_parser.hh)
# Generate C++ sources from Swagger definitions
set(swagger_files
api/api-doc/cache_service.json
api/api-doc/collectd.json
api/api-doc/column_family.json
api/api-doc/commitlog.json
api/api-doc/compaction_manager.json
api/api-doc/config.json
api/api-doc/endpoint_snitch_info.json
api/api-doc/error_injection.json
api/api-doc/failure_detector.json
api/api-doc/gossiper.json
api/api-doc/hinted_handoff.json
api/api-doc/lsa.json
api/api-doc/messaging_service.json
api/api-doc/storage_proxy.json
api/api-doc/storage_service.json
api/api-doc/stream_manager.json
api/api-doc/system.json
api/api-doc/utils.json)
set(swagger_gen_files)
foreach(f ${swagger_files})
get_filename_component(fname "${f}" NAME_WE)
get_filename_component(dir "${f}" DIRECTORY)
seastar_generate_swagger(
TARGET scylla_swagger_gen_${fname}
VAR scylla_swagger_gen_${fname}_files
IN_FILE "${f}"
OUT_DIR "${scylla_gen_build_dir}/${dir}")
list(APPEND swagger_gen_files "${scylla_swagger_gen_${fname}_files}")
endforeach()
# Create C++ bindings for IDL serializers
function(scylla_generate_idl_serializer)
set(one_value_args TARGET VAR IN_FILE OUT_FILE)
cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})
get_filename_component(out_dir ${args_OUT_FILE} DIRECTORY)
set(idl_compiler "${CMAKE_SOURCE_DIR}/idl-compiler.py")
find_package(Python3 COMPONENTS Interpreter)
add_custom_command(
DEPENDS
${args_IN_FILE}
${idl_compiler}
OUTPUT ${args_OUT_FILE}
COMMAND ${CMAKE_COMMAND} -E make_directory ${out_dir}
COMMAND Python3::Interpreter ${idl_compiler} --ns ser -f ${args_IN_FILE} -o ${args_OUT_FILE})
add_custom_target(${args_TARGET}
DEPENDS ${args_OUT_FILE})
set(${args_VAR} ${args_OUT_FILE} PARENT_SCOPE)
endfunction()
set(idl_serializers
idl/cache_temperature.idl.hh
idl/commitlog.idl.hh
idl/consistency_level.idl.hh
idl/frozen_mutation.idl.hh
idl/frozen_schema.idl.hh
idl/gossip_digest.idl.hh
idl/idl_test.idl.hh
idl/keys.idl.hh
idl/messaging_service.idl.hh
idl/mutation.idl.hh
idl/paging_state.idl.hh
idl/partition_checksum.idl.hh
idl/paxos.idl.hh
idl/query.idl.hh
idl/range.idl.hh
idl/read_command.idl.hh
idl/reconcilable_result.idl.hh
idl/replay_position.idl.hh
idl/result.idl.hh
idl/ring_position.idl.hh
idl/streaming.idl.hh
idl/token.idl.hh
idl/tracing.idl.hh
idl/truncation_record.idl.hh
idl/uuid.idl.hh
idl/view.idl.hh)
set(idl_gen_files)
foreach(f ${idl_serializers})
get_filename_component(idl_name "${f}" NAME)
get_filename_component(idl_target "${idl_name}" NAME_WE)
get_filename_component(idl_dir "${f}" DIRECTORY)
string(REPLACE ".idl.hh" ".dist.hh" idl_out_hdr_name "${idl_name}")
scylla_generate_idl_serializer(
TARGET scylla_idl_gen_${idl_target}
VAR scylla_idl_gen_${idl_target}_files
IN_FILE ${f}
OUT_FILE ${scylla_gen_build_dir}/${idl_dir}/${idl_out_hdr_name})
list(APPEND idl_gen_files "${scylla_idl_gen_${idl_target}_files}")
endforeach()
set(scylla_sources
absl-flat_hash_map.cc
alternator/auth.cc
alternator/base64.cc
alternator/conditions.cc
alternator/executor.cc
alternator/expressions.cc
alternator/serialization.cc
alternator/server.cc
alternator/stats.cc
alternator/streams.cc
api/api.cc
api/cache_service.cc
api/collectd.cc
api/column_family.cc
api/commitlog.cc
api/compaction_manager.cc
api/config.cc
api/endpoint_snitch.cc
api/error_injection.cc
api/failure_detector.cc
api/gossiper.cc
api/hinted_handoff.cc
api/lsa.cc
api/messaging_service.cc
api/storage_proxy.cc
api/storage_service.cc
api/stream_manager.cc
api/system.cc
atomic_cell.cc
auth/allow_all_authenticator.cc
auth/allow_all_authorizer.cc
auth/authenticated_user.cc
auth/authentication_options.cc
auth/authenticator.cc
auth/common.cc
auth/default_authorizer.cc
auth/password_authenticator.cc
auth/passwords.cc
auth/permission.cc
auth/permissions_cache.cc
auth/resource.cc
auth/role_or_anonymous.cc
auth/roles-metadata.cc
auth/sasl_challenge.cc
auth/service.cc
auth/standard_role_manager.cc
auth/transitional.cc
bytes.cc
canonical_mutation.cc
cdc/cdc_partitioner.cc
cdc/generation.cc
cdc/log.cc
cdc/metadata.cc
cdc/split.cc
clocks-impl.cc
collection_mutation.cc
compress.cc
connection_notifier.cc
converting_mutation_partition_applier.cc
counters.cc
cql3/abstract_marker.cc
cql3/attributes.cc
cql3/cf_name.cc
cql3/column_condition.cc
cql3/column_identifier.cc
cql3/column_specification.cc
cql3/constants.cc
cql3/cql3_type.cc
cql3/expr/expression.cc
cql3/functions/aggregate_fcts.cc
cql3/functions/castas_fcts.cc
cql3/functions/error_injection_fcts.cc
cql3/functions/functions.cc
cql3/functions/user_function.cc
cql3/index_name.cc
cql3/keyspace_element_name.cc
cql3/lists.cc
cql3/maps.cc
cql3/operation.cc
cql3/query_options.cc
cql3/query_processor.cc
cql3/relation.cc
cql3/restrictions/statement_restrictions.cc
cql3/result_set.cc
cql3/role_name.cc
cql3/selection/abstract_function_selector.cc
cql3/selection/selectable.cc
cql3/selection/selection.cc
cql3/selection/selector.cc
cql3/selection/selector_factories.cc
cql3/selection/simple_selector.cc
cql3/sets.cc
cql3/single_column_relation.cc
cql3/statements/alter_keyspace_statement.cc
cql3/statements/alter_table_statement.cc
cql3/statements/alter_type_statement.cc
cql3/statements/alter_view_statement.cc
cql3/statements/authentication_statement.cc
cql3/statements/authorization_statement.cc
cql3/statements/batch_statement.cc
cql3/statements/cas_request.cc
cql3/statements/cf_prop_defs.cc
cql3/statements/cf_statement.cc
cql3/statements/create_function_statement.cc
cql3/statements/create_index_statement.cc
cql3/statements/create_keyspace_statement.cc
cql3/statements/create_table_statement.cc
cql3/statements/create_type_statement.cc
cql3/statements/create_view_statement.cc
cql3/statements/delete_statement.cc
cql3/statements/drop_function_statement.cc
cql3/statements/drop_index_statement.cc
cql3/statements/drop_keyspace_statement.cc
cql3/statements/drop_table_statement.cc
cql3/statements/drop_type_statement.cc
cql3/statements/drop_view_statement.cc
cql3/statements/function_statement.cc
cql3/statements/grant_statement.cc
cql3/statements/index_prop_defs.cc
cql3/statements/index_target.cc
cql3/statements/ks_prop_defs.cc
cql3/statements/list_permissions_statement.cc
cql3/statements/list_users_statement.cc
cql3/statements/modification_statement.cc
cql3/statements/permission_altering_statement.cc
cql3/statements/property_definitions.cc
cql3/statements/raw/parsed_statement.cc
cql3/statements/revoke_statement.cc
cql3/statements/role-management-statements.cc
cql3/statements/schema_altering_statement.cc
cql3/statements/select_statement.cc
cql3/statements/truncate_statement.cc
cql3/statements/update_statement.cc
cql3/statements/use_statement.cc
cql3/token_relation.cc
cql3/tuples.cc
cql3/type_json.cc
cql3/untyped_result_set.cc
cql3/update_parameters.cc
cql3/user_types.cc
cql3/ut_name.cc
cql3/util.cc
cql3/values.cc
cql3/variable_specifications.cc
data/cell.cc
database.cc
db/batchlog_manager.cc
db/commitlog/commitlog.cc
db/commitlog/commitlog_entry.cc
db/commitlog/commitlog_replayer.cc
db/config.cc
db/consistency_level.cc
db/cql_type_parser.cc
db/data_listeners.cc
db/extensions.cc
db/heat_load_balance.cc
db/hints/manager.cc
db/hints/resource_manager.cc
db/large_data_handler.cc
db/legacy_schema_migrator.cc
db/marshal/type_parser.cc
db/schema_tables.cc
db/size_estimates_virtual_reader.cc
db/snapshot-ctl.cc
db/sstables-format-selector.cc
db/system_distributed_keyspace.cc
db/system_keyspace.cc
db/view/row_locking.cc
db/view/view.cc
db/view/view_update_generator.cc
dht/boot_strapper.cc
dht/i_partitioner.cc
dht/murmur3_partitioner.cc
dht/range_streamer.cc
dht/token.cc
distributed_loader.cc
duration.cc
exceptions/exceptions.cc
flat_mutation_reader.cc
frozen_mutation.cc
frozen_schema.cc
gms/application_state.cc
gms/endpoint_state.cc
gms/failure_detector.cc
gms/feature_service.cc
gms/gossip_digest_ack.cc
gms/gossip_digest_ack2.cc
gms/gossip_digest_syn.cc
gms/gossiper.cc
gms/inet_address.cc
gms/version_generator.cc
gms/versioned_value.cc
hashers.cc
index/secondary_index.cc
index/secondary_index_manager.cc
init.cc
keys.cc
lister.cc
locator/abstract_replication_strategy.cc
locator/ec2_multi_region_snitch.cc
locator/ec2_snitch.cc
locator/everywhere_replication_strategy.cc
locator/gce_snitch.cc
locator/gossiping_property_file_snitch.cc
locator/local_strategy.cc
locator/network_topology_strategy.cc
locator/production_snitch_base.cc
locator/rack_inferring_snitch.cc
locator/simple_snitch.cc
locator/simple_strategy.cc
locator/snitch_base.cc
locator/token_metadata.cc
lua.cc
main.cc
memtable.cc
message/messaging_service.cc
multishard_mutation_query.cc
mutation.cc
raft/fsm.cc
raft/log.cc
raft/progress.cc
raft/raft.cc
raft/server.cc
mutation_fragment.cc
mutation_partition.cc
mutation_partition_serializer.cc
mutation_partition_view.cc
mutation_query.cc
mutation_reader.cc
mutation_writer/multishard_writer.cc
mutation_writer/shard_based_splitting_writer.cc
mutation_writer/timestamp_based_splitting_writer.cc
mutation_writer/feed_writers.cc
partition_slice_builder.cc
partition_version.cc
querier.cc
query-result-set.cc
query.cc
range_tombstone.cc
range_tombstone_list.cc
reader_concurrency_semaphore.cc
redis/abstract_command.cc
redis/command_factory.cc
redis/commands.cc
redis/keyspace_utils.cc
redis/lolwut.cc
redis/mutation_utils.cc
redis/options.cc
redis/query_processor.cc
redis/query_utils.cc
redis/server.cc
redis/service.cc
redis/stats.cc
repair/repair.cc
repair/row_level.cc
row_cache.cc
schema.cc
schema_mutations.cc
schema_registry.cc
service/client_state.cc
service/migration_manager.cc
service/migration_task.cc
service/misc_services.cc
service/pager/paging_state.cc
service/pager/query_pagers.cc
service/paxos/paxos_state.cc
service/paxos/prepare_response.cc
service/paxos/prepare_summary.cc
service/paxos/proposal.cc
service/priority_manager.cc
service/storage_proxy.cc
service/storage_service.cc
sstables/compaction.cc
sstables/compaction_manager.cc
sstables/compaction_strategy.cc
sstables/compress.cc
sstables/integrity_checked_file_impl.cc
sstables/kl/writer.cc
sstables/leveled_compaction_strategy.cc
sstables/m_format_read_helpers.cc
sstables/metadata_collector.cc
sstables/mp_row_consumer.cc
sstables/mx/writer.cc
sstables/partition.cc
sstables/prepended_input_stream.cc
sstables/random_access_reader.cc
sstables/size_tiered_compaction_strategy.cc
sstables/sstable_directory.cc
sstables/sstable_version.cc
sstables/sstables.cc
sstables/sstables_manager.cc
sstables/time_window_compaction_strategy.cc
sstables/writer.cc
streaming/progress_info.cc
streaming/session_info.cc
streaming/stream_coordinator.cc
streaming/stream_manager.cc
streaming/stream_plan.cc
streaming/stream_reason.cc
streaming/stream_receive_task.cc
streaming/stream_request.cc
streaming/stream_result_future.cc
streaming/stream_session.cc
streaming/stream_session_state.cc
streaming/stream_summary.cc
streaming/stream_task.cc
streaming/stream_transfer_task.cc
table.cc
table_helper.cc
thrift/controller.cc
thrift/handler.cc
thrift/server.cc
thrift/thrift_validation.cc
timeout_config.cc
tracing/trace_keyspace_helper.cc
tracing/trace_state.cc
tracing/traced_file.cc
tracing/tracing.cc
tracing/tracing_backend_registry.cc
transport/controller.cc
transport/cql_protocol_extension.cc
transport/event.cc
transport/event_notifier.cc
transport/messages/result_message.cc
transport/server.cc
types.cc
unimplemented.cc
utils/UUID_gen.cc
utils/arch/powerpc/crc32-vpmsum/crc32_wrapper.cc
utils/array-search.cc
utils/ascii.cc
utils/big_decimal.cc
utils/bloom_calculations.cc
utils/bloom_filter.cc
utils/buffer_input_stream.cc
utils/build_id.cc
utils/config_file.cc
utils/directories.cc
utils/disk-error-handler.cc
utils/dynamic_bitset.cc
utils/error_injection.cc
utils/exceptions.cc
utils/file_lock.cc
utils/generation-number.cc
utils/gz/crc_combine.cc
utils/human_readable.cc
utils/i_filter.cc
utils/large_bitset.cc
utils/like_matcher.cc
utils/limiting_data_source.cc
utils/logalloc.cc
utils/managed_bytes.cc
utils/multiprecision_int.cc
utils/murmur_hash.cc
utils/rate_limiter.cc
utils/rjson.cc
utils/runtime.cc
utils/updateable_value.cc
utils/utf8.cc
utils/uuid.cc
validation.cc
vint-serialization.cc
zstd.cc
release.cc)
set(scylla_gen_sources
"${scylla_thrift_gen_cassandra_files}"
"${scylla_ragel_gen_protocol_parser_file}"
"${swagger_gen_files}"
"${idl_gen_files}"
"${antlr3_gen_files}")
set(SCYLLA_SOURCE_FILES
${SCYLLA_ROOT_SOURCE_FILES}
${SCYLLA_GEN_SOURCE_FILES}
${SCYLLA_SUB_SOURCE_FILES})
add_executable(scylla
${scylla_sources}
${scylla_gen_sources})
${SEASTAR_SOURCE_FILES}
${SCYLLA_SOURCE_FILES})
target_link_libraries(scylla PRIVATE
seastar
# Boost dependencies
Boost::filesystem
Boost::program_options
Boost::system
Boost::thread
Boost::regex
Boost::headers
# Abseil libs
absl::hashtablez_sampler
absl::raw_hash_set
absl::synchronization
absl::graphcycles_internal
absl::stacktrace
absl::symbolize
absl::debugging_internal
absl::demangle_internal
absl::time
absl::time_zone
absl::int128
absl::city
absl::hash
absl::malloc_internal
absl::spinlock_wait
absl::base
absl::dynamic_annotations
absl::raw_logging_internal
absl::exponential_biased
absl::throw_delegate
# System libs
ZLIB::ZLIB
ICU::uc
systemd
zstd
snappy
${LUA_LIBRARIES}
thrift
crypt)
# Note that since CLion does not undestand GCC6 concepts, we always disable them (even if users configure otherwise).
# CLion seems to have trouble with `-U` (macro undefinition), so we do it this way instead.
list(REMOVE_ITEM SEASTAR_CFLAGS "-DHAVE_GCC6_CONCEPTS")
target_link_libraries(scylla PRIVATE
-Wl,--build-id=sha1 # Force SHA1 build-id generation
# TODO: Use lld linker if it's available, otherwise gold, else bfd
-fuse-ld=lld)
# TODO: patch dynamic linker to match configure.py behavior
# If the Seastar pkg-config information is available, append to the default flags.
#
# For ease of browsing the source code, we always pretend that DPDK is enabled.
target_compile_options(scylla PUBLIC
-std=gnu++1z
-DHAVE_DPDK
-DHAVE_HWLOC
"${SEASTAR_CFLAGS}")
target_compile_options(scylla PRIVATE
-std=gnu++20
-fcoroutines # TODO: Clang does not have this flag, adjust to both variants
${target_arch_flag})
# Hacks needed to expose internal APIs for xxhash dependencies
target_compile_definitions(scylla PRIVATE XXH_PRIVATE_API HAVE_LZ4_COMPRESS_DEFAULT)
target_include_directories(scylla PRIVATE
"${CMAKE_CURRENT_SOURCE_DIR}"
libdeflate
abseil
"${scylla_gen_build_dir}")
###
### Create crc_combine_table helper executable.
### Use it to generate crc_combine_table.cc to be used in scylla at build time.
###
add_executable(crc_combine_table utils/gz/gen_crc_combine_table.cc)
target_link_libraries(crc_combine_table PRIVATE seastar)
target_include_directories(crc_combine_table PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}")
target_compile_options(crc_combine_table PRIVATE
-std=gnu++20
-fcoroutines
${target_arch_flag})
add_dependencies(scylla crc_combine_table)
# Generate an additional source file at build time that is needed for Scylla compilation
add_custom_command(OUTPUT "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc"
COMMAND $<TARGET_FILE:crc_combine_table> > "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc"
DEPENDS crc_combine_table)
target_sources(scylla PRIVATE "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc")
###
### Generate version file and supply appropriate compile definitions for release.cc
###
execute_process(COMMAND ${CMAKE_SOURCE_DIR}/SCYLLA-VERSION-GEN RESULT_VARIABLE scylla_version_gen_res)
if(scylla_version_gen_res)
message(SEND_ERROR "Version file generation failed. Return code: ${scylla_version_gen_res}")
endif()
file(READ build/SCYLLA-VERSION-FILE scylla_version)
string(STRIP "${scylla_version}" scylla_version)
file(READ build/SCYLLA-RELEASE-FILE scylla_release)
string(STRIP "${scylla_release}" scylla_release)
get_property(release_cdefs SOURCE "${CMAKE_SOURCE_DIR}/release.cc" PROPERTY COMPILE_DEFINITIONS)
list(APPEND release_cdefs "SCYLLA_VERSION=\"${scylla_version}\"" "SCYLLA_RELEASE=\"${scylla_release}\"")
set_source_files_properties("${CMAKE_SOURCE_DIR}/release.cc" PROPERTIES COMPILE_DEFINITIONS "${release_cdefs}")
###
### Custom command for building libdeflate. Link the library to scylla.
###
set(libdeflate_lib "${scylla_build_dir}/libdeflate/libdeflate.a")
add_custom_command(OUTPUT "${libdeflate_lib}"
COMMAND make -C libdeflate
BUILD_DIR=../build/${BUILD_TYPE}/libdeflate/
CC=${CMAKE_C_COMPILER}
"CFLAGS=${target_arch_flag}"
../build/${BUILD_TYPE}/libdeflate//libdeflate.a) # Two backslashes are important!
# Hack to force generating custom command to produce libdeflate.a
add_custom_target(libdeflate DEPENDS "${libdeflate_lib}")
target_link_libraries(scylla PRIVATE "${libdeflate_lib}")
# TODO: create cmake/ directory and move utilities (generate functions etc) there
# TODO: Build tests if BUILD_TESTING=on (using CTest module)
# The order matters here: prefer the "static" DPDK directories to any dynamic paths from pkg-config. Some files are only
# available dynamically, though.
target_include_directories(scylla PUBLIC
.
${SEASTAR_DPDK_INCLUDE_DIRS}
${SEASTAR_INCLUDE_DIRS}
${Boost_INCLUDE_DIRS}
xxhash
build/release/gen)

View File

@@ -1,18 +1,11 @@
# Contributing to Scylla
# Asking questions or requesting help
## Asking questions or requesting help
Use the [ScyllaDB user mailing list](https://groups.google.com/forum/#!forum/scylladb-users) for general questions and help.
Use the [Scylla Users mailing list](https://groups.google.com/g/scylladb-users) or the [Slack workspace](http://slack.scylladb.com) for general questions and help.
# Reporting an issue
Join the [Scylla Developers mailing list](https://groups.google.com/g/scylladb-dev) for deeper technical discussions and to discuss your ideas for contributions.
Please use the [Issue Tracker](https://github.com/scylladb/scylla/issues/) to report issues. Fill in as much information as you can in the issue template, especially for performance problems.
## Reporting an issue
# Contributing Code to Scylla
Please use the [issue tracker](https://github.com/scylladb/scylla/issues/) to report issues or to suggest features. Fill in as much information as you can in the issue template, especially for performance problems.
## Contributing code to Scylla
Before you can contribute code to Scylla for the first time, you should sign the [Contributor License Agreement](https://www.scylladb.com/open-source/contributor-agreement/) and send the signed form cla@scylladb.com. You can then submit your changes as patches to the to the [scylladb-dev mailing list](https://groups.google.com/forum/#!forum/scylladb-dev) or as a pull request to the [Scylla project on github](https://github.com/scylladb/scylla).
If you need help formatting or sending patches, [check out these instructions](https://github.com/scylladb/scylla/wiki/Formatting-and-sending-patches).
The Scylla C++ source code uses the [Seastar coding style](https://github.com/scylladb/seastar/blob/master/coding-style.md) so please adhere to that in your patches. Note that Scylla code is written with `using namespace seastar`, so should not explicitly add the `seastar::` prefix to Seastar symbols. You will usually not need to add `using namespace seastar` to new source files, because most Scylla header files have `#include "seastarx.hh"`, which does this.
To contribute code to Scylla, you need to sign the [Contributor License Agreement](http://www.scylladb.com/opensource/cla/) and send your changes as [patches](https://github.com/scylladb/scylla/wiki/Formatting-and-sending-patches) to the [mailing list](https://groups.google.com/forum/#!forum/scylladb-dev). We don't accept pull requests on GitHub.

View File

@@ -1 +0,0 @@
Dedicated to the memory of Alberto José Araújo, a coworker and a friend.

View File

@@ -18,36 +18,13 @@ $ git submodule update --init --recursive
### Dependencies
Scylla is fairly fussy about its build environment, requiring a very recent
version of the C++20 compiler and numerous tools and libraries to build.
Scylla depends on the system package manager for its development dependencies.
Run `./install-dependencies.sh` (as root) to use your Linux distributions's
package manager to install the appropriate packages on your build machine.
However, this will only work on very recent distributions. For example,
currently Fedora users must upgrade to Fedora 32 otherwise the C++ compiler
will be too old, and not support the new C++20 standard that Scylla uses.
Alternatively, to avoid having to upgrade your build machine or install
various packages on it, we provide another option - the **frozen toolchain**.
This is a script, `./tools/toolchain/dbuild`, that can execute build or run
commands inside a Docker image that contains exactly the right build tools and
libraries. The `dbuild` technique is useful for beginners, but is also the way
in which ScyllaDB produces official releases, so it is highly recommended.
To use `dbuild`, you simply prefix any build or run command with it. Building
and running Scylla becomes as easy as:
```bash
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla
$ ./tools/toolchain/dbuild ./build/release/scylla --developer-mode 1
```
Running `./install_dependencies.sh` (as root) installs the appropriate packages based on your Linux distribution.
### Build system
**Note**: Compiling Scylla requires, conservatively, 2 GB of memory per native
thread, and up to 3 GB per native thread while linking. GCC >= 10 is
required.
**Note**: Compiling Scylla requires, conservatively, 2 GB of memory per native thread, and up to 3 GB per native thread while linking.
Scylla is built with [Ninja](https://ninja-build.org/), a low-level rule-based system. A Python script, `configure.py`, generates a Ninja file (`build.ninja`) based on configuration options.
@@ -66,9 +43,11 @@ The full suite of options for project configuration is available via
$ ./configure.py --help
```
The most important option is:
The most important options are:
- `--enable-dpdk`: [DPDK](http://dpdk.org/) is a set of libraries and drivers for fast packet processing. During development, it's not necessary to enable support even if it is supported by your platform.
- `--mode={release,debug,all}`: Debug mode enables [AddressSanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer) and allows for debugging with tools like GDB. Debugging builds are generally slower and generate much larger object files than release builds.
- `--{enable,disable}-dpdk`: [DPDK](http://dpdk.org/) is a set of libraries and drivers for fast packet processing. During development, it's not necessary to enable support even if it is supported by your platform.
Source files and build targets are tracked manually in `configure.py`, so the script needs to be updated when new files or targets are added or removed.
@@ -76,30 +55,6 @@ To save time -- for instance, to avoid compiling all unit tests -- you can also
```bash
$ ninja-build build/release/tests/schema_change_test
$ ninja-build build/release/service/storage_proxy.o
```
You can also specify a single mode. For example
```bash
$ ninja-build release
```
Will build everytihng in release mode. The valid modes are
* Debug: Enables [AddressSanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer)
and other sanity checks. It has no optimizations, which allows for debugging with tools like
GDB. Debugging builds are generally slower and generate much larger object files than release builds.
* Release: Fewer checks and more optimizations. It still has debug info.
* Dev: No optimizations or debug info. The objective is to compile and link as fast as possible.
This is useful for the first iterations of a patch.
Note that by default unit tests binaries are stripped so they can't be used with gdb or seastar-addr2line.
To include debug information in the unit test binary, build the test binary with a `_g` suffix. For example,
```bash
$ ninja-build build/release/tests/schema_change_test_g
```
### Unit testing
@@ -128,7 +83,7 @@ The `-c1 -m1G` arguments limit this Seastar-based test to a single system thread
### Preparing patches
All changes to Scylla are submitted as patches to the public [mailing list](mailto:scylladb-dev@googlegroups.com). Once a patch is approved by one of the maintainers of the project, it is committed to the maintainers' copy of the repository at https://github.com/scylladb/scylla.
All changes to Scylla are submitted as patches to the public mailing list. Once a patch is approved by one of the maintainers of the project, it is committed to the maintainers' copy of the repository at https://github.com/scylladb/scylla.
Detailed instructions for formatting patches for the mailing list and advice on preparing good patches are available at the [ScyllaDB website](http://docs.scylladb.com/contribute/). There are also some guidelines that can help you make the patch review process smoother:
@@ -153,12 +108,10 @@ In v3:
"Tests: unit ({mode}), dtest ({smp})"
```
The usual is "Tests: unit (dev)", although running debug tests is encouraged.
The usual is "Tests: unit (release)", although running debug tests is encouraged.
5. When answering review comments, prefer inline quotes as they make it easier to track the conversation across multiple e-mails.
6. The Linux kernel's [Submitting Patches](https://www.kernel.org/doc/html/v4.19/process/submitting-patches.html) document offers excellent advice on how to prepare patches and patchsets for review. Since the Scylla development process is derived from the kernel's, almost all of the advice there is directly applicable.
### Finding a person to review and merge your patches
You can use the `scripts/find-maintainer` script to find a subsystem maintainer and/or reviewer for your patches. The script accepts a filename in the git source tree as an argument and outputs a list of subsystems the file belongs to and their respective maintainers and reviewers. For example, if you changed the `cql3/statements/create_view_statement.hh` file, run the script as follows:
@@ -211,29 +164,6 @@ On a development machine, one might run Scylla as
$ SCYLLA_HOME=$HOME/scylla build/release/scylla --overprovisioned --developer-mode=yes
```
To interact with scylla it is recommended to build our versions of
cqlsh and nodetool. They are available at
https://github.com/scylladb/scylla-tools-java and can be built with
```bash
$ sudo ./install-dependencies.sh
$ ant jar
```
cqlsh should work out of the box, but nodetool depends on a running
scylla-jmx (https://github.com/scylladb/scylla-jmx). It can be build
with
```bash
$ mvn package
```
and must be started with
```bash
$ ./scripts/scylla-jmx
```
### Branches and tags
Multiple release branches are maintained on the Git repository at https://github.com/scylladb/scylla. Release 1.5, for instance, is tracked on the `branch-1.5` branch.
@@ -324,7 +254,7 @@ In this example, `10.0.0.2` will be sent up to 16 jobs and the local machine wil
When a compilation is in progress, the status of jobs on all remote machines can be visualized in the terminal with `distccmon-text` or graphically as a GTK application with `distccmon-gnome`.
One thing to keep in mind is that linking object files happens on the coordinating machine, which can be a bottleneck. See the next sections speeding up this process.
One thing to keep in mind is that linking object files happens on the coordinating machine, which can be a bottleneck. See the next section speeding up this process.
### Using the `gold` linker
@@ -334,24 +264,6 @@ Linking Scylla can be slow. The gold linker can replace GNU ld and often speeds
$ sudo alternatives --config ld
```
### Using split dwarf
With debug info enabled, most of the link time is spent copying and
relocating it. It is possible to leave most of the debug info out of
the link by writing it to a side .dwo file. This is done by passing
`-gsplit-dwarf` to gcc.
Unfortunately just `-gsplit-dwarf` would slow down `gdb` startup. To
avoid that the gold linker can be told to create an index with
`--gdb-index`.
More info at https://gcc.gnu.org/wiki/DebugFission.
Both options can be enable by passing `--split-dwarf` to configure.py.
Note that distcc is *not* compatible with it, but icecream
(https://github.com/icecc/icecream) is.
### Testing changes in Seastar with Scylla
Sometimes Scylla development is closely tied with a feature being developed in Seastar. It can be useful to compile Scylla with a particular check-out of Seastar.
@@ -365,8 +277,3 @@ $ git remote add local /home/tsmith/src/seastar
$ git remote update
$ git checkout -t local/my_local_seastar_branch
```
### Core dump debugging
Slides:
2018.11.20: https://www.slideshare.net/tomekgrabiec/scylla-core-dump-debugging-tools

View File

@@ -1,9 +1,9 @@
# IDL definition
#IDL definition
The schema we use similar to c++ schema.
Use class or struct similar to the object you need the serializer for.
Use namespace when applicable.
## keywords
##keywords
* class/struct - a class or a struct like C++
class/struct can have final or stub marker
* namespace - has the same C++ meaning
@@ -12,16 +12,16 @@ Use namespace when applicable.
* stub class - when a class is mark as stub, it means that no code will be generated for this class and it is only there as a documentation.
* version attributes - mark with [[version id ]] mark that a field is available from a specific version
* template - A template class definition like C++
## Syntax
##Syntax
### Namespace
###Namespace
```
namespace ns_name { namespace-body }
```
* ns_name: either a previously unused identifier, in which case this is original-namespace-definition or the name of a namespace, in which case this is extension-namespace-definition
* namespace-body: possibly empty sequence of declarations of any kind (including class and struct definitions as well as nested namespaces)
### class/struct
###class/struct
`
class-key class-name final(optional) stub(optional) { member-specification } ;(optional)
`
@@ -31,7 +31,7 @@ class-key class-name final(optional) stub(optional) { member-specification } ;(
* stub: when a class is mark as stub, it means no code will generate for it and it is added for documentation only.
* member-specification: list of access specifiers, and public member accessor see class member below.
* to be compatible with C++ a class definition can be followed by a semicolon.
### enum
###enum
`enum-key identifier enum-base { enumerator-list(optional) }`
* enum-key: only enum class is supported
* identifier: the name of the enumeration that's being declared.
@@ -39,18 +39,18 @@ class-key class-name final(optional) stub(optional) { member-specification } ;(
* enumerator-list: comma-separated list of enumerator definitions, each of which is either simply an identifier, which becomes the name of the enumerator, or an identifier with an initializer: identifier = integral value.
Note that though C++ allows constexpr as an initialize value, it makes the documentation less readable, hence is not permitted.
### class member
###class member
`type member-access attributes(optional) default-value(optional);`
* type: Any valid C++ type, following the C++ notation. note that there should be a serializer for the type, but deceleration order is not mandatory
* member-access: is the way the member can be access. If the member is public it can be the name itself. if not it could be a getter function that should be followed by braces. Note that getter can (and probably should) be const methods.
* attributes: Attributes define by square brackets. Currently are use to mark a version in which a specific member was added [ [ version version-number] ] would mark that the specific member was added in the given version number.
### template
###template
`template < parameter-list > class-declaration`
* parameter-list - a non-empty comma-separated list of the template parameters.
* class-decleration - (See class section) The class name declared become a template name.
## IDL example
##IDL example
Forward slashes comments are ignored until the end of the line.
```
namespace utils {

131
MAINTAINERS Normal file
View File

@@ -0,0 +1,131 @@
M: Maintainer with commit access
R: Reviewer with subsystem expertise
F: Filename, directory, or pattern for the subsystem
---
AUTH
M: Paweł Dziepak <pdziepak@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
R: Calle Wilund <calle@scylladb.com>
R: Vlad Zolotarov <vladz@scylladb.com>
R: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
F: auth/*
CACHE
M: Tomasz Grabiec <tgrabiec@scylladb.com>
M: Paweł Dziepak <pdziepak@scylladb.com>
R: Piotr Jastrzebski <piotr@scylladb.com>
F: row_cache*
F: *mutation*
F: tests/mvcc*
COMMITLOG / BATCHLOGa
M: Paweł Dziepak <pdziepak@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
R: Calle Wilund <calle@scylladb.com>
F: db/commitlog/*
F: db/batch*
COORDINATOR
M: Paweł Dziepak <pdziepak@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
R: Gleb Natapov <gleb@scylladb.com>
F: service/storage_proxy*
COMPACTION
R: Raphael S. Carvalho <raphaelsc@scylladb.com>
R: Glauber Costa <glauber@scylladb.com>
R: Nadav Har'El <nyh@scylladb.com>
F: sstables/compaction*
CQL TRANSPORT LAYER
M: Pekka Enberg <penberg@scylladb.com>
F: transport/*
CQL QUERY LANGUAGE
M: Tomasz Grabiec <tgrabiec@scylladb.com>
M: Pekka Enberg <penberg@scylladb.com>
F: cql3/*
COUNTERS
M: Paweł Dziepak <pdziepak@scylladb.com>
F: counters*
F: tests/counter_test*
GOSSIP
M: Duarte Nunes <duarte@scylladb.com>
M: Tomasz Grabiec <tgrabiec@scylladb.com>
R: Asias He <asias@scylladb.com>
F: gms/*
DOCKER
M: Pekka Enberg <penberg@scylladb.com>
F: dist/docker/*
LSA
M: Tomasz Grabiec <tgrabiec@scylladb.com>
M: Paweł Dziepak <pdziepak@scylladb.com>
F: utils/logalloc*
MATERIALIZED VIEWS
M: Duarte Nunes <duarte@scylladb.com>
M: Pekka Enberg <penberg@scylladb.com>
R: Nadav Har'El <nyh@scylladb.com>
R: Duarte Nunes <duarte@scylladb.com>
F: db/view/*
F: cql3/statements/*view*
PACKAGING
R: Takuya ASADA <syuu@scylladb.com>
F: dist/*
REPAIR
M: Tomasz Grabiec <tgrabiec@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
R: Asias He <asias@scylladb.com>
R: Nadav Har'El <nyh@scylladb.com>
F: repair/*
SCHEMA MANAGEMENT
M: Tomasz Grabiec <tgrabiec@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
M: Pekka Enberg <penberg@scylladb.com>
F: db/schema_tables*
F: db/legacy_schema_migrator*
F: service/migration*
F: schema*
SECONDARY INDEXES
M: Pekka Enberg <penberg@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
R: Nadav Har'El <nyh@scylladb.com>
R: Pekka Enberg <penberg@scylladb.com>
F: db/index/*
F: cql3/statements/*index*
SSTABLES
M: Tomasz Grabiec <tgrabiec@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
R: Raphael S. Carvalho <raphaelsc@scylladb.com>
R: Glauber Costa <glauber@scylladb.com>
R: Nadav Har'El <nyh@scylladb.com>
F: sstables/*
STREAMING
M: Tomasz Grabiec <tgrabiec@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
R: Asias He <asias@scylladb.com>
F: streaming/*
F: service/storage_service.*
THRIFT TRANSPORT LAYER
M: Duarte Nunes <duarte@scylladb.com>
F: thrift/*
THE REST
M: Avi Kivity <avi@scylladb.com>
M: Paweł Dziepak <pdziepak@scylladb.com>
M: Duarte Nunes <duarte@scylladb.com>
M: Tomasz Grabiec <tgrabiec@scylladb.com>
F: *

View File

@@ -1,9 +1,5 @@
This project includes code developed by the Apache Software Foundation (http://www.apache.org/),
especially Apache Cassandra.
It includes files from https://github.com/antonblanchard/crc32-vpmsum (author Anton Blanchard <anton@au.ibm.com>, IBM).
It also includes files from https://github.com/antonblanchard/crc32-vpmsum (author Anton Blanchard <anton@au.ibm.com>, IBM).
These files are located in utils/arch/powerpc/crc32-vpmsum. Their license may be found in licenses/LICENSE-crc32-vpmsum.TXT.
It includes modified code from https://gitbox.apache.org/repos/asf?p=cassandra-dtest.git (owned by The Apache Software Foundation)
It includes modified tests from https://github.com/etcd-io/etcd.git (owned by The etcd Authors)

29
README-DPDK.md Normal file
View File

@@ -0,0 +1,29 @@
Seastar and DPDK
================
Seastar uses the Data Plane Development Kit to drive NIC hardware directly. This
provides an enormous performance boost.
To enable DPDK, specify `--enable-dpdk` to `./configure.py`, and `--dpdk-pmd` as a
run-time parameter. This will use the DPDK package provided as a git submodule with the
seastar sources.
To use your own self-compiled DPDK package, follow this procedure:
1. Setup host to compile DPDK:
- Ubuntu
`sudo apt-get install -y build-essential linux-image-extra-$(uname -r)`
2. Prepare a DPDK SDK:
- Download the latest DPDK release: `wget http://dpdk.org/browse/dpdk/snapshot/dpdk-1.8.0.tar.gz`
- Untar it.
- Edit config/common_linuxapp: set CONFIG_RTE_MBUF_REFCNT and CONFIG_RTE_LIBRTE_KNI to 'n'.
- For DPDK 1.7.x: edit config/common_linuxapp:
- Set CONFIG_RTE_LIBRTE_PMD_BOND to 'n'.
- Set CONFIG_RTE_MBUF_SCATTER_GATHER to 'n'.
- Set CONFIG_RTE_LIBRTE_IP_FRAG to 'n'.
- Start the tools/setup.sh script as root.
- Compile a linuxapp target (option 9).
- Install IGB_UIO module (option 11).
- Bind some physical port to IGB_UIO (option 17).
- Configure hugepage mappings (option 14/15).
3. Run a configure.py: `./configure.py --dpdk-target <Path to untared dpdk-1.8.0 above>/x86_64-native-linuxapp-gcc`.

140
README.md
View File

@@ -1,110 +1,78 @@
# Scylla
[![Slack](https://img.shields.io/badge/slack-scylla-brightgreen.svg?logo=slack)](http://slack.scylladb.com)
[![Twitter](https://img.shields.io/twitter/follow/ScyllaDB.svg?style=social&label=Follow)](https://twitter.com/intent/follow?screen_name=ScyllaDB)
## What is Scylla?
Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB.
Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.
For more information, please see the [ScyllaDB web site].
[ScyllaDB web site]: https://www.scylladb.com
## Build Prerequisites
Scylla is fairly fussy about its build environment, requiring very recent
versions of the C++20 compiler and of many libraries to build. The document
[HACKING.md](HACKING.md) includes detailed information on building and
developing Scylla, but to get Scylla building quickly on (almost) any build
machine, Scylla offers a [frozen toolchain](tools/toolchain/README.md),
This is a pre-configured Docker image which includes recent versions of all
the required compilers, libraries and build tools. Using the frozen toolchain
allows you to avoid changing anything in your build machine to meet Scylla's
requirements - you just need to meet the frozen toolchain's prerequisites
(mostly, Docker or Podman being available).
## Building Scylla
Building Scylla with the frozen toolchain `dbuild` is as easy as:
## Quick-start
```bash
$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla
$ git submodule update --init --recursive
$ sudo ./install-dependencies.sh
$ ./configure.py --mode=release
$ ninja-build -j4 # Assuming 4 system threads.
$ ./build/release/scylla
$ # Rejoice!
```
For further information, please see:
* [Developer documentation] for more information on building Scylla.
* [Build documentation] on how to build Scylla binaries, tests, and packages.
* [Docker image build documentation] for information on how to build Docker images.
[developer documentation]: HACKING.md
[build documentation]: docs/guides/building.md
[docker image build documentation]: dist/docker/redhat/README.md
Please see [HACKING.md](HACKING.md) for detailed information on building and developing Scylla.
## Running Scylla
To start Scylla server, run:
* Run Scylla
```
./build/release/scylla
```bash
$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1
```
This will start a Scylla node with one CPU core allocated to it and data files stored in the `tmp` directory.
The `--developer-mode` is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations).
Please note that you need to run Scylla with `dbuild` if you built it with the frozen toolchain.
* run Scylla with one CPU and ./tmp as data directory
For more run options, run:
```bash
$ ./tools/toolchain/dbuild ./build/release/scylla --help
```
./build/release/scylla --datadir tmp --commitlog-directory tmp --smp 1
```
## Testing
* For more run options:
```
./build/release/scylla --help
```
See [test.py manual](docs/guides/testing.md).
## Building Fedora RPM
## Scylla APIs and compatibility
By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and
Thrift. There is also support for the API of Amazon DynamoDB™,
which needs to be enabled and configured in order to be used. For more
information on how to enable the DynamoDB™ API in Scylla,
and the current compatibility of this feature as well as Scylla-specific extensions, see
[Alternator](docs/alternator/alternator.md) and
[Getting started with Alternator](docs/alternator/getting-started.md).
As a pre-requisite, you need to install [Mock](https://fedoraproject.org/wiki/Mock) on your machine:
## Documentation
```
# Install mock:
sudo yum install mock
Documentation can be found [here](https://scylla.docs.scylladb.com).
Seastar documentation can be found [here](http://docs.seastar.io/master/index.html).
User documentation can be found [here](https://docs.scylladb.com/).
# Add user to the "mock" group:
usermod -a -G mock $USER && newgrp mock
```
## Training
Then, to build an RPM, run:
Training material and online courses can be found at [Scylla University](https://university.scylladb.com/).
The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling,
administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions,
multi-datacenters and how Scylla integrates with third-party applications.
```
./dist/redhat/build_rpm.sh
```
The built RPM is stored in ``/var/lib/mock/<configuration>/result`` directory.
For example, on Fedora 21 mock reports the following:
```
INFO: Done(scylla-server-0.00-1.fc21.src.rpm) Config(default) 20 minutes 7 seconds
INFO: Results and/or logs in: /var/lib/mock/fedora-21-x86_64/result
```
## Building Fedora-based Docker image
Build a Docker image with:
```
cd dist/docker
docker build -t <image-name> .
```
Run the image with:
```
docker run -p $(hostname -i):9042:9042 -i -t <image name>
```
## Contributing to Scylla
If you want to report a bug or submit a pull request or a patch, please read the [contribution guidelines].
If you are a developer working on Scylla, please read the [developer guidelines].
[contribution guidelines]: CONTRIBUTING.md
[developer guidelines]: HACKING.md
## Contact
* The [users mailing list] and [Slack channel] are for users to discuss configuration, management, and operations of the ScyllaDB open source.
* The [developers mailing list] is for developers and people interested in following the development of ScyllaDB to discuss technical topics.
[Users mailing list]: https://groups.google.com/forum/#!forum/scylladb-users
[Slack channel]: http://slack.scylladb.com/
[Developers mailing list]: https://groups.google.com/forum/#!forum/scylladb-dev
[Guidelines for contributing](CONTRIBUTING.md)

View File

@@ -1,7 +1,6 @@
#!/bin/sh
PRODUCT=scylla
VERSION=4.5.7
VERSION=2.2.2
if test -f version
then
@@ -19,16 +18,7 @@ else
SCYLLA_RELEASE=$SCYLLA_BUILD.$DATE.$GIT_COMMIT
fi
if [ -f build/SCYLLA-RELEASE-FILE ]; then
RELEASE_FILE=$(cat build/SCYLLA-RELEASE-FILE)
GIT_COMMIT_FILE=$(cat build/SCYLLA-RELEASE-FILE |cut -d . -f 3)
if [ "$GIT_COMMIT" = "$GIT_COMMIT_FILE" ]; then
exit 0
fi
fi
echo "$SCYLLA_VERSION-$SCYLLA_RELEASE"
mkdir -p build
echo "$SCYLLA_VERSION" > build/SCYLLA-VERSION-FILE
echo "$SCYLLA_RELEASE" > build/SCYLLA-RELEASE-FILE
echo "$PRODUCT" > build/SCYLLA-PRODUCT-FILE

1
abseil

Submodule abseil deleted from 9c6a50fdd8

View File

@@ -1,26 +0,0 @@
/*
* Copyright (C) 2020 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "absl-flat_hash_map.hh"
size_t sstring_hash::operator()(std::string_view v) const noexcept {
return absl::Hash<std::string_view>{}(v);
}

View File

@@ -1,47 +0,0 @@
/*
* Copyright (C) 2020 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <absl/container/flat_hash_map.h>
#include <seastar/core/sstring.hh>
using namespace seastar;
struct sstring_hash {
using is_transparent = void;
size_t operator()(std::string_view v) const noexcept;
};
struct sstring_eq {
using is_transparent = void;
bool operator()(std::string_view a, std::string_view b) const noexcept {
return a == b;
}
};
template <typename K, typename V, typename... Ts>
struct flat_hash_map : public absl::flat_hash_map<K, V, Ts...> {
};
template <typename V>
struct flat_hash_map<sstring, V>
: public absl::flat_hash_map<sstring, V, sstring_hash, sstring_eq> {};

View File

@@ -1,154 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "alternator/error.hh"
#include "log.hh"
#include <string>
#include <string_view>
#include <gnutls/crypto.h>
#include <seastar/util/defer.hh>
#include "hashers.hh"
#include "bytes.hh"
#include "alternator/auth.hh"
#include <fmt/format.h>
#include "auth/common.hh"
#include "auth/password_authenticator.hh"
#include "auth/roles-metadata.hh"
#include "cql3/query_processor.hh"
#include "cql3/untyped_result_set.hh"
namespace alternator {
static logging::logger alogger("alternator-auth");
static hmac_sha256_digest hmac_sha256(std::string_view key, std::string_view msg) {
hmac_sha256_digest digest;
int ret = gnutls_hmac_fast(GNUTLS_MAC_SHA256, key.data(), key.size(), msg.data(), msg.size(), digest.data());
if (ret) {
throw std::runtime_error(fmt::format("Computing HMAC failed ({}): {}", ret, gnutls_strerror(ret)));
}
return digest;
}
static hmac_sha256_digest get_signature_key(std::string_view key, std::string_view date_stamp, std::string_view region_name, std::string_view service_name) {
auto date = hmac_sha256("AWS4" + std::string(key), date_stamp);
auto region = hmac_sha256(std::string_view(date.data(), date.size()), region_name);
auto service = hmac_sha256(std::string_view(region.data(), region.size()), service_name);
auto signing = hmac_sha256(std::string_view(service.data(), service.size()), "aws4_request");
return signing;
}
static std::string apply_sha256(std::string_view msg) {
sha256_hasher hasher;
hasher.update(msg.data(), msg.size());
return to_hex(hasher.finalize());
}
static std::string apply_sha256(const std::vector<temporary_buffer<char>>& msg) {
sha256_hasher hasher;
for (const temporary_buffer<char>& buf : msg) {
hasher.update(buf.get(), buf.size());
}
return to_hex(hasher.finalize());
}
static std::string format_time_point(db_clock::time_point tp) {
time_t time_point_repr = db_clock::to_time_t(tp);
std::string time_point_str;
time_point_str.resize(17);
::tm time_buf;
// strftime prints the terminating null character as well
std::strftime(time_point_str.data(), time_point_str.size(), "%Y%m%dT%H%M%SZ", ::gmtime_r(&time_point_repr, &time_buf));
time_point_str.resize(16);
return time_point_str;
}
void check_expiry(std::string_view signature_date) {
//FIXME: The default 15min can be changed with X-Amz-Expires header - we should honor it
std::string expiration_str = format_time_point(db_clock::now() - 15min);
std::string validity_str = format_time_point(db_clock::now() + 15min);
if (signature_date < expiration_str) {
throw api_error::invalid_signature(
fmt::format("Signature expired: {} is now earlier than {} (current time - 15 min.)",
signature_date, expiration_str));
}
if (signature_date > validity_str) {
throw api_error::invalid_signature(
fmt::format("Signature not yet current: {} is still later than {} (current time + 15 min.)",
signature_date, validity_str));
}
}
std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
const std::vector<temporary_buffer<char>>& body_content, std::string_view region, std::string_view service, std::string_view query_string) {
auto amz_date_it = signed_headers_map.find("x-amz-date");
if (amz_date_it == signed_headers_map.end()) {
throw api_error::invalid_signature("X-Amz-Date header is mandatory for signature verification");
}
std::string_view amz_date = amz_date_it->second;
check_expiry(amz_date);
std::string_view datestamp = amz_date.substr(0, 8);
if (datestamp != orig_datestamp) {
throw api_error::invalid_signature(
format("X-Amz-Date date does not match the provided datestamp. Expected {}, got {}",
orig_datestamp, datestamp));
}
std::string_view canonical_uri = "/";
std::stringstream canonical_headers;
for (const auto& header : signed_headers_map) {
canonical_headers << fmt::format("{}:{}", header.first, header.second) << '\n';
}
std::string payload_hash = apply_sha256(body_content);
std::string canonical_request = fmt::format("{}\n{}\n{}\n{}\n{}\n{}", method, canonical_uri, query_string, canonical_headers.str(), signed_headers_str, payload_hash);
std::string_view algorithm = "AWS4-HMAC-SHA256";
std::string credential_scope = fmt::format("{}/{}/{}/aws4_request", datestamp, region, service);
std::string string_to_sign = fmt::format("{}\n{}\n{}\n{}", algorithm, amz_date, credential_scope, apply_sha256(canonical_request));
hmac_sha256_digest signing_key = get_signature_key(secret_access_key, datestamp, region, service);
hmac_sha256_digest signature = hmac_sha256(std::string_view(signing_key.data(), signing_key.size()), string_to_sign);
return to_hex(bytes_view(reinterpret_cast<const int8_t*>(signature.data()), signature.size()));
}
future<std::string> get_key_from_roles(cql3::query_processor& qp, std::string username) {
static const sstring query = format("SELECT salted_hash FROM {} WHERE {} = ?",
auth::meta::roles_table::qualified_name, auth::meta::roles_table::role_col_name);
auto cl = auth::password_authenticator::consistency_for_user(username);
return qp.execute_internal(query, cl, auth::internal_distributed_query_state(), {sstring(username)}, true).then_wrapped([username = std::move(username)] (future<::shared_ptr<cql3::untyped_result_set>> f) {
auto res = f.get0();
auto salted_hash = std::optional<sstring>();
if (res->empty()) {
throw api_error::unrecognized_client(fmt::format("User not found: {}", username));
}
salted_hash = res->one().get_opt<sstring>("salted_hash");
if (!salted_hash) {
throw api_error::unrecognized_client(fmt::format("No password found for user: {}", username));
}
return make_ready_future<std::string>(*salted_hash);
});
}
}

View File

@@ -1,46 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <string>
#include <string_view>
#include <array>
#include "gc_clock.hh"
#include "utils/loading_cache.hh"
namespace cql3 {
class query_processor;
}
namespace alternator {
using hmac_sha256_digest = std::array<char, 32>;
using key_cache = utils::loading_cache<std::string, std::string>;
std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
const std::vector<temporary_buffer<char>>& body_content, std::string_view region, std::string_view service, std::string_view query_string);
future<std::string> get_key_from_roles(cql3::query_processor& qp, std::string username);
}

View File

@@ -1,145 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
// The DynamoAPI dictates that "binary" (a.k.a. "bytes" or "blob") values
// be encoded in the JSON API as base64-encoded strings. This is code to
// convert byte arrays to base64-encoded strings, and back.
#include "base64.hh"
#include <ctype.h>
// Arrays for quickly converting to and from an integer between 0 and 63,
// and the character used in base64 encoding to represent it.
static class base64_chars {
public:
static constexpr const char to[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
int8_t from[255];
base64_chars() {
static_assert(sizeof(to) == 64 + 1);
for (int i = 0; i < 255; i++) {
from[i] = -1; // signal invalid character
}
for (int i = 0; i < 64; i++) {
from[(unsigned) to[i]] = i;
}
}
} base64_chars;
std::string base64_encode(bytes_view in) {
std::string ret;
ret.reserve(((4 * in.size() / 3) + 3) & ~3);
int i = 0;
unsigned char chunk3[3]; // chunk of input
for (auto byte : in) {
chunk3[i++] = byte;
if (i == 3) {
ret += base64_chars.to[ (chunk3[0] & 0xfc) >> 2 ];
ret += base64_chars.to[ ((chunk3[0] & 0x03) << 4) + ((chunk3[1] & 0xf0) >> 4) ];
ret += base64_chars.to[ ((chunk3[1] & 0x0f) << 2) + ((chunk3[2] & 0xc0) >> 6) ];
ret += base64_chars.to[ chunk3[2] & 0x3f ];
i = 0;
}
}
if (i) {
// i can be 1 or 2.
for(int j = i; j < 3; j++)
chunk3[j] = '\0';
ret += base64_chars.to[ ( chunk3[0] & 0xfc) >> 2 ];
ret += base64_chars.to[ ((chunk3[0] & 0x03) << 4) + ((chunk3[1] & 0xf0) >> 4) ];
if (i == 2) {
ret += base64_chars.to[ ((chunk3[1] & 0x0f) << 2) + ((chunk3[2] & 0xc0) >> 6) ];
} else {
ret += '=';
}
ret += '=';
}
return ret;
}
static std::string base64_decode_string(std::string_view in) {
int i = 0;
int8_t chunk4[4]; // chunk of input, each byte converted to 0..63;
std::string ret;
ret.reserve(in.size() * 3 / 4);
for (unsigned char c : in) {
uint8_t dc = base64_chars.from[c];
if (dc == 255) {
// Any unexpected character, include the "=" character usually
// used for padding, signals the end of the decode.
break;
}
chunk4[i++] = dc;
if (i == 4) {
ret += (chunk4[0] << 2) + ((chunk4[1] & 0x30) >> 4);
ret += ((chunk4[1] & 0xf) << 4) + ((chunk4[2] & 0x3c) >> 2);
ret += ((chunk4[2] & 0x3) << 6) + chunk4[3];
i = 0;
}
}
if (i) {
// i can be 2 or 3, meaning 1 or 2 more output characters
if (i>=2)
ret += (chunk4[0] << 2) + ((chunk4[1] & 0x30) >> 4);
if (i==3)
ret += ((chunk4[1] & 0xf) << 4) + ((chunk4[2] & 0x3c) >> 2);
}
return ret;
}
bytes base64_decode(std::string_view in) {
// FIXME: This copy is sad. The problem is we need back "bytes"
// but "bytes" doesn't have efficient append and std::string.
// To fix this we need to use bytes' "uninitialized" feature.
std::string ret = base64_decode_string(in);
return bytes(ret.begin(), ret.end());
}
static size_t base64_padding_len(std::string_view str) {
size_t padding = 0;
padding += (!str.empty() && str.back() == '=');
padding += (str.size() > 1 && *(str.end() - 2) == '=');
return padding;
}
size_t base64_decoded_len(std::string_view str) {
return str.size() / 4 * 3 - base64_padding_len(str);
}
bool base64_begins_with(std::string_view base, std::string_view operand) {
if (base.size() < operand.size() || base.size() % 4 != 0 || operand.size() % 4 != 0) {
return false;
}
if (base64_padding_len(operand) == 0) {
return base.starts_with(operand);
}
const std::string_view unpadded_base_prefix = base.substr(0, operand.size() - 4);
const std::string_view unpadded_operand = operand.substr(0, operand.size() - 4);
if (unpadded_base_prefix != unpadded_operand) {
return false;
}
// Decode and compare last 4 bytes of base64-encoded strings
const std::string base_remainder = base64_decode_string(base.substr(operand.size() - 4, operand.size()));
const std::string operand_remainder = base64_decode_string(operand.substr(operand.size() - 4));
return base_remainder.starts_with(operand_remainder);
}

View File

@@ -1,38 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <string_view>
#include "bytes.hh"
#include "utils/rjson.hh"
std::string base64_encode(bytes_view);
bytes base64_decode(std::string_view);
inline bytes base64_decode(const rjson::value& v) {
return base64_decode(std::string_view(v.GetString(), v.GetStringLength()));
}
size_t base64_decoded_len(std::string_view str);
bool base64_begins_with(std::string_view base, std::string_view operand);

View File

@@ -1,750 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include <list>
#include <map>
#include <string_view>
#include "alternator/conditions.hh"
#include "alternator/error.hh"
#include "cql3/constants.hh"
#include <unordered_map>
#include "utils/rjson.hh"
#include "serialization.hh"
#include "base64.hh"
#include <stdexcept>
#include <boost/algorithm/cxx11/all_of.hpp>
#include <boost/algorithm/cxx11/any_of.hpp>
#include "utils/overloaded_functor.hh"
#include "expressions.hh"
namespace alternator {
static logging::logger clogger("alternator-conditions");
comparison_operator_type get_comparison_operator(const rjson::value& comparison_operator) {
static std::unordered_map<std::string, comparison_operator_type> ops = {
{"EQ", comparison_operator_type::EQ},
{"NE", comparison_operator_type::NE},
{"LE", comparison_operator_type::LE},
{"LT", comparison_operator_type::LT},
{"GE", comparison_operator_type::GE},
{"GT", comparison_operator_type::GT},
{"IN", comparison_operator_type::IN},
{"NULL", comparison_operator_type::IS_NULL},
{"NOT_NULL", comparison_operator_type::NOT_NULL},
{"BETWEEN", comparison_operator_type::BETWEEN},
{"BEGINS_WITH", comparison_operator_type::BEGINS_WITH},
{"CONTAINS", comparison_operator_type::CONTAINS},
{"NOT_CONTAINS", comparison_operator_type::NOT_CONTAINS},
};
if (!comparison_operator.IsString()) {
throw api_error::validation(format("Invalid comparison operator definition {}", rjson::print(comparison_operator)));
}
std::string op = comparison_operator.GetString();
auto it = ops.find(op);
if (it == ops.end()) {
throw api_error::validation(format("Unsupported comparison operator {}", op));
}
return it->second;
}
namespace {
struct size_check {
// True iff size passes this check.
virtual bool operator()(rapidjson::SizeType size) const = 0;
// Check description, such that format("expected array {}", check.what()) is human-readable.
virtual sstring what() const = 0;
};
class exact_size : public size_check {
rapidjson::SizeType _expected;
public:
explicit exact_size(rapidjson::SizeType expected) : _expected(expected) {}
bool operator()(rapidjson::SizeType size) const override { return size == _expected; }
sstring what() const override { return format("of size {}", _expected); }
};
struct empty : public size_check {
bool operator()(rapidjson::SizeType size) const override { return size < 1; }
sstring what() const override { return "to be empty"; }
};
struct nonempty : public size_check {
bool operator()(rapidjson::SizeType size) const override { return size > 0; }
sstring what() const override { return "to be non-empty"; }
};
} // anonymous namespace
// Check that array has the expected number of elements
static void verify_operand_count(const rjson::value* array, const size_check& expected, const rjson::value& op) {
if (!array && expected(0)) {
// If expected() allows an empty AttributeValueList, it is also fine
// that it is missing.
return;
}
if (!array || !array->IsArray()) {
throw api_error::validation("With ComparisonOperator, AttributeValueList must be given and an array");
}
if (!expected(array->Size())) {
throw api_error::validation(
format("{} operator requires AttributeValueList {}, instead found list size {}",
op, expected.what(), array->Size()));
}
}
struct rjson_engaged_ptr_comp {
bool operator()(const rjson::value* p1, const rjson::value* p2) const {
return rjson::single_value_comp()(*p1, *p2);
}
};
// It's not enough to compare underlying JSON objects when comparing sets,
// as internally they're stored in an array, and the order of elements is
// not important in set equality. See issue #5021
static bool check_EQ_for_sets(const rjson::value& set1, const rjson::value& set2) {
if (!set1.IsArray() || !set2.IsArray() || set1.Size() != set2.Size()) {
return false;
}
std::set<const rjson::value*, rjson_engaged_ptr_comp> set1_raw;
for (auto it = set1.Begin(); it != set1.End(); ++it) {
set1_raw.insert(&*it);
}
for (const auto& a : set2.GetArray()) {
if (!set1_raw.contains(&a)) {
return false;
}
}
return true;
}
// Moreover, the JSON being compared can be a nested document with outer
// layers of lists and maps and some inner set - and we need to get to that
// inner set to compare it correctly with check_EQ_for_sets() (issue #8514).
static bool check_EQ(const rjson::value* v1, const rjson::value& v2);
static bool check_EQ_for_lists(const rjson::value& list1, const rjson::value& list2) {
if (!list1.IsArray() || !list2.IsArray() || list1.Size() != list2.Size()) {
return false;
}
auto it1 = list1.Begin();
auto it2 = list2.Begin();
while (it1 != list1.End()) {
// Note: Alternator limits an item's depth (rjson::parse() limits
// it to around 37 levels), so this recursion is safe.
if (!check_EQ(&*it1, *it2)) {
return false;
}
++it1;
++it2;
}
return true;
}
static bool check_EQ_for_maps(const rjson::value& list1, const rjson::value& list2) {
if (!list1.IsObject() || !list2.IsObject() || list1.MemberCount() != list2.MemberCount()) {
return false;
}
for (auto it1 = list1.MemberBegin(); it1 != list1.MemberEnd(); ++it1) {
auto it2 = list2.FindMember(it1->name);
if (it2 == list2.MemberEnd() || !check_EQ(&it1->value, it2->value)) {
return false;
}
}
return true;
}
// Check if two JSON-encoded values match with the EQ relation
static bool check_EQ(const rjson::value* v1, const rjson::value& v2) {
if (v1 && v1->IsObject() && v1->MemberCount() == 1 && v2.IsObject() && v2.MemberCount() == 1) {
auto it1 = v1->MemberBegin();
auto it2 = v2.MemberBegin();
if (it1->name != it2->name) {
return false;
}
if (it1->name == "SS" || it1->name == "NS" || it1->name == "BS") {
return check_EQ_for_sets(it1->value, it2->value);
} else if(it1->name == "L") {
return check_EQ_for_lists(it1->value, it2->value);
} else if(it1->name == "M") {
return check_EQ_for_maps(it1->value, it2->value);
} else {
// Other, non-nested types (number, string, etc.) can be compared
// literally, comparing their JSON representation.
return it1->value == it2->value;
}
} else {
// If v1 and/or v2 are missing (IsNull()) the result should be false.
// In the unlikely case that the object is malformed (issue #8070),
// let's also return false.
return false;
}
}
// Check if two JSON-encoded values match with the NE relation
static bool check_NE(const rjson::value* v1, const rjson::value& v2) {
return !check_EQ(v1, v2);
}
// Check if two JSON-encoded values match with the BEGINS_WITH relation
bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2,
bool v1_from_query, bool v2_from_query) {
bool bad = false;
if (!v1 || !v1->IsObject() || v1->MemberCount() != 1) {
if (v1_from_query) {
throw api_error::validation("begins_with() encountered malformed argument");
} else {
bad = true;
}
} else if (v1->MemberBegin()->name != "S" && v1->MemberBegin()->name != "B") {
if (v1_from_query) {
throw api_error::validation(format("begins_with supports only string or binary type, got: {}", *v1));
} else {
bad = true;
}
}
if (!v2.IsObject() || v2.MemberCount() != 1) {
if (v2_from_query) {
throw api_error::validation("begins_with() encountered malformed argument");
} else {
bad = true;
}
} else if (v2.MemberBegin()->name != "S" && v2.MemberBegin()->name != "B") {
if (v2_from_query) {
throw api_error::validation(format("begins_with() supports only string or binary type, got: {}", v2));
} else {
bad = true;
}
}
if (bad) {
return false;
}
auto it1 = v1->MemberBegin();
auto it2 = v2.MemberBegin();
if (it1->name != it2->name) {
return false;
}
if (it2->name == "S") {
return rjson::to_string_view(it1->value).starts_with(rjson::to_string_view(it2->value));
} else /* it2->name == "B" */ {
return base64_begins_with(rjson::to_string_view(it1->value), rjson::to_string_view(it2->value));
}
}
static bool is_set_of(const rjson::value& type1, const rjson::value& type2) {
return (type2 == "S" && type1 == "SS") || (type2 == "N" && type1 == "NS") || (type2 == "B" && type1 == "BS");
}
// Check if two JSON-encoded values match with the CONTAINS relation
bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2) {
if (!v1) {
return false;
}
const auto& kv1 = *v1->MemberBegin();
const auto& kv2 = *v2.MemberBegin();
if (kv1.name == "S" && kv2.name == "S") {
return rjson::to_string_view(kv1.value).find(rjson::to_string_view(kv2.value)) != std::string_view::npos;
} else if (kv1.name == "B" && kv2.name == "B") {
return base64_decode(kv1.value).find(base64_decode(kv2.value)) != bytes::npos;
} else if (is_set_of(kv1.name, kv2.name)) {
for (auto i = kv1.value.Begin(); i != kv1.value.End(); ++i) {
if (*i == kv2.value) {
return true;
}
}
} else if (kv1.name == "L") {
for (auto i = kv1.value.Begin(); i != kv1.value.End(); ++i) {
if (!i->IsObject() || i->MemberCount() != 1) {
clogger.error("check_CONTAINS received a list whose element is malformed");
return false;
}
const auto& el = *i->MemberBegin();
if (el.name == kv2.name && el.value == kv2.value) {
return true;
}
}
}
return false;
}
// Check if two JSON-encoded values match with the NOT_CONTAINS relation
static bool check_NOT_CONTAINS(const rjson::value* v1, const rjson::value& v2) {
if (!v1) {
return false;
}
return !check_CONTAINS(v1, v2);
}
// Check if a JSON-encoded value equals any element of an array, which must have at least one element.
static bool check_IN(const rjson::value* val, const rjson::value& array) {
if (!array[0].IsObject() || array[0].MemberCount() != 1) {
throw api_error::validation(
format("IN operator encountered malformed AttributeValue: {}", array[0]));
}
const auto& type = array[0].MemberBegin()->name;
if (type != "S" && type != "N" && type != "B") {
throw api_error::validation(
"IN operator requires AttributeValueList elements to be of type String, Number, or Binary ");
}
if (!val) {
return false;
}
bool have_match = false;
for (const auto& elem : array.GetArray()) {
if (!elem.IsObject() || elem.MemberCount() != 1 || elem.MemberBegin()->name != type) {
throw api_error::validation(
"IN operator requires all AttributeValueList elements to have the same type ");
}
if (!have_match && *val == elem) {
// Can't return yet, must check types of all array elements. <sigh>
have_match = true;
}
}
return have_match;
}
// Another variant of check_IN, this one for ConditionExpression. It needs to
// check whether the first element in the given vector is equal to any of the
// others.
static bool check_IN(const std::vector<rjson::value>& array) {
const rjson::value* first = &array[0];
for (unsigned i = 1; i < array.size(); i++) {
if (check_EQ(first, array[i])) {
return true;
}
}
return false;
}
static bool check_NULL(const rjson::value* val) {
return val == nullptr;
}
static bool check_NOT_NULL(const rjson::value* val) {
return val != nullptr;
}
// Only types S, N or B (string, number or bytes) may be compared by the
// various comparion operators - lt, le, gt, ge, and between.
// Note that in particular, if the value is missing (v->IsNull()), this
// check returns false.
static bool check_comparable_type(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
return false;
}
const rjson::value& type = v.MemberBegin()->name;
return type == "S" || type == "N" || type == "B";
}
// Check if two JSON-encoded values match with cmp.
template <typename Comparator>
bool check_compare(const rjson::value* v1, const rjson::value& v2, const Comparator& cmp,
bool v1_from_query, bool v2_from_query) {
bool bad = false;
if (!v1 || !check_comparable_type(*v1)) {
if (v1_from_query) {
throw api_error::validation(format("{} allow only the types String, Number, or Binary", cmp.diagnostic));
}
bad = true;
}
if (!check_comparable_type(v2)) {
if (v2_from_query) {
throw api_error::validation(format("{} allow only the types String, Number, or Binary", cmp.diagnostic));
}
bad = true;
}
if (bad) {
return false;
}
const auto& kv1 = *v1->MemberBegin();
const auto& kv2 = *v2.MemberBegin();
if (kv1.name != kv2.name) {
return false;
}
if (kv1.name == "N") {
return cmp(unwrap_number(*v1, cmp.diagnostic), unwrap_number(v2, cmp.diagnostic));
}
if (kv1.name == "S") {
return cmp(std::string_view(kv1.value.GetString(), kv1.value.GetStringLength()),
std::string_view(kv2.value.GetString(), kv2.value.GetStringLength()));
}
if (kv1.name == "B") {
return cmp(base64_decode(kv1.value), base64_decode(kv2.value));
}
// cannot reach here, as check_comparable_type() verifies the type is one
// of the above options.
return false;
}
struct cmp_lt {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs < rhs; }
// We cannot use the normal comparison operators like "<" on the bytes
// type, because they treat individual bytes as signed but we need to
// compare them as *unsigned*. So we need a specialization for bytes.
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) < 0; }
static constexpr const char* diagnostic = "LT operator";
};
struct cmp_le {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs <= rhs; }
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) <= 0; }
static constexpr const char* diagnostic = "LE operator";
};
struct cmp_ge {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs >= rhs; }
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) >= 0; }
static constexpr const char* diagnostic = "GE operator";
};
struct cmp_gt {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs > rhs; }
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) > 0; }
static constexpr const char* diagnostic = "GT operator";
};
// True if v is between lb and ub, inclusive. Throws or returns false
// (depending on bounds_from_query parameter) if lb > ub.
template <typename T>
static bool check_BETWEEN(const T& v, const T& lb, const T& ub, bool bounds_from_query) {
if (cmp_lt()(ub, lb)) {
if (bounds_from_query) {
throw api_error::validation(
format("BETWEEN operator requires lower_bound <= upper_bound, but {} > {}", lb, ub));
} else {
return false;
}
}
return cmp_ge()(v, lb) && cmp_le()(v, ub);
}
static bool check_BETWEEN(const rjson::value* v, const rjson::value& lb, const rjson::value& ub,
bool v_from_query, bool lb_from_query, bool ub_from_query) {
if ((v && v_from_query && !check_comparable_type(*v)) ||
(lb_from_query && !check_comparable_type(lb)) ||
(ub_from_query && !check_comparable_type(ub))) {
throw api_error::validation("between allow only the types String, Number, or Binary");
}
if (!v || !v->IsObject() || v->MemberCount() != 1 ||
!lb.IsObject() || lb.MemberCount() != 1 ||
!ub.IsObject() || ub.MemberCount() != 1) {
return false;
}
const auto& kv_v = *v->MemberBegin();
const auto& kv_lb = *lb.MemberBegin();
const auto& kv_ub = *ub.MemberBegin();
bool bounds_from_query = lb_from_query && ub_from_query;
if (kv_lb.name != kv_ub.name) {
if (bounds_from_query) {
throw api_error::validation(
format("BETWEEN operator requires the same type for lower and upper bound; instead got {} and {}",
kv_lb.name, kv_ub.name));
} else {
return false;
}
}
if (kv_v.name != kv_lb.name) { // Cannot compare different types, so v is NOT between lb and ub.
return false;
}
if (kv_v.name == "N") {
const char* diag = "BETWEEN operator";
return check_BETWEEN(unwrap_number(*v, diag), unwrap_number(lb, diag), unwrap_number(ub, diag), bounds_from_query);
}
if (kv_v.name == "S") {
return check_BETWEEN(std::string_view(kv_v.value.GetString(), kv_v.value.GetStringLength()),
std::string_view(kv_lb.value.GetString(), kv_lb.value.GetStringLength()),
std::string_view(kv_ub.value.GetString(), kv_ub.value.GetStringLength()),
bounds_from_query);
}
if (kv_v.name == "B") {
return check_BETWEEN(base64_decode(kv_v.value), base64_decode(kv_lb.value), base64_decode(kv_ub.value), bounds_from_query);
}
if (v_from_query) {
throw api_error::validation(
format("BETWEEN operator requires AttributeValueList elements to be of type String, Number, or Binary; instead got {}",
kv_lb.name));
} else {
return false;
}
}
// Verify one Expect condition on one attribute (whose content is "got")
// for the verify_expected() below.
// This function returns true or false depending on whether the condition
// succeeded - it does not throw ConditionalCheckFailedException.
// However, it may throw ValidationException on input validation errors.
static bool verify_expected_one(const rjson::value& condition, const rjson::value* got) {
const rjson::value* comparison_operator = rjson::find(condition, "ComparisonOperator");
const rjson::value* attribute_value_list = rjson::find(condition, "AttributeValueList");
const rjson::value* value = rjson::find(condition, "Value");
const rjson::value* exists = rjson::find(condition, "Exists");
// There are three types of conditions that Expected supports:
// A value, not-exists, and a comparison of some kind. Each allows
// and requires a different combinations of parameters in the request
if (value) {
if (exists && (!exists->IsBool() || exists->GetBool() != true)) {
throw api_error::validation("Cannot combine Value with Exists!=true");
}
if (comparison_operator) {
throw api_error::validation("Cannot combine Value with ComparisonOperator");
}
return check_EQ(got, *value);
} else if (exists) {
if (comparison_operator) {
throw api_error::validation("Cannot combine Exists with ComparisonOperator");
}
if (!exists->IsBool() || exists->GetBool() != false) {
throw api_error::validation("Exists!=false requires Value");
}
// Remember Exists=false, so we're checking that the attribute does *not* exist:
return !got;
} else {
if (!comparison_operator) {
throw api_error::validation("Missing ComparisonOperator, Value or Exists");
}
comparison_operator_type op = get_comparison_operator(*comparison_operator);
switch (op) {
case comparison_operator_type::EQ:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_EQ(got, (*attribute_value_list)[0]);
case comparison_operator_type::NE:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_NE(got, (*attribute_value_list)[0]);
case comparison_operator_type::LT:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_lt{}, false, true);
case comparison_operator_type::LE:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_le{}, false, true);
case comparison_operator_type::GT:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_gt{}, false, true);
case comparison_operator_type::GE:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_ge{}, false, true);
case comparison_operator_type::BEGINS_WITH:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_BEGINS_WITH(got, (*attribute_value_list)[0], false, true);
case comparison_operator_type::IN:
verify_operand_count(attribute_value_list, nonempty(), *comparison_operator);
return check_IN(got, *attribute_value_list);
case comparison_operator_type::IS_NULL:
verify_operand_count(attribute_value_list, empty(), *comparison_operator);
return check_NULL(got);
case comparison_operator_type::NOT_NULL:
verify_operand_count(attribute_value_list, empty(), *comparison_operator);
return check_NOT_NULL(got);
case comparison_operator_type::BETWEEN:
verify_operand_count(attribute_value_list, exact_size(2), *comparison_operator);
return check_BETWEEN(got, (*attribute_value_list)[0], (*attribute_value_list)[1],
false, true, true);
case comparison_operator_type::CONTAINS:
{
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
// Expected's "CONTAINS" has this artificial limitation.
// ConditionExpression's "contains()" does not...
const rjson::value& arg = (*attribute_value_list)[0];
const auto& argtype = (*arg.MemberBegin()).name;
if (argtype != "S" && argtype != "N" && argtype != "B") {
throw api_error::validation(
format("CONTAINS operator requires a single AttributeValue of type String, Number, or Binary, "
"got {} instead", argtype));
}
return check_CONTAINS(got, arg);
}
case comparison_operator_type::NOT_CONTAINS:
{
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
// Expected's "NOT_CONTAINS" has this artificial limitation.
// ConditionExpression's "contains()" does not...
const rjson::value& arg = (*attribute_value_list)[0];
const auto& argtype = (*arg.MemberBegin()).name;
if (argtype != "S" && argtype != "N" && argtype != "B") {
throw api_error::validation(
format("CONTAINS operator requires a single AttributeValue of type String, Number, or Binary, "
"got {} instead", argtype));
}
return check_NOT_CONTAINS(got, arg);
}
}
throw std::logic_error(format("Internal error: corrupted operator enum: {}", int(op)));
}
}
conditional_operator_type get_conditional_operator(const rjson::value& req) {
const rjson::value* conditional_operator = rjson::find(req, "ConditionalOperator");
if (!conditional_operator) {
return conditional_operator_type::MISSING;
}
if (!conditional_operator->IsString()) {
throw api_error::validation("'ConditionalOperator' parameter, if given, must be a string");
}
auto s = rjson::to_string_view(*conditional_operator);
if (s == "AND") {
return conditional_operator_type::AND;
} else if (s == "OR") {
return conditional_operator_type::OR;
} else {
throw api_error::validation(
format("'ConditionalOperator' parameter must be AND, OR or missing. Found {}.", s));
}
}
// Check if the existing values of the item (previous_item) match the
// conditions given by the Expected and ConditionalOperator parameters
// (if they exist) in the request (an UpdateItem, PutItem or DeleteItem).
// This function can throw an ValidationException API error if there
// are errors in the format of the condition itself.
bool verify_expected(const rjson::value& req, const rjson::value* previous_item) {
const rjson::value* expected = rjson::find(req, "Expected");
auto conditional_operator = get_conditional_operator(req);
if (conditional_operator != conditional_operator_type::MISSING &&
(!expected || (expected->IsObject() && expected->GetObject().ObjectEmpty()))) {
throw api_error::validation("'ConditionalOperator' parameter cannot be specified for missing or empty Expression");
}
if (!expected) {
return true;
}
if (!expected->IsObject()) {
throw api_error::validation("'Expected' parameter, if given, must be an object");
}
bool require_all = conditional_operator != conditional_operator_type::OR;
return verify_condition(*expected, require_all, previous_item);
}
bool verify_condition(const rjson::value& condition, bool require_all, const rjson::value* previous_item) {
for (auto it = condition.MemberBegin(); it != condition.MemberEnd(); ++it) {
const rjson::value* got = nullptr;
if (previous_item) {
got = rjson::find(*previous_item, rjson::to_string_view(it->name));
}
bool success = verify_expected_one(it->value, got);
if (success && !require_all) {
// When !require_all, one success is enough!
return true;
} else if (!success && require_all) {
// When require_all, one failure is enough!
return false;
}
}
// If we got here and require_all, none of the checks failed, so succeed.
// If we got here and !require_all, all of the checks failed, so fail.
return require_all;
}
static bool calculate_primitive_condition(const parsed::primitive_condition& cond,
const rjson::value* previous_item) {
std::vector<rjson::value> calculated_values;
calculated_values.reserve(cond._values.size());
for (const parsed::value& v : cond._values) {
calculated_values.push_back(calculate_value(v,
cond._op == parsed::primitive_condition::type::VALUE ?
calculate_value_caller::ConditionExpressionAlone :
calculate_value_caller::ConditionExpression,
previous_item));
}
switch (cond._op) {
case parsed::primitive_condition::type::BETWEEN:
if (calculated_values.size() != 3) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error(format("Wrong number of values {} in BETWEEN primitive_condition", cond._values.size()));
}
return check_BETWEEN(&calculated_values[0], calculated_values[1], calculated_values[2],
cond._values[0].is_constant(), cond._values[1].is_constant(), cond._values[2].is_constant());
case parsed::primitive_condition::type::IN:
return check_IN(calculated_values);
case parsed::primitive_condition::type::VALUE:
if (calculated_values.size() != 1) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error(format("Unexpected values in primitive_condition", cond._values.size()));
}
// Unwrap the boolean wrapped as the value (if it is a boolean)
if (calculated_values[0].IsObject() && calculated_values[0].MemberCount() == 1) {
auto it = calculated_values[0].MemberBegin();
if (it->name == "BOOL" && it->value.IsBool()) {
return it->value.GetBool();
}
}
throw api_error::validation(
format("ConditionExpression: condition results in a non-boolean value: {}",
calculated_values[0]));
default:
// All the rest of the operators have exactly two parameters (and unless
// we have a bug in the parser, that's what we have in the parsed object:
if (calculated_values.size() != 2) {
throw std::logic_error(format("Wrong number of values {} in primitive_condition object", cond._values.size()));
}
}
switch (cond._op) {
case parsed::primitive_condition::type::EQ:
return check_EQ(&calculated_values[0], calculated_values[1]);
case parsed::primitive_condition::type::NE:
return check_NE(&calculated_values[0], calculated_values[1]);
case parsed::primitive_condition::type::GT:
return check_compare(&calculated_values[0], calculated_values[1], cmp_gt{},
cond._values[0].is_constant(), cond._values[1].is_constant());
case parsed::primitive_condition::type::GE:
return check_compare(&calculated_values[0], calculated_values[1], cmp_ge{},
cond._values[0].is_constant(), cond._values[1].is_constant());
case parsed::primitive_condition::type::LT:
return check_compare(&calculated_values[0], calculated_values[1], cmp_lt{},
cond._values[0].is_constant(), cond._values[1].is_constant());
case parsed::primitive_condition::type::LE:
return check_compare(&calculated_values[0], calculated_values[1], cmp_le{},
cond._values[0].is_constant(), cond._values[1].is_constant());
default:
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error(format("Unknown type {} in primitive_condition object", (int)(cond._op)));
}
}
// Check if the existing values of the item (previous_item) match the
// conditions given by the given parsed ConditionExpression.
bool verify_condition_expression(
const parsed::condition_expression& condition_expression,
const rjson::value* previous_item) {
if (condition_expression.empty()) {
return true;
}
bool ret = std::visit(overloaded_functor {
[&] (const parsed::primitive_condition& cond) -> bool {
return calculate_primitive_condition(cond, previous_item);
},
[&] (const parsed::condition_expression::condition_list& list) -> bool {
auto verify_condition = [&] (const parsed::condition_expression& e) {
return verify_condition_expression(e, previous_item);
};
switch (list.op) {
case '&':
return boost::algorithm::all_of(list.conditions, verify_condition);
case '|':
return boost::algorithm::any_of(list.conditions, verify_condition);
default:
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error("bad operator in condition_list");
}
}
}, condition_expression._expression);
return condition_expression._negated ? !ret : ret;
}
}

View File

@@ -1,61 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
/*
* This file contains definitions and functions related to placing conditions
* on Alternator queries (equivalent of CQL's restrictions).
*
* With conditions, it's possible to add criteria to selection requests (Scan, Query)
* and use them for narrowing down the result set, by means of filtering or indexing.
*
* Ref: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Condition.html
*/
#pragma once
#include "cql3/restrictions/statement_restrictions.hh"
#include "serialization.hh"
#include "expressions_types.hh"
namespace alternator {
enum class comparison_operator_type {
EQ, NE, LE, LT, GE, GT, IN, BETWEEN, CONTAINS, NOT_CONTAINS, IS_NULL, NOT_NULL, BEGINS_WITH
};
comparison_operator_type get_comparison_operator(const rjson::value& comparison_operator);
enum class conditional_operator_type {
AND, OR, MISSING
};
conditional_operator_type get_conditional_operator(const rjson::value& req);
bool verify_expected(const rjson::value& req, const rjson::value* previous_item);
bool verify_condition(const rjson::value& condition, bool require_all, const rjson::value* previous_item);
bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2);
bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2, bool v1_from_query, bool v2_from_query);
bool verify_condition_expression(
const parsed::condition_expression& condition_expression,
const rjson::value* previous_item);
}

View File

@@ -1,92 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <seastar/http/httpd.hh>
#include "seastarx.hh"
namespace alternator {
// api_error contains a DynamoDB error message to be returned to the user.
// It can be returned by value (see executor::request_return_type) or thrown.
// The DynamoDB's error messages are described in detail in
// https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html
// An error message has an HTTP code (almost always 400), a type, e.g.,
// "ResourceNotFoundException", and a human readable message.
// Eventually alternator::api_handler will convert a returned or thrown
// api_error into a JSON object, and that is returned to the user.
class api_error final {
public:
using status_type = httpd::reply::status_type;
status_type _http_code;
std::string _type;
std::string _msg;
api_error(std::string type, std::string msg, status_type http_code = status_type::bad_request)
: _http_code(std::move(http_code))
, _type(std::move(type))
, _msg(std::move(msg))
{ }
// Factory functions for some common types of DynamoDB API errors
static api_error validation(std::string msg) {
return api_error("ValidationException", std::move(msg));
}
static api_error resource_not_found(std::string msg) {
return api_error("ResourceNotFoundException", std::move(msg));
}
static api_error resource_in_use(std::string msg) {
return api_error("ResourceInUseException", std::move(msg));
}
static api_error invalid_signature(std::string msg) {
return api_error("InvalidSignatureException", std::move(msg));
}
static api_error missing_authentication_token(std::string msg) {
return api_error("MissingAuthenticationTokenException", std::move(msg));
}
static api_error unrecognized_client(std::string msg) {
return api_error("UnrecognizedClientException", std::move(msg));
}
static api_error unknown_operation(std::string msg) {
return api_error("UnknownOperationException", std::move(msg));
}
static api_error access_denied(std::string msg) {
return api_error("AccessDeniedException", std::move(msg));
}
static api_error conditional_check_failed(std::string msg) {
return api_error("ConditionalCheckFailedException", std::move(msg));
}
static api_error expired_iterator(std::string msg) {
return api_error("ExpiredIteratorException", std::move(msg));
}
static api_error trimmed_data_access_exception(std::string msg) {
return api_error("TrimmedDataAccessException", std::move(msg));
}
static api_error request_limit_exceeded(std::string msg) {
return api_error("RequestLimitExceeded", std::move(msg));
}
static api_error internal(std::string msg) {
return api_error("InternalServerError", std::move(msg), reply::status_type::internal_server_error);
}
};
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,224 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <seastar/core/future.hh>
#include <seastar/http/httpd.hh>
#include "seastarx.hh"
#include <seastar/json/json_elements.hh>
#include <seastar/core/sharded.hh>
#include "service/storage_proxy.hh"
#include "service/migration_manager.hh"
#include "service/client_state.hh"
#include "db/timeout_clock.hh"
#include "alternator/error.hh"
#include "stats.hh"
#include "utils/rjson.hh"
namespace db {
class system_distributed_keyspace;
}
namespace query {
class partition_slice;
class result;
}
namespace cql3::selection {
class selection;
}
namespace service {
class storage_service;
}
namespace cdc {
class metadata;
}
namespace alternator {
class rmw_operation;
struct make_jsonable : public json::jsonable {
rjson::value _value;
public:
explicit make_jsonable(rjson::value&& value);
std::string to_json() const override;
};
struct json_string : public json::jsonable {
std::string _value;
public:
explicit json_string(std::string&& value);
std::string to_json() const override;
};
namespace parsed {
class path;
};
// An attribute_path_map object is used to hold data for various attributes
// paths (parsed::path) in a hierarchy of attribute paths. Each attribute path
// has a root attribute, and then modified by member and index operators -
// for example in "a.b[2].c" we have "a" as the root, then ".b" member, then
// "[2]" index, and finally ".c" member.
// Data can be added to an attribute_path_map using the add() function, but
// requires that attributes with data not be *overlapping* or *conflicting*:
//
// 1. Two attribute paths which are identical or an ancestor of one another
// are considered *overlapping* and not allowed. If a.b.c has data,
// we can't add more data in a.b.c or any of its descendants like a.b.c.d.
//
// 2. Two attribute paths which need the same parent to have both a member and
// an index are considered *conflicting* and not allowed. E.g., if a.b has
// data, you can't add a[1]. The meaning of adding both would be that the
// attribute a is both a map and an array, which isn't sensible.
//
// These two requirements are common to the two places where Alternator uses
// this abstraction to describe how a hierarchical item is to be transformed:
//
// 1. In ProjectExpression: for filtering from a full top-level attribute
// only the parts for which user asked in ProjectionExpression.
//
// 2. In UpdateExpression: for taking the previous value of a top-level
// attribute, and modifying it based on the instructions in the user
// wrote in UpdateExpression.
template<typename T>
class attribute_path_map_node {
public:
using data_t = T;
// We need the extra unique_ptr<> here because libstdc++ unordered_map
// doesn't work with incomplete types :-(
using members_t = std::unordered_map<std::string, std::unique_ptr<attribute_path_map_node<T>>>;
// The indexes list is sorted because DynamoDB requires handling writes
// beyond the end of a list in index order.
using indexes_t = std::map<unsigned, std::unique_ptr<attribute_path_map_node<T>>>;
// The prohibition on "overlap" and "conflict" explained above means
// That only one of data, members or indexes is non-empty.
std::optional<std::variant<data_t, members_t, indexes_t>> _content;
bool is_empty() const { return !_content; }
bool has_value() const { return _content && std::holds_alternative<data_t>(*_content); }
bool has_members() const { return _content && std::holds_alternative<members_t>(*_content); }
bool has_indexes() const { return _content && std::holds_alternative<indexes_t>(*_content); }
// get_members() assumes that has_members() is true
members_t& get_members() { return std::get<members_t>(*_content); }
const members_t& get_members() const { return std::get<members_t>(*_content); }
indexes_t& get_indexes() { return std::get<indexes_t>(*_content); }
const indexes_t& get_indexes() const { return std::get<indexes_t>(*_content); }
T& get_value() { return std::get<T>(*_content); }
const T& get_value() const { return std::get<T>(*_content); }
};
template<typename T>
using attribute_path_map = std::unordered_map<std::string, attribute_path_map_node<T>>;
using attrs_to_get_node = attribute_path_map_node<std::monostate>;
using attrs_to_get = attribute_path_map<std::monostate>;
class executor : public peering_sharded_service<executor> {
service::storage_proxy& _proxy;
service::migration_manager& _mm;
db::system_distributed_keyspace& _sdks;
service::storage_service& _ss;
cdc::metadata& _cdc_metadata;
// An smp_service_group to be used for limiting the concurrency when
// forwarding Alternator request between shards - if necessary for LWT.
smp_service_group _ssg;
public:
using client_state = service::client_state;
using request_return_type = std::variant<json::json_return_type, api_error>;
stats _stats;
static constexpr auto ATTRS_COLUMN_NAME = ":attrs";
static constexpr auto KEYSPACE_NAME_PREFIX = "alternator_";
static constexpr std::string_view INTERNAL_TABLE_PREFIX = ".scylla.alternator.";
executor(service::storage_proxy& proxy, service::migration_manager& mm, db::system_distributed_keyspace& sdks, service::storage_service& ss, cdc::metadata& cdc_metadata, smp_service_group ssg)
: _proxy(proxy), _mm(mm), _sdks(sdks), _ss(ss), _cdc_metadata(cdc_metadata), _ssg(ssg) {}
future<request_return_type> create_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> describe_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> delete_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> update_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> put_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> get_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> delete_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> update_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> list_tables(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> scan(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> describe_endpoints(client_state& client_state, service_permit permit, rjson::value request, std::string host_header);
future<request_return_type> batch_write_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> batch_get_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> query(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> tag_resource(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> untag_resource(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> list_tags_of_resource(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> list_streams(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> describe_stream(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> get_shard_iterator(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> get_records(client_state& client_state, tracing::trace_state_ptr, service_permit permit, rjson::value request);
future<> start();
future<> stop() { return make_ready_future<>(); }
future<> create_keyspace(std::string_view keyspace_name);
static sstring table_name(const schema&);
static db::timeout_clock::time_point default_timeout();
static void set_default_timeout(db::timeout_clock::duration timeout);
private:
static db::timeout_clock::duration s_default_timeout;
public:
static schema_ptr find_table(service::storage_proxy&, const rjson::value& request);
private:
friend class rmw_operation;
static bool is_alternator_keyspace(const sstring& ks_name);
static sstring make_keyspace_name(const sstring& table_name);
static void describe_key_schema(rjson::value& parent, const schema&, std::unordered_map<std::string,std::string> * = nullptr);
static void describe_key_schema(rjson::value& parent, const schema& schema, std::unordered_map<std::string,std::string>&);
public:
static std::optional<rjson::value> describe_single_item(schema_ptr,
const query::partition_slice&,
const cql3::selection::selection&,
const query::result&,
const attrs_to_get&);
static void describe_single_item(const cql3::selection::selection&,
const std::vector<bytes_opt>&,
const attrs_to_get&,
rjson::value&,
bool = false);
void add_stream_options(const rjson::value& stream_spec, schema_builder&) const;
void supplement_table_info(rjson::value& descr, const schema& schema) const;
void supplement_table_stream_info(rjson::value& descr, const schema& schema) const;
};
}

View File

@@ -1,767 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "expressions.hh"
#include "serialization.hh"
#include "base64.hh"
#include "conditions.hh"
#include "alternator/expressionsLexer.hpp"
#include "alternator/expressionsParser.hpp"
#include "utils/overloaded_functor.hh"
#include "error.hh"
#include "seastarx.hh"
#include <seastar/core/print.hh>
#include <seastar/util/log.hh>
#include <boost/algorithm/cxx11/any_of.hpp>
#include <boost/algorithm/cxx11/all_of.hpp>
#include <functional>
#include <unordered_map>
namespace alternator {
template <typename Func, typename Result = std::result_of_t<Func(expressionsParser&)>>
Result do_with_parser(std::string input, Func&& f) {
expressionsLexer::InputStreamType input_stream{
reinterpret_cast<const ANTLR_UINT8*>(input.data()),
ANTLR_ENC_UTF8,
static_cast<ANTLR_UINT32>(input.size()),
nullptr };
expressionsLexer lexer(&input_stream);
expressionsParser::TokenStreamType tstream(ANTLR_SIZE_HINT, lexer.get_tokSource());
expressionsParser parser(&tstream);
auto result = f(parser);
return result;
}
parsed::update_expression
parse_update_expression(std::string query) {
try {
return do_with_parser(query, std::mem_fn(&expressionsParser::update_expression));
} catch (...) {
throw expressions_syntax_error(format("Failed parsing UpdateExpression '{}': {}", query, std::current_exception()));
}
}
std::vector<parsed::path>
parse_projection_expression(std::string query) {
try {
return do_with_parser(query, std::mem_fn(&expressionsParser::projection_expression));
} catch (...) {
throw expressions_syntax_error(format("Failed parsing ProjectionExpression '{}': {}", query, std::current_exception()));
}
}
parsed::condition_expression
parse_condition_expression(std::string query) {
try {
return do_with_parser(query, std::mem_fn(&expressionsParser::condition_expression));
} catch (...) {
throw expressions_syntax_error(format("Failed parsing ConditionExpression '{}': {}", query, std::current_exception()));
}
}
namespace parsed {
void update_expression::add(update_expression::action a) {
std::visit(overloaded_functor {
[&] (action::set&) { seen_set = true; },
[&] (action::remove&) { seen_remove = true; },
[&] (action::add&) { seen_add = true; },
[&] (action::del&) { seen_del = true; }
}, a._action);
_actions.push_back(std::move(a));
}
void update_expression::append(update_expression other) {
if ((seen_set && other.seen_set) ||
(seen_remove && other.seen_remove) ||
(seen_add && other.seen_add) ||
(seen_del && other.seen_del)) {
throw expressions_syntax_error("Each of SET, REMOVE, ADD, DELETE may only appear once in UpdateExpression");
}
std::move(other._actions.begin(), other._actions.end(), std::back_inserter(_actions));
seen_set |= other.seen_set;
seen_remove |= other.seen_remove;
seen_add |= other.seen_add;
seen_del |= other.seen_del;
}
void condition_expression::append(condition_expression&& a, char op) {
std::visit(overloaded_functor {
[&] (condition_list& x) {
// If 'a' has a single condition, we could, instead of inserting
// it insert its single condition (possibly negated if a._negated)
// But considering it we don't evaluate these expressions many
// times, this optimization is not worth extra code complexity.
if (!x.conditions.empty() && x.op != op) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error("condition_expression::append called with mixed operators");
}
x.conditions.push_back(std::move(a));
x.op = op;
},
[&] (primitive_condition& x) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error("condition_expression::append called on primitive_condition");
}
}, _expression);
}
void path::check_depth_limit() {
if (1 + _operators.size() > depth_limit) {
throw expressions_syntax_error(format("Document path exceeded {} nesting levels", depth_limit));
}
}
std::ostream& operator<<(std::ostream& os, const path& p) {
os << p.root();
for (const auto& op : p.operators()) {
std::visit(overloaded_functor {
[&] (const std::string& member) {
os << '.' << member;
},
[&] (unsigned index) {
os << '[' << index << ']';
}
}, op);
}
return os;
}
} // namespace parsed
// The following resolve_*() functions resolve references in parsed
// expressions of different types. Resolving a parsed expression means
// replacing:
// 1. In parsed::path objects, replace references like "#name" with the
// attribute name from ExpressionAttributeNames,
// 2. In parsed::constant objects, replace references like ":value" with
// the value from ExpressionAttributeValues.
// These function also track which name and value references were used, to
// allow complaining if some remain unused.
// Note that the resolve_*() functions modify the expressions in-place,
// so if we ever intend to cache parsed expression, we need to pass a copy
// into this function.
//
// Doing the "resolving" stage before the evaluation stage has two benefits.
// First, it allows us to be compatible with DynamoDB in catching unused
// names and values (see issue #6572). Second, in the FilterExpression case,
// we need to resolve the expression just once but then use it many times
// (once for each item to be filtered).
static std::optional<std::string> resolve_path_component(const std::string& column_name,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names) {
if (column_name.size() > 0 && column_name.front() == '#') {
if (!expression_attribute_names) {
throw api_error::validation(
format("ExpressionAttributeNames missing, entry '{}' required by expression", column_name));
}
const rjson::value* value = rjson::find(*expression_attribute_names, column_name);
if (!value || !value->IsString()) {
throw api_error::validation(
format("ExpressionAttributeNames missing entry '{}' required by expression", column_name));
}
used_attribute_names.emplace(column_name);
return std::string(rjson::to_string_view(*value));
}
return std::nullopt;
}
static void resolve_path(parsed::path& p,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names) {
std::optional<std::string> r = resolve_path_component(p.root(), expression_attribute_names, used_attribute_names);
if (r) {
p.set_root(std::move(*r));
}
for (auto& op : p.operators()) {
std::visit(overloaded_functor {
[&] (std::string& s) {
r = resolve_path_component(s, expression_attribute_names, used_attribute_names);
if (r) {
s = std::move(*r);
}
},
[&] (unsigned index) {
// nothing to resolve
}
}, op);
}
}
static void resolve_constant(parsed::constant& c,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_values) {
std::visit(overloaded_functor {
[&] (const std::string& valref) {
if (!expression_attribute_values) {
throw api_error::validation(
format("ExpressionAttributeValues missing, entry '{}' required by expression", valref));
}
const rjson::value* value = rjson::find(*expression_attribute_values, valref);
if (!value) {
throw api_error::validation(
format("ExpressionAttributeValues missing entry '{}' required by expression", valref));
}
if (value->IsNull()) {
throw api_error::validation(
format("ExpressionAttributeValues null value for entry '{}' required by expression", valref));
}
validate_value(*value, "ExpressionAttributeValues");
used_attribute_values.emplace(valref);
c.set(*value);
},
[&] (const parsed::constant::literal& lit) {
// Nothing to do, already resolved
}
}, c._value);
}
void resolve_value(parsed::value& rhs,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
std::visit(overloaded_functor {
[&] (parsed::constant& c) {
resolve_constant(c, expression_attribute_values, used_attribute_values);
},
[&] (parsed::value::function_call& f) {
for (parsed::value& value : f._parameters) {
resolve_value(value, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
},
[&] (parsed::path& p) {
resolve_path(p, expression_attribute_names, used_attribute_names);
}
}, rhs._value);
}
void resolve_set_rhs(parsed::set_rhs& rhs,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
resolve_value(rhs._v1, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
if (rhs._op != 'v') {
resolve_value(rhs._v2, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
}
void resolve_update_expression(parsed::update_expression& ue,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
for (parsed::update_expression::action& action : ue.actions()) {
resolve_path(action._path, expression_attribute_names, used_attribute_names);
std::visit(overloaded_functor {
[&] (parsed::update_expression::action::set& a) {
resolve_set_rhs(a._rhs, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
},
[&] (parsed::update_expression::action::remove& a) {
// nothing to do
},
[&] (parsed::update_expression::action::add& a) {
resolve_constant(a._valref, expression_attribute_values, used_attribute_values);
},
[&] (parsed::update_expression::action::del& a) {
resolve_constant(a._valref, expression_attribute_values, used_attribute_values);
}
}, action._action);
}
}
static void resolve_primitive_condition(parsed::primitive_condition& pc,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
for (parsed::value& value : pc._values) {
resolve_value(value,
expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
}
void resolve_condition_expression(parsed::condition_expression& ce,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
std::visit(overloaded_functor {
[&] (parsed::primitive_condition& cond) {
resolve_primitive_condition(cond,
expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
},
[&] (parsed::condition_expression::condition_list& list) {
for (parsed::condition_expression& cond : list.conditions) {
resolve_condition_expression(cond,
expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
}
}, ce._expression);
}
void resolve_projection_expression(std::vector<parsed::path>& pe,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names) {
for (parsed::path& p : pe) {
resolve_path(p, expression_attribute_names, used_attribute_names);
}
}
// condition_expression_on() checks whether a condition_expression places any
// condition on the given attribute. It can be useful, for example, for
// checking whether the condition tries to restrict a key column.
static bool value_on(const parsed::value& v, std::string_view attribute) {
return std::visit(overloaded_functor {
[&] (const parsed::constant& c) {
return false;
},
[&] (const parsed::value::function_call& f) {
for (const parsed::value& value : f._parameters) {
if (value_on(value, attribute)) {
return true;
}
}
return false;
},
[&] (const parsed::path& p) {
return p.root() == attribute;
}
}, v._value);
}
static bool primitive_condition_on(const parsed::primitive_condition& pc, std::string_view attribute) {
for (const parsed::value& value : pc._values) {
if (value_on(value, attribute)) {
return true;
}
}
return false;
}
bool condition_expression_on(const parsed::condition_expression& ce, std::string_view attribute) {
return std::visit(overloaded_functor {
[&] (const parsed::primitive_condition& cond) {
return primitive_condition_on(cond, attribute);
},
[&] (const parsed::condition_expression::condition_list& list) {
for (const parsed::condition_expression& cond : list.conditions) {
if (condition_expression_on(cond, attribute)) {
return true;
}
}
return false;
}
}, ce._expression);
}
// for_condition_expression_on() runs a given function over all the attributes
// mentioned in the expression. If the same attribute is mentioned more than
// once, the function will be called more than once for the same attribute.
static void for_value_on(const parsed::value& v, const noncopyable_function<void(std::string_view)>& func) {
std::visit(overloaded_functor {
[&] (const parsed::constant& c) { },
[&] (const parsed::value::function_call& f) {
for (const parsed::value& value : f._parameters) {
for_value_on(value, func);
}
},
[&] (const parsed::path& p) {
func(p.root());
}
}, v._value);
}
void for_condition_expression_on(const parsed::condition_expression& ce, const noncopyable_function<void(std::string_view)>& func) {
std::visit(overloaded_functor {
[&] (const parsed::primitive_condition& cond) {
for (const parsed::value& value : cond._values) {
for_value_on(value, func);
}
},
[&] (const parsed::condition_expression::condition_list& list) {
for (const parsed::condition_expression& cond : list.conditions) {
for_condition_expression_on(cond, func);
}
}
}, ce._expression);
}
// The following calculate_value() functions calculate, or evaluate, a parsed
// expression. The parsed expression is assumed to have been "resolved", with
// the matching resolve_* function.
// Take two JSON-encoded list values (remember that a list value is
// {"L": [...the actual list]}) and return the concatenation, again as
// a list value.
static rjson::value list_concatenate(const rjson::value& v1, const rjson::value& v2) {
const rjson::value* list1 = unwrap_list(v1);
const rjson::value* list2 = unwrap_list(v2);
if (!list1 || !list2) {
throw api_error::validation("UpdateExpression: list_append() given a non-list");
}
rjson::value cat = rjson::copy(*list1);
for (const auto& a : list2->GetArray()) {
rjson::push_back(cat, rjson::copy(a));
}
rjson::value ret = rjson::empty_object();
rjson::set(ret, "L", std::move(cat));
return ret;
}
// calculate_size() is ConditionExpression's size() function, i.e., it takes
// a JSON-encoded value and returns its "size" as defined differently for the
// different types - also as a JSON-encoded number.
// It return a JSON-encoded "null" value if this value's type has no size
// defined. Comparisons against this non-numeric value will later fail.
static rjson::value calculate_size(const rjson::value& v) {
// NOTE: If v is improperly formatted for our JSON value encoding, it
// must come from the request itself, not from the database, so it makes
// sense to throw a ValidationException if we see such a problem.
if (!v.IsObject() || v.MemberCount() != 1) {
throw api_error::validation(format("invalid object: {}", v));
}
auto it = v.MemberBegin();
int ret;
if (it->name == "S") {
if (!it->value.IsString()) {
throw api_error::validation(format("invalid string: {}", v));
}
ret = it->value.GetStringLength();
} else if (it->name == "NS" || it->name == "SS" || it->name == "BS" || it->name == "L") {
if (!it->value.IsArray()) {
throw api_error::validation(format("invalid set: {}", v));
}
ret = it->value.Size();
} else if (it->name == "M") {
if (!it->value.IsObject()) {
throw api_error::validation(format("invalid map: {}", v));
}
ret = it->value.MemberCount();
} else if (it->name == "B") {
if (!it->value.IsString()) {
throw api_error::validation(format("invalid byte string: {}", v));
}
ret = base64_decoded_len(rjson::to_string_view(it->value));
} else {
rjson::value json_ret = rjson::empty_object();
rjson::set(json_ret, "null", rjson::value(true));
return json_ret;
}
rjson::value json_ret = rjson::empty_object();
rjson::set(json_ret, "N", rjson::from_string(std::to_string(ret)));
return json_ret;
}
static const rjson::value& calculate_value(const parsed::constant& c) {
return std::visit(overloaded_functor {
[&] (const parsed::constant::literal& v) -> const rjson::value& {
return *v;
},
[&] (const std::string& valref) -> const rjson::value& {
// Shouldn't happen, we should have called resolve_value() earlier
// and replaced the value reference by the literal constant.
throw std::logic_error("calculate_value() called before resolve_value()");
}
}, c._value);
}
static rjson::value to_bool_json(bool b) {
rjson::value json_ret = rjson::empty_object();
rjson::set(json_ret, "BOOL", rjson::value(b));
return json_ret;
}
static bool known_type(std::string_view type) {
static thread_local const std::unordered_set<std::string_view> types = {
"N", "S", "B", "NS", "SS", "BS", "L", "M", "NULL", "BOOL"
};
return types.contains(type);
}
using function_handler_type = rjson::value(calculate_value_caller, const rjson::value*, const parsed::value::function_call&);
static const
std::unordered_map<std::string_view, function_handler_type*> function_handlers {
{"list_append", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::UpdateExpression) {
throw api_error::validation(
format("{}: list_append() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: list_append() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
return list_concatenate(v1, v2);
}
},
{"if_not_exists", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::UpdateExpression) {
throw api_error::validation(
format("{}: if_not_exists() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: if_not_exists() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
if (!std::holds_alternative<parsed::path>(f._parameters[0]._value)) {
throw api_error::validation(
format("{}: if_not_exists() must include path as its first argument", caller));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
return v1.IsNull() ? std::move(v2) : std::move(v1);
}
},
{"size", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpression) {
throw api_error::validation(
format("{}: size() not allowed here", caller));
}
if (f._parameters.size() != 1) {
throw api_error::validation(
format("{}: size() accepts 1 parameter, got {}", caller, f._parameters.size()));
}
rjson::value v = calculate_value(f._parameters[0], caller, previous_item);
return calculate_size(v);
}
},
{"attribute_exists", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: attribute_exists() not allowed here", caller));
}
if (f._parameters.size() != 1) {
throw api_error::validation(
format("{}: attribute_exists() accepts 1 parameter, got {}", caller, f._parameters.size()));
}
if (!std::holds_alternative<parsed::path>(f._parameters[0]._value)) {
throw api_error::validation(
format("{}: attribute_exists()'s parameter must be a path", caller));
}
rjson::value v = calculate_value(f._parameters[0], caller, previous_item);
return to_bool_json(!v.IsNull());
}
},
{"attribute_not_exists", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: attribute_not_exists() not allowed here", caller));
}
if (f._parameters.size() != 1) {
throw api_error::validation(
format("{}: attribute_not_exists() accepts 1 parameter, got {}", caller, f._parameters.size()));
}
if (!std::holds_alternative<parsed::path>(f._parameters[0]._value)) {
throw api_error::validation(
format("{}: attribute_not_exists()'s parameter must be a path", caller));
}
rjson::value v = calculate_value(f._parameters[0], caller, previous_item);
return to_bool_json(v.IsNull());
}
},
{"attribute_type", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: attribute_type() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: attribute_type() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
// There is no real reason for the following check (not
// allowing the type to come from a document attribute), but
// DynamoDB does this check, so we do too...
if (!f._parameters[1].is_constant()) {
throw api_error::validation(
format("{}: attribute_types()'s first parameter must be an expression attribute", caller));
}
rjson::value v0 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v1 = calculate_value(f._parameters[1], caller, previous_item);
if (v1.IsObject() && v1.MemberCount() == 1 && v1.MemberBegin()->name == "S") {
// If the type parameter is not one of the legal types
// we should generate an error, not a failed condition:
if (!known_type(rjson::to_string_view(v1.MemberBegin()->value))) {
throw api_error::validation(
format("{}: attribute_types()'s second parameter, {}, is not a known type",
caller, v1.MemberBegin()->value));
}
if (v0.IsObject() && v0.MemberCount() == 1) {
return to_bool_json(v1.MemberBegin()->value == v0.MemberBegin()->name);
} else {
return to_bool_json(false);
}
} else {
throw api_error::validation(
format("{}: attribute_type() second parameter must refer to a string, got {}", caller, v1));
}
}
},
{"begins_with", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: begins_with() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: begins_with() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
return to_bool_json(check_BEGINS_WITH(v1.IsNull() ? nullptr : &v1, v2,
f._parameters[0].is_constant(), f._parameters[1].is_constant()));
}
},
{"contains", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: contains() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: contains() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
return to_bool_json(check_CONTAINS(v1.IsNull() ? nullptr : &v1, v2));
}
},
};
// Given a parsed::path and an item read from the table, extract the value
// of a certain attribute path, such as "a" or "a.b.c[3]". Returns a null
// value if the item or the requested attribute does not exist.
// Note that the item is assumed to be encoded in JSON using DynamoDB
// conventions - each level of a nested document is a map with one key -
// a type (e.g., "M" for map) - and its value is the representation of
// that value.
static rjson::value extract_path(const rjson::value* item,
const parsed::path& p, calculate_value_caller caller) {
if (!item) {
return rjson::null_value();
}
const rjson::value* v = rjson::find(*item, p.root());
if (!v) {
return rjson::null_value();
}
for (const auto& op : p.operators()) {
if (!v->IsObject() || v->MemberCount() != 1) {
// This shouldn't happen. We shouldn't have stored malformed
// objects. But today Alternator does not validate the structure
// of nested documents before storing them, so this can happen on
// read.
throw api_error::validation(format("{}: malformed item read: {}", *item));
}
const char* type = v->MemberBegin()->name.GetString();
v = &(v->MemberBegin()->value);
std::visit(overloaded_functor {
[&] (const std::string& member) {
if (type[0] == 'M' && v->IsObject()) {
v = rjson::find(*v, member);
} else {
v = nullptr;
}
},
[&] (unsigned index) {
if (type[0] == 'L' && v->IsArray() && index < v->Size()) {
v = &(v->GetArray()[index]);
} else {
v = nullptr;
}
}
}, op);
if (!v) {
return rjson::null_value();
}
}
return rjson::copy(*v);
}
// Given a parsed::value, which can refer either to a constant value from
// ExpressionAttributeValues, to the value of some attribute, or to a function
// of other values, this function calculates the resulting value.
// "caller" determines which expression - ConditionExpression or
// UpdateExpression - is asking for this value. We need to know this because
// DynamoDB allows a different choice of functions for different expressions.
rjson::value calculate_value(const parsed::value& v,
calculate_value_caller caller,
const rjson::value* previous_item) {
return std::visit(overloaded_functor {
[&] (const parsed::constant& c) -> rjson::value {
return rjson::copy(calculate_value(c));
},
[&] (const parsed::value::function_call& f) -> rjson::value {
auto function_it = function_handlers.find(std::string_view(f._function_name));
if (function_it == function_handlers.end()) {
throw api_error::validation(
format("{}: unknown function '{}' called.", caller, f._function_name));
}
return function_it->second(caller, previous_item, f);
},
[&] (const parsed::path& p) -> rjson::value {
return extract_path(previous_item, p, caller);
}
}, v._value);
}
// Same as calculate_value() above, except takes a set_rhs, which may be
// either a single value, or v1+v2 or v1-v2.
rjson::value calculate_value(const parsed::set_rhs& rhs,
const rjson::value* previous_item) {
switch (rhs._op) {
case 'v':
return calculate_value(rhs._v1, calculate_value_caller::UpdateExpression, previous_item);
case '+': {
rjson::value v1 = calculate_value(rhs._v1, calculate_value_caller::UpdateExpression, previous_item);
rjson::value v2 = calculate_value(rhs._v2, calculate_value_caller::UpdateExpression, previous_item);
return number_add(v1, v2);
}
case '-': {
rjson::value v1 = calculate_value(rhs._v1, calculate_value_caller::UpdateExpression, previous_item);
rjson::value v2 = calculate_value(rhs._v2, calculate_value_caller::UpdateExpression, previous_item);
return number_subtract(v1, v2);
}
}
// Can't happen
return rjson::null_value();
}
} // namespace alternator

View File

@@ -1,265 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*
* This file is part of Scylla. See the LICENSE.PROPRIETARY file in the
* top-level directory for licensing information.
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
/*
* The DynamoDB protocol is based on JSON, and most DynamoDB requests
* describe the operation and its parameters via JSON objects such as maps
* and lists. Nevertheless, in some types of requests an "expression" is
* passed as a single string, and we need to parse this string. These
* cases include:
* 1. Attribute paths, such as "a[3].b.c", are used in projection
* expressions as well as inside other expressions described below.
* 2. Condition expressions, such as "(NOT (a=b OR c=d)) AND e=f",
* used in conditional updates, filters, and other places.
* 3. Update expressions, such as "SET #a.b = :x, c = :y DELETE d"
*
* All these expression syntaxes are very simple: Most of them could be
* parsed as regular expressions, and the parenthesized condition expression
* could be done with a simple hand-written lexical analyzer and recursive-
* descent parser. Nevertheless, we decided to specify these parsers in the
* ANTLR3 language already used in the Scylla project, hopefully making these
* parsers easier to reason about, and easier to change if needed - and
* reducing the amount of boiler-plate code.
*/
grammar expressions;
options {
language = Cpp;
}
@parser::namespace{alternator}
@lexer::namespace{alternator}
/* TODO: explain what these traits things are. I haven't seen them explained
* in any document... Compilation fails without these fail because a definition
* of "expressionsLexerTraits" and "expressionParserTraits" is needed.
*/
@lexer::traits {
class expressionsLexer;
class expressionsParser;
typedef antlr3::Traits<expressionsLexer, expressionsParser> expressionsLexerTraits;
}
@parser::traits {
typedef expressionsLexerTraits expressionsParserTraits;
}
@lexer::header {
#include "alternator/expressions.hh"
// ANTLR generates a bunch of unused variables and functions. Yuck...
#pragma GCC diagnostic ignored "-Wunused-variable"
#pragma GCC diagnostic ignored "-Wunused-function"
}
@parser::header {
#include "expressionsLexer.hpp"
}
/* By default, ANTLR3 composes elaborate syntax-error messages, saying which
* token was unexpected, where, and so on on, but then dutifully writes these
* error messages to the standard error, and returns from the parser as if
* everything was fine, with a half-constructed output object! If we define
* the "displayRecognitionError" method, it will be called upon to build this
* error message, and we can instead throw an exception to stop the parsing
* immediately. This is good enough for now, for our simple needs, but if
* we ever want to show more information about the syntax error, Cql3.g
* contains an elaborate implementation (it would be nice if we could reuse
* it, not duplicate it).
* Unfortunately, we have to repeat the same definition twice - once for the
* parser, and once for the lexer.
*/
@parser::context {
void displayRecognitionError(ANTLR_UINT8** token_names, ExceptionBaseType* ex) {
throw expressions_syntax_error("syntax error");
}
}
@lexer::context {
void displayRecognitionError(ANTLR_UINT8** token_names, ExceptionBaseType* ex) {
throw expressions_syntax_error("syntax error");
}
}
/*
* Lexical analysis phase, i.e., splitting the input up to tokens.
* Lexical analyzer rules have names starting in capital letters.
* "fragment" rules do not generate tokens, and are just aliases used to
* make other rules more readable.
* Characters *not* listed here, e.g., '=', '(', etc., will be handled
* as individual tokens on their own right.
* Whitespace spans are skipped, so do not generate tokens.
*/
WHITESPACE: (' ' | '\t' | '\n' | '\r')+ { skip(); };
/* shortcuts for case-insensitive keywords */
fragment A:('a'|'A');
fragment B:('b'|'B');
fragment C:('c'|'C');
fragment D:('d'|'D');
fragment E:('e'|'E');
fragment F:('f'|'F');
fragment G:('g'|'G');
fragment H:('h'|'H');
fragment I:('i'|'I');
fragment J:('j'|'J');
fragment K:('k'|'K');
fragment L:('l'|'L');
fragment M:('m'|'M');
fragment N:('n'|'N');
fragment O:('o'|'O');
fragment P:('p'|'P');
fragment Q:('q'|'Q');
fragment R:('r'|'R');
fragment S:('s'|'S');
fragment T:('t'|'T');
fragment U:('u'|'U');
fragment V:('v'|'V');
fragment W:('w'|'W');
fragment X:('x'|'X');
fragment Y:('y'|'Y');
fragment Z:('z'|'Z');
/* These keywords must be appear before the generic NAME token below,
* because NAME matches too, and the first to match wins.
*/
SET: S E T;
REMOVE: R E M O V E;
ADD: A D D;
DELETE: D E L E T E;
AND: A N D;
OR: O R;
NOT: N O T;
BETWEEN: B E T W E E N;
IN: I N;
fragment ALPHA: 'A'..'Z' | 'a'..'z';
fragment DIGIT: '0'..'9';
fragment ALNUM: ALPHA | DIGIT | '_';
INTEGER: DIGIT+;
NAME: ALPHA ALNUM*;
NAMEREF: '#' ALNUM+;
VALREF: ':' ALNUM+;
/*
* Parsing phase - parsing the string of tokens generated by the lexical
* analyzer defined above.
*/
path_component: NAME | NAMEREF;
path returns [parsed::path p]:
root=path_component { $p.set_root($root.text); }
( '.' name=path_component { $p.add_dot($name.text); }
| '[' INTEGER ']' { $p.add_index(std::stoi($INTEGER.text)); }
)*;
value returns [parsed::value v]:
VALREF { $v.set_valref($VALREF.text); }
| path { $v.set_path($path.p); }
| NAME { $v.set_func_name($NAME.text); }
'(' x=value { $v.add_func_parameter($x.v); }
(',' x=value { $v.add_func_parameter($x.v); })*
')'
;
update_expression_set_rhs returns [parsed::set_rhs rhs]:
v=value { $rhs.set_value(std::move($v.v)); }
( '+' v=value { $rhs.set_plus(std::move($v.v)); }
| '-' v=value { $rhs.set_minus(std::move($v.v)); }
)?
;
update_expression_set_action returns [parsed::update_expression::action a]:
path '=' rhs=update_expression_set_rhs { $a.assign_set($path.p, $rhs.rhs); };
update_expression_remove_action returns [parsed::update_expression::action a]:
path { $a.assign_remove($path.p); };
update_expression_add_action returns [parsed::update_expression::action a]:
path VALREF { $a.assign_add($path.p, $VALREF.text); };
update_expression_delete_action returns [parsed::update_expression::action a]:
path VALREF { $a.assign_del($path.p, $VALREF.text); };
update_expression_clause returns [parsed::update_expression e]:
SET s=update_expression_set_action { $e.add(s); }
(',' s=update_expression_set_action { $e.add(s); })*
| REMOVE r=update_expression_remove_action { $e.add(r); }
(',' r=update_expression_remove_action { $e.add(r); })*
| ADD a=update_expression_add_action { $e.add(a); }
(',' a=update_expression_add_action { $e.add(a); })*
| DELETE d=update_expression_delete_action { $e.add(d); }
(',' d=update_expression_delete_action { $e.add(d); })*
;
// Note the "EOF" token at the end of the update expression. We want to the
// parser to match the entire string given to it - not just its beginning!
update_expression returns [parsed::update_expression e]:
(update_expression_clause { e.append($update_expression_clause.e); })* EOF;
projection_expression returns [std::vector<parsed::path> v]:
p=path { $v.push_back(std::move($p.p)); }
(',' p=path { $v.push_back(std::move($p.p)); } )* EOF;
primitive_condition returns [parsed::primitive_condition c]:
v=value { $c.add_value(std::move($v.v));
$c.set_operator(parsed::primitive_condition::type::VALUE); }
( ( '=' { $c.set_operator(parsed::primitive_condition::type::EQ); }
| '<' '>' { $c.set_operator(parsed::primitive_condition::type::NE); }
| '<' { $c.set_operator(parsed::primitive_condition::type::LT); }
| '<' '=' { $c.set_operator(parsed::primitive_condition::type::LE); }
| '>' { $c.set_operator(parsed::primitive_condition::type::GT); }
| '>' '=' { $c.set_operator(parsed::primitive_condition::type::GE); }
)
v=value { $c.add_value(std::move($v.v)); }
| BETWEEN { $c.set_operator(parsed::primitive_condition::type::BETWEEN); }
v=value { $c.add_value(std::move($v.v)); }
AND
v=value { $c.add_value(std::move($v.v)); }
| IN '(' { $c.set_operator(parsed::primitive_condition::type::IN); }
v=value { $c.add_value(std::move($v.v)); }
(',' v=value { $c.add_value(std::move($v.v)); })*
')'
)?
;
// The following rules for parsing boolean expressions are verbose and
// somewhat strange because of Antlr 3's limitations on recursive rules,
// common rule prefixes, and (lack of) support for operator precedence.
// These rules could have been written more clearly using a more powerful
// parser generator - such as Yacc.
boolean_expression returns [parsed::condition_expression e]:
b=boolean_expression_1 { $e.append(std::move($b.e), '|'); }
(OR b=boolean_expression_1 { $e.append(std::move($b.e), '|'); } )*
;
boolean_expression_1 returns [parsed::condition_expression e]:
b=boolean_expression_2 { $e.append(std::move($b.e), '&'); }
(AND b=boolean_expression_2 { $e.append(std::move($b.e), '&'); } )*
;
boolean_expression_2 returns [parsed::condition_expression e]:
p=primitive_condition { $e.set_primitive(std::move($p.c)); }
| NOT b=boolean_expression_2 { $e = std::move($b.e); $e.apply_not(); }
| '(' b=boolean_expression ')' { $e = std::move($b.e); }
;
condition_expression returns [parsed::condition_expression e]:
boolean_expression { e=std::move($boolean_expression.e); } EOF;

View File

@@ -1,102 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <string>
#include <stdexcept>
#include <vector>
#include <unordered_set>
#include <string_view>
#include <seastar/util/noncopyable_function.hh>
#include "expressions_types.hh"
#include "utils/rjson.hh"
namespace alternator {
class expressions_syntax_error : public std::runtime_error {
public:
using runtime_error::runtime_error;
};
parsed::update_expression parse_update_expression(std::string query);
std::vector<parsed::path> parse_projection_expression(std::string query);
parsed::condition_expression parse_condition_expression(std::string query);
void resolve_update_expression(parsed::update_expression& ue,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values);
void resolve_projection_expression(std::vector<parsed::path>& pe,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names);
void resolve_condition_expression(parsed::condition_expression& ce,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values);
void validate_value(const rjson::value& v, const char* caller);
bool condition_expression_on(const parsed::condition_expression& ce, std::string_view attribute);
// for_condition_expression_on() runs the given function on the attributes
// that the expression uses. It may run for the same attribute more than once
// if the same attribute is used more than once in the expression.
void for_condition_expression_on(const parsed::condition_expression& ce, const noncopyable_function<void(std::string_view)>& func);
// calculate_value() behaves slightly different (especially, different
// functions supported) when used in different types of expressions, as
// enumerated in this enum:
enum class calculate_value_caller {
UpdateExpression, ConditionExpression, ConditionExpressionAlone
};
inline std::ostream& operator<<(std::ostream& out, calculate_value_caller caller) {
switch (caller) {
case calculate_value_caller::UpdateExpression:
out << "UpdateExpression";
break;
case calculate_value_caller::ConditionExpression:
out << "ConditionExpression";
break;
case calculate_value_caller::ConditionExpressionAlone:
out << "ConditionExpression";
break;
default:
out << "unknown type of expression";
break;
}
return out;
}
rjson::value calculate_value(const parsed::value& v,
calculate_value_caller caller,
const rjson::value* previous_item);
rjson::value calculate_value(const parsed::set_rhs& rhs,
const rjson::value* previous_item);
} /* namespace alternator */

View File

@@ -1,270 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <vector>
#include <string>
#include <variant>
#include <seastar/core/shared_ptr.hh>
#include "utils/rjson.hh"
/*
* Parsed representation of expressions and their components.
*
* Types in alternator::parse namespace are used for holding the parse
* tree - objects generated by the Antlr rules after parsing an expression.
* Because of the way Antlr works, all these objects are default-constructed
* first, and then assigned when the rule is completed, so all these types
* have only default constructors - but setter functions to set them later.
*/
namespace alternator {
namespace parsed {
// "path" is an attribute's path in a document, e.g., a.b[3].c.
class path {
// All paths have a "root", a top-level attribute, and any number of
// "dereference operators" - each either an index (e.g., "[2]") or a
// dot (e.g., ".xyz").
std::string _root;
std::vector<std::variant<std::string, unsigned>> _operators;
// It is useful to limit the depth of a user-specified path, because is
// allows us to use recursive algorithms without worrying about recursion
// depth. DynamoDB officially limits the length of paths to 32 components
// (including the root) so let's use the same limit.
static constexpr unsigned depth_limit = 32;
void check_depth_limit();
public:
void set_root(std::string root) {
_root = std::move(root);
}
void add_index(unsigned i) {
_operators.emplace_back(i);
check_depth_limit();
}
void add_dot(std::string(name)) {
_operators.emplace_back(std::move(name));
check_depth_limit();
}
const std::string& root() const {
return _root;
}
bool has_operators() const {
return !_operators.empty();
}
const std::vector<std::variant<std::string, unsigned>>& operators() const {
return _operators;
}
std::vector<std::variant<std::string, unsigned>>& operators() {
return _operators;
}
friend std::ostream& operator<<(std::ostream&, const path&);
};
// When an expression is first parsed, all constants are references, like
// ":val1", into ExpressionAttributeValues. This uses std::string() variant.
// The resolve_value() function replaces these constants by the JSON item
// extracted from the ExpressionAttributeValues.
struct constant {
// We use lw_shared_ptr<rjson::value> just to make rjson::value copyable,
// to make this entire object copyable as ANTLR needs.
using literal = lw_shared_ptr<rjson::value>;
std::variant<std::string, literal> _value;
void set(const rjson::value& v) {
_value = make_lw_shared<rjson::value>(rjson::copy(v));
}
void set(std::string& s) {
_value = s;
}
};
// "value" is is a value used in the right hand side of an assignment
// expression, "SET a = ...". It can be a constant (a reference to a value
// included in the request, e.g., ":val"), a path to an attribute from the
// existing item (e.g., "a.b[3].c"), or a function of other such values.
// Note that the real right-hand-side of an assignment is actually a bit
// more general - it allows either a value, or a value+value or value-value -
// see class set_rhs below.
struct value {
struct function_call {
std::string _function_name;
std::vector<value> _parameters;
};
std::variant<constant, path, function_call> _value;
void set_constant(constant c) {
_value = std::move(c);
}
void set_valref(std::string s) {
_value = constant { std::move(s) };
}
void set_path(path p) {
_value = std::move(p);
}
void set_func_name(std::string s) {
_value = function_call {std::move(s), {}};
}
void add_func_parameter(value v) {
std::get<function_call>(_value)._parameters.emplace_back(std::move(v));
}
bool is_constant() const {
return std::holds_alternative<constant>(_value);
}
bool is_path() const {
return std::holds_alternative<path>(_value);
}
bool is_func() const {
return std::holds_alternative<function_call>(_value);
}
};
// The right-hand-side of a SET in an update expression can be either a
// single value (see above), or value+value, or value-value.
class set_rhs {
public:
char _op; // '+', '-', or 'v''
value _v1;
value _v2;
void set_value(value&& v1) {
_op = 'v';
_v1 = std::move(v1);
}
void set_plus(value&& v2) {
_op = '+';
_v2 = std::move(v2);
}
void set_minus(value&& v2) {
_op = '-';
_v2 = std::move(v2);
}
};
class update_expression {
public:
struct action {
path _path;
struct set {
set_rhs _rhs;
};
struct remove {
};
struct add {
constant _valref;
};
struct del {
constant _valref;
};
std::variant<set, remove, add, del> _action;
void assign_set(path p, set_rhs rhs) {
_path = std::move(p);
_action = set { std::move(rhs) };
}
void assign_remove(path p) {
_path = std::move(p);
_action = remove { };
}
void assign_add(path p, std::string v) {
_path = std::move(p);
_action = add { constant { std::move(v) } };
}
void assign_del(path p, std::string v) {
_path = std::move(p);
_action = del { constant { std::move(v) } };
}
};
private:
std::vector<action> _actions;
bool seen_set = false;
bool seen_remove = false;
bool seen_add = false;
bool seen_del = false;
public:
void add(action a);
void append(update_expression other);
bool empty() const {
return _actions.empty();
}
const std::vector<action>& actions() const {
return _actions;
}
std::vector<action>& actions() {
return _actions;
}
};
// A primitive_condition is a condition expression involving one condition,
// while the full condition_expression below adds boolean logic over these
// primitive conditions.
// The supported primitive conditions are:
// 1. Binary operators - v1 OP v2, where OP is =, <>, <, <=, >, or >= and
// v1 and v2 are values - from the item (an attribute path), the query
// (a ":val" reference), or a function of the the above (only the size()
// function is supported).
// 2. Ternary operator - v1 BETWEEN v2 and v3 (means v1 >= v2 AND v1 <= v3).
// 3. N-ary operator - v1 IN ( v2, v3, ... )
// 4. A single function call (attribute_exists etc.). The parser actually
// accepts a more general "value" here but later stages reject a value
// which is not a function call (because DynamoDB does it too).
class primitive_condition {
public:
enum class type {
UNDEFINED, VALUE, EQ, NE, LT, LE, GT, GE, BETWEEN, IN
};
type _op = type::UNDEFINED;
std::vector<value> _values;
void set_operator(type op) {
_op = op;
}
void add_value(value&& v) {
_values.push_back(std::move(v));
}
bool empty() const {
return _op == type::UNDEFINED;
}
};
class condition_expression {
public:
bool _negated = false; // If true, the entire condition is negated
struct condition_list {
char op = '|'; // '&' or '|'
std::vector<condition_expression> conditions;
};
std::variant<primitive_condition, condition_list> _expression = condition_list();
void set_primitive(primitive_condition&& p) {
_expression = std::move(p);
}
void append(condition_expression&& c, char op);
void apply_not() {
_negated = !_negated;
}
bool empty() const {
return std::holds_alternative<condition_list>(_expression) &&
std::get<condition_list>(_expression).conditions.empty();
}
};
} // namespace parsed
} // namespace alternator

View File

@@ -1,128 +0,0 @@
/*
* Copyright 2020 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "seastarx.hh"
#include "service/storage_proxy.hh"
#include "service/storage_proxy.hh"
#include "utils/rjson.hh"
#include "executor.hh"
namespace alternator {
// An rmw_operation encapsulates the common logic of all the item update
// operations which may involve a read of the item before the write
// (so-called Read-Modify-Write operations). These operations include PutItem,
// UpdateItem and DeleteItem: All of these may be conditional operations (the
// "Expected" parameter) which requir a read before the write, and UpdateItem
// may also have an update expression which refers to the item's old value.
//
// The code below supports running the read and the write together as one
// transaction using LWT (this is why rmw_operation is a subclass of
// cas_request, as required by storage_proxy::cas()), but also has optional
// modes not using LWT.
class rmw_operation : public service::cas_request, public enable_shared_from_this<rmw_operation> {
public:
// The following options choose which mechanism to use for isolating
// parallel write operations:
// * The FORBID_RMW option forbids RMW (read-modify-write) operations
// such as conditional updates. For the remaining write-only
// operations, ordinary quorum writes are isolated enough.
// * The LWT_ALWAYS option always uses LWT (lightweight transactions)
// for any write operation - whether or not it also has a read.
// * The LWT_RMW_ONLY option uses LWT only for RMW operations, and uses
// ordinary quorum writes for write-only operations.
// This option is not safe if the user may send both RMW and write-only
// operations on the same item.
// * The UNSAFE_RMW option does read-modify-write operations as separate
// read and write. It is unsafe - concurrent RMW operations are not
// isolated at all. This option will likely be removed in the future.
enum class write_isolation {
FORBID_RMW, LWT_ALWAYS, LWT_RMW_ONLY, UNSAFE_RMW
};
static constexpr auto WRITE_ISOLATION_TAG_KEY = "system:write_isolation";
static write_isolation get_write_isolation_for_schema(schema_ptr schema);
static write_isolation default_write_isolation;
public:
static void set_default_write_isolation(std::string_view mode);
protected:
// The full request JSON
rjson::value _request;
// All RMW operations involve a single item with a specific partition
// and optional clustering key, in a single table, so the following
// information is common to all of them:
schema_ptr _schema;
partition_key _pk = partition_key::make_empty();
clustering_key _ck = clustering_key::make_empty();
write_isolation _write_isolation;
// All RMW operations can have a ReturnValues parameter from the following
// choices. But note that only UpdateItem actually supports all of them:
enum class returnvalues {
NONE, ALL_OLD, UPDATED_OLD, ALL_NEW, UPDATED_NEW
} _returnvalues;
static returnvalues parse_returnvalues(const rjson::value& request);
// When _returnvalues != NONE, apply() should store here, in JSON form,
// the values which are to be returned in the "Attributes" field.
// The default null JSON means do not return an Attributes field at all.
// This field is marked "mutable" so that the const apply() can modify
// it (see explanation below), but note that because apply() may be
// called more than once, if apply() will sometimes set this field it
// must set it (even if just to the default empty value) every time.
mutable rjson::value _return_attributes;
public:
// The constructor of a rmw_operation subclass should parse the request
// and try to discover as many input errors as it can before really
// attempting the read or write operations.
rmw_operation(service::storage_proxy& proxy, rjson::value&& request);
// rmw_operation subclasses (update_item_operation, put_item_operation
// and delete_item_operation) shall implement an apply() function which
// takes the previous value of the item (if it was read) and creates the
// write mutation. If the previous value of item does not pass the needed
// conditional expression, apply() should return an empty optional.
// apply() may throw if it encounters input errors not discovered during
// the constructor.
// apply() may be called more than once in case of contention, so it must
// not change the state saved in the object (issue #7218 was caused by
// violating this). We mark apply() "const" to let the compiler validate
// this for us. The output-only field _return_attributes is marked
// "mutable" above so that apply() can still write to it.
virtual std::optional<mutation> apply(std::unique_ptr<rjson::value> previous_item, api::timestamp_type ts) const = 0;
// Convert the above apply() into the signature needed by cas_request:
virtual std::optional<mutation> apply(foreign_ptr<lw_shared_ptr<query::result>> qr, const query::partition_slice& slice, api::timestamp_type ts) override;
virtual ~rmw_operation() = default;
schema_ptr schema() const { return _schema; }
const rjson::value& request() const { return _request; }
rjson::value&& move_request() && { return std::move(_request); }
future<executor::request_return_type> execute(service::storage_proxy& proxy,
service::client_state& client_state,
tracing::trace_state_ptr trace_state,
service_permit permit,
bool needs_read_before_write,
stats& stats);
std::optional<shard_id> shard_for_execute(bool needs_read_before_write);
};
} // namespace alternator

View File

@@ -1,375 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "base64.hh"
#include "log.hh"
#include "serialization.hh"
#include "error.hh"
#include "rapidjson/writer.h"
#include "concrete_types.hh"
#include "cql3/type_json.hh"
static logging::logger slogger("alternator-serialization");
namespace alternator {
type_info type_info_from_string(std::string_view type) {
static thread_local const std::unordered_map<std::string_view, type_info> type_infos = {
{"S", {alternator_type::S, utf8_type}},
{"B", {alternator_type::B, bytes_type}},
{"BOOL", {alternator_type::BOOL, boolean_type}},
{"N", {alternator_type::N, decimal_type}}, //FIXME: Replace with custom Alternator type when implemented
};
auto it = type_infos.find(type);
if (it == type_infos.end()) {
return {alternator_type::NOT_SUPPORTED_YET, utf8_type};
}
return it->second;
}
type_representation represent_type(alternator_type atype) {
static thread_local const std::unordered_map<alternator_type, type_representation> type_representations = {
{alternator_type::S, {"S", utf8_type}},
{alternator_type::B, {"B", bytes_type}},
{alternator_type::BOOL, {"BOOL", boolean_type}},
{alternator_type::N, {"N", decimal_type}}, //FIXME: Replace with custom Alternator type when implemented
};
auto it = type_representations.find(atype);
if (it == type_representations.end()) {
throw std::runtime_error(format("Unknown alternator type {}", int8_t(atype)));
}
return it->second;
}
struct from_json_visitor {
const rjson::value& v;
bytes_ostream& bo;
void operator()(const reversed_type_impl& t) const { visit(*t.underlying_type(), from_json_visitor{v, bo}); };
void operator()(const string_type_impl& t) {
bo.write(t.from_string(rjson::to_string_view(v)));
}
void operator()(const bytes_type_impl& t) const {
bo.write(base64_decode(v));
}
void operator()(const boolean_type_impl& t) const {
bo.write(boolean_type->decompose(v.GetBool()));
}
void operator()(const decimal_type_impl& t) const {
try {
bo.write(t.from_string(rjson::to_string_view(v)));
} catch (const marshal_exception& e) {
throw api_error::validation(format("The parameter cannot be converted to a numeric value: {}", v));
}
}
// default
void operator()(const abstract_type& t) const {
bo.write(from_json_object(t, v, cql_serialization_format::internal()));
}
};
bytes serialize_item(const rjson::value& item) {
if (item.IsNull() || item.MemberCount() != 1) {
throw api_error::validation(format("An item can contain only one attribute definition: {}", item));
}
auto it = item.MemberBegin();
type_info type_info = type_info_from_string(rjson::to_string_view(it->name)); // JSON keys are guaranteed to be strings
if (type_info.atype == alternator_type::NOT_SUPPORTED_YET) {
slogger.trace("Non-optimal serialization of type {}", it->name);
return bytes{int8_t(type_info.atype)} + to_bytes(rjson::print(item));
}
bytes_ostream bo;
bo.write(bytes{int8_t(type_info.atype)});
visit(*type_info.dtype, from_json_visitor{it->value, bo});
return bytes(bo.linearize());
}
struct to_json_visitor {
rjson::value& deserialized;
const std::string& type_ident;
bytes_view bv;
void operator()(const reversed_type_impl& t) const { visit(*t.underlying_type(), to_json_visitor{deserialized, type_ident, bv}); };
void operator()(const decimal_type_impl& t) const {
auto s = to_json_string(*decimal_type, bytes(bv));
//FIXME(sarna): unnecessary copy
rjson::set_with_string_name(deserialized, type_ident, rjson::from_string(s));
}
void operator()(const string_type_impl& t) {
rjson::set_with_string_name(deserialized, type_ident, rjson::from_string(reinterpret_cast<const char *>(bv.data()), bv.size()));
}
void operator()(const bytes_type_impl& t) const {
std::string b64 = base64_encode(bv);
rjson::set_with_string_name(deserialized, type_ident, rjson::from_string(b64));
}
// default
void operator()(const abstract_type& t) const {
rjson::set_with_string_name(deserialized, type_ident, rjson::parse(to_json_string(t, bytes(bv))));
}
};
rjson::value deserialize_item(bytes_view bv) {
rjson::value deserialized(rapidjson::kObjectType);
if (bv.empty()) {
throw api_error::validation("Serialized value empty");
}
alternator_type atype = alternator_type(bv[0]);
bv.remove_prefix(1);
if (atype == alternator_type::NOT_SUPPORTED_YET) {
slogger.trace("Non-optimal deserialization of alternator type {}", int8_t(atype));
return rjson::parse(std::string_view(reinterpret_cast<const char *>(bv.data()), bv.size()));
}
type_representation type_representation = represent_type(atype);
visit(*type_representation.dtype, to_json_visitor{deserialized, type_representation.ident, bv});
return deserialized;
}
std::string type_to_string(data_type type) {
static thread_local std::unordered_map<data_type, std::string> types = {
{utf8_type, "S"},
{bytes_type, "B"},
{boolean_type, "BOOL"},
{decimal_type, "N"}, // FIXME: use a specialized Alternator number type instead of the general decimal_type
};
auto it = types.find(type);
if (it == types.end()) {
// fall back to string, in order to be able to present
// internal Scylla types in a human-readable way
return "S";
}
return it->second;
}
bytes get_key_column_value(const rjson::value& item, const column_definition& column) {
std::string column_name = column.name_as_text();
const rjson::value* key_typed_value = rjson::find(item, column_name);
if (!key_typed_value) {
throw api_error::validation(format("Key column {} not found", column_name));
}
return get_key_from_typed_value(*key_typed_value, column);
}
// Parses the JSON encoding for a key value, which is a map with a single
// entry, whose key is the type (expected to match the key column's type)
// and the value is the encoded value.
bytes get_key_from_typed_value(const rjson::value& key_typed_value, const column_definition& column) {
if (!key_typed_value.IsObject() || key_typed_value.MemberCount() != 1 ||
!key_typed_value.MemberBegin()->value.IsString()) {
throw api_error::validation(
format("Malformed value object for key column {}: {}",
column.name_as_text(), key_typed_value));
}
auto it = key_typed_value.MemberBegin();
if (it->name != type_to_string(column.type)) {
throw api_error::validation(
format("Type mismatch: expected type {} for key column {}, got type {}",
type_to_string(column.type), column.name_as_text(), it->name));
}
std::string_view value_view = rjson::to_string_view(it->value);
if (value_view.empty()) {
throw api_error::validation(
format("The AttributeValue for a key attribute cannot contain an empty string value. Key: {}", column.name_as_text()));
}
if (column.type == bytes_type) {
return base64_decode(it->value);
} else {
return column.type->from_string(rjson::to_string_view(it->value));
}
}
rjson::value json_key_column_value(bytes_view cell, const column_definition& column) {
if (column.type == bytes_type) {
std::string b64 = base64_encode(cell);
return rjson::from_string(b64);
} if (column.type == utf8_type) {
return rjson::from_string(std::string(reinterpret_cast<const char*>(cell.data()), cell.size()));
} else if (column.type == decimal_type) {
// FIXME: use specialized Alternator number type, not the more
// general "decimal_type". A dedicated type can be more efficient
// in storage space and in parsing speed.
auto s = to_json_string(*decimal_type, bytes(cell));
return rjson::from_string(s);
} else {
// Support for arbitrary key types is useful for parsing values of virtual tables,
// which can involve any type supported by Scylla.
// In order to guarantee that the returned type is parsable by alternator clients,
// they are represented simply as strings.
return rjson::from_string(column.type->to_string(bytes(cell)));
}
}
partition_key pk_from_json(const rjson::value& item, schema_ptr schema) {
std::vector<bytes> raw_pk;
// FIXME: this is a loop, but we really allow only one partition key column.
for (const column_definition& cdef : schema->partition_key_columns()) {
bytes raw_value = get_key_column_value(item, cdef);
raw_pk.push_back(std::move(raw_value));
}
return partition_key::from_exploded(raw_pk);
}
clustering_key ck_from_json(const rjson::value& item, schema_ptr schema) {
if (schema->clustering_key_size() == 0) {
return clustering_key::make_empty();
}
std::vector<bytes> raw_ck;
// FIXME: this is a loop, but we really allow only one clustering key column.
for (const column_definition& cdef : schema->clustering_key_columns()) {
bytes raw_value = get_key_column_value(item, cdef);
raw_ck.push_back(std::move(raw_value));
}
return clustering_key::from_exploded(raw_ck);
}
big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic) {
if (!v.IsObject() || v.MemberCount() != 1) {
throw api_error::validation(format("{}: invalid number object", diagnostic));
}
auto it = v.MemberBegin();
if (it->name != "N") {
throw api_error::validation(format("{}: expected number, found type '{}'", diagnostic, it->name));
}
try {
if (it->value.IsNumber()) {
// FIXME(sarna): should use big_decimal constructor with numeric values directly:
return big_decimal(rjson::print(it->value));
}
if (!it->value.IsString()) {
throw api_error::validation(format("{}: improperly formatted number constant", diagnostic));
}
return big_decimal(rjson::to_string_view(it->value));
} catch (const marshal_exception& e) {
throw api_error::validation(format("The parameter cannot be converted to a numeric value: {}", it->value));
}
}
const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
return {"", nullptr};
}
auto it = v.MemberBegin();
const std::string it_key = it->name.GetString();
if (it_key != "SS" && it_key != "BS" && it_key != "NS") {
return {"", nullptr};
}
return std::make_pair(it_key, &(it->value));
}
const rjson::value* unwrap_list(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
return nullptr;
}
auto it = v.MemberBegin();
if (it->name != std::string("L")) {
return nullptr;
}
return &(it->value);
}
// Take two JSON-encoded numeric values ({"N": "thenumber"}) and return the
// sum, again as a JSON-encoded number.
rjson::value number_add(const rjson::value& v1, const rjson::value& v2) {
auto n1 = unwrap_number(v1, "UpdateExpression");
auto n2 = unwrap_number(v2, "UpdateExpression");
rjson::value ret = rjson::empty_object();
std::string str_ret = std::string((n1 + n2).to_string());
rjson::set(ret, "N", rjson::from_string(str_ret));
return ret;
}
rjson::value number_subtract(const rjson::value& v1, const rjson::value& v2) {
auto n1 = unwrap_number(v1, "UpdateExpression");
auto n2 = unwrap_number(v2, "UpdateExpression");
rjson::value ret = rjson::empty_object();
std::string str_ret = std::string((n1 - n2).to_string());
rjson::set(ret, "N", rjson::from_string(str_ret));
return ret;
}
// Take two JSON-encoded set values (e.g. {"SS": [...the actual set]}) and
// return the sum of both sets, again as a set value.
rjson::value set_sum(const rjson::value& v1, const rjson::value& v2) {
auto [set1_type, set1] = unwrap_set(v1);
auto [set2_type, set2] = unwrap_set(v2);
if (set1_type != set2_type) {
throw api_error::validation(format("Mismatched set types: {} and {}", set1_type, set2_type));
}
if (!set1 || !set2) {
throw api_error::validation("UpdateExpression: ADD operation for sets must be given sets as arguments");
}
rjson::value sum = rjson::copy(*set1);
std::set<rjson::value, rjson::single_value_comp> set1_raw;
for (auto it = sum.Begin(); it != sum.End(); ++it) {
set1_raw.insert(rjson::copy(*it));
}
for (const auto& a : set2->GetArray()) {
if (!set1_raw.contains(a)) {
rjson::push_back(sum, rjson::copy(a));
}
}
rjson::value ret = rjson::empty_object();
rjson::set_with_string_name(ret, set1_type, std::move(sum));
return ret;
}
// Take two JSON-encoded set values (e.g. {"SS": [...the actual list]}) and
// return the difference of s1 - s2, again as a set value.
// DynamoDB does not allow empty sets, so if resulting set is empty, return
// an unset optional instead.
std::optional<rjson::value> set_diff(const rjson::value& v1, const rjson::value& v2) {
auto [set1_type, set1] = unwrap_set(v1);
auto [set2_type, set2] = unwrap_set(v2);
if (set1_type != set2_type) {
throw api_error::validation(format("Mismatched set types: {} and {}", set1_type, set2_type));
}
if (!set1 || !set2) {
throw api_error::validation("UpdateExpression: DELETE operation can only be performed on a set");
}
std::set<rjson::value, rjson::single_value_comp> set1_raw;
for (auto it = set1->Begin(); it != set1->End(); ++it) {
set1_raw.insert(rjson::copy(*it));
}
for (const auto& a : set2->GetArray()) {
set1_raw.erase(a);
}
if (set1_raw.empty()) {
return std::nullopt;
}
rjson::value ret = rjson::empty_object();
rjson::set_with_string_name(ret, set1_type, rjson::empty_array());
rjson::value& result_set = ret[set1_type];
for (const auto& a : set1_raw) {
rjson::push_back(result_set, rjson::copy(a));
}
return ret;
}
}

View File

@@ -1,89 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <string>
#include <string_view>
#include "types.hh"
#include "schema_fwd.hh"
#include "keys.hh"
#include "utils/rjson.hh"
#include "utils/big_decimal.hh"
namespace alternator {
enum class alternator_type : int8_t {
S, B, BOOL, N, NOT_SUPPORTED_YET
};
struct type_info {
alternator_type atype;
data_type dtype;
};
struct type_representation {
std::string ident;
data_type dtype;
};
type_info type_info_from_string(std::string_view type);
type_representation represent_type(alternator_type atype);
bytes serialize_item(const rjson::value& item);
rjson::value deserialize_item(bytes_view bv);
std::string type_to_string(data_type type);
bytes get_key_column_value(const rjson::value& item, const column_definition& column);
bytes get_key_from_typed_value(const rjson::value& key_typed_value, const column_definition& column);
rjson::value json_key_column_value(bytes_view cell, const column_definition& column);
partition_key pk_from_json(const rjson::value& item, schema_ptr schema);
clustering_key ck_from_json(const rjson::value& item, schema_ptr schema);
// If v encodes a number (i.e., it is a {"N": [...]}, returns an object representing it. Otherwise,
// raises ValidationException with diagnostic.
big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic);
// Check if a given JSON object encodes a set (i.e., it is a {"SS": [...]}, or "NS", "BS"
// and returns set's type and a pointer to that set. If the object does not encode a set,
// returned value is {"", nullptr}
const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v);
// Check if a given JSON object encodes a list (i.e., it is a {"L": [...]}
// and returns a pointer to that list.
const rjson::value* unwrap_list(const rjson::value& v);
// Take two JSON-encoded numeric values ({"N": "thenumber"}) and return the
// sum, again as a JSON-encoded number.
rjson::value number_add(const rjson::value& v1, const rjson::value& v2);
rjson::value number_subtract(const rjson::value& v1, const rjson::value& v2);
// Take two JSON-encoded set values (e.g. {"SS": [...the actual set]}) and
// return the sum of both sets, again as a set value.
rjson::value set_sum(const rjson::value& v1, const rjson::value& v2);
// Take two JSON-encoded set values (e.g. {"SS": [...the actual list]}) and
// return the difference of s1 - s2, again as a set value.
// DynamoDB does not allow empty sets, so if resulting set is empty, return
// an unset optional instead.
std::optional<rjson::value> set_diff(const rjson::value& v1, const rjson::value& v2);
}

View File

@@ -1,617 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "alternator/server.hh"
#include "log.hh"
#include <seastar/http/function_handlers.hh>
#include <seastar/http/short_streams.hh>
#include <seastar/core/coroutine.hh>
#include <seastar/json/json_elements.hh>
#include "seastarx.hh"
#include "error.hh"
#include "utils/rjson.hh"
#include "auth.hh"
#include <cctype>
#include "cql3/query_processor.hh"
#include "service/storage_service.hh"
#include "utils/overloaded_functor.hh"
static logging::logger slogger("alternator-server");
using namespace httpd;
namespace alternator {
static constexpr auto TARGET = "X-Amz-Target";
inline std::vector<std::string_view> split(std::string_view text, char separator) {
std::vector<std::string_view> tokens;
if (text == "") {
return tokens;
}
while (true) {
auto pos = text.find_first_of(separator);
if (pos != std::string_view::npos) {
tokens.emplace_back(text.data(), pos);
text.remove_prefix(pos + 1);
} else {
tokens.emplace_back(text);
break;
}
}
return tokens;
}
// Handle CORS (Cross-origin resource sharing) in the HTTP request:
// If the request has the "Origin" header specifying where the script which
// makes this request comes from, we need to reply with the header
// "Access-Control-Allow-Origin: *" saying that this (and any) origin is fine.
// Additionally, if preflight==true (i.e., this is an OPTIONS request),
// the script can also "request" in headers that the server allows it to use
// some HTTP methods and headers in the followup request, and the server
// should respond by "allowing" them in the response headers.
// We also add the header "Access-Control-Expose-Headers" to let the script
// access additional headers in the response.
// This handle_CORS() should be used when handling any HTTP method - both the
// usual GET and POST, and also the "preflight" OPTIONS method.
static void handle_CORS(const request& req, reply& rep, bool preflight) {
if (!req.get_header("origin").empty()) {
rep.add_header("Access-Control-Allow-Origin", "*");
// This is the list that DynamoDB returns for expose headers. I am
// not sure why not just return "*" here, what's the risk?
rep.add_header("Access-Control-Expose-Headers", "x-amzn-RequestId,x-amzn-ErrorType,x-amzn-ErrorMessage,Date");
if (preflight) {
sstring s = req.get_header("Access-Control-Request-Headers");
if (!s.empty()) {
rep.add_header("Access-Control-Allow-Headers", std::move(s));
}
s = req.get_header("Access-Control-Request-Method");
if (!s.empty()) {
rep.add_header("Access-Control-Allow-Methods", std::move(s));
}
// Our CORS response never change anyway, let the browser cache it
// for two hours (Chrome's maximum):
rep.add_header("Access-Control-Max-Age", "7200");
}
}
}
// DynamoDB HTTP error responses are structured as follows
// https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html
// Our handlers throw an exception to report an error. If the exception
// is of type alternator::api_error, it unwrapped and properly reported to
// the user directly. Other exceptions are unexpected, and reported as
// Internal Server Error.
class api_handler : public handler_base {
public:
api_handler(const std::function<future<executor::request_return_type>(std::unique_ptr<request> req)>& _handle) : _f_handle(
[this, _handle](std::unique_ptr<request> req, std::unique_ptr<reply> rep) {
return seastar::futurize_invoke(_handle, std::move(req)).then_wrapped([this, rep = std::move(rep)](future<executor::request_return_type> resf) mutable {
if (resf.failed()) {
// Exceptions of type api_error are wrapped as JSON and
// returned to the client as expected. Other types of
// exceptions are unexpected, and returned to the user
// as an internal server error:
try {
resf.get();
} catch (api_error &ae) {
generate_error_reply(*rep, ae);
} catch (rjson::error & re) {
generate_error_reply(*rep,
api_error::validation(re.what()));
} catch (...) {
generate_error_reply(*rep,
api_error::internal(format("Internal server error: {}", std::current_exception())));
}
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
auto res = resf.get0();
std::visit(overloaded_functor {
[&] (const json::json_return_type& json_return_value) {
slogger.trace("api_handler success case");
if (json_return_value._body_writer) {
// Unfortunately, write_body() forces us to choose
// from a fixed and irrelevant list of "mime-types"
// at this point. But we'll override it with the
// one (application/x-amz-json-1.0) below.
rep->write_body("json", std::move(json_return_value._body_writer));
} else {
rep->_content += json_return_value._res;
}
},
[&] (const api_error& err) {
generate_error_reply(*rep, err);
}
}, res);
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
});
}) { }
api_handler(const api_handler&) = default;
future<std::unique_ptr<reply>> handle(const sstring& path,
std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
handle_CORS(*req, *rep, false);
return _f_handle(std::move(req), std::move(rep)).then(
[this](std::unique_ptr<reply> rep) {
rep->set_mime_type("application/x-amz-json-1.0");
rep->done();
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
});
}
protected:
void generate_error_reply(reply& rep, const api_error& err) {
rep._content += "{\"__type\":\"com.amazonaws.dynamodb.v20120810#" + err._type + "\"," +
"\"message\":\"" + err._msg + "\"}";
rep._status = err._http_code;
slogger.trace("api_handler error case: {}", rep._content);
}
future_handler_function _f_handle;
};
class gated_handler : public handler_base {
seastar::gate& _gate;
public:
gated_handler(seastar::gate& gate) : _gate(gate) {}
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) = 0;
virtual future<std::unique_ptr<reply>> handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) final override {
return with_gate(_gate, [this, &path, req = std::move(req), rep = std::move(rep)] () mutable {
return do_handle(path, std::move(req), std::move(rep));
});
}
};
class health_handler : public gated_handler {
public:
health_handler(seastar::gate& pending_requests) : gated_handler(pending_requests) {}
protected:
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
handle_CORS(*req, *rep, false);
rep->set_status(reply::status_type::ok);
rep->write_body("txt", format("healthy: {}", req->get_header("Host")));
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
};
class local_nodelist_handler : public gated_handler {
public:
local_nodelist_handler(seastar::gate& pending_requests) : gated_handler(pending_requests) {}
protected:
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
rjson::value results = rjson::empty_array();
// It's very easy to get a list of all live nodes on the cluster,
// using gms::get_local_gossiper().get_live_members(). But getting
// just the list of live nodes in this DC needs more elaborate code:
sstring local_dc = locator::i_endpoint_snitch::get_local_snitch_ptr()->get_datacenter(
utils::fb_utilities::get_broadcast_address());
std::unordered_set<gms::inet_address> local_dc_nodes =
service::get_local_storage_service().get_token_metadata().
get_topology().get_datacenter_endpoints().at(local_dc);
for (auto& ip : local_dc_nodes) {
if (gms::get_local_gossiper().is_alive(ip)) {
rjson::push_back(results, rjson::from_string(ip.to_sstring()));
}
}
rep->set_status(reply::status_type::ok);
rep->set_content_type("json");
rep->_content = rjson::print(results);
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
};
// The CORS (Cross-origin resource sharing) protocol can send an OPTIONS
// request before ("pre-flight") the main request. The response to this
// request can be empty, but needs to have the right headers (which we
// fill with handle_CORS())
class options_handler : public gated_handler {
public:
options_handler(seastar::gate& pending_requests) : gated_handler(pending_requests) {}
protected:
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
handle_CORS(*req, *rep, true);
rep->set_status(reply::status_type::ok);
rep->write_body("txt", sstring(""));
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
};
future<> server::verify_signature(const request& req, const chunked_content& content) {
if (!_enforce_authorization) {
slogger.debug("Skipping authorization");
return make_ready_future<>();
}
auto host_it = req._headers.find("Host");
if (host_it == req._headers.end()) {
throw api_error::invalid_signature("Host header is mandatory for signature verification");
}
auto authorization_it = req._headers.find("Authorization");
if (authorization_it == req._headers.end()) {
throw api_error::missing_authentication_token("Authorization header is mandatory for signature verification");
}
std::string host = host_it->second;
std::string_view authorization_header = authorization_it->second;
auto pos = authorization_header.find_first_of(' ');
if (pos == std::string_view::npos || authorization_header.substr(0, pos) != "AWS4-HMAC-SHA256") {
throw api_error::invalid_signature(format("Authorization header must use AWS4-HMAC-SHA256 algorithm: {}", authorization_header));
}
authorization_header.remove_prefix(pos+1);
std::string credential;
std::string user_signature;
std::string signed_headers_str;
std::vector<std::string_view> signed_headers;
do {
// Either one of a comma or space can mark the end of an entry
pos = authorization_header.find_first_of(" ,");
std::string_view entry = authorization_header.substr(0, pos);
if (pos != std::string_view::npos) {
authorization_header.remove_prefix(pos + 1);
}
if (entry.empty()) {
continue;
}
std::vector<std::string_view> entry_split = split(entry, '=');
if (entry_split.size() != 2) {
continue;
}
std::string_view auth_value = entry_split[1];
if (entry_split[0] == "Credential") {
credential = std::string(auth_value);
} else if (entry_split[0] == "Signature") {
user_signature = std::string(auth_value);
} else if (entry_split[0] == "SignedHeaders") {
signed_headers_str = std::string(auth_value);
signed_headers = split(auth_value, ';');
std::sort(signed_headers.begin(), signed_headers.end());
}
} while (pos != std::string_view::npos);
std::vector<std::string_view> credential_split = split(credential, '/');
if (credential_split.size() != 5) {
throw api_error::validation(format("Incorrect credential information format: {}", credential));
}
std::string user(credential_split[0]);
std::string datestamp(credential_split[1]);
std::string region(credential_split[2]);
std::string service(credential_split[3]);
std::map<std::string_view, std::string_view> signed_headers_map;
for (const auto& header : signed_headers) {
signed_headers_map.emplace(header, std::string_view());
}
for (auto& header : req._headers) {
std::string header_str;
header_str.resize(header.first.size());
std::transform(header.first.begin(), header.first.end(), header_str.begin(), ::tolower);
auto it = signed_headers_map.find(header_str);
if (it != signed_headers_map.end()) {
it->second = std::string_view(header.second);
}
}
auto cache_getter = [&qp = _qp] (std::string username) {
return get_key_from_roles(qp, std::move(username));
};
return _key_cache.get_ptr(user, cache_getter).then([this, &req, &content,
user = std::move(user),
host = std::move(host),
datestamp = std::move(datestamp),
signed_headers_str = std::move(signed_headers_str),
signed_headers_map = std::move(signed_headers_map),
region = std::move(region),
service = std::move(service),
user_signature = std::move(user_signature)] (key_cache::value_ptr key_ptr) {
std::string signature = get_signature(user, *key_ptr, std::string_view(host), req._method,
datestamp, signed_headers_str, signed_headers_map, content, region, service, "");
if (signature != std::string_view(user_signature)) {
_key_cache.remove(user);
throw api_error::unrecognized_client("The security token included in the request is invalid.");
}
});
}
static tracing::trace_state_ptr create_tracing_session(tracing::tracing& tracing_instance) {
tracing::trace_state_props_set props;
props.set<tracing::trace_state_props::full_tracing>();
props.set_if<tracing::trace_state_props::log_slow_query>(tracing_instance.slow_query_tracing_enabled());
return tracing_instance.create_session(tracing::trace_type::QUERY, props);
}
// truncated_content_view() prints a potentially long chunked_content for
// debugging purposes. In the common case when the content is not excessively
// long, it just returns a view into the given content, without any copying.
// But when the content is very long, it is truncated after some arbitrary
// max_len (or one chunk, whichever comes first), with "<truncated>" added at
// the end. To do this modification to the string, we need to create a new
// std::string, so the caller must pass us a reference to one, "buf", where
// we can store the content. The returned view is only alive for as long this
// buf is kept alive.
static std::string_view truncated_content_view(const chunked_content& content, std::string& buf) {
constexpr size_t max_len = 1024;
if (content.empty()) {
return std::string_view();
} else if (content.size() == 1 && content.begin()->size() <= max_len) {
return std::string_view(content.begin()->get(), content.begin()->size());
} else {
buf = std::string(content.begin()->get(), std::min(content.begin()->size(), max_len)) + "<truncated>";
return std::string_view(buf);
}
}
static tracing::trace_state_ptr maybe_trace_query(service::client_state& client_state, sstring_view op, const chunked_content& query) {
tracing::trace_state_ptr trace_state;
tracing::tracing& tracing_instance = tracing::tracing::get_local_tracing_instance();
if (tracing_instance.trace_next_query() || tracing_instance.slow_query_tracing_enabled()) {
trace_state = create_tracing_session(tracing_instance);
std::string buf;
tracing::add_session_param(trace_state, "alternator_op", op);
tracing::add_query(trace_state, truncated_content_view(query, buf));
tracing::begin(trace_state, format("Alternator {}", op), client_state.get_client_address());
}
return trace_state;
}
future<executor::request_return_type> server::handle_api_request(std::unique_ptr<request> req) {
_executor._stats.total_operations++;
sstring target = req->get_header(TARGET);
std::vector<std::string_view> split_target = split(target, '.');
//NOTICE(sarna): Target consists of Dynamo API version followed by a dot '.' and operation type (e.g. CreateTable)
std::string op = split_target.empty() ? std::string() : std::string(split_target.back());
// JSON parsing can allocate up to roughly 2x the size of the raw
// document, + a couple of bytes for maintenance.
// TODO: consider the case where req->content_length is missing. Maybe
// we need to take the content_length_limit and return some of the units
// when we finish read_content_and_verify_signature?
size_t mem_estimate = req->content_length * 2 + 8000;
auto units_fut = get_units(*_memory_limiter, mem_estimate);
if (_memory_limiter->waiters()) {
++_executor._stats.requests_blocked_memory;
}
auto units = co_await std::move(units_fut);
assert(req->content_stream);
chunked_content content = co_await httpd::read_entire_stream(*req->content_stream);
co_await verify_signature(*req, content);
if (slogger.is_enabled(log_level::trace)) {
std::string buf;
slogger.trace("Request: {} {} {}", op, truncated_content_view(content, buf), req->_headers);
}
auto callback_it = _callbacks.find(op);
if (callback_it == _callbacks.end()) {
_executor._stats.unsupported_operations++;
co_return api_error::unknown_operation(format("Unsupported operation {}", op));
}
if (_pending_requests.get_count() >= _max_concurrent_requests) {
_executor._stats.requests_shed++;
co_return api_error::request_limit_exceeded(format("too many in-flight requests (configured via max_concurrent_requests_per_shard): {}", _pending_requests.get_count()));
}
_pending_requests.enter();
auto leave = defer([this] { _pending_requests.leave(); });
//FIXME: Client state can provide more context, e.g. client's endpoint address
// We use unique_ptr because client_state cannot be moved or copied
executor::client_state client_state{executor::client_state::internal_tag()};
tracing::trace_state_ptr trace_state = maybe_trace_query(client_state, op, content);
tracing::trace(trace_state, op);
rjson::value json_request = co_await _json_parser.parse(std::move(content));
co_return co_await callback_it->second(_executor, client_state, trace_state,
make_service_permit(std::move(units)), std::move(json_request), std::move(req));
}
void server::set_routes(routes& r) {
api_handler* req_handler = new api_handler([this] (std::unique_ptr<request> req) mutable {
return handle_api_request(std::move(req));
});
r.put(operation_type::POST, "/", req_handler);
r.put(operation_type::GET, "/", new health_handler(_pending_requests));
// The "/localnodes" request is a new Alternator feature, not supported by
// DynamoDB and not required for DynamoDB compatibility. It allows a
// client to enquire - using a trivial HTTP request without requiring
// authentication - the list of all live nodes in the same data center of
// the Alternator cluster. The client can use this list to balance its
// request load to all the nodes in the same geographical region.
// Note that this API exposes - openly without authentication - the
// information on the cluster's members inside one data center. We do not
// consider this to be a security risk, because an attacker can already
// scan an entire subnet for nodes responding to the health request,
// or even just scan for open ports.
r.put(operation_type::GET, "/localnodes", new local_nodelist_handler(_pending_requests));
r.put(operation_type::OPTIONS, "/", new options_handler(_pending_requests));
}
//FIXME: A way to immediately invalidate the cache should be considered,
// e.g. when the system table which stores the keys is changed.
// For now, this propagation may take up to 1 minute.
server::server(executor& exec, cql3::query_processor& qp)
: _http_server("http-alternator")
, _https_server("https-alternator")
, _executor(exec)
, _qp(qp)
, _key_cache(1024, 1min, slogger)
, _enforce_authorization(false)
, _enabled_servers{}
, _pending_requests{}
, _callbacks{
{"CreateTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.create_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DescribeTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DeleteTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.delete_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"UpdateTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.update_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"PutItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.put_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"UpdateItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.update_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"GetItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.get_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DeleteItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.delete_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"ListTables", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.list_tables(client_state, std::move(permit), std::move(json_request));
}},
{"Scan", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.scan(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DescribeEndpoints", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_endpoints(client_state, std::move(permit), std::move(json_request), req->get_header("Host"));
}},
{"BatchWriteItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.batch_write_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"BatchGetItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.batch_get_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"Query", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.query(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"TagResource", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.tag_resource(client_state, std::move(permit), std::move(json_request));
}},
{"UntagResource", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.untag_resource(client_state, std::move(permit), std::move(json_request));
}},
{"ListTagsOfResource", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.list_tags_of_resource(client_state, std::move(permit), std::move(json_request));
}},
{"ListStreams", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.list_streams(client_state, std::move(permit), std::move(json_request));
}},
{"DescribeStream", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_stream(client_state, std::move(permit), std::move(json_request));
}},
{"GetShardIterator", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.get_shard_iterator(client_state, std::move(permit), std::move(json_request));
}},
{"GetRecords", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.get_records(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
} {
}
future<> server::init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds,
bool enforce_authorization, semaphore* memory_limiter, utils::updateable_value<uint32_t> max_concurrent_requests) {
_memory_limiter = memory_limiter;
_enforce_authorization = enforce_authorization;
_max_concurrent_requests = std::move(max_concurrent_requests);
if (!port && !https_port) {
return make_exception_future<>(std::runtime_error("Either regular port or TLS port"
" must be specified in order to init an alternator HTTP server instance"));
}
return seastar::async([this, addr, port, https_port, creds] {
try {
_executor.start().get();
if (port) {
set_routes(_http_server._routes);
_http_server.set_content_length_limit(server::content_length_limit);
_http_server.set_content_streaming(true);
_http_server.listen(socket_address{addr, *port}).get();
_enabled_servers.push_back(std::ref(_http_server));
}
if (https_port) {
set_routes(_https_server._routes);
_https_server.set_content_length_limit(server::content_length_limit);
_https_server.set_content_streaming(true);
_https_server.set_tls_credentials(creds->build_reloadable_server_credentials([](const std::unordered_set<sstring>& files, std::exception_ptr ep) {
if (ep) {
slogger.warn("Exception loading {}: {}", files, ep);
} else {
slogger.info("Reloaded {}", files);
}
}).get0());
_https_server.listen(socket_address{addr, *https_port}).get();
_enabled_servers.push_back(std::ref(_https_server));
}
} catch (...) {
slogger.error("Failed to set up Alternator HTTP server on {} port {}, TLS port {}: {}",
addr, port ? std::to_string(*port) : "OFF", https_port ? std::to_string(*https_port) : "OFF", std::current_exception());
std::throw_with_nested(std::runtime_error(
format("Failed to set up Alternator HTTP server on {} port {}, TLS port {}",
addr, port ? std::to_string(*port) : "OFF", https_port ? std::to_string(*https_port) : "OFF")));
}
});
}
future<> server::stop() {
return parallel_for_each(_enabled_servers, [] (http_server& server) {
return server.stop();
}).then([this] {
return _pending_requests.close();
}).then([this] {
return _json_parser.stop();
});
}
server::json_parser::json_parser() : _run_parse_json_thread(async([this] {
while (true) {
_document_waiting.wait().get();
if (_as.abort_requested()) {
return;
}
try {
_parsed_document = rjson::parse_yieldable(std::move(_raw_document));
_current_exception = nullptr;
} catch (...) {
_current_exception = std::current_exception();
}
_document_parsed.signal();
}
})) {
}
future<rjson::value> server::json_parser::parse(chunked_content&& content) {
if (content.size() < yieldable_parsing_threshold) {
return make_ready_future<rjson::value>(rjson::parse(std::move(content)));
}
return with_semaphore(_parsing_sem, 1, [this, content = std::move(content)] () mutable {
_raw_document = std::move(content);
_document_waiting.signal();
return _document_parsed.wait().then([this] {
if (_current_exception) {
return make_exception_future<rjson::value>(_current_exception);
}
return make_ready_future<rjson::value>(std::move(_parsed_document));
});
});
}
future<> server::json_parser::stop() {
_as.request_abort();
_document_waiting.signal();
_document_parsed.broken();
return std::move(_run_parse_json_thread);
}
}

View File

@@ -1,91 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "alternator/executor.hh"
#include <seastar/core/future.hh>
#include <seastar/http/httpd.hh>
#include <seastar/net/tls.hh>
#include <optional>
#include "alternator/auth.hh"
#include "utils/small_vector.hh"
#include "utils/updateable_value.hh"
#include <seastar/core/units.hh>
namespace alternator {
using chunked_content = rjson::chunked_content;
class server {
static constexpr size_t content_length_limit = 16*MB;
using alternator_callback = std::function<future<executor::request_return_type>(executor&, executor::client_state&,
tracing::trace_state_ptr, service_permit, rjson::value, std::unique_ptr<request>)>;
using alternator_callbacks_map = std::unordered_map<std::string_view, alternator_callback>;
http_server _http_server;
http_server _https_server;
executor& _executor;
cql3::query_processor& _qp;
key_cache _key_cache;
bool _enforce_authorization;
utils::small_vector<std::reference_wrapper<seastar::httpd::http_server>, 2> _enabled_servers;
gate _pending_requests;
alternator_callbacks_map _callbacks;
semaphore* _memory_limiter;
utils::updateable_value<uint32_t> _max_concurrent_requests;
class json_parser {
static constexpr size_t yieldable_parsing_threshold = 16*KB;
chunked_content _raw_document;
rjson::value _parsed_document;
std::exception_ptr _current_exception;
semaphore _parsing_sem{1};
condition_variable _document_waiting;
condition_variable _document_parsed;
abort_source _as;
future<> _run_parse_json_thread;
public:
json_parser();
// Moving a chunked_content into parse() allows parse() to free each
// chunk as soon as it is parsed, so when chunks are relatively small,
// we don't need to store the sum of unparsed and parsed sizes.
future<rjson::value> parse(chunked_content&& content);
future<> stop();
};
json_parser _json_parser;
public:
server(executor& executor, cql3::query_processor& qp);
future<> init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds,
bool enforce_authorization, semaphore* memory_limiter, utils::updateable_value<uint32_t> max_concurrent_requests);
future<> stop();
private:
void set_routes(seastar::httpd::routes& r);
future<> verify_signature(const seastar::httpd::request&, const chunked_content&);
future<executor::request_return_type> handle_api_request(std::unique_ptr<request> req);
};
}

View File

@@ -1,112 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "stats.hh"
#include "utils/histogram_metrics_helper.hh"
#include <seastar/core/metrics.hh>
namespace alternator {
const char* ALTERNATOR_METRICS = "alternator";
stats::stats() : api_operations{} {
// Register the
seastar::metrics::label op("op");
_metrics.add_group("alternator", {
#define OPERATION(name, CamelCaseName) \
seastar::metrics::make_total_operations("operation", api_operations.name, \
seastar::metrics::description("number of operations via Alternator API"), {op(CamelCaseName)}),
#define OPERATION_LATENCY(name, CamelCaseName) \
seastar::metrics::make_histogram("op_latency", \
seastar::metrics::description("Latency histogram of an operation via Alternator API"), {op(CamelCaseName)}, [this]{return to_metrics_histogram(api_operations.name);}),
OPERATION(batch_get_item, "BatchGetItem")
OPERATION(batch_write_item, "BatchWriteItem")
OPERATION(create_backup, "CreateBackup")
OPERATION(create_global_table, "CreateGlobalTable")
OPERATION(create_table, "CreateTable")
OPERATION(delete_backup, "DeleteBackup")
OPERATION(delete_item, "DeleteItem")
OPERATION(delete_table, "DeleteTable")
OPERATION(describe_backup, "DescribeBackup")
OPERATION(describe_continuous_backups, "DescribeContinuousBackups")
OPERATION(describe_endpoints, "DescribeEndpoints")
OPERATION(describe_global_table, "DescribeGlobalTable")
OPERATION(describe_global_table_settings, "DescribeGlobalTableSettings")
OPERATION(describe_limits, "DescribeLimits")
OPERATION(describe_table, "DescribeTable")
OPERATION(describe_time_to_live, "DescribeTimeToLive")
OPERATION(get_item, "GetItem")
OPERATION(list_backups, "ListBackups")
OPERATION(list_global_tables, "ListGlobalTables")
OPERATION(list_tables, "ListTables")
OPERATION(list_tags_of_resource, "ListTagsOfResource")
OPERATION(put_item, "PutItem")
OPERATION(query, "Query")
OPERATION(restore_table_from_backup, "RestoreTableFromBackup")
OPERATION(restore_table_to_point_in_time, "RestoreTableToPointInTime")
OPERATION(scan, "Scan")
OPERATION(tag_resource, "TagResource")
OPERATION(transact_get_items, "TransactGetItems")
OPERATION(transact_write_items, "TransactWriteItems")
OPERATION(untag_resource, "UntagResource")
OPERATION(update_continuous_backups, "UpdateContinuousBackups")
OPERATION(update_global_table, "UpdateGlobalTable")
OPERATION(update_global_table_settings, "UpdateGlobalTableSettings")
OPERATION(update_item, "UpdateItem")
OPERATION(update_table, "UpdateTable")
OPERATION(update_time_to_live, "UpdateTimeToLive")
OPERATION_LATENCY(put_item_latency, "PutItem")
OPERATION_LATENCY(get_item_latency, "GetItem")
OPERATION_LATENCY(delete_item_latency, "DeleteItem")
OPERATION_LATENCY(update_item_latency, "UpdateItem")
OPERATION(list_streams, "ListStreams")
OPERATION(describe_stream, "DescribeStream")
OPERATION(get_shard_iterator, "GetShardIterator")
OPERATION(get_records, "GetRecords")
OPERATION_LATENCY(get_records_latency, "GetRecords")
});
_metrics.add_group("alternator", {
seastar::metrics::make_total_operations("unsupported_operations", unsupported_operations,
seastar::metrics::description("number of unsupported operations via Alternator API")),
seastar::metrics::make_total_operations("total_operations", total_operations,
seastar::metrics::description("number of total operations via Alternator API")),
seastar::metrics::make_total_operations("reads_before_write", reads_before_write,
seastar::metrics::description("number of performed read-before-write operations")),
seastar::metrics::make_total_operations("write_using_lwt", write_using_lwt,
seastar::metrics::description("number of writes that used LWT")),
seastar::metrics::make_total_operations("shard_bounce_for_lwt", shard_bounce_for_lwt,
seastar::metrics::description("number writes that had to be bounced from this shard because of LWT requirements")),
seastar::metrics::make_total_operations("requests_blocked_memory", requests_blocked_memory,
seastar::metrics::description("Counts a number of requests blocked due to memory pressure.")),
seastar::metrics::make_total_operations("requests_shed", requests_shed,
seastar::metrics::description("Counts a number of requests shed due to overload.")),
seastar::metrics::make_total_operations("filtered_rows_read_total", cql_stats.filtered_rows_read_total,
seastar::metrics::description("number of rows read during filtering operations")),
seastar::metrics::make_total_operations("filtered_rows_matched_total", cql_stats.filtered_rows_matched_total,
seastar::metrics::description("number of rows read and matched during filtering operations")),
seastar::metrics::make_total_operations("filtered_rows_dropped_total", [this] { return cql_stats.filtered_rows_read_total - cql_stats.filtered_rows_matched_total; },
seastar::metrics::description("number of rows read and dropped during filtering operations")),
});
}
}

View File

@@ -1,104 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <cstdint>
#include <seastar/core/metrics_registration.hh>
#include "seastarx.hh"
#include "utils/estimated_histogram.hh"
#include "cql3/stats.hh"
namespace alternator {
// Object holding per-shard statistics related to Alternator.
// While this object is alive, these metrics are also registered to be
// visible by the metrics REST API, with the "alternator" prefix.
class stats {
public:
stats();
// Count of DynamoDB API operations by types
struct {
uint64_t batch_get_item = 0;
uint64_t batch_write_item = 0;
uint64_t create_backup = 0;
uint64_t create_global_table = 0;
uint64_t create_table = 0;
uint64_t delete_backup = 0;
uint64_t delete_item = 0;
uint64_t delete_table = 0;
uint64_t describe_backup = 0;
uint64_t describe_continuous_backups = 0;
uint64_t describe_endpoints = 0;
uint64_t describe_global_table = 0;
uint64_t describe_global_table_settings = 0;
uint64_t describe_limits = 0;
uint64_t describe_table = 0;
uint64_t describe_time_to_live = 0;
uint64_t get_item = 0;
uint64_t list_backups = 0;
uint64_t list_global_tables = 0;
uint64_t list_tables = 0;
uint64_t list_tags_of_resource = 0;
uint64_t put_item = 0;
uint64_t query = 0;
uint64_t restore_table_from_backup = 0;
uint64_t restore_table_to_point_in_time = 0;
uint64_t scan = 0;
uint64_t tag_resource = 0;
uint64_t transact_get_items = 0;
uint64_t transact_write_items = 0;
uint64_t untag_resource = 0;
uint64_t update_continuous_backups = 0;
uint64_t update_global_table = 0;
uint64_t update_global_table_settings = 0;
uint64_t update_item = 0;
uint64_t update_table = 0;
uint64_t update_time_to_live = 0;
uint64_t list_streams = 0;
uint64_t describe_stream = 0;
uint64_t get_shard_iterator = 0;
uint64_t get_records = 0;
utils::time_estimated_histogram put_item_latency;
utils::time_estimated_histogram get_item_latency;
utils::time_estimated_histogram delete_item_latency;
utils::time_estimated_histogram update_item_latency;
utils::time_estimated_histogram get_records_latency;
} api_operations;
// Miscellaneous event counters
uint64_t total_operations = 0;
uint64_t unsupported_operations = 0;
uint64_t reads_before_write = 0;
uint64_t write_using_lwt = 0;
uint64_t shard_bounce_for_lwt = 0;
uint64_t requests_blocked_memory = 0;
uint64_t requests_shed = 0;
// CQL-derived stats
cql3::cql_stats cql_stats;
private:
// The metric_groups object holds this stat object's metrics registered
// as long as the stats object is alive.
seastar::metrics::metric_groups _metrics;
};
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,53 +0,0 @@
/*
* Copyright 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "serializer.hh"
#include "schema.hh"
#include "db/extensions.hh"
namespace alternator {
class tags_extension : public schema_extension {
public:
static constexpr auto NAME = "scylla_tags";
tags_extension() = default;
explicit tags_extension(const std::map<sstring, sstring>& tags) : _tags(std::move(tags)) {}
explicit tags_extension(bytes b) : _tags(tags_extension::deserialize(b)) {}
explicit tags_extension(const sstring& s) {
throw std::logic_error("Cannot create tags from string");
}
bytes serialize() const override {
return ser::serialize_to_buffer<bytes>(_tags);
}
static std::map<sstring, sstring> deserialize(bytes_view buffer) {
return ser::deserialize_from_buffer(buffer, boost::type<std::map<sstring, sstring>>());
}
const std::map<sstring, sstring>& tags() const {
return _tags;
}
private:
std::map<sstring, sstring> _tags;
};
}

View File

@@ -13,7 +13,7 @@
{
"method":"GET",
"summary":"get row cache save period in seconds",
"type": "long",
"type":"int",
"nickname":"get_row_cache_save_period_in_seconds",
"produces":[
"application/json"
@@ -35,7 +35,7 @@
"description":"row cache save period in seconds",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -48,7 +48,7 @@
{
"method":"GET",
"summary":"get key cache save period in seconds",
"type": "long",
"type":"int",
"nickname":"get_key_cache_save_period_in_seconds",
"produces":[
"application/json"
@@ -70,7 +70,7 @@
"description":"key cache save period in seconds",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -83,7 +83,7 @@
{
"method":"GET",
"summary":"get counter cache save period in seconds",
"type": "long",
"type":"int",
"nickname":"get_counter_cache_save_period_in_seconds",
"produces":[
"application/json"
@@ -105,7 +105,7 @@
"description":"counter cache save period in seconds",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -118,7 +118,7 @@
{
"method":"GET",
"summary":"get row cache keys to save",
"type": "long",
"type":"int",
"nickname":"get_row_cache_keys_to_save",
"produces":[
"application/json"
@@ -140,7 +140,7 @@
"description":"row cache keys to save",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -153,7 +153,7 @@
{
"method":"GET",
"summary":"get key cache keys to save",
"type": "long",
"type":"int",
"nickname":"get_key_cache_keys_to_save",
"produces":[
"application/json"
@@ -175,7 +175,7 @@
"description":"key cache keys to save",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -188,7 +188,7 @@
{
"method":"GET",
"summary":"get counter cache keys to save",
"type": "long",
"type":"int",
"nickname":"get_counter_cache_keys_to_save",
"produces":[
"application/json"
@@ -210,7 +210,7 @@
"description":"counter cache keys to save",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -448,7 +448,7 @@
{
"method": "GET",
"summary": "Get key entries",
"type": "long",
"type": "int",
"nickname": "get_key_entries",
"produces": [
"application/json"
@@ -568,7 +568,7 @@
{
"method": "GET",
"summary": "Get row entries",
"type": "long",
"type": "int",
"nickname": "get_row_entries",
"produces": [
"application/json"
@@ -688,7 +688,7 @@
{
"method": "GET",
"summary": "Get counter entries",
"type": "long",
"type": "int",
"nickname": "get_counter_entries",
"produces": [
"application/json"

View File

@@ -70,7 +70,7 @@
{
"method":"POST",
"summary":"Force a major compaction of this column family",
"type":"void",
"type":"string",
"nickname":"force_major_compaction",
"produces":[
"application/json"
@@ -121,7 +121,7 @@
"description":"The minimum number of sstables in queue before compaction kicks off",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -172,7 +172,7 @@
"description":"The maximum number of sstables in queue before compaction kicks off",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -223,7 +223,7 @@
"description":"The maximum number of sstables in queue before compaction kicks off",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
},
{
@@ -231,7 +231,7 @@
"description":"The minimum number of sstables in queue before compaction kicks off",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -380,54 +380,16 @@
"operations":[
{
"method":"GET",
"summary":"check if the auto_compaction property is enabled for a given table",
"summary":"check if the auto compaction disabled",
"type":"boolean",
"nickname":"get_auto_compaction",
"nickname":"is_auto_compaction_disabled",
"produces":[
"application/json"
],
"parameters":[
{
"name":"name",
"description":"The table name in keyspace:name format",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
},
{
"method":"POST",
"summary":"Enable table auto compaction",
"type":"void",
"nickname":"enable_auto_compaction",
"produces":[
"application/json"
],
"parameters":[
{
"name":"name",
"description":"The table name in keyspace:name format",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
},
{
"method":"DELETE",
"summary":"Disable table auto compaction",
"type":"void",
"nickname":"disable_auto_compaction",
"produces":[
"application/json"
],
"parameters":[
{
"name":"name",
"description":"The table name in keyspace:name format",
"description":"The column family name in keyspace:name format",
"required":true,
"allowMultiple":false,
"type":"string",
@@ -493,7 +455,7 @@
"operations":[
{
"method":"GET",
"summary":"Returns a list of sstable filenames that contain the given partition key on this node",
"summary":"Returns a list of filenames that contain the given key on this node",
"type":"array",
"items":{
"type":"string"
@@ -513,7 +475,7 @@
},
{
"name":"key",
"description":"The partition key. In a composite-key scenario, use ':' to separate the columns in the key.",
"description":"The key",
"required":true,
"allowMultiple":false,
"type":"string",
@@ -582,7 +544,7 @@
"summary":"sstable count for each level. empty unless leveled compaction is used",
"type":"array",
"items":{
"type": "long"
"type":"int"
},
"nickname":"get_sstable_count_per_level",
"produces":[
@@ -649,54 +611,6 @@
}
]
},
{
"path":"/column_family/toppartitions/{name}",
"operations":[
{
"method":"GET",
"summary":"Toppartitions query",
"type":"toppartitions_query_results",
"nickname":"toppartitions",
"produces":[
"application/json"
],
"parameters":[
{
"name":"name",
"description":"The column family name in keyspace:name format",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"duration",
"description":"Duration (in milliseconds) of monitoring operation",
"required":true,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"list_size",
"description":"number of the top partitions to list",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"capacity",
"description":"capacity of stream summary: determines amount of resources used in query processing",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
}
]
}
]
},
{
"path":"/column_family/metrics/memtable_columns_count/",
"operations":[
@@ -959,7 +873,7 @@
{
"method":"GET",
"summary":"Get memtable switch count",
"type": "long",
"type":"int",
"nickname":"get_memtable_switch_count",
"produces":[
"application/json"
@@ -983,7 +897,7 @@
{
"method":"GET",
"summary":"Get all memtable switch count",
"type": "long",
"type":"int",
"nickname":"get_all_memtable_switch_count",
"produces":[
"application/json"
@@ -1120,7 +1034,7 @@
{
"method":"GET",
"summary":"Get read latency",
"type": "long",
"type":"int",
"nickname":"get_read_latency",
"produces":[
"application/json"
@@ -1273,7 +1187,7 @@
{
"method":"GET",
"summary":"Get all read latency",
"type": "long",
"type":"int",
"nickname":"get_all_read_latency",
"produces":[
"application/json"
@@ -1289,7 +1203,7 @@
{
"method":"GET",
"summary":"Get range latency",
"type": "long",
"type":"int",
"nickname":"get_range_latency",
"produces":[
"application/json"
@@ -1313,7 +1227,7 @@
{
"method":"GET",
"summary":"Get all range latency",
"type": "long",
"type":"int",
"nickname":"get_all_range_latency",
"produces":[
"application/json"
@@ -1329,7 +1243,7 @@
{
"method":"GET",
"summary":"Get write latency",
"type": "long",
"type":"int",
"nickname":"get_write_latency",
"produces":[
"application/json"
@@ -1482,7 +1396,7 @@
{
"method":"GET",
"summary":"Get all write latency",
"type": "long",
"type":"int",
"nickname":"get_all_write_latency",
"produces":[
"application/json"
@@ -1498,7 +1412,7 @@
{
"method":"GET",
"summary":"Get pending flushes",
"type": "long",
"type":"int",
"nickname":"get_pending_flushes",
"produces":[
"application/json"
@@ -1522,7 +1436,7 @@
{
"method":"GET",
"summary":"Get all pending flushes",
"type": "long",
"type":"int",
"nickname":"get_all_pending_flushes",
"produces":[
"application/json"
@@ -1538,7 +1452,7 @@
{
"method":"GET",
"summary":"Get pending compactions",
"type": "long",
"type":"int",
"nickname":"get_pending_compactions",
"produces":[
"application/json"
@@ -1562,7 +1476,7 @@
{
"method":"GET",
"summary":"Get all pending compactions",
"type": "long",
"type":"int",
"nickname":"get_all_pending_compactions",
"produces":[
"application/json"
@@ -1578,7 +1492,7 @@
{
"method":"GET",
"summary":"Get live ss table count",
"type": "long",
"type":"int",
"nickname":"get_live_ss_table_count",
"produces":[
"application/json"
@@ -1602,7 +1516,7 @@
{
"method":"GET",
"summary":"Get all live ss table count",
"type": "long",
"type":"int",
"nickname":"get_all_live_ss_table_count",
"produces":[
"application/json"
@@ -1618,7 +1532,7 @@
{
"method":"GET",
"summary":"Get live disk space used",
"type": "long",
"type":"int",
"nickname":"get_live_disk_space_used",
"produces":[
"application/json"
@@ -1642,7 +1556,7 @@
{
"method":"GET",
"summary":"Get all live disk space used",
"type": "long",
"type":"int",
"nickname":"get_all_live_disk_space_used",
"produces":[
"application/json"
@@ -1658,7 +1572,7 @@
{
"method":"GET",
"summary":"Get total disk space used",
"type": "long",
"type":"int",
"nickname":"get_total_disk_space_used",
"produces":[
"application/json"
@@ -1682,7 +1596,7 @@
{
"method":"GET",
"summary":"Get all total disk space used",
"type": "long",
"type":"int",
"nickname":"get_all_total_disk_space_used",
"produces":[
"application/json"
@@ -2138,7 +2052,7 @@
{
"method":"GET",
"summary":"Get speculative retries",
"type": "long",
"type":"int",
"nickname":"get_speculative_retries",
"produces":[
"application/json"
@@ -2162,7 +2076,7 @@
{
"method":"GET",
"summary":"Get all speculative retries",
"type": "long",
"type":"int",
"nickname":"get_all_speculative_retries",
"produces":[
"application/json"
@@ -2242,7 +2156,7 @@
{
"method":"GET",
"summary":"Get row cache hit out of range",
"type": "long",
"type":"int",
"nickname":"get_row_cache_hit_out_of_range",
"produces":[
"application/json"
@@ -2266,7 +2180,7 @@
{
"method":"GET",
"summary":"Get all row cache hit out of range",
"type": "long",
"type":"int",
"nickname":"get_all_row_cache_hit_out_of_range",
"produces":[
"application/json"
@@ -2282,7 +2196,7 @@
{
"method":"GET",
"summary":"Get row cache hit",
"type": "long",
"type":"int",
"nickname":"get_row_cache_hit",
"produces":[
"application/json"
@@ -2306,7 +2220,7 @@
{
"method":"GET",
"summary":"Get all row cache hit",
"type": "long",
"type":"int",
"nickname":"get_all_row_cache_hit",
"produces":[
"application/json"
@@ -2322,7 +2236,7 @@
{
"method":"GET",
"summary":"Get row cache miss",
"type": "long",
"type":"int",
"nickname":"get_row_cache_miss",
"produces":[
"application/json"
@@ -2346,7 +2260,7 @@
{
"method":"GET",
"summary":"Get all row cache miss",
"type": "long",
"type":"int",
"nickname":"get_all_row_cache_miss",
"produces":[
"application/json"
@@ -2362,7 +2276,7 @@
{
"method":"GET",
"summary":"Get cas prepare",
"type": "long",
"type":"int",
"nickname":"get_cas_prepare",
"produces":[
"application/json"
@@ -2386,7 +2300,7 @@
{
"method":"GET",
"summary":"Get cas propose",
"type": "long",
"type":"int",
"nickname":"get_cas_propose",
"produces":[
"application/json"
@@ -2410,7 +2324,7 @@
{
"method":"GET",
"summary":"Get cas commit",
"type": "long",
"type":"int",
"nickname":"get_cas_commit",
"produces":[
"application/json"
@@ -2902,52 +2816,6 @@
"description":"The column family type"
}
}
},
"toppartitions_record":{
"id":"toppartitions_record",
"description":"nodetool toppartitions query record",
"properties":{
"partition":{
"type":"string",
"description":"Partition key"
},
"count":{
"type":"long",
"description":"Number of read/write operations"
},
"error":{
"type":"long",
"description":"Indication of inaccuracy in counting PKs"
}
}
},
"toppartitions_query_results":{
"id":"toppartitions_query_results",
"description":"nodetool toppartitions query results",
"properties":{
"read_cardinality":{
"type":"long",
"description":"Number of the unique operations in the sample set"
},
"read":{
"type":"array",
"items":{
"type":"toppartitions_record"
},
"description":"Read results"
},
"write_cardinality":{
"type":"long",
"description":"Number of the unique operations in the sample set"
},
"write":{
"type":"array",
"items":{
"type":"toppartitions_record"
},
"description":"Write results"
}
}
}
}
}

View File

@@ -118,7 +118,7 @@
{
"method": "GET",
"summary": "Get pending tasks",
"type": "long",
"type": "int",
"nickname": "get_pending_tasks",
"produces": [
"application/json"
@@ -127,24 +127,6 @@
}
]
},
{
"path": "/compaction_manager/metrics/pending_tasks_by_table",
"operations": [
{
"method": "GET",
"summary": "Get pending tasks by table name",
"type": "array",
"items": {
"type": "pending_compaction"
},
"nickname": "get_pending_tasks_by_table",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/compaction_manager/metrics/completed_tasks",
"operations": [
@@ -181,7 +163,7 @@
{
"method": "GET",
"summary": "Get bytes compacted",
"type": "long",
"type": "int",
"nickname": "get_bytes_compacted",
"produces": [
"application/json"
@@ -197,7 +179,7 @@
"description":"A row merged information",
"properties":{
"key":{
"type": "long",
"type":"int",
"description":"The number of sstable"
},
"value":{
@@ -262,23 +244,6 @@
}
}
},
"pending_compaction": {
"id": "pending_compaction",
"properties": {
"cf": {
"type": "string",
"description": "The column family name"
},
"ks": {
"type":"string",
"description": "The keyspace name"
},
"task": {
"type":"long",
"description": "The number of pending tasks"
}
}
},
"history": {
"id":"history",
"description":"Compaction history information",

View File

@@ -1,30 +0,0 @@
"/v2/config/{id}": {
"get": {
"description": "Return a config value",
"operationId": "find_config_id",
"produces": [
"application/json"
],
"tags": ["config"],
"parameters": [
{
"name": "id",
"in": "path",
"description": "ID of config to return",
"required": true,
"type": "string"
}
],
"responses": {
"200": {
"description": "Config value"
},
"default": {
"description": "unexpected error",
"schema": {
"$ref": "#/definitions/ErrorModel"
}
}
}
}
}

View File

@@ -1,90 +0,0 @@
{
"apiVersion":"0.0.1",
"swaggerVersion":"1.2",
"basePath":"{{Protocol}}://{{Host}}",
"resourcePath":"/error_injection",
"produces":[
"application/json"
],
"apis":[
{
"path":"/v2/error_injection/injection/{injection}",
"operations":[
{
"method":"POST",
"summary":"Activate an injection that triggers an error in code",
"type":"void",
"nickname":"enable_injection",
"produces":[
"application/json"
],
"parameters":[
{
"name":"injection",
"description":"injection name, should correspond to an injection added in code",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"one_shot",
"description":"boolean flag indicating whether the injection should be enabled to trigger only once",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
}
]
},
{
"method":"DELETE",
"summary":"Deactivate an injection previously activated by the API",
"type":"void",
"nickname":"disable_injection",
"produces":[
"application/json"
],
"parameters":[
{
"name":"injection",
"description":"injection name",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
},
{
"path":"/v2/error_injection/injection",
"operations":[
{
"method":"GET",
"summary":"List all enabled injections on all shards, i.e. injections that will trigger an error in the code",
"type":"array",
"items":{
"type":"string"
},
"nickname":"get_enabled_injections_on_all",
"produces":[
"application/json"
],
"parameters":[]
},
{
"method":"DELETE",
"summary":"Deactivate all injections previously activated on all shards by the API",
"type":"void",
"nickname":"disable_on_all",
"produces":[
"application/json"
],
"parameters":[]
}
]
}
]
}

View File

@@ -110,7 +110,7 @@
{
"method":"GET",
"summary":"Get count down endpoint",
"type": "long",
"type":"int",
"nickname":"get_down_endpoint_count",
"produces":[
"application/json"
@@ -126,7 +126,7 @@
{
"method":"GET",
"summary":"Get count up endpoint",
"type": "long",
"type":"int",
"nickname":"get_up_endpoint_count",
"produces":[
"application/json"
@@ -180,11 +180,11 @@
"description": "The endpoint address"
},
"generation": {
"type": "long",
"type": "int",
"description": "The heart beat generation"
},
"version": {
"type": "long",
"type": "int",
"description": "The heart beat version"
},
"update_time": {
@@ -209,7 +209,7 @@
"description": "Holds a version value for an application state",
"properties": {
"application_state": {
"type": "long",
"type": "int",
"description": "The application state enum index"
},
"value": {
@@ -217,7 +217,7 @@
"description": "The version value"
},
"version": {
"type": "long",
"type": "int",
"description": "The application state version"
}
}

View File

@@ -75,7 +75,7 @@
{
"method":"GET",
"summary":"Returns files which are pending for archival attempt. Does NOT include failed archive attempts",
"type": "long",
"type":"int",
"nickname":"get_current_generation_number",
"produces":[
"application/json"
@@ -99,7 +99,7 @@
{
"method":"GET",
"summary":"Get heart beat version for a node",
"type": "long",
"type":"int",
"nickname":"get_current_heart_beat_version",
"produces":[
"application/json"
@@ -148,30 +148,6 @@
]
}
]
},
{
"path":"/gossiper/force_remove_endpoint/{addr}",
"operations":[
{
"method":"POST",
"summary":"Force remove an endpoint from gossip",
"type":"void",
"nickname":"force_remove_endpoint",
"produces":[
"application/json"
],
"parameters":[
{
"name":"addr",
"description":"The endpoint address",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
}
]
}

View File

@@ -99,7 +99,7 @@
{
"method": "GET",
"summary": "Get create hint count",
"type": "long",
"type": "int",
"nickname": "get_create_hint_count",
"produces": [
"application/json"
@@ -123,7 +123,7 @@
{
"method": "GET",
"summary": "Get not stored hints count",
"type": "long",
"type": "int",
"nickname": "get_not_stored_hints_count",
"produces": [
"application/json"

View File

@@ -76,7 +76,7 @@
"items":{
"type":"message_counter"
},
"nickname":"get_replied_messages",
"nickname":"get_completed_messages",
"produces":[
"application/json"
],
@@ -191,7 +191,7 @@
{
"method":"GET",
"summary":"Get the version number",
"type": "long",
"type":"int",
"nickname":"get_version",
"produces":[
"application/json"
@@ -249,7 +249,7 @@
"MIGRATION_REQUEST",
"PREPARE_MESSAGE",
"PREPARE_DONE_MESSAGE",
"UNUSED__STREAM_MUTATION",
"STREAM_MUTATION",
"STREAM_MUTATION_DONE",
"COMPLETE_MESSAGE",
"REPAIR_CHECKSUM_RANGE",

View File

@@ -68,7 +68,7 @@
"summary":"Get the hinted handoff enabled by dc",
"type":"array",
"items":{
"type":"array"
"type":"mapper_list"
},
"nickname":"get_hinted_handoff_enabled_by_dc",
"produces":[
@@ -105,7 +105,7 @@
{
"method":"GET",
"summary":"Get the max hint window",
"type": "long",
"type":"int",
"nickname":"get_max_hint_window",
"produces":[
"application/json"
@@ -128,7 +128,7 @@
"description":"max hint window in ms",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -141,7 +141,7 @@
{
"method":"GET",
"summary":"Get max hints in progress",
"type": "long",
"type":"int",
"nickname":"get_max_hints_in_progress",
"produces":[
"application/json"
@@ -164,7 +164,7 @@
"description":"max hints in progress",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -177,7 +177,7 @@
{
"method":"GET",
"summary":"get hints in progress",
"type": "long",
"type":"int",
"nickname":"get_hints_in_progress",
"produces":[
"application/json"
@@ -602,7 +602,7 @@
{
"method": "GET",
"summary": "Get cas write metrics",
"type": "long",
"type": "int",
"nickname": "get_cas_write_metrics_unfinished_commit",
"produces": [
"application/json"
@@ -632,7 +632,7 @@
{
"method": "GET",
"summary": "Get cas write metrics",
"type": "long",
"type": "int",
"nickname": "get_cas_write_metrics_condition_not_met",
"produces": [
"application/json"
@@ -641,28 +641,13 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_write/failed_read_round_optimization",
"operations": [
{
"method": "GET",
"summary": "Get cas write metrics",
"type": "long",
"nickname": "get_cas_write_metrics_failed_read_round_optimization",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/cas_read/unfinished_commit",
"operations": [
{
"method": "GET",
"summary": "Get cas read metrics",
"type": "long",
"type": "int",
"nickname": "get_cas_read_metrics_unfinished_commit",
"produces": [
"application/json"
@@ -686,13 +671,28 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_read/condition_not_met",
"operations": [
{
"method": "GET",
"summary": "Get cas read metrics",
"type": "int",
"nickname": "get_cas_read_metrics_condition_not_met",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/read/timeouts",
"operations": [
{
"method": "GET",
"summary": "Get read metrics",
"type": "long",
"type": "int",
"nickname": "get_read_metrics_timeouts",
"produces": [
"application/json"
@@ -707,7 +707,7 @@
{
"method": "GET",
"summary": "Get read metrics",
"type": "long",
"type": "int",
"nickname": "get_read_metrics_unavailables",
"produces": [
"application/json"
@@ -791,36 +791,6 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_read/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get CAS read rate and latency histogram",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_cas_read_metrics_latency_histogram",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/view_write/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get view write rate and latency histogram",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_view_write_metrics_latency_histogram",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/range/moving_average_histogram",
"operations": [
@@ -842,7 +812,7 @@
{
"method": "GET",
"summary": "Get range metrics",
"type": "long",
"type": "int",
"nickname": "get_range_metrics_timeouts",
"produces": [
"application/json"
@@ -857,7 +827,7 @@
{
"method": "GET",
"summary": "Get range metrics",
"type": "long",
"type": "int",
"nickname": "get_range_metrics_unavailables",
"produces": [
"application/json"
@@ -902,7 +872,7 @@
{
"method": "GET",
"summary": "Get write metrics",
"type": "long",
"type": "int",
"nickname": "get_write_metrics_timeouts",
"produces": [
"application/json"
@@ -917,7 +887,7 @@
{
"method": "GET",
"summary": "Get write metrics",
"type": "long",
"type": "int",
"nickname": "get_write_metrics_unavailables",
"produces": [
"application/json"
@@ -986,21 +956,6 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_write/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get CAS write rate and latency histogram",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_cas_write_metrics_latency_histogram",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path":"/storage_proxy/metrics/read/estimated_histogram/",
"operations":[
@@ -1023,7 +978,7 @@
{
"method":"GET",
"summary":"Get read latency",
"type": "long",
"type":"int",
"nickname":"get_read_latency",
"produces":[
"application/json"
@@ -1055,7 +1010,7 @@
{
"method":"GET",
"summary":"Get write latency",
"type": "long",
"type":"int",
"nickname":"get_write_latency",
"produces":[
"application/json"
@@ -1087,7 +1042,7 @@
{
"method":"GET",
"summary":"Get range latency",
"type": "long",
"type":"int",
"nickname":"get_range_latency",
"produces":[
"application/json"

View File

@@ -104,68 +104,6 @@
}
]
},
{
"path":"/storage_service/toppartitions/",
"operations":[
{
"method":"GET",
"summary":"Toppartitions query",
"type":"toppartitions_query_results",
"nickname":"toppartitions_generic",
"produces":[
"application/json"
],
"parameters":[
{
"name":"table_filters",
"description":"Optional list of table name filters in keyspace:name format",
"required":false,
"allowMultiple":false,
"type":"array",
"items":{
"type":"string"
},
"paramType":"query"
},
{
"name":"keyspace_filters",
"description":"Optional list of keyspace filters",
"required":false,
"allowMultiple":false,
"type":"array",
"items":{
"type":"string"
},
"paramType":"query"
},
{
"name":"duration",
"description":"Duration (in milliseconds) of monitoring operation",
"required":true,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"list_size",
"description":"number of the top partitions to list",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"capacity",
"description":"capacity of stream summary: determines amount of resources used in query processing",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
}
]
}
]
},
{
"path":"/storage_service/nodes/leaving",
"operations":[
@@ -520,7 +458,7 @@
{
"method":"GET",
"summary":"Return the generation value for this node.",
"type": "long",
"type":"int",
"nickname":"get_current_generation_number",
"produces":[
"application/json"
@@ -573,21 +511,6 @@
}
]
},
{
"path":"/storage_service/cdc_streams_check_and_repair",
"operations":[
{
"method":"POST",
"summary":"Checks that CDC streams reflect current cluster topology and regenerates them if not.",
"type":"void",
"nickname":"cdc_streams_check_and_repair",
"produces":[
"application/json"
],
"parameters":[]
}
]
},
{
"path":"/storage_service/snapshots",
"operations":[
@@ -659,15 +582,7 @@
},
{
"name":"kn",
"description":"Comma seperated keyspaces name that their snapshot will be deleted",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"cf",
"description":"an optional table name that its snapshot will be deleted",
"description":"Comma seperated keyspaces name to snapshot",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -731,7 +646,7 @@
{
"method":"POST",
"summary":"Trigger a cleanup of keys on a single keyspace",
"type": "long",
"type":"int",
"nickname":"force_keyspace_cleanup",
"produces":[
"application/json"
@@ -763,7 +678,7 @@
{
"method":"GET",
"summary":"Scrub (deserialize + reserialize at the latest version, skipping bad rows if any) the given keyspace. If columnFamilies array is empty, all CFs are scrubbed. Scrubbed CFs will be snapshotted first, if disableSnapshot is false",
"type": "long",
"type":"int",
"nickname":"scrub",
"produces":[
"application/json"
@@ -811,7 +726,7 @@
{
"method":"GET",
"summary":"Rewrite all sstables to the latest version. Unlike scrub, it doesn't skip bad rows and do not snapshot sstables first.",
"type": "long",
"type":"int",
"nickname":"upgrade_sstables",
"produces":[
"application/json"
@@ -885,7 +800,7 @@
"summary":"Return an array with the ids of the currently active repairs",
"type":"array",
"items":{
"type": "long"
"type":"int"
},
"nickname":"get_active_repair_async",
"produces":[
@@ -895,50 +810,13 @@
}
]
},
{
"path":"/storage_service/repair_status/",
"operations":[
{
"method":"GET",
"summary":"Query the repair status and return when the repair is finished or timeout",
"type":"string",
"enum":[
"RUNNING",
"SUCCESSFUL",
"FAILED"
],
"nickname":"repair_await_completion",
"produces":[
"application/json"
],
"parameters":[
{
"name":"id",
"description":"The repair ID to check for status",
"required":true,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"timeout",
"description":"Seconds to wait before the query returns even if the repair is not finished. The value -1 or not providing this parameter means no timeout",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
}
]
}
]
},
{
"path":"/storage_service/repair_async/{keyspace}",
"operations":[
{
"method":"POST",
"summary":"Invoke repair asynchronously. You can track repair progress by using the get supplying id",
"type": "long",
"type":"int",
"nickname":"repair_async",
"produces":[
"application/json"
@@ -1032,14 +910,6 @@
"type":"string",
"paramType":"query"
},
{
"name":"ignore_nodes",
"description":"Which hosts are to ignore in this repair. Multiple hosts can be listed separated by commas.",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"trace",
"description":"If the value is the string 'true' with any capitalization, enable tracing of the repair.",
@@ -1077,7 +947,7 @@
"description":"The repair ID to check for status",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -1175,14 +1045,6 @@
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"ignore_nodes",
"description":"List of dead nodes to ingore in removenode operation",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
@@ -1415,18 +1277,18 @@
},
{
"name":"dynamic_update_interval",
"description":"interval in ms (default 100)",
"description":"integer, in ms (default 100)",
"required":false,
"allowMultiple":false,
"type":"long",
"type":"integer",
"paramType":"query"
},
{
"name":"dynamic_reset_interval",
"description":"interval in ms (default 600,000)",
"description":"integer, in ms (default 600,000)",
"required":false,
"allowMultiple":false,
"type":"long",
"type":"integer",
"paramType":"query"
},
{
@@ -1631,7 +1493,7 @@
"description":"Stream throughput",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -1639,7 +1501,7 @@
{
"method":"GET",
"summary":"Get stream throughput mb per sec",
"type": "long",
"type":"int",
"nickname":"get_stream_throughput_mb_per_sec",
"produces":[
"application/json"
@@ -1655,7 +1517,7 @@
{
"method":"GET",
"summary":"get compaction throughput mb per sec",
"type": "long",
"type":"int",
"nickname":"get_compaction_throughput_mb_per_sec",
"produces":[
"application/json"
@@ -1677,7 +1539,7 @@
"description":"compaction throughput",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -1834,22 +1696,6 @@
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"load_and_stream",
"description":"Load the sstables and stream to all replica nodes that owns the data",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"primary_replica_only",
"description":"Load the sstables and stream to primary replica node that owns the data. Repair is needed after the load and stream process",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
@@ -1960,14 +1806,6 @@
"allowMultiple":false,
"type":"long",
"paramType":"query"
},
{
"name":"fast",
"description":"Lightweight tracing mode: if true, slow queries tracing records only session headers",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
}
]
},
@@ -2105,7 +1943,7 @@
{
"method":"GET",
"summary":"Returns the threshold for warning of queries with many tombstones",
"type": "long",
"type":"int",
"nickname":"get_tombstone_warn_threshold",
"produces":[
"application/json"
@@ -2127,7 +1965,7 @@
"description":"tombstone debug threshold",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2140,7 +1978,7 @@
{
"method":"GET",
"summary":"",
"type": "long",
"type":"int",
"nickname":"get_tombstone_failure_threshold",
"produces":[
"application/json"
@@ -2162,7 +2000,7 @@
"description":"tombstone debug threshold",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2175,7 +2013,7 @@
{
"method":"GET",
"summary":"Returns the threshold for rejecting queries due to a large batch size",
"type": "long",
"type":"int",
"nickname":"get_batch_size_failure_threshold",
"produces":[
"application/json"
@@ -2197,7 +2035,7 @@
"description":"batch size debug threshold",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2221,7 +2059,7 @@
"description":"throttle in kb",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2234,7 +2072,7 @@
{
"method":"GET",
"summary":"Get load",
"type": "long",
"type":"int",
"nickname":"get_metrics_load",
"produces":[
"application/json"
@@ -2250,7 +2088,7 @@
{
"method":"GET",
"summary":"Get exceptions",
"type": "long",
"type":"int",
"nickname":"get_exceptions",
"produces":[
"application/json"
@@ -2266,7 +2104,7 @@
{
"method":"GET",
"summary":"Get total hints in progress",
"type": "long",
"type":"int",
"nickname":"get_total_hints_in_progress",
"produces":[
"application/json"
@@ -2282,7 +2120,7 @@
{
"method":"GET",
"summary":"Get total hints",
"type": "long",
"type":"int",
"nickname":"get_total_hints",
"produces":[
"application/json"
@@ -2291,77 +2129,7 @@
]
}
]
},
{
"path":"/storage_service/view_build_statuses/{keyspace}/{view}",
"operations":[
{
"method":"GET",
"summary":"Gets the progress of a materialized view build",
"type":"array",
"items":{
"type":"mapper"
},
"nickname":"view_build_statuses",
"produces":[
"application/json"
],
"parameters":[
{
"name":"keyspace",
"description":"The keyspace",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"view",
"description":"View name",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
},
{
"path":"/storage_service/sstable_info",
"operations":[
{
"method":"GET",
"summary":"SSTable information",
"type":"array",
"items":{
"type":"table_sstables"
},
"nickname":"sstable_info",
"produces":[
"application/json"
],
"parameters":[
{
"name":"keyspace",
"description":"The keyspace",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"cf",
"description":"column family name",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
]
}
}
],
"models":{
"mapper":{
@@ -2466,10 +2234,6 @@
"threshold":{
"type":"long",
"description":"The slow query logging threshold in microseconds. Queries that takes longer, will be logged"
},
"fast":{
"type":"boolean",
"description":"Is lightweight tracing mode enabled. In that mode tracing ignore events and tracks only sessions."
}
}
},
@@ -2525,92 +2289,6 @@
"description":"The endpoint details"
}
}
},
"named_maps":{
"id":"named_maps",
"properties":{
"group":{
"type":"string"
},
"attributes":{
"type":"array",
"items":{
"type":"mapper"
}
}
}
},
"sstable":{
"id":"sstable",
"properties":{
"size":{
"type":"long",
"description":"Total size in bytes of sstable"
},
"data_size":{
"type":"long",
"description":"The size in bytes on disk of data"
},
"index_size":{
"type":"long",
"description":"The size in bytes on disk of index"
},
"filter_size":{
"type":"long",
"description":"The size in bytes on disk of filter"
},
"timestamp":{
"type":"datetime",
"description":"File creation time"
},
"generation":{
"type":"long",
"description":"SSTable generation"
},
"level":{
"type":"long",
"description":"SSTable level"
},
"version":{
"type":"string",
"enum":[
"ka", "la", "mc", "md"
],
"description":"SSTable version"
},
"properties":{
"type":"array",
"description":"SSTable attributes",
"items":{
"type":"mapper"
}
},
"extended_properties":{
"type":"array",
"description":"SSTable extended attributes",
"items":{
"type":"named_maps"
}
}
}
},
"table_sstables":{
"id":"table_sstables",
"description":"Per-table SSTable info and attributes",
"properties":{
"keyspace":{
"type":"string"
},
"table":{
"type":"string"
},
"sstables":{
"type":"array",
"items":{
"$ref":"sstable"
}
}
}
}
}
}

View File

@@ -32,7 +32,7 @@
{
"method":"GET",
"summary":"Get number of active outbound streams",
"type": "long",
"type":"int",
"nickname":"get_all_active_streams_outbound",
"produces":[
"application/json"
@@ -48,7 +48,7 @@
{
"method":"GET",
"summary":"Get total incoming bytes",
"type": "long",
"type":"int",
"nickname":"get_total_incoming_bytes",
"produces":[
"application/json"
@@ -72,7 +72,7 @@
{
"method":"GET",
"summary":"Get all total incoming bytes",
"type": "long",
"type":"int",
"nickname":"get_all_total_incoming_bytes",
"produces":[
"application/json"
@@ -88,7 +88,7 @@
{
"method":"GET",
"summary":"Get total outgoing bytes",
"type": "long",
"type":"int",
"nickname":"get_total_outgoing_bytes",
"produces":[
"application/json"
@@ -112,7 +112,7 @@
{
"method":"GET",
"summary":"Get all total outgoing bytes",
"type": "long",
"type":"int",
"nickname":"get_all_total_outgoing_bytes",
"produces":[
"application/json"
@@ -154,7 +154,7 @@
"description":"The peer"
},
"session_index":{
"type": "long",
"type":"int",
"description":"The session index"
},
"connecting":{
@@ -211,7 +211,7 @@
"description":"The ID"
},
"files":{
"type": "long",
"type":"int",
"description":"Number of files to transfer. Can be 0 if nothing to transfer for some streaming request."
},
"total_size":{
@@ -242,7 +242,7 @@
"description":"The peer address"
},
"session_index":{
"type": "long",
"type":"int",
"description":"The session index"
},
"file_name":{

View File

@@ -52,37 +52,6 @@
}
]
},
{
"path":"/system/drop_sstable_caches",
"operations":[
{
"method":"POST",
"summary":"Drop in-memory caches for data which is in sstables",
"type":"void",
"nickname":"drop_sstable_caches",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/system/uptime_ms",
"operations":[
{
"method":"GET",
"summary":"Get system uptime, in milliseconds",
"type":"long",
"nickname":"get_system_uptime",
"produces":[
"application/json"
],
"parameters":[]
}
]
},
{
"path":"/system/logger/{name}",
"operations":[

View File

@@ -20,9 +20,9 @@
*/
#include "api.hh"
#include <seastar/http/file_handler.hh>
#include <seastar/http/transformers.hh>
#include <seastar/http/api_docs.hh>
#include "http/file_handler.hh"
#include "http/transformers.hh"
#include "http/api_docs.hh"
#include "storage_service.hh"
#include "commitlog.hh"
#include "gossiper.hh"
@@ -36,13 +36,9 @@
#include "endpoint_snitch.hh"
#include "compaction_manager.hh"
#include "hinted_handoff.hh"
#include "error_injection.hh"
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "stream_manager.hh"
#include "system.hh"
#include "api/config.hh"
logging::logger apilog("api");
namespace api {
@@ -75,13 +71,6 @@ future<> set_server_init(http_context& ctx) {
});
}
future<> set_server_config(http_context& ctx) {
auto rb02 = std::make_shared < api_registry_builder20 > (ctx.api_doc, "/v2");
return ctx.http_server.set_routes([&ctx, rb02](routes& r) {
set_config(rb02, ctx, r);
});
}
static future<> register_api(http_context& ctx, const sstring& api_name,
const sstring api_desc,
std::function<void(http_context& ctx, routes& r)> f) {
@@ -93,42 +82,10 @@ static future<> register_api(http_context& ctx, const sstring& api_name,
});
}
future<> set_transport_controller(http_context& ctx, cql_transport::controller& ctl) {
return ctx.http_server.set_routes([&ctx, &ctl] (routes& r) { set_transport_controller(ctx, r, ctl); });
}
future<> unset_transport_controller(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_transport_controller(ctx, r); });
}
future<> set_rpc_controller(http_context& ctx, thrift_controller& ctl) {
return ctx.http_server.set_routes([&ctx, &ctl] (routes& r) { set_rpc_controller(ctx, r, ctl); });
}
future<> unset_rpc_controller(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_rpc_controller(ctx, r); });
}
future<> set_server_storage_service(http_context& ctx) {
return register_api(ctx, "storage_service", "The storage service API", set_storage_service);
}
future<> set_server_repair(http_context& ctx, sharded<netw::messaging_service>& ms) {
return ctx.http_server.set_routes([&ctx, &ms] (routes& r) { set_repair(ctx, r, ms); });
}
future<> unset_server_repair(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_repair(ctx, r); });
}
future<> set_server_snapshot(http_context& ctx, sharded<db::snapshot_ctl>& snap_ctl) {
return ctx.http_server.set_routes([&ctx, &snap_ctl] (routes& r) { set_snapshot(ctx, r, snap_ctl); });
}
future<> unset_server_snapshot(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_snapshot(ctx, r); });
}
future<> set_server_snitch(http_context& ctx) {
return register_api(ctx, "endpoint_snitch_info", "The endpoint snitch info API", set_endpoint_snitch);
}
@@ -143,14 +100,9 @@ future<> set_server_load_sstable(http_context& ctx) {
"The column family API", set_column_family);
}
future<> set_server_messaging_service(http_context& ctx, sharded<netw::messaging_service>& ms) {
future<> set_server_messaging_service(http_context& ctx) {
return register_api(ctx, "messaging_service",
"The messaging service API", [&ms] (http_context& ctx, routes& r) {
set_messaging_service(ctx, r, ms);
});
}
future<> unset_server_messaging_service(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_messaging_service(ctx, r); });
"The messaging service API", set_messaging_service);
}
future<> set_server_storage_proxy(http_context& ctx) {
@@ -197,9 +149,6 @@ future<> set_server_done(http_context& ctx) {
rb->register_function(r, "collectd",
"The collectd API");
set_collectd(ctx, r);
rb->register_function(r, "error_injection",
"The error injection API");
set_error_injection(ctx, r);
});
}

View File

@@ -21,15 +21,13 @@
#pragma once
#include <seastar/json/json_elements.hh>
#include <type_traits>
#include "json/json_elements.hh"
#include <boost/lexical_cast.hpp>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
#include <boost/units/detail/utility.hpp>
#include "api/api-doc/utils.json.hh"
#include "utils/histogram.hh"
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "api_init.hh"
#include "seastarx.hh"
@@ -218,44 +216,4 @@ std::vector<T> concat(std::vector<T> a, std::vector<T>&& b) {
return a;
}
template <class T, class Base = T>
class req_param {
public:
sstring name;
sstring param;
T value;
req_param(const request& req, sstring name, T default_val) : name(name) {
param = req.get_query_param(name);
if (param.empty()) {
value = default_val;
return;
}
try {
// boost::lexical_cast does not use boolalpha. Converting a
// true/false throws exceptions. We don't want that.
if constexpr (std::is_same_v<Base, bool>) {
// Cannot use boolalpha because we (probably) want to
// accept 1 and 0 as well as true and false. And True. And fAlse.
std::transform(param.begin(), param.end(), param.begin(), ::tolower);
if (param == "true" || param == "1") {
value = T(true);
} else if (param == "false" || param == "0") {
value = T(false);
} else {
throw boost::bad_lexical_cast{};
}
} else {
value = T{boost::lexical_cast<Base>(param)};
}
} catch (boost::bad_lexical_cast&) {
throw bad_param_exception(format("{} ({}): type error - should be {}", name, param, boost::units::detail::demangle(typeid(Base).name())));
}
}
operator T() const { return value; }
};
utils_json::estimated_histogram time_to_json_histogram(const utils::time_estimated_histogram& val);
}

View File

@@ -19,16 +19,9 @@
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "database_fwd.hh"
#include "database.hh"
#include "service/storage_proxy.hh"
#include <seastar/http/httpd.hh>
namespace service { class load_meter; }
namespace locator { class shared_token_metadata; }
namespace cql_transport { class controller; }
class thrift_controller;
namespace db { class snapshot_ctl; }
namespace netw { class messaging_service; }
#include "http/httpd.hh"
namespace api {
@@ -38,34 +31,18 @@ struct http_context {
httpd::http_server_control http_server;
distributed<database>& db;
distributed<service::storage_proxy>& sp;
service::load_meter& lmeter;
const sharded<locator::shared_token_metadata>& shared_token_metadata;
http_context(distributed<database>& _db,
distributed<service::storage_proxy>& _sp,
service::load_meter& _lm, const sharded<locator::shared_token_metadata>& _stm)
: db(_db), sp(_sp), lmeter(_lm), shared_token_metadata(_stm) {
distributed<service::storage_proxy>& _sp)
: db(_db), sp(_sp) {
}
const locator::token_metadata& get_token_metadata();
};
future<> set_server_init(http_context& ctx);
future<> set_server_config(http_context& ctx);
future<> set_server_snitch(http_context& ctx);
future<> set_server_storage_service(http_context& ctx);
future<> set_server_repair(http_context& ctx, sharded<netw::messaging_service>& ms);
future<> unset_server_repair(http_context& ctx);
future<> set_transport_controller(http_context& ctx, cql_transport::controller& ctl);
future<> unset_transport_controller(http_context& ctx);
future<> set_rpc_controller(http_context& ctx, thrift_controller& ctl);
future<> unset_rpc_controller(http_context& ctx);
future<> set_server_snapshot(http_context& ctx, sharded<db::snapshot_ctl>& snap_ctl);
future<> unset_server_snapshot(http_context& ctx);
future<> set_server_gossip(http_context& ctx);
future<> set_server_load_sstable(http_context& ctx);
future<> set_server_messaging_service(http_context& ctx, sharded<netw::messaging_service>& ms);
future<> unset_server_messaging_service(http_context& ctx);
future<> set_server_messaging_service(http_context& ctx);
future<> set_server_storage_proxy(http_context& ctx);
future<> set_server_stream_manager(http_context& ctx);
future<> set_server_gossip_settle(http_context& ctx);

View File

@@ -208,11 +208,9 @@ void set_cache_service(http_context& ctx, routes& r) {
});
cs::get_row_capacity.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([](database& db) -> uint64_t {
return db.row_cache_tracker().region().occupancy().used_space();
}, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {
return make_ready_future<json::json_return_type>(res);
});
return map_reduce_cf(ctx, uint64_t(0), [](const column_family& cf) {
return cf.get_row_cache().get_cache_tracker().region().occupancy().used_space();
}, std::plus<uint64_t>());
});
cs::get_row_hits.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -253,19 +251,15 @@ void set_cache_service(http_context& ctx, routes& r) {
cs::get_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
// In origin row size is the weighted size.
// We currently do not support weights, so we use num entries instead
return ctx.db.map_reduce0([](database& db) -> uint64_t {
return db.row_cache_tracker().partitions();
}, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {
return make_ready_future<json::json_return_type>(res);
});
return map_reduce_cf(ctx, 0, [](const column_family& cf) {
return cf.get_row_cache().partitions();
}, std::plus<uint64_t>());
});
cs::get_row_entries.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([](database& db) -> uint64_t {
return db.row_cache_tracker().partitions();
}, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {
return make_ready_future<json::json_return_type>(res);
});
return map_reduce_cf(ctx, 0, [](const column_family& cf) {
return cf.get_row_cache().partitions();
}, std::plus<uint64_t>());
});
cs::get_counter_capacity.set(r, [] (std::unique_ptr<request> req) {

View File

@@ -21,8 +21,8 @@
#include "collectd.hh"
#include "api/api-doc/collectd.json.hh"
#include <seastar/core/scollectd.hh>
#include <seastar/core/scollectd_api.hh>
#include "core/scollectd.hh"
#include "core/scollectd_api.hh"
#include "endian.h"
#include <boost/range/irange.hpp>
#include <regex>
@@ -64,7 +64,7 @@ static const char* str_to_regex(const sstring& v) {
void set_collectd(http_context& ctx, routes& r) {
cd::get_collectd.set(r, [&ctx](std::unique_ptr<request> req) {
auto id = ::make_shared<scollectd::type_instance_id>(req->param["pluginid"],
auto id = make_shared<scollectd::type_instance_id>(req->param["pluginid"],
req->get_query_param("instance"), req->get_query_param("type"),
req->get_query_param("type_instance"));

View File

@@ -22,15 +22,10 @@
#include "column_family.hh"
#include "api/api-doc/column_family.json.hh"
#include <vector>
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "sstables/sstables.hh"
#include "utils/estimated_histogram.hh"
#include <algorithm>
#include "db/system_keyspace_view_types.hh"
#include "db/data_listeners.hh"
#include "storage_service.hh"
extern logging::logger apilog;
namespace api {
using namespace httpd;
@@ -39,7 +34,7 @@ using namespace std;
using namespace json;
namespace cf = httpd::column_family_json;
std::tuple<sstring, sstring> parse_fully_qualified_cf_name(sstring name) {
const utils::UUID& get_uuid(const sstring& name, const database& db) {
auto pos = name.find("%3A");
size_t end;
if (pos == sstring::npos) {
@@ -51,22 +46,14 @@ std::tuple<sstring, sstring> parse_fully_qualified_cf_name(sstring name) {
} else {
end = pos + 3;
}
return std::make_tuple(name.substr(0, pos), name.substr(end));
}
const utils::UUID& get_uuid(const sstring& ks, const sstring& cf, const database& db) {
try {
return db.find_uuid(ks, cf);
return db.find_uuid(name.substr(0, pos), name.substr(end));
} catch (std::out_of_range& e) {
throw bad_param_exception(format("Column family '{}:{}' not found", ks, cf));
throw bad_param_exception("Column family '" + name.substr(0, pos) + ":"
+ name.substr(end) + "' not found");
}
}
const utils::UUID& get_uuid(const sstring& name, const database& db) {
auto [ks, cf] = parse_fully_qualified_cf_name(name);
return get_uuid(ks, cf, db);
}
future<> foreach_column_family(http_context& ctx, const sstring& name, function<void(column_family&)> f) {
auto uuid = get_uuid(name, ctx.db.local());
@@ -76,28 +63,28 @@ future<> foreach_column_family(http_context& ctx, const sstring& name, function<
}
future<json::json_return_type> get_cf_stats(http_context& ctx, const sstring& name,
int64_t column_family_stats::*f) {
int64_t column_family::stats::*f) {
return map_reduce_cf(ctx, name, int64_t(0), [f](const column_family& cf) {
return cf.get_stats().*f;
}, std::plus<int64_t>());
}
future<json::json_return_type> get_cf_stats(http_context& ctx,
int64_t column_family_stats::*f) {
int64_t column_family::stats::*f) {
return map_reduce_cf(ctx, int64_t(0), [f](const column_family& cf) {
return cf.get_stats().*f;
}, std::plus<int64_t>());
}
static future<json::json_return_type> get_cf_stats_count(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
return map_reduce_cf(ctx, name, int64_t(0), [f](const column_family& cf) {
return (cf.get_stats().*f).hist.count;
}, std::plus<int64_t>());
}
static future<json::json_return_type> get_cf_stats_sum(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
auto uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([uuid, f](database& db) {
// Histograms information is sample of the actual load
@@ -113,14 +100,14 @@ static future<json::json_return_type> get_cf_stats_sum(http_context& ctx, const
static future<json::json_return_type> get_cf_stats_count(http_context& ctx,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
return map_reduce_cf(ctx, int64_t(0), [f](const column_family& cf) {
return (cf.get_stats().*f).hist.count;
}, std::plus<int64_t>());
}
static future<json::json_return_type> get_cf_histogram(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
utils::UUID uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([f, uuid](const database& p) {
return (p.find_column_family(uuid).get_stats().*f).hist;},
@@ -131,7 +118,7 @@ static future<json::json_return_type> get_cf_histogram(http_context& ctx, const
});
}
static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
std::function<utils::ihistogram(const database&)> fun = [f] (const database& db) {
utils::ihistogram res;
for (auto i : db.get_column_families()) {
@@ -147,7 +134,7 @@ static future<json::json_return_type> get_cf_histogram(http_context& ctx, utils:
}
static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, const sstring& name,
utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
utils::UUID uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([f, uuid](const database& p) {
return (p.find_column_family(uuid).get_stats().*f).rate();},
@@ -158,7 +145,7 @@ static future<json::json_return_type> get_cf_rate_and_histogram(http_context& c
});
}
static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family_stats::*f) {
static future<json::json_return_type> get_cf_rate_and_histogram(http_context& ctx, utils::timed_rate_moving_average_and_histogram column_family::stats::*f) {
std::function<utils::rate_moving_average_and_histogram(const database&)> fun = [f] (const database& db) {
utils::rate_moving_average_and_histogram res;
for (auto i : db.get_column_families()) {
@@ -179,27 +166,27 @@ static future<json::json_return_type> get_cf_unleveled_sstables(http_context& ct
}, std::plus<int64_t>());
}
static int64_t min_partition_size(column_family& cf) {
static int64_t min_row_size(column_family& cf) {
int64_t res = INT64_MAX;
for (auto sstables = cf.get_sstables(); auto& i : *sstables) {
res = std::min(res, i->get_stats_metadata().estimated_partition_size.min());
for (auto i: *cf.get_sstables() ) {
res = std::min(res, i->get_stats_metadata().estimated_row_size.min());
}
return (res == INT64_MAX) ? 0 : res;
}
static int64_t max_partition_size(column_family& cf) {
static int64_t max_row_size(column_family& cf) {
int64_t res = 0;
for (auto sstables = cf.get_sstables(); auto& i : *sstables) {
res = std::max(i->get_stats_metadata().estimated_partition_size.max(), res);
for (auto i: *cf.get_sstables() ) {
res = std::max(i->get_stats_metadata().estimated_row_size.max(), res);
}
return res;
}
static integral_ratio_holder mean_partition_size(column_family& cf) {
static integral_ratio_holder mean_row_size(column_family& cf) {
integral_ratio_holder res;
for (auto sstables = cf.get_sstables(); auto& i : *sstables) {
auto c = i->get_stats_metadata().estimated_partition_size.count();
res.sub += i->get_stats_metadata().estimated_partition_size.mean() * c;
for (auto i: *cf.get_sstables() ) {
auto c = i->get_stats_metadata().estimated_row_size.count();
res.sub += i->get_stats_metadata().estimated_row_size.mean() * c;
res.total += c;
}
return res;
@@ -250,22 +237,17 @@ static future<json::json_return_type> sum_sstable(http_context& ctx, bool total)
});
}
future<json::json_return_type> map_reduce_cf_time_histogram(http_context& ctx, const sstring& name, std::function<utils::time_estimated_histogram(const column_family&)> f) {
return map_reduce_cf_raw(ctx, name, utils::time_estimated_histogram(), f, utils::time_estimated_histogram_merge).then([](const utils::time_estimated_histogram& res) {
return make_ready_future<json::json_return_type>(time_to_json_histogram(res));
});
}
template <typename T>
class sum_ratio {
uint64_t _n = 0;
T _total = 0;
public:
void operator()(T value) {
future<> operator()(T value) {
if (value > 0) {
_total += value;
_n++;
}
return make_ready_future<>();
}
// Returns average value of all registered ratios.
T get() && {
@@ -275,7 +257,7 @@ public:
static double get_compression_ratio(column_family& cf) {
sum_ratio<double> result;
for (auto sstables = cf.get_sstables(); auto& i : *sstables) {
for (auto i : *cf.get_sstables()) {
auto compression_ratio = i->get_compression_ratio();
if (compression_ratio != sstables::metadata_collector::NO_COMPRESSION_RATIO) {
result(compression_ratio);
@@ -311,8 +293,8 @@ void set_column_family(http_context& ctx, routes& r) {
return res;
});
cf::get_column_family.set(r, [&ctx] (std::unique_ptr<request> req){
std::list<cf::column_family_info> res;
cf::get_column_family.set(r, [&ctx] (const_req req){
vector<cf::column_family_info> res;
for (auto i: ctx.db.local().get_column_families_mapping()) {
cf::column_family_info info;
info.ks = i.first.first;
@@ -320,7 +302,7 @@ void set_column_family(http_context& ctx, routes& r) {
info.type = "ColumnFamilies";
res.push_back(info);
}
return make_ready_future<json::json_return_type>(json::stream_range_as_array(std::move(res), std::identity()));
return res;
});
cf::get_column_family_name_keyspace.set(r, [&ctx] (const_req req){
@@ -332,15 +314,15 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_memtable_columns_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], uint64_t{0}, [](column_family& cf) {
return map_reduce_cf(ctx, req->param["name"], 0, [](column_family& cf) {
return cf.active_memtable().partition_count();
}, std::plus<>());
}, std::plus<int>());
});
cf::get_all_memtable_columns_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t{0}, [](column_family& cf) {
return map_reduce_cf(ctx, 0, [](column_family& cf) {
return cf.active_memtable().partition_count();
}, std::plus<>());
}, std::plus<int>());
});
cf::get_memtable_on_heap_size.set(r, [] (const_req req) {
@@ -414,31 +396,29 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx,req->param["name"] ,&column_family_stats::memtable_switch_count);
return get_cf_stats(ctx,req->param["name"] ,&column_family::stats::memtable_switch_count);
});
cf::get_all_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::memtable_switch_count);
return get_cf_stats(ctx, &column_family::stats::memtable_switch_count);
});
// FIXME: this refers to partitions, not rows.
cf::get_estimated_row_size_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
utils::estimated_histogram res(0);
for (auto sstables = cf.get_sstables(); auto& i : *sstables) {
res.merge(i->get_stats_metadata().estimated_partition_size);
for (auto i: *cf.get_sstables() ) {
res.merge(i->get_stats_metadata().estimated_row_size);
}
return res;
},
utils::estimated_histogram_merge, utils_json::estimated_histogram());
});
// FIXME: this refers to partitions, not rows.
cf::get_estimated_row_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], int64_t(0), [](column_family& cf) {
uint64_t res = 0;
for (auto sstables = cf.get_sstables(); auto& i : *sstables) {
res += i->get_stats_metadata().estimated_partition_size.count();
for (auto i: *cf.get_sstables() ) {
res += i->get_stats_metadata().estimated_row_size.count();
}
return res;
},
@@ -448,8 +428,8 @@ void set_column_family(http_context& ctx, routes& r) {
cf::get_estimated_column_count_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
utils::estimated_histogram res(0);
for (auto sstables = cf.get_sstables(); auto& i : *sstables) {
res.merge(i->get_stats_metadata().estimated_cells_count);
for (auto i: *cf.get_sstables() ) {
res.merge(i->get_stats_metadata().estimated_column_count);
}
return res;
},
@@ -463,67 +443,67 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx,req->param["name"] ,&column_family_stats::pending_flushes);
return get_cf_stats(ctx,req->param["name"] ,&column_family::stats::pending_flushes);
});
cf::get_all_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::pending_flushes);
return get_cf_stats(ctx, &column_family::stats::pending_flushes);
});
cf::get_read.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx,req->param["name"] ,&column_family_stats::reads);
return get_cf_stats_count(ctx,req->param["name"] ,&column_family::stats::reads);
});
cf::get_all_read.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx, &column_family_stats::reads);
return get_cf_stats_count(ctx, &column_family::stats::reads);
});
cf::get_write.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx, req->param["name"] ,&column_family_stats::writes);
return get_cf_stats_count(ctx, req->param["name"] ,&column_family::stats::writes);
});
cf::get_all_write.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_count(ctx, &column_family_stats::writes);
return get_cf_stats_count(ctx, &column_family::stats::writes);
});
cf::get_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::reads);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::reads);
});
cf::get_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family_stats::reads);
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family::stats::reads);
});
cf::get_read_latency.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_sum(ctx,req->param["name"] ,&column_family_stats::reads);
return get_cf_stats_sum(ctx,req->param["name"] ,&column_family::stats::reads);
});
cf::get_write_latency.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats_sum(ctx, req->param["name"] ,&column_family_stats::writes);
return get_cf_stats_sum(ctx, req->param["name"] ,&column_family::stats::writes);
});
cf::get_all_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, &column_family_stats::writes);
return get_cf_histogram(ctx, &column_family::stats::writes);
});
cf::get_all_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, &column_family_stats::writes);
return get_cf_rate_and_histogram(ctx, &column_family::stats::writes);
});
cf::get_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::writes);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::writes);
});
cf::get_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family_stats::writes);
return get_cf_rate_and_histogram(ctx, req->param["name"], &column_family::stats::writes);
});
cf::get_all_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, &column_family_stats::writes);
return get_cf_histogram(ctx, &column_family::stats::writes);
});
cf::get_all_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_rate_and_histogram(ctx, &column_family_stats::writes);
return get_cf_rate_and_histogram(ctx, &column_family::stats::writes);
});
cf::get_pending_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -539,11 +519,11 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, req->param["name"], &column_family_stats::live_sstable_count);
return get_cf_stats(ctx, req->param["name"], &column_family::stats::live_sstable_count);
});
cf::get_all_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_stats(ctx, &column_family_stats::live_sstable_count);
return get_cf_stats(ctx, &column_family::stats::live_sstable_count);
});
cf::get_unleveled_sstables.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -566,42 +546,35 @@ void set_column_family(http_context& ctx, routes& r) {
return sum_sstable(ctx, true);
});
// FIXME: this refers to partitions, not rows.
cf::get_min_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], INT64_MAX, min_partition_size, min_int64);
return map_reduce_cf(ctx, req->param["name"], INT64_MAX, min_row_size, min_int64);
});
// FIXME: this refers to partitions, not rows.
cf::get_all_min_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, INT64_MAX, min_partition_size, min_int64);
return map_reduce_cf(ctx, INT64_MAX, min_row_size, min_int64);
});
// FIXME: this refers to partitions, not rows.
cf::get_max_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], int64_t(0), max_partition_size, max_int64);
return map_reduce_cf(ctx, req->param["name"], int64_t(0), max_row_size, max_int64);
});
// FIXME: this refers to partitions, not rows.
cf::get_all_max_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, int64_t(0), max_partition_size, max_int64);
return map_reduce_cf(ctx, int64_t(0), max_row_size, max_int64);
});
// FIXME: this refers to partitions, not rows.
cf::get_mean_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
// Cassandra 3.x mean values are truncated as integrals.
return map_reduce_cf(ctx, req->param["name"], integral_ratio_holder(), mean_partition_size, std::plus<integral_ratio_holder>());
return map_reduce_cf(ctx, req->param["name"], integral_ratio_holder(), mean_row_size, std::plus<integral_ratio_holder>());
});
// FIXME: this refers to partitions, not rows.
cf::get_all_mean_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
// Cassandra 3.x mean values are truncated as integrals.
return map_reduce_cf(ctx, integral_ratio_holder(), mean_partition_size, std::plus<integral_ratio_holder>());
return map_reduce_cf(ctx, integral_ratio_holder(), mean_row_size, std::plus<integral_ratio_holder>());
});
cf::get_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_get_false_positive();
});
}, std::plus<uint64_t>());
@@ -609,8 +582,7 @@ void set_column_family(http_context& ctx, routes& r) {
cf::get_all_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_get_false_positive();
});
}, std::plus<uint64_t>());
@@ -618,8 +590,7 @@ void set_column_family(http_context& ctx, routes& r) {
cf::get_recent_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_get_recent_false_positive();
});
}, std::plus<uint64_t>());
@@ -627,8 +598,7 @@ void set_column_family(http_context& ctx, routes& r) {
cf::get_all_recent_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_get_recent_false_positive();
});
}, std::plus<uint64_t>());
@@ -660,54 +630,48 @@ void set_column_family(http_context& ctx, routes& r) {
cf::get_bloom_filter_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_size();
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return sst->filter_size();
});
}, std::plus<uint64_t>());
});
cf::get_all_bloom_filter_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_size();
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return sst->filter_size();
});
}, std::plus<uint64_t>());
});
cf::get_bloom_filter_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_memory_size();
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return sst->filter_memory_size();
});
}, std::plus<uint64_t>());
});
cf::get_all_bloom_filter_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->filter_memory_size();
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return sst->filter_memory_size();
});
}, std::plus<uint64_t>());
});
cf::get_index_summary_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->get_summary().memory_footprint();
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return sst->get_summary().memory_footprint();
});
}, std::plus<uint64_t>());
});
cf::get_all_index_summary_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t(0), [] (column_family& cf) {
auto sstables = cf.get_sstables();
return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return s + sst->get_summary().memory_footprint();
return std::accumulate(cf.get_sstables()->begin(), cf.get_sstables()->end(), uint64_t(0), [](uint64_t s, auto& sst) {
return sst->get_summary().memory_footprint();
});
}, std::plus<uint64_t>());
});
@@ -812,22 +776,25 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_cas_prepare.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const column_family& cf) {
return cf.get_stats().estimated_cas_prepare;
});
cf::get_cas_prepare.set(r, [] (std::unique_ptr<request> req) {
//TBD
unimplemented();
//auto id = get_uuid(req->param["name"], ctx.db.local());
return make_ready_future<json::json_return_type>(0);
});
cf::get_cas_propose.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const column_family& cf) {
return cf.get_stats().estimated_cas_accept;
});
cf::get_cas_propose.set(r, [] (std::unique_ptr<request> req) {
//TBD
unimplemented();
//auto id = get_uuid(req->param["name"], ctx.db.local());
return make_ready_future<json::json_return_type>(0);
});
cf::get_cas_commit.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const column_family& cf) {
return cf.get_stats().estimated_cas_learn;
});
cf::get_cas_commit.set(r, [] (std::unique_ptr<request> req) {
//TBD
unimplemented();
//auto id = get_uuid(req->param["name"], ctx.db.local());
return make_ready_future<json::json_return_type>(0);
});
cf::get_sstables_per_read_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -838,11 +805,11 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_tombstone_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::tombstone_scanned);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::tombstone_scanned);
});
cf::get_live_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {
return get_cf_histogram(ctx, req->param["name"], &column_family_stats::live_scanned);
return get_cf_histogram(ctx, req->param["name"], &column_family::stats::live_scanned);
});
cf::get_col_update_time_delta_histogram.set(r, [] (std::unique_ptr<request> req) {
@@ -853,51 +820,19 @@ void set_column_family(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(res);
});
cf::get_auto_compaction.set(r, [&ctx] (const_req req) {
const utils::UUID& uuid = get_uuid(req.param["name"], ctx.db.local());
column_family& cf = ctx.db.local().find_column_family(uuid);
return !cf.is_auto_compaction_disabled_by_user();
cf::is_auto_compaction_disabled.set(r, [] (const_req req) {
// FIXME
// currently auto compaction is disable
// it should be changed when it would have an API
return true;
});
cf::enable_auto_compaction.set(r, [&ctx](std::unique_ptr<request> req) {
return foreach_column_family(ctx, req->param["name"], [](column_family &cf) {
cf.enable_auto_compaction();
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
cf::get_built_indexes.set(r, [](const_req) {
// FIXME
// Currently there are no index support
return std::vector<sstring>();
});
cf::disable_auto_compaction.set(r, [&ctx](std::unique_ptr<request> req) {
return foreach_column_family(ctx, req->param["name"], [](column_family &cf) {
cf.disable_auto_compaction();
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
});
cf::get_built_indexes.set(r, [&ctx](std::unique_ptr<request> req) {
auto ks_cf = parse_fully_qualified_cf_name(req->param["name"]);
auto&& ks = std::get<0>(ks_cf);
auto&& cf_name = std::get<1>(ks_cf);
return db::system_keyspace::load_view_build_progress().then([ks, cf_name, &ctx](const std::vector<db::system_keyspace::view_build_progress>& vb) mutable {
std::set<sstring> vp;
for (auto b : vb) {
if (b.view.first == ks) {
vp.insert(b.view.second);
}
}
std::vector<sstring> res;
auto uuid = get_uuid(ks, cf_name, ctx.db.local());
column_family& cf = ctx.db.local().find_column_family(uuid);
res.reserve(cf.get_index_manager().list_indexes().size());
for (auto&& i : cf.get_index_manager().list_indexes()) {
if (!vp.contains(secondary_index::index_table_name(i.metadata().name()))) {
res.emplace_back(i.metadata().name());
}
}
return make_ready_future<json::json_return_type>(res);
});
});
cf::get_compression_metadata_off_heap_memory_used.set(r, [](const_req) {
// FIXME
@@ -925,15 +860,17 @@ void set_column_family(http_context& ctx, routes& r) {
});
cf::get_read_latency_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const column_family& cf) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
return cf.get_stats().estimated_read;
});
},
utils::estimated_histogram_merge, utils_json::estimated_histogram());
});
cf::get_write_latency_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const column_family& cf) {
return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](column_family& cf) {
return cf.get_stats().estimated_write;
});
},
utils::estimated_histogram_merge, utils_json::estimated_histogram());
});
cf::set_compaction_strategy_class.set(r, [&ctx](std::unique_ptr<request> req) {
@@ -968,48 +905,5 @@ void set_column_family(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(res);
});
});
cf::get_sstables_for_key.set(r, [&ctx](std::unique_ptr<request> req) {
auto key = req->get_query_param("key");
auto uuid = get_uuid(req->param["name"], ctx.db.local());
return ctx.db.map_reduce0([key, uuid] (database& db) {
return db.find_column_family(uuid).get_sstables_by_partition_key(key);
}, std::unordered_set<sstring>(),
[](std::unordered_set<sstring> a, std::unordered_set<sstring>&& b) mutable {
a.insert(b.begin(),b.end());
return a;
}).then([](const std::unordered_set<sstring>& res) {
return make_ready_future<json::json_return_type>(container_to_vec(res));
});
});
cf::toppartitions.set(r, [&ctx] (std::unique_ptr<request> req) {
auto name = req->param["name"];
auto [ks, cf] = parse_fully_qualified_cf_name(name);
api::req_param<std::chrono::milliseconds, unsigned> duration{*req, "duration", 1000ms};
api::req_param<unsigned> capacity(*req, "capacity", 256);
api::req_param<unsigned> list_size(*req, "list_size", 10);
apilog.info("toppartitions query: name={} duration={} list_size={} capacity={}",
name, duration.param, list_size.param, capacity.param);
return seastar::do_with(db::toppartitions_query(ctx.db, {{ks, cf}}, {}, duration.value, list_size, capacity), [&ctx] (db::toppartitions_query& q) {
return run_toppartitions_query(q, ctx, true);
});
});
cf::force_major_compaction.set(r, [&ctx](std::unique_ptr<request> req) {
if (req->get_query_param("split_output") != "") {
fail(unimplemented::cause::API);
}
return foreach_column_family(ctx, req->param["name"], [](column_family &cf) {
return cf.compact_all_sstables();
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
});
}
}

View File

@@ -24,8 +24,6 @@
#include "api.hh"
#include "api/api-doc/column_family.json.hh"
#include "database.hh"
#include <seastar/core/future-util.hh>
#include <any>
namespace api {
@@ -39,15 +37,9 @@ template<class Mapper, class I, class Reducer>
future<I> map_reduce_cf_raw(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer) {
auto uuid = get_uuid(name, ctx.db.local());
using mapper_type = std::function<std::unique_ptr<std::any>(database&)>;
using reducer_type = std::function<std::unique_ptr<std::any>(std::unique_ptr<std::any>, std::unique_ptr<std::any>)>;
return ctx.db.map_reduce0(mapper_type([mapper, uuid](database& db) {
return std::make_unique<std::any>(I(mapper(db.find_column_family(uuid))));
}), std::make_unique<std::any>(std::move(init)), reducer_type([reducer = std::move(reducer)] (std::unique_ptr<std::any> a, std::unique_ptr<std::any> b) mutable {
return std::make_unique<std::any>(I(reducer(std::any_cast<I>(std::move(*a)), std::any_cast<I>(std::move(*b)))));
})).then([] (std::unique_ptr<std::any> r) {
return std::any_cast<I>(std::move(*r));
});
return ctx.db.map_reduce0([mapper, uuid](database& db) {
return mapper(db.find_column_family(uuid));
}, init, reducer);
}
@@ -59,46 +51,35 @@ future<json::json_return_type> map_reduce_cf(http_context& ctx, const sstring& n
});
}
template<class Mapper, class I, class Reducer, class Result>
future<I> map_reduce_cf_raw(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer, Result result) {
auto uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([mapper, uuid](database& db) {
return mapper(db.find_column_family(uuid));
}, init, reducer);
}
template<class Mapper, class I, class Reducer, class Result>
future<json::json_return_type> map_reduce_cf(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer, Result result) {
return map_reduce_cf_raw(ctx, name, init, mapper, reducer).then([result](const I& res) mutable {
return map_reduce_cf_raw(ctx, name, init, mapper, reducer, result).then([result](const I& res) mutable {
result = res;
return make_ready_future<json::json_return_type>(result);
});
}
future<json::json_return_type> map_reduce_cf_time_histogram(http_context& ctx, const sstring& name, std::function<utils::time_estimated_histogram(const column_family&)> f);
struct map_reduce_column_families_locally {
std::any init;
std::function<std::unique_ptr<std::any>(column_family&)> mapper;
std::function<std::unique_ptr<std::any>(std::unique_ptr<std::any>, std::unique_ptr<std::any>)> reducer;
future<std::unique_ptr<std::any>> operator()(database& db) const {
auto res = seastar::make_lw_shared<std::unique_ptr<std::any>>(std::make_unique<std::any>(init));
return do_for_each(db.get_column_families(), [res, this](const std::pair<utils::UUID, seastar::lw_shared_ptr<table>>& i) {
*res = std::move(reducer(std::move(*res), mapper(*i.second.get())));
}).then([res] {
return std::move(*res);
});
}
};
template<class Mapper, class I, class Reducer>
future<I> map_reduce_cf_raw(http_context& ctx, I init,
Mapper mapper, Reducer reducer) {
using mapper_type = std::function<std::unique_ptr<std::any>(column_family&)>;
using reducer_type = std::function<std::unique_ptr<std::any>(std::unique_ptr<std::any>, std::unique_ptr<std::any>)>;
auto wrapped_mapper = mapper_type([mapper = std::move(mapper)] (column_family& cf) mutable {
return std::make_unique<std::any>(I(mapper(cf)));
});
auto wrapped_reducer = reducer_type([reducer = std::move(reducer)] (std::unique_ptr<std::any> a, std::unique_ptr<std::any> b) mutable {
return std::make_unique<std::any>(I(reducer(std::any_cast<I>(std::move(*a)), std::any_cast<I>(std::move(*b)))));
});
return ctx.db.map_reduce0(map_reduce_column_families_locally{init,
std::move(wrapped_mapper), wrapped_reducer}, std::make_unique<std::any>(init), wrapped_reducer).then([] (std::unique_ptr<std::any> res) {
return std::any_cast<I>(std::move(*res));
});
return ctx.db.map_reduce0([mapper, init, reducer](database& db) {
auto res = init;
for (auto i : db.get_column_families()) {
res = reducer(res, mapper(*i.second.get()));
}
return res;
}, init, reducer);
}
@@ -111,12 +92,9 @@ future<json::json_return_type> map_reduce_cf(http_context& ctx, I init,
}
future<json::json_return_type> get_cf_stats(http_context& ctx, const sstring& name,
int64_t column_family_stats::*f);
int64_t column_family::stats::*f);
future<json::json_return_type> get_cf_stats(http_context& ctx,
int64_t column_family_stats::*f);
std::tuple<sstring, sstring> parse_fully_qualified_cf_name(sstring name);
int64_t column_family::stats::*f);
}

View File

@@ -20,18 +20,17 @@
*/
#include "commitlog.hh"
#include "db/commitlog/commitlog.hh"
#include <db/commitlog/commitlog.hh>
#include "api/api-doc/commitlog.json.hh"
#include "database.hh"
#include <vector>
namespace api {
template<typename T>
static auto acquire_cl_metric(http_context& ctx, std::function<T (db::commitlog*)> func) {
typedef T ret_type;
template<typename Func>
static auto acquire_cl_metric(http_context& ctx, Func&& func) {
typedef std::result_of_t<Func(db::commitlog *)> ret_type;
return ctx.db.map_reduce0([func = std::move(func)](database& db) {
return ctx.db.map_reduce0([func = std::forward<Func>(func)](database& db) {
if (db.commitlog() == nullptr) {
return make_ready_future<ret_type>();
}
@@ -64,15 +63,15 @@ void set_commitlog(http_context& ctx, routes& r) {
});
httpd::commitlog_json::get_completed_tasks.set(r, [&ctx](std::unique_ptr<request> req) {
return acquire_cl_metric<uint64_t>(ctx, std::bind(&db::commitlog::get_completed_tasks, std::placeholders::_1));
return acquire_cl_metric(ctx, std::bind(&db::commitlog::get_completed_tasks, std::placeholders::_1));
});
httpd::commitlog_json::get_pending_tasks.set(r, [&ctx](std::unique_ptr<request> req) {
return acquire_cl_metric<uint64_t>(ctx, std::bind(&db::commitlog::get_pending_tasks, std::placeholders::_1));
return acquire_cl_metric(ctx, std::bind(&db::commitlog::get_pending_tasks, std::placeholders::_1));
});
httpd::commitlog_json::get_total_commit_log_size.set(r, [&ctx](std::unique_ptr<request> req) {
return acquire_cl_metric<uint64_t>(ctx, std::bind(&db::commitlog::get_total_size, std::placeholders::_1));
return acquire_cl_metric(ctx, std::bind(&db::commitlog::get_total_size, std::placeholders::_1));
});
}

View File

@@ -24,7 +24,6 @@
#include "api/api-doc/compaction_manager.json.hh"
#include "db/system_keyspace.hh"
#include "column_family.hh"
#include <utility>
namespace api {
@@ -39,16 +38,6 @@ static future<json::json_return_type> get_cm_stats(http_context& ctx,
return make_ready_future<json::json_return_type>(res);
});
}
static std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash> sum_pending_tasks(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>&& a,
const std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& b) {
for (auto&& i : b) {
if (i.second) {
a[i.first] += i.second;
}
}
return std::move(a);
}
void set_compaction_manager(http_context& ctx, routes& r) {
cm::get_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -58,9 +47,8 @@ void set_compaction_manager(http_context& ctx, routes& r) {
for (const auto& c : cm.get_compactions()) {
cm::summary s;
s.id = c->compaction_uuid.to_sstring();
s.ks = c->ks_name;
s.cf = c->cf_name;
s.ks = c->ks;
s.cf = c->cf;
s.unit = "keys";
s.task_type = sstables::compaction_name(c->type);
s.completed = c->total_keys_written;
@@ -73,32 +61,6 @@ void set_compaction_manager(http_context& ctx, routes& r) {
});
});
cm::get_pending_tasks_by_table.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([&ctx](database& db) {
return do_with(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), [&ctx, &db](std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& tasks) {
return do_for_each(db.get_column_families(), [&tasks](const std::pair<utils::UUID, seastar::lw_shared_ptr<table>>& i) {
table& cf = *i.second.get();
tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.get_compaction_strategy().estimated_pending_compactions(cf);
return make_ready_future<>();
}).then([&tasks] {
return std::move(tasks);
});
});
}, std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), sum_pending_tasks).then(
[](const std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& task_map) {
std::vector<cm::pending_compaction> res;
res.reserve(task_map.size());
for (auto i : task_map) {
cm::pending_compaction task;
task.ks = i.first.first;
task.cf = i.first.second;
task.task = i.second;
res.emplace_back(std::move(task));
}
return make_ready_future<json::json_return_type>(res);
});
});
cm::force_user_defined_compaction.set(r, [] (std::unique_ptr<request> req) {
//TBD
// FIXME
@@ -141,37 +103,29 @@ void set_compaction_manager(http_context& ctx, routes& r) {
});
cm::get_compaction_history.set(r, [] (std::unique_ptr<request> req) {
std::function<future<>(output_stream<char>&&)> f = [](output_stream<char>&& s) {
return do_with(output_stream<char>(std::move(s)), true, [] (output_stream<char>& s, bool& first){
return s.write("[").then([&s, &first] {
return db::system_keyspace::get_compaction_history([&s, &first](const db::system_keyspace::compaction_history_entry& entry) mutable {
cm::history h;
h.id = entry.id.to_sstring();
h.ks = std::move(entry.ks);
h.cf = std::move(entry.cf);
h.compacted_at = entry.compacted_at;
h.bytes_in = entry.bytes_in;
h.bytes_out = entry.bytes_out;
for (auto it : entry.rows_merged) {
httpd::compaction_manager_json::row_merged e;
e.key = it.first;
e.value = it.second;
h.rows_merged.push(std::move(e));
}
auto fut = first ? make_ready_future<>() : s.write(", ");
first = false;
return fut.then([&s, h = std::move(h)] {
return formatter::write(s, h);
});
}).then([&s] {
return s.write("]").then([&s] {
return s.close();
});
});
});
});
};
return make_ready_future<json::json_return_type>(std::move(f));
return db::system_keyspace::get_compaction_history().then([] (std::vector<db::system_keyspace::compaction_history_entry> history) {
std::vector<cm::history> res;
res.reserve(history.size());
for (auto& entry : history) {
cm::history h;
h.id = entry.id.to_sstring();
h.ks = std::move(entry.ks);
h.cf = std::move(entry.cf);
h.compacted_at = entry.compacted_at;
h.bytes_in = entry.bytes_in;
h.bytes_out = entry.bytes_out;
for (auto it : entry.rows_merged) {
httpd::compaction_manager_json::row_merged e;
e.key = it.first;
e.value = it.second;
h.rows_merged.push(std::move(e));
}
res.push_back(std::move(h));
}
return make_ready_future<json::json_return_type>(res);
});
});
cm::get_compaction_info.set(r, [] (std::unique_ptr<request> req) {

View File

@@ -1,119 +0,0 @@
/*
* Copyright 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "api/config.hh"
#include "api/api-doc/config.json.hh"
#include "db/config.hh"
#include "database.hh"
#include <sstream>
#include <boost/algorithm/string/replace.hpp>
namespace api {
template<class T>
json::json_return_type get_json_return_type(const T& val) {
return json::json_return_type(val);
}
/*
* As commented on db::seed_provider_type is not used
* and probably never will.
*
* Just in case, we will return its name
*/
template<>
json::json_return_type get_json_return_type(const db::seed_provider_type& val) {
return json::json_return_type(val.class_name);
}
std::string_view format_type(std::string_view type) {
if (type == "int") {
return "integer";
}
return type;
}
future<> get_config_swagger_entry(std::string_view name, const std::string& description, std::string_view type, bool& first, output_stream<char>& os) {
std::stringstream ss;
if (first) {
first=false;
} else {
ss <<',';
};
ss << "\"/config/" << name <<"\": {"
"\"get\": {"
"\"description\": \"" << boost::replace_all_copy(boost::replace_all_copy(boost::replace_all_copy(description,"\n","\\n"),"\"", "''"), "\t", " ") <<"\","
"\"operationId\": \"find_config_"<< name <<"\","
"\"produces\": ["
"\"application/json\""
"],"
"\"tags\": [\"config\"],"
"\"parameters\": ["
"],"
"\"responses\": {"
"\"200\": {"
"\"description\": \"Config value\","
"\"schema\": {"
"\"type\": \"" << format_type(type) << "\""
"}"
"},"
"\"default\": {"
"\"description\": \"unexpected error\","
"\"schema\": {"
"\"$ref\": \"#/definitions/ErrorModel\""
"}"
"}"
"}"
"}"
"}";
return os.write(ss.str());
}
namespace cs = httpd::config_json;
void set_config(std::shared_ptr < api_registry_builder20 > rb, http_context& ctx, routes& r) {
rb->register_function(r, [&ctx] (output_stream<char>& os) {
return do_with(true, [&os, &ctx] (bool& first) {
auto f = make_ready_future();
for (auto&& cfg_ref : ctx.db.local().get_config().values()) {
auto&& cfg = cfg_ref.get();
f = f.then([&os, &first, &cfg] {
return get_config_swagger_entry(cfg.name(), std::string(cfg.desc()), cfg.type_name(), first, os);
});
}
return f;
});
});
cs::find_config_id.set(r, [&ctx] (const_req r) {
auto id = r.param["id"];
for (auto&& cfg_ref : ctx.db.local().get_config().values()) {
auto&& cfg = cfg_ref.get();
if (id == cfg.name()) {
return cfg.value_as_json();
}
}
throw bad_param_exception(sstring("No such config entry: ") + id);
});
}
}

View File

@@ -1,30 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
#include <seastar/http/api_docs.hh>
namespace api {
void set_config(std::shared_ptr<api_registry_builder20> rb, http_context& ctx, routes& r);
}

View File

@@ -1,69 +0,0 @@
/*
* Copyright (C) 2020 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "api/api-doc/error_injection.json.hh"
#include "api/api.hh"
#include <seastar/http/exception.hh>
#include "log.hh"
#include "utils/error_injection.hh"
#include "seastar/core/future-util.hh"
namespace api {
namespace hf = httpd::error_injection_json;
void set_error_injection(http_context& ctx, routes& r) {
hf::enable_injection.set(r, [](std::unique_ptr<request> req) {
sstring injection = req->param["injection"];
bool one_shot = req->get_query_param("one_shot") == "True";
auto& errinj = utils::get_local_injector();
return errinj.enable_on_all(injection, one_shot).then([] {
return make_ready_future<json::json_return_type>(json::json_void());
});
});
hf::get_enabled_injections_on_all.set(r, [](std::unique_ptr<request> req) {
auto& errinj = utils::get_local_injector();
auto ret = errinj.enabled_injections_on_all();
return make_ready_future<json::json_return_type>(ret);
});
hf::disable_injection.set(r, [](std::unique_ptr<request> req) {
sstring injection = req->param["injection"];
auto& errinj = utils::get_local_injector();
return errinj.disable_on_all(injection).then([] {
return make_ready_future<json::json_return_type>(json::json_void());
});
});
hf::disable_on_all.set(r, [](std::unique_ptr<request> req) {
auto& errinj = utils::get_local_injector();
return errinj.disable_on_all().then([] {
return make_ready_future<json::json_return_type>(json::json_void());
});
});
}
} // namespace api

View File

@@ -1,30 +0,0 @@
/*
* Copyright (C) 2019 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
namespace api {
void set_error_injection(http_context& ctx, routes& r);
}

View File

@@ -21,7 +21,7 @@
#include "gossiper.hh"
#include "api/api-doc/gossiper.json.hh"
#include "gms/gossiper.hh"
#include <gms/gossiper.hh>
namespace api {
using namespace json;
@@ -66,13 +66,6 @@ void set_gossiper(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(json_void());
});
});
httpd::gossiper_json::force_remove_endpoint.set(r, [](std::unique_ptr<request> req) {
gms::inet_address ep(req->param["addr"]);
return gms::get_local_gossiper().force_remove_endpoint(ep).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
});
}
}

View File

@@ -23,10 +23,9 @@
#include "api/lsa.hh"
#include "api/api.hh"
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "utils/logalloc.hh"
#include "log.hh"
#include "database.hh"
namespace api {

View File

@@ -21,7 +21,7 @@
#include "messaging_service.hh"
#include "message/messaging_service.hh"
#include <seastar/rpc/rpc_types.hh>
#include "rpc/rpc_types.hh"
#include "api/api-doc/messaging_service.json.hh"
#include <iostream>
#include <sstream>
@@ -53,8 +53,8 @@ std::vector<message_counter> map_to_message_counters(
* according to a function that it gets as a parameter.
*
*/
future_json_function get_client_getter(sharded<netw::messaging_service>& ms, std::function<uint64_t(const shard_info&)> f) {
return [&ms, f](std::unique_ptr<request> req) {
future_json_function get_client_getter(std::function<uint64_t(const shard_info&)> f) {
return [f](std::unique_ptr<request> req) {
using map_type = std::unordered_map<gms::inet_address, uint64_t>;
auto get_shard_map = [f](messaging_service& ms) {
std::unordered_map<gms::inet_address, unsigned long> map;
@@ -63,74 +63,70 @@ future_json_function get_client_getter(sharded<netw::messaging_service>& ms, std
});
return map;
};
return ms.map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
return get_messaging_service().map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
then([](map_type&& map) {
return make_ready_future<json::json_return_type>(map_to_message_counters(map));
});
};
}
future_json_function get_server_getter(sharded<netw::messaging_service>& ms, std::function<uint64_t(const rpc::stats&)> f) {
return [&ms, f](std::unique_ptr<request> req) {
future_json_function get_server_getter(std::function<uint64_t(const rpc::stats&)> f) {
return [f](std::unique_ptr<request> req) {
using map_type = std::unordered_map<gms::inet_address, uint64_t>;
auto get_shard_map = [f](messaging_service& ms) {
std::unordered_map<gms::inet_address, unsigned long> map;
ms.foreach_server_connection_stats([&map, f] (const rpc::client_info& info, const rpc::stats& stats) mutable {
map[gms::inet_address(info.addr.addr())] = f(stats);
map[gms::inet_address(net::ipv4_address(info.addr))] = f(stats);
});
return map;
};
return ms.map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
return get_messaging_service().map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
then([](map_type&& map) {
return make_ready_future<json::json_return_type>(map_to_message_counters(map));
});
};
}
void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging_service>& ms) {
get_timeout_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
void set_messaging_service(http_context& ctx, routes& r) {
get_timeout_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().timeout;
}));
get_sent_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_sent_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().sent_messages;
}));
get_replied_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
return c.get_stats().replied;
}));
get_dropped_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_dropped_messages.set(r, get_client_getter([](const shard_info& c) {
// We don't have the same drop message mechanism
// as origin has.
// hence we can always return 0
return 0;
}));
get_exception_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_exception_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().exception_received;
}));
get_pending_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_pending_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().pending;
}));
get_respond_pending_messages.set(r, get_server_getter(ms, [](const rpc::stats& c) {
get_respond_pending_messages.set(r, get_server_getter([](const rpc::stats& c) {
return c.pending;
}));
get_respond_completed_messages.set(r, get_server_getter(ms, [](const rpc::stats& c) {
get_respond_completed_messages.set(r, get_server_getter([](const rpc::stats& c) {
return c.sent_messages;
}));
get_version.set(r, [&ms](const_req req) {
return ms.local().get_raw_version(req.get_query_param("addr"));
get_version.set(r, [](const_req req) {
return netw::get_local_messaging_service().get_raw_version(req.get_query_param("addr"));
});
get_dropped_messages_by_ver.set(r, [&ms](std::unique_ptr<request> req) {
get_dropped_messages_by_ver.set(r, [](std::unique_ptr<request> req) {
shared_ptr<std::vector<uint64_t>> map = make_shared<std::vector<uint64_t>>(num_verb);
return ms.map_reduce([map](const uint64_t* local_map) mutable {
return netw::get_messaging_service().map_reduce([map](const uint64_t* local_map) mutable {
for (auto i = 0; i < num_verb; i++) {
(*map)[i]+= local_map[i];
}
@@ -143,7 +139,7 @@ void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging
messaging_verb v = i; // for type safety we use messaging_verb values
auto idx = static_cast<uint32_t>(v);
if (idx >= map->size()) {
throw std::runtime_error(format("verb index out of bounds: {:d}, map size: {:d}", idx, map->size()));
throw std::runtime_error(sprint("verb index out of bounds: %lu, map size: %lu", idx, map->size()));
}
if ((*map)[idx] > 0) {
c.count = (*map)[idx];
@@ -155,19 +151,5 @@ void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging
});
});
}
void unset_messaging_service(http_context& ctx, routes& r) {
get_timeout_messages.unset(r);
get_sent_messages.unset(r);
get_replied_messages.unset(r);
get_dropped_messages.unset(r);
get_exception_messages.unset(r);
get_pending_messages.unset(r);
get_respond_pending_messages.unset(r);
get_respond_completed_messages.unset(r);
get_version.unset(r);
get_dropped_messages_by_ver.unset(r);
}
}

View File

@@ -23,11 +23,8 @@
#include "api.hh"
namespace netw { class messaging_service; }
namespace api {
void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging_service>& ms);
void unset_messaging_service(http_context& ctx, routes& r);
void set_messaging_service(http_context& ctx, routes& r);
}

View File

@@ -26,8 +26,6 @@
#include "service/storage_service.hh"
#include "db/config.hh"
#include "utils/histogram.hh"
#include "database.hh"
#include "seastar/core/scheduling_specific.hh"
namespace api {
@@ -35,70 +33,12 @@ namespace sp = httpd::storage_proxy_json;
using proxy = service::storage_proxy;
using namespace json;
/**
* This function implement a two dimentional map reduce where
* the first level is a distributed storage_proxy class and the
* second level is the stats per scheduling group class.
* @param d - a reference to the storage_proxy distributed class.
* @param mapper - the internal mapper that is used to map the internal
* stat class into a value of type `V`.
* @param reducer - the reducer that is used in both outer and inner
* aggregations.
* @param initial_value - the initial value to use for both aggregations
* @return A future that resolves to the result of the aggregation.
*/
template<typename V, typename Reducer, typename InnerMapper>
future<V> two_dimensional_map_reduce(distributed<service::storage_proxy>& d,
InnerMapper mapper, Reducer reducer, V initial_value) {
return d.map_reduce0( [mapper, reducer, initial_value] (const service::storage_proxy& sp) {
return map_reduce_scheduling_group_specific<service::storage_proxy_stats::stats>(
mapper, reducer, initial_value, sp.get_stats_key());
}, initial_value, reducer);
static future<utils::rate_moving_average> sum_timed_rate(distributed<proxy>& d, utils::timed_rate_moving_average proxy::stats::*f) {
return d.map_reduce0([f](const proxy& p) {return (p.get_stats().*f).rate();}, utils::rate_moving_average(),
std::plus<utils::rate_moving_average>());
}
/**
* This function implement a two dimentional map reduce where
* the first level is a distributed storage_proxy class and the
* second level is the stats per scheduling group class.
* @param d - a reference to the storage_proxy distributed class.
* @param f - a field pointer which is the implicit internal reducer.
* @param reducer - the reducer that is used in both outer and inner
* aggregations.
* @param initial_value - the initial value to use for both aggregations* @return
* @return A future that resolves to the result of the aggregation.
*/
template<typename V, typename Reducer, typename F>
future<V> two_dimensional_map_reduce(distributed<service::storage_proxy>& d,
V F::*f, Reducer reducer, V initial_value) {
return two_dimensional_map_reduce(d, [f] (F& stats) {
return stats.*f;
}, reducer, initial_value);
}
/**
* A partial Specialization of sum_stats for the storage proxy
* case where the get stats function doesn't return a
* stats object with fields but a per scheduling group
* stats object, the name was also changed since functions
* partial specialization is not supported in C++.
*
*/
template<typename V, typename F>
future<json::json_return_type> sum_stats_storage_proxy(distributed<proxy>& d, V F::*f) {
return two_dimensional_map_reduce(d, [f] (F& stats) { return stats.*f; }, std::plus<V>(), V(0)).then([] (V val) {
return make_ready_future<json::json_return_type>(val);
});
}
static future<utils::rate_moving_average> sum_timed_rate(distributed<proxy>& d, utils::timed_rate_moving_average service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(d, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).rate();
}, std::plus<utils::rate_moving_average>(), utils::rate_moving_average());
}
static future<json::json_return_type> sum_timed_rate_as_obj(distributed<proxy>& d, utils::timed_rate_moving_average service::storage_proxy_stats::stats::*f) {
static future<json::json_return_type> sum_timed_rate_as_obj(distributed<proxy>& d, utils::timed_rate_moving_average proxy::stats::*f) {
return sum_timed_rate(d, f).then([](const utils::rate_moving_average& val) {
httpd::utils_json::rate_moving_average m;
m = val;
@@ -106,93 +46,29 @@ static future<json::json_return_type> sum_timed_rate_as_obj(distributed<proxy>&
});
}
httpd::utils_json::rate_moving_average_and_histogram get_empty_moving_average() {
return timer_to_json(utils::rate_moving_average_and_histogram());
}
static future<json::json_return_type> sum_timed_rate_as_long(distributed<proxy>& d, utils::timed_rate_moving_average service::storage_proxy_stats::stats::*f) {
static future<json::json_return_type> sum_timed_rate_as_long(distributed<proxy>& d, utils::timed_rate_moving_average proxy::stats::*f) {
return sum_timed_rate(d, f).then([](const utils::rate_moving_average& val) {
return make_ready_future<json::json_return_type>(val.count);
});
}
utils_json::estimated_histogram time_to_json_histogram(const utils::time_estimated_histogram& val) {
utils_json::estimated_histogram res;
for (size_t i = 0; i < val.size(); i++) {
res.buckets.push(val.get(i));
res.bucket_offsets.push(val.get_bucket_lower_limit(i));
}
return res;
}
static future<json::json_return_type> sum_estimated_histogram(http_context& ctx, utils::time_estimated_histogram service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(ctx.sp, f, utils::time_estimated_histogram_merge,
utils::time_estimated_histogram()).then([](const utils::time_estimated_histogram& val) {
return make_ready_future<json::json_return_type>(time_to_json_histogram(val));
});
}
static future<json::json_return_type> sum_estimated_histogram(http_context& ctx, utils::estimated_histogram service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(ctx.sp, f, utils::estimated_histogram_merge,
utils::estimated_histogram()).then([](const utils::estimated_histogram& val) {
static future<json::json_return_type> sum_estimated_histogram(http_context& ctx, utils::estimated_histogram proxy::stats::*f) {
return ctx.sp.map_reduce0([f](const proxy& p) {return p.get_stats().*f;}, utils::estimated_histogram(),
utils::estimated_histogram_merge).then([](const utils::estimated_histogram& val) {
utils_json::estimated_histogram res;
res = val;
return make_ready_future<json::json_return_type>(res);
});
}
static future<json::json_return_type> total_latency(http_context& ctx, utils::timed_rate_moving_average_and_histogram service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(ctx.sp, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).hist.mean * (stats.*f).hist.count;
}, std::plus<double>(), 0.0).then([](double val) {
static future<json::json_return_type> total_latency(http_context& ctx, utils::timed_rate_moving_average_and_histogram proxy::stats::*f) {
return ctx.sp.map_reduce0([f](const proxy& p) {return (p.get_stats().*f).hist.mean * (p.get_stats().*f).hist.count;}, 0.0,
std::plus<double>()).then([](double val) {
int64_t res = val;
return make_ready_future<json::json_return_type>(res);
});
}
/**
* A partial Specialization of sum_histogram_stats
* for the storage proxy case where the get stats
* function doesn't return a stats object with
* fields but a per scheduling group stats object,
* the name was also changed since function partial
* specialization is not supported in C++.
*/
template<typename F>
future<json::json_return_type>
sum_histogram_stats_storage_proxy(distributed<proxy>& d,
utils::timed_rate_moving_average_and_histogram F::*f) {
return two_dimensional_map_reduce(d, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).hist;
}, std::plus<utils::ihistogram>(), utils::ihistogram()).
then([](const utils::ihistogram& val) {
return make_ready_future<json::json_return_type>(to_json(val));
});
}
/**
* A partial Specialization of sum_timer_stats for the
* storage proxy case where the get stats function
* doesn't return a stats object with fields but a
* per scheduling group stats object, the name
* was also changed since partial function specialization
* is not supported in C++.
*/
template<typename F>
future<json::json_return_type>
sum_timer_stats_storage_proxy(distributed<proxy>& d,
utils::timed_rate_moving_average_and_histogram F::*f) {
return two_dimensional_map_reduce(d, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).rate();
}, std::plus<utils::rate_moving_average_and_histogram>(),
utils::rate_moving_average_and_histogram()).then([](const utils::rate_moving_average_and_histogram& val) {
return make_ready_future<json::json_return_type>(timer_to_json(val));
});
}
void set_storage_proxy(http_context& ctx, routes& r) {
sp::get_total_hints.set(r, [](std::unique_ptr<request> req) {
//TBD
@@ -200,40 +76,33 @@ void set_storage_proxy(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(0);
});
sp::get_hinted_handoff_enabled.set(r, [&ctx](std::unique_ptr<request> req) {
const auto& filter = service::get_storage_proxy().local().get_hints_host_filter();
return make_ready_future<json::json_return_type>(!filter.is_disabled_for_all());
sp::get_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// hinted handoff is not supported currently,
// so we should return false
return make_ready_future<json::json_return_type>(false);
});
sp::set_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
auto enable = req->get_query_param("enable");
auto filter = (enable == "true" || enable == "1")
? db::hints::host_filter(db::hints::host_filter::enabled_for_all_tag {})
: db::hints::host_filter(db::hints::host_filter::disabled_for_all_tag {});
return service::get_storage_proxy().invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {
return sp.change_hints_host_filter(filter);
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
return make_ready_future<json::json_return_type>(json_void());
});
sp::get_hinted_handoff_enabled_by_dc.set(r, [](std::unique_ptr<request> req) {
std::vector<sstring> res;
const auto& filter = service::get_storage_proxy().local().get_hints_host_filter();
const auto& dcs = filter.get_dcs();
res.reserve(res.size());
std::copy(dcs.begin(), dcs.end(), std::back_inserter(res));
//TBD
unimplemented();
std::vector<sp::mapper_list> res;
return make_ready_future<json::json_return_type>(res);
});
sp::set_hinted_handoff_enabled_by_dc_list.set(r, [](std::unique_ptr<request> req) {
auto dcs = req->get_query_param("dcs");
auto filter = db::hints::host_filter::parse_from_dc_list(std::move(dcs));
return service::get_storage_proxy().invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {
return sp.change_hints_host_filter(filter);
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
//TBD
unimplemented();
auto enable = req->get_query_param("dcs");
return make_ready_future<json::json_return_type>(json_void());
});
sp::get_max_hint_window.set(r, [](std::unique_ptr<request> req) {
@@ -352,15 +221,15 @@ void set_storage_proxy(http_context& ctx, routes& r) {
});
sp::get_read_repair_attempted.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_attempts);
return sum_stats(ctx.sp, &proxy::stats::read_repair_attempts);
});
sp::get_read_repair_repaired_blocking.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_repaired_blocking);
return sum_stats(ctx.sp, &proxy::stats::read_repair_repaired_blocking);
});
sp::get_read_repair_repaired_background.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_repaired_background);
return sum_stats(ctx.sp, &proxy::stats::read_repair_repaired_background);
});
sp::get_schema_versions.set(r, [](std::unique_ptr<request> req) {
@@ -376,154 +245,163 @@ void set_storage_proxy(http_context& ctx, routes& r) {
});
});
sp::get_cas_read_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_timeouts);
sp::get_cas_read_timeouts.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_unavailables);
sp::get_cas_read_unavailables.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_timeouts);
sp::get_cas_write_timeouts.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_unavailables);
sp::get_cas_write_unavailables.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_write_unfinished_commit);
sp::get_cas_write_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &proxy::stats::cas_write_contention);
sp::get_cas_write_metrics_contention.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_condition_not_met.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_write_condition_not_met);
sp::get_cas_write_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_failed_read_round_optimization.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_failed_read_round_optimization);
sp::get_cas_read_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_read_unfinished_commit);
sp::get_cas_read_metrics_contention.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &proxy::stats::cas_read_contention);
sp::get_cas_read_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_read_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::read_timeouts);
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::read_timeouts);
});
sp::get_read_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::read_unavailables);
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::read_unavailables);
});
sp::get_range_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::range_slice_timeouts);
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::range_slice_timeouts);
});
sp::get_range_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::range_slice_unavailables);
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::range_slice_unavailables);
});
sp::get_write_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::write_timeouts);
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::write_timeouts);
});
sp::get_write_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::write_unavailables);
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::write_unavailables);
});
sp::get_read_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::read_timeouts);
return sum_timed_rate_as_obj(ctx.sp, &proxy::stats::read_timeouts);
});
sp::get_read_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::read_unavailables);
return sum_timed_rate_as_obj(ctx.sp, &proxy::stats::read_unavailables);
});
sp::get_range_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::range_slice_timeouts);
return sum_timed_rate_as_obj(ctx.sp, &proxy::stats::range_slice_timeouts);
});
sp::get_range_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::range_slice_unavailables);
return sum_timed_rate_as_obj(ctx.sp, &proxy::stats::range_slice_unavailables);
});
sp::get_write_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::write_timeouts);
return sum_timed_rate_as_obj(ctx.sp, &proxy::stats::write_timeouts);
});
sp::get_write_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::write_unavailables);
return sum_timed_rate_as_obj(ctx.sp, &proxy::stats::write_unavailables);
});
sp::get_range_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);
return sum_histogram_stats(ctx.sp, &proxy::stats::range);
});
sp::get_write_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::write);
return sum_histogram_stats(ctx.sp, &proxy::stats::write);
});
sp::get_read_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read);
return sum_histogram_stats(ctx.sp, &proxy::stats::read);
});
sp::get_range_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);
return sum_timer_stats(ctx.sp, &proxy::stats::range);
});
sp::get_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::write);
});
sp::get_cas_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats(ctx.sp, &proxy::stats::cas_write);
});
sp::get_cas_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats(ctx.sp, &proxy::stats::cas_read);
});
sp::get_view_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
//TBD
// FIXME
// No View metrics are available, so just return empty moving average
return make_ready_future<json::json_return_type>(get_empty_moving_average());
return sum_timer_stats(ctx.sp, &proxy::stats::write);
});
sp::get_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read);
return sum_timer_stats(ctx.sp, &proxy::stats::read);
});
sp::get_read_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &service::storage_proxy_stats::stats::estimated_read);
return sum_estimated_histogram(ctx, &proxy::stats::estimated_read);
});
sp::get_read_latency.set(r, [&ctx](std::unique_ptr<request> req) {
return total_latency(ctx, &service::storage_proxy_stats::stats::read);
return total_latency(ctx, &proxy::stats::read);
});
sp::get_write_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &service::storage_proxy_stats::stats::estimated_write);
return sum_estimated_histogram(ctx, &proxy::stats::estimated_write);
});
sp::get_write_latency.set(r, [&ctx](std::unique_ptr<request> req) {
return total_latency(ctx, &service::storage_proxy_stats::stats::write);
return total_latency(ctx, &proxy::stats::write);
});
sp::get_range_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);
return sum_timer_stats(ctx.sp, &proxy::stats::range);
});
sp::get_range_latency.set(r, [&ctx](std::unique_ptr<request> req) {
return total_latency(ctx, &service::storage_proxy_stats::stats::range);
return total_latency(ctx, &proxy::stats::range);
});
}

File diff suppressed because it is too large Load Diff

View File

@@ -21,26 +21,10 @@
#pragma once
#include <seastar/core/sharded.hh>
#include "api.hh"
#include "db/data_listeners.hh"
namespace cql_transport { class controller; }
class thrift_controller;
namespace db { class snapshot_ctl; }
namespace netw { class messaging_service; }
namespace api {
void set_storage_service(http_context& ctx, routes& r);
void set_repair(http_context& ctx, routes& r, sharded<netw::messaging_service>& ms);
void unset_repair(http_context& ctx, routes& r);
void set_transport_controller(http_context& ctx, routes& r, cql_transport::controller& ctl);
void unset_transport_controller(http_context& ctx, routes& r);
void set_rpc_controller(http_context& ctx, routes& r, thrift_controller& ctl);
void unset_rpc_controller(http_context& ctx, routes& r);
void set_snapshot(http_context& ctx, routes& r, sharded<db::snapshot_ctl>& snap_ctl);
void unset_snapshot(http_context& ctx, routes& r);
seastar::future<json::json_return_type> run_toppartitions_query(db::toppartitions_query& q, http_context &ctx, bool legacy_request = false);
}

View File

@@ -22,22 +22,14 @@
#include "api/api-doc/system.json.hh"
#include "api/api.hh"
#include <seastar/core/reactor.hh>
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "log.hh"
#include "database.hh"
extern logging::logger apilog;
namespace api {
namespace hs = httpd::system_json;
void set_system(http_context& ctx, routes& r) {
hs::get_system_uptime.set(r, [](const_req req) {
return std::chrono::duration_cast<std::chrono::milliseconds>(engine().uptime()).count();
});
hs::get_all_logger_names.set(r, [](const_req req) {
return logging::logger_registry().get_all_logger_names();
});
@@ -73,16 +65,6 @@ void set_system(http_context& ctx, routes& r) {
}
return json::json_void();
});
hs::drop_sstable_caches.set(r, [&ctx](std::unique_ptr<request> req) {
apilog.info("Dropping sstable caches");
return ctx.db.invoke_on_all([] (database& db) {
return db.drop_caches();
}).then([] {
apilog.info("Caches dropped");
return json::json_return_type(json::json_void());
});
});
}
}

View File

@@ -1,252 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "atomic_cell.hh"
#include "atomic_cell_or_collection.hh"
#include "counters.hh"
#include "types.hh"
atomic_cell atomic_cell::make_dead(api::timestamp_type timestamp, gc_clock::time_point deletion_time) {
return atomic_cell_type::make_dead(timestamp, deletion_time);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value, atomic_cell::collection_member cm) {
return atomic_cell_type::make_live(timestamp, single_fragment_range(value));
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, managed_bytes_view value, atomic_cell::collection_member cm) {
return atomic_cell_type::make_live(timestamp, fragment_range(value));
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value, atomic_cell::collection_member cm) {
return atomic_cell_type::make_live(timestamp, value);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, const fragmented_temporary_buffer::view& value, collection_member cm)
{
return atomic_cell_type::make_live(timestamp, value);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl, atomic_cell::collection_member cm) {
return atomic_cell_type::make_live(timestamp, single_fragment_range(value), expiry, ttl);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, managed_bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl, atomic_cell::collection_member cm) {
return atomic_cell_type::make_live(timestamp, fragment_range(value), expiry, ttl);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value,
gc_clock::time_point expiry, gc_clock::duration ttl, atomic_cell::collection_member cm) {
return atomic_cell_type::make_live(timestamp, value, expiry, ttl);
}
atomic_cell atomic_cell::make_live(const abstract_type& type, api::timestamp_type timestamp, const fragmented_temporary_buffer::view& value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member cm)
{
return atomic_cell_type::make_live(timestamp, value, expiry, ttl);
}
atomic_cell atomic_cell::make_live_counter_update(api::timestamp_type timestamp, int64_t value) {
return atomic_cell_type::make_live_counter_update(timestamp, value);
}
atomic_cell atomic_cell::make_live_uninitialized(const abstract_type& type, api::timestamp_type timestamp, size_t size) {
return atomic_cell_type::make_live_uninitialized(timestamp, size);
}
atomic_cell::atomic_cell(const abstract_type& type, atomic_cell_view other)
: _data(other._view) {
set_view(_data);
}
// Based on:
// - org.apache.cassandra.db.AbstractCell#reconcile()
// - org.apache.cassandra.db.BufferExpiringCell#reconcile()
// - org.apache.cassandra.db.BufferDeletedCell#reconcile()
int
compare_atomic_cell_for_merge(atomic_cell_view left, atomic_cell_view right) {
if (left.timestamp() != right.timestamp()) {
return left.timestamp() > right.timestamp() ? 1 : -1;
}
if (left.is_live() != right.is_live()) {
return left.is_live() ? -1 : 1;
}
if (left.is_live()) {
auto c = compare_unsigned(left.value(), right.value());
if (c != 0) {
return c;
}
if (left.is_live_and_has_ttl() != right.is_live_and_has_ttl()) {
// prefer expiring cells.
return left.is_live_and_has_ttl() ? 1 : -1;
}
if (left.is_live_and_has_ttl()) {
if (left.expiry() != right.expiry()) {
return left.expiry() < right.expiry() ? -1 : 1;
} else {
// prefer the cell that was written later,
// so it survives longer after it expires, until purged.
if (left.ttl() != right.ttl()) {
return left.ttl() < right.ttl() ? 1 : -1;
} else {
return 0;
}
}
}
} else {
// Both are deleted
if (left.deletion_time() != right.deletion_time()) {
// Origin compares big-endian serialized deletion time. That's because it
// delegates to AbstractCell.reconcile() which compares values after
// comparing timestamps, which in case of deleted cells will hold
// serialized expiry.
return (uint64_t) left.deletion_time().time_since_epoch().count()
< (uint64_t) right.deletion_time().time_since_epoch().count() ? -1 : 1;
}
}
return 0;
}
atomic_cell_or_collection atomic_cell_or_collection::copy(const abstract_type& type) const {
if (_data.empty()) {
return atomic_cell_or_collection();
}
return atomic_cell_or_collection(managed_bytes(_data));
}
atomic_cell_or_collection::atomic_cell_or_collection(const abstract_type& type, atomic_cell_view acv)
: _data(acv._view)
{
}
bool atomic_cell_or_collection::equals(const abstract_type& type, const atomic_cell_or_collection& other) const
{
if (_data.empty() || other._data.empty()) {
return _data.empty() && other._data.empty();
}
if (type.is_atomic()) {
auto a = atomic_cell_view::from_bytes(type, _data);
auto b = atomic_cell_view::from_bytes(type, other._data);
if (a.timestamp() != b.timestamp()) {
return false;
}
if (a.is_live() != b.is_live()) {
return false;
}
if (a.is_live()) {
if (a.is_counter_update() != b.is_counter_update()) {
return false;
}
if (a.is_counter_update()) {
return a.counter_update_value() == b.counter_update_value();
}
if (a.is_live_and_has_ttl() != b.is_live_and_has_ttl()) {
return false;
}
if (a.is_live_and_has_ttl()) {
if (a.ttl() != b.ttl() || a.expiry() != b.expiry()) {
return false;
}
}
return a.value() == b.value();
}
return a.deletion_time() == b.deletion_time();
} else {
return as_collection_mutation().data == other.as_collection_mutation().data;
}
}
size_t atomic_cell_or_collection::external_memory_usage(const abstract_type& t) const
{
return _data.external_memory_usage();
}
std::ostream&
operator<<(std::ostream& os, const atomic_cell_view& acv) {
if (acv.is_live()) {
return fmt_print(os, "atomic_cell{{{},ts={:d},expiry={:d},ttl={:d}}}",
acv.is_counter_update()
? "counter_update_value=" + to_sstring(acv.counter_update_value())
: to_hex(to_bytes(acv.value())),
acv.timestamp(),
acv.is_live_and_has_ttl() ? acv.expiry().time_since_epoch().count() : -1,
acv.is_live_and_has_ttl() ? acv.ttl().count() : 0);
} else {
return fmt_print(os, "atomic_cell{{DEAD,ts={:d},deletion_time={:d}}}",
acv.timestamp(), acv.deletion_time().time_since_epoch().count());
}
}
std::ostream&
operator<<(std::ostream& os, const atomic_cell& ac) {
return os << atomic_cell_view(ac);
}
std::ostream&
operator<<(std::ostream& os, const atomic_cell_view::printer& acvp) {
auto& type = acvp._type;
auto& acv = acvp._cell;
if (acv.is_live()) {
std::ostringstream cell_value_string_builder;
if (type.is_counter()) {
if (acv.is_counter_update()) {
cell_value_string_builder << "counter_update_value=" << acv.counter_update_value();
} else {
cell_value_string_builder << "shards: ";
auto ccv = counter_cell_view(acv);
cell_value_string_builder << ::join(", ", ccv.shards());
}
} else {
cell_value_string_builder << type.to_string(to_bytes(acv.value()));
}
return fmt_print(os, "atomic_cell{{{},ts={:d},expiry={:d},ttl={:d}}}",
cell_value_string_builder.str(),
acv.timestamp(),
acv.is_live_and_has_ttl() ? acv.expiry().time_since_epoch().count() : -1,
acv.is_live_and_has_ttl() ? acv.ttl().count() : 0);
} else {
return fmt_print(os, "atomic_cell{{DEAD,ts={:d},deletion_time={:d}}}",
acv.timestamp(), acv.deletion_time().time_since_epoch().count());
}
}
std::ostream&
operator<<(std::ostream& os, const atomic_cell::printer& acp) {
return operator<<(os, static_cast<const atomic_cell_view::printer&>(acp));
}
std::ostream& operator<<(std::ostream& os, const atomic_cell_or_collection::printer& p) {
if (p._cell._data.empty()) {
return os << "{ null atomic_cell_or_collection }";
}
os << "{ ";
if (p._cdef.type->is_multi_cell()) {
os << "collection ";
auto cmv = p._cell.as_collection_mutation();
os << collection_mutation_view::printer(*p._cdef.type, cmv);
} else {
os << atomic_cell_view::printer(*p._cdef.type, p._cell.as_atomic_cell(p._cdef));
}
return os << " }";
}

View File

@@ -26,132 +26,111 @@
#include "tombstone.hh"
#include "gc_clock.hh"
#include "utils/managed_bytes.hh"
#include "utils/fragment_range.hh"
#include <seastar/net//byteorder.hh>
#include <seastar/util/bool_class.hh>
#include "net/byteorder.hh"
#include <cstdint>
#include <iosfwd>
#include <concepts>
#include "utils/fragmented_temporary_buffer.hh"
#include <seastar/util/gcc6-concepts.hh>
#include "serializer.hh"
template<typename T, typename Input>
static inline
void set_field(Input& v, unsigned offset, T val) {
reinterpret_cast<net::packed<T>*>(v.begin() + offset)->raw = net::hton(val);
}
template<typename T>
static inline
T get_field(const bytes_view& v, unsigned offset) {
return net::ntoh(*reinterpret_cast<const net::packed<T>*>(v.begin() + offset));
}
class abstract_type;
class collection_type_impl;
class atomic_cell_or_collection;
using atomic_cell_value = managed_bytes;
template <mutable_view is_mutable>
using atomic_cell_value_basic_view = managed_bytes_basic_view<is_mutable>;
using atomic_cell_value_view = atomic_cell_value_basic_view<mutable_view::no>;
using atomic_cell_value_mutable_view = atomic_cell_value_basic_view<mutable_view::yes>;
template <typename T>
requires std::is_trivial_v<T>
static void set_field(atomic_cell_value_mutable_view& out, unsigned offset, T val) {
auto out_view = managed_bytes_mutable_view(out);
out_view.remove_prefix(offset);
write<T>(out_view, val);
}
template <typename T>
requires std::is_trivial_v<T>
static void set_field(atomic_cell_value& out, unsigned offset, T val) {
auto out_view = atomic_cell_value_mutable_view(out);
set_field(out_view, offset, val);
}
template <FragmentRange Buffer>
static void set_value(managed_bytes& b, unsigned value_offset, const Buffer& value) {
auto v = managed_bytes_mutable_view(b).substr(value_offset, value.size_bytes());
for (auto frag : value) {
write_fragmented(v, single_fragmented_view(frag));
}
}
template <typename T, FragmentedView Input>
requires std::is_trivial_v<T>
static T get_field(Input in, unsigned offset = 0) {
in.remove_prefix(offset);
return read_simple<T>(in);
}
/*
* Represents atomic cell layout. Works on serialized form.
*
* Layout:
*
* <live> := <int8_t:flags><int64_t:timestamp>(<int64_t:expiry><int32_t:ttl>)?<value>
* <dead> := <int8_t: 0><int64_t:timestamp><int64_t:deletion_time>
* <live> := <int8_t:flags><int64_t:timestamp>(<int32_t:expiry><int32_t:ttl>)?<value>
* <dead> := <int8_t: 0><int64_t:timestamp><int32_t:deletion_time>
*/
class atomic_cell_type final {
private:
static constexpr int8_t LIVE_FLAG = 0x01;
static constexpr int8_t EXPIRY_FLAG = 0x02; // When present, expiry field is present. Set only for live cells
static constexpr int8_t COUNTER_UPDATE_FLAG = 0x08; // Cell is a counter update.
static constexpr int8_t COUNTER_IN_PLACE_REVERT = 0x10;
static constexpr unsigned flags_size = 1;
static constexpr unsigned timestamp_offset = flags_size;
static constexpr unsigned timestamp_size = 8;
static constexpr unsigned expiry_offset = timestamp_offset + timestamp_size;
static constexpr unsigned expiry_size = 8;
static constexpr unsigned expiry_size = 4;
static constexpr unsigned deletion_time_offset = timestamp_offset + timestamp_size;
static constexpr unsigned deletion_time_size = 8;
static constexpr unsigned deletion_time_size = 4;
static constexpr unsigned ttl_offset = expiry_offset + expiry_size;
static constexpr unsigned ttl_size = 4;
friend class counter_cell_builder;
private:
static bool is_counter_update(atomic_cell_value_view cell) {
return cell.front() & COUNTER_UPDATE_FLAG;
static bool is_counter_update(bytes_view cell) {
return cell[0] & COUNTER_UPDATE_FLAG;
}
static bool is_live(atomic_cell_value_view cell) {
return cell.front() & LIVE_FLAG;
static bool is_counter_in_place_revert_set(bytes_view cell) {
return cell[0] & COUNTER_IN_PLACE_REVERT;
}
static bool is_live_and_has_ttl(atomic_cell_value_view cell) {
return cell.front() & EXPIRY_FLAG;
template<typename BytesContainer>
static void set_counter_in_place_revert(BytesContainer& cell, bool flag) {
cell[0] = (cell[0] & ~COUNTER_IN_PLACE_REVERT) | (flag * COUNTER_IN_PLACE_REVERT);
}
static bool is_dead(atomic_cell_value_view cell) {
static bool is_live(const bytes_view& cell) {
return cell[0] & LIVE_FLAG;
}
static bool is_live_and_has_ttl(const bytes_view& cell) {
return cell[0] & EXPIRY_FLAG;
}
static bool is_dead(const bytes_view& cell) {
return !is_live(cell);
}
// Can be called on live and dead cells
static api::timestamp_type timestamp(atomic_cell_value_view cell) {
static api::timestamp_type timestamp(const bytes_view& cell) {
return get_field<api::timestamp_type>(cell, timestamp_offset);
}
static void set_timestamp(atomic_cell_value_mutable_view& cell, api::timestamp_type ts) {
template<typename BytesContainer>
static void set_timestamp(BytesContainer& cell, api::timestamp_type ts) {
set_field(cell, timestamp_offset, ts);
}
// Can be called on live cells only
private:
template <mutable_view is_mutable>
static managed_bytes_basic_view<is_mutable> do_get_value(managed_bytes_basic_view<is_mutable> cell) {
auto expiry_field_size = bool(cell.front() & EXPIRY_FLAG) * (expiry_size + ttl_size);
template<typename BytesView>
static BytesView do_get_value(BytesView cell) {
auto expiry_field_size = bool(cell[0] & EXPIRY_FLAG) * (expiry_size + ttl_size);
auto value_offset = flags_size + timestamp_size + expiry_field_size;
cell.remove_prefix(value_offset);
return cell;
}
public:
static atomic_cell_value_view value(managed_bytes_view cell) {
static bytes_view value(bytes_view cell) {
return do_get_value(cell);
}
static atomic_cell_value_mutable_view value(managed_bytes_mutable_view cell) {
static bytes_mutable_view value(bytes_mutable_view cell) {
return do_get_value(cell);
}
// Can be called on live counter update cells only
static int64_t counter_update_value(atomic_cell_value_view cell) {
static int64_t counter_update_value(bytes_view cell) {
return get_field<int64_t>(cell, flags_size + timestamp_size);
}
// Can be called only when is_dead() is true.
static gc_clock::time_point deletion_time(atomic_cell_value_view cell) {
static gc_clock::time_point deletion_time(const bytes_view& cell) {
assert(is_dead(cell));
return gc_clock::time_point(gc_clock::duration(get_field<int64_t>(cell, deletion_time_offset)));
return gc_clock::time_point(gc_clock::duration(
get_field<int32_t>(cell, deletion_time_offset)));
}
// Can be called only when is_live_and_has_ttl() is true.
static gc_clock::time_point expiry(atomic_cell_value_view cell) {
static gc_clock::time_point expiry(const bytes_view& cell) {
assert(is_live_and_has_ttl(cell));
auto expiry = get_field<int64_t>(cell, expiry_offset);
auto expiry = get_field<int32_t>(cell, expiry_offset);
return gc_clock::time_point(gc_clock::duration(expiry));
}
// Can be called only when is_live_and_has_ttl() is true.
static gc_clock::duration ttl(atomic_cell_value_view cell) {
static gc_clock::duration ttl(const bytes_view& cell) {
assert(is_live_and_has_ttl(cell));
return gc_clock::duration(get_field<int32_t>(cell, ttl_offset));
}
@@ -159,16 +138,15 @@ public:
managed_bytes b(managed_bytes::initialized_later(), flags_size + timestamp_size + deletion_time_size);
b[0] = 0;
set_field(b, timestamp_offset, timestamp);
set_field(b, deletion_time_offset, static_cast<int64_t>(deletion_time.time_since_epoch().count()));
set_field(b, deletion_time_offset, deletion_time.time_since_epoch().count());
return b;
}
template <FragmentRange Buffer>
static managed_bytes make_live(api::timestamp_type timestamp, const Buffer& value) {
static managed_bytes make_live(api::timestamp_type timestamp, bytes_view value) {
auto value_offset = flags_size + timestamp_size;
managed_bytes b(managed_bytes::initialized_later(), value_offset + value.size_bytes());
managed_bytes b(managed_bytes::initialized_later(), value_offset + value.size());
b[0] = LIVE_FLAG;
set_field(b, timestamp_offset, timestamp);
set_value(b, value_offset, value);
std::copy_n(value.begin(), value.size(), b.begin() + value_offset);
return b;
}
static managed_bytes make_live_counter_update(api::timestamp_type timestamp, int64_t value) {
@@ -179,52 +157,62 @@ public:
set_field(b, value_offset, value);
return b;
}
template <FragmentRange Buffer>
static managed_bytes make_live(api::timestamp_type timestamp, const Buffer& value, gc_clock::time_point expiry, gc_clock::duration ttl) {
static managed_bytes make_live(api::timestamp_type timestamp, bytes_view value, gc_clock::time_point expiry, gc_clock::duration ttl) {
auto value_offset = flags_size + timestamp_size + expiry_size + ttl_size;
managed_bytes b(managed_bytes::initialized_later(), value_offset + value.size_bytes());
managed_bytes b(managed_bytes::initialized_later(), value_offset + value.size());
b[0] = EXPIRY_FLAG | LIVE_FLAG;
set_field(b, timestamp_offset, timestamp);
set_field(b, expiry_offset, static_cast<int64_t>(expiry.time_since_epoch().count()));
set_field(b, ttl_offset, static_cast<int32_t>(ttl.count()));
set_value(b, value_offset, value);
set_field(b, expiry_offset, expiry.time_since_epoch().count());
set_field(b, ttl_offset, ttl.count());
std::copy_n(value.begin(), value.size(), b.begin() + value_offset);
return b;
}
static managed_bytes make_live_uninitialized(api::timestamp_type timestamp, size_t size) {
// make_live_from_serializer() is intended for users that need to serialise
// some object or objects to the format used in atomic_cell::value().
// With just make_live() the patter would look like follows:
// 1. allocate a buffer and write to it serialised objects
// 2. pass that buffer to make_live()
// 3. make_live() needs to prepend some metadata to the cell value so it
// allocates a new buffer and copies the content of the original one
//
// The allocation and copy of a buffer can be avoided.
// make_live_from_serializer() allows the user code to specify the timestamp
// and size of the cell value as well as provide the serialiser function
// object, which would write the serialised value of the cell to the buffer
// given to it by make_live_from_serializer().
template<typename Serializer>
GCC6_CONCEPT(requires requires(Serializer serializer, bytes::iterator it) {
serializer(it);
})
static managed_bytes make_live_from_serializer(api::timestamp_type timestamp, size_t size, Serializer&& serializer) {
auto value_offset = flags_size + timestamp_size;
managed_bytes b(managed_bytes::initialized_later(), value_offset + size);
b[0] = LIVE_FLAG;
set_field(b, timestamp_offset, timestamp);
serializer(b.begin() + value_offset);
return b;
}
template <mutable_view is_mutable>
friend class basic_atomic_cell_view;
template<typename ByteContainer>
friend class atomic_cell_base;
friend class atomic_cell;
};
/// View of an atomic cell
template<mutable_view is_mutable>
class basic_atomic_cell_view {
template<typename ByteContainer>
class atomic_cell_base {
protected:
managed_bytes_basic_view<is_mutable> _view;
friend class atomic_cell;
ByteContainer _data;
protected:
void set_view(managed_bytes_basic_view<is_mutable> v) {
_view = v;
}
basic_atomic_cell_view() = default;
explicit basic_atomic_cell_view(managed_bytes_basic_view<is_mutable> v) : _view(std::move(v)) { }
atomic_cell_base(ByteContainer&& data) : _data(std::forward<ByteContainer>(data)) { }
friend class atomic_cell_or_collection;
public:
operator basic_atomic_cell_view<mutable_view::no>() const noexcept {
return basic_atomic_cell_view<mutable_view::no>(_view);
}
bool is_counter_update() const {
return atomic_cell_type::is_counter_update(_view);
return atomic_cell_type::is_counter_update(_data);
}
bool is_counter_in_place_revert_set() const {
return atomic_cell_type::is_counter_in_place_revert_set(_data);
}
bool is_live() const {
return atomic_cell_type::is_live(_view);
return atomic_cell_type::is_live(_data);
}
bool is_live(tombstone t, bool is_counter) const {
return is_live() && !is_covered_by(t, is_counter);
@@ -233,167 +221,162 @@ public:
return is_live() && !is_covered_by(t, is_counter) && !has_expired(now);
}
bool is_live_and_has_ttl() const {
return atomic_cell_type::is_live_and_has_ttl(_view);
return atomic_cell_type::is_live_and_has_ttl(_data);
}
bool is_dead(gc_clock::time_point now) const {
return atomic_cell_type::is_dead(_view) || has_expired(now);
return atomic_cell_type::is_dead(_data) || has_expired(now);
}
bool is_covered_by(tombstone t, bool is_counter) const {
return timestamp() <= t.timestamp || (is_counter && t.timestamp != api::missing_timestamp);
}
// Can be called on live and dead cells
api::timestamp_type timestamp() const {
return atomic_cell_type::timestamp(_view);
return atomic_cell_type::timestamp(_data);
}
void set_timestamp(api::timestamp_type ts) {
atomic_cell_type::set_timestamp(_view, ts);
atomic_cell_type::set_timestamp(_data, ts);
}
// Can be called on live cells only
atomic_cell_value_basic_view<is_mutable> value() const {
return atomic_cell_type::value(_view);
}
// Can be called on live cells only
size_t value_size() const {
return atomic_cell_type::value(_view).size();
}
bool is_value_fragmented() const {
return _view.is_fragmented();
auto value() const {
return atomic_cell_type::value(_data);
}
// Can be called on live counter update cells only
int64_t counter_update_value() const {
return atomic_cell_type::counter_update_value(_view);
return atomic_cell_type::counter_update_value(_data);
}
// Can be called only when is_dead(gc_clock::time_point)
gc_clock::time_point deletion_time() const {
return !is_live() ? atomic_cell_type::deletion_time(_view) : expiry() - ttl();
return !is_live() ? atomic_cell_type::deletion_time(_data) : expiry() - ttl();
}
// Can be called only when is_live_and_has_ttl()
gc_clock::time_point expiry() const {
return atomic_cell_type::expiry(_view);
return atomic_cell_type::expiry(_data);
}
// Can be called only when is_live_and_has_ttl()
gc_clock::duration ttl() const {
return atomic_cell_type::ttl(_view);
return atomic_cell_type::ttl(_data);
}
// Can be called on live and dead cells
bool has_expired(gc_clock::time_point now) const {
return is_live_and_has_ttl() && expiry() <= now;
}
managed_bytes_view serialize() const {
return _view;
bytes_view serialize() const {
return _data;
}
void set_counter_in_place_revert(bool flag) {
atomic_cell_type::set_counter_in_place_revert(_data, flag);
}
};
class atomic_cell_view final : public basic_atomic_cell_view<mutable_view::no> {
atomic_cell_view(managed_bytes_view v)
: basic_atomic_cell_view(v) {}
template<mutable_view is_mutable>
atomic_cell_view(basic_atomic_cell_view<is_mutable> view)
: basic_atomic_cell_view<mutable_view::no>(view) {}
friend class atomic_cell;
class atomic_cell_view final : public atomic_cell_base<bytes_view> {
atomic_cell_view(bytes_view data) : atomic_cell_base(std::move(data)) {}
public:
static atomic_cell_view from_bytes(const abstract_type& t, managed_bytes_view v) {
return atomic_cell_view(v);
}
static atomic_cell_view from_bytes(const abstract_type& t, bytes_view v) {
return atomic_cell_view(managed_bytes_view(v));
}
static atomic_cell_view from_bytes(bytes_view data) { return atomic_cell_view(data); }
friend class atomic_cell;
friend std::ostream& operator<<(std::ostream& os, const atomic_cell_view& acv);
class printer {
const abstract_type& _type;
const atomic_cell_view& _cell;
public:
printer(const abstract_type& type, const atomic_cell_view& cell) : _type(type), _cell(cell) {}
friend std::ostream& operator<<(std::ostream& os, const printer& acvp);
};
};
class atomic_cell_mutable_view final : public basic_atomic_cell_view<mutable_view::yes> {
atomic_cell_mutable_view(managed_bytes_mutable_view data)
: basic_atomic_cell_view(data) {}
class atomic_cell_mutable_view final : public atomic_cell_base<bytes_mutable_view> {
atomic_cell_mutable_view(bytes_mutable_view data) : atomic_cell_base(std::move(data)) {}
public:
static atomic_cell_mutable_view from_bytes(const abstract_type& t, managed_bytes_mutable_view v) {
return atomic_cell_mutable_view(v);
}
static atomic_cell_mutable_view from_bytes(bytes_mutable_view data) { return atomic_cell_mutable_view(data); }
friend class atomic_cell;
};
using atomic_cell_ref = atomic_cell_mutable_view;
class atomic_cell final : public basic_atomic_cell_view<mutable_view::yes> {
managed_bytes _data;
atomic_cell(managed_bytes b) : _data(std::move(b)) {
set_view(_data);
}
class atomic_cell_ref final : public atomic_cell_base<managed_bytes&> {
public:
class collection_member_tag;
using collection_member = bool_class<collection_member_tag>;
atomic_cell_ref(managed_bytes& buf) : atomic_cell_base(buf) {}
};
atomic_cell(atomic_cell&& o) noexcept : _data(std::move(o._data)) {
set_view(_data);
class atomic_cell final : public atomic_cell_base<managed_bytes> {
atomic_cell(managed_bytes b) : atomic_cell_base(std::move(b)) {}
public:
atomic_cell(const atomic_cell&) = default;
atomic_cell(atomic_cell&&) = default;
atomic_cell& operator=(const atomic_cell&) = default;
atomic_cell& operator=(atomic_cell&&) = default;
static atomic_cell from_bytes(managed_bytes b) {
return atomic_cell(std::move(b));
}
atomic_cell& operator=(const atomic_cell&) = delete;
atomic_cell& operator=(atomic_cell&& o) {
_data = std::move(o._data);
set_view(_data);
return *this;
atomic_cell(atomic_cell_view other) : atomic_cell_base(managed_bytes{other._data}) {}
operator atomic_cell_view() const {
return atomic_cell_view(_data);
}
operator atomic_cell_view() const { return atomic_cell_view(managed_bytes_view(_data)); }
atomic_cell(const abstract_type& t, atomic_cell_view other);
static atomic_cell make_dead(api::timestamp_type timestamp, gc_clock::time_point deletion_time);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value,
collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, managed_bytes_view value,
collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value,
collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, const fragmented_temporary_buffer::view& value,
collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, const bytes& value,
collection_member cm = collection_member::no) {
return make_live(type, timestamp, bytes_view(value), cm);
static atomic_cell make_dead(api::timestamp_type timestamp, gc_clock::time_point deletion_time) {
return atomic_cell_type::make_dead(timestamp, deletion_time);
}
static atomic_cell make_live_counter_update(api::timestamp_type timestamp, int64_t value);
static atomic_cell make_live(const abstract_type&, api::timestamp_type timestamp, bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type&, api::timestamp_type timestamp, managed_bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type&, api::timestamp_type timestamp, ser::buffer_view<bytes_ostream::fragment_iterator> value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type&, api::timestamp_type timestamp, const fragmented_temporary_buffer::view& value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member = collection_member::no);
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, const bytes& value,
gc_clock::time_point expiry, gc_clock::duration ttl, collection_member cm = collection_member::no)
static atomic_cell make_live(api::timestamp_type timestamp, bytes_view value) {
return atomic_cell_type::make_live(timestamp, value);
}
static atomic_cell make_live(api::timestamp_type timestamp, const bytes& value) {
return make_live(timestamp, bytes_view(value));
}
static atomic_cell make_live_counter_update(api::timestamp_type timestamp, int64_t value) {
return atomic_cell_type::make_live_counter_update(timestamp, value);
}
static atomic_cell make_live(api::timestamp_type timestamp, bytes_view value,
gc_clock::time_point expiry, gc_clock::duration ttl)
{
return make_live(type, timestamp, bytes_view(value), expiry, ttl, cm);
return atomic_cell_type::make_live(timestamp, value, expiry, ttl);
}
static atomic_cell make_live(const abstract_type& type, api::timestamp_type timestamp, bytes_view value, ttl_opt ttl, collection_member cm = collection_member::no) {
static atomic_cell make_live(api::timestamp_type timestamp, const bytes& value,
gc_clock::time_point expiry, gc_clock::duration ttl)
{
return make_live(timestamp, bytes_view(value), expiry, ttl);
}
static atomic_cell make_live(api::timestamp_type timestamp, bytes_view value, ttl_opt ttl) {
if (!ttl) {
return make_live(type, timestamp, value, cm);
return atomic_cell_type::make_live(timestamp, value);
} else {
return make_live(type, timestamp, value, gc_clock::now() + *ttl, *ttl, cm);
return atomic_cell_type::make_live(timestamp, value, gc_clock::now() + *ttl, *ttl);
}
}
static atomic_cell make_live_uninitialized(const abstract_type& type, api::timestamp_type timestamp, size_t size);
template<typename Serializer>
static atomic_cell make_live_from_serializer(api::timestamp_type timestamp, size_t size, Serializer&& serializer) {
return atomic_cell_type::make_live_from_serializer(timestamp, size, std::forward<Serializer>(serializer));
}
friend class atomic_cell_or_collection;
friend std::ostream& operator<<(std::ostream& os, const atomic_cell& ac);
class printer : atomic_cell_view::printer {
public:
printer(const abstract_type& type, const atomic_cell_view& cell) : atomic_cell_view::printer(type, cell) {}
friend std::ostream& operator<<(std::ostream& os, const printer& acvp);
};
};
class collection_mutation_view;
// Represents a mutation of a collection. Actual format is determined by collection type,
// and is:
// set: list of atomic_cell
// map: list of pair<atomic_cell, bytes> (for key/value)
// list: tbd, probably ugly
class collection_mutation {
public:
managed_bytes data;
collection_mutation() {}
collection_mutation(managed_bytes b) : data(std::move(b)) {}
collection_mutation(collection_mutation_view v);
operator collection_mutation_view() const;
};
class collection_mutation_view {
public:
bytes_view data;
bytes_view serialize() const { return data; }
static collection_mutation_view from_bytes(bytes_view v) { return { v }; }
};
inline
collection_mutation::collection_mutation(collection_mutation_view v)
: data(v.data) {
}
inline
collection_mutation::operator collection_mutation_view() const {
return { data };
}
class column_definition;
int compare_atomic_cell_for_merge(atomic_cell_view left, atomic_cell_view right);
void merge_column(const abstract_type& def,
void merge_column(const column_definition& def,
atomic_cell_or_collection& old,
const atomic_cell_or_collection& neww);

View File

@@ -24,7 +24,6 @@
// Not part of atomic_cell.hh to avoid cyclic dependency between types.hh and atomic_cell.hh
#include "types.hh"
#include "types/collection.hh"
#include "atomic_cell.hh"
#include "atomic_cell_or_collection.hh"
#include "hashing.hh"
@@ -34,13 +33,12 @@ template<>
struct appending_hash<collection_mutation_view> {
template<typename Hasher>
void operator()(Hasher& h, collection_mutation_view cell, const column_definition& cdef) const {
cell.with_deserialized(*cdef.type, [&] (collection_mutation_view_description m_view) {
::feed_hash(h, m_view.tomb);
for (auto&& key_and_value : m_view.cells) {
::feed_hash(h, key_and_value.first);
::feed_hash(h, key_and_value.second, cdef);
}
});
auto m_view = collection_type_impl::deserialize_mutation_form(cell);
::feed_hash(h, m_view.tomb);
for (auto&& key_and_value : m_view.cells) {
::feed_hash(h, key_and_value.first);
::feed_hash(h, key_and_value.second, cdef);
}
}
};
@@ -87,9 +85,9 @@ struct appending_hash<atomic_cell_or_collection> {
template<typename Hasher>
void operator()(Hasher& h, const atomic_cell_or_collection& c, const column_definition& cdef) const {
if (cdef.is_atomic()) {
feed_hash(h, c.as_atomic_cell(cdef), cdef);
feed_hash(h, c.as_atomic_cell(), cdef);
} else {
feed_hash(h, c.as_collection_mutation(), cdef);
}
}
};
};

View File

@@ -22,7 +22,6 @@
#pragma once
#include "atomic_cell.hh"
#include "collection_mutation.hh"
#include "schema.hh"
#include "hashing.hh"
@@ -36,39 +35,32 @@ private:
atomic_cell_or_collection(managed_bytes&& data) : _data(std::move(data)) {}
public:
atomic_cell_or_collection() = default;
atomic_cell_or_collection(atomic_cell_or_collection&&) = default;
atomic_cell_or_collection(const atomic_cell_or_collection&) = delete;
atomic_cell_or_collection& operator=(atomic_cell_or_collection&&) = default;
atomic_cell_or_collection& operator=(const atomic_cell_or_collection&) = delete;
atomic_cell_or_collection(atomic_cell ac) : _data(std::move(ac._data)) {}
atomic_cell_or_collection(const abstract_type& at, atomic_cell_view acv);
static atomic_cell_or_collection from_atomic_cell(atomic_cell data) { return { std::move(data._data) }; }
atomic_cell_view as_atomic_cell(const column_definition& cdef) const { return atomic_cell_view::from_bytes(*cdef.type, _data); }
atomic_cell_mutable_view as_mutable_atomic_cell(const column_definition& cdef) { return atomic_cell_mutable_view::from_bytes(*cdef.type, _data); }
atomic_cell_or_collection(collection_mutation cm) : _data(std::move(cm._data)) { }
atomic_cell_or_collection copy(const abstract_type&) const;
atomic_cell_view as_atomic_cell() const { return atomic_cell_view::from_bytes(_data); }
atomic_cell_ref as_atomic_cell_ref() { return { _data }; }
atomic_cell_mutable_view as_mutable_atomic_cell() { return atomic_cell_mutable_view::from_bytes(_data); }
atomic_cell_or_collection(collection_mutation cm) : _data(std::move(cm.data)) {}
explicit operator bool() const {
return !_data.empty();
}
static constexpr bool can_use_mutable_view() {
return true;
bool can_use_mutable_view() const {
return !_data.is_fragmented();
}
static atomic_cell_or_collection from_collection_mutation(collection_mutation data) { return std::move(data._data); }
collection_mutation_view as_collection_mutation() const;
bytes_view serialize() const;
bool equals(const abstract_type& type, const atomic_cell_or_collection& other) const;
size_t external_memory_usage(const abstract_type&) const;
class printer {
const column_definition& _cdef;
const atomic_cell_or_collection& _cell;
public:
printer(const column_definition& cdef, const atomic_cell_or_collection& cell)
: _cdef(cdef), _cell(cell) { }
printer(const printer&) = delete;
printer(printer&&) = delete;
friend std::ostream& operator<<(std::ostream&, const printer&);
};
friend std::ostream& operator<<(std::ostream&, const printer&);
static atomic_cell_or_collection from_collection_mutation(collection_mutation data) {
return std::move(data.data);
}
collection_mutation_view as_collection_mutation() const {
return collection_mutation_view{_data};
}
bytes_view serialize() const {
return _data;
}
bool operator==(const atomic_cell_or_collection& other) const {
return _data == other._data;
}
size_t external_memory_usage() const {
return _data.external_memory_usage();
}
friend std::ostream& operator<<(std::ostream&, const atomic_cell_or_collection&);
};

View File

@@ -26,7 +26,10 @@
namespace auth {
constexpr std::string_view allow_all_authenticator_name("org.apache.cassandra.auth.AllowAllAuthenticator");
const sstring& allow_all_authenticator_name() {
static const sstring name = meta::AUTH_PACKAGE_NAME + "AllowAllAuthenticator";
return name;
}
// To ensure correct initialization order, we unfortunately need to use a string literal.
static const class_registrator<

View File

@@ -37,7 +37,7 @@ class migration_manager;
namespace auth {
extern const std::string_view allow_all_authenticator_name;
const sstring& allow_all_authenticator_name();
class allow_all_authenticator final : public authenticator {
public:
@@ -52,8 +52,8 @@ public:
return make_ready_future<>();
}
virtual std::string_view qualified_java_name() const override {
return allow_all_authenticator_name;
virtual const sstring& qualified_java_name() const override {
return allow_all_authenticator_name();
}
virtual bool require_authentication() const override {
@@ -72,19 +72,19 @@ public:
return make_ready_future<authenticated_user>(anonymous_user());
}
virtual future<> create(std::string_view, const authentication_options& options) const override {
virtual future<> create(stdx::string_view, const authentication_options& options) const override {
return make_ready_future();
}
virtual future<> alter(std::string_view, const authentication_options& options) const override {
virtual future<> alter(stdx::string_view, const authentication_options& options) const override {
return make_ready_future();
}
virtual future<> drop(std::string_view) const override {
virtual future<> drop(stdx::string_view) const override {
return make_ready_future();
}
virtual future<custom_options> query_custom_options(std::string_view role_name) const override {
virtual future<custom_options> query_custom_options(stdx::string_view role_name) const override {
return make_ready_future<custom_options>();
}

View File

@@ -26,7 +26,10 @@
namespace auth {
constexpr std::string_view allow_all_authorizer_name("org.apache.cassandra.auth.AllowAllAuthorizer");
const sstring& allow_all_authorizer_name() {
static const sstring name = meta::AUTH_PACKAGE_NAME + "AllowAllAuthorizer";
return name;
}
// To ensure correct initialization order, we unfortunately need to use a string literal.
static const class_registrator<

View File

@@ -23,6 +23,7 @@
#include "auth/authorizer.hh"
#include "exceptions/exceptions.hh"
#include "stdx.hh"
namespace cql3 {
class query_processor;
@@ -34,7 +35,7 @@ class migration_manager;
namespace auth {
extern const std::string_view allow_all_authorizer_name;
const sstring& allow_all_authorizer_name();
class allow_all_authorizer final : public authorizer {
public:
@@ -49,20 +50,20 @@ public:
return make_ready_future<>();
}
virtual std::string_view qualified_java_name() const override {
return allow_all_authorizer_name;
virtual const sstring& qualified_java_name() const override {
return allow_all_authorizer_name();
}
virtual future<permission_set> authorize(const role_or_anonymous&, const resource&) const override {
return make_ready_future<permission_set>(permissions::ALL);
}
virtual future<> grant(std::string_view, permission_set, const resource&) const override {
virtual future<> grant(stdx::string_view, permission_set, const resource&) const override {
return make_exception_future<>(
unsupported_authorization_operation("GRANT operation is not supported by AllowAllAuthorizer"));
}
virtual future<> revoke(std::string_view, permission_set, const resource&) const override {
virtual future<> revoke(stdx::string_view, permission_set, const resource&) const override {
return make_exception_future<>(
unsupported_authorization_operation("REVOKE operation is not supported by AllowAllAuthorizer"));
}
@@ -73,7 +74,7 @@ public:
"LIST PERMISSIONS operation is not supported by AllowAllAuthorizer"));
}
virtual future<> revoke_all(std::string_view) const override {
virtual future<> revoke_all(stdx::string_view) const override {
return make_exception_future(
unsupported_authorization_operation("REVOKE operation is not supported by AllowAllAuthorizer"));
}

View File

@@ -45,7 +45,7 @@
namespace auth {
authenticated_user::authenticated_user(std::string_view name)
authenticated_user::authenticated_user(stdx::string_view name)
: name(sstring(name)) {
}

View File

@@ -41,7 +41,7 @@
#pragma once
#include <string_view>
#include <experimental/string_view>
#include <functional>
#include <iosfwd>
#include <optional>
@@ -49,6 +49,7 @@
#include <seastar/core/sstring.hh>
#include "seastarx.hh"
#include "stdx.hh"
namespace auth {
@@ -66,7 +67,7 @@ public:
/// An anonymous user.
///
authenticated_user() = default;
explicit authenticated_user(std::string_view name);
explicit authenticated_user(stdx::string_view name);
};
///

View File

@@ -57,7 +57,7 @@ inline bool any_authentication_options(const authentication_options& aos) noexce
class unsupported_authentication_option : public std::invalid_argument {
public:
explicit unsupported_authentication_option(authentication_option k)
: std::invalid_argument(format("The {} option is not supported.", k)) {
: std::invalid_argument(sprint("The %s option is not supported.", k)) {
}
};

View File

@@ -45,6 +45,7 @@
#include "auth/common.hh"
#include "auth/password_authenticator.hh"
#include "cql3/query_processor.hh"
#include "db/config.hh"
#include "utils/class_registrator.hh"
const sstring auth::authenticator::USERNAME_KEY("username");

View File

@@ -41,7 +41,7 @@
#pragma once
#include <string_view>
#include <experimental/string_view>
#include <memory>
#include <set>
#include <stdexcept>
@@ -55,10 +55,10 @@
#include "auth/authentication_options.hh"
#include "auth/resource.hh"
#include "auth/sasl_challenge.hh"
#include "bytes.hh"
#include "enum_set.hh"
#include "exceptions/exceptions.hh"
#include "stdx.hh"
namespace db {
class config;
@@ -96,7 +96,7 @@ public:
///
/// A fully-qualified (class with package) Java-like name for this implementation.
///
virtual std::string_view qualified_java_name() const = 0;
virtual const sstring& qualified_java_name() const = 0;
virtual bool require_authentication() const = 0;
@@ -122,7 +122,7 @@ public:
///
/// The options provided must be a subset of `supported_options()`.
///
virtual future<> create(std::string_view role_name, const authentication_options& options) const = 0;
virtual future<> create(stdx::string_view role_name, const authentication_options& options) const = 0;
///
/// Alter the authentication record of an existing user.
@@ -131,25 +131,39 @@ public:
///
/// Callers must ensure that the specification of `alterable_options()` is adhered to.
///
virtual future<> alter(std::string_view role_name, const authentication_options& options) const = 0;
virtual future<> alter(stdx::string_view role_name, const authentication_options& options) const = 0;
///
/// Delete the authentication record for a user. This will disallow the user from logging in.
///
virtual future<> drop(std::string_view role_name) const = 0;
virtual future<> drop(stdx::string_view role_name) const = 0;
///
/// Query for custom options (those corresponding to \ref authentication_options::options).
///
/// If no options are set the result is an empty container.
///
virtual future<custom_options> query_custom_options(std::string_view role_name) const = 0;
virtual future<custom_options> query_custom_options(stdx::string_view role_name) const = 0;
///
/// System resources used internally as part of the implementation. These are made inaccessible to users.
///
virtual const resource_set& protected_resources() const = 0;
///
/// A stateful SASL challenge which supports many authentication schemes (depending on the implementation).
///
class sasl_challenge {
public:
virtual ~sasl_challenge() = default;
virtual bytes evaluate_response(bytes_view client_response) = 0;
virtual bool is_complete() const = 0;
virtual future<authenticated_user> get_authenticated_user() const = 0;
};
virtual ::shared_ptr<sasl_challenge> new_sasl_challenge() const = 0;
};

View File

@@ -41,7 +41,7 @@
#pragma once
#include <string_view>
#include <experimental/string_view>
#include <functional>
#include <optional>
#include <stdexcept>
@@ -54,6 +54,7 @@
#include "auth/permission.hh"
#include "auth/resource.hh"
#include "seastarx.hh"
#include "stdx.hh"
namespace auth {
@@ -100,7 +101,7 @@ public:
///
/// A fully-qualified (class with package) Java-like name for this implementation.
///
virtual std::string_view qualified_java_name() const = 0;
virtual const sstring& qualified_java_name() const = 0;
///
/// Query for the permissions granted directly to a role for a particular \ref resource (and not any of its
@@ -116,14 +117,14 @@ public:
///
/// \throws \ref unsupported_authorization_operation if granting permissions is not supported.
///
virtual future<> grant(std::string_view role_name, permission_set, const resource&) const = 0;
virtual future<> grant(stdx::string_view role_name, permission_set, const resource&) const = 0;
///
/// Revoke a set of permissions from a role for a particular \ref resource.
///
/// \throws \ref unsupported_authorization_operation if revoking permissions is not supported.
///
virtual future<> revoke(std::string_view role_name, permission_set, const resource&) const = 0;
virtual future<> revoke(stdx::string_view role_name, permission_set, const resource&) const = 0;
///
/// Query for all directly granted permissions.
@@ -137,7 +138,7 @@ public:
///
/// \throws \ref unsupported_authorization_operation if revoking permissions is not supported.
///
virtual future<> revoke_all(std::string_view role_name) const = 0;
virtual future<> revoke_all(stdx::string_view role_name) const = 0;
///
/// Revoke all permissions granted to any role for a particular resource.

View File

@@ -28,15 +28,15 @@
#include "database.hh"
#include "schema_builder.hh"
#include "service/migration_manager.hh"
#include "timeout_config.hh"
namespace auth {
namespace meta {
constexpr std::string_view AUTH_KS("system_auth");
constexpr std::string_view USERS_CF("users");
constexpr std::string_view AUTH_PACKAGE_NAME("org.apache.cassandra.auth.");
const sstring DEFAULT_SUPERUSER_NAME("cassandra");
const sstring AUTH_KS("system_auth");
const sstring USERS_CF("users");
const sstring AUTH_PACKAGE_NAME("org.apache.cassandra.auth.");
}
@@ -47,9 +47,9 @@ future<> do_after_system_ready(seastar::abort_source& as, seastar::noncopyable_f
struct empty_state { };
return delay_until_system_ready(as).then([&as, func = std::move(func)] () mutable {
return exponential_backoff_retry::do_until_value(1s, 1min, as, [func = std::move(func)] {
return func().then_wrapped([] (auto&& f) -> std::optional<empty_state> {
return func().then_wrapped([] (auto&& f) -> stdx::optional<empty_state> {
if (f.failed()) {
auth_log.debug("Auth task failed with error, rescheduling: {}", f.get_exception());
auth_log.info("Auth task failed with error, rescheduling: {}", f.get_exception());
return { };
}
return { empty_state() };
@@ -58,67 +58,40 @@ future<> do_after_system_ready(seastar::abort_source& as, seastar::noncopyable_f
}).discard_result();
}
static future<> create_metadata_table_if_missing_impl(
std::string_view table_name,
future<> create_metadata_table_if_missing(
stdx::string_view table_name,
cql3::query_processor& qp,
std::string_view cql,
stdx::string_view cql,
::service::migration_manager& mm) {
static auto ignore_existing = [] (seastar::noncopyable_function<future<>()> func) {
return futurize_invoke(std::move(func)).handle_exception_type([] (exceptions::already_exists_exception& ignored) { });
};
auto& db = qp.db();
auto parsed_statement = cql3::query_processor::parse_statement(cql);
auto& parsed_cf_statement = static_cast<cql3::statements::raw::cf_statement&>(*parsed_statement);
auto& db = qp.db().local();
parsed_cf_statement.prepare_keyspace(meta::AUTH_KS);
if (db.has_schema(meta::AUTH_KS, sstring(table_name))) {
return make_ready_future<>();
}
auto parsed_statement = static_pointer_cast<cql3::statements::raw::cf_statement>(
cql3::query_processor::parse_statement(cql));
parsed_statement->prepare_keyspace(meta::AUTH_KS);
auto statement = static_pointer_cast<cql3::statements::create_table_statement>(
parsed_cf_statement.prepare(db, qp.get_cql_stats())->statement);
parsed_statement->prepare(db, qp.get_cql_stats())->statement);
const auto schema = statement->get_cf_meta_data(qp.db());
const auto schema = statement->get_cf_meta_data(qp.db().local());
const auto uuid = generate_legacy_id(schema->ks_name(), schema->cf_name());
schema_builder b(schema);
b.set_uuid(uuid);
schema_ptr table = b.build();
return ignore_existing([&mm, table = std::move(table)] () {
return mm.announce_new_column_family(table);
});
return mm.announce_new_column_family(b.build(), false);
}
future<> create_metadata_table_if_missing(
std::string_view table_name,
cql3::query_processor& qp,
std::string_view cql,
::service::migration_manager& mm) noexcept {
return futurize_invoke(create_metadata_table_if_missing_impl, table_name, qp, cql, mm);
}
future<> wait_for_schema_agreement(::service::migration_manager& mm, const database& db, seastar::abort_source& as) {
future<> wait_for_schema_agreement(::service::migration_manager& mm, const database& db) {
static const auto pause = [] { return sleep(std::chrono::milliseconds(500)); };
return do_until([&db, &as] {
as.check();
return db.get_version() != database::empty_version;
}, pause).then([&mm, &as] {
return do_until([&mm, &as] {
as.check();
return mm.have_schema_agreement();
}, pause);
return do_until([&db] { return db.get_version() != database::empty_version; }, pause).then([&mm] {
return do_until([&mm] { return mm.have_schema_agreement(); }, pause);
});
}
::service::query_state& internal_distributed_query_state() noexcept {
#ifdef DEBUG
// Give the much slower debug tests more headroom for completing auth queries.
static const auto t = 30s;
#else
static const auto t = 5s;
#endif
static const timeout_config tc{t, t, t, t, t, t, t};
static thread_local ::service::client_state cs(::service::client_state::internal_tag{}, tc);
static thread_local ::service::query_state qs(cs, empty_service_permit());
return qs;
}
}

View File

@@ -22,25 +22,22 @@
#pragma once
#include <chrono>
#include <string_view>
#include <experimental/string_view>
#include <seastar/core/future.hh>
#include <seastar/core/abort_source.hh>
#include <seastar/util/noncopyable_function.hh>
#include <seastar/core/seastar.hh>
#include <seastar/core/reactor.hh>
#include <seastar/core/resource.hh>
#include <seastar/core/sstring.hh>
#include <seastar/core/smp.hh>
#include "log.hh"
#include "seastarx.hh"
#include "utils/exponential_backoff_retry.hh"
#include "service/query_state.hh"
using namespace std::chrono_literals;
class database;
class timeout_config;
namespace service {
class migration_manager;
@@ -54,16 +51,16 @@ namespace auth {
namespace meta {
constexpr std::string_view DEFAULT_SUPERUSER_NAME("cassandra");
extern const std::string_view AUTH_KS;
extern const std::string_view USERS_CF;
extern const std::string_view AUTH_PACKAGE_NAME;
extern const sstring DEFAULT_SUPERUSER_NAME;
extern const sstring AUTH_KS;
extern const sstring USERS_CF;
extern const sstring AUTH_PACKAGE_NAME;
}
template <class Task>
future<> once_among_shards(Task&& f) {
if (this_shard_id() == 0u) {
if (engine().cpu_id() == 0u) {
return f();
}
@@ -78,16 +75,11 @@ inline future<> delay_until_system_ready(seastar::abort_source& as) {
future<> do_after_system_ready(seastar::abort_source& as, seastar::noncopyable_function<future<>()> func);
future<> create_metadata_table_if_missing(
std::string_view table_name,
stdx::string_view table_name,
cql3::query_processor&,
std::string_view cql,
::service::migration_manager&) noexcept;
stdx::string_view cql,
::service::migration_manager&);
future<> wait_for_schema_agreement(::service::migration_manager&, const database&, seastar::abort_source&);
///
/// Time-outs for internal, non-local CQL queries.
///
::service::query_state& internal_distributed_query_state() noexcept;
future<> wait_for_schema_agreement(::service::migration_manager&, const database&);
}

View File

@@ -51,7 +51,7 @@ extern "C" {
#include <boost/algorithm/string/join.hpp>
#include <boost/range.hpp>
#include <seastar/core/seastar.hh>
#include <seastar/core/reactor.hh>
#include "auth/authenticated_user.hh"
#include "auth/common.hh"
@@ -61,18 +61,18 @@ extern "C" {
#include "cql3/untyped_result_set.hh"
#include "exceptions/exceptions.hh"
#include "log.hh"
#include "database.hh"
namespace auth {
std::string_view default_authorizer::qualified_java_name() const {
return "org.apache.cassandra.auth.CassandraAuthorizer";
const sstring& default_authorizer_name() {
static const sstring name = meta::AUTH_PACKAGE_NAME + "CassandraAuthorizer";
return name;
}
static constexpr std::string_view ROLE_NAME = "role";
static constexpr std::string_view RESOURCE_NAME = "resource";
static constexpr std::string_view PERMISSIONS_NAME = "permissions";
static constexpr std::string_view PERMISSIONS_CF = "role_permissions";
static const sstring ROLE_NAME = "role";
static const sstring RESOURCE_NAME = "resource";
static const sstring PERMISSIONS_NAME = "permissions";
static const sstring PERMISSIONS_CF = "role_permissions";
static logging::logger alogger("default_authorizer");
@@ -94,13 +94,13 @@ default_authorizer::~default_authorizer() {
static const sstring legacy_table_name{"permissions"};
bool default_authorizer::legacy_metadata_exists() const {
return _qp.db().has_schema(meta::AUTH_KS, legacy_table_name);
return _qp.db().local().has_schema(meta::AUTH_KS, legacy_table_name);
}
future<bool> default_authorizer::any_granted() const {
static const sstring query = format("SELECT * FROM {}.{} LIMIT 1", meta::AUTH_KS, PERMISSIONS_CF);
static const sstring query = sprint("SELECT * FROM %s.%s LIMIT 1", meta::AUTH_KS, PERMISSIONS_CF);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
{},
@@ -111,9 +111,9 @@ future<bool> default_authorizer::any_granted() const {
future<> default_authorizer::migrate_legacy_metadata() const {
alogger.info("Starting migration of legacy permissions metadata.");
static const sstring query = format("SELECT * FROM {}.{}", meta::AUTH_KS, legacy_table_name);
static const sstring query = sprint("SELECT * FROM %s.%s", meta::AUTH_KS, legacy_table_name);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::LOCAL_ONE).then([this](::shared_ptr<cql3::untyped_result_set> results) {
return do_for_each(*results, [this](const cql3::untyped_result_set_row& row) {
@@ -158,7 +158,7 @@ future<> default_authorizer::start() {
_migration_manager).then([this] {
_finished = do_after_system_ready(_as, [this] {
return async([this] {
wait_for_schema_agreement(_migration_manager, _qp.db(), _as).get0();
wait_for_schema_agreement(_migration_manager, _qp.db().local()).get0();
if (legacy_metadata_exists()) {
if (!any_granted().get0()) {
@@ -176,7 +176,7 @@ future<> default_authorizer::start() {
future<> default_authorizer::stop() {
_as.request_abort();
return _finished.handle_exception_type([](const sleep_aborted&) {}).handle_exception_type([](const abort_requested_exception&) {});
return _finished.handle_exception_type([](const sleep_aborted&) {});
}
future<permission_set>
@@ -185,14 +185,15 @@ default_authorizer::authorize(const role_or_anonymous& maybe_role, const resourc
return make_ready_future<permission_set>(permissions::NONE);
}
static const sstring query = format("SELECT {} FROM {}.{} WHERE {} = ? AND {} = ?",
static const sstring query = sprint(
"SELECT %s FROM %s.%s WHERE %s = ? AND %s = ?",
PERMISSIONS_NAME,
meta::AUTH_KS,
PERMISSIONS_CF,
ROLE_NAME,
RESOURCE_NAME);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
{*maybe_role.name, r.name()}).then([](::shared_ptr<cql3::untyped_result_set> results) {
@@ -206,12 +207,13 @@ default_authorizer::authorize(const role_or_anonymous& maybe_role, const resourc
future<>
default_authorizer::modify(
std::string_view role_name,
stdx::string_view role_name,
permission_set set,
const resource& resource,
std::string_view op) const {
stdx::string_view op) const {
return do_with(
format("UPDATE {}.{} SET {} = {} {} ? WHERE {} = ? AND {} = ?",
sprint(
"UPDATE %s.%s SET %s = %s %s ? WHERE %s = ? AND %s = ?",
meta::AUTH_KS,
PERMISSIONS_CF,
PERMISSIONS_NAME,
@@ -220,35 +222,34 @@ default_authorizer::modify(
ROLE_NAME,
RESOURCE_NAME),
[this, &role_name, set, &resource](const auto& query) {
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::ONE,
internal_distributed_query_state(),
{permissions::to_strings(set), sstring(role_name), resource.name()}).discard_result();
});
}
future<> default_authorizer::grant(std::string_view role_name, permission_set set, const resource& resource) const {
future<> default_authorizer::grant(stdx::string_view role_name, permission_set set, const resource& resource) const {
return modify(role_name, std::move(set), resource, "+");
}
future<> default_authorizer::revoke(std::string_view role_name, permission_set set, const resource& resource) const {
future<> default_authorizer::revoke(stdx::string_view role_name, permission_set set, const resource& resource) const {
return modify(role_name, std::move(set), resource, "-");
}
future<std::vector<permission_details>> default_authorizer::list_all() const {
static const sstring query = format("SELECT {}, {}, {} FROM {}.{}",
static const sstring query = sprint(
"SELECT %s, %s, %s FROM %s.%s",
ROLE_NAME,
RESOURCE_NAME,
PERMISSIONS_NAME,
meta::AUTH_KS,
PERMISSIONS_CF);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::ONE,
internal_distributed_query_state(),
{},
true).then([](::shared_ptr<cql3::untyped_result_set> results) {
std::vector<permission_details> all_details;
@@ -266,16 +267,16 @@ future<std::vector<permission_details>> default_authorizer::list_all() const {
});
}
future<> default_authorizer::revoke_all(std::string_view role_name) const {
static const sstring query = format("DELETE FROM {}.{} WHERE {} = ?",
future<> default_authorizer::revoke_all(stdx::string_view role_name) const {
static const sstring query = sprint(
"DELETE FROM %s.%s WHERE %s = ?",
meta::AUTH_KS,
PERMISSIONS_CF,
ROLE_NAME);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::ONE,
internal_distributed_query_state(),
{sstring(role_name)}).discard_result().handle_exception([role_name](auto ep) {
try {
std::rethrow_exception(ep);
@@ -286,13 +287,14 @@ future<> default_authorizer::revoke_all(std::string_view role_name) const {
}
future<> default_authorizer::revoke_all(const resource& resource) const {
static const sstring query = format("SELECT {} FROM {}.{} WHERE {} = ? ALLOW FILTERING",
static const sstring query = sprint(
"SELECT %s FROM %s.%s WHERE %s = ? ALLOW FILTERING",
ROLE_NAME,
meta::AUTH_KS,
PERMISSIONS_CF,
RESOURCE_NAME);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
{resource.name()}).then_wrapped([this, resource](future<::shared_ptr<cql3::untyped_result_set>> f) {
@@ -302,13 +304,14 @@ future<> default_authorizer::revoke_all(const resource& resource) const {
res->begin(),
res->end(),
[this, res, resource](const cql3::untyped_result_set::row& r) {
static const sstring query = format("DELETE FROM {}.{} WHERE {} = ? AND {} = ?",
static const sstring query = sprint(
"DELETE FROM %s.%s WHERE %s = ? AND %s = ?",
meta::AUTH_KS,
PERMISSIONS_CF,
ROLE_NAME,
RESOURCE_NAME);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::LOCAL_ONE,
{r.get_as<sstring>(ROLE_NAME), resource.name()}).discard_result().handle_exception(

View File

@@ -51,6 +51,8 @@
namespace auth {
const sstring& default_authorizer_name();
class default_authorizer : public authorizer {
cql3::query_processor& _qp;
@@ -69,17 +71,19 @@ public:
virtual future<> stop() override;
virtual std::string_view qualified_java_name() const override;
virtual const sstring& qualified_java_name() const override {
return default_authorizer_name();
}
virtual future<permission_set> authorize(const role_or_anonymous&, const resource&) const override;
virtual future<> grant(std::string_view, permission_set, const resource&) const override;
virtual future<> grant(stdx::string_view, permission_set, const resource&) const override;
virtual future<> revoke( std::string_view, permission_set, const resource&) const override;
virtual future<> revoke( stdx::string_view, permission_set, const resource&) const override;
virtual future<std::vector<permission_details>> list_all() const override;
virtual future<> revoke_all(std::string_view) const override;
virtual future<> revoke_all(stdx::string_view) const override;
virtual future<> revoke_all(const resource&) const override;
@@ -92,7 +96,7 @@ private:
future<> migrate_legacy_metadata() const;
future<> modify(std::string_view, permission_set, const resource&, std::string_view) const;
future<> modify(stdx::string_view, permission_set, const resource&, stdx::string_view) const;
};
} /* namespace auth */

View File

@@ -41,33 +41,37 @@
#include "auth/password_authenticator.hh"
extern "C" {
#include <crypt.h>
#include <unistd.h>
}
#include <algorithm>
#include <chrono>
#include <random>
#include <string_view>
#include <optional>
#include <boost/algorithm/cxx11/all_of.hpp>
#include <seastar/core/seastar.hh>
#include <seastar/core/reactor.hh>
#include "auth/authenticated_user.hh"
#include "auth/common.hh"
#include "auth/passwords.hh"
#include "auth/roles-metadata.hh"
#include "cql3/untyped_result_set.hh"
#include "log.hh"
#include "service/migration_manager.hh"
#include "utils/class_registrator.hh"
#include "database.hh"
namespace auth {
constexpr std::string_view password_authenticator_name("org.apache.cassandra.auth.PasswordAuthenticator");
const sstring& password_authenticator_name() {
static const sstring name = meta::AUTH_PACKAGE_NAME + "PasswordAuthenticator";
return name;
}
// name of the hash column.
static constexpr std::string_view SALTED_HASH = "salted_hash";
static constexpr std::string_view DEFAULT_USER_NAME = meta::DEFAULT_SUPERUSER_NAME;
static const sstring DEFAULT_USER_PASSWORD = sstring(meta::DEFAULT_SUPERUSER_NAME);
static const sstring SALTED_HASH = "salted_hash";
static const sstring DEFAULT_USER_NAME = meta::DEFAULT_SUPERUSER_NAME;
static const sstring DEFAULT_USER_PASSWORD = meta::DEFAULT_SUPERUSER_NAME;
static logging::logger plogger("password_authenticator");
@@ -78,8 +82,6 @@ static const class_registrator<
cql3::query_processor&,
::service::migration_manager&> password_auth_reg("org.apache.cassandra.auth.PasswordAuthenticator");
static thread_local auto rng_for_salt = std::default_random_engine(std::random_device{}());
password_authenticator::~password_authenticator() {
}
@@ -89,40 +91,108 @@ password_authenticator::password_authenticator(cql3::query_processor& qp, ::serv
, _stopped(make_ready_future<>()) {
}
static bool has_salted_hash(const cql3::untyped_result_set_row& row) {
return !row.get_or<sstring>(SALTED_HASH, "").empty();
// TODO: blowfish
// Origin uses Java bcrypt library, i.e. blowfish salt
// generation and hashing, which is arguably a "better"
// password hash than sha/md5 versions usually available in
// crypt_r. Otoh, glibc 2.7+ uses a modified sha512 algo
// which should be the same order of safe, so the only
// real issue should be salted hash compatibility with
// origin if importing system tables from there.
//
// Since bcrypt/blowfish is _not_ (afaict) not available
// as a dev package/lib on most linux distros, we'd have to
// copy and compile for example OWL crypto
// (http://cvsweb.openwall.com/cgi/cvsweb.cgi/Owl/packages/glibc/crypt_blowfish/)
// to be fully bit-compatible.
//
// Until we decide this is needed, let's just use crypt_r,
// and some old-fashioned random salt generation.
static constexpr size_t rand_bytes = 16;
static thread_local crypt_data tlcrypt = { 0, };
static sstring hashpw(const sstring& pass, const sstring& salt) {
auto res = crypt_r(pass.c_str(), salt.c_str(), &tlcrypt);
if (res == nullptr) {
throw std::system_error(errno, std::system_category());
}
return res;
}
static const sstring& update_row_query() {
static const sstring update_row_query = format("UPDATE {} SET {} = ? WHERE {} = ?",
meta::roles_table::qualified_name,
SALTED_HASH,
meta::roles_table::role_col_name);
return update_row_query;
static bool checkpw(const sstring& pass, const sstring& salted_hash) {
auto tmp = hashpw(pass, salted_hash);
return tmp == salted_hash;
}
static sstring gensalt() {
static sstring prefix;
std::random_device rd;
std::default_random_engine e1(rd());
std::uniform_int_distribution<char> dist;
sstring valid_salt = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789./";
sstring input(rand_bytes, 0);
for (char&c : input) {
c = valid_salt[dist(e1) % valid_salt.size()];
}
sstring salt;
if (!prefix.empty()) {
return prefix + input;
}
// Try in order:
// blowfish 2011 fix, blowfish, sha512, sha256, md5
for (sstring pfx : { "$2y$", "$2a$", "$6$", "$5$", "$1$" }) {
salt = pfx + input;
const char* e = crypt_r("fisk", salt.c_str(), &tlcrypt);
if (e && (e[0] != '*')) {
prefix = pfx;
return salt;
}
}
throw std::runtime_error("Could not initialize hashing algorithm");
}
static sstring hashpw(const sstring& pass) {
return hashpw(pass, gensalt());
}
static bool has_salted_hash(const cql3::untyped_result_set_row& row) {
return utf8_type->deserialize(row.get_blob(SALTED_HASH)) != data_value::make_null(utf8_type);
}
static const sstring update_row_query = sprint(
"UPDATE %s SET %s = ? WHERE %s = ?",
meta::roles_table::qualified_name(),
SALTED_HASH,
meta::roles_table::role_col_name);
static const sstring legacy_table_name{"credentials"};
bool password_authenticator::legacy_metadata_exists() const {
return _qp.db().has_schema(meta::AUTH_KS, legacy_table_name);
return _qp.db().local().has_schema(meta::AUTH_KS, legacy_table_name);
}
future<> password_authenticator::migrate_legacy_metadata() const {
plogger.info("Starting migration of legacy authentication metadata.");
static const sstring query = format("SELECT * FROM {}.{}", meta::AUTH_KS, legacy_table_name);
static const sstring query = sprint("SELECT * FROM %s.%s", meta::AUTH_KS, legacy_table_name);
return _qp.execute_internal(
return _qp.process(
query,
db::consistency_level::QUORUM,
internal_distributed_query_state()).then([this](::shared_ptr<cql3::untyped_result_set> results) {
db::consistency_level::QUORUM).then([this](::shared_ptr<cql3::untyped_result_set> results) {
return do_for_each(*results, [this](const cql3::untyped_result_set_row& row) {
auto username = row.get_as<sstring>("username");
auto salted_hash = row.get_as<sstring>(SALTED_HASH);
return _qp.execute_internal(
update_row_query(),
return _qp.process(
update_row_query,
consistency_for_user(username),
internal_distributed_query_state(),
{std::move(salted_hash), username}).discard_result();
}).finally([results] {});
}).then([] {
@@ -136,11 +206,10 @@ future<> password_authenticator::migrate_legacy_metadata() const {
future<> password_authenticator::create_default_if_missing() const {
return default_role_row_satisfies(_qp, &has_salted_hash).then([this](bool exists) {
if (!exists) {
return _qp.execute_internal(
update_row_query(),
return _qp.process(
update_row_query,
db::consistency_level::QUORUM,
internal_distributed_query_state(),
{passwords::hash(DEFAULT_USER_PASSWORD, rng_for_salt), DEFAULT_USER_NAME}).then([](auto&&) {
{hashpw(DEFAULT_USER_PASSWORD), DEFAULT_USER_NAME}).then([](auto&&) {
plogger.info("Created default superuser authentication record.");
});
}
@@ -151,6 +220,8 @@ future<> password_authenticator::create_default_if_missing() const {
future<> password_authenticator::start() {
return once_among_shards([this] {
gensalt(); // do this once to determine usable hashing
auto f = create_metadata_table_if_missing(
meta::roles_table::name,
_qp,
@@ -159,7 +230,7 @@ future<> password_authenticator::start() {
_stopped = do_after_system_ready(_as, [this] {
return async([this] {
wait_for_schema_agreement(_migration_manager, _qp.db(), _as).get0();
wait_for_schema_agreement(_migration_manager, _qp.db().local()).get0();
if (any_nondefault_role_row_satisfies(_qp, &has_salted_hash).get0()) {
if (legacy_metadata_exists()) {
@@ -184,18 +255,18 @@ future<> password_authenticator::start() {
future<> password_authenticator::stop() {
_as.request_abort();
return _stopped.handle_exception_type([] (const sleep_aborted&) { }).handle_exception_type([](const abort_requested_exception&) {});
return _stopped.handle_exception_type([] (const sleep_aborted&) { });
}
db::consistency_level password_authenticator::consistency_for_user(std::string_view role_name) {
db::consistency_level password_authenticator::consistency_for_user(stdx::string_view role_name) {
if (role_name == DEFAULT_USER_NAME) {
return db::consistency_level::QUORUM;
}
return db::consistency_level::LOCAL_ONE;
}
std::string_view password_authenticator::qualified_java_name() const {
return password_authenticator_name;
const sstring& password_authenticator::qualified_java_name() const {
return password_authenticator_name();
}
bool password_authenticator::require_authentication() const {
@@ -212,11 +283,11 @@ authentication_option_set password_authenticator::alterable_options() const {
future<authenticated_user> password_authenticator::authenticate(
const credentials_map& credentials) const {
if (!credentials.contains(USERNAME_KEY)) {
throw exceptions::authentication_exception(format("Required key '{}' is missing", USERNAME_KEY));
if (!credentials.count(USERNAME_KEY)) {
throw exceptions::authentication_exception(sprint("Required key '%s' is missing", USERNAME_KEY));
}
if (!credentials.contains(PASSWORD_KEY)) {
throw exceptions::authentication_exception(format("Required key '{}' is missing", PASSWORD_KEY));
if (!credentials.count(PASSWORD_KEY)) {
throw exceptions::authentication_exception(sprint("Required key '%s' is missing", PASSWORD_KEY));
}
auto& username = credentials.at(USERNAME_KEY);
@@ -227,26 +298,22 @@ future<authenticated_user> password_authenticator::authenticate(
// obsolete prepared statements pretty quickly.
// Rely on query processing caching statements instead, and lets assume
// that a map lookup string->statement is not gonna kill us much.
return futurize_invoke([this, username, password] {
static const sstring query = format("SELECT {} FROM {} WHERE {} = ?",
return futurize_apply([this, username, password] {
static const sstring query = sprint(
"SELECT %s FROM %s WHERE %s = ?",
SALTED_HASH,
meta::roles_table::qualified_name,
meta::roles_table::qualified_name(),
meta::roles_table::role_col_name);
return _qp.execute_internal(
return _qp.process(
query,
consistency_for_user(username),
internal_distributed_query_state(),
{username},
true);
}).then_wrapped([=](future<::shared_ptr<cql3::untyped_result_set>> f) {
try {
auto res = f.get0();
auto salted_hash = std::optional<sstring>();
if (!res->empty()) {
salted_hash = res->one().get_opt<sstring>(SALTED_HASH);
}
if (!salted_hash || !passwords::check(password, *salted_hash)) {
if (res->empty() || !checkpw(password, res->one().get_as<sstring>(SALTED_HASH))) {
throw exceptions::authentication_exception("Username and/or password are incorrect");
}
return make_ready_future<authenticated_user>(username);
@@ -254,56 +321,51 @@ future<authenticated_user> password_authenticator::authenticate(
std::throw_with_nested(exceptions::authentication_exception("Could not verify password"));
} catch (exceptions::request_execution_exception& e) {
std::throw_with_nested(exceptions::authentication_exception(e.what()));
} catch (exceptions::authentication_exception& e) {
std::throw_with_nested(e);
} catch (...) {
std::throw_with_nested(exceptions::authentication_exception("authentication failed"));
}
});
}
future<> password_authenticator::create(std::string_view role_name, const authentication_options& options) const {
future<> password_authenticator::create(stdx::string_view role_name, const authentication_options& options) const {
if (!options.password) {
return make_ready_future<>();
}
return _qp.execute_internal(
update_row_query(),
return _qp.process(
update_row_query,
consistency_for_user(role_name),
internal_distributed_query_state(),
{passwords::hash(*options.password, rng_for_salt), sstring(role_name)}).discard_result();
{hashpw(*options.password), sstring(role_name)}).discard_result();
}
future<> password_authenticator::alter(std::string_view role_name, const authentication_options& options) const {
future<> password_authenticator::alter(stdx::string_view role_name, const authentication_options& options) const {
if (!options.password) {
return make_ready_future<>();
}
static const sstring query = format("UPDATE {} SET {} = ? WHERE {} = ?",
meta::roles_table::qualified_name,
static const sstring query = sprint(
"UPDATE %s SET %s = ? WHERE %s = ?",
meta::roles_table::qualified_name(),
SALTED_HASH,
meta::roles_table::role_col_name);
return _qp.execute_internal(
return _qp.process(
query,
consistency_for_user(role_name),
internal_distributed_query_state(),
{passwords::hash(*options.password, rng_for_salt), sstring(role_name)}).discard_result();
{hashpw(*options.password), sstring(role_name)}).discard_result();
}
future<> password_authenticator::drop(std::string_view name) const {
static const sstring query = format("DELETE {} FROM {} WHERE {} = ?",
future<> password_authenticator::drop(stdx::string_view name) const {
static const sstring query = sprint(
"DELETE %s FROM %s WHERE %s = ?",
SALTED_HASH,
meta::roles_table::qualified_name,
meta::roles_table::qualified_name(),
meta::roles_table::role_col_name);
return _qp.execute_internal(
query, consistency_for_user(name),
internal_distributed_query_state(),
{sstring(name)}).discard_result();
return _qp.process(query, consistency_for_user(name), {sstring(name)}).discard_result();
}
future<custom_options> password_authenticator::query_custom_options(std::string_view role_name) const {
future<custom_options> password_authenticator::query_custom_options(stdx::string_view role_name) const {
return make_ready_future<custom_options>();
}
@@ -312,13 +374,75 @@ const resource_set& password_authenticator::protected_resources() const {
return resources;
}
::shared_ptr<sasl_challenge> password_authenticator::new_sasl_challenge() const {
return ::make_shared<plain_sasl_challenge>([this](std::string_view username, std::string_view password) {
credentials_map credentials{};
credentials[USERNAME_KEY] = sstring(username);
credentials[PASSWORD_KEY] = sstring(password);
return this->authenticate(credentials);
});
::shared_ptr<authenticator::sasl_challenge> password_authenticator::new_sasl_challenge() const {
class plain_text_password_challenge : public sasl_challenge {
const password_authenticator& _self;
public:
plain_text_password_challenge(const password_authenticator& self) : _self(self) {
}
/**
* SASL PLAIN mechanism specifies that credentials are encoded in a
* sequence of UTF-8 bytes, delimited by 0 (US-ASCII NUL).
* The form is : {code}authzId<NUL>authnId<NUL>password<NUL>{code}
* authzId is optional, and in fact we don't care about it here as we'll
* set the authzId to match the authnId (that is, there is no concept of
* a user being authorized to act on behalf of another).
*
* @param bytes encoded credentials string sent by the client
* @return map containing the username/password pairs in the form an IAuthenticator
* would expect
* @throws javax.security.sasl.SaslException
*/
bytes evaluate_response(bytes_view client_response) override {
plogger.debug("Decoding credentials from client token");
sstring username, password;
auto b = client_response.crbegin();
auto e = client_response.crend();
auto i = b;
while (i != e) {
if (*i == 0) {
sstring tmp(i.base(), b.base());
if (password.empty()) {
password = std::move(tmp);
} else if (username.empty()) {
username = std::move(tmp);
}
b = ++i;
continue;
}
++i;
}
if (username.empty()) {
throw exceptions::authentication_exception("Authentication ID must not be null");
}
if (password.empty()) {
throw exceptions::authentication_exception("Password must not be null");
}
_credentials[USERNAME_KEY] = std::move(username);
_credentials[PASSWORD_KEY] = std::move(password);
_complete = true;
return {};
}
bool is_complete() const override {
return _complete;
}
future<authenticated_user> get_authenticated_user() const override {
return _self.authenticate(_credentials);
}
private:
credentials_map _credentials;
bool _complete = false;
};
return ::make_shared<plain_text_password_challenge>(*this);
}
}

View File

@@ -52,7 +52,7 @@ class migration_manager;
namespace auth {
extern const std::string_view password_authenticator_name;
const sstring& password_authenticator_name();
class password_authenticator : public authenticator {
cql3::query_processor& _qp;
@@ -61,7 +61,7 @@ class password_authenticator : public authenticator {
seastar::abort_source _as;
public:
static db::consistency_level consistency_for_user(std::string_view role_name);
static db::consistency_level consistency_for_user(stdx::string_view role_name);
password_authenticator(cql3::query_processor&, ::service::migration_manager&);
@@ -71,7 +71,7 @@ public:
virtual future<> stop() override;
virtual std::string_view qualified_java_name() const override;
virtual const sstring& qualified_java_name() const override;
virtual bool require_authentication() const override;
@@ -81,13 +81,13 @@ public:
virtual future<authenticated_user> authenticate(const credentials_map& credentials) const override;
virtual future<> create(std::string_view role_name, const authentication_options& options) const override;
virtual future<> create(stdx::string_view role_name, const authentication_options& options) const override;
virtual future<> alter(std::string_view role_name, const authentication_options& options) const override;
virtual future<> alter(stdx::string_view role_name, const authentication_options& options) const override;
virtual future<> drop(std::string_view role_name) const override;
virtual future<> drop(stdx::string_view role_name) const override;
virtual future<custom_options> query_custom_options(std::string_view role_name) const override;
virtual future<custom_options> query_custom_options(stdx::string_view role_name) const override;
virtual const resource_set& protected_resources() const override;

View File

@@ -1,84 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "auth/passwords.hh"
#include <cerrno>
#include <optional>
extern "C" {
#include <crypt.h>
#include <unistd.h>
}
namespace auth::passwords {
static thread_local crypt_data tlcrypt = { 0, };
namespace detail {
scheme identify_best_supported_scheme() {
const auto all_schemes = { scheme::bcrypt_y, scheme::bcrypt_a, scheme::sha_512, scheme::sha_256, scheme::md5 };
// "Random", for testing schemes.
const sstring random_part_of_salt = "aaaabbbbccccdddd";
for (scheme c : all_schemes) {
const sstring salt = sstring(prefix_for_scheme(c)) + random_part_of_salt;
const char* e = crypt_r("fisk", salt.c_str(), &tlcrypt);
if (e && (e[0] != '*')) {
return c;
}
}
throw no_supported_schemes();
}
sstring hash_with_salt(const sstring& pass, const sstring& salt) {
auto res = crypt_r(pass.c_str(), salt.c_str(), &tlcrypt);
if (!res || (res[0] == '*')) {
throw std::system_error(errno, std::system_category());
}
return res;
}
const char* prefix_for_scheme(scheme c) noexcept {
switch (c) {
case scheme::bcrypt_y: return "$2y$";
case scheme::bcrypt_a: return "$2a$";
case scheme::sha_512: return "$6$";
case scheme::sha_256: return "$5$";
case scheme::md5: return "$1$";
default: return nullptr;
}
}
} // namespace detail
no_supported_schemes::no_supported_schemes()
: std::runtime_error("No allowed hashing schemes are supported on this system") {
}
bool check(const sstring& pass, const sstring& salted_hash) {
return detail::hash_with_salt(pass, salted_hash) == salted_hash;
}
} // namespace auth::paswords

View File

@@ -1,125 +0,0 @@
/*
* Copyright (C) 2018 ScyllaDB
*/
/*
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <random>
#include <stdexcept>
#include <seastar/core/sstring.hh>
#include "seastarx.hh"
namespace auth::passwords {
class no_supported_schemes : public std::runtime_error {
public:
no_supported_schemes();
};
///
/// Apache Cassandra uses a library to provide the bcrypt scheme. Many Linux implementations do not support bcrypt, so
/// we support alternatives. The cost is loss of direct compatibility with Apache Cassandra system tables.
///
enum class scheme {
bcrypt_y,
bcrypt_a,
sha_512,
sha_256,
md5
};
namespace detail {
template <typename RandomNumberEngine>
sstring generate_random_salt_bytes(RandomNumberEngine& g) {
static const sstring valid_bytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789./";
static constexpr std::size_t num_bytes = 16;
std::uniform_int_distribution<std::size_t> dist(0, valid_bytes.size() - 1);
sstring result(num_bytes, 0);
for (char& c : result) {
c = valid_bytes[dist(g)];
}
return result;
}
///
/// Test each allowed hashing scheme and report the best supported one on the current system.
///
/// \throws \ref no_supported_schemes when none of the known schemes is supported.
///
scheme identify_best_supported_scheme();
const char* prefix_for_scheme(scheme) noexcept;
///
/// Generate a implementation-specific salt string for hashing passwords.
///
/// The `RandomNumberEngine` is used to generate the string, which is an implementation-specific length.
///
/// \throws \ref no_supported_schemes when no known hashing schemes are supported on the system.
///
template <typename RandomNumberEngine>
sstring generate_salt(RandomNumberEngine& g) {
static const scheme scheme = identify_best_supported_scheme();
static const sstring prefix = sstring(prefix_for_scheme(scheme));
return prefix + generate_random_salt_bytes(g);
}
///
/// Hash a password combined with an implementation-specific salt string.
///
/// \throws \ref std::system_error when an unexpected implementation-specific error occurs.
///
sstring hash_with_salt(const sstring& pass, const sstring& salt);
} // namespace detail
///
/// Run a one-way hashing function on cleartext to produce encrypted text.
///
/// Prior to applying the hashing function, random salt is amended to the cleartext. The random salt bytes are generated
/// according to the random number engine `g`.
///
/// The result is the encrypted cyphertext, and also the salt used but in a implementation-specific format.
///
/// \throws \ref std::system_error when the implementation-specific implementation fails to hash the cleartext.
///
template <typename RandomNumberEngine>
sstring hash(const sstring& pass, RandomNumberEngine& g) {
return detail::hash_with_salt(pass, detail::generate_salt(g));
}
///
/// Check that cleartext matches previously hashed cleartext with salt.
///
/// \ref salted_hash is the result of invoking \ref hash, which is the implementation-specific combination of the hashed
/// password and the salt that was generated for it.
///
/// \returns `true` if the cleartext matches the salted hash.
///
/// \throws \ref std::system_error when an unexpected implementation-specific error occurs.
///
bool check(const sstring& pass, const sstring& salted_hash);
} // namespace auth::passwords

View File

@@ -24,9 +24,19 @@
#include "auth/authorizer.hh"
#include "auth/common.hh"
#include "auth/service.hh"
#include "db/config.hh"
namespace auth {
permissions_cache_config permissions_cache_config::from_db_config(const db::config& dc) {
permissions_cache_config c;
c.max_entries = dc.permissions_cache_max_entries();
c.validity_period = std::chrono::milliseconds(dc.permissions_validity_in_ms());
c.update_period = std::chrono::milliseconds(dc.permissions_update_interval_in_ms());
return c;
}
permissions_cache::permissions_cache(const permissions_cache_config& c, service& ser, logging::logger& log)
: _cache(c.max_entries, c.validity_period, c.update_period, log, [&ser, &log](const key_type& k) {
log.debug("Refreshing permissions for {}", k.first);

View File

@@ -22,7 +22,7 @@
#pragma once
#include <chrono>
#include <string_view>
#include <experimental/string_view>
#include <functional>
#include <iostream>
#include <optional>
@@ -37,6 +37,7 @@
#include "auth/resource.hh"
#include "auth/role_or_anonymous.hh"
#include "log.hh"
#include "stdx.hh"
#include "utils/hash.hh"
#include "utils/loading_cache.hh"
@@ -58,6 +59,8 @@ namespace auth {
class service;
struct permissions_cache_config final {
static permissions_cache_config from_db_config(const db::config&);
std::size_t max_entries;
std::chrono::milliseconds validity_period;
std::chrono::milliseconds update_period;

View File

@@ -61,7 +61,7 @@ std::ostream& operator<<(std::ostream& os, resource_kind kind) {
return os;
}
static const std::unordered_map<resource_kind, std::string_view> roots{
static const std::unordered_map<resource_kind, stdx::string_view> roots{
{resource_kind::data, "data"},
{resource_kind::role, "roles"}};
@@ -101,25 +101,24 @@ static permission_set applicable_permissions(const role_resource_view& rv) {
permission::DESCRIBE>();
}
resource::resource(resource_kind kind) : _kind(kind) {
_parts.emplace_back(roots.at(kind));
resource::resource(resource_kind kind) : _kind(kind), _parts{sstring(roots.at(kind))} {
}
resource::resource(resource_kind kind, utils::small_vector<sstring, 3> parts) : resource(kind) {
resource::resource(resource_kind kind, std::vector<sstring> parts) : resource(kind) {
_parts.reserve(parts.size() + 1);
_parts.insert(_parts.end(), std::make_move_iterator(parts.begin()), std::make_move_iterator(parts.end()));
}
resource::resource(data_resource_t, std::string_view keyspace) : resource(resource_kind::data) {
_parts.emplace_back(keyspace);
resource::resource(data_resource_t, stdx::string_view keyspace)
: resource(resource_kind::data, std::vector<sstring>{sstring(keyspace)}) {
}
resource::resource(data_resource_t, std::string_view keyspace, std::string_view table) : resource(resource_kind::data) {
_parts.emplace_back(keyspace);
_parts.emplace_back(table);
resource::resource(data_resource_t, stdx::string_view keyspace, stdx::string_view table)
: resource(resource_kind::data, std::vector<sstring>{sstring(keyspace), sstring(table)}) {
}
resource::resource(role_resource_t, std::string_view role) : resource(resource_kind::role) {
_parts.emplace_back(role);
resource::resource(role_resource_t, stdx::string_view role)
: resource(resource_kind::role, std::vector<sstring>{sstring(role)}) {
}
sstring resource::name() const {
@@ -174,7 +173,7 @@ data_resource_view::data_resource_view(const resource& r) : _resource(r) {
}
}
std::optional<std::string_view> data_resource_view::keyspace() const {
std::optional<stdx::string_view> data_resource_view::keyspace() const {
if (_resource._parts.size() == 1) {
return {};
}
@@ -182,7 +181,7 @@ std::optional<std::string_view> data_resource_view::keyspace() const {
return _resource._parts[1];
}
std::optional<std::string_view> data_resource_view::table() const {
std::optional<stdx::string_view> data_resource_view::table() const {
if (_resource._parts.size() <= 2) {
return {};
}
@@ -211,7 +210,7 @@ role_resource_view::role_resource_view(const resource& r) : _resource(r) {
}
}
std::optional<std::string_view> role_resource_view::role() const {
std::optional<stdx::string_view> role_resource_view::role() const {
if (_resource._parts.size() == 1) {
return {};
}
@@ -231,9 +230,9 @@ std::ostream& operator<<(std::ostream& os, const role_resource_view& v) {
return os;
}
resource parse_resource(std::string_view name) {
static const std::unordered_map<std::string_view, resource_kind> reverse_roots = [] {
std::unordered_map<std::string_view, resource_kind> result;
resource parse_resource(stdx::string_view name) {
static const std::unordered_map<stdx::string_view, resource_kind> reverse_roots = [] {
std::unordered_map<stdx::string_view, resource_kind> result;
for (const auto& pair : roots) {
result.emplace(pair.second, pair.first);
@@ -242,7 +241,7 @@ resource parse_resource(std::string_view name) {
return result;
}();
utils::small_vector<sstring, 3> parts;
std::vector<sstring> parts;
boost::split(parts, name, [](char ch) { return ch == '/'; });
if (parts.empty()) {

Some files were not shown because too many files have changed in this diff Show More