Compare commits

..

167 Commits

Author SHA1 Message Date
Pekka Enberg
1915521974 release: prepare for 1.0.4 2016-05-29 10:41:38 +03:00
Tomasz Grabiec
ef9974e723 tests: Add unit tests for schema_registry
(cherry picked from commit 90c31701e3)
2016-05-18 14:52:45 +03:00
Tomasz Grabiec
93ac6a584a schema_registry: Fix possible hang in maybe_sync() if syncer doesn't defer
Spotted during code review.

If it doesn't defer, we may execute then_wrapped() body before we
change the state. Fix by moving then_wrapped() body after state changes.

(cherry picked from commit 443e5aef5a)
2016-05-18 13:53:14 +03:00
Tomasz Grabiec
2457a16d23 migration_manager: Fix schema syncing with older version
The problem was that "s" would not be marked as synced-with if it came from
shard != 0.

As a result, mutation using that schema would fail to apply with an exception:

  "attempted to mutate using not synced schema of ..."

The problem could surface when altering schema without changing
columns and restarting one of the nodes so that it forgets past
versions.

Fixes #1258.

Will be covered by dtest:

  SchemaManagementTest.test_prepared_statements_work_after_node_restart_after_altering_schema_without_changing_columns

(cherry picked from commit 8703136a4f)
2016-05-18 13:52:24 +03:00
Tomasz Grabiec
daabc8777d migration_manager: Invalidate prepared statements on every schema change
Currently we only do that when column set changes. When prepared
statements are executed, paramaters like read repair chance are read
from schema version stored in the statement. Not invalidating prepared
statements on changes of such parameters will appear as if alter took
no effect.

Fixes #1255.
Message-Id: <1462985495-9767-1-git-send-email-tgrabiec@scylladb.com>

(cherry picked from commit 13d8cd0ae9)
(cherry picked from commit 734cfa949a)
2016-05-15 13:36:39 +03:00
Raphael S. Carvalho
b259e1b0bc tests: test that leveled strategy was fixed
L1 wasn't being compacted into L2.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <1a357896a448eafa7da4d28bc56fa02b89d4193e.1460508373.git.raphaelsc@scylladb.com>
(cherry picked from commit beaacbda2e)
2016-05-09 08:17:59 +03:00
Raphael S. Carvalho
322f194032 sstables: Fix leveled compaction strategy
There is a problem in the implementation of leveled compaction strategy that
prevents level 1 from being compacted into level 2, and so forth. As a result,
all sstables will only belong to either level 0 or 1. One of the consequences
is level 1 being overwhelmed by a huge amount of sstables.

The root of the problem is a conditional statement in the code that prevents a
single sstable, with level > 0, from being compacted into a subsequent level
that is empty or has no overlapping sstables.

Fixes #1180.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <9a4bffdb0368dea77b49c23687015ff5832299ab.1460508373.git.raphaelsc@scylladb.com>
(cherry picked from commit c7b728e716)
2016-05-09 08:17:39 +03:00
Glauber Costa
c51b05efb3 throttle: always release at least one request if we are below the limit
Our current throttling code releases one requests per 1MB of memory available
that we have. If we are below the memory limit, but not by 1MB or more, then
we will keep getting to unthrottle, but never really do anything.

If another memtable is close to the flushing point, those requests may be
exactly the ones that would make it flush. Without them, we'll freeze the
database.

In general, we need to always release at least one request to make sure that
progress is always achieved.

This fixes #1144

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 9c87ae3496)
2016-05-09 08:14:37 +03:00
Glauber Costa
2e41a09631 memtable_list: make sure at least two memtables are available
This is usually not a problem for the main memtable list - although it can be,
depending on settings, but shows up easily for the streaming memtables list.

We would like to have at least two memtables, even if we have to cut it short.
If we don't do that, one memtable will have use all available memory and we'll
force throttling until the memtable gets totally flushed.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 2c5dfe08c1)
2016-05-09 08:14:37 +03:00
Glauber Costa
44cfbc15d0 unnest throttle_state
throttle_state is currently a nested member of database, but there is no
particular reason - aside from the fact that it is currently only ever
referenced by the database for us to do so.

We'll soon want to have some interaction between this and the column family, to
allow us to flush during throttle. To make that easier, let's unnest it.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 1daede7396)
2016-05-09 08:14:37 +03:00
Glauber Costa
c9bd954237 move information about memtables' region group inside memtable list
This is a preparation patch so we can move the throttling infrastructure inside
the memtable_list. To do that, the region group will have to be passed to the
throttler so let's just go ahead and store it.

In consequence of that, all that the CF has to tell us is what is the current
schema - no longer how to create a new memtable.

Also, with a new parameter to be passed to the memtable_list the creation code
gets quite big and hard to follow. So let's move the creation functions to a
helper.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 39def369ce)
2016-05-09 08:14:37 +03:00
Calle Wilund
6c4d7223fe database.cc: Fix compilation error with boost 1.55
Message-Id: <1461067254-526-1-git-send-email-calle@scylladb.com>
(cherry picked from commit 9130b0de16)
2016-05-04 08:42:21 +03:00
Calle Wilund
c1a5488993 sstables: Fix compilation error on boost 1.55
Message-Id: <1461067254-526-2-git-send-email-calle@scylladb.com>
(cherry picked from commit 49d3d79dfe)
2016-05-04 08:42:15 +03:00
Pekka Enberg
9c9f62e30b release: prepare for 1.0.3 2016-05-02 14:29:15 +03:00
Pekka Enberg
c147676ccb dist/docker/redhat: Make sure image builds against latest Scylla
Use "yum clean expire-cache" to make sure we build against the latest
Scylla release.
Message-Id: <1460374418-27315-1-git-send-email-penberg@scylladb.com>

(cherry picked from commit 355c3ea331)
2016-04-27 15:07:38 +03:00
Raphael S. Carvalho
07adedf28a tests: fix use-after-free in sstable test
After commit a843aea547, a gate was introduced to make sure that
an asynchronous operation is finished before column family is
destroyed. A sstable testcase was not stopping column family,
instead it just removed column family from compaction manager.
That could cause an user-after-free if column family is destroyed
while the asynchronous operation is running. Let's fix it by
stopping column family in the test.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <ed910ec459c1752148099e6dc503e7f3adee54da.1461177411.git.raphaelsc@scylladb.com>
(cherry picked from commit eb51c93a5a)
2016-04-26 10:37:40 +03:00
Pekka Enberg
8ca530b6d3 Merge "Backport atomic sstable deletion to 1.0" from Avi
"This patchset is a backport of the atomic sstable deletion patchset, which
 waits until all shards agree to delete an sstable set before deleting it,
 avoiding the resurrecting data problem.

 The first four patches are identical to master, the last patch is new.

 Fixes #1181"
2016-04-25 14:12:33 +03:00
Avi Kivity
e5a123ea80 sstables: avoid long-duration smp calls in delete_atomically()
Since seastar is limited to 128 cross-shard calls per shard-pair,
long-duration smp calls can lead to deadlocks.

Prevent such calls by returning immediately from shard 0 (which manages
the deletions), and calling back to the requesting shard when the deletion
completes.
2016-04-25 13:21:00 +03:00
Avi Kivity
9bfce3255a db: delete compacted sstables atomically
If sstables A, B are compacted, A and B must be deleted atomically.
Otherwise, if A has data that is covered by a tombstone in B, and that
tombstone is deleted, and if B is deleted while A is not, then the data
in A is resurrected.

Fixes #1181.

(cherry picked from commit a843aea547)
2016-04-25 11:41:50 +03:00
Avi Kivity
d2251199b2 sstables: convert sstable::mark_for_deletion() to atomic deletion infrastructure
All deletions must go through the same data structure, or some atomic
deletions will never be satisified.

(cherry picked from commit 3798d04ae8)
2016-04-25 11:41:39 +03:00
Avi Kivity
bed6437b38 main: cancel pending atomic deletions on shutdown
A shared sstable must be compacted by all shards before it can be deleted.
Since we're stoping, that's not going to happen.  Cancel those pending
deletions to let anyone waiting on them to continue.

(cherry picked from commit e43dbac836)
2016-04-25 11:41:28 +03:00
Avi Kivity
70508734a5 sstables: add delete_atomically(), for atomically deleting multiple sstables
When we compact a set of sstables, we have to remove the set atomically,
otherwise we can resurrect data if the following happens:

 insert data to sstable A
 insert tombstone to sstable B
 compact A+B -> C (removing both data and tombstone)
 delete B only
 read data from A

Since an sstable may be shared by multiple shard, and each shard performs
compaction at a different time, we need to defer deletion of an sstable
set until all shards agree that the set can be deleted.

An additional atomicity issue exists because posix does not provide a way
to atomically delete multiple files.  This issue is not addressed by this
patch.

(cherry picked from commit 2ba584db8d)
2016-04-25 11:41:20 +03:00
Pekka Enberg
60307f62fe release: prepare for 1.0.2 2016-04-20 22:10:57 +03:00
Gleb Natapov
8006a15e3b udt: fix error generation if accessed type is not udt
Fixes #1198
Message-Id: <1460884314-3717-2-git-send-email-gleb@scylladb.com>

(cherry picked from commit f3b515052b)
2016-04-19 11:28:53 +03:00
Duarte Nunes
1cfbc29f01 udt: Implement to_string() for selectable
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1460884314-3717-1-git-send-email-gleb@scylladb.com>
(cherry picked from commit ece89069dd)
2016-04-19 11:28:46 +03:00
Tomasz Grabiec
c665455b71 tests: Add test for query of collection with deleted item
(cherry picked from commit 89bc32b020)
2016-04-18 11:30:28 +03:00
Tomasz Grabiec
b09c91d1c8 mutation_partition: Fix collection emptiness check
Broken by f15c380a4f.

This resulted in empty collection being returned in the results
instead of no collection.

Fixes org.apache.cassandra.cql3.validation.entities.CollectionsTest
from cassandra-unit-tests.

(cherry picked from commit c69d0a8e87)
2016-04-18 11:30:22 +03:00
Tomasz Grabiec
776ae831e6 types: Add default argument values to is_any_live()
(cherry picked from commit b0d4782016)
2016-04-18 11:30:16 +03:00
Pekka Enberg
2ad3c7532f Merge "Summary backport" from Glauber
This series contains 1.0 backports of the following series:

 * Commit 9b98278 ("Merge "Be able to boot without a Summary" from Glauber")
 * Commit 60352f8 ("Merge "Fixes for the reading of missing Summary" from Glauber")

The backport was done by Glauber because the original commits don't work
as-is due to I/O error handling differences in master and 1.0.

Fixes #1170
2016-04-13 22:02:40 +03:00
Glauber Costa
91c35c3e19 sstable_tests: make sure the generation of the Summary is sane
When we recreate the summary from a missing Summary, we should make
sure it is generated sanely, and that it resembles the Summary that
would have otherwise been there.

In this tests we'll grab one of the Summary tests we've been doing,
and just apply them to the non-existent Summary file. We expect
the same results on those cases. Plus, a new test is added with some
sanity checking.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:18:03 -04:00
Glauber Costa
4f0cc195dc be robust against broken summary files
Now that we can boot without a Summary file, we can just as easily boot
with a broken one.

Suggested by Nadav, and it is actually very easy to do, so do it.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:17:54 -04:00
Glauber Costa
c9f7986be4 review fixes for generate_summary
Spotted by Avi post-merge
1) Need to close the file
2) Should be using the parameter pc instead of the default_class

1.0 backport: general_disk_error is non-existent. Replace it with just
propagating the exception

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:17:15 -04:00
Glauber Costa
4feaf1372b clear components if reading toc fail
This shouldn't be a problem in practice, because if read_toc() fails,
the users will just tend to discard the sstable object altogether, and
not insist on using it.

However, if somebody does try to keep using it, a subsequent read_toc() could
theoretically have some components filled up leading the new reader to believe
the toc was populated successfully.

It is easier to just clear the _components set and never worry about it, than
trying to reason about whether or not that could happen.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:14:04 -04:00
Glauber Costa
3ebfecc88e index_reader: avoid misleading parent name
Also add comments about the expected signature of IndexConsumer

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:13:56 -04:00
Glauber Costa
c841d87fe3 summary: generate one if it is not present
There are cases in which a Summary file will not be present, and imported
SSTables will have just the Index and Data files. In earlier versions of
Cassandra, a Summary didn't exist, so one may not be generated when migrating.

In Issue #1170, we can see an example of tables generated by CQLSSTableWriter,
and they lack a Summary. Cassandra is robust against this and can cope
perfectly with the Summary not existing. I will argue that we should do the
same.

1.0 backport: open_checked_file_dma -> open_file_dma

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:13:11 -04:00
Glauber Costa
7a887ea2ea sstables: allow read_toc to be called more than once
We do that by bailing immediately if we detect that the components
map is already populated. This allow us to call read_toc() earlier
if we need to - for instance, to inquire about the existence of the
Summary - without the need to re-read the components again later.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:10:52 -04:00
Glauber Costa
bc4d63c802 sstables: avoid passing schema unnecessarily
for prepare_summary we can just pass the min interval as a parameter and
avoid having the schema do yet another hop. For sealing the summary, it
is completely unused and we can do away with it.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:10:41 -04:00
Glauber Costa
616196b543 index reader: make index_consumer a template parameter
This is done so we can use other consumers. An example of that, is regeneration
of the Summary from an existing Index.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:10:32 -04:00
Glauber Costa
a04f462904 make get_sstable_key_range an instance method
Because just creating an SSTable object does not generate any I/O,
get_sstable_key_range should be an instance method. The main advantage
of doing that is that we won't have to read the summary twice. The way
we're doing it currently, if happens to be a shard-relevant table we'll
call load() - which reads the summary again.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:10:18 -04:00
Glauber Costa
ebf8fb802e do not re-read the summary
There are times in which we read the Summary file twice. That actually happens
every time during normal boot (it doesn't during refresh). First during
get_sstable_key_range and then again during load().

Every summary will have at least one entry, so we can easily test for whether
or not this is properly initialized.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-04-13 14:10:00 -04:00
Avi Kivity
8d1374e911 sstables: filter sstables single-row read using first_key/last_key
Using leveled compaction strategy, only a few sstables will contain a
given key, so we need to filter out the rest.  Using the summary entries
to filter keys works if the key is before the first summary entry,
but does not work if it is after the last summary entry, because the last
summary entry does not represent the last key; so sstables that are
are towards the beginning of the ring are read even if they do not contain
the key, greatly reducing read performance.

Fix by consulting the summary's first_key/last_key entries before consulting
the summary entry array.

(cherry picked from commit 715794cce6)
2016-04-13 09:25:07 +03:00
Avi Kivity
bacc769328 Update seastar submodule (branch-1.0)
* seastar aa281bd...0225940 (10):
  > memory: avoid exercising the reclaimers for oversized requests
  > memory: fix live objects counter underflow due to cross-cpu free
  > core/reactor: Don't abort in allocate_aligned_buffer() on allocation failure
  > scripts/posix_net_conf.sh: added a support for bonding interfaces
  > scripts/posix_net_conf.sh: move the NIC configuration code into a separate function
  > scripts/posix_net_conf.sh: implement the logic for selecting default MQ mode
  > scripts/posix_net_conf.sh: forward the interface name as a parameter
  > http/routes: Remove request failure logging to stderr
  > lowres_clock: Initialize _now when the clock is created
  > apps/iotune: fix broken URL
2016-04-11 09:18:47 +03:00
Avi Kivity
241eb9e199 Update seastar submodule to point to scylla-seastar
This allows us to cherry-pick seastar fixes.
2016-04-10 18:25:31 +03:00
Pekka Enberg
58fdfe5bc9 release: prepare for 1.0.1 2016-04-09 19:21:21 +03:00
Tomasz Grabiec
f45cc1b229 tests: cql_query_test: Add test for slicing in reverse
(cherry picked from commit 3e0c24934b)
2016-04-09 18:42:53 +03:00
Tomasz Grabiec
14f9eeaafd mutation_partition: Fix static row being returned when paginating
Reproduced by dtest paging_test.py:TestPagingData.static_columns_paging_test.

Broken by f15c380a4f, where the
calcualtion of has_ck_selector got broken, in such a way that present
clustering restrictions were treated as if not present, which resulted
in static row being returned when it shouldn't.

While at it, unify the check between query_compacted() and
do_compact() by extracting it to a function.

(cherry picked from commit c2b955d40b)
2016-04-09 18:42:53 +03:00
Tomasz Grabiec
05df90ad4b mutation_partition: Fix reversed trim_rows()
The first erase_and_dispose(), which removes rows between last
position and beginning of the next range, can invalidate end()
iterator of the range. Fix by looking up end after erasing.

mutation_partition::range() was split into lower_bound() and
upper_bound() to allow for that.

This affects for example queries with descending order where the
selected clustering range is empty and falls before all rows.

Exposed by f15c380a4f, which is now
calling do_compact() during query.

Reproduced by dtest paging_test.py:TestPagingData.static_columns_paging_test

(cherry picked from commit a1539fed95)
2016-04-09 18:42:53 +03:00
Tomasz Grabiec
5646faba18 tests: Add test for query digest calculation
(cherry picked from commit 474a35ba6b)
2016-04-09 18:42:52 +03:00
Tomasz Grabiec
814df06245 tests: mutation_source: Include random mutations in generate_mutation_sets() result
Probably increases coverage.

(cherry picked from commit 4418da77e6)
2016-04-09 18:42:52 +03:00
Tomasz Grabiec
5ac9e2501c tests: mutation_test: Move mutation generator to mutation_source_test.hh
So that it can be reused.

(cherry picked from commit 5d768d0681)
2016-04-09 18:42:52 +03:00
Tomasz Grabiec
34ddfb4498 tests: mutation_test: Add test case for querying of expired cells
(cherry picked from commit 30d25bc47a)
2016-04-09 18:42:52 +03:00
Tomasz Grabiec
e4d4d0b31c partition_slice_builder: Add new setters
(cherry picked from commit 58bbd4203f)
2016-04-09 18:42:52 +03:00
Tomasz Grabiec
4125f279c0 tests: result_set_assertions: Add and_only_that()
(cherry picked from commit 7cd8e61429)
2016-04-09 18:42:52 +03:00
Tomasz Grabiec
e276e7b1e3 database: Compact mutations when executing data queries
Currently data query digest includes cells and tombstones which may have
expired or be covered by higher-level tombstones. This causes digest
mismatch between replicas if some elements are compacted on one of the
nodes and not on others. This mismatch triggers read-repair which doesn't
resolve because mutations received by mutation queries are not differing,
they are compacted already.

The fix adds compacting step before writing and digesting query results by
reusing the algorithm used by mutation query. This is not the most optimal
way to fix this. The compaction step could be folded with the query writing,
there is redundancy in both steps. However such change carries more risk,
and thus was postponed.

perf_simple_query test (cassandra-stress-like partitions) shows regression
from 83k to 77k (7%) ops/s.

Fixes #1165.

(cherry picked from commit f15c380a4f)
2016-04-09 18:42:52 +03:00
Tomasz Grabiec
a516b24111 mutation_query: Extract main part of mutation_query() into more generic querying_reader
So that it can be reused in query()

(cherry picked from commit e4e8acc946)
2016-04-09 18:42:52 +03:00
Gleb Natapov
4642c706c1 commitlog, sstables: enlarge XFS extent allocation for large files
With big rows I see contention in XFS allocations which cause reactor
thread to sleep. Commitlog is a main offender, so enlarge extent to
commitlog segment size for big files (commitlog and sstable Data files).

Message-Id: <20160404110952.GP20957@scylladb.com>
(cherry picked from commit 70575699e4)
2016-04-07 09:52:15 +03:00
Nadav Har'El
4666c095bc sstables: overhaul range tombstone reading
Until recently, we believed that range tombstones we read from sstables will
always be for entire rows (or more generalized clustering-key prefixes),
not for arbitrary ranges. But as we found out, because Cassandra insists
that range tombstones do not overlap, it may take two overlapping row
tombstones and convert them into three range tombstones which look like
general ranges (see the patch for a more detailed example).

Not only do we need to accept such "split" range tombstones, we also need
to convert them back to our internal representation which, in the above
example, involves two overlapping tombstones. This is what this patch does.

This patch also contains a test for this case: We created in Cassandra
an sstable with two overlapping deletions, and verify that when we read
it to Scylla, we get these two overlapping deletions - despite the
sstable file actually having contained three non-overlapping tombstones.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <b7c07466074bf0db6457323af8622bb5210bb86a.1459399004.git.glauber@scylladb.com>
(cherry picked from commit 99ecda3c96)
2016-03-31 12:58:07 +03:00
Nadav Har'El
507e6ec75a sstables: merge range tombstones if possible
This is a rewrite of Glauber's earlier patch to do the same thing, taking
into account Avi's comments (do not use a class, do not throw from the
constructor, etc.). I also verified that the actual use case which was
broken in #1136 was fixed by this patch.

Currently, we have no support for range tombstones because CQL will not
generate them as of version 2.x. Thrift will, but we can safely leave this for
the future.

However, we have seen cases during a real migration in which a pure-CQL
Cassandra would generate range tombstones in its SSTables.

Although we are not sure how and why, those range tombstones were of a special
kind: their end and next's start range were adjacent, which means that in
reality, they could very well have been written as a single range tombstone for
an entire clustering key - which we support just fine.

This code will attempt to fix this problem temporarily by merging such ranges
if possible. Care must be taken so that we don't end up accepting a true
generic range tombstone by accident.

Fixes #1136

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459333972-20345-1-git-send-email-nyh@scylladb.com>
(cherry picked from commit 0fc9a5ee4d)
2016-03-31 12:57:56 +03:00
Glauber Costa
29d6952ddd sstables: fix exception printouts in check_marker
As Nadav noticed in his bug report, check_marker is creating its error messages
using characters instead of numbers - which is what we intended here in the
first place.

That happens because sprint(), when faced with an 8-byte type, interprets this
as a character.  To avoid that we'll use uint16_t types, taking care not to
sign-extend them.

The bug also noted that one of the error messages is missing a parameter, and
that is also fixed.

Fixes #1122

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <74f825bbff8488ffeb1911e626db51eed88629b1.1459266115.git.glauber@scylladb.com>
(cherry picked from commit 23808ba184)
2016-03-31 12:56:02 +03:00
Pekka Enberg
9fae641099 release: prepare for 1.0.0 2016-03-30 12:19:12 +03:00
Pekka Enberg
ccd1fe4348 Revert "sanity check Seastar's I/O queue configuration"
This reverts commit 7b88ba8882, it's too
late for it.
2016-03-29 16:44:55 +03:00
Glauber Costa
7b88ba8882 sanity check Seastar's I/O queue configuration
While Seastar in general can accept any parameter for its I/O queues, Scylla
in particular shouldn't run with them disabled. Such will be the status when
the max-io-requests parameter is not enabled.

On top of that, we would like to have enough depth per I/O queue not to allow
for shard-local parallelism. Therefore, we will require a minimum per-queue
capacity of 4. In machines where the disk iodepth is not enough to allow for 4
concurrent requests per shard, one should reduce the number of I/O queues.

For --max-io-requests, we will check the parameter itself. However, the
--num-io-queues parameter is not mandatory, and given enough concurrent
requests, Seastar's default configuration can very well just be doing the right
thing. So for that, we will check the final result of each I/O queue.

As it is the case with other checks of the sorts, this can be overridden by
the --developer-mode switch.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <63bf7e91ac10c95810351815bb8f5e94d75592a5.1458836000.git.glauber@scylladb.com>
(cherry picked from commit e750a94300)
2016-03-29 16:37:16 +03:00
Pekka Enberg
46825a5e07 release: prepare for 1.0.rc3 2016-03-29 16:22:31 +03:00
Benoît Canet
740d98901f collectd: Write to the network to get rid of spurious log messages
Closes #1018

Suggested-by: Avi Kivity <avi@scylladb.com>
Signed-of-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458759378-4935-1-git-send-email-benoit@scylladb.com>
(cherry picked from commit 4ac1126677)
2016-03-29 11:47:19 +03:00
Tomasz Grabiec
ceff8b9b41 schema_tables: Wait for notifications to be processed.
Listeners may defer since:

 93015bcc54 "migration_manager: Make the migration callbacks runs inside seastar thread"

Not all places were adjusted to wait for them. Fix that.

Message-Id: <1458837613-27616-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 53bbcf4a1e)
2016-03-29 11:18:32 +03:00
Gleb Natapov
1b2dbcc26e config: enable truncate_request_timeout_in_ms option
Option truncate_request_timeout_in_ms is used by truncate. Mark it as
used.

Message-Id: <20160323162649.GH2282@scylladb.com>
(cherry picked from commit 0afd1c6f0a)
2016-03-29 11:16:53 +03:00
Raphael Carvalho
75b2db7862 sstables: fix deletion of sstable with temporary TOC
After 4e52b41a4, remove_by_toc_name() became aware of temporary TOC
files, however, it doesn't consider that some components may be
missing if temporary TOC is present.
When creating a new sstable, the first thing we do is to write all
components into temporary TOC, so content of a temporary TOC isn't
reliable until it is renamed.

Solution is about implementing the following flow (described by Avi):
"Flow should be:

  - remove all components in parallel
  - forgive ENOENT, since the compoent may not have been written;
otherwise deletion error should be raised
  - fsync the directory
  - delete the temporary TOC
"

This problem can be reproduced by running compaction without disk
space, so compaction would fail and leave a partial sstable that would
be marked for deletion. Afterwards, remove_by_toc_name() would try to
delete a component that doesn't exist because it looked at the content
of temporary TOC.

Fixes #1095.

Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com>
Message-Id: <0cfcaacb43cc5bad3a8a7ea6c1fa6f325c5de97d.1459194263.git.raphaelsc@scylladb.com>
(cherry picked from commit d515a7fd85)
2016-03-29 10:56:49 +03:00
Tomasz Grabiec
789c1297dd storage_service: Fix typos
Message-Id: <1458837390-26634-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit d1db23e353)
2016-03-29 10:29:28 +03:00
Pekka Enberg
afeaaab034 Update scylla-ami submodule
* dist/ami/files/scylla-ami 89e7436...7019088 (1):
  > Re-enable clocksource=tsc on AMI
2016-03-29 09:59:34 +03:00
Takuya ASADA
80242ff443 dist: re-enable clocksource=tsc on AMI
clocksource=tsc on boot parameter mistakenly dropped on b3c85aea89, need to re-enable.

[ penberg: Manual backport of commit 050fb911d5 to 1.0. ]
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1459180643-4389-1-git-send-email-syuu@scylladb.com>
2016-03-29 09:56:10 +03:00
Nadav Har'El
0b456578c0 sstable: fix read failure of certain sstables
We had a problem reading certain existing Cassandra sstables into
Scylla.

Our consume_range_tombstone() function assumes that the start and end
columns have a certain "end of component" markers, and want to verify
that assumption. But because of bugs in older versions of Cassandra,
see https://issues.apache.org/jira/browse/CASSANDRA-7593, sometimes the
"end of component" was missing (set to 0). CASSANDRA-7593 suggested
this problem might exist on the start column, so we allowed for that,
but now we discovered a case where also the end column is set to 0 -
causing the test in consume_range_tombstone() to fail and the sstable
read to fail - causing Scylla to no be able to import that sstable from
Cassandra. Allowing for an 0 also on the end column made it possible
to read that sstable, compact it, and so on.

Fixes #1125.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459173964-23242-1-git-send-email-nyh@scylladb.com>
(cherry picked from commit a05577ca41)
2016-03-28 17:10:10 +03:00
Pekka Enberg
3b5a55c6fc release: prepare for 1.0.rc2 2016-03-27 10:19:53 +03:00
Raphael Carvalho
4f1d37c3c9 Fix corner-case in refresh
Problem found by dtest which loads sstables with generation 1 and 2 into an
empty column family. The root of the problem is that reshuffle procedure
changes new sstables to start from generation 2 at least. So reshuffle could
try to set generation 1 to 2 when generation 2 exists.
This problem can be fixed by starting from generation 1 instead, so reshuffle
would handle this case properly.

Fixes #1099.

Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com>
Message-Id: <88c51fbda9557a506ad99395aeb0a91cd550ede4.1458917237.git.raphaelsc@scylladb.com>
(cherry picked from commit e6e5999282)
2016-03-27 10:04:28 +03:00
Avi Kivity
8422a42381 dist: ami: fix AMI_OPT receiving no value
We assign AMI=0 and AMI_OPT=1, so in the true case, AMI_OPT has no value,
and a later compare fails.

(cherry picked from commit 077c0d1022)
2016-03-26 21:17:49 +03:00
Takuya ASADA
c0f31fac48 dist/ami: use tilde for release candidate builds
Sync with ubuntu package versioning rule

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458882718-29317-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 2582dbe4a0)
2016-03-26 16:51:24 +02:00
Calle Wilund
6fe88a663f database: Use disk-marking delete function in discard_sstables
Fixes #797

To make sure an inopportune crash after truncate does not leave
sstables on disk to be considered live, and thus resurrect data,
after a truncate, use delete function that renames the TOC file to
make sure we've marked sstables as dead on disk when we finish
this discard call.
Message-Id: <1458575440-505-2-git-send-email-calle@scylladb.com>

Rebase to 1.0:
Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-24 09:16:24 -04:00
Calle Wilund
5f76f3d445 sstables: Add delete func to rename TOC ensuring table is marked dead
Note: "normal" remove_by_toc_name must now be prepared for and check
if the TOC of the sstable is already moved to temp file when we
get to the juicy delete parts.
Message-Id: <1458575440-505-1-git-send-email-calle@scylladb.com>

For the rebase to 1.0:

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-24 09:05:03 -04:00
Asias He
6676d126aa streaming: Complete receive task after the flush
A STREAM_MUTATION_DONE message will signal the receiver that the sender
has completed the sending of streams mutations. When the receiver finds
it has zero task to send and zero task to receive, it will finish the
stream_session, and in turn finish the stream_plan if all the
stream_sessions are finished. We should call receive_task_completed only
after the flush finishes so that when stream_plan is finshed all the
data is on disk.

Fixes repair_disjoint_data_test issue with Glauber's "[PATCH v4 0/9] Make
sure repairs do not cripple incoming load" serries

======================================================================
FAIL: repair_disjoint_data_test
(repair_additional_test.RepairAdditionalTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "scylla-dtest/repair_additional_test.py",
line 102, in repair_disjoint_data_test
    self.check_rows_on_node(node1, 3000)
  File "scylla-dtest/repair_additional_test.py",
line 33, in check_rows_on_node
    self.assertEqual(len(result), rows, len(result))
AssertionError: 2461

(cherry picked from commit c2eff7e824)
2016-03-24 10:26:00 +02:00
Glauber Costa
38343ccbfe repair: rework repair code so we can limit parallelism
The repair code as it is right now is a bit convoluted: it resorts to detached
continuations + do_for_each when calling sync_ranges, and deals with the
problem of excessive parallelism by employing a semaphore inside that range.

Still, even by doing that, we still generate a great number of
checksum requests because the ranges themselves are processed in parallel.

It would be better to have a single-semaphore to limit the overall parallelism
for all requests.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit f49e965d78)
2016-03-24 10:26:00 +02:00
Glauber Costa
f1272933fd database: keep streaming memtables in their own region group
Theoretically, because we can have a lot of pending streaming memtables, we can
have the database start throttling and incoming connections slowing down during
streaming.

Turns out this is actually a very easy condition to trigger. That is basically
because the other side of the wire in this case is quite efficient in sending
us work. This situation is alleviated a bit by reducing parallelism, but not
only it does't go away completely, once we have the tools to start increasing
parallelism again it will become common place.

The solution for this is to limit the streaming memtables to a fraction of the
total allowed dirty memory. Using the nesting capability built in in the LSA
regions, we will make the streaming region group a child of the main region
group.  With that, we can throttle streaming requests separately, while at the
same time being able to control the total amount of dirty memory as well.

Because of the property, it can still be the case that incoming requests will
throttle earlier due to streaming - unless we allow for more dirty memory to be
used during repairs - but at least that effect will be limited.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 34a9fc106f)
2016-03-24 10:26:00 +02:00
Glauber Costa
ccd623aa87 streaming memtables: coalesce incoming writes
The repair process will potentially send ranges containing few mutations,
definitely not enough to fill a memtable. It wants to know whether or not each
of those ranges individually succeeded or failed, so we need a future for each.

Small memtables being flushed are bad, and we would like to write bigger
memtables so we can better utilize our disks.

One of the ways to fix that, is changing the repair itself to send more
mutations at a single batch. But relying on that is a bad idea for two reasons:

First, the goals of the SSTable writer and the repair sender are at odds. The
SSTable writer wants to write as few SSTables as possible, while the repair
sender wants to break down the range in pieces as small as it can and checksum
them individually, so it doesn't have to send a lot of mutations for no reason.

Second, even if the repair process wants to process larger ranges at once, some
ranges themselves may be small. So while most ranges would be large, we would
still have potentially some fairly small SSTables lying around.

The best course of action in this case is to coalesce the incoming streams
write-side.  repair can now choose whatever strategy - small or big ranges - it
wants, resting assure that the incoming memtables will be coalesced together.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 455d5a57d2)
2016-03-24 10:26:00 +02:00
Glauber Costa
8176fa8379 streaming: add incoming streaming mutations to a different sstable
Keeping the mutations coming from the streaming process as mutations like any
other have a number of advantages - and that's why we do it.

However, this makes it impossible for Seastar's I/O scheduler to differentiate
between incoming requests from clients, and those who are arriving from peers
in the streaming process.

As a result, if the streaming mutations consume a significant fraction of the
total mutations, and we happen to be using the disk at its limits, we are in no
position to provide any guarantees - defeating the whole purpose of the
scheduler.

To implement that, we'll keep a separate set of memtables that will contain
only streaming mutations. We don't have to do it this way, but doing so
makes life a lot easier. In particular, to write an SSTable, our API requires
(because the filter requires), that a good estimate on the number of partitions
is informed in advance. The partitions also need to be sorted.

We could write mutations directly to disk, but the above conditions couldn't be
met without significant effort. In particular, because mutations can be
arriving from multiple peer nodes, we can't really sort them without keeping a
staging area anyway.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 5fa866223d)
2016-03-24 10:26:00 +02:00
Glauber Costa
d03910f46d priority manager: separate streaming reads from writes
Streaming has currently one class, that can be used to contain the read
operations being generated by the streaming process. Those reads come from two
places:

- checksums (if doing repair)
- reading mutations to be sent over the wire.

Depending on the amount of data we're dealing with, that can generate a
significant chunk of data, with seconds worth of backlog, and if we need to
have the incoming writes intertwined with those reads, those can take a long
time.

Even if one node is only acting as a receiver, it may still read a lot for the
checksums - if we're talking about repairs, those are coming from the
checksums.

However, in more complicated failure scenarios, it is not hard to imagine a
node that will be both sending and receiving a lot of data.

The best way to guarantee progress on both fronts, is to put both kinds of
operations into different classes.

This patch introduces a new write class, and rename the old read class so it
can have a more meaningful name.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 10c8ca6ace)
2016-03-24 10:26:00 +02:00
Glauber Costa
0c75700d8c database: make seal_on_overflow a method of the memtable_list
Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 78189de57f)
2016-03-24 10:26:00 +02:00
Glauber Costa
478975b3fa database: move add_memtable as a method of the memtable_list
The column family still has to teach the memtable list how to allocate a new memtable,
since it uses CF parameters to do so.

After that, the memtable_list's constructor takes a seal and a create function and is complete.
The copy constructor can now go, since there are no users left.
The behavior of keeping a reference to the underlying memtables can also go, since we can now
guarantee that nobody is keeping references to it (it is not even a shared pointer anymore).
Individual memtables are, and users may be keeping references to them individually.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 635bb942b2)
2016-03-24 10:26:00 +02:00
Glauber Costa
5ce76258c8 database: move active_memtable to memtable_list
Each list can have a different active memtable. The column family method keeps
existing, since the two separate sets of memtable are just an implementation
detail to deal with the problem of streaming QoS: *the* active memtable keeps
being the one from the main list.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 6ba95d450f)
2016-03-24 10:26:00 +02:00
Glauber Costa
4cf8791d56 database: create a class for memtable_list
memtable_list is currently just an alias for a vector of memtables.  Let's move
them to a class on its own, exporting the relevant methods to keep user code
unchanged as much as possible.

This will help us keeping separate lists of memtables.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit af6c7a5192)
2016-03-24 10:26:00 +02:00
Pekka Enberg
ccd51010f1 Merge seastar upstream
* seastar 9f2b868...aa281bd (7):
  > shared_promise: Add move assignment operator
  > lowres_clock: Fix stretched time
  > scripts: Delete tap with ip instead of tunctl
  > vla: Actually be exception-safe
  > vla: Ensure memory is freed if ctor throws
  > vla: Ensure memory is correctly freed
  > net: Improve error message when parsing invalid ipv4 address
2016-03-24 10:25:42 +02:00
Shlomi Livne
8e78cbfc2d fix a collision betwen --ami command line param and env
sysconfig scylla-server includes an AMI, the script also used an AMI
variable fix this by renaming the script variable

6a18634f9f introduced this issue since it
started imported the sysconfig scylla-server

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <0bc472bb885db2f43702907e3e40d871f1385972.1458767984.git.shlomi@scylladb.com>
(cherry picked from commit d3a91e737b)
2016-03-24 08:18:45 +02:00
Shlomi Livne
c6c176b1be scylla_io_setup import scylla-server env args
scylla_io_seup requires the scylla-server env to be setup to run
correctly. previously scylla_io_setup was encapsulated in
scylla-io.service that assured this.

extracting CPUSET,SMP from SCYLLA_ARGS as CPUSET is needed for invoking
io_tune

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <d49af9cb54ae327c38e451ff76fe0322e64a5f00.1458747527.git.shlomi@scylladb.com>
(cherry picked from commit 6a18634f9f)
2016-03-23 17:55:33 +02:00
Shlomi Livne
9795edbe04 dist/ami: Use the actual number of disks instead of AWS meta service
We have seen in some cases that when using the boto api to start
instances the aws metadata service
http://169.254.169.254/latest/meta-data/block-device-mapping/ returns
incorrect number of disks - workaround that by checking the actual
number of disks using lsblk

Adding a validation at the end verifying that after all computations the
NR_IO_QUEUES will not be greater then the number of shards (we had an
issue with i2.8x)

Fixes: #1062

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <54c51cd94dd30577a3fe23aef3ce916c01e05504.1458721659.git.shlomi@scylladb.com>
(cherry picked from commit 4ecc37111f)
2016-03-23 11:22:25 +02:00
Shlomi Livne
1539c8b136 fix centos local ami creation (revert some changes)
in centos we do not have a version file created - revert this changes
introduced when adding ubuntu ami creation

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <69c80dcfa7afe4f5db66dde2893d9253a86ac430.1458578004.git.shlomi@scylladb.com>
(cherry picked from commit b7e338275b)
2016-03-23 11:22:25 +02:00
Takuya ASADA
0396a94eaf dist: allow more requests for i2 instances
i2 instances has better performance than others, so allow more requests.
Fixes #921

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458251067-1533-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 769204d41e)
2016-03-23 11:22:25 +02:00
Raphael Carvalho
3c40c1be71 service: fix refresh
Vlad and I were working on finding the root of the problems with
refresh. We found that refresh was deleting existing sstable files
because of a bug in a function that was supposed to return the maximum
generation of a column family.
The intention of this function is to get generation from last element
of column_family::_sstables, which is of type std::map.
However, we were incorrectly using std::map::end() to get last element,
so garbage was being read instead of maximum generation.
If the garbage value is lower than the minimum generation of a column
family, then reshuffle_sstables() would set generation of all existing
sstables to a lower value. That would confuse our mechanism used to
delete sstables because sstables loaded at boot stage were touched.
Solution to this problem is about using rbegin() instead of end() to
get last element from column_family::_sstables.

The other problem is that refresh will only load generations that are
larger than or equal to X, so new sstables with lower generation will
not be loaded. Solution is about creating a set with generation of
live SSTables from all shards, and using this set to determine whether
a generation is new or not.

The last change was about providing an unused generation to reshuffle
procedure by adding one to the maximum generation. That's important to
prevent reshuffle from touching an existing SSTable.

Tested 'refresh' under the following scenarios:
1) Existing generations: 1, 2, 3, 4. New ones: 5, 6.
2) Existing generations: 3, 4, 5, 6. New ones: 1, 2.
3) Existing generations: 1, 2, 3, 4. New ones: 7, 8.
4) No existing generation. No new generation.
5) No existing generation. New ones: 1, 2.
I also had to adapt existing testcase for reshuffle procedure.

Fixes #1073.

Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com>
Message-Id: <1c7b8b7f94163d5cd00d90247598dd7d26442e70.1458694985.git.raphaelsc@scylladb.com>
(cherry picked from commit 370b1336fe)
2016-03-23 11:22:25 +02:00
Benoît Canet
de969a5d6f dist/ubuntu: Fix the init script variable sourcing
The variable sourcing was crashing the init script on ubuntu.
Fix it with the suggestion from Avi.

Signed-off-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458685099-1160-1-git-send-email-benoit@scylladb.com>
(cherry picked from commit 1594bdd5bb)
2016-03-23 11:22:25 +02:00
Takuya ASADA
0ade2894f7 dist: stop using '-p' option on lsblk since Ubuntu doesn't supported it
On scylla_setup interactive mode we are using lsblk to list up candidate
block devices for RAID, and -p option is to print full device paths.

Since Ubuntu 14.04LTS version of lsblk doesn't supported this option, we
need to use non-full path name and complete paths before passes it to
scylla_raid_setup.

Fixes #1030

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458325411-9870-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 6edd909b00)
2016-03-23 09:16:04 +02:00
Takuya ASADA
6b36315040 dist: allow to run 'sudo scylla_ami_setup' for Ubuntu AMI
Allows to run scylla_ami_setup from scylla-server.conf

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
(cherry picked from commit a6cd085c38)
2016-03-23 09:14:49 +02:00
Takuya ASADA
edc5f8f2f7 dist: launch scylla_ami_setup on Ubuntu AMI
Since upstart does not have same behavior as systemd, we need to run scylla_io_setup and scylla_ami_setup in scylla-server.conf's pre-start stanza.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
(cherry picked from commit 7828023599)
2016-03-23 09:14:49 +02:00
Takuya ASADA
066149ad46 dist: fix broken scylla_install_pkg --local-pkg and --unstable on Ubuntu
--local-pkg and --unstable arguments didn't handled on Ubuntu, support it.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
(cherry picked from commit 93bf7bff8e)
2016-03-23 09:14:49 +02:00
Takuya ASADA
1f07468195 dist: prevent to show up dialog on apt-get in scylla_raid_setup
"apt-get -y install mdadm" shows up a dialog to select install mode of postfix, this will block scylla-ami-setup.service forever since it is running as background task, we need to prevent it.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
(cherry picked from commit 0c83b34d0c)
2016-03-23 09:14:49 +02:00
Takuya ASADA
0577ae5a61 dist: Ubuntu based AMI support
This introduces Ubuntu AMI.
Both CentOS AMI and Ubuntu AMI are need to build on same distribution, so build_ami.sh script automatically detect current distribution, and selects base AMI image.

Fixes #998

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
(cherry picked from commit b097ed6d75)
2016-03-23 09:14:49 +02:00
Pekka Enberg
054cf13cd0 Update scylla-ami submodule
* dist/ami/files/scylla-ami 84bcd0d...89e7436 (3):
  > Merge "iotune packaging fix for scylla-ami" from Takuya
  > Ubuntu AMI support on scylla_install_ami
  > scylla_ami_setup is not POSIX sh compatible, change shebang to /bin/bash
2016-03-23 09:07:07 +02:00
Takuya ASADA
71446edc97 dist: on scylla_io_setup, SMP and CPUSET should be empty when the parameter not present
Fixes #1060

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458659928-2050-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit dac2bc3055)
2016-03-23 09:06:00 +02:00
Takuya ASADA
c1d8a62b5b dist: remove scylla-io-setup.service and make it standalone script
(cherry picked from commit 9889712d43)
2016-03-23 09:06:00 +02:00
Takuya ASADA
a3baef6b45 dist: on scylla_io_setup print out message both for stdout and syslog
(cherry picked from commit 2cedab07f2)
2016-03-23 09:06:00 +02:00
Takuya ASADA
feaba177e2 dist: introduce dev-mode.conf and scylla_dev_mode_setup
(cherry picked from commit 83112551bb)
2016-03-23 09:06:00 +02:00
Tomasz Grabiec
83a289bdcd cql3: batch_statement: Execute statements sequentially
Currently we execute all statements in parallel, but some statements
depend on order, in particular list append/prepend. Fix by executing
sequentially.

Fixes cql_additional_tests.py:TestCQL.batch_and_list_test dtest.

Fixes #1075.

Message-Id: <1458672874-4749-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 5f44afa311)
2016-03-22 21:06:21 +02:00
Tomasz Grabiec
382e7e63b3 Fix assertion in row_cache_alloc_stress
Fixes the following assertion failure:

  row_cache_alloc_stress: tests/row_cache_alloc_stress.cc:120: main(int, char**)::<lambda()>::<lambda()>: Assertion `mt->occupancy().used_space() < memory::stats().free_memory()' failed.

memory::stats()::free_memory() may be much lower than the actual
amount of reclaimable memory in the system since LSA zones will try to
keep a lot of free segments to themselves. Fix by using actual amount
of reclaimable memory in the check.

(cherry picked from commit a4e3adfbec)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
deeed904f4 logalloc: Introduce tracker::occupancy()
Returns occupancy information for all memory allocated by LSA, including
segment pools / zones.

(cherry picked from commit a0cba3c86f)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
d927053b3b logalloc: Rename tracker::occupancy() to region_occupancy()
(cherry picked from commit 529c8b8858)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
8b8923b5af managed_bytes: Make operator[] work for large blobs as well
Fixes assertion in mutation_test:

mutation_test: ./utils/managed_bytes.hh:349: blob_storage::char_type* managed_bytes::data(): Assertion `!_u.ptr->next'

Introduced in ea7c2dd085

Message-Id: <1458648786-9127-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit ca08db504b)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
48ec129595 perf_simple_query: Make duration configurable
(cherry picked from commit 6e73c3f3dc)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
a4757a6737 mutation_test: Add allocation failure stress test for apply()
The test injects allocation failures at every allocation site during
apply(). Only allocations throug allocation_strategy are instrumented,
but currently those should include all allocations in the apply() path.

The target and source mutations are randomized.

(cherry picked from commit 2fbb55929d)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
223b73849d mutation_test: Add more apply() tests
(cherry picked from commit 8ede27f9c6)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
ba4b1eac45 mutation_test: Hoist make_blob() to a function
(cherry picked from commit 36575d9f01)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
9cf5fabfdf mutation_test: Make make_blob() return different blob each time
random_bytes was constructed with the same seed each time.

(cherry picked from commit 4c85d06df7)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
5723c664ad mutation_test: Fix use-after-free
The problem was that verify_row() was returning a future which was not
waited on. Fix by running the code in a thread.

(cherry picked from commit 19b3df9f0f)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
9635a83edd mutation_partition: Fix friend declarations
Missing "class" confuses CLion IDE.

(cherry picked from commit a7966e9b71)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
24c68e48a5 mutation_partition: Make apply() atomic even in case of exception
We cannot leave partially applied mutation behind when the write
fails. It may fail if memory allocation fails in the middle of
apply(). This for example would violate write atomicity, readers
should either see the whole write or none at all.

This fix makes apply() revert partially applied data upon failure, by
the means of ReversiblyMergeable concept. In a nut shell the idea is
to store old state in the source mutation as we apply it and swap back
in case of exception. At cell level this swapping is inexpensive, just
rewiring pointers. For this to work, the source mutation needs to be
brought into mutable form, so frozen mutations need to be unfrozen. In
practice this doesn't increase amount of cell allocations in the
memtable apply path because incoming data will usually be newer and we
will have to copy it into LSA anyway. There are extra allocations
though for the data structures which holds cells.

I didn't see significant change in performance of:

  build/release/tests/perf/perf_simple_query -c1 -m1G --write --duration 13

The score fluctuates around ~77k ops/s.

Fixes #283.

(cherry picked from commit dc290f0af7)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
80cb0a28e1 mutation_partition: Make intrusive sets ReversiblyMergeable
(cherry picked from commit e09d186c7c)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
95a9f66b75 mutation_partition: Make row_tombstones_entry ReversiblyMergeable
(cherry picked from commit f1a4feb1fc)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
58448d4b05 mutation_partition: Make rows_entry ReversiblyMergeable
(cherry picked from commit e4a576a90f)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
0a4d0e95f2 mutation_partition: Make row_marker ReversiblyMergeable
(cherry picked from commit aadcd75d89)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
2c73e1c2e8 mutation_partition: Make row ReversiblyMergeable
(cherry picked from commit ea7c2dd085)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
0ebd1ae62a atomic_cell_or_collection: Introduce as_atomic_cell_ref()
Needed for setting the REVERT flag on existing cell.

(cherry picked from commit c9d4f5a49c)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
14f616de3f atomic_cell_hash: Specialize appending_hash<> for atomic_cell and collection_mutation
(cherry picked from commit 1ffe06165d)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
827c0f68c3 atomic_cell: Add REVERT flag
Needed to make atomic cells ReversiblyMergeable.

(cherry picked from commit bfc6413414)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
e3607a4c16 tombstone: Make ReversiblyMergeable
(cherry picked from commit 7fcfa97916)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
59270c6d00 Introduce the concept of ReversiblyMergeable
(cherry picked from commit 1407173186)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
3be5d3a7c9 mutation_partition: row: Add empty()
(cherry picked from commit 9fc7f8a5ed)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
cd6697b506 mutation_partition: row: Allow storing empty cells internally
Currently only "set" storage could store empty cells, but not the
"vector" one because there empty cell has the meaning of being
missing. To implement rolback, we need to be able to distinguish empty
cells from missing ones. Solve by making vector storage use a bitmap
for presence checking instead of emptiness. This adds 4 bytes to
vector storage.

(cherry picked from commit d5e66a5b0d)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
acc9849e2b mutation_partition: Make row::merge() tolerate empty row
The row may be empty and still have a set storage, in which case
rbegin() dereference is undefined behavior.

(cherry picked from commit ed1e6515db)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
a445f6a7be managed_bytes: Mark move-assignment noexcept
(cherry picked from commit 184e2831e7)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
88ed9c53a6 managed_bytes: Make copy assignment exception-safe
(cherry picked from commit 92d4cfc3ab)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
50f98ff90a managed_bytes: Make linearization_context::forget() noexcept
It is needed for noexcept destruction, which we need for exception
safety in higher layers.

According to [1], erase() only throws if key comparison throws, and in
our case it doesn't.

[1] http://en.cppreference.com/w/cpp/container/unordered_map/erase

(cherry picked from commit 22d193ba9f)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
30ffb2917f mutation: Add copy assignment operator
We already have a copy constructor, so can have copy assignment as
well.

(cherry picked from commit 87d7279267)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
6ef8b45bf4 mutation_partition: Add cell_entry constructor which makes an empty cell
(cherry picked from commit 8134992024)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
144829606a mutation_partition: Make row::vector_to_set() exception-safe
Currently allocation failure can leave the old row in a
half-moved-from state and leak cell_entry objects.

(cherry picked from commit 518e956736)
2016-03-22 19:59:16 +02:00
Tomasz Grabiec
2eb54bb068 mutation_partition: Unmark cell_entry's copy constructor as noexcept
It was a mistake, it certainly may throw because it copies cells.

(cherry picked from commit c91eefa183)
2016-03-22 19:59:16 +02:00
Pekka Enberg
a133e48515 Merge seastar upstream
* seastar 6a207e1...9f2b868 (10):
  > memory: set free memory to non-zero value in debug mode
  > Merge "Increase IOTune's robustness by including a timeout" from Glauber
  > shared_future: add companion class, shared_promise
  > rpc: fix client connection stopping
  > semaphore: allow wait() and signal() after broken()
  > run reactor::stop() only once
  > sharded: fix start with reference parameter
  > core: add asserts to rwlock
  > util/defer: Fix cancel() not being respected
  > tcp: Do not return accept until the connection is connected
2016-03-22 15:49:51 +02:00
Asias He
5db0049d99 gossip: Sync gossip_digest.idl.hh and application_state.hh
We did the clean up in idl/gossip_digest.idl.hh, but the patch to clean
up gms/application_state.hh was never merged.

To maintain compatibility with previous version of scylla, we can not
change application_state.hh, instead change idl to be sync with
application_state.hh.

Message-Id: <3a78b159d5cb60bc65b354d323d163ce8528b36d.1458557948.git.asias@scylladb.com>
(cherry picked from commit 39992dd559)
2016-03-22 15:22:12 +02:00
Takuya ASADA
ac80445bd9 dist: enable collectd on scylla_setup by default, to make scyllatop usable
Fixes #1037

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458324769-9152-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 6b2a8a2f70)
2016-03-22 15:16:54 +02:00
Asias He
0c3ffba5c8 messaging_service: Take reference of ms in send_message_timeout_and_retry
Take a reference of messaging_service object inside
send_message_timeout_and_retry to make sure it is not freed during the
life time of send_message_timeout_and_retry operation.

(cherry picked from commit b8abd88841)
2016-03-22 13:20:47 +02:00
Gleb Natapov
7ca3d22c7d messaging: do not admit new requests during messaging service shutdown.
Sending a message may open new client connection which will never be
closed in case messaging service is shutting down already.

Fixes #1059

Message-Id: <1458639452-29388-3-git-send-email-gleb@scylladb.com>
(cherry picked from commit 1e6352e398)
2016-03-22 13:18:12 +02:00
Gleb Natapov
9b1d2dad89 messaging: do not delete client during messaging service shutdown
Messaging service stop() method calls stop() on all clients. If
remove_rpc_client_one() is called while those stops are running
client::stop() will be called twice which not suppose to happen. Fix it
by ignoring client remove request during messaging service shutdown.

Fixes #1059

Message-Id: <1458639452-29388-2-git-send-email-gleb@scylladb.com>
(cherry picked from commit 357c91a076)
2016-03-22 13:18:05 +02:00
Pekka Enberg
7e6a7a6cb5 release: prepare for 1.0.rc1 2016-03-22 12:19:03 +02:00
Pekka Enberg
ec7f637384 dist/ubuntu: Use tilde for release candidate builds
The version number ordering rules are different for rpm and deb. Use
tilde ('~') for the latter to ensure a release candidate is ordered
_before_ a final version.

Message-Id: <1458627524-23030-1-git-send-email-penberg@scylladb.com>
(cherry picked from commit ae33e9fe76)
2016-03-22 12:18:52 +02:00
Nadav Har'El
eecfb2e4ef sstable: fix use-after-free of temporary ioclass copy
Commit 6a3872b355 fixed some use-after-free
bugs but introduced a new one because of a typo:

Instead of capturing a reference to the long-living io-class object, as
all the code does, one place in the code accidentally captured a *copy*
of this object. This copy had a very temporary life, and when a reference
to that *copy* was passed to sstable reading code which assumed that it
lives at least as long as the read call, a use-after-free resulted.

Fixes #1072

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1458595629-9314-1-git-send-email-nyh@scylladb.com>
(cherry picked from commit 2eb0627665)
2016-03-22 08:08:49 +02:00
Pekka Enberg
1f6476351a build: Invoke Seastar build only once
Make sure we invoke the Seastar ninja build only once from our own build
process so that we don't have multiple ninjas racing with each other.

Refs #1061.

Message-Id: <1458563076-29502-1-git-send-email-penberg@scylladb.com>
(cherry picked from commit 4892a6ded9)
2016-03-22 08:08:02 +02:00
Pekka Enberg
0d95dd310a Revert "build: prepare for 1.0 release series"
This reverts commit 80d2b72068. It breaks
the RPM build which does not allow the "-" character to appear in
version numbers.
2016-03-22 08:03:22 +02:00
Avi Kivity
80d2b72068 build: prepare for 1.0 release series 2016-03-21 18:44:05 +02:00
Asias He
ac95f04ff9 gossip: Handle unknown application_state when printing
In case an unknown application_state is received, we should be able to
handle it when printting.

Message-Id: <98d2307359292e90c8925f38f67a74b69e45bebe.1458553057.git.asias@scylladb.com>
(cherry picked from commit 7acc9816d2)
2016-03-21 11:59:35 +02:00
Pekka Enberg
08a8a4a1b4 main: Defer API server hooks until commitlog replay
Defer registering services to the API server until commitlog has been
replayed to ensure that nobody is able to trigger sstable operations via
'nodetool' before we are ready for them.
Message-Id: <1458116227-4671-1-git-send-email-penberg@scylladb.com>

(cherry picked from commit 972fc6e014)
2016-03-18 09:20:49 +02:00
Pekka Enberg
b7e9924299 main: Fix broadcast_address and listen_address validation errors
Fix the validation error message to look like this:

  Scylla version 666.development-20160316.49af399 starting ...
  WARN  2016-03-17 12:24:15,137 [shard 0] config - Option partitioner is not (yet) used.
  WARN  2016-03-17 12:24:15,138 [shard 0] init - NOFILE rlimit too low (recommended setting 200000, minimum setting 10000; you may run out of file descriptors.
  ERROR 2016-03-17 12:24:15,138 [shard 0] init - Bad configuration: invalid 'listen_address': eth0: boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> > (Invalid argument)
  Exiting on unhandled exception of type 'bad_configuration_error': std::exception

Instead of:

  Exiting on unhandled exception of type 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >': Invalid argument

Fixes #1051.

Message-Id: <1458210329-4488-1-git-send-email-penberg@scylladb.com>
(cherry picked from commit 69dacf9063)
2016-03-18 09:00:23 +02:00
Takuya ASADA
19ed269cc7 dist: follow sysconfig setting when counting number of cpus on scylla_io_setup
When NR_CPU >= 8, we disabled cpu0 for AMI on scylla_sysconfig_setup.
But scylla_io_setup doesn't know that, try to assign NR_CPU queues, then scylla fails to start because queues > cpus.
So on this fix scylla_io_setup checks sysconfig settings, if '--smp <n>' specified on SCYLLA_ARGS, use n to limit queue size.
Also, when instance type is not supported pre-configured parameters, we need to passes --cpuset parameters to iotune. Otherwise iotune will run on a different set of CPUs, which may have different performance characteristics.

Fixes #996, #1043, #1046

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458221762-10595-2-git-send-email-syuu@scylladb.com>
(cherry picked from commit 4cc589872d)
2016-03-18 08:58:00 +02:00
Takuya ASADA
a223450a56 dist: On scylla_sysconfig_setup, don't disable cpu0 on non-AMI environments
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458221762-10595-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 6f71173827)
2016-03-18 08:57:56 +02:00
Paweł Dziepak
8f4800b30e lsa: update _closed_occupancy after freeing all segments
_closed_occupancy will be used when a region is removed from its region
group, make sure that it is accurate.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
(cherry picked from commit 338fd34770)
2016-03-18 08:11:31 +02:00
Pekka Enberg
7d13d115c6 dist: Fix '--developer-mode' parsing in scylla_io_setup
We need to support the following variations:

   --developer-mode true
   --developer-mode 1
   --developer-mode=true
   --developer-mode=1

Fixes #1026.
Message-Id: <1458203393-26658-1-git-send-email-penberg@scylladb.com>

(cherry picked from commit 0434bc3d33)
2016-03-17 11:00:14 +02:00
Glauber Costa
c9c52235a1 stream_session: print debug message for STREAM_MUTATION
For this verb(), we don't call get_session - and it doesn't look like we will.
We currently have no debug message for this one, which makes it harder to debug
the stream of messages. Print it.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit a3ebf640c6)
2016-03-17 08:18:54 +02:00
Glauber Costa
52eeab089c stream_session: remove duplicated debug message
Whenever we call get_session, that will print a debug message about the arrival
of this new verb. Because we also print that explicitly in PREPARE_DONE, that
message gets duplicated.

That confuses poor developers who are, for a while, left wondering why is it that
the sender is sender the message twice.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 0ab4275893)
2016-03-17 08:18:49 +02:00
Glauber Costa
49af399a2e sstables: do not assume mutation_reader will be kept alive
Our sstables::mutation_reader has a specialization in which start and end
ranges are passed as futures. That is needed because we may have to read the
index file for those.

This works well under the assumption that every time a mutation_reader will be
created it will be used, since whoever is using it will surely keep the state
of the reader alive.

However, that assumption is no longer true - for a while. We use a reader
interface for reading everything from mutations and sstables to cache entries,
and when we create an sstable mutation_reader, that does not mean we'll use it.
In fact we won't, if the read can be serviced first by a higher level entity.

If that happens to be the case, the reader will be destructed. However, since
it may take more time than that for the start and end futures to resolve, by
the time they are resolved the state of the mutation reader will no longer be
valid.

The proposed fix for that is to only resolve the future inside
mutation_reader's read() function. If that function is called,  we can have a
reasonable expectation that the caller object is being kept alive.

A second way to fix this would be to force the mutation reader to be kept alive
by transforming it into a shared pointer and acquiring a reference to itself.
However, because the reader may turn out not to be used, the delayed read
actually has the advantage of not even reading anything from the disk if there
is no need for it.

Also, because sstables can be compacted, we can't guarantee that the sst object
itself , used in the resolution of start and end can be alive and that has the
same problem. If we delay the calling of those, we will also solve a similar
problem.  We assume here that the outter reader is keeping the SSTable object
alive.

I must note that I have not reproduced this problem. What goes above is the
result of the analysis we have made in #1036. That being the case, a thorough
review is appreciated.

Fixes #1036

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <a7e4e722f76774d0b1f263d86c973061fb7fe2f2.1458135770.git.glauber@scylladb.com>
(cherry picked from commit 6a3872b355)
2016-03-16 19:41:06 +02:00
Nadav Har'El
d915370e3f Allow uncompression at end of file
Asking to read from byte 100 when a file has 50 bytes is an obvious error.
But what if we ask to read from byte 50? What if we ask to read 0 bytes at
byte 50? :-)

Before this patch, code which asked to read from the EOF position would
get an exception. After this patch, it would simply read nothing, without
error. This allows, for example, reading 0 bytes from position 0 on a file
with 0 bytes, which apparently happened in issue #1039...

A read which starts at a position higher than the EOF position still
generates an exception.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1458137867-10998-1-git-send-email-nyh@scylladb.com>
(cherry picked from commit 02ba8ffbe8)
2016-03-16 19:40:59 +02:00
Nadav Har'El
a6d5e67923 Fix out-of-range exception when uncompressing 0 bytes
The uncompression code reads the compressed chunks containing the bytes
pos through pos + len - 1. This, however, is not correct when len==0,
and pos + len - 1 may even be -1, causing an out-of-range exception when
calling locate() to find the chunks containing this byte position.

So we need to treat len==0 specially, and in this case we don't read
anything, and don't need to locate() the chunks to read.

Refs #1039.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1458135987-10200-1-git-send-email-nyh@scylladb.com>
(cherry picked from commit 73297c7872)
2016-03-16 15:55:12 +02:00
Takuya ASADA
f885750f90 dist: do not auto-start scylla-server job on Ubuntu package install time
Fixes #1017

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458122424-22889-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit f1d18e9980)
2016-03-16 13:55:30 +02:00
Pekka Enberg
36f55e409d tests/gossip_test: Fix messaging service stop
This fixes gossip test shutdown similar to what commit 13ce48e ("tests:
Fix stop of storage_service in cql_test_env") did for CQL tests:

  gossip_test: /home/penberg/scylla/seastar/core/sharded.hh:439: Service& seastar::sharded<Service>::local() [with Service = net::messaging_service]: Assertion `local_is_initialized()' failed.
  Running 1 test case...

  [snip]

  unknown location(0): fatal error in "test_boot_shutdown": signal: SIGABRT (application abort requested)
  seastar/tests/test-utils.cc(32): last checkpoint
Message-Id: <1458126520-20025-1-git-send-email-penberg@scylladb.com>

(cherry picked from commit 2f519b9b34)
2016-03-16 13:15:39 +02:00
Asias He
c436fb5892 streaming: Handle cf is deleted after the deletion check
The cf can be deleted after the cf deletion check. Handle this case as
well.

Use "warn" level to log if cf is missing. Although we can handle the
case, but it is good to distingush where the receiver of streaming
applied all the stream mutations or not. We believe that the cf is
missing because it was dropped, but it could be missing because of a bug
or something we didn't anticipated here.

Related patch: "streaming: Handle cf is deleted when sending
STREAM_MUTATION_DONE"

Fixes simple_add_new_node_while_schema_changes_test failure.
Message-Id: <c4497e0500f50e0a3422efb37e73130765c88c57.1458090598.git.asias@scylladb.com>

(cherry picked from commit 2d50c71ca3)
2016-03-16 11:47:03 +02:00
Asias He
950bcd3e38 tests: Fix stop of storage_service in cql_test_env
In stop() of storage_service, it unregisters the verb handler. In the
test, we stop messaging_service before storage_service. Fix it by
deferring stop of messaging_service.
Message-Id: <c71f7b5b46e475efe2fac4c1588460406f890176.1458086329.git.asias@scylladb.com>

(cherry picked from commit 13ce48e775)
2016-03-16 11:36:36 +02:00
5842 changed files with 101694 additions and 553461 deletions

View File

@@ -1,4 +0,0 @@
.git
build
seastar/build
testlog

1
.gitattributes vendored
View File

@@ -1,3 +1,2 @@
*.cc diff=cpp
*.hh diff=cpp
*.svg binary

101
.github/CODEOWNERS vendored
View File

@@ -1,101 +0,0 @@
# AUTH
auth/* @elcallio @vladzcloudius
# CACHE
row_cache* @tgrabiec
*mutation* @tgrabiec
test/boost/mvcc* @tgrabiec
# CDC
cdc/* @kbr- @elcallio @piodul @jul-stas
test/cql/cdc_* @kbr- @elcallio @piodul @jul-stas
test/boost/cdc_* @kbr- @elcallio @piodul @jul-stas
# COMMITLOG / BATCHLOG
db/commitlog/* @elcallio
db/batch* @elcallio
# COORDINATOR
service/storage_proxy* @gleb-cloudius
# COMPACTION
compaction/* @raphaelsc @nyh
# CQL TRANSPORT LAYER
transport/*
# CQL QUERY LANGUAGE
cql3/* @tgrabiec @psarna @cvybhu
# COUNTERS
counters* @jul-stas
tests/counter_test* @jul-stas
# DOCS
docs/* @annastuchlik @tzach
docs/alternator @annastuchlik @tzach @nyh @psarna
# GOSSIP
gms/* @tgrabiec @asias
# DOCKER
dist/docker/*
# LSA
utils/logalloc* @tgrabiec
# MATERIALIZED VIEWS
db/view/* @nyh @psarna
cql3/statements/*view* @nyh @psarna
test/boost/view_* @nyh @psarna
# PACKAGING
dist/* @syuu1228
# REPAIR
repair/* @tgrabiec @asias @nyh
# SCHEMA MANAGEMENT
db/schema_tables* @tgrabiec @nyh
db/legacy_schema_migrator* @tgrabiec @nyh
service/migration* @tgrabiec @nyh
schema* @tgrabiec @nyh
# SECONDARY INDEXES
db/index/* @nyh @psarna
cql3/statements/*index* @nyh @psarna
test/boost/*index* @nyh @psarna
# SSTABLES
sstables/* @tgrabiec @raphaelsc @nyh
# STREAMING
streaming/* @tgrabiec @asias
service/storage_service.* @tgrabiec @asias
# ALTERNATOR
alternator/* @nyh @psarna
test/alternator/* @nyh @psarna
# HINTED HANDOFF
db/hints/* @piodul @vladzcloudius
# REDIS
redis/* @nyh @syuu1228
test/redis/* @nyh @syuu1228
# READERS
reader_* @denesb
querier* @denesb
test/boost/mutation_reader_test.cc @denesb
test/boost/querier_cache_test.cc @denesb
# PYTEST-BASED CQL TESTS
test/cql-pytest/* @nyh
# RAFT
raft/* @kbr- @gleb-cloudius @kostja
test/raft/* @kbr- @gleb-cloudius @kostja
# HEAT-WEIGHTED LOAD BALANCING
db/heat_load_balance.* @nyh @gleb-cloudius

View File

@@ -1,15 +0,0 @@
This is Scylla's bug tracker, to be used for reporting bugs only.
If you have a question about Scylla, and not a bug, please ask it in
our mailing-list at scylladb-dev@googlegroups.com or in our slack channel.
- [] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.
*Installation details*
Scylla version (or git commit hash):
Cluster size:
OS (RHEL/CentOS/Ubuntu/AWS AMI):
*Hardware details (for performance issues)* Delete if unneeded
Platform (physical/VM/cloud instance type/docker):
Hardware: sockets= cores= hyperthreading= memory=
Disks: (SSD/HDD, count)

View File

@@ -1,35 +0,0 @@
name: "Docs / Publish"
# For more information,
# see https://sphinx-theme.scylladb.com/stable/deployment/production.html#available-workflows
on:
push:
branches:
- master
paths:
- "docs/**"
workflow_dispatch:
jobs:
release:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
with:
persist-credentials: false
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: 3.7
- name: Set up env
run: make -C docs setupenv
- name: Build docs
run: make -C docs multiversion
- name: Build redirects
run: make -C docs redirects
- name: Deploy docs to GitHub Pages
run: ./docs/_utils/deploy.sh
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -1,28 +0,0 @@
name: "Docs / Build PR"
# For more information,
# see https://sphinx-theme.scylladb.com/stable/deployment/production.html#available-workflows
on:
pull_request:
branches:
- master
paths:
- "docs/**"
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
with:
persist-credentials: false
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: 3.7
- name: Set up env
run: make -C docs setupenv
- name: Build docs
run: make -C docs test

24
.gitignore vendored
View File

@@ -9,27 +9,3 @@ dist/ami/files/*.rpm
dist/ami/variables.json
dist/ami/scylla_deploy.sh
*.pyc
Cql.tokens
.kdev4
*.kdev4
CMakeLists.txt.user
.cache
.tox
*.egg-info
__pycache__CMakeLists.txt.user
.gdbinit
resources
.pytest_cache
/expressions.tokens
tags
!db/tags/
testlog
test/*/*.reject
.vscode
docs/_build
docs/poetry.lock
compile_commands.json
.ccls-cache/
.mypy_cache
.envrc
rust/Cargo.lock

20
.gitmodules vendored
View File

@@ -1,23 +1,11 @@
[submodule "seastar"]
path = seastar
url = ../seastar
url = ../scylla-seastar
ignore = dirty
[submodule "swagger-ui"]
path = swagger-ui
url = ../scylla-swagger-ui
ignore = dirty
[submodule "libdeflate"]
path = libdeflate
url = ../libdeflate
[submodule "abseil"]
path = abseil
url = ../abseil-cpp
[submodule "scylla-jmx"]
path = tools/jmx
url = ../scylla-jmx
[submodule "scylla-tools"]
path = tools/java
url = ../scylla-tools-java
[submodule "scylla-python3"]
path = tools/python3
url = ../scylla-python3
[submodule "dist/ami/files/scylla-ami"]
path = dist/ami/files/scylla-ami
url = ../scylla-ami

View File

@@ -1,3 +0,0 @@
Avi Kivity <avi@scylladb.com> Avi Kivity' via ScyllaDB development <scylladb-dev@googlegroups.com>
Raphael S. Carvalho <raphaelsc@scylladb.com> Raphael S. Carvalho' via ScyllaDB development <scylladb-dev@googlegroups.com>
Pavel Emelyanov <xemul@scylladb.com> Pavel Emelyanov' via ScyllaDB development <scylladb-dev@googlegroups.com>

View File

@@ -1,804 +0,0 @@
cmake_minimum_required(VERSION 3.18)
project(scylla)
if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
message(STATUS "Setting build type to 'Release' as none was specified.")
set(CMAKE_BUILD_TYPE "Release" CACHE
STRING "Choose the type of build." FORCE)
# Set the possible values of build type for cmake-gui
set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS
"Debug" "Release" "Dev" "Sanitize")
endif()
if(CMAKE_BUILD_TYPE)
string(TOLOWER "${CMAKE_BUILD_TYPE}" BUILD_TYPE)
else()
set(BUILD_TYPE "release")
endif()
function(default_target_arch arch)
set(x86_instruction_sets i386 i686 x86_64)
if(CMAKE_SYSTEM_PROCESSOR IN_LIST x86_instruction_sets)
set(${arch} "westmere" PARENT_SCOPE)
elseif(CMAKE_SYSTEM_PROCESSOR EQUAL "aarch64")
set(${arch} "armv8-a+crc+crypto" PARENT_SCOPE)
else()
set(${arch} "" PARENT_SCOPE)
endif()
endfunction()
default_target_arch(target_arch)
if(target_arch)
set(target_arch_flag "-march=${target_arch}")
endif()
set(cxx_coro_flag)
if (CMAKE_CXX_COMPILER_ID MATCHES GNU)
set(cxx_coro_flag -fcoroutines)
endif()
# Configure Seastar compile options to align with Scylla
set(Seastar_CXX_FLAGS ${cxx_coro_flag} ${target_arch_flag} CACHE INTERNAL "" FORCE)
set(Seastar_CXX_DIALECT gnu++20 CACHE INTERNAL "" FORCE)
add_subdirectory(seastar)
add_subdirectory(abseil)
# Exclude absl::strerror from the default "all" target since it's not
# used in Scylla build and, moreover, makes use of deprecated glibc APIs,
# such as sys_nerr, which are not exposed from "stdio.h" since glibc 2.32,
# which happens to be the case for recent Fedora distribution versions.
#
# Need to use the internal "absl_strerror" target name instead of namespaced
# variant because `set_target_properties` does not understand the latter form,
# unfortunately.
set_target_properties(absl_strerror PROPERTIES EXCLUDE_FROM_ALL TRUE)
# System libraries dependencies
find_package(Boost COMPONENTS filesystem program_options system thread regex REQUIRED)
find_package(Lua REQUIRED)
find_package(ZLIB REQUIRED)
find_package(ICU COMPONENTS uc REQUIRED)
set(scylla_build_dir "${CMAKE_BINARY_DIR}/build/${BUILD_TYPE}")
set(scylla_gen_build_dir "${scylla_build_dir}/gen")
file(MAKE_DIRECTORY "${scylla_build_dir}" "${scylla_gen_build_dir}")
# Place libraries, executables and archives in ${buildroot}/build/${mode}/
foreach(mode RUNTIME LIBRARY ARCHIVE)
set(CMAKE_${mode}_OUTPUT_DIRECTORY "${scylla_build_dir}")
endforeach()
# Generate C++ source files from thrift definitions
function(scylla_generate_thrift)
set(one_value_args TARGET VAR IN_FILE OUT_DIR SERVICE)
cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})
get_filename_component(in_file_name ${args_IN_FILE} NAME_WE)
set(aux_out_file_name ${args_OUT_DIR}/${in_file_name})
set(outputs
${aux_out_file_name}_types.cpp
${aux_out_file_name}_types.h
${aux_out_file_name}_constants.cpp
${aux_out_file_name}_constants.h
${args_OUT_DIR}/${args_SERVICE}.cpp
${args_OUT_DIR}/${args_SERVICE}.h)
add_custom_command(
DEPENDS
${args_IN_FILE}
thrift
OUTPUT ${outputs}
COMMAND ${CMAKE_COMMAND} -E make_directory ${args_OUT_DIR}
COMMAND thrift -gen cpp:cob_style,no_skeleton -out "${args_OUT_DIR}" "${args_IN_FILE}")
add_custom_target(${args_TARGET}
DEPENDS ${outputs})
set(${args_VAR} ${outputs} PARENT_SCOPE)
endfunction()
scylla_generate_thrift(
TARGET scylla_thrift_gen_cassandra
VAR scylla_thrift_gen_cassandra_files
IN_FILE "${CMAKE_SOURCE_DIR}/interface/cassandra.thrift"
OUT_DIR ${scylla_gen_build_dir}
SERVICE Cassandra)
# Parse antlr3 grammar files and generate C++ sources
function(scylla_generate_antlr3)
set(one_value_args TARGET VAR IN_FILE OUT_DIR)
cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})
get_filename_component(in_file_pure_name ${args_IN_FILE} NAME)
get_filename_component(stem ${in_file_pure_name} NAME_WE)
set(outputs
"${args_OUT_DIR}/${stem}Lexer.hpp"
"${args_OUT_DIR}/${stem}Lexer.cpp"
"${args_OUT_DIR}/${stem}Parser.hpp"
"${args_OUT_DIR}/${stem}Parser.cpp")
add_custom_command(
DEPENDS
${args_IN_FILE}
OUTPUT ${outputs}
# Remove #ifdef'ed code from the grammar source code
COMMAND sed -e "/^#if 0/,/^#endif/d" "${args_IN_FILE}" > "${args_OUT_DIR}/${in_file_pure_name}"
COMMAND antlr3 "${args_OUT_DIR}/${in_file_pure_name}"
# We replace many local `ExceptionBaseType* ex` variables with a single function-scope one.
# Because we add such a variable to every function, and because `ExceptionBaseType` is not a global
# name, we also add a global typedef to avoid compilation errors.
COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Lexer.hpp"
COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Lexer.cpp"
COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Parser.hpp"
COMMAND sed -i
-e "s/^\\( *\\)\\(ImplTraits::CommonTokenType\\* [a-zA-Z0-9_]* = NULL;\\)$/\\1const \\2/"
-e "/^.*On :.*$/d"
-e "1i using ExceptionBaseType = int;"
-e "s/^{/{ ExceptionBaseType\\* ex = nullptr;/; s/ExceptionBaseType\\* ex = new/ex = new/; s/exceptions::syntax_exception e/exceptions::syntax_exception\\& e/"
"${args_OUT_DIR}/${stem}Parser.cpp"
VERBATIM)
add_custom_target(${args_TARGET}
DEPENDS ${outputs})
set(${args_VAR} ${outputs} PARENT_SCOPE)
endfunction()
set(antlr3_grammar_files
cql3/Cql.g
alternator/expressions.g)
set(antlr3_gen_files)
foreach(f ${antlr3_grammar_files})
get_filename_component(grammar_file_name "${f}" NAME_WE)
get_filename_component(f_dir "${f}" DIRECTORY)
scylla_generate_antlr3(
TARGET scylla_antlr3_gen_${grammar_file_name}
VAR scylla_antlr3_gen_${grammar_file_name}_files
IN_FILE "${CMAKE_SOURCE_DIR}/${f}"
OUT_DIR ${scylla_gen_build_dir}/${f_dir})
list(APPEND antlr3_gen_files "${scylla_antlr3_gen_${grammar_file_name}_files}")
endforeach()
# Generate C++ sources from ragel grammar files
seastar_generate_ragel(
TARGET scylla_ragel_gen_protocol_parser
VAR scylla_ragel_gen_protocol_parser_file
IN_FILE "${CMAKE_SOURCE_DIR}/redis/protocol_parser.rl"
OUT_FILE ${scylla_gen_build_dir}/redis/protocol_parser.hh)
# Generate C++ sources from Swagger definitions
set(swagger_files
api/api-doc/cache_service.json
api/api-doc/collectd.json
api/api-doc/column_family.json
api/api-doc/commitlog.json
api/api-doc/compaction_manager.json
api/api-doc/config.json
api/api-doc/endpoint_snitch_info.json
api/api-doc/error_injection.json
api/api-doc/failure_detector.json
api/api-doc/gossiper.json
api/api-doc/hinted_handoff.json
api/api-doc/lsa.json
api/api-doc/messaging_service.json
api/api-doc/storage_proxy.json
api/api-doc/storage_service.json
api/api-doc/stream_manager.json
api/api-doc/system.json
api/api-doc/task_manager.json
api/api-doc/task_manager_test.json
api/api-doc/utils.json)
set(swagger_gen_files)
foreach(f ${swagger_files})
get_filename_component(fname "${f}" NAME_WE)
get_filename_component(dir "${f}" DIRECTORY)
seastar_generate_swagger(
TARGET scylla_swagger_gen_${fname}
VAR scylla_swagger_gen_${fname}_files
IN_FILE "${CMAKE_SOURCE_DIR}/${f}"
OUT_DIR "${scylla_gen_build_dir}/${dir}")
list(APPEND swagger_gen_files "${scylla_swagger_gen_${fname}_files}")
endforeach()
# Create C++ bindings for IDL serializers
function(scylla_generate_idl_serializer)
set(one_value_args TARGET VAR IN_FILE OUT_FILE)
cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})
get_filename_component(out_dir ${args_OUT_FILE} DIRECTORY)
set(idl_compiler "${CMAKE_SOURCE_DIR}/idl-compiler.py")
find_package(Python3 COMPONENTS Interpreter)
add_custom_command(
DEPENDS
${args_IN_FILE}
${idl_compiler}
OUTPUT ${args_OUT_FILE}
COMMAND ${CMAKE_COMMAND} -E make_directory ${out_dir}
COMMAND Python3::Interpreter ${idl_compiler} --ns ser -f ${args_IN_FILE} -o ${args_OUT_FILE})
add_custom_target(${args_TARGET}
DEPENDS ${args_OUT_FILE})
set(${args_VAR} ${args_OUT_FILE} PARENT_SCOPE)
endfunction()
set(idl_serializers
idl/cache_temperature.idl.hh
idl/commitlog.idl.hh
idl/consistency_level.idl.hh
idl/frozen_mutation.idl.hh
idl/frozen_schema.idl.hh
idl/gossip_digest.idl.hh
idl/hinted_handoff.idl.hh
idl/idl_test.idl.hh
idl/keys.idl.hh
idl/messaging_service.idl.hh
idl/mutation.idl.hh
idl/paging_state.idl.hh
idl/partition_checksum.idl.hh
idl/paxos.idl.hh
idl/query.idl.hh
idl/raft.idl.hh
idl/range.idl.hh
idl/read_command.idl.hh
idl/reconcilable_result.idl.hh
idl/replay_position.idl.hh
idl/result.idl.hh
idl/ring_position.idl.hh
idl/streaming.idl.hh
idl/token.idl.hh
idl/tracing.idl.hh
idl/truncation_record.idl.hh
idl/uuid.idl.hh
idl/view.idl.hh)
set(idl_gen_files)
foreach(f ${idl_serializers})
get_filename_component(idl_name "${f}" NAME)
get_filename_component(idl_target "${idl_name}" NAME_WE)
get_filename_component(idl_dir "${f}" DIRECTORY)
string(REPLACE ".idl.hh" ".dist.hh" idl_out_hdr_name "${idl_name}")
scylla_generate_idl_serializer(
TARGET scylla_idl_gen_${idl_target}
VAR scylla_idl_gen_${idl_target}_files
IN_FILE "${CMAKE_SOURCE_DIR}/${f}"
OUT_FILE ${scylla_gen_build_dir}/${idl_dir}/${idl_out_hdr_name})
list(APPEND idl_gen_files "${scylla_idl_gen_${idl_target}_files}")
endforeach()
set(scylla_sources
absl-flat_hash_map.cc
alternator/auth.cc
alternator/conditions.cc
alternator/controller.cc
alternator/executor.cc
alternator/expressions.cc
alternator/serialization.cc
alternator/server.cc
alternator/stats.cc
alternator/streams.cc
api/api.cc
api/cache_service.cc
api/collectd.cc
api/column_family.cc
api/commitlog.cc
api/compaction_manager.cc
api/config.cc
api/endpoint_snitch.cc
api/error_injection.cc
api/failure_detector.cc
api/gossiper.cc
api/hinted_handoff.cc
api/lsa.cc
api/messaging_service.cc
api/storage_proxy.cc
api/storage_service.cc
api/stream_manager.cc
api/system.cc
api/task_manager.cc
api/task_manager_test.cc
atomic_cell.cc
auth/allow_all_authenticator.cc
auth/allow_all_authorizer.cc
auth/authenticated_user.cc
auth/authentication_options.cc
auth/authenticator.cc
auth/common.cc
auth/default_authorizer.cc
auth/password_authenticator.cc
auth/passwords.cc
auth/permission.cc
auth/permissions_cache.cc
auth/resource.cc
auth/role_or_anonymous.cc
auth/roles-metadata.cc
auth/sasl_challenge.cc
auth/service.cc
auth/standard_role_manager.cc
auth/transitional.cc
bytes.cc
caching_options.cc
canonical_mutation.cc
cdc/cdc_partitioner.cc
cdc/generation.cc
cdc/log.cc
cdc/metadata.cc
cdc/split.cc
clocks-impl.cc
collection_mutation.cc
compaction/compaction.cc
compaction/compaction_manager.cc
compaction/compaction_strategy.cc
compaction/leveled_compaction_strategy.cc
compaction/size_tiered_compaction_strategy.cc
compaction/time_window_compaction_strategy.cc
compress.cc
converting_mutation_partition_applier.cc
counters.cc
cql3/abstract_marker.cc
cql3/attributes.cc
cql3/cf_name.cc
cql3/column_condition.cc
cql3/column_identifier.cc
cql3/column_specification.cc
cql3/constants.cc
cql3/cql3_type.cc
cql3/expr/expression.cc
cql3/expr/prepare_expr.cc
cql3/expr/restrictions.cc
cql3/functions/aggregate_fcts.cc
cql3/functions/castas_fcts.cc
cql3/functions/error_injection_fcts.cc
cql3/functions/functions.cc
cql3/functions/user_function.cc
cql3/index_name.cc
cql3/keyspace_element_name.cc
cql3/lists.cc
cql3/maps.cc
cql3/operation.cc
cql3/prepare_context.cc
cql3/query_options.cc
cql3/query_processor.cc
cql3/restrictions/statement_restrictions.cc
cql3/result_set.cc
cql3/role_name.cc
cql3/selection/abstract_function_selector.cc
cql3/selection/selectable.cc
cql3/selection/selection.cc
cql3/selection/selector.cc
cql3/selection/selector_factories.cc
cql3/selection/simple_selector.cc
cql3/sets.cc
cql3/statements/alter_keyspace_statement.cc
cql3/statements/alter_service_level_statement.cc
cql3/statements/alter_table_statement.cc
cql3/statements/alter_type_statement.cc
cql3/statements/alter_view_statement.cc
cql3/statements/attach_service_level_statement.cc
cql3/statements/authentication_statement.cc
cql3/statements/authorization_statement.cc
cql3/statements/batch_statement.cc
cql3/statements/cas_request.cc
cql3/statements/cf_prop_defs.cc
cql3/statements/cf_statement.cc
cql3/statements/create_aggregate_statement.cc
cql3/statements/create_function_statement.cc
cql3/statements/create_index_statement.cc
cql3/statements/create_keyspace_statement.cc
cql3/statements/create_service_level_statement.cc
cql3/statements/create_table_statement.cc
cql3/statements/create_type_statement.cc
cql3/statements/create_view_statement.cc
cql3/statements/delete_statement.cc
cql3/statements/detach_service_level_statement.cc
cql3/statements/drop_aggregate_statement.cc
cql3/statements/drop_function_statement.cc
cql3/statements/drop_index_statement.cc
cql3/statements/drop_keyspace_statement.cc
cql3/statements/drop_service_level_statement.cc
cql3/statements/drop_table_statement.cc
cql3/statements/drop_type_statement.cc
cql3/statements/drop_view_statement.cc
cql3/statements/function_statement.cc
cql3/statements/grant_statement.cc
cql3/statements/index_prop_defs.cc
cql3/statements/index_target.cc
cql3/statements/ks_prop_defs.cc
cql3/statements/list_permissions_statement.cc
cql3/statements/list_service_level_attachments_statement.cc
cql3/statements/list_service_level_statement.cc
cql3/statements/list_users_statement.cc
cql3/statements/modification_statement.cc
cql3/statements/permission_altering_statement.cc
cql3/statements/property_definitions.cc
cql3/statements/raw/parsed_statement.cc
cql3/statements/revoke_statement.cc
cql3/statements/role-management-statements.cc
cql3/statements/schema_altering_statement.cc
cql3/statements/select_statement.cc
cql3/statements/service_level_statement.cc
cql3/statements/sl_prop_defs.cc
cql3/statements/truncate_statement.cc
cql3/statements/update_statement.cc
cql3/statements/strongly_consistent_modification_statement.cc
cql3/statements/strongly_consistent_select_statement.cc
cql3/statements/use_statement.cc
cql3/type_json.cc
cql3/untyped_result_set.cc
cql3/update_parameters.cc
cql3/user_types.cc
cql3/util.cc
cql3/ut_name.cc
cql3/values.cc
data_dictionary/data_dictionary.cc
db/batchlog_manager.cc
db/commitlog/commitlog.cc
db/commitlog/commitlog_entry.cc
db/commitlog/commitlog_replayer.cc
db/config.cc
db/consistency_level.cc
db/cql_type_parser.cc
db/data_listeners.cc
db/extensions.cc
db/heat_load_balance.cc
db/hints/host_filter.cc
db/hints/manager.cc
db/hints/resource_manager.cc
db/hints/sync_point.cc
db/large_data_handler.cc
db/legacy_schema_migrator.cc
db/marshal/type_parser.cc
db/rate_limiter.cc
db/schema_tables.cc
db/size_estimates_virtual_reader.cc
db/snapshot-ctl.cc
db/sstables-format-selector.cc
db/system_distributed_keyspace.cc
db/system_keyspace.cc
db/view/row_locking.cc
db/view/view.cc
db/view/view_update_generator.cc
db/virtual_table.cc
dht/boot_strapper.cc
dht/i_partitioner.cc
dht/murmur3_partitioner.cc
dht/range_streamer.cc
dht/token.cc
replica/distributed_loader.cc
duration.cc
exceptions/exceptions.cc
readers/mutation_readers.cc
frozen_mutation.cc
frozen_schema.cc
generic_server.cc
gms/application_state.cc
gms/endpoint_state.cc
gms/failure_detector.cc
gms/feature_service.cc
gms/gossip_digest_ack2.cc
gms/gossip_digest_ack.cc
gms/gossip_digest_syn.cc
gms/gossiper.cc
gms/inet_address.cc
gms/versioned_value.cc
gms/version_generator.cc
hashers.cc
index/secondary_index.cc
index/secondary_index_manager.cc
init.cc
keys.cc
utils/lister.cc
locator/abstract_replication_strategy.cc
locator/azure_snitch.cc
locator/ec2_multi_region_snitch.cc
locator/ec2_snitch.cc
locator/everywhere_replication_strategy.cc
locator/gce_snitch.cc
locator/gossiping_property_file_snitch.cc
locator/local_strategy.cc
locator/network_topology_strategy.cc
locator/production_snitch_base.cc
locator/rack_inferring_snitch.cc
locator/simple_snitch.cc
locator/simple_strategy.cc
locator/snitch_base.cc
locator/token_metadata.cc
lang/lua.cc
main.cc
replica/memtable.cc
message/messaging_service.cc
multishard_mutation_query.cc
mutation.cc
mutation_fragment.cc
mutation_partition.cc
mutation_partition_serializer.cc
mutation_partition_view.cc
mutation_query.cc
readers/mutation_reader.cc
mutation_writer/feed_writers.cc
mutation_writer/multishard_writer.cc
mutation_writer/partition_based_splitting_writer.cc
mutation_writer/shard_based_splitting_writer.cc
mutation_writer/timestamp_based_splitting_writer.cc
partition_slice_builder.cc
partition_version.cc
querier.cc
query.cc
query_ranges_to_vnodes.cc
query-result-set.cc
raft/fsm.cc
raft/log.cc
raft/raft.cc
raft/server.cc
raft/tracker.cc
service/broadcast_tables/experimental/lang.cc
range_tombstone.cc
range_tombstone_list.cc
tombstone_gc_options.cc
tombstone_gc.cc
reader_concurrency_semaphore.cc
redis/abstract_command.cc
redis/command_factory.cc
redis/commands.cc
redis/keyspace_utils.cc
redis/lolwut.cc
redis/mutation_utils.cc
redis/options.cc
redis/query_processor.cc
redis/query_utils.cc
redis/server.cc
redis/service.cc
redis/stats.cc
release.cc
repair/repair.cc
repair/row_level.cc
replica/database.cc
replica/table.cc
row_cache.cc
schema.cc
schema_mutations.cc
schema_registry.cc
serializer.cc
service/client_state.cc
service/forward_service.cc
service/migration_manager.cc
service/misc_services.cc
service/pager/paging_state.cc
service/pager/query_pagers.cc
service/paxos/paxos_state.cc
service/paxos/prepare_response.cc
service/paxos/prepare_summary.cc
service/paxos/proposal.cc
service/priority_manager.cc
service/qos/qos_common.cc
service/qos/service_level_controller.cc
service/qos/standard_service_level_distributed_data_accessor.cc
service/raft/raft_group_registry.cc
service/raft/raft_rpc.cc
service/raft/raft_sys_table_storage.cc
service/raft/group0_state_machine.cc
service/storage_proxy.cc
service/storage_service.cc
sstables/compress.cc
sstables/integrity_checked_file_impl.cc
sstables/kl/reader.cc
sstables/metadata_collector.cc
sstables/m_format_read_helpers.cc
sstables/mx/reader.cc
sstables/mx/writer.cc
sstables/prepended_input_stream.cc
sstables/random_access_reader.cc
sstables/sstable_directory.cc
sstables/sstable_mutation_reader.cc
sstables/sstables.cc
sstables/sstable_set.cc
sstables/sstables_manager.cc
sstables/sstable_version.cc
sstables/writer.cc
streaming/consumer.cc
streaming/progress_info.cc
streaming/session_info.cc
streaming/stream_coordinator.cc
streaming/stream_manager.cc
streaming/stream_plan.cc
streaming/stream_reason.cc
streaming/stream_receive_task.cc
streaming/stream_request.cc
streaming/stream_result_future.cc
streaming/stream_session.cc
streaming/stream_session_state.cc
streaming/stream_summary.cc
streaming/stream_task.cc
streaming/stream_transfer_task.cc
table_helper.cc
tasks/task_manager.cc
thrift/controller.cc
thrift/handler.cc
thrift/server.cc
thrift/thrift_validation.cc
timeout_config.cc
tools/scylla-sstable-index.cc
tools/scylla-types.cc
tracing/traced_file.cc
tracing/trace_keyspace_helper.cc
tracing/trace_state.cc
tracing/tracing_backend_registry.cc
tracing/tracing.cc
transport/controller.cc
transport/cql_protocol_extension.cc
transport/event.cc
transport/event_notifier.cc
transport/messages/result_message.cc
transport/server.cc
types.cc
unimplemented.cc
utils/arch/powerpc/crc32-vpmsum/crc32_wrapper.cc
utils/array-search.cc
utils/ascii.cc
utils/base64.cc
utils/big_decimal.cc
utils/bloom_calculations.cc
utils/bloom_filter.cc
utils/buffer_input_stream.cc
utils/build_id.cc
utils/config_file.cc
utils/directories.cc
utils/disk-error-handler.cc
utils/dynamic_bitset.cc
utils/error_injection.cc
utils/exceptions.cc
utils/file_lock.cc
utils/generation-number.cc
utils/gz/crc_combine.cc
utils/gz/gen_crc_combine_table.cc
utils/human_readable.cc
utils/i_filter.cc
utils/large_bitset.cc
utils/like_matcher.cc
utils/limiting_data_source.cc
utils/logalloc.cc
utils/managed_bytes.cc
utils/multiprecision_int.cc
utils/murmur_hash.cc
utils/rate_limiter.cc
utils/rjson.cc
utils/runtime.cc
utils/updateable_value.cc
utils/utf8.cc
utils/uuid.cc
utils/UUID_gen.cc
validation.cc
vint-serialization.cc
zstd.cc)
set(scylla_gen_sources
"${scylla_thrift_gen_cassandra_files}"
"${scylla_ragel_gen_protocol_parser_file}"
"${swagger_gen_files}"
"${idl_gen_files}"
"${antlr3_gen_files}")
add_executable(scylla
${scylla_sources}
${scylla_gen_sources})
target_link_libraries(scylla PRIVATE
seastar
# Boost dependencies
Boost::filesystem
Boost::program_options
Boost::system
Boost::thread
Boost::regex
Boost::headers
# Abseil libs
absl::hashtablez_sampler
absl::raw_hash_set
absl::synchronization
absl::graphcycles_internal
absl::stacktrace
absl::symbolize
absl::debugging_internal
absl::demangle_internal
absl::time
absl::time_zone
absl::int128
absl::city
absl::hash
absl::malloc_internal
absl::spinlock_wait
absl::base
absl::dynamic_annotations
absl::raw_logging_internal
absl::exponential_biased
absl::throw_delegate
# System libs
ZLIB::ZLIB
ICU::uc
systemd
zstd
snappy
${LUA_LIBRARIES}
thrift
crypt)
target_link_libraries(scylla PRIVATE
-Wl,--build-id=sha1 # Force SHA1 build-id generation
# TODO: Use lld linker if it's available, otherwise gold, else bfd
-fuse-ld=lld)
# TODO: patch dynamic linker to match configure.py behavior
target_compile_options(scylla PRIVATE
-std=gnu++20
${cxx_coro_flag}
${target_arch_flag})
# Hacks needed to expose internal APIs for xxhash dependencies
target_compile_definitions(scylla PRIVATE XXH_PRIVATE_API HAVE_LZ4_COMPRESS_DEFAULT)
target_include_directories(scylla PRIVATE
"${CMAKE_CURRENT_SOURCE_DIR}"
libdeflate
abseil
"${scylla_gen_build_dir}")
###
### Create crc_combine_table helper executable.
### Use it to generate crc_combine_table.cc to be used in scylla at build time.
###
add_executable(crc_combine_table utils/gz/gen_crc_combine_table.cc)
target_link_libraries(crc_combine_table PRIVATE seastar)
target_include_directories(crc_combine_table PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}")
target_compile_options(crc_combine_table PRIVATE
-std=gnu++20
${cxx_coro_flag}
${target_arch_flag})
add_dependencies(scylla crc_combine_table)
# Generate an additional source file at build time that is needed for Scylla compilation
add_custom_command(OUTPUT "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc"
COMMAND $<TARGET_FILE:crc_combine_table> > "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc"
DEPENDS crc_combine_table)
target_sources(scylla PRIVATE "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc")
###
### Generate version file and supply appropriate compile definitions for release.cc
###
execute_process(COMMAND ${CMAKE_SOURCE_DIR}/SCYLLA-VERSION-GEN --output-dir "${CMAKE_BINARY_DIR}/gen" RESULT_VARIABLE scylla_version_gen_res)
if(scylla_version_gen_res)
message(SEND_ERROR "Version file generation failed. Return code: ${scylla_version_gen_res}")
endif()
file(READ "${CMAKE_BINARY_DIR}/gen/SCYLLA-VERSION-FILE" scylla_version)
string(STRIP "${scylla_version}" scylla_version)
file(READ "${CMAKE_BINARY_DIR}/gen/SCYLLA-RELEASE-FILE" scylla_release)
string(STRIP "${scylla_release}" scylla_release)
get_property(release_cdefs SOURCE "${CMAKE_SOURCE_DIR}/release.cc" PROPERTY COMPILE_DEFINITIONS)
list(APPEND release_cdefs "SCYLLA_VERSION=\"${scylla_version}\"" "SCYLLA_RELEASE=\"${scylla_release}\"")
set_source_files_properties("${CMAKE_SOURCE_DIR}/release.cc" PROPERTIES COMPILE_DEFINITIONS "${release_cdefs}")
###
### Custom command for building libdeflate. Link the library to scylla.
###
set(libdeflate_lib "${scylla_build_dir}/libdeflate/libdeflate.a")
add_custom_command(OUTPUT "${libdeflate_lib}"
COMMAND make -C "${CMAKE_SOURCE_DIR}/libdeflate"
BUILD_DIR=../build/${BUILD_TYPE}/libdeflate/
CC=${CMAKE_C_COMPILER}
"CFLAGS=${target_arch_flag}"
../build/${BUILD_TYPE}/libdeflate//libdeflate.a) # Two backslashes are important!
# Hack to force generating custom command to produce libdeflate.a
add_custom_target(libdeflate DEPENDS "${libdeflate_lib}")
target_link_libraries(scylla PRIVATE "${libdeflate_lib}")
# TODO: create cmake/ directory and move utilities (generate functions etc) there
# TODO: Build tests if BUILD_TESTING=on (using CTest module)

View File

@@ -1,22 +0,0 @@
# Contributing to Scylla
## Asking questions or requesting help
Use the [Scylla Users mailing list](https://groups.google.com/g/scylladb-users) or the [Slack workspace](http://slack.scylladb.com) for general questions and help.
Join the [Scylla Developers mailing list](https://groups.google.com/g/scylladb-dev) for deeper technical discussions and to discuss your ideas for contributions.
## Reporting an issue
Please use the [issue tracker](https://github.com/scylladb/scylla/issues/) to report issues or to suggest features. Fill in as much information as you can in the issue template, especially for performance problems.
## Contributing code to Scylla
Before you can contribute code to Scylla for the first time, you should sign the [Contributor License Agreement](https://www.scylladb.com/open-source/contributor-agreement/) and send the signed form cla@scylladb.com. You can then submit your changes as patches to the to the [scylladb-dev mailing list](https://groups.google.com/forum/#!forum/scylladb-dev) or as a pull request to the [Scylla project on github](https://github.com/scylladb/scylla).
If you need help formatting or sending patches, [check out these instructions](https://github.com/scylladb/scylla/wiki/Formatting-and-sending-patches).
The Scylla C++ source code uses the [Seastar coding style](https://github.com/scylladb/seastar/blob/master/coding-style.md) so please adhere to that in your patches. Note that Scylla code is written with `using namespace seastar`, so should not explicitly add the `seastar::` prefix to Seastar symbols. You will usually not need to add `using namespace seastar` to new source files, because most Scylla header files have `#include "seastarx.hh"`, which does this.
Header files in Scylla must be self-contained, i.e., each can be included without having to include specific other headers first. To verify that your change did not break this property, run `ninja dev-headers`. If you added or removed header files, you must `touch configure.py` first - this will cause `configure.py` to be automatically re-run to generate a fresh list of header files.
For more criteria on what reviewers consider good code, see the [review checklist](https://github.com/scylladb/scylla/blob/master/docs/dev/review-checklist.md).

View File

@@ -1,422 +0,0 @@
# Guidelines for developing Scylla
This document is intended to help developers and contributors to Scylla get started. The first part consists of general guidelines that make no assumptions about a development environment or tooling. The second part describes a particular environment and work-flow for exemplary purposes.
## Overview
This section covers some high-level information about the Scylla source code and work-flow.
### Getting the source code
Scylla uses [Git submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) to manage its dependency on Seastar and other tools. Be sure that all submodules are correctly initialized when cloning the project:
```bash
$ git clone https://github.com/scylladb/scylla
$ cd scylla
$ git submodule update --init --recursive
```
### Dependencies
Scylla is fairly fussy about its build environment, requiring a very recent
version of the C++20 compiler and numerous tools and libraries to build.
Run `./install-dependencies.sh` (as root) to use your Linux distributions's
package manager to install the appropriate packages on your build machine.
However, this will only work on very recent distributions. For example,
currently Fedora users must upgrade to Fedora 32 otherwise the C++ compiler
will be too old, and not support the new C++20 standard that Scylla uses.
Alternatively, to avoid having to upgrade your build machine or install
various packages on it, we provide another option - the **frozen toolchain**.
This is a script, `./tools/toolchain/dbuild`, that can execute build or run
commands inside a Docker image that contains exactly the right build tools and
libraries. The `dbuild` technique is useful for beginners, but is also the way
in which ScyllaDB produces official releases, so it is highly recommended.
To use `dbuild`, you simply prefix any build or run command with it. Building
and running Scylla becomes as easy as:
```bash
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla
$ ./tools/toolchain/dbuild ./build/release/scylla --developer-mode 1
```
### Build system
**Note**: Compiling Scylla requires, conservatively, 2 GB of memory per native
thread, and up to 3 GB per native thread while linking. GCC >= 10 is
required.
Scylla is built with [Ninja](https://ninja-build.org/), a low-level rule-based system. A Python script, `configure.py`, generates a Ninja file (`build.ninja`) based on configuration options.
To build for the first time:
```bash
$ ./configure.py
$ ninja-build
```
Afterwards, it is sufficient to just execute Ninja.
The full suite of options for project configuration is available via
```bash
$ ./configure.py --help
```
The most important option is:
- `--enable-dpdk`: [DPDK](http://dpdk.org/) is a set of libraries and drivers for fast packet processing. During development, it's not necessary to enable support even if it is supported by your platform.
Source files and build targets are tracked manually in `configure.py`, so the script needs to be updated when new files or targets are added or removed.
To save time -- for instance, to avoid compiling all unit tests -- you can also specify specific targets to Ninja. For example,
```bash
$ ninja-build build/release/tests/schema_change_test
$ ninja-build build/release/service/storage_proxy.o
```
You can also specify a single mode. For example
```bash
$ ninja-build release
```
Will build everytihng in release mode. The valid modes are
* Debug: Enables [AddressSanitizer](https://github.com/google/sanitizers/wiki/AddressSanitizer)
and other sanity checks. It has no optimizations, which allows for debugging with tools like
GDB. Debugging builds are generally slower and generate much larger object files than release builds.
* Release: Fewer checks and more optimizations. It still has debug info.
* Dev: No optimizations or debug info. The objective is to compile and link as fast as possible.
This is useful for the first iterations of a patch.
Note that by default unit tests binaries are stripped so they can't be used with gdb or seastar-addr2line.
To include debug information in the unit test binary, build the test binary with a `_g` suffix. For example,
```bash
$ ninja-build build/release/tests/schema_change_test_g
```
### Unit testing
Unit tests live in the `/tests` directory. Like with application source files, test sources and executables are specified manually in `configure.py` and need to be updated when changes are made.
A test target can be any executable. A non-zero return code indicates test failure.
Most tests in the Scylla repository are built using the [Boost.Test](http://www.boost.org/doc/libs/1_64_0/libs/test/doc/html/index.html) library. Utilities for writing tests with Seastar futures are also included.
Run all tests through the test execution wrapper with
```bash
$ ./test.py --mode={debug,release}
```
The `--name` argument can be specified to run a particular test.
Alternatively, you can execute the test executable directly. For example,
```bash
$ build/release/tests/row_cache_test -- -c1 -m1G
```
The `-c1 -m1G` arguments limit this Seastar-based test to a single system thread and 1 GB of memory.
### Preparing patches
All changes to Scylla are submitted as patches to the public [mailing list](mailto:scylladb-dev@googlegroups.com). Once a patch is approved by one of the maintainers of the project, it is committed to the maintainers' copy of the repository at https://github.com/scylladb/scylla.
Detailed instructions for formatting patches for the mailing list and advice on preparing good patches are available at the [ScyllaDB website](http://docs.scylladb.com/contribute/). There are also some guidelines that can help you make the patch review process smoother:
1. Before generating patches, make sure your Git configuration points to `.gitorderfile`. You can do it by running
```bash
$ git config diff.orderfile .gitorderfile
```
2. If you are sending more than a single patch, push your changes into a new branch of your fork of Scylla on GitHub and add a URL pointing to this branch to your cover letter.
3. If you are sending a new revision of an earlier patchset, add a brief summary of changes in this version, for example:
```
In v3:
- declared move constructor and move assignment operator as noexcept
- used std::variant instead of a union
...
```
4. Add information about the tests run with this fix. It can look like
```
"Tests: unit ({mode}), dtest ({smp})"
```
The usual is "Tests: unit (dev)", although running debug tests is encouraged.
5. When answering review comments, prefer inline quotes as they make it easier to track the conversation across multiple e-mails.
6. The Linux kernel's [Submitting Patches](https://www.kernel.org/doc/html/v4.19/process/submitting-patches.html) document offers excellent advice on how to prepare patches and patchsets for review. Since the Scylla development process is derived from the kernel's, almost all of the advice there is directly applicable.
### Finding a person to review and merge your patches
You can use the `scripts/find-maintainer` script to find a subsystem maintainer and/or reviewer for your patches. The script accepts a filename in the git source tree as an argument and outputs a list of subsystems the file belongs to and their respective maintainers and reviewers. For example, if you changed the `cql3/statements/create_view_statement.hh` file, run the script as follows:
```bash
$ ./scripts/find-maintainer cql3/statements/create_view_statement.hh
```
and you will get output like this:
```
CQL QUERY LANGUAGE
Tomasz Grabiec <tgrabiec@scylladb.com> [maintainer]
MATERIALIZED VIEWS
Nadav Har'El <nyh@scylladb.com> [reviewer]
```
### Running Scylla
Once Scylla has been compiled, executing the (`debug` or `release`) target will start a running instance in the foreground:
```bash
$ build/release/scylla
```
The `scylla` executable requires a configuration file, `scylla.yaml`. By default, this is read from `$SCYLLA_HOME/conf/scylla.yaml`. A good starting point for development is located in the repository at `/conf/scylla.yaml`.
For development, a directory at `$HOME/scylla` can be used for all Scylla-related files:
```bash
$ mkdir -p $HOME/scylla $HOME/scylla/conf
$ cp conf/scylla.yaml $HOME/scylla/conf/scylla.yaml
$ # Edit configuration options as appropriate
$ SCYLLA_HOME=$HOME/scylla build/release/scylla
```
The `scylla.yaml` file in the repository by default writes all database data to `/var/lib/scylla`, which likely requires root access. Change the `data_file_directories` and `commitlog_directory` fields as appropriate.
Scylla has a number of requirements for the file-system and operating system to operate ideally and at peak performance. However, during development, these requirements can be relaxed with the `--developer-mode` flag.
Additionally, when running on under-powered platforms like portable laptops, the `--overprovisined` flag is useful.
On a development machine, one might run Scylla as
```bash
$ SCYLLA_HOME=$HOME/scylla build/release/scylla --overprovisioned --developer-mode=yes
```
To interact with scylla it is recommended to build our versions of
cqlsh and nodetool. They are available at
https://github.com/scylladb/scylla-tools-java and can be built with
```bash
$ sudo ./install-dependencies.sh
$ ant jar
```
cqlsh should work out of the box, but nodetool depends on a running
scylla-jmx (https://github.com/scylladb/scylla-jmx). It can be build
with
```bash
$ mvn package
```
and must be started with
```bash
$ ./scripts/scylla-jmx
```
### Branches and tags
Multiple release branches are maintained on the Git repository at https://github.com/scylladb/scylla. Release 1.5, for instance, is tracked on the `branch-1.5` branch.
Similarly, tags are used to pin-point precise release versions, including hot-fix versions like 1.5.4. These are named `scylla-1.5.4`, for example.
Most development happens on the `master` branch. Release branches are cut from `master` based on time and/or features. When a patch against `master` fixes a serious issue like a node crash or data loss, it is backported to a particular release branch with `git cherry-pick` by the project maintainers.
## Example: development on Fedora 25
This section describes one possible work-flow for developing Scylla on a Fedora 25 system. It is presented as an example to help you to develop a work-flow and tools that you are comfortable with.
### Preface
This guide will be written from the perspective of a fictitious developer, Taylor Smith.
### Git work-flow
Having two Git remotes is useful:
- A public clone of Seastar (`"public"`)
- A private clone of Seastar (`"private"`) for in-progress work or work that is not yet ready to share
The first step to contributing a change to Scylla is to create a local branch dedicated to it. For example, a feature that fixes a bug in the CQL statement for creating tables could be called `ts/cql_create_table_error/v1`. The branch name is prefaced by the developer's initials and has a suffix indicating that this is the first version. The version suffix is useful when branches are shared publicly and changes are requested on the mailing list. Having a branch for each version of the patch (or patch set) shared publicly makes it easier to reference and compare the history of a change.
Setting the upstream branch of your development branch to `master` is a useful way to track your changes. You can do this with
```bash
$ git branch -u master ts/cql_create_table_error/v1
```
As a patch set is developed, you can periodically push the branch to the private remote to back-up work.
Once the patch set is ready to be reviewed, push the branch to the public remote and prepare an email to the `scylladb-dev` mailing list. Including a link to the branch on your public remote allows for reviewers to quickly test and explore your changes.
### Development environment and source code navigation
Scylla includes a [CMake](https://cmake.org/) file, `CMakeLists.txt`, for use only with development environments (not for building) so that they can properly analyze the source code.
[CLion](https://www.jetbrains.com/clion/) is a commercial IDE offers reasonably good source code navigation and advice for code hygiene, though its C++ parser sometimes makes errors and flags false issues.
Other good options that directly parse CMake files are [KDevelop](https://www.kdevelop.org/) and [QtCreator](https://wiki.qt.io/Qt_Creator).
To use the `CMakeLists.txt` file with these programs, define the `FOR_IDE` CMake variable or shell environmental variable.
[Eclipse](https://eclipse.org/cdt/) is another open-source option. It doesn't natively work with CMake projects, and its C++ parser has many similar issues as CLion.
### Distributed compilation: `distcc` and `ccache`
Scylla's compilations times can be long. Two tools help somewhat:
- [ccache](https://ccache.samba.org/) caches compiled object files on disk and re-uses them when possible
- [distcc](https://github.com/distcc/distcc) distributes compilation jobs to remote machines
A reasonably-powered laptop acts as the coordinator for compilation. A second, more powerful, machine acts as a passive compilation server.
Having a direct wired connection between the machines ensures that object files can be transmitted quickly and limits the overhead of remote compilation.
The coordinator has been assigned the static IP address `10.0.0.1` and the passive compilation machine has been assigned `10.0.0.2`.
On Fedora, installing the `ccache` package places symbolic links for `gcc` and `g++` in the `PATH`. This allows normal compilation to transparently invoke `ccache` for compilation and cache object files on the local file-system.
Next, set `CCACHE_PREFIX` so that `ccache` is responsible for invoking `distcc` as necessary:
```bash
export CCACHE_PREFIX="distcc"
```
On each host, edit `/etc/sysconfig/distccd` to include the allowed coordinators and the total number of jobs that the machine should accept.
This example is for the laptop, which has 2 physical cores (4 logical cores with hyper-threading):
```
OPTIONS="--allow 10.0.0.2 --allow 127.0.0.1 --jobs 4"
```
`10.0.0.2` has 8 physical cores (16 logical cores) and 64 GB of memory.
As a rule-of-thumb, the number of jobs that a machine should be specified to support should be equal to the number of its native threads.
Restart the `distccd` service on all machines.
On the coordinator machine, edit `$HOME/.distcc/hosts` with the available hosts for compilation. Order of the hosts indicates preference.
```
10.0.0.2/16 localhost/2
```
In this example, `10.0.0.2` will be sent up to 16 jobs and the local machine will be sent up to 2. Allowing for two extra threads on the host machine for coordination, we run compilation with `16 + 2 + 2 = 20` jobs in total: `ninja-build -j20`.
When a compilation is in progress, the status of jobs on all remote machines can be visualized in the terminal with `distccmon-text` or graphically as a GTK application with `distccmon-gnome`.
One thing to keep in mind is that linking object files happens on the coordinating machine, which can be a bottleneck. See the next sections speeding up this process.
### Using the `gold` linker
Linking Scylla can be slow. The gold linker can replace GNU ld and often speeds the linking process. On Fedora, you can switch the system linker using
```bash
$ sudo alternatives --config ld
```
### Using split dwarf
With debug info enabled, most of the link time is spent copying and
relocating it. It is possible to leave most of the debug info out of
the link by writing it to a side .dwo file. This is done by passing
`-gsplit-dwarf` to gcc.
Unfortunately just `-gsplit-dwarf` would slow down `gdb` startup. To
avoid that the gold linker can be told to create an index with
`--gdb-index`.
More info at https://gcc.gnu.org/wiki/DebugFission.
Both options can be enable by passing `--split-dwarf` to configure.py.
Note that distcc is *not* compatible with it, but icecream
(https://github.com/icecc/icecream) is.
### Testing changes in Seastar with Scylla
Sometimes Scylla development is closely tied with a feature being developed in Seastar. It can be useful to compile Scylla with a particular check-out of Seastar.
One way to do this it to create a local remote for the Seastar submodule in the Scylla repository:
```bash
$ cd $HOME/src/scylla
$ cd seastar
$ git remote add local /home/tsmith/src/seastar
$ git remote update
$ git checkout -t local/my_local_seastar_branch
```
### Generating code coverage report
Install dependencies:
$ dnf install llvm # for llvm-profdata and llvm-cov
$ dnf install lcov # for genhtml
Instruct `configure.py` to generate build files for `coverage` mode:
$ ./configure.py --mode=coverage
Build the tests you want to run, then run them via `test.py` (important!):
$ ./test.py --mode=coverage [...]
Alternatively, you can run individual tests via `./scripts/coverage.py --run`.
Open the link printed at the end. Be horrified. Go and write more tests.
For more details see `./scripts/coverage.py --help`.
### Resolving stack backtraces
Scylla may print stack backtraces to the log for several reasons.
For example:
- When aborting (e.g. due to assertion failure, internal error, or segfault)
- When detecting seastar reactor stalls (where a seastar task runs for a long time without yielding the cpu to other tasks on that shard)
The backtraces contain code pointers so they are not very helpful without resolving into code locations.
To resolve the backtraces, one needs the scylla relocatable package that contains the scylla binary (with debug information),
as well as the dynamic libraries it is linked against.
Builds from our automated build system are uploaded to the cloud
and can be searched on http://backtrace.scylladb.com/
Make sure you have the scylla server exact `build-id` to locate
its respective relocatable package, required for decoding backtraces it prints.
The build-id is printed to the system log when scylla starts.
It can also be found by executing `scylla --build-id`, or
by using the `file` utility, for example:
```
$ scylla --build-id
4cba12e6eb290a406bfa4930918db23941fd4be3
$ file scylla
scylla: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=4cba12e6eb290a406bfa4930918db23941fd4be3, with debug_info, not stripped, too many notes (256)
```
To find the build-id of a coredump, use the `eu-unstrip` utility as follows:
```
$ eu-unstrip -n --core <coredump> | awk '/scylla$/ { s=$2; sub(/@.*$/, "", s); print s; exit(0); }'
4cba12e6eb290a406bfa4930918db23941fd4be3
```
### Core dump debugging
See [debugging.md](docs/dev/debugging.md).

103
IDL.md Normal file
View File

@@ -0,0 +1,103 @@
#IDL definition
The schema we use similar to c++ schema.
Use class or struct similar to the object you need the serializer for.
Use namespace when applicable.
##keywords
* class/struct - a class or a struct like C++
class/struct can have final or stub marker
* namespace - has the same C++ meaning
* enum class - has the same C++ meaning
* final modifier for class - when a class mark as final it will not contain a size parameter. Note that final class cannot be extended by future version, so use with care
* stub class - when a class is mark as stub, it means that no code will be generated for this class and it is only there as a documentation.
* version attributes - mark with [[version id ]] mark that a field is available from a specific version
* template - A template class definition like C++
##Syntax
###Namespace
```
namespace ns_name { namespace-body }
```
* ns_name: either a previously unused identifier, in which case this is original-namespace-definition or the name of a namespace, in which case this is extension-namespace-definition
* namespace-body: possibly empty sequence of declarations of any kind (including class and struct definitions as well as nested namespaces)
###class/struct
`
class-key class-name final(optional) stub(optional) { member-specification } ;(optional)
`
* class-key: one of class or struct.
* class-name: the name of the class that's being defined. optionally followed by keyword final, optionally followed by keyword stub
* final: when a class mark as final, it means it can not be extended and there is no need to serialize its size, use with care.
* stub: when a class is mark as stub, it means no code will generate for it and it is added for documentation only.
* member-specification: list of access specifiers, and public member accessor see class member below.
* to be compatible with C++ a class definition can be followed by a semicolon.
###enum
`enum-key identifier enum-base { enumerator-list(optional) }`
* enum-key: only enum class is supported
* identifier: the name of the enumeration that's being declared.
* enum-base: colon (:), followed by a type-specifier-seq that names an integral type (see the C++ standard for the full list of all possible integral types).
* enumerator-list: comma-separated list of enumerator definitions, each of which is either simply an identifier, which becomes the name of the enumerator, or an identifier with an initializer: identifier = integral value.
Note that though C++ allows constexpr as an initialize value, it makes the documentation less readable, hence is not permitted.
###class member
`type member-access attributes(optional) default-value(optional);`
* type: Any valid C++ type, following the C++ notation. note that there should be a serializer for the type, but deceleration order is not mandatory
* member-access: is the way the member can be access. If the member is public it can be the name itself. if not it could be a getter function that should be followed by braces. Note that getter can (and probably should) be const methods.
* attributes: Attributes define by square brackets. Currently are use to mark a version in which a specific member was added [ [ version version-number] ] would mark that the specific member was added in the given version number.
###template
`template < parameter-list > class-declaration`
* parameter-list - a non-empty comma-separated list of the template parameters.
* class-decleration - (See class section) The class name declared become a template name.
##IDL example
Forward slashes comments are ignored until the end of the line.
```
namespace utils {
// An example of a stub class
class UUID stub {
int64_t most_sig_bits;
int64_t least_sig_bits;
}
}
namespace gms {
//an enum example
enum class application_state:int {STATUS = 0,
LOAD,
SCHEMA,
DC};
// example of final class
class versioned_value final {
// getter and setter as public member
int version;
sstring value;
}
class heart_beat_state {
//getter as function
int32_t get_generation();
//default value example
int32_t get_heart_beat_version() = 1;
}
class endpoint_state {
heart_beat_state get_heart_beat_state();
std::map<application_state, versioned_value> get_application_state_map();
}
class gossip_digest {
inet_address get_endpoint();
int32_t get_generation();
//mark that a field was added on a specific version
int32_t get_max_version() [ [version 0.14.2] ];
}
class gossip_digest_ack {
std::vector<gossip_digest> digests();
std::map<inet_address, gms::endpoint_state> get_endpoint_state_map();
}
}
```

View File

@@ -1,11 +1,2 @@
This project includes code developed by the Apache Software Foundation (http://www.apache.org/),
especially Apache Cassandra.
It includes files from https://github.com/antonblanchard/crc32-vpmsum (author Anton Blanchard <anton@au.ibm.com>, IBM).
These files are located in utils/arch/powerpc/crc32-vpmsum. Their license may be found in licenses/LICENSE-crc32-vpmsum.TXT.
It includes modified code from https://gitbox.apache.org/repos/asf?p=cassandra-dtest.git (owned by The Apache Software Foundation)
It includes modified tests from https://github.com/etcd-io/etcd.git (owned by The etcd Authors)
It includes files from https://github.com/bytecodealliance/wasmtime-cpp (owned by Bytecode Alliance), licensed with Apache License 2.0.

29
README-DPDK.md Normal file
View File

@@ -0,0 +1,29 @@
Seastar and DPDK
================
Seastar uses the Data Plane Development Kit to drive NIC hardware directly. This
provides an enormous performance boost.
To enable DPDK, specify `--enable-dpdk` to `./configure.py`, and `--dpdk-pmd` as a
run-time parameter. This will use the DPDK package provided as a git submodule with the
seastar sources.
To use your own self-compiled DPDK package, follow this procedure:
1. Setup host to compile DPDK:
- Ubuntu
`sudo apt-get install -y build-essential linux-image-extra-$(uname -r)`
2. Prepare a DPDK SDK:
- Download the latest DPDK release: `wget http://dpdk.org/browse/dpdk/snapshot/dpdk-1.8.0.tar.gz`
- Untar it.
- Edit config/common_linuxapp: set CONFIG_RTE_MBUF_REFCNT and CONFIG_RTE_LIBRTE_KNI to 'n'.
- For DPDK 1.7.x: edit config/common_linuxapp:
- Set CONFIG_RTE_LIBRTE_PMD_BOND to 'n'.
- Set CONFIG_RTE_MBUF_SCATTER_GATHER to 'n'.
- Set CONFIG_RTE_LIBRTE_IP_FRAG to 'n'.
- Start the tools/setup.sh script as root.
- Compile a linuxapp target (option 9).
- Install IGB_UIO module (option 11).
- Bind some physical port to IGB_UIO (option 17).
- Configure hugepage mappings (option 14/15).
3. Run a configure.py: `./configure.py --dpdk-target <Path to untared dpdk-1.8.0 above>/x86_64-native-linuxapp-gcc`.

162
README.md
View File

@@ -1,110 +1,96 @@
# Scylla
#Scylla
[![Slack](https://img.shields.io/badge/slack-scylla-brightgreen.svg?logo=slack)](http://slack.scylladb.com)
[![Twitter](https://img.shields.io/twitter/follow/ScyllaDB.svg?style=social&label=Follow)](https://twitter.com/intent/follow?screen_name=ScyllaDB)
##Building Scylla
## What is Scylla?
In addition to required packages by Seastar, the following packages are required by Scylla.
Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB.
Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.
For more information, please see the [ScyllaDB web site].
[ScyllaDB web site]: https://www.scylladb.com
## Build Prerequisites
Scylla is fairly fussy about its build environment, requiring very recent
versions of the C++20 compiler and of many libraries to build. The document
[HACKING.md](HACKING.md) includes detailed information on building and
developing Scylla, but to get Scylla building quickly on (almost) any build
machine, Scylla offers a [frozen toolchain](tools/toolchain/README.md),
This is a pre-configured Docker image which includes recent versions of all
the required compilers, libraries and build tools. Using the frozen toolchain
allows you to avoid changing anything in your build machine to meet Scylla's
requirements - you just need to meet the frozen toolchain's prerequisites
(mostly, Docker or Podman being available).
## Building Scylla
Building Scylla with the frozen toolchain `dbuild` is as easy as:
```bash
$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla
### Submodules
Scylla uses submodules, so make sure you pull the submodules first by doing:
```
git submodule init
git submodule update --recursive
```
For further information, please see:
### Building and Running Scylla on Fedora
* Installing required packages:
* [Developer documentation] for more information on building Scylla.
* [Build documentation] on how to build Scylla binaries, tests, and packages.
* [Docker image build documentation] for information on how to build Docker images.
[developer documentation]: HACKING.md
[build documentation]: docs/dev/building.md
[docker image build documentation]: dist/docker/debian/README.md
## Running Scylla
To start Scylla server, run:
```bash
$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1
```
sudo yum install yaml-cpp-devel lz4-devel zlib-devel snappy-devel jsoncpp-devel thrift-devel antlr3-tool antlr3-C++-devel libasan libubsan gcc-c++ gnutls-devel ninja-build ragel libaio-devel cryptopp-devel xfsprogs-devel numactl-devel hwloc-devel libpciaccess-devel libxml2-devel python3-pyparsing
```
This will start a Scylla node with one CPU core allocated to it and data files stored in the `tmp` directory.
The `--developer-mode` is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations).
Please note that you need to run Scylla with `dbuild` if you built it with the frozen toolchain.
* Build Scylla
```
./configure.py --mode=release --with=scylla --disable-xen
ninja-build build/release/scylla -j2 # you can use more cpus if you have tons of RAM
For more run options, run:
```bash
$ ./tools/toolchain/dbuild ./build/release/scylla --help
```
## Testing
* Run Scylla
```
./build/release/scylla
See [test.py manual](docs/dev/testing.md).
```
## Scylla APIs and compatibility
By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and
Thrift. There is also support for the API of Amazon DynamoDB™,
which needs to be enabled and configured in order to be used. For more
information on how to enable the DynamoDB™ API in Scylla,
and the current compatibility of this feature as well as Scylla-specific extensions, see
[Alternator](docs/alternator/alternator.md) and
[Getting started with Alternator](docs/alternator/getting-started.md).
* run Scylla with one CPU and ./tmp as data directory
## Documentation
```
./build/release/scylla --datadir tmp --commitlog-directory tmp --smp 1
```
Documentation can be found [here](docs/dev/README.md).
Seastar documentation can be found [here](http://docs.seastar.io/master/index.html).
User documentation can be found [here](https://docs.scylladb.com/).
* For more run options:
```
./build/release/scylla --help
```
## Training
## Building Fedora RPM
As a pre-requisite, you need to install [Mock](https://fedoraproject.org/wiki/Mock) on your machine:
```
# Install mock:
sudo yum install mock
# Add user to the "mock" group:
usermod -a -G mock $USER && newgrp mock
```
Then, to build an RPM, run:
```
./dist/redhat/build_rpm.sh
```
The built RPM is stored in ``/var/lib/mock/<configuration>/result`` directory.
For example, on Fedora 21 mock reports the following:
```
INFO: Done(scylla-server-0.00-1.fc21.src.rpm) Config(default) 20 minutes 7 seconds
INFO: Results and/or logs in: /var/lib/mock/fedora-21-x86_64/result
```
## Building Fedora-based Docker image
Build a Docker image with:
```
cd dist/docker
docker build -t <image-name> .
```
Run the image with:
```
docker run -p $(hostname -i):9042:9042 -i -t <image name>
```
Training material and online courses can be found at [Scylla University](https://university.scylladb.com/).
The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling,
administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions,
multi-datacenters and how Scylla integrates with third-party applications.
## Contributing to Scylla
If you want to report a bug or submit a pull request or a patch, please read the [contribution guidelines].
Do not send pull requests.
If you are a developer working on Scylla, please read the [developer guidelines].
Send patches to the mailing list address scylladb-dev@googlegroups.com.
Be sure to subscribe.
[contribution guidelines]: CONTRIBUTING.md
[developer guidelines]: HACKING.md
## Contact
* The [users mailing list] and [Slack channel] are for users to discuss configuration, management, and operations of the ScyllaDB open source.
* The [developers mailing list] is for developers and people interested in following the development of ScyllaDB to discuss technical topics.
[Users mailing list]: https://groups.google.com/forum/#!forum/scylladb-users
[Slack channel]: http://slack.scylladb.com/
[Developers mailing list]: https://groups.google.com/forum/#!forum/scylladb-dev
In order for your patches to be merged, you must sign the Contributor's
License Agreement, protecting your rights and ours. See
http://www.scylladb.com/opensource/cla/.

View File

@@ -1,109 +1,19 @@
#!/bin/sh
USAGE=$(cat <<-END
Usage: $(basename "$0") [-h|--help] [-o|--output-dir PATH] [--date-stamp DATE] -- generate Scylla version and build information files.
Options:
-h|--help show this help message.
-o|--output-dir PATH specify destination path at which the version files are to be created.
-d|--date-stamp DATE manually set date for release parameter
By default, the script will attempt to parse 'version' file
in the current directory, which should contain a string of
'\$version-\$release' form.
Otherwise, it will call 'git log' on the source tree (the
directory, which contains the script) to obtain current
commit hash and use it for building the version and release
strings.
The script assumes that it's called from the Scylla source
tree.
The files created are:
SCYLLA-VERSION-FILE
SCYLLA-RELEASE-FILE
SCYLLA-PRODUCT-FILE
By default, these files are created in the 'build'
subdirectory under the directory containing the script.
The destination directory can be overriden by
using '-o PATH' option.
END
)
DATE=""
while [[ $# -gt 0 ]]; do
opt="$1"
case $opt in
-h|--help)
echo "$USAGE"
exit 0
;;
-o|--output-dir)
OUTPUT_DIR="$2"
shift
shift
;;
--date-stamp)
DATE="$2"
shift
shift
;;
*)
echo "Unexpected argument found: $1"
echo
echo "$USAGE"
exit 1
;;
esac
done
SCRIPT_DIR="$(dirname "$0")"
if [ -z "$OUTPUT_DIR" ]; then
OUTPUT_DIR="$SCRIPT_DIR/build"
fi
if [ -z "$DATE" ]; then
DATE=$(date --utc +%Y%m%d)
fi
# Default scylla product/version tags
PRODUCT=scylla
VERSION=5.2.0-dev
VERSION=1.0.4
if test -f version
then
SCYLLA_VERSION=$(cat version | awk -F'-' '{print $1}')
SCYLLA_RELEASE=$(cat version | awk -F'-' '{print $2}')
else
DATE=$(date +%Y%m%d)
GIT_COMMIT=$(git log --pretty=format:'%h' -n 1)
SCYLLA_VERSION=$VERSION
if [ -z "$SCYLLA_RELEASE" ]; then
DATE=$(date --utc +%Y%m%d)
GIT_COMMIT=$(git -C "$SCRIPT_DIR" log --pretty=format:'%h' -n 1 --abbrev=12)
# For custom package builds, replace "0" with "counter.your_name",
# where counter starts at 1 and increments for successive versions.
# This ensures that the package manager will select your custom
# package over the standard release.
SCYLLA_BUILD=0
SCYLLA_RELEASE=$SCYLLA_BUILD.$DATE.$GIT_COMMIT
elif [ -f "$OUTPUT_DIR/SCYLLA-RELEASE-FILE" ]; then
echo "setting SCYLLA_RELEASE only makes sense in clean builds" 1>&2
exit 1
fi
fi
if [ -f "$OUTPUT_DIR/SCYLLA-RELEASE-FILE" ]; then
GIT_COMMIT_FILE=$(cat "$OUTPUT_DIR/SCYLLA-RELEASE-FILE" |cut -d . -f 3)
if [ "$GIT_COMMIT" = "$GIT_COMMIT_FILE" ]; then
exit 0
fi
SCYLLA_RELEASE=$DATE.$GIT_COMMIT
fi
echo "$SCYLLA_VERSION-$SCYLLA_RELEASE"
mkdir -p "$OUTPUT_DIR"
echo "$SCYLLA_VERSION" > "$OUTPUT_DIR/SCYLLA-VERSION-FILE"
echo "$SCYLLA_RELEASE" > "$OUTPUT_DIR/SCYLLA-RELEASE-FILE"
echo "$PRODUCT" > "$OUTPUT_DIR/SCYLLA-PRODUCT-FILE"
mkdir -p build
echo "$SCYLLA_VERSION" > build/SCYLLA-VERSION-FILE
echo "$SCYLLA_RELEASE" > build/SCYLLA-RELEASE-FILE

1
abseil

Submodule abseil deleted from 7f3c0d7811

View File

@@ -1,13 +0,0 @@
/*
* Copyright (C) 2020-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "absl-flat_hash_map.hh"
size_t sstring_hash::operator()(std::string_view v) const noexcept {
return absl::Hash<std::string_view>{}(v);
}

View File

@@ -1,34 +0,0 @@
/*
* Copyright (C) 2020-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <absl/container/flat_hash_map.h>
#include <seastar/core/sstring.hh>
using namespace seastar;
struct sstring_hash {
using is_transparent = void;
size_t operator()(std::string_view v) const noexcept;
};
struct sstring_eq {
using is_transparent = void;
bool operator()(std::string_view a, std::string_view b) const noexcept {
return a == b;
}
};
template <typename K, typename V, typename... Ts>
struct flat_hash_map : public absl::flat_hash_map<K, V, Ts...> {
};
template <typename V>
struct flat_hash_map<sstring, V>
: public absl::flat_hash_map<sstring, V, sstring_hash, sstring_eq> {};

View File

@@ -1,158 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "alternator/error.hh"
#include "log.hh"
#include <string>
#include <string_view>
#include <gnutls/crypto.h>
#include "hashers.hh"
#include "bytes.hh"
#include "alternator/auth.hh"
#include <fmt/format.h>
#include "auth/common.hh"
#include "auth/password_authenticator.hh"
#include "auth/roles-metadata.hh"
#include "service/storage_proxy.hh"
#include "alternator/executor.hh"
#include "cql3/selection/selection.hh"
#include "query-result-set.hh"
#include "cql3/result_set.hh"
#include <seastar/core/coroutine.hh>
namespace alternator {
static logging::logger alogger("alternator-auth");
static hmac_sha256_digest hmac_sha256(std::string_view key, std::string_view msg) {
hmac_sha256_digest digest;
int ret = gnutls_hmac_fast(GNUTLS_MAC_SHA256, key.data(), key.size(), msg.data(), msg.size(), digest.data());
if (ret) {
throw std::runtime_error(fmt::format("Computing HMAC failed ({}): {}", ret, gnutls_strerror(ret)));
}
return digest;
}
static hmac_sha256_digest get_signature_key(std::string_view key, std::string_view date_stamp, std::string_view region_name, std::string_view service_name) {
auto date = hmac_sha256("AWS4" + std::string(key), date_stamp);
auto region = hmac_sha256(std::string_view(date.data(), date.size()), region_name);
auto service = hmac_sha256(std::string_view(region.data(), region.size()), service_name);
auto signing = hmac_sha256(std::string_view(service.data(), service.size()), "aws4_request");
return signing;
}
static std::string apply_sha256(std::string_view msg) {
sha256_hasher hasher;
hasher.update(msg.data(), msg.size());
return to_hex(hasher.finalize());
}
static std::string apply_sha256(const std::vector<temporary_buffer<char>>& msg) {
sha256_hasher hasher;
for (const temporary_buffer<char>& buf : msg) {
hasher.update(buf.get(), buf.size());
}
return to_hex(hasher.finalize());
}
static std::string format_time_point(db_clock::time_point tp) {
time_t time_point_repr = db_clock::to_time_t(tp);
std::string time_point_str;
time_point_str.resize(17);
::tm time_buf;
// strftime prints the terminating null character as well
std::strftime(time_point_str.data(), time_point_str.size(), "%Y%m%dT%H%M%SZ", ::gmtime_r(&time_point_repr, &time_buf));
time_point_str.resize(16);
return time_point_str;
}
void check_expiry(std::string_view signature_date) {
//FIXME: The default 15min can be changed with X-Amz-Expires header - we should honor it
std::string expiration_str = format_time_point(db_clock::now() - 15min);
std::string validity_str = format_time_point(db_clock::now() + 15min);
if (signature_date < expiration_str) {
throw api_error::invalid_signature(
fmt::format("Signature expired: {} is now earlier than {} (current time - 15 min.)",
signature_date, expiration_str));
}
if (signature_date > validity_str) {
throw api_error::invalid_signature(
fmt::format("Signature not yet current: {} is still later than {} (current time + 15 min.)",
signature_date, validity_str));
}
}
std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
const std::vector<temporary_buffer<char>>& body_content, std::string_view region, std::string_view service, std::string_view query_string) {
auto amz_date_it = signed_headers_map.find("x-amz-date");
if (amz_date_it == signed_headers_map.end()) {
throw api_error::invalid_signature("X-Amz-Date header is mandatory for signature verification");
}
std::string_view amz_date = amz_date_it->second;
check_expiry(amz_date);
std::string_view datestamp = amz_date.substr(0, 8);
if (datestamp != orig_datestamp) {
throw api_error::invalid_signature(
format("X-Amz-Date date does not match the provided datestamp. Expected {}, got {}",
orig_datestamp, datestamp));
}
std::string_view canonical_uri = "/";
std::stringstream canonical_headers;
for (const auto& header : signed_headers_map) {
canonical_headers << fmt::format("{}:{}", header.first, header.second) << '\n';
}
std::string payload_hash = apply_sha256(body_content);
std::string canonical_request = fmt::format("{}\n{}\n{}\n{}\n{}\n{}", method, canonical_uri, query_string, canonical_headers.str(), signed_headers_str, payload_hash);
std::string_view algorithm = "AWS4-HMAC-SHA256";
std::string credential_scope = fmt::format("{}/{}/{}/aws4_request", datestamp, region, service);
std::string string_to_sign = fmt::format("{}\n{}\n{}\n{}", algorithm, amz_date, credential_scope, apply_sha256(canonical_request));
hmac_sha256_digest signing_key = get_signature_key(secret_access_key, datestamp, region, service);
hmac_sha256_digest signature = hmac_sha256(std::string_view(signing_key.data(), signing_key.size()), string_to_sign);
return to_hex(bytes_view(reinterpret_cast<const int8_t*>(signature.data()), signature.size()));
}
future<std::string> get_key_from_roles(service::storage_proxy& proxy, std::string username) {
schema_ptr schema = proxy.data_dictionary().find_schema("system_auth", "roles");
partition_key pk = partition_key::from_single_value(*schema, utf8_type->decompose(username));
dht::partition_range_vector partition_ranges{dht::partition_range(dht::decorate_key(*schema, pk))};
std::vector<query::clustering_range> bounds{query::clustering_range::make_open_ended_both_sides()};
const column_definition* salted_hash_col = schema->get_column_definition(bytes("salted_hash"));
if (!salted_hash_col) {
co_await coroutine::return_exception(api_error::unrecognized_client(format("Credentials cannot be fetched for: {}", username)));
}
auto selection = cql3::selection::selection::for_columns(schema, {salted_hash_col});
auto partition_slice = query::partition_slice(std::move(bounds), {}, query::column_id_vector{salted_hash_col->id}, selection->get_query_options());
auto command = ::make_lw_shared<query::read_command>(schema->id(), schema->version(), partition_slice,
proxy.get_max_result_size(partition_slice), query::tombstone_limit(proxy.get_tombstone_limit()));
auto cl = auth::password_authenticator::consistency_for_user(username);
service::client_state client_state{service::client_state::internal_tag()};
service::storage_proxy::coordinator_query_result qr = co_await proxy.query(schema, std::move(command), std::move(partition_ranges), cl,
service::storage_proxy::coordinator_query_options(executor::default_timeout(), empty_service_permit(), client_state));
cql3::selection::result_set_builder builder(*selection, gc_clock::now(), cql_serialization_format::latest());
query::result_view::consume(*qr.query_result, partition_slice, cql3::selection::result_set_builder::visitor(builder, *schema, *selection));
auto result_set = builder.build();
if (result_set->empty()) {
co_await coroutine::return_exception(api_error::unrecognized_client(format("User not found: {}", username)));
}
const bytes_opt& salted_hash = result_set->rows().front().front(); // We only asked for 1 row and 1 column
if (!salted_hash) {
co_await coroutine::return_exception(api_error::unrecognized_client(format("No password found for user: {}", username)));
}
co_return value_cast<sstring>(utf8_type->deserialize(*salted_hash));
}
}

View File

@@ -1,33 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <string>
#include <string_view>
#include <array>
#include "gc_clock.hh"
#include "utils/loading_cache.hh"
namespace service {
class storage_proxy;
}
namespace alternator {
using hmac_sha256_digest = std::array<char, 32>;
using key_cache = utils::loading_cache<std::string, std::string, 1>;
std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,
std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,
const std::vector<temporary_buffer<char>>& body_content, std::string_view region, std::string_view service, std::string_view query_string);
future<std::string> get_key_from_roles(service::storage_proxy& proxy, std::string username);
}

View File

@@ -1,738 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include <list>
#include <map>
#include <string_view>
#include "alternator/conditions.hh"
#include "alternator/error.hh"
#include "cql3/constants.hh"
#include <unordered_map>
#include "utils/rjson.hh"
#include "serialization.hh"
#include "utils/base64.hh"
#include "utils/rjson.hh"
#include <stdexcept>
#include <boost/algorithm/cxx11/all_of.hpp>
#include <boost/algorithm/cxx11/any_of.hpp>
#include "utils/overloaded_functor.hh"
#include "expressions.hh"
namespace alternator {
static logging::logger clogger("alternator-conditions");
comparison_operator_type get_comparison_operator(const rjson::value& comparison_operator) {
static std::unordered_map<std::string, comparison_operator_type> ops = {
{"EQ", comparison_operator_type::EQ},
{"NE", comparison_operator_type::NE},
{"LE", comparison_operator_type::LE},
{"LT", comparison_operator_type::LT},
{"GE", comparison_operator_type::GE},
{"GT", comparison_operator_type::GT},
{"IN", comparison_operator_type::IN},
{"NULL", comparison_operator_type::IS_NULL},
{"NOT_NULL", comparison_operator_type::NOT_NULL},
{"BETWEEN", comparison_operator_type::BETWEEN},
{"BEGINS_WITH", comparison_operator_type::BEGINS_WITH},
{"CONTAINS", comparison_operator_type::CONTAINS},
{"NOT_CONTAINS", comparison_operator_type::NOT_CONTAINS},
};
if (!comparison_operator.IsString()) {
throw api_error::validation(format("Invalid comparison operator definition {}", rjson::print(comparison_operator)));
}
std::string op = comparison_operator.GetString();
auto it = ops.find(op);
if (it == ops.end()) {
throw api_error::validation(format("Unsupported comparison operator {}", op));
}
return it->second;
}
namespace {
struct size_check {
// True iff size passes this check.
virtual bool operator()(rapidjson::SizeType size) const = 0;
// Check description, such that format("expected array {}", check.what()) is human-readable.
virtual sstring what() const = 0;
};
class exact_size : public size_check {
rapidjson::SizeType _expected;
public:
explicit exact_size(rapidjson::SizeType expected) : _expected(expected) {}
bool operator()(rapidjson::SizeType size) const override { return size == _expected; }
sstring what() const override { return format("of size {}", _expected); }
};
struct empty : public size_check {
bool operator()(rapidjson::SizeType size) const override { return size < 1; }
sstring what() const override { return "to be empty"; }
};
struct nonempty : public size_check {
bool operator()(rapidjson::SizeType size) const override { return size > 0; }
sstring what() const override { return "to be non-empty"; }
};
} // anonymous namespace
// Check that array has the expected number of elements
static void verify_operand_count(const rjson::value* array, const size_check& expected, const rjson::value& op) {
if (!array && expected(0)) {
// If expected() allows an empty AttributeValueList, it is also fine
// that it is missing.
return;
}
if (!array || !array->IsArray()) {
throw api_error::validation("With ComparisonOperator, AttributeValueList must be given and an array");
}
if (!expected(array->Size())) {
throw api_error::validation(
format("{} operator requires AttributeValueList {}, instead found list size {}",
op, expected.what(), array->Size()));
}
}
struct rjson_engaged_ptr_comp {
bool operator()(const rjson::value* p1, const rjson::value* p2) const {
return rjson::single_value_comp()(*p1, *p2);
}
};
// It's not enough to compare underlying JSON objects when comparing sets,
// as internally they're stored in an array, and the order of elements is
// not important in set equality. See issue #5021
static bool check_EQ_for_sets(const rjson::value& set1, const rjson::value& set2) {
if (!set1.IsArray() || !set2.IsArray() || set1.Size() != set2.Size()) {
return false;
}
std::set<const rjson::value*, rjson_engaged_ptr_comp> set1_raw;
for (auto it = set1.Begin(); it != set1.End(); ++it) {
set1_raw.insert(&*it);
}
for (const auto& a : set2.GetArray()) {
if (!set1_raw.contains(&a)) {
return false;
}
}
return true;
}
// Moreover, the JSON being compared can be a nested document with outer
// layers of lists and maps and some inner set - and we need to get to that
// inner set to compare it correctly with check_EQ_for_sets() (issue #8514).
static bool check_EQ(const rjson::value* v1, const rjson::value& v2);
static bool check_EQ_for_lists(const rjson::value& list1, const rjson::value& list2) {
if (!list1.IsArray() || !list2.IsArray() || list1.Size() != list2.Size()) {
return false;
}
auto it1 = list1.Begin();
auto it2 = list2.Begin();
while (it1 != list1.End()) {
// Note: Alternator limits an item's depth (rjson::parse() limits
// it to around 37 levels), so this recursion is safe.
if (!check_EQ(&*it1, *it2)) {
return false;
}
++it1;
++it2;
}
return true;
}
static bool check_EQ_for_maps(const rjson::value& list1, const rjson::value& list2) {
if (!list1.IsObject() || !list2.IsObject() || list1.MemberCount() != list2.MemberCount()) {
return false;
}
for (auto it1 = list1.MemberBegin(); it1 != list1.MemberEnd(); ++it1) {
auto it2 = list2.FindMember(it1->name);
if (it2 == list2.MemberEnd() || !check_EQ(&it1->value, it2->value)) {
return false;
}
}
return true;
}
// Check if two JSON-encoded values match with the EQ relation
static bool check_EQ(const rjson::value* v1, const rjson::value& v2) {
if (v1 && v1->IsObject() && v1->MemberCount() == 1 && v2.IsObject() && v2.MemberCount() == 1) {
auto it1 = v1->MemberBegin();
auto it2 = v2.MemberBegin();
if (it1->name != it2->name) {
return false;
}
if (it1->name == "SS" || it1->name == "NS" || it1->name == "BS") {
return check_EQ_for_sets(it1->value, it2->value);
} else if(it1->name == "L") {
return check_EQ_for_lists(it1->value, it2->value);
} else if(it1->name == "M") {
return check_EQ_for_maps(it1->value, it2->value);
} else {
// Other, non-nested types (number, string, etc.) can be compared
// literally, comparing their JSON representation.
return it1->value == it2->value;
}
} else {
// If v1 and/or v2 are missing (IsNull()) the result should be false.
// In the unlikely case that the object is malformed (issue #8070),
// let's also return false.
return false;
}
}
// Check if two JSON-encoded values match with the NE relation
static bool check_NE(const rjson::value* v1, const rjson::value& v2) {
return !check_EQ(v1, v2);
}
// Check if two JSON-encoded values match with the BEGINS_WITH relation
bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2,
bool v1_from_query, bool v2_from_query) {
bool bad = false;
if (!v1 || !v1->IsObject() || v1->MemberCount() != 1) {
if (v1_from_query) {
throw api_error::validation("begins_with() encountered malformed argument");
} else {
bad = true;
}
} else if (v1->MemberBegin()->name != "S" && v1->MemberBegin()->name != "B") {
if (v1_from_query) {
throw api_error::validation(format("begins_with supports only string or binary type, got: {}", *v1));
} else {
bad = true;
}
}
if (!v2.IsObject() || v2.MemberCount() != 1) {
if (v2_from_query) {
throw api_error::validation("begins_with() encountered malformed argument");
} else {
bad = true;
}
} else if (v2.MemberBegin()->name != "S" && v2.MemberBegin()->name != "B") {
if (v2_from_query) {
throw api_error::validation(format("begins_with() supports only string or binary type, got: {}", v2));
} else {
bad = true;
}
}
if (bad) {
return false;
}
auto it1 = v1->MemberBegin();
auto it2 = v2.MemberBegin();
if (it1->name != it2->name) {
return false;
}
if (it2->name == "S") {
return rjson::to_string_view(it1->value).starts_with(rjson::to_string_view(it2->value));
} else /* it2->name == "B" */ {
return base64_begins_with(rjson::to_string_view(it1->value), rjson::to_string_view(it2->value));
}
}
static bool is_set_of(const rjson::value& type1, const rjson::value& type2) {
return (type2 == "S" && type1 == "SS") || (type2 == "N" && type1 == "NS") || (type2 == "B" && type1 == "BS");
}
// Check if two JSON-encoded values match with the CONTAINS relation
bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2) {
if (!v1) {
return false;
}
const auto& kv1 = *v1->MemberBegin();
const auto& kv2 = *v2.MemberBegin();
if (kv1.name == "S" && kv2.name == "S") {
return rjson::to_string_view(kv1.value).find(rjson::to_string_view(kv2.value)) != std::string_view::npos;
} else if (kv1.name == "B" && kv2.name == "B") {
return rjson::base64_decode(kv1.value).find(rjson::base64_decode(kv2.value)) != bytes::npos;
} else if (is_set_of(kv1.name, kv2.name)) {
for (auto i = kv1.value.Begin(); i != kv1.value.End(); ++i) {
if (*i == kv2.value) {
return true;
}
}
} else if (kv1.name == "L") {
for (auto i = kv1.value.Begin(); i != kv1.value.End(); ++i) {
if (!i->IsObject() || i->MemberCount() != 1) {
clogger.error("check_CONTAINS received a list whose element is malformed");
return false;
}
const auto& el = *i->MemberBegin();
if (el.name == kv2.name && el.value == kv2.value) {
return true;
}
}
}
return false;
}
// Check if two JSON-encoded values match with the NOT_CONTAINS relation
static bool check_NOT_CONTAINS(const rjson::value* v1, const rjson::value& v2) {
if (!v1) {
return false;
}
return !check_CONTAINS(v1, v2);
}
// Check if a JSON-encoded value equals any element of an array, which must have at least one element.
static bool check_IN(const rjson::value* val, const rjson::value& array) {
if (!array[0].IsObject() || array[0].MemberCount() != 1) {
throw api_error::validation(
format("IN operator encountered malformed AttributeValue: {}", array[0]));
}
const auto& type = array[0].MemberBegin()->name;
if (type != "S" && type != "N" && type != "B") {
throw api_error::validation(
"IN operator requires AttributeValueList elements to be of type String, Number, or Binary ");
}
if (!val) {
return false;
}
bool have_match = false;
for (const auto& elem : array.GetArray()) {
if (!elem.IsObject() || elem.MemberCount() != 1 || elem.MemberBegin()->name != type) {
throw api_error::validation(
"IN operator requires all AttributeValueList elements to have the same type ");
}
if (!have_match && *val == elem) {
// Can't return yet, must check types of all array elements. <sigh>
have_match = true;
}
}
return have_match;
}
// Another variant of check_IN, this one for ConditionExpression. It needs to
// check whether the first element in the given vector is equal to any of the
// others.
static bool check_IN(const std::vector<rjson::value>& array) {
const rjson::value* first = &array[0];
for (unsigned i = 1; i < array.size(); i++) {
if (check_EQ(first, array[i])) {
return true;
}
}
return false;
}
static bool check_NULL(const rjson::value* val) {
return val == nullptr;
}
static bool check_NOT_NULL(const rjson::value* val) {
return val != nullptr;
}
// Only types S, N or B (string, number or bytes) may be compared by the
// various comparion operators - lt, le, gt, ge, and between.
// Note that in particular, if the value is missing (v->IsNull()), this
// check returns false.
static bool check_comparable_type(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
return false;
}
const rjson::value& type = v.MemberBegin()->name;
return type == "S" || type == "N" || type == "B";
}
// Check if two JSON-encoded values match with cmp.
template <typename Comparator>
bool check_compare(const rjson::value* v1, const rjson::value& v2, const Comparator& cmp,
bool v1_from_query, bool v2_from_query) {
bool bad = false;
if (!v1 || !check_comparable_type(*v1)) {
if (v1_from_query) {
throw api_error::validation(format("{} allow only the types String, Number, or Binary", cmp.diagnostic));
}
bad = true;
}
if (!check_comparable_type(v2)) {
if (v2_from_query) {
throw api_error::validation(format("{} allow only the types String, Number, or Binary", cmp.diagnostic));
}
bad = true;
}
if (bad) {
return false;
}
const auto& kv1 = *v1->MemberBegin();
const auto& kv2 = *v2.MemberBegin();
if (kv1.name != kv2.name) {
return false;
}
if (kv1.name == "N") {
return cmp(unwrap_number(*v1, cmp.diagnostic), unwrap_number(v2, cmp.diagnostic));
}
if (kv1.name == "S") {
return cmp(std::string_view(kv1.value.GetString(), kv1.value.GetStringLength()),
std::string_view(kv2.value.GetString(), kv2.value.GetStringLength()));
}
if (kv1.name == "B") {
return cmp(rjson::base64_decode(kv1.value), rjson::base64_decode(kv2.value));
}
// cannot reach here, as check_comparable_type() verifies the type is one
// of the above options.
return false;
}
struct cmp_lt {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs < rhs; }
// We cannot use the normal comparison operators like "<" on the bytes
// type, because they treat individual bytes as signed but we need to
// compare them as *unsigned*. So we need a specialization for bytes.
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) < 0; }
static constexpr const char* diagnostic = "LT operator";
};
struct cmp_le {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs <= rhs; }
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) <= 0; }
static constexpr const char* diagnostic = "LE operator";
};
struct cmp_ge {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs >= rhs; }
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) >= 0; }
static constexpr const char* diagnostic = "GE operator";
};
struct cmp_gt {
template <typename T> bool operator()(const T& lhs, const T& rhs) const { return lhs > rhs; }
bool operator()(const bytes& lhs, const bytes& rhs) const { return compare_unsigned(lhs, rhs) > 0; }
static constexpr const char* diagnostic = "GT operator";
};
// True if v is between lb and ub, inclusive. Throws or returns false
// (depending on bounds_from_query parameter) if lb > ub.
template <typename T>
static bool check_BETWEEN(const T& v, const T& lb, const T& ub, bool bounds_from_query) {
if (cmp_lt()(ub, lb)) {
if (bounds_from_query) {
throw api_error::validation(
format("BETWEEN operator requires lower_bound <= upper_bound, but {} > {}", lb, ub));
} else {
return false;
}
}
return cmp_ge()(v, lb) && cmp_le()(v, ub);
}
static bool check_BETWEEN(const rjson::value* v, const rjson::value& lb, const rjson::value& ub,
bool v_from_query, bool lb_from_query, bool ub_from_query) {
if ((v && v_from_query && !check_comparable_type(*v)) ||
(lb_from_query && !check_comparable_type(lb)) ||
(ub_from_query && !check_comparable_type(ub))) {
throw api_error::validation("between allow only the types String, Number, or Binary");
}
if (!v || !v->IsObject() || v->MemberCount() != 1 ||
!lb.IsObject() || lb.MemberCount() != 1 ||
!ub.IsObject() || ub.MemberCount() != 1) {
return false;
}
const auto& kv_v = *v->MemberBegin();
const auto& kv_lb = *lb.MemberBegin();
const auto& kv_ub = *ub.MemberBegin();
bool bounds_from_query = lb_from_query && ub_from_query;
if (kv_lb.name != kv_ub.name) {
if (bounds_from_query) {
throw api_error::validation(
format("BETWEEN operator requires the same type for lower and upper bound; instead got {} and {}",
kv_lb.name, kv_ub.name));
} else {
return false;
}
}
if (kv_v.name != kv_lb.name) { // Cannot compare different types, so v is NOT between lb and ub.
return false;
}
if (kv_v.name == "N") {
const char* diag = "BETWEEN operator";
return check_BETWEEN(unwrap_number(*v, diag), unwrap_number(lb, diag), unwrap_number(ub, diag), bounds_from_query);
}
if (kv_v.name == "S") {
return check_BETWEEN(std::string_view(kv_v.value.GetString(), kv_v.value.GetStringLength()),
std::string_view(kv_lb.value.GetString(), kv_lb.value.GetStringLength()),
std::string_view(kv_ub.value.GetString(), kv_ub.value.GetStringLength()),
bounds_from_query);
}
if (kv_v.name == "B") {
return check_BETWEEN(rjson::base64_decode(kv_v.value), rjson::base64_decode(kv_lb.value), rjson::base64_decode(kv_ub.value), bounds_from_query);
}
if (v_from_query) {
throw api_error::validation(
format("BETWEEN operator requires AttributeValueList elements to be of type String, Number, or Binary; instead got {}",
kv_lb.name));
} else {
return false;
}
}
// Verify one Expect condition on one attribute (whose content is "got")
// for the verify_expected() below.
// This function returns true or false depending on whether the condition
// succeeded - it does not throw ConditionalCheckFailedException.
// However, it may throw ValidationException on input validation errors.
static bool verify_expected_one(const rjson::value& condition, const rjson::value* got) {
const rjson::value* comparison_operator = rjson::find(condition, "ComparisonOperator");
const rjson::value* attribute_value_list = rjson::find(condition, "AttributeValueList");
const rjson::value* value = rjson::find(condition, "Value");
const rjson::value* exists = rjson::find(condition, "Exists");
// There are three types of conditions that Expected supports:
// A value, not-exists, and a comparison of some kind. Each allows
// and requires a different combinations of parameters in the request
if (value) {
if (exists && (!exists->IsBool() || exists->GetBool() != true)) {
throw api_error::validation("Cannot combine Value with Exists!=true");
}
if (comparison_operator) {
throw api_error::validation("Cannot combine Value with ComparisonOperator");
}
return check_EQ(got, *value);
} else if (exists) {
if (comparison_operator) {
throw api_error::validation("Cannot combine Exists with ComparisonOperator");
}
if (!exists->IsBool() || exists->GetBool() != false) {
throw api_error::validation("Exists!=false requires Value");
}
// Remember Exists=false, so we're checking that the attribute does *not* exist:
return !got;
} else {
if (!comparison_operator) {
throw api_error::validation("Missing ComparisonOperator, Value or Exists");
}
comparison_operator_type op = get_comparison_operator(*comparison_operator);
switch (op) {
case comparison_operator_type::EQ:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_EQ(got, (*attribute_value_list)[0]);
case comparison_operator_type::NE:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_NE(got, (*attribute_value_list)[0]);
case comparison_operator_type::LT:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_lt{}, false, true);
case comparison_operator_type::LE:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_le{}, false, true);
case comparison_operator_type::GT:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_gt{}, false, true);
case comparison_operator_type::GE:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_compare(got, (*attribute_value_list)[0], cmp_ge{}, false, true);
case comparison_operator_type::BEGINS_WITH:
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
return check_BEGINS_WITH(got, (*attribute_value_list)[0], false, true);
case comparison_operator_type::IN:
verify_operand_count(attribute_value_list, nonempty(), *comparison_operator);
return check_IN(got, *attribute_value_list);
case comparison_operator_type::IS_NULL:
verify_operand_count(attribute_value_list, empty(), *comparison_operator);
return check_NULL(got);
case comparison_operator_type::NOT_NULL:
verify_operand_count(attribute_value_list, empty(), *comparison_operator);
return check_NOT_NULL(got);
case comparison_operator_type::BETWEEN:
verify_operand_count(attribute_value_list, exact_size(2), *comparison_operator);
return check_BETWEEN(got, (*attribute_value_list)[0], (*attribute_value_list)[1],
false, true, true);
case comparison_operator_type::CONTAINS:
{
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
// Expected's "CONTAINS" has this artificial limitation.
// ConditionExpression's "contains()" does not...
const rjson::value& arg = (*attribute_value_list)[0];
const auto& argtype = (*arg.MemberBegin()).name;
if (argtype != "S" && argtype != "N" && argtype != "B") {
throw api_error::validation(
format("CONTAINS operator requires a single AttributeValue of type String, Number, or Binary, "
"got {} instead", argtype));
}
return check_CONTAINS(got, arg);
}
case comparison_operator_type::NOT_CONTAINS:
{
verify_operand_count(attribute_value_list, exact_size(1), *comparison_operator);
// Expected's "NOT_CONTAINS" has this artificial limitation.
// ConditionExpression's "contains()" does not...
const rjson::value& arg = (*attribute_value_list)[0];
const auto& argtype = (*arg.MemberBegin()).name;
if (argtype != "S" && argtype != "N" && argtype != "B") {
throw api_error::validation(
format("CONTAINS operator requires a single AttributeValue of type String, Number, or Binary, "
"got {} instead", argtype));
}
return check_NOT_CONTAINS(got, arg);
}
}
throw std::logic_error(format("Internal error: corrupted operator enum: {}", int(op)));
}
}
conditional_operator_type get_conditional_operator(const rjson::value& req) {
const rjson::value* conditional_operator = rjson::find(req, "ConditionalOperator");
if (!conditional_operator) {
return conditional_operator_type::MISSING;
}
if (!conditional_operator->IsString()) {
throw api_error::validation("'ConditionalOperator' parameter, if given, must be a string");
}
auto s = rjson::to_string_view(*conditional_operator);
if (s == "AND") {
return conditional_operator_type::AND;
} else if (s == "OR") {
return conditional_operator_type::OR;
} else {
throw api_error::validation(
format("'ConditionalOperator' parameter must be AND, OR or missing. Found {}.", s));
}
}
// Check if the existing values of the item (previous_item) match the
// conditions given by the Expected and ConditionalOperator parameters
// (if they exist) in the request (an UpdateItem, PutItem or DeleteItem).
// This function can throw an ValidationException API error if there
// are errors in the format of the condition itself.
bool verify_expected(const rjson::value& req, const rjson::value* previous_item) {
const rjson::value* expected = rjson::find(req, "Expected");
auto conditional_operator = get_conditional_operator(req);
if (conditional_operator != conditional_operator_type::MISSING &&
(!expected || (expected->IsObject() && expected->GetObject().ObjectEmpty()))) {
throw api_error::validation("'ConditionalOperator' parameter cannot be specified for missing or empty Expression");
}
if (!expected) {
return true;
}
if (!expected->IsObject()) {
throw api_error::validation("'Expected' parameter, if given, must be an object");
}
bool require_all = conditional_operator != conditional_operator_type::OR;
return verify_condition(*expected, require_all, previous_item);
}
bool verify_condition(const rjson::value& condition, bool require_all, const rjson::value* previous_item) {
for (auto it = condition.MemberBegin(); it != condition.MemberEnd(); ++it) {
const rjson::value* got = nullptr;
if (previous_item) {
got = rjson::find(*previous_item, rjson::to_string_view(it->name));
}
bool success = verify_expected_one(it->value, got);
if (success && !require_all) {
// When !require_all, one success is enough!
return true;
} else if (!success && require_all) {
// When require_all, one failure is enough!
return false;
}
}
// If we got here and require_all, none of the checks failed, so succeed.
// If we got here and !require_all, all of the checks failed, so fail.
return require_all;
}
static bool calculate_primitive_condition(const parsed::primitive_condition& cond,
const rjson::value* previous_item) {
std::vector<rjson::value> calculated_values;
calculated_values.reserve(cond._values.size());
for (const parsed::value& v : cond._values) {
calculated_values.push_back(calculate_value(v,
cond._op == parsed::primitive_condition::type::VALUE ?
calculate_value_caller::ConditionExpressionAlone :
calculate_value_caller::ConditionExpression,
previous_item));
}
switch (cond._op) {
case parsed::primitive_condition::type::BETWEEN:
if (calculated_values.size() != 3) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error(format("Wrong number of values {} in BETWEEN primitive_condition", cond._values.size()));
}
return check_BETWEEN(&calculated_values[0], calculated_values[1], calculated_values[2],
cond._values[0].is_constant(), cond._values[1].is_constant(), cond._values[2].is_constant());
case parsed::primitive_condition::type::IN:
return check_IN(calculated_values);
case parsed::primitive_condition::type::VALUE:
if (calculated_values.size() != 1) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error(format("Unexpected values in primitive_condition", cond._values.size()));
}
// Unwrap the boolean wrapped as the value (if it is a boolean)
if (calculated_values[0].IsObject() && calculated_values[0].MemberCount() == 1) {
auto it = calculated_values[0].MemberBegin();
if (it->name == "BOOL" && it->value.IsBool()) {
return it->value.GetBool();
}
}
throw api_error::validation(
format("ConditionExpression: condition results in a non-boolean value: {}",
calculated_values[0]));
default:
// All the rest of the operators have exactly two parameters (and unless
// we have a bug in the parser, that's what we have in the parsed object:
if (calculated_values.size() != 2) {
throw std::logic_error(format("Wrong number of values {} in primitive_condition object", cond._values.size()));
}
}
switch (cond._op) {
case parsed::primitive_condition::type::EQ:
return check_EQ(&calculated_values[0], calculated_values[1]);
case parsed::primitive_condition::type::NE:
return check_NE(&calculated_values[0], calculated_values[1]);
case parsed::primitive_condition::type::GT:
return check_compare(&calculated_values[0], calculated_values[1], cmp_gt{},
cond._values[0].is_constant(), cond._values[1].is_constant());
case parsed::primitive_condition::type::GE:
return check_compare(&calculated_values[0], calculated_values[1], cmp_ge{},
cond._values[0].is_constant(), cond._values[1].is_constant());
case parsed::primitive_condition::type::LT:
return check_compare(&calculated_values[0], calculated_values[1], cmp_lt{},
cond._values[0].is_constant(), cond._values[1].is_constant());
case parsed::primitive_condition::type::LE:
return check_compare(&calculated_values[0], calculated_values[1], cmp_le{},
cond._values[0].is_constant(), cond._values[1].is_constant());
default:
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error(format("Unknown type {} in primitive_condition object", (int)(cond._op)));
}
}
// Check if the existing values of the item (previous_item) match the
// conditions given by the given parsed ConditionExpression.
bool verify_condition_expression(
const parsed::condition_expression& condition_expression,
const rjson::value* previous_item) {
if (condition_expression.empty()) {
return true;
}
bool ret = std::visit(overloaded_functor {
[&] (const parsed::primitive_condition& cond) -> bool {
return calculate_primitive_condition(cond, previous_item);
},
[&] (const parsed::condition_expression::condition_list& list) -> bool {
auto verify_condition = [&] (const parsed::condition_expression& e) {
return verify_condition_expression(e, previous_item);
};
switch (list.op) {
case '&':
return boost::algorithm::all_of(list.conditions, verify_condition);
case '|':
return boost::algorithm::any_of(list.conditions, verify_condition);
default:
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error("bad operator in condition_list");
}
}
}, condition_expression._expression);
return condition_expression._negated ? !ret : ret;
}
}

View File

@@ -1,48 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
/*
* This file contains definitions and functions related to placing conditions
* on Alternator queries (equivalent of CQL's restrictions).
*
* With conditions, it's possible to add criteria to selection requests (Scan, Query)
* and use them for narrowing down the result set, by means of filtering or indexing.
*
* Ref: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Condition.html
*/
#pragma once
#include "cql3/restrictions/statement_restrictions.hh"
#include "serialization.hh"
#include "expressions_types.hh"
namespace alternator {
enum class comparison_operator_type {
EQ, NE, LE, LT, GE, GT, IN, BETWEEN, CONTAINS, NOT_CONTAINS, IS_NULL, NOT_NULL, BEGINS_WITH
};
comparison_operator_type get_comparison_operator(const rjson::value& comparison_operator);
enum class conditional_operator_type {
AND, OR, MISSING
};
conditional_operator_type get_conditional_operator(const rjson::value& req);
bool verify_expected(const rjson::value& req, const rjson::value* previous_item);
bool verify_condition(const rjson::value& condition, bool require_all, const rjson::value* previous_item);
bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2);
bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2, bool v1_from_query, bool v2_from_query);
bool verify_condition_expression(
const parsed::condition_expression& condition_expression,
const rjson::value* previous_item);
}

View File

@@ -1,159 +0,0 @@
/*
* Copyright (C) 2021-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include <seastar/net/dns.hh>
#include "controller.hh"
#include "server.hh"
#include "executor.hh"
#include "rmw_operation.hh"
#include "db/config.hh"
#include "cdc/generation_service.hh"
#include "service/memory_limiter.hh"
#include "auth/service.hh"
#include "service/qos/service_level_controller.hh"
using namespace seastar;
namespace alternator {
static logging::logger logger("alternator_controller");
controller::controller(
sharded<gms::gossiper>& gossiper,
sharded<service::storage_proxy>& proxy,
sharded<service::migration_manager>& mm,
sharded<db::system_distributed_keyspace>& sys_dist_ks,
sharded<cdc::generation_service>& cdc_gen_svc,
sharded<service::memory_limiter>& memory_limiter,
sharded<auth::service>& auth_service,
sharded<qos::service_level_controller>& sl_controller,
const db::config& config)
: _gossiper(gossiper)
, _proxy(proxy)
, _mm(mm)
, _sys_dist_ks(sys_dist_ks)
, _cdc_gen_svc(cdc_gen_svc)
, _memory_limiter(memory_limiter)
, _auth_service(auth_service)
, _sl_controller(sl_controller)
, _config(config)
{
}
sstring controller::name() const {
return "alternator";
}
sstring controller::protocol() const {
return "dynamodb";
}
sstring controller::protocol_version() const {
return version;
}
std::vector<socket_address> controller::listen_addresses() const {
return _listen_addresses;
}
future<> controller::start_server() {
return seastar::async([this] {
_listen_addresses.clear();
auto preferred = _config.listen_interface_prefer_ipv6() ? std::make_optional(net::inet_address::family::INET6) : std::nullopt;
auto family = _config.enable_ipv6_dns_lookup() || preferred ? std::nullopt : std::make_optional(net::inet_address::family::INET);
// Create an smp_service_group to be used for limiting the
// concurrency when forwarding Alternator request between
// shards - if necessary for LWT.
smp_service_group_config c;
c.max_nonlocal_requests = 5000;
_ssg = create_smp_service_group(c).get0();
rmw_operation::set_default_write_isolation(_config.alternator_write_isolation());
executor::set_default_timeout(std::chrono::milliseconds(_config.alternator_timeout_in_ms()));
net::inet_address addr = utils::resolve(_config.alternator_address, family).get0();
auto get_cdc_metadata = [] (cdc::generation_service& svc) { return std::ref(svc.get_cdc_metadata()); };
_executor.start(std::ref(_gossiper), std::ref(_proxy), std::ref(_mm), std::ref(_sys_dist_ks), sharded_parameter(get_cdc_metadata, std::ref(_cdc_gen_svc)), _ssg.value()).get();
_server.start(std::ref(_executor), std::ref(_proxy), std::ref(_gossiper), std::ref(_auth_service), std::ref(_sl_controller)).get();
// Note: from this point on, if start_server() throws for any reason,
// it must first call stop_server() to stop the executor and server
// services we just started - or Scylla will cause an assertion
// failure when the controller object is destroyed in the exception
// unwinding.
std::optional<uint16_t> alternator_port;
if (_config.alternator_port()) {
alternator_port = _config.alternator_port();
_listen_addresses.push_back({addr, *alternator_port});
}
std::optional<uint16_t> alternator_https_port;
std::optional<tls::credentials_builder> creds;
if (_config.alternator_https_port()) {
alternator_https_port = _config.alternator_https_port();
_listen_addresses.push_back({addr, *alternator_https_port});
creds.emplace();
auto opts = _config.alternator_encryption_options();
if (opts.empty()) {
// Earlier versions mistakenly configured Alternator's
// HTTPS parameters via the "server_encryption_option"
// configuration parameter. We *temporarily* continue
// to allow this, for backward compatibility.
opts = _config.server_encryption_options();
if (!opts.empty()) {
logger.warn("Setting server_encryption_options to configure "
"Alternator's HTTPS encryption is deprecated. Please "
"switch to setting alternator_encryption_options instead.");
}
}
opts.erase("require_client_auth");
opts.erase("truststore");
try {
utils::configure_tls_creds_builder(creds.value(), std::move(opts)).get();
} catch(...) {
logger.error("Failed to set up Alternator TLS credentials: {}", std::current_exception());
stop_server().get();
std::throw_with_nested(std::runtime_error("Failed to set up Alternator TLS credentials"));
}
}
bool alternator_enforce_authorization = _config.alternator_enforce_authorization();
_server.invoke_on_all(
[this, addr, alternator_port, alternator_https_port, creds = std::move(creds), alternator_enforce_authorization] (server& server) mutable {
return server.init(addr, alternator_port, alternator_https_port, creds, alternator_enforce_authorization,
&_memory_limiter.local().get_semaphore(),
_config.max_concurrent_requests_per_shard);
}).handle_exception([this, addr, alternator_port, alternator_https_port] (std::exception_ptr ep) {
logger.error("Failed to set up Alternator HTTP server on {} port {}, TLS port {}: {}",
addr, alternator_port ? std::to_string(*alternator_port) : "OFF", alternator_https_port ? std::to_string(*alternator_https_port) : "OFF", ep);
return stop_server().then([ep = std::move(ep)] { return make_exception_future<>(ep); });
}).then([addr, alternator_port, alternator_https_port] {
logger.info("Alternator server listening on {}, HTTP port {}, HTTPS port {}",
addr, alternator_port ? std::to_string(*alternator_port) : "OFF", alternator_https_port ? std::to_string(*alternator_https_port) : "OFF");
}).get();
});
}
future<> controller::stop_server() {
return seastar::async([this] {
if (!_ssg) {
return;
}
_server.stop().get();
_executor.stop().get();
_listen_addresses.clear();
destroy_smp_service_group(_ssg.value()).get();
});
}
future<> controller::request_stop_server() {
return stop_server();
}
}

View File

@@ -1,94 +0,0 @@
/*
* Copyright (C) 2021-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <seastar/core/sharded.hh>
#include <seastar/core/smp.hh>
#include "protocol_server.hh"
namespace service {
class storage_proxy;
class migration_manager;
class memory_limiter;
}
namespace db {
class system_distributed_keyspace;
class config;
}
namespace cdc {
class generation_service;
}
namespace gms {
class gossiper;
}
namespace auth {
class service;
}
namespace qos {
class service_level_controller;
}
namespace alternator {
// This is the official DynamoDB API version.
// It represents the last major reorganization of that API, and all the features
// that were added since did NOT increment this version string.
constexpr const char* version = "2012-08-10";
using namespace seastar;
class executor;
class server;
class controller : public protocol_server {
sharded<gms::gossiper>& _gossiper;
sharded<service::storage_proxy>& _proxy;
sharded<service::migration_manager>& _mm;
sharded<db::system_distributed_keyspace>& _sys_dist_ks;
sharded<cdc::generation_service>& _cdc_gen_svc;
sharded<service::memory_limiter>& _memory_limiter;
sharded<auth::service>& _auth_service;
sharded<qos::service_level_controller>& _sl_controller;
const db::config& _config;
std::vector<socket_address> _listen_addresses;
sharded<executor> _executor;
sharded<server> _server;
std::optional<smp_service_group> _ssg;
public:
controller(
sharded<gms::gossiper>& gossiper,
sharded<service::storage_proxy>& proxy,
sharded<service::migration_manager>& mm,
sharded<db::system_distributed_keyspace>& sys_dist_ks,
sharded<cdc::generation_service>& cdc_gen_svc,
sharded<service::memory_limiter>& memory_limiter,
sharded<auth::service>& auth_service,
sharded<qos::service_level_controller>& sl_controller,
const db::config& config);
virtual sstring name() const override;
virtual sstring protocol() const override;
virtual sstring protocol_version() const override;
virtual std::vector<socket_address> listen_addresses() const override;
virtual future<> start_server() override;
virtual future<> stop_server() override;
virtual future<> request_stop_server() override;
};
}

View File

@@ -1,93 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <seastar/http/httpd.hh>
#include "seastarx.hh"
namespace alternator {
// api_error contains a DynamoDB error message to be returned to the user.
// It can be returned by value (see executor::request_return_type) or thrown.
// The DynamoDB's error messages are described in detail in
// https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html
// An error message has an HTTP code (almost always 400), a type, e.g.,
// "ResourceNotFoundException", and a human readable message.
// Eventually alternator::api_handler will convert a returned or thrown
// api_error into a JSON object, and that is returned to the user.
class api_error final : public std::exception {
public:
using status_type = httpd::reply::status_type;
status_type _http_code;
std::string _type;
std::string _msg;
api_error(std::string type, std::string msg, status_type http_code = status_type::bad_request)
: _http_code(std::move(http_code))
, _type(std::move(type))
, _msg(std::move(msg))
{ }
// Factory functions for some common types of DynamoDB API errors
static api_error validation(std::string msg) {
return api_error("ValidationException", std::move(msg));
}
static api_error resource_not_found(std::string msg) {
return api_error("ResourceNotFoundException", std::move(msg));
}
static api_error resource_in_use(std::string msg) {
return api_error("ResourceInUseException", std::move(msg));
}
static api_error invalid_signature(std::string msg) {
return api_error("InvalidSignatureException", std::move(msg));
}
static api_error missing_authentication_token(std::string msg) {
return api_error("MissingAuthenticationTokenException", std::move(msg));
}
static api_error unrecognized_client(std::string msg) {
return api_error("UnrecognizedClientException", std::move(msg));
}
static api_error unknown_operation(std::string msg) {
return api_error("UnknownOperationException", std::move(msg));
}
static api_error access_denied(std::string msg) {
return api_error("AccessDeniedException", std::move(msg));
}
static api_error conditional_check_failed(std::string msg) {
return api_error("ConditionalCheckFailedException", std::move(msg));
}
static api_error expired_iterator(std::string msg) {
return api_error("ExpiredIteratorException", std::move(msg));
}
static api_error trimmed_data_access_exception(std::string msg) {
return api_error("TrimmedDataAccessException", std::move(msg));
}
static api_error request_limit_exceeded(std::string msg) {
return api_error("RequestLimitExceeded", std::move(msg));
}
static api_error serialization(std::string msg) {
return api_error("SerializationException", std::move(msg));
}
static api_error table_not_found(std::string msg) {
return api_error("TableNotFoundException", std::move(msg));
}
static api_error internal(std::string msg) {
return api_error("InternalServerError", std::move(msg), reply::status_type::internal_server_error);
}
// Provide the "std::exception" interface, to make it easier to print this
// exception in log messages. Note that this function is *not* used to
// format the error to send it back to the client - server.cc has
// generate_error_reply() to format an api_error as the DynamoDB protocol
// requires.
virtual const char* what() const noexcept override;
mutable std::string _what_string;
};
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,242 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <seastar/core/future.hh>
#include <seastar/http/httpd.hh>
#include "seastarx.hh"
#include <seastar/json/json_elements.hh>
#include <seastar/core/sharded.hh>
#include "service/migration_manager.hh"
#include "service/client_state.hh"
#include "service_permit.hh"
#include "db/timeout_clock.hh"
#include "alternator/error.hh"
#include "stats.hh"
#include "utils/rjson.hh"
namespace db {
class system_distributed_keyspace;
}
namespace query {
class partition_slice;
class result;
}
namespace cql3::selection {
class selection;
}
namespace service {
class storage_proxy;
}
namespace cdc {
class metadata;
}
namespace gms {
class gossiper;
}
namespace alternator {
class rmw_operation;
struct make_jsonable : public json::jsonable {
rjson::value _value;
public:
explicit make_jsonable(rjson::value&& value);
std::string to_json() const override;
};
/**
* Make return type for serializing the object "streamed",
* i.e. direct to HTTP output stream. Note: only useful for
* (very) large objects as there are overhead issues with this
* as well, but for massive lists of return objects this can
* help avoid large allocations/many re-allocs
*/
json::json_return_type make_streamed(rjson::value&&);
struct json_string : public json::jsonable {
std::string _value;
public:
explicit json_string(std::string&& value);
std::string to_json() const override;
};
namespace parsed {
class path;
};
schema_ptr get_table(service::storage_proxy& proxy, const rjson::value& request);
bool is_alternator_keyspace(const sstring& ks_name);
// Wraps the db::get_tags_of_table and throws if the table is missing the tags extension.
const std::map<sstring, sstring>& get_tags_of_table_or_throw(schema_ptr schema);
// An attribute_path_map object is used to hold data for various attributes
// paths (parsed::path) in a hierarchy of attribute paths. Each attribute path
// has a root attribute, and then modified by member and index operators -
// for example in "a.b[2].c" we have "a" as the root, then ".b" member, then
// "[2]" index, and finally ".c" member.
// Data can be added to an attribute_path_map using the add() function, but
// requires that attributes with data not be *overlapping* or *conflicting*:
//
// 1. Two attribute paths which are identical or an ancestor of one another
// are considered *overlapping* and not allowed. If a.b.c has data,
// we can't add more data in a.b.c or any of its descendants like a.b.c.d.
//
// 2. Two attribute paths which need the same parent to have both a member and
// an index are considered *conflicting* and not allowed. E.g., if a.b has
// data, you can't add a[1]. The meaning of adding both would be that the
// attribute a is both a map and an array, which isn't sensible.
//
// These two requirements are common to the two places where Alternator uses
// this abstraction to describe how a hierarchical item is to be transformed:
//
// 1. In ProjectExpression: for filtering from a full top-level attribute
// only the parts for which user asked in ProjectionExpression.
//
// 2. In UpdateExpression: for taking the previous value of a top-level
// attribute, and modifying it based on the instructions in the user
// wrote in UpdateExpression.
template<typename T>
class attribute_path_map_node {
public:
using data_t = T;
// We need the extra unique_ptr<> here because libstdc++ unordered_map
// doesn't work with incomplete types :-(
using members_t = std::unordered_map<std::string, std::unique_ptr<attribute_path_map_node<T>>>;
// The indexes list is sorted because DynamoDB requires handling writes
// beyond the end of a list in index order.
using indexes_t = std::map<unsigned, std::unique_ptr<attribute_path_map_node<T>>>;
// The prohibition on "overlap" and "conflict" explained above means
// That only one of data, members or indexes is non-empty.
std::optional<std::variant<data_t, members_t, indexes_t>> _content;
bool is_empty() const { return !_content; }
bool has_value() const { return _content && std::holds_alternative<data_t>(*_content); }
bool has_members() const { return _content && std::holds_alternative<members_t>(*_content); }
bool has_indexes() const { return _content && std::holds_alternative<indexes_t>(*_content); }
// get_members() assumes that has_members() is true
members_t& get_members() { return std::get<members_t>(*_content); }
const members_t& get_members() const { return std::get<members_t>(*_content); }
indexes_t& get_indexes() { return std::get<indexes_t>(*_content); }
const indexes_t& get_indexes() const { return std::get<indexes_t>(*_content); }
T& get_value() { return std::get<T>(*_content); }
const T& get_value() const { return std::get<T>(*_content); }
};
template<typename T>
using attribute_path_map = std::unordered_map<std::string, attribute_path_map_node<T>>;
using attrs_to_get_node = attribute_path_map_node<std::monostate>;
// attrs_to_get lists which top-level attribute are needed, and possibly also
// which part of the top-level attribute is really needed (when nested
// attribute paths appeared in the query).
// Most code actually uses optional<attrs_to_get>. There, a disengaged
// optional means we should get all attributes, not specific ones.
using attrs_to_get = attribute_path_map<std::monostate>;
class executor : public peering_sharded_service<executor> {
gms::gossiper& _gossiper;
service::storage_proxy& _proxy;
service::migration_manager& _mm;
db::system_distributed_keyspace& _sdks;
cdc::metadata& _cdc_metadata;
// An smp_service_group to be used for limiting the concurrency when
// forwarding Alternator request between shards - if necessary for LWT.
smp_service_group _ssg;
public:
using client_state = service::client_state;
using request_return_type = std::variant<json::json_return_type, api_error>;
stats _stats;
static constexpr auto ATTRS_COLUMN_NAME = ":attrs";
static constexpr auto KEYSPACE_NAME_PREFIX = "alternator_";
static constexpr std::string_view INTERNAL_TABLE_PREFIX = ".scylla.alternator.";
executor(gms::gossiper& gossiper, service::storage_proxy& proxy, service::migration_manager& mm, db::system_distributed_keyspace& sdks, cdc::metadata& cdc_metadata, smp_service_group ssg)
: _gossiper(gossiper), _proxy(proxy), _mm(mm), _sdks(sdks), _cdc_metadata(cdc_metadata), _ssg(ssg) {}
future<request_return_type> create_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> describe_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> delete_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> update_table(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> put_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> get_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> delete_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> update_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> list_tables(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> scan(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> describe_endpoints(client_state& client_state, service_permit permit, rjson::value request, std::string host_header);
future<request_return_type> batch_write_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> batch_get_item(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> query(client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value request);
future<request_return_type> tag_resource(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> untag_resource(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> list_tags_of_resource(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> update_time_to_live(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> describe_time_to_live(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> list_streams(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> describe_stream(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> get_shard_iterator(client_state& client_state, service_permit permit, rjson::value request);
future<request_return_type> get_records(client_state& client_state, tracing::trace_state_ptr, service_permit permit, rjson::value request);
future<request_return_type> describe_continuous_backups(client_state& client_state, service_permit permit, rjson::value request);
future<> start();
future<> stop() { return make_ready_future<>(); }
static sstring table_name(const schema&);
static db::timeout_clock::time_point default_timeout();
static void set_default_timeout(db::timeout_clock::duration timeout);
private:
static db::timeout_clock::duration s_default_timeout;
public:
static schema_ptr find_table(service::storage_proxy&, const rjson::value& request);
private:
friend class rmw_operation;
static void describe_key_schema(rjson::value& parent, const schema&, std::unordered_map<std::string,std::string> * = nullptr);
static void describe_key_schema(rjson::value& parent, const schema& schema, std::unordered_map<std::string,std::string>&);
public:
static std::optional<rjson::value> describe_single_item(schema_ptr,
const query::partition_slice&,
const cql3::selection::selection&,
const query::result&,
const std::optional<attrs_to_get>&);
static std::vector<rjson::value> describe_multi_item(schema_ptr schema,
const query::partition_slice& slice,
const cql3::selection::selection& selection,
const query::result& query_result,
const std::optional<attrs_to_get>& attrs_to_get);
static void describe_single_item(const cql3::selection::selection&,
const std::vector<bytes_opt>&,
const std::optional<attrs_to_get>&,
rjson::value&,
bool = false);
static void add_stream_options(const rjson::value& stream_spec, schema_builder&, service::storage_proxy& sp);
static void supplement_table_info(rjson::value& descr, const schema& schema, service::storage_proxy& sp);
static void supplement_table_stream_info(rjson::value& descr, const schema& schema, service::storage_proxy& sp);
};
}

View File

@@ -1,740 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "expressions.hh"
#include "serialization.hh"
#include "utils/base64.hh"
#include "conditions.hh"
#include "alternator/expressionsLexer.hpp"
#include "alternator/expressionsParser.hpp"
#include "utils/overloaded_functor.hh"
#include "error.hh"
#include "seastarx.hh"
#include <seastar/core/print.hh>
#include <seastar/util/log.hh>
#include <boost/algorithm/cxx11/any_of.hpp>
#include <boost/algorithm/cxx11/all_of.hpp>
#include <functional>
#include <unordered_map>
namespace alternator {
template <typename Func, typename Result = std::result_of_t<Func(expressionsParser&)>>
Result do_with_parser(std::string_view input, Func&& f) {
expressionsLexer::InputStreamType input_stream{
reinterpret_cast<const ANTLR_UINT8*>(input.data()),
ANTLR_ENC_UTF8,
static_cast<ANTLR_UINT32>(input.size()),
nullptr };
expressionsLexer lexer(&input_stream);
expressionsParser::TokenStreamType tstream(ANTLR_SIZE_HINT, lexer.get_tokSource());
expressionsParser parser(&tstream);
auto result = f(parser);
return result;
}
parsed::update_expression
parse_update_expression(std::string_view query) {
try {
return do_with_parser(query, std::mem_fn(&expressionsParser::update_expression));
} catch (...) {
throw expressions_syntax_error(format("Failed parsing UpdateExpression '{}': {}", query, std::current_exception()));
}
}
std::vector<parsed::path>
parse_projection_expression(std::string_view query) {
try {
return do_with_parser(query, std::mem_fn(&expressionsParser::projection_expression));
} catch (...) {
throw expressions_syntax_error(format("Failed parsing ProjectionExpression '{}': {}", query, std::current_exception()));
}
}
parsed::condition_expression
parse_condition_expression(std::string_view query) {
try {
return do_with_parser(query, std::mem_fn(&expressionsParser::condition_expression));
} catch (...) {
throw expressions_syntax_error(format("Failed parsing ConditionExpression '{}': {}", query, std::current_exception()));
}
}
namespace parsed {
void update_expression::add(update_expression::action a) {
std::visit(overloaded_functor {
[&] (action::set&) { seen_set = true; },
[&] (action::remove&) { seen_remove = true; },
[&] (action::add&) { seen_add = true; },
[&] (action::del&) { seen_del = true; }
}, a._action);
_actions.push_back(std::move(a));
}
void update_expression::append(update_expression other) {
if ((seen_set && other.seen_set) ||
(seen_remove && other.seen_remove) ||
(seen_add && other.seen_add) ||
(seen_del && other.seen_del)) {
throw expressions_syntax_error("Each of SET, REMOVE, ADD, DELETE may only appear once in UpdateExpression");
}
std::move(other._actions.begin(), other._actions.end(), std::back_inserter(_actions));
seen_set |= other.seen_set;
seen_remove |= other.seen_remove;
seen_add |= other.seen_add;
seen_del |= other.seen_del;
}
void condition_expression::append(condition_expression&& a, char op) {
std::visit(overloaded_functor {
[&] (condition_list& x) {
// If 'a' has a single condition, we could, instead of inserting
// it insert its single condition (possibly negated if a._negated)
// But considering it we don't evaluate these expressions many
// times, this optimization is not worth extra code complexity.
if (!x.conditions.empty() && x.op != op) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error("condition_expression::append called with mixed operators");
}
x.conditions.push_back(std::move(a));
x.op = op;
},
[&] (primitive_condition& x) {
// Shouldn't happen unless we have a bug in the parser
throw std::logic_error("condition_expression::append called on primitive_condition");
}
}, _expression);
}
void path::check_depth_limit() {
if (1 + _operators.size() > depth_limit) {
throw expressions_syntax_error(format("Document path exceeded {} nesting levels", depth_limit));
}
}
std::ostream& operator<<(std::ostream& os, const path& p) {
os << p.root();
for (const auto& op : p.operators()) {
std::visit(overloaded_functor {
[&] (const std::string& member) {
os << '.' << member;
},
[&] (unsigned index) {
os << '[' << index << ']';
}
}, op);
}
return os;
}
} // namespace parsed
// The following resolve_*() functions resolve references in parsed
// expressions of different types. Resolving a parsed expression means
// replacing:
// 1. In parsed::path objects, replace references like "#name" with the
// attribute name from ExpressionAttributeNames,
// 2. In parsed::constant objects, replace references like ":value" with
// the value from ExpressionAttributeValues.
// These function also track which name and value references were used, to
// allow complaining if some remain unused.
// Note that the resolve_*() functions modify the expressions in-place,
// so if we ever intend to cache parsed expression, we need to pass a copy
// into this function.
//
// Doing the "resolving" stage before the evaluation stage has two benefits.
// First, it allows us to be compatible with DynamoDB in catching unused
// names and values (see issue #6572). Second, in the FilterExpression case,
// we need to resolve the expression just once but then use it many times
// (once for each item to be filtered).
static std::optional<std::string> resolve_path_component(const std::string& column_name,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names) {
if (column_name.size() > 0 && column_name.front() == '#') {
if (!expression_attribute_names) {
throw api_error::validation(
format("ExpressionAttributeNames missing, entry '{}' required by expression", column_name));
}
const rjson::value* value = rjson::find(*expression_attribute_names, column_name);
if (!value || !value->IsString()) {
throw api_error::validation(
format("ExpressionAttributeNames missing entry '{}' required by expression", column_name));
}
used_attribute_names.emplace(column_name);
return std::string(rjson::to_string_view(*value));
}
return std::nullopt;
}
static void resolve_path(parsed::path& p,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names) {
std::optional<std::string> r = resolve_path_component(p.root(), expression_attribute_names, used_attribute_names);
if (r) {
p.set_root(std::move(*r));
}
for (auto& op : p.operators()) {
std::visit(overloaded_functor {
[&] (std::string& s) {
r = resolve_path_component(s, expression_attribute_names, used_attribute_names);
if (r) {
s = std::move(*r);
}
},
[&] (unsigned index) {
// nothing to resolve
}
}, op);
}
}
static void resolve_constant(parsed::constant& c,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_values) {
std::visit(overloaded_functor {
[&] (const std::string& valref) {
if (!expression_attribute_values) {
throw api_error::validation(
format("ExpressionAttributeValues missing, entry '{}' required by expression", valref));
}
const rjson::value* value = rjson::find(*expression_attribute_values, valref);
if (!value) {
throw api_error::validation(
format("ExpressionAttributeValues missing entry '{}' required by expression", valref));
}
if (value->IsNull()) {
throw api_error::validation(
format("ExpressionAttributeValues null value for entry '{}' required by expression", valref));
}
validate_value(*value, "ExpressionAttributeValues");
used_attribute_values.emplace(valref);
c.set(*value);
},
[&] (const parsed::constant::literal& lit) {
// Nothing to do, already resolved
}
}, c._value);
}
void resolve_value(parsed::value& rhs,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
std::visit(overloaded_functor {
[&] (parsed::constant& c) {
resolve_constant(c, expression_attribute_values, used_attribute_values);
},
[&] (parsed::value::function_call& f) {
for (parsed::value& value : f._parameters) {
resolve_value(value, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
},
[&] (parsed::path& p) {
resolve_path(p, expression_attribute_names, used_attribute_names);
}
}, rhs._value);
}
void resolve_set_rhs(parsed::set_rhs& rhs,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
resolve_value(rhs._v1, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
if (rhs._op != 'v') {
resolve_value(rhs._v2, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
}
void resolve_update_expression(parsed::update_expression& ue,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
for (parsed::update_expression::action& action : ue.actions()) {
resolve_path(action._path, expression_attribute_names, used_attribute_names);
std::visit(overloaded_functor {
[&] (parsed::update_expression::action::set& a) {
resolve_set_rhs(a._rhs, expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
},
[&] (parsed::update_expression::action::remove& a) {
// nothing to do
},
[&] (parsed::update_expression::action::add& a) {
resolve_constant(a._valref, expression_attribute_values, used_attribute_values);
},
[&] (parsed::update_expression::action::del& a) {
resolve_constant(a._valref, expression_attribute_values, used_attribute_values);
}
}, action._action);
}
}
static void resolve_primitive_condition(parsed::primitive_condition& pc,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
for (parsed::value& value : pc._values) {
resolve_value(value,
expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
}
void resolve_condition_expression(parsed::condition_expression& ce,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values) {
std::visit(overloaded_functor {
[&] (parsed::primitive_condition& cond) {
resolve_primitive_condition(cond,
expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
},
[&] (parsed::condition_expression::condition_list& list) {
for (parsed::condition_expression& cond : list.conditions) {
resolve_condition_expression(cond,
expression_attribute_names, expression_attribute_values,
used_attribute_names, used_attribute_values);
}
}
}, ce._expression);
}
void resolve_projection_expression(std::vector<parsed::path>& pe,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names) {
for (parsed::path& p : pe) {
resolve_path(p, expression_attribute_names, used_attribute_names);
}
}
// condition_expression_on() checks whether a condition_expression places any
// condition on the given attribute. It can be useful, for example, for
// checking whether the condition tries to restrict a key column.
static bool value_on(const parsed::value& v, std::string_view attribute) {
return std::visit(overloaded_functor {
[&] (const parsed::constant& c) {
return false;
},
[&] (const parsed::value::function_call& f) {
for (const parsed::value& value : f._parameters) {
if (value_on(value, attribute)) {
return true;
}
}
return false;
},
[&] (const parsed::path& p) {
return p.root() == attribute;
}
}, v._value);
}
static bool primitive_condition_on(const parsed::primitive_condition& pc, std::string_view attribute) {
for (const parsed::value& value : pc._values) {
if (value_on(value, attribute)) {
return true;
}
}
return false;
}
bool condition_expression_on(const parsed::condition_expression& ce, std::string_view attribute) {
return std::visit(overloaded_functor {
[&] (const parsed::primitive_condition& cond) {
return primitive_condition_on(cond, attribute);
},
[&] (const parsed::condition_expression::condition_list& list) {
for (const parsed::condition_expression& cond : list.conditions) {
if (condition_expression_on(cond, attribute)) {
return true;
}
}
return false;
}
}, ce._expression);
}
// for_condition_expression_on() runs a given function over all the attributes
// mentioned in the expression. If the same attribute is mentioned more than
// once, the function will be called more than once for the same attribute.
static void for_value_on(const parsed::value& v, const noncopyable_function<void(std::string_view)>& func) {
std::visit(overloaded_functor {
[&] (const parsed::constant& c) { },
[&] (const parsed::value::function_call& f) {
for (const parsed::value& value : f._parameters) {
for_value_on(value, func);
}
},
[&] (const parsed::path& p) {
func(p.root());
}
}, v._value);
}
void for_condition_expression_on(const parsed::condition_expression& ce, const noncopyable_function<void(std::string_view)>& func) {
std::visit(overloaded_functor {
[&] (const parsed::primitive_condition& cond) {
for (const parsed::value& value : cond._values) {
for_value_on(value, func);
}
},
[&] (const parsed::condition_expression::condition_list& list) {
for (const parsed::condition_expression& cond : list.conditions) {
for_condition_expression_on(cond, func);
}
}
}, ce._expression);
}
// The following calculate_value() functions calculate, or evaluate, a parsed
// expression. The parsed expression is assumed to have been "resolved", with
// the matching resolve_* function.
// calculate_size() is ConditionExpression's size() function, i.e., it takes
// a JSON-encoded value and returns its "size" as defined differently for the
// different types - also as a JSON-encoded number.
// It return a JSON-encoded "null" value if this value's type has no size
// defined. Comparisons against this non-numeric value will later fail.
static rjson::value calculate_size(const rjson::value& v) {
// NOTE: If v is improperly formatted for our JSON value encoding, it
// must come from the request itself, not from the database, so it makes
// sense to throw a ValidationException if we see such a problem.
if (!v.IsObject() || v.MemberCount() != 1) {
throw api_error::validation(format("invalid object: {}", v));
}
auto it = v.MemberBegin();
int ret;
if (it->name == "S") {
if (!it->value.IsString()) {
throw api_error::validation(format("invalid string: {}", v));
}
ret = it->value.GetStringLength();
} else if (it->name == "NS" || it->name == "SS" || it->name == "BS" || it->name == "L") {
if (!it->value.IsArray()) {
throw api_error::validation(format("invalid set: {}", v));
}
ret = it->value.Size();
} else if (it->name == "M") {
if (!it->value.IsObject()) {
throw api_error::validation(format("invalid map: {}", v));
}
ret = it->value.MemberCount();
} else if (it->name == "B") {
if (!it->value.IsString()) {
throw api_error::validation(format("invalid byte string: {}", v));
}
ret = base64_decoded_len(rjson::to_string_view(it->value));
} else {
rjson::value json_ret = rjson::empty_object();
rjson::add(json_ret, "null", rjson::value(true));
return json_ret;
}
rjson::value json_ret = rjson::empty_object();
rjson::add(json_ret, "N", rjson::from_string(std::to_string(ret)));
return json_ret;
}
static const rjson::value& calculate_value(const parsed::constant& c) {
return std::visit(overloaded_functor {
[&] (const parsed::constant::literal& v) -> const rjson::value& {
return *v;
},
[&] (const std::string& valref) -> const rjson::value& {
// Shouldn't happen, we should have called resolve_value() earlier
// and replaced the value reference by the literal constant.
throw std::logic_error("calculate_value() called before resolve_value()");
}
}, c._value);
}
static rjson::value to_bool_json(bool b) {
rjson::value json_ret = rjson::empty_object();
rjson::add(json_ret, "BOOL", rjson::value(b));
return json_ret;
}
static bool known_type(std::string_view type) {
static thread_local const std::unordered_set<std::string_view> types = {
"N", "S", "B", "NS", "SS", "BS", "L", "M", "NULL", "BOOL"
};
return types.contains(type);
}
using function_handler_type = rjson::value(calculate_value_caller, const rjson::value*, const parsed::value::function_call&);
static const
std::unordered_map<std::string_view, function_handler_type*> function_handlers {
{"list_append", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::UpdateExpression) {
throw api_error::validation(
format("{}: list_append() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: list_append() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
rjson::value ret = list_concatenate(v1, v2);
if (ret.IsNull()) {
throw api_error::validation("UpdateExpression: list_append() given a non-list");
}
return ret;
}
},
{"if_not_exists", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::UpdateExpression) {
throw api_error::validation(
format("{}: if_not_exists() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: if_not_exists() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
if (!std::holds_alternative<parsed::path>(f._parameters[0]._value)) {
throw api_error::validation(
format("{}: if_not_exists() must include path as its first argument", caller));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
return v1.IsNull() ? std::move(v2) : std::move(v1);
}
},
{"size", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpression) {
throw api_error::validation(
format("{}: size() not allowed here", caller));
}
if (f._parameters.size() != 1) {
throw api_error::validation(
format("{}: size() accepts 1 parameter, got {}", caller, f._parameters.size()));
}
rjson::value v = calculate_value(f._parameters[0], caller, previous_item);
return calculate_size(v);
}
},
{"attribute_exists", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: attribute_exists() not allowed here", caller));
}
if (f._parameters.size() != 1) {
throw api_error::validation(
format("{}: attribute_exists() accepts 1 parameter, got {}", caller, f._parameters.size()));
}
if (!std::holds_alternative<parsed::path>(f._parameters[0]._value)) {
throw api_error::validation(
format("{}: attribute_exists()'s parameter must be a path", caller));
}
rjson::value v = calculate_value(f._parameters[0], caller, previous_item);
return to_bool_json(!v.IsNull());
}
},
{"attribute_not_exists", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: attribute_not_exists() not allowed here", caller));
}
if (f._parameters.size() != 1) {
throw api_error::validation(
format("{}: attribute_not_exists() accepts 1 parameter, got {}", caller, f._parameters.size()));
}
if (!std::holds_alternative<parsed::path>(f._parameters[0]._value)) {
throw api_error::validation(
format("{}: attribute_not_exists()'s parameter must be a path", caller));
}
rjson::value v = calculate_value(f._parameters[0], caller, previous_item);
return to_bool_json(v.IsNull());
}
},
{"attribute_type", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: attribute_type() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: attribute_type() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
// There is no real reason for the following check (not
// allowing the type to come from a document attribute), but
// DynamoDB does this check, so we do too...
if (!f._parameters[1].is_constant()) {
throw api_error::validation(
format("{}: attribute_types()'s first parameter must be an expression attribute", caller));
}
rjson::value v0 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v1 = calculate_value(f._parameters[1], caller, previous_item);
if (v1.IsObject() && v1.MemberCount() == 1 && v1.MemberBegin()->name == "S") {
// If the type parameter is not one of the legal types
// we should generate an error, not a failed condition:
if (!known_type(rjson::to_string_view(v1.MemberBegin()->value))) {
throw api_error::validation(
format("{}: attribute_types()'s second parameter, {}, is not a known type",
caller, v1.MemberBegin()->value));
}
if (v0.IsObject() && v0.MemberCount() == 1) {
return to_bool_json(v1.MemberBegin()->value == v0.MemberBegin()->name);
} else {
return to_bool_json(false);
}
} else {
throw api_error::validation(
format("{}: attribute_type() second parameter must refer to a string, got {}", caller, v1));
}
}
},
{"begins_with", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: begins_with() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: begins_with() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
return to_bool_json(check_BEGINS_WITH(v1.IsNull() ? nullptr : &v1, v2,
f._parameters[0].is_constant(), f._parameters[1].is_constant()));
}
},
{"contains", [] (calculate_value_caller caller, const rjson::value* previous_item, const parsed::value::function_call& f) {
if (caller != calculate_value_caller::ConditionExpressionAlone) {
throw api_error::validation(
format("{}: contains() not allowed here", caller));
}
if (f._parameters.size() != 2) {
throw api_error::validation(
format("{}: contains() accepts 2 parameters, got {}", caller, f._parameters.size()));
}
rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);
rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);
return to_bool_json(check_CONTAINS(v1.IsNull() ? nullptr : &v1, v2));
}
},
};
// Given a parsed::path and an item read from the table, extract the value
// of a certain attribute path, such as "a" or "a.b.c[3]". Returns a null
// value if the item or the requested attribute does not exist.
// Note that the item is assumed to be encoded in JSON using DynamoDB
// conventions - each level of a nested document is a map with one key -
// a type (e.g., "M" for map) - and its value is the representation of
// that value.
static rjson::value extract_path(const rjson::value* item,
const parsed::path& p, calculate_value_caller caller) {
if (!item) {
return rjson::null_value();
}
const rjson::value* v = rjson::find(*item, p.root());
if (!v) {
return rjson::null_value();
}
for (const auto& op : p.operators()) {
if (!v->IsObject() || v->MemberCount() != 1) {
// This shouldn't happen. We shouldn't have stored malformed
// objects. But today Alternator does not validate the structure
// of nested documents before storing them, so this can happen on
// read.
throw api_error::validation(format("{}: malformed item read: {}", *item));
}
const char* type = v->MemberBegin()->name.GetString();
v = &(v->MemberBegin()->value);
std::visit(overloaded_functor {
[&] (const std::string& member) {
if (type[0] == 'M' && v->IsObject()) {
v = rjson::find(*v, member);
} else {
v = nullptr;
}
},
[&] (unsigned index) {
if (type[0] == 'L' && v->IsArray() && index < v->Size()) {
v = &(v->GetArray()[index]);
} else {
v = nullptr;
}
}
}, op);
if (!v) {
return rjson::null_value();
}
}
return rjson::copy(*v);
}
// Given a parsed::value, which can refer either to a constant value from
// ExpressionAttributeValues, to the value of some attribute, or to a function
// of other values, this function calculates the resulting value.
// "caller" determines which expression - ConditionExpression or
// UpdateExpression - is asking for this value. We need to know this because
// DynamoDB allows a different choice of functions for different expressions.
rjson::value calculate_value(const parsed::value& v,
calculate_value_caller caller,
const rjson::value* previous_item) {
return std::visit(overloaded_functor {
[&] (const parsed::constant& c) -> rjson::value {
return rjson::copy(calculate_value(c));
},
[&] (const parsed::value::function_call& f) -> rjson::value {
auto function_it = function_handlers.find(std::string_view(f._function_name));
if (function_it == function_handlers.end()) {
throw api_error::validation(
format("{}: unknown function '{}' called.", caller, f._function_name));
}
return function_it->second(caller, previous_item, f);
},
[&] (const parsed::path& p) -> rjson::value {
return extract_path(previous_item, p, caller);
}
}, v._value);
}
// Same as calculate_value() above, except takes a set_rhs, which may be
// either a single value, or v1+v2 or v1-v2.
rjson::value calculate_value(const parsed::set_rhs& rhs,
const rjson::value* previous_item) {
switch (rhs._op) {
case 'v':
return calculate_value(rhs._v1, calculate_value_caller::UpdateExpression, previous_item);
case '+': {
rjson::value v1 = calculate_value(rhs._v1, calculate_value_caller::UpdateExpression, previous_item);
rjson::value v2 = calculate_value(rhs._v2, calculate_value_caller::UpdateExpression, previous_item);
return number_add(v1, v2);
}
case '-': {
rjson::value v1 = calculate_value(rhs._v1, calculate_value_caller::UpdateExpression, previous_item);
rjson::value v2 = calculate_value(rhs._v2, calculate_value_caller::UpdateExpression, previous_item);
return number_subtract(v1, v2);
}
}
// Can't happen
return rjson::null_value();
}
} // namespace alternator

View File

@@ -1,249 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
/*
* The DynamoDB protocol is based on JSON, and most DynamoDB requests
* describe the operation and its parameters via JSON objects such as maps
* and lists. Nevertheless, in some types of requests an "expression" is
* passed as a single string, and we need to parse this string. These
* cases include:
* 1. Attribute paths, such as "a[3].b.c", are used in projection
* expressions as well as inside other expressions described below.
* 2. Condition expressions, such as "(NOT (a=b OR c=d)) AND e=f",
* used in conditional updates, filters, and other places.
* 3. Update expressions, such as "SET #a.b = :x, c = :y DELETE d"
*
* All these expression syntaxes are very simple: Most of them could be
* parsed as regular expressions, and the parenthesized condition expression
* could be done with a simple hand-written lexical analyzer and recursive-
* descent parser. Nevertheless, we decided to specify these parsers in the
* ANTLR3 language already used in the Scylla project, hopefully making these
* parsers easier to reason about, and easier to change if needed - and
* reducing the amount of boiler-plate code.
*/
grammar expressions;
options {
language = Cpp;
}
@parser::namespace{alternator}
@lexer::namespace{alternator}
/* TODO: explain what these traits things are. I haven't seen them explained
* in any document... Compilation fails without these fail because a definition
* of "expressionsLexerTraits" and "expressionParserTraits" is needed.
*/
@lexer::traits {
class expressionsLexer;
class expressionsParser;
typedef antlr3::Traits<expressionsLexer, expressionsParser> expressionsLexerTraits;
}
@parser::traits {
typedef expressionsLexerTraits expressionsParserTraits;
}
@lexer::header {
#include "alternator/expressions.hh"
// ANTLR generates a bunch of unused variables and functions. Yuck...
#pragma GCC diagnostic ignored "-Wunused-variable"
#pragma GCC diagnostic ignored "-Wunused-function"
}
@parser::header {
#include "expressionsLexer.hpp"
}
/* By default, ANTLR3 composes elaborate syntax-error messages, saying which
* token was unexpected, where, and so on on, but then dutifully writes these
* error messages to the standard error, and returns from the parser as if
* everything was fine, with a half-constructed output object! If we define
* the "displayRecognitionError" method, it will be called upon to build this
* error message, and we can instead throw an exception to stop the parsing
* immediately. This is good enough for now, for our simple needs, but if
* we ever want to show more information about the syntax error, Cql3.g
* contains an elaborate implementation (it would be nice if we could reuse
* it, not duplicate it).
* Unfortunately, we have to repeat the same definition twice - once for the
* parser, and once for the lexer.
*/
@parser::context {
void displayRecognitionError(ANTLR_UINT8** token_names, ExceptionBaseType* ex) {
throw expressions_syntax_error("syntax error");
}
}
@lexer::context {
void displayRecognitionError(ANTLR_UINT8** token_names, ExceptionBaseType* ex) {
throw expressions_syntax_error("syntax error");
}
}
/*
* Lexical analysis phase, i.e., splitting the input up to tokens.
* Lexical analyzer rules have names starting in capital letters.
* "fragment" rules do not generate tokens, and are just aliases used to
* make other rules more readable.
* Characters *not* listed here, e.g., '=', '(', etc., will be handled
* as individual tokens on their own right.
* Whitespace spans are skipped, so do not generate tokens.
*/
WHITESPACE: (' ' | '\t' | '\n' | '\r')+ { skip(); };
/* shortcuts for case-insensitive keywords */
fragment A:('a'|'A');
fragment B:('b'|'B');
fragment C:('c'|'C');
fragment D:('d'|'D');
fragment E:('e'|'E');
fragment F:('f'|'F');
fragment G:('g'|'G');
fragment H:('h'|'H');
fragment I:('i'|'I');
fragment J:('j'|'J');
fragment K:('k'|'K');
fragment L:('l'|'L');
fragment M:('m'|'M');
fragment N:('n'|'N');
fragment O:('o'|'O');
fragment P:('p'|'P');
fragment Q:('q'|'Q');
fragment R:('r'|'R');
fragment S:('s'|'S');
fragment T:('t'|'T');
fragment U:('u'|'U');
fragment V:('v'|'V');
fragment W:('w'|'W');
fragment X:('x'|'X');
fragment Y:('y'|'Y');
fragment Z:('z'|'Z');
/* These keywords must be appear before the generic NAME token below,
* because NAME matches too, and the first to match wins.
*/
SET: S E T;
REMOVE: R E M O V E;
ADD: A D D;
DELETE: D E L E T E;
AND: A N D;
OR: O R;
NOT: N O T;
BETWEEN: B E T W E E N;
IN: I N;
fragment ALPHA: 'A'..'Z' | 'a'..'z';
fragment DIGIT: '0'..'9';
fragment ALNUM: ALPHA | DIGIT | '_';
INTEGER: DIGIT+;
NAME: ALPHA ALNUM*;
NAMEREF: '#' ALNUM+;
VALREF: ':' ALNUM+;
/*
* Parsing phase - parsing the string of tokens generated by the lexical
* analyzer defined above.
*/
path_component: NAME | NAMEREF;
path returns [parsed::path p]:
root=path_component { $p.set_root($root.text); }
( '.' name=path_component { $p.add_dot($name.text); }
| '[' INTEGER ']' { $p.add_index(std::stoi($INTEGER.text)); }
)*;
value returns [parsed::value v]:
VALREF { $v.set_valref($VALREF.text); }
| path { $v.set_path($path.p); }
| NAME { $v.set_func_name($NAME.text); }
'(' x=value { $v.add_func_parameter($x.v); }
(',' x=value { $v.add_func_parameter($x.v); })*
')'
;
update_expression_set_rhs returns [parsed::set_rhs rhs]:
v=value { $rhs.set_value(std::move($v.v)); }
( '+' v=value { $rhs.set_plus(std::move($v.v)); }
| '-' v=value { $rhs.set_minus(std::move($v.v)); }
)?
;
update_expression_set_action returns [parsed::update_expression::action a]:
path '=' rhs=update_expression_set_rhs { $a.assign_set($path.p, $rhs.rhs); };
update_expression_remove_action returns [parsed::update_expression::action a]:
path { $a.assign_remove($path.p); };
update_expression_add_action returns [parsed::update_expression::action a]:
path VALREF { $a.assign_add($path.p, $VALREF.text); };
update_expression_delete_action returns [parsed::update_expression::action a]:
path VALREF { $a.assign_del($path.p, $VALREF.text); };
update_expression_clause returns [parsed::update_expression e]:
SET s=update_expression_set_action { $e.add(s); }
(',' s=update_expression_set_action { $e.add(s); })*
| REMOVE r=update_expression_remove_action { $e.add(r); }
(',' r=update_expression_remove_action { $e.add(r); })*
| ADD a=update_expression_add_action { $e.add(a); }
(',' a=update_expression_add_action { $e.add(a); })*
| DELETE d=update_expression_delete_action { $e.add(d); }
(',' d=update_expression_delete_action { $e.add(d); })*
;
// Note the "EOF" token at the end of the update expression. We want to the
// parser to match the entire string given to it - not just its beginning!
update_expression returns [parsed::update_expression e]:
(update_expression_clause { e.append($update_expression_clause.e); })* EOF;
projection_expression returns [std::vector<parsed::path> v]:
p=path { $v.push_back(std::move($p.p)); }
(',' p=path { $v.push_back(std::move($p.p)); } )* EOF;
primitive_condition returns [parsed::primitive_condition c]:
v=value { $c.add_value(std::move($v.v));
$c.set_operator(parsed::primitive_condition::type::VALUE); }
( ( '=' { $c.set_operator(parsed::primitive_condition::type::EQ); }
| '<' '>' { $c.set_operator(parsed::primitive_condition::type::NE); }
| '<' { $c.set_operator(parsed::primitive_condition::type::LT); }
| '<' '=' { $c.set_operator(parsed::primitive_condition::type::LE); }
| '>' { $c.set_operator(parsed::primitive_condition::type::GT); }
| '>' '=' { $c.set_operator(parsed::primitive_condition::type::GE); }
)
v=value { $c.add_value(std::move($v.v)); }
| BETWEEN { $c.set_operator(parsed::primitive_condition::type::BETWEEN); }
v=value { $c.add_value(std::move($v.v)); }
AND
v=value { $c.add_value(std::move($v.v)); }
| IN '(' { $c.set_operator(parsed::primitive_condition::type::IN); }
v=value { $c.add_value(std::move($v.v)); }
(',' v=value { $c.add_value(std::move($v.v)); })*
')'
)?
;
// The following rules for parsing boolean expressions are verbose and
// somewhat strange because of Antlr 3's limitations on recursive rules,
// common rule prefixes, and (lack of) support for operator precedence.
// These rules could have been written more clearly using a more powerful
// parser generator - such as Yacc.
boolean_expression returns [parsed::condition_expression e]:
b=boolean_expression_1 { $e.append(std::move($b.e), '|'); }
(OR b=boolean_expression_1 { $e.append(std::move($b.e), '|'); } )*
;
boolean_expression_1 returns [parsed::condition_expression e]:
b=boolean_expression_2 { $e.append(std::move($b.e), '&'); }
(AND b=boolean_expression_2 { $e.append(std::move($b.e), '&'); } )*
;
boolean_expression_2 returns [parsed::condition_expression e]:
p=primitive_condition { $e.set_primitive(std::move($p.c)); }
| NOT b=boolean_expression_2 { $e = std::move($b.e); $e.apply_not(); }
| '(' b=boolean_expression ')' { $e = std::move($b.e); }
;
condition_expression returns [parsed::condition_expression e]:
boolean_expression { e=std::move($boolean_expression.e); } EOF;

View File

@@ -1,89 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <string>
#include <stdexcept>
#include <vector>
#include <unordered_set>
#include <string_view>
#include <seastar/util/noncopyable_function.hh>
#include "expressions_types.hh"
#include "utils/rjson.hh"
namespace alternator {
class expressions_syntax_error : public std::runtime_error {
public:
using runtime_error::runtime_error;
};
parsed::update_expression parse_update_expression(std::string_view query);
std::vector<parsed::path> parse_projection_expression(std::string_view query);
parsed::condition_expression parse_condition_expression(std::string_view query);
void resolve_update_expression(parsed::update_expression& ue,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values);
void resolve_projection_expression(std::vector<parsed::path>& pe,
const rjson::value* expression_attribute_names,
std::unordered_set<std::string>& used_attribute_names);
void resolve_condition_expression(parsed::condition_expression& ce,
const rjson::value* expression_attribute_names,
const rjson::value* expression_attribute_values,
std::unordered_set<std::string>& used_attribute_names,
std::unordered_set<std::string>& used_attribute_values);
void validate_value(const rjson::value& v, const char* caller);
bool condition_expression_on(const parsed::condition_expression& ce, std::string_view attribute);
// for_condition_expression_on() runs the given function on the attributes
// that the expression uses. It may run for the same attribute more than once
// if the same attribute is used more than once in the expression.
void for_condition_expression_on(const parsed::condition_expression& ce, const noncopyable_function<void(std::string_view)>& func);
// calculate_value() behaves slightly different (especially, different
// functions supported) when used in different types of expressions, as
// enumerated in this enum:
enum class calculate_value_caller {
UpdateExpression, ConditionExpression, ConditionExpressionAlone
};
inline std::ostream& operator<<(std::ostream& out, calculate_value_caller caller) {
switch (caller) {
case calculate_value_caller::UpdateExpression:
out << "UpdateExpression";
break;
case calculate_value_caller::ConditionExpression:
out << "ConditionExpression";
break;
case calculate_value_caller::ConditionExpressionAlone:
out << "ConditionExpression";
break;
default:
out << "unknown type of expression";
break;
}
return out;
}
rjson::value calculate_value(const parsed::value& v,
calculate_value_caller caller,
const rjson::value* previous_item);
rjson::value calculate_value(const parsed::set_rhs& rhs,
const rjson::value* previous_item);
} /* namespace alternator */

View File

@@ -1,257 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <vector>
#include <string>
#include <variant>
#include <seastar/core/shared_ptr.hh>
#include "utils/rjson.hh"
/*
* Parsed representation of expressions and their components.
*
* Types in alternator::parse namespace are used for holding the parse
* tree - objects generated by the Antlr rules after parsing an expression.
* Because of the way Antlr works, all these objects are default-constructed
* first, and then assigned when the rule is completed, so all these types
* have only default constructors - but setter functions to set them later.
*/
namespace alternator {
namespace parsed {
// "path" is an attribute's path in a document, e.g., a.b[3].c.
class path {
// All paths have a "root", a top-level attribute, and any number of
// "dereference operators" - each either an index (e.g., "[2]") or a
// dot (e.g., ".xyz").
std::string _root;
std::vector<std::variant<std::string, unsigned>> _operators;
// It is useful to limit the depth of a user-specified path, because is
// allows us to use recursive algorithms without worrying about recursion
// depth. DynamoDB officially limits the length of paths to 32 components
// (including the root) so let's use the same limit.
static constexpr unsigned depth_limit = 32;
void check_depth_limit();
public:
void set_root(std::string root) {
_root = std::move(root);
}
void add_index(unsigned i) {
_operators.emplace_back(i);
check_depth_limit();
}
void add_dot(std::string(name)) {
_operators.emplace_back(std::move(name));
check_depth_limit();
}
const std::string& root() const {
return _root;
}
bool has_operators() const {
return !_operators.empty();
}
const std::vector<std::variant<std::string, unsigned>>& operators() const {
return _operators;
}
std::vector<std::variant<std::string, unsigned>>& operators() {
return _operators;
}
friend std::ostream& operator<<(std::ostream&, const path&);
};
// When an expression is first parsed, all constants are references, like
// ":val1", into ExpressionAttributeValues. This uses std::string() variant.
// The resolve_value() function replaces these constants by the JSON item
// extracted from the ExpressionAttributeValues.
struct constant {
// We use lw_shared_ptr<rjson::value> just to make rjson::value copyable,
// to make this entire object copyable as ANTLR needs.
using literal = lw_shared_ptr<rjson::value>;
std::variant<std::string, literal> _value;
void set(const rjson::value& v) {
_value = make_lw_shared<rjson::value>(rjson::copy(v));
}
void set(std::string& s) {
_value = s;
}
};
// "value" is is a value used in the right hand side of an assignment
// expression, "SET a = ...". It can be a constant (a reference to a value
// included in the request, e.g., ":val"), a path to an attribute from the
// existing item (e.g., "a.b[3].c"), or a function of other such values.
// Note that the real right-hand-side of an assignment is actually a bit
// more general - it allows either a value, or a value+value or value-value -
// see class set_rhs below.
struct value {
struct function_call {
std::string _function_name;
std::vector<value> _parameters;
};
std::variant<constant, path, function_call> _value;
void set_constant(constant c) {
_value = std::move(c);
}
void set_valref(std::string s) {
_value = constant { std::move(s) };
}
void set_path(path p) {
_value = std::move(p);
}
void set_func_name(std::string s) {
_value = function_call {std::move(s), {}};
}
void add_func_parameter(value v) {
std::get<function_call>(_value)._parameters.emplace_back(std::move(v));
}
bool is_constant() const {
return std::holds_alternative<constant>(_value);
}
bool is_path() const {
return std::holds_alternative<path>(_value);
}
bool is_func() const {
return std::holds_alternative<function_call>(_value);
}
};
// The right-hand-side of a SET in an update expression can be either a
// single value (see above), or value+value, or value-value.
class set_rhs {
public:
char _op; // '+', '-', or 'v''
value _v1;
value _v2;
void set_value(value&& v1) {
_op = 'v';
_v1 = std::move(v1);
}
void set_plus(value&& v2) {
_op = '+';
_v2 = std::move(v2);
}
void set_minus(value&& v2) {
_op = '-';
_v2 = std::move(v2);
}
};
class update_expression {
public:
struct action {
path _path;
struct set {
set_rhs _rhs;
};
struct remove {
};
struct add {
constant _valref;
};
struct del {
constant _valref;
};
std::variant<set, remove, add, del> _action;
void assign_set(path p, set_rhs rhs) {
_path = std::move(p);
_action = set { std::move(rhs) };
}
void assign_remove(path p) {
_path = std::move(p);
_action = remove { };
}
void assign_add(path p, std::string v) {
_path = std::move(p);
_action = add { constant { std::move(v) } };
}
void assign_del(path p, std::string v) {
_path = std::move(p);
_action = del { constant { std::move(v) } };
}
};
private:
std::vector<action> _actions;
bool seen_set = false;
bool seen_remove = false;
bool seen_add = false;
bool seen_del = false;
public:
void add(action a);
void append(update_expression other);
bool empty() const {
return _actions.empty();
}
const std::vector<action>& actions() const {
return _actions;
}
std::vector<action>& actions() {
return _actions;
}
};
// A primitive_condition is a condition expression involving one condition,
// while the full condition_expression below adds boolean logic over these
// primitive conditions.
// The supported primitive conditions are:
// 1. Binary operators - v1 OP v2, where OP is =, <>, <, <=, >, or >= and
// v1 and v2 are values - from the item (an attribute path), the query
// (a ":val" reference), or a function of the the above (only the size()
// function is supported).
// 2. Ternary operator - v1 BETWEEN v2 and v3 (means v1 >= v2 AND v1 <= v3).
// 3. N-ary operator - v1 IN ( v2, v3, ... )
// 4. A single function call (attribute_exists etc.). The parser actually
// accepts a more general "value" here but later stages reject a value
// which is not a function call (because DynamoDB does it too).
class primitive_condition {
public:
enum class type {
UNDEFINED, VALUE, EQ, NE, LT, LE, GT, GE, BETWEEN, IN
};
type _op = type::UNDEFINED;
std::vector<value> _values;
void set_operator(type op) {
_op = op;
}
void add_value(value&& v) {
_values.push_back(std::move(v));
}
bool empty() const {
return _op == type::UNDEFINED;
}
};
class condition_expression {
public:
bool _negated = false; // If true, the entire condition is negated
struct condition_list {
char op = '|'; // '&' or '|'
std::vector<condition_expression> conditions;
};
std::variant<primitive_condition, condition_list> _expression = condition_list();
void set_primitive(primitive_condition&& p) {
_expression = std::move(p);
}
void append(condition_expression&& c, char op);
void apply_not() {
_negated = !_negated;
}
bool empty() const {
return std::holds_alternative<condition_list>(_expression) &&
std::get<condition_list>(_expression).conditions.empty();
}
};
} // namespace parsed
} // namespace alternator

View File

@@ -1,114 +0,0 @@
/*
* Copyright 2020-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include "seastarx.hh"
#include "service/paxos/cas_request.hh"
#include "utils/rjson.hh"
#include "executor.hh"
namespace alternator {
// An rmw_operation encapsulates the common logic of all the item update
// operations which may involve a read of the item before the write
// (so-called Read-Modify-Write operations). These operations include PutItem,
// UpdateItem and DeleteItem: All of these may be conditional operations (the
// "Expected" parameter) which requir a read before the write, and UpdateItem
// may also have an update expression which refers to the item's old value.
//
// The code below supports running the read and the write together as one
// transaction using LWT (this is why rmw_operation is a subclass of
// cas_request, as required by storage_proxy::cas()), but also has optional
// modes not using LWT.
class rmw_operation : public service::cas_request, public enable_shared_from_this<rmw_operation> {
public:
// The following options choose which mechanism to use for isolating
// parallel write operations:
// * The FORBID_RMW option forbids RMW (read-modify-write) operations
// such as conditional updates. For the remaining write-only
// operations, ordinary quorum writes are isolated enough.
// * The LWT_ALWAYS option always uses LWT (lightweight transactions)
// for any write operation - whether or not it also has a read.
// * The LWT_RMW_ONLY option uses LWT only for RMW operations, and uses
// ordinary quorum writes for write-only operations.
// This option is not safe if the user may send both RMW and write-only
// operations on the same item.
// * The UNSAFE_RMW option does read-modify-write operations as separate
// read and write. It is unsafe - concurrent RMW operations are not
// isolated at all. This option will likely be removed in the future.
enum class write_isolation {
FORBID_RMW, LWT_ALWAYS, LWT_RMW_ONLY, UNSAFE_RMW
};
static constexpr auto WRITE_ISOLATION_TAG_KEY = "system:write_isolation";
static write_isolation get_write_isolation_for_schema(schema_ptr schema);
static write_isolation default_write_isolation;
public:
static void set_default_write_isolation(std::string_view mode);
protected:
// The full request JSON
rjson::value _request;
// All RMW operations involve a single item with a specific partition
// and optional clustering key, in a single table, so the following
// information is common to all of them:
schema_ptr _schema;
partition_key _pk = partition_key::make_empty();
clustering_key _ck = clustering_key::make_empty();
write_isolation _write_isolation;
// All RMW operations can have a ReturnValues parameter from the following
// choices. But note that only UpdateItem actually supports all of them:
enum class returnvalues {
NONE, ALL_OLD, UPDATED_OLD, ALL_NEW, UPDATED_NEW
} _returnvalues;
static returnvalues parse_returnvalues(const rjson::value& request);
// When _returnvalues != NONE, apply() should store here, in JSON form,
// the values which are to be returned in the "Attributes" field.
// The default null JSON means do not return an Attributes field at all.
// This field is marked "mutable" so that the const apply() can modify
// it (see explanation below), but note that because apply() may be
// called more than once, if apply() will sometimes set this field it
// must set it (even if just to the default empty value) every time.
mutable rjson::value _return_attributes;
public:
// The constructor of a rmw_operation subclass should parse the request
// and try to discover as many input errors as it can before really
// attempting the read or write operations.
rmw_operation(service::storage_proxy& proxy, rjson::value&& request);
// rmw_operation subclasses (update_item_operation, put_item_operation
// and delete_item_operation) shall implement an apply() function which
// takes the previous value of the item (if it was read) and creates the
// write mutation. If the previous value of item does not pass the needed
// conditional expression, apply() should return an empty optional.
// apply() may throw if it encounters input errors not discovered during
// the constructor.
// apply() may be called more than once in case of contention, so it must
// not change the state saved in the object (issue #7218 was caused by
// violating this). We mark apply() "const" to let the compiler validate
// this for us. The output-only field _return_attributes is marked
// "mutable" above so that apply() can still write to it.
virtual std::optional<mutation> apply(std::unique_ptr<rjson::value> previous_item, api::timestamp_type ts) const = 0;
// Convert the above apply() into the signature needed by cas_request:
virtual std::optional<mutation> apply(foreign_ptr<lw_shared_ptr<query::result>> qr, const query::partition_slice& slice, api::timestamp_type ts) override;
virtual ~rmw_operation() = default;
schema_ptr schema() const { return _schema; }
const rjson::value& request() const { return _request; }
rjson::value&& move_request() && { return std::move(_request); }
future<executor::request_return_type> execute(service::storage_proxy& proxy,
service::client_state& client_state,
tracing::trace_state_ptr trace_state,
service_permit permit,
bool needs_read_before_write,
stats& stats);
std::optional<shard_id> shard_for_execute(bool needs_read_before_write);
};
} // namespace alternator

View File

@@ -1,442 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "utils/base64.hh"
#include "utils/rjson.hh"
#include "log.hh"
#include "serialization.hh"
#include "error.hh"
#include "rapidjson/writer.h"
#include "concrete_types.hh"
#include "cql3/type_json.hh"
#include "position_in_partition.hh"
static logging::logger slogger("alternator-serialization");
namespace alternator {
bool is_alternator_keyspace(const sstring& ks_name);
type_info type_info_from_string(std::string_view type) {
static thread_local const std::unordered_map<std::string_view, type_info> type_infos = {
{"S", {alternator_type::S, utf8_type}},
{"B", {alternator_type::B, bytes_type}},
{"BOOL", {alternator_type::BOOL, boolean_type}},
{"N", {alternator_type::N, decimal_type}}, //FIXME: Replace with custom Alternator type when implemented
};
auto it = type_infos.find(type);
if (it == type_infos.end()) {
return {alternator_type::NOT_SUPPORTED_YET, utf8_type};
}
return it->second;
}
type_representation represent_type(alternator_type atype) {
static thread_local const std::unordered_map<alternator_type, type_representation> type_representations = {
{alternator_type::S, {"S", utf8_type}},
{alternator_type::B, {"B", bytes_type}},
{alternator_type::BOOL, {"BOOL", boolean_type}},
{alternator_type::N, {"N", decimal_type}}, //FIXME: Replace with custom Alternator type when implemented
};
auto it = type_representations.find(atype);
if (it == type_representations.end()) {
throw std::runtime_error(format("Unknown alternator type {}", int8_t(atype)));
}
return it->second;
}
struct from_json_visitor {
const rjson::value& v;
bytes_ostream& bo;
void operator()(const reversed_type_impl& t) const { visit(*t.underlying_type(), from_json_visitor{v, bo}); };
void operator()(const string_type_impl& t) {
bo.write(t.from_string(rjson::to_string_view(v)));
}
void operator()(const bytes_type_impl& t) const {
bo.write(rjson::base64_decode(v));
}
void operator()(const boolean_type_impl& t) const {
bo.write(boolean_type->decompose(v.GetBool()));
}
void operator()(const decimal_type_impl& t) const {
try {
bo.write(t.from_string(rjson::to_string_view(v)));
} catch (const marshal_exception& e) {
throw api_error::validation(format("The parameter cannot be converted to a numeric value: {}", v));
}
}
// default
void operator()(const abstract_type& t) const {
bo.write(from_json_object(t, v, cql_serialization_format::internal()));
}
};
bytes serialize_item(const rjson::value& item) {
if (item.IsNull() || item.MemberCount() != 1) {
throw api_error::validation(format("An item can contain only one attribute definition: {}", item));
}
auto it = item.MemberBegin();
type_info type_info = type_info_from_string(rjson::to_string_view(it->name)); // JSON keys are guaranteed to be strings
if (type_info.atype == alternator_type::NOT_SUPPORTED_YET) {
slogger.trace("Non-optimal serialization of type {}", it->name);
return bytes{int8_t(type_info.atype)} + to_bytes(rjson::print(item));
}
bytes_ostream bo;
bo.write(bytes{int8_t(type_info.atype)});
visit(*type_info.dtype, from_json_visitor{it->value, bo});
return bytes(bo.linearize());
}
struct to_json_visitor {
rjson::value& deserialized;
const std::string& type_ident;
bytes_view bv;
void operator()(const reversed_type_impl& t) const { visit(*t.underlying_type(), to_json_visitor{deserialized, type_ident, bv}); };
void operator()(const decimal_type_impl& t) const {
auto s = to_json_string(*decimal_type, bytes(bv));
//FIXME(sarna): unnecessary copy
rjson::add_with_string_name(deserialized, type_ident, rjson::from_string(s));
}
void operator()(const string_type_impl& t) {
rjson::add_with_string_name(deserialized, type_ident, rjson::from_string(reinterpret_cast<const char *>(bv.data()), bv.size()));
}
void operator()(const bytes_type_impl& t) const {
std::string b64 = base64_encode(bv);
rjson::add_with_string_name(deserialized, type_ident, rjson::from_string(b64));
}
// default
void operator()(const abstract_type& t) const {
rjson::add_with_string_name(deserialized, type_ident, rjson::parse(to_json_string(t, bytes(bv))));
}
};
rjson::value deserialize_item(bytes_view bv) {
rjson::value deserialized(rapidjson::kObjectType);
if (bv.empty()) {
throw api_error::validation("Serialized value empty");
}
alternator_type atype = alternator_type(bv[0]);
bv.remove_prefix(1);
if (atype == alternator_type::NOT_SUPPORTED_YET) {
slogger.trace("Non-optimal deserialization of alternator type {}", int8_t(atype));
return rjson::parse(std::string_view(reinterpret_cast<const char *>(bv.data()), bv.size()));
}
type_representation type_representation = represent_type(atype);
visit(*type_representation.dtype, to_json_visitor{deserialized, type_representation.ident, bv});
return deserialized;
}
std::string type_to_string(data_type type) {
static thread_local std::unordered_map<data_type, std::string> types = {
{utf8_type, "S"},
{bytes_type, "B"},
{boolean_type, "BOOL"},
{decimal_type, "N"}, // FIXME: use a specialized Alternator number type instead of the general decimal_type
};
auto it = types.find(type);
if (it == types.end()) {
// fall back to string, in order to be able to present
// internal Scylla types in a human-readable way
return "S";
}
return it->second;
}
bytes get_key_column_value(const rjson::value& item, const column_definition& column) {
std::string column_name = column.name_as_text();
const rjson::value* key_typed_value = rjson::find(item, column_name);
if (!key_typed_value) {
throw api_error::validation(format("Key column {} not found", column_name));
}
return get_key_from_typed_value(*key_typed_value, column);
}
// Parses the JSON encoding for a key value, which is a map with a single
// entry whose key is the type and the value is the encoded value.
// If this type does not match the desired "type_str", an api_error::validation
// error is thrown (the "name" parameter is the name of the column which will
// mentioned in the exception message).
// If the type does match, a reference to the encoded value is returned.
static const rjson::value& get_typed_value(const rjson::value& key_typed_value, std::string_view type_str, std::string_view name, std::string_view value_name) {
if (!key_typed_value.IsObject() || key_typed_value.MemberCount() != 1 ||
!key_typed_value.MemberBegin()->value.IsString()) {
throw api_error::validation(
format("Malformed value object for {} {}: {}",
value_name, name, key_typed_value));
}
auto it = key_typed_value.MemberBegin();
if (rjson::to_string_view(it->name) != type_str) {
throw api_error::validation(
format("Type mismatch: expected type {} for {} {}, got type {}",
type_str, value_name, name, it->name));
}
return it->value;
}
// Parses the JSON encoding for a key value, which is a map with a single
// entry, whose key is the type (expected to match the key column's type)
// and the value is the encoded value.
bytes get_key_from_typed_value(const rjson::value& key_typed_value, const column_definition& column) {
auto& value = get_typed_value(key_typed_value, type_to_string(column.type), column.name_as_text(), "key column");
std::string_view value_view = rjson::to_string_view(value);
if (value_view.empty()) {
throw api_error::validation(
format("The AttributeValue for a key attribute cannot contain an empty string value. Key: {}", column.name_as_text()));
}
if (column.type == bytes_type) {
return rjson::base64_decode(value);
} else {
return column.type->from_string(value_view);
}
}
rjson::value json_key_column_value(bytes_view cell, const column_definition& column) {
if (column.type == bytes_type) {
std::string b64 = base64_encode(cell);
return rjson::from_string(b64);
} if (column.type == utf8_type) {
return rjson::from_string(std::string(reinterpret_cast<const char*>(cell.data()), cell.size()));
} else if (column.type == decimal_type) {
// FIXME: use specialized Alternator number type, not the more
// general "decimal_type". A dedicated type can be more efficient
// in storage space and in parsing speed.
auto s = to_json_string(*decimal_type, bytes(cell));
return rjson::from_string(s);
} else {
// Support for arbitrary key types is useful for parsing values of virtual tables,
// which can involve any type supported by Scylla.
// In order to guarantee that the returned type is parsable by alternator clients,
// they are represented simply as strings.
return rjson::from_string(column.type->to_string(bytes(cell)));
}
}
partition_key pk_from_json(const rjson::value& item, schema_ptr schema) {
std::vector<bytes> raw_pk;
// FIXME: this is a loop, but we really allow only one partition key column.
for (const column_definition& cdef : schema->partition_key_columns()) {
bytes raw_value = get_key_column_value(item, cdef);
raw_pk.push_back(std::move(raw_value));
}
return partition_key::from_exploded(raw_pk);
}
clustering_key ck_from_json(const rjson::value& item, schema_ptr schema) {
if (schema->clustering_key_size() == 0) {
return clustering_key::make_empty();
}
std::vector<bytes> raw_ck;
// FIXME: this is a loop, but we really allow only one clustering key column.
for (const column_definition& cdef : schema->clustering_key_columns()) {
bytes raw_value = get_key_column_value(item, cdef);
raw_ck.push_back(std::move(raw_value));
}
return clustering_key::from_exploded(raw_ck);
}
position_in_partition pos_from_json(const rjson::value& item, schema_ptr schema) {
auto ck = ck_from_json(item, schema);
if (is_alternator_keyspace(schema->ks_name())) {
return position_in_partition::for_key(std::move(ck));
}
const auto region_item = rjson::find(item, scylla_paging_region);
const auto weight_item = rjson::find(item, scylla_paging_weight);
if (bool(region_item) != bool(weight_item)) {
throw api_error::validation("Malformed value object: region and weight has to be either both missing or both present");
}
partition_region region;
bound_weight weight;
if (region_item) {
auto region_view = rjson::to_string_view(get_typed_value(*region_item, "S", scylla_paging_region, "key region"));
auto weight_view = rjson::to_string_view(get_typed_value(*weight_item, "N", scylla_paging_weight, "key weight"));
auto region = parse_partition_region(region_view);
if (weight_view == "-1") {
weight = bound_weight::before_all_prefixed;
} else if (weight_view == "0") {
weight = bound_weight::equal;
} else if (weight_view == "1") {
weight = bound_weight::after_all_prefixed;
} else {
throw std::runtime_error(fmt::format("Invalid value for weight: {}", weight_view));
}
return position_in_partition(region, weight, region == partition_region::clustered ? std::optional(std::move(ck)) : std::nullopt);
}
if (ck.is_empty()) {
return position_in_partition(position_in_partition::partition_start_tag_t());
}
return position_in_partition::for_key(std::move(ck));
}
big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic) {
if (!v.IsObject() || v.MemberCount() != 1) {
throw api_error::validation(format("{}: invalid number object", diagnostic));
}
auto it = v.MemberBegin();
if (it->name != "N") {
throw api_error::validation(format("{}: expected number, found type '{}'", diagnostic, it->name));
}
try {
if (!it->value.IsString()) {
// We shouldn't reach here. Callers normally validate their input
// earlier with validate_value().
throw api_error::validation(format("{}: improperly formatted number constant", diagnostic));
}
return big_decimal(rjson::to_string_view(it->value));
} catch (const marshal_exception& e) {
throw api_error::validation(format("The parameter cannot be converted to a numeric value: {}", it->value));
}
}
std::optional<big_decimal> try_unwrap_number(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
return std::nullopt;
}
auto it = v.MemberBegin();
if (it->name != "N" || !it->value.IsString()) {
return std::nullopt;
}
try {
return big_decimal(rjson::to_string_view(it->value));
} catch (const marshal_exception& e) {
return std::nullopt;
}
}
const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
return {"", nullptr};
}
auto it = v.MemberBegin();
const std::string it_key = it->name.GetString();
if (it_key != "SS" && it_key != "BS" && it_key != "NS") {
return {std::move(it_key), nullptr};
}
return std::make_pair(it_key, &(it->value));
}
const rjson::value* unwrap_list(const rjson::value& v) {
if (!v.IsObject() || v.MemberCount() != 1) {
return nullptr;
}
auto it = v.MemberBegin();
if (it->name != std::string("L")) {
return nullptr;
}
return &(it->value);
}
// Take two JSON-encoded numeric values ({"N": "thenumber"}) and return the
// sum, again as a JSON-encoded number.
rjson::value number_add(const rjson::value& v1, const rjson::value& v2) {
auto n1 = unwrap_number(v1, "UpdateExpression");
auto n2 = unwrap_number(v2, "UpdateExpression");
rjson::value ret = rjson::empty_object();
std::string str_ret = std::string((n1 + n2).to_string());
rjson::add(ret, "N", rjson::from_string(str_ret));
return ret;
}
rjson::value number_subtract(const rjson::value& v1, const rjson::value& v2) {
auto n1 = unwrap_number(v1, "UpdateExpression");
auto n2 = unwrap_number(v2, "UpdateExpression");
rjson::value ret = rjson::empty_object();
std::string str_ret = std::string((n1 - n2).to_string());
rjson::add(ret, "N", rjson::from_string(str_ret));
return ret;
}
// Take two JSON-encoded set values (e.g. {"SS": [...the actual set]}) and
// return the sum of both sets, again as a set value.
rjson::value set_sum(const rjson::value& v1, const rjson::value& v2) {
auto [set1_type, set1] = unwrap_set(v1);
auto [set2_type, set2] = unwrap_set(v2);
if (set1_type != set2_type) {
throw api_error::validation(format("Mismatched set types: {} and {}", set1_type, set2_type));
}
if (!set1 || !set2) {
throw api_error::validation("UpdateExpression: ADD operation for sets must be given sets as arguments");
}
rjson::value sum = rjson::copy(*set1);
std::set<rjson::value, rjson::single_value_comp> set1_raw;
for (auto it = sum.Begin(); it != sum.End(); ++it) {
set1_raw.insert(rjson::copy(*it));
}
for (const auto& a : set2->GetArray()) {
if (!set1_raw.contains(a)) {
rjson::push_back(sum, rjson::copy(a));
}
}
rjson::value ret = rjson::empty_object();
rjson::add_with_string_name(ret, set1_type, std::move(sum));
return ret;
}
// Take two JSON-encoded set values (e.g. {"SS": [...the actual list]}) and
// return the difference of s1 - s2, again as a set value.
// DynamoDB does not allow empty sets, so if resulting set is empty, return
// an unset optional instead.
std::optional<rjson::value> set_diff(const rjson::value& v1, const rjson::value& v2) {
auto [set1_type, set1] = unwrap_set(v1);
auto [set2_type, set2] = unwrap_set(v2);
if (set1_type != set2_type) {
throw api_error::validation(format("Set DELETE type mismatch: {} and {}", set1_type, set2_type));
}
if (!set1 || !set2) {
throw api_error::validation("UpdateExpression: DELETE operation can only be performed on a set");
}
std::set<rjson::value, rjson::single_value_comp> set1_raw;
for (auto it = set1->Begin(); it != set1->End(); ++it) {
set1_raw.insert(rjson::copy(*it));
}
for (const auto& a : set2->GetArray()) {
set1_raw.erase(a);
}
if (set1_raw.empty()) {
return std::nullopt;
}
rjson::value ret = rjson::empty_object();
rjson::add_with_string_name(ret, set1_type, rjson::empty_array());
rjson::value& result_set = ret[set1_type];
for (const auto& a : set1_raw) {
rjson::push_back(result_set, rjson::copy(a));
}
return ret;
}
// Take two JSON-encoded list values (remember that a list value is
// {"L": [...the actual list]}) and return the concatenation, again as
// a list value.
// Returns a null value if one of the arguments is not actually a list.
rjson::value list_concatenate(const rjson::value& v1, const rjson::value& v2) {
const rjson::value* list1 = unwrap_list(v1);
const rjson::value* list2 = unwrap_list(v2);
if (!list1 || !list2) {
return rjson::null_value();
}
rjson::value cat = rjson::copy(*list1);
for (const auto& a : list2->GetArray()) {
rjson::push_back(cat, rjson::copy(a));
}
rjson::value ret = rjson::empty_object();
rjson::add(ret, "L", std::move(cat));
return ret;
}
}

View File

@@ -1,93 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <string>
#include <string_view>
#include <optional>
#include "types.hh"
#include "schema_fwd.hh"
#include "keys.hh"
#include "utils/rjson.hh"
#include "utils/big_decimal.hh"
class position_in_partition;
namespace alternator {
enum class alternator_type : int8_t {
S, B, BOOL, N, NOT_SUPPORTED_YET
};
struct type_info {
alternator_type atype;
data_type dtype;
};
struct type_representation {
std::string ident;
data_type dtype;
};
inline constexpr std::string_view scylla_paging_region(":scylla:paging:region");
inline constexpr std::string_view scylla_paging_weight(":scylla:paging:weight");
type_info type_info_from_string(std::string_view type);
type_representation represent_type(alternator_type atype);
bytes serialize_item(const rjson::value& item);
rjson::value deserialize_item(bytes_view bv);
std::string type_to_string(data_type type);
bytes get_key_column_value(const rjson::value& item, const column_definition& column);
bytes get_key_from_typed_value(const rjson::value& key_typed_value, const column_definition& column);
rjson::value json_key_column_value(bytes_view cell, const column_definition& column);
partition_key pk_from_json(const rjson::value& item, schema_ptr schema);
clustering_key ck_from_json(const rjson::value& item, schema_ptr schema);
position_in_partition pos_from_json(const rjson::value& item, schema_ptr schema);
// If v encodes a number (i.e., it is a {"N": [...]}, returns an object representing it. Otherwise,
// raises ValidationException with diagnostic.
big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic);
// try_unwrap_number is like unwrap_number, but returns an unset optional
// when the given v does not encode a number.
std::optional<big_decimal> try_unwrap_number(const rjson::value& v);
// Check if a given JSON object encodes a set (i.e., it is a {"SS": [...]}, or "NS", "BS"
// and returns set's type and a pointer to that set. If the object does not encode a set,
// returned value is {"", nullptr}
const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v);
// Check if a given JSON object encodes a list (i.e., it is a {"L": [...]}
// and returns a pointer to that list.
const rjson::value* unwrap_list(const rjson::value& v);
// Take two JSON-encoded numeric values ({"N": "thenumber"}) and return the
// sum, again as a JSON-encoded number.
rjson::value number_add(const rjson::value& v1, const rjson::value& v2);
rjson::value number_subtract(const rjson::value& v1, const rjson::value& v2);
// Take two JSON-encoded set values (e.g. {"SS": [...the actual set]}) and
// return the sum of both sets, again as a set value.
rjson::value set_sum(const rjson::value& v1, const rjson::value& v2);
// Take two JSON-encoded set values (e.g. {"SS": [...the actual list]}) and
// return the difference of s1 - s2, again as a set value.
// DynamoDB does not allow empty sets, so if resulting set is empty, return
// an unset optional instead.
std::optional<rjson::value> set_diff(const rjson::value& v1, const rjson::value& v2);
// Take two JSON-encoded list values (remember that a list value is
// {"L": [...the actual list]}) and return the concatenation, again as
// a list value.
// Returns a null value if one of the arguments is not actually a list.
rjson::value list_concatenate(const rjson::value& v1, const rjson::value& v2);
}

View File

@@ -1,632 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "alternator/server.hh"
#include "log.hh"
#include <seastar/http/function_handlers.hh>
#include <seastar/http/short_streams.hh>
#include <seastar/core/coroutine.hh>
#include <seastar/json/json_elements.hh>
#include <seastar/util/defer.hh>
#include <seastar/util/short_streams.hh>
#include "seastarx.hh"
#include "error.hh"
#include "service/qos/service_level_controller.hh"
#include "utils/rjson.hh"
#include "auth.hh"
#include <cctype>
#include "service/storage_proxy.hh"
#include "gms/gossiper.hh"
#include "utils/overloaded_functor.hh"
#include "utils/fb_utilities.hh"
static logging::logger slogger("alternator-server");
using namespace httpd;
namespace alternator {
static constexpr auto TARGET = "X-Amz-Target";
inline std::vector<std::string_view> split(std::string_view text, char separator) {
std::vector<std::string_view> tokens;
if (text == "") {
return tokens;
}
while (true) {
auto pos = text.find_first_of(separator);
if (pos != std::string_view::npos) {
tokens.emplace_back(text.data(), pos);
text.remove_prefix(pos + 1);
} else {
tokens.emplace_back(text);
break;
}
}
return tokens;
}
// Handle CORS (Cross-origin resource sharing) in the HTTP request:
// If the request has the "Origin" header specifying where the script which
// makes this request comes from, we need to reply with the header
// "Access-Control-Allow-Origin: *" saying that this (and any) origin is fine.
// Additionally, if preflight==true (i.e., this is an OPTIONS request),
// the script can also "request" in headers that the server allows it to use
// some HTTP methods and headers in the followup request, and the server
// should respond by "allowing" them in the response headers.
// We also add the header "Access-Control-Expose-Headers" to let the script
// access additional headers in the response.
// This handle_CORS() should be used when handling any HTTP method - both the
// usual GET and POST, and also the "preflight" OPTIONS method.
static void handle_CORS(const request& req, reply& rep, bool preflight) {
if (!req.get_header("origin").empty()) {
rep.add_header("Access-Control-Allow-Origin", "*");
// This is the list that DynamoDB returns for expose headers. I am
// not sure why not just return "*" here, what's the risk?
rep.add_header("Access-Control-Expose-Headers", "x-amzn-RequestId,x-amzn-ErrorType,x-amzn-ErrorMessage,Date");
if (preflight) {
sstring s = req.get_header("Access-Control-Request-Headers");
if (!s.empty()) {
rep.add_header("Access-Control-Allow-Headers", std::move(s));
}
s = req.get_header("Access-Control-Request-Method");
if (!s.empty()) {
rep.add_header("Access-Control-Allow-Methods", std::move(s));
}
// Our CORS response never change anyway, let the browser cache it
// for two hours (Chrome's maximum):
rep.add_header("Access-Control-Max-Age", "7200");
}
}
}
// DynamoDB HTTP error responses are structured as follows
// https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html
// Our handlers throw an exception to report an error. If the exception
// is of type alternator::api_error, it unwrapped and properly reported to
// the user directly. Other exceptions are unexpected, and reported as
// Internal Server Error.
class api_handler : public handler_base {
public:
api_handler(const std::function<future<executor::request_return_type>(std::unique_ptr<request> req)>& _handle) : _f_handle(
[this, _handle](std::unique_ptr<request> req, std::unique_ptr<reply> rep) {
return seastar::futurize_invoke(_handle, std::move(req)).then_wrapped([this, rep = std::move(rep)](future<executor::request_return_type> resf) mutable {
if (resf.failed()) {
// Exceptions of type api_error are wrapped as JSON and
// returned to the client as expected. Other types of
// exceptions are unexpected, and returned to the user
// as an internal server error:
try {
resf.get();
} catch (api_error &ae) {
generate_error_reply(*rep, ae);
} catch (rjson::error & re) {
generate_error_reply(*rep,
api_error::validation(re.what()));
} catch (...) {
generate_error_reply(*rep,
api_error::internal(format("Internal server error: {}", std::current_exception())));
}
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
auto res = resf.get0();
std::visit(overloaded_functor {
[&] (const json::json_return_type& json_return_value) {
slogger.trace("api_handler success case");
if (json_return_value._body_writer) {
// Unfortunately, write_body() forces us to choose
// from a fixed and irrelevant list of "mime-types"
// at this point. But we'll override it with the
// one (application/x-amz-json-1.0) below.
rep->write_body("json", std::move(json_return_value._body_writer));
} else {
rep->_content += json_return_value._res;
}
},
[&] (const api_error& err) {
generate_error_reply(*rep, err);
}
}, res);
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
});
}) { }
api_handler(const api_handler&) = default;
future<std::unique_ptr<reply>> handle(const sstring& path,
std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
handle_CORS(*req, *rep, false);
return _f_handle(std::move(req), std::move(rep)).then(
[this](std::unique_ptr<reply> rep) {
rep->set_mime_type("application/x-amz-json-1.0");
rep->done();
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
});
}
protected:
void generate_error_reply(reply& rep, const api_error& err) {
rjson::value results = rjson::empty_object();
rjson::add(results, "__type", rjson::from_string("com.amazonaws.dynamodb.v20120810#" + err._type));
rjson::add(results, "message", err._msg);
rep._content = rjson::print(std::move(results));
rep._status = err._http_code;
slogger.trace("api_handler error case: {}", rep._content);
}
future_handler_function _f_handle;
};
class gated_handler : public handler_base {
seastar::gate& _gate;
public:
gated_handler(seastar::gate& gate) : _gate(gate) {}
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) = 0;
virtual future<std::unique_ptr<reply>> handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) final override {
return with_gate(_gate, [this, &path, req = std::move(req), rep = std::move(rep)] () mutable {
return do_handle(path, std::move(req), std::move(rep));
});
}
};
class health_handler : public gated_handler {
public:
health_handler(seastar::gate& pending_requests) : gated_handler(pending_requests) {}
protected:
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
handle_CORS(*req, *rep, false);
rep->set_status(reply::status_type::ok);
rep->write_body("txt", format("healthy: {}", req->get_header("Host")));
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
};
class local_nodelist_handler : public gated_handler {
service::storage_proxy& _proxy;
gms::gossiper& _gossiper;
public:
local_nodelist_handler(seastar::gate& pending_requests, service::storage_proxy& proxy, gms::gossiper& gossiper)
: gated_handler(pending_requests)
, _proxy(proxy)
, _gossiper(gossiper) {}
protected:
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
rjson::value results = rjson::empty_array();
// It's very easy to get a list of all live nodes on the cluster,
// using _gossiper().get_live_members(). But getting
// just the list of live nodes in this DC needs more elaborate code:
auto& topology = _proxy.get_token_metadata_ptr()->get_topology();
sstring local_dc = topology.get_datacenter();
std::unordered_set<gms::inet_address> local_dc_nodes = topology.get_datacenter_endpoints().at(local_dc);
for (auto& ip : local_dc_nodes) {
if (_gossiper.is_alive(ip)) {
rjson::push_back(results, rjson::from_string(ip.to_sstring()));
}
}
rep->set_status(reply::status_type::ok);
rep->set_content_type("json");
rep->_content = rjson::print(results);
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
};
// The CORS (Cross-origin resource sharing) protocol can send an OPTIONS
// request before ("pre-flight") the main request. The response to this
// request can be empty, but needs to have the right headers (which we
// fill with handle_CORS())
class options_handler : public gated_handler {
public:
options_handler(seastar::gate& pending_requests) : gated_handler(pending_requests) {}
protected:
virtual future<std::unique_ptr<reply>> do_handle(const sstring& path, std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {
handle_CORS(*req, *rep, true);
rep->set_status(reply::status_type::ok);
rep->write_body("txt", sstring(""));
return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
}
};
future<std::string> server::verify_signature(const request& req, const chunked_content& content) {
if (!_enforce_authorization) {
slogger.debug("Skipping authorization");
return make_ready_future<std::string>();
}
auto host_it = req._headers.find("Host");
if (host_it == req._headers.end()) {
throw api_error::invalid_signature("Host header is mandatory for signature verification");
}
auto authorization_it = req._headers.find("Authorization");
if (authorization_it == req._headers.end()) {
throw api_error::missing_authentication_token("Authorization header is mandatory for signature verification");
}
std::string host = host_it->second;
std::string_view authorization_header = authorization_it->second;
auto pos = authorization_header.find_first_of(' ');
if (pos == std::string_view::npos || authorization_header.substr(0, pos) != "AWS4-HMAC-SHA256") {
throw api_error::invalid_signature(format("Authorization header must use AWS4-HMAC-SHA256 algorithm: {}", authorization_header));
}
authorization_header.remove_prefix(pos+1);
std::string credential;
std::string user_signature;
std::string signed_headers_str;
std::vector<std::string_view> signed_headers;
do {
// Either one of a comma or space can mark the end of an entry
pos = authorization_header.find_first_of(" ,");
std::string_view entry = authorization_header.substr(0, pos);
if (pos != std::string_view::npos) {
authorization_header.remove_prefix(pos + 1);
}
if (entry.empty()) {
continue;
}
std::vector<std::string_view> entry_split = split(entry, '=');
if (entry_split.size() != 2) {
continue;
}
std::string_view auth_value = entry_split[1];
if (entry_split[0] == "Credential") {
credential = std::string(auth_value);
} else if (entry_split[0] == "Signature") {
user_signature = std::string(auth_value);
} else if (entry_split[0] == "SignedHeaders") {
signed_headers_str = std::string(auth_value);
signed_headers = split(auth_value, ';');
std::sort(signed_headers.begin(), signed_headers.end());
}
} while (pos != std::string_view::npos);
std::vector<std::string_view> credential_split = split(credential, '/');
if (credential_split.size() != 5) {
throw api_error::validation(format("Incorrect credential information format: {}", credential));
}
std::string user(credential_split[0]);
std::string datestamp(credential_split[1]);
std::string region(credential_split[2]);
std::string service(credential_split[3]);
std::map<std::string_view, std::string_view> signed_headers_map;
for (const auto& header : signed_headers) {
signed_headers_map.emplace(header, std::string_view());
}
for (auto& header : req._headers) {
std::string header_str;
header_str.resize(header.first.size());
std::transform(header.first.begin(), header.first.end(), header_str.begin(), ::tolower);
auto it = signed_headers_map.find(header_str);
if (it != signed_headers_map.end()) {
it->second = std::string_view(header.second);
}
}
auto cache_getter = [&proxy = _proxy] (std::string username) {
return get_key_from_roles(proxy, std::move(username));
};
return _key_cache.get_ptr(user, cache_getter).then([this, &req, &content,
user = std::move(user),
host = std::move(host),
datestamp = std::move(datestamp),
signed_headers_str = std::move(signed_headers_str),
signed_headers_map = std::move(signed_headers_map),
region = std::move(region),
service = std::move(service),
user_signature = std::move(user_signature)] (key_cache::value_ptr key_ptr) {
std::string signature = get_signature(user, *key_ptr, std::string_view(host), req._method,
datestamp, signed_headers_str, signed_headers_map, content, region, service, "");
if (signature != std::string_view(user_signature)) {
_key_cache.remove(user);
throw api_error::unrecognized_client("The security token included in the request is invalid.");
}
return user;
});
}
static tracing::trace_state_ptr create_tracing_session(tracing::tracing& tracing_instance) {
tracing::trace_state_props_set props;
props.set<tracing::trace_state_props::full_tracing>();
props.set_if<tracing::trace_state_props::log_slow_query>(tracing_instance.slow_query_tracing_enabled());
return tracing_instance.create_session(tracing::trace_type::QUERY, props);
}
// truncated_content_view() prints a potentially long chunked_content for
// debugging purposes. In the common case when the content is not excessively
// long, it just returns a view into the given content, without any copying.
// But when the content is very long, it is truncated after some arbitrary
// max_len (or one chunk, whichever comes first), with "<truncated>" added at
// the end. To do this modification to the string, we need to create a new
// std::string, so the caller must pass us a reference to one, "buf", where
// we can store the content. The returned view is only alive for as long this
// buf is kept alive.
static std::string_view truncated_content_view(const chunked_content& content, std::string& buf) {
constexpr size_t max_len = 1024;
if (content.empty()) {
return std::string_view();
} else if (content.size() == 1 && content.begin()->size() <= max_len) {
return std::string_view(content.begin()->get(), content.begin()->size());
} else {
buf = std::string(content.begin()->get(), std::min(content.begin()->size(), max_len)) + "<truncated>";
return std::string_view(buf);
}
}
static tracing::trace_state_ptr maybe_trace_query(service::client_state& client_state, std::string_view username, sstring_view op, const chunked_content& query) {
tracing::trace_state_ptr trace_state;
tracing::tracing& tracing_instance = tracing::tracing::get_local_tracing_instance();
if (tracing_instance.trace_next_query() || tracing_instance.slow_query_tracing_enabled()) {
trace_state = create_tracing_session(tracing_instance);
std::string buf;
tracing::add_session_param(trace_state, "alternator_op", op);
tracing::add_query(trace_state, truncated_content_view(query, buf));
tracing::begin(trace_state, format("Alternator {}", op), client_state.get_client_address());
if (!username.empty()) {
tracing::set_username(trace_state, auth::authenticated_user(username));
}
}
return trace_state;
}
future<executor::request_return_type> server::handle_api_request(std::unique_ptr<request> req) {
_executor._stats.total_operations++;
sstring target = req->get_header(TARGET);
std::vector<std::string_view> split_target = split(target, '.');
//NOTICE(sarna): Target consists of Dynamo API version followed by a dot '.' and operation type (e.g. CreateTable)
std::string op = split_target.empty() ? std::string() : std::string(split_target.back());
// JSON parsing can allocate up to roughly 2x the size of the raw
// document, + a couple of bytes for maintenance.
// TODO: consider the case where req->content_length is missing. Maybe
// we need to take the content_length_limit and return some of the units
// when we finish read_content_and_verify_signature?
size_t mem_estimate = req->content_length * 2 + 8000;
auto units_fut = get_units(*_memory_limiter, mem_estimate);
if (_memory_limiter->waiters()) {
++_executor._stats.requests_blocked_memory;
}
auto units = co_await std::move(units_fut);
assert(req->content_stream);
chunked_content content = co_await util::read_entire_stream(*req->content_stream);
auto username = co_await verify_signature(*req, content);
if (slogger.is_enabled(log_level::trace)) {
std::string buf;
slogger.trace("Request: {} {} {}", op, truncated_content_view(content, buf), req->_headers);
}
auto callback_it = _callbacks.find(op);
if (callback_it == _callbacks.end()) {
_executor._stats.unsupported_operations++;
co_return api_error::unknown_operation(format("Unsupported operation {}", op));
}
if (_pending_requests.get_count() >= _max_concurrent_requests) {
_executor._stats.requests_shed++;
co_return api_error::request_limit_exceeded(format("too many in-flight requests (configured via max_concurrent_requests_per_shard): {}", _pending_requests.get_count()));
}
_pending_requests.enter();
auto leave = defer([this] () noexcept { _pending_requests.leave(); });
//FIXME: Client state can provide more context, e.g. client's endpoint address
// We use unique_ptr because client_state cannot be moved or copied
executor::client_state client_state = username.empty()
? service::client_state{service::client_state::internal_tag()}
: service::client_state{service::client_state::internal_tag(), _auth_service, _sl_controller, username};
co_await client_state.maybe_update_per_service_level_params();
tracing::trace_state_ptr trace_state = maybe_trace_query(client_state, username, op, content);
tracing::trace(trace_state, op);
rjson::value json_request = co_await _json_parser.parse(std::move(content));
co_return co_await callback_it->second(_executor, client_state, trace_state,
make_service_permit(std::move(units)), std::move(json_request), std::move(req));
}
void server::set_routes(routes& r) {
api_handler* req_handler = new api_handler([this] (std::unique_ptr<request> req) mutable {
return handle_api_request(std::move(req));
});
r.put(operation_type::POST, "/", req_handler);
r.put(operation_type::GET, "/", new health_handler(_pending_requests));
// The "/localnodes" request is a new Alternator feature, not supported by
// DynamoDB and not required for DynamoDB compatibility. It allows a
// client to enquire - using a trivial HTTP request without requiring
// authentication - the list of all live nodes in the same data center of
// the Alternator cluster. The client can use this list to balance its
// request load to all the nodes in the same geographical region.
// Note that this API exposes - openly without authentication - the
// information on the cluster's members inside one data center. We do not
// consider this to be a security risk, because an attacker can already
// scan an entire subnet for nodes responding to the health request,
// or even just scan for open ports.
r.put(operation_type::GET, "/localnodes", new local_nodelist_handler(_pending_requests, _proxy, _gossiper));
r.put(operation_type::OPTIONS, "/", new options_handler(_pending_requests));
}
//FIXME: A way to immediately invalidate the cache should be considered,
// e.g. when the system table which stores the keys is changed.
// For now, this propagation may take up to 1 minute.
server::server(executor& exec, service::storage_proxy& proxy, gms::gossiper& gossiper, auth::service& auth_service, qos::service_level_controller& sl_controller)
: _http_server("http-alternator")
, _https_server("https-alternator")
, _executor(exec)
, _proxy(proxy)
, _gossiper(gossiper)
, _auth_service(auth_service)
, _sl_controller(sl_controller)
, _key_cache(1024, 1min, slogger)
, _enforce_authorization(false)
, _enabled_servers{}
, _pending_requests{}
, _callbacks{
{"CreateTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.create_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DescribeTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DeleteTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.delete_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"UpdateTable", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.update_table(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"PutItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.put_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"UpdateItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.update_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"GetItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.get_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DeleteItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.delete_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"ListTables", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.list_tables(client_state, std::move(permit), std::move(json_request));
}},
{"Scan", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.scan(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DescribeEndpoints", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_endpoints(client_state, std::move(permit), std::move(json_request), req->get_header("Host"));
}},
{"BatchWriteItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.batch_write_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"BatchGetItem", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.batch_get_item(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"Query", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.query(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"TagResource", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.tag_resource(client_state, std::move(permit), std::move(json_request));
}},
{"UntagResource", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.untag_resource(client_state, std::move(permit), std::move(json_request));
}},
{"ListTagsOfResource", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.list_tags_of_resource(client_state, std::move(permit), std::move(json_request));
}},
{"UpdateTimeToLive", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.update_time_to_live(client_state, std::move(permit), std::move(json_request));
}},
{"DescribeTimeToLive", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_time_to_live(client_state, std::move(permit), std::move(json_request));
}},
{"ListStreams", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.list_streams(client_state, std::move(permit), std::move(json_request));
}},
{"DescribeStream", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_stream(client_state, std::move(permit), std::move(json_request));
}},
{"GetShardIterator", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.get_shard_iterator(client_state, std::move(permit), std::move(json_request));
}},
{"GetRecords", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.get_records(client_state, std::move(trace_state), std::move(permit), std::move(json_request));
}},
{"DescribeContinuousBackups", [] (executor& e, executor::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit, rjson::value json_request, std::unique_ptr<request> req) {
return e.describe_continuous_backups(client_state, std::move(permit), std::move(json_request));
}},
} {
}
future<> server::init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds,
bool enforce_authorization, semaphore* memory_limiter, utils::updateable_value<uint32_t> max_concurrent_requests) {
_memory_limiter = memory_limiter;
_enforce_authorization = enforce_authorization;
_max_concurrent_requests = std::move(max_concurrent_requests);
if (!port && !https_port) {
return make_exception_future<>(std::runtime_error("Either regular port or TLS port"
" must be specified in order to init an alternator HTTP server instance"));
}
return seastar::async([this, addr, port, https_port, creds] {
_executor.start().get();
if (port) {
set_routes(_http_server._routes);
_http_server.set_content_length_limit(server::content_length_limit);
_http_server.set_content_streaming(true);
_http_server.listen(socket_address{addr, *port}).get();
_enabled_servers.push_back(std::ref(_http_server));
}
if (https_port) {
set_routes(_https_server._routes);
_https_server.set_content_length_limit(server::content_length_limit);
_https_server.set_content_streaming(true);
_https_server.set_tls_credentials(creds->build_reloadable_server_credentials([](const std::unordered_set<sstring>& files, std::exception_ptr ep) {
if (ep) {
slogger.warn("Exception loading {}: {}", files, ep);
} else {
slogger.info("Reloaded {}", files);
}
}).get0());
_https_server.listen(socket_address{addr, *https_port}).get();
_enabled_servers.push_back(std::ref(_https_server));
}
});
}
future<> server::stop() {
return parallel_for_each(_enabled_servers, [] (http_server& server) {
return server.stop();
}).then([this] {
return _pending_requests.close();
}).then([this] {
return _json_parser.stop();
});
}
server::json_parser::json_parser() : _run_parse_json_thread(async([this] {
while (true) {
_document_waiting.wait().get();
if (_as.abort_requested()) {
return;
}
try {
_parsed_document = rjson::parse_yieldable(std::move(_raw_document));
_current_exception = nullptr;
} catch (...) {
_current_exception = std::current_exception();
}
_document_parsed.signal();
}
})) {
}
future<rjson::value> server::json_parser::parse(chunked_content&& content) {
if (content.size() < yieldable_parsing_threshold) {
return make_ready_future<rjson::value>(rjson::parse(std::move(content)));
}
return with_semaphore(_parsing_sem, 1, [this, content = std::move(content)] () mutable {
_raw_document = std::move(content);
_document_waiting.signal();
return _document_parsed.wait().then([this] {
if (_current_exception) {
return make_exception_future<rjson::value>(_current_exception);
}
return make_ready_future<rjson::value>(std::move(_parsed_document));
});
});
}
future<> server::json_parser::stop() {
_as.request_abort();
_document_waiting.signal();
_document_parsed.broken();
return std::move(_run_parse_json_thread);
}
const char* api_error::what() const noexcept {
if (_what_string.empty()) {
_what_string = format("{} {}: {}", static_cast<int>(_http_code), _type, _msg);
}
return _what_string.c_str();
}
}

View File

@@ -1,84 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include "alternator/executor.hh"
#include <seastar/core/future.hh>
#include <seastar/core/condition-variable.hh>
#include <seastar/http/httpd.hh>
#include <seastar/net/tls.hh>
#include <optional>
#include "alternator/auth.hh"
#include "service/qos/service_level_controller.hh"
#include "utils/small_vector.hh"
#include "utils/updateable_value.hh"
#include <seastar/core/units.hh>
namespace alternator {
using chunked_content = rjson::chunked_content;
class server {
static constexpr size_t content_length_limit = 16*MB;
using alternator_callback = std::function<future<executor::request_return_type>(executor&, executor::client_state&,
tracing::trace_state_ptr, service_permit, rjson::value, std::unique_ptr<request>)>;
using alternator_callbacks_map = std::unordered_map<std::string_view, alternator_callback>;
http_server _http_server;
http_server _https_server;
executor& _executor;
service::storage_proxy& _proxy;
gms::gossiper& _gossiper;
auth::service& _auth_service;
qos::service_level_controller& _sl_controller;
key_cache _key_cache;
bool _enforce_authorization;
utils::small_vector<std::reference_wrapper<seastar::httpd::http_server>, 2> _enabled_servers;
gate _pending_requests;
alternator_callbacks_map _callbacks;
semaphore* _memory_limiter;
utils::updateable_value<uint32_t> _max_concurrent_requests;
class json_parser {
static constexpr size_t yieldable_parsing_threshold = 16*KB;
chunked_content _raw_document;
rjson::value _parsed_document;
std::exception_ptr _current_exception;
semaphore _parsing_sem{1};
condition_variable _document_waiting;
condition_variable _document_parsed;
abort_source _as;
future<> _run_parse_json_thread;
public:
json_parser();
// Moving a chunked_content into parse() allows parse() to free each
// chunk as soon as it is parsed, so when chunks are relatively small,
// we don't need to store the sum of unparsed and parsed sizes.
future<rjson::value> parse(chunked_content&& content);
future<> stop();
};
json_parser _json_parser;
public:
server(executor& executor, service::storage_proxy& proxy, gms::gossiper& gossiper, auth::service& service, qos::service_level_controller& sl_controller);
future<> init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds,
bool enforce_authorization, semaphore* memory_limiter, utils::updateable_value<uint32_t> max_concurrent_requests);
future<> stop();
private:
void set_routes(seastar::httpd::routes& r);
// If verification succeeds, returns the authenticated user's username
future<std::string> verify_signature(const seastar::httpd::request&, const chunked_content&);
future<executor::request_return_type> handle_api_request(std::unique_ptr<request> req);
};
}

View File

@@ -1,99 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "stats.hh"
#include "utils/histogram_metrics_helper.hh"
#include <seastar/core/metrics.hh>
namespace alternator {
const char* ALTERNATOR_METRICS = "alternator";
stats::stats() : api_operations{} {
// Register the
seastar::metrics::label op("op");
_metrics.add_group("alternator", {
#define OPERATION(name, CamelCaseName) \
seastar::metrics::make_total_operations("operation", api_operations.name, \
seastar::metrics::description("number of operations via Alternator API"), {op(CamelCaseName)}),
#define OPERATION_LATENCY(name, CamelCaseName) \
seastar::metrics::make_histogram("op_latency", \
seastar::metrics::description("Latency histogram of an operation via Alternator API"), {op(CamelCaseName)}, [this]{return to_metrics_histogram(api_operations.name);}),
OPERATION(batch_get_item, "BatchGetItem")
OPERATION(batch_write_item, "BatchWriteItem")
OPERATION(create_backup, "CreateBackup")
OPERATION(create_global_table, "CreateGlobalTable")
OPERATION(create_table, "CreateTable")
OPERATION(delete_backup, "DeleteBackup")
OPERATION(delete_item, "DeleteItem")
OPERATION(delete_table, "DeleteTable")
OPERATION(describe_backup, "DescribeBackup")
OPERATION(describe_continuous_backups, "DescribeContinuousBackups")
OPERATION(describe_endpoints, "DescribeEndpoints")
OPERATION(describe_global_table, "DescribeGlobalTable")
OPERATION(describe_global_table_settings, "DescribeGlobalTableSettings")
OPERATION(describe_limits, "DescribeLimits")
OPERATION(describe_table, "DescribeTable")
OPERATION(describe_time_to_live, "DescribeTimeToLive")
OPERATION(get_item, "GetItem")
OPERATION(list_backups, "ListBackups")
OPERATION(list_global_tables, "ListGlobalTables")
OPERATION(list_tables, "ListTables")
OPERATION(list_tags_of_resource, "ListTagsOfResource")
OPERATION(put_item, "PutItem")
OPERATION(query, "Query")
OPERATION(restore_table_from_backup, "RestoreTableFromBackup")
OPERATION(restore_table_to_point_in_time, "RestoreTableToPointInTime")
OPERATION(scan, "Scan")
OPERATION(tag_resource, "TagResource")
OPERATION(transact_get_items, "TransactGetItems")
OPERATION(transact_write_items, "TransactWriteItems")
OPERATION(untag_resource, "UntagResource")
OPERATION(update_continuous_backups, "UpdateContinuousBackups")
OPERATION(update_global_table, "UpdateGlobalTable")
OPERATION(update_global_table_settings, "UpdateGlobalTableSettings")
OPERATION(update_item, "UpdateItem")
OPERATION(update_table, "UpdateTable")
OPERATION(update_time_to_live, "UpdateTimeToLive")
OPERATION_LATENCY(put_item_latency, "PutItem")
OPERATION_LATENCY(get_item_latency, "GetItem")
OPERATION_LATENCY(delete_item_latency, "DeleteItem")
OPERATION_LATENCY(update_item_latency, "UpdateItem")
OPERATION(list_streams, "ListStreams")
OPERATION(describe_stream, "DescribeStream")
OPERATION(get_shard_iterator, "GetShardIterator")
OPERATION(get_records, "GetRecords")
OPERATION_LATENCY(get_records_latency, "GetRecords")
});
_metrics.add_group("alternator", {
seastar::metrics::make_total_operations("unsupported_operations", unsupported_operations,
seastar::metrics::description("number of unsupported operations via Alternator API")),
seastar::metrics::make_total_operations("total_operations", total_operations,
seastar::metrics::description("number of total operations via Alternator API")),
seastar::metrics::make_total_operations("reads_before_write", reads_before_write,
seastar::metrics::description("number of performed read-before-write operations")),
seastar::metrics::make_total_operations("write_using_lwt", write_using_lwt,
seastar::metrics::description("number of writes that used LWT")),
seastar::metrics::make_total_operations("shard_bounce_for_lwt", shard_bounce_for_lwt,
seastar::metrics::description("number writes that had to be bounced from this shard because of LWT requirements")),
seastar::metrics::make_total_operations("requests_blocked_memory", requests_blocked_memory,
seastar::metrics::description("Counts a number of requests blocked due to memory pressure.")),
seastar::metrics::make_total_operations("requests_shed", requests_shed,
seastar::metrics::description("Counts a number of requests shed due to overload.")),
seastar::metrics::make_total_operations("filtered_rows_read_total", cql_stats.filtered_rows_read_total,
seastar::metrics::description("number of rows read during filtering operations")),
seastar::metrics::make_total_operations("filtered_rows_matched_total", cql_stats.filtered_rows_matched_total,
seastar::metrics::description("number of rows read and matched during filtering operations")),
seastar::metrics::make_total_operations("filtered_rows_dropped_total", [this] { return cql_stats.filtered_rows_read_total - cql_stats.filtered_rows_matched_total; },
seastar::metrics::description("number of rows read and dropped during filtering operations")),
});
}
}

View File

@@ -1,91 +0,0 @@
/*
* Copyright 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include <cstdint>
#include <seastar/core/metrics_registration.hh>
#include "seastarx.hh"
#include "utils/estimated_histogram.hh"
#include "cql3/stats.hh"
namespace alternator {
// Object holding per-shard statistics related to Alternator.
// While this object is alive, these metrics are also registered to be
// visible by the metrics REST API, with the "alternator" prefix.
class stats {
public:
stats();
// Count of DynamoDB API operations by types
struct {
uint64_t batch_get_item = 0;
uint64_t batch_write_item = 0;
uint64_t create_backup = 0;
uint64_t create_global_table = 0;
uint64_t create_table = 0;
uint64_t delete_backup = 0;
uint64_t delete_item = 0;
uint64_t delete_table = 0;
uint64_t describe_backup = 0;
uint64_t describe_continuous_backups = 0;
uint64_t describe_endpoints = 0;
uint64_t describe_global_table = 0;
uint64_t describe_global_table_settings = 0;
uint64_t describe_limits = 0;
uint64_t describe_table = 0;
uint64_t describe_time_to_live = 0;
uint64_t get_item = 0;
uint64_t list_backups = 0;
uint64_t list_global_tables = 0;
uint64_t list_tables = 0;
uint64_t list_tags_of_resource = 0;
uint64_t put_item = 0;
uint64_t query = 0;
uint64_t restore_table_from_backup = 0;
uint64_t restore_table_to_point_in_time = 0;
uint64_t scan = 0;
uint64_t tag_resource = 0;
uint64_t transact_get_items = 0;
uint64_t transact_write_items = 0;
uint64_t untag_resource = 0;
uint64_t update_continuous_backups = 0;
uint64_t update_global_table = 0;
uint64_t update_global_table_settings = 0;
uint64_t update_item = 0;
uint64_t update_table = 0;
uint64_t update_time_to_live = 0;
uint64_t list_streams = 0;
uint64_t describe_stream = 0;
uint64_t get_shard_iterator = 0;
uint64_t get_records = 0;
utils::time_estimated_histogram put_item_latency;
utils::time_estimated_histogram get_item_latency;
utils::time_estimated_histogram delete_item_latency;
utils::time_estimated_histogram update_item_latency;
utils::time_estimated_histogram get_records_latency;
} api_operations;
// Miscellaneous event counters
uint64_t total_operations = 0;
uint64_t unsupported_operations = 0;
uint64_t reads_before_write = 0;
uint64_t write_using_lwt = 0;
uint64_t shard_bounce_for_lwt = 0;
uint64_t requests_blocked_memory = 0;
uint64_t requests_shed = 0;
// CQL-derived stats
cql3::cql_stats cql_stats;
private:
// The metric_groups object holds this stat object's metrics registered
// as long as the stats object is alive.
seastar::metrics::metric_groups _metrics;
};
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,825 +0,0 @@
/*
* Copyright 2021-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include <chrono>
#include <cstdint>
#include <optional>
#include <seastar/core/sstring.hh>
#include <seastar/core/coroutine.hh>
#include <seastar/core/sleep.hh>
#include <seastar/core/future.hh>
#include <seastar/core/lowres_clock.hh>
#include <seastar/coroutine/maybe_yield.hh>
#include <boost/multiprecision/cpp_int.hpp>
#include "gms/gossiper.hh"
#include "gms/inet_address.hh"
#include "inet_address_vectors.hh"
#include "locator/abstract_replication_strategy.hh"
#include "log.hh"
#include "gc_clock.hh"
#include "replica/database.hh"
#include "service_permit.hh"
#include "timestamp.hh"
#include "service/storage_proxy.hh"
#include "service/pager/paging_state.hh"
#include "service/pager/query_pagers.hh"
#include "gms/feature_service.hh"
#include "sstables/types.hh"
#include "mutation.hh"
#include "types.hh"
#include "types/map.hh"
#include "utils/rjson.hh"
#include "utils/big_decimal.hh"
#include "utils/fb_utilities.hh"
#include "cql3/selection/selection.hh"
#include "cql3/values.hh"
#include "cql3/query_options.hh"
#include "cql3/column_identifier.hh"
#include "alternator/executor.hh"
#include "alternator/controller.hh"
#include "alternator/serialization.hh"
#include "dht/sharder.hh"
#include "db/config.hh"
#include "db/tags/utils.hh"
#include "ttl.hh"
static logging::logger tlogger("alternator_ttl");
namespace alternator {
// We write the expiration-time attribute enabled on a table using a
// tag TTL_TAG_KEY.
// Currently, the *value* of this tag is simply the name of the attribute,
// and the expiration scanner interprets it as an Alternator attribute name -
// It can refer to a real column or if that doesn't exist, to a member of
// the ":attrs" map column. Although this is designed for Alternator, it may
// be good enough for CQL as well (there, the ":attrs" column won't exist).
static const sstring TTL_TAG_KEY("system:ttl_attribute");
future<executor::request_return_type> executor::update_time_to_live(client_state& client_state, service_permit permit, rjson::value request) {
_stats.api_operations.update_time_to_live++;
if (!_proxy.data_dictionary().features().alternator_ttl) {
co_return api_error::unknown_operation("UpdateTimeToLive not yet supported. Experimental support is available if the 'alternator-ttl' experimental feature is enabled on all nodes.");
}
schema_ptr schema = get_table(_proxy, request);
rjson::value* spec = rjson::find(request, "TimeToLiveSpecification");
if (!spec || !spec->IsObject()) {
co_return api_error::validation("UpdateTimeToLive missing mandatory TimeToLiveSpecification");
}
const rjson::value* v = rjson::find(*spec, "Enabled");
if (!v || !v->IsBool()) {
co_return api_error::validation("UpdateTimeToLive requires boolean Enabled");
}
bool enabled = v->GetBool();
v = rjson::find(*spec, "AttributeName");
if (!v || !v->IsString()) {
co_return api_error::validation("UpdateTimeToLive requires string AttributeName");
}
// Although the DynamoDB documentation specifies that attribute names
// should be between 1 and 64K bytes, in practice, it only allows
// between 1 and 255 bytes. There are no other limitations on which
// characters are allowed in the name.
if (v->GetStringLength() < 1 || v->GetStringLength() > 255) {
co_return api_error::validation("The length of AttributeName must be between 1 and 255");
}
sstring attribute_name(v->GetString(), v->GetStringLength());
std::map<sstring, sstring> tags_map = get_tags_of_table_or_throw(schema);
if (enabled) {
if (tags_map.contains(TTL_TAG_KEY)) {
co_return api_error::validation("TTL is already enabled");
}
tags_map[TTL_TAG_KEY] = attribute_name;
} else {
auto i = tags_map.find(TTL_TAG_KEY);
if (i == tags_map.end()) {
co_return api_error::validation("TTL is already disabled");
} else if (i->second != attribute_name) {
co_return api_error::validation(format(
"Requested to disable TTL on attribute {}, but a different attribute {} is enabled.",
attribute_name, i->second));
}
tags_map.erase(TTL_TAG_KEY);
}
co_await db::update_tags(_mm, schema, std::move(tags_map));
// Prepare the response, which contains a TimeToLiveSpecification
// basically identical to the request's
rjson::value response = rjson::empty_object();
rjson::add(response, "TimeToLiveSpecification", std::move(*spec));
co_return make_jsonable(std::move(response));
}
future<executor::request_return_type> executor::describe_time_to_live(client_state& client_state, service_permit permit, rjson::value request) {
_stats.api_operations.describe_time_to_live++;
schema_ptr schema = get_table(_proxy, request);
std::map<sstring, sstring> tags_map = get_tags_of_table_or_throw(schema);
rjson::value desc = rjson::empty_object();
auto i = tags_map.find(TTL_TAG_KEY);
if (i == tags_map.end()) {
rjson::add(desc, "TimeToLiveStatus", "DISABLED");
} else {
rjson::add(desc, "TimeToLiveStatus", "ENABLED");
rjson::add(desc, "AttributeName", rjson::from_string(i->second));
}
rjson::value response = rjson::empty_object();
rjson::add(response, "TimeToLiveDescription", std::move(desc));
co_return make_jsonable(std::move(response));
}
// expiration_service is a sharded service responsible for cleaning up expired
// items in all tables with per-item expiration enabled. Currently, this means
// Alternator tables with TTL configured via a UpdateTimeToLive request.
//
// Here is a brief overview of how the expiration service works:
//
// An expiration thread on each shard periodically scans the items (i.e.,
// rows) owned by this shard, looking for items whose chosen expiration-time
// attribute indicates they are expired, and deletes those items.
// The expiration-time "attribute" can be either an actual Scylla column
// (must be numeric) or an Alternator "attribute" - i.e., an element in
// the ATTRS_COLUMN_NAME map<utf8,bytes> column where the numeric expiration
// time is encoded in DynamoDB's JSON encoding inside the bytes value.
// To avoid scanning the same items RF times in RF replicas, only one node is
// responsible for scanning a token range at a time. Normally, this is the
// node owning this range as a "primary range" (the first node in the ring
// with this range), but when this node is down, the secondary owner (the
// second in the ring) may take over.
// An expiration thread is reponsible for all tables which need expiration
// scans. Currently, the different tables are scanned sequentially (not in
// parallel).
// The expiration thread scans item using CL=QUORUM to ensures that it reads
// a consistent expiration-time attribute. This means that the items are read
// locally and in addition QUORUM-1 additional nodes (one additional node
// when RF=3) need to read the data and send digests.
// When the expiration thread decides that an item has expired and wants
// to delete it, it does it using a CL=QUORUM write. This allows this
// deletion to be visible for consistent (quorum) reads. The deletion,
// like user deletions, will also appear on the CDC log and therefore
// Alternator Streams if enabled - currently as ordinary deletes (the
// userIdentity flag is currently missing this is issue #11523).
expiration_service::expiration_service(data_dictionary::database db, service::storage_proxy& proxy, gms::gossiper& g)
: _db(db)
, _proxy(proxy)
, _gossiper(g)
{
}
// Convert the big_decimal used to represent expiration time to an integer.
// Any fractional part is dropped. If the number is negative or invalid,
// 0 is returned, and if it's too high, the maximum unsigned long is returned.
static unsigned long bigdecimal_to_ul(const big_decimal& bd) {
// The big_decimal format has an integer mantissa of arbitrary length
// "unscaled_value" and then a (power of 10) exponent "scale".
if (bd.unscaled_value() <= 0) {
return 0;
}
if (bd.scale() == 0) {
// The fast path, when the expiration time is an integer, scale==0.
return static_cast<unsigned long>(bd.unscaled_value());
}
// Because the mantissa can be of arbitrary length, we work on it
// as a string. TODO: find a less ugly algorithm.
auto str = bd.unscaled_value().str();
if (bd.scale() > 0) {
int len = str.length();
if (len < bd.scale()) {
return 0;
}
str = str.substr(0, len-bd.scale());
} else {
if (bd.scale() < -20) {
return std::numeric_limits<unsigned long>::max();
}
for (int i = 0; i < -bd.scale(); i++) {
str.push_back('0');
}
}
// strtoul() returns ULONG_MAX if the number is too large, or 0 if not
// a number.
return strtoul(str.c_str(), nullptr, 10);
}
// The following is_expired() functions all check if an item with the given
// expiration time has expired, according to the DynamoDB API rules.
// The rules are:
// 1. If the expiration time attribute's value is not a number type,
// the item is not expired.
// 2. The expiration time is measured in seconds since the UNIX epoch.
// 3. If the expiration time is more than 5 years in the past, it is assumed
// to be malformed and ignored - and the item does not expire.
static bool is_expired(gc_clock::time_point expiration_time, gc_clock::time_point now) {
return expiration_time <= now &&
expiration_time > now - std::chrono::years(5);
}
static bool is_expired(const big_decimal& expiration_time, gc_clock::time_point now) {
unsigned long t = bigdecimal_to_ul(expiration_time);
// We assume - and the assumption turns out to be correct - that the
// epoch of gc_clock::time_point and the one used by the DynamoDB protocol
// are the same (the UNIX epoch in UTC). The resolution (seconds) is also
// the same.
return is_expired(gc_clock::time_point(gc_clock::duration(std::chrono::seconds(t))), now);
}
static bool is_expired(const rjson::value& expiration_time, gc_clock::time_point now) {
std::optional<big_decimal> n = try_unwrap_number(expiration_time);
return n && is_expired(*n, now);
}
// expire_item() expires an item - i.e., deletes it as appropriate for
// expiration - with CL=QUORUM and (FIXME!) in a way Alternator Streams
// understands it is an expiration event - not a user-initiated deletion.
static future<> expire_item(service::storage_proxy& proxy,
const service::query_state& qs,
const std::vector<bytes_opt>& row,
schema_ptr schema,
api::timestamp_type ts) {
// Prepare the row key to delete
// NOTICE: the order of columns is guaranteed by the fact that selection::wildcard
// is used, which indicates that columns appear in the order defined by
// schema::all_columns_in_select_order() - partition key columns goes first,
// immediately followed by clustering key columns
std::vector<bytes> exploded_pk;
const unsigned pk_size = schema->partition_key_size();
const unsigned ck_size = schema->clustering_key_size();
for (unsigned c = 0; c < pk_size; ++c) {
const auto& row_c = row[c];
if (!row_c) {
// This shouldn't happen - all key columns must have values.
// But if it ever happens, let's just *not* expire the item.
// FIXME: log or increment a metric if this happens.
return make_ready_future<>();
}
exploded_pk.push_back(*row_c);
}
auto pk = partition_key::from_exploded(exploded_pk);
mutation m(schema, pk);
// If there's no clustering key, a tombstone should be created directly
// on a partition, not on a clustering row - otherwise it will look like
// an open-ended range tombstone, which will crash on KA/LA sstable format.
// See issue #6035
if (ck_size == 0) {
m.partition().apply(tombstone(ts, gc_clock::now()));
} else {
std::vector<bytes> exploded_ck;
for (unsigned c = pk_size; c < pk_size + ck_size; ++c) {
const auto& row_c = row[c];
if (!row_c) {
// This shouldn't happen - all key columns must have values.
// But if it ever happens, let's just *not* expire the item.
// FIXME: log or increment a metric if this happens.
return make_ready_future<>();
}
exploded_ck.push_back(*row_c);
}
auto ck = clustering_key::from_exploded(exploded_ck);
m.partition().clustered_row(*schema, ck).apply(tombstone(ts, gc_clock::now()));
}
std::vector<mutation> mutations;
mutations.push_back(std::move(m));
return proxy.mutate(std::move(mutations),
db::consistency_level::LOCAL_QUORUM,
executor::default_timeout(), // FIXME - which timeout?
qs.get_trace_state(), qs.get_permit(),
db::allow_per_partition_rate_limit::no);
}
static size_t random_offset(size_t min, size_t max) {
static thread_local std::default_random_engine re{std::random_device{}()};
std::uniform_int_distribution<size_t> dist(min, max);
return dist(re);
}
// Get a list of secondary token ranges for the given node, and the primary
// node responsible for each of these token ranges.
// A "secondary range" is a range of tokens where for each token, the second
// node (in ring order) out of the RF replicas that hold this token is the
// given node.
// In the expiration scanner, we want to scan a secondary range but only if
// this range's primary node is down. For this we need to return not just
// a list of this node's secondary ranges - but also the primary owner of
// each of those ranges.
static std::vector<std::pair<dht::token_range, gms::inet_address>> get_secondary_ranges(
const locator::effective_replication_map_ptr& erm,
gms::inet_address ep) {
const auto& tm = *erm->get_token_metadata_ptr();
const auto& sorted_tokens = tm.sorted_tokens();
std::vector<std::pair<dht::token_range, gms::inet_address>> ret;
if (sorted_tokens.empty()) {
on_internal_error(tlogger, "Token metadata is empty");
}
auto prev_tok = sorted_tokens.back();
for (const auto& tok : sorted_tokens) {
inet_address_vector_replica_set eps = erm->get_natural_endpoints(tok);
if (eps.size() <= 1 || eps[1] != ep) {
prev_tok = tok;
continue;
}
// Add the range (prev_tok, tok] to ret. However, if the range wraps
// around, split it to two non-wrapping ranges.
if (prev_tok < tok) {
ret.emplace_back(
dht::token_range{
dht::token_range::bound(prev_tok, false),
dht::token_range::bound(tok, true)},
eps[0]);
} else {
ret.emplace_back(
dht::token_range{
dht::token_range::bound(prev_tok, false),
std::nullopt},
eps[0]);
ret.emplace_back(
dht::token_range{
std::nullopt,
dht::token_range::bound(tok, true)},
eps[0]);
}
prev_tok = tok;
}
return ret;
}
// A class for iterating over all the token ranges *owned* by this shard.
// To avoid code duplication, it is a template with two distinct cases -
// <primary> and <secondary>:
//
// In the <primary> case, we consider a token *owned* by this shard if:
// 1. This node is a replica for this token.
// 2. Moreover, this node is the *primary* replica of the token (i.e., the
// first replica in the ring).
// 3. In this node, this shard is responsible for this token.
// We will use this definition of which shard in the cluster owns which tokens
// to split the expiration scanner's work between all the shards of the
// system.
//
// In the <secondary> case, we consider a token *owned* by this shard if:
// 1. This node is the *secondary* replica for this token (i.e., the second
// replica in the ring).
// 2. The primary replica for this token is currently marked down.
// 3. In this node, this shard is responsible for this token.
// We use the <secondary> case to handle the possibility that some of the
// nodes in the system are down. A dead node will not be expiring
// the tokens owned by it, so we want the secondary owner to take over its
// primary ranges.
//
// FIXME: need to decide how to choose primary ranges in multi-DC setup!
// We could call get_primary_ranges_within_dc() below instead of get_primary_ranges().
// NOTICE: Iteration currently starts from a random token range in order to improve
// the chances of covering all ranges during a scan when restarts occur.
// A more deterministic way would be to regularly persist the scanning state,
// but that incurs overhead that we want to avoid if not needed.
enum primary_or_secondary_t {primary, secondary};
template<primary_or_secondary_t primary_or_secondary>
class token_ranges_owned_by_this_shard {
// ranges_holder_primary holds just the primary ranges themselves
class ranges_holder_primary {
const dht::token_range_vector _token_ranges;
public:
ranges_holder_primary(const locator::effective_replication_map_ptr& erm, gms::gossiper& g, gms::inet_address ep)
: _token_ranges(erm->get_primary_ranges(ep)) {}
std::size_t size() const { return _token_ranges.size(); }
const dht::token_range& operator[](std::size_t i) const {
return _token_ranges[i];
}
bool should_skip(std::size_t i) const {
return false;
}
};
// ranges_holder<secondary> holds the secondary token ranges plus each
// range's primary owner, needed to implement should_skip().
class ranges_holder_secondary {
std::vector<std::pair<dht::token_range, gms::inet_address>> _token_ranges;
gms::gossiper& _gossiper;
public:
ranges_holder_secondary(const locator::effective_replication_map_ptr& erm, gms::gossiper& g, gms::inet_address ep)
: _token_ranges(get_secondary_ranges(erm, ep))
, _gossiper(g) {}
std::size_t size() const { return _token_ranges.size(); }
const dht::token_range& operator[](std::size_t i) const {
return _token_ranges[i].first;
}
// range i should be skipped if its primary owner is alive.
bool should_skip(std::size_t i) const {
return _gossiper.is_alive(_token_ranges[i].second);
}
};
schema_ptr _s;
// _token_ranges will contain a list of token ranges owned by this node.
// We'll further need to split each such range to the pieces owned by
// the current shard, using _intersecter.
using ranges_holder = std::conditional_t<
primary_or_secondary == primary_or_secondary_t::primary,
ranges_holder_primary,
ranges_holder_secondary>;
const ranges_holder _token_ranges;
// NOTICE: _range_idx is used modulo _token_ranges size when accessing
// the data to ensure that it doesn't go out of bounds
size_t _range_idx;
size_t _end_idx;
std::optional<dht::selective_token_range_sharder> _intersecter;
public:
token_ranges_owned_by_this_shard(replica::database& db, gms::gossiper& g, schema_ptr s)
: _s(s)
, _token_ranges(db.find_keyspace(s->ks_name()).get_effective_replication_map(),
g, utils::fb_utilities::get_broadcast_address())
, _range_idx(random_offset(0, _token_ranges.size() - 1))
, _end_idx(_range_idx + _token_ranges.size())
{
tlogger.debug("Generating token ranges starting from base range {} of {}", _range_idx, _token_ranges.size());
}
// Return the next token_range owned by this shard, or nullopt when the
// iteration ends.
std::optional<dht::token_range> next() {
// We may need three or more iterations in the following loop if a
// vnode doesn't intersect with the given shard at all (such a small
// vnode is unlikely, but possible). The loop cannot be infinite
// because each iteration of the loop advances _range_idx.
for (;;) {
if (_intersecter) {
std::optional<dht::token_range> ret = _intersecter->next();
if (ret) {
return ret;
}
// done with this range, go to next one
++_range_idx;
_intersecter = std::nullopt;
}
if (_range_idx == _end_idx) {
return std::nullopt;
}
// If should_skip(), the range should be skipped. This happens for
// a secondary range whose primary owning node is still alive.
while (_token_ranges.should_skip(_range_idx % _token_ranges.size())) {
++_range_idx;
if (_range_idx == _end_idx) {
return std::nullopt;
}
}
_intersecter.emplace(_s->get_sharder(), _token_ranges[_range_idx % _token_ranges.size()], this_shard_id());
}
}
// Same as next(), just return a partition_range instead of token_range
std::optional<dht::partition_range> next_partition_range() {
std::optional<dht::token_range> ret = next();
if (ret) {
return dht::to_partition_range(*ret);
} else {
return std::nullopt;
}
}
};
// Precomputed information needed to perform a scan on partition ranges
struct scan_ranges_context {
schema_ptr s;
bytes column_name;
std::optional<std::string> member;
::shared_ptr<cql3::selection::selection> selection;
std::unique_ptr<service::query_state> query_state_ptr;
std::unique_ptr<cql3::query_options> query_options;
::lw_shared_ptr<query::read_command> command;
scan_ranges_context(schema_ptr s, service::storage_proxy& proxy, bytes column_name, std::optional<std::string> member)
: s(s)
, column_name(column_name)
, member(member)
{
// FIXME: don't read the entire items - read only parts of it.
// We must read the key columns (to be able to delete) and also
// the requested attribute. If the requested attribute is a map's
// member we may be forced to read the entire map - but it would
// be good if we can read only the single item of the map - it
// should be possible (and a must for issue #7751!).
lw_shared_ptr<service::pager::paging_state> paging_state = nullptr;
auto regular_columns = boost::copy_range<query::column_id_vector>(
s->regular_columns() | boost::adaptors::transformed([] (const column_definition& cdef) { return cdef.id; }));
selection = cql3::selection::selection::wildcard(s);
query::partition_slice::option_set opts = selection->get_query_options();
opts.set<query::partition_slice::option::allow_short_read>();
// It is important that the scan bypass cache to avoid polluting it:
opts.set<query::partition_slice::option::bypass_cache>();
std::vector<query::clustering_range> ck_bounds{query::clustering_range::make_open_ended_both_sides()};
auto partition_slice = query::partition_slice(std::move(ck_bounds), {}, std::move(regular_columns), opts);
command = ::make_lw_shared<query::read_command>(s->id(), s->version(), partition_slice, proxy.get_max_result_size(partition_slice), query::tombstone_limit(proxy.get_tombstone_limit()));
executor::client_state client_state{executor::client_state::internal_tag()};
tracing::trace_state_ptr trace_state;
// NOTICE: empty_service_permit is used because the TTL service has fixed parallelism
query_state_ptr = std::make_unique<service::query_state>(client_state, trace_state, empty_service_permit());
// FIXME: What should we do on multi-DC? Will we run the expiration on the same ranges on all
// DCs or only once for each range? If the latter, we need to change the CLs in the
// scanner and deleter.
db::consistency_level cl = db::consistency_level::LOCAL_QUORUM;
query_options = std::make_unique<cql3::query_options>(cl, std::vector<cql3::raw_value>{});
query_options = std::make_unique<cql3::query_options>(std::move(query_options), std::move(paging_state));
}
};
// Scan data in a list of token ranges in one table, looking for expired
// items and deleting them.
// Because of issue #9167, partition_ranges must have a single partition
// range for this code to work correctly.
static future<> scan_table_ranges(
service::storage_proxy& proxy,
const scan_ranges_context& scan_ctx,
dht::partition_range_vector&& partition_ranges,
abort_source& abort_source,
named_semaphore& page_sem,
expiration_service::stats& expiration_stats)
{
const schema_ptr& s = scan_ctx.s;
assert (partition_ranges.size() == 1); // otherwise issue #9167 will cause incorrect results.
auto p = service::pager::query_pagers::pager(proxy, s, scan_ctx.selection, *scan_ctx.query_state_ptr,
*scan_ctx.query_options, scan_ctx.command, std::move(partition_ranges), nullptr);
while (!p->is_exhausted()) {
if (abort_source.abort_requested()) {
co_return;
}
auto units = co_await get_units(page_sem, 1);
// We don't to limit page size in number of rows because there is a
// builtin limit of the page's size in bytes. Setting this limit to 1
// is useful for debugging the paging code with moderate-size data.
uint32_t limit = std::numeric_limits<uint32_t>::max();
// FIXME: which timeout?
// FIXME: if read times out, need to retry it.
std::unique_ptr<cql3::result_set> rs = co_await p->fetch_page(limit, gc_clock::now(), executor::default_timeout());
auto rows = rs->rows();
auto meta = rs->get_metadata().get_names();
std::optional<unsigned> expiration_column;
for (unsigned i = 0; i < meta.size(); i++) {
const cql3::column_specification& col = *meta[i];
if (col.name->name() == scan_ctx.column_name) {
expiration_column = i;
break;
}
}
if (!expiration_column) {
continue;
}
for (const auto& row : rows) {
const bytes_opt& cell = row[*expiration_column];
if (!cell) {
continue;
}
auto v = meta[*expiration_column]->type->deserialize(*cell);
bool expired = false;
// FIXME: don't recalculate "now" all the time
auto now = gc_clock::now();
if (scan_ctx.member) {
// In this case, the expiration-time attribute we're
// looking for is a member in a map, saved serialized
// into bytes using Alternator's serialization (basically
// a JSON serialized into bytes)
// FIXME: is it possible to find a specific member of a map
// without iterating through it like we do here and compare
// the key?
for (const auto& entry : value_cast<map_type_impl::native_type>(v)) {
std::string attr_name = value_cast<sstring>(entry.first);
if (value_cast<sstring>(entry.first) == *scan_ctx.member) {
bytes value = value_cast<bytes>(entry.second);
rjson::value json = deserialize_item(value);
expired = is_expired(json, now);
break;
}
}
} else {
// For a real column to contain an expiration time, it
// must be a numeric type.
// FIXME: Currently we only support decimal_type (which is
// what Alternator uses), but other numeric types can be
// supported as well to make this feature more useful in CQL.
// Note that kind::decimal is also checked above.
big_decimal n = value_cast<big_decimal>(v);
expired = is_expired(n, now);
}
if (expired) {
expiration_stats.items_deleted++;
// FIXME: maybe don't recalculate new_timestamp() all the time
// FIXME: if expire_item() throws on timeout, we need to retry it.
auto ts = api::new_timestamp();
co_await expire_item(proxy, *scan_ctx.query_state_ptr, row, s, ts);
}
}
// FIXME: once in a while, persist p->state(), so on reboot
// we don't start from scratch.
}
}
// scan_table() scans, in one table, data "owned" by this shard, looking for
// expired items and deleting them.
// We consider each node to "own" its primary token ranges, i.e., the tokens
// that this node is their first replica in the ring. Inside the node, each
// shard "owns" subranges of the node's token ranges - according to the node's
// sharding algorithm.
// When a node goes down, the token ranges owned by it will not be scanned
// and items in those token ranges will not expire, so in the future (FIXME)
// this function should additionally work on token ranges whose primary owner
// is down and this node is the range's secondary owner.
// If the TTL (expiration-time scanning) feature is not enabled for this
// table, scan_table() returns false without doing anything. Remember that the
// TTL feature may be enabled later so this function will need to be called
// again when the feature is enabled.
// Currently this function scans the entire table (or, rather the parts owned
// by this shard) at full rate, once. In the future (FIXME) we should consider
// how to pace this scan, how and when to repeat it, how to interleave or
// parallelize scanning of multiple tables, and how to continue scans after a
// reboot.
static future<bool> scan_table(
service::storage_proxy& proxy,
data_dictionary::database db,
gms::gossiper& gossiper,
schema_ptr s,
abort_source& abort_source,
named_semaphore& page_sem,
expiration_service::stats& expiration_stats)
{
// Check if an expiration-time attribute is enabled for this table.
// If not, just return false immediately.
// FIXME: the setting of the TTL may change in the middle of a long scan!
std::optional<std::string> attribute_name = db::find_tag(*s, TTL_TAG_KEY);
if (!attribute_name) {
co_return false;
}
// attribute_name may be one of the schema's columns (in Alternator, this
// means it's a key column), or an element in Alternator's attrs map
// encoded in Alternator's JSON encoding.
// FIXME: To make this less Alternators-specific, we should encode in the
// single key's value three things:
// 1. The name of a column
// 2. Optionally if column is a map, a member in the map
// 3. The deserializer for the value: CQL or Alternator (JSON).
// The deserializer can be guessed: If the given column or map item is
// numeric, it can be used directly. If it is a "bytes" type, it needs to
// be deserialized using Alternator's deserializer.
bytes column_name = to_bytes(*attribute_name);
const column_definition *cd = s->get_column_definition(column_name);
std::optional<std::string> member;
if (!cd) {
member = std::move(attribute_name);
column_name = bytes(executor::ATTRS_COLUMN_NAME);
cd = s->get_column_definition(column_name);
tlogger.info("table {} TTL enabled with attribute {} in {}", s->cf_name(), *member, executor::ATTRS_COLUMN_NAME);
} else {
tlogger.info("table {} TTL enabled with attribute {}", s->cf_name(), *attribute_name);
}
if (!cd) {
tlogger.info("table {} TTL column is missing, not scanning", s->cf_name());
co_return false;
}
data_type column_type = cd->type;
// Verify that the column has the right type: If "member" exists
// the column must be a map, and if it doesn't, the column must
// (currently) be a decimal_type. If the column has the wrong type
// nothing can get expired in this table, and it's pointless to
// scan it.
if ((member && column_type->get_kind() != abstract_type::kind::map) ||
(!member && column_type->get_kind() != abstract_type::kind::decimal)) {
tlogger.info("table {} TTL column has unsupported type, not scanning", s->cf_name());
co_return false;
}
expiration_stats.scan_table++;
// FIXME: need to pace the scan, not do it all at once.
scan_ranges_context scan_ctx{s, proxy, std::move(column_name), std::move(member)};
token_ranges_owned_by_this_shard<primary> my_ranges(db.real_database(), gossiper, s);
while (std::optional<dht::partition_range> range = my_ranges.next_partition_range()) {
// Note that because of issue #9167 we need to run a separate
// query on each partition range, and can't pass several of
// them into one partition_range_vector.
dht::partition_range_vector partition_ranges;
partition_ranges.push_back(std::move(*range));
// FIXME: if scanning a single range fails, including network errors,
// we fail the entire scan (and rescan from the beginning). Need to
// reconsider this. Saving the scan position might be a good enough
// solution for this problem.
co_await scan_table_ranges(proxy, scan_ctx, std::move(partition_ranges), abort_source, page_sem, expiration_stats);
}
// If each node only scans its own primary ranges, then when any node is
// down part of the token range will not get scanned. This can be viewed
// as acceptable (when the comes back online, it will resume its scan),
// but as noted in issue #9787, we can allow more prompt expiration
// by tasking another node to take over scanning of the dead node's primary
// ranges. What we do here is that this node will also check expiration
// on its *secondary* ranges - but only those whose primary owner is down.
token_ranges_owned_by_this_shard<secondary> my_secondary_ranges(db.real_database(), gossiper, s);
while (std::optional<dht::partition_range> range = my_secondary_ranges.next_partition_range()) {
expiration_stats.secondary_ranges_scanned++;
dht::partition_range_vector partition_ranges;
partition_ranges.push_back(std::move(*range));
co_await scan_table_ranges(proxy, scan_ctx, std::move(partition_ranges), abort_source, page_sem, expiration_stats);
}
co_return true;
}
future<> expiration_service::run() {
// FIXME: don't just tight-loop, think about timing, pace, and
// store position in durable storage, etc.
// FIXME: think about working on different tables in parallel.
// also need to notice when a new table is added, a table is
// deleted or when ttl is enabled or disabled for a table!
for (;;) {
auto start = lowres_clock::now();
// _db.tables() may change under our feet during a
// long-living loop, so we must keep our own copy of the list of
// schemas.
std::vector<schema_ptr> schemas;
for (auto cf : _db.get_tables()) {
schemas.push_back(cf.schema());
}
for (schema_ptr s : schemas) {
co_await coroutine::maybe_yield();
if (shutting_down()) {
co_return;
}
try {
co_await scan_table(_proxy, _db, _gossiper, s, _abort_source, _page_sem, _expiration_stats);
} catch (...) {
// The scan of a table may fail in the middle for many
// reasons, including network failure and even the table
// being removed. We'll continue scanning this table later
// (if it still exists). In any case it's important to catch
// the exception and not let the scanning service die for
// good.
// If the table has been deleted, it is expected that the scan
// will fail at some point, and even a warning is excessive.
if (_db.has_schema(s->ks_name(), s->cf_name())) {
tlogger.warn("table {}.{} expiration scan failed: {}",
s->ks_name(), s->cf_name(), std::current_exception());
} else {
tlogger.info("expiration scan failed when table {}.{} was deleted",
s->ks_name(), s->cf_name());
}
}
}
_expiration_stats.scan_passes++;
// The TTL scanner runs above once over all tables, at full steam.
// After completing such a scan, we sleep until it's time start
// another scan. TODO: If the scan went too fast, we can slow it down
// in the next iteration by reducing the scanner's scheduling-group
// share (if using a separate scheduling group), or introduce
// finer-grain sleeps into the scanning code.
std::chrono::milliseconds scan_duration(std::chrono::duration_cast<std::chrono::milliseconds>(lowres_clock::now() - start));
std::chrono::milliseconds period(long(_db.get_config().alternator_ttl_period_in_seconds() * 1000));
if (scan_duration < period) {
try {
tlogger.info("sleeping {} seconds until next period", (period - scan_duration).count()/1000.0);
co_await seastar::sleep_abortable(period - scan_duration, _abort_source);
} catch(seastar::sleep_aborted&) {}
} else {
tlogger.warn("scan took {} seconds, longer than period - not sleeping", scan_duration.count()/1000.0);
}
}
}
future<> expiration_service::start() {
// Called by main() on each shard to start the expiration-service
// thread. Just runs run() in the background and allows stop().
if (_db.features().alternator_ttl) {
if (!shutting_down()) {
_end = run().handle_exception([] (std::exception_ptr ep) {
tlogger.error("expiration_service failed: {}", ep);
});
}
}
return make_ready_future<>();
}
future<> expiration_service::stop() {
if (_abort_source.abort_requested()) {
throw std::logic_error("expiration_service::stop() called a second time");
}
_abort_source.request_abort();
if (!_end) {
// if _end is was not set, start() was never called
return make_ready_future<>();
}
return std::move(*_end);
}
expiration_service::stats::stats() {
_metrics.add_group("expiration", {
seastar::metrics::make_total_operations("scan_passes", scan_passes,
seastar::metrics::description("number of passes over the database")),
seastar::metrics::make_total_operations("scan_table", scan_table,
seastar::metrics::description("number of table scans (counting each scan of each table that enabled expiration)")),
seastar::metrics::make_total_operations("items_deleted", items_deleted,
seastar::metrics::description("number of items deleted after expiration")),
seastar::metrics::make_total_operations("secondary_ranges_scanned", secondary_ranges_scanned,
seastar::metrics::description("number of token ranges scanned by this node while their primary owner was down")),
});
}
} // namespace alternator

View File

@@ -1,80 +0,0 @@
/*
* Copyright 2021-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include "seastarx.hh"
#include <seastar/core/sharded.hh>
#include <seastar/core/abort_source.hh>
#include <seastar/core/semaphore.hh>
#include "data_dictionary/data_dictionary.hh"
namespace gms {
class gossiper;
}
namespace replica {
class database;
}
namespace service {
class storage_proxy;
}
namespace alternator {
// expiration_service is a sharded service responsible for cleaning up expired
// items in all tables with per-item expiration enabled. Currently, this means
// Alternator tables with TTL configured via a UpdateTimeToLeave request.
class expiration_service final : public seastar::peering_sharded_service<expiration_service> {
public:
// Object holding per-shard statistics related to the expiration service.
// While this object is alive, these metrics are also registered to be
// visible by the metrics REST API, with the "expiration_" prefix.
class stats {
public:
stats();
uint64_t scan_passes = 0;
uint64_t scan_table = 0;
uint64_t items_deleted = 0;
uint64_t secondary_ranges_scanned = 0;
private:
// The metric_groups object holds this stat object's metrics registered
// as long as the stats object is alive.
seastar::metrics::metric_groups _metrics;
};
private:
data_dictionary::database _db;
service::storage_proxy& _proxy;
gms::gossiper& _gossiper;
// _end is set by start(), and resolves when the the background service
// started by it ends. To ask the background service to end, _abort_source
// should be triggered. stop() below uses both _abort_source and _end.
std::optional<future<>> _end;
abort_source _abort_source;
// Ensures that at most 1 page of scan results at a time is processed by the TTL service
named_semaphore _page_sem{1, named_semaphore_exception_factory{"alternator_ttl"}};
bool shutting_down() { return _abort_source.abort_requested(); }
stats _expiration_stats;
public:
// sharded_service<expiration_service>::start() creates this object on
// all shards, so calls this constructor on each shard. Later, the
// additional start() function should be invoked on all shards.
expiration_service(data_dictionary::database, service::storage_proxy&, gms::gossiper&);
future<> start();
future<> run();
// sharded_service<expiration_service>::stop() calls the following stop()
// method on each shard. This stop() asks the service on this shard to
// shut down as quickly as it can. The returned future indicates when the
// service is no longer running.
// stop() may be called even before start(), but may only be called once -
// calling it twice will result in an exception.
future<> stop();
};
} // namespace alternator

View File

@@ -1,29 +0,0 @@
{
"apiVersion":"0.0.1",
"swaggerVersion":"1.2",
"basePath":"{{Protocol}}://{{Host}}",
"resourcePath":"/authorization_cache",
"produces":[
"application/json"
],
"apis":[
{
"path":"/authorization_cache/reset",
"operations":[
{
"method":"POST",
"summary":"Reset cache",
"type":"void",
"nickname":"authorization_cache_reset",
"produces":[
"application/json"
],
"parameters":[
]
}
]
}
],
"models":{
}
}

View File

@@ -13,7 +13,7 @@
{
"method":"GET",
"summary":"get row cache save period in seconds",
"type": "long",
"type":"int",
"nickname":"get_row_cache_save_period_in_seconds",
"produces":[
"application/json"
@@ -35,7 +35,7 @@
"description":"row cache save period in seconds",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -48,7 +48,7 @@
{
"method":"GET",
"summary":"get key cache save period in seconds",
"type": "long",
"type":"int",
"nickname":"get_key_cache_save_period_in_seconds",
"produces":[
"application/json"
@@ -70,7 +70,7 @@
"description":"key cache save period in seconds",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -83,7 +83,7 @@
{
"method":"GET",
"summary":"get counter cache save period in seconds",
"type": "long",
"type":"int",
"nickname":"get_counter_cache_save_period_in_seconds",
"produces":[
"application/json"
@@ -105,7 +105,7 @@
"description":"counter cache save period in seconds",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -118,7 +118,7 @@
{
"method":"GET",
"summary":"get row cache keys to save",
"type": "long",
"type":"int",
"nickname":"get_row_cache_keys_to_save",
"produces":[
"application/json"
@@ -140,7 +140,7 @@
"description":"row cache keys to save",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -153,7 +153,7 @@
{
"method":"GET",
"summary":"get key cache keys to save",
"type": "long",
"type":"int",
"nickname":"get_key_cache_keys_to_save",
"produces":[
"application/json"
@@ -175,7 +175,7 @@
"description":"key cache keys to save",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -188,7 +188,7 @@
{
"method":"GET",
"summary":"get counter cache keys to save",
"type": "long",
"type":"int",
"nickname":"get_counter_cache_keys_to_save",
"produces":[
"application/json"
@@ -210,7 +210,7 @@
"description":"counter cache keys to save",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -397,36 +397,6 @@
}
]
},
{
"path": "/cache_service/metrics/key/hits_moving_avrage",
"operations": [
{
"method": "GET",
"summary": "Get key hits moving avrage",
"type": "#/utils/rate_moving_average",
"nickname": "get_key_hits_moving_avrage",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/cache_service/metrics/key/requests_moving_avrage",
"operations": [
{
"method": "GET",
"summary": "Get key requests moving avrage",
"type": "#/utils/rate_moving_average",
"nickname": "get_key_requests_moving_avrage",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/cache_service/metrics/key/size",
"operations": [
@@ -448,7 +418,7 @@
{
"method": "GET",
"summary": "Get key entries",
"type": "long",
"type": "int",
"nickname": "get_key_entries",
"produces": [
"application/json"
@@ -517,36 +487,6 @@
}
]
},
{
"path": "/cache_service/metrics/row/hits_moving_avrage",
"operations": [
{
"method": "GET",
"summary": "Get row hits moving avrage",
"type": "#/utils/rate_moving_average",
"nickname": "get_row_hits_moving_avrage",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/cache_service/metrics/row/requests_moving_avrage",
"operations": [
{
"method": "GET",
"summary": "Get row requests moving avrage",
"type": "#/utils/rate_moving_average",
"nickname": "get_row_requests_moving_avrage",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/cache_service/metrics/row/size",
"operations": [
@@ -568,7 +508,7 @@
{
"method": "GET",
"summary": "Get row entries",
"type": "long",
"type": "int",
"nickname": "get_row_entries",
"produces": [
"application/json"
@@ -637,36 +577,6 @@
}
]
},
{
"path": "/cache_service/metrics/counter/hits_moving_avrage",
"operations": [
{
"method": "GET",
"summary": "Get counter hits moving avrage",
"type": "#/utils/rate_moving_average",
"nickname": "get_counter_hits_moving_avrage",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/cache_service/metrics/counter/requests_moving_avrage",
"operations": [
{
"method": "GET",
"summary": "Get counter requests moving avrage",
"type": "#/utils/rate_moving_average",
"nickname": "get_counter_requests_moving_avrage",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/cache_service/metrics/counter/size",
"operations": [
@@ -688,7 +598,7 @@
{
"method": "GET",
"summary": "Get counter entries",
"type": "long",
"type": "int",
"nickname": "get_counter_entries",
"produces": [
"application/json"

View File

@@ -55,57 +55,6 @@
"paramType":"query"
}
]
},
{
"method":"POST",
"summary":"Start reporting on one or more collectd metric",
"type":"void",
"nickname":"enable_collectd",
"produces":[
"application/json"
],
"parameters":[
{
"name":"pluginid",
"description":"The plugin ID, describe the component the metric belongs to. Examples are cache, thrift, etc'. Regex are supported.The plugin ID, describe the component the metric belong to. Examples are: cache, thrift etc'. regex are supported",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"instance",
"description":"The plugin instance typically #CPU indicating per CPU metric. Regex are supported. Omit for all",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"type",
"description":"The plugin type, the type of the information. Examples are total_operations, bytes, total_operations, etc'. Regex are supported. Omit for all",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"type_instance",
"description":"The plugin type instance, the specific metric. Exampls are total_writes, total_size, zones, etc'. Regex are supported, Omit for all",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"enable",
"description":"set to true to enable all, anything else or omit to disable",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
}
]
}
]
},
@@ -114,10 +63,10 @@
"operations":[
{
"method":"GET",
"summary":"Get a list of all collectd metrics and their status",
"summary":"Get a collectd value",
"type":"array",
"items":{
"type":"collectd_metric_status"
"type":"type_instance_id"
},
"nickname":"get_collectd_items",
"produces":[
@@ -125,25 +74,6 @@
],
"parameters":[
]
},
{
"method":"POST",
"summary":"Enable or disable all collectd metrics",
"type":"void",
"nickname":"enable_all_collectd",
"produces":[
"application/json"
],
"parameters":[
{
"name":"enable",
"description":"set to true to enable all, anything else or omit to disable",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
}
]
}
]
}
@@ -183,20 +113,6 @@
}
}
}
},
"collectd_metric_status":{
"id":"collectd_metric_status",
"description":"Holds a collectd id and an enable flag",
"properties":{
"id":{
"description":"The metric ID",
"type":"type_instance_id"
},
"enable":{
"description":"Is the metric enabled",
"type":"boolean"
}
}
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -102,47 +102,7 @@
"parameters":[
{
"name":"type",
"description":"The type of compaction to stop. Can be one of: COMPACTION | CLEANUP | SCRUB | UPGRADE | RESHAPE",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
]
},
{
"path":"/compaction_manager/stop_keyspace_compaction/{keyspace}",
"operations":[
{
"method":"POST",
"summary":"Stop all running compaction-like tasks in the given keyspace and tables having the provided type.",
"type":"void",
"nickname":"stop_keyspace_compaction",
"produces":[
"application/json"
],
"parameters":[
{
"name":"keyspace",
"description":"The keyspace to stop compaction in",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"tables",
"description":"Comma-separated tables to stop compaction in",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"type",
"description":"The type of compaction to stop. Can be one of: COMPACTION | CLEANUP | SCRUB | UPGRADE | RESHAPE",
"description":"the type of compaction to stop. Can be one of: - COMPACTION - VALIDATION - CLEANUP - SCRUB - INDEX_BUILD",
"required":true,
"allowMultiple":false,
"type":"string",
@@ -158,7 +118,7 @@
{
"method": "GET",
"summary": "Get pending tasks",
"type": "long",
"type": "int",
"nickname": "get_pending_tasks",
"produces": [
"application/json"
@@ -167,24 +127,6 @@
}
]
},
{
"path": "/compaction_manager/metrics/pending_tasks_by_table",
"operations": [
{
"method": "GET",
"summary": "Get pending tasks by table name",
"type": "array",
"items": {
"type": "pending_compaction"
},
"nickname": "get_pending_tasks_by_table",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/compaction_manager/metrics/completed_tasks",
"operations": [
@@ -221,7 +163,7 @@
{
"method": "GET",
"summary": "Get bytes compacted",
"type": "long",
"type": "int",
"nickname": "get_bytes_compacted",
"produces": [
"application/json"
@@ -237,7 +179,7 @@
"description":"A row merged information",
"properties":{
"key":{
"type": "long",
"type":"int",
"description":"The number of sstable"
},
"value":{
@@ -302,23 +244,6 @@
}
}
},
"pending_compaction": {
"id": "pending_compaction",
"properties": {
"cf": {
"type": "string",
"description": "The column family name"
},
"ks": {
"type":"string",
"description": "The keyspace name"
},
"task": {
"type":"long",
"description": "The number of pending tasks"
}
}
},
"history": {
"id":"history",
"description":"Compaction history information",

View File

@@ -1,30 +0,0 @@
"/v2/config/{id}": {
"get": {
"description": "Return a config value",
"operationId": "find_config_id",
"produces": [
"application/json"
],
"tags": ["config"],
"parameters": [
{
"name": "id",
"in": "path",
"description": "ID of config to return",
"required": true,
"type": "string"
}
],
"responses": {
"200": {
"description": "Config value"
},
"default": {
"description": "unexpected error",
"schema": {
"$ref": "#/definitions/ErrorModel"
}
}
}
}
}

View File

@@ -21,8 +21,8 @@
"parameters":[
{
"name":"host",
"description":"The host name. If absent, the local server broadcast/listen address is used",
"required":false,
"description":"The host name",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"query"
@@ -45,8 +45,8 @@
"parameters":[
{
"name":"host",
"description":"The host name. If absent, the local server broadcast/listen address is used",
"required":false,
"description":"The host name",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"query"

View File

@@ -1,90 +0,0 @@
{
"apiVersion":"0.0.1",
"swaggerVersion":"1.2",
"basePath":"{{Protocol}}://{{Host}}",
"resourcePath":"/error_injection",
"produces":[
"application/json"
],
"apis":[
{
"path":"/v2/error_injection/injection/{injection}",
"operations":[
{
"method":"POST",
"summary":"Activate an injection that triggers an error in code",
"type":"void",
"nickname":"enable_injection",
"produces":[
"application/json"
],
"parameters":[
{
"name":"injection",
"description":"injection name, should correspond to an injection added in code",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"one_shot",
"description":"boolean flag indicating whether the injection should be enabled to trigger only once",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
}
]
},
{
"method":"DELETE",
"summary":"Deactivate an injection previously activated by the API",
"type":"void",
"nickname":"disable_injection",
"produces":[
"application/json"
],
"parameters":[
{
"name":"injection",
"description":"injection name",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
},
{
"path":"/v2/error_injection/injection",
"operations":[
{
"method":"GET",
"summary":"List all enabled injections on all shards, i.e. injections that will trigger an error in the code",
"type":"array",
"items":{
"type":"string"
},
"nickname":"get_enabled_injections_on_all",
"produces":[
"application/json"
],
"parameters":[]
},
{
"method":"DELETE",
"summary":"Deactivate all injections previously activated on all shards by the API",
"type":"void",
"nickname":"disable_on_all",
"produces":[
"application/json"
],
"parameters":[]
}
]
}
]
}

View File

@@ -42,25 +42,6 @@
}
]
},
{
"path":"/failure_detector/endpoint_phi_values",
"operations":[
{
"method":"GET",
"summary":"Get end point phi values",
"type":"array",
"items":{
"type":"endpoint_phi_values"
},
"nickname":"get_endpoint_phi_values",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/failure_detector/endpoints/",
"operations":[
@@ -110,7 +91,7 @@
{
"method":"GET",
"summary":"Get count down endpoint",
"type": "long",
"type":"int",
"nickname":"get_down_endpoint_count",
"produces":[
"application/json"
@@ -126,7 +107,7 @@
{
"method":"GET",
"summary":"Get count up endpoint",
"type": "long",
"type":"int",
"nickname":"get_up_endpoint_count",
"produces":[
"application/json"
@@ -180,11 +161,11 @@
"description": "The endpoint address"
},
"generation": {
"type": "long",
"type": "int",
"description": "The heart beat generation"
},
"version": {
"type": "long",
"type": "int",
"description": "The heart beat version"
},
"update_time": {
@@ -209,7 +190,7 @@
"description": "Holds a version value for an application state",
"properties": {
"application_state": {
"type": "long",
"type": "int",
"description": "The application state enum index"
},
"value": {
@@ -217,24 +198,10 @@
"description": "The version value"
},
"version": {
"type": "long",
"type": "int",
"description": "The application state version"
}
}
},
"endpoint_phi_value": {
"id" : "endpoint_phi_value",
"description": "Holds phi value for a single end point",
"properties": {
"phi": {
"type": "double",
"description": "Phi value"
},
"endpoint": {
"type": "string",
"description": "end point address"
}
}
}
}
}

View File

@@ -75,7 +75,7 @@
{
"method":"GET",
"summary":"Returns files which are pending for archival attempt. Does NOT include failed archive attempts",
"type": "long",
"type":"int",
"nickname":"get_current_generation_number",
"produces":[
"application/json"
@@ -99,7 +99,7 @@
{
"method":"GET",
"summary":"Get heart beat version for a node",
"type": "long",
"type":"int",
"nickname":"get_current_heart_beat_version",
"produces":[
"application/json"
@@ -148,30 +148,6 @@
]
}
]
},
{
"path":"/gossiper/force_remove_endpoint/{addr}",
"operations":[
{
"method":"POST",
"summary":"Force remove an endpoint from gossip",
"type":"void",
"nickname":"force_remove_endpoint",
"produces":[
"application/json"
],
"parameters":[
{
"name":"addr",
"description":"The endpoint address",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
}
]
}

View File

@@ -7,61 +7,6 @@
"application/json"
],
"apis":[
{
"path":"/hinted_handoff/sync_point",
"operations":[
{
"method":"POST",
"summary":"Creates a hints sync point. It can be used to wait until hints between given nodes are replayed. A sync point allows you to wait for hints accumulated at the moment of its creation - it won't wait for hints generated later. A sync point is described entirely by its ID - there is no state kept server-side, so there is no need to delete it.",
"type":"string",
"nickname":"create_hints_sync_point",
"produces":[
"application/json"
],
"parameters":[
{
"name":"target_hosts",
"description":"A list of nodes towards which hints should be replayed. Multiple hosts can be listed by separating them with commas. If not provided or empty, the point will resolve when current hints towards all nodes in the cluster are sent.",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
},
{
"method":"GET",
"summary":"Get the status of a hints sync point, possibly waiting for it to be reached.",
"type":"string",
"enum":[
"DONE",
"IN_PROGRESS"
],
"nickname":"get_hints_sync_point",
"produces":[
"application/json"
],
"parameters":[
{
"name":"id",
"description":"The ID of the hint sync point which should be checked or waited on",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"timeout",
"description":"Timeout in seconds after which the query returns even if hints are still being replayed. No value or 0 will cause the query to return immediately. A negative value will cause the query to wait until the sync point is reached",
"required":false,
"allowMultiple":false,
"type":"long",
"paramType":"query"
}
]
}
]
},
{
"path":"/hinted_handoff/hints",
"operations":[
@@ -154,7 +99,7 @@
{
"method": "GET",
"summary": "Get create hint count",
"type": "long",
"type": "int",
"nickname": "get_create_hint_count",
"produces": [
"application/json"
@@ -178,7 +123,7 @@
{
"method": "GET",
"summary": "Get not stored hints count",
"type": "long",
"type": "int",
"nickname": "get_not_stored_hints_count",
"produces": [
"application/json"

View File

@@ -76,7 +76,7 @@
"items":{
"type":"message_counter"
},
"nickname":"get_replied_messages",
"nickname":"get_completed_messages",
"produces":[
"application/json"
],
@@ -191,7 +191,7 @@
{
"method":"GET",
"summary":"Get the version number",
"type": "long",
"type":"int",
"nickname":"get_version",
"produces":[
"application/json"
@@ -249,10 +249,10 @@
"MIGRATION_REQUEST",
"PREPARE_MESSAGE",
"PREPARE_DONE_MESSAGE",
"UNUSED__STREAM_MUTATION",
"STREAM_MUTATION",
"STREAM_MUTATION_DONE",
"COMPLETE_MESSAGE",
"UNUSED__REPAIR_CHECKSUM_RANGE",
"REPAIR_CHECKSUM_RANGE",
"GET_SCHEMA_VERSION"
]
}

View File

@@ -68,7 +68,7 @@
"summary":"Get the hinted handoff enabled by dc",
"type":"array",
"items":{
"type":"array"
"type":"mapper_list"
},
"nickname":"get_hinted_handoff_enabled_by_dc",
"produces":[
@@ -105,7 +105,7 @@
{
"method":"GET",
"summary":"Get the max hint window",
"type": "long",
"type":"int",
"nickname":"get_max_hint_window",
"produces":[
"application/json"
@@ -128,7 +128,7 @@
"description":"max hint window in ms",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -141,7 +141,7 @@
{
"method":"GET",
"summary":"Get max hints in progress",
"type": "long",
"type":"int",
"nickname":"get_max_hints_in_progress",
"produces":[
"application/json"
@@ -164,7 +164,7 @@
"description":"max hints in progress",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -177,7 +177,7 @@
{
"method":"GET",
"summary":"get hints in progress",
"type": "long",
"type":"int",
"nickname":"get_hints_in_progress",
"produces":[
"application/json"
@@ -602,7 +602,7 @@
{
"method": "GET",
"summary": "Get cas write metrics",
"type": "long",
"type": "int",
"nickname": "get_cas_write_metrics_unfinished_commit",
"produces": [
"application/json"
@@ -632,7 +632,7 @@
{
"method": "GET",
"summary": "Get cas write metrics",
"type": "long",
"type": "int",
"nickname": "get_cas_write_metrics_condition_not_met",
"produces": [
"application/json"
@@ -641,28 +641,13 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_write/failed_read_round_optimization",
"operations": [
{
"method": "GET",
"summary": "Get cas write metrics",
"type": "long",
"nickname": "get_cas_write_metrics_failed_read_round_optimization",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/cas_read/unfinished_commit",
"operations": [
{
"method": "GET",
"summary": "Get cas read metrics",
"type": "long",
"type": "int",
"nickname": "get_cas_read_metrics_unfinished_commit",
"produces": [
"application/json"
@@ -686,13 +671,28 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_read/condition_not_met",
"operations": [
{
"method": "GET",
"summary": "Get cas read metrics",
"type": "int",
"nickname": "get_cas_read_metrics_condition_not_met",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/read/timeouts",
"operations": [
{
"method": "GET",
"summary": "Get read metrics",
"type": "long",
"type": "int",
"nickname": "get_read_metrics_timeouts",
"produces": [
"application/json"
@@ -707,7 +707,7 @@
{
"method": "GET",
"summary": "Get read metrics",
"type": "long",
"type": "int",
"nickname": "get_read_metrics_unavailables",
"produces": [
"application/json"
@@ -716,36 +716,6 @@
}
]
},
{
"path": "/storage_proxy/metrics/read/timeouts_rates",
"operations": [
{
"method": "GET",
"summary": "Get read metrics rates",
"type": "#/utils/rate_moving_average",
"nickname": "get_read_metrics_timeouts_rates",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/read/unavailables_rates",
"operations": [
{
"method": "GET",
"summary": "Get read metrics rates",
"type": "#/utils/rate_moving_average",
"nickname": "get_read_metrics_unavailables_rates",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/read/histogram",
"operations": [
@@ -753,7 +723,7 @@
"method": "GET",
"summary": "Get read metrics",
"$ref": "#/utils/histogram",
"nickname": "get_read_metrics_latency_histogram_depricated",
"nickname": "get_read_metrics_latency_histogram",
"produces": [
"application/json"
],
@@ -768,66 +738,6 @@
"method": "GET",
"summary": "Get range metrics",
"$ref": "#/utils/histogram",
"nickname": "get_range_metrics_latency_histogram_depricated",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/read/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get read metrics",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_read_metrics_latency_histogram",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/cas_read/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get CAS read rate and latency histogram",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_cas_read_metrics_latency_histogram",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/view_write/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get view write rate and latency histogram",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_view_write_metrics_latency_histogram",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/range/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get range metrics rate and histogram",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_range_metrics_latency_histogram",
"produces": [
"application/json"
@@ -842,7 +752,7 @@
{
"method": "GET",
"summary": "Get range metrics",
"type": "long",
"type": "int",
"nickname": "get_range_metrics_timeouts",
"produces": [
"application/json"
@@ -857,7 +767,7 @@
{
"method": "GET",
"summary": "Get range metrics",
"type": "long",
"type": "int",
"nickname": "get_range_metrics_unavailables",
"produces": [
"application/json"
@@ -866,43 +776,13 @@
}
]
},
{
"path": "/storage_proxy/metrics/range/timeouts_rates",
"operations": [
{
"method": "GET",
"summary": "Get range metrics rates",
"type": "#/utils/rate_moving_average",
"nickname": "get_range_metrics_timeouts_rates",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/range/unavailables_rates",
"operations": [
{
"method": "GET",
"summary": "Get range metrics rates",
"type": "#/utils/rate_moving_average",
"nickname": "get_range_metrics_unavailables_rates",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/write/timeouts",
"operations": [
{
"method": "GET",
"summary": "Get write metrics",
"type": "long",
"type": "int",
"nickname": "get_write_metrics_timeouts",
"produces": [
"application/json"
@@ -917,7 +797,7 @@
{
"method": "GET",
"summary": "Get write metrics",
"type": "long",
"type": "int",
"nickname": "get_write_metrics_unavailables",
"produces": [
"application/json"
@@ -926,36 +806,6 @@
}
]
},
{
"path": "/storage_proxy/metrics/write/timeouts_rates",
"operations": [
{
"method": "GET",
"summary": "Get write metrics rates",
"type": "#/utils/rate_moving_average",
"nickname": "get_write_metrics_timeouts_rates",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/write/unavailables_rates",
"operations": [
{
"method": "GET",
"summary": "Get write metrics rates",
"type": "#/utils/rate_moving_average",
"nickname": "get_write_metrics_unavailables_rates",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/write/histogram",
"operations": [
@@ -963,21 +813,6 @@
"method": "GET",
"summary": "Get write metrics",
"$ref": "#/utils/histogram",
"nickname": "get_write_metrics_latency_histogram_depricated",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path": "/storage_proxy/metrics/write/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get write metrics",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_write_metrics_latency_histogram",
"produces": [
"application/json"
@@ -986,21 +821,6 @@
}
]
},
{
"path": "/storage_proxy/metrics/cas_write/moving_average_histogram",
"operations": [
{
"method": "GET",
"summary": "Get CAS write rate and latency histogram",
"$ref": "#/utils/rate_moving_average_and_histogram",
"nickname": "get_cas_write_metrics_latency_histogram",
"produces": [
"application/json"
],
"parameters": []
}
]
},
{
"path":"/storage_proxy/metrics/read/estimated_histogram/",
"operations":[
@@ -1023,7 +843,7 @@
{
"method":"GET",
"summary":"Get read latency",
"type": "long",
"type":"int",
"nickname":"get_read_latency",
"produces":[
"application/json"
@@ -1055,7 +875,7 @@
{
"method":"GET",
"summary":"Get write latency",
"type": "long",
"type":"int",
"nickname":"get_write_latency",
"produces":[
"application/json"
@@ -1087,7 +907,7 @@
{
"method":"GET",
"summary":"Get range latency",
"type": "long",
"type":"int",
"nickname":"get_range_latency",
"produces":[
"application/json"

View File

@@ -104,68 +104,6 @@
}
]
},
{
"path":"/storage_service/toppartitions/",
"operations":[
{
"method":"GET",
"summary":"Toppartitions query",
"type":"toppartitions_query_results",
"nickname":"toppartitions_generic",
"produces":[
"application/json"
],
"parameters":[
{
"name":"table_filters",
"description":"Optional list of table name filters in keyspace:name format",
"required":false,
"allowMultiple":false,
"type":"array",
"items":{
"type":"string"
},
"paramType":"query"
},
{
"name":"keyspace_filters",
"description":"Optional list of keyspace filters",
"required":false,
"allowMultiple":false,
"type":"array",
"items":{
"type":"string"
},
"paramType":"query"
},
{
"name":"duration",
"description":"Duration (in milliseconds) of monitoring operation",
"required":true,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"list_size",
"description":"number of the top partitions to list",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"capacity",
"description":"capacity of stream summary: determines amount of resources used in query processing",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
}
]
}
]
},
{
"path":"/storage_service/nodes/leaving",
"operations":[
@@ -239,22 +177,6 @@
}
]
},
{
"path":"/storage_service/scylla_release_version",
"operations":[
{
"method":"GET",
"summary":"Fetch a string representation of the Scylla version.",
"type":"string",
"nickname":"get_scylla_release_version",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/storage_service/schema_version",
"operations":[
@@ -520,7 +442,7 @@
{
"method":"GET",
"summary":"Return the generation value for this node.",
"type": "long",
"type":"int",
"nickname":"get_current_generation_number",
"produces":[
"application/json"
@@ -573,21 +495,6 @@
}
]
},
{
"path":"/storage_service/cdc_streams_check_and_repair",
"operations":[
{
"method":"POST",
"summary":"Checks that CDC streams reflect current cluster topology and regenerates them if not.",
"type":"void",
"nickname":"cdc_streams_check_and_repair",
"produces":[
"application/json"
],
"parameters":[]
}
]
},
{
"path":"/storage_service/snapshots",
"operations":[
@@ -624,7 +531,7 @@
},
{
"name":"kn",
"description":"Keyspace(s) to snapshot. Multiple keyspaces can be provided using a comma-separated list. If omitted, snapshot all keyspaces.",
"description":"Comma seperated keyspaces name to snapshot",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -632,19 +539,11 @@
},
{
"name":"cf",
"description":"Table(s) to snapshot. Multiple tables (in a single keyspace) can be provided using a comma-separated list. If omitted, snapshot all tables in the given keyspace(s).",
"description":"the column family to snapshot",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"sf",
"description":"Skip flush. When set to \"true\", do not flush memtables before snapshotting (snapshot will not contain unflushed data)",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
}
]
},
@@ -667,15 +566,7 @@
},
{
"name":"kn",
"description":"Comma-separated keyspaces name that their snapshot will be deleted",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"cf",
"description":"an optional table name that its snapshot will be deleted",
"description":"Comma seperated keyspaces name to snapshot",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -723,7 +614,7 @@
},
{
"name":"cf",
"description":"Comma-separated column family names",
"description":"Comma seperated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -739,7 +630,7 @@
{
"method":"POST",
"summary":"Trigger a cleanup of keys on a single keyspace",
"type": "long",
"type":"int",
"nickname":"force_keyspace_cleanup",
"produces":[
"application/json"
@@ -755,39 +646,7 @@
},
{
"name":"cf",
"description":"Comma-separated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
]
},
{
"path":"/storage_service/keyspace_offstrategy_compaction/{keyspace}",
"operations":[
{
"method":"POST",
"summary":"Perform offstrategy compaction, if needed, in a single keyspace",
"type":"boolean",
"nickname":"perform_keyspace_offstrategy_compaction",
"produces":[
"application/json"
],
"parameters":[
{
"name":"keyspace",
"description":"The keyspace to operate on",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"cf",
"description":"Comma-separated table names",
"description":"Comma seperated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -802,8 +661,8 @@
"operations":[
{
"method":"GET",
"summary":"Scrub (deserialize + reserialize at the latest version, resolving corruptions if any) the given keyspace. If columnFamilies array is empty, all CFs are scrubbed. Scrubbed CFs will be snapshotted first, if disableSnapshot is false. Scrub has the following modes: Abort (default) - abort scrub if corruption is detected; Skip (same as `skip_corrupted=true`) skip over corrupt data, omitting them from the output; Segregate - segregate data into multiple sstables if needed, such that each sstable contains data with valid order; Validate - read (no rewrite) and validate data, logging any problems found.",
"type": "long",
"summary":"Scrub (deserialize + reserialize at the latest version, skipping bad rows if any) the given keyspace. If columnFamilies array is empty, all CFs are scrubbed. Scrubbed CFs will be snapshotted first, if disableSnapshot is false",
"type":"int",
"nickname":"scrub",
"produces":[
"application/json"
@@ -825,33 +684,6 @@
"type":"boolean",
"paramType":"query"
},
{
"name":"scrub_mode",
"description":"How to handle corrupt data (overrides 'skip_corrupted'); ",
"required":false,
"allowMultiple":false,
"type":"string",
"enum":[
"ABORT",
"SKIP",
"SEGREGATE",
"VALIDATE"
],
"paramType":"query"
},
{
"name":"quarantine_mode",
"description":"Controls whether to scrub quarantined sstables (default INCLUDE)",
"required":false,
"allowMultiple":false,
"type":"string",
"enum":[
"INCLUDE",
"EXCLUDE",
"ONLY"
],
"paramType":"query"
},
{
"name":"keyspace",
"description":"The keyspace to query about",
@@ -862,7 +694,7 @@
},
{
"name":"cf",
"description":"Comma-separated column family names",
"description":"Comma seperated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -878,7 +710,7 @@
{
"method":"GET",
"summary":"Rewrite all sstables to the latest version. Unlike scrub, it doesn't skip bad rows and do not snapshot sstables first.",
"type": "long",
"type":"int",
"nickname":"upgrade_sstables",
"produces":[
"application/json"
@@ -902,7 +734,7 @@
},
{
"name":"cf",
"description":"Comma-separated column family names",
"description":"Comma seperated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -934,7 +766,7 @@
},
{
"name":"cf",
"description":"Comma-separated column family names",
"description":"Comma seperated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -944,68 +776,13 @@
}
]
},
{
"path":"/storage_service/active_repair/",
"operations":[
{
"method":"GET",
"summary":"Return an array with the ids of the currently active repairs",
"type":"array",
"items":{
"type": "long"
},
"nickname":"get_active_repair_async",
"produces":[
"application/json"
],
"parameters":[]
}
]
},
{
"path":"/storage_service/repair_status/",
"operations":[
{
"method":"GET",
"summary":"Query the repair status and return when the repair is finished or timeout",
"type":"string",
"enum":[
"RUNNING",
"SUCCESSFUL",
"FAILED"
],
"nickname":"repair_await_completion",
"produces":[
"application/json"
],
"parameters":[
{
"name":"id",
"description":"The repair ID to check for status",
"required":true,
"allowMultiple":false,
"type": "long",
"paramType":"query"
},
{
"name":"timeout",
"description":"Seconds to wait before the query returns even if the repair is not finished. The value -1 or not providing this parameter means no timeout",
"required":false,
"allowMultiple":false,
"type": "long",
"paramType":"query"
}
]
}
]
},
{
"path":"/storage_service/repair_async/{keyspace}",
"operations":[
{
"method":"POST",
"summary":"Invoke repair asynchronously. You can track repair progress by using the get supplying id",
"type": "long",
"type":"int",
"nickname":"repair_async",
"produces":[
"application/json"
@@ -1099,14 +876,6 @@
"type":"string",
"paramType":"query"
},
{
"name":"ignore_nodes",
"description":"Which hosts are to ignore in this repair. Multiple hosts can be listed separated by commas.",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"trace",
"description":"If the value is the string 'true' with any capitalization, enable tracing of the repair.",
@@ -1144,7 +913,7 @@
"description":"The repair ID to check for status",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -1167,22 +936,6 @@
}
]
},
{
"path":"/storage_service/force_terminate_repair",
"operations":[
{
"method":"POST",
"summary":"Force terminate all repair sessions",
"type":"void",
"nickname":"force_terminate_all_repair_sessions_new",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/storage_service/decommission",
"operations":[
@@ -1228,7 +981,7 @@
"operations":[
{
"method":"POST",
"summary":"Removes a node from the cluster. Replicated data that logically belonged to this node is redistributed among the remaining nodes.",
"summary":"Removes token (and all data associated with enpoint that had it) from the ring",
"type":"void",
"nickname":"remove_node",
"produces":[
@@ -1242,14 +995,6 @@
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"ignore_nodes",
"description":"Comma-separated list of dead nodes to ignore in removenode operation. Use the same method for all nodes to ignore: either Host IDs or ip addresses.",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
@@ -1440,12 +1185,11 @@
],
"parameters":[
{
"name":"type",
"description":"Which keyspaces to return",
"name":"non_system",
"description":"When set to true limit to non system",
"required":false,
"allowMultiple":false,
"type":"string",
"enum": [ "all", "user", "non_local_strategy" ],
"type":"boolean",
"paramType":"query"
}
]
@@ -1482,18 +1226,18 @@
},
{
"name":"dynamic_update_interval",
"description":"interval in ms (default 100)",
"description":"integer, in ms (default 100)",
"required":false,
"allowMultiple":false,
"type":"long",
"type":"integer",
"paramType":"query"
},
{
"name":"dynamic_reset_interval",
"description":"interval in ms (default 600,000)",
"description":"integer, in ms (default 600,000)",
"required":false,
"allowMultiple":false,
"type":"long",
"type":"integer",
"paramType":"query"
},
{
@@ -1698,7 +1442,7 @@
"description":"Stream throughput",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -1706,7 +1450,7 @@
{
"method":"GET",
"summary":"Get stream throughput mb per sec",
"type": "long",
"type":"int",
"nickname":"get_stream_throughput_mb_per_sec",
"produces":[
"application/json"
@@ -1722,7 +1466,7 @@
{
"method":"GET",
"summary":"get compaction throughput mb per sec",
"type": "long",
"type":"int",
"nickname":"get_compaction_throughput_mb_per_sec",
"produces":[
"application/json"
@@ -1744,7 +1488,7 @@
"description":"compaction throughput",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -1901,22 +1645,6 @@
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"load_and_stream",
"description":"Load the sstables and stream to all replica nodes that owns the data",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"primary_replica_only",
"description":"Load the sstables and stream to primary replica node that owns the data. Repair is needed after the load and stream process",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
@@ -1992,65 +1720,6 @@
}
]
},
{
"path":"/storage_service/slow_query",
"operations":[
{
"method":"POST",
"summary":"Set slow query parameter",
"type":"void",
"nickname":"set_slow_query",
"produces":[
"application/json"
],
"parameters":[
{
"name":"enable",
"description":"set it to true to enable, anything else to disable",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
},
{
"name":"ttl",
"description":"TTL in seconds",
"required":false,
"allowMultiple":false,
"type":"long",
"paramType":"query"
},
{
"name":"threshold",
"description":"Slow query record threshold in microseconds",
"required":false,
"allowMultiple":false,
"type":"long",
"paramType":"query"
},
{
"name":"fast",
"description":"Lightweight tracing mode: if true, slow queries tracing records only session headers",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
}
]
},
{
"method":"GET",
"summary":"Returns the slow query record configuration.",
"type":"slow_query_info",
"nickname":"get_slow_query_info",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/storage_service/auto_compaction/{keyspace}",
"operations":[
@@ -2073,7 +1742,7 @@
},
{
"name":"cf",
"description":"Comma-separated column family names",
"description":"Comma seperated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -2100,7 +1769,7 @@
},
{
"name":"cf",
"description":"Comma-separated column family names",
"description":"Comma seperated column family names",
"required":false,
"allowMultiple":false,
"type":"string",
@@ -2172,7 +1841,7 @@
{
"method":"GET",
"summary":"Returns the threshold for warning of queries with many tombstones",
"type": "long",
"type":"int",
"nickname":"get_tombstone_warn_threshold",
"produces":[
"application/json"
@@ -2194,7 +1863,7 @@
"description":"tombstone debug threshold",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2207,7 +1876,7 @@
{
"method":"GET",
"summary":"",
"type": "long",
"type":"int",
"nickname":"get_tombstone_failure_threshold",
"produces":[
"application/json"
@@ -2229,7 +1898,7 @@
"description":"tombstone debug threshold",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2242,7 +1911,7 @@
{
"method":"GET",
"summary":"Returns the threshold for rejecting queries due to a large batch size",
"type": "long",
"type":"int",
"nickname":"get_batch_size_failure_threshold",
"produces":[
"application/json"
@@ -2264,7 +1933,7 @@
"description":"batch size debug threshold",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2288,7 +1957,7 @@
"description":"throttle in kb",
"required":true,
"allowMultiple":false,
"type": "long",
"type":"int",
"paramType":"query"
}
]
@@ -2301,7 +1970,7 @@
{
"method":"GET",
"summary":"Get load",
"type": "long",
"type":"int",
"nickname":"get_metrics_load",
"produces":[
"application/json"
@@ -2317,7 +1986,7 @@
{
"method":"GET",
"summary":"Get exceptions",
"type": "long",
"type":"int",
"nickname":"get_exceptions",
"produces":[
"application/json"
@@ -2333,7 +2002,7 @@
{
"method":"GET",
"summary":"Get total hints in progress",
"type": "long",
"type":"int",
"nickname":"get_total_hints_in_progress",
"produces":[
"application/json"
@@ -2349,7 +2018,7 @@
{
"method":"GET",
"summary":"Get total hints",
"type": "long",
"type":"int",
"nickname":"get_total_hints",
"produces":[
"application/json"
@@ -2358,77 +2027,7 @@
]
}
]
},
{
"path":"/storage_service/view_build_statuses/{keyspace}/{view}",
"operations":[
{
"method":"GET",
"summary":"Gets the progress of a materialized view build",
"type":"array",
"items":{
"type":"mapper"
},
"nickname":"view_build_statuses",
"produces":[
"application/json"
],
"parameters":[
{
"name":"keyspace",
"description":"The keyspace",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"view",
"description":"View name",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
},
{
"path":"/storage_service/sstable_info",
"operations":[
{
"method":"GET",
"summary":"SSTable information",
"type":"array",
"items":{
"type":"table_sstables"
},
"nickname":"sstable_info",
"produces":[
"application/json"
],
"parameters":[
{
"name":"keyspace",
"description":"The keyspace",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"cf",
"description":"column family name",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
]
}
}
],
"models":{
"mapper":{
@@ -2492,11 +2091,11 @@
"description":"The column family"
},
"total":{
"type":"long",
"type":"int",
"description":"The total snapshot size"
},
"live":{
"type":"long",
"type":"int",
"description":"The live snapshot size"
}
}
@@ -2518,28 +2117,6 @@
}
}
},
"slow_query_info": {
"id":"slow_query_info",
"description":"Slow query triggering information",
"properties":{
"enable":{
"type":"boolean",
"description":"Is slow query logging enable or disable"
},
"ttl":{
"type":"long",
"description":"The slow query TTL in seconds"
},
"threshold":{
"type":"long",
"description":"The slow query logging threshold in microseconds. Queries that takes longer, will be logged"
},
"fast":{
"type":"boolean",
"description":"Is lightweight tracing mode enabled. In that mode tracing ignore events and tracks only sessions."
}
}
},
"endpoint_detail":{
"id":"endpoint_detail",
"description":"Endpoint detail",
@@ -2592,92 +2169,6 @@
"description":"The endpoint details"
}
}
},
"named_maps":{
"id":"named_maps",
"properties":{
"group":{
"type":"string"
},
"attributes":{
"type":"array",
"items":{
"type":"mapper"
}
}
}
},
"sstable":{
"id":"sstable",
"properties":{
"size":{
"type":"long",
"description":"Total size in bytes of sstable"
},
"data_size":{
"type":"long",
"description":"The size in bytes on disk of data"
},
"index_size":{
"type":"long",
"description":"The size in bytes on disk of index"
},
"filter_size":{
"type":"long",
"description":"The size in bytes on disk of filter"
},
"timestamp":{
"type":"datetime",
"description":"File creation time"
},
"generation":{
"type":"long",
"description":"SSTable generation"
},
"level":{
"type":"long",
"description":"SSTable level"
},
"version":{
"type":"string",
"enum":[
"ka", "la", "mc", "md", "me"
],
"description":"SSTable version"
},
"properties":{
"type":"array",
"description":"SSTable attributes",
"items":{
"type":"mapper"
}
},
"extended_properties":{
"type":"array",
"description":"SSTable extended attributes",
"items":{
"type":"named_maps"
}
}
}
},
"table_sstables":{
"id":"table_sstables",
"description":"Per-table SSTable info and attributes",
"properties":{
"keyspace":{
"type":"string"
},
"table":{
"type":"string"
},
"sstables":{
"type":"array",
"items":{
"$ref":"sstable"
}
}
}
}
}
}

View File

@@ -32,7 +32,7 @@
{
"method":"GET",
"summary":"Get number of active outbound streams",
"type": "long",
"type":"int",
"nickname":"get_all_active_streams_outbound",
"produces":[
"application/json"
@@ -48,7 +48,7 @@
{
"method":"GET",
"summary":"Get total incoming bytes",
"type": "long",
"type":"int",
"nickname":"get_total_incoming_bytes",
"produces":[
"application/json"
@@ -72,7 +72,7 @@
{
"method":"GET",
"summary":"Get all total incoming bytes",
"type": "long",
"type":"int",
"nickname":"get_all_total_incoming_bytes",
"produces":[
"application/json"
@@ -88,7 +88,7 @@
{
"method":"GET",
"summary":"Get total outgoing bytes",
"type": "long",
"type":"int",
"nickname":"get_total_outgoing_bytes",
"produces":[
"application/json"
@@ -112,7 +112,7 @@
{
"method":"GET",
"summary":"Get all total outgoing bytes",
"type": "long",
"type":"int",
"nickname":"get_all_total_outgoing_bytes",
"produces":[
"application/json"
@@ -154,7 +154,7 @@
"description":"The peer"
},
"session_index":{
"type": "long",
"type":"int",
"description":"The session index"
},
"connecting":{
@@ -211,7 +211,7 @@
"description":"The ID"
},
"files":{
"type": "long",
"type":"int",
"description":"Number of files to transfer. Can be 0 if nothing to transfer for some streaming request."
},
"total_size":{
@@ -242,7 +242,7 @@
"description":"The peer address"
},
"session_index":{
"type": "long",
"type":"int",
"description":"The session index"
},
"file_name":{

View File

@@ -1,29 +0,0 @@
{
"swagger": "2.0",
"info": {
"version": "1.0.0",
"title": "Scylla API",
"description": "The scylla API version 2.0",
"termsOfService": "http://www.scylladb.com/tos/",
"contact": {
"name": "Scylla Team",
"email": "info@scylladb.com",
"url": "http://scylladb.com"
},
"license": {
"name": "AGPL",
"url": "https://github.com/scylladb/scylla/blob/master/LICENSE.AGPL"
}
},
"host": "{{Host}}",
"basePath": "/v2",
"schemes": [
"http"
],
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"paths": {

View File

@@ -52,76 +52,6 @@
}
]
},
{
"path":"/system/log",
"operations":[
{
"method":"POST",
"summary":"Write a message to the Scylla log",
"type":"void",
"nickname":"write_log_message",
"produces":[
"application/json"
],
"parameters":[
{
"name":"message",
"description":"The message to write to the log",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"level",
"description":"The logging level to use",
"required":true,
"allowMultiple":false,
"type":"string",
"enum":[
"error",
"warn",
"info",
"debug",
"trace"
],
"paramType":"query"
}
]
}
]
},
{
"path":"/system/drop_sstable_caches",
"operations":[
{
"method":"POST",
"summary":"Drop in-memory caches for data which is in sstables",
"type":"void",
"nickname":"drop_sstable_caches",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/system/uptime_ms",
"operations":[
{
"method":"GET",
"summary":"Get system uptime, in milliseconds",
"type":"long",
"nickname":"get_system_uptime",
"produces":[
"application/json"
],
"parameters":[]
}
]
},
{
"path":"/system/logger/{name}",
"operations":[

View File

@@ -1,251 +0,0 @@
{
"apiVersion":"0.0.1",
"swaggerVersion":"1.2",
"basePath":"{{Protocol}}://{{Host}}",
"resourcePath":"/task_manager",
"produces":[
"application/json"
],
"apis":[
{
"path":"/task_manager/list_modules",
"operations":[
{
"method":"GET",
"summary":"Get all modules names",
"type":"array",
"items":{
"type":"string"
},
"nickname":"get_modules",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/task_manager/list_module_tasks/{module}",
"operations":[
{
"method":"GET",
"summary":"Get a list of tasks",
"type":"array",
"items":{
"type":"task_stats"
},
"nickname":"get_tasks",
"produces":[
"application/json"
],
"parameters":[
{
"name":"module",
"description":"The module to query about",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"internal",
"description":"Boolean flag indicating whether internal tasks should be shown (false by default)",
"required":false,
"allowMultiple":false,
"type":"boolean",
"paramType":"query"
},
{
"name":"keyspace",
"description":"The keyspace to query about",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"table",
"description":"The table to query about",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
]
},
{
"path":"/task_manager/task_status/{task_id}",
"operations":[
{
"method":"GET",
"summary":"Get task status",
"type":"task_status",
"nickname":"get_task_status",
"produces":[
"application/json"
],
"parameters":[
{
"name":"task_id",
"description":"The uuid of a task to query about",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
},
{
"path":"/task_manager/abort_task/{task_id}",
"operations":[
{
"method":"POST",
"summary":"Abort running task and its descendants",
"type":"void",
"nickname":"abort_task",
"produces":[
"application/json"
],
"parameters":[
{
"name":"task_id",
"description":"The uuid of a task to abort",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
},
{
"path":"/task_manager/wait_task/{task_id}",
"operations":[
{
"method":"GET",
"summary":"Wait for a task to complete",
"type":"task_status",
"nickname":"wait_task",
"produces":[
"application/json"
],
"parameters":[
{
"name":"task_id",
"description":"The uuid of a task to wait for",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
}
]
}
]
}
],
"models":{
"task_stats" :{
"id": "task_stats",
"description":"A task statistics object",
"properties":{
"task_id":{
"type":"string",
"description":"The uuid of a task"
},
"state":{
"type":"string",
"enum":[
"created",
"running",
"done",
"failed"
],
"description":"The state of a task"
}
}
},
"task_status":{
"id":"task_status",
"description":"A task status object",
"properties":{
"id":{
"type":"string",
"description":"The uuid of the task"
},
"type":{
"type":"string",
"description":"The description of the task"
},
"state":{
"type":"string",
"enum":[
"created",
"running",
"done",
"failed"
],
"description":"The state of the task"
},
"is_abortable":{
"type":"boolean",
"description":"Boolean flag indicating whether the task can be aborted"
},
"start_time":{
"type":"datetime",
"description":"The start time of the task"
},
"end_time":{
"type":"datetime",
"description":"The end time of the task (unspecified when the task is not completed)"
},
"error":{
"type":"string",
"description":"Error string, if the task failed"
},
"parent_id":{
"type":"string",
"description":"The uuid of the parent task"
},
"sequence_number":{
"type":"long",
"description":"The running sequence number of the task"
},
"shard":{
"type":"long",
"description":"The number of a shard the task is running on"
},
"keyspace":{
"type":"string",
"description":"The keyspace the task is working on (if applicable)"
},
"table":{
"type":"string",
"description":"The table the task is working on (if applicable)"
},
"entity":{
"type":"string",
"description":"Task-specific entity description"
},
"progress_units":{
"type":"string",
"description":"A description of the progress units"
},
"progress_total":{
"type":"double",
"description":"The total number of units to complete for the task"
},
"progress_completed":{
"type":"double",
"description":"The number of units completed so far"
}
}
}
}
}

View File

@@ -1,185 +0,0 @@
{
"apiVersion":"0.0.1",
"swaggerVersion":"1.2",
"basePath":"{{Protocol}}://{{Host}}",
"resourcePath":"/task_manager_test",
"produces":[
"application/json"
],
"apis":[
{
"path":"/task_manager_test/test_module",
"operations":[
{
"method":"POST",
"summary":"Register test module in task manager",
"type":"void",
"nickname":"register_test_module",
"produces":[
"application/json"
],
"parameters":[
]
},
{
"method":"DELETE",
"summary":"Unregister test module in task manager",
"type":"void",
"nickname":"unregister_test_module",
"produces":[
"application/json"
],
"parameters":[
]
}
]
},
{
"path":"/task_manager_test/test_task",
"operations":[
{
"method":"POST",
"summary":"Register test task",
"type":"string",
"nickname":"register_test_task",
"produces":[
"application/json"
],
"parameters":[
{
"name":"task_id",
"description":"The uuid of a task to register",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"shard",
"description":"The shard of the task",
"required":false,
"allowMultiple":false,
"type":"long",
"paramType":"query"
},
{
"name":"parent_id",
"description":"The uuid of a parent task",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"keyspace",
"description":"The keyspace the task is working on",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"table",
"description":"The table the task is working on",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"type",
"description":"The type of the task",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
},
{
"name":"entity",
"description":"Task-specific entity description",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
},
{
"method":"DELETE",
"summary":"Unregister test task",
"type":"void",
"nickname":"unregister_test_task",
"produces":[
"application/json"
],
"parameters":[
{
"name":"task_id",
"description":"The uuid of a task to register",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
]
},
{
"path":"/task_manager_test/finish_test_task/{task_id}",
"operations":[
{
"method":"POST",
"summary":"Finish test task",
"type":"void",
"nickname":"finish_test_task",
"produces":[
"application/json"
],
"parameters":[
{
"name":"task_id",
"description":"The uuid of a task to finish",
"required":true,
"allowMultiple":false,
"type":"string",
"paramType":"path"
},
{
"name":"error",
"description":"The error with which task fails (if it does)",
"required":false,
"allowMultiple":false,
"type":"string",
"paramType":"query"
}
]
}
]
},
{
"path":"/task_manager_test/ttl",
"operations":[
{
"method":"POST",
"summary":"Set ttl in seconds and get last value",
"type":"long",
"nickname":"get_and_update_ttl",
"produces":[
"application/json"
],
"parameters":[
{
"name":"ttl",
"description":"The number of seconds for which the tasks will be kept in memory after it finishes",
"required":true,
"allowMultiple":false,
"type":"long",
"paramType":"query"
}
]
}
]
}
]
}

View File

@@ -65,41 +65,6 @@
"description":"The series of values to which the counts in `buckets` correspond"
}
}
},
"rate_moving_average": {
"id":"rate_moving_average",
"description":"A meter metric which measures mean throughput and one, five, and fifteen-minute exponentially-weighted moving average throughputs",
"properties":{
"rates": {
"type":"array",
"items":{
"type":"double"
},
"description":"One, five and fifteen mintues rates"
},
"mean_rate": {
"type":"double",
"description":"The mean rate from startup"
},
"count": {
"type":"long",
"description":"Total number of events from startup"
}
}
},
"rate_moving_average_and_histogram": {
"id":"rate_moving_average_and_histogram",
"description":"A timer metric which aggregates timing durations and provides duration statistics, plus throughput statistics",
"properties":{
"meter": {
"type":"rate_moving_average",
"description":"The metric rate moving average"
},
"hist": {
"type":"histogram",
"description":"The metric histogram"
}
}
}
}
}
}
}

View File

@@ -1,15 +1,28 @@
/*
* Copyright 2015-present ScyllaDB
* Copyright 2015 ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "api.hh"
#include <seastar/http/file_handler.hh>
#include <seastar/http/transformers.hh>
#include <seastar/http/api_docs.hh>
#include "http/file_handler.hh"
#include "http/transformers.hh"
#include "http/api_docs.hh"
#include "storage_service.hh"
#include "commitlog.hh"
#include "gossiper.hh"
@@ -23,52 +36,35 @@
#include "endpoint_snitch.hh"
#include "compaction_manager.hh"
#include "hinted_handoff.hh"
#include "error_injection.hh"
#include "authorization_cache.hh"
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "stream_manager.hh"
#include "system.hh"
#include "api/config.hh"
#include "task_manager.hh"
#include "task_manager_test.hh"
logging::logger apilog("api");
namespace api {
static std::unique_ptr<reply> exception_reply(std::exception_ptr eptr) {
try {
std::rethrow_exception(eptr);
} catch (const replica::no_such_keyspace& ex) {
} catch (const no_such_keyspace& ex) {
throw bad_param_exception(ex.what());
}
// We never going to get here
throw std::runtime_error("exception_reply");
return std::make_unique<reply>();
}
future<> set_server_init(http_context& ctx) {
auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);
auto rb02 = std::make_shared < api_registry_builder20 > (ctx.api_doc, "/v2");
return ctx.http_server.set_routes([rb, &ctx, rb02](routes& r) {
return ctx.http_server.set_routes([rb, &ctx](routes& r) {
r.register_exeption_handler(exception_reply);
r.put(GET, "/ui", new httpd::file_handler(ctx.api_dir + "/index.html",
new content_replace("html")));
r.add(GET, url("/ui").remainder("path"), new httpd::directory_handler(ctx.api_dir,
new content_replace("html")));
rb->set_api_doc(r);
rb02->set_api_doc(r);
rb02->register_api_file(r, "swagger20_header");
rb->register_function(r, "system",
"The system related API");
set_system(ctx, r);
});
}
future<> set_server_config(http_context& ctx, const db::config& cfg) {
auto rb02 = std::make_shared < api_registry_builder20 > (ctx.api_doc, "/v2");
return ctx.http_server.set_routes([&ctx, &cfg, rb02](routes& r) {
set_config(rb02, ctx, r, cfg);
rb->set_api_doc(r);
});
}
@@ -83,86 +79,13 @@ static future<> register_api(http_context& ctx, const sstring& api_name,
});
}
future<> set_transport_controller(http_context& ctx, cql_transport::controller& ctl) {
return ctx.http_server.set_routes([&ctx, &ctl] (routes& r) { set_transport_controller(ctx, r, ctl); });
future<> set_server_storage_service(http_context& ctx) {
return register_api(ctx, "storage_service", "The storage service API", set_storage_service);
}
future<> unset_transport_controller(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_transport_controller(ctx, r); });
}
future<> set_rpc_controller(http_context& ctx, thrift_controller& ctl) {
return ctx.http_server.set_routes([&ctx, &ctl] (routes& r) { set_rpc_controller(ctx, r, ctl); });
}
future<> unset_rpc_controller(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_rpc_controller(ctx, r); });
}
future<> set_server_storage_service(http_context& ctx, sharded<service::storage_service>& ss, sharded<gms::gossiper>& g, sharded<cdc::generation_service>& cdc_gs, sharded<db::system_keyspace>& sys_ks) {
return register_api(ctx, "storage_service", "The storage service API", [&ss, &g, &cdc_gs, &sys_ks] (http_context& ctx, routes& r) {
set_storage_service(ctx, r, ss, g.local(), cdc_gs, sys_ks);
});
}
future<> set_server_sstables_loader(http_context& ctx, sharded<sstables_loader>& sst_loader) {
return ctx.http_server.set_routes([&ctx, &sst_loader] (routes& r) { set_sstables_loader(ctx, r, sst_loader); });
}
future<> unset_server_sstables_loader(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_sstables_loader(ctx, r); });
}
future<> set_server_view_builder(http_context& ctx, sharded<db::view::view_builder>& vb) {
return ctx.http_server.set_routes([&ctx, &vb] (routes& r) { set_view_builder(ctx, r, vb); });
}
future<> unset_server_view_builder(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_view_builder(ctx, r); });
}
future<> set_server_repair(http_context& ctx, sharded<repair_service>& repair) {
return ctx.http_server.set_routes([&ctx, &repair] (routes& r) { set_repair(ctx, r, repair); });
}
future<> unset_server_repair(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_repair(ctx, r); });
}
future<> set_server_authorization_cache(http_context &ctx, sharded<auth::service> &auth_service) {
return register_api(ctx, "authorization_cache",
"The authorization cache API", [&auth_service] (http_context &ctx, routes &r) {
set_authorization_cache(ctx, r, auth_service);
});
}
future<> unset_server_authorization_cache(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_authorization_cache(ctx, r); });
}
future<> set_server_snapshot(http_context& ctx, sharded<db::snapshot_ctl>& snap_ctl) {
return ctx.http_server.set_routes([&ctx, &snap_ctl] (routes& r) { set_snapshot(ctx, r, snap_ctl); });
}
future<> unset_server_snapshot(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_snapshot(ctx, r); });
}
future<> set_server_snitch(http_context& ctx, sharded<locator::snitch_ptr>& snitch) {
return register_api(ctx, "endpoint_snitch_info", "The endpoint snitch info API", [&snitch] (http_context& ctx, routes& r) {
set_endpoint_snitch(ctx, r, snitch);
});
}
future<> unset_server_snitch(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_endpoint_snitch(ctx, r); });
}
future<> set_server_gossip(http_context& ctx, sharded<gms::gossiper>& g) {
future<> set_server_gossip(http_context& ctx) {
return register_api(ctx, "gossiper",
"The gossiper API", [&g] (http_context& ctx, routes& r) {
set_gossiper(ctx, r, g.local());
});
"The gossiper API", set_gossiper);
}
future<> set_server_load_sstable(http_context& ctx) {
@@ -170,67 +93,35 @@ future<> set_server_load_sstable(http_context& ctx) {
"The column family API", set_column_family);
}
future<> set_server_messaging_service(http_context& ctx, sharded<netw::messaging_service>& ms) {
future<> set_server_messaging_service(http_context& ctx) {
return register_api(ctx, "messaging_service",
"The messaging service API", [&ms] (http_context& ctx, routes& r) {
set_messaging_service(ctx, r, ms);
});
}
future<> unset_server_messaging_service(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_messaging_service(ctx, r); });
"The messaging service API", set_messaging_service);
}
future<> set_server_storage_proxy(http_context& ctx, sharded<service::storage_service>& ss) {
future<> set_server_storage_proxy(http_context& ctx) {
return register_api(ctx, "storage_proxy",
"The storage proxy API", [&ss] (http_context& ctx, routes& r) {
set_storage_proxy(ctx, r, ss);
});
"The storage proxy API", set_storage_proxy);
}
future<> set_server_stream_manager(http_context& ctx, sharded<streaming::stream_manager>& sm) {
future<> set_server_stream_manager(http_context& ctx) {
return register_api(ctx, "stream_manager",
"The stream manager API", [&sm] (http_context& ctx, routes& r) {
set_stream_manager(ctx, r, sm);
});
"The stream manager API", set_stream_manager);
}
future<> unset_server_stream_manager(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_stream_manager(ctx, r); });
}
future<> set_server_cache(http_context& ctx) {
return register_api(ctx, "cache_service",
"The cache service API", set_cache_service);
}
future<> set_hinted_handoff(http_context& ctx, sharded<gms::gossiper>& g) {
return register_api(ctx, "hinted_handoff",
"The hinted handoff API", [&g] (http_context& ctx, routes& r) {
set_hinted_handoff(ctx, r, g.local());
});
}
future<> unset_hinted_handoff(http_context& ctx) {
return ctx.http_server.set_routes([&ctx] (routes& r) { unset_hinted_handoff(ctx, r); });
}
future<> set_server_gossip_settle(http_context& ctx, sharded<gms::gossiper>& g) {
auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);
return ctx.http_server.set_routes([rb, &ctx, &g](routes& r) {
rb->register_function(r, "failure_detector",
"The failure detector API");
set_failure_detector(ctx, r, g.local());
});
}
future<> set_server_compaction_manager(http_context& ctx) {
future<> set_server_gossip_settle(http_context& ctx) {
auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);
return ctx.http_server.set_routes([rb, &ctx](routes& r) {
rb->register_function(r, "compaction_manager",
"The Compaction manager API");
set_compaction_manager(ctx, r);
rb->register_function(r, "failure_detector",
"The failure detector API");
set_failure_detector(ctx,r);
rb->register_function(r, "cache_service",
"The cache service API");
set_cache_service(ctx,r);
rb->register_function(r, "endpoint_snitch_info",
"The endpoint snitch info API");
set_endpoint_snitch(ctx, r);
});
}
@@ -238,71 +129,23 @@ future<> set_server_done(http_context& ctx) {
auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);
return ctx.http_server.set_routes([rb, &ctx](routes& r) {
rb->register_function(r, "compaction_manager",
"The Compaction manager API");
set_compaction_manager(ctx, r);
rb->register_function(r, "lsa", "Log-structured allocator API");
set_lsa(ctx, r);
rb->register_function(r, "commitlog",
"The commit log API");
set_commitlog(ctx,r);
rb->register_function(r, "hinted_handoff",
"The hinted handoff API");
set_hinted_handoff(ctx, r);
rb->register_function(r, "collectd",
"The collectd API");
set_collectd(ctx, r);
rb->register_function(r, "error_injection",
"The error injection API");
set_error_injection(ctx, r);
});
}
future<> set_server_task_manager(http_context& ctx) {
auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);
return ctx.http_server.set_routes([rb, &ctx](routes& r) {
rb->register_function(r, "task_manager",
"The task manager API");
set_task_manager(ctx, r);
});
}
#ifndef SCYLLA_BUILD_MODE_RELEASE
future<> set_server_task_manager_test(http_context& ctx, lw_shared_ptr<db::config> cfg) {
auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);
return ctx.http_server.set_routes([rb, &ctx, &cfg = *cfg](routes& r) mutable {
rb->register_function(r, "task_manager_test",
"The task manager test API");
set_task_manager_test(ctx, r, cfg);
});
}
#endif
void req_params::process(const request& req) {
// Process mandatory parameters
for (auto& [name, ent] : params) {
if (!ent.is_mandatory) {
continue;
}
try {
ent.value = req.param[name];
} catch (std::out_of_range&) {
throw httpd::bad_param_exception(fmt::format("Mandatory parameter '{}' was not provided", name));
}
}
// Process optional parameters
for (auto& [name, value] : req.query_parameters) {
try {
auto& ent = params.at(name);
if (ent.is_mandatory) {
throw httpd::bad_param_exception(fmt::format("Parameter '{}' is expected to be provided as part of the request url", name));
}
ent.value = value;
} catch (std::out_of_range&) {
throw httpd::bad_param_exception(fmt::format("Unsupported optional parameter '{}'", name));
}
}
}
}

View File

@@ -1,25 +1,34 @@
/*
* Copyright 2015-present ScyllaDB
* Copyright 2015 ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <seastar/json/json_elements.hh>
#include <type_traits>
#include "json/json_elements.hh"
#include <boost/lexical_cast.hpp>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
#include <boost/units/detail/utility.hpp>
#include "api/api-doc/utils.json.hh"
#include "utils/histogram.hh"
#include "utils/estimated_histogram.hh"
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "api_init.hh"
#include "seastarx.hh"
namespace api {
@@ -58,7 +67,7 @@ T map_sum(T&& dest, const S& src) {
for (auto i : src) {
dest[i.first] += i.second;
}
return std::move(dest);
return dest;
}
template <typename MAP>
@@ -81,6 +90,13 @@ inline std::vector<sstring> split(const sstring& text, const char* separator) {
return boost::split(tokens, text, boost::is_any_of(separator));
}
/**
* Split a column family parameter
*/
inline std::vector<sstring> split_cf(const sstring& cf) {
return split(cf, ",");
}
/**
* A helper function to sum values on an a distributed object that
* has a get_stats method.
@@ -94,57 +110,61 @@ future<json::json_return_type> sum_stats(distributed<T>& d, V F::*f) {
});
}
inline double pow2(double a) {
return a * a;
}
// FIXME: Move to utils::ihistogram::operator+=()
inline utils::ihistogram add_histogram(utils::ihistogram res,
const utils::ihistogram& val) {
if (res.count == 0) {
return val;
}
if (val.count == 0) {
return std::move(res);
}
if (res.min > val.min) {
res.min = val.min;
}
if (res.max < val.max) {
res.max = val.max;
}
double ncount = res.count + val.count;
// To get an estimated sum we take the estimated mean
// and multiply it by the true count
res.sum = res.sum + val.mean * val.count;
double a = res.count/ncount;
double b = val.count/ncount;
double mean = a * res.mean + b * val.mean;
res.variance = (res.variance + pow2(res.mean - mean) )* a +
(val.variance + pow2(val.mean -mean))* b;
res.mean = mean;
res.count = res.count + val.count;
for (auto i : val.sample) {
res.sample.push_back(i);
}
return res;
}
inline
httpd::utils_json::histogram to_json(const utils::ihistogram& val) {
httpd::utils_json::histogram h;
h = val;
h.sum = val.estimated_sum();
return h;
}
inline
httpd::utils_json::rate_moving_average meter_to_json(const utils::rate_moving_average& val) {
httpd::utils_json::rate_moving_average m;
m = val;
return m;
}
inline
httpd::utils_json::rate_moving_average_and_histogram timer_to_json(const utils::rate_moving_average_and_histogram& val) {
httpd::utils_json::rate_moving_average_and_histogram h;
h.hist = to_json(val.hist);
h.meter = meter_to_json(val.rate);
return h;
}
template<class T, class F>
future<json::json_return_type> sum_histogram_stats(distributed<T>& d, utils::timed_rate_moving_average_and_histogram F::*f) {
future<json::json_return_type> sum_histogram_stats(distributed<T>& d, utils::ihistogram F::*f) {
return d.map_reduce0([f](const T& p) {return (p.get_stats().*f).hist;}, utils::ihistogram(),
std::plus<utils::ihistogram>()).then([](const utils::ihistogram& val) {
return d.map_reduce0([f](const T& p) {return p.get_stats().*f;}, utils::ihistogram(),
add_histogram).then([](const utils::ihistogram& val) {
return make_ready_future<json::json_return_type>(to_json(val));
});
}
template<class T, class F>
future<json::json_return_type> sum_timer_stats(distributed<T>& d, utils::timed_rate_moving_average_and_histogram F::*f) {
return d.map_reduce0([f](const T& p) {return (p.get_stats().*f).rate();}, utils::rate_moving_average_and_histogram(),
std::plus<utils::rate_moving_average_and_histogram>()).then([](const utils::rate_moving_average_and_histogram& val) {
return make_ready_future<json::json_return_type>(timer_to_json(val));
});
}
template<class T, class F>
future<json::json_return_type> sum_timer_stats(distributed<T>& d, utils::timed_rate_moving_average_summary_and_histogram F::*f) {
return d.map_reduce0([f](const T& p) {return (p.get_stats().*f).rate();}, utils::rate_moving_average_and_histogram(),
std::plus<utils::rate_moving_average_and_histogram>()).then([](const utils::rate_moving_average_and_histogram& val) {
return make_ready_future<json::json_return_type>(timer_to_json(val));
});
}
inline int64_t min_int64(int64_t a, int64_t b) {
return std::min(a,b);
}
@@ -158,36 +178,33 @@ inline int64_t max_int64(int64_t a, int64_t b) {
* It combine total and the sub set for the ratio and its
* to_json method return the ration sub/total
*/
template<typename T>
struct basic_ratio_holder : public json::jsonable {
T total = 0;
T sub = 0;
struct ratio_holder : public json::jsonable {
double total = 0;
double sub = 0;
virtual std::string to_json() const {
if (total == 0) {
return "0";
}
return std::to_string(sub/total);
}
basic_ratio_holder() = default;
basic_ratio_holder& add(T _total, T _sub) {
ratio_holder() = default;
ratio_holder& add(double _total, double _sub) {
total += _total;
sub += _sub;
return *this;
}
basic_ratio_holder(T _total, T _sub) {
ratio_holder(double _total, double _sub) {
total = _total;
sub = _sub;
}
basic_ratio_holder<T>& operator+=(const basic_ratio_holder<T>& a) {
ratio_holder& operator+=(const ratio_holder& a) {
return add(a.total, a.sub);
}
friend basic_ratio_holder<T> operator+(basic_ratio_holder a, const basic_ratio_holder<T>& b) {
friend ratio_holder operator+(ratio_holder a, const ratio_holder& b) {
return a += b;
}
};
typedef basic_ratio_holder<double> ratio_holder;
typedef basic_ratio_holder<int64_t> integral_ratio_holder;
class unimplemented_exception : public base_exception {
public:
@@ -207,105 +224,4 @@ std::vector<T> concat(std::vector<T> a, std::vector<T>&& b) {
return a;
}
template <class T, class Base = T>
class req_param {
public:
sstring name;
sstring param;
T value;
req_param(const request& req, sstring name, T default_val) : name(name) {
param = req.get_query_param(name);
if (param.empty()) {
value = default_val;
return;
}
try {
// boost::lexical_cast does not use boolalpha. Converting a
// true/false throws exceptions. We don't want that.
if constexpr (std::is_same_v<Base, bool>) {
// Cannot use boolalpha because we (probably) want to
// accept 1 and 0 as well as true and false. And True. And fAlse.
std::transform(param.begin(), param.end(), param.begin(), ::tolower);
if (param == "true" || param == "1") {
value = T(true);
} else if (param == "false" || param == "0") {
value = T(false);
} else {
throw boost::bad_lexical_cast{};
}
} else {
value = T{boost::lexical_cast<Base>(param)};
}
} catch (boost::bad_lexical_cast&) {
throw bad_param_exception(format("{} ({}): type error - should be {}", name, param, boost::units::detail::demangle(typeid(Base).name())));
}
}
operator T() const { return value; }
};
using mandatory = bool_class<struct mandatory_tag>;
class req_params {
public:
struct def {
std::optional<sstring> value;
mandatory is_mandatory = mandatory::no;
def(std::optional<sstring> value_ = std::nullopt, mandatory is_mandatory_ = mandatory::no)
: value(std::move(value_))
, is_mandatory(is_mandatory_)
{ }
def(mandatory is_mandatory_)
: is_mandatory(is_mandatory_)
{ }
};
private:
std::unordered_map<sstring, def> params;
public:
req_params(std::initializer_list<std::pair<sstring, def>> l) {
for (const auto& [name, ent] : l) {
add(std::move(name), std::move(ent));
}
}
void add(sstring name, def ent) {
params.emplace(std::move(name), std::move(ent));
}
void process(const request& req);
const std::optional<sstring>& get(const char* name) const {
return params.at(name).value;
}
template <typename T = sstring>
const std::optional<T> get_as(const char* name) const {
return get(name);
}
template <typename T = sstring>
requires std::same_as<T, bool>
const std::optional<bool> get_as(const char* name) const {
auto value = get(name);
if (!value) {
return std::nullopt;
}
std::transform(value->begin(), value->end(), value->begin(), ::tolower);
if (value == "true" || value == "yes" || value == "1") {
return true;
}
if (value == "false" || value == "no" || value == "0") {
return false;
}
throw boost::bad_lexical_cast{};
}
};
utils_json::estimated_histogram time_to_json_histogram(const utils::time_estimated_histogram& val);
}

View File

@@ -3,60 +3,25 @@
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <seastar/http/httpd.hh>
#include <seastar/core/future.hh>
#include "replica/database_fwd.hh"
#include "tasks/task_manager.hh"
#include "seastarx.hh"
namespace service {
class load_meter;
class storage_proxy;
class storage_service;
} // namespace service
class sstables_loader;
namespace streaming {
class stream_manager;
}
namespace locator {
class token_metadata;
class shared_token_metadata;
class snitch_ptr;
} // namespace locator
namespace cql_transport { class controller; }
class thrift_controller;
namespace db {
class snapshot_ctl;
class config;
namespace view {
class view_builder;
}
class system_keyspace;
}
namespace netw { class messaging_service; }
class repair_service;
namespace cdc { class generation_service; }
namespace gms {
class gossiper;
}
namespace auth { class service; }
#include "database.hh"
#include "service/storage_proxy.hh"
#include "http/httpd.hh"
namespace api {
@@ -64,54 +29,23 @@ struct http_context {
sstring api_dir;
sstring api_doc;
httpd::http_server_control http_server;
distributed<replica::database>& db;
distributed<database>& db;
distributed<service::storage_proxy>& sp;
service::load_meter& lmeter;
const sharded<locator::shared_token_metadata>& shared_token_metadata;
sharded<tasks::task_manager>& tm;
http_context(distributed<replica::database>& _db,
distributed<service::storage_proxy>& _sp,
service::load_meter& _lm, const sharded<locator::shared_token_metadata>& _stm, sharded<tasks::task_manager>& _tm)
: db(_db), sp(_sp), lmeter(_lm), shared_token_metadata(_stm), tm(_tm) {
http_context(distributed<database>& _db,
distributed<service::storage_proxy>& _sp)
: db(_db), sp(_sp) {
}
const locator::token_metadata& get_token_metadata();
};
future<> set_server_init(http_context& ctx);
future<> set_server_config(http_context& ctx, const db::config& cfg);
future<> set_server_snitch(http_context& ctx, sharded<locator::snitch_ptr>& snitch);
future<> unset_server_snitch(http_context& ctx);
future<> set_server_storage_service(http_context& ctx, sharded<service::storage_service>& ss, sharded<gms::gossiper>& g, sharded<cdc::generation_service>& cdc_gs, sharded<db::system_keyspace>& sys_ks);
future<> set_server_sstables_loader(http_context& ctx, sharded<sstables_loader>& sst_loader);
future<> unset_server_sstables_loader(http_context& ctx);
future<> set_server_view_builder(http_context& ctx, sharded<db::view::view_builder>& vb);
future<> unset_server_view_builder(http_context& ctx);
future<> set_server_repair(http_context& ctx, sharded<repair_service>& repair);
future<> unset_server_repair(http_context& ctx);
future<> set_transport_controller(http_context& ctx, cql_transport::controller& ctl);
future<> unset_transport_controller(http_context& ctx);
future<> set_rpc_controller(http_context& ctx, thrift_controller& ctl);
future<> unset_rpc_controller(http_context& ctx);
future<> set_server_authorization_cache(http_context& ctx, sharded<auth::service> &auth_service);
future<> unset_server_authorization_cache(http_context& ctx);
future<> set_server_snapshot(http_context& ctx, sharded<db::snapshot_ctl>& snap_ctl);
future<> unset_server_snapshot(http_context& ctx);
future<> set_server_gossip(http_context& ctx, sharded<gms::gossiper>& g);
future<> set_server_storage_service(http_context& ctx);
future<> set_server_gossip(http_context& ctx);
future<> set_server_load_sstable(http_context& ctx);
future<> set_server_messaging_service(http_context& ctx, sharded<netw::messaging_service>& ms);
future<> unset_server_messaging_service(http_context& ctx);
future<> set_server_storage_proxy(http_context& ctx, sharded<service::storage_service>& ss);
future<> set_server_stream_manager(http_context& ctx, sharded<streaming::stream_manager>& sm);
future<> unset_server_stream_manager(http_context& ctx);
future<> set_hinted_handoff(http_context& ctx, sharded<gms::gossiper>& g);
future<> unset_hinted_handoff(http_context& ctx);
future<> set_server_gossip_settle(http_context& ctx, sharded<gms::gossiper>& g);
future<> set_server_cache(http_context& ctx);
future<> set_server_compaction_manager(http_context& ctx);
future<> set_server_messaging_service(http_context& ctx);
future<> set_server_storage_proxy(http_context& ctx);
future<> set_server_stream_manager(http_context& ctx);
future<> set_server_gossip_settle(http_context& ctx);
future<> set_server_done(http_context& ctx);
future<> set_server_task_manager(http_context& ctx);
future<> set_server_task_manager_test(http_context& ctx, lw_shared_ptr<db::config> cfg);
}

View File

@@ -1,33 +0,0 @@
/*
* Copyright (C) 2022-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "api/api-doc/authorization_cache.json.hh"
#include "api/authorization_cache.hh"
#include "api/api.hh"
#include "auth/common.hh"
namespace api {
using namespace json;
void set_authorization_cache(http_context& ctx, routes& r, sharded<auth::service> &auth_service) {
httpd::authorization_cache_json::authorization_cache_reset.set(r, [&auth_service] (std::unique_ptr<request> req) -> future<json::json_return_type> {
co_await auth_service.invoke_on_all([] (auth::service& auth) -> future<> {
auth.reset_authorization_cache();
return make_ready_future<>();
});
co_return json_void();
});
}
void unset_authorization_cache(http_context& ctx, routes& r) {
httpd::authorization_cache_json::authorization_cache_reset.unset(r);
}
}

View File

@@ -1,18 +0,0 @@
/*
* Copyright (C) 2022-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include "api.hh"
namespace api {
void set_authorization_cache(http_context& ctx, routes& r, sharded<auth::service> &auth_service);
void unset_authorization_cache(http_context& ctx, routes& r);
}

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "cache_service.hh"
@@ -164,20 +177,6 @@ void set_cache_service(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(0);
});
cs::get_key_hits_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {
// TBD
// FIXME
// See above
return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));
});
cs::get_key_requests_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {
// TBD
// FIXME
// See above
return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));
});
cs::get_key_size.set(r, [] (std::unique_ptr<request> req) {
// TBD
// FIXME
@@ -195,64 +194,42 @@ void set_cache_service(http_context& ctx, routes& r) {
});
cs::get_row_capacity.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([](replica::database& db) -> uint64_t {
return db.row_cache_tracker().region().occupancy().used_space();
}, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {
return make_ready_future<json::json_return_type>(res);
});
return map_reduce_cf(ctx, 0, [](const column_family& cf) {
return cf.get_row_cache().get_cache_tracker().region().occupancy().used_space();
}, std::plus<uint64_t>());
});
cs::get_row_hits.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t(0), [](const replica::column_family& cf) {
return cf.get_row_cache().stats().hits.count();
}, std::plus<uint64_t>());
return map_reduce_cf(ctx, 0, [](const column_family& cf) {
return cf.get_row_cache().stats().hits;
}, std::plus<int64_t>());
});
cs::get_row_requests.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, uint64_t(0), [](const replica::column_family& cf) {
return cf.get_row_cache().stats().hits.count() + cf.get_row_cache().stats().misses.count();
}, std::plus<uint64_t>());
return map_reduce_cf(ctx, 0, [](const column_family& cf) {
return cf.get_row_cache().stats().hits + cf.get_row_cache().stats().misses;
}, std::plus<int64_t>());
});
cs::get_row_hit_rate.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, ratio_holder(), [](const replica::column_family& cf) {
return ratio_holder(cf.get_row_cache().stats().hits.count() + cf.get_row_cache().stats().misses.count(),
cf.get_row_cache().stats().hits.count());
return map_reduce_cf(ctx, ratio_holder(), [](const column_family& cf) {
return ratio_holder(cf.get_row_cache().stats().hits + cf.get_row_cache().stats().misses,
cf.get_row_cache().stats().hits);
}, std::plus<ratio_holder>());
});
cs::get_row_hits_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf_raw(ctx, utils::rate_moving_average(), [](const replica::column_family& cf) {
return cf.get_row_cache().stats().hits.rate();
}, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {
return make_ready_future<json::json_return_type>(meter_to_json(m));
});
});
cs::get_row_requests_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf_raw(ctx, utils::rate_moving_average(), [](const replica::column_family& cf) {
return cf.get_row_cache().stats().hits.rate() + cf.get_row_cache().stats().misses.rate();
}, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {
return make_ready_future<json::json_return_type>(meter_to_json(m));
});
});
cs::get_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {
// In origin row size is the weighted size.
// We currently do not support weights, so we use num entries instead
return ctx.db.map_reduce0([](replica::database& db) -> uint64_t {
return db.row_cache_tracker().partitions();
}, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {
return make_ready_future<json::json_return_type>(res);
});
return map_reduce_cf(ctx, 0, [](const column_family& cf) {
return cf.get_row_cache().num_entries();
}, std::plus<uint64_t>());
});
cs::get_row_entries.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([](replica::database& db) -> uint64_t {
return db.row_cache_tracker().partitions();
}, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {
return make_ready_future<json::json_return_type>(res);
});
return map_reduce_cf(ctx, 0, [](const column_family& cf) {
return cf.get_row_cache().num_entries();
}, std::plus<uint64_t>());
});
cs::get_counter_capacity.set(r, [] (std::unique_ptr<request> req) {
@@ -287,20 +264,6 @@ void set_cache_service(http_context& ctx, routes& r) {
return make_ready_future<json::json_return_type>(0);
});
cs::get_counter_hits_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {
// TBD
// FIXME
// See above
return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));
});
cs::get_counter_requests_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {
// TBD
// FIXME
// See above
return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));
});
cs::get_counter_size.set(r, [] (std::unique_ptr<request> req) {
// TBD
// FIXME

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once

View File

@@ -1,25 +1,34 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "collectd.hh"
#include "api/api-doc/collectd.json.hh"
#include <seastar/core/scollectd.hh>
#include <seastar/core/scollectd_api.hh>
#include "core/scollectd.hh"
#include "core/scollectd_api.hh"
#include "endian.h"
#include <boost/range/irange.hpp>
#include <regex>
namespace api {
using namespace scollectd;
using namespace httpd;
using namespace json;
namespace cd = httpd::collectd_json;
static auto transformer(const std::vector<collectd_value>& values) {
@@ -27,34 +36,23 @@ static auto transformer(const std::vector<collectd_value>& values) {
for (auto v: values) {
switch (v._type) {
case scollectd::data_type::GAUGE:
collected_value.values.push(v.d());
collected_value.values.push(v.u._d);
break;
case scollectd::data_type::COUNTER:
collected_value.values.push(v.ui());
break;
case scollectd::data_type::REAL_COUNTER:
collected_value.values.push(v.d());
case scollectd::data_type::DERIVE:
collected_value.values.push(v.u._i);
break;
default:
collected_value.values.push(v.ui());
collected_value.values.push(v.u._ui);
break;
}
}
return collected_value;
}
static const char* str_to_regex(const sstring& v) {
if (v != "") {
return v.c_str();
}
return ".*";
}
void set_collectd(http_context& ctx, routes& r) {
cd::get_collectd.set(r, [&ctx](std::unique_ptr<request> req) {
auto id = ::make_shared<scollectd::type_instance_id>(req->param["pluginid"],
auto id = make_shared<scollectd::type_instance_id>(req->param["pluginid"],
req->get_query_param("instance"), req->get_query_param("type"),
req->get_query_param("type_instance"));
@@ -74,7 +72,7 @@ void set_collectd(http_context& ctx, routes& r) {
});
cd::get_collectd_items.set(r, [](const_req req) {
std::vector<cd::collectd_metric_status> res;
std::vector<cd::type_instance_id> res;
auto ids = scollectd::get_collectd_ids();
for (auto i: ids) {
cd::type_instance_id id;
@@ -82,44 +80,10 @@ void set_collectd(http_context& ctx, routes& r) {
id.plugin_instance = i.plugin_instance();
id.type = i.type();
id.type_instance = i.type_instance();
cd::collectd_metric_status it;
it.id = id;
it.enable = scollectd::is_enabled(i);
res.push_back(it);
res.push_back(id);
}
return res;
});
cd::enable_collectd.set(r, [](std::unique_ptr<request> req) -> future<json::json_return_type> {
std::regex plugin(req->param["pluginid"].c_str());
std::regex instance(str_to_regex(req->get_query_param("instance")));
std::regex type(str_to_regex(req->get_query_param("type")));
std::regex type_instance(str_to_regex(req->get_query_param("type_instance")));
bool enable = strcasecmp(req->get_query_param("enable").c_str(), "true") == 0;
return smp::invoke_on_all([enable, plugin, instance, type, type_instance]() {
for (auto id: scollectd::get_collectd_ids()) {
if (std::regex_match(std::string(id.plugin()), plugin) &&
std::regex_match(std::string(id.plugin_instance()), instance) &&
std::regex_match(std::string(id.type()), type) &&
std::regex_match(std::string(id.type_instance()), type_instance)) {
scollectd::enable(id, enable);
}
}
}).then([] {
return json::json_return_type(json_void());
});
});
cd::enable_all_collectd.set(r, [](std::unique_ptr<request> req) -> future<json::json_return_type> {
bool enable = strcasecmp(req->get_query_param("enable").c_str(), "true") == 0;
return smp::invoke_on_all([enable] {
for (auto id: scollectd::get_collectd_ids()) {
scollectd::enable(id, enable);
}
}).then([] {
return json::json_return_type(json_void());
});
});
}
}

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once

File diff suppressed because it is too large Load Diff

View File

@@ -1,47 +1,45 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
#include "api/api-doc/column_family.json.hh"
#include "replica/database.hh"
#include <seastar/core/future-util.hh>
#include <any>
#include "database.hh"
namespace api {
void set_column_family(http_context& ctx, routes& r);
const table_id& get_uuid(const sstring& name, const replica::database& db);
future<> foreach_column_family(http_context& ctx, const sstring& name, std::function<void(replica::column_family&)> f);
template<class Mapper, class I, class Reducer>
future<I> map_reduce_cf_raw(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer) {
auto uuid = get_uuid(name, ctx.db.local());
using mapper_type = std::function<std::unique_ptr<std::any>(replica::database&)>;
using reducer_type = std::function<std::unique_ptr<std::any>(std::unique_ptr<std::any>, std::unique_ptr<std::any>)>;
return ctx.db.map_reduce0(mapper_type([mapper, uuid](replica::database& db) {
return std::make_unique<std::any>(I(mapper(db.find_column_family(uuid))));
}), std::make_unique<std::any>(std::move(init)), reducer_type([reducer = std::move(reducer)] (std::unique_ptr<std::any> a, std::unique_ptr<std::any> b) mutable {
return std::make_unique<std::any>(I(reducer(std::any_cast<I>(std::move(*a)), std::any_cast<I>(std::move(*b)))));
})).then([] (std::unique_ptr<std::any> r) {
return std::any_cast<I>(std::move(*r));
});
}
const utils::UUID& get_uuid(const sstring& name, const database& db);
future<> foreach_column_family(http_context& ctx, const sstring& name, std::function<void(column_family&)> f);
template<class Mapper, class I, class Reducer>
future<json::json_return_type> map_reduce_cf(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer) {
return map_reduce_cf_raw(ctx, name, init, mapper, reducer).then([](const I& res) {
auto uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([mapper, uuid](database& db) {
return mapper(db.find_column_family(uuid));
}, init, reducer).then([](const I& res) {
return make_ready_future<json::json_return_type>(res);
});
}
@@ -49,61 +47,33 @@ future<json::json_return_type> map_reduce_cf(http_context& ctx, const sstring& n
template<class Mapper, class I, class Reducer, class Result>
future<json::json_return_type> map_reduce_cf(http_context& ctx, const sstring& name, I init,
Mapper mapper, Reducer reducer, Result result) {
return map_reduce_cf_raw(ctx, name, init, mapper, reducer).then([result](const I& res) mutable {
auto uuid = get_uuid(name, ctx.db.local());
return ctx.db.map_reduce0([mapper, uuid](database& db) {
return mapper(db.find_column_family(uuid));
}, init, reducer).then([result](const I& res) mutable {
result = res;
return make_ready_future<json::json_return_type>(result);
});
}
future<json::json_return_type> map_reduce_cf_time_histogram(http_context& ctx, const sstring& name, std::function<utils::time_estimated_histogram(const replica::column_family&)> f);
struct map_reduce_column_families_locally {
std::any init;
std::function<std::unique_ptr<std::any>(replica::column_family&)> mapper;
std::function<std::unique_ptr<std::any>(std::unique_ptr<std::any>, std::unique_ptr<std::any>)> reducer;
future<std::unique_ptr<std::any>> operator()(replica::database& db) const {
auto res = seastar::make_lw_shared<std::unique_ptr<std::any>>(std::make_unique<std::any>(init));
return do_for_each(db.get_column_families(), [res, this](const std::pair<table_id, seastar::lw_shared_ptr<replica::table>>& i) {
*res = reducer(std::move(*res), mapper(*i.second.get()));
}).then([res] {
return std::move(*res);
});
}
};
template<class Mapper, class I, class Reducer>
future<I> map_reduce_cf_raw(http_context& ctx, I init,
Mapper mapper, Reducer reducer) {
using mapper_type = std::function<std::unique_ptr<std::any>(replica::column_family&)>;
using reducer_type = std::function<std::unique_ptr<std::any>(std::unique_ptr<std::any>, std::unique_ptr<std::any>)>;
auto wrapped_mapper = mapper_type([mapper = std::move(mapper)] (replica::column_family& cf) mutable {
return std::make_unique<std::any>(I(mapper(cf)));
});
auto wrapped_reducer = reducer_type([reducer = std::move(reducer)] (std::unique_ptr<std::any> a, std::unique_ptr<std::any> b) mutable {
return std::make_unique<std::any>(I(reducer(std::any_cast<I>(std::move(*a)), std::any_cast<I>(std::move(*b)))));
});
return ctx.db.map_reduce0(map_reduce_column_families_locally{init,
std::move(wrapped_mapper), wrapped_reducer}, std::make_unique<std::any>(init), wrapped_reducer).then([] (std::unique_ptr<std::any> res) {
return std::any_cast<I>(std::move(*res));
});
}
template<class Mapper, class I, class Reducer>
future<json::json_return_type> map_reduce_cf(http_context& ctx, I init,
Mapper mapper, Reducer reducer) {
return map_reduce_cf_raw(ctx, init, mapper, reducer).then([](const I& res) {
return ctx.db.map_reduce0([mapper, init, reducer](database& db) {
auto res = init;
for (auto i : db.get_column_families()) {
res = reducer(res, mapper(*i.second.get()));
}
return res;
}, init, reducer).then([](const I& res) {
return make_ready_future<json::json_return_type>(res);
});
}
future<json::json_return_type> get_cf_stats(http_context& ctx, const sstring& name,
int64_t replica::column_family_stats::*f);
int64_t column_family::stats::*f);
future<json::json_return_type> get_cf_stats(http_context& ctx,
int64_t replica::column_family_stats::*f);
std::tuple<sstring, sstring> parse_fully_qualified_cf_name(sstring name);
int64_t column_family::stats::*f);
}

View File

@@ -1,24 +1,36 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "commitlog.hh"
#include "db/commitlog/commitlog.hh"
#include <db/commitlog/commitlog.hh>
#include "api/api-doc/commitlog.json.hh"
#include "replica/database.hh"
#include <vector>
namespace api {
template<typename T>
static auto acquire_cl_metric(http_context& ctx, std::function<T (db::commitlog*)> func) {
typedef T ret_type;
template<typename Func>
static auto acquire_cl_metric(http_context& ctx, Func&& func) {
typedef std::result_of_t<Func(db::commitlog *)> ret_type;
return ctx.db.map_reduce0([func = std::move(func)](replica::database& db) {
return ctx.db.map_reduce0([func = std::forward<Func>(func)](database& db) {
if (db.commitlog() == nullptr) {
return make_ready_future<ret_type>();
}
@@ -34,7 +46,7 @@ void set_commitlog(http_context& ctx, routes& r) {
auto res = make_shared<std::vector<sstring>>();
return ctx.db.map_reduce([res](std::vector<sstring> names) {
res->insert(res->end(), names.begin(), names.end());
}, [](replica::database& db) {
}, [](database& db) {
if (db.commitlog() == nullptr) {
return make_ready_future<std::vector<sstring>>(std::vector<sstring>());
}
@@ -51,15 +63,15 @@ void set_commitlog(http_context& ctx, routes& r) {
});
httpd::commitlog_json::get_completed_tasks.set(r, [&ctx](std::unique_ptr<request> req) {
return acquire_cl_metric<uint64_t>(ctx, std::bind(&db::commitlog::get_completed_tasks, std::placeholders::_1));
return acquire_cl_metric(ctx, std::bind(&db::commitlog::get_completed_tasks, std::placeholders::_1));
});
httpd::commitlog_json::get_pending_tasks.set(r, [&ctx](std::unique_ptr<request> req) {
return acquire_cl_metric<uint64_t>(ctx, std::bind(&db::commitlog::get_pending_tasks, std::placeholders::_1));
return acquire_cl_metric(ctx, std::bind(&db::commitlog::get_pending_tasks, std::placeholders::_1));
});
httpd::commitlog_json::get_total_commit_log_size.set(r, [&ctx](std::unique_ptr<request> req) {
return acquire_cl_metric<uint64_t>(ctx, std::bind(&db::commitlog::get_total_size, std::placeholders::_1));
return acquire_cl_metric(ctx, std::bind(&db::commitlog::get_total_size, std::placeholders::_1));
});
}

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once

View File

@@ -1,62 +1,57 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include <seastar/core/coroutine.hh>
#include "compaction_manager.hh"
#include "compaction/compaction_manager.hh"
#include "api/api-doc/compaction_manager.json.hh"
#include "db/system_keyspace.hh"
#include "column_family.hh"
#include "unimplemented.hh"
#include "storage_service.hh"
#include <utility>
namespace api {
using namespace scollectd;
namespace cm = httpd::compaction_manager_json;
using namespace json;
static future<json::json_return_type> get_cm_stats(http_context& ctx,
int64_t compaction_manager::stats::*f) {
return ctx.db.map_reduce0([f](replica::database& db) {
return ctx.db.map_reduce0([f](database& db) {
return db.get_compaction_manager().get_stats().*f;
}, int64_t(0), std::plus<int64_t>()).then([](const int64_t& res) {
return make_ready_future<json::json_return_type>(res);
});
}
static std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash> sum_pending_tasks(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>&& a,
const std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& b) {
for (auto&& i : b) {
if (i.second) {
a[i.first] += i.second;
}
}
return std::move(a);
}
void set_compaction_manager(http_context& ctx, routes& r) {
cm::get_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([](replica::database& db) {
return ctx.db.map_reduce0([](database& db) {
std::vector<cm::summary> summaries;
const compaction_manager& cm = db.get_compaction_manager();
for (const auto& c : cm.get_compactions()) {
cm::summary s;
s.id = c.compaction_uuid.to_sstring();
s.ks = c.ks_name;
s.cf = c.cf_name;
s.ks = c->ks;
s.cf = c->cf;
s.unit = "keys";
s.task_type = sstables::compaction_name(c.type);
s.completed = c.total_keys_written;
s.total = c.total_partitions;
s.task_type = sstables::compaction_name(c->type);
s.completed = c->total_keys_written;
s.total = c->total_partitions;
summaries.push_back(std::move(s));
}
return summaries;
@@ -65,32 +60,6 @@ void set_compaction_manager(http_context& ctx, routes& r) {
});
});
cm::get_pending_tasks_by_table.set(r, [&ctx] (std::unique_ptr<request> req) {
return ctx.db.map_reduce0([&ctx](replica::database& db) {
return do_with(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), [&ctx, &db](std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& tasks) {
return do_for_each(db.get_column_families(), [&tasks](const std::pair<table_id, seastar::lw_shared_ptr<replica::table>>& i) {
replica::table& cf = *i.second.get();
tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.get_compaction_strategy().estimated_pending_compactions(cf.as_table_state());
return make_ready_future<>();
}).then([&tasks] {
return std::move(tasks);
});
});
}, std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), sum_pending_tasks).then(
[](const std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& task_map) {
std::vector<cm::pending_compaction> res;
res.reserve(task_map.size());
for (auto i : task_map) {
cm::pending_compaction task;
task.ks = i.first.first;
task.cf = i.first.second;
task.task = i.second;
res.emplace_back(std::move(task));
}
return make_ready_future<json::json_return_type>(res);
});
});
cm::force_user_defined_compaction.set(r, [] (std::unique_ptr<request> req) {
//TBD
// FIXME
@@ -100,35 +69,16 @@ void set_compaction_manager(http_context& ctx, routes& r) {
cm::stop_compaction.set(r, [&ctx] (std::unique_ptr<request> req) {
auto type = req->get_query_param("type");
return ctx.db.invoke_on_all([type] (replica::database& db) {
return ctx.db.invoke_on_all([type] (database& db) {
auto& cm = db.get_compaction_manager();
return cm.stop_compaction(type);
cm.stop_compaction(type);
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
});
cm::stop_keyspace_compaction.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {
auto ks_name = validate_keyspace(ctx, req->param);
auto table_names = parse_tables(ks_name, ctx, req->query_parameters, "tables");
if (table_names.empty()) {
table_names = map_keys(ctx.db.local().find_keyspace(ks_name).metadata().get()->cf_meta_data());
}
auto type = req->get_query_param("type");
co_await ctx.db.invoke_on_all([&ks_name, &table_names, type] (replica::database& db) {
auto& cm = db.get_compaction_manager();
return parallel_for_each(table_names, [&db, &cm, &ks_name, type] (sstring& table_name) {
auto& t = db.find_column_family(ks_name, table_name);
return cm.stop_compaction(type, &t.as_table_state());
});
});
co_return json_void();
});
cm::get_pending_tasks.set(r, [&ctx] (std::unique_ptr<request> req) {
return map_reduce_cf(ctx, int64_t(0), [](replica::column_family& cf) {
return cf.get_compaction_strategy().estimated_pending_compactions(cf.as_table_state());
}, std::plus<int64_t>());
return get_cm_stats(ctx, &compaction_manager::stats::pending_tasks);
});
cm::get_completed_tasks.set(r, [&ctx] (std::unique_ptr<request> req) {
@@ -150,37 +100,29 @@ void set_compaction_manager(http_context& ctx, routes& r) {
});
cm::get_compaction_history.set(r, [] (std::unique_ptr<request> req) {
std::function<future<>(output_stream<char>&&)> f = [](output_stream<char>&& s) {
return do_with(output_stream<char>(std::move(s)), true, [] (output_stream<char>& s, bool& first){
return s.write("[").then([&s, &first] {
return db::system_keyspace::get_compaction_history([&s, &first](const db::system_keyspace::compaction_history_entry& entry) mutable {
cm::history h;
h.id = entry.id.to_sstring();
h.ks = std::move(entry.ks);
h.cf = std::move(entry.cf);
h.compacted_at = entry.compacted_at;
h.bytes_in = entry.bytes_in;
h.bytes_out = entry.bytes_out;
for (auto it : entry.rows_merged) {
httpd::compaction_manager_json::row_merged e;
e.key = it.first;
e.value = it.second;
h.rows_merged.push(std::move(e));
}
auto fut = first ? make_ready_future<>() : s.write(", ");
first = false;
return fut.then([&s, h = std::move(h)] {
return formatter::write(s, h);
});
}).then([&s] {
return s.write("]").then([&s] {
return s.close();
});
});
});
});
};
return make_ready_future<json::json_return_type>(std::move(f));
return db::system_keyspace::get_compaction_history().then([] (std::vector<db::system_keyspace::compaction_history_entry> history) {
std::vector<cm::history> res;
res.reserve(history.size());
for (auto& entry : history) {
cm::history h;
h.id = entry.id.to_sstring();
h.ks = std::move(entry.ks);
h.cf = std::move(entry.cf);
h.compacted_at = entry.compacted_at;
h.bytes_in = entry.bytes_in;
h.bytes_out = entry.bytes_out;
for (auto it : entry.rows_merged) {
httpd::compaction_manager_json::row_merged e;
e.key = it.first;
e.value = it.second;
h.rows_merged.push(std::move(e));
}
res.push_back(std::move(h));
}
return make_ready_future<json::json_return_type>(res);
});
});
cm::get_compaction_info.set(r, [] (std::unique_ptr<request> req) {

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once

View File

@@ -1,105 +0,0 @@
/*
* Copyright 2018-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "api/config.hh"
#include "api/api-doc/config.json.hh"
#include "db/config.hh"
#include <sstream>
#include <boost/algorithm/string/replace.hpp>
namespace api {
template<class T>
json::json_return_type get_json_return_type(const T& val) {
return json::json_return_type(val);
}
/*
* As commented on db::seed_provider_type is not used
* and probably never will.
*
* Just in case, we will return its name
*/
template<>
json::json_return_type get_json_return_type(const db::seed_provider_type& val) {
return json::json_return_type(val.class_name);
}
std::string_view format_type(std::string_view type) {
if (type == "int") {
return "integer";
}
return type;
}
future<> get_config_swagger_entry(std::string_view name, const std::string& description, std::string_view type, bool& first, output_stream<char>& os) {
std::stringstream ss;
if (first) {
first=false;
} else {
ss <<',';
};
ss << "\"/config/" << name <<"\": {"
"\"get\": {"
"\"description\": \"" << boost::replace_all_copy(boost::replace_all_copy(boost::replace_all_copy(description,"\n","\\n"),"\"", "''"), "\t", " ") <<"\","
"\"operationId\": \"find_config_"<< name <<"\","
"\"produces\": ["
"\"application/json\""
"],"
"\"tags\": [\"config\"],"
"\"parameters\": ["
"],"
"\"responses\": {"
"\"200\": {"
"\"description\": \"Config value\","
"\"schema\": {"
"\"type\": \"" << format_type(type) << "\""
"}"
"},"
"\"default\": {"
"\"description\": \"unexpected error\","
"\"schema\": {"
"\"$ref\": \"#/definitions/ErrorModel\""
"}"
"}"
"}"
"}"
"}";
return os.write(ss.str());
}
namespace cs = httpd::config_json;
void set_config(std::shared_ptr < api_registry_builder20 > rb, http_context& ctx, routes& r, const db::config& cfg) {
rb->register_function(r, [&cfg] (output_stream<char>& os) {
return do_with(true, [&os, &cfg] (bool& first) {
auto f = make_ready_future();
for (auto&& cfg_ref : cfg.values()) {
auto&& cfg = cfg_ref.get();
f = f.then([&os, &first, &cfg] {
return get_config_swagger_entry(cfg.name(), std::string(cfg.desc()), cfg.type_name(), first, os);
});
}
return f;
});
});
cs::find_config_id.set(r, [&cfg] (const_req r) {
auto id = r.param["id"];
for (auto&& cfg_ref : cfg.values()) {
auto&& cfg = cfg_ref.get();
if (id == cfg.name()) {
return cfg.value_as_json();
}
}
throw bad_param_exception(sstring("No such config entry: ") + id);
});
}
}

View File

@@ -1,17 +0,0 @@
/*
* Copyright (C) 2018-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include "api.hh"
#include <seastar/http/api_docs.hh>
namespace api {
void set_config(std::shared_ptr<api_registry_builder20> rb, http_context& ctx, routes& r, const db::config& cfg);
}

View File

@@ -1,68 +1,42 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "locator/token_metadata.hh"
#include "locator/snitch_base.hh"
#include "locator/production_snitch_base.hh"
#include "endpoint_snitch.hh"
#include "api/api-doc/endpoint_snitch_info.json.hh"
#include "api/api-doc/storage_service.json.hh"
#include "utils/fb_utilities.hh"
namespace api {
void set_endpoint_snitch(http_context& ctx, routes& r, sharded<locator::snitch_ptr>& snitch) {
static auto host_or_broadcast = [](const_req req) {
auto host = req.get_query_param("host");
return host.empty() ? gms::inet_address(utils::fb_utilities::get_broadcast_address()) : gms::inet_address(host);
};
httpd::endpoint_snitch_info_json::get_datacenter.set(r, [&ctx](const_req req) {
auto& topology = ctx.shared_token_metadata.local().get()->get_topology();
auto ep = host_or_broadcast(req);
if (!topology.has_endpoint(ep, locator::topology::pending::yes)) {
// Cannot return error here, nodetool status can race, request
// info about just-left node and not handle it nicely
return sstring(locator::production_snitch_base::default_dc);
}
return topology.get_datacenter(ep);
void set_endpoint_snitch(http_context& ctx, routes& r) {
httpd::endpoint_snitch_info_json::get_datacenter.set(r, [] (const_req req) {
return locator::i_endpoint_snitch::get_local_snitch_ptr()->get_datacenter(req.get_query_param("host"));
});
httpd::endpoint_snitch_info_json::get_rack.set(r, [&ctx](const_req req) {
auto& topology = ctx.shared_token_metadata.local().get()->get_topology();
auto ep = host_or_broadcast(req);
if (!topology.has_endpoint(ep, locator::topology::pending::yes)) {
// Cannot return error here, nodetool status can race, request
// info about just-left node and not handle it nicely
return sstring(locator::production_snitch_base::default_rack);
}
return topology.get_rack(ep);
httpd::endpoint_snitch_info_json::get_rack.set(r, [] (const_req req) {
return locator::i_endpoint_snitch::get_local_snitch_ptr()->get_rack(req.get_query_param("host"));
});
httpd::endpoint_snitch_info_json::get_snitch_name.set(r, [&snitch] (const_req req) {
return snitch.local()->get_name();
httpd::endpoint_snitch_info_json::get_snitch_name.set(r, [] (const_req req) {
return locator::i_endpoint_snitch::get_local_snitch_ptr()->get_name();
});
httpd::storage_service_json::update_snitch.set(r, [&snitch](std::unique_ptr<request> req) {
locator::snitch_config cfg;
cfg.name = req->get_query_param("ep_snitch_class_name");
return locator::i_endpoint_snitch::reset_snitch(snitch, cfg).then([] {
return make_ready_future<json::json_return_type>(json::json_void());
});
});
}
void unset_endpoint_snitch(http_context& ctx, routes& r) {
httpd::endpoint_snitch_info_json::get_datacenter.unset(r);
httpd::endpoint_snitch_info_json::get_rack.unset(r);
httpd::endpoint_snitch_info_json::get_snitch_name.unset(r);
httpd::storage_service_json::update_snitch.unset(r);
}
}

View File

@@ -1,22 +1,30 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
namespace locator {
class snitch_ptr;
}
namespace api {
void set_endpoint_snitch(http_context& ctx, routes& r, sharded<locator::snitch_ptr>&);
void unset_endpoint_snitch(http_context& ctx, routes& r);
void set_endpoint_snitch(http_context& ctx, routes& r);
}

View File

@@ -1,56 +0,0 @@
/*
* Copyright (C) 2020-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#include "api/api-doc/error_injection.json.hh"
#include "api/api.hh"
#include <seastar/http/exception.hh>
#include "log.hh"
#include "utils/error_injection.hh"
#include <seastar/core/future-util.hh>
namespace api {
namespace hf = httpd::error_injection_json;
void set_error_injection(http_context& ctx, routes& r) {
hf::enable_injection.set(r, [](std::unique_ptr<request> req) {
sstring injection = req->param["injection"];
bool one_shot = req->get_query_param("one_shot") == "True";
auto& errinj = utils::get_local_injector();
return errinj.enable_on_all(injection, one_shot).then([] {
return make_ready_future<json::json_return_type>(json::json_void());
});
});
hf::get_enabled_injections_on_all.set(r, [](std::unique_ptr<request> req) {
auto& errinj = utils::get_local_injector();
auto ret = errinj.enabled_injections_on_all();
return make_ready_future<json::json_return_type>(ret);
});
hf::disable_injection.set(r, [](std::unique_ptr<request> req) {
sstring injection = req->param["injection"];
auto& errinj = utils::get_local_injector();
return errinj.disable_on_all(injection).then([] {
return make_ready_future<json::json_return_type>(json::json_void());
});
});
hf::disable_on_all.set(r, [](std::unique_ptr<request> req) {
auto& errinj = utils::get_local_injector();
return errinj.disable_on_all().then([] {
return make_ready_future<json::json_return_type>(json::json_void());
});
});
}
} // namespace api

View File

@@ -1,17 +0,0 @@
/*
* Copyright (C) 2019-present ScyllaDB
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
#pragma once
#include "api.hh"
namespace api {
void set_error_injection(http_context& ctx, routes& r);
}

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "failure_detector.hh"
@@ -15,10 +28,10 @@ namespace api {
namespace fd = httpd::failure_detector_json;
void set_failure_detector(http_context& ctx, routes& r, gms::gossiper& g) {
fd::get_all_endpoint_states.set(r, [&g](std::unique_ptr<request> req) {
void set_failure_detector(http_context& ctx, routes& r) {
fd::get_all_endpoint_states.set(r, [](std::unique_ptr<request> req) {
std::vector<fd::endpoint_state> res;
for (auto i : g.get_endpoint_states()) {
for (auto i : gms::get_local_gossiper().endpoint_state_map) {
fd::endpoint_state val;
val.addrs = boost::lexical_cast<std::string>(i.first);
val.is_alive = i.second.is_alive();
@@ -39,54 +52,41 @@ void set_failure_detector(http_context& ctx, routes& r, gms::gossiper& g) {
return make_ready_future<json::json_return_type>(res);
});
fd::get_up_endpoint_count.set(r, [&g](std::unique_ptr<request> req) {
int res = g.get_up_endpoint_count();
return make_ready_future<json::json_return_type>(res);
fd::get_up_endpoint_count.set(r, [](std::unique_ptr<request> req) {
return gms::get_up_endpoint_count().then([](int res) {
return make_ready_future<json::json_return_type>(res);
});
});
fd::get_down_endpoint_count.set(r, [&g](std::unique_ptr<request> req) {
int res = g.get_down_endpoint_count();
return make_ready_future<json::json_return_type>(res);
fd::get_down_endpoint_count.set(r, [](std::unique_ptr<request> req) {
return gms::get_down_endpoint_count().then([](int res) {
return make_ready_future<json::json_return_type>(res);
});
});
fd::get_phi_convict_threshold.set(r, [] (std::unique_ptr<request> req) {
return make_ready_future<json::json_return_type>(8);
return gms::get_phi_convict_threshold().then([](double res) {
return make_ready_future<json::json_return_type>(res);
});
});
fd::get_simple_states.set(r, [&g] (std::unique_ptr<request> req) {
std::map<sstring, sstring> nodes_status;
for (auto& entry : g.get_endpoint_states()) {
nodes_status.emplace(entry.first.to_sstring(), entry.second.is_alive() ? "UP" : "DOWN");
}
return make_ready_future<json::json_return_type>(map_to_key_value<fd::mapper>(nodes_status));
fd::get_simple_states.set(r, [] (std::unique_ptr<request> req) {
return gms::get_simple_states().then([](const std::map<sstring, sstring>& map) {
return make_ready_future<json::json_return_type>(map_to_key_value<fd::mapper>(map));
});
});
fd::set_phi_convict_threshold.set(r, [](std::unique_ptr<request> req) {
double phi = atof(req->get_query_param("phi").c_str());
return make_ready_future<json::json_return_type>("");
return gms::set_phi_convict_threshold(phi).then([]() {
return make_ready_future<json::json_return_type>("");
});
});
fd::get_endpoint_state.set(r, [&g] (std::unique_ptr<request> req) {
auto* state = g.get_endpoint_state_for_endpoint_ptr(gms::inet_address(req->param["addr"]));
if (!state) {
return make_ready_future<json::json_return_type>(format("unknown endpoint {}", req->param["addr"]));
}
std::stringstream ss;
g.append_endpoint_state(ss, *state);
return make_ready_future<json::json_return_type>(sstring(ss.str()));
});
fd::get_endpoint_phi_values.set(r, [](std::unique_ptr<request> req) {
std::map<gms::inet_address, gms::arrival_window> map;
std::vector<fd::endpoint_phi_value> res;
auto now = gms::arrival_window::clk::now();
for (auto& p : map) {
fd::endpoint_phi_value val;
val.endpoint = p.first.to_sstring();
val.phi = p.second.phi(now);
res.emplace_back(std::move(val));
}
return make_ready_future<json::json_return_type>(res);
fd::get_endpoint_state.set(r, [](std::unique_ptr<request> req) {
return gms::get_endpoint_state(req->param["addr"]).then([](const sstring& state) {
return make_ready_future<json::json_return_type>(state);
});
});
}

View File

@@ -1,23 +1,30 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
namespace gms {
class gossiper;
}
namespace api {
void set_failure_detector(http_context& ctx, routes& r, gms::gossiper& g);
void set_failure_detector(http_context& ctx, routes& r);
}

View File

@@ -1,62 +1,68 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "gossiper.hh"
#include "api/api-doc/gossiper.json.hh"
#include "gms/gossiper.hh"
#include <gms/gossiper.hh>
namespace api {
using namespace json;
void set_gossiper(http_context& ctx, routes& r, gms::gossiper& g) {
httpd::gossiper_json::get_down_endpoint.set(r, [&g] (const_req req) {
auto res = g.get_unreachable_members();
void set_gossiper(http_context& ctx, routes& r) {
httpd::gossiper_json::get_down_endpoint.set(r, [] (const_req req) {
auto res = gms::get_local_gossiper().get_unreachable_members();
return container_to_vec(res);
});
httpd::gossiper_json::get_live_endpoint.set(r, [&g] (const_req req) {
auto res = g.get_live_members();
httpd::gossiper_json::get_live_endpoint.set(r, [] (const_req req) {
auto res = gms::get_local_gossiper().get_live_members();
return container_to_vec(res);
});
httpd::gossiper_json::get_endpoint_downtime.set(r, [&g] (const_req req) {
httpd::gossiper_json::get_endpoint_downtime.set(r, [] (const_req req) {
gms::inet_address ep(req.param["addr"]);
return g.get_endpoint_downtime(ep);
return gms::get_local_gossiper().get_endpoint_downtime(ep);
});
httpd::gossiper_json::get_current_generation_number.set(r, [&g] (std::unique_ptr<request> req) {
httpd::gossiper_json::get_current_generation_number.set(r, [] (std::unique_ptr<request> req) {
gms::inet_address ep(req->param["addr"]);
return g.get_current_generation_number(ep).then([] (int res) {
return gms::get_local_gossiper().get_current_generation_number(ep).then([] (int res) {
return make_ready_future<json::json_return_type>(res);
});
});
httpd::gossiper_json::get_current_heart_beat_version.set(r, [&g] (std::unique_ptr<request> req) {
httpd::gossiper_json::get_current_heart_beat_version.set(r, [] (std::unique_ptr<request> req) {
gms::inet_address ep(req->param["addr"]);
return g.get_current_heart_beat_version(ep).then([] (int res) {
return gms::get_local_gossiper().get_current_heart_beat_version(ep).then([] (int res) {
return make_ready_future<json::json_return_type>(res);
});
});
httpd::gossiper_json::assassinate_endpoint.set(r, [&g](std::unique_ptr<request> req) {
httpd::gossiper_json::assassinate_endpoint.set(r, [](std::unique_ptr<request> req) {
if (req->get_query_param("unsafe") != "True") {
return g.assassinate_endpoint(req->param["addr"]).then([] {
return gms::get_local_gossiper().assassinate_endpoint(req->param["addr"]).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
}
return g.unsafe_assassinate_endpoint(req->param["addr"]).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
});
httpd::gossiper_json::force_remove_endpoint.set(r, [&g](std::unique_ptr<request> req) {
gms::inet_address ep(req->param["addr"]);
return g.force_remove_endpoint(ep).then([] {
return gms::get_local_gossiper().unsafe_assassinate_endpoint(req->param["addr"]).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
});

View File

@@ -1,23 +1,30 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
namespace gms {
class gossiper;
}
namespace api {
void set_gossiper(http_context& ctx, routes& r, gms::gossiper& g);
void set_gossiper(http_context& ctx, routes& r);
}

View File

@@ -1,98 +1,34 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include <algorithm>
#include <vector>
#include "hinted_handoff.hh"
#include "api/api-doc/hinted_handoff.json.hh"
#include "gms/inet_address.hh"
#include "gms/gossiper.hh"
#include "service/storage_proxy.hh"
namespace api {
using namespace scollectd;
using namespace json;
namespace hh = httpd::hinted_handoff_json;
void set_hinted_handoff(http_context& ctx, routes& r, gms::gossiper& g) {
hh::create_hints_sync_point.set(r, [&ctx, &g] (std::unique_ptr<request> req) -> future<json::json_return_type> {
auto parse_hosts_list = [&g] (sstring arg) {
std::vector<sstring> hosts_str = split(arg, ",");
std::vector<gms::inet_address> hosts;
hosts.reserve(hosts_str.size());
if (hosts_str.empty()) {
// No target_hosts specified means that we should wait for hints for all nodes to be sent
const auto members_set = g.get_live_members();
std::copy(members_set.begin(), members_set.end(), std::back_inserter(hosts));
} else {
for (const auto& host_str : hosts_str) {
try {
gms::inet_address host;
host = gms::inet_address(host_str);
hosts.push_back(host);
} catch (std::exception& e) {
throw httpd::bad_param_exception(format("Failed to parse host address {}: {}", host_str, e.what()));
}
}
}
return hosts;
};
std::vector<gms::inet_address> target_hosts = parse_hosts_list(req->get_query_param("target_hosts"));
return ctx.sp.local().create_hint_sync_point(std::move(target_hosts)).then([] (db::hints::sync_point sync_point) {
return json::json_return_type(sync_point.encode());
});
});
hh::get_hints_sync_point.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {
db::hints::sync_point sync_point;
const sstring encoded = req->get_query_param("id");
try {
sync_point = db::hints::sync_point::decode(encoded);
} catch (std::exception& e) {
throw httpd::bad_param_exception(format("Failed to parse the sync point description {}: {}", encoded, e.what()));
}
lowres_clock::time_point deadline;
const sstring timeout_str = req->get_query_param("timeout");
try {
deadline = [&] {
if (timeout_str.empty()) {
// Empty string - don't wait at all, just check the status
return lowres_clock::time_point::min();
} else {
const auto timeout = std::stoll(timeout_str);
if (timeout >= 0) {
// Wait until the point is reached, or until `timeout` seconds elapse
return lowres_clock::now() + std::chrono::seconds(timeout);
} else {
// Negative value indicates infinite timeout
return lowres_clock::time_point::max();
}
}
} ();
} catch (std::exception& e) {
throw httpd::bad_param_exception(format("Failed to parse the timeout parameter {}: {}", timeout_str, e.what()));
}
using return_type = hh::ns_get_hints_sync_point::get_hints_sync_point_return_type;
using return_type_wrapper = hh::ns_get_hints_sync_point::return_type_wrapper;
return ctx.sp.local().wait_for_hint_sync_point(std::move(sync_point), deadline).then([] {
return json::json_return_type(return_type_wrapper(return_type::DONE));
}).handle_exception_type([] (const timed_out_error&) {
return json::json_return_type(return_type_wrapper(return_type::IN_PROGRESS));
});
});
void set_hinted_handoff(http_context& ctx, routes& r) {
hh::list_endpoints_pending_hints.set(r, [] (std::unique_ptr<request> req) {
//TBD
unimplemented();
@@ -136,16 +72,5 @@ void set_hinted_handoff(http_context& ctx, routes& r, gms::gossiper& g) {
});
}
void unset_hinted_handoff(http_context& ctx, routes& r) {
hh::create_hints_sync_point.unset(r);
hh::get_hints_sync_point.unset(r);
hh::list_endpoints_pending_hints.unset(r);
hh::truncate_all_hints.unset(r);
hh::schedule_hint_delivery.unset(r);
hh::pause_hints_delivery.unset(r);
hh::get_create_hint_count.unset(r);
hh::get_not_stored_hints_count.unset(r);
}
}

View File

@@ -1,24 +1,30 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
namespace gms {
class gossiper;
}
namespace api {
void set_hinted_handoff(http_context& ctx, routes& r, gms::gossiper& g);
void unset_hinted_handoff(http_context& ctx, routes& r);
void set_hinted_handoff(http_context& ctx, routes& r);
}

View File

@@ -1,28 +1,40 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "api/api-doc/lsa.json.hh"
#include "api/lsa.hh"
#include "api/api.hh"
#include <seastar/http/exception.hh>
#include "http/exception.hh"
#include "utils/logalloc.hh"
#include "log.hh"
#include "replica/database.hh"
namespace api {
static logging::logger alogger("lsa-api");
static logging::logger logger("lsa-api");
void set_lsa(http_context& ctx, routes& r) {
httpd::lsa_json::lsa_compact.set(r, [&ctx](std::unique_ptr<request> req) {
alogger.info("Triggering compaction");
return ctx.db.invoke_on_all([] (replica::database&) {
logger.info("Triggering compaction");
return ctx.db.invoke_on_all([] (database&) {
logalloc::shard_tracker().reclaim(std::numeric_limits<size_t>::max());
}).then([] {
return json::json_return_type(json::json_void());

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once

View File

@@ -1,20 +1,33 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "messaging_service.hh"
#include "message/messaging_service.hh"
#include <seastar/rpc/rpc_types.hh>
#include "rpc/rpc_types.hh"
#include "api/api-doc/messaging_service.json.hh"
#include <iostream>
#include <sstream>
using namespace httpd::messaging_service_json;
using namespace netw;
using namespace net;
namespace api {
@@ -40,8 +53,8 @@ std::vector<message_counter> map_to_message_counters(
* according to a function that it gets as a parameter.
*
*/
future_json_function get_client_getter(sharded<netw::messaging_service>& ms, std::function<uint64_t(const shard_info&)> f) {
return [&ms, f](std::unique_ptr<request> req) {
future_json_function get_client_getter(std::function<uint64_t(const shard_info&)> f) {
return [f](std::unique_ptr<request> req) {
using map_type = std::unordered_map<gms::inet_address, uint64_t>;
auto get_shard_map = [f](messaging_service& ms) {
std::unordered_map<gms::inet_address, unsigned long> map;
@@ -50,74 +63,70 @@ future_json_function get_client_getter(sharded<netw::messaging_service>& ms, std
});
return map;
};
return ms.map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
return get_messaging_service().map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
then([](map_type&& map) {
return make_ready_future<json::json_return_type>(map_to_message_counters(map));
});
};
}
future_json_function get_server_getter(sharded<netw::messaging_service>& ms, std::function<uint64_t(const rpc::stats&)> f) {
return [&ms, f](std::unique_ptr<request> req) {
future_json_function get_server_getter(std::function<uint64_t(const rpc::stats&)> f) {
return [f](std::unique_ptr<request> req) {
using map_type = std::unordered_map<gms::inet_address, uint64_t>;
auto get_shard_map = [f](messaging_service& ms) {
std::unordered_map<gms::inet_address, unsigned long> map;
ms.foreach_server_connection_stats([&map, f] (const rpc::client_info& info, const rpc::stats& stats) mutable {
map[gms::inet_address(info.addr.addr())] = f(stats);
map[gms::inet_address(net::ipv4_address(info.addr))] = f(stats);
});
return map;
};
return ms.map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
return get_messaging_service().map_reduce0(get_shard_map, map_type(), map_sum<map_type>).
then([](map_type&& map) {
return make_ready_future<json::json_return_type>(map_to_message_counters(map));
});
};
}
void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging_service>& ms) {
get_timeout_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
void set_messaging_service(http_context& ctx, routes& r) {
get_timeout_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().timeout;
}));
get_sent_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_sent_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().sent_messages;
}));
get_replied_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
return c.get_stats().replied;
}));
get_dropped_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_dropped_messages.set(r, get_client_getter([](const shard_info& c) {
// We don't have the same drop message mechanism
// as origin has.
// hence we can always return 0
return 0;
}));
get_exception_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_exception_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().exception_received;
}));
get_pending_messages.set(r, get_client_getter(ms, [](const shard_info& c) {
get_pending_messages.set(r, get_client_getter([](const shard_info& c) {
return c.get_stats().pending;
}));
get_respond_pending_messages.set(r, get_server_getter(ms, [](const rpc::stats& c) {
get_respond_pending_messages.set(r, get_server_getter([](const rpc::stats& c) {
return c.pending;
}));
get_respond_completed_messages.set(r, get_server_getter(ms, [](const rpc::stats& c) {
get_respond_completed_messages.set(r, get_server_getter([](const rpc::stats& c) {
return c.sent_messages;
}));
get_version.set(r, [&ms](const_req req) {
return ms.local().get_raw_version(req.get_query_param("addr"));
get_version.set(r, [](const_req req) {
return net::get_local_messaging_service().get_raw_version(req.get_query_param("addr"));
});
get_dropped_messages_by_ver.set(r, [&ms](std::unique_ptr<request> req) {
get_dropped_messages_by_ver.set(r, [](std::unique_ptr<request> req) {
shared_ptr<std::vector<uint64_t>> map = make_shared<std::vector<uint64_t>>(num_verb);
return ms.map_reduce([map](const uint64_t* local_map) mutable {
return net::get_messaging_service().map_reduce([map](const uint64_t* local_map) mutable {
for (auto i = 0; i < num_verb; i++) {
(*map)[i]+= local_map[i];
}
@@ -130,7 +139,7 @@ void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging
messaging_verb v = i; // for type safety we use messaging_verb values
auto idx = static_cast<uint32_t>(v);
if (idx >= map->size()) {
throw std::runtime_error(format("verb index out of bounds: {:d}, map size: {:d}", idx, map->size()));
throw std::runtime_error(sprint("verb index out of bounds: %lu, map size: %lu", idx, map->size()));
}
if ((*map)[idx] > 0) {
c.count = (*map)[idx];
@@ -142,19 +151,5 @@ void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging
});
});
}
void unset_messaging_service(http_context& ctx, routes& r) {
get_timeout_messages.unset(r);
get_sent_messages.unset(r);
get_replied_messages.unset(r);
get_dropped_messages.unset(r);
get_exception_messages.unset(r);
get_pending_messages.unset(r);
get_respond_pending_messages.unset(r);
get_respond_completed_messages.unset(r);
get_version.unset(r);
get_dropped_messages_by_ver.unset(r);
}
}

View File

@@ -1,20 +1,30 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include "api.hh"
namespace netw { class messaging_service; }
namespace api {
void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging_service>& ms);
void unset_messaging_service(http_context& ctx, routes& r);
void set_messaging_service(http_context& ctx, routes& r);
}

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "storage_proxy.hh"
@@ -13,8 +26,6 @@
#include "service/storage_service.hh"
#include "db/config.hh"
#include "utils/histogram.hh"
#include "replica/database.hh"
#include <seastar/core/scheduling_specific.hh>
namespace api {
@@ -22,208 +33,57 @@ namespace sp = httpd::storage_proxy_json;
using proxy = service::storage_proxy;
using namespace json;
utils::time_estimated_histogram timed_rate_moving_average_summary_merge(utils::time_estimated_histogram a, const utils::timed_rate_moving_average_summary_and_histogram& b) {
return a.merge(b.histogram());
}
/**
* This function implement a two dimentional map reduce where
* the first level is a distributed storage_proxy class and the
* second level is the stats per scheduling group class.
* @param d - a reference to the storage_proxy distributed class.
* @param mapper - the internal mapper that is used to map the internal
* stat class into a value of type `V`.
* @param reducer - the reducer that is used in both outer and inner
* aggregations.
* @param initial_value - the initial value to use for both aggregations
* @return A future that resolves to the result of the aggregation.
*/
template<typename V, typename Reducer, typename InnerMapper>
future<V> two_dimensional_map_reduce(distributed<service::storage_proxy>& d,
InnerMapper mapper, Reducer reducer, V initial_value) {
return d.map_reduce0( [mapper, reducer, initial_value] (const service::storage_proxy& sp) {
return map_reduce_scheduling_group_specific<service::storage_proxy_stats::stats>(
mapper, reducer, initial_value, sp.get_stats_key());
}, initial_value, reducer);
}
/**
* This function implement a two dimentional map reduce where
* the first level is a distributed storage_proxy class and the
* second level is the stats per scheduling group class.
* @param d - a reference to the storage_proxy distributed class.
* @param f - a field pointer which is the implicit internal reducer.
* @param reducer - the reducer that is used in both outer and inner
* aggregations.
* @param initial_value - the initial value to use for both aggregations* @return
* @return A future that resolves to the result of the aggregation.
*/
template<typename V, typename Reducer, typename F, typename C>
future<V> two_dimensional_map_reduce(distributed<service::storage_proxy>& d,
C F::*f, Reducer reducer, V initial_value) {
return two_dimensional_map_reduce(d, [f] (F& stats) -> V {
return stats.*f;
}, reducer, initial_value);
}
/**
* A partial Specialization of sum_stats for the storage proxy
* case where the get stats function doesn't return a
* stats object with fields but a per scheduling group
* stats object, the name was also changed since functions
* partial specialization is not supported in C++.
*
*/
template<typename V, typename F>
future<json::json_return_type> sum_stats_storage_proxy(distributed<proxy>& d, V F::*f) {
return two_dimensional_map_reduce(d, [f] (F& stats) { return stats.*f; }, std::plus<V>(), V(0)).then([] (V val) {
return make_ready_future<json::json_return_type>(val);
});
}
static future<utils::rate_moving_average> sum_timed_rate(distributed<proxy>& d, utils::timed_rate_moving_average service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(d, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).rate();
}, std::plus<utils::rate_moving_average>(), utils::rate_moving_average());
}
static future<json::json_return_type> sum_timed_rate_as_obj(distributed<proxy>& d, utils::timed_rate_moving_average service::storage_proxy_stats::stats::*f) {
return sum_timed_rate(d, f).then([](const utils::rate_moving_average& val) {
httpd::utils_json::rate_moving_average m;
m = val;
return make_ready_future<json::json_return_type>(m);
});
}
httpd::utils_json::rate_moving_average_and_histogram get_empty_moving_average() {
return timer_to_json(utils::rate_moving_average_and_histogram());
}
static future<json::json_return_type> sum_timed_rate_as_long(distributed<proxy>& d, utils::timed_rate_moving_average service::storage_proxy_stats::stats::*f) {
return sum_timed_rate(d, f).then([](const utils::rate_moving_average& val) {
return make_ready_future<json::json_return_type>(val.count);
});
}
utils_json::estimated_histogram time_to_json_histogram(const utils::time_estimated_histogram& val) {
utils_json::estimated_histogram res;
for (size_t i = 0; i < val.size(); i++) {
res.buckets.push(val.get(i));
res.bucket_offsets.push(val.get_bucket_lower_limit(i));
}
return res;
}
static future<json::json_return_type> sum_estimated_histogram(http_context& ctx, utils::timed_rate_moving_average_summary_and_histogram service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(ctx.sp, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).histogram();
}, utils::time_estimated_histogram_merge, utils::time_estimated_histogram()).then([](const utils::time_estimated_histogram& val) {
return make_ready_future<json::json_return_type>(time_to_json_histogram(val));
});
}
static future<json::json_return_type> sum_estimated_histogram(http_context& ctx, utils::estimated_histogram service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(ctx.sp, f, utils::estimated_histogram_merge,
utils::estimated_histogram()).then([](const utils::estimated_histogram& val) {
static future<json::json_return_type> sum_estimated_histogram(http_context& ctx, sstables::estimated_histogram proxy::stats::*f) {
return ctx.sp.map_reduce0([f](const proxy& p) {return p.get_stats().*f;}, sstables::estimated_histogram(),
sstables::merge).then([](const sstables::estimated_histogram& val) {
utils_json::estimated_histogram res;
res = val;
return make_ready_future<json::json_return_type>(res);
});
}
static future<json::json_return_type> total_latency(http_context& ctx, utils::timed_rate_moving_average_summary_and_histogram service::storage_proxy_stats::stats::*f) {
return two_dimensional_map_reduce(ctx.sp, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).hist.mean * (stats.*f).hist.count;
}, std::plus<double>(), 0.0).then([](double val) {
static future<json::json_return_type> total_latency(http_context& ctx, utils::ihistogram proxy::stats::*f) {
return ctx.sp.map_reduce0([f](const proxy& p) {return (p.get_stats().*f).mean * (p.get_stats().*f).count;}, 0.0,
std::plus<double>()).then([](double val) {
int64_t res = val;
return make_ready_future<json::json_return_type>(res);
});
}
/**
* A partial Specialization of sum_histogram_stats
* for the storage proxy case where the get stats
* function doesn't return a stats object with
* fields but a per scheduling group stats object,
* the name was also changed since function partial
* specialization is not supported in C++.
*/
template<typename F>
future<json::json_return_type>
sum_histogram_stats_storage_proxy(distributed<proxy>& d,
utils::timed_rate_moving_average_summary_and_histogram F::*f) {
return two_dimensional_map_reduce(d, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).hist;
}, std::plus<utils::ihistogram>(), utils::ihistogram()).
then([](const utils::ihistogram& val) {
return make_ready_future<json::json_return_type>(to_json(val));
});
}
/**
* A partial Specialization of sum_timer_stats for the
* storage proxy case where the get stats function
* doesn't return a stats object with fields but a
* per scheduling group stats object, the name
* was also changed since partial function specialization
* is not supported in C++.
*/
template<typename F>
future<json::json_return_type>
sum_timer_stats_storage_proxy(distributed<proxy>& d,
utils::timed_rate_moving_average_summary_and_histogram F::*f) {
return two_dimensional_map_reduce(d, [f] (service::storage_proxy_stats::stats& stats) {
return (stats.*f).rate();
}, std::plus<utils::rate_moving_average_and_histogram>(),
utils::rate_moving_average_and_histogram()).then([](const utils::rate_moving_average_and_histogram& val) {
return make_ready_future<json::json_return_type>(timer_to_json(val));
});
}
void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_service>& ss) {
void set_storage_proxy(http_context& ctx, routes& r) {
sp::get_total_hints.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_hinted_handoff_enabled.set(r, [&ctx](std::unique_ptr<request> req) {
const auto& filter = service::get_storage_proxy().local().get_hints_host_filter();
return make_ready_future<json::json_return_type>(!filter.is_disabled_for_all());
sp::get_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// hinted handoff is not supported currently,
// so we should return false
return make_ready_future<json::json_return_type>(false);
});
sp::set_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
auto enable = req->get_query_param("enable");
auto filter = (enable == "true" || enable == "1")
? db::hints::host_filter(db::hints::host_filter::enabled_for_all_tag {})
: db::hints::host_filter(db::hints::host_filter::disabled_for_all_tag {});
return service::get_storage_proxy().invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {
return sp.change_hints_host_filter(filter);
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
return make_ready_future<json::json_return_type>(json_void());
});
sp::get_hinted_handoff_enabled_by_dc.set(r, [](std::unique_ptr<request> req) {
std::vector<sstring> res;
const auto& filter = service::get_storage_proxy().local().get_hints_host_filter();
const auto& dcs = filter.get_dcs();
res.reserve(res.size());
std::copy(dcs.begin(), dcs.end(), std::back_inserter(res));
//TBD
unimplemented();
std::vector<sp::mapper_list> res;
return make_ready_future<json::json_return_type>(res);
});
sp::set_hinted_handoff_enabled_by_dc_list.set(r, [](std::unique_ptr<request> req) {
auto dcs = req->get_query_param("dcs");
auto filter = db::hints::host_filter::parse_from_dc_list(std::move(dcs));
return service::get_storage_proxy().invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {
return sp.change_hints_host_filter(filter);
}).then([] {
return make_ready_future<json::json_return_type>(json_void());
});
//TBD
unimplemented();
auto enable = req->get_query_param("dcs");
return make_ready_future<json::json_return_type>(json_void());
});
sp::get_max_hint_window.set(r, [](std::unique_ptr<request> req) {
@@ -342,19 +202,19 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se
});
sp::get_read_repair_attempted.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_attempts);
return sum_stats(ctx.sp, &proxy::stats::read_repair_attempts);
});
sp::get_read_repair_repaired_blocking.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_repaired_blocking);
return sum_stats(ctx.sp, &proxy::stats::read_repair_repaired_blocking);
});
sp::get_read_repair_repaired_background.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_repaired_background);
return sum_stats(ctx.sp, &proxy::stats::read_repair_repaired_background);
});
sp::get_schema_versions.set(r, [&ss](std::unique_ptr<request> req) {
return ss.local().describe_schema_versions().then([] (auto result) {
sp::get_schema_versions.set(r, [](std::unique_ptr<request> req) {
return service::get_local_storage_service().describe_schema_versions().then([] (auto result) {
std::vector<sp::mapper_list> res;
for (auto e : result) {
sp::mapper_list entry;
@@ -366,154 +226,127 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se
});
});
sp::get_cas_read_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_timeouts);
sp::get_cas_read_timeouts.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_unavailables);
sp::get_cas_read_unavailables.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_timeouts);
sp::get_cas_write_timeouts.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_unavailables);
sp::get_cas_write_unavailables.set(r, [](std::unique_ptr<request> req) {
//TBD
// FIXME
// cas is not supported yet, so just return 0
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_write_unfinished_commit);
sp::get_cas_write_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &proxy::stats::cas_write_contention);
sp::get_cas_write_metrics_contention.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_condition_not_met.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_write_condition_not_met);
sp::get_cas_write_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_write_metrics_failed_read_round_optimization.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_failed_read_round_optimization);
sp::get_cas_read_metrics_unfinished_commit.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_stats(ctx.sp, &proxy::stats::cas_read_unfinished_commit);
sp::get_cas_read_metrics_contention.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_cas_read_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &proxy::stats::cas_read_contention);
sp::get_cas_read_metrics_condition_not_met.set(r, [](std::unique_ptr<request> req) {
//TBD
unimplemented();
return make_ready_future<json::json_return_type>(0);
});
sp::get_read_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::read_timeouts);
return sum_stats(ctx.sp, &proxy::stats::read_timeouts);
});
sp::get_read_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::read_unavailables);
return sum_stats(ctx.sp, &proxy::stats::read_unavailables);
});
sp::get_range_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::range_slice_timeouts);
return sum_stats(ctx.sp, &proxy::stats::range_slice_timeouts);
});
sp::get_range_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::range_slice_unavailables);
return sum_stats(ctx.sp, &proxy::stats::range_slice_unavailables);
});
sp::get_write_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::write_timeouts);
return sum_stats(ctx.sp, &proxy::stats::write_timeouts);
});
sp::get_write_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::write_unavailables);
});
sp::get_read_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::read_timeouts);
});
sp::get_read_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::read_unavailables);
});
sp::get_range_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::range_slice_timeouts);
});
sp::get_range_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::range_slice_unavailables);
});
sp::get_write_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::write_timeouts);
});
sp::get_write_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::write_unavailables);
});
sp::get_range_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);
});
sp::get_write_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::write);
});
sp::get_read_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read);
return sum_stats(ctx.sp, &proxy::stats::write_unavailables);
});
sp::get_range_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);
return sum_histogram_stats(ctx.sp, &proxy::stats::range);
});
sp::get_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::write);
});
sp::get_cas_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats(ctx.sp, &proxy::stats::cas_write);
});
sp::get_cas_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats(ctx.sp, &proxy::stats::cas_read);
});
sp::get_view_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
//TBD
// FIXME
// No View metrics are available, so just return empty moving average
return make_ready_future<json::json_return_type>(get_empty_moving_average());
return sum_histogram_stats(ctx.sp, &proxy::stats::write);
});
sp::get_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read);
return sum_histogram_stats(ctx.sp, &proxy::stats::read);
});
sp::get_read_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &service::storage_proxy_stats::stats::read);
return sum_estimated_histogram(ctx, &proxy::stats::estimated_read);
});
sp::get_read_latency.set(r, [&ctx](std::unique_ptr<request> req) {
return total_latency(ctx, &service::storage_proxy_stats::stats::read);
return total_latency(ctx, &proxy::stats::read);
});
sp::get_write_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_estimated_histogram(ctx, &service::storage_proxy_stats::stats::write);
return sum_estimated_histogram(ctx, &proxy::stats::estimated_write);
});
sp::get_write_latency.set(r, [&ctx](std::unique_ptr<request> req) {
return total_latency(ctx, &service::storage_proxy_stats::stats::write);
return total_latency(ctx, &proxy::stats::write);
});
sp::get_range_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {
return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);
return sum_histogram_stats(ctx.sp, &proxy::stats::read);
});
sp::get_range_latency.set(r, [&ctx](std::unique_ptr<request> req) {
return total_latency(ctx, &service::storage_proxy_stats::stats::range);
return total_latency(ctx, &proxy::stats::range);
});
}

View File

@@ -1,20 +1,30 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <seastar/core/sharded.hh>
#include "api.hh"
namespace service { class storage_service; }
namespace api {
void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_service>& ss);
void set_storage_proxy(http_context& ctx, routes& r);
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,61 +1,30 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
#include <seastar/core/sharded.hh>
#include "api.hh"
#include "db/data_listeners.hh"
namespace cql_transport { class controller; }
class thrift_controller;
namespace db {
class snapshot_ctl;
namespace view {
class view_builder;
}
class system_keyspace;
}
namespace netw { class messaging_service; }
class repair_service;
namespace cdc { class generation_service; }
class sstables_loader;
namespace gms {
class gossiper;
}
namespace api {
// verify that the keyspace parameter is found, otherwise a bad_param_exception exception is thrown
// containing the description of the respective keyspace error.
sstring validate_keyspace(http_context& ctx, const parameters& param);
// splits a request parameter assumed to hold a comma-separated list of table names
// verify that the tables are found, otherwise a bad_param_exception exception is thrown
// containing the description of the respective no_such_column_family error.
std::vector<sstring> parse_tables(const sstring& ks_name, http_context& ctx, const std::unordered_map<sstring, sstring>& query_params, sstring param_name);
void set_storage_service(http_context& ctx, routes& r, sharded<service::storage_service>& ss, gms::gossiper& g, sharded<cdc::generation_service>& cdc_gs, sharded<db::system_keyspace>& sys_ls);
void set_sstables_loader(http_context& ctx, routes& r, sharded<sstables_loader>& sst_loader);
void unset_sstables_loader(http_context& ctx, routes& r);
void set_view_builder(http_context& ctx, routes& r, sharded<db::view::view_builder>& vb);
void unset_view_builder(http_context& ctx, routes& r);
void set_repair(http_context& ctx, routes& r, sharded<repair_service>& repair);
void unset_repair(http_context& ctx, routes& r);
void set_transport_controller(http_context& ctx, routes& r, cql_transport::controller& ctl);
void unset_transport_controller(http_context& ctx, routes& r);
void set_rpc_controller(http_context& ctx, routes& r, thrift_controller& ctl);
void unset_rpc_controller(http_context& ctx, routes& r);
void set_snapshot(http_context& ctx, routes& r, sharded<db::snapshot_ctl>& snap_ctl);
void unset_snapshot(http_context& ctx, routes& r);
seastar::future<json::json_return_type> run_toppartitions_query(db::toppartitions_query& q, http_context &ctx, bool legacy_request = false);
void set_storage_service(http_context& ctx, routes& r);
}

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#include "stream_manager.hh"
@@ -74,13 +87,13 @@ static hs::stream_state get_state(
return state;
}
void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_manager>& sm) {
void set_stream_manager(http_context& ctx, routes& r) {
hs::get_current_streams.set(r,
[&sm] (std::unique_ptr<request> req) {
return sm.invoke_on_all([] (auto& sm) {
[] (std::unique_ptr<request> req) {
return streaming::get_stream_manager().invoke_on_all([] (auto& sm) {
return sm.update_all_progress_info();
}).then([&sm] {
return sm.map_reduce0([](streaming::stream_manager& stream) {
}).then([] {
return streaming::get_stream_manager().map_reduce0([](streaming::stream_manager& stream) {
std::vector<hs::stream_state> res;
for (auto i : stream.get_initiated_streams()) {
res.push_back(get_state(*i.second.get()));
@@ -96,17 +109,17 @@ void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_
});
});
hs::get_all_active_streams_outbound.set(r, [&sm](std::unique_ptr<request> req) {
return sm.map_reduce0([](streaming::stream_manager& stream) {
hs::get_all_active_streams_outbound.set(r, [](std::unique_ptr<request> req) {
return streaming::get_stream_manager().map_reduce0([](streaming::stream_manager& stream) {
return stream.get_initiated_streams().size();
}, 0, std::plus<int64_t>()).then([](int64_t res) {
return make_ready_future<json::json_return_type>(res);
});
});
hs::get_total_incoming_bytes.set(r, [&sm](std::unique_ptr<request> req) {
hs::get_total_incoming_bytes.set(r, [](std::unique_ptr<request> req) {
gms::inet_address peer(req->param["peer"]);
return sm.map_reduce0([peer](streaming::stream_manager& sm) {
return streaming::get_stream_manager().map_reduce0([peer](streaming::stream_manager& sm) {
return sm.get_progress_on_all_shards(peer).then([] (auto sbytes) {
return sbytes.bytes_received;
});
@@ -115,8 +128,8 @@ void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_
});
});
hs::get_all_total_incoming_bytes.set(r, [&sm](std::unique_ptr<request> req) {
return sm.map_reduce0([](streaming::stream_manager& sm) {
hs::get_all_total_incoming_bytes.set(r, [](std::unique_ptr<request> req) {
return streaming::get_stream_manager().map_reduce0([](streaming::stream_manager& sm) {
return sm.get_progress_on_all_shards().then([] (auto sbytes) {
return sbytes.bytes_received;
});
@@ -125,9 +138,9 @@ void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_
});
});
hs::get_total_outgoing_bytes.set(r, [&sm](std::unique_ptr<request> req) {
hs::get_total_outgoing_bytes.set(r, [](std::unique_ptr<request> req) {
gms::inet_address peer(req->param["peer"]);
return sm.map_reduce0([peer] (streaming::stream_manager& sm) {
return streaming::get_stream_manager().map_reduce0([peer] (streaming::stream_manager& sm) {
return sm.get_progress_on_all_shards(peer).then([] (auto sbytes) {
return sbytes.bytes_sent;
});
@@ -136,8 +149,8 @@ void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_
});
});
hs::get_all_total_outgoing_bytes.set(r, [&sm](std::unique_ptr<request> req) {
return sm.map_reduce0([](streaming::stream_manager& sm) {
hs::get_all_total_outgoing_bytes.set(r, [](std::unique_ptr<request> req) {
return streaming::get_stream_manager().map_reduce0([](streaming::stream_manager& sm) {
return sm.get_progress_on_all_shards().then([] (auto sbytes) {
return sbytes.bytes_sent;
});
@@ -147,13 +160,4 @@ void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_
});
}
void unset_stream_manager(http_context& ctx, routes& r) {
hs::get_current_streams.unset(r);
hs::get_all_active_streams_outbound.unset(r);
hs::get_total_incoming_bytes.unset(r);
hs::get_all_total_incoming_bytes.unset(r);
hs::get_total_outgoing_bytes.unset(r);
hs::get_all_total_outgoing_bytes.unset(r);
}
}

View File

@@ -1,9 +1,22 @@
/*
* Copyright (C) 2015-present ScyllaDB
* Copyright 2015 Cloudius Systems
*/
/*
* SPDX-License-Identifier: AGPL-3.0-or-later
* This file is part of Scylla.
*
* Scylla is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Scylla is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Scylla. If not, see <http://www.gnu.org/licenses/>.
*/
#pragma once
@@ -12,7 +25,6 @@
namespace api {
void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_manager>& sm);
void unset_stream_manager(http_context& ctx, routes& r);
void set_stream_manager(http_context& ctx, routes& r);
}

Some files were not shown because too many files have changed in this diff Show More