Now that table no longer accept shared SSTables, those two functions can
be simplified by removing the shared condition.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
With off-strategy work on reshard on boot and refresh, table no
longer needs to work with Shared SSTables. That will unlock
a host of cleanups.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
The tests in test_projection_expression.py test that ProjectionExpression
works - including attribute paths - for the GetItem, Query and Scan
operations.
There is a fourth read operation - BatchGetItem, and it supports
ProjectionExpression too. We tested BatchGetItem + ProjectionExpression in
test_batch.py, but this only tests the basic feature, with top-level
attributes, and we were missing a test for nested document paths.
This patch adds such a test. It is still xfailing on Alternator (and passing
on DynamoDB), because attribute paths are still not supported (this is
issue #5024).
Refs #5024.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200629063244.287571-1-nyh@scylladb.com>
This patch adds three more tests for the ProjectionExpression parameter
of GetItem. They are tests for nested document paths like a.b[2].c.
We don't support nested paths in Alternator yet (this is issue #5024),
so the new tests all xfail (and pass on DynamoDB).
We already had similar tests for UpdateExpression, which also needs to
support document paths, but the tests were missing for ProjectionExpression.
I am planning to start the implementation of document paths with
ProjectionExpression (which is the simplest use of document paths), so I
want the tests for this expression to be as complete as possible.
Refs #5024.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200628213208.275050-1-nyh@scylladb.com>
"
The snapshotting code is already well isolated from the rest of
the storage_service, so it's relatively easy to move it into
independent component, thus de-bloating the storage_service.
As a side effect this allows painless removal of calls to global
get_storage_service() from schema::describe code.
Test: unit(debug), dtest.snapshot_test(dev), manual start-stop
"
* 'br-snapshot-controller-4' of https://github.com/xemul/scylla:
snap: Get rid of storage_service reference in schema.cc
main: Stop http server
snapshot: Make check_snapshot_not_exist a method
snapshots: Move ops gate from storage_service
snapshot: Move lock from storage_service
snapshot: Move all code into db::snapshot_ctl class
storage_service: Move all snapshot code into snapshot-ctl.cc
snapshots: Initial skeleton
snapshots: Properly shutdown API endpoints
api: Rewrap set_server_snapshot lambda
We are currently investigating a segmentation fault, which is suspected
to be caused by a leaked pause handle. Although according to the latest
theory the handle leak is not the root cause of the issue, just a
symptom, its better to catch any bugs that would cause a handle leaking
at the act, and not later when some side-effect causes a segfault.
Refs: #6613
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200625153729.522811-1-bdenes@scylladb.com>
"
Before this series scylla would effectively infinite loop when, for
example, casting a decimal with a negative scale to float.
Fixes#6720
"
* 'espindola/fix-decimal-issue' of https://github.com/espindola/scylla:
big_decimal: Add a test for a corner case
big_decimal: Correctly handle negative scales
big_decimal: Add a as_rational member function
big_decimal: Move constructors out of line
As HACKING.md suggests, we now require gcc version >= 10. Set the
minimum at 10.1.1, as that is the first official 10 release:
https://gcc.gnu.org/releases.html
Tests: manually run configure.py and ensure it passes/fails
appropriately.
Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
Now when the snapshot stopping is correctly handled, we may pull the database
reference all the way down to the schema::describe().
One tricky place is in table::napshot() -- the local db reference is pulled
through an smp::submit_to call, but thanks to the shard checks in the place
where it is needed the db is still "local"
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently it's not stopped at all, so calling a REST request shutdown-time
may crash things at random places.
Fixes: #5702
But it's not the end of the story. Since the server stays up while we are
shutting things down, each subsystem should carefully handle the cases when
it's half-down, but a request comes. A better solution is to unregister
rest verbs eventually, but httpd's rules cannot do it now.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
For this de-static run_snapshot_*_operation (because we no longer have
the static global to get the lock from) and make the snapshot_ctl be
peering_sharded_service to call invoke_on.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This includes
- rename namespace in snapshot-ctl.[cc|hh]
- move methods from storage_service to snapshot_ctl
- move snapshot_details struct
- temporarily make storage_service._snapshot_lock and ._snapshot_ops public
- replace two get_local_storage_service() occurrences with this._db
The latter is not 100% clear as the code that does this references "this"
from another shard, but the _db in question is the distributed object, so
they are all the same on all instances.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is plain move, no other modifications are made, even the
"service" namespace is kept, only few broken indentation fixes.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
A placeholder for snapshotting code that will be moved into it
from the storage_service.
Also -- pass it through the API for future use.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Now with the seastar httpd routes unset() at hands we
can shut down individual API endpoints. Do this for
snapshot calls, this will make snapshot controller stop
safe.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The lambda calls the core snapshot method deep inside the
json marshalling callback. This will bring problems with
stopping the snapshot controller in the next patches.
To prepare for this -- call the .get_snapshot_details()
first, then keep the result in do_with() context. This
change doesn't affect the issue the lambde in question is
about to solve as the whole result set is anyway kept in
memory while being streamed outside.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This behavior is different from cassandra, but without arithmetic
operations it doesn't seem possible to notice the difference from
CQL. Using avg produces the same results, since we use an initial
value of 0 (scale = 0).
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
A negative scale was being passed an a positive value to
boost::multiprecision::pow, which would never finish.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Before Scylla 3.0, we used to send streaming mutations using
individual RPC requests and flush them together using dedicated
streaming memtables. This mechanism is no longer in use and all
versions that use it have long reached end-of-life.
Remove this code.
This makes the code a bit easier to read as there are no discarded
futures and no references to having to keep a subscription alive,
which we don't with current seastar.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200527013120.179763-1-espindola@scylladb.com>
"
Row level repair, when using a local reader, is prone to deadlocking on
the streaming reader concurrency semaphore. This has been observed to
happen with at least two participating nodes, running more concurrent
repairs than the maximum allowed amount of reads by the concurrency
semaphore. In this situation, it is possible that two repair instances,
competing for the last available permits on both nodes, get a permit on
one of the nodes and get queued on the other one respectively. As
neither will let go of the permit it already acquired, nor give up
waiting on the failed-to-acquired permit, a deadlock happens.
To prevent this, we make the local repair reader evictable. For this we
reuse the already existing evictable reader mechanism of the multishard
combining reader. This patchset refactors this evictable reader
mechanism into a standalone flat mutation reader, then exposes it to the
outside world.
The repair reader is paused after the repair buffer is filled, which is
currently 32MB, so the cost of a possible reader recreation is amortized
over 32MB read.
The repair reader is said to be local, when it can use the shard-local
partitioner. This is the case if the participating nodes are homogenous
(their shard configuration is identical), that is the repair instance
has to read just from one shard. A non-local reader uses the multishard
reader, which already makes its shard readers evictable and hence is not
prone to the deadlock described here.
Fixes: #6272
Tests: unit(dev, release, debug)
"
* 'repair-row-level-evictable-local-reader/v3' of https://github.com/denesb/scylla:
repair: row_level: destroy reader on EOS or error
repair: row_level: use evictable_reader for local reads
mutation_reader: expose evictable_reader
mutation_reader: evictable_reader: add auto_pause flag
mutation_reader: make evictable_reader a flat_mutation_reader
mutation_reader: s/inactive_shard_read/inactive_evictable_reader/
mutation_reader: move inactive_shard_reader code up
mutation_reader: fix indentation
mutation_reader: shard_reader: extract remote_reader as evictable_reader
mutation_reader: reader_lifecycle_policy: make semaphore() available early
The database has a mechanism of performing internal CQL queries,
mainly to edit its own local tables. Unfortunately, it's easy
to use the interface incorrectly - e.g. issuing an `ALTER TABLE`
statement on a non-local table will result in not propagating
the schema change to other nodes, which in turn leads to
inconsistencies. In order to avoid such mistakes (one of them
was a root cause of #6513), when an attempt to alter a distributed
table via a local interface is performed, it results in an error.
Tests: unit(dev)
Fixes#6700
Message-Id: <61be3defb57be79f486e6067ceff4f4c965e34cb.1592990796.git.sarna@scylladb.com>
The function that determines if a level L, where L > 0, is disjoint,
is returning false if level is disjoint.
That's because it incorrectly accounts an overlapping SSTable in
the level as a disjoint SSTable. So we need to inverse the logic.
The side effect is that boot will always try to reshape levels
greater than 0 because reshape procedure incorrectly thinks that
levels are overlapping when they're actually disjoint.
Fixes#6695.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200623180221.229695-1-raphaelsc@scylladb.com>
Currently the message only mentions the endpoint and the error message
returned from the replica. Add the keyspace and table to this message to
provide more context. This should help investigations of such errors
greatly, as in the case of tests where there is usually a single table,
we can already guess what exactly is timing out based on this.
We should add even more context, like the kind of the query (single
partition or range scan) but this information is not readily available
in the surrounding scope so this patch defers it.
Refs: #6548
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200624054647.413256-1-bdenes@scylladb.com>
Manual translation from JSON to string_view is replaced
with rjson::to_string_view helper function. In one place,
a redundant string_view intermediary is removed
in favor of creating the string straight from JSON.
Message-Id: <2aa9d9fedd73f14b7640870d14db4f2f0bd7bd8a.1592936139.git.sarna@scylladb.com>
Updating tags was erroneously done locally, which means that
the schema change was not propagated to other nodes.
The new code announces new schema globally.
Fixes#6513
Branches: 4.0,4.1
Tests: unit(dev)
dtest(alternator_tests.AlternatorTest.test_update_condition_expression_and_write_isolation)
Message-Id: <3a816c4ecc33c03af4f36e51b11f195c231e7ce1.1592935039.git.sarna@scylladb.com>
To avoid having to make it an optional with all the additional checks,
we just replace it with an empty reader instead, this also also achieves
the desired effect of releasing the read permit and all the associated
resources early.
Row level repair, when using a local reader, is prone to deadlocking on
the streaming reader concurrency semaphore. This has been observed to
happen with at least two participating nodes, running more concurrent
repairs than the maximum allowed amount of reads by the concurrency
semaphore. In this situation, it is possible that two repair instances,
competing for the last available permits on both nodes, get a permit on
one of the nodes and get queued on the other one respectively. As
neither will let go of the permit it already acquired, nor give up
waiting on the failed-to-acquired permit, a deadlock happens.
To prevent this, we make the local repair reader evictable. For this we
reuse the newly exposed evictable reader.
The repair reader is paused after the repair buffer is filled, which is
currently 32MB, so the cost of a possible reader recreation is amortized
over 32MB read.
The repair reader is said to be local, when it can use the shard-local
partitioner. This is the case if the participating nodes are homogenous
(their shard configuration is identical), that is the repair instance
has to read just from one shard. A non-local reader uses the multishard
reader, which already makes its shard readers evictable and hence is not
prone to the deadlock described here.
Expose functions for the outside world to create evictable readers. We
expose two functions, which create an evictable reader with
`auto_pause::yes` and `auto_pause::no` respectively. The function
creating the latter also returns a handle in addition to the reader,
which can be used to pause the reader.
Currently the evictable reader unconditionally pauses the underlying
reader after each use (`fill_buffer()` or `fast_forward_to()` call).
This is fine for current users (the multishard reader), but the future
user we are doing all this refactoring for -- repair -- will want to
control when the underlying reader is paused "manually". Both these
behaviours can easily be supported in a single implementation, so we
add an `auto_pause` flag to allow the creator of the evictable reader
to control this.
The `evictable_reader` class is almost a proper flat mutation reader
already, it roughly offers the same interface. This patch makes this
formal: changing the class to inherit from `flat_mutation_reader::impl`,
and implement all virtual methods. This also entails a departure from
using the lifecycle policy to pause/resume and create readers, instead
using more general building blocks like the reader concurrency semaphore
and a mutation source.
Unlike refresh on upload dir, column family population shouldn't mutate
level of SSTables to level 0. Otherwise, LCS will have to regenerate all
levels by rewriting the data multiple times, hurting a lot the write
amplification and consequently the node performance. That's also affecting
the time for a node to boot because reshape may be triggered as a result
of this.
Refs #6695.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200622192502.187532-2-raphaelsc@scylladb.com>
The thrift compiler (since 0.13 at least) complains that
the csharp target is deprecated and recommend replacing it
with netstd. Since we don't use either, humor it.
I suspect that this warning caused some spurious rebuilds,
but have not proven it.