Commit Graph

1440 Commits

Author SHA1 Message Date
Konstantin Osipov
94006d77b1 lwt: add cas_contention_timeout_in_ms to config
Make the default conform to the origin.
Message-Id: <20191006154532.54856-3-kostja@scylladb.com>
2019-10-08 00:02:35 +02:00
Nadav Har'El
f2f0f5eb0f alternator: add https support
Merged patch series from Piotr Sarna:

This series adds HTTPS support for Alternator.
The series comes with --https option added to alternator-test, which makes
the test harness run all the tests with HTTPS instead of HTTP. All the tests
pass, albeit with security warnings that a self-signed x509 certificate was
used and it should not be trusted.

Fixes #5042
Refs scylladb/seastar#685

Patches:
  docs: update alternator entry on HTTPS
  alternator-test: suppress the "Unverified HTTPS request" warning
  alternator-test: add HTTPS info to README.md
  alternator-test: add HTTPS to test_describe_endpoints
  alternator-test: add --https parameter
  alternator: add HTTPS support
  config: add alternator HTTPS port
2019-10-07 12:38:20 +03:00
Piotr Sarna
b42eb8b80a config: add alternator HTTPS port
The config variable will be used to set up a TLS-based server
for serving alternator HTTPS requests.
2019-10-03 19:10:29 +02:00
Avi Kivity
3cb081eb84 Merge " hinted handoff: fix races during shutdown and draining" from Vlad
"
Fix races that may lead to use-after-free events and file system level exceptions
during shutdown and drain.

The root cause of use-after-free events in question is that space_watchdog blocks on
end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as
it's accessed even if the corresponding end_point_hints_manager instance
is destroyed in the context of manager::drain_for().

File system exceptions may occur when space_watchdog attempts to scan a
directory while it's being deleted from the drain_for() context.
In case of such an exception new hints generation is going to be blocked
- including for materialized views, till the next space_watchdog round (in 1s).

Issues that are fixed are #4685 and #4836.

Tested as follows:
 1) Patched the code in order to trigger the race with (a lot) higher
    probability and running slightly modified hinted handoff replace
    dtest with a debug binary for 100 times. Side effect of this
    testing was discovering of #4836.
 2) Using the same patch as above tested that there are no crashes and
    nodes survive stop/start sequences (they were not without this series)
    in the context of all hinted handoff dtests. Ran the whole set of
    tests with dev binary for 10 times.
"

* 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla:
  hinted handoff: fix a race on a directory removal between space_watchdog and drain_for()
  hinted handoff: make taking file_update_mutex safe
  db::hints::manager::drain_for(): fix alignment
  db::hints::manager: serialize calls to drain_for()
  db::hints: cosmetics: identation and missing method qualifier
2019-10-03 14:38:00 +03:00
Avi Kivity
c6b66d197b Merge "Couple of preparatory patches for lwt" from Gleb
"
This is a collection of assorted patches that will be needed for LWT.
Most of them are trivial, but one touches a lot of files, so have a
good chance to cause rebase headache (I already had to rebase it on
top of Alternator). Lets push them earlier instead of carrying them in
the lwt branch.
"

* 'gleb/lwt-prepare-v2' of github.com:scylladb/seastar-dev:
  lwt: make _last_timestamp_micros static
  lwt: Add client_state::get_timestamp_for_paxos() function
  lwt: Pass client_state reference all the way to storage_proxy::query
  exceptions: Add a constructor for unavailable_exception that allows providing a custom message
  serializer: Add std::variant support
  lwt: Add missing functions to utils/UUID_gen.hh
2019-09-29 13:02:26 +03:00
Tomasz Grabiec
5b0e48f25b Merge "toppartitions: don't transport schema_ptr across shards" from Avi
When the toppartitions operation gathers results, it copies partition
keys with their schema_ptr:s. When these schema_ptr:s are copies
or destroyed, they can cause leaks or premature frees of the schema
in its original shard since reference count operations in are not atomic.

Fix that by converting the schema_ptr to a global_schema_ptr during
transportation.

Fixes #5104 (direct bug)
Fixes #5018 (schema prematurely freed, toppartitions previously executed on that node)
Fixes #4973 (corrupted memory pool of the same size class as schema, toppartitions previously executed on that node)

Tests: new test added that fails with the existing code in debug mode,
manual toppartitions test
2019-09-26 17:09:54 +02:00
Avi Kivity
670f398a8a toppartitions: do not copy schema_ptr:s in item keys across shards
Copying schema_ptrs across shards results in memory corruption since
lw_shared_ptr does not use atomic operations for reference counts.
Prevent that by converting schema_ptr:s to global_schema_ptr:s before
shipping them across shards in the map operation, and converting them
back to local schema_ptr:s in the reduce operation.
2019-09-26 17:26:40 +03:00
Avi Kivity
f015bd69b7 toppartitions: compare schemas using schema::id(), not pointer to schema
This allows keys from different stages in the schema's like to compare equal.
This is safe since the partition key cannot change, unlike the rest of the schema.

More importantly, it will allow us to compare keys made local after a pass through
global_schema_ptr, which does not guarantee that the schema_ptr conversion will be
the same even when starting with the same global_schema_ptr.
2019-09-26 17:15:46 +03:00
Avi Kivity
ba64ec78cf messaging_service: use rpc::tuple instead of variadic futures for rpc
Since variadic future<> is deprecated, switch to rpc::tuple for multiple
return values in rpc calls. This is more or less mechanical translation.
2019-09-26 12:09:31 +02:00
Gleb Natapov
e72a105b5e lwt: Pass client_state reference all the way to storage_proxy::query
client_state holds a state to generate monotonically increasing unique
timestamp. Queries with a SERIAL consistency level need it to generate
a paxos round.
2019-09-26 11:44:00 +03:00
Rafael Ávila de Espíndola
5af8b1e4a3 types: recreate dependent user types.
In the system.types table a user type refers to another by name. When
a user type is modified, only its entry in the table is changed.

At runtime a user type has direct pointer to the types it uses. To
handle the discrepancy we need to recreate any dependent types when a
entry in system.types changes.

Fixes #5049

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-09-25 15:41:45 -07:00
Rafael Ávila de Espíndola
34eddafdb0 types: Don't modify the type list in db::cql_type_parser::raw_builder
With this patch db::cql_type_parser::raw_builder creates a local copy
of the list of existing types and uses that internally. By doing that
build() should have no observable behavior other than returning the
new types.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-09-25 15:41:45 -07:00
Rafael Ávila de Espíndola
d6b2e3b23b types: pass a reference to prepare_internal
We were never passing a null pointer and never saving a copy of the
lw_shared_ptr. Passing a reference is more flexible as not all callers
are required to hold the user_types_metadata in a lw_shared_ptr.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-09-25 15:40:30 -07:00
Rafael Ávila de Espíndola
4d0916a094 commitlog: Handle gate_closed_exception
Before this patch, if the _gate is closed, with_gate throws and
forward_to is not executed. When the promise<> p is destroyed it marks
its _task as a broken promise.

What happens next depends on the branch.

On master, we warn when the shared_future is destroyed, so this patch
changes the warning from a broken_promise to a gate closed.

On 3.1, we warn when the promises in shared_future::_peers are
destroyed since they no longer have a future attached: The future that
was attached was the "auto f" just before the with_gate call, and it
is destroyed when with_gate throws. The net result is that this patch
fixes the warning in 3.1.

I will send a patch to seastar to make the warning on master more
consistent with the warning in 3.1.

Fixes #4394

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190917211915.117252-1-espindola@scylladb.com>
2019-09-17 23:41:21 +02:00
Piotr Sarna
feec3825aa view: degrade shutdown bookkeeping update failures log to warn
Currently, if updating bookkeeping operations for view building fails,
we log the error message and continue. However, during shutdown,
some errors are more likely to happen due to existing issues
like #4384. To differentiate actual errors from semi-expected
errors during shutdown, the latter are now logged with a warning
level instead of error.

Fixes #4954
2019-09-16 10:13:06 +03:00
Tomasz Grabiec
79935df959 commitlog: replay: Respect back-pressure from memtable space to prevent OOM
Commit log replay was bypassing memtable space back-pressure, and if
replay was faster than memtable flush, it could lead to OOM.

The fix is to call database::apply_in_memory() instead of
table::apply(). The former blocks when memtable space is full.

Fixes #4982.

Tests:
  - unit (release)
  - manual, replay with memtable flush failin and without failing

Message-Id: <1568381952-26256-1-git-send-email-tgrabiec@scylladb.com>
2019-09-15 11:51:56 +03:00
Tomasz Grabiec
8517eecc28 Revert "Simplify db::cql_type_parser::parse"
This reverts commit 7f64a6ec4b.

Fixes #5011

The reverted commit exposes #3760 for all schemas, not only those
which have UDTs.

The problem is that table schema deserialization now requires keyspace
to be present. If the replica hasn't received schema changes which
introduce the keyspace yet, the write will fail.
2019-09-12 12:45:21 +02:00
Nadav Har'El
b2bd3bbc1f alternator: add "--alternator-address" configuration parameter
So far we had the "--alternator-port" option allowing to configure the port
on which the Alternator server listens on, but the server always listened
to any address. It is important to also be able to configure the listen
address - it is useful in tests running several instances of Scylla on
the same machine, and useful in multi-homed machines with several interfaces.

So this patch adds the "--alternator-address" option, defaulting to 0.0.0.0
(to listen on all interfaces). It works like the many other "--*-address"
options that Scylla already has.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190808204641.28648-1-nyh@scylladb.com>
2019-09-11 18:01:05 +03:00
Nadav Har'El
c0518183c2 alternator: require alternator-port configuration
Until now, we always opened the Alternator port along with Scylla's
regular ports (CQL etc.). This should really be made optional.

With this patch, by default Alternator does NOT start and does not
open a port. Run Scylla with --alternator-port=8000 to open an Alternator
API port on port 8000, as was the default until now. It's also possible
to set this in scylla.yaml.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2019-09-11 12:38:31 +03:00
Tomasz Grabiec
a09479e63c Merge "Validate position in partition monotonicity" from Benny
Introduce mutation_fragment_stream_validator class and use it as a
Filter to flat_mutation_reader::consume_in_thread from
sstable::write_components to validate partition region and optionally
clustering key monotonicity.

Fixes #4803
2019-09-09 15:38:31 +02:00
Benny Halevy
42f6462837 config: enable_sstable_key_validation by default in debug build
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-09-09 15:30:59 +03:00
Benny Halevy
34d306b982 config: add enable_sstable_key_validation option
key monotonicity validation requires an overhead to store the last key and also to compare
therefore provide an option to enable/disable it (disabled by default).

Refs #4804

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-09-09 15:30:59 +03:00
Gleb Natapov
e52ebfb957 cql3: remove unused next_timestamp() function
next_timestamp() just calls get_timestamp() directly and nobody uses it
anyway.

Message-Id: <20190905101648.GO21540@scylladb.com>
2019-09-05 17:20:21 +03:00
Botond Dénes
e02b93cae1 schema_tables: convert_schema_to_mutations: return canonical_mutations
In preparation to the schema push/pull migrating to use canonical
mutations, convert the method producing the schema mutations to return a
vector of canonical mutations. The only user, MIGRATION_REQUEST verb,
converts the canonical mutations back to frozen mutations. This is very
inefficient, but this path will only be used in mixed clusters. After
all nodes are upgraded the verb will be sending the canonical mutations
directly instead.
2019-09-04 08:47:20 +03:00
Piotr Sarna
23c891923e main: make sure view_builder doesn't propagate semaphore errors
Stopping services which occurs in a destructor of deferred_action
should not throw, or it will end the program with
terminate(). View builder breaks a semaphore during its shutdown,
which results in propagating a broken_semaphore exception,
which in turn results in throwing an exception during stop().get().
In order to fix that issue, semaphore exceptions are explicitly
ignored, since they're expected to appear during shutdown.

Fixes #4875
2019-09-01 11:59:57 +03:00
Botond Dénes
136fc856c5 treewide: silence discarded future warnings for questionable discards
This patches silences the remaining discarded future warnings, those
where it cannot be determined with reasonable confidence that this was
indeed the actual intent of the author, or that the discarding of the
future could lead to problems. For all those places a FIXME is added,
with the intent that these will be soon followed-up with an actual fix.
I deliberately haven't fixed any of these, even if the fix seems
trivial. It is too easy to overlook a bad fix mixed in with so many
mechanical changes.
2019-08-26 19:28:43 +03:00
Botond Dénes
fddd9a88dd treewide: silence discarded future warnings for legit discards
This patch silences those future discard warnings where it is clear that
discarding the future was actually the intent of the original author,
*and* they did the necessary precautions (handling errors). The patch
also adds some trivial error handling (logging the error) in some
places, which were lacking this, but otherwise look ok. No functional
changes.
2019-08-26 18:54:44 +03:00
Avi Kivity
67b0d379e0 main: add glue between db::config and cql3::cql_config
Copy values between the flat db::config and the hierarchical cql_config, adding
observers to keep the values updated.
2019-08-21 19:35:59 +02:00
Vlad Zolotarov
d253846c91 hinted handoff: fix a race on a directory removal between space_watchdog and drain_for()
The endpoint directories scanned by space_watchdog may get deleted
by the manager::drain_for().

If a deleted directory is given to a lister::scan_dir() this will end up
in an exception and as a result a space_watchdog will skip this round
and hinted handoff is going to be disabled (for all agents including MVs)
for the whole space_watchdog round.

Let's make sure this doesn't happen by serializing the scanning and deletion
using end_point_hints_manager::file_update_mutex.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 11:46:46 -04:00
Vlad Zolotarov
b34c36baa2 hinted handoff: make taking file_update_mutex safe
end_point_hints_manager::file_update_mutex is taken by space_watchdog
but while space_watchdog is waiting for it the corresponding
end_point_hints_manager instance may get destroyed by manager::drain_for()
or by manager::stop().

This will end up in a use-after-free event.

Let's change the end_point_hints_manager's API in a way that would prevent
such an unsafe locking:

   - Introduce the with_file_update_mutex().
   - Make end_point_hints_manager::file_update_mutex() method private.

Fixes #4685
Fixes #4836

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 11:26:19 -04:00
Vlad Zolotarov
dbad9fcc7d db::hints::manager::drain_for(): fix alignment
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 10:58:36 -04:00
Vlad Zolotarov
7a12b46fc9 db::hints::manager: serialize calls to drain_for()
If drain_for() is running together with itself: one instance for the local
node and one for some other node, erasing of elements from the _ep_managers
map may lead to a use-after-free event.

Let's serialize drain_for() calls with a semaphore.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 10:58:36 -04:00
Vlad Zolotarov
09600f1779 db::hints: cosmetics: identation and missing method qualifier
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 10:58:36 -04:00
Piotr Sarna
3cc5a04301 db,view: wrap view update generation in stream scheduling group
Generating view updates is used by streaming, so the service itself
should also run under the matching scheduling group.
2019-08-20 00:24:50 +02:00
Calle Wilund
1afc899e37 type_parser: Fix/improve exception messages
Removes long-standing FIXME for message detail
Also simplifies some code, removing duplication.

Message-Id: <20190812134144.2417-1-calle@scylladb.com>
2019-08-12 17:03:43 +03:00
Avi Kivity
e6cde72d2b Merge "Fix cql server admission control to take all leftover work into account" from Gleb
"
Current admission control takes a permit when cql requests starts and
releases it when reply is sent, but some requests may leave background
work behind after that point (some because there is genuine background
work to do like complete a write or do a read repair, and some because
a read/write may stuck in a queue longer than the request's timeout), so
after Scylla replies with a timeout some resources are still occupied.

The series fixes this by passing the permit down to storage_proxy where
it is held until all background work is completed.

Fixes #4768
"

* 'gleb/admission-v3' of github.com:scylladb/seastar-dev:
  transport: add a metric to follow memory available for service permit.
  storage_proxy: store a permit in a read executor
  storage_proxy: store a permit in a write response handler
  Pass service permit to storage_proxy
  transport: introduce service_permit class and use it instead of semaphore_units
  transport: hold admission a permit until a reply is sent
  transport: remove cql server load balancer
2019-08-12 11:02:37 +03:00
Gleb Natapov
6a4207f202 Pass service permit to storage_proxy
Current cql transport code acquire a permit before processing a query and
release it when the query gets a reply, but some quires leave work behind.
If the work is allowed to accumulate without any limit a server may
eventually run out of memory. To prevent that the permit system should
account for the background work as well. The patch is a first step in
this direction. It passes a permit down to storage proxy where it will
be later hold by background work.
2019-08-12 10:20:43 +03:00
Gleb Natapov
7e3805ed3d transport: remove cql server load balancer
It is buggy, unused and unnecessary complicates the code.
2019-08-11 16:08:52 +03:00
Calle Wilund
d3410f0e48 config: Add rpc_interface_prefer_ipv6 parameter
As already existing in scylla.yaml
2019-08-06 08:32:10 +00:00
Calle Wilund
0028cecb8e config: Add listen_interface_perfer_ipv6 parameter
As already existing in scylla.yaml.
https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml#L622
2019-08-06 08:32:10 +00:00
Calle Wilund
39d18178eb config.cc: Fix enable_ipv6_dns_lookup actual param name
When adding option (and iterating through config refactoring)
the member name and the config param name got out of sync
2019-08-06 08:32:09 +00:00
Tomasz Grabiec
bf70ee3986 config, exceptions: Add helper for handling internal errors
The handler is intended to be called when internal invariants are
violated and the operation cannot safely continue. The handler either
throws (default) or aborts, depending on configuration option.

Passing --abort-on-internal-error on the command line will switch to
aborting.

The reason we don't abort by default is that it may bring the whole
cluster down and cause unavailability, while it may not be necessary
to do so. It's safer to fail just the affected operation,
e.g. repair. However, failing the operation with an exception leaves
little information for debugging the root cause. So the idea is that the
user would enable aborts on only one of the nodes in the cluster to
get a core dump and not bring the whole cluster down.
2019-08-02 11:13:54 +02:00
Avi Kivity
e03c7003f1 toppartitions: fix race between listener removal and reads
Data listener reads are implemented as flat_mutation_readers, which
take a reference to the listener and then execute asynchronously.
The listener can be removed between the time when the reference is
taken and actual execution, resulting in a dangling pointer
dereference.

Fix by using a weak_ptr to avoid writing to a destroyed object. Note that writes
don't need protection because they execute atomically.

Fixes #4661.

Tests: unit (dev)
2019-07-22 13:26:18 +02:00
Rafael Ávila de Espíndola
636e2470b1 Always close commitlog files
We were using segment::_closed to decide whether _file was already
closed. Unfortunately they are not exactly the same thing. As far as
I understand it, segments can be closed and reused without actually
closing the file.

Found with a seastar patch that asserts on destroying an open
append_challenged_posix_file_impl.

Fixes #4745.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190721171332.7995-1-espindola@scylladb.com>
2019-07-22 10:08:57 +03:00
Piotr Sarna
c1d5aef735 db: add system_schema.computed_columns
Information on which columns of a table are 'computed' is now kept
in system_schema.computed_columns system table.
2019-07-19 11:58:42 +02:00
Piotr Sarna
17c323c096 database: add fixing previous secondary index schemas
If a schema was created before computed columns were implemented,
its token column may not have been marked as computed.
To remedy this, if no computed column is found, the schema
will be recreated.
The code will work correctly even without this patch in order to support
upgrading from legacy versions, but it's still important: it transforms
token columns from the legacy format to new computed format, which will
eventually (after a few release cycles) allow dropping the support for
legacy format altogether.
2019-07-19 11:58:42 +02:00
Piotr Sarna
3c5dd94306 view: remove unused token_for function
The function was only used once in code removed in this series.
2019-07-19 11:58:42 +02:00
Piotr Sarna
6a6871aa0e view: check for computed columns in view
Currently, having a 'computed' column in view update generation
indicates that token value needs to be generated and assigned to it.
2019-07-19 11:58:42 +02:00
Piotr Sarna
a0e02df36a service: add computed columns feature
Computed columns feature should be checked before creating
index schemas the new way - by adding computed column names
to system_schema.computed_columns.
2019-07-19 11:58:42 +02:00
Tomasz Grabiec
14700c2ac4 Merge "Fix the system.size_estimates table" from Kamil
Fixes a segfault when querying for an empty keyspace.

Also, fixes an infinite loop on smp > 1. Queries to
system.size_estimates table which are not single-partition queries
caused Scylla to go into an infinite loop inside
multishard_combining_reader::fill_buffer. This happened because
multishard_combinind_reader assumes that shards return rows belonging
to separate partitions, which was not the case for
size_estimates_mutation_reader.

Fixes #4689.
2019-07-15 22:09:30 +02:00