DowngradingConsistencyRetryPolicy uses live replicas count from
Unavailable exception to adjust CL for retry, but when there are pending
nodes CL is increased internally by a coordinator and that may prevent
retried query from succeeding. Adjust live replica count in case of
pending node presence so that retried query will be able to proceed.
Fixes#2535
Message-Id: <20170710085238.GY2324@scylladb.com>
This patch makes storage proxy to choose replicas to read from base on
their cache hit rates. Replicas with higher cache hit rates will see
more requests while replicas with lower hit rates will see less. Local
node has a special bonus and will get more requests even if another node
has slightly higher cache hit rate (same goes for local vs remote DC),
but after the patch it is no longer guarantied that a coordinator node
will be chosen as a replica for the read (if the feature is enabled).
Currently storage proxy has to loop over remaining replicas to search
for suitable extra replica, but doing it in filter_for_query() is
extremely easy, so do it there instead.
Merge filter_for_query_dc_local() functionality into filter_for_query().
This is more efficient since filter_for_query_dc_local() partitions
endpoints into 'local' and 'remote' set but filter_for_query() already
does it for CL=LOCAL so for such queries we needlessly do it twice.
During bootstrapping additional copies of data has to be made to ensure
that CL level is met (see CASSANDRA-833 for details). Our code does
that, but it does not take into account that bootstraping node can be
dead which may cause request to proceed even though there is no
enough live nodes for it to be completed. In such a case request neither
completes nor timeouts, so it appear to be stuck from CQL layer POV. The
patch fixes this by taking into account pending nodes while checking
that there are enough sufficient live nodes for operation to proceed.
Fixes#965
Message-Id: <20160303165250.GG2253@scylladb.com>
Move unavailable_exception to exceptions.hh where other CQL transport
level exceptions are defined in.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Move 'consistency_level' enumeration to a separate header file to fix
dependency issues that arise when we move 'unavailable_exception' to
exceptions.hh.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Cleanup consistency_level.hh by removing untranslated code that's been
sitting in the tree for a while.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Transport layer expects to get error code in an exception of type
exceptions::cassandra_exception. Fix code to use it as a base for
all user visible exceptions and put correct error code there.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Use std::partition_copy() and boost::range::algorithm::partition().
- Don't use std::move() when returning a local vector variable.
Works only if all replicas (participating in CL) has the same live
data. Does not detects mismatch in tombstones (no infrastructure yet).
Does not report timeout yet.