"Commit 4cd9c4c0c5441cf55e280c6f2f2e5529426b9c98 introduced a minor
issue: a wrong snitch instance may be used when updating a Gossiper state
(if I/O CPU is different from CPU0).
In order to fix this issue a local snitch instance on CPU0 should be used,
just like a Gossiper local instance.
We have to move some interfaces to i_endpoint_snitch
from being private in a gossiping_property_file_snitch in order to be
able to access it using snitch_ptr handle."
Don't ignore yet another returned future in reload_configuration().
Since commit 5e8037b50a
storage_service::gossip_snitch_info() returns a future.
This patch takes this into an account.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
When we access a gossiper instance we use a _gossip_started
state of a snitch, which is set in a gossiper_starting() method.
gossiper_starting() method however is invoked by a gossiper on CPU0
only therefore the _gossip_started snitch state will be set for an
instance on CPU0 only.
Therefore instead of synchronizing the _gossip_started state between
all shards we just have to make sure we check it on the right CPU,
which is CPU0.
This patch fixes this issue.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Adjust the interface and distribution of prefer_local parameter read
from a snitch property file with the rest of similar parameters (e.g. dc and rack):
they are read and their values are distributed (copied) across all shards'
instances.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Make reload_gossiper_state() be a virtual method
of a base class in order to allow calling it using a snitch_ptr
handle.
A base class already has a ton of virtual methods so no harm is
done performance-wise. Using virtual methods instead of doing
dynamic_cast results in a much cleaner code however.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Move the member and add an access method.
This is needed in order to be able to access this state using
snitch_ptr handle.
This also allows to get rid of ec2_multi_region_snitch::_helper_added
member since it duplicates _gossip_started semantics.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
* seastar 9ae6407...258daf9 (6):
> rpc server: Add pending and sent messages to server
> scripts: posix_net_conf.sh: Use a generic logic for RPS configuring
> scripts: posix_net_conf.sh: allow passing a NIC name as a parameter
> doc: link to the tutorial
> tutorial: begin documenting the network API
> slab: remove bogus uintptr_t definition
"In 5e8037b50a (gossip: Futurize
add_local_application_state()) , we futurized add_local_application_state.
However, not all of the callers are futurized. Fix it up."
"- Fix snitch names from EC2XXX to Ec2XXX to align with configuration.
- Copy cassandra-rackdc.properties file to /var/lib/scylla/conf
- Set SCYLLA_HOME before booting process"
During testing build, the debugging statement at the end
of the function body (after return statements) causes compilation to
fail due to the flag -Werror=return-type:
service/storage_service.cc: In member function ‘future<> service::storage_service::clear_snapshot(sstring, std::vector<basic_sstring<char, unsigned int, 15u> >)’:
service/storage_service.cc:1358:1: error: control reaches end of non-void function [-Werror=return-type]
Which traces back to 21f84d77. Let's attach a then_wrapped()
clause to parallel_for_each() adding the debug message as
suggested by Avi.
CC: Glauber Costa <glommer@scylladb.com>
Signed-off-by: Lucas Meneghel Rodrigues <lmr@scylladb.com>
We are ignoring the future returned by seastar::async. Futurize it so
caller can wait for the application state to be actually applied.
In addition, dropping the unused add_local_application_states function.
We use boost::any to convert to and from database values (stored in
serlialized form) and native C++ values. boost::any captures information
about the data type (how to copy/move/delete etc.) and stores it inside
the boost::any instance. We later retrieve the real value using
boost::any_cast.
However, data_value (which has a boost::any member) already has type
information as a data_type instance. By teaching data_type intances about
the corresponding native type, we can elimiante the use of boost::any.
While boost::any is evil and eliminating it improves efficiency somewhat,
the real goal is growing native type support in data_type. We will use that
later to store native types in the cache, enabling O(log n) access to
collections, O(1) access to tuples, and more efficient large blob support.
"gossiping_property_file_snitch checks its property
file (cassandra-rackdc.properties) for changes every minute and
if there were changes it re-registers the helper and initiates
re-read of the new DC and Rack values in the corresponding places.
Therefore we need the ability to unregister/register the corresponding subscriber
at the same time when a subscriber list is possibly iterated by
some other asynchronous context on the current CPU.
The current gossiper implementation assumes that subscribers list may not be
changed from the context different from the one that iterates on their list.
So, this had to be fixed.
There was also missing an update_endpoint(ep) interface in the locator::topology
class and the corresponding token_metadata::update_topology(ep) wrapper.
Also there were some bugs in the gossiping_property_file::reload_configuration()
method."
On hindsight, it doesn't make much sense to print an
empty string, so let's only print stdout if it's non
None, non empty.
Signed-off-by: Lucas Meneghel Rodrigues <lmr@scylladb.com>
This functions were empty and now they have the intended code:
- Register the reconnectable_snitch_helper if "prefer_local"
parameter was given the TRUE value.
- Set the application INTERNAL_IP state to listen_address().
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Invoke reload_gossiper_state() and gossip_snitch_info() on CPU0 since
gossiper is effectively running on CPU0 therefore all methods
modifying its state should be invoked on CPU0 as well.
- Don't invoke any method on external "distributed" objects unless their
corresponding per-shard service object have already been initialized.
- Update a local Node info in a storage_service::token_metadata::topology
when reloading snitch configuration when DC and/or Rack info has changed.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Introduce a subscribers_list class that exposes 3 methods:
- push_back(s) - adds a new element s to the back of the list
- remove(s) - removes an element s from the list
- for_each(f) - invoke f on each element of the list
- make a subscriber_list store shared_ptr to a subscriber
to allow removing (currently it stores a naked pointer to the object).
subscribers_list allows push_back() and remove() to be called while
another thread (e.g. seastar::async()) is in the middle of for_each().
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Simplify subscribers_list::remove() method.
- load_broadcaster: inherit from enable_shared_from_this instead
of async_sharded_service.
* seastar 9d8913a...9ae6407 (2):
> core/memory.cc: Declare member min_free_pages from cpu_pages struct
> http: All http replies should have version set
It may happen that the user will migrate a table to Scylla which
compaction strategy isn't supported yet, such as Data tiered.
Let's handle that by falling back to size-tiered compaction
strategy and printing a warning message.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Do not hold the api lock while streaming the data since it might take a
long time, so we need to reconcile other operations while we are in the
middle of rebuild.