cql_query_test hasn't configured Broadcast address before
it was used for the first time.
Broadcast address is an essential Node's configuration.
There is an assert in utils::fb_utils::get_broadcast_address()
that ensures that broadcast address has been properly configured
before it's used for the first time and it is triggered without
this patch.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
"Fixes for commitlog (debug) test failures related to shutdowns.
Note that most the fixes here are only really related to the tests
failing, not really real scylla runs. However, at some point we'll
have real shutdown in scylla as well (not just hard exit), at which
point this becomes more relevant there as well.
Main issue was post-flush continuation chains for stats update
remaining unexecuted, due to task reordering, once the commitlog
object itself had been destroyed. This could have been handled by just
making the stats object a shared pointer, but in general it seems more
prudent to enforce having all tasks completed after shutdown.
* Change commitlog shutdown to use gate+wait for all outstanding ops
(flush, write, timer). Thus we can ensure everything is finished
when returning from "shutdown".
* Fix bug with "commitlog::clear" (test method) not doing the intended deed
* Most importantly, fix the tests themselves, cleaning up old crud, and
fixing invalid assumptions (CL behaviour changed quite a bit since tests
were created), and remove races.
Disclaimer: I've _never_ managed to reproduce the debug tests failing
like in jenkins locally (though I managed to provoke other failures),
but at least jenkins runs with this series have been clean. Knock knock."
Now that #475 is solved an read_indexes() guarantees to return disjont
sets of keys sstable key reader can be simplified, namely, only two key
lookups are needed (the first and the last one) and there is no need for
range splitting.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
"This series add the mighty EC2MultiRegionSnitch and some missing
multi-DC related functionality:
- Use the proper Broadcast Address: either the one from the
.yaml configuration (if present) or the one configured by some
scylla component (e.g. snitch).
- Introduce the ability to switch to internal IPs when connecting
to Nodes in the same data center.
- Store the known internal IPs in system.peers table and
load then immediately during boot.
This series also contains some related fixes done on the way."
* Do close + fsync on all segments
* Make sure all pending cycle/sync ops are guarded with a gate, and
explicitly wait for this gate on shutdown to make sure we don't
leave hanging flushes in the task queue.
* Fix bug where "commitlog::clear" did not in fact shut down the CL,
due to "_shutdown" being already set.
Note: This is (at least currently) not an issue for anything else than tests,
since we don't shutdown the normal server "properly", i.e. the CL itself
will not go away, and hanging tasks are ok, as long as the sync-all is done
(which it was previously). But, to make tests predictable, and future-proof
the CL, this is better.
sstable level is set to zero by default, but it may be set to
a different value if a new sstable is the result of leveled
compaction. This is done outside write_components.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
We were incorrectly setting s.header.min_index_interval to
BASE_SAMPLING_LEVEL, which luckily is the default value to
min index interval. BASE_SAMPLING_LEVEL was also used as
the min index interval when checking if the estimated
number of summary entries is greater than the limit.
To fix problems, get min index interval from schema and
use this value to check the limit.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
This snitch in addition to what EC2Snitch does registers
a reconnectable_snitch_helper that will make messenger_service
connect to internal IPs when it connects to the nodes in the same
data center with the current Node.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v4:
- Added dual license in newly added files.
New in v3:
- Returned the Apache license.
New in v2:
- Update the license to the latest version. ;)
Add utils::fb_utilities::set_broadcast_address().
Set it to either broadcast_address or listen_address configuration value
if appropriate values are set. If none of the two values above
are set - abort the application.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Simplify the utils::fb_utilities::get_broadcast() logic.
reconnectable_snitch_helper implements i_endpoint_state_change_subscriber
and triggers reconnect using the internal IP to the nodes in the
same data center when one of the following events happen:
- on_join()
- on_change() - when INTERNAL_IP state is changed
- on_alive()
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v4:
- Added dual license for newly added files.
New in v3:
- Fix reconnect() logic.
- Returned the Apache license.
- Check if the new local address is not already stored in the cache.
- Get rid of get_ep_addr().
New in v2:
- Update the license to the latest version. ;)
Added load_config() function that reads AWS info and property file
and distributes the read values on all shards.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
This map will contain the (internal) IPs corresponding to specific Nodes.
The mapping is also stored in the system.peers table.
So, instead of always connecting to external IP messaging_service::get_rpc_client()
will query _preferred_ip_cache and only if there is no entry for a given
Node will connect to the external IP.
We will call for init_local_preferred_ip_cache() at the end of system table init.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Improved the _preferred_ip_cache description.
- Code styling issues.
New in v3:
- Make get_internal_ip() public.
- get_rpc_client(): return a get_preferred_ip() usage dropped
in v2 by mistake during rebase.
This function erases shard_info objects from all _clients maps.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Use remove_rpc_client_one() instead of direct map::erase().
- Ensure messaging_service::stop() blocks until all rpc_protocol::client::stop()
are over.
- Remove the async code from rpc_protocol_client_wrapper destructor - call
for stop() everywhere it's needed instead. Ensure that
rpc_protocol_client_wrapper is always "stopped" when its destructor is called.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v3:
- Code style fixes.
- Killed rpc_protocol_client_wrapper::_stopped.
- Killed rpc_protocol_client_wrapper::~rpc_protocol_client_wrapper().
- Use std::move() for saving shared pointer before
erasing the entry from _clients in
remove_rpc_client_one() in
order to avoid extra ref count bumping.
This makes code cleaner. Also it would allow less changes
if we decide to increase _clients size in the future.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
get_preferred_ips() returns all preferred_ip's stored in system.peers
table.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Get rid of extra std::move().
Scylla is not "daemon" (witch forks twice), but it can be "fork" (forks once) when we don't use "exec" to call startup scripts.
Fixes#495
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
"This series adds two types of functionality to the storage_proxy, it adds the
API that returns the timeout constants from the config and it aligned the
metrics of the read, write and range to origin StorageProxy metrics."
read_indexes() will not work for a column family that minimum
index interval is different than sampling level or that sampling
level is lower than BASE_SAMPLING_LEVEl.
That's because the function was using sampling level to determine
the interval between indexes that are stored by index summary.
Instead, method from downsampling will be used to calculate the
effective interval based on both minimum_index_interval and
sampling_level parameters.
Fixes issue #474.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>