A shared sstable must be compacted by all shards before it can be deleted.
Since we're stoping, that's not going to happen. Cancel those pending
deletions to let anyone waiting on them to continue.
While Seastar in general can accept any parameter for its I/O queues, Scylla
in particular shouldn't run with them disabled. Such will be the status when
the max-io-requests parameter is not enabled.
On top of that, we would like to have enough depth per I/O queue not to allow
for shard-local parallelism. Therefore, we will require a minimum per-queue
capacity of 4. In machines where the disk iodepth is not enough to allow for 4
concurrent requests per shard, one should reduce the number of I/O queues.
For --max-io-requests, we will check the parameter itself. However, the
--num-io-queues parameter is not mandatory, and given enough concurrent
requests, Seastar's default configuration can very well just be doing the right
thing. So for that, we will check the final result of each I/O queue.
As it is the case with other checks of the sorts, this can be overridden by
the --developer-mode switch.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <63bf7e91ac10c95810351815bb8f5e94d75592a5.1458836000.git.glauber@scylladb.com>
Since initialization now runs in a thread storage, messaging and
gossiper services initialization code may take advantage of it too.
Message-Id: <20160323094732.GF2282@scylladb.com>
Fix the validation error message to look like this:
Scylla version 666.development-20160316.49af399 starting ...
WARN 2016-03-17 12:24:15,137 [shard 0] config - Option partitioner is not (yet) used.
WARN 2016-03-17 12:24:15,138 [shard 0] init - NOFILE rlimit too low (recommended setting 200000, minimum setting 10000; you may run out of file descriptors.
ERROR 2016-03-17 12:24:15,138 [shard 0] init - Bad configuration: invalid 'listen_address': eth0: boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> > (Invalid argument)
Exiting on unhandled exception of type 'bad_configuration_error': std::exception
Instead of:
Exiting on unhandled exception of type 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >': Invalid argument
Fixes#1051.
Message-Id: <1458210329-4488-1-git-send-email-penberg@scylladb.com>
Defer registering services to the API server until commitlog has been
replayed to ensure that nobody is able to trigger sstable operations via
'nodetool' before we are ready for them.
Message-Id: <1458116227-4671-1-git-send-email-penberg@scylladb.com>
Streaming is used by bootstrap and repair. Streaming uses storage_proxy
class to apply the frozen_mutation and db/column_family class to
invalidate row cache. Defer the initalization just before repair and
bootstrap init.
Message-Id: <8e99cf443239dd8e17e6b6284dab171f7a12365c.1458034320.git.asias@scylladb.com>
Register the REPAIR_CHECKSUM_RANGE messaging service verb handler after
we have replayed the commitlog to avoid responding with bogus checksums.
Message-Id: <1458027934-8546-1-git-send-email-penberg@scylladb.com>
Defer registering migration manager RPC verbs after commitlog has has
been replayed so that our own schema is fully loaded before other other
nodes start querying it or sending schema updates.
Message-Id: <1457971028-7325-1-git-send-email-penberg@scylladb.com>
Deletion of previous stale, temporary SSTables is done by Shard0. Therefore,
let's run Shard0 first. Technically, we could just have all shards agree on the
deletion and just delete it later, but that is prone to races.
Those races are not supposed to happen during normal operation, but if we have
bugs, they can. Scylla's Github Issue #1014 is an example of a situation where
that can happen, making existing problems worse. So running a single shard
first and getting making sure that all temporary tables are deleted provides
extra protection against such situations.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
While looking at initialization code I felt like my head is going to
explode. Moving initialization into a thread makes things a little bit
better. Only lightly tested.
Message-Id: <20160310163142.GE28529@scylladb.com>
We start services like gossiper before system keyspace is initialized
which means we can start writing too early. Shuffle code so that system
keyspace is initialized earlier.
Refs #1014
Message-Id: <1457593758-9444-1-git-send-email-penberg@scylladb.com>
We require SSE 4.2 (for commitlog CRC32), verify it exists early and bail
out if it does not.
We need to check early, because the compiler may use newer instructions
in the generated code; the earlier we check, the lower the probability
we hit an undefined opcode exception.
Message-Id: <1456665401-18252-1-git-send-email-avi@scylladb.com>
When shutting down a node gracefully, this patch asks all ongoing repairs
started on this node to stop as soon as possible (without completing
their work), and then waits for these repairs to finish (with failure,
usually, because they didn't complete).
We need to do this, because if the repair loop continues to run while we
start destructing the various services it relies on, it can crash (as
reported in #699, although the specific crash reported there no longer
occurs after some changes in the streaming code). Additionally, it is
important that to stop the ongoing repair, and not wait for it to complete
its normal operation, because that can take a very long time, and shutdown
is supposed to not take more than a few seconds.
Fixes#699.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1455218873-6201-1-git-send-email-nyh@scylladb.com>
Fixes#868.
Registerring exit hooks while reactor is already iterating over exit
hooks is not allowed and currently leads to undefined behavior
observed in #868. While we should make the failure more user friendly,
registering exit hooks concurrently with shutdown will not be allowed.
We don't expect exit hooks to be registered after exit starts because
this would violate the guarantee which says that exit hooks are
executed in reverse order of registration. Starting exit sequence in
the middle of initialization sequence would result in use after free
errors. Btw, I'm not sure if currently there's anything which prevents
this
To solve this problem, move the exit hook to initilization
sequence. In case of tests, the cleanup has to be called explicitly.
supervisor_notify() calls periodically, to log message on systemd.
So raise(SIGSTOP) will called multiple times, upstart doesn't expected that.
We need to call it just one time.
Fixes#846
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
We register engine().at_exit() callbacks when we initialize the services. We
do not really call the callbacks at the moment due to #293.
It is pretty hard to see the whole picture in which order the services
are shutdown. Instead of for each services to register a at_exit()
callbacks, I proposal to have a single at_exit() callback which do the
shutdown for all the services. In cassandra, the shutdown work is done
in storage_service::drain_on_shutdown callbacks.
In this patch, the drain_on_shutdown is executed during shutdown.
As a result, the proper gossip shutdown is executed and fixes#790.
With this patch, when Ctrl-C on a node, it looks like:
INFO [shard 0] storage_service - Drain on shutdown: starts
INFO [shard 0] gossip - Announcing shutdown
INFO [shard 0] storage_service - Node 127.0.0.1 state jump to normal
INFO [shard 0] storage_service - Drain on shutdown: stop_gossiping done
INFO [shard 0] storage_service - CQL server stopped
INFO [shard 0] storage_service - Drain on shutdown: shutdown rpc and cql server done
INFO [shard 0] storage_service - Drain on shutdown: shutdown messaging_service done
INFO [shard 0] storage_service - Drain on shutdown: flush column_families done
INFO [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
INFO [shard 0] storage_service - Drain on shutdown: done
The API needs to be available at an early stage of the initialization,
on the other hand not all the specific APIs are available at that time.
This patch breaks the API initialization into stages, in each stage
additional commands will be available.
While setting that the api header files was broken into api_init.hh that
is relevent to the main and to api.hh which holds the different
api helper functions.
Fixes#754
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1453822331-16729-2-git-send-email-amnon@scylladb.com>
This reverts commit f0d68e4 ("main: start the http server in the first
step"). The service layer is not ready to serve clients before it's
fully up and running which causes early startup crashes everywhere.
Message-Id: <1452768015-22763-1-git-send-email-penberg@scylladb.com>
With Docker we might be running on a filesystem that does not support DMA
(aufs; or tmpfs on boot2docker), so let --developer-mode allow running
on those file systems.
Message-Id: <1452593083-25601-1-git-send-email-avi@scylladb.com>
Wait for the future returned by the http server start process to resolve,
so we know it is started. If it doesn't, we'll hit the or_terminate()
further down the line and exit with an error code.
Message-Id: <1452092806-11508-3-git-send-email-avi@scylladb.com>
Implement the wait for gossip to settle logic in the bootup process.
CASSANDRA-4288
Fixes:
bootstrap_test.py:TestBootstrap.shutdown_wiped_node_cannot_join_test
1) start node2
2) wait for cql connection with node2 is ready
3) stop node2
4) delete data and commitlog directory for node2
5) start node2
In step 5, sometimes I saw in shadow round of node2, it gets node2's
status as BOOT from other nodes in the cluster instead of NORMAL. The
problem is we do not wait for gossip to settle before we start cql server,
as a result, when we stop node2 in step 3), other nodes in the cluster
have not got node2's status update to NORMAL.
This patch adds a new type of message, "REPAIR_CHECKSUM_RANGE" to scylla's
"messaging_service" RPC mechanism, for the use of repair:
With this message the repair's master host tells a slave host to calculate
the checksum of a column-family's partitions in a given token range, and
return that checksum.
The implementation of this message uses the checksum_range() function
defined in the previous patch.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This change set the http server to start as the first step in the boot
order.
It is helpfull if some other step takes a long time or stuck.
Fixes#725
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
The start_native_transport() function in storage_service expects the
'enabled' option to be defined. If the option is not defined, it means
that encryption is implicitly disabled.
Fixes#718.
* Massage user options in main
* Use them in storage_service, and if needed, load certificates etc
and pass to transport/cql server.
Conflicts:
service/storage_service.cc
If we can't open the file, we will fail with a misterious error. It is a costumary
scenario, though, since people who are unaware or have just forgotten about seastar's
restriction of direct io access may put those files in tmpfs and other mount points.
We have a direct_io check that is designed exactly for this purpose, so as to give
the user a better error message. This patch makes use of it.
Fixes#644
Signed-off-by: Glauber Costa <glauber@scylladb.com>
With this patch, start two nodes
node 1:
scylla --rpc-address 127.0.0.1 --broadcast-rpc-address 127.0.0.11
node 2:
scylla --rpc-address 127.0.0.2 --broadcast-rpc-address 127.0.0.12
On node 1:
cqlsh> SELECT rpc_address from system.peers;
rpc_address
-------------
127.0.0.12
which means client should use this address to connect node 2 for cql and
thrift protocol.
Add utils::fb_utilities::set_broadcast_address().
Set it to either broadcast_address or listen_address configuration value
if appropriate values are set. If none of the two values above
are set - abort the application.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Simplify the utils::fb_utilities::get_broadcast() logic.