supervisor_notify() calls periodically, to log message on systemd.
So raise(SIGSTOP) will called multiple times, upstart doesn't expected that.
We need to call it just one time.
Fixes#846
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
We register engine().at_exit() callbacks when we initialize the services. We
do not really call the callbacks at the moment due to #293.
It is pretty hard to see the whole picture in which order the services
are shutdown. Instead of for each services to register a at_exit()
callbacks, I proposal to have a single at_exit() callback which do the
shutdown for all the services. In cassandra, the shutdown work is done
in storage_service::drain_on_shutdown callbacks.
In this patch, the drain_on_shutdown is executed during shutdown.
As a result, the proper gossip shutdown is executed and fixes#790.
With this patch, when Ctrl-C on a node, it looks like:
INFO [shard 0] storage_service - Drain on shutdown: starts
INFO [shard 0] gossip - Announcing shutdown
INFO [shard 0] storage_service - Node 127.0.0.1 state jump to normal
INFO [shard 0] storage_service - Drain on shutdown: stop_gossiping done
INFO [shard 0] storage_service - CQL server stopped
INFO [shard 0] storage_service - Drain on shutdown: shutdown rpc and cql server done
INFO [shard 0] storage_service - Drain on shutdown: shutdown messaging_service done
INFO [shard 0] storage_service - Drain on shutdown: flush column_families done
INFO [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
INFO [shard 0] storage_service - Drain on shutdown: done
The API needs to be available at an early stage of the initialization,
on the other hand not all the specific APIs are available at that time.
This patch breaks the API initialization into stages, in each stage
additional commands will be available.
While setting that the api header files was broken into api_init.hh that
is relevent to the main and to api.hh which holds the different
api helper functions.
Fixes#754
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1453822331-16729-2-git-send-email-amnon@scylladb.com>
This reverts commit f0d68e4 ("main: start the http server in the first
step"). The service layer is not ready to serve clients before it's
fully up and running which causes early startup crashes everywhere.
Message-Id: <1452768015-22763-1-git-send-email-penberg@scylladb.com>
With Docker we might be running on a filesystem that does not support DMA
(aufs; or tmpfs on boot2docker), so let --developer-mode allow running
on those file systems.
Message-Id: <1452593083-25601-1-git-send-email-avi@scylladb.com>
Wait for the future returned by the http server start process to resolve,
so we know it is started. If it doesn't, we'll hit the or_terminate()
further down the line and exit with an error code.
Message-Id: <1452092806-11508-3-git-send-email-avi@scylladb.com>
Implement the wait for gossip to settle logic in the bootup process.
CASSANDRA-4288
Fixes:
bootstrap_test.py:TestBootstrap.shutdown_wiped_node_cannot_join_test
1) start node2
2) wait for cql connection with node2 is ready
3) stop node2
4) delete data and commitlog directory for node2
5) start node2
In step 5, sometimes I saw in shadow round of node2, it gets node2's
status as BOOT from other nodes in the cluster instead of NORMAL. The
problem is we do not wait for gossip to settle before we start cql server,
as a result, when we stop node2 in step 3), other nodes in the cluster
have not got node2's status update to NORMAL.
This patch adds a new type of message, "REPAIR_CHECKSUM_RANGE" to scylla's
"messaging_service" RPC mechanism, for the use of repair:
With this message the repair's master host tells a slave host to calculate
the checksum of a column-family's partitions in a given token range, and
return that checksum.
The implementation of this message uses the checksum_range() function
defined in the previous patch.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This change set the http server to start as the first step in the boot
order.
It is helpfull if some other step takes a long time or stuck.
Fixes#725
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
The start_native_transport() function in storage_service expects the
'enabled' option to be defined. If the option is not defined, it means
that encryption is implicitly disabled.
Fixes#718.
* Massage user options in main
* Use them in storage_service, and if needed, load certificates etc
and pass to transport/cql server.
Conflicts:
service/storage_service.cc
If we can't open the file, we will fail with a misterious error. It is a costumary
scenario, though, since people who are unaware or have just forgotten about seastar's
restriction of direct io access may put those files in tmpfs and other mount points.
We have a direct_io check that is designed exactly for this purpose, so as to give
the user a better error message. This patch makes use of it.
Fixes#644
Signed-off-by: Glauber Costa <glauber@scylladb.com>
With this patch, start two nodes
node 1:
scylla --rpc-address 127.0.0.1 --broadcast-rpc-address 127.0.0.11
node 2:
scylla --rpc-address 127.0.0.2 --broadcast-rpc-address 127.0.0.12
On node 1:
cqlsh> SELECT rpc_address from system.peers;
rpc_address
-------------
127.0.0.12
which means client should use this address to connect node 2 for cql and
thrift protocol.
Add utils::fb_utilities::set_broadcast_address().
Set it to either broadcast_address or listen_address configuration value
if appropriate values are set. If none of the two values above
are set - abort the application.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Simplify the utils::fb_utilities::get_broadcast() logic.
I was mildly annoyed by seeing two warnings about the same directory not
being XFS, when the sstable directory and the commitlog directory are the
same one (I don't know if this is typical, but this is what I do in all
my tests...). So I wrote this trivial patch to make sure not to test the
same directory twice.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
"This series adds EC2Snich.
Since both GossipingPropertyFileSnitch and EC2SnitchXXX snitches family
are using the same property file it was logical to share the corresponding
code. Most of this series does just that... "
This function returns the directory containing the configuration
files. It takes into an account the evironment variables as follows:
- If SCYLLA_CONF is defines - this is the directory
- else if SCYLLA_HOME is defines, then $SCYLLA_HOME/conf is the directory
- else "conf" is a directory, namely the configuration files should be
looked at ./conf
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Updated get_conf_dir() description.
For a lot of users, running Scylla in some kinds of filesystems that do not support
O_DIRECT is quite frustrating: it will fail at some point, with random error messages
that aren't really meaningful.
We should try to check for that, and fail with a good error message. Also, since our
performance claims won't really hold in anything other than XFS, we should warn the user
if that is not the setup we encounter.
Fixes#409
Signed-off-by: Glauber Costa <glommer@scylladb.com>
Although it is technically a seastar problem, most complains about that is
coming from the Scylla side. I prefer to keep the message here so we can reference
a Scylla issue.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
This adds version number generation in the build system. Version numbers
follow the format:
<version>-<release>
where release consists of:
<date>-<git-hash>
The version and release numbers are generated by the SCYLLA-VERSION-GEN
script and they are stored in SCYLLA-VERSION-FILE and
SCYLLA-RELEASE-FILE files so that other parts of the build system can
easily pick them up.
For builds that happen from release tarballs, for example,
SCYLLA-VERSION-GEN looks for a "version" file in the tree and just uses
that.
Basically, we're doing pretty much the same as Git is doing in its build
system.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
"It moves the API configuration from the command line argument to the general
config, it also move the api-doc directory to be configurable instead of hard
coded."
* Issue the "stop" method on DB (flushed CL + tables (partially))
* Do hard exit (_exit) to escape destructors and sanity checks.
This patch is horrible but sort of a workaround for various interdepdency
shutdown issues. Until services can actually be turned off, this might be
a viable option.
Refs #293. I will not call it a fix.
This replaces the http configuration to use the general configuration
object instead of the command line argument. This will allow to
configure the API from configuration file and not just from the command
line.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>