Commit Graph

135 Commits

Author SHA1 Message Date
Takuya ASADA
4162fb158c main: raise SIGSTOP only when scylla become ready
supervisor_notify() calls periodically, to log message on systemd.
So raise(SIGSTOP) will called multiple times, upstart doesn't expected that.
We need to call it just one time.

Fixes #846

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
2016-01-27 23:30:26 +09:00
Takuya ASADA
b4accd8904 main: autodetect systemd/upstart
We can autodetect systemd/upstart by environment variables, don't need program argument.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
2016-01-27 23:29:32 +09:00
Asias He
b2f2c1c28c storage_service: Add drain on shutdown logic
We register engine().at_exit() callbacks when we initialize the services. We
do not really call the callbacks at the moment due to #293.

It is pretty hard to see the whole picture in which order the services
are shutdown. Instead of for each services to register a at_exit()
callbacks, I proposal to have a single at_exit() callback which do the
shutdown for all the services. In cassandra, the shutdown work is done
in storage_service::drain_on_shutdown callbacks.

In this patch, the drain_on_shutdown is executed during shutdown.

As a result, the proper gossip shutdown is executed and fixes #790.

With this patch, when Ctrl-C on a node, it looks like:

INFO  [shard 0] storage_service - Drain on shutdown: starts
INFO  [shard 0] gossip - Announcing shutdown
INFO  [shard 0] storage_service - Node 127.0.0.1 state jump to normal
INFO  [shard 0] storage_service - Drain on shutdown: stop_gossiping done
INFO  [shard 0] storage_service - CQL server stopped
INFO  [shard 0] storage_service - Drain on shutdown: shutdown rpc and cql server done
INFO  [shard 0] storage_service - Drain on shutdown: shutdown messaging_service done
INFO  [shard 0] storage_service - Drain on shutdown: flush column_families done
INFO  [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
INFO  [shard 0] storage_service - Drain on shutdown: done
2016-01-27 11:45:52 +08:00
Amnon Heiman
b1845cddec Breaking the API initialization into stages
The API needs to be available at an early stage of the initialization,
on the other hand not all the specific APIs are available at that time.

This patch breaks the API initialization into stages, in each stage
additional commands will be available.

While setting that the api header files was broken into api_init.hh that
is relevent to the main and to api.hh which holds the different
api helper functions.

Fixes #754

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1453822331-16729-2-git-send-email-amnon@scylladb.com>
2016-01-26 17:41:31 +02:00
Avi Kivity
71eb79aedd main: exit with code 0 on shutdown
To avoid confusing systemd.

Fixes #823.

Message-Id: <1453220473-28712-1-git-send-email-avi@scylladb.com>
2016-01-26 16:26:53 +02:00
Takuya ASADA
b92a075a34 main: support supervisor_notify() on Ubuntu
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1453422886-26297-1-git-send-email-syuu@scylladb.com>
2016-01-24 12:10:41 +02:00
Pekka Enberg
733584c44d main: Start the API service as the last step
This reverts commit f0d68e4 ("main: start the http server in the first
step"). The service layer is not ready to serve clients before it's
fully up and running which causes early startup crashes everywhere.
Message-Id: <1452768015-22763-1-git-send-email-penberg@scylladb.com>
2016-01-14 12:55:50 +02:00
Avi Kivity
39f81b95d6 main: make --developer-mode relax dma requirements
With Docker we might be running on a filesystem that does not support DMA
(aufs; or tmpfs on boot2docker), so let --developer-mode allow running
on those file systems.
Message-Id: <1452593083-25601-1-git-send-email-avi@scylladb.com>
2016-01-12 13:34:46 +02:00
Avi Kivity
3d5f6de683 main: notify systemd of startup progress
Send current startup stage via sd_notify STATUS variable; let it know that
startup is complete via READY=1.

Fixes #760.
2016-01-12 11:58:24 +02:00
Avi Kivity
3377739fa3 main: wait for API http server to start
Wait for the future returned by the http server start process to resolve,
so we know it is started.  If it doesn't, we'll hit the or_terminate()
further down the line and exit with an error code.
Message-Id: <1452092806-11508-3-git-send-email-avi@scylladb.com>
2016-01-07 16:44:07 +02:00
Asias He
933614bdf9 main: Change API server starting message
It comes from the Seastar HTTP server and is inaccurate.

Message-Id: <6a634437d2bd4368400010e25969e215894c2df9.1452162686.git.asias@scylladb.com>
2016-01-07 15:53:28 +02:00
Asias He
8c909122a6 gossip: Add wait_for_gossip_to_settle
Implement the wait for gossip to settle logic in the bootup process.

CASSANDRA-4288

Fixes:
bootstrap_test.py:TestBootstrap.shutdown_wiped_node_cannot_join_test

1) start node2
2) wait for cql connection with node2 is ready
3) stop node2
4) delete data and commitlog directory for node2
5) start node2

In step 5, sometimes I saw in shadow round of node2, it gets node2's
status as BOOT from other nodes in the cluster instead of NORMAL. The
problem is we do not wait for gossip to settle before we start cql server,
as a result, when we stop node2 in step 3), other nodes in the cluster
have not got node2's status update to NORMAL.
2016-01-07 10:09:25 +02:00
Nadav Har'El
f5b2135a80 repair: repair_checksum_range message
This patch adds a new type of message, "REPAIR_CHECKSUM_RANGE" to scylla's
"messaging_service" RPC mechanism, for the use of repair:

With this message the repair's master host tells a slave host to calculate
the checksum of a column-family's partitions in a given token range, and
return that checksum.

The implementation of this message uses the checksum_range() function
defined in the previous patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2016-01-05 15:38:40 +02:00
Avi Kivity
2ba4910385 main: verify that the NOFILE rlimit is sufficient
Require 10k files, recommend 200k.

Allow bypassing via --developer-mode.

Fixes #692.
2015-12-30 11:02:08 +02:00
Avi Kivity
c26689f325 init: bail out if running not on an XFS filesystem
Allow an override via '--developer-mode true', and use it in
the docker setup, since that cannot be expected to use XFS.

Fixes #658.
2015-12-30 10:56:21 +02:00
Amnon Heiman
f0d68e4161 main: start the http server in the first step
This change set the http server to start as the first step in the boot
order.

It is helpfull if some other step takes a long time or stuck.

Fixes #725

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2015-12-29 14:20:57 +02:00
Pekka Enberg
ca1f9f1c9a main: Fix implicitly disabled client encryption options
The start_native_transport() function in storage_service expects the
'enabled' option to be defined. If the option is not defined, it means
that encryption is implicitly disabled.

Fixes #718.
2015-12-28 16:24:49 +02:00
Calle Wilund
fae3bb7a24 storage_service: Set up CQL server as SSL if specified
* Massage user options in main
* Use them in storage_service, and if needed, load certificates etc
  and pass to transport/cql server.

Conflicts:
	service/storage_service.cc
2015-12-28 10:13:48 +00:00
Calle Wilund
70f293d82e main/init: Use server_encryption_options
* Reads server_encryption_options
* Interpret the above, and load and initialize credentials
  and use with messaging service init if required
2015-12-28 10:10:35 +00:00
Glauber Costa
e299127e81 main: check if options file can be read.
If we can't open the file, we will fail with a misterious error. It is a costumary
scenario, though, since people who are unaware or have just forgotten about seastar's
restriction of direct io access may put those files in tmpfs and other mount points.

We have a direct_io check that is designed exactly for this purpose, so as to give
the user a better error message. This patch makes use of it.

Fixes #644

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2015-12-27 12:20:40 +02:00
Avi Kivity
167addbfe1 main: remove issue #417 (poll mode) warning
Fixed.
2015-12-09 19:00:32 +02:00
Asias He
2022117234 failure_detector: Enable phi_convict_threshold option
Adjusts the sensitivity of the failure detector on an exponential scale.

Use as:

$ scylla --phi-convict-threshold 9

Default to 8.
2015-11-30 11:09:36 +02:00
Asias He
7ddf8963f5 config: Enable broadcast_rpc_address option
With this patch, start two nodes

node 1:
scylla --rpc-address 127.0.0.1 --broadcast-rpc-address 127.0.0.11

node 2:
scylla --rpc-address 127.0.0.2 --broadcast-rpc-address 127.0.0.12

On node 1:
cqlsh> SELECT rpc_address from system.peers;

 rpc_address
-------------
  127.0.0.12

which means client should use this address to connect node 2 for cql and
thrift protocol.
2015-11-24 10:07:31 +08:00
Asias He
2c8867c348 config: Enable storage_port option 2015-10-29 08:58:41 +08:00
Asias He
8218ab7922 storage_service: Implement start_native_transport and start_rpc_server
They are used for APIs. Share the code in main.cc as well.
2015-10-27 21:48:37 +08:00
Pekka Enberg
a772938e73 transport/server: Round-robin CQL request load balancing
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-10-27 13:24:58 +02:00
Vlad Zolotarov
5613979a85 utils::fb_utilities: add the ability to set a broadcast address
Add utils::fb_utilities::set_broadcast_address().
Set it to either broadcast_address or listen_address configuration value
if appropriate values are set. If none of the two values above
are set - abort the application.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Simplify the utils::fb_utilities::get_broadcast() logic.
2015-10-26 14:10:39 +02:00
Asias He
a137006d27 main: Set load_broadcaster for storage_service during startup
Note only storage_service on shard 0 will access load_broadcaster
2015-10-23 11:28:06 +08:00
Gleb Natapov
5b97604735 load_broadcaster: fix linkage error in debug mode
Also move it to service namespace.
2015-10-22 18:18:05 +02:00
Gleb Natapov
ece6c68288 convert loadBroadcaster 2015-10-22 17:10:20 +03:00
Glauber Costa
c3e68ad5a4 guarantee that CF directories exist for system tables
This is done in a single shard, so it lives in main.cc

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-17 13:08:07 +02:00
Pekka Enberg
c80deeb6fe cql3/query_processor: Add get_query_processor() helper
Signed-off-by: Pekka Enberg <penberg@scylladb.com>
2015-10-15 09:18:52 +03:00
Nadav Har'El
39f70a043d main: don't warn twice about the same directory
I was mildly annoyed by seeing two warnings about the same directory not
being XFS, when the sstable directory and the commitlog directory are the
same one (I don't know if this is typical, but this is what I do in all
my tests...). So I wrote this trivial patch to make sure not to test the
same directory twice.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-10-13 11:23:55 +03:00
Avi Kivity
e252475e67 Merge "locator: Adding EC2Snitch" from Vlad
"This series adds EC2Snich.

Since both GossipingPropertyFileSnitch and EC2SnitchXXX snitches family
are using the same property file it was logical to share the corresponding
code. Most of this series does just that... "
2015-10-11 14:55:26 +03:00
Vlad Zolotarov
de6cf8db51 db::config: add get_conf_dir()
This function returns the directory containing the configuration
files. It takes into an account the evironment variables as follows:
   - If SCYLLA_CONF is defines - this is the directory
   - else if SCYLLA_HOME is defines, then $SCYLLA_HOME/conf is the directory
   - else "conf" is a directory, namely the configuration files should be
     looked at ./conf

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Updated get_conf_dir() description.
2015-10-08 20:57:11 +03:00
Calle Wilund
6416c62d39 main: Actually start the batchlog_manager service loop
Was not invoked previously.
2015-10-07 14:30:09 +02:00
Glauber Costa
651937becf Revert "pass db::config to storage service as well"
This reverts commit c2b981cd82.
2015-10-05 13:21:33 +02:00
Glauber Costa
c2b981cd82 pass db::config to storage service as well
We would like to access configuration, but don't want to poke other services
in order to do so.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-02 18:23:26 +02:00
Avi Kivity
01e01a2bd7 main: fix typo in 'check_direct_io_support()' 2015-09-30 20:16:07 +03:00
Glauber Costa
73a1fab273 sanity check the filesystem
For a lot of users, running Scylla in some kinds of filesystems that do not support
O_DIRECT is quite frustrating: it will fail at some point, with random error messages
that aren't really meaningful.

We should try to check for that, and fail with a good error message. Also, since our
performance claims won't really hold in anything other than XFS, we should warn the user
if that is not the setup we encounter.

Fixes #409

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-09-30 17:58:27 +03:00
Glauber Costa
91408d3cbc warn users on 100 % CPU usage
Although it is technically a seastar problem, most complains about that is
coming from the Scylla side.  I prefer to keep the message here so we can reference
a Scylla issue.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-09-29 16:40:24 +03:00
Paweł Dziepak
34e66e60c1 main: disable thrift by default
Fixes #205.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-22 09:48:44 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Calle Wilund
e001df0a35 Main: Resolve scylla.conf based on ENV vars + do more explicit error logging
Refs #135
2015-09-16 15:44:34 +03:00
Calle Wilund
bd14d40a35 main: configure logging before reading yaml as well as after
So that commmand line --log* options can affect config logging.
2015-09-16 15:43:32 +03:00
Pekka Enberg
eab6094124 main: Print version number at startup
Now that we have a version number, lets tell the world about it!

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 11:35:32 +03:00
Pekka Enberg
5ef77a8a56 build: Add version number generation
This adds version number generation in the build system. Version numbers
follow the format:

  <version>-<release>

where release consists of:

  <date>-<git-hash>

The version and release numbers are generated by the SCYLLA-VERSION-GEN
script and they are stored in SCYLLA-VERSION-FILE and
SCYLLA-RELEASE-FILE files so that other parts of the build system can
easily pick them up.

For builds that happen from release tarballs, for example,
SCYLLA-VERSION-GEN looks for a "version" file in the tree and just uses
that.

Basically, we're doing pretty much the same as Git is doing in its build
system.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 11:23:31 +03:00
Avi Kivity
6ccb0b15b5 Merge "Move the API configuration from command line to configuration" from Amnon
"It moves the API configuration from the command line argument to the general
config, it also move the api-doc directory to be configurable instead of hard
coded."
2015-09-09 12:35:03 +03:00
Calle Wilund
945d2f73b3 Main: Do not actually stop any services on exit.
* Issue the "stop" method on DB (flushed CL + tables (partially))
* Do hard exit (_exit) to escape destructors and sanity checks.

This patch is horrible but sort of a workaround for various interdepdency
shutdown issues. Until services can actually be turned off, this might be
a viable option.

Refs #293. I will not call it a fix.
2015-09-08 11:13:34 +02:00
Amnon Heiman
d2556787d8 main: Take the http configuration from the configuration object
This replaces the http configuration to use the general configuration
object instead of the command line argument. This will allow to
configure the API from configuration file and not just from the command
line.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-09-08 03:07:39 +03:00