Commit Graph

776 Commits

Author SHA1 Message Date
Avi Kivity
3377739fa3 main: wait for API http server to start
Wait for the future returned by the http server start process to resolve,
so we know it is started.  If it doesn't, we'll hit the or_terminate()
further down the line and exit with an error code.
Message-Id: <1452092806-11508-3-git-send-email-avi@scylladb.com>
2016-01-07 16:44:07 +02:00
Asias He
933614bdf9 main: Change API server starting message
It comes from the Seastar HTTP server and is inaccurate.

Message-Id: <6a634437d2bd4368400010e25969e215894c2df9.1452162686.git.asias@scylladb.com>
2016-01-07 15:53:28 +02:00
Asias He
8c909122a6 gossip: Add wait_for_gossip_to_settle
Implement the wait for gossip to settle logic in the bootup process.

CASSANDRA-4288

Fixes:
bootstrap_test.py:TestBootstrap.shutdown_wiped_node_cannot_join_test

1) start node2
2) wait for cql connection with node2 is ready
3) stop node2
4) delete data and commitlog directory for node2
5) start node2

In step 5, sometimes I saw in shadow round of node2, it gets node2's
status as BOOT from other nodes in the cluster instead of NORMAL. The
problem is we do not wait for gossip to settle before we start cql server,
as a result, when we stop node2 in step 3), other nodes in the cluster
have not got node2's status update to NORMAL.
2016-01-07 10:09:25 +02:00
Nadav Har'El
f5b2135a80 repair: repair_checksum_range message
This patch adds a new type of message, "REPAIR_CHECKSUM_RANGE" to scylla's
"messaging_service" RPC mechanism, for the use of repair:

With this message the repair's master host tells a slave host to calculate
the checksum of a column-family's partitions in a given token range, and
return that checksum.

The implementation of this message uses the checksum_range() function
defined in the previous patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2016-01-05 15:38:40 +02:00
Avi Kivity
2ba4910385 main: verify that the NOFILE rlimit is sufficient
Require 10k files, recommend 200k.

Allow bypassing via --developer-mode.

Fixes #692.
2015-12-30 11:02:08 +02:00
Avi Kivity
c26689f325 init: bail out if running not on an XFS filesystem
Allow an override via '--developer-mode true', and use it in
the docker setup, since that cannot be expected to use XFS.

Fixes #658.
2015-12-30 10:56:21 +02:00
Amnon Heiman
f0d68e4161 main: start the http server in the first step
This change set the http server to start as the first step in the boot
order.

It is helpfull if some other step takes a long time or stuck.

Fixes #725

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2015-12-29 14:20:57 +02:00
Pekka Enberg
ca1f9f1c9a main: Fix implicitly disabled client encryption options
The start_native_transport() function in storage_service expects the
'enabled' option to be defined. If the option is not defined, it means
that encryption is implicitly disabled.

Fixes #718.
2015-12-28 16:24:49 +02:00
Calle Wilund
fae3bb7a24 storage_service: Set up CQL server as SSL if specified
* Massage user options in main
* Use them in storage_service, and if needed, load certificates etc
  and pass to transport/cql server.

Conflicts:
	service/storage_service.cc
2015-12-28 10:13:48 +00:00
Calle Wilund
70f293d82e main/init: Use server_encryption_options
* Reads server_encryption_options
* Interpret the above, and load and initialize credentials
  and use with messaging service init if required
2015-12-28 10:10:35 +00:00
Glauber Costa
e299127e81 main: check if options file can be read.
If we can't open the file, we will fail with a misterious error. It is a costumary
scenario, though, since people who are unaware or have just forgotten about seastar's
restriction of direct io access may put those files in tmpfs and other mount points.

We have a direct_io check that is designed exactly for this purpose, so as to give
the user a better error message. This patch makes use of it.

Fixes #644

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2015-12-27 12:20:40 +02:00
Avi Kivity
167addbfe1 main: remove issue #417 (poll mode) warning
Fixed.
2015-12-09 19:00:32 +02:00
Asias He
2022117234 failure_detector: Enable phi_convict_threshold option
Adjusts the sensitivity of the failure detector on an exponential scale.

Use as:

$ scylla --phi-convict-threshold 9

Default to 8.
2015-11-30 11:09:36 +02:00
Asias He
7ddf8963f5 config: Enable broadcast_rpc_address option
With this patch, start two nodes

node 1:
scylla --rpc-address 127.0.0.1 --broadcast-rpc-address 127.0.0.11

node 2:
scylla --rpc-address 127.0.0.2 --broadcast-rpc-address 127.0.0.12

On node 1:
cqlsh> SELECT rpc_address from system.peers;

 rpc_address
-------------
  127.0.0.12

which means client should use this address to connect node 2 for cql and
thrift protocol.
2015-11-24 10:07:31 +08:00
Asias He
2c8867c348 config: Enable storage_port option 2015-10-29 08:58:41 +08:00
Asias He
8218ab7922 storage_service: Implement start_native_transport and start_rpc_server
They are used for APIs. Share the code in main.cc as well.
2015-10-27 21:48:37 +08:00
Pekka Enberg
a772938e73 transport/server: Round-robin CQL request load balancing
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-10-27 13:24:58 +02:00
Vlad Zolotarov
5613979a85 utils::fb_utilities: add the ability to set a broadcast address
Add utils::fb_utilities::set_broadcast_address().
Set it to either broadcast_address or listen_address configuration value
if appropriate values are set. If none of the two values above
are set - abort the application.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Simplify the utils::fb_utilities::get_broadcast() logic.
2015-10-26 14:10:39 +02:00
Asias He
a137006d27 main: Set load_broadcaster for storage_service during startup
Note only storage_service on shard 0 will access load_broadcaster
2015-10-23 11:28:06 +08:00
Gleb Natapov
5b97604735 load_broadcaster: fix linkage error in debug mode
Also move it to service namespace.
2015-10-22 18:18:05 +02:00
Gleb Natapov
ece6c68288 convert loadBroadcaster 2015-10-22 17:10:20 +03:00
Glauber Costa
c3e68ad5a4 guarantee that CF directories exist for system tables
This is done in a single shard, so it lives in main.cc

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-17 13:08:07 +02:00
Pekka Enberg
c80deeb6fe cql3/query_processor: Add get_query_processor() helper
Signed-off-by: Pekka Enberg <penberg@scylladb.com>
2015-10-15 09:18:52 +03:00
Nadav Har'El
39f70a043d main: don't warn twice about the same directory
I was mildly annoyed by seeing two warnings about the same directory not
being XFS, when the sstable directory and the commitlog directory are the
same one (I don't know if this is typical, but this is what I do in all
my tests...). So I wrote this trivial patch to make sure not to test the
same directory twice.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-10-13 11:23:55 +03:00
Avi Kivity
e252475e67 Merge "locator: Adding EC2Snitch" from Vlad
"This series adds EC2Snich.

Since both GossipingPropertyFileSnitch and EC2SnitchXXX snitches family
are using the same property file it was logical to share the corresponding
code. Most of this series does just that... "
2015-10-11 14:55:26 +03:00
Vlad Zolotarov
de6cf8db51 db::config: add get_conf_dir()
This function returns the directory containing the configuration
files. It takes into an account the evironment variables as follows:
   - If SCYLLA_CONF is defines - this is the directory
   - else if SCYLLA_HOME is defines, then $SCYLLA_HOME/conf is the directory
   - else "conf" is a directory, namely the configuration files should be
     looked at ./conf

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Updated get_conf_dir() description.
2015-10-08 20:57:11 +03:00
Calle Wilund
6416c62d39 main: Actually start the batchlog_manager service loop
Was not invoked previously.
2015-10-07 14:30:09 +02:00
Glauber Costa
651937becf Revert "pass db::config to storage service as well"
This reverts commit c2b981cd82.
2015-10-05 13:21:33 +02:00
Glauber Costa
c2b981cd82 pass db::config to storage service as well
We would like to access configuration, but don't want to poke other services
in order to do so.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-02 18:23:26 +02:00
Avi Kivity
01e01a2bd7 main: fix typo in 'check_direct_io_support()' 2015-09-30 20:16:07 +03:00
Glauber Costa
73a1fab273 sanity check the filesystem
For a lot of users, running Scylla in some kinds of filesystems that do not support
O_DIRECT is quite frustrating: it will fail at some point, with random error messages
that aren't really meaningful.

We should try to check for that, and fail with a good error message. Also, since our
performance claims won't really hold in anything other than XFS, we should warn the user
if that is not the setup we encounter.

Fixes #409

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-09-30 17:58:27 +03:00
Glauber Costa
91408d3cbc warn users on 100 % CPU usage
Although it is technically a seastar problem, most complains about that is
coming from the Scylla side.  I prefer to keep the message here so we can reference
a Scylla issue.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-09-29 16:40:24 +03:00
Paweł Dziepak
34e66e60c1 main: disable thrift by default
Fixes #205.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-22 09:48:44 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Calle Wilund
e001df0a35 Main: Resolve scylla.conf based on ENV vars + do more explicit error logging
Refs #135
2015-09-16 15:44:34 +03:00
Calle Wilund
bd14d40a35 main: configure logging before reading yaml as well as after
So that commmand line --log* options can affect config logging.
2015-09-16 15:43:32 +03:00
Pekka Enberg
eab6094124 main: Print version number at startup
Now that we have a version number, lets tell the world about it!

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 11:35:32 +03:00
Pekka Enberg
5ef77a8a56 build: Add version number generation
This adds version number generation in the build system. Version numbers
follow the format:

  <version>-<release>

where release consists of:

  <date>-<git-hash>

The version and release numbers are generated by the SCYLLA-VERSION-GEN
script and they are stored in SCYLLA-VERSION-FILE and
SCYLLA-RELEASE-FILE files so that other parts of the build system can
easily pick them up.

For builds that happen from release tarballs, for example,
SCYLLA-VERSION-GEN looks for a "version" file in the tree and just uses
that.

Basically, we're doing pretty much the same as Git is doing in its build
system.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 11:23:31 +03:00
Avi Kivity
6ccb0b15b5 Merge "Move the API configuration from command line to configuration" from Amnon
"It moves the API configuration from the command line argument to the general
config, it also move the api-doc directory to be configurable instead of hard
coded."
2015-09-09 12:35:03 +03:00
Calle Wilund
945d2f73b3 Main: Do not actually stop any services on exit.
* Issue the "stop" method on DB (flushed CL + tables (partially))
* Do hard exit (_exit) to escape destructors and sanity checks.

This patch is horrible but sort of a workaround for various interdepdency
shutdown issues. Until services can actually be turned off, this might be
a viable option.

Refs #293. I will not call it a fix.
2015-09-08 11:13:34 +02:00
Amnon Heiman
d2556787d8 main: Take the http configuration from the configuration object
This replaces the http configuration to use the general configuration
object instead of the command line argument. This will allow to
configure the API from configuration file and not just from the command
line.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-09-08 03:07:39 +03:00
Asias He
7cc768a864 gossip: Fix wrong cluster name and partitioner name
Right now, gossip returns hard coded cluster and partitioner name.

  sstring get_cluster_name() {
      // FIXME: DatabaseDescriptor.getClusterName()
      return "my_cluster_name";
  }
  sstring get_partitioner_name() {
      // FIXME: DatabaseDescriptor.getPartitionerName()
      return "my_partitioner_name";
  }

Fix it by setting the correct name from configure option.

With this

   cqlsh 127.0.0.$i -e "SELECT * from system.local;

returns correct cluster_name.

Fixes #291
2015-09-07 09:21:18 +03:00
Shlomi Livne
2e6dd2f585 Update CQL start message
Align with origin CQL boot message

Resolves an issue when starting a cluster with CCM - wait till all
servers have openned their CQL port

Fixes #39

Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-09-04 11:02:02 +02:00
Calle Wilund
2fa699896b Main: Use lock files in data dir + CL dir to ensure single instance.
Acquires and maintains lock files during execution. 
Lock files are deleted on "clean" exit, and re-taken on uncontended 
startup.

Fixes #34
2015-09-01 17:50:18 +02:00
Calle Wilund
1ba655d905 Main: Do commit log replay at startup 2015-08-31 14:29:51 +02:00
Avi Kivity
554645db91 Revert "Merge "Move the API configuration from command line to configuration" from Amnon"
See issue #59 for details.

This reverts commit 5aa0244d32, reversing
changes made to 7fb109a58d.
2015-08-30 12:09:00 +03:00
Avi Kivity
5aa0244d32 Merge "Move the API configuration from command line to configuration" from Amnon
"This series address issues #59 and #23.

It moves the API configuration from the command line argument to the general
config, it also move the api-doc directory to be configurable instead of hard
coded."

Fixes #59
Fixes #23
2015-08-29 12:34:04 +03:00
Avi Kivity
c734ef2b72 Merge seastar upstream
* seastar 10e09b0...2e041c2 (7):
  > Merge "Change app_template::run() to terminate when callback is done" from Tomasz
  > resource: Fix compilation for hwloc version 1.8.0
  > memory: Fix infinite recursion when throwing std::bad_alloc
  > core/reactor: Throw the right error code when connect() fails
  > future: improve exception safety
  > xen: add missing virtual destructors
  > circular_buffer: do not destroy uninitialized object

app_template::run() users updated to call app_template::run_depracated().
2015-08-28 23:52:49 +03:00
Amnon Heiman
9ef7d1ee69 main: Take the http configuration from the configuration object
This replaces the http configuration to use the general configuration
object instead of the command line argument. This will allow to
configure the API from configuration file and not just from the command
line.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-28 20:24:59 +03:00
Avi Kivity
5f62f7a288 Revert "Merge "Commit log replay" from Calle"
Due to test breakage.

This reverts commit 43a4491043, reversing
changes made to 5dcf1ab71a.
2015-08-27 12:39:08 +03:00