* seastar 0083ee8...85ca12d (1):
> Merge "Run-time logging configuration" from Jesse
Includes patch from Jesse:
"Switch to Seastar for logging option handling
In addition to updating the abstraction layer for Seastar logging in `log.hh`,
the configuration system (`db/config.{hh,cc}`) has been updated in two ways:
- The string-map type for Boost.program_options is now defined in Seastar.
- A configuration value can be marked as `UsedFromSeastar`. This is like `Used`,
except the option is expected to be defined in the Boost.Program_options
description for Seastar. If the option is not defined in Seastar, or it is
defined with a different type, then a run-time exception is thrown early in
Scylla's initialization. This is necessary because logging options which are
now defined in Seastar were previously defined in Scylla and support for these
options in the YAML file cannot be dropped. In order to be able to verify that
options marked `UsedFromSeastar` are actually defined in Seastar, the
interface for adding options to `db::config` has changed from taking a
`boost::program_options::options_description_easy_init` (which is handle into
a `boost::program_options::options_description` which only allows adding
options) to taking a `boost::program_options::options_description`
directly (which also allows querying existing options).
Scylla also fully defers to Seastar's support for run-time logging
configuration."
Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <ef26cffb91bef1ae95d508187a6dd861a6c4fc84.1503344007.git.jhaberku@scylladb.com>
Some (most?) users don't read logs or release notes, so they won't notice
that the ByteOrdered and Random partitioners were deprecated in 2.0. Make
them notice by refusing to start with a deprecated partitioner, unless a
switch is explicitly enabled.
Message-Id: <20170820073424.8331-1-avi@scylladb.com>
Large allocations can require cache evictions to be satisfied, and can
therefore induce long latencies. Enable the seastar large allocation
warning so we can hunt them down and fix them.
Message-Id: <20170819135212.25230-1-avi@scylladb.com>
* Configuration for cluster_name is commented-out in config file.
* Default value set to empty string and if not rewritten by user then
warning is printed and value is reset to "ScyllaDB Cluster".
Fixes#2648.
Message-Id: <20170808113322.9313-1-daniel@scylladb.com>
The default of 2ms is somewhat arbitrary. Now that we have a lot more
mileage deploying Scylla applications in production it does sound not
only arbitrary, but high.
In particular, it is really hard to achieve 1ms latencies in the face of
CPU-heavy workloads with it.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <1499354495-27173-1-git-send-email-glauber@scylladb.com>
Boot should not continue until a future returned by
wait_for_gossip_to_settle() is resolved. Commit 991ec4a16 mistakenly
broke that, so restore it back. Also fix calls for supervisor::notify()
to be in the right places.
Message-Id: <20170702082355.GQ14563@scylladb.com>
Node may go down, so after it restarts cache hit rate info will be
incorrect and it can be overwhelmed with traffic until new and
up-to-date cache hit rate arrives. Solve this by dropping node's
information on connection reset, it is more accurate than relying on
gossip which may be slow and miss reboot of a node.
This patch adds new class cache_hitrate_calculator whose responsibility
is to periodically calculate average cache hit rates between all shards
for each CF.
Removing non-murmur3 partitioners will allow us to reduce memory footprint
and speed up some code by utilizing the properties of the murmur3 partitioner
token.
Message-Id: <20170528172536.16079-1-avi@scylladb.com>
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
- 'net' namespace conflicts with seastar::net, renamed to 'netw'.
- 'transport' namespace conflicts with seastar::transport, renamed to
cql_transport.
- "logger" global variables now conflict with logger global type, renamed
to xlogger.
- other minor changes
There is no need to wait when starting the prometheus server. As it is
up to each of the modules to register its metrics when it is ready.
This is especially important when debuging boot issues.
This patch moves the prometheus initilization to be done at an early
stage of the boot sequencec.
Fixes#2144
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1489041986-28974-1-git-send-email-amnon@scylladb.com>
Refs #1813 (fixes scylla part)
Added require_client_auth and priority_string options to
server_encryption_options/client_encryption_options an process them.
Allows TLS method/algo specification. Also enabled enforcing known cert
authentication for both node-to-node and client communication.
"Intended to reduce memory usage when resharding by sharing sstable
components among shards. File descriptors are also shared from now
on, meaning that a much smaller number of file descriptors will be
used during resharding.
Fixes #1951."
branch 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla
* 'excessive_memory_usage_v4' of github.com:raphaelsc/scylla:
db: avoid excessive memory usage during resharding
checked_file_impl: add support to dup
sstables: group sstable components that can be shared among shards
sstables: rename sstable member
After resharding, sstables may be owned by all shards, which
means that file descriptors and memory usage for metadata will
increase by a factor equal to number of shards. That can easily
lead to OOM.
SSTable components are immutable, so they can be stored in one
shard and shared with others that need it. We use the following
formula to decide which shard will open the sstable and share
it with the others: (generation % smp::count), which is the
inverse of how we calculate generation for new sstables.
So if no resharding is performed, everything is shard-local.
With this approach, resource usage due to loaded sstables will
be evenly distributed among shards.
For this approach to work, we now only populate keyspaces from
shard 0. It's now the sole responsible for iterating through
column family dirs. In addition, most of population functions
are now free and take distributed database object as parameter.
Fixes#1951.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Transform the supervisor_notify() and related functions into
the "supervisor" class and place this class implementation in
a separate .cc file.
This is going to fix the compilation breakage of tests introduced
by a
commit 8014adc2a1
init: serialize the creation of system_traces KS objects
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1483663955-20096-1-git-send-email-vladz@scylladb.com>
Serialize the creation of a system_traces KS objects when
they do not exist - the initial cluster boot.
Avoid creating them in parallel by different cluster Nodes
in order to avoid issue #420.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1483552503-12873-3-git-send-email-vladz@scylladb.com>
Previously, if the Prometheus port (by default, 0.0.0.0:9180) could not
be opened, the following message appeared in the log about 10 seconds into
the run, and Scylla crashed.
ERROR 2017-01-01 19:31:04,066 [shard 0] seastar - Exiting on unhandled exception: std::system_error (error system:98, Address already in use)
The puzzled user would have no idea *which* address was already in use, why,
or why Scylla stopped.
In this patch, before the above message we get the much more informative
message:
ERROR 2017-01-01 19:58:19,080 [shard 0] init - Could not start Prometheus API server on 0.0.0.0:9180: std::system_error (error system:98, Address already in use)
We continue to print the original message - and exit - in this case,
under the assumption that it's better not to run the database while
improperly configured.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20170102121304.2060-1-nyh@scylladb.com>
Refuse to boot if we don't have at least 1 GiB per shard, unless in developer
mode.
The primary violator here is docker, but since it starts in developer mode,
it won't get fixed. We need some extra logic for this case.
Message-Id: <20161221090222.28677-1-avi@scylladb.com>
Shared sstables will now be resharded in the same order to guarantee
that all shards owning a sstable will agree on its deletion nearly
the same time, therefore, reducing disk space requirement.
That's done by picking which column family to reshard in UUID order,
and each individual column family will reshard its shared sstables
in generation order.
Fixes#1952.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <87ff649ed24590c55c00cbb32bffd8fa2743e36e.1482342754.git.raphaelsc@scylladb.com>
RPC messaging service is initialized before the Tracing service, so
we should prevent creation of tracing spans before the service is
fully initialized.
We will use an already existing "_down" state and extend it in a way
that !_down equals "started", where "started" is TRUE when the local
service is fully initialized.
We will also split the Tracing service initialization into two parts:
1) Initialize the sharded object.
2) Start the tracing service:
- Create the I/O backend service.
- Enable tracing.
Fixes issue #1939
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1481836429-28478-1-git-send-email-vladz@scylladb.com>
The problem is that replay will unlink any segments which were on disk
at the time the replay starts. However, some of those segments may
have been created by current node since the boot. If a segment is part
of reserve for example, it will be unlinked by replay, but we will
still use that segment to log mutations. Those mutations will not be
visible to replay after a crash though.
The fix is to record preexisting segents before any new segments will
have a chance to be created and use that as the replay list.
Introduced in abe7358767.
dtest failure:
commitlog_test.py:TestCommitLog.test_commitlog_replay_on_startup
Message-Id: <1481117436-6243-1-git-send-email-tgrabiec@scylladb.com>
Since Ubuntu 15.10/16.04 still uses Upstart to manage GUI session (not as init), when we directly launch Scylla on Ubuntu's GUI Terminal(not using systemctl or initctl), raise(SIGSTOP) mistakenly calls (Because GUI session has "UPSTART_JOB" environment variable, won't happen when running Scylla as systemd service).
To avoid this, we need to verify UPSTART_JOB == "scylla-server".
If it's part of GUI session UPSTART_JOB has to be "unity7", we need to avoid raise(SIGSTOP) in that case.
Fixes#1199
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1480620421-28967-1-git-send-email-syuu@scylladb.com>
"We currently write the size_estimates system table for every schema
on a periodic basis, currently set to 5 minutes, which can interfere
with an ongoing workload.
This patchset virtualizes it such that queries are intercepted and we
calculate the results on the fly, only for the ranges the caller is interested in.
Fixes#1616"
* 'virtual-estimates/v4' of github.com:duarten/scylla:
size_estimates_virtual_reader: Add unit test
db: Delete size_estimates_recorder
size_estimates: Add virtual reader
column_family: Add support for virtual readers
storage_service: get_local_tokens() returns a future
nonwrapping_range: Add slice() function
range: Find a sequence's lower and upper bounds
system_keyspace: Build mutations for size estimates
size_estimates: Store the token range as bytes
range_estimates: Add schema
murmur3_partitioner: Convert maximum_token to sstring
Now that access to the size_estimates system is virtualized, we no
longer need the recorder.
Fixes#1616
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Exception handling was broken because after io checker, storage_io_error
exception is wrapped around system error exceptions. Also the message
when handling exception wasn't precise enough for all cases. For example,
lack of permission to write to existing data directory.
Fixes#883.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <b2dc75010a06f16ab1b676ce905ae12e930a700a.1478542388.git.raphaelsc@scylladb.com>
Wrapping ranges are a pain, so we are moving wrap handling to the edges.
Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.
Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>