The condition in question is sanity check for a SW bug.
This SW bug (if occurs) is not critical - there is an additional protection
against it in the stop_foreground_and_write().
Having said all that, since we shell not throw from a destructor,
replace throwing of a std::logic_error with an logger error message.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1471773320-7398-1-git-send-email-vladz@cloudius-systems.com>
Glauber "eagle eyes" Costa pointed out that the Scylla logo used in our
Docker image documentation looks broken because it's missing the Scylla
text.
Fix the problem by using the Scylla mascot instead.
Message-Id: <1471525154-2800-1-git-send-email-penberg@scylladb.com>
The bug tracker URL in our Docker image documentation is not clickable
because the URL Markdown extracts automatically is broken.
Fix that and add some more links on how to get help and report issues.
Message-Id: <1471524880-2501-1-git-send-email-penberg@scylladb.com>
allow user to use the `supervisorctl' program to start and stop
services. `exec` needed to be added to the scylla and scylla-jmx starter
scripts - otherwise supervisord loses track of the actual process we
want to manage.
Signed-off-by: Yoav Kleinberger <yoav@scylladb.com>
Message-Id: <1471442960-110914-1-git-send-email-yoav@scylladb.com>
"This series includes a time stamp representation changes Avi asked.
In addition is fixes a session "duration" semantics to be the time
it took to satisfy the user's request and not a time it took to
achieve the complete replication factor."
* seastar 823a404...81df893 (3):
> memory: Do not increase g_allocs on failure in allocate and allocate_aligned
> memory: Balance the g_frees and g_allocs
> Merge "thread: explicitly yield on get()" from Glauber
Fixes#1586.
Once unlink_leftmost_without_rebalance() has been called on a bi::set no
other method can be used. This includes clear_and_disposed() used by the
mutation_partition destructor.
We like unlink_leftmost_without_rebalance() because it is efficient, so
the solution is to manually finish destroying clustering row and range
tombstone sets in the reader destructor using that function.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
mutation_fragment() constructor allocates memory. If it fails the
already unlinked parts of mutation (either rows_entry or range_tombtone)
will be leaked.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
The WorkingDirectory directive does not support environment variables on
systemd version that is shipped with Ubuntu 16.04. Fortunately, not
setting WorkingDirectory implicitly sets it to user home directory,
which is the same thing (i.e. /var/lib/scylla).
Fixes#1319
Signed-of-by: Benoit Canet <benoit@scylladb.com>
Message-Id: <1470053876-1019-1-git-send-email-benoit@scylladb.com>
We have two counters that tracks how many memtable flushes are in progress, and
how much memory are they pinning.
The problem is, after we have revamped the code to limit the amount of flushes
in progress, those counters became useless: as they live inside the semaphore
side, they will only be incremented once we have past the semaphore.
One wouldn't notice if working with CPU-bound problems, where memtables don't
pile. But as soon as they do, those counters will always show the same numbers:
the depth of the semaphore, which doesn't mean much. The problem is poised to
become much worse: once we enable write behind in full and set the semaphore's
depth to one, that's the number we'll see here all the time.
The fix is to move the counters outside the semaphore, which will bring back its
old semantics.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <c5ae6903e170f3f356cdda7ed78a4c9ba8d5f024.1471370504.git.glauber@scylladb.com>
A session's "duration" should be a time it took to
handle a request, which is a time till response to a user.
In other words - till a consistency level is reached.
Before this patch is was a time that takes a complete
handling of a request, which is the time it takes to handle
all replicas and not only those required to reach a CL.
This patch fixes this situation by extending the trace_state's state
values to 3 states: inactive, foreground and background.
A primary session may be in 3 states:
- "inactive": between the creation and a begin() call.
- "foreground": after a begin() call and before a
stop_foreground_and_write() call.
- "background": after a stop_foreground_and_write() call and till the
state object is destroyed.
- Traces are not allowed while state is in an "inactive" state.
- The time the primary session was in a "foreground" state is the time
reported as a session's "duration".
- Traces that have arrived during the "background" state will be recorded
as usual but their "elapsed" time will be greater or equal to the
session's "duration".
Secondary sessions may only be in an "inactive" or in a "foreground"
states.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Define an tracing::elapsed_clock type (std::chrono::steady_clock).
Use it instead of trace_state::clock_type.
- Store the "elapsed" information in a form of elapsed_clock::duration.
- Make all keyspace_backend specific conversions inside the trace_keyspace_helper
class, where they belong.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
events_records are promised to be kept alive till the future returned
by apply_events_mutation() resolves: it's dowithificated by a caller already.
In addition, since its passed by a reference, it's a logical thing to demand
it to be kept alive by a caller till the future above resolves.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
"This series changes the tracing back pressure scheme from limiting the amount traces in
a single session by a fixed number to have a per-shard budget consumed by all active tracing
sessions.
It was really easy to cause the traces to be dropped even if there weren't too many
active traces: e.g. if there was a single active session which creates more traces
than a per-session limit (30) the traces above 30-th were going to be dropped. Namely
traces were dropped when there were only 30 active traces, which is ridiculous.
This series introduces two main changes:
- Changes the records budgeting from being per-session to be per-shard. This substantially
increases the amount of active records after which new records are going to be dropped.
- Introduces a flow when events' records are written BEFORE the corresponding tracing
session is over (right now traces are written to I/O back end only when the session object
is destroyed).
The later is meant to virtually eliminate the traces drops in normal situations at all.
Of course, if a back end is slow or if there are a lot of small sessions that do not complete we would still have
to drop new sessions/records in order to avoid uncontrolled growth of a memory foot print of Tracing.
If we see the later case happening a lot in the future we may add lowres timers to each session that would
commit the cached records for writing every X time. But let's not try to optimize something that we
are not completely sure has to be optimized... "
The histogram implementation uses sampling to estimate the mean and sum.
This patch adds a method that returns an estimated sum based on the mean
and the total number of events measured.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1467547341-30438-2-git-send-email-amnon@scylladb.com>
Currently, query_processor.cc code formatting is all over the place,
which makes the file hard to read. Apply some formatting magic to make
it prettier.
Message-Id: <1470832486-26020-2-git-send-email-penberg@scylladb.com>
"Ranges that wrap around are a source of complexity and bugs. This patchset
adds a nonwrapping_range class, which specifies the range can't wrap around.
It is the user of the nonwrapping_range that is required to enforce this
constraint.
The idea is to incrementaly disallow ranges that wrap around. We do it
for query::clustering_range in this patchset, and it can be done similarly
for other ranges. This moves the burden of unwrapping ranges to the edges.
Fixes#1544"
This patch changes the type of query::clustering_range to express that
ranges that wrap around are not allowed, and ranges that have the
start bound after the end bound are considered empty.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch makes the storage_proxy return an empty result when the
query doesn't define any clustering ranges (default or specific).
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch makes make_clustering_range not enforce that the range be
non-wrapping, so that it can be validated differently if needed. A
make_clustering_range_and_validate function is introduced that keeps
the old behavior.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch introduces the nonwrapping_range class. This class is
intended to be used by code that requires non wrapping ranges.
Internally, it uses a wrapping_range. Users are responsible for
ensuring the bounds are correct when creating a nonwrapping_range.
The path proposed here is to incrementally replace usages of
wrapping_range/range by nonwrapping_range, pushing usages of wrapping
ranges as further to the edges as possible.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch renames range to wrapping_range in preparation for adding a
new range type, nonwrapping_range.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Currently, partition snapshot destructor can throw which is a big no-no.
The solution is to ignore the exception and leave versions unmerged and
hope that subsequent reads will succeed at merging.
However, another problem is that the merge doesn't use allocating
sections which means that memory won't be reclaimed to satisfy its
needs. If the cache is full this may result in partition versions not
being merged for a very long time.
This patch introduces partition_snapshot::merge_partition_versions()
which contains all the version merging logic that was previously present
in the snapshot destructor. This function may throw so that it can be
used with allocating sections.
The actual merging and handling of potential erros is done from
partition_snapshot_reader destructor. It tries to merge versions under
the allocating section. Only if that fails it gives up and leaves them
unmerged.
Fixes#1578Fixes#1579.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1471265544-23579-1-git-send-email-pdziepak@scylladb.com>
$ tools/scyllatop/scyllatop.py '*gossip*'
node-1/gossip-0/gauge-heart_beat_version 1.0
node-2/gossip-0/gauge-heart_beat_version 1.0
node-3/gossip-0/gauge-heart_beat_version 1.0
Gossip heart beat version changes every second. If everyting is working
correctly, the gauge-heart_beat_version output should be 1.0. If not,
the gauge-heart_beat_version output should be less than 1.0.
Message-Id: <cbdaa1397cdbcd0dc6a67987f8af8038fd9b2d08.1470712861.git.asias@scylladb.com>
[v2: fix check for static column (don't check if the schema is not compound)
and move want-static-columns flag inside the filtering context to avoid
changing all the callers.]
When a CQL request asks to read only a range of clustering keys inside
a partition, we actually need to read not just these clustering rows, but
also the static columns and add them to the response (as explained by Tomek
in issue #1568).
With the current code, that CQL request is translated into an
sstable::read_row() with a clustering-key filter. But this currently
only reads the requested clustering keys - NOT the static columns.
We don't want sstable::read_row() to unconditionally read the from disk
the static columns because if, for example, they are already cached, we
might not want to read them from disk. We don't have such partial-partition
cache yet, but we are likely to have one in the future.
This patch adds in the clustering key filter object a flag of whether we
need to read the static columns (actually, it's function, returning this
flag per partition, to match the API for the clustering-key filtering).
When sstable::read_row() sees the flag for this partition is true, it also
request to read the static columns.
Currently, the code always passes "true" for this flag - because we don't
have the logic to cache partially-read partitions.
The current find_disk_ranges() code does not yet support returning a non-
contiguous byte range, so this patch, if it notices that this partition
really has static columns in addition to the range it needs to read,
falls back to reading the entire partition. This is a correct solution
(and fixes#1568) but not the most efficient solution. Because static
columns are relatively rare, let's start with this solution (correct
by less efficient when there are static columns) and providing the non-
contiguous reading support is left as a FIXME.
Fixes#1568
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1471124536-19471-1-git-send-email-nyh@scylladb.com>
When the housekeeping configuration name was changed from conf to cfg it
was no longer included as part of the conf rpm.
This change adds a macro that determines of if the file should be
included or not and use that marco to conditionally add the
configuration file to the rpm.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1471169042-19099-1-git-send-email-amnon@scylladb.com>
Files with a conf extension are run by the scylla_prepare on the AMI.
The scylla-housekeeping configuration file is not a bash script and
should not be run.
This patch changes its extension to cfg which is more python like.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1470896759-22651-2-git-send-email-amnon@scylladb.com>