Commit Graph

10283 Commits

Author SHA1 Message Date
Paweł Dziepak
3fe5ed3cd9 query: use result_view::consume() where appropriate
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
cb2a557cf7 query::result: reduce chunk count
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
ea3ac0a270 frozen_mutation: reduce chunk count in constructor
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
1daf4c73a3 frozen_mutation: avoid buffer linearization and copy
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
3707d7fec3 frozen_mutation: use bytes_ostream internally
Unlike bytes, bytes_ostream supports fragmented buffers, thus reducing
the pressure on the memory allocator caused by large frozen partitions.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
c0425b63ff frozen_mutation: add mutation_view()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
89f7b46f61 idl: switch to utils::input_stream
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
dcf794b04d idl: make bytes compatible with bytes_ostream
This patch makes idl type "bytes" compatible with both bytes and
bytes_ostream.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
2c5ec44281 atomic_cell: add overloads taking const bytes&
Deserialization code is going to use a proxy object that will be casted
to either bytes or bytes_ostream depending on the demand. It cannot be
casted directly to bytes_view though as it won't extend the lifetime of
the buffer appropriately. The simples solution is just to add overloads
that accept const bytes&.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
387434a76c bytes_ostream: add reduce_chunk_count()
Deserialization code has now two variants. The faster one can be used
only when the source buffer is not fragmented. reduce_chunk_count() aims
to increase number of cases when the fast path can be used.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
7d4b7fd5fc bytes_ostream: add equality operator
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
222bde7e6f bytes_ostream: introduce upper bound on chunk size
This patch makes append() and write() limit the maximum size of a single
allocation to bytes_ostream::max_chunk_size.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
0ee98ea4c4 tests: add fragmented input stream test
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
e76203c927 utils: add input_stream
input_stream performs a type erasure on seastar::simple_input_stream and
fragmented_input_stream. The main goal is to keep the overhead for the
cases when simple_input_stream is used minimum.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Paweł Dziepak
29827a9726 utils: add fragmented_input_stream
fragmented_input_stream is an input stream usable by IDL-generated
deserializers which can read from fragmented buffers.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-22 09:31:33 +01:00
Asias He
4ffd867ad0 gossip: Add log when cluster or partioner mismatch
It is easier for user to figure out the configuration error.

The log looks like:

   WARN  2016-08-22 15:04:56,214 [shard 0] gossip - ClusterName mismatch
   from 127.0.0.2 test2!=test

   WARN  2016-08-22 15:06:16,106 [shard 0] gossip - Partitioner mismatch from 127.0.0.2
   org.apache.cassandra.dht.RandomPartitioner!=org.apache.cassandra.dht.Murmur3Partitioner

Fixes: #1587

Message-Id: <745ed8857da6f70745735b94eef7b226d2f22e10.1471849834.git.asias@scylladb.com>
2016-08-22 11:06:31 +03:00
Raphael S. Carvalho
77d4cd21d7 sstables: Fix estimation of pending tasks for leveled strategy
There were two underflow bugs.

1) in variable i, causing get_level() to see an invalid level and
throw an exception as a result.
2) when estimating number of pending tasks for a level.

Fixes #1603.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <cce993863d9de4d1f49b3aabe981c475700595fc.1471636164.git.raphaelsc@scylladb.com>
2016-08-22 10:37:15 +03:00
Vlad Zolotarov
92921fe110 tracing::trace_state: push the UUID to the end of an error message in a destructor
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1471780783-25406-1-git-send-email-vladz@cloudius-systems.com>
2016-08-21 16:50:52 +03:00
Vlad Zolotarov
0683d4bd29 tracing::trace_state: don't throw in a destructor
The condition in question is sanity check for a SW bug.

This SW bug (if occurs) is not critical - there is an additional protection
against it in the stop_foreground_and_write().

Having said all that, since we shell not throw from a destructor,
replace throwing of a std::logic_error with an logger error message.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1471773320-7398-1-git-send-email-vladz@cloudius-systems.com>
2016-08-21 13:50:52 +03:00
Avi Kivity
2dcbbf9006 Merge seastar upstream
* seastar 4254111...6fadd98 (6):
  > simple_input_stream: remove explicit copy constructor
  > simple_stream: add [[gnu::always_inline]]
  > perf_fstream: don't initialize variable-size array inline
  > rpc: annotate template function calls with "template"
  > iotune: avoid "auto" parameters
  > file: ext4 support
2016-08-21 13:38:43 +03:00
Paweł Dziepak
e60bb83688 sstables: optimise clustering rows filtering
Clustering rows in the sstables are sorted in the ascending order so we
can use that to minimise number of comparisons when checking if a row is
in the requested range.

Refs #1544.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Reviewed-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <1471608921-30818-1-git-send-email-pdziepak@scylladb.com>
2016-08-19 18:11:11 +03:00
Pekka Enberg
2bf5e8de6e dist/docker: Use Scylla mascot as the logo
Glauber "eagle eyes" Costa pointed out that the Scylla logo used in our
Docker image documentation looks broken because it's missing the Scylla
text.

Fix the problem by using the Scylla mascot instead.
Message-Id: <1471525154-2800-1-git-send-email-penberg@scylladb.com>
2016-08-19 12:50:02 +03:00
Pekka Enberg
4d90e1b4d4 dist/docker: Fix bug tracker URL in the documentation
The bug tracker URL in our Docker image documentation is not clickable
because the URL Markdown extracts automatically is broken.

Fix that and add some more links on how to get help and report issues.
Message-Id: <1471524880-2501-1-git-send-email-penberg@scylladb.com>
2016-08-19 12:49:52 +03:00
Yoav Kleinberger
25fb5e831e docker: extend supervisor capabilities
allow user to use the `supervisorctl' program to start and stop
services. `exec` needed to be added to the scylla and scylla-jmx starter
scripts - otherwise supervisord loses track of the actual process we
want to manage.

Signed-off-by: Yoav Kleinberger <yoav@scylladb.com>
Message-Id: <1471442960-110914-1-git-send-email-yoav@scylladb.com>
2016-08-18 15:08:11 +03:00
Avi Kivity
d33d958239 Merge "tracing: cleanups" from Vlad
"This series includes a time stamp representation changes Avi asked.
In addition is fixes a session "duration" semantics to be the time
it took to satisfy the user's request and not a time it took to
achieve the complete replication factor."
2016-08-18 14:36:19 +03:00
Pekka Enberg
1553bec57a dist/docker: Documentation cleanups
- Fix invisible characters to be space so that Markdown to PDF
  conversion works.

- Fix formatting of examples to be consistent.

- Spellcheck.

Message-Id: <1471514924-29361-1-git-send-email-penberg@scylladb.com>
2016-08-18 13:09:37 +03:00
Pekka Enberg
4ca260a526 dist/docker: Document image command line options
This patch documents all the command line options Scylla's Docker image supports.

Message-Id: <1471513755-27518-1-git-send-email-penberg@scylladb.com>
2016-08-18 13:01:58 +03:00
Avi Kivity
42094524e7 Merge seastar upstream
* seastar ab29b12...4254111 (3):
  > file: fix size() return type
  > build: adjust -fsantize=vptr broken warning
  > thread_impl.hh: add missing include
2016-08-18 10:34:58 +03:00
Avi Kivity
d0308ff488 Merge seastar upstream
* seastar 81df893...ab29b12 (1):
  > core: Fix bug in make_file_impl() which affects directory scanning
2016-08-17 21:57:03 +03:00
Amos Kong
9d53305475 systemd: have the first housekeeping check right after start
Issue: https://github.com/scylladb/scylla/issues/1594

Currently systemd run first housekeeping check at the end of
first timer period. We expected it to be run right after start.

This patch makes systemd to be consistent with upstart.

Signed-off-by: Amos Kong <amos@scylladb.com>
Message-Id: <4cc880d509b0a7b283278122a70856e21e5f1649.1471433388.git.amos@scylladb.com>
2016-08-17 16:02:00 +03:00
Avi Kivity
0033197ba9 Merge seastar upstream
* seastar 823a404...81df893 (3):
  > memory: Do not increase g_allocs on failure in allocate and allocate_aligned
  > memory: Balance the g_frees and g_allocs
  > Merge "thread: explicitly yield on get()" from Glauber

Fixes #1586.
2016-08-17 13:28:30 +03:00
Avi Kivity
4871b19337 Merge "Fixes for streamed_mutation_from_mutation" from Paweł
"This series contains fixes for two memory leaks in
streamed_mutation_from_mutation.

Fixes #1557."
2016-08-17 13:24:22 +03:00
Avi Kivity
e7eb76fc58 Introduce stdx.hh header file
So we don't have to create an stdx = std::experimental alias everywhere.
Message-Id: <1471417039-21391-1-git-send-email-avi@scylladb.com>
2016-08-17 11:19:49 +01:00
Paweł Dziepak
148e9c5608 streamed_mutation_from_mutation: fix destroying bi::sets
Once unlink_leftmost_without_rebalance() has been called on a bi::set no
other method can be used. This includes clear_and_disposed() used by the
mutation_partition destructor.

We like unlink_leftmost_without_rebalance() because it is efficient, so
the solution is to manually finish destroying clustering row and range
tombstone sets in the reader destructor using that function.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-17 11:03:59 +01:00
Paweł Dziepak
fe9575d01d streamed_mutation_from_mutation: fix leak on allocation failure
mutation_fragment() constructor allocates memory. If it fails the
already unlinked parts of mutation (either rows_entry or range_tombtone)
will be leaked.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-08-17 11:02:24 +01:00
Benoit Canet
90ef150ee9 systemd: Remove WorkingDirectory directive
The WorkingDirectory directive does not support environment variables on
systemd version that is shipped with Ubuntu 16.04. Fortunately, not
setting WorkingDirectory implicitly sets it to user home directory,
which is the same thing (i.e. /var/lib/scylla).

Fixes #1319

Signed-of-by: Benoit Canet <benoit@scylladb.com>
Message-Id: <1470053876-1019-1-git-send-email-benoit@scylladb.com>
2016-08-17 12:34:11 +03:00
Raphael S. Carvalho
108fd1fade database: close file in lister
After listing is done, let's close file. This fixes no bug.
It's only an improvement.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <2f52d297bcf6a6b6e3429912c28f17e6b37f8842.1471381607.git.raphaelsc@scylladb.com>
2016-08-17 11:01:44 +03:00
Glauber Costa
b361dee488 database: memtables pending flushes tell us nothing
We have two counters that tracks how many memtable flushes are in progress, and
how much memory are they pinning.

The problem is, after we have revamped the code to limit the amount of flushes
in progress, those counters became useless: as they live inside the semaphore
side, they will only be incremented once we have past the semaphore.

One wouldn't notice if working with CPU-bound problems, where memtables don't
pile. But as soon as they do, those counters will always show the same numbers:
the depth of the semaphore, which doesn't mean much. The problem is poised to
become much worse: once we enable write behind in full and set the semaphore's
depth to one, that's the number we'll see here all the time.

The fix is to move the counters outside the semaphore, which will bring back its
old semantics.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <c5ae6903e170f3f356cdda7ed78a4c9ba8d5f024.1471370504.git.glauber@scylladb.com>
2016-08-17 10:54:15 +03:00
Piotr Jastrzebski
bb0c4c3c40 Fix compilation errors
query::range parameter in mutation_partiton::range
has to be changed to nonwrapping_range.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <36e444bfe90586f8d3b08ca36d8dc13d5898ef97.1471347402.git.piotr@scylladb.com>
2016-08-16 12:49:54 +01:00
Vlad Zolotarov
37da6f53f8 tracing: fix a session "duration" semantics
A session's "duration" should be a time it took to
handle a request, which is a time till response to a user.
In other words - till a consistency level is reached.

Before this patch is was a time that takes a complete
handling of a request, which is the time it takes to handle
all replicas and not only those required to reach a CL.

This patch fixes this situation by extending the trace_state's state
values to 3 states: inactive, foreground and background.

A primary session may be in 3 states:
  - "inactive": between the creation and a begin() call.
  - "foreground": after a begin() call and before a
    stop_foreground_and_write() call.
  - "background": after a stop_foreground_and_write() call and till the
    state object is destroyed.

- Traces are not allowed while state is in an "inactive" state.
- The time the primary session was in a "foreground" state is the time
  reported as a session's "duration".
- Traces that have arrived during the "background" state will be recorded
  as usual but their "elapsed" time will be greater or equal to the
  session's "duration".

Secondary sessions may only be in an "inactive" or in a "foreground"
states.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-16 12:32:34 +03:00
Vlad Zolotarov
f83e33fc13 tracing: make "elapsed" be std::chrono::duration
- Define an tracing::elapsed_clock type (std::chrono::steady_clock).
     Use it instead of trace_state::clock_type.
   - Store the "elapsed" information in a form of elapsed_clock::duration.
   - Make all keyspace_backend specific conversions inside the trace_keyspace_helper
     class, where they belong.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-16 12:32:34 +03:00
Vlad Zolotarov
ebf13da9c9 tracing::session_record: make start_at to be a time_point
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-16 12:32:34 +03:00
Vlad Zolotarov
5e24f6f442 trace_keyspace_helper::apply_events_mutation(): Avoid extra std::move of std::deque
events_records are promised to be kept alive till the future returned
by apply_events_mutation() resolves: it's dowithificated by a caller already.

In addition, since its passed by a reference, it's a logical thing to demand
it to be kept alive by a caller till the future above resolves.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-08-16 12:32:34 +03:00
Avi Kivity
bf02ca831d Merge "Tracing: change a back pressure scheme" from Vlad
"This series changes the tracing back pressure scheme from limiting the amount traces in
a single session by a fixed number to have a per-shard budget consumed by all active tracing
sessions.

It was really easy to cause the traces to be dropped even if there weren't too many
active traces: e.g. if there was a single active session which creates more traces
than a per-session limit (30) the traces above 30-th were going to be dropped. Namely
traces were dropped when there were only 30 active traces, which is ridiculous.

This series introduces two main changes:
   - Changes the records budgeting from being per-session to be per-shard. This substantially
     increases the amount of active records after which new records are going to be dropped.
   - Introduces a flow when events' records are written BEFORE the corresponding tracing
     session is over (right now traces are written to I/O back end only when the session object
     is destroyed).

The later is meant to virtually eliminate the traces drops in normal situations at all.
Of course, if a back end is slow or if there are a lot of small sessions that do not complete we would still have
to drop new sessions/records in order to avoid uncontrolled growth of a memory foot print of Tracing.

If we see the later case happening a lot in the future we may add lowres timers to each session that would
commit the cached records for writing every X time. But let's not try to optimize something that we
are not completely sure has to be optimized... "
2016-08-16 12:21:02 +03:00
Amnon Heiman
0706db9387 API: use the estimated sum when converting histogram to json
The function that convert histogram to the json histogram object need to
use the estimated_sum to get the actual sum and not the sampled sum.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1467547341-30438-3-git-send-email-amnon@scylladb.com>
2016-08-16 11:06:51 +03:00
Amnon Heiman
4c14b2a527 histogram: Add an estimated sum method
The histogram implementation uses sampling to estimate the mean and sum.
This patch adds a method that returns an estimated sum based on the mean
and the total number of events measured.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1467547341-30438-2-git-send-email-amnon@scylladb.com>
2016-08-16 11:06:50 +03:00
Pekka Enberg
fde8677f1a cql3/query_processor: Clean up code formatting
Currently, query_processor.cc code formatting is all over the place,
which makes the file hard to read. Apply some formatting magic to make
it prettier.

Message-Id: <1470832486-26020-2-git-send-email-penberg@scylladb.com>
2016-08-16 10:39:15 +03:00
Pekka Enberg
ce07822f49 cql3/query_processor: Use type deduction to make code more readable
Use the 'auto' specifier for variables and lambda parameters to make the
code more readable.

Message-Id: <1470832486-26020-1-git-send-email-penberg@scylladb.com>
2016-08-16 10:39:11 +03:00
Avi Kivity
b1f9688432 Merge "range: Add nonwrapping_range" from Duarte
"Ranges that wrap around are a source of complexity and bugs. This patchset
adds a nonwrapping_range class, which specifies the range can't wrap around.
It is the user of the nonwrapping_range that is required to enforce this
constraint.

The idea is to incrementaly disallow ranges that wrap around. We do it
for query::clustering_range in this patchset, and it can be done similarly
for other ranges. This moves the burden of unwrapping ranges to the edges.

Fixes #1544"
2016-08-16 10:08:24 +03:00
Duarte Nunes
5161ea283f query: query::clustering_range can't wrap around
This patch changes the type of query::clustering_range to express that
ranges that wrap around are not allowed, and ranges that have the
start bound after the end bound are considered empty.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-15 14:50:20 +00:00