Commit Graph

10509 Commits

Author SHA1 Message Date
Amos Kong
102792ee4b systemd: reset housekeeping timer at each start
Currently housekeeping timer won't be reset when we restart scylla-server.
We expect the service to be run at each start, it will be consistent with
upstart script in Ubuntu 14.04

When we restart scylla-server, housekeepting timer will also be restarted,
so let's replace "OnBootSec" with "OnActiveSec".

Fixes: #1601

Signed-off-by: Amos Kong <amos@scylladb.com>
Message-Id: <a22943cc11a3de23db266c52fd476c08014098c4.1480607401.git.amos@scylladb.com>
2016-12-06 18:34:09 +02:00
Takuya ASADA
125c39d8d1 dist/common/systemd/scylla-housekeeping.timer: workaround to avoid crash of systemd on RHEL 7.3
RHEL 7.3's systemd contains known bug on timer.c:
https://github.com/systemd/systemd/issues/2632

This is workaround to avoid hitting bug.

Fixes #1846

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1480452194-11683-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 8464903021)
2016-12-06 10:49:12 +02:00
Glauber Costa
38e78bb8a2 commitlog: use read ahead for replay requests
Aside from putting the requests in the commitlog class, read ahead
will help us going through the file faster.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 59a41cf7f1)
2016-12-01 16:15:49 +01:00
Glauber Costa
58504bda5b commitlog: use commitlog priority for replay
Right now replay is being issued with the standard seastar priority.
The rationale for that at the time is that it is an early event that
doesn't really share the disk with anybody.

That is largely untrue now that we start compactions on boot.
Compactions may fight for bandwidth with the commitlog, and with such
low priority the commitlog is guaranteed to lose.

Fixes #1856

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit aa375cd33d)
2016-12-01 16:15:44 +01:00
Glauber Costa
be9e419a32 commitlog: close file after read, and not at stop
There are other code paths that may interrupt the read in the middle
and bypass stop. It's safer this way.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <8c32ca2777ce2f44462d141fd582848ac7cf832d.1479477360.git.glauber@scylladb.com>
(cherry picked from commit 60b7d35f15)
2016-12-01 16:15:39 +01:00
Glauber Costa
7647acd201 commitlog: close replay file
Replay file is opened, so it should be closed. We're not seeing any
problems arising from this, but they may happen. Enabling read ahead in
this stream makes them happen immediately. Fix it.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
(cherry picked from commit 4d3d774757)
2016-12-01 16:15:35 +01:00
Takuya ASADA
f35c088363 dist/common/scripts/scylla_kernel_check: fix incorrect document URL
Fixes #1871

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1480327243-18177-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 1042e40188)
2016-11-29 11:12:37 +02:00
Avi Kivity
f19fca6b8b Update seastar submodule
* seastar 28aeb47...c50a301 (1):
  > Collectd get_value_map safe scan the map

Fixes #1835.
2016-11-27 18:21:07 +02:00
Pekka Enberg
0d6223580c release: prepare for 1.4.2 scylla-1.4.2 2016-11-24 10:58:28 +02:00
Duarte Nunes
6152f091d4 thrift: Don't apply cell limit across rows
In Thrift, SliceRange defines a count that limits the number of cells
to return from that row (in CQL3 terms, it limits the number of rows
in that partition). While this limit is honored in the engine, the
Thrift layer also applies the same limit, which, while redundant in
most cases, is used to support the get_paged_slice verb.

Currently, the limit is not being reset per Thrift row (CQL3
partition), so in practice, instead of limiting the cells in a row,
we're limiting the rows we return as well. This patch fixes that by
ensuring the limit applies only within a row/partition.

Fixes #1882

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20161123220001.15496-1-duarte@scylladb.com>
(cherry picked from commit a527ba285f)
2016-11-24 10:39:08 +02:00
Avi Kivity
9bc54bcf5e tests: fix tests with boost 1.60
In boost 1.60, the executable's command-line arguments are expected to
be separated from the boost command-line arguments by '--'.  Detect
this requirement and comply with it.
Message-Id: <1477212424-3831-1-git-send-email-avi@scylladb.com>

(cherry picked from commit fc8210a875)
2016-11-24 10:11:11 +02:00
Avi Kivity
d6c9abd9ae storage_proxy: don't query concurrently needlessly during range queries
storage_proxy has an optimization where it tries to query multiple token
ranges concurrently to satisfy very large requests (an optimization which is
likely meaningless when paging is enabled, as it always should be).  However,
the rows-per-range code severely underestimates the number of rows per range,
resulting in a large number of "read-ahead" internal queries being performed,
the results of most of which are discarded.

Fix by disabling this code. We should likely remove it completely, but let's
start with a band-aid that can be backported.

Fixes #1863.

Message-Id: <20161120165741.2488-1-avi@scylladb.com>
(cherry picked from commit 6bdb8ba31d)
2016-11-21 18:28:14 +02:00
Raphael S. Carvalho
b862c30bc0 db: do not leak deleted sstable when deletion triggers an exception
The leakage results in deleted sstables being opened until shutdown, and disk
space isn't released. That's because column_family::rebuild_sstable_list()
will not remove reference to deleted sstables if an exception was triggered in
sstables::delete_atomically(). A sstable only has its files closed when its
object is destructed.

The exception happens when a major compaction is issued in parallel to a
regular one, and one of them will be unable to delete a sstable already deleted
by the other. That results in remove_by_toc_name() triggering boost::filesystem
::filesystem_error because TOC and temporary TOC don't exist.

We wouldn't have seen this problem if major compaction were going through
compaction manager, but remove_by_toc_name() and rebuild_sstable_list() should
be made resilient.

Fixes #1840.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <d43b2e78f9658e2c3c5bbb7f813756f18874bf92.1479390842.git.raphaelsc@scylladb.com>
(cherry picked from commit 3dc9294023)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <fd3035b14f4e10f6bfd36cfd644388a95e60e6a8.1479431741.git.raphaelsc@scylladb.com>
2016-11-18 13:11:14 +02:00
Raphael S. Carvalho
87fedbbb9d sstables: Fix "fix ad-hoc summary creation" backport
Commit 8bc1c87cfd ("sstables: fix ad-hoc summary creation ") didn't take
into account that there were some places calling
index_consume_entry_context() with wrong arguments.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <98ca5f7bf517c4e55dacc3a23ad294d3bcde8907.1479431741.git.raphaelsc@scylladb.com>
2016-11-18 13:03:53 +02:00
Gleb Natapov
8bc1c87cfd sstables: fix ad-hoc summary creation
If sstable Summary is not present Scylla does not refuses to boot but
instead creates summary information on the fly. There is a bug in this
code though. Summary files is a map between keys and offsets into Index
file, but the code creates map between keys and Data file offsets
instead. Fix it by keeping offset of an index entry in index_entry
structure and use it during Summary file creation.

Fixes #1857.

Reviewed-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20161116165421.GA22296@scylladb.com>
(cherry picked from commit ae0a2935b4)
2016-11-17 11:46:18 +02:00
Raphael S. Carvalho
44448b3ab9 main: fix exception handling when initializing data or commitlog dirs
Exception handling was broken because after io checker, storage_io_error
exception is wrapped around system error exceptions. Also the message
when handling exception wasn't precise enough for all cases. For example,
lack of permission to write to existing data directory.

Fixes #883.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <b2dc75010a06f16ab1b676ce905ae12e930a700a.1478542388.git.raphaelsc@scylladb.com>
(cherry picked from commit 9a9f0d3a0f)
2016-11-16 15:13:14 +02:00
Amnon Heiman
419045548d API: cache_capacity should use uint for summing
Using integer as a type for the map_reduce causes number over overflow.

Fixes #1801

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1479299425-782-1-git-send-email-amnon@scylladb.com>
(cherry picked from commit a4be7afbb0)
2016-11-16 15:04:07 +02:00
Paweł Dziepak
25830dd6ac partition_version: make sure that snapshot is destroyed under LSA
Snapshot destructor may free some objects managed by the LSA. That's why
partition_snapshot_reader destructor explicitly destroys the snapshot it
uses. However, it was possible that exception thrown by _read_section
prevented that from happenning making snapshot destoryed implicitly
without current allocator set to LSA.

Refs #1831.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1478778570-2795-1-git-send-email-pdziepak@scylladb.com>
(cherry picked from commit f16d6f9c40)
2016-11-16 12:47:38 +00:00
Paweł Dziepak
7a6dd3c56d query_pagers: distinct queries do not have clustering keys
Query pager needs to handle results that contain partitions with
possibly multiple clustering rows quite differently than results with
just one row per partition (for example a page may end in a middle of
partition). However, the logic dealing with partitions with clustering
rows doesn't work correctly for SELECT DISTINCT queries, which are
much more similar to the ones for schemas without clustering key.

The solution is to set _has_clustering_keys to false in case of SELECT
DISTINCT queries regardless of the schema which will make pager
correctly expect each partition to return at most one rows.

Fixes #1822.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1478612486-13421-1-git-send-email-pdziepak@scylladb.com>
(cherry picked from commit 055d78ee4c)
2016-11-16 10:17:55 +01:00
Paweł Dziepak
739bc54246 row_cache: touch entries read during range queries
Fixes #1847.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1479230809-27547-1-git-send-email-pdziepak@scylladb.com>
(cherry picked from commit 999dafbe57)
2016-11-15 20:39:05 +00:00
Avi Kivity
092b214a2e Merge "Fixes for histogram and moving average calculations" from Glauber
"JMX metrics were found to be either not showing, or showing absurd
values.  Turns out there were multiple things wrong with them. The
patches were sent separately but conflict with one another. This series
is a collection of the patches needed to fix the issues we saw.

Fixes #1832, #1836, #1837"

(cherry picked from commit bf20aa722b)
2016-11-13 11:43:09 +02:00
Amnon Heiman
5cca752ebb API: fix a type in storage_proxy
This patch fixes a typo in the URL definition, causing the metric in the
jmx not to find it.

Fixes #1821

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1478563869-20504-1-git-send-email-amnon@scylladb.com>
(cherry picked from commit c8082ccadb)
2016-11-13 09:24:00 +02:00
Paweł Dziepak
5990044158 Merge "Remove quadratic behavior from atomic sstable deletion" from Avi
"The atomic sstable deletion provides exception safety at the cost of
quadratic behavior in the number of sstables awaiting deletion.  This
causes high cpu utilization during startup.

Change the code to avoid quadratic complexity, and add some unit tests.

See #1812."

(cherry picked from commit 985d2f6d4a)
2016-11-13 00:14:19 +02:00
Glauber Costa
e8c804f4f9 histogram: moving averages: fix inverted parameters
moving_averages constructor is defined like this:

    moving_average(latency_counter::duration interval, latency_counter::duration tick_interval)

But when it is time to initialize them, we do this:

	... {tick_interval(), std::chrono::minutes(1)} ...

As it can be seen, the interval and tick interval are inverted. This
leads to the metrics being assigned bogus values.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <d83f09eed20ea2ea007d120544a003b2e0099732.1478798595.git.glauber@scylladb.com>
(cherry picked from commit d3f11fbabf)
2016-11-11 10:15:55 +02:00
Pekka Enberg
ce9468a95d abstract_replication_strategy: Fix exception type if class not found
Change abstract_replication_strategy::create_replication_strategy() to
throw exceptions::configuration_error if replication strategy class
lookup to make sure the error is converted to the correct CQL response.

Fixes #1755

Message-Id: <1476361262-28723-1-git-send-email-penberg@scylladb.com>
(cherry picked from commit 3b4e6cdc5e)
2016-11-07 15:11:06 +02:00
Calle Wilund
a7616b9116 auth::password_authenticator: Ensure exceptions are processed in continuation
Fixes #1718 (even more)
Message-Id: <1475497389-27016-1-git-send-email-calle@scylladb.com>

(cherry picked from commit 5b815b81b4)
2016-11-07 09:23:38 +02:00
Calle Wilund
2f8a846b46 auth::password_authenticator: "authenticate" should not throw undeclared excpt
Fixes #1718

Message-Id: <1475487331-25927-1-git-send-email-calle@scylladb.com>
(cherry picked from commit d24d0f8f90)
2016-11-07 09:23:33 +02:00
Pekka Enberg
5ef32112ae release: prepare for 1.4.1 scylla-1.4.1 2016-11-07 09:17:46 +02:00
Pekka Enberg
36ad8a8fd8 cql3: Fix selecting same column multiple times
Under the hood, the selectable::add_and_get_index() function
deliberately filters out duplicate columns. This causes
simple_selector::get_output_row() to return a row with all duplicate
columns filtered out, which triggers and assertion because of row
mismatch with metadata (which contains the duplicate columns).

The fix is rather simple: just make selection::from_selectors() use
selection_with_processing if the number of selectors and column
definitions doesn't match -- like Apache Cassandra does.

Fixes #1367
Message-Id: <1477989740-6485-1-git-send-email-penberg@scylladb.com>

(cherry picked from commit e1e8ca2788)
2016-11-01 09:34:22 +00:00
Pekka Enberg
ef012856a5 release: prepare for 1.4.0 scylla-1.4.0 2016-10-31 14:04:54 +02:00
Avi Kivity
e87bed5816 Update seastar submodule
* seastar b7be36a...28aeb47 (1):
  > rpc: Avoid using zero-copy interface of output_stream (Fixes #1786)
scylla-1.4-rc3
2016-10-28 14:15:02 +03:00
Pekka Enberg
577ffc5851 auth: Fix resource level handling
We use `data_resource` class in the CQL parser, which let's users refer
to a table resource without specifying a keyspace. This asserts out in
get_level() for no good reason as we already know the intented level
based on the constructor. Therefore, change `data_resource` to track the
level like upstream Cassandra does and use that.

Fixes #1790

Message-Id: <1477599169-2945-1-git-send-email-penberg@scylladb.com>
(cherry picked from commit b54870764f)
2016-10-27 23:37:57 +03:00
Glauber Costa
a4fffc9c5d auth: always convert string to upper case before comparing
We store all auth perm strings in upper case, but the user might very
well pass this in upper case.

We could use a standard key comparator / hash here, but since the
strings tend to be small, the new sstring will likely be allocated in
the stack here and this approach yields significantly less code.

Fixes #1791.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <51df92451e6e0a6325a005c19c95eaa55270da61.1477594199.git.glauber@scylladb.com>
(cherry picked from commit ef3c7ab38e)
2016-10-27 22:09:42 +03:00
Pekka Enberg
10ba47674a release: prepare for 1.4.rc3 2016-10-26 12:20:13 +03:00
Tomasz Grabiec
1c278d9abf Update seastar submodule
* seastar 810ef2b...b7be36a (2):
  > rpc: Fix crash during connection teardown
  > rpc: Move _connected flag to protocol::connection
2016-10-26 10:03:58 +02:00
Tomasz Grabiec
be0b5ad962 Merge seastar upstream
* seastar 742eb00...810ef2b (1):
  > rpc: Do not close client connection on error response for a timed out request

Refs #1778
2016-10-25 13:55:19 +02:00
Vlad Zolotarov
707b59100c service::storage_proxy: use global_trace_state_ptr when using invoke_on
When trace_state may migrate to a different shard a global_trace_state_ptr
has to be used.

This patch completes the patch below:

commit 7e180c7bd3
Author: Vlad Zolotarov <vladz@cloudius-systems.com>
Date:   Tue Sep 20 19:09:27 2016 +0300

    tracing: introduce the tracing::global_trace_state_ptr class

Fixes #1770

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1476993537-27388-1-git-send-email-vladz@cloudius-systems.com>
(cherry picked from commit f75a350a8f)
2016-10-25 11:36:16 +03:00
Takuya ASADA
dc8fa5090d dist/ami: fix incorrect /etc/fstab entry on CentOS7 base image
There was incorrect rootfs entry on /etc/fstab:
 /dev/sda1 / xfs defaults,noatime 1 1
This causes boot error when updated to new kernel.
(see:
https://github.com/scylladb/scylla/issues/1597#issuecomment-250243187)

So replaced the entry to
 UUID=<uuid>  / xfs defaults,noatime 1 1
Also all recent security updates applied.

Fixes #1597
Fixes #1707

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475094957-9464-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 80e3d8286c)
2016-10-20 11:53:04 +03:00
Avi Kivity
ea9a8e7f65 Update seastar submodule
* seastar c960804...742eb00 (1):
  > rpc: Add missing adjustment of snd_buf::size

Fixes #1767.
2016-10-19 19:45:16 +03:00
Tomasz Grabiec
8d91b8652f partition_version: Fix corruption of partition_version list
The move constructor of partition_version was not invoking move
constructor of anchorless_list_base_hook. As a result, when
partition_version objects were moved, e.g. during LSA compaction, they
were unlinked from their lists.

This can make readers return invalid data, because not all versions
will be reachable.

It also casues leaks of the versions which are not directly attached
to memtable entry. This will trigger assertion failure in LSA region
destructor. This assetion triggers with row cache disabled. With cache
enabled (default) all segments are merged into the cache region, which
currently is not destroyed on shutdown, so this problem would go
unnoticed. With cache disabled, memtable region is destroyed after
memtable is flushed and after all readers stop using that memtable.

Fixes #1753.
Message-Id: <1476778472-5711-1-git-send-email-tgrabiec@scylladb.com>

(cherry picked from commit fe387f8ba0)
2016-10-18 10:58:47 +02:00
Pekka Enberg
830df18df5 release: prepare for 1.4.0 2016-10-14 14:37:13 +03:00
Amnon Heiman
ced171c28b scylla_setup: Reorder questions and actions
The expected behaviour in the scylla_setup script is that a question
will be followed by the answer.

For example, after asking if the scylla should be run as a service the
relevant actions will be taken before the following question.

This patch address two such mis-orders:
1. the scylla-housekeeping depends on the scylla-server, but the
setup should first setup the scylla-server service and only then ask
(and install if needed) the scylla-housekeeping.
2. The node_exporter should be placed after the io_setup is done.

Fixes #1739

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1476370098-25617-1-git-send-email-amnon@scylladb.com>
(cherry picked from commit 7829da13b4)
2016-10-13 18:29:52 +03:00
Avi Kivity
34bf40b552 Merge "node_exporter service on ubuntu 16" from Amnon
"This series address two issues that interfere with running the node_exporter as a service in ubuntu 16.
1. The service file should be packed in the deb file
2. When setting the node_exporter as a service it doesn't need to run with scylla use"

* 'amnon/node_exporter_ubuntu_v2' of github.com:cloudius-systems/seastar-dev:
  node-exporter service: No need to run as scylla user
  debian package: Include the node_exporter service file

(cherry picked from commit 1506b06617)
2016-10-13 15:54:41 +03:00
Avi Kivity
a68d829644 Update seastar submodule
* seastar f9f4746...c960804 (1):
  > Merge "rometheus API with grafana uses labels" from Amnon
2016-10-13 15:53:51 +03:00
Takuya ASADA
551c4ff965 dist/common/script/scylla_io_setup: handle comma correctly when parsing cpuset
The script mistakenly split value at "," when cpuset list is separated
by comma. Instead of matching possible patterns of the argument, let's
pass all characters until reach to space delimiter or end of line.

Fixes #1716

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1476171037-32373-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit ccad720bb1)
2016-10-11 10:43:17 +03:00
Pekka Enberg
19b35e812b release: prepare for 1.4.rc2 scylla-1.4-rc2 2016-10-10 16:09:16 +03:00
Takuya ASADA
d9ac058bff dist/ubuntu: add realpath to dependency, requires for scylla_setup
We need dependency to realpath, since scylla_setup using it.

Fixes #1740.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1475788340-22939-1-git-send-email-syuu@scylladb.com>
(cherry picked from commit 8452045b85)
2016-10-10 15:59:10 +03:00
Pekka Enberg
766367a6c5 dist/docker: Use Scylla 1.4 RPM repository 2016-10-10 15:21:08 +03:00
Pekka Enberg
7ac9b6e9ca docs/docker: Tag --listen-address as 1.4 feature
The Docker Hub documentation is the same for all image versions. Tag
`--listen-address` as 1.4 feature.

Message-Id: <1475819164-7865-1-git-send-email-penberg@scylladb.com>
(cherry picked from commit 3b75ff1496)
2016-10-10 14:34:04 +03:00
Vlad Zolotarov
b5bab524e1 api::storage_service::slow_query: don't use duration_cast in GET
The slow_query_record_ttl() and slow_query_threshold() return the duration
of the appropriate type already - no need for an additional cast.
In addition there was a mistake in a cast of ttl.

Fixes #1734

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1475669400-5925-1-git-send-email-vladz@cloudius-systems.com>
(cherry picked from commit 006999f46c)
2016-10-09 19:27:09 +03:00