Commit Graph

17840 Commits

Author SHA1 Message Date
Nadav Har'El
9dd3c59c77 docs/metrics.md: explain Prometheus and Grafana
docs/metrics.md so far explained just the REST API for retrieving current
metrics from a single Scylla node. In this patch, I add basic explanations
on how to use the Prometheus and Grafana tools included in the
"scylla-grafana-monitoring" project.

It is true that technically, what is being explained here doesn't come
with the Scylla project and requires the separate scylla-grafana-monitoring
to be installed as well. Nevertheless, most Scylla developers will need this
knowledge eventually and suprisingly it appears it was never documented
anywhere accessible to newbie developers, and I think metrics.md is the
right place to introduce it.

In fact, I myself wasn't aware until today that Prometheus actually had
its own Web UI on port 9090, and that it is probably more useful for
developers than Grafana is.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Reviewed-by: Botond Denes <bdenes@scylladb.com>
Message-Id: <20190129114214.17786-1-nyh@scylladb.com>
2019-01-29 15:46:06 +02:00
Duarte Nunes
35c03f41a4 Merge 'Fix multiple contains for one column' from Piotr
"
An error in validating CONTAINS restrictions against collections caused
only the first restriction to be taken into account due to returning
prematurely.
This miniseries provides a fix for that as well as a matching test case.

Tests: unit (release)
Fixes #4161
"

* 'fix_multiple_contains_for_one_column' of https://github.com/psarna/scylla:
  tests: enable CONTAINS tests for filtering
  cql3: remove premature return from is_satisfied_by
  cql3: restore indentation
2019-01-29 11:10:13 +00:00
Piotr Sarna
11aae54cca tests: enable CONTAINS tests for filtering
Tests for filtering with CONTAINS restrictions were not enabled,
so they are now. Also, another case for having two CONTAINS restrictions
for a single column is added.

Refs #4161
2019-01-29 11:47:28 +01:00
Piotr Sarna
9595fec2ec cql3: remove premature return from is_satisfied_by
Function which checked whether a CONTAINS restriction is satisfied
by a collection erroneously returned prematurely after checking
just the first restriction - which works fine for the usual case,
but fails if there are multiple CONTAINS restrictions present
for a column.

Fixes #4161
2019-01-29 11:47:28 +01:00
Piotr Sarna
89af01315d cql3: restore indentation 2019-01-29 11:47:28 +01:00
Pekka Enberg
7bda3abbc6 toolchain/dbuild: Fix permission errors when SELinux is enabled
Use the ":z" suffix to tell Docker to relabel file objets on shared
volumes. Fixes accessing filesystem via dbuild when SELinux is enabled.

Message-Id: <20190128160557.2066-1-penberg@scylladb.com>
2019-01-28 18:16:53 +02:00
Paweł Dziepak
11a1f97307 Merge "Fix cleanup of temporary sstable directories" from Benny
"
Cleanup of temporary sstable directories in distributed_loader::populate_column_family
is completely broken and non tested. This code path was never executed since
populate_column_family doesn't currently list subdirectories at all.

This patchset fixes this code path and scans subdirectories in populate_column_family.
Also, a unit test is added for testing the cleanup of incomplete (unsealed) sstables.

Fixes: #4129
"

* 'projects/sst-temp-dir-cleanup/v3' of https://github.com/bhalevy/scylla:
  tests: add test_distributed_loader_with_incomplete_sstables
  tests: single_node_cql_env::do_with: use the provided data_file_directories path if available
  tests: single_node_cql_env::_data_dir is not used
  distributed_loader: populate_column_family should scan directories too
  sstables: fix is_temp_dir
  distributed_loader: populate_column_family: ignore directories other than sstable::is_temp_dir
  distributed_loader: remove temporary sstable directories only on shard 0
  distributed_loader: push future returned by rmdir into futures vector
2019-01-28 12:23:00 +00:00
Duarte Nunes
ea34e242de Merge 'Do not use hints for view building' from Piotr
"
This series prevents view building to fall back to storing hints.
Instead, it will try to send hints to an endpoint as if it has
consistency level ONE, and in case of failure retry the whole
building step. Then, view building will never be marked as finished
prematurely (because of pending hints), which will help avoid
creating inconsistencies when decommissioning a node from the cluster.

Tests:
  unit (release)
  dtest (materialized_views_test.py.*)

Fixes #3857
Fixes #4039
"

* 'do_not_mark_view_as_built_with_hints_7' of https://github.com/psarna/scylla:
  db,view: add updating view_building_paused statistics
  database: add view_building_paused metrics
  table: make populate_views not allow hints
  db,view: add allow_hints parameter to mutate_MV
  storage_proxy: add allow_hints parameter to send_to_endpoint
2019-01-28 10:31:14 +00:00
Piotr Sarna
9a6261ca27 db,view: add updating view_building_paused statistics
Each time view building does is paused because of connection failure,
view_building_paused metrics is bumped.
2019-01-28 09:38:42 +01:00
Piotr Sarna
e30b0663d6 database: add view_building_paused metrics
The metrics exposes how many times view building process was paused,
e.g. because target node was down or overloaded.
2019-01-28 09:38:42 +01:00
Piotr Sarna
5dec6dc6c6 table: make populate_views not allow hints
View building uses populate_views to generate and send view updates.
This procedure will now not allow hints to be used to acknowledge
the write. Instead, the whole building step will be retried on failure.

Fixes #3857
Fixes #4039
2019-01-28 09:38:42 +01:00
Piotr Sarna
e30cf22956 db,view: add allow_hints parameter to mutate_MV
Mutating MV function can now accept a parameter whether
hints should be allowed during sending mutations to endpoints.
2019-01-28 09:38:42 +01:00
Piotr Sarna
e0fe9ce2c0 storage_proxy: add allow_hints parameter to send_to_endpoint
With hints allowed, send_to_endpoint will leverage consistency level ANY
to send data. Otherwise, it will use the default - cl::ONE.
2019-01-28 09:38:41 +01:00
Rafael Ávila de Espíndola
5332ebd50c Update the description of compaction_large_partition_warning_threshold_mb
Despite the name, this option also controls if a warning is issued
during memtable writes.

Warning during memtable writes is useful but the option name also
exists in cassandra, so probably the best we can do is update the
description.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190125020821.72815-1-espindola@scylladb.com>
2019-01-28 09:09:35 +02:00
Takuya ASADA
5c6c008109 dist/ami: follow build script changes on -jmx/-tools/-ami packages
We need to follow changes of rpm package build procedure on
-jmx/-tools/-ami packages, since it have been changed when we merged
relocatable pacakge.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190127204436.13959-1-syuu@scylladb.com>
2019-01-28 09:08:32 +02:00
Takuya ASADA
7db1b45839 reloc: move relocatable libraries from /opt/scylladb/lib to /opt/scylladb/libreloc
On Scylla 3rdparty tools, we add /opt/scylladb/lib to LD_LIBRARY_PATH.
We use same directory for relocatable binaries, including libc.so.6.
Once we install both scylla-env package and relocatable version of scylla-server package, the loader tries to load libc from /opt/scylladb/lib then entire distribution become unusable.

We may able to use Obsoletes or Conflict tag on .rpm/.deb to avoid
install new Scylla package with scylla-env, but it's better & safer not to share
same directory for different purpose.

Fixes #3943

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190128023757.25676-1-syuu@scylladb.com>
2019-01-28 09:04:56 +02:00
Avi Kivity
274f553485 tools: toolchain: run dbuild container with same timezone as host
Make it easier to work interactively by not reporting surprising times.

There are also reports that dtest fails with incorrect timezones, but those
are probably bugs in dtest.
Message-Id: <20190127134754.1428-1-avi@scylladb.com>
2019-01-27 22:48:42 +00:00
Benny Halevy
36b6a3ebcf tests: add test_distributed_loader_with_incomplete_sstables
Test removal of sstables with temporary TOC file,
with and without temporary sstable directory.

Temporary sstable directories may be empty or still have
leftover components in them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:48:24 +02:00
Benny Halevy
64a23ea3bc tests: single_node_cql_env::do_with: use the provided data_file_directories path if available
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Benny Halevy
441809094a tests: single_node_cql_env::_data_dir is not used
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Benny Halevy
74ef09a3a2 distributed_loader: populate_column_family should scan directories too
To detect and cleanup leftover temporary sstable directories.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Benny Halevy
bd85975277 sstables: fix is_temp_dir
1. fs::canonical required that the path will exist.
   and there is no need for fs::canonical here.
2. fs::path::extension will return the leading dot.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Benny Halevy
c2a5f3b842 distributed_loader: populate_column_family: ignore directories other than sstable::is_temp_dir
populate_column_family currently lists only regular files. ignoring all directories.
A later patch in this series allows it to list also directories so to cleanup
the temporary sstable directories, yet valid sub-directories, like staging|upload|snapshots,
may still exist and need to be ignored.

Other kinds of handling, like validating recgnized sub-directories and halting on
unrecognized sub-directories are possible, yet out of scope for this patch(set).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Benny Halevy
9bd7b2f4e6 distributed_loader: remove temporary sstable directories only on shard 0
Similar to calling remove_sstable_with_temp_toc later on in
populate_column_family(), we need only one thread to do the
cleanup work and the existing convention is that it's shard 0.

Since lister::rmdir is checking remove_file of all entries
(recursively) and the dir itself, doing that concurrently would fail.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-27 14:14:32 +02:00
Benny Halevy
bcfb2e509b distributed_loader: push future returned by rmdir into futures vector 2019-01-27 14:14:32 +02:00
Asias He
ee0bb0aa94 tests: Drop the unsupported random_read mode in perf_sstable
It is not supported. Remove it.
Message-Id: <fe31e090574be96a9620b6902ceb843699d558d0.1548403105.git.asias@scylladb.com>
2019-01-25 14:24:40 +00:00
Avi Kivity
85abb13679 Merge "Fix cross shard cf usage" from Piotr
"
Lambda passed to distribute_reader_and_consume_on_shards shouldn't
capture shard local variables.

Fixes #4108

Tests:
unit(release),
dtest(update_cluster_layout_tests.TestLargeScaleCluster.add_50_nodes_test)
"

* 'haaawk/4108/v2' of github.com:scylladb/seastar-dev:
  Fix cross shard cf usage in repair
  Fix cross shard cf usage in streaming
2019-01-24 19:40:44 +02:00
Avi Kivity
d0f9e00e85 Merge " Support 64-bit gc_clock" (fixes) from Benny
"
Use int64_t in data::cell for expiry / deletion time.

Extend time_overflow unit tests in cql_query_test to use
select statements with and without bypass cache to access deeper
into the system.

Refs #3353
"

* 'projects/gc_clock_64_fixes/v1' of https://github.com/bhalevy/scylla:
  tests: extend time_overflow unit tests
  data::cell: use int64_t for expiry and deletion time
2019-01-24 19:15:12 +02:00
Piotr Jastrzebski
fab1b7a3a2 Fix cross shard cf usage in repair
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 18:13:49 +01:00
Piotr Jastrzebski
1ac7283550 Fix cross shard cf usage in streaming
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 18:13:30 +01:00
Glauber Costa
ec66dd6562 scylla_setup: tell users about the possibility of a non-interactive session
From day1, scylla_setup can be run either iteractively or through
command line parameters. Still, one of the requests we are asked the
most from users is whether we can provide them with a version of
scylla_setup that they can call from their scripts.

This probably happens because once you call a script interactively,
it may not be totally obvious that a different mode is available.
Even when we do tell users about that possibility, the request number
two is then "which flags do I pass?"

The solution I am proposing is to just tell users the answers to those
qestions at the end of an interactive session. After this patch, we
print the following message to the console:

  ScyllaDB setup finished.
  scylla_setup accepts command line arguments as well! For easily provisioning in a similar environmen than this, type:

    scylla_setup --no-raid-setup --nic eth0 --no-kernel-check \
                 --no-verify-package --no-enable-service --no-ntp-setup \
                 --no-node-exporter --no-fstrim-setup

  Also, to avoid the time-consuming I/O tuning you can add --no-io-setup and copy the contents of /etc/scylla.d/io*
  Only do that if you are moving the files into machines with the exact same hardware

Notes on the implementation: it is unfortunate for these purposes that
all our options are negated. Most conditionals are branching on true
conditions, so although I could write this:

  args.no_option = not interactive_ask_service(...)
  if not args.no_option:
    ...

I opted in this patch to write:

  option = interactive_ask_service(...)
  args.no_option = not option
  if option:
    ...

There is an extra line and we have to update args separately, but it
makes it less hard to get confused in the conditional with the double
negation. Let me know if there are disagreements here.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190124153832.21140-1-glauber@scylladb.com>
2019-01-24 17:41:26 +02:00
Benny Halevy
6efd85ed01 tests: extend time_overflow unit tests
Test also cql select queries with and without bypass cache.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-24 15:55:06 +02:00
Benny Halevy
7373825473 data::cell: use int64_t for expiry and deletion time
Ttl may still use int32_t to reduce footprint

Refs #3353

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-01-24 15:55:06 +02:00
Takuya ASADA
597059b4b1 dist/debian: skip stripping libprotobuf.so.15
dh_strip won't able to strip libprotobuf.so.15, and we actually don't
need to strip dependency libraries, so skip it.

Fixes #4135

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190123202213.2117-4-syuu@scylladb.com>
2019-01-24 15:51:56 +02:00
Takuya ASADA
aefc18e70d dist/debian: install /usr/bin/file for dh_strip
dh_strip requires /usr/bin/file but does not automatically installed, so
install it on build_deb.sh.

Fixes #4134

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190123202213.2117-3-syuu@scylladb.com>
2019-01-24 15:51:53 +02:00
Benny Halevy
fbebd0bb1d thrift: validate_column_name: fix exception format string
It's printing uint32_t rather than char*.

Refs #4140

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190124104002.32381-1-bhalevy@scylladb.com>
2019-01-24 12:46:23 +02:00
Avi Kivity
b58b82c9a2 Merge "Cut build dependencies around types.hh" from Piotr
"
I've recently had to work around types.hh/types.cc files and had
very unpleasent experience with incremental build on every change
to types.hh. It took ~30 min on my machine which is almost as much
as the clean build.

I looked around and it turns out that types.hh contains the whole
hierarchy of the types. On the same time, many places access the
types only through abstract_type which is the root of the
hierarchy.

This patchset extracts user_type_impl, tuple_type_impl,
map_type_impl, set_type_impl, list_type_impl and
collection_type_impl from types.hh and places each of them
in a separate header.

The result of this is that change in user_type_impl causes now
incremental build of ~6 min instead of ~30 min.
Change to tuple_type_impl causes incremental build of ~7.5 min
instead of ~30 min and change to map_type_impl triggers incremental
build that takes ~20 min instead of ~30 min.

Tests: unit(release)
"

* 'haaawk/types_build_speedup_2/rfc/2' of github.com:scylladb/seastar-dev:
  Stop including types/list.hh in cql3/tuples.hh
  Stop including types/set.hh into cql3/sets.hh
  Move collection_type_impl out of types.hh to types/collection.hh
  Move set_type_impl out of types.hh to types/set.hh
  Move list_type_impl out of types.hh to types/list.hh
  Move map_type_impl out of types.hh to types/map.hh
  Move tuple_type_impl from types.hh to types/tuple.hh
  Decouple database.hh from types/user.hh
  Allow to use shared_ptr with incomplete type other than sstable
  Move user_type_impl out of types.hh to types/user.hh
2019-01-24 11:21:22 +02:00
Piotr Jastrzebski
a3912a35f5 Stop including types/list.hh in cql3/tuples.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:57:19 +01:00
Piotr Jastrzebski
fe8dfc8fdc Stop including types/set.hh into cql3/sets.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:57:19 +01:00
Piotr Jastrzebski
5a5201a50b Move collection_type_impl out of types.hh to types/collection.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
ad016a732b Move set_type_impl out of types.hh to types/set.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
b1e1b66732 Move list_type_impl out of types.hh to types/list.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
147cc031db Move map_type_impl out of types.hh to types/map.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
b6b2fdc5be Move tuple_type_impl from types.hh to types/tuple.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:56:38 +01:00
Piotr Jastrzebski
7666e81b51 Decouple database.hh from types/user.hh
This commit declares shared_ptr<user_types_metadata> in
database.hh were user_types_metadata is an incomplete type so
it requires
"Allow to use shared_ptr with incomplete type other than sstable"
to compile correctly.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:55:04 +01:00
Piotr Jastrzebski
316be5c6b5 Allow to use shared_ptr with incomplete type other than sstable
When seastar/core/shared_ptr_incomplete.hh is included in a header
then it causes problems with all declarations of shared_ptr<T> with
incomplete type T that end up in the same compilation unit.

The problem happens when we have a compilation unit that includes
two headers a.hh and b.hh such that a.hh includes
seastar/core/shared_ptr_incomplete.hh and b.hh declares
shared_ptr<T> with incomplete type T. On the same time this
compilation unit does not use declared shared_ptr<T> so it should
compile and work but it does not because shared_ptr_incomplete.hh
is included and it forces instantiation of:

template <typename T>
T*
lw_shared_ptr_accessors<T,
void_t<decltype(lw_shared_ptr_deleter<T>{})>>::to_value(lw_shared_ptr_counter_base*
counter) {
    return static_cast<T*>(counter);
}

for each declared shared_ptr<T> with incomplete type T. Even the once
that are never used.

Following commit "Decouple database.hh from types/user.hh"
moves user_types_metadata type out of database.hh and instead
declares shared_ptr<user_types_metadata> in database.hh where
user_types_metadata is incomplete. Without this commit
the compilation of the following one fails with:

In file included from ./sstables/sstables.hh:34,
                 from ./db/size_estimates_virtual_reader.hh:38,
                 from db/system_keyspace.cc:77:
seastar/include/seastar/core/shared_ptr_incomplete.hh: In
instantiation of ‘static T*
seastar::internal::lw_shared_ptr_accessors<T,
seastar::internal::void_t<decltype
(seastar::lw_shared_ptr_deleter<T>{})>
>::to_value(seastar::lw_shared_ptr_counter_base*) [with T =
user_types_metadata]’:
seastar/include/seastar/core/shared_ptr.hh:243:51:   required from
‘static void seastar::internal::lw_shared_ptr_accessors<T,
seastar::internal::void_t<decltype
(seastar::lw_shared_ptr_deleter<T>{})>
>::dispose(seastar::lw_shared_ptr_counter_base*) [with T =
user_types_metadata]’
seastar/include/seastar/core/shared_ptr.hh:300:31:   required from
‘seastar::lw_shared_ptr<T>::~lw_shared_ptr() [with T =
user_types_metadata]’
./database.hh:1004:7:   required from ‘static void
seastar::internal::lw_shared_ptr_accessors_no_esft<T>::dispose(seastar::lw_shared_ptr_counter_base*)
[with T = keyspace_metadata]’
seastar/include/seastar/core/shared_ptr.hh:300:31:   required from
‘seastar::lw_shared_ptr<T>::~lw_shared_ptr() [with T =
keyspace_metadata]’
./db/size_estimates_virtual_reader.hh:233:67:   required from here
seastar/include/seastar/core/shared_ptr_incomplete.hh:38:12: error:
invalid static_cast from type ‘seastar::lw_shared_ptr_counter_base*’
to type ‘user_types_metadata*’
     return static_cast<T*>(counter);
            ^~~~~~~~~~~~~~~~~~~~~~~~
[131/415] CXX build/release/distributed_loader.o

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:45:25 +01:00
Piotr Jastrzebski
e92b4c3dbc Move user_type_impl out of types.hh to types/user.hh
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 09:04:04 +01:00
Rafael Ávila de Espíndola
f7d1dc16d4 database: Use nop_large_partition_handler to avoid self-reporting
Currently nop_large_partition_handler is only used in tests, but it
can also be used avoid self-reporting.

Tests: unit(Release)

I also tested starting scylla with
--compaction-large-partition-warning-threshold-mb=0.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190123205059.39573-1-espindola@scylladb.com>
2019-01-23 21:11:21 +00:00
Avi Kivity
4882f29f82 Merge "Detemplatize primary key restrictions" from Piotr
"
This series is a first small step towards rewriting
CQL restrictions layer. Primary key restrictions used to be
a template that accepts either partition_key or clustering_key,
but the implementation is already based on virtual inheritance,
so in multiple cases these templates need specializations.

Refs #3815
"

* 'detemplatize_primary_key_restrictions_2' of https://github.com/psarna/scylla:
  cql3: alias single_column_primary_key_restrictions
  cql3: remove KeyType template from statement_restrictions
  cql3: remove template from primary_key_restrictions
  cql3: remove forwarding_primary_key_restrictions
2019-01-23 17:43:03 +02:00
Piotr Sarna
9982587bea cql3: alias single_column_primary_key_restrictions
In preparation for detemplatizing this class, it's aliased with
single_column_partition_key restrictions and
single_column_clustering_key_restrictions accordingly.
2019-01-23 17:43:03 +02:00