Commit Graph

941 Commits

Author SHA1 Message Date
Botond Dénes
47e07b787e restricted_mutation_reader: restrict based-on memory consumption
Restrict readers based on their memory consumption, instead of the count
of the top-level readers. To do this an interposer is installed at the
input_stream level which tracks buffers emmited by the stream. This way
we can have an accurate picture of the readers' actual memory
consumption.
New readers will consume 16k units from the semaphore up-front. This is
to account their own memory-consumption, apart from the buffers they
will allocate. Creating the reader will be deferred to when there are
enough resources to create it. As before only new readers will be
blocked on an exhausted semaphore, existing readers can continue to
work.
2017-10-03 12:44:12 +03:00
Avi Kivity
78eae8bf48 Revert "Merge "Make restricting_mutation_reader more accurate" from Botond"
This reverts commit c6e5dcc556, reversing
changes made to 19b21a0ab2. Failes to build,
plus author has more changes.
2017-10-03 11:58:59 +03:00
Botond Dénes
43dba8f173 Update reader restriction related metrics
Update description of existing reader count metrics, add memory
consumption metrics.
2017-09-20 11:16:21 +03:00
Botond Dénes
33e97e7457 restricted_mutation_reader: restrict based-on memory consumption
Restrict readers based on their memory consumption, instead of the count
of the top-level readers. To do this an interposer is installed at the
input_stream level which tracks buffers emmited by the stream. This way
we can have an accurate picture of the readers' actual memory
consumption.
New readers will consume 16k units from the semaphore up-front. This is
to account their own memory-consumption, apart from the buffers they
will allocate. Creating the reader will be deferred to when there are
enough resources to create it. As before only new readers will be
blocked on an exhausted semaphore, existing readers can continue to
work.
2017-09-20 11:14:35 +03:00
Avi Kivity
e44517851e untyped_result_set: reduce dependencies
Forward-declare untyped_result_set and untyped_result_set_row, and remove
the include from query_processor.hh.
Message-Id: <20170916170859.27612-3-avi@scylladb.com>
2017-09-18 15:15:15 +02:00
Tomasz Grabiec
571cac95ed schema_tables: Make make_scylla_tables_mutation() visible
For tests.
2017-09-14 20:26:31 +02:00
Tomasz Grabiec
f943d2efbf schema_tables: Don't alter tables which differ only in version
We apply deletion of scylla_tables.version to the incoming schema
mutations so that table schema version is recalculated after merge.
The mutations which we read from local schema tables may not have it
deleted in which case all tables would be considered as differing on
the presence of the version field. Avoid this by deleting the field
from old mutations as well.
2017-09-14 20:26:31 +02:00
Tomasz Grabiec
99272087e6 schema_mutations: Use mutation_opt instead of stdx::optional<mutation> 2017-09-14 20:26:31 +02:00
Avi Kivity
0aaefe665b system_keyspace: add missing include 2017-09-11 20:09:45 +03:00
Avi Kivity
d3cde2e2be size_estimates_virtual_reader.hh: add missing include 2017-09-11 20:09:45 +03:00
Piotr Jastrzebski
dd5dc75605 Stop calling _local_cache.stop in at_exit.
This removes a race condition that was causing #2721

Fixes #2721

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <ad060fab43d63c17db9f811c421d7ab26e5e57c8.1503933021.git.piotr@scylladb.com>
2017-09-03 15:55:48 +03:00
Avi Kivity
0524cbbd72 Merge db/config.cc cleanups from Jesse
* 'jhk/config_hygiene/v1' of https://github.com/hakuch/scylla:
  db/config.cc: Clarify documentation for `typed_value_ex`
  db/config.cc: Fix formatting and warnings
  db/config.cc: Remove unnecessary `mutable` on lambdas
  db/config.cc: Remove unused variables
2017-09-03 11:08:53 +03:00
Botond Dénes
a980ff6463 Use abort() instead of assert + throw in unreachable code
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <393c3730111dfe090c44d8fc2e31602956a7d008.1504022425.git.bdenes@scylladb.com>
2017-09-03 11:07:27 +03:00
Raphael S. Carvalho
0218d6fd8f db/config: add sstable_data_integrity_check option
If enabled, interposer for checking integrity of sstable component
writes will be used.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2017-08-30 13:57:08 -03:00
Botond Dénes
d1209c548a Fix -Wreturn-type warnings
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <99f7a006daaa78eb87720ac51c394093398bc868.1504013915.git.bdenes@scylladb.com>
2017-08-29 16:41:09 +03:00
Jesse Haber-Kucharsky
abf4c1688d db/config.cc: Clarify documentation for typed_value_ex 2017-08-28 10:08:29 -04:00
Jesse Haber-Kucharsky
7374f9d86f db/config.cc: Fix formatting and warnings 2017-08-28 10:08:29 -04:00
Jesse Haber-Kucharsky
90666e5744 db/config.cc: Remove unnecessary mutable on lambdas 2017-08-28 10:08:29 -04:00
Jesse Haber-Kucharsky
449bd60480 db/config.cc: Remove unused variables 2017-08-28 10:08:29 -04:00
Tzach Livyatan
12fb975282 Fix typos in metrics description
Fixes #2658

Signed-off-by: Tzach Livyatan <tzach@scylladb.com>
Message-Id: <20170803121732.19640-1-tzach@scylladb.com>
2017-08-28 10:48:28 +03:00
Avi Kivity
576e33149f Merge seastar upstream
* seastar 0083ee8...85ca12d (1):
  > Merge "Run-time logging configuration" from Jesse

Includes patch from Jesse:

"Switch to Seastar for logging option handling

In addition to updating the abstraction layer for Seastar logging in `log.hh`,
the configuration system (`db/config.{hh,cc}`) has been updated in two ways:

- The string-map type for Boost.program_options is now defined in Seastar.

- A configuration value can be marked as `UsedFromSeastar`. This is like `Used`,
  except the option is expected to be defined in the Boost.Program_options
  description for Seastar. If the option is not defined in Seastar, or it is
  defined with a different type, then a run-time exception is thrown early in
  Scylla's initialization. This is necessary because logging options which are
  now defined in Seastar were previously defined in Scylla and support for these
  options in the YAML file cannot be dropped. In order to be able to verify that
  options marked `UsedFromSeastar` are actually defined in Seastar, the
  interface for adding options to `db::config` has changed from taking a
  `boost::program_options::options_description_easy_init` (which is handle into
  a `boost::program_options::options_description` which only allows adding
  options) to taking a `boost::program_options::options_description`
  directly (which also allows querying existing options).

Scylla also fully defers to Seastar's support for run-time logging
configuration."

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <ef26cffb91bef1ae95d508187a6dd861a6c4fc84.1503344007.git.jhaberku@scylladb.com>
2017-08-27 13:11:33 +03:00
Jesse Haber-Kucharsky
af95d3baa7 db/config.cc: Remove unused function
Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <5a4e4e153c2d87e838d1cf6def7a494a92a72f63.1503344007.git.jhaberku@scylladb.com>
2017-08-27 13:08:19 +03:00
Amnon Heiman
abbd78367c Add configuration to disable per keyspace and column family metrics
The number of keysapce and column family metrics reported is
proportional to the number of shards times the number of keysapce/column
families.

This can cause a performance issue both on the reporting system and on
the collecting system.

This patch adds a configuration flag (set to false by default) to enable
or disable those metrics.

Fixes #2701

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20170821113843.1036-1-amnon@scylladb.com>
2017-08-22 19:19:54 +03:00
Avi Kivity
de011ece52 main: deprecate non-murmur3 partitioners more forcefully
Some (most?) users don't read logs or release notes, so they won't notice
that the ByteOrdered and Random partitioners were deprecated in 2.0. Make
them notice by refusing to start with a deprecated partitioner, unless a
switch is explicitly enabled.
Message-Id: <20170820073424.8331-1-avi@scylladb.com>
2017-08-21 14:32:22 +02:00
Avi Kivity
5a2439e702 main: check for large allocations
Large allocations can require cache evictions to be satisfied, and can
therefore induce long latencies. Enable the seastar large allocation
warning so we can hunt them down and fix them.

Message-Id: <20170819135212.25230-1-avi@scylladb.com>
2017-08-21 10:25:40 +03:00
Raphael S. Carvalho
872412d31a db/config: introduce sstable_summary_ratio option
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2017-08-11 01:36:21 -03:00
Daniel Fiala
06089474c9 Print warning if user uses default cluster_name
* Configuration for cluster_name is commented-out in config file.
* Default value set to empty string and if not rewritten by user then
  warning is printed and value is reset to "ScyllaDB Cluster".

Fixes #2648.

Message-Id: <20170808113322.9313-1-daniel@scylladb.com>
2017-08-08 14:47:17 +03:00
Avi Kivity
a71138fc84 config: mark column_index_size_in_kb as Used
Fixes #2681
Message-Id: <20170808100415.16296-1-avi@scylladb.com>
2017-08-08 11:08:00 +01:00
Avi Kivity
86de6cc7fb Merge seastat upstream
* seastar f14d2a3...7a49ae5 (8):
  > sharded: improve support for cooperating sharded<> services
  > sharded: support for peer services
  > semaphore: add a version of with_semaphore that takes a duration timeout
  > scripts: perftune.py: fix the CPU mask generation for more than 64 CPUs
  > Revert "future-utils: make when_all() (vector variant) exception safe"
  > Revert "future-utils: fix gross compilation errors in when_all()"
  > future-utils: fix gross compilation errors in when_all()
  > future-utils: make when_all() (vector variant) exception safe

Includes change to batchlog_manager constructor to adapt it to
seastar::sharded::start() change.
2017-08-06 17:47:47 +03:00
Avi Kivity
ebff739a84 Merge "use paging for compaction history" from Amnon
"This series adds an option to use paging in internal query and use that for the
get compaction history function.

Internal paging will be done explicitly, to use paging, you first create a
state object (that contains the query as well) and use that state to get the
first page, the result will contain both the query result and a new state that
can be used to get the next page.

Fixes #2366"

* 'amnon/paged_compaction_history_v5' of github.com:cloudius-systems/seastar-dev:
  system_keyspace: Use paging for get compaction history
  Add paging for internal queries
  query_options: Allows creating query_options from query_options
2017-08-02 18:15:58 +03:00
Asias He
cf6f4a5185 gossip: Introduce the shadow_round_ms option
It specifies the maximum gossip shadow round time. It can be used to
reduce the gossip feature check time during node boot up.
For instance, when the first node in the cluster, which listed both
itself and other node as seed in the yaml config, boots up, it will try
to talk to other seed nodes which are not started yet. The gossip shadow
round will be used to fetch the feature info of the cluster. Since there
is no other seed node in the cluster, the shadow round will fail. User
can reduce the default shadow_round_ms option to reduce the boot time.

Fixes #2615
Message-Id: <10916ce9059f3c7f1a1fb465919ae57de3b67d59.1500540297.git.asias@scylladb.com>
2017-08-02 09:52:35 +03:00
Duarte Nunes
85e85ec72e Don't catch polymorphic exceptions by value
It makes gcc a very sad compiler.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170726172053.5639-2-duarte@scylladb.com>
2017-07-27 09:39:58 +03:00
Duarte Nunes
50ad0003c6 db/schema_tables: Drop dropped columns when dropping tables
Fixes #2633

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170726150228.2593-2-duarte@scylladb.com>
2017-07-26 18:41:28 +02:00
Duarte Nunes
3425403126 db/schema_tables: Store column_name in text form
As does Cassandra.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170726150228.2593-1-duarte@scylladb.com>
2017-07-26 18:41:12 +02:00
Duarte Nunes
33e18a1779 db/schema_tables: Consider differing dropped columns
If a node is notified of a schema change where the schema's dropped
columns have changes, that node will miss the changes to the dropped
columns. A scenario where this can happen is where a column c is
dropped, then added as a different typed, and then dropped again, with
a node n having seen the first drop and being notified of the
subsequent add and drop.

Fixes #2616

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170725170622.4380-1-duarte@scylladb.com>
2017-07-26 11:59:34 +02:00
Tomasz Grabiec
ecc85988dd legacy_schema_migrator: Don't snapshot empty legacy tables
Otherwise we will create a new (empty) snapshot each time we boot.
Message-Id: <1500573920-31478-2-git-send-email-tgrabiec@scylladb.com>
2017-07-21 16:56:31 +02:00
Duarte Nunes
937fe80a1a Merge 'Fix possible inconsistency of table schema version' from Tomasz
"Fixes issues uncovered in longevity test (#2608).

Main problem is that due to time drift scylla_tables.version column
may not get deleted on all nodes doing the schema merge, which will
make some nodes come up with different table schema version than others.

The inconsistency will not heal because scylla_tables doesn't
take part in the schema sync. This is fixed by the last patch.

This will cause nodes to constantly try to sync the schema, which under
some conditions triggers #2617."

* tag 'tgrabiec/fix-table-schema-version-inconsistency-v1' of github.com:scylladb/seastar-dev:
  schema_tables: Add scylla_tables to ALL
  schema: Make schema_mutations equality consistent with digest
  schema_tables: Extract compact_for_schema_digest()
  schema_tables: Always drop scylla_tables::version
2017-07-21 16:55:23 +02:00
Duarte Nunes
7eecda3a61 schema: Support compaction enabled attribute
Fixes #2547

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170721132206.3037-1-duarte@scylladb.com>
2017-07-21 15:38:45 +02:00
Amnon Heiman
e345d05ebe system_keyspace: Use paging for get compaction history
there could be a lot of compactions when querying for compaction
history.

This patch changes the query to use paging. It would collect all results
when returning to the caller.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2017-07-20 18:17:49 +03:00
Tomasz Grabiec
ed2388da2c schema_tables: Add scylla_tables to ALL
So that scylla_tables takes part in the digest and in mutations sent
as part of schema sync. Otherwise inconsistencies in scylla_tables
will not heal.

Refs #2608.
2017-07-20 15:47:10 +02:00
Tomasz Grabiec
6adbe61e2f schema_tables: Extract compact_for_schema_digest() 2017-07-20 15:47:10 +02:00
Tomasz Grabiec
1b85c316bf schema_tables: Always drop scylla_tables::version
It can happen that due to time drift between nodes, the incoming
"version" cell will have higher timestamp than api::new_timestamp().
In such case the column would not be dropped and would cause version
mismatch between nodes.

Ensure it's always covered by using max of current time and cell's
timestamp.

Refs #2608.
2017-07-20 15:47:10 +02:00
Avi Kivity
c5ee62a6a4 Merge "restrict background writers with scheduling groups" from Glauber
"This patchset restricts background writers - such as compactions,
streaming flushes and memtable flushes to a maximum amount of CPU usage
through a seastar::thread_scheduling_group.

The said maximum is recommended to be set  50 % - it is default
disabled, but can be adjusted through a configuration option until we
are able to auto-tune this.

The second patch in this series provides a preview on how such auto-tune
would look like. By implementing a simple controller we automatically
adjust the quota for the memtable writer processes, so that the rate at
which bytes come in is equal to the rates at which bytes are flushed.

Tail latencies are greatly reduced by this series, and heavy spikes that
previously appeared on CPU-bound workloads are no more."

* 'memtable-controller-v5' of https://github.com/glommer/scylla:
  simple controller for memtable/streaming writer shares.
  restrict background writers to 50 % of CPU.
2017-07-20 10:58:53 +03:00
Calle Wilund
7a583585a2 system_keyspace: Make sure "system" is written to keyspaces (visible)
Fixes #2514

Bug in schema version 3 update: We failed to write "system" to the
schema tables. Only visible on an empty instance of course.

Message-Id: <1500469809-23546-2-git-send-email-calle@scylladb.com>
2017-07-19 16:18:56 +03:00
Calle Wilund
247c36e048 system_schema: Fix remaining places not handing two system keyspaces
Some places remained where code looked directly at
system_keyspace::NAME to determine iff a ks is
considered special/system/protected. Including
schema digest calculation.

Export "is_system_keyspace" and use accordingly.

Message-Id: <1500469809-23546-1-git-send-email-calle@scylladb.com>
2017-07-19 16:18:45 +03:00
Duarte Nunes
1daf1bc4bb Merge 'Revert back to 1.7 schema layout in memory' from Tomasz
"Fixes schema layout incompatibility in a mixed 1.7 and 2.0 cluster (#2555)
by reverting back to using the old layout in memory and thus also
in across-node requests. We still use the new v3 layout in schema
tables (needed by drivers and external tools). Translations happen
when converting to/from schema mutations."

* tag 'tgrabiec/use-v2-schema-layout-in-memory-v2' of github.com:scylladb/seastar-dev:
  schema: Revert back to the 1.7 layout of static compact tables in memory
  schema: Use v3 column layout when converting to/from schema mutations
  schema: Encapsulate column layout translations in the v3_columns class
2017-07-19 12:52:52 +02:00
Duarte Nunes
115ff1095e db/view: Use view schema for view pk operations
Instead of base schema.

Fixes #2504

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170718190703.12972-1-duarte@scylladb.com>
2017-07-19 09:59:34 +02:00
Tomasz Grabiec
a9237c1666 schema: Revert back to the 1.7 layout of static compact tables in memory
We are using C* 3.x compatible layout in schema tables but want to
keep using the 1.7 layout in memory for compatibility during rolling
upgrade. This patch switches the schema and schema_builder classes
back to the old layout. Translation of layout happens when converting
to/from schema mutations.

Notable changes:

 1) Includes a revert of commit 6260f31e08
    "thrift: Update CQL mapping of static CFs".

 2) Brings back the "default_validation_class" schema attribute. In v3
    it can be dervied from column definitions, but in v2 it can't, so
    we have to store it.

 3) legacy_schema_migrator and schema_builder don't have to do
    conversions to v3, this is now handled by the v3_columns
    class. schema_builder works with the same layout as schema, that
    is v2.

 4) Includes a revert of commit 66991a7ccb
    "v3 schema test fixes"

Fixes #2555.
2017-07-19 09:52:15 +02:00
Tomasz Grabiec
dc2dc056a4 schema: Use v3 column layout when converting to/from schema mutations 2017-07-19 09:52:15 +02:00
Glauber Costa
c9a529ebee simple controller for memtable/streaming writer shares.
This patch introduces a simple controller that will adjust memtables CPU
shares, trying to keep it around the soft limit: if we start going below
it means we're too fast (unless we are idle) and shares are adjusted
downwards. If we start going above it means we're too fast and shares
are adjusted upwards.

I have tested this extensively in a single-CPU setup with various
CPU-bound workloads while tracking virtual dirty and the results are
good, with virtual dirty fluctuating only slightly, somewhere within the
desired range.

Exceptions to this are:
1) when the load is very light - the idle system goes faster, and that's
   ok
2) when the load is very high - as foreground requests dominate we can't
   flush fast enough and hit the hard limit. However, in such scenarios
   the memtable shares do hit its maximum, and the results are no worse
   than they are right now and this will only be fixed by CPU-limiting the
   actual requests.

This feature can be disabled with a config option - that is scheduled to
go away as we acquire more confidence in this. When the feature is
disabled, all background writers (streaming, compaction, memtables) will
share the same scheduling group, with static quotas.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-07-18 23:35:47 -04:00