scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 18:10:39 +00:00

Author	SHA1	Message	Date
Botond Dénes	d1209c548a	Fix -Wreturn-type warnings Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <99f7a006daaa78eb87720ac51c394093398bc868.1504013915.git.bdenes@scylladb.com>	2017-08-29 16:41:09 +03:00
Tzach Livyatan	12fb975282	Fix typos in metrics description Fixes #2658 Signed-off-by: Tzach Livyatan <tzach@scylladb.com> Message-Id: <20170803121732.19640-1-tzach@scylladb.com>	2017-08-28 10:48:28 +03:00
Avi Kivity	576e33149f	Merge seastar upstream * seastar 0083ee8...85ca12d (1): > Merge "Run-time logging configuration" from Jesse Includes patch from Jesse: "Switch to Seastar for logging option handling In addition to updating the abstraction layer for Seastar logging in `log.hh`, the configuration system (`db/config.{hh,cc}`) has been updated in two ways: - The string-map type for Boost.program_options is now defined in Seastar. - A configuration value can be marked as `UsedFromSeastar`. This is like `Used`, except the option is expected to be defined in the Boost.Program_options description for Seastar. If the option is not defined in Seastar, or it is defined with a different type, then a run-time exception is thrown early in Scylla's initialization. This is necessary because logging options which are now defined in Seastar were previously defined in Scylla and support for these options in the YAML file cannot be dropped. In order to be able to verify that options marked `UsedFromSeastar` are actually defined in Seastar, the interface for adding options to `db::config` has changed from taking a `boost::program_options::options_description_easy_init` (which is handle into a `boost::program_options::options_description` which only allows adding options) to taking a `boost::program_options::options_description` directly (which also allows querying existing options). Scylla also fully defers to Seastar's support for run-time logging configuration." Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <ef26cffb91bef1ae95d508187a6dd861a6c4fc84.1503344007.git.jhaberku@scylladb.com>	2017-08-27 13:11:33 +03:00
Jesse Haber-Kucharsky	af95d3baa7	db/config.cc: Remove unused function Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <5a4e4e153c2d87e838d1cf6def7a494a92a72f63.1503344007.git.jhaberku@scylladb.com>	2017-08-27 13:08:19 +03:00
Amnon Heiman	abbd78367c	Add configuration to disable per keyspace and column family metrics The number of keysapce and column family metrics reported is proportional to the number of shards times the number of keysapce/column families. This can cause a performance issue both on the reporting system and on the collecting system. This patch adds a configuration flag (set to false by default) to enable or disable those metrics. Fixes #2701 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <20170821113843.1036-1-amnon@scylladb.com>	2017-08-22 19:19:54 +03:00
Avi Kivity	de011ece52	main: deprecate non-murmur3 partitioners more forcefully Some (most?) users don't read logs or release notes, so they won't notice that the ByteOrdered and Random partitioners were deprecated in 2.0. Make them notice by refusing to start with a deprecated partitioner, unless a switch is explicitly enabled. Message-Id: <20170820073424.8331-1-avi@scylladb.com>	2017-08-21 14:32:22 +02:00
Avi Kivity	5a2439e702	main: check for large allocations Large allocations can require cache evictions to be satisfied, and can therefore induce long latencies. Enable the seastar large allocation warning so we can hunt them down and fix them. Message-Id: <20170819135212.25230-1-avi@scylladb.com>	2017-08-21 10:25:40 +03:00
Raphael S. Carvalho	872412d31a	db/config: introduce sstable_summary_ratio option Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2017-08-11 01:36:21 -03:00
Daniel Fiala	06089474c9	Print warning if user uses default cluster_name * Configuration for cluster_name is commented-out in config file. * Default value set to empty string and if not rewritten by user then warning is printed and value is reset to "ScyllaDB Cluster". Fixes #2648. Message-Id: <20170808113322.9313-1-daniel@scylladb.com>	2017-08-08 14:47:17 +03:00
Avi Kivity	a71138fc84	config: mark column_index_size_in_kb as Used Fixes #2681 Message-Id: <20170808100415.16296-1-avi@scylladb.com>	2017-08-08 11:08:00 +01:00
Avi Kivity	86de6cc7fb	Merge seastat upstream * seastar f14d2a3...7a49ae5 (8): > sharded: improve support for cooperating sharded<> services > sharded: support for peer services > semaphore: add a version of with_semaphore that takes a duration timeout > scripts: perftune.py: fix the CPU mask generation for more than 64 CPUs > Revert "future-utils: make when_all() (vector variant) exception safe" > Revert "future-utils: fix gross compilation errors in when_all()" > future-utils: fix gross compilation errors in when_all() > future-utils: make when_all() (vector variant) exception safe Includes change to batchlog_manager constructor to adapt it to seastar::sharded::start() change.	2017-08-06 17:47:47 +03:00
Avi Kivity	ebff739a84	Merge "use paging for compaction history" from Amnon "This series adds an option to use paging in internal query and use that for the get compaction history function. Internal paging will be done explicitly, to use paging, you first create a state object (that contains the query as well) and use that state to get the first page, the result will contain both the query result and a new state that can be used to get the next page. Fixes #2366" * 'amnon/paged_compaction_history_v5' of github.com:cloudius-systems/seastar-dev: system_keyspace: Use paging for get compaction history Add paging for internal queries query_options: Allows creating query_options from query_options	2017-08-02 18:15:58 +03:00
Asias He	cf6f4a5185	gossip: Introduce the shadow_round_ms option It specifies the maximum gossip shadow round time. It can be used to reduce the gossip feature check time during node boot up. For instance, when the first node in the cluster, which listed both itself and other node as seed in the yaml config, boots up, it will try to talk to other seed nodes which are not started yet. The gossip shadow round will be used to fetch the feature info of the cluster. Since there is no other seed node in the cluster, the shadow round will fail. User can reduce the default shadow_round_ms option to reduce the boot time. Fixes #2615 Message-Id: <10916ce9059f3c7f1a1fb465919ae57de3b67d59.1500540297.git.asias@scylladb.com>	2017-08-02 09:52:35 +03:00
Duarte Nunes	85e85ec72e	Don't catch polymorphic exceptions by value It makes gcc a very sad compiler. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170726172053.5639-2-duarte@scylladb.com>	2017-07-27 09:39:58 +03:00
Duarte Nunes	50ad0003c6	db/schema_tables: Drop dropped columns when dropping tables Fixes #2633 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170726150228.2593-2-duarte@scylladb.com>	2017-07-26 18:41:28 +02:00
Duarte Nunes	3425403126	db/schema_tables: Store column_name in text form As does Cassandra. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170726150228.2593-1-duarte@scylladb.com>	2017-07-26 18:41:12 +02:00
Duarte Nunes	33e18a1779	db/schema_tables: Consider differing dropped columns If a node is notified of a schema change where the schema's dropped columns have changes, that node will miss the changes to the dropped columns. A scenario where this can happen is where a column c is dropped, then added as a different typed, and then dropped again, with a node n having seen the first drop and being notified of the subsequent add and drop. Fixes #2616 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170725170622.4380-1-duarte@scylladb.com>	2017-07-26 11:59:34 +02:00
Tomasz Grabiec	ecc85988dd	legacy_schema_migrator: Don't snapshot empty legacy tables Otherwise we will create a new (empty) snapshot each time we boot. Message-Id: <1500573920-31478-2-git-send-email-tgrabiec@scylladb.com>	2017-07-21 16:56:31 +02:00
Duarte Nunes	937fe80a1a	Merge 'Fix possible inconsistency of table schema version' from Tomasz "Fixes issues uncovered in longevity test (#2608). Main problem is that due to time drift scylla_tables.version column may not get deleted on all nodes doing the schema merge, which will make some nodes come up with different table schema version than others. The inconsistency will not heal because scylla_tables doesn't take part in the schema sync. This is fixed by the last patch. This will cause nodes to constantly try to sync the schema, which under some conditions triggers #2617." * tag 'tgrabiec/fix-table-schema-version-inconsistency-v1' of github.com:scylladb/seastar-dev: schema_tables: Add scylla_tables to ALL schema: Make schema_mutations equality consistent with digest schema_tables: Extract compact_for_schema_digest() schema_tables: Always drop scylla_tables::version	2017-07-21 16:55:23 +02:00
Duarte Nunes	7eecda3a61	schema: Support compaction enabled attribute Fixes #2547 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170721132206.3037-1-duarte@scylladb.com>	2017-07-21 15:38:45 +02:00
Amnon Heiman	e345d05ebe	system_keyspace: Use paging for get compaction history there could be a lot of compactions when querying for compaction history. This patch changes the query to use paging. It would collect all results when returning to the caller. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2017-07-20 18:17:49 +03:00
Tomasz Grabiec	ed2388da2c	schema_tables: Add scylla_tables to ALL So that scylla_tables takes part in the digest and in mutations sent as part of schema sync. Otherwise inconsistencies in scylla_tables will not heal. Refs #2608.	2017-07-20 15:47:10 +02:00
Tomasz Grabiec	6adbe61e2f	schema_tables: Extract compact_for_schema_digest()	2017-07-20 15:47:10 +02:00
Tomasz Grabiec	1b85c316bf	schema_tables: Always drop scylla_tables::version It can happen that due to time drift between nodes, the incoming "version" cell will have higher timestamp than api::new_timestamp(). In such case the column would not be dropped and would cause version mismatch between nodes. Ensure it's always covered by using max of current time and cell's timestamp. Refs #2608.	2017-07-20 15:47:10 +02:00
Avi Kivity	c5ee62a6a4	Merge "restrict background writers with scheduling groups" from Glauber "This patchset restricts background writers - such as compactions, streaming flushes and memtable flushes to a maximum amount of CPU usage through a seastar::thread_scheduling_group. The said maximum is recommended to be set 50 % - it is default disabled, but can be adjusted through a configuration option until we are able to auto-tune this. The second patch in this series provides a preview on how such auto-tune would look like. By implementing a simple controller we automatically adjust the quota for the memtable writer processes, so that the rate at which bytes come in is equal to the rates at which bytes are flushed. Tail latencies are greatly reduced by this series, and heavy spikes that previously appeared on CPU-bound workloads are no more." * 'memtable-controller-v5' of https://github.com/glommer/scylla: simple controller for memtable/streaming writer shares. restrict background writers to 50 % of CPU.	2017-07-20 10:58:53 +03:00
Calle Wilund	7a583585a2	system_keyspace: Make sure "system" is written to keyspaces (visible) Fixes #2514 Bug in schema version 3 update: We failed to write "system" to the schema tables. Only visible on an empty instance of course. Message-Id: <1500469809-23546-2-git-send-email-calle@scylladb.com>	2017-07-19 16:18:56 +03:00
Calle Wilund	247c36e048	system_schema: Fix remaining places not handing two system keyspaces Some places remained where code looked directly at system_keyspace::NAME to determine iff a ks is considered special/system/protected. Including schema digest calculation. Export "is_system_keyspace" and use accordingly. Message-Id: <1500469809-23546-1-git-send-email-calle@scylladb.com>	2017-07-19 16:18:45 +03:00
Duarte Nunes	1daf1bc4bb	Merge 'Revert back to 1.7 schema layout in memory' from Tomasz "Fixes schema layout incompatibility in a mixed 1.7 and 2.0 cluster (#2555) by reverting back to using the old layout in memory and thus also in across-node requests. We still use the new v3 layout in schema tables (needed by drivers and external tools). Translations happen when converting to/from schema mutations." * tag 'tgrabiec/use-v2-schema-layout-in-memory-v2' of github.com:scylladb/seastar-dev: schema: Revert back to the 1.7 layout of static compact tables in memory schema: Use v3 column layout when converting to/from schema mutations schema: Encapsulate column layout translations in the v3_columns class	2017-07-19 12:52:52 +02:00
Duarte Nunes	115ff1095e	db/view: Use view schema for view pk operations Instead of base schema. Fixes #2504 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170718190703.12972-1-duarte@scylladb.com>	2017-07-19 09:59:34 +02:00
Tomasz Grabiec	a9237c1666	schema: Revert back to the 1.7 layout of static compact tables in memory We are using C* 3.x compatible layout in schema tables but want to keep using the 1.7 layout in memory for compatibility during rolling upgrade. This patch switches the schema and schema_builder classes back to the old layout. Translation of layout happens when converting to/from schema mutations. Notable changes: 1) Includes a revert of commit `6260f31e08` "thrift: Update CQL mapping of static CFs". 2) Brings back the "default_validation_class" schema attribute. In v3 it can be dervied from column definitions, but in v2 it can't, so we have to store it. 3) legacy_schema_migrator and schema_builder don't have to do conversions to v3, this is now handled by the v3_columns class. schema_builder works with the same layout as schema, that is v2. 4) Includes a revert of commit `66991a7ccb` "v3 schema test fixes" Fixes #2555.	2017-07-19 09:52:15 +02:00
Tomasz Grabiec	dc2dc056a4	schema: Use v3 column layout when converting to/from schema mutations	2017-07-19 09:52:15 +02:00
Glauber Costa	c9a529ebee	simple controller for memtable/streaming writer shares. This patch introduces a simple controller that will adjust memtables CPU shares, trying to keep it around the soft limit: if we start going below it means we're too fast (unless we are idle) and shares are adjusted downwards. If we start going above it means we're too fast and shares are adjusted upwards. I have tested this extensively in a single-CPU setup with various CPU-bound workloads while tracking virtual dirty and the results are good, with virtual dirty fluctuating only slightly, somewhere within the desired range. Exceptions to this are: 1) when the load is very light - the idle system goes faster, and that's ok 2) when the load is very high - as foreground requests dominate we can't flush fast enough and hit the hard limit. However, in such scenarios the memtable shares do hit its maximum, and the results are no worse than they are right now and this will only be fixed by CPU-limiting the actual requests. This feature can be disabled with a config option - that is scheduled to go away as we acquire more confidence in this. When the feature is disabled, all background writers (streaming, compaction, memtables) will share the same scheduling group, with static quotas. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2017-07-18 23:35:47 -04:00
Glauber Costa	4f01ec0910	restrict background writers to 50 % of CPU. In scylla, we have foreground processes, which are latency sensitive and need to be responded to as fast as possible in order to maintain good latency profiles, and background process, which are less so. The most important background processes we have during normal write workload operations are memtable writes and sstable compactions. Those processes are quite CPU-intensive, and left unchecked will easily dominate the CPU. Lower values of task-quota usually help, as it will force those processes to preempt more, but aren't enough to guarantee good isolation. We have seen boxes with good NVMe storage having their throughput reduced to less than half of the original baseline in a short dive down for the duration of a compaction. In the long run, our goal is to leverage the CPU scheduler to make sure that those processes are balanced with respect to all the others. However, the current state of affairs is causing grievances as this very moment. Thankfully, those processes live in a seastar::thread, that ships with its own rudimentary bandwidth control mechanism: the scheduling group. The goal of this patch is to wrap background processes together in a scheduling group, and assign to such group 50 % of our CPU power; the remainder being left to foreground processes. While we pride ourselves in dynamically adjusting things to the workload, we won't be able to do this properly before the CPU scheduler lands - and let's face it, leaving background processes run wild is not adaptative either. Every workload would benefit most from a different value for such shares, but 50 % is as fair as it gets if we really need static partitining in the mean time. As a defense against unforeseen consequences, we'll leave the actual value as an option, but will do our best to hide it - as this is not a tunable that we want to be part of a normal Scylla setup. The most convenient place for this tunable is still db::config, so we can easily pass it down to the database layer - but we will not document it in the yaml, and will clearly note in the help string that it is not supposed to be tuned. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2017-07-18 23:35:33 -04:00
Asias He	adc5f0bd21	gossip: Implement the missing fd_max_interval_ms and fd_initial_value_ms option It is useful for larger cluster with larger gossip message latency. By default the fd_max_interval_ms is 2 seconds which means the failure_detector will ignore any gossip message update interval larger than 2 seconds. However, in larger cluster, the gossip message udpate interval can be larger than 2 seconds. Fixes #2603. Message-Id: <49b387955fbf439e49f22e109723d3a19d11a1b9.1500278434.git.asias@scylladb.com>	2017-07-17 13:29:16 +03:00
Duarte Nunes	13caccf1cf	Merge 'Fixes around migration to v3 schema tables' from Tomasz branch 'tgrabiec/schema-migration-fixes' of github.com:scylladb/seastar-dev: schema: Use proper name comparator legacy_schema_migrator: Properly migrate non-UTF8 named columns schema_tables: Store column_name in text form legacy_schema_migrator: Migrate columns like Cassandra schema_builder: Add factory method for default_names legacy_schema_migrator: Simplify logic thrift: Don't set regular_column_name_type schema: Use proper column name type for static columns schema: Fix column_name_type() for static compact tables schema: Introduce clustering_column_at() thrift: Reuse cell_comparator::to_sstring() for obtaining comparator type partition_slice_builder: Use proper column's type instead of regular_column_name_type()	2017-07-17 11:16:52 +02:00
Tomasz Grabiec	7e54290d38	legacy_schema_migrator: Properly migrate non-UTF8 named columns Currently migrator assumed all columns are utf8-named, which doesn't have to be the case for static compact tables. Refs #2597. Due to #2573, we can assume that Scylla wasn't used with non-utf8 column names, and that old names are always in textual form.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	60a76efd37	schema_tables: Store column_name in text form That's how it is stored by Cassandra. Refs #2597.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	61229a7536	legacy_schema_migrator: Migrate columns like Cassandra This fixes generation of synthetic columns for static compact tables. Current code always generates synthetic clustering column with utf8 type and synthetic regular column with bytes type (in schema_builder). That's fine when creating a new CQL table, but not when migrating existing tables created via thrift API. Fixes #2584. This also migrates empty compact value columns like Cassandra does. Such columns are present in compact tables without regular columns, e.g.: create table test (k int, ck int, primary key (k, ck)) with compact storage; They should be migrated to a synthetic regular column with empty_type type and a non-empty name.	2017-07-17 09:40:06 +02:00
Tomasz Grabiec	6dc299c27a	legacy_schema_migrator: Simplify logic The expression "is_dense.value_or(true)" is always true inside the if, so drop it. This allows us to drop temporary calulated_is_dense. We can also get rid of one of the if branches by extracting builder.set_is_dense() outside.	2017-07-17 09:40:06 +02:00
Vlad Zolotarov	45e23d8090	db::config: fix the permissions cache related parameters description Make the descriptions of permissions_validity_in_ms, permissions_update_interval_in_ms and permissions_cache_max_entries more readable and more related to what they really do. Mention the none-zero value requirement for the permissions_update_interval_in_ms and the permissions_cache_max_entries when the permissions cache is enabled. Adjust the parameters description in the scylla.yaml too. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1499957053-31792-1-git-send-email-vladz@scylladb.com>	2017-07-13 16:00:40 +01:00
Tomasz Grabiec	30ec4af949	legacy_schema_migrator: Fix calculation of is_dense Current algorithm was marking tables with regular columns not named "value" as not dense, which doesn't have to be the case. It can be either way. It should be enough to look at clustering components. If there is a clustering key, then table is dense if and only if all comparator components belong to the clustering key. If there is no clustering key, then if there are any regular columns we're sure it's not dense. Fixes #2587. Message-Id: <1499877777-7083-1-git-send-email-tgrabiec@scylladb.com>	2017-07-13 17:28:09 +03:00
Avi Kivity	a397889c81	Merge "Preserve table schema digest on schema tables migration" from Tomasz "Currently new nodes calculate digests based on v3 schema mutations, which are very different from v2 mutations. As a result they will use schemas with different table_schema_version that the old nodes. The old nodes will not recognize the version and will try to request its definition. That will fail, because old nodes don't understand v3 schema mutations. To fix this problem, let's preserve the digests during migration, so that they're the same on new and old nodes. This will allow requests to proceed as usual. This does not solve the problem of schema being changed during the rolling upgrade. This is not allowed, as it would bring the same problem back. Fixes #2549." * tag 'tgrabiec/use-consistent-schema-table-digests-v2' of github.com:cloudius-systems/seastar-dev: tests: Add test for concurrent column addition legacy_schema_migrator: Set digest to one compatible with the old nodes schema_tables: Persist table_schema_version schema_tables: Introduce system_schema.scylla_tables schema_tables: Simplify read_table_mutations() schema_tables: Resurrect v2 read_table_mutations() system_keyspace: Forward-declare legacy schemas legacy_schema_migrator: Take storage_proxy as dependency	2017-07-11 17:22:42 +03:00
Gleb Natapov	739dd878e3	consistency_level: report less live endpoints in Unavailable exception if there are pending nodes DowngradingConsistencyRetryPolicy uses live replicas count from Unavailable exception to adjust CL for retry, but when there are pending nodes CL is increased internally by a coordinator and that may prevent retried query from succeeding. Adjust live replica count in case of pending node presence so that retried query will be able to proceed. Fixes #2535 Message-Id: <20170710085238.GY2324@scylladb.com>	2017-07-11 16:51:56 +03:00
Tomasz Grabiec	f5909ec515	legacy_schema_migrator: Set digest to one compatible with the old nodes Calculate and set digest using v2 mutations so that digests are the same before and after migration. This is neeed so that no schema definition exchange is required during rolling upgrade. Fixes #2549.	2017-07-11 14:52:23 +02:00
Tomasz Grabiec	5b69d99bf8	schema_tables: Persist table_schema_version When migrating schema tables from v2 to v3, mutations underlying table schema will change, and so will their digest. However, we want the digest to be the same on new nodes as on the old nodes, because schema exchange is not possible between the two nodes, so they must to request schema definitions from each other. The solution is to make the digest persistable, so that it sticks to given table schema, surviving both migration and node restarts. On migration from v2, the digest will be calculated from v2 mutations, so it will be the same on new and old nodes.	2017-07-11 14:52:23 +02:00
Tomasz Grabiec	cdf5b67522	schema_tables: Introduce system_schema.scylla_tables It will be used to store Scylla spcific table metadata. We cannot store it in the standard "tables" table for compatibility reasons - Cassandra will fail to read schema if it encounteres columns it is not expecting.	2017-07-11 14:52:23 +02:00
Tomasz Grabiec	cdcdf4772f	schema_tables: Simplify read_table_mutations()	2017-07-11 14:52:23 +02:00
Tomasz Grabiec	6e62bc77f1	schema_tables: Resurrect v2 read_table_mutations()	2017-07-11 14:52:23 +02:00
Tomasz Grabiec	4b5818a404	system_keyspace: Forward-declare legacy schemas	2017-07-11 14:52:23 +02:00
Tomasz Grabiec	8624edc0fa	legacy_schema_migrator: Take storage_proxy as dependency Will be needed to query for mutations.	2017-07-11 14:52:23 +02:00

1 2 3 4 5 ...

923 Commits