scylladb

Author	SHA1	Message	Date
Nadav Har'El	b2b7ae4e41	alternator: require alternator-port configuration Until now, we always opened the Alternator port along with Scylla's regular ports (CQL etc.). This should really be made optional. With this patch, by default Alternator does NOT start and does not open a port. Run Scylla with --alternator-port=8000 to open an Alternator API port on port 8000, as was the default until now. It's also possible to set this in scylla.yaml. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2019-08-19 15:48:16 +03:00
Calle Wilund	1afc899e37	type_parser: Fix/improve exception messages Removes long-standing FIXME for message detail Also simplifies some code, removing duplication. Message-Id: <20190812134144.2417-1-calle@scylladb.com>	2019-08-12 17:03:43 +03:00
Avi Kivity	e6cde72d2b	Merge "Fix cql server admission control to take all leftover work into account" from Gleb " Current admission control takes a permit when cql requests starts and releases it when reply is sent, but some requests may leave background work behind after that point (some because there is genuine background work to do like complete a write or do a read repair, and some because a read/write may stuck in a queue longer than the request's timeout), so after Scylla replies with a timeout some resources are still occupied. The series fixes this by passing the permit down to storage_proxy where it is held until all background work is completed. Fixes #4768 " * 'gleb/admission-v3' of github.com:scylladb/seastar-dev: transport: add a metric to follow memory available for service permit. storage_proxy: store a permit in a read executor storage_proxy: store a permit in a write response handler Pass service permit to storage_proxy transport: introduce service_permit class and use it instead of semaphore_units transport: hold admission a permit until a reply is sent transport: remove cql server load balancer	2019-08-12 11:02:37 +03:00
Gleb Natapov	6a4207f202	Pass service permit to storage_proxy Current cql transport code acquire a permit before processing a query and release it when the query gets a reply, but some quires leave work behind. If the work is allowed to accumulate without any limit a server may eventually run out of memory. To prevent that the permit system should account for the background work as well. The patch is a first step in this direction. It passes a permit down to storage proxy where it will be later hold by background work.	2019-08-12 10:20:43 +03:00
Gleb Natapov	7e3805ed3d	transport: remove cql server load balancer It is buggy, unused and unnecessary complicates the code.	2019-08-11 16:08:52 +03:00
Calle Wilund	d3410f0e48	config: Add rpc_interface_prefer_ipv6 parameter As already existing in scylla.yaml	2019-08-06 08:32:10 +00:00
Calle Wilund	0028cecb8e	config: Add listen_interface_perfer_ipv6 parameter As already existing in scylla.yaml. https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml#L622	2019-08-06 08:32:10 +00:00
Calle Wilund	39d18178eb	config.cc: Fix enable_ipv6_dns_lookup actual param name When adding option (and iterating through config refactoring) the member name and the config param name got out of sync	2019-08-06 08:32:09 +00:00
Tomasz Grabiec	bf70ee3986	config, exceptions: Add helper for handling internal errors The handler is intended to be called when internal invariants are violated and the operation cannot safely continue. The handler either throws (default) or aborts, depending on configuration option. Passing --abort-on-internal-error on the command line will switch to aborting. The reason we don't abort by default is that it may bring the whole cluster down and cause unavailability, while it may not be necessary to do so. It's safer to fail just the affected operation, e.g. repair. However, failing the operation with an exception leaves little information for debugging the root cause. So the idea is that the user would enable aborts on only one of the nodes in the cluster to get a core dump and not bring the whole cluster down.	2019-08-02 11:13:54 +02:00
Avi Kivity	e03c7003f1	toppartitions: fix race between listener removal and reads Data listener reads are implemented as flat_mutation_readers, which take a reference to the listener and then execute asynchronously. The listener can be removed between the time when the reference is taken and actual execution, resulting in a dangling pointer dereference. Fix by using a weak_ptr to avoid writing to a destroyed object. Note that writes don't need protection because they execute atomically. Fixes #4661. Tests: unit (dev)	2019-07-22 13:26:18 +02:00
Rafael Ávila de Espíndola	636e2470b1	Always close commitlog files We were using segment::_closed to decide whether _file was already closed. Unfortunately they are not exactly the same thing. As far as I understand it, segments can be closed and reused without actually closing the file. Found with a seastar patch that asserts on destroying an open append_challenged_posix_file_impl. Fixes #4745. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190721171332.7995-1-espindola@scylladb.com>	2019-07-22 10:08:57 +03:00
Piotr Sarna	c1d5aef735	db: add system_schema.computed_columns Information on which columns of a table are 'computed' is now kept in system_schema.computed_columns system table.	2019-07-19 11:58:42 +02:00
Piotr Sarna	17c323c096	database: add fixing previous secondary index schemas If a schema was created before computed columns were implemented, its token column may not have been marked as computed. To remedy this, if no computed column is found, the schema will be recreated. The code will work correctly even without this patch in order to support upgrading from legacy versions, but it's still important: it transforms token columns from the legacy format to new computed format, which will eventually (after a few release cycles) allow dropping the support for legacy format altogether.	2019-07-19 11:58:42 +02:00
Piotr Sarna	3c5dd94306	view: remove unused token_for function The function was only used once in code removed in this series.	2019-07-19 11:58:42 +02:00
Piotr Sarna	6a6871aa0e	view: check for computed columns in view Currently, having a 'computed' column in view update generation indicates that token value needs to be generated and assigned to it.	2019-07-19 11:58:42 +02:00
Piotr Sarna	a0e02df36a	service: add computed columns feature Computed columns feature should be checked before creating index schemas the new way - by adding computed column names to system_schema.computed_columns.	2019-07-19 11:58:42 +02:00
Tomasz Grabiec	14700c2ac4	Merge "Fix the system.size_estimates table" from Kamil Fixes a segfault when querying for an empty keyspace. Also, fixes an infinite loop on smp > 1. Queries to system.size_estimates table which are not single-partition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. Fixes #4689.	2019-07-15 22:09:30 +02:00
Piotr Sarna	ac7531d8d9	db,hints: decouple in-flight hints limits from resource manager The resource manager is used to manage common resources between various hints managers. In-flight hints used to be one of the shared resources, but it proves to cause starvation, when one manager eats the whole limit - which may be especially painful if the background materialized views hints manager starves the regular hints manager, which can in turn start failing user writes because of admission control. This patch makes the limit per-manager again, which effectively reverts the limit to its original behavior. Fixes #4483 Message-Id: <8498768e8bccbfa238e6a021f51ec0fa0bf3f7f9.1559649491.git.sarna@scylladb.com>	2019-07-12 19:21:26 +03:00
Kamil Braun	60a4867a5b	Fix infinite looping when performing a range query on system.size_estimates. Queries to system.size_estimates table which are not single parition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. This commit fixes the issue and closes #4689.	2019-07-12 18:09:15 +02:00
Kamil Braun	ba5a02169e	Fix segmentation fault when querying system.size_estimates for an empty keyspace.	2019-07-12 18:02:10 +02:00
Kamil Braun	a1665b74a9	Refactor size_estimates_virtual_reader Move the implementation of size_estimates_mutation_reader to a separate compilation unit to speed up compilation times and increase readability. Refactor tests to use seastar::thread.	2019-07-12 17:53:00 +02:00
Calle Wilund	1f5e1d22bf	db::config: Add enable ipv6 switch (default off) Off by default to prevent problems during cluster migration when needing to gossip with non-ipv6 aware nodes.	2019-07-08 14:13:09 +00:00
Calle Wilund	4ef940169f	Replace use of "ipv4_addr" with socket_address Allows the various sockets to use ipv6 address binding if so configured.	2019-07-08 14:13:09 +00:00
Calle Wilund	5fd811ec8a	gms::inet_address: Change inet_address to wrap actual seastar::net::inet_address Thusly handle all types net::inet_address can handle. I.e. ipv6.	2019-07-08 14:13:09 +00:00
Calle Wilund	f317d7a975	commitlog: Simplify commitlog extension iteration Fixes #4640 Iterating extensions in commitlog.cc should mimic that in sstables.cc, i.e. a simple future-chain. Should also use same order for read and write open, as we should preserve transformation stack order. Message-Id: <20190702150028.18042-1-calle@scylladb.com>	2019-07-02 18:37:44 +03:00
Tomasz Grabiec	eb496b5eae	Merge "Allow changing configuration at runtime" from Avi This patchset allows changing the configuration at runtime, The user triggers this by editing the configuration file normally, then signalling the database with SIGHUP (as is traditional). The implementation is somewhat complicated due the need to store non-atomic mutable state per-shard and to synchronize the values in all shards. This is somewhat similar to Seastar's sharded<>, but that cannot be used since the configuration is read before Seastar is initialized (due to the need to read command-line options). Tests: unit (dev, debug), manual test with extra prints (dev) Ref #2689 Fixes #2517.	2019-07-01 15:04:59 +02:00
Avi Kivity	2abe015150	database: allow live update of the compaction_enforce_min_threshold config item Change the type from bool to updateable_value<bool> throughout the dependency chain and mark it as live updateable. In theory we should also observe the value and trigger compaction if it changes, but I don't think it is worthwhile.	2019-06-28 16:43:25 +03:00
Avi Kivity	8d7c1c7231	db: seed_provider_type: add operator==() Dynamically updateable configuration requires checking whether configuration items changed or not, so we can skip firing notifiers for the common case where nothing changed. This patch adds a comparison operator for seed_provider_type, which was missing it.	2019-06-28 16:43:25 +03:00
Avi Kivity	da2a98cde6	config: don't allow assignment to config values Currently, we allow adjusting configuration via cfg.whatever() = 5; by returning a mutable reference from cfg.whatever(). Soon, however, this operation will have side effects (updating all references to the config item, and triggering notifiers). While this can be done with a proxy, it is too tricky. Switch to an ordinary setter interface: cfg.whatever.set(5); Because boost::program_options no longer gets a reference to the value to be written to, we have to move the update to a notifier, and the value_ex() function has to be adjusted to infer whether it was called with a vector type after it is called, not before.	2019-06-28 16:43:25 +03:00
Glauber Costa	d916601ea4	toppartitions: fix typo toppartitons -> toppartitions Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20190627160937.7842-1-glauber@scylladb.com>	2019-06-27 19:13:58 +03:00
Piotr Sarna	85a3a4b458	view: ignore duplicated key entries in progress virtual reader Build progress virtual reader uses Scylla-specific scylla_views_builds_in_progress table in order to represent legacy views_builds_in_progress rows. The Scylla-specific table contains additional cpu_id clustering key part, which is trimmed before returning it to the user. That may cause duplicated clustering row fragments to be emitted by the reader, which may cause undefined behaviour in consumers. The solution is to keep track of previous clustering keys for each partition and drop fragments that would cause duplication. That way if any shard is still building a view, its progress will be returned, and if many shards are still building, the returned value will indicate the progress of a single arbitrary shard. Fixes #4524 Tests: unit(dev) + custom monotonicity checks from <tgrabiec@scylladb.com>	2019-06-11 13:01:31 +02:00
Juliana Oliveira	fd83f61556	Add a warning for partitions with too many rows This patch adds a warning option to the user for situations where rows count may get bigger than initially designed. Through the warning, users can be aware of possible data modeling problems. The threshold is initially set to '100,000'. Tests: unit (dev) Message-Id: <20190528075612.GA24671@shenzou.localdomain>	2019-06-06 19:48:57 +03:00
Piotr Sarna	74f6ab7599	db: drop unnecessary double computation when feeding hash When feeding hash for schema digest, compact_for_schema_digest is mistakenly called twice, which may result in needless recomputation. Message-Id: <8f52201cf428a55e7057d8438025275023eb9288.1559826555.git.sarna@scylladb.com>	2019-06-06 16:16:47 +03:00
Calle Wilund	1e37e1d40c	commitlog: Add optional use of O_DSYNC mode Refs #3929 Optionally enables O_DSYNC mode for segment files, and when enabled ignores actual flushing and just barriers any ongoing writes. Iff using O_DSYNC mode, we will not only truncate the file to max size, but also do an actual initial write of zero:s to it, since XFS (intended target) has observably less good behaviour on non-physical file blocks. Once written (and maybe recycled) we should have rather satisfying throughput on writes. Note that the O_DSYNC behaviour is hidden behind a default disabled option. While user should probably seldom worry about this, we should add some sort of logic i main/init that unless specified by user, evaluates the commitlog disk and sets this to true if it is using XFS and looks ok. This is because using O_DSYNC on things like EXT4 etc has quite horrible performance. All above statements about performance and O_DSYNC behaviour are based on a sampling of benchmark results (modified fsqual) on a statistically non-ssignificant selection of disks. However, at least there the observed behaviour is a rather large difference between ::fallocate:ed disk area vs. actually written using O_DSYNC on XFS, and O_DSYNC on EXT4. Note also that measurements on O_DSYNC vs. no O_DSYNC does not take into account the wall-clock time of doing manual disk flush. This is intentionally ignored, since in the commitlog case, at least using periodic mode, flushes are relatively rare. Message-Id: <20190520120331.10229-1-calle@scylladb.com>	2019-05-20 15:10:48 +03:00
Avi Kivity	5b2c8847c7	Merge "Pre timestamp based data segregation cleanup" from Botond " This series contains loosely related generic cleanup patches that the timestamp based data segregation series depends on. Most of the patches have to do with making headers self-sustainable, that is compilable on their own. This was needed to be able to ensure that the new headers introduced or touched by that series are self-sustainable too. This series also introduces `schema_fwd.hh` which contains a forward declaration of `schema` and `schema_ptr` classes. No effort was made to find and replace all existing ad-hoc schema forward declarations in the source tree. " * 'pre-timestamp-based-data-segregation-cleanup/v1' of https://github.com/denesb/scylla: encoding_stats.hh: add missing include sstables/time_window_compaction_strategy.hh: make self-sufficient sstables/size_tiered_compaction_strategy.hh: make self-sufficient sstables/compaction_strategy_impl.hh: make header self-sufficient compaction_strategy.hh: use schema_fwd.hh db/extensions.hh: use schema_fwd.hh Add schema_fwd.hh	2019-05-15 17:37:06 +03:00
Avi Kivity	82b91c1511	Merge "gc_clock: Fix hashing to be backwards-compatible" from Tomasz " Commit `d0f9e00` changed the representation of the gc_clock::duration from int32_t to int64_t. Mutation hashing uses appending_hash<gc_clock::time_point>, which by default feeds duration::count() into the hasher. duration::rep changed from int32_t to int64_t, which changes the value of the hash. This affects schema digest and query digests, resulting in mismatches between nodes during a rolling upgrade. Fixes #4460. Refs #4485. " * tag 'fix-gc_clock-digest-v2.1' of github.com:tgrabiec/scylla: tests: Add test which verifies that schema digest stays the same tests: Add sstables for the schema digest test schema_tables, storage_service: Make schema digest insensitive to expired tombstones in empty partition db/schema_tables: Move feed_hash_for_schema_digest() to .cc file hashing: Introduce type-erased interface for the hasher hashing: Introduce C++ concept for the hasher hashers: Rename hasher to cryptopp_hasher gc_clock: Fix hashing to be backwards-compatible	2019-05-14 16:59:50 +03:00
Tomasz Grabiec	285ada5035	Merge "config: remove _make_config_values macro" from Avi The _make_config_values macro reduces duplication (both the item name and the types need to be available as C++ identifiers and as runtime strings), but is hard to work with. The macro is huge and editors don't handle it well, errors aren't identified at the correct location, and since the macro doesn't have types, it's hard to refactor. This series replaces the macro with ordinary C++ code. Some repetition is introduced, but IMO the result is easier to maintain than the macro. As a bonus the bulk of the code is moved away from the header file. Tests: unit (dev), manual testing of the config REST API * https://github.com/avikivity/scylla config-no-macro/v2 config: make the named_value type name available without requiring _make_config_values config: remove value_status from named_value template parameter list config: add named_value::value_as_json() api: config: stop using _make_config_values config: auto-add named_values into config_file config: add allowed_values parameter to named_value constructor config: convert _make_config_values to individual named_value member declarations and initializers	2019-05-14 16:00:23 +03:00
Botond Dénes	690ef09b8f	db/extensions.hh: use schema_fwd.hh	2019-05-14 13:27:30 +03:00
Tomasz Grabiec	9de071d214	schema_tables, storage_service: Make schema digest insensitive to expired tombstones in empty partition Schema digest is calculated by querying for mutations of all schema tables, then compacting them so that all tombstones in them are dropped. However, even if the mutation becomes empty after compaction, we still feed its partition key. If the same mutations were compacted prior to the query, because the tombstones expire, we won't get any mutation at all and won't feed the partition key. So schema digest will change once an empty partition of some schema table is compacted away. That's not a problem during normal cluster operation because the tombstones will expire at all nodes at the same time, and schema digest, although changes, will change to the same value on all nodes at about the same time. This fix changes digest calculation to not feed any digest for partitions which are empty after compaction. The digest returned by schema_mutations::digest() is left unchanged by this patch. It affects the table schema version calculation. It's not changed because the version is calculated on boot, where we don't yet know all the cluster features. It's possible to fix this but it's more complicated, so this patch defers that. Refs #4485. Asd	2019-05-14 10:43:06 +02:00
Tomasz Grabiec	3a4a903674	db/schema_tables: Move feed_hash_for_schema_digest() to .cc file	2019-05-14 10:43:06 +02:00
Paweł Dziepak	49b4aeca4d	Merge "hinted handoff: prevent sending attempts" from Vlad " Fix the broken logic that is meant to prevent sending hints when node is in a DOWN NORMAL state. " * 'hinted_handoff_stop_sending_to_down_node-v2' of https://github.com/vladzcloudius/scylla: hints_manager: rename the state::ep_state_is_not_normal enum value hinted handoff: fix the logic that detects that the destination node is in DN state hinted_handoff: sender::can_send(): optimize gossiper::is_alive(ep) check hinted handoff: end_point_hints_manager::sender: use _gossiper instead of _shard_manager.local_gossiper() types.cc: fix the compilation with fmt v5.3.0	2019-05-09 15:18:57 +01:00
Vlad Zolotarov	f07c341efc	hints_manager: rename the state::ep_state_is_not_normal enum value Rename this state value to better reflect the reality: state::ep_state_is_not_normal -> state::ep_state_left_the_ring The manager gets to this state when the destination Node has left the ring. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2019-05-08 15:46:47 -04:00
Vlad Zolotarov	93ba700458	hinted handoff: fix the logic that detects that the destination node is in DN state When node is in a DN state its gossiper state may be NORMAL, SHUTDOWN or "" depending on the use case. In addition to that if node has been removed from the ring its state is also going to be removed from the gossiper_state map. Let's consider the above when deciding if node is in the DN state. Fixes #4461 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2019-05-08 14:53:01 -04:00
Avi Kivity	1c65ba6e66	Use correct scylla_tables schema for removing version column Mutations carry their schema, so use that instead of bring in a global schema, which may change as features are added. Message-Id: <20190505132542.6472-1-avi@scylladb.com>	2019-05-06 13:51:08 +02:00
Piotr Sarna	cf8d2a5141	Revert "view: cache is_index for view pointer" This reverts commit `dbe8491655`. Caching the value was not done in a correct manner, which resulted in longevity tests failures. Fixes #4478 Branches: 3.1 Message-Id: <762ca9db618ca2ed7702372fbafe8ecd193dcf4d.1557129652.git.sarna@scylladb.com>	2019-05-06 11:45:46 +03:00
Benny Halevy	d9136f96f3	commitlog: descriptor: skip leading path from filename std::regex_match of the leading path may run out of stack with long paths in debug build. Using rfind instead to lookup the last '/' in in pathname and skip it if found. Fixes #4464 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20190505144133.4333-1-bhalevy@scylladb.com>	2019-05-05 17:51:56 +03:00
Gleb Natapov	95c6d19f6c	batchlog_manager: fix array out of bound access endpoint_filter() function assumes that each bucket of std::unordered_multimap contains elements with the same key only, so its size can be used to know how many elements with a particular key are there. But this is not the case, elements with multiple keys may share a bucket. Fix it by counting keys in other way. Fixes #3229 Message-Id: <20190501133127.GE21208@scylladb.com>	2019-05-01 17:30:11 +03:00
Tomasz Grabiec	077c639e42	Merge "Simplify the result_set_row API" from Rafael Currently null and missing values are treated differently. Missing values throw no_such_column. Null values return nullptr, std::nullopt or throw null_column_value. The api is a bit confusing since a function returning a std::optional either returns std::nullopt or throws depending on why there is no value. With this patch series only get_nonnull throws and there is only one exception type. * https://github.com/espindola/scylla.git espindola/merge-null-and-missing-v2: query-result-set: merge handling of null and missing values Remove result_set_row::has Return a reference from get_nonnull	2019-04-30 11:06:29 +02:00
Rafael Ávila de Espíndola	63c47117b5	Return a reference from get_nonnull No reason to copy if we don't have to. Now that get_nonnull doesn't copy, replace a raw used of get_data_value with it. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-04-29 21:14:11 -07:00
Rafael Ávila de Espíndola	0474458872	Remove result_set_row::has Now that the various get methods return nullptr or std::nullopt on missing values, we don't need to do double lookups. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-04-29 19:56:26 -07:00

1 2 3 4 5 ...

1407 Commits