scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-03 13:37:04 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	5b0e48f25b	Merge "toppartitions: don't transport schema_ptr across shards" from Avi When the toppartitions operation gathers results, it copies partition keys with their schema_ptr:s. When these schema_ptr:s are copies or destroyed, they can cause leaks or premature frees of the schema in its original shard since reference count operations in are not atomic. Fix that by converting the schema_ptr to a global_schema_ptr during transportation. Fixes #5104 (direct bug) Fixes #5018 (schema prematurely freed, toppartitions previously executed on that node) Fixes #4973 (corrupted memory pool of the same size class as schema, toppartitions previously executed on that node) Tests: new test added that fails with the existing code in debug mode, manual toppartitions test	2019-09-26 17:09:54 +02:00
Avi Kivity	670f398a8a	toppartitions: do not copy schema_ptr:s in item keys across shards Copying schema_ptrs across shards results in memory corruption since lw_shared_ptr does not use atomic operations for reference counts. Prevent that by converting schema_ptr:s to global_schema_ptr:s before shipping them across shards in the map operation, and converting them back to local schema_ptr:s in the reduce operation.	2019-09-26 17:26:40 +03:00
Avi Kivity	f015bd69b7	toppartitions: compare schemas using schema::id(), not pointer to schema This allows keys from different stages in the schema's like to compare equal. This is safe since the partition key cannot change, unlike the rest of the schema. More importantly, it will allow us to compare keys made local after a pass through global_schema_ptr, which does not guarantee that the schema_ptr conversion will be the same even when starting with the same global_schema_ptr.	2019-09-26 17:15:46 +03:00
Avi Kivity	ba64ec78cf	messaging_service: use rpc::tuple instead of variadic futures for rpc Since variadic future<> is deprecated, switch to rpc::tuple for multiple return values in rpc calls. This is more or less mechanical translation.	2019-09-26 12:09:31 +02:00
Rafael Ávila de Espíndola	5af8b1e4a3	types: recreate dependent user types. In the system.types table a user type refers to another by name. When a user type is modified, only its entry in the table is changed. At runtime a user type has direct pointer to the types it uses. To handle the discrepancy we need to recreate any dependent types when a entry in system.types changes. Fixes #5049 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-09-25 15:41:45 -07:00
Rafael Ávila de Espíndola	34eddafdb0	types: Don't modify the type list in db::cql_type_parser::raw_builder With this patch db::cql_type_parser::raw_builder creates a local copy of the list of existing types and uses that internally. By doing that build() should have no observable behavior other than returning the new types. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-09-25 15:41:45 -07:00
Rafael Ávila de Espíndola	d6b2e3b23b	types: pass a reference to prepare_internal We were never passing a null pointer and never saving a copy of the lw_shared_ptr. Passing a reference is more flexible as not all callers are required to hold the user_types_metadata in a lw_shared_ptr. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-09-25 15:40:30 -07:00
Rafael Ávila de Espíndola	4d0916a094	commitlog: Handle gate_closed_exception Before this patch, if the _gate is closed, with_gate throws and forward_to is not executed. When the promise<> p is destroyed it marks its _task as a broken promise. What happens next depends on the branch. On master, we warn when the shared_future is destroyed, so this patch changes the warning from a broken_promise to a gate closed. On 3.1, we warn when the promises in shared_future::_peers are destroyed since they no longer have a future attached: The future that was attached was the "auto f" just before the with_gate call, and it is destroyed when with_gate throws. The net result is that this patch fixes the warning in 3.1. I will send a patch to seastar to make the warning on master more consistent with the warning in 3.1. Fixes #4394 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190917211915.117252-1-espindola@scylladb.com>	2019-09-17 23:41:21 +02:00
Piotr Sarna	feec3825aa	view: degrade shutdown bookkeeping update failures log to warn Currently, if updating bookkeeping operations for view building fails, we log the error message and continue. However, during shutdown, some errors are more likely to happen due to existing issues like #4384. To differentiate actual errors from semi-expected errors during shutdown, the latter are now logged with a warning level instead of error. Fixes #4954	2019-09-16 10:13:06 +03:00
Tomasz Grabiec	79935df959	commitlog: replay: Respect back-pressure from memtable space to prevent OOM Commit log replay was bypassing memtable space back-pressure, and if replay was faster than memtable flush, it could lead to OOM. The fix is to call database::apply_in_memory() instead of table::apply(). The former blocks when memtable space is full. Fixes #4982. Tests: - unit (release) - manual, replay with memtable flush failin and without failing Message-Id: <1568381952-26256-1-git-send-email-tgrabiec@scylladb.com>	2019-09-15 11:51:56 +03:00
Tomasz Grabiec	8517eecc28	Revert "Simplify db::cql_type_parser::parse" This reverts commit `7f64a6ec4b`. Fixes #5011 The reverted commit exposes #3760 for all schemas, not only those which have UDTs. The problem is that table schema deserialization now requires keyspace to be present. If the replica hasn't received schema changes which introduce the keyspace yet, the write will fail.	2019-09-12 12:45:21 +02:00
Nadav Har'El	b2bd3bbc1f	alternator: add "--alternator-address" configuration parameter So far we had the "--alternator-port" option allowing to configure the port on which the Alternator server listens on, but the server always listened to any address. It is important to also be able to configure the listen address - it is useful in tests running several instances of Scylla on the same machine, and useful in multi-homed machines with several interfaces. So this patch adds the "--alternator-address" option, defaulting to 0.0.0.0 (to listen on all interfaces). It works like the many other "--*-address" options that Scylla already has. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190808204641.28648-1-nyh@scylladb.com>	2019-09-11 18:01:05 +03:00
Nadav Har'El	c0518183c2	alternator: require alternator-port configuration Until now, we always opened the Alternator port along with Scylla's regular ports (CQL etc.). This should really be made optional. With this patch, by default Alternator does NOT start and does not open a port. Run Scylla with --alternator-port=8000 to open an Alternator API port on port 8000, as was the default until now. It's also possible to set this in scylla.yaml. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2019-09-11 12:38:31 +03:00
Tomasz Grabiec	a09479e63c	Merge "Validate position in partition monotonicity" from Benny Introduce mutation_fragment_stream_validator class and use it as a Filter to flat_mutation_reader::consume_in_thread from sstable::write_components to validate partition region and optionally clustering key monotonicity. Fixes #4803	2019-09-09 15:38:31 +02:00
Benny Halevy	42f6462837	config: enable_sstable_key_validation by default in debug build Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-09-09 15:30:59 +03:00
Benny Halevy	34d306b982	config: add enable_sstable_key_validation option key monotonicity validation requires an overhead to store the last key and also to compare therefore provide an option to enable/disable it (disabled by default). Refs #4804 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-09-09 15:30:59 +03:00
Gleb Natapov	e52ebfb957	cql3: remove unused next_timestamp() function next_timestamp() just calls get_timestamp() directly and nobody uses it anyway. Message-Id: <20190905101648.GO21540@scylladb.com>	2019-09-05 17:20:21 +03:00
Botond Dénes	e02b93cae1	schema_tables: convert_schema_to_mutations: return canonical_mutations In preparation to the schema push/pull migrating to use canonical mutations, convert the method producing the schema mutations to return a vector of canonical mutations. The only user, MIGRATION_REQUEST verb, converts the canonical mutations back to frozen mutations. This is very inefficient, but this path will only be used in mixed clusters. After all nodes are upgraded the verb will be sending the canonical mutations directly instead.	2019-09-04 08:47:20 +03:00
Piotr Sarna	23c891923e	main: make sure view_builder doesn't propagate semaphore errors Stopping services which occurs in a destructor of deferred_action should not throw, or it will end the program with terminate(). View builder breaks a semaphore during its shutdown, which results in propagating a broken_semaphore exception, which in turn results in throwing an exception during stop().get(). In order to fix that issue, semaphore exceptions are explicitly ignored, since they're expected to appear during shutdown. Fixes #4875	2019-09-01 11:59:57 +03:00
Botond Dénes	136fc856c5	treewide: silence discarded future warnings for questionable discards This patches silences the remaining discarded future warnings, those where it cannot be determined with reasonable confidence that this was indeed the actual intent of the author, or that the discarding of the future could lead to problems. For all those places a FIXME is added, with the intent that these will be soon followed-up with an actual fix. I deliberately haven't fixed any of these, even if the fix seems trivial. It is too easy to overlook a bad fix mixed in with so many mechanical changes.	2019-08-26 19:28:43 +03:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Avi Kivity	67b0d379e0	main: add glue between db::config and cql3::cql_config Copy values between the flat db::config and the hierarchical cql_config, adding observers to keep the values updated.	2019-08-21 19:35:59 +02:00
Piotr Sarna	3cc5a04301	db,view: wrap view update generation in stream scheduling group Generating view updates is used by streaming, so the service itself should also run under the matching scheduling group.	2019-08-20 00:24:50 +02:00
Calle Wilund	1afc899e37	type_parser: Fix/improve exception messages Removes long-standing FIXME for message detail Also simplifies some code, removing duplication. Message-Id: <20190812134144.2417-1-calle@scylladb.com>	2019-08-12 17:03:43 +03:00
Avi Kivity	e6cde72d2b	Merge "Fix cql server admission control to take all leftover work into account" from Gleb " Current admission control takes a permit when cql requests starts and releases it when reply is sent, but some requests may leave background work behind after that point (some because there is genuine background work to do like complete a write or do a read repair, and some because a read/write may stuck in a queue longer than the request's timeout), so after Scylla replies with a timeout some resources are still occupied. The series fixes this by passing the permit down to storage_proxy where it is held until all background work is completed. Fixes #4768 " * 'gleb/admission-v3' of github.com:scylladb/seastar-dev: transport: add a metric to follow memory available for service permit. storage_proxy: store a permit in a read executor storage_proxy: store a permit in a write response handler Pass service permit to storage_proxy transport: introduce service_permit class and use it instead of semaphore_units transport: hold admission a permit until a reply is sent transport: remove cql server load balancer	2019-08-12 11:02:37 +03:00
Gleb Natapov	6a4207f202	Pass service permit to storage_proxy Current cql transport code acquire a permit before processing a query and release it when the query gets a reply, but some quires leave work behind. If the work is allowed to accumulate without any limit a server may eventually run out of memory. To prevent that the permit system should account for the background work as well. The patch is a first step in this direction. It passes a permit down to storage proxy where it will be later hold by background work.	2019-08-12 10:20:43 +03:00
Gleb Natapov	7e3805ed3d	transport: remove cql server load balancer It is buggy, unused and unnecessary complicates the code.	2019-08-11 16:08:52 +03:00
Calle Wilund	d3410f0e48	config: Add rpc_interface_prefer_ipv6 parameter As already existing in scylla.yaml	2019-08-06 08:32:10 +00:00
Calle Wilund	0028cecb8e	config: Add listen_interface_perfer_ipv6 parameter As already existing in scylla.yaml. https://github.com/apache/cassandra/blob/cassandra-3.11/conf/cassandra.yaml#L622	2019-08-06 08:32:10 +00:00
Calle Wilund	39d18178eb	config.cc: Fix enable_ipv6_dns_lookup actual param name When adding option (and iterating through config refactoring) the member name and the config param name got out of sync	2019-08-06 08:32:09 +00:00
Tomasz Grabiec	bf70ee3986	config, exceptions: Add helper for handling internal errors The handler is intended to be called when internal invariants are violated and the operation cannot safely continue. The handler either throws (default) or aborts, depending on configuration option. Passing --abort-on-internal-error on the command line will switch to aborting. The reason we don't abort by default is that it may bring the whole cluster down and cause unavailability, while it may not be necessary to do so. It's safer to fail just the affected operation, e.g. repair. However, failing the operation with an exception leaves little information for debugging the root cause. So the idea is that the user would enable aborts on only one of the nodes in the cluster to get a core dump and not bring the whole cluster down.	2019-08-02 11:13:54 +02:00
Avi Kivity	e03c7003f1	toppartitions: fix race between listener removal and reads Data listener reads are implemented as flat_mutation_readers, which take a reference to the listener and then execute asynchronously. The listener can be removed between the time when the reference is taken and actual execution, resulting in a dangling pointer dereference. Fix by using a weak_ptr to avoid writing to a destroyed object. Note that writes don't need protection because they execute atomically. Fixes #4661. Tests: unit (dev)	2019-07-22 13:26:18 +02:00
Rafael Ávila de Espíndola	636e2470b1	Always close commitlog files We were using segment::_closed to decide whether _file was already closed. Unfortunately they are not exactly the same thing. As far as I understand it, segments can be closed and reused without actually closing the file. Found with a seastar patch that asserts on destroying an open append_challenged_posix_file_impl. Fixes #4745. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20190721171332.7995-1-espindola@scylladb.com>	2019-07-22 10:08:57 +03:00
Piotr Sarna	c1d5aef735	db: add system_schema.computed_columns Information on which columns of a table are 'computed' is now kept in system_schema.computed_columns system table.	2019-07-19 11:58:42 +02:00
Piotr Sarna	17c323c096	database: add fixing previous secondary index schemas If a schema was created before computed columns were implemented, its token column may not have been marked as computed. To remedy this, if no computed column is found, the schema will be recreated. The code will work correctly even without this patch in order to support upgrading from legacy versions, but it's still important: it transforms token columns from the legacy format to new computed format, which will eventually (after a few release cycles) allow dropping the support for legacy format altogether.	2019-07-19 11:58:42 +02:00
Piotr Sarna	3c5dd94306	view: remove unused token_for function The function was only used once in code removed in this series.	2019-07-19 11:58:42 +02:00
Piotr Sarna	6a6871aa0e	view: check for computed columns in view Currently, having a 'computed' column in view update generation indicates that token value needs to be generated and assigned to it.	2019-07-19 11:58:42 +02:00
Piotr Sarna	a0e02df36a	service: add computed columns feature Computed columns feature should be checked before creating index schemas the new way - by adding computed column names to system_schema.computed_columns.	2019-07-19 11:58:42 +02:00
Tomasz Grabiec	14700c2ac4	Merge "Fix the system.size_estimates table" from Kamil Fixes a segfault when querying for an empty keyspace. Also, fixes an infinite loop on smp > 1. Queries to system.size_estimates table which are not single-partition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. Fixes #4689.	2019-07-15 22:09:30 +02:00
Piotr Sarna	ac7531d8d9	db,hints: decouple in-flight hints limits from resource manager The resource manager is used to manage common resources between various hints managers. In-flight hints used to be one of the shared resources, but it proves to cause starvation, when one manager eats the whole limit - which may be especially painful if the background materialized views hints manager starves the regular hints manager, which can in turn start failing user writes because of admission control. This patch makes the limit per-manager again, which effectively reverts the limit to its original behavior. Fixes #4483 Message-Id: <8498768e8bccbfa238e6a021f51ec0fa0bf3f7f9.1559649491.git.sarna@scylladb.com>	2019-07-12 19:21:26 +03:00
Kamil Braun	60a4867a5b	Fix infinite looping when performing a range query on system.size_estimates. Queries to system.size_estimates table which are not single parition queries caused Scylla to go into an infinite loop inside multishard_combining_reader::fill_buffer. This happened because multishard_combinind_reader assumes that shards return rows belonging to separate partitions, which was not the case for size_estimates_mutation_reader. This commit fixes the issue and closes #4689.	2019-07-12 18:09:15 +02:00
Kamil Braun	ba5a02169e	Fix segmentation fault when querying system.size_estimates for an empty keyspace.	2019-07-12 18:02:10 +02:00
Kamil Braun	a1665b74a9	Refactor size_estimates_virtual_reader Move the implementation of size_estimates_mutation_reader to a separate compilation unit to speed up compilation times and increase readability. Refactor tests to use seastar::thread.	2019-07-12 17:53:00 +02:00
Calle Wilund	1f5e1d22bf	db::config: Add enable ipv6 switch (default off) Off by default to prevent problems during cluster migration when needing to gossip with non-ipv6 aware nodes.	2019-07-08 14:13:09 +00:00
Calle Wilund	4ef940169f	Replace use of "ipv4_addr" with socket_address Allows the various sockets to use ipv6 address binding if so configured.	2019-07-08 14:13:09 +00:00
Calle Wilund	5fd811ec8a	gms::inet_address: Change inet_address to wrap actual seastar::net::inet_address Thusly handle all types net::inet_address can handle. I.e. ipv6.	2019-07-08 14:13:09 +00:00
Calle Wilund	f317d7a975	commitlog: Simplify commitlog extension iteration Fixes #4640 Iterating extensions in commitlog.cc should mimic that in sstables.cc, i.e. a simple future-chain. Should also use same order for read and write open, as we should preserve transformation stack order. Message-Id: <20190702150028.18042-1-calle@scylladb.com>	2019-07-02 18:37:44 +03:00
Tomasz Grabiec	eb496b5eae	Merge "Allow changing configuration at runtime" from Avi This patchset allows changing the configuration at runtime, The user triggers this by editing the configuration file normally, then signalling the database with SIGHUP (as is traditional). The implementation is somewhat complicated due the need to store non-atomic mutable state per-shard and to synchronize the values in all shards. This is somewhat similar to Seastar's sharded<>, but that cannot be used since the configuration is read before Seastar is initialized (due to the need to read command-line options). Tests: unit (dev, debug), manual test with extra prints (dev) Ref #2689 Fixes #2517.	2019-07-01 15:04:59 +02:00
Avi Kivity	2abe015150	database: allow live update of the compaction_enforce_min_threshold config item Change the type from bool to updateable_value<bool> throughout the dependency chain and mark it as live updateable. In theory we should also observe the value and trigger compaction if it changes, but I don't think it is worthwhile.	2019-06-28 16:43:25 +03:00
Avi Kivity	8d7c1c7231	db: seed_provider_type: add operator==() Dynamically updateable configuration requires checking whether configuration items changed or not, so we can skip firing notifiers for the common case where nothing changed. This patch adds a comparison operator for seed_provider_type, which was missing it.	2019-06-28 16:43:25 +03:00

1 2 3 4 5 ...

1429 Commits