scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 10:00:35 +00:00

Author	SHA1	Message	Date
Asias He	28d6d117d2	migration_manager: Fix nullptr dereference in maybe_schedule_schema_pull Commit `976324bbb8` changed to use get_application_state_ptr to get a pointer of the application_state. It may return nullptr that is dereferenced unconditionally. In resharding_test.py:ReshardingTest_nodes4_with_SizeTieredCompactionStrategy.resharding_by_smp_increase_test, we saw: 4 nodes in the tests n1, n2, n3, n4 are started n1 is stopped n1 is changed to use different shard config n1 is restarted ( 2019-01-27 04:56:00,377 ) The backtrace happened on n2 right fater n1 restarts: 0 INFO 2019-01-27 04:56:05,175 [shard 0] gossip - Feature STREAM_WITH_RPC_STREAM is enabled 1 INFO 2019-01-27 04:56:05,175 [shard 0] gossip - Feature WRITE_FAILURE_REPLY is enabled 2 INFO 2019-01-27 04:56:05,175 [shard 0] gossip - Feature XXHASH is enabled 3 WARN 2019-01-27 04:56:05,177 [shard 0] gossip - Fail to send EchoMessage to 127.0.58.1: seastar::rpc::closed_error (connection is closed) 4 INFO 2019-01-27 04:56:05,205 [shard 0] gossip - InetAddress 127.0.58.1 is now UP, status = 5 Segmentation fault on shard 0. 6 Backtrace: 7 0x00000000041c0782 8 0x00000000040d9a8c 9 0x00000000040d9d35 10 0x00000000040d9d83 11 /lib64/libpthread.so.0+0x00000000000121af 12 0x0000000001a8ac0e 13 0x00000000040ba39e 14 0x00000000040ba561 15 0x000000000418c247 16 0x0000000004265437 17 0x000000000054766e 18 /lib64/libc.so.6+0x0000000000020f29 19 0x00000000005b17d9 We do not know when this backtrace happened, but according to log from n3 an n4: INFO 2019-01-27 04:56:22,154 [shard 0] gossip - InetAddress 127.0.58.2 is now DOWN, status = NORMAL INFO 2019-01-27 04:56:21,594 [shard 0] gossip - InetAddress 127.0.58.2 is now DOWN, status = NORMAL We can be sure the backtrace on n2 happened before 04:56:21 - 19 seconds (the delay the gossip notice a peer is down), so the abort time is around 04:56:0X. The migration_manager::maybe_schedule_schema_pull that triggers the backtrace must be scheduled before n1 is restarted, because it dereference application_state pointer after it sleeps 60 seconds, so the time maybe_schedule_schema_pull is called is around 04:55:0X which is before n1 is restarted. So my theory is: migration_manager::maybe_schedule_schema_pull is scheduled, at this time n1 has SCHEMA application_state, when n1 restarts, n2 gets new application state from n1 which does not have SCHEMA yet, when migration_manager::maybe_schedule wakes up from the 60 sleep, n1 has non-empty endpoint_state but empty application_state for SCHEMA. We dereference the nullptr application_state and abort. Fixes: #4148 Tests: resharding_test.py:ReshardingTest_nodes4_with_SizeTieredCompactionStrategy.resharding_by_smp_increase_test Message-Id: <9ef33277483ae193a49c5f441486ee6e045d766b.1548896554.git.asias@scylladb.com>	2019-02-01 09:01:08 +02:00
Piotr Jastrzebski	7666e81b51	Decouple database.hh from types/user.hh This commit declares shared_ptr<user_types_metadata> in database.hh were user_types_metadata is an incomplete type so it requires "Allow to use shared_ptr with incomplete type other than sstable" to compile correctly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-01-24 09:55:04 +01:00
Avi Kivity	f02c64cadf	streaming: stream_session: remove include of db/view/view_update_from_staging_generator.hh This header, which is easily replaced with a forward declaration, introduces a dependency on database.hh everywhere. Remove it and scatter includes of database.hh in source files that really need it.	2019-01-05 17:33:25 +02:00
Duarte Nunes	89ae3fbf11	db/system_distributed_keyspace: Create the schema with min_timestamp Different nodes can concurrently create the distributed system keyspace on boot, before the "if not exists" clause can take effect. However, the resulting schema mutations will be different since different nodes use different timestamps. This patch forces the timestamps to be the same across all nodes, so we save some schema mismatches. This fixes a bug exposed by `ca5dfdf`, whereby the initialization of the distributed system keyspace is done before waiting for schema agreement. While waiting for schema agreement in storage_service::join_token_ring(), the node still hasn't joined the ring and schemas can't be pulled from it, so nodes can deadlock. A similar situation can happen between a seed node and a non-seed node, where the seed node progresses to a different "wait for schema agreement" barrier, but still can't make progress because it can't pull the schema from the non-seed node still trying to join the ring. Finally, it is assumed that changes to the schema of the current distributed system keyspace tables will be protected by a cluster feature and a subsequent schema synchronization, such that all nodes will be at a point where schemas can be transferred around. Fixes #3976 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181211113407.20075-1-duarte@scylladb.com>	2018-12-11 13:35:48 +01:00
Avi Kivity	bb0eb9dae8	service: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Duarte Nunes	7ba944a243	service/migration_manager: Validate duplicate ID in time We allow tables to be created with the ID property, mostly for advanced recovery cases. However, we need to validate that the ID doesn't match an existing one. We currently do this in database::add_column_family(), but this is already too late in the normal workflow: if we allow the schema change to go through, then it is applied to the system tables and loaded the next time the node boots, regardless of us throwing from database::add_column_family(). To fix this, we perform this validation when announcing a new table. Note that the check wasn't removed from database::add_column_family(); it's there since 2015 and there might have been other reasons to add it that are not related to the ID property. Refs #2059 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181001230142.46743-1-duarte@scylladb.com>	2018-10-02 13:40:40 +03:00
Piotr Sarna	7e4813a466	migration_manager: allow dropping table with secondary indexes Previously dropping a table with secondary indexes failed, because SI are internally backed by materialized views. This commit triggers dropping dependent secondary indexes before dropping a table. Fixes #3202	2018-05-22 21:10:51 +02:00
Duarte Nunes	9146de3118	service/migration_manager: Don't drop index-backing MV Unless dropped by the index itself, forbid dropping an index-backing MV using `drop materialized view`. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180424140745.7144-1-duarte@scylladb.com>	2018-04-24 17:01:59 +01:00
Jesse Haber-Kucharsky	3e415e28bc	Single-node clusters can agree on schema At some points while bootstrapping [1], new non-seed Scylla nodes wait for schema agreement among all known endpoints in the cluster. The check for schema agreement was in `service::migration_manager::is_ready_for_bootstrap`. This function would return `true` if, at the time of its invocation, the node was aware of at least one `UP` peer (not itself) and that all `UP` peers had the same schema version as the node. We wish to re-use this check in the `auth` sub-system to ensure that the schema for internal system tables used for access-control have propagated to the entire cluster. Unlike in `service/storage_service.cc`, where `is_ready_for_bootstrap` was only invoked for seed nodes, we wish to wait for schema agreement for all nodes regardless of whether or not they are seeds. For a single-node cluster with itself as a seed, `is_ready_for_bootstrap` would always return `false`. We therefore change the conditions for schema agreement. Schema agreement is now reached when there are no known peers (so the endpoint map of the gossiper consists only of ourselves), or when there is at least one `UP` peer and all `UP` peers have the same schema version as us. This change should not impact any bootstrap behavior in `storage_service` because seed nodes do not invoke the function and non-seed nodes wait for peer visibility before checking for schema agreement. Since this function is no longer checking for schema agreement only in the context of bootstrapping non-seed nodes, we rename it to reflect its generality. [1] http://thelastpickle.com/blog/2017/05/23/auto-bootstrapping-part1.html	2018-03-25 22:08:42 -04:00
Duarte Nunes	03e6fc95ba	service/migration_manager: Avoid copies in is_ready_for_bootstrap() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-10-11 10:02:32 +01:00
Duarte Nunes	72ca6b34ef	service/migration_manager: Cleanup has_compatible_schema_tables_version() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-10-11 10:02:32 +01:00
Duarte Nunes	976324bbb8	service/migration_manager: Fix usages of get_application_state() We were taking a reference to a temporary value in different places. Fix them by using get_application_state_ptr(), which also avoids a copy. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-10-11 10:02:32 +01:00
Duarte Nunes	ceebbe14cc	gossiper: Avoid endpoint_state copies gossiper::get_endpoint_state_for_endpoint() returns a copy of endpoint_state, which we've seen can be very expensive. This patch adds a similar function which returns a pointer instead, and changes the call sites where using the pointer-returning variant is deemed safe (the pointer neither escapes the function, nor crosses any defer point). Fixes #764 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-10-10 13:48:02 +01:00
Tomasz Grabiec	b704710954	migration_manager: Make sure schema pulls eventually happen when schema_tables_v3 is enabled We don't pull schema during rolling upgrade, that is until schema_tables_v3 feature is enabled on all nodes. Because features are enabled from gossiper timer, there is a race between feature enablement and processing of endpoint states which may trigger schema pull. It can happen that we first try to pull, but only later enable the feature. In that case the schema pull will not happen until the next schema change. The fix is to ensure that pulls abandoned due to feature not being enabled will be retried when it is enabled. Fixes sporadic failure in dtest: repair_additional_test.py:RepairAdditionalTest.repair_schema_test Message-Id: <1506428715-8182-2-git-send-email-tgrabiec@scylladb.com>	2017-09-27 12:00:07 +01:00
Tomasz Grabiec	5a92c18e63	migration_manager: Disable pulls during rolling upgrade from 1.7 If there is a schema pull during rolling upgrade among a two 2.0 nodes, then schema merge will delete the persisted schema version. When the node loads that table again, e.g. on restart, it will generate a version which is different than the one which 1.7 nodes use. This will cause reads and writes to fail. To avoid this, disable pulls until all nodes are upgraded. Fixes #2802.	2017-09-14 20:26:31 +02:00
Tomasz Grabiec	e09220dbff	migration_manager: Log schema pulls	2017-07-27 20:08:25 +02:00
Tomasz Grabiec	350d98d4e1	migration_manager: Prevent pull requests from accumulating If schema merging completes at lower rate than incoming pull requests, then merge processes will accumulate and needlessly request and hold schema mutations. In rare cases, when there are constant schema changes, they may even overflow memory. This was seen in dtest: concurrent_schema_changes_test.py:TestConcurrentSchemaChanges.create_lots_of_schema_churn_test Allowing only one active and one queued pull request per remote endpoint is enough.	2017-07-27 20:08:25 +02:00
Calle Wilund	247c36e048	system_schema: Fix remaining places not handing two system keyspaces Some places remained where code looked directly at system_keyspace::NAME to determine iff a ks is considered special/system/protected. Including schema digest calculation. Export "is_system_keyspace" and use accordingly. Message-Id: <1500469809-23546-1-git-send-email-calle@scylladb.com>	2017-07-19 16:18:45 +03:00
Tomasz Grabiec	07ed512060	migration_manager: Give empty response to schema pulls from incompatible nodes The old nodes which are still using v2 schema tables will fail to apply our response, with error messages complaining about not being able to locate schema of certain versions (new schema tables). This change inhibits such errors by responding with an empty mutation list.	2017-07-07 19:09:57 +02:00
Tomasz Grabiec	5f613d0527	migration_manager: Don't pull schema from incompatible nodes Currently it results in scary error messages in logs about not being able to find schema of given version. It's benign, but may scare users. It the future incompatibilities could result in more subtle errors. Better to inhibit it completely.	2017-07-07 19:08:59 +02:00
Duarte Nunes	b2c5aca4cf	db/schema_tables: View mutations shouldn't always include base ones When making the schema mutations for a view update, we should only include the base table schema mutations (in case the target node doesn't contain them) when the view is being directly updated. When it is being updated as a side effect of updating the base table, then including the base schema mutations will hide the actual changes being performed on the base. Fixes #2500 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1497782822-2711-1-git-send-email-duarte@scylladb.com>	2017-06-18 16:29:59 +03:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Duarte Nunes	e215f25b11	migration_manager: Atomically migrate table and views This patch changes the migration path for table updates such that the base table mutations are sent and applied atomically with the view schema mutations. This ensures that after schema merging, we have a consistent mapping of base table versions to view table versions, which will be used in later patches. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-03-15 16:03:56 +01:00
Tomasz Grabiec	06d4ad1bdd	migration_manager: Append actual keyspace mutations with schema notifications There is a workaround for notification race, which attaches keyspace mutations to other schema changes in case the target node missed the keyspace creation. Currently that generated keyspace mutations on the spot instead of using the ones stored in schema tables. Those mutations would have current timestamp, as if the keyspace has been just modified. This is problematic because this may generate an overwrite of keyspace parameters with newer timestamp but with stale values, if the node is not up to date with keyspace metadata. That's especially the case when booting up a node without enabling auto_bootstrap. In such case the node will not wait for schema sync before creating auth tables. Such table creation will attach potentially out of date mutations for keyspace metadata, which may overwrite changes made to keyspace paramteters made earlier in the cluster. Refs #2129.	2017-03-07 19:19:15 +01:00
Duarte Nunes	a5b7b0464b	migration_manager: Only drop table without views This patch forbids dropping a column family if there are still views associated with it, and also forbids dropping a view through the drop table statement. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	282c023524	migration_manager: Announce view drop Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	99aa8eb4b8	migration_manager: Announce view update Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	6ef3358321	migration_manager: Announce new view creation Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Duarte Nunes	93458f314c	migration_manager: Notify of view schema changes Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-12-20 13:06:11 +00:00
Tomasz Grabiec	c1a7e2090e	Revert "database: change find_column_families signature so it returns a lw_shared_ptr" This reverts commit `f3528ede65`.	2016-11-04 10:48:21 +01:00
Glauber Costa	f3528ede65	database: change find_column_families signature so it returns a lw_shared_ptr There are places in which we need to use the column family object many times, with deferring points in between. Because the column family may have been destroyed in the deferring point, we need to go and find it again. If we use lw_shared_ptr, however, we'll be able to at least guarantee that the object will be alive. Some users will still need to check, if they want to guarantee that the column family wasn't removed. But others that only need to make sure we don't access an invalid object will be able to avoid the cost of re-finding it just fine. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <722bf49e158da77ff509372c2034e5707706e5bf.1478111467.git.glauber@scylladb.com>	2016-11-03 13:27:31 +01:00
Tomasz Grabiec	c97871d95c	migration_manager: Uncomment logging for keysapce drop Message-Id: <1468413673-6899-1-git-send-email-tgrabiec@scylladb.com>	2016-07-13 13:42:23 +01:00
Pekka Enberg	b5d9aa866d	Merge "Fixes for schema synchronization" from Tomek "Writes may start to be rejected by replicas after issuing alter table which doesn't affect columns. This affects all versions with alter table support. Fixes #1258"	2016-05-12 09:43:25 +03:00
Tomasz Grabiec	8703136a4f	migration_manager: Fix schema syncing with older version The problem was that "s" would not be marked as synced-with if it came from shard != 0. As a result, mutation using that schema would fail to apply with an exception: "attempted to mutate using not synced schema of ..." The problem could surface when altering schema without changing columns and restarting one of the nodes so that it forgets past versions. Fixes #1258. Will be covered by dtest: SchemaManagementTest.test_prepared_statements_work_after_node_restart_after_altering_schema_without_changing_columns	2016-05-11 17:29:14 +02:00
Calle Wilund	63b6c6bb5a	migration_manager: Implement announce_keyspace_update More or less the same as create keyspace...	2016-05-10 14:34:51 +00:00
Pekka Enberg	3f1fcca3bc	cql3: Fix DROP KEYSPACE error message when keyspace does not exist Commit `d3fe0c5` ("Refactor db/keyspace/column_family toplogy") changed database::find_keyspace() to throw a std::nested_exception so the catch block in migration_manager::announce_keyspace_drop() no longer catches the exception. Fix the issue by explicitly checking if the keyspace exists and throwing the correct exception type if it doesn't. Fixes TestCQL.keyspace_test. Message-Id: <1461218910-26691-1-git-send-email-penberg@scylladb.com>	2016-04-21 12:42:45 +02:00
Duarte Nunes	08a7bba4ed	udt: Announce UDT migrations This patch defines the member functions responsible for announce create, update and drop user defined types migration. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-04-20 09:54:06 +02:00
Duarte Nunes	37a1547971	udt: Add migration notifications This patch adds migration notifications for user defined types. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-04-20 09:54:06 +02:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Pekka Enberg	5019b709ba	service/migration_manager: Simplify verb unregistration You can safely unregister verbs even if they're not registered yet. Simplify code in migration manager by dropping the redundant checks. Message-Id: <1458027669-6517-1-git-send-email-penberg@scylladb.com>	2016-03-22 15:24:55 +02:00
Paweł Dziepak	9f3893980a	move SCHEMA_CHECK registration to migration_manager The verb is just for reporting and debugging purposes, but it is better not to register it until it can return a meaningful value. Besides, it really belongs to the migration manager subsystem anyway. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1458037053-14836-1-git-send-email-pdziepak@scylladb.com>	2016-03-15 12:24:37 +02:00
Asias He	93015bcc54	migration_manager: Make the migration callbacks runs inside seastar thread At the momment, the callbacks returns void, it is impossible to wait for the callbacks to complete. Make the callbacks runs inside seastar thread, so if we need to wait for the callback, we can make it call foo_operation().get() in the callback. It is easier than making the callbacks return future<>.	2016-03-15 15:41:23 +08:00
Pekka Enberg	1429213b4c	main: Defer migration manager RPC verb registration after commitlog replay Defer registering migration manager RPC verbs after commitlog has has been replayed so that our own schema is fully loaded before other other nodes start querying it or sending schema updates. Message-Id: <1457971028-7325-1-git-send-email-penberg@scylladb.com>	2016-03-14 18:03:16 +01:00
Raphael S. Carvalho	9cb8a43684	start using notation ks.cf everywhere Some places were using the notation ks/cf to represent a keyspace and column family pair. ks.cf is the notation used by C*, so we should use it everywhere. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <939449af92565b79d1823890784dc4d1dc3cdc84.1455830989.git.raphaelsc@scylladb.com>	2016-02-21 11:15:09 +02:00
Tomasz Grabiec	8deb3f18d3	query_processor: Invalidate prepared statements when columns change Replicates https://issues.apache.org/jira/browse/CASSANDRA-7910 : "Prepare a statement with a wildcard in the select clause. 2. Alter the table - add a column 3. execute the prepared statement Expected result - get all the columns including the new column Actual result - get the columns except the new column"	2016-01-11 10:34:55 +01:00
Tomasz Grabiec	8817e9613d	migration_manager: Simplify notifications Currently the notify_() method family broadcasts to all shards, so schema merging code invokes them only on shard 0, to avoid doubling notifications. We can simplify this by making the notify_() methods per-instance and thus shard-local.	2016-01-11 10:34:54 +01:00
Tomasz Grabiec	b91c92401f	migration_manager: Implement migration_manager::announce_column_family_update	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	e1e8858ed1	service: Fetch and sync schema	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	dee0bbf3f3	migration_manager: Introduce merge_schema_from()	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	eb9b383531	service: migration_manager: Fix announce order to match C* Current logic differs from C, we first push to other nodes and then initiate the the sync locally, while C does the opposite.	2016-01-08 21:10:25 +01:00

1 2

93 Commits