scylladb

Author	SHA1	Message	Date
Piotr Jastrzebski	ec3d59bf13	Add flag to configure max size of a cached partition. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> (cherry picked from commit `636a4acfd0`)	2016-07-27 14:09:34 +03:00
Tomasz Grabiec	35c1781913	schema_tables: Fix hang during keyspace drop Fixes #1484. We drop tables as part of keyspace drop. Table drop starts with creating a snapshot on all shards. All shards must use the same snapshot timestamp which, among other things, is part of the snapshot name. The timestamp is generated using supplied timestamp generating function (joinpoint object). The joinpoint object will wait for all shards to arrive and then generate and return the timestamp. However, we drop tables in parallel, using the same joinpoint instance. So joinpoint may be contacted by snapshotting shards of tables A and B concurrently, generating timestamp t1 for some shards of table A and some shards of table B. Later the remaining shards of table A will get a different timestamp. As a result, different shards may use different snapshot names for the same table. The snapshot creation will never complete because the sealing fiber waits for all shards to signal it, on the same name. The fix is to give each table a separate joinpoint instance. Message-Id: <1469117228-17879-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `5e8f0efc85`)	2016-07-22 15:36:45 +02:00
Tomasz Grabiec	9c430c2cff	schema_tables: Add more logging Message-Id: <1468917771-2592-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `a0832f08d2`)	2016-07-20 10:13:28 +03:00
Duarte Nunes	aacc7193f2	schema: Replace keyspace's schema_ptr on CF update This patch ensures we replace the schema_ptr held by its respective keyspace object when a column family is being updated. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20160623085710.26168-1-duarte@scylladb.com>	2016-06-23 11:11:52 +02:00
Calle Wilund	8cdf4e37fb	schema_tables: Fix merge_keyspaces to handle alter keyspace Must keep "altered" alive into the call chain.	2016-05-10 14:32:51 +00:00
Duarte Nunes	809b45e160	udt: Add drop type statement Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-04-20 18:07:02 +02:00
Duarte Nunes	d1f215b743	udt: Merge user defined type mutations This patch implements the merge_types() function, allowing mutations to user defined types to be applied. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-04-20 09:54:06 +02:00
Duarte Nunes	d6d29f7c52	schema: Replace ad hoc func with indirect_equal_to Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-04-20 09:54:06 +02:00
Duarte Nunes	dd75fe8ec0	udt: Add mutations for user defined types This patch implements mutations for user defined types. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-04-20 09:54:06 +02:00
Duarte Nunes	c7b3a4b144	udt: Parse user types system table This patch loads and parses the user types system table during bootstrap. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-04-20 09:54:06 +02:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Avi Kivity	a919113fdb	schema_tables: fix deadlock in cross-node communications Seastar wrongly limits the number of concurrent submit_to()s to a single remote shard. This can cause an ABBA deadlock: fiberA fiberB (x127) submit_to(0) # lock schema <- returns submit_to(0) # lock schema (waits) submit_to(0) # do work (waits) The fiberBs wait for fiberA, which in turn waits for a fiberB to return. While the correct fix is to remote the client-side limit and replace it with a server-side per-verb limit, we start with a simpler fix that replaces the blocking lock call with a non-blocking call, removing the deadlock. Fixes #1088. Message-Id: <1459095357-28950-1-git-send-email-avi@scylladb.com>	2016-03-28 10:12:10 +03:00
Tomasz Grabiec	53bbcf4a1e	schema_tables: Wait for notifications to be processed. Listeners may defer since: `93015bcc54` "migration_manager: Make the migration callbacks runs inside seastar thread" Not all places were adjusted to wait for them. Fix that. Message-Id: <1458837613-27616-1-git-send-email-tgrabiec@scylladb.com>	2016-03-24 19:04:12 +02:00
Asias He	93015bcc54	migration_manager: Make the migration callbacks runs inside seastar thread At the momment, the callbacks returns void, it is impossible to wait for the callbacks to complete. Make the callbacks runs inside seastar thread, so if we need to wait for the callback, we can make it call foo_operation().get() in the callback. It is easier than making the callbacks return future<>.	2016-03-15 15:41:23 +08:00
Glauber Costa	a339296385	database: turn sstable generation number into an optional This patch makes sure that every time we need to create a new generation number - the very first step in the creation of a new SSTable, the respective CF is already initialized and populated. Failure to do so can lead to data being overwritten. Extensive details about why this is important can be found in Scylla's Github Issue #1014 Nothing should be writing to SSTables before we have the chance to populate the existing SSTables and calculate what should the next generation number be. However, if that happens, we want to protect against it in a way that does not involve overwriting existing tables. This is one of the ways to do it: every column family starts in an unwriteable state, and when it can finally be written to, we mark it as writeable. Note that this cannot be a part of add_column_family. That adds a column family to a db in memory only, and if anybody is about to write to a CF, that was most likely already called. We need to call this explicitly when we are sure we're ready to issue disk operations safely. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-10 21:06:05 -05:00
Tomasz Grabiec	04f2482d74	schema_tables: Log results of schema merge Currently schema changes are only logged at coordinator node which initiates the change. It would be helpful in post morten analysis to also see when and how schema changes are resolved when applied on other nodes. Message-Id: <1456953095-1982-1-git-send-email-tgrabiec@scylladb.com>	2016-03-03 11:12:15 +02:00
Calle Wilund	590ec1674b	truncate: Require timestamp join-function to ensure equal values Fixes #937 In fixing #884, truncation not truncating memtables properly, time stamping in truncate was made shard-local. This however breaks the snapshot logic, since for all shards in a truncate, the sstables should snapshot to the same location. This patch adds a required function argument to truncate (and by extension drop_column_family) that produces a time stamp in a "join" fashion (i.e. same on all shards), and utilizes the joinpoint type in caller to do so. Message-Id: <1456332856-23395-2-git-send-email-calle@scylladb.com>	2016-02-24 18:59:31 +02:00
Calle Wilund	18203a4244	database::truncate/drop: Move time stamp generation to shard Fixes #884 Time stamps for truncation must be generated after flush, either by splitting the truncate into two (or more) for-each-shard operations, or simply by doing time stamping per shard (this solution). We generate TS on each shard after flushing, and then rely on the actual stored value to be the highest time point generated. This should however, from batch replay point of view, be functionally equivalent. And not a problem.	2016-02-09 15:45:37 +00:00
Gleb Natapov	63a5aa6122	prevent superfluous frozen_mutation copying Sometimes frozen_mutation is copied while it can be moved instead. Fix those cases. Message-Id: <20160204165708.GI6705@scylladb.com>	2016-02-07 10:54:16 +02:00
Paweł Dziepak	4927ff95da	schema: read collections from comparator Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-18 08:35:33 +01:00
Tomasz Grabiec	e62857da48	schema_tables: Wait for make_directory_for_column_family() to finish in merge_tables()	2016-01-11 10:34:55 +01:00
Tomasz Grabiec	71bbbceced	schema_tables: Notify about table creation after it is fully inited I'm not aware of any issues it could cause, but it makes more sense that way.	2016-01-11 10:34:55 +01:00
Tomasz Grabiec	8deb3f18d3	query_processor: Invalidate prepared statements when columns change Replicates https://issues.apache.org/jira/browse/CASSANDRA-7910 : "Prepare a statement with a wildcard in the select clause. 2. Alter the table - add a column 3. execute the prepared statement Expected result - get all the columns including the new column Actual result - get the columns except the new column"	2016-01-11 10:34:55 +01:00
Tomasz Grabiec	d80ffc580f	schema_tables: Notify about table schema update	2016-01-11 10:34:54 +01:00
Tomasz Grabiec	8817e9613d	migration_manager: Simplify notifications Currently the notify_() method family broadcasts to all shards, so schema merging code invokes them only on shard 0, to avoid doubling notifications. We can simplify this by making the notify_() methods per-instance and thus shard-local.	2016-01-11 10:34:54 +01:00
Paweł Dziepak	f24f677dde	db/schema_tables: simplify column difference computation Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-11 10:34:54 +01:00
Paweł Dziepak	ae3acd0f9c	system_tables: store sechma::dropped_columns in system tables Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-11 10:34:54 +01:00
Tomasz Grabiec	d8ff9ee441	schema_tables: Make merge_tables() compare by mutations Schema version is calculated from mutations, so merge_schema should also look at mutation changes to detect schema changes whenever version changes.	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	5707c5e7ca	schema_tables: Simplify merge_tables() and merge_keyspaces() read_schema_for_keyspaces() drops empty results so the emptiness checks are always false and we can remove some redundancy.	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	bfefe5a546	schema_tables: Calculate digest from mutations We want the node's schema version to change whenever table_schema_version of any table changes. The latter is calculated by hashing mutations so we should also use mutation hash when calculating schema digest.	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	b91c92401f	migration_manager: Implement migration_manager::announce_column_family_update	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	8164902c84	schema_tables: Change column_family schema on schema sync Notifications are not implemented yet.	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	4e5a52d6fa	db: Make read interface schema version aware The intent is to make data returned by queries always conform to a single schema version, which is requested by the client. For CQL queries, for example, we want to use the same schema which was used to compile the query. The other node expects to receive data conforming to the requested schema. Interface on shard level accepts schema_ptr, across nodes we use table_schema_version UUID. To transfer schema_ptr across shards, we use global_schema_ptr. Because schema is identified with UUID across nodes, requestors must be prepared for being queried for the definition of the schema. They must hold a live schema_ptr around the request. This guarantees that schema_registry will always know about the requested version. This is not an issue because for queries the requestor needs to hold on to the schema anyway to be able to interpret the results. But care must be taken to always use the same schema version for making the request and parsing the results. Schema requesting across nodes is currently stubbed (throws runtime exception).	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	04eb58159a	query: Add schema_version field to read_command	2016-01-11 10:34:51 +01:00
Tomasz Grabiec	f58c2dec1e	schema: Make schema objects versioned The version needs to change value not only on structural changes but also temporal. This is needed for nodes to detect if the version they see was already synchronized with or not even if it has the same structure as the past versions. We also need to end up with the same version on all nodes when schema changes are commuted. For regular mutable schemas version will be calculated from underlying mutations when schema is announced. For static schemas of system keyspace it is calculated by hashing scylla version and column id, because we don't have mutations at the time of building the schema.	2016-01-08 21:10:26 +01:00
Tomasz Grabiec	fdb9e01eb4	schema_tables: Use schema_mutations for schema_ptr translations We will be able to reuse the code in frozen_schema. We need to read data in mutation form so that we can construct the correct schema_table_version, and attach the mutations to schema_ptr.	2016-01-08 21:10:26 +01:00
Tomasz Grabiec	d07e32bc32	schema_tables: Simplify schema building invocation chain	2016-01-08 21:10:26 +01:00
Tomasz Grabiec	3c3ea20640	schema_tables: Drop pkey parameter from add_table_to_schema_mutation() It simplifies add_table_to_schema_mutation() interface. The current code is also a bit confusing, partition_key is created with the keyspaces() schema and used in mutations destined for the columnfamilies() schema. It works, the types are the same, but looks a bit scary.	2016-01-08 21:10:26 +01:00
Pekka Enberg	e56bf8933f	Improve not implemented errors Print out the function name where we're throwing the exception from to make it easier to debug such exceptions.	2015-12-18 10:51:37 +01:00
Tomasz Grabiec	bc23ebcbc3	schema_tables: Replace schema_result::value_type with equivalent movable type future<> requires and will assert nothrow move constructible types.	2015-12-07 09:50:27 +01:00
Tomasz Grabiec	8d88ece896	schema_tables: Fix "comment" property not being loaded from storage	2015-11-30 10:57:36 +02:00
Avi Kivity	2c3591cbd9	data_value de-any-fication We use boost::any to convert to and from database values (stored in serlialized form) and native C++ values. boost::any captures information about the data type (how to copy/move/delete etc.) and stores it inside the boost::any instance. We later retrieve the real value using boost::any_cast. However, data_value (which has a boost::any member) already has type information as a data_type instance. By teaching data_type intances about the corresponding native type, we can elimiante the use of boost::any. While boost::any is evil and eliminating it improves efficiency somewhat, the real goal is growing native type support in data_type. We will use that later to store native types in the cache, enabling O(log n) access to collections, O(1) access to tuples, and more efficient large blob support.	2015-10-30 17:38:51 +01:00
Raphael S. Carvalho	6bea503f9a	db: fallback to sizetiered if compaction strategy isn't supported It may happen that the user will migrate a table to Scylla which compaction strategy isn't supported yet, such as Data tiered. Let's handle that by falling back to size-tiered compaction strategy and printing a warning message. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-10-29 09:33:28 +02:00
Raphael S. Carvalho	a21af32eed	db: do not ignore compaction strategy class When building the in-memory schema for a column family, we were ignoring compaction strategy class because of a bug in the existing code. Example: suppose that you create a column family with leveled compaction strategy. This option would be ignored and the default strategy (size-tiered) would be used instead. Found this problem while working on leveled compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-10-18 11:06:37 +03:00
Glauber Costa	e99e418238	schema_tables: make sure CF directory exists upon creation In Cassandra, when you create a new column family, a directory for it immediately appears under the KS directory. In the past, we have made a decision to delay that creation until the first SSTable is created, which works well in general. There is a problem, however, for backup restoration: the standard procedure to call loadNewSSTables is to do that in an empty directory. But the directory simply won't be there until we create the first SSTable: bummer! In the current incarnation of the code in schema_tables.cc, there is already some code that runs on CPU0 only. That is a perfect place for the directory creation. So let's do it. After this patch, a directory for the CF appears right after the CF creation. Signed-off-by: Glauber Costa <glommer@scylladb.com>	2015-10-17 13:08:07 +02:00
Glauber Costa	b2fef14ada	do not calculate truncation time independently Currently, we are calculating truncated_at during truncate() independently for each shard. It will work if we're lucky, but it is fairly easy to trigger cases in which each shard will end up with a slightly different time. The main problem here, is that this time is used as the snapshot name when auto snapshots are enabled. Previous to my last fixes, this would just generate two separate directories in this case, which is wrong but not severe. But after the fix, this means that both shards will wait for one another to synchronize and this will hang the database. Fix this by making sure that the truncation time is calculated before invoke_on_all in all needed places. Signed-off-by: Glauber Costa <glommer@scylladb.com>	2015-10-09 17:17:11 +03:00
Pekka Enberg	95012793e5	db/schema_tables: Wire up drop keyspace notifications Signed-off-by: Pekka Enberg <penberg@scylladb.com>	2015-10-08 13:10:48 +02:00
Pekka Enberg	5878f62b18	db/schema_tables: Clean up indentation Almost the whole file is (accidentally) indented four spaces to the right for no reason. Fix that up because it's annoying as hell. Signed-off-by: Pekka Enberg <penberg@scylladb.com>	2015-10-06 17:09:27 +02:00
Pekka Enberg	1f9e769dd3	db/schema_tables: Remove obsolete ifdef'd code Remove ifdef'd code that we won't be converting to C++ because of design differences. Signed-off-by: Pekka Enberg <penberg@scylladb.com>	2015-10-06 17:09:27 +02:00
Pekka Enberg	6e304cd58c	db/schema_tables: Fix merge_keyspaces() to actually drop keyspaces When we query schema keyspaces after we have applied a delete mutation, the dropped keyspace does not exist in the "after" result set. Fix the merge_keyspaces() algorithm to take that into account. Makes merge_keyspaces() really call to database::drop_keyspace() when a keyspace is dropped. Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>	2015-10-06 14:53:35 +03:00

1 2

76 Commits