scylladb

Author	SHA1	Message	Date
Vlad Zolotarov	c2ab54e9c7	sstables flushing: enable incremental backup (if requested) Enable incremental backup when sstables are flushed if incremental backup has been requested. It has been enabled in the regular flushing flow before but wasn't in the compaction flow. This patch enables it in both places and does it using a backup capability of sstable::write_components() method(s). Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-01-21 12:13:20 +02:00
Tomasz Grabiec	06d1f4b584	database: Print table name when printing mutation	2016-01-19 13:46:28 +01:00
Tomasz Grabiec	52073d619c	database: Add trace-level logging of applied mutations	2016-01-19 13:46:28 +01:00
Pekka Enberg	7d3a3bd201	Merge "column family cleanup support" from Raphael "This patch is intended to add support to column family cleanup, which will make 'nodetool cleanup' possible. Why is this feature needed? Remove irrelevant data from a node that loses part of its token range to a newly added node."	2016-01-18 10:15:05 +02:00
Paweł Dziepak	18d0a57bf4	commitlog: use commitlog entry writer and reader Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-13 10:20:06 +01:00
Raphael S. Carvalho	a5c90194f5	db: add support to clean up a column family Cleanup is a procedure that will discard irrelevant keys from all sstables of a column family, thus saving disk space. Scylla will clean up a sstable by using compaction code, in which this sstable will be the only input used. Compaction manager was changed to become aware of cleanup, such that it will be able to schedule cleanup requests and also know how to handle them properly. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 03:53:04 -02:00
Raphael S. Carvalho	9c13c1c738	compaction: move compaction execution from strategy to manager Currently, compaction strategy is the responsible for both getting the sstables selected for compaction and running compaction. Moving the code that runs compaction from strategy to manager is a big improvement, which will also make possible for the compaction manager to keep track of which sstables are being compacted at a moment. This change will also be needed for cleanup and concurrent compaction on the same column family. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 00:04:27 -02:00
Raphael S. Carvalho	5c674091dc	db: move code that rebuilds sstable list to a function That code will be used by column family cleanup, so let's put that code into a function. This change also improves the code readability. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-11 19:51:04 -02:00
Raphael S. Carvalho	58189dd489	db: move generation calculation code to a function Code that calculates generation should be put in a function. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-11 19:51:02 -02:00
Tomasz Grabiec	8deb3f18d3	query_processor: Invalidate prepared statements when columns change Replicates https://issues.apache.org/jira/browse/CASSANDRA-7910 : "Prepare a statement with a wildcard in the select clause. 2. Alter the table - add a column 3. execute the prepared statement Expected result - get all the columns including the new column Actual result - get the columns except the new column"	2016-01-11 10:34:55 +01:00
Tomasz Grabiec	c6a52bed73	db: Fail when attempting to mutate using not synced schema	2016-01-11 10:34:53 +01:00
Tomasz Grabiec	f0d886893d	db: Mark new schemas as synced	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	8164902c84	schema_tables: Change column_family schema on schema sync Notifications are not implemented yet.	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	d81a46d7b5	column_family: Add schema setters There is one current schema for given column_family. Entries in memtables and cache can be at any of the previous schemas, but they're always upgraded to current schema on access.	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	4e5a52d6fa	db: Make read interface schema version aware The intent is to make data returned by queries always conform to a single schema version, which is requested by the client. For CQL queries, for example, we want to use the same schema which was used to compile the query. The other node expects to receive data conforming to the requested schema. Interface on shard level accepts schema_ptr, across nodes we use table_schema_version UUID. To transfer schema_ptr across shards, we use global_schema_ptr. Because schema is identified with UUID across nodes, requestors must be prepared for being queried for the definition of the schema. They must hold a live schema_ptr around the request. This guarantees that schema_registry will always know about the requested version. This is not an issue because for queries the requestor needs to hold on to the schema anyway to be able to interpret the results. But care must be taken to always use the same schema version for making the request and parsing the results. Schema requesting across nodes is currently stubbed (throws runtime exception).	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	036974e19b	Make mutation interfaces support multiple versions Schema is tracked in memtable and cache per-entry. Entries are upgraded lazily on access. Incoming mutations are upgraded to table's current schema on given shard. Mutating nodes need to keep schema_ptr alive in case schema version is requested by target node.	2016-01-11 10:34:51 +01:00
Tomasz Grabiec	9eef4d1651	db: Learn schema versions when adding tables	2016-01-11 10:34:51 +01:00
Tomasz Grabiec	dbb7b7ebe3	db: Move system keyspace initialization to init_system_keyspace()	2016-01-08 21:10:26 +01:00
Avi Kivity	0c755d2c94	db: reduce log spam when ignoring an sstable With 10 sstables/shard and 50 shards, we get ~105050 messages = 25,000 log messages about sstables being ignored. This is not reasonable. Reduce the log level to debug, and move the message to database.cc, because at its original location, the containing function has nothing to do with the message itself. Reviewed-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Message-Id: <1452181687-7665-1-git-send-email-avi@scylladb.com>	2016-01-07 19:23:25 +02:00
Vlad Zolotarov	07f8549683	database: filter out a manifest.json files Filter out manifest.json files when reading sstables during bootup and when loading new sstables ('nodetool refresh'). Fixes issue #529 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1451911734-26511-3-git-send-email-vladz@cloudius-systems.com>	2016-01-07 15:56:02 +02:00
Vlad Zolotarov	c5aa2d6f1a	database: lister: add a filtering option Add a possibility to pass a filter functor receiving a full path to a directory entry and returning a boolean value: TRUE if an entry should be enumerated and FALSE - if it should be filtered out. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1451911734-26511-2-git-send-email-vladz@cloudius-systems.com>	2016-01-07 15:56:01 +02:00
Pekka Enberg	f4bdec4d09	Merge "Support for deleting all snapshots" from Vlad "Add support for deleting all snapshots of all keyspaces." Fixes #639.	2016-01-05 15:42:44 +02:00
Glauber Costa	74fbd8fac0	do not call open_file_dma directly We have an API that wraps open_file_dma which we use in some places, but in many other places we call the reactor version directly. This patch changes the latter to match the former. It will have the added benefit of allowing us to make easier changes to these interfaces if needed. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <29296e4ec6f5e84361992028fe3f27adc569f139.1451950408.git.glauber@scylladb.com>	2016-01-05 10:37:57 +02:00
Vlad Zolotarov	7bb2b2408b	database::clear_snapshot(): added support for deleting all snapshots When 'nodetool clearsnapshot' is given no parameters it should remove all existing snapshots. Fixes issue #639 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-01-03 14:22:25 +02:00
Vlad Zolotarov	d5920705b8	service::storage_service: move clear_snapshot() code to 'database' class service::storage_service::clear_snapshot() was built around _db.local() calls so it makes more sense to move its code into the 'database' class instead of calling _db.local().bla_bla() all the time. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-01-03 14:22:17 +02:00
Vlad Zolotarov	756de38a9d	database: actually check that a snapshot directory exists Actually check that a snapshot directory with a given tag exists instead of just checking that a 'snapshot' directory exists. Fixes issue #689 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-12-29 12:59:00 +01:00
Avi Kivity	41bd266ddd	db: provide more information on "Unrecognized error" while loading sstables This information can be used to understand the root cause of the failure. Refs #692.	2015-12-29 10:23:32 +02:00
Pekka Enberg	eeadf601e6	Merge "cleanups and improvements" from Raphael	2015-12-18 13:45:11 +02:00
Pekka Enberg	e56bf8933f	Improve not implemented errors Print out the function name where we're throwing the exception from to make it easier to debug such exceptions.	2015-12-18 10:51:37 +01:00
Raphael S. Carvalho	41be378ff1	db: fix build of sstable list in column_family::compact_sstables The last two loops were incorrectly inside the first one. That's a bug because a new sstable may be emplaced more than once in the sstable list, which can cause several problems. mark_for_deletion may also be called more than once for compacted sstables, however, it is idempotent. Found this issue while auditing the code. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-16 17:46:17 +02:00
Raphael S. Carvalho	6142efaedb	db: fix indentation Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-14 12:43:34 -02:00
Raphael S. Carvalho	7bbc1b49b6	db: add missing sstable::mark_for_deletion call If a sstable doesn't belong to current shard, mark_for_deletion should be called for the deletion manager to still work. It doesn't mean that the sstable will be deleted, but that the sstable is not relevant to the current shard, thus it can be deleted by the deletion manager in the future. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-14 12:42:26 -02:00
Amnon Heiman	2086c651ba	column_family: get_snapshot_details should return empty map for no snapshots If there is no snapshot directory for the specific column family, get_snapshot_details should return an empty map. This patch check that a directory exists before trying to iterate over it. Fixes #619 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2015-12-07 12:51:04 +01:00
Tomasz Grabiec	bc23ebcbc3	schema_tables: Replace schema_result::value_type with equivalent movable type future<> requires and will assert nothrow move constructible types.	2015-12-07 09:50:27 +01:00
Amnon Heiman	7e79d35f85	Estimated histogram: Clean the add interface The add interface of the estimated histogram is confusing as it is not clear what units are used. This patch removes the general add method and replace it with a add_nano that adds nanoseconds or add that gets duration. To be compatible with origin, nanoseconds vales are translated to microseconds.	2015-12-01 15:28:06 +02:00
Asias He	aa2b11f21b	database: Move is_replacing and get_replace_address to database class So they can be used outside storage_service.	2015-11-30 09:15:42 +08:00
Tomasz Grabiec	a7c11d1e30	db: Fix handling of missing column family The FIXMEs are no longer valid, we load schema on bootstrap and don't support hot-plugging of column families via file system (nor does Cassandra). Handling of missing tables matches Cassandra 2.1, applies log it and continue, queries propagate the error.	2015-11-25 16:59:15 +02:00
Raphael S. Carvalho	0f3ccc1143	db: optimize the sstable loading process Currently, we only determine if a sstable belongs to current shard after loading some of its components into memory. For example, filter may be considerably big and its content is irrelevant to decide if a sstable should be included to a given shard. Start using the functions previously introduced to optimize the sstable loading process. add_sstable no longer checks if a sstable is relevant to the current shard. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-11-19 13:34:25 -02:00
Raphael S. Carvalho	0ce2b7bc8d	db: introduce belongs_to_current_shard Returns true if key range belongs to current shard. False otherwise. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-11-19 13:34:21 -02:00
Raphael S. Carvalho	966e8c7144	db: introduce parallelism to sstable loading Boot may be slow because the function that loads sstables do so serially instead of in parallel. In the callback supplied to lister::scan_dir, let's push the future returned by probe_file (function that loads sstable) into a vector of future and wait for all of them at the end. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-11-19 13:34:11 -02:00
Glauber Costa	fa1ae45218	database: export collectd metrics about the state of memtable flushing When analyzing a recent performance issue, I found helpful to keep track of the amount of memtables that are currently in flight, as well as how much memory they are consuming in the system. Although those are memtable statistics, I am grouping them under the "cf_stats" structure: being the column family a central piece of the puzzle, it is reasonable to assume that a lot of metrics about it would be potentially welcome in the future. Note that we don't want to reuse the "stats" structure in the column family: for once, the fields not always map precisely (pending flushes, for instance, only tracks explicit flushes), and also the stats structure is a lot more complex than we need. Signed-off-by: Glauber Costa <glommer@scylladb.com>	2015-11-12 20:17:22 +02:00
Calle Wilund	284b10cabe	Make partition_slice::row_ranges mulitplex on partition Allows for having more than one clustering row range set, depending on PK queried (although right now limited to one - which happens to be exactly the number of mutiplexing paging needs... What a coincidence...) Encapsulates the row_ranges member in a query function, and if needed holds ranges outside the default one in an extra object. Query result::builder::add_partition now fetches the correct row range for the partition, and this is the range used in subsequent iteration.	2015-11-10 13:12:33 +01:00
Gleb Natapov	d77a2a0f03	do not try to write same memtable to sstable twice if moving it to a cache failed. Error handling in column_family::try_flush_memtable_to_sstable() is misplaced. It happens after update_cache(), so writing sstable may have succeeded, but moving memtable into the cache may have failed. update_cache() destroys memtable even if it fails, but error handler is not aware of it (it does not even distinguish whether error happened during sstable creation or moving into cache) and when it tells caller to retry it retries with already destroyed memtable. Fix it by ignoring moving to cache errors.	2015-11-09 11:27:37 +01:00
Avi Kivity	cb93af2ad7	Revert "do not try to write same memtable to sstable twice if moving it to a cache failed." This reverts commit `fff37d15cd`. Says Tomek (and the comment in the code): "update_cache() must be called before unlinking the memtable because cache + memtable at any time is supposed to be authoritative source of data for contained partitions. If there is a cache hit in cache, sstables won't be checked. If we unlink the memtable before cache is updated, it's possible that a query will miss data which was in that unlinked memtable, if it hits in the cache (with an old value)."	2015-11-09 11:22:12 +02:00
Gleb Natapov	fff37d15cd	do not try to write same memtable to sstable twice if moving it to a cache failed. Error handling in column_family::try_flush_memtable_to_sstable() is misplaced. It happens after update_cache(), so writing sstable may have succeeded, but moving memtable into the cache may have failed. update_cache() destroys memtable even if it fails, but error handler is not aware of it (it does not even distinguish whether error happened during sstable creation or moving into cache) and when it tells caller to retry it retries with already destroyed memtable. Fix it by ignoring moving to cache errors.	2015-11-09 09:56:45 +02:00
Asias He	20ecb0bede	database: Introduce get_initial_tokens Get initial tokens specified by the initial_token in scylla.conf. E.g., --initial-token "-1112521204969569328,1117992399013959838" --initial-token "1117992399013959838" It can be multiple tokens split by comma.	2015-11-04 10:40:12 +08:00
Calle Wilund	ceb9f4d647	database: Just do commitlog::shutdown on shutdown. It will do flushes.	2015-10-26 14:56:24 +01:00
Avi Kivity	f7087da054	Merge "GET methods for snapshots" from Glauber "The snapshots API need to expose GET methods so people can query information on them. Now that taking snapshots is supported, this relatively simple series implement get_snapshot_details, a column family method, and wire that up through the storage_service."	2015-10-22 15:23:45 +03:00
Avi Kivity	5f3a46eabb	Merge "load_new_sstables" from Glauber "This patchset implements load_new_sstables, allowing one to move tables inside the data directory of a CF, and then call "nodetool refresh" to start using them. Keep in mind that for Cassandra, this is deemed an unsafe operation: https://issues.apache.org/jira/browse/CASSANDRA-6245 It is still for us something we should not recommend - unless the CF is totally empty and not yet used, but we can do a much better job in the safety front. To guarantee that, the process works in four steps: 1) All writes to this specific column family are disabled. This is a horrible thing to do, because dirty memory can grow much more than desired during this. Throughout out this implementation, we will try to keep the time during which the writes are disabled to its bare minimum. While disabling the writes, each shard will tell us about the highest generation number it has seen. 2) We will scan all tables that we haven't seen before. Those are any tables found in the CF datadir, that are higher than the highest generation number seen so far. We will link them to new generation numbers that are sequential to the ones we have so far, and end up with a new generation number that is returned to the next step 3) The generation number computed in the previous step is now propagated to all CFs, which guarantees that all further writes will pick generation numbers that won't conflict with the existing tables. Right after doing that, the writes are resumed. 4) The tables we found in step 2 are passed on to each of the CFs. They can now load those tables while operations to the CF proceed normally."	2015-10-22 13:42:24 +03:00
Glauber Costa	36cea4313e	column family: load new sstables CF-level code to load new SSTables. There isn't really a lot of complication here. We don't even need to repopulate the entire SSTable directory: by requiring that the external service who is coordinating this tell us explicitly about the new SSTables found in the scan process, we can just load them specifically and add them to the SSTable map. All new tables will start their lifes as shared tables, and will be unshared if it is possible to do so: this all happens inside add_sstable and there isn't really anything special in this front. Signed-off-by: Glauber Costa <glommer@scylladb.com>	2015-10-21 18:06:22 +02:00

1 2 3 4 5 ...

463 Commits