scylladb

Author	SHA1	Message	Date
Raphael Carvalho	de4b4e593d	db: better handling of failure in column_family::populate Improve handling of failure by saving first exception and ignoring the remaining futures. At the moment, code only throws first exception and doesn't care about any possible remaining future. Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com> Message-Id: <383dc4445db09dd2fbce093d4609a0a0bc38a405.1458240398.git.raphaelsc@scylladb.com>	2016-03-20 17:33:20 +02:00
Benoît Canet	3b1d3d977d	exceptions: Shutdown communications on non file I/O errors Apply the same treatment to non file filesystem I/O errors. Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <1458154098-9977-2-git-send-email-benoit@scylladb.com>	2016-03-17 15:02:54 +02:00
Benoît Canet	1fb9a48ac5	exception: Optionally shutdown communication on I/O errors. I/O errors cannot be fixed by Scylla the only solution is to shutdown the database communications. Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <1458154098-9977-1-git-send-email-benoit@scylladb.com>	2016-03-17 15:02:52 +02:00
Pekka Enberg	d4b4baad98	Merge "Add more information to query result digest" from Paweł "This series adds more information (i.e. keys and tombstones) to the query result digest in order to ensure correctness and increase the chances of early detection of disagreement between replicas. The digest is no longer computed by hashing query::result but build using the query result builder. That is necessary since the query result itself doesn't contain all information required to compute the digest. Another consequence of this is that now replicas asked for a result need to send both the result and the digest to the coordinator as it won't be able to compute the digest itself. Unfortunately, these patches change our on wire communication: 1) hash computation is different 2) format of query::result is changed (and it is made non-final) Fixes #182."	2016-03-14 08:22:05 +02:00
Paweł Dziepak	82d2a2dccb	specify whether query::result, result_digest or both are needed Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Glauber Costa	a339296385	database: turn sstable generation number into an optional This patch makes sure that every time we need to create a new generation number - the very first step in the creation of a new SSTable, the respective CF is already initialized and populated. Failure to do so can lead to data being overwritten. Extensive details about why this is important can be found in Scylla's Github Issue #1014 Nothing should be writing to SSTables before we have the chance to populate the existing SSTables and calculate what should the next generation number be. However, if that happens, we want to protect against it in a way that does not involve overwriting existing tables. This is one of the ways to do it: every column family starts in an unwriteable state, and when it can finally be written to, we mark it as writeable. Note that this cannot be a part of add_column_family. That adds a column family to a db in memory only, and if anybody is about to write to a CF, that was most likely already called. We need to call this explicitly when we are sure we're ready to issue disk operations safely. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-10 21:06:05 -05:00
Glauber Costa	94e90d4a17	column_family: do not open code generation calculation We already have a function that wraps this, re-use it. This FIXME is still relevant, so just move it there. Let's not lose it. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-10 21:05:47 -05:00
Glauber Costa	46fdeec60a	colum_family: remove mutation_count We use memory usage as a threshold these days, and nowhere is _mutation_count checked. Get rid of it. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-10 21:05:47 -05:00
Asias He	4abaacfc61	db: Introduce column_family_exists It is cheaper than throwing a no_such_column_family exception to test if a cf is gone, e.g., deleted.	2016-03-09 16:50:38 +08:00
Glauber Costa	8260b8fc6f	touch CF directories during startup We try to be robust against files disappearing (due to any kind of corruption) inside the data directory. But if the data directory itself goes missing, that's a situation that we don't handle correctly. We will keep accepting writes normally, but when we try to flush the memtable to disk, we'll fail with a system error. Having the CF directory disappearing is not a common thing. But it is also one that we can easily protect against, by touching all CF directories we know about on startup. Fixes #999 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <ed66373dccca11742150a6d08e21ece3980227d3.1457379853.git.glauber@scylladb.com>	2016-03-09 09:06:51 +02:00
Vlad Zolotarov	a45ecaf336	database: store "incremental backup" configuration value in per-shard instance Store the "incremental_backups" configuration value in the database class (and use it when creating a keyspace::config) in order to be able to modify it in runtime. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-03-06 17:22:48 +02:00
Paweł Dziepak	bdc23ae5b5	remove db/serializer.hh includes Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-02 09:07:09 +00:00
Raphael S. Carvalho	34ed930aa4	sstables: fix lack of accuracy in disk usage report To report disk usage, scylla was only taking into account size of sstable data component. Other components such as index and filter may be relatively big too. Therefore, 'nodetool status' would report an innacurate disk usage. That can be fixed by taking into account size of all sstable components. Fixes #943. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <08453585223570006ac4d25fe5fb909ad6c140a5.1456762244.git.raphaelsc@scylladb.com>	2016-03-01 08:58:42 +02:00
Tomasz Grabiec	6cec131432	query: Switch to IDL-generated views and writers The query result footprint for cassandra-stress mutation as reported by tests/memory-footprint increased by 18% from 285 B to 337 B. perf_simple_query shows slight regression in throughput (-8%): build/release/tests/perf/perf_simple_query -c4 -m1G --partitions 100000 Before: ~433k tps After: ~400k tps	2016-02-26 12:26:13 +01:00
Avi Kivity	a74f68eeb2	Merge "Properly tag readers" from Glauber "Gleb has recently noted that our query reads are not even being registered with the I/O queue. Investigating what is happening, I found out that while the priority that make_reader receives was not being properly passed downwards to the SSTable reader. The reader code is also used by compaction class, and that one is fine. But the CQL reads are not. On top of that, there are also some other places where the tag was not properly propagated, and those are patched."	2016-02-25 18:35:58 +02:00
Raphael S. Carvalho	fc4cbcde72	Revert "Revert "database: Fix use and assumptions about pending compations"" This reverts commit `a4d92750eb`. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <8a405e7c1daf94c4d70d8084f59ce7205d56fe52.1456415398.git.raphaelsc@scylladb.com>	2016-02-25 18:02:01 +02:00
Pekka Enberg	a4d92750eb	Revert "database: Fix use and assumptions about pending compations" This reverts commit `9586793c70`. It breaks sstable_test as follows: [penberg@nero scylla]$ build/release/tests/sstable_test --smp 1 Running 81 test cases... INFO [shard 0] compaction_manager - Asked to stop INFO [shard 0] compaction_manager - Stopped sstable_test: database.cc:878: future<> column_family::run_compaction(sstables::compaction_descriptor): Assertion `_stats.pending_compactions > 0' failed. unknown location(0): fatal error in "compaction_manager_test": signal: SIGABRT (application abort requested) tests/sstable_datafile_test.cc(1023): last checkpoint	2016-02-25 15:28:06 +02:00
Calle Wilund	9586793c70	database: Fix use and assumptions about pending compations Fixes #934 - faulty assert in discard_sstables run_with_compaction_disabled clears out a CF from compaction mananger queue. discard_sstables wants to assert on this, but looks at the wrong counters. pending_compactions is an indicator on how much interested parties want a CF compacted (again and again). It should not be considered an indicator of compactions actually being done. This modifies the usage slightly so that: 1.) The counter is always incremented, even if compaction is disallowed. The counters value on end of run_with_compaction_disabled is then instead used as an indicator as to whether a compaction should be re-triggered. (If compactions finished, it will be zero) 2.) Document the use and purpose of the pending counter, and add method to re-add CF to compaction for r_w_c_d above. 3.) discard_sstables now asserts on the right things. Message-Id: <1456332824-23349-1-git-send-email-calle@scylladb.com>	2016-02-25 08:57:04 +02:00
Glauber Costa	336babfcb8	database: add a priority class to a few SSTable readers Not all SSTable readers will end up getting the right tag for a priority class. In particular, the range reader, also used for the memtables complete ignores any priority class. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-02-24 18:00:34 -05:00
Glauber Costa	2816bc6fed	database: use a reference instead of a pointer to store the priority classes We will always initialize it, so don't use a pointer. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-02-24 18:00:34 -05:00
Glauber Costa	80ab41a715	memtable reader: also include a priority class There are situations when a memtable is already flushed but the memtable reader will continue to be in place, relaying reads to the underlying table. For that reason, the "memtables don't need a priority class" argument gets obviously broken. We need to pass a priority class for its reader as well. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-02-24 18:00:34 -05:00
Calle Wilund	590ec1674b	truncate: Require timestamp join-function to ensure equal values Fixes #937 In fixing #884, truncation not truncating memtables properly, time stamping in truncate was made shard-local. This however breaks the snapshot logic, since for all shards in a truncate, the sstables should snapshot to the same location. This patch adds a required function argument to truncate (and by extension drop_column_family) that produces a time stamp in a "join" fashion (i.e. same on all shards), and utilizes the joinpoint type in caller to do so. Message-Id: <1456332856-23395-2-git-send-email-calle@scylladb.com>	2016-02-24 18:59:31 +02:00
Tomasz Grabiec	d3b7e143dc	db: Fix error handling in populate_keyspace() When find_uuid() fails Scylla would terminate with: Exiting on unhandled exception of type 'std::out_of_range': _Map_base::at But we are supposed to ignore directories for unknown column families. The try {} catch block is doing just that when no_such_column_family is thrown from the find_column_family() call which follows find_uuid(). Fix by converting std::out_of_range to no_such_column_family. Message-Id: <1456056280-3933-1-git-send-email-tgrabiec@scylladb.com>	2016-02-21 14:19:31 +02:00
Raphael S. Carvalho	55be1830ff	database: make column_family::rebuild_sstable_list safer If any of the allocation in rebuild_sstable_list fail, the system may be left with an incorrect set of sstables. It's probably safer to assign the new set of sstables as a last step. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <52b188262dcc06730dc9220b54ff6810d7dca1ae.1455835030.git.raphaelsc@scylladb.com>	2016-02-21 11:55:15 +02:00
Tomasz Grabiec	a921479e71	Merge tag '807-v3' from https://github.com/avikivity/scylla From Avi: This patchset introduces a linearization context for managed_bytes objects. Within this context, any scattered managed_bytes (found only in lsa regions, so limited to memtable and cache) are auto-linearized for the lifetime of the context. This ensures that key and value lookups can use fast contiguous iterators instead of using slow discontiguous iterators (or crashing, as is the case now).	2016-02-16 14:29:48 +01:00
Avi Kivity	3c60310e38	key: relax some APIs to accept partition_key_view instead of const partition_key& Using a partition_key_view can save an allocation in some cases. We will make use of it when we linearize a partition_key; during the process we are given a simple byte pointer, and constructing a partition_key from that requires an allocation.	2016-02-09 19:55:13 +02:00
Calle Wilund	18203a4244	database::truncate/drop: Move time stamp generation to shard Fixes #884 Time stamps for truncation must be generated after flush, either by splitting the truncate into two (or more) for-each-shard operations, or simply by doing time stamping per shard (this solution). We generate TS on each shard after flushing, and then rely on the actual stored value to be the highest time point generated. This should however, from batch replay point of view, be functionally equivalent. And not a problem.	2016-02-09 15:45:37 +00:00
Calle Wilund	873f87430d	database: Check sstable dir name UUID part when populating CF Fixes #870 Only load sstables from CF directories that match the current CF uuid. Message-Id: <1454938450-4338-1-git-send-email-calle@scylladb.com>	2016-02-08 14:48:19 +01:00
Avi Kivity	f3ca597a01	Merge "Sstable cleanup fixes" from Tomasz " - Added waiting for async cleanup on clean shutdown - Crash in the middle of sstable removal doesn't leave system in a non-bootable state"	2016-02-04 12:36:13 +02:00
Tomasz Grabiec	136c9d9247	sstables: Improve error message in case of generation duplication Refs #870.	2016-02-03 17:35:50 +01:00
Raphael S. Carvalho	a46aa47ab1	make sstables::compact_sstables return list of created sstables Now, sstables::compact_sstables() receives as input a list of sstables to be compacted, and outputs a list of sstables generated by compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <0d8397f0395ce560a7c83cccf6e897a7f464d030.1454110234.git.raphaelsc@scylladb.com>	2016-01-31 12:39:20 +02:00
Raphael S. Carvalho	ee84f310d9	move deletion of sstables generated by interrupted compaction This deletion should be handled by sstables::compact_sstables, which is the responsible for creation of new sstables. It also simplifies the code. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <541206be2e910ab4edb1500b098eb5ebf29c6509.1454110234.git.raphaelsc@scylladb.com>	2016-01-31 12:39:20 +02:00
Raphael S. Carvalho	3b7970baff	compaction: delete generated sstables in event of an interrupt Generated sstables may imply either fully or partially written. Compaction is interrupted if it was deriberately asked to stop (stop API) or it was forced to do so in event of a failure, ex: out of disk space. There is a need to explicitly delete sstables generated by a compaction that was interrupted. Otherwise, such sstables will waste disk space and even worsen read performance, which degrades as number of generations to look at increases. Fixes #852. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <49212dbf485598ae839c8e174e28299f7127f63e.1453912119.git.raphaelsc@scylladb.com>	2016-01-28 14:05:57 +02:00
Tomasz Grabiec	9fa62af96b	database: Move implementation to .cc Message-Id: <1453980679-27226-1-git-send-email-tgrabiec@scylladb.com>	2016-01-28 13:35:33 +02:00
Glauber Costa	3f94070d4e	use auto&& instead of auto& for priority classes. By Avi's request, who reminds us that auto& is more suited for situations in which we are assigning to the variable in question. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <87c76520f4df8b8c152e60cac3b5fba5034f0b50.1453820373.git.glauber@scylladb.com>	2016-01-26 17:00:20 +02:00
Glauber Costa	b63611e148	mark I/O operations with priority classes After this patch, our I/O operations will be tagged into a specific priority class. The available classes are 5, and were defined in the previous patch: 1) memtable flush 2) commitlog writes 3) streaming mutation 4) SSTable compaction 5) CQL query Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Glauber Costa	f6cfb04d61	add a priority class to mutation readers SSTables already have a priority argument wired to their read path. However, most of our reads do not call that interface directly, but employ the services of a mutation reader instead. Some of those readers will be used to read through a mutation_source, and those have to patched as well. Right now, whenever we need to pass a class, we pass Seastar's default priority class. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Glauber Costa	15336e7eb7	key_source: turn it into a class Its definition as a lambda function is inconvenient, because it does not allow us to use default values for parameters. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Glauber Costa	58fdae33bd	mutation_source: turn it into a class Its definition as a lambda function is inconvenient, because it does not allow us to use default values for parameters. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Vlad Zolotarov	c2ab54e9c7	sstables flushing: enable incremental backup (if requested) Enable incremental backup when sstables are flushed if incremental backup has been requested. It has been enabled in the regular flushing flow before but wasn't in the compaction flow. This patch enables it in both places and does it using a backup capability of sstable::write_components() method(s). Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2016-01-21 12:13:20 +02:00
Tomasz Grabiec	06d1f4b584	database: Print table name when printing mutation	2016-01-19 13:46:28 +01:00
Tomasz Grabiec	52073d619c	database: Add trace-level logging of applied mutations	2016-01-19 13:46:28 +01:00
Pekka Enberg	7d3a3bd201	Merge "column family cleanup support" from Raphael "This patch is intended to add support to column family cleanup, which will make 'nodetool cleanup' possible. Why is this feature needed? Remove irrelevant data from a node that loses part of its token range to a newly added node."	2016-01-18 10:15:05 +02:00
Paweł Dziepak	18d0a57bf4	commitlog: use commitlog entry writer and reader Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-01-13 10:20:06 +01:00
Raphael S. Carvalho	a5c90194f5	db: add support to clean up a column family Cleanup is a procedure that will discard irrelevant keys from all sstables of a column family, thus saving disk space. Scylla will clean up a sstable by using compaction code, in which this sstable will be the only input used. Compaction manager was changed to become aware of cleanup, such that it will be able to schedule cleanup requests and also know how to handle them properly. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 03:53:04 -02:00
Raphael S. Carvalho	9c13c1c738	compaction: move compaction execution from strategy to manager Currently, compaction strategy is the responsible for both getting the sstables selected for compaction and running compaction. Moving the code that runs compaction from strategy to manager is a big improvement, which will also make possible for the compaction manager to keep track of which sstables are being compacted at a moment. This change will also be needed for cleanup and concurrent compaction on the same column family. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-12 00:04:27 -02:00
Raphael S. Carvalho	5c674091dc	db: move code that rebuilds sstable list to a function That code will be used by column family cleanup, so let's put that code into a function. This change also improves the code readability. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-11 19:51:04 -02:00
Raphael S. Carvalho	58189dd489	db: move generation calculation code to a function Code that calculates generation should be put in a function. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2016-01-11 19:51:02 -02:00
Tomasz Grabiec	8deb3f18d3	query_processor: Invalidate prepared statements when columns change Replicates https://issues.apache.org/jira/browse/CASSANDRA-7910 : "Prepare a statement with a wildcard in the select clause. 2. Alter the table - add a column 3. execute the prepared statement Expected result - get all the columns including the new column Actual result - get the columns except the new column"	2016-01-11 10:34:55 +01:00
Tomasz Grabiec	c6a52bed73	db: Fail when attempting to mutate using not synced schema	2016-01-11 10:34:53 +01:00

1 2 3 4 5 ...

502 Commits