scylladb

Author	SHA1	Message	Date
Nadav Har'El	99ecda3c96	sstables: overhaul range tombstone reading Until recently, we believed that range tombstones we read from sstables will always be for entire rows (or more generalized clustering-key prefixes), not for arbitrary ranges. But as we found out, because Cassandra insists that range tombstones do not overlap, it may take two overlapping row tombstones and convert them into three range tombstones which look like general ranges (see the patch for a more detailed example). Not only do we need to accept such "split" range tombstones, we also need to convert them back to our internal representation which, in the above example, involves two overlapping tombstones. This is what this patch does. This patch also contains a test for this case: We created in Cassandra an sstable with two overlapping deletions, and verify that when we read it to Scylla, we get these two overlapping deletions - despite the sstable file actually having contained three non-overlapping tombstones. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <b7c07466074bf0db6457323af8622bb5210bb86a.1459399004.git.glauber@scylladb.com>	2016-03-31 12:49:50 +03:00
Duarte Nunes	26a3461908	cql: Fix antlr3 missing token leak This patch overrides the antlr3 function that allocates the missing tokens that would eventually leak. The override stores these tokens in a vector, ensuring memory is freed whenever the parser is destroyed. Fixes #1147 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1459355146-17402-1-git-send-email-duarte@scylladb.com>	2016-03-31 08:44:45 +03:00
Duarte Nunes	f7a12adb6f	cql3: Disable pg-style string format test antlr3 leaks the token itself creates when recovering from a mismatch in the case the missing token can be determined. Until this bug is fixed or circumvented, the test should remain disabled. Ref #1147 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1459345403-8243-1-git-send-email-duarte@scylladb.com>	2016-03-30 16:44:47 +03:00
Duarte Nunes	db881fdc8f	cql: Add support for pg-style string literal This patch adds support for pg-style string literals to the CQL grammar. Fixes #1078 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1459093238-2529-1-git-send-email-duarte@scylladb.com>	2016-03-28 17:06:03 +03:00
Tomasz Grabiec	341b509f68	cql_test_env: Make initialization exception-safe Currently start() is not prepared to handle exceptions thrown from service initialization. It's easy to trigger such exceprion by starting two tests at the same time, which will result in socket bind error. Exception thrown from start() typically results in assertion failures like this one: seastar::sharded<Service>::~sharded() [with Service = database]: Assertion `_instances.empty()' failed. This patch fixes the problem by combining start() and stop() in a single do_with() and using RAII for stopping services. Now exceptions thrown from service initialization should stop services in proper order and let the original exception to pass through. Example result: fatal error in "test_new_schema_with_no_structural_change_is_propagated": std::runtime_error: bind: Address already in use Message-Id: <1458768018-27662-1-git-send-email-tgrabiec@scylladb.com>	2016-03-24 11:20:01 +02:00
Raphael Carvalho	370b1336fe	service: fix refresh Vlad and I were working on finding the root of the problems with refresh. We found that refresh was deleting existing sstable files because of a bug in a function that was supposed to return the maximum generation of a column family. The intention of this function is to get generation from last element of column_family::_sstables, which is of type std::map. However, we were incorrectly using std::map::end() to get last element, so garbage was being read instead of maximum generation. If the garbage value is lower than the minimum generation of a column family, then reshuffle_sstables() would set generation of all existing sstables to a lower value. That would confuse our mechanism used to delete sstables because sstables loaded at boot stage were touched. Solution to this problem is about using rbegin() instead of end() to get last element from column_family::_sstables. The other problem is that refresh will only load generations that are larger than or equal to X, so new sstables with lower generation will not be loaded. Solution is about creating a set with generation of live SSTables from all shards, and using this set to determine whether a generation is new or not. The last change was about providing an unused generation to reshuffle procedure by adding one to the maximum generation. That's important to prevent reshuffle from touching an existing SSTable. Tested 'refresh' under the following scenarios: 1) Existing generations: 1, 2, 3, 4. New ones: 5, 6. 2) Existing generations: 3, 4, 5, 6. New ones: 1, 2. 3) Existing generations: 1, 2, 3, 4. New ones: 7, 8. 4) No existing generation. No new generation. 5) No existing generation. New ones: 1, 2. I also had to adapt existing testcase for reshuffle procedure. Fixes #1073. Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com> Message-Id: <1c7b8b7f94163d5cd00d90247598dd7d26442e70.1458694985.git.raphaelsc@scylladb.com>	2016-03-23 10:21:58 +02:00
Tomasz Grabiec	a4e3adfbec	Fix assertion in row_cache_alloc_stress Fixes the following assertion failure: row_cache_alloc_stress: tests/row_cache_alloc_stress.cc:120: main(int, char**)::<lambda()>::<lambda()>: Assertion `mt->occupancy().used_space() < memory::stats().free_memory()' failed. memory::stats()::free_memory() may be much lower than the actual amount of reclaimable memory in the system since LSA zones will try to keep a lot of free segments to themselves. Fix by using actual amount of reclaimable memory in the check.	2016-03-22 16:31:04 +01:00
Tomasz Grabiec	529c8b8858	logalloc: Rename tracker::occupancy() to region_occupancy()	2016-03-22 14:56:44 +01:00
Tomasz Grabiec	6e73c3f3dc	perf_simple_query: Make duration configurable	2016-03-21 21:49:53 +01:00
Tomasz Grabiec	2fbb55929d	mutation_test: Add allocation failure stress test for apply() The test injects allocation failures at every allocation site during apply(). Only allocations throug allocation_strategy are instrumented, but currently those should include all allocations in the apply() path. The target and source mutations are randomized.	2016-03-21 21:49:53 +01:00
Tomasz Grabiec	8ede27f9c6	mutation_test: Add more apply() tests	2016-03-21 21:49:53 +01:00
Tomasz Grabiec	36575d9f01	mutation_test: Hoist make_blob() to a function	2016-03-21 21:49:53 +01:00
Tomasz Grabiec	4c85d06df7	mutation_test: Make make_blob() return different blob each time random_bytes was constructed with the same seed each time.	2016-03-21 21:49:53 +01:00
Tomasz Grabiec	19b3df9f0f	mutation_test: Fix use-after-free The problem was that verify_row() was returning a future which was not waited on. Fix by running the code in a thread.	2016-03-21 21:49:53 +01:00
Benoît Canet	1fb9a48ac5	exception: Optionally shutdown communication on I/O errors. I/O errors cannot be fixed by Scylla the only solution is to shutdown the database communications. Signed-off-by: Benoît Canet <benoit@scylladb.com> Message-Id: <1458154098-9977-1-git-send-email-benoit@scylladb.com>	2016-03-17 15:02:52 +02:00
Paweł Dziepak	13849fd129	tests/lsa: add test for region groups Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-17 11:20:22 +00:00
Paweł Dziepak	ed53784cb6	tests/lsa: do not leak memory in large allocation test Large allocations test, unsurprisingly, allocates a lot of memory. Do not leak it so that any tests that are going to be run afterwards have still some memory left. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-17 11:19:13 +00:00
Pekka Enberg	2f519b9b34	tests/gossip_test: Fix messaging service stop This fixes gossip test shutdown similar to what commit `13ce48e` ("tests: Fix stop of storage_service in cql_test_env") did for CQL tests: gossip_test: /home/penberg/scylla/seastar/core/sharded.hh:439: Service& seastar::sharded<Service>::local() [with Service = net::messaging_service]: Assertion `local_is_initialized()' failed. Running 1 test case... [snip] unknown location(0): fatal error in "test_boot_shutdown": signal: SIGABRT (application abort requested) seastar/tests/test-utils.cc(32): last checkpoint Message-Id: <1458126520-20025-1-git-send-email-penberg@scylladb.com>	2016-03-16 13:15:18 +02:00
Asias He	13ce48e775	tests: Fix stop of storage_service in cql_test_env In stop() of storage_service, it unregisters the verb handler. In the test, we stop messaging_service before storage_service. Fix it by deferring stop of messaging_service. Message-Id: <c71f7b5b46e475efe2fac4c1588460406f890176.1458086329.git.asias@scylladb.com>	2016-03-16 08:32:01 +02:00
Asias He	9f64c36a08	storage_service: Fix pending_range_calculator_service Since calculate_pending_ranges will modify token_metadata, we need to replicate to other shards. With this patch, when we call calculate_pending_ranges, token_metadata will be replciated to other non-zero shards. In addition, it is not useful as a standalone class. We can merge it into the storage_service. Kill one singleton class. Fixes #1033 Refs #962 Message-Id: <fb5b26311cafa4d315eb9e72d823c5ade2ab4bda.1457943074.git.asias@scylladb.com>	2016-03-14 10:14:22 +02:00
Pekka Enberg	d4b4baad98	Merge "Add more information to query result digest" from Paweł "This series adds more information (i.e. keys and tombstones) to the query result digest in order to ensure correctness and increase the chances of early detection of disagreement between replicas. The digest is no longer computed by hashing query::result but build using the query result builder. That is necessary since the query result itself doesn't contain all information required to compute the digest. Another consequence of this is that now replicas asked for a result need to send both the result and the digest to the coordinator as it won't be able to compute the digest itself. Unfortunately, these patches change our on wire communication: 1) hash computation is different 2) format of query::result is changed (and it is made non-final) Fixes #182."	2016-03-14 08:22:05 +02:00
Paweł Dziepak	82d2a2dccb	specify whether query::result, result_digest or both are needed Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Raphael S. Carvalho	031bf57c19	sstables: bail out if toc exists for generation used by write_components Currently, if sstable::write_components() is called to write a new sstable using the same generation of a sstable that exists, a temporary TOC will be unconditionally created. Afterwards, the same sstable::write_components() will fail when it reaches sstable::create_data(). The reason is obvious because data component exists for that generation (in this scenario). After that, user will not be able to boot scylla anymore because there is a generation with both a TOC and a temporary TOC. We cannot simply remove a generation with TOC and temporary TOC because user data will be lost (again, in this scenario). After all, the temporary TOC was only created because sstable::write_components() was wrongly called with the generation of a sstable that exists. Solution proposed by this patch is to trigger exception if a TOC file exists for the generation used. Some SSTable unit tests were also changed to guarantee that we don't try to overwrite components of an existing sstable. Refs #1014. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <caffc4e19cdcf25e4c6b9dd277d115422f8246c4.1457643565.git.raphaelsc@scylladb.com>	2016-03-11 09:22:51 +02:00
Glauber Costa	a339296385	database: turn sstable generation number into an optional This patch makes sure that every time we need to create a new generation number - the very first step in the creation of a new SSTable, the respective CF is already initialized and populated. Failure to do so can lead to data being overwritten. Extensive details about why this is important can be found in Scylla's Github Issue #1014 Nothing should be writing to SSTables before we have the chance to populate the existing SSTables and calculate what should the next generation number be. However, if that happens, we want to protect against it in a way that does not involve overwriting existing tables. This is one of the ways to do it: every column family starts in an unwriteable state, and when it can finally be written to, we mark it as writeable. Note that this cannot be a part of add_column_family. That adds a column family to a db in memory only, and if anybody is about to write to a CF, that was most likely already called. We need to call this explicitly when we are sure we're ready to issue disk operations safely. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-10 21:06:05 -05:00
Pekka Enberg	ab502bcfa8	types: Implement to_string for timestamps and dates The to_string() function is used for logging purpose so use boost to_iso_extended_string() to format both timestamps and dates. Fixes #968 (showstopper) Message-Id: <1457528755-6164-1-git-send-email-penberg@scylladb.com>	2016-03-09 14:08:33 +01:00
Tomasz Grabiec	2abd62b5cb	bytes_ostream: Drop methods which serialize integers This will make bytes_ostream completely agnostic to serialization format, which should be determined by layer above it. Message-Id: <1457004221-8345-2-git-send-email-tgrabiec@scylladb.com>	2016-03-03 13:27:27 +02:00
Paweł Dziepak	d50594351b	db: remove old-style serializers Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-02 09:09:30 +00:00
Paweł Dziepak	92f9c9428e	cql3: don't insert row marker if schema is_cql3_table() Checking schema::is_dense() is not enough to know whether row marker should be inserted or not as there may be compact storage tables that are not considered dense (namely, a table with now clustering key). Row marker should only be insterted if schema::is_cql3_table() is true. Fixes #931. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1456834937-1630-1-git-send-email-pdziepak@scylladb.com>	2016-03-01 13:29:53 +01:00
Paweł Dziepak	6a6c12f8c4	tests/commitlog: use unaligned_cast instead of reinterpret_cast corrupt_segment() is meant to write some garbage at arbitrary position in the commitlog segment. That position is not necessairly properly aligned for uint32_t. Silences ubsan complaints about unaligned write. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1456827726-21288-1-git-send-email-pdziepak@scylladb.com>	2016-03-01 12:57:06 +02:00
Paweł Dziepak	e194835d8a	tests/idl: add test for stdx::optional<> serialization Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1456761055-23916-1-git-send-email-pdziepak@scylladb.com>	2016-02-29 18:12:59 +02:00
Calle Wilund	0de8f6d24f	cql_test_env: Shutdown auth on test stop Ensures no spurious timer tasks tries to touch stopped distributed objects. Message-Id: <1456753987-6914-4-git-send-email-calle@scylladb.com>	2016-02-29 16:06:33 +02:00
Tomasz Grabiec	135c1fa306	tests: memory_footprint: Report size in query results	2016-02-26 12:26:13 +01:00
Tomasz Grabiec	697d9bfa56	serializer: Introduce as_input_stream(bytes_view)	2016-02-26 12:26:13 +01:00
Tomasz Grabiec	4284715ddf	Relax includes	2016-02-26 12:26:13 +01:00
Raphael S. Carvalho	7f0371129c	tests: sstable_test: submit compaction request through column family That's needed for reverted commit `9586793c` to work. It's also the correct thing to do, i.e. column family submits itself to manager. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <2a1d141ad929c1957933f57412083dd52af0390b.1456415398.git.raphaelsc@scylladb.com>	2016-02-25 18:02:00 +02:00
Asias He	697b16414a	gossip: Make gossip message handling async In each gossip round, i.e., gossiper::run(), we do: 1) send syn message 2) peer node: receive syn message, send back ack message 3) process ack message in handle_ack_msg apply_state_locally mark_alive send_gossip_echo handle_major_state_change on_restart mark_alive send_gossip_echo mark_dead on_dead on_join apply_new_states do_on_change_notifications on_change 4) send back ack2 message 5) peer node: process ack2 message apply_state_locally At the moment, syn is "wait" message, it times out in 3 seconds. In step 3, all the registered gossip callbacks are called which might take significant amount of time to complete. In order to reduce the gossip round latency, we make syn "no-wait" and do not run the handle_ack_msg insdie the gossip::run(). As a result, we will not get a ack message as the return value of a syn message any more, so a GOSSIP_DIGEST_ACK message verb is introduced. With this patch, the gossip message exchange is now async. It is useful when some nodes are down in the cluster. We will not delay the gossip round, which is supposed to run every second, 3*n seconds (n = 1-3, since it talks to 1-3 peer nodes in each gossip round) or even longer (considering the time to run gossip callbacks). Later, we can make talking to the 1-3 peer nodes in parallel to reduce latency even more. Refs: #900	2016-02-24 19:33:39 +08:00
Tomasz Grabiec	79bcb5a616	tests: Fix build of memory_footprint	2016-02-23 19:12:54 +01:00
Tomasz Grabiec	c591157755	tests: mutation_query: Add test for dropping partitions with expired tombstones	2016-02-22 20:23:29 +01:00
Paweł Dziepak	597ed15dfd	tests: add idl unit test Test auto-generated and writer-based serialization as well as deserialization of simple compound type, vectors and variants. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-02-19 21:19:30 +00:00
Tomasz Grabiec	9d11968ad8	Rename serialization_format to cql_serialization_format	2016-02-15 16:53:56 +01:00
Tomasz Grabiec	b74301302c	tests: Add test for key serialization	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	095efd01d6	keys: Make from_exploded() and components() work without schema For simplicity, we want to have keys serializable and deserializable without schema for now. We will serialize keys in a generic form of a vector of components where the format of components is specified by CQL binary protocol. So conversion between keys and vector of components needs to be possible to do without schema. We may want to make keys schema-dependent back in the future to apply space optimizations specific to column types. Existing code should still pass schema& to construct and access the key when possible. One optimization had to be reverted in this change - avoidance of storing key length (2 bytes) for single-component partition keys. One consequence of this, in addition to a bit larger keys, is that we can no longer avoid copy when constructing single-component partition keys from a ready "bytes" object. I haven't noticed any significant performance difference in: tests/perf/perf_simple_query -c1 --write It does ~130K tps on my machine.	2016-02-10 14:35:13 +01:00
Tomasz Grabiec	b777cc9565	tests: Fix tests to not rely on key representation	2016-02-10 14:35:13 +01:00
Paweł Dziepak	ababdfc9e2	tests/batchlog: use proper batchlog version Since `42e3999a00` "Check batchlog version before replaying" there is a version check in batchlog replay. However, the test wasn't updated and still used some arbitrary version number which caused it to fail. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1454595368-21670-1-git-send-email-pdziepak@scylladb.com>	2016-02-04 16:50:45 +02:00
Avi Kivity	f3ca597a01	Merge "Sstable cleanup fixes" from Tomasz " - Added waiting for async cleanup on clean shutdown - Crash in the middle of sstable removal doesn't leave system in a non-bootable state"	2016-02-04 12:36:13 +02:00
Tomasz Grabiec	355874281a	sstables: Do not register exit hooks from static initializer Fixes #868. Registerring exit hooks while reactor is already iterating over exit hooks is not allowed and currently leads to undefined behavior observed in #868. While we should make the failure more user friendly, registering exit hooks concurrently with shutdown will not be allowed. We don't expect exit hooks to be registered after exit starts because this would violate the guarantee which says that exit hooks are executed in reverse order of registration. Starting exit sequence in the middle of initialization sequence would result in use after free errors. Btw, I'm not sure if currently there's anything which prevents this To solve this problem, move the exit hook to initilization sequence. In case of tests, the cleanup has to be called explicitly.	2016-02-03 17:35:50 +01:00
Calle Wilund	159dbe3a64	sstable_datafile_tests: Replace '---' with auto Fixes compilation issues on some g++. Message-Id: <1454323749-21933-1-git-send-email-calle@scylladb.com>	2016-02-01 12:58:33 +02:00
Raphael S. Carvalho	a46aa47ab1	make sstables::compact_sstables return list of created sstables Now, sstables::compact_sstables() receives as input a list of sstables to be compacted, and outputs a list of sstables generated by compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <0d8397f0395ce560a7c83cccf6e897a7f464d030.1454110234.git.raphaelsc@scylladb.com>	2016-01-31 12:39:20 +02:00
Asias He	5003c6e78b	config: Introduce shutdown_announce_in_ms option Time a node waits after sending gossip shutdown message in milliseconds. Reduces ./cql_query_test execution time from real 2m24.272s user 0m8.339s sys 0m10.556s to real 1m17.765s user 0m3.698s sys 0m11.578	2016-01-27 11:19:38 +08:00
Paweł Dziepak	490201fd1c	row_cache: protect against stale entries row_cache::update() does not explicitly invalidate the entries it failed to update in case of a failure. This could lead to inconsistency between row cache and sstables. In paractice that's not a problem because before row_cache::update() fails it will cause all entries in the cache to be invalidated during memory reclaim, but it's better to be safe and explicitly remove entries that should be updated but it was not possible to do so. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1453829681-29239-1-git-send-email-pdziepak@scylladb.com>	2016-01-26 20:34:41 +01:00

1 2 3 4 5 ...

1043 Commits