scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 13:06:57 +00:00

Author	SHA1	Message	Date
Botond Dénes	ddd70dc113	Use dht::token_range alias for last/preferred replicas Use the pre-existing type alias instead of fully spelling out the type everywhere.	2018-05-10 06:22:39 +03:00
Botond Dénes	52affa2a61	storage_proxy::coordinator_query_result: merge constructors into one w/ default params	2018-05-10 06:22:39 +03:00
Botond Dénes	3b6f4e4901	querier: check only the end bound of ranges when matching them The querier provides a `matches(const nonwrapping_range&)` member to allow for checking whether a range matches that with which the querier was originally created. The check for match is more lax than a strict equality check as ranges are shrunk query progresses. Because of this the above member only checked that one of the bounds of the examined ranges matches. This is adequate as for this purpose because, in the context of a single query, it is guaranteed that no two read requests to the same replica will have overlapping range. However Avi pointed out in a recent, related review, that this check can be made a little more strict by requiring that the end-bounds of the two ranges always matches, instead of allowing any of the bounds to match.	2018-05-10 06:22:39 +03:00
Botond Dénes	eba90d0208	querier: take range and slice by value It needs to copy these anyway so give callers the opportunity to move these in.	2018-05-10 06:22:39 +03:00
Botond Dénes	546a0e292e	querier: remove const params from make_compaction_state()	2018-05-10 06:22:39 +03:00
Botond Dénes	bc01833cad	querier: make _range and _slice const Since we are storing them on the heap we can make them const and still be movable. We get the cake and can eat it too.	2018-05-10 06:22:39 +03:00
Botond Dénes	f5b012c952	flat_multi_range_mutation_reader: optimize for non-plural range vectors Don't create a flat_multi_range_mutation_reader when the range vector has 0 or 1 element. In the former case create an empty reader and in the latter just create a reader with the mutation-source with the only range in the vector.	2018-05-10 06:22:39 +03:00
Botond Dénes	16319c2036	range: clean the deduced transformed type wrapping_range and nonwrapping_range offer a transform() member function which allows creating a new range by applying a transformer function to the bounds of the current range. The type of bounds of the new range is deduced from the return type for this transformer function. However the return type is used as-is, with any CV or reference attached to it. Since it doesn't make sense to create a range of references or a type with CV qualifiers strip these off the deduced type.	2018-05-10 06:22:39 +03:00
Avi Kivity	911c2e7953	Merge "Support Bloom filter format for SSTables 3.x." from Vladimir " In SSTables 3.0, the base and increment fields have been swapped in Bloom filters to reduce collisions (see CASSANDRA-8413). This affects the resulting values written to Filter.db. This patchset adds support for reading/writing Filter.db in the format corresponding to the version of SSTables. Tests: unit {release} Filter.db files have been generated using Cassandra 3.11 with same data as in unit tests and are validated to match those generated by Scylla. " * 'projects/sstables-30/write-filter/v1-2' of https://github.com/argenet/scylla: Fix mistakes and typos in comments (minor clean-up) Check Filter.db in SSTables 3.x write tests. Support Bloom filter format used in SSTables 3.0. Remove unused overload of i_filter::get_filter().	2018-05-09 11:16:09 +03:00
Vladimir Krivopalov	51c8ea74d6	sstables: generate non-empty summaries for m format Add summary entries as needed. Also removes the duplicate line that assigned summary byte cost. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <0d387c68523bae0c121cb15ad1e651ee9a8e4b4a.1525732404.git.vladimir@scylladb.com>	2018-05-09 11:15:02 +03:00
Vladimir Krivopalov	b59549cd16	Fix mistakes and typos in comments (minor clean-up) Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-08 15:28:43 -07:00
Vladimir Krivopalov	e739bb3280	Check Filter.db in SSTables 3.x write tests. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-08 15:28:35 -07:00
Vladimir Krivopalov	0f37c0e684	Support Bloom filter format used in SSTables 3.0. The two hash values, base and increment, used to produce indices for setting bits in the filter, have been swapped in SSTables 3.0. See CASSANDRA-8413 for details. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-08 15:28:27 -07:00
Vladimir Krivopalov	fe2358e8bd	Remove unused overload of i_filter::get_filter(). Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-08 15:28:18 -07:00
Calle Wilund	b2b1a1f7e1	database: Fix assert in truncate Fixes crash in cql_tests.StorageProxyCQLTester.table_test "avoid race condition when deleting sstable on behalf..." changed discard_sstables behaviour to only return rp:s for sstables owned and submitted for deletion (not all matching time stamp), which can in some cases cause zero rp returned. Message-Id: <20180508070003.1110-1-calle@scylladb.com>	2018-05-08 22:29:21 +01:00
Vlad Zolotarov	48c96d09d6	db::hints::manager: drain hints when the node is decommissioned/removed When node is decommissioned/removed it will drain all its hints and all remote nodes that have hints to it will drain their hints to this node. What "drain" means? - The node that "drains" hints to a specific destination will ignore failures and will continue sending hints till the end of the current segment, erase it and move to the next one till there are no more segments left. After all hints are drained the corresponding hints directory is removed. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-08 22:29:21 +01:00
Vlad Zolotarov	ec76f8a27d	db::hints::manager: add a few more trace messages Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-08 22:29:21 +01:00
Vlad Zolotarov	6ede32156f	db::hints::manager::end_point_hints_manager::sender: add set_stopping()/stopping() methods It's nicer to have access methods instead of working directly with enum_set methods and values. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-08 22:29:21 +01:00
Vlad Zolotarov	94da744f37	db::hints::manager::end_point_hints_manager::stop(): log the last exception instead of forwarding it Returning a future with an exception from end_point_manager::stop() is practically useless because the best the caller can do is to log it and continue as if it didn't happen because it has other things to shut down. Therefore in order to simplify the caller we will log the exception if it happens and will always return a non-exceptional future. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-08 22:29:21 +01:00
Vlad Zolotarov	8aedbf9d18	db::hints: manager.hh: cleanup: fix the comments Fix the comments that went out of sync with the current implementation. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-08 22:29:21 +01:00
Vlad Zolotarov	5463b58faa	db::hints::manager: rework end_point_hints_manager::stop() to use seastar::async() This simplifies the code reading and extending. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-08 22:29:21 +01:00
Botond Dénes	6f7d919470	database: when dropping a table evict all relevant queriers Queriers shouldn't outlive the table they read from as that could lead to use-after-free problems when they are destroyed. Fixes: #3414 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <3d7172cef79bb52b7097596e1d4ebba3a6ff757e.1525716986.git.bdenes@scylladb.com>	2018-05-07 21:20:25 +03:00
Duarte Nunes	c053275a48	db/view/row_locking: Add timeout when waiting for the lock This ensures we respect the write timeout set by the client when applying base writes, in case a writes takes too long to acquire the row lock for the read-before-write phase of a materialized view update. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180507132755.8751-1-duarte@scylladb.com>	2018-05-07 18:22:39 +01:00
Duarte Nunes	113294074d	Merge seastar upstream * seastar ac02df7...840002c (20): > dpdk: protect against missing statistics > alien: make visible in documentation > Merge "rewrite iotune to conform to the new ioscheduler" from Glauber > app_template: Correct outdated comment > apps, tests: Catch polymorphic exceptions by reference > configure.py: Enhance detection for gcc -fvisibility=hidden bug > reactor: add rudimentary task histogram reporting > Revert "Merge "rewrite iotune to conform to the new ioscheduler" from Glauber" > Merge "rewrite iotune to conform to the new ioscheduler" from Glauber > build: Use the same warning name for Clang and GCC > core/rwlock: Add support for timeouts > fs qualification: protect against EINTR > Docker: Fix failing build due to missing GNU make > reactor: move optional to experimental so we compile with c++14 > future: remove allocation from future::get() thread context switch > Merge "rpc streaming" from Gleb > reactor: put mountpoint_params in seastar namespace > Tutorial: in PDF version of tutorial, better backtick typesetting > tutorial: support, and start using, links to other sections > tutorial: improve second half of semaphores section Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-05-07 18:22:39 +01:00
Avi Kivity	368e15a8e2	Update scylla-ami submodule * dist/ami/files/scylla-ami 8a6e4dd...e0b35dc (1): > change default roles for EBS / ephemeral	2018-05-07 12:34:04 +03:00
Duarte Nunes	4b3562c3f5	db/view: Limit number of pending view updates This patch adds a simple and naive mechanism to ensure a base replica doesn't overwhelm a potentially overloaded view replica by sending too many concurrent view updates. We add a semaphore to limit to 100 the number of outstanding view updates. We limit globally per shard, and not per destination view replica. We also limit statically. Refs #2538 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180426134457.21290-2-duarte@scylladb.com>	2018-05-07 11:25:27 +03:00
Duarte Nunes	2be75bdfc9	db/timeout_clock: Properly scope type names Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180426134457.21290-1-duarte@scylladb.com>	2018-05-07 11:24:41 +03:00
Nadav Har'El	c93b56034d	tests: improve usability of cql_assertions.hh error messages The functions in cql_assertions.hh are very convenient, but have one frustrating drawback: When you have many of those assertions in one test, it's very hard to know which of the similar assertions failed. The problem is that an error often looks like this: unknown location(0): fatal error: in "test_many_columns": std::runtime_error: Expected 2 row(s) but got 0 tests/cql_assertions.cc(131): last checkpoint Which of the many similar checks in "test_many_columns" failed? Note the unhelpful "unknown location" and also the "last checkpoint" points to code in cql_assertions.cc, not in the actual test, so it is useless. The root cause of these problems is that the Boost macros use the C preprocessor __FILE__ and __LINE__, which in actual C++ functions like is_rows() remembers its location, instead of the caller. Fixing this will not be simple. But this patch has a much simpler solution - fixing the "last checkpoint". What ruins the last checkpoint is the use of BOOST_REQUIRE inside the cql_assertions.cc is_rows() - when that succeeds, it records the location inside cql_assertions.cc (!) as the last success. If we just replace BOOST_REQUIRE by our own test (just like in the rest of the cql_assertions.cc code), this code will not override the last checkpoint. The user can see the last real successful BOOST_REQUIRE, or use BOOST_TEST_PASSPOINT() to set his own checkpoints between different parts of the same test. After this patch, and with adding BOOST_TEST_PASSPOINT() calls between different parts of my test, the failure above now looks like: unknown location(0): fatal error: in "test_many_columns": std::runtime_error: Expected 2 row(s) but got 0 tests/secondary_index_test.cc(299): last checkpoint The "last checkpoint" now shows me exactly where my failing check was. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180501152638.26238-1-nyh@scylladb.com>	2018-05-07 09:19:45 +01:00
Duarte Nunes	eabe471ce8	tests/secondary_index_test: Don't catch polymorphic exceptions by value Don't slice exception by catching them by value. Instead of catching by reference, use assert_that_failed(). Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180506153745.4512-1-duarte@scylladb.com>	2018-05-06 18:53:40 +03:00
Duarte Nunes	ab5a45b00c	Merge 'Improve debuggability of result_message' from Avi "This patchset adds ostream operators to result_message and uses them in cql_assertions." * tag 'result_message-print/v1.1' of https://github.com/avikivity/scylla: tests: cql_assersions: improve error message when a row is not found transport: add ostream support to result_message transport: const correctness for result_message::accept()	2018-05-06 14:52:56 +01:00
Avi Kivity	6d3fb69827	tests: cql_assersions: improve error message when a row is not found Display the row and the result set.	2018-05-06 16:28:37 +03:00
Avi Kivity	07d69ebce2	transport: add ostream support to result_message Allow printing result_message:s for debugging.	2018-05-06 16:28:35 +03:00
Avi Kivity	50d4d01cb7	tests: fix view_schema_test cql_assertion types Use utf8_type where warranted. Fixes view_schema_test failure where the rows did not match. I don't understand exactly why the failure happened (using the wrong type should not cause a failure here), but the change fixes the problem. Tests: view_schema_test (release) Message-Id: <20180506130015.7450-1-avi@scylladb.com>	2018-05-06 14:25:22 +01:00
Avi Kivity	31f2b3ce15	transport: const correctness for result_message::accept() The visitor does not alter the result_message it is visiting (and its signature indicates that) so accept() should be const-qualified to indicate that and to allow visiting const result_message:s.	2018-05-06 15:51:48 +03:00
Avi Kivity	cc900c23a6	Merge "Write Statistics.db in SSTables 3.x format." from Vladimir " This patchset adds support for writing Statistics.db in the SSTables 'mc' (3.x) format. This file is essential for reading data stored in Data.db as it contains base values used for delta encoding and types of columns. This patchset also fixes several bugs found in writing data and index files as well as bugs in a statistics-related structure definition. Tests: unit {debug, release} All SSTables files for write unit tests are validated to be processed by sstabledump and output is verified to show the expected data. " * 'projects/sstables-30/write-statistics/v1' of https://github.com/argenet/scylla: Add test covering the composite partition key case. Add Statistics.db files to write tests for SSTables 3.0. Do not check rows and cells for expiration when writing them to the data file. Fix promoted index serialization. Fix the order of items in stats_metadata. Fix timestamp_epoch value which was truncated on exceeding int32_t type limit. Write serialization header to Statistics.db for SSTables 3.x. Do not pass schema to metadata_collector::update(column_stats) Collect metadata statistics when writing SSTables 3.0. Call get_metadata_collector() instead of referencing sstable::_collector directly. Fix logic of writing TTLed cells in SSTable 3.0 format. Separate statistics for count of cells, columns and rows in column_stats. Deserialize collection in a way that doesn't incur shared_ptr counter increment and is generally shorter. Track both min & max values for timestamp, TTL and local deletion time in metadata_collector. Add class for tracking both extremum values (min and max) on updates.	2018-05-05 16:53:08 +03:00
Vladimir Krivopalov	4ecb3a5e2a	Add test covering the composite partition key case. Mainly to check that the composite type is properly serialized when writing serialization header to Statistics.db. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:45:11 -07:00
Vladimir Krivopalov	1b3989adcd	Add Statistics.db files to write tests for SSTables 3.0. For these tests to work, all time-related values are now fixed as these are stored in Statistics.db files. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:45:11 -07:00
Vladimir Krivopalov	293ee6ae3f	Do not check rows and cells for expiration when writing them to the data file. Although this logic may be seen as a useful optimization, it hinders unit tests writing SSTables 3.0 as those need to have fixed time-related values to produce Statistics.db files with the same content on each run. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:45:11 -07:00
Vladimir Krivopalov	44bc0f1493	Fix promoted index serialization. There is a new field introduced in the SSTables 3.0 index file format named 'partition_header_length' that can be used to skip over to the first clustering row in a wide partition. This one has not been previously written and caused malformed indices. Updated the corresponding test to include a static row and write multiple wide partitions. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:45:10 -07:00
Vladimir Krivopalov	56ac941a2e	Fix the order of items in stats_metadata. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:45:10 -07:00
Vladimir Krivopalov	926cdc6d70	Fix timestamp_epoch value which was truncated on exceeding int32_t type limit. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:45:10 -07:00
Vladimir Krivopalov	5db6002720	Write serialization header to Statistics.db for SSTables 3.x. Serialization header is a new components in Statistics.db introduced in SSTables 3.0 ('ma') format. It is essential for reading data file as it contains the base values used for delta-encoded values (timestamps, TTLs, local deletion times) and description of column types. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:43:17 -07:00
Vladimir Krivopalov	6e4601d177	Do not pass schema to metadata_collector::update(column_stats) Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:22:32 -07:00
Vladimir Krivopalov	a10ad6b623	Collect metadata statistics when writing SSTables 3.0. Track min/max timestamps, TTLs, local deletion times and count of cells, columns and rows. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>	2018-05-04 15:22:30 -07:00
Raphael S. Carvalho	abcfc19fe9	db: make compaction slightly faster by not using filtering reader on unshared sstable After reboot, all existing sstables are considered shared. That's a safe default. Reader used by compaction decides to use filtering reader (filters out data that doesn't belong to this shard) if sstable is considered shared even though it may actually be unshared. By avoiding filtering reader we're avoiding an extra check for each key, and that may be meaningful for compaction of tons of small partitions and even range reads of such. We do so by fixing sstable::_shared, which is now set properly for existing sstables at start. quick check using microbenchmark which extends perf_sstable with compaction mode: before: 69407.61 +- 37.03 partitions / sec (30 runs, 1 concurrent ops) after: 70161.09 +- 40.35 partitions / sec (30 runs, 1 concurrent ops) Fixes #3042. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180504182158.21130-1-raphaelsc@scylladb.com>	2018-05-04 19:34:09 +01:00
Raphael S. Carvalho	b65bc511fe	sstables/compaction_manager: log user initiated compaction Sometimes it's hard to figure out from log whether user run major compaction. Fixes #1303. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20180504181047.20277-1-raphaelsc@scylladb.com>	2018-05-04 19:15:58 +01:00
Duarte Nunes	7916368df8	Merge "Introduce system.large_partitions table" from Piotr " This series introduces a system.large_partitions table, used to gather information on largest partitions in the cluster. Schema below allows easy extraction of most offending keys and removal by sstable name, which happens when a table is compacted away. Schema: ( keyspace_name text, table_name text, sstable_name text, partition_size bigint, key text, compaction_time timestamp, PRIMARY KEY((keyspace_name, table_name), sstable_name, partition_size, key) ) WITH CLUSTERING ORDER BY (partition_size DESC); " Closes #3292. * 'large_partition_table_3' of https://github.com/psarna/scylla: database, sstables, tests: add large_partition_handler db: add large_partition_handler interface with implementations docs: init system_keyspace entry with system.large_partitions db: add system.large_partitions table	2018-05-04 18:18:50 +01:00
Piotr Sarna	bc019205b3	schema: fix typos in a comment Message-Id: <2b2a169e8a511fa9e0e1556ac7559ce9bef896e1.1525431353.git.sarna@scylladb.com>	2018-05-04 15:26:51 +01:00
Piotr Sarna	fe02c3d0e2	database, sstables, tests: add large_partition_handler This commit makes database, sstables and tests aware of which large_partition_handler they use. Proper large_partition_handler is retrievable from config information and is based on existing compaction_large_partition_warning_threshold_mb entry. Right now CQL TABLE variant of large_partition_handler is used in the database. Tests use a NOP version of large_partition_handler, which does not depend on CQL queries at all.	2018-05-04 14:38:13 +02:00
Piotr Sarna	14b3c7e7e7	db: add large_partition_handler interface with implementations This commit introduces large_partition_handler class, which can be used to take additional action when large partitions are written. It comes with two implementations: * NOP, used in tests, which does nothing on large partition update/delete * CQL TABLE, which inserts/deletes information on particular sstable to system.large_partitions table, in order to be retrievable from cqlsh later. References #3292	2018-05-04 12:46:31 +02:00

1 2 3 4 5 ...

15289 Commits