scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 18:10:39 +00:00

Author	SHA1	Message	Date
Botond Dénes	dec4e5659b	test/boost/mutation_test: simplify test_compaction_data_stream_split test This test has very elaborate infrastructure essentially duplicating mutation, mutation::apply() and mutation::operator==. Drop all this extra code and use mutations directly instead. This makes migrating the test to v2 easier.	2022-02-21 12:29:24 +02:00
Botond Dénes	2941803da0	mutation_partition: do_compact(): do drop row tombstones covered by higher order tombstones The comment on the public methods calling said method promises to do so but doesn't actually follows through. This patch fixes this for row tombstones, to mirror the behaviour of the mutation compactor. This is especially important for tests that compare mutations compacted with different methods.	2022-02-21 12:29:24 +02:00
Botond Dénes	f2e2b84038	multishard_mutation_query: migrate to v2 Mostly mechanical transformation. The main difference is in the detached compaction state, from which we now get the range tombstone change, instead of the range tombstone list. The code around this is a bit awkward, will become simpler when compactor drops v1 support.	2022-02-21 12:29:24 +02:00
Botond Dénes	b330cba792	mutation_fragment_v2: range_tombstone_change: add memory_usage()	2022-02-21 12:29:24 +02:00
Botond Dénes	9e48237b86	evictable_reader_v2: terminate active range tombstones on reader recreation Reader recreation messes with the continuity of the mutation fragment stream because it breaks snapshot isolation. We cannot guarantee that a range tombstone or even the partition started before will continue after too. So we have to make sure to wrap up all loose threads when recreating the reader. We already close uncontinued partitions. This commit also takes care of closing any range tombstone started by unconditionally emitting a null range tombstone. This is legal to do, even if no range tombstone was in effect.	2022-02-21 12:29:24 +02:00
Botond Dénes	6db08ddeb2	evictable_reader_v2: restore handling of non-monotonically increasing positions We thought that unlike v1, v2 will not need this. But it does. Handled similarly to how v1 did it: we ensure each buffer represents forward progress, when the last fragment in the buffer is a range tombstone change: * Ensure the content of the buffer represents progress w.r.t. _next_position_in_partition, thus ensuring the next time we recreate the reader it will continue from a later position. * Continue reading until the next (peeked) fragment has a strictly larger position. The code is just much nicer because it uses coroutines.	2022-02-21 12:29:24 +02:00
Botond Dénes	498d03836b	evictable_reader_v2: simplify handling of reader recreation The evictable reader has a handful of flags dictating what to do after the reader is recreated: what to validate, what to drop, etc. We actually need a single flag telling us if the reader was recreated or not, all other things can be derived from existing fields. This patch does exactly that. Furthermore it folds do_fill_buffer() into fill_buffer() and replaces the awkward to use `should_drop_fragment()` with `examine_first_fragments()`, which does a much better job of encapsulating all validation and fragment dropping logic. This code reorganization also fixes two bugs introduced by the v2 conversion: * The loop in `do_fill_buffer()` could become infinite in certain circumstances due to a difference between the v1 and v2 versions of `is_end_of_stream()`. * The position of the first non-dropped fragment is was not validated (this was integrated into the range tombstone trimming which was thrown out by the conversion).	2022-02-21 12:29:24 +02:00
Botond Dénes	d4ac473f7d	mutation: counter_write_query: use v2 reader	2022-02-21 12:27:55 +02:00
Botond Dénes	fcda35d08e	mutation: migrate consume() to v2 The underlying mutation format is still v1, so consume() ends up doing an online conversion. This allows converting all downstream code to v2, leaving the conversion close to the code that is yet to be migrated to v2 native: the mutation itself.	2022-02-21 12:27:55 +02:00
Botond Dénes	1fa6537a2f	mutation_fragment_v2,flat_mutation_reader_v2: mirror v1 concept organization Currently all concepts are in mutation_fragment_v2.hh and flat_mutation_reader_v2.hh. Organize concepts similar to how the v1 ones are: move high-level consume concepts into mutation_consumer_concepts.hh.	2022-02-21 12:27:55 +02:00
Botond Dénes	fb0e0ec7c1	mutation_reader: compacting_reader: require a v2 input reader Before we add a v2 output option to the compactor, we want to get rid of all the v1 inputs to make it simpler. This means that for a while the compacting reader will be in a strange place of having a v2 input and a v1 output. Hopefully, not for long.	2022-02-21 12:27:55 +02:00
Botond Dénes	45b36d91c6	db/view/view_builder: use v2 reader	2022-02-21 12:27:55 +02:00
Botond Dénes	bba20f5cce	test/lib/flat_mutation_reader_assertions: adjust has_monotonic_positions() to v2 spec The v2 spec allows for non-strictly monotonically increasing positions, but has_monotonic_positions() tried to enforce it. Relax the check so it conforms to the spec.	2022-02-21 12:27:55 +02:00
Nadav Har'El	f292d3d679	alternator: make schema modifications in CreateTable atomic The Alternator CreateTable operation currently performs several schema- changing operations separately - one by one: It creates a keyspace, a table in that keyspace and possibly also multiple views, and it sets tags on the table. A consequence of this is that concurrent CreateTable and DeleteTable operations (for example) can result in unexpected errors or inconsistent states - for example CreateTable wants to create the table in the keyspace it just created, but a concurrent DeleteTable deleted it. We have two issues about this problem (#6391 and #9868) and three tests (test_table.py::test_concurrent_create_and_delete_table) reproducing it. In this patch we fix these problems by switching to the modern Scylla schema-changing API: Instead of doing several schema-changing operations one by one, we create a vector of schema mutation performing all these operations - and then perform all these mutations together. When the experimental Raft-based schema modifications is enabled, this completely solves the races, and the tests begin to pass. However, if the experimental Raft mode is not enabled, these tests continue to fail because there is still no locking while applying the different schema mutations (not even on a single node). So I put a special fixture "fails_without_raft" on these tests - which means that the tests xfail if run without raft, and expected to pass when run on Raft. Indeed, after this patch test/alternator/run --raft test_table.py::test_concurrent_create_and_delete_table shows three passing tests (they also pass if we drastically improve the number of iterations), while test/alternator/run test_table.py::test_concurrent_create_and_delete_table shows three xfailing tests. All other Alternator tests pass as before with this patch, verifying that the handling of new tables, new views, tags, and CDC log tables, all happen correctly even after this patch. A note about the implementation: Before this patch, the CreateTable code used high-level functions like prepare_new_column_family_announcement(). These high-level functions become unusable if we write multiple schema operations to one list of mutations, because for example this function validates that the keyspace had already been created - when it hasn't and that's the whole point. So instead we had to use lower-level function like add_table_or_view_to_schema_mutation() and before_create_column_family(). However, despite being lower level, these functions were public so I think it's reasonable to use them, and we probably have no other alternative. Fixes #6391 Fixes #9868 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-02-18 09:03:52 +02:00
Nadav Har'El	46120ca4f4	Merge 'tools/scylla-sstable: change output of dump commands to JSON' from Botond Dénes Replacing the previous text output with the exception of the dump-data command. The text output was supposed to be human-friendly but it is not really human friendlier than a well formatted JSON, the latter having the additional advantage of being machine friendly too. Although the text output already exists, having just one output format makes the code much simpler and easier to maintain so we chose not to pay the higher maintenance price for a format that is not expected to see much (if any) use. Although the JSON written by the tool is not formatted, it can easily be formatted by e.g. piping it through `jq`. The latter also allows lookup of specific field(s). The JSON schema of each command is documented in the --help output of the respective command (e.g. scylla sstable data-dump --help) . We keep the text output of the dump-data command as this is using scylla's built-in printer that we also use in logging and tests. Some people might be used to this format, so leave it in: the code already exists for it and lives in scylla core, so we don't need to maintain it separately. The default output-format of dump-data is now JSON. A smoke test suite is added for the dump commands too. The tests only check that some output is present and that it is valid JSON. Refs: #9882 Tests: unit(dev) Also on: https://github.com/denesb/scylla.git scylla-sstable-json/v2 Changelog v3: * Rebase on recent master (which has the required seastar fixes for debug tests) v2: * Document the JSON schema of each command. * Use the SAX-style API of rapidjson to generate streaming JSON, instead of hand-generating it. Closes #10074 * github.com:scylladb/scylla: test/cql-pytest: add tests for scylla-sstable's dump commands test/cql-pytest: prepare for tool tests tools/schema_loader: auto-create the keyspace for all statements tools/scylla-sstable: change output of dump-scylla-metadata to json tools/scylla-sstable: change output of dump-statistics to json tools/scylla-sstable: change output of dump-summary to json tools/scylla-sstable: change output of dump-compression-info to json tools/scylla-sstable: change output of dump-index to json tools/scylla-sstable: add json support in --dump-data tools/scylla-sstable: add json_writer tools/scylla-sstable: use fmt::print in --dump-data tools/scylla-sstable: prepare --dump-data for multiple output formats	2022-02-18 07:52:19 +02:00
Botond Dénes	1e038b40cf	test/cql-pytest: add tests for scylla-sstable's dump commands The tests are smoke-tests: they mostly check that scylla doesn't crash while dumping and it produces some output. When dumping json, the test checks that it is valid json.	2022-02-17 15:24:24 +02:00
Botond Dénes	afab1a97c6	test/cql-pytest: prepare for tool tests We want to add tool tests. These tests will have to invoke scylla executable (as tools are hosted by the latter) and they want access to the scylla data directories. Propagate the scylla path and data directory used from `run` into the test suite via pytest request parameters.	2022-02-17 15:24:24 +02:00
Botond Dénes	96082631c8	tools/schema_loader: auto-create the keyspace for all statements Currently the keyspace is only auto-created for create type statements. However the keyspace is needed even without UDTs being involved: for example if the table contains a collection type. So auto-create the keyspace unconditionally before preparing the first statement. Also add a test-case with a create table statement which requires the keyspace to be present at prepare time.	2022-02-17 15:24:24 +02:00
Botond Dénes	59ce247164	tools/scylla-sstable: change output of dump-scylla-metadata to json	2022-02-17 15:24:24 +02:00
Botond Dénes	2a7ed8212f	tools/scylla-sstable: change output of dump-statistics to json	2022-02-17 15:24:24 +02:00
Botond Dénes	a617e66878	tools/scylla-sstable: change output of dump-summary to json	2022-02-17 14:17:11 +02:00
Botond Dénes	fb6b7c8036	tools/scylla-sstable: change output of dump-compression-info to json	2022-02-17 14:17:11 +02:00
Botond Dénes	f5c6d7e12e	tools/scylla-sstable: change output of dump-index to json	2022-02-17 14:17:11 +02:00
Botond Dénes	bdbbda29c1	tools/scylla-sstable: add json support in --dump-data But keep the old text output-format too. One can switch between the two with the --output-format flag, which defaults to "json".	2022-02-17 14:17:11 +02:00
Botond Dénes	03bbf1b362	tools/scylla-sstable: add json_writer Wrapping a rapidjson::Writer<> and mirrors the latter's API, providing more convenient overloads for the Key() and String() methods, as well as providing some extra, scylla-sstable specific methods too.	2022-02-17 14:17:11 +02:00
Botond Dénes	72f27c8782	tools/scylla-sstable: use fmt::print in --dump-data The rest of the code is standardizing on fmt::print(), bring the code for --dump-data in line.	2022-02-17 14:17:11 +02:00
Botond Dénes	ba2a61b2bc	tools/scylla-sstable: prepare --dump-data for multiple output formats Extract the actual dumping code into a separate class, which also implements sstable_consumer interface. The dumping consumer now just forwards calls to actual dumper through the abstract consumer interface, allowing different concrete dumpers to be instantiated.	2022-02-17 14:17:11 +02:00
Piotr Dulikowski	adfd9d2f7a	abstract_read_resolver::fail_request: make non-virtual This method is not overrided by any of the derived classes, so it does not need to be virtual. (cherry picked from commit b7fb93dc46531bca8db535301a069df52991f9d9)	2022-02-17 12:34:37 +02:00
Michael Livshin	f8d4bafa5a	to_string.hh: include <map> The code uses `std::map`, so it should include the definition explicitly. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-17 08:53:48 +02:00
Michael Livshin	a657dc9787	scylla-gdb.py: set source language to c++ When you interrupt a process in gdb using Ctrl-C or attach gdb to a running process, usually gdb will show the current frame as `syscall()` (no source information). But in some less usual setups gdb may happen to know that `syscall()` is implemented in assembly, and even knows which line is current in which assembly file. An unfortunate effect of gdb knowing that the current frame's source language is assembly is that since assembly is not C++, gdb's expression parser switches to "auto" while in the `syscall()` stack frame. And in the "auto" language explicit C++ global namespace references like "::debug::the_database" are not syntactically valid, which renders much of scylla-gdb.py unusable unless you remember to go up the call stack before doing anything. But since scylla-gdb.py is there to help debug Scylla, and Scylla is written in C++, we can just set gdb source language to "c++" and avoid the problem. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com> Message-Id: <20220216235301.1206341-1-michael.livshin@scylladb.com>	2022-02-17 08:43:59 +02:00
Botond Dénes	948bc359c2	Merge "ME sstable format support" from Michael Livshin " This series implements support for the ME sstable format (introduced in C* 3.11.11). Tests: unit(dev) " * tag 'me-sstable-format-v5' of https://github.com/cmm/scylla: sstables: validate originating host id sstable: add is_uploaded() predicate config: make the ME sstable format default scylla-gdb.py: recognize ME sstables sstables: store originating host id in stats metadata system_keyspace: cache local host id before flushing database_test: ensure host id continuity sstables_manager: add get_local_host_id() method and support sstables_manager: formalize inheritability system_keyspace, main: load (or create) local host id earlier sstable_3_x_test: test ME sstable format too add "ME_SSTABLE" cluster feature add "sstable_format" config add support for the ME sstable format scylla-sstable: add ability to dump optionals and utils::UUID sstables: add ability to write and parse optionals globalize sstables::write(..., utils::UUID)	2022-02-16 18:28:16 +02:00
Michael Livshin	79bf79ebd3	sstables: validate originating host id Add an additional sstable validation step to check that originating host id matches the local host id. This is only done for ME-and-up sstables, which do not come from upload/, and when the local host id is known. When local host id is unknown, check that the sstable belongs to a system keyspace, i.e. whether it is plausible that Scylla is still booting up and hasn't loaded/generated the local host id yet. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	3511d7cd21	sstable: add is_uploaded() predicate Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	3bf1e137fc	config: make the ME sstable format default Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	0ca58096cf	scylla-gdb.py: recognize ME sstables Also use the opportunity to unify two closely-related lists into a dictionary. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	dd4e330cc5	sstables: store originating host id in stats metadata With this change, ME sstables start carrying their originating host id, which makes ME format feature-complete so it can be made default. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	0ccd56e036	system_keyspace: cache local host id before flushing Later in this series the ME sstable format is made default, which means that `system.local` will likely be written as ME. Since, in ME, originating host id is a part of sstable stats metadata, the local host id needs to either already be cached by the time `system.local` is flushed, or to somehow be special-case-ignored when flushing `system.local`. The former (done here) is optimistic (cache before flush), but the alternative would be an abstraction violation and would also cost a little time upon each sstable write. (Cache-before-flush could be undone by catching any exceptions during flush and un-caching, but inability to `co_await` in catch clauses makes the code look rather awkward. And there is no need to bother because bootstrap failures should be fatal anyway) Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	d8cc535297	database_test: ensure host id continuity The "populate_from_quarantine_works" test case creates sstables with one db config, then reads them with another. Ensure that both configs have the same host id so the sstables pass validation. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	3fef604075	sstables_manager: add get_local_host_id() method and support Since ME sstable format includes originating host id in stats metadata, local host id needs to be made available for writing and validation. Both Scylla server (where local host id comes from the `system.local` table) and unit tests (where it is fabricated) must be accomodated. Regardless of how the host id is obtained, it is stored in the db config instance and accessed through `sstables_manager`. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	0895188851	sstables_manager: formalize inheritability The class is already inherited from in tests (along with overriding a non-virtual method), so this seems to be called for. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	7d2af177eb	system_keyspace, main: load (or create) local host id earlier We want it to be cached before any sstable is written, so do it right after system_keyspace::minimal_setup(). Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	387c882dc7	sstable_3_x_test: test ME sstable format too Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	d370558279	add "ME_SSTABLE" cluster feature Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	0b1447c702	add "sstable_format" config Initialize it to "md" until ME format support is complete (i.e. storing originating host id in sstable stats metadata is implemented), so at present there is no observable change by default. Also declare "enable_sstables_md_format" unused -- the idea, going forward, being that only "sstable_format" controls the written sstable file format and that no more per-format enablement config options shall be added. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	c96708d262	add support for the ME sstable format The ME format has been introduced in Cassandra 3.11.11: `11952fae77/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java (L123)` `d84c6e9810` It adds originating host id to sstable metadata in support of fixing loss of commit log data when moving sstables between nodes: https://issues.apache.org/jira/browse/CASSANDRA-16619 In Scylla: * The supported way to ingest sstables is via upload/, where stored commit log replay position should be disregarded (but see https://github.com/scylladb/scylla/issues/10080). * A later commit in this series implements originating host id validation for native ME sstables. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	3712a82ca7	scylla-sstable: add ability to dump optionals and utils::UUID Needed for the ME sstable format. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:24 +02:00
Michael Livshin	26bae0cd39	sstables: add ability to write and parse optionals (that is, instances of `std::optional`). The ME sstable format includes optional originating host id in stats metadata. We know how to write and parse uuids, but not how to write and parse optionals. The format is (used by C* in this case, and also happens to be consistent with how booleans are serialized): first a boolean indicating whether the contents are present (0 or 1, as a byte), then the contents (if any). Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:23 +02:00
Michael Livshin	c00d272b16	globalize sstables::write(..., utils::UUID) Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-02-16 18:21:23 +02:00
Benny Halevy	b7b0c19fdc	test: uuid: cement the assumption that default and null uuid are equal Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220216081623.830627-2-bhalevy@scylladb.com>	2022-02-16 10:19:47 +02:00
Benny Halevy	489e50ef3a	utils: uuid: make operator bool explicit Following up on `69fcc053bb` To prevent unintentional implicit conversions e.g. to a number. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220216081623.830627-1-bhalevy@scylladb.com>	2022-02-16 10:19:47 +02:00

1 2 3 4 5 ...

30167 Commits