scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Kefu Chai	3a67c31df0	compaction_manager: pass const reference to ctor the callers of the constructor does not move variable into this parameter, and the constructor itself is not able to consume it. as the parameter is a vector while `compaction_sstable_registration` use an `unordered_set` for tracking the sstables being compacted. so, to avoid creating a temporary copy of the vector, let's just pass by reference. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14661	2023-07-13 11:19:44 +03:00
Petr Gusev	3737bf8fa2	topology.cc: unindex_node: _dc_racks removal fix The eps reference was reused to manipulate the racks dictionary. This resulted in assigning a set of nodes from the racks dictionary to an element of the _dc_endpoints dictionary. The problem was demonstrated by the dtest test_decommission_last_node_in_rack (scylladb/scylla-dtest#3299). The test set up four nodes, three on one rack and one on another, all within a single data center (dc). It then switched to a 'network_topology_strategy' for one keyspace and tried to decommission the single node on the second rack. This decomission command with error message 'zero replica after the removal.' This happened because unindex_node assigned the empty list from the second rack as a value for the single dc in _dc_endpoints dictionary. As a result, we got empty nodes list for single dc in natural_endpoints_tracker::_all_endpoints, node_count == 0 in data_center_endpoints, _rf_left == 0, so network_topology_strategy::calculate_natural_endpoints rejected all the endpoints and returned an empty endpoint_set. In repair_service::do_decommission_removenode_with_repair this caused the 'zero replica after the removal' error. With this fix the test passes both with --consistent-cluster-management option and without it. The specific unit test for this problem was added. Fixes: #14184 Closes #14673	2023-07-13 11:16:01 +03:00
Asias He	1b577e0414	repair: Release permit earlier when the repair_reader is done Consider - 10 repair instances take all the 10 _streaming_concurrency_sem - repair readers are done but the permits are not released since they are waiting for view update _registration_sem - view updates trying to take the _streaming_concurrency_sem to make progress of view update so it could release _registration_sem, but it could not take _streaming_concurrency_sem since the 10 repair instances have taken them - deadlock happens Note, when the readers are done, i.e., reaching EOS, the repair reader replaces the underlying (evictable) reader with an empty reader. The empty reader is not evictable, so the resources cannot be forcibly released. To fix, release the permits manually as soon as the repair readers are done even if the repair job is waiting for _registration_sem. Fixes #14676 Closes #14677	2023-07-13 11:00:35 +03:00
Nadav Har'El	6a7d980a5d	docs/alternator: list more DynamoDB features not in Alternator This patch adds to docs/alternator/compatibility.md mentions of three recently-added DynamoDB features (ReturnValuesOnConditionCheckFailure, DeletionProtectionEnabled and TableClass) which Alternator does not yet support. Each of these mentions also links to the github issue we have on each feature - issues #14481, #14482 and #10431 respectively. During a review of this patch, the reviewers didn't like that I used words like "recent" and "new" to describe recently-added DynamoDB features, and asked that I use specific dates instead. So this is what I do in this patch for the new features - and I also went back and fixed a few pre-existing references to "recent" and "new" features, and added the dates. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14483	2023-07-13 09:52:08 +02:00
Kamil Braun	9d4b3c6036	test: use correct timestamp resolution in `test_group0_history_clearing_old_entries` In `10c1f1dc80` I fixed `make_group0_history_state_id_mutation` to use correct timestamp resolution (microseconds instead of milliseconds) which was supposed to fix the flakiness of `test_group0_history_clearing_old_entries`. Unfortunately, the test is still flaky, although now it's failing at a later step -- this is because I was sloppy and I didn't adjust this second part of the test to also use microsecond resolution. The test is counting the number of entries in the `system.group0_history` table that are older than a certain timestamp, but it's doing the counting using millisecond resolution, causing it to give results that are off by one sometimes. Fix it by using microseconds everywhere. Fixes #14653 Closes #14670	2023-07-13 10:33:52 +03:00
Kefu Chai	aeb160a654	sstables: use sstables_manager::uuid_stable_identifier() instead of accessing the `feature_service`'s member variable, use the accessor provided by sstable_manager. so we always access the this setting via a single channel. this should helps with the readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14658	2023-07-13 10:31:06 +03:00
Tomasz Grabiec	b7bc991aa1	Merge 'Fix `test_node_isolation` flakiness' from Kamil Braun The test isolates a node and then connects to it through CQL. The `connect()` step would often timeout on ARM debug builds. This was already dealt with in the past in the context of other tests: #11289. The `ManagerClient.con_gen` function creates a connection in a way that avoids the problem -- connection timeout settings are adjusted to account for the slowness. Use it in this test to fix the flakiness. At the same time, reduce the timeout used for the actual CQL request (after the driver has already connected), because the test expects this request to timeout and waiting for 200 seconds here is just a waste of time. Closes #14663 * github.com:scylladb/scylladb: test: test_node_isolation: use `ManagerClient.con_gen` to create CQL connection test: manager_client: make `con_gen` for `ManagerClient.__init__` nonoptional	2023-07-12 16:36:54 +02:00
Calle Wilund	890f1f4ad3	generic_server: Handle TLS error codes indicating broken pipe Fixes #14625 In broken pipe detection, handle also TLS error codes. Requires https://github.com/scylladb/seastar/pull/1729 Closes #14626	2023-07-12 16:04:33 +03:00
Botond Dénes	6a63abcb9f	Merge 'doc: fix broken links reported by the link checker' from Anna Stuchlik This PR fixes or removes broken links reported by an online link checker. Fixes https://github.com/scylladb/scylladb/issues/14488 Closes #14462 * github.com:scylladb/scylladb: doc: update the link to ABRT doc: fix broken links on the Scylla SStable page	2023-07-12 16:02:23 +03:00
Asias He	d3034e0fab	view_update_generator: Increase the registration_queue_size When repair writes a sstable to disk, we check if the sstable needs view update processing. If yes, the sstable will be placed into the staging dir for processing, with the _registration_sem semaphore to prevent too many pending unprocessed sstables. We have seen multiple cases in the field where view update processing is inefficient and way too slow which blocks the base table repair to finish on time. This patch increases the registration_queue_size to a bigger number to mitigate the problem that slow view update processing blocks repair. It is better to have a consistent base table + inconsistent view table than inconsistent base table + inconsistent view table. Currently, sstables in staging dir are not compacted. So we could not increase the _registration_sem with too big number to avoid accumulate too many sstables. The view_build_test.cc is updated to make the test pass. Closes #14241	2023-07-12 15:51:35 +03:00
Tomasz Grabiec	e8ee0a2f86	Merge 'group0_state_machine: use correct comparison for timeuuids in `merger`' from Kamil Braun In `d2a4079bbe`, `merger` was modified so that when we merge a command, `last_group0_state_id` is taken to be the maximum of the merged command's state_id and the current `last_group0_state_id`. This is necessary for achieving the same behavior as if the commands were applied individually instead of being merged -- where we take the maximum state ID from `group0_history` table which was applied until now (because the table is sorted using the state IDs and we take the greatest row). However, a subtle bug was introduced -- the `std::max` function uses the `utils::UUID` standard comparison operator which is unfortunately not the same as timeuuid comparison that Scylla performs when sorting the `group0_history` table. So in rare cases it could return the smaller of the two timeuuids w.r.t. the correct timeuuid ordering. This would then lead to commands being applied which should have been turned to no-ops due to the `prev_state_id` check -- and then, for example, permanent schema desync or worse. Fix it by using the correct comparison method. Fixes: #14600 Closes #14616 * github.com:scylladb/scylladb: utils/UUID: reference `timeuuid_tri_compare` in `UUID::operator<=>` comment group0_state_machine: use correct comparison for timeuuids in `merger` utils/UUID: introduce `timeuuid_tri_compare` for `const UUID&` utils/UUID: introduce `timeuuid_tri_compare` for `const int8_t*`	2023-07-12 14:48:18 +02:00
Botond Dénes	296837120d	db: move virtual tables into virtual_tables.cc The definitions of virtual tables make up approximately a quarter of the huge system_keyspace.cc file (almost 4K lines), pulling in a lot of headers only used by them. Move them to a separate source file to make system_keyspace.cc easier for humans and compilers to digest. This patch also moves the `register_virtual_tables()`, `install_virtual_readers()` as well as the `virtual_tables` global. Closes #14308	2023-07-12 15:26:54 +03:00
Anna Stuchlik	a414ac8fde	doc: update the link to ABRT	2023-07-12 14:13:42 +02:00
Kefu Chai	8f31f28446	build: cmake: add test/raft tests Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14656	2023-07-12 15:06:59 +03:00
Kamil Braun	820d7e9520	test: test_node_isolation: use `ManagerClient.con_gen` to create CQL connection The test isolates a node and then connects to it through CQL. The `connect()` step would often timeout on ARM debug builds. This was already dealt with in the past in the context of other tests: #11289. The `ManagerClient.con_gen` function creates a connection in a way that avoids the problem -- connection timeout settings are adjusted to account for the slowness. Use it in this test to fix the flakiness. At the same time, reduce the timeout used for the actual CQL request (after the driver has already connected), because the test expects this request to timeout and waiting for 200 seconds here is just a waste of time.	2023-07-12 12:34:02 +02:00
Kefu Chai	20c7b6057b	test: silence the deprecation warning. because `lw_shared_ptr::operator=(T&&)` was deprecated. we started to have following waring: ``` /home/kefu/dev/scylladb/test/boost/statement_restrictions_test.cc:394:41: warning: 'operator=' is deprecated: call make_lw_shared<> and assign the result instead [-Wdeprecated-declarations] 394 \| definition.column_specification = std::move(specification); \| ^ /home/kefu/dev/scylladb/seastar/include/seastar/core/shared_ptr.hh:346:7: note: 'operator=' has been explicitly marked deprecated here 346 \| [[deprecated("call make_lw_shared<> and assign the result instead")]] \| ^ 1 warning generated. ``` so, in this change, we use the recommended way to update a lw_shared_ptr. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14648	2023-07-12 13:10:33 +03:00
Kamil Braun	3464877276	test: manager_client: make `con_gen` for `ManagerClient.__init__` nonoptional `ManagerClient` is given a function that is used to create CQL connections to the Scylla cluster. For some reason it was typed as `Optional` even though it was never passed `None`. Fix it.	2023-07-12 11:44:15 +02:00
Kefu Chai	5443bf69f7	storage_proxy: print the expected ex.what() before this change, the format string contains two placeholders, but only one extra argument is passed in. if we actually format this logging message, fmtlib would throw. after this change, we pass the exception's error message as yet another argument. this logging message is printed with "trace" level, guess that's why we haven't have the exception thrown by fmtlib. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14628	2023-07-12 12:34:51 +03:00
Nadav Har'El	a4087f58df	alternator: fix error path for size() function on constants The DynamoDB documentation for the size() function claims that it only works on paths (attribute names or references), but it actually works on constants from the query (e.g., ":val") as well. It turns out that Alternator supports this undocumented case already, but gets the error path wrong: Usually, when size() is calculated on the data, if the data has the wrong type of size() (e.g., an integer), the condition simply doesn't match. But if the value comes from the query - it should generate an error that the query is wrong - ValidationException. This patch fixes this case, and also adds tests for it that pass on both DynamoDB and Alternator (after this patch). Fixes #14592 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14593	2023-07-12 12:29:05 +03:00
Pavel Emelyanov	eb549234b0	scylla-gdb: Fix tables filtering There's -k\|--keyspace argument to the tables command that's supposed to filter tables belonging to specific keyspace that doesn't work. Fix it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #14634	2023-07-12 12:26:25 +03:00
Avi Kivity	0fc067a54c	build: add -Wimplicit-fallthrough to cmake In `0cabf4eeb9` ("build: disable implicit fallthrough"), we added -Wimplicit-fallthrough to configure.py, but forgot to add it to cmake. Closes #14629	2023-07-12 12:24:22 +03:00
Nadav Har'El	f08bc83cb2	cql-pytest: translate Cassandra's tests for CAST operations This is a translation of Cassandra's CQL unit test source file functions/CastFctsTest.java into our cql-pytest framework. There are 13 tests, 9 of them currently xfail. The failures are caused by one recently-discovered issue: Refs #14501: Cannot Cast Counter To Double and by three previously unknown or undocumented issues: Refs #14508: SELECT CAST column names should match Cassandra's Refs #14518: CAST from timestamp to string not same as Cassandra on zero milliseconds Refs #14522: Support CAST function not only in SELECT Curiously, the careful translation of this test also caused me to find a bug in Cassandra https://issues.apache.org/jira/browse/CASSANDRA-18647 which the test in Java missed because it made the same mistake as the implementation. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14528	2023-07-12 11:42:04 +03:00
Nadav Har'El	599636b307	test/alternator: fix flaky test test_ttl_expiration_gsi_lsi The Alternator test test_ttl.py::test_ttl_expiration_gsi_lsi was flaky. The test incorrectly assumes that when we write an already expired item, it will be visible for a short time until being deleted by the TTL thread. But this doesn't need to be true - if the test is slow enough, it may go look or the item after it was already expired! So we fix this test by splitting it into two parts - in the first part we write a non-expiring item, and notice it eventually appears in the GSI, LSI, and base-table. Then we write the same item again, with an expiration time - and now it should eventually disappear from the GSI, LSI and base-table. This patch also fixes a small bug which prevented this test from running on DynamoDB. Fixes #14495 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14496	2023-07-12 11:23:12 +03:00
Botond Dénes	968421a3e0	Merge 'Stop task manager compaction module properly' from Aleksandra Martyniuk Due to wrong order of stopping of compaction services, shutdown needs to wait until all compactions are complete, which may take really long. Moreover, test version of compaction manager does not abort task manager, which is strictly bounded to it, but stops its compaction module. This results in tests waiting for compaction task manager's tasks to be unregistered, which never happens. Stopping and aborting of compaction manager and task manager's compaction module are performed in a proper order. Closes #14461 * github.com:scylladb/scylladb: tasks: test: abort task manager when wrapped_compaction_manager is destructed compaction: swap compaction manager stopping order compaction: modify compaction_manager::stop()	2023-07-12 09:54:00 +03:00
Avi Kivity	118fa59ba8	tools: add cqlsh shortcut Add bin/cqlsh as a shortcut to tools/cqlsh/bin/cqlsh, intended for developers. Closes #14362	2023-07-12 09:36:59 +03:00
Pavel Emelyanov	033e5348aa	scylla-gdb: Print all clients from all idx's The scylla netw command prints clients from [0] index only, but there are more of them on messaging service. Print all Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #14633	2023-07-12 09:29:02 +03:00
Botond Dénes	c5cb23a825	Merge 'Add `scylla table` to scylla-gdb' from Pavel Emelyanov The command is to print interesting and/or hard-to-get-by-hand info about individual tables Closes #14635 * github.com:scylladb/scylladb: test: Add 'scylla table' cmd test scylla-gdb: Print table phased barriers scylla-gdb: Add 'table' command	2023-07-12 09:26:59 +03:00
Kamil Braun	dc6f6cb6b0	cql_test_env: load host ID from sstables after restart Performance tests such as `perf-fast-forward` are executed in our CI environments in two steps (two invocations of the `scylla` process): first by populating data directories (with `--populate` option), then by running the actual test. These tests are using `cql_test_env`, which did not load the previously saved (in the populate step) Host ID of this node, but generated a new one randomly instead. In `b39ca97919` we enabled `consistent_cluster_management` by default. This caused the perf tests to hang in `setup_group0` at `read_barrier` step. That's because Raft group 0 was initialized with old configuration -- the one created during the populate step -- but the Raft server was started with a newly generated Host ID (which is used as the server's Raft ID), so the server considered itself as being outside the configuration. Fix this by reloading the Host ID from disk, simulating more closely the behavior of main.cc initialization. Fixes #14599 Closes #14640	2023-07-11 23:30:44 +03:00
Avi Kivity	1545ae2d3b	Merge 'Make SSTable cleanup more efficient by fast forwarding to next owned range' from Raphael "Raph" Carvalho Today, SSTable cleanup skips to the next partition, one at a time, when it finds that the current partition is no longer owned by this node. That's very inefficient because when a cluster is growing in size, existing nodes lose multiple sequential tokens in its owned ranges. Another inefficiency comes from fetching index pages spanning all unowned tokens, which was described in https://github.com/scylladb/scylladb/issues/14317. To solve both problems, cleanup will now use multi range reader, to guarantee that it will only process the owned data and as a result skip unowned data. This results in cleanup scanning an owned range and then fast forwarding to the next one, until it's done with them all. This reduces significantly the amount of data in the index caching, as index will only be invoked at each range boundary instead. Without further ado, before: `INFO 2023-07-01 07:10:26,281 [shard 0] compaction - [Cleanup keyspace2.standard1 701af580-17f7-11ee-8b85-a479a1a77573] Cleaned 1 sstables to [./tmp/1/keyspace2/standard1-b490ee20179f11ee9134afb16b3e10fd/me-3g7a_0s8o_06uww24drzrroaodpv-big-Data.db:level=0]. 2GB to 1GB (~50% of original) in 26248ms = 81MB/s. ~9443072 total partitions merged to 4750028.` after: `INFO 2023-07-01 07:07:52,354 [shard 0] compaction - [Cleanup keyspace2.standard1 199dff90-17f7-11ee-b592-b4f5d81717b9] Cleaned 1 sstables to [./tmp/1/keyspace2/standard1-b490ee20179f11ee9134afb16b3e10fd/me-3g7a_0s4m_5hehd2rejj8w15d2nt-big-Data.db:level=0]. 2GB to 1GB (~50% of original) in 17424ms = 123MB/s. ~9443072 total partitions merged to 4750028.` Fixes #12998. Fixes #14317. Closes #14469 * github.com:scylladb/scylladb: test: Extend cleanup correctness test to cover more cases compaction: Make SSTable cleanup more efficient by fast forwarding to next owned range sstables: Close SSTable reader if index exhaustion is detected in fast forward call sstables: Simplify sstable reader initialization compaction: Extend make_sstable_reader() interface to work with mutation_source test: Extend sstable partition skipping test to cover fast forward using token	2023-07-11 23:28:15 +03:00
Avi Kivity	9cdae78d04	test: expr_test: add copyright/license Closes #14613	2023-07-11 21:45:27 +03:00
Raphael S. Carvalho	60ba1d8b47	test: Extend cleanup correctness test to cover more cases Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-07-11 13:56:24 -03:00
Raphael S. Carvalho	8d58ff1be6	compaction: Make SSTable cleanup more efficient by fast forwarding to next owned range Today, SSTable cleanup skips to the next partition, one at a time, when it finds that the current partition is no longer owned by this node. That's very inefficient because when a cluster is growing in size, existing nodes lose multiple sequential tokens in its owned ranges. Another inefficiency comes from fetching index pages spanning all unowned tokens, which was described in #14317. To solve both problems, cleanup will now use multi range reader, to guarantee that it will only process the owned data and as a result skip unowned data. This results in cleanup scanning an owned range and then fast forwarding to the next one, until it's done with them all. This reduces significantly the amount of data in the index caching, as index will only be invoked at each range boundary instead. Without further ado, before: ... 2GB to 1GB (~50% of original) in 26248ms = 81MB/s. ~9443072 total partitions merged to 4750028. after: ... 2GB to 1GB (~50% of original) in 17424ms = 123MB/s. ~9443072 total partitions merged to 4750028. Fixes #12998. Fixes #14317. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-07-11 13:56:24 -03:00
Raphael S. Carvalho	1fefe597e6	sstables: Close SSTable reader if index exhaustion is detected in fast forward call When wiring multi range reader with cleanup, I found that cleanup wouldn't be able to release disk space of input SSTables earlier. The reason is that multi range reader fast forward to the next range, therefore it enables mutation_reader::forwarding, and as a result, combined reader cannot release readers proactively as it cannot tell for sure that the underlying reader is exhausted. It may have reached EOS for the current range, but it may have data for the next one. The concept of EOS actually only applies to the current range being read. A reader that returned EOS will actually get out of this state once the combined reader fast forward to the next range. Therefore, only the underlying reader, i.e. the sstable reader, can for certain know that the data source is completely exhausted, given that tokens are read in monotonically increasing order. For reversed reads, that's not true but fast forward to range is not actually supported yet for it. Today, the SSTable reader already knows that the underlying SSTable was exhausted in fast_forward_to(), after it call index_reader's advance_to(partition_range), therefore it disables subsequent reads. We can take a step further and also check that the index was exhausted, i.e. reached EOF. So if the index is exhausted, and there's no partition to read after the fast_forward_to() call, we know that there's nothing left to do in this reader, and therefore the reader can be closed proactively, allowing the disk space of SSTable to be reclaimed if it was already deleted. We can see that the combined reader, under multi range reader, will incrementally find a set of disjoint SSTable exhausted, as it fast foward to owned ranges 1: INFO 2023-07-05 10:51:09,570 [shard 0] mutation_reader - flat_multi_range_mutation_reader(): fast forwarding to range [{-4525396453480898112, start},{-4525396453480898112, end}] INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-1-big-Data.db, start == end, eof ? true INFO 2023-07-05 10:51:09,570 [shard 0] sstable - closing reader 0x60100029d800 for /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-1-big-Data.db INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-3-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-4-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-5-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-6-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-7-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-8-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-9-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,570 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-10-big-Data.db, start == end, eof ? false 2: INFO 2023-07-05 10:51:09,572 [shard 0] mutation_reader - flat_multi_range_mutation_reader(): fast forwarding to range [{-2253424581619911583, start},{-2253424581619911583, end}] INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-2-big-Data.db, start == end, eof ? true INFO 2023-07-05 10:51:09,572 [shard 0] sstable - closing reader 0x60100029d400 for /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-2-big-Data.db INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-4-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-5-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-6-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-7-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-8-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-9-big-Data.db, start == end, eof ? false INFO 2023-07-05 10:51:09,572 [shard 0] sstable - sstable /tmp/scylla-9831a31a-66f3-4541-8681-000ac8e21bbb/me-10-big-Data.db, start == *end, eof ? false And so on. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-07-11 13:56:24 -03:00
Raphael S. Carvalho	f08a4eaacb	sstables: Simplify sstable reader initialization It's odd that we see things like: if (!is_initialized()) { return initialize().then([this] { if (!is_initialized()) { and return ensure_initialized().then([this, &pr] { if (!is_initialized()) { One might think initialize will actually initialize the reader by setting up context, and ensure_initialized() will even have stronger guarantees, meaning that the reader must be initialized by it. But none are true. In the context of single-partition read, it can happen initialize() will not set up context, meaning is_initialized() returns false, which is why initialization must be checked even after we call ensure_initialized(). Let's merge ensure_initialized() and initialize() into a maybe_initialize() which returns a boolean saying if the reader is initialized. It makes the code initializing the reader easier to understand.	2023-07-11 13:56:23 -03:00
Michał Chojnowski	b511d57fc8	Revert "Merge 'Compaction resharding tasks' from Aleksandra Martyniuk" This reverts commit `2a58b4a39a`, reversing changes made to `dd63169077`. After patch `87c8d63b7a`, table_resharding_compaction_task_impl::run() performs the forbidden action of copying a lw_shared_ptr (_owned_ranges_ptr) on a remote shard, which is a data race that can cause a use-after-free, typically manifesting as allocator corruption. Note: before the bad patch, this was avoided by copying the _contents_ of the lw_shared_ptr into a new, local lw_shared_ptr. Fixes #14475 Fixes #14618 Closes #14641	2023-07-11 19:11:37 +03:00
Calle Wilund	e1a52af69e	messaging_service: Do TLS init early Fixes #14299 failure_detector can try sending messages to TLS endpoints before start_listen has been called (why?). Need TLS initialized before this. So do on service creation. Closes #14493	2023-07-11 18:19:01 +03:00
Kefu Chai	b4dc3f7cd9	scylla-gdb: add sstable::generation_type printer to inspect the sstable generation after uuid-based generation change. in this change: * a pretty printer for sstable::generation_type is added * now that the pretty printer for the generation_type is registered, we can just leverage it when printing the sstable name, so instead of checking if `_generation` member variable contains `_value`, we use delegate it to `str()`, which is used by `str.format()`. as the behavior of `str()` is similar to that of the gdb `print` command, and calls `value.format_string()`, which in turn calls into `to_string()` if the "value" in question has a pretty printer. after this change, the printer is able to print both the generations before the uuid change and the ones after the change. a typical gdb session looks like: ``` (gdb) p generation._value $5 = f0770b40-1c7c-11ee-b136-bf28f8d18b88 (gdb) p generation $10 = 3g7g_0bu7_0jpvk2p0mmtlsb8lu0 (gdb) p/x generation._value.least_sig_bits $7 = 0xb136bf28f8d18b88 (gdb) p/x generation._value.most_sig_bits $8 = 0xf0770b401c7c11ee ``` if we use `scripts/base36-uuid.py` to encode the msb and lsb, we'd need to: ```console scripts/base36-uuid.py -e 0xf0770b401c7c11ee 0xb136bf28f8d18b88 3g7g_0bu7_0jpvk2p0mmtlsb8lu0 ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14561	2023-07-11 15:56:20 +03:00
Raphael S. Carvalho	3b1829f0d8	compaction: base compaction throughput on amount of data read Today, we base compaction throughput on the amount of data written, but it should be based on the amount of input data compacted instead, to show the amount of data compaction had to process during its execution. A good example is a compaction which expire 99% of data, and today throughput would be calculated on the 1% written, which will mislead the reader to think that compaction was terribly slow. Fixes #14533. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #14615	2023-07-11 15:48:05 +03:00
Kefu Chai	25f4a7c400	sstables: format using format string instead of concatenating strings, let's format using the builtin support of `log::debug()`. for two reasons: 1. better performance, after this change, we don't need to materialize the concatenated string, if the "debug" level logging is not enabled. seasetar::log only formats when a certain log level is enabled. 2. better readability. with the format string, it is clear what is the fixed part, and which arguments are to be formatted. this also helps us to move to compile-time formatting check, as fmtlib requires the caller to be explicit when it wants to use runtime format string. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14627	2023-07-11 15:31:20 +03:00
Pavel Emelyanov	5518502085	test: Add 'scylla table' cmd test Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-11 15:12:43 +03:00
Pavel Emelyanov	2c2ad09d3c	scylla-gdb: Print table phased barriers These barriers show if there's any operation in progress (read, write, flush or stream). These are crucial to know if stopping fails, e.g. see issue #13100 These barriers are symmarized in 'scylla memory' command, but they are also good to know on per-table basis Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-11 15:10:47 +03:00
Pavel Emelyanov	1948b8fa17	scylla-gdb: Add 'table' command There's 'scylla tables' one that lists tables on the given/current shard, but the list is unable to show lots of information. It prints the table address so it can be explored by hand, but some data is more handy to be parsed and printed with the script The syntax is $ scylla table ks.cf For now just print the schema version. To be extended in the future. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-07-11 15:08:55 +03:00
Botond Dénes	bc5174ced6	Merge 'doc: move the package installation instructions to the documentation' from Anna Stuchlik Refs: https://github.com/scylladb/scylla-docs/issues/4091 Fixes https://github.com/scylladb/scylla-docs/issues/3419 This PR moves the installation instructions from the [website](https://www.scylladb.com/download/) to the documentation. Key changes: - The instructions are mostly identical, so they were squeezed into one page with different tabs. - I've merged the info for Ubuntu and Debian, as well as CentOS and RHEL. - The page uses variables that should be updated each release (at least for now). - The Java requirement was updated from Java 8 to Java 11 following [this issue](https://github.com/scylladb/scylla-docs/issues/3419). - In addition, the title of the Unified Installer page has been updated to communicate better about its contents. Closes #14504 * github.com:scylladb/scylladb: doc: update the prerequisites section doc: improve the tile of Unified Installer page doc: move package install instructions to the docs	2023-07-11 14:30:11 +03:00
Kamil Braun	051728318d	utils/UUID: reference `timeuuid_tri_compare` in `UUID::operator<=>` comment	2023-07-11 13:19:50 +02:00
Avi Kivity	f26e36f448	Update seastar submodule * seastar 2b7a341210...bac344d584 (3): > tls: Export error_category instance used by tls + some common error codes > reactor: cast enum to int when formatting it > cooking: bump up zlib to 1.2.13	2023-07-11 13:24:32 +03:00
Kamil Braun	5779230d28	group0_state_machine: use correct comparison for timeuuids in `merger` In `d2a4079bbe`, `merger` was modified so that when we merge a command, `last_group0_state_id` is taken to be the maximum of the merged command's state_id and the current `last_group0_state_id`. This is necessary for achieving the same behavior as if the commands were applied individually instead of being merged -- where we take the maximum state ID from `group0_history` table which was applied until now (because the table is sorted using the state IDs and we take the greatest row). However, a subtle bug was introduced -- the `std::max` function uses the `utils::UUID` standard comparison operator which is unfortunately not the same as timeuuid comparison that Scylla performs when sorting the `group0_history` table. So in rare cases it could return the smaller of the two timeuuids w.r.t. the correct timeuuid ordering. This would then lead to commands being applied which should have been turned to no-ops due to the `prev_state_id` check -- and then, for example, permanent schema desync or worse. Fix it by using the correct comparison method. Fixes: #14600	2023-07-11 11:48:02 +02:00
Kamil Braun	5ce802676f	utils/UUID: introduce `timeuuid_tri_compare` for `const UUID&` The existing `timeuuid_tri_compare` operates on UUIDs serialized in byte buffers. Introduce a version which operates directly on the `utils::UUID` type. To reuse existing comparison code, we serialize to a buffer before comparing. But we avoid allocations by using `std::array`. Since the serialized size needs to be known at compile time for `std::array`, mark `UUID::serialized_size()` as `constexpr`.	2023-07-11 11:48:02 +02:00
Kamil Braun	668beedadc	utils/UUID: introduce `timeuuid_tri_compare` for `const int8_t` `timeuuid_tri_compare` takes `bytes_view` parameters and converts them to `const int8_t` before comparing. Extract the part that operates on `const int8_t*` to separate function which we will reuse in a later commit.	2023-07-11 11:48:02 +02:00
Kefu Chai	ef78b31b43	s3/client: add tagging ops with tagging ops, we will be able to attach kv pairs to an object. this will allow us to mark sstable components with taggings, and filter them based on them. * test/pylib/minio_server.py: enable anonymous user to perform more actions. because the tagging related ops are not enabled by "mc anonymous set public", we have to enable them using "set-json" subcommand. * utils/s3/client: add methods to manipulate taggings. * test/boost/s3_test: add a simple test accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14486	2023-07-11 09:30:46 +03:00
Kefu Chai	3b6e37051b	build: cmake: add more tests to CMake to be in-sync with configure.py Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14479	2023-07-11 09:21:26 +03:00

1 2 3 4 5 ...

37842 Commits