scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	a396c27efc	Merge 'message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client' from Kamil Braun `get_rpc_client` calculates a `topology_ignored` field when creating a client which says whether the client's endpoint had topology information when this client was created. This is later used to check if that client needs to be dropped and replaced with a new client which uses the correct topology information. The `topology_ignored` field was incorrectly calculated as `true` for pending endpoints even though we had topology information for them. This would lead to unnecessary drops of RPC clients later. Fix this. Remove the default parameter for `with_pending` from `topology::has_endpoint` to avoid similar bugs in the future. Apparently this fixes #11780. The verbs used by decommission operation use RPC client index 1 (see `do_get_rpc_client_idx` in message/messaging_service.cc). From local testing with additional logging I found that by the time this client is created (i.e. the first verb in this group is used), we already know the topology. The node is pending at that point - hence the bug would cause us to assume we don't know the topology, leading us to dropping the RPC client later, possibly in the middle of a decommission operation. Fixes: #11780 Closes #11942 * github.com:scylladb/scylladb: message: messaging_service: check for known topology before calling is_same_dc/rack test: reenable test_topology::test_decommission_node_add_column test/pylib: util: configurable period in wait_for message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client message: messaging_service: topology independent connection settings for GOSSIP verbs	2022-11-17 20:14:32 +03:00
Nadav Har'El	e393639114	test/cql-pytest: reproducer for crash in LWT with null key This patch adds a reproducer for issue #11954: Attempting an "IF NOT EXISTS" (LWT) write with a null key crashes Scylla, instead of producing a simple error message (like happens without the "IF NOT EXISTS" after #7852 was fixed). The test passed on Cassandra, but crashes Scylla. Because of this crash, we can't just mark the test "xfail" and it's temporarily marked "skip" instead. Refs #11954. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11982	2022-11-17 07:31:13 +02:00
Avi Kivity	3497891cf9	utils: spell "barrett" correctly As P. T. Barnoom famously said, "write what you like but spell my name correctly". Following that, we correct the spelling of Barrett's name in the source tree. Closes #11989	2022-11-16 16:30:38 +02:00
Kamil Braun	9b2449d3ea	test: reenable test_topology::test_decommission_node_add_column Also improve the test to increase the probability of reproducing #11780 by injecting sleeps in appropriate places. Without the fix for #11780 from the earlier commit, the test reproduces the issue in roughly half of all runs in dev build on my laptop.	2022-11-16 14:01:50 +01:00
Kamil Braun	0f49813312	test/pylib: util: configurable period in wait_for	2022-11-16 14:01:50 +01:00
Nadav Har'El	2f2f01b045	materialized views: fix view writes after base table schema change When we write to a materialized view, we need to know some information defined in the base table such as the columns in its schema. We have a "view_info" object that tracks each view and its base. This view_info object has a couple of mutable attributes which are used to lazily-calculate and cache the SELECT statement needed to read from the base table. If the base-table schema ever changes - and the code calls set_base_info() at that point - we need to forget this cached statement. If we don't (as before this patch), the SELECT will use the wrong schema and writes will no longer work. This patch also includes a reproducing test that failed before this patch, and passes afterwords. The test creates a base table with a view that has a non-trivial SELECT (it has a filter on one of the base-regular columns), makes a benign modification to the base table (just a silly addition of a comment), and then tries to write to the view - and before this patch it fails. Fixes #10026 Fixes #11542	2022-11-16 13:58:21 +02:00
Botond Dénes	bd1fcbc38f	Merge 'Introduce reverse vector_deserializer.' from Michał Radwański As indicated in #11816, we'd like to enable deserializing vectors in reverse. The forward deserialization is achieved by reading from an input_stream. The input stream internally is a singly linked list with complicated logic. In order to allow for going through it in reverse, instead when creating the reverse vector initializer, we scan the stream and store substreams to all the places that are a starting point for a next element. The iterator itself just deserializes elements from the remembered substreams, this time in reverse. Fixes #11816 Closes #11956 * github.com:scylladb/scylladb: test/boost/serialization_test.cc: add test for reverse vector deserializer serializer_impl.hh: add reverse vector serializer serializer_impl: remove unneeded generic parameter	2022-11-16 07:37:24 +02:00
Nadav Har'El	e4dba6a830	test/cql-pytest: add test for when MV requires IS NOT NULL As noted in issue #11979, Scylla inconsistently (and unlike Cassandra) requires "IS NOT NULL" one some but not all materialized-view key columns. Specifically, Scylla does not require "IS NOT NULL" on the base's partition key, while Cassandra does. This patch is a test which demonstrates this inconsistency. It currently passes on Cassandra and fails on Scylla, so is marked xfail. Refs #11979 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11980	2022-11-15 14:21:48 +01:00
Botond Dénes	34f29c8d67	Merge 'Use with_sstable_directory() helper in tests' from Pavel Emelyanov The helper is already widely used, one (last) test case can benefit from using it too Closes #11978 * github.com:scylladb/scylladb: test: Indentation fix after previous patch test: Wse with_sstable_directory() helper	2022-11-15 14:21:48 +01:00
Nadav Har'El	8a4ab87e44	Merge 'utils: crc: generate crc barrett fold tables at compile time' from Avi Kivity We use Barrett tables (misspelled in the code unfortunately) to fold crc computations of multiple buffers into a single crc. This is important because it turns out to be faster to compute crc of three different buffers in parallel rather than compute the crc of one large buffer, since the crc instruction has latency 3. Currently, we have a separate code generation step to compute the fold tables. The step generates a new C++ source files with the tables. But modern C++ allows us to do this computation at compile time, avoiding the code generation step. This simplifies the build. This series does that. There is some complication in that the code uses compiler intrinsics for the computation, and these are not constexpr friendly. So we first introduce constexpr-friendly alternatives and use them. To prove the transformation is correct, I compared the generated code from before the series and from just before the last step (where we use constexpr evaluation but still retain the generated file) and saw no difference in the values. Note that constexpr is not strictly needed - we could have run the code in the global variables' initializer. But that would cause a crash if we run on a pre-clmul machine, and is not as fun. Closes #11957 * github.com:scylladb/scylladb: test: crc: add unit tests for constexpr clmul and barrett fold utils: crc combine table: generate at compile time utils: barrett: inline functions in header utils: crc combine table: generate tables at compile time utils: crc combine table: extract table generation into a constexpr function utils: crc combine table: extract "pow table" code into constexpr function utils: crc combine table: store tables std::arrray rather than C array utils: barrett: make the barrett reduction constexpr friendly utils: clmul: add 64-bit constexpr clmul utils: barrett: extract barrett reduction constants utils: barrett: reorder functions utils: make clmul() constexpr	2022-11-15 14:21:48 +01:00
Pavel Emelyanov	8dcd9d98d6	test: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-14 20:11:01 +03:00
Pavel Emelyanov	c9128e9791	test: Wse with_sstable_directory() helper It's already used everywhere, but one test case wires up the sstable_directory by hand. Fix it too, but keep in mind, that the caller fn stops the directory early. (indentation is deliberately left broken) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-14 20:11:01 +03:00
Michał Radwański	32c60b44c5	test/boost/serialization_test.cc: add test for reverse vector deserializer This test is just a copy-pasted version of forward serializer test.	2022-11-14 16:06:24 +01:00
Botond Dénes	8e38551d93	Merge 'Allow each compaction group to have its own compaction backlog tracker' from Raphael "Raph" Carvalho Today, compaction_backlog_tracker is managed in each compaction_strategy implementation. So every compaction strategy is managing its own tracker and providing a reference to it through get_backlog_tracker(). But this prevents each group from having its own tracker, because there's only a single compaction_strategy instance per table. To remove this limitation, compaction_strategy impl will no longer manage trackers but will instead provide an interface for trackers to be created, such that each compaction_group will be allowed to create its own tracker and manage it by itself. Now table's backlog will be the sum of all compaction_group backlogs. The normalization factor is applied on the sum, so we don't have to adjust each individual backlog to any factor. Closes #11762 * github.com:scylladb/scylladb: replica: Allow one compaction_backlog_tracker for each compaction_group compaction: Make compaction_state available for compaction tasks being stopped compaction: Implement move assignment for compaction_backlog_tracker compaction: Fix compaction_backlog_tracker move ctor compaction: Use table_state's backlog tracker in compaction_read_monitor_generator compaction: kill undefined get_unimplemented_backlog_tracker() replica: Refactor table::set_compaction_strategy for multiple groups Fix exception safety when transferring ongoing charges to new backlog tracker replica: move_sstables_from_staging: Use tracker from group owning the SSTable replica: Move table::backlog_tracker_adjust_charges() to compaction_group replica: table::discard_sstables: Use compaction_group's backlog tracker replica: Disable backlog tracker in compaction_group::stop() replica: database_sstable_write_monitor: use compaction_group's backlog tracker replica: Move table::do_add_sstable() to compaction_group test/sstable_compaction_test: Switch to table_state::get_backlog_tracker() compaction/table_state: Introduce get_backlog_tracker()	2022-11-14 07:05:28 +02:00
Avi Kivity	b8cb34b928	test: crc: add unit tests for constexpr clmul and barrett fold Check that the constexpr variants indeed match the runtime variants. I verified manually that exactly one computation in each test is executed at run time (and is compared against a constant).	2022-11-13 16:22:29 +02:00
Raphael S. Carvalho	b88acffd66	replica: Allow one compaction_backlog_tracker for each compaction_group Today, compaction_backlog_tracker is managed in each compaction_strategy implementation. So every compaction strategy is managing its own tracker and providing a reference to it through get_backlog_tracker(). But this prevents each group from having its own tracker, because there's only a single compaction_strategy instance per table. To remove this limitation, compaction_strategy impl will no longer manage trackers but will instead provide an interface for trackers to be created, such that each compaction group will be allowed to have its own tracker, which will be managed by compaction manager. On compaction strategy change, table will update each group with the new tracker, which is created using the previously introduced ompaction_group_sstable_set_updater. Now table's backlog will be the sum of all compaction_group backlogs. The normalization factor is applied on the sum, so we don't have to adjust each individual backlog to any factor. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:22:51 -03:00
Raphael S. Carvalho	835927a2ad	test/sstable_compaction_test: Switch to table_state::get_backlog_tracker() Important for decoupling backlog tracker from table's compaction strategy. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	1ec0ef18a5	compaction/table_state: Introduce get_backlog_tracker() This interface will be helpful for allowing replica::table, unit tests and sstables::compaction to access the compaction group's tracker which will be managed by the compaction manager, once we complete the decoupling work. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Nadav Har'El	ff87624fb4	test/cql-pytest: add another regression test for reversed-type bug In commit `544ef2caf3` we fixed a bug where a reveresed clustering-key order caused problems using a secondary index because of incorrect type comparison. That commit also included a regression test for this fix. However, that fix was incomplete, and improved later in commit `c8653d1321`. That later fix was labeled "better safe than sorry", and did not include a test demonstrating any actual bug, so unsurprisingly we never backported that second fix to any older branches. Recently we discovered that missing the second patch does cause real problems, and this patch includes a test which fails when the first patch is in, but the second patch isn't (and passes when both patches are in, and also passes on Cassandra). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11943	2022-11-11 11:01:22 +02:00
Kamil Braun	4a2ec888d5	Merge 'test.py: use internal id to manage servers' from Alecco Instead of using assigned IP addresses, use a local integer ID for managing servers. IP address can be reused by a different server. While there, get host ID (UUID). This can also be reused with `node replace` so it's not good enough for tracking. Closes #11747 * github.com:scylladb/scylladb: test.py: use internal id to manage servers test.py: rename hostname to ip_addr test.py: get host id test.py: use REST api client in ScyllaCluster test.py: remove unnecessary reference to web app test.py: requests without aiohttp ClientSession	2022-11-10 17:12:16 +01:00
Alejo Sanchez	700054abee	test.py: use internal id to manage servers Instead of using assigned IP addresses, use an internal server id. Define types to distinguish local server id, host ID (UUID), and IP address. This is needed to test servers changing IP address and for node replace (host UUID). Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	1e38f5478c	test.py: rename hostname to ip_addr The code explicitly manages an IP as string, make it explicit in the variable name. Define its type and test for set in the instance instead of using an empty string as placeholder. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	f478eb52a3	test.py: get host id When initializing a ScyllaServer, try to get the host id instead of only checking the REST API is up. Use the existing aiohttp session from ScyllaCluster. In case of HTTP error check the status was not an internal error (500+). Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	78663dda72	test.py: use REST api client in ScyllaCluster Move the REST api client to ScyllaCluster. This will allow the cluster to query its own servers. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	75ea345611	test.py: remove unnecessary reference to web app The aiohttp.web.Application only needs to be passed, so don't store a reference in ScyllaCluster object. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	a5316b0c6b	test.py: requests without aiohttp ClientSession Simplify REST helper by doing requests without a session. Reusing an aiohttp.ClientSession causes knock-on effects on `rest_api/test_task_manager` due to handling exceptions outside of an async with block. Requests for cluster management and Scylla REST API don't need session, anyway. Raise HTTPError with status code, text reason, params, and json. In ScyllaCluster.install_and_start() instead of adding one more custom exception, just catch all exceptions as they will be re-raised later. While there avoid code duplication and improve sanity, type checking, and lint score. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Botond Dénes	21bc37603a	Merge 'utils: config_src: add set_value_on_all_shards functions' from Benny Halevy Currently when we set a single value we need to call broadcast_to_all_shards to let observers on all shards get notified of the new value. However, the latter broadcasts all value to all shards so it's terribly inefficient. Instead, add async set_value_on_all_shards functions to broadcast a value to all shards. Use those in system_keyspace for db_config_table virtual table and in task_manager_test to update the task_manager ttl. Refs #7316 Closes #11893 * github.com:scylladb/scylladb: tests: check ttl on different shards utils: config_src: add set_value_on_all_shards functions utils: config_file: add config_source::API	2022-11-10 07:16:39 +02:00
Aleksandra Martyniuk	b0ed4d1f0f	tests: check ttl on different shards Test checking if ttl is properly set is extended to check whether the ttl value is changed on non-zero shard.	2022-11-09 16:58:46 +02:00
Botond Dénes	725e5b119d	Revert "replica: Pick new generation for SSTables being moved from staging dir" This reverts commit `ba6186a47f`. Said commit violates the widely held assumption that sstables generations can be used as sstable identity. One known problem caused this is potential OOO partition emitted when reading from sstables (#11843). We now also have a better fix for #11789 (the bug this commit was meant to fix): `4aa0b16852`. So we can revert without regressions. Fixes: #11843 Closes #11886	2022-11-09 16:35:31 +02:00
Michał Chojnowski	3e0c7a6e9f	test: sstable_datafile_test: eliminate a use of std::regex to prevent stack overflow This usage of std::regex overflows the seastar::thread stack size (128 KiB), causing memory corruption. Fix that. Closes #11911	2022-11-08 14:41:34 +02:00
Tomasz Grabiec	a9063f9582	Merge 'service/raft: failure detector: ping `raft::server_id`s, not `gms::inet_address`es' from Kamil Braun Whenever a Raft configuration change is performed, `raft::server` calls `raft_rpc::add_server`/`raft_rpc::remove_server`. Our `raft_rpc` implementation has a function, `_on_server_update`, passed in the constructor, which it called in `add_server`/`remove_server`; that function would update the set of endpoints detected by the direct failure detector. `_on_server_update` was passed an IP address and that address was added to / removed from the failure detector set (there's another translation layer between the IP addresses and internal failure detector 'endpoint ID's; but we can ignore it for the purposes of this commit). Therefore: the failure detector was pinging a certain set of IP addresses. These IP addresses were updated during Raft configuration changes. To implement the `is_alive(raft::server_id)` function (required by `raft::failure_detector` interface), we would translate the ID using the Raft address map, which is currently also updated during configuration changes, to an IP address, and check if that IP address is alive according to the direct failure detector (which maintained an `_alive_set` of type `unordered_set<gms::inet_address>`). This all works well but it assumes that servers can be identified using IP addresses - it doesn't play well with the fact that servers may change their IP addresses. The only immutable identifier we have for a server is `raft::server_id`. In the future, Raft configurations will not associate IP addresses with Raft servers; instead we will assume that IP addresses can change at any time, and there will be a different mechanism that eventually updates the Raft address map with the latest IP address for each `raft::server_id`. To prepare us for that future, in this commit we no longer operate in terms of IP addresses in the failure detector, but in terms of `raft::server_id`s. Most of the commit is boilerplate, changing `gms::inet_address` to `raft::server_id` and function/variable names. The interesting changes are: - in `is_alive`, we no longer need to translate the `raft::server_id` to an IP address, because now the stored `_alive_set` already contains `raft::server_id`s instead of `gms::inet_address`es. - the `ping` function now takes a `raft::server_id` instead of `gms::inet_address`. To send the ping message, we need to translate this to IP address; we do it by the `raft_address_map` pointer introduced in an earlier commit. Thus, there is still a point where we have to translate between `raft::server_id` and `gms::inet_address`; but observe we now do it at the last possible moment - just before sending the message. If we have no translation, we consider the `ping` to have failed - it's equivalent to a network failure where no route to a given address was found. Closes #11759 * github.com:scylladb/scylladb: direct_failure_detector: get rid of complex `endpoint_id` translations service/raft: ping `raft::server_id`s, not `gms::inet_address`es service/raft: store `raft_address_map` reference in `direct_fd_pinger` gms: gossiper: move `direct_fd_pinger` out to a separate service gms: gossiper: direct_fd_pinger: extract generation number caching to a separate class	2022-11-07 16:42:35 +01:00
Avi Kivity	91f2cd5ac4	test: lib: exception_predicate: use boost::regex instead of std::regex std::regex was observed to overflow stack on aarch64 in debug mode. Use boost::regex until the libstdc++ bug[1] is fixed. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582 Closes #11888	2022-11-07 14:03:25 +02:00
Petr Gusev	44f48bea0f	raft: test_remove_node_with_concurrent_ddl The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce scylladb#11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems. Closes #11734	2022-11-04 17:16:35 +01:00
Kamil Braun	e086521c1a	direct_failure_detector: get rid of complex `endpoint_id` translations The direct failure detector operates on abstract `endpoint_id`s for pinging. The `pigner` interface is responsible for translating these IDs to 'real' addresses. Earlier we used two types of addresses: IP addresses in 'production' code (`gms::gossiper::direct_fd_pinger`) and `raft::server_id`s in test code (in `randomized_nemesis_test`). For each of these use cases we would maintain mappings between `endpoint_id`s and the address type. In recent commits we switched the 'production' code to also operate on Raft server IDs, which are UUIDs underneath. In this commit we switch `endpoint_id`s from `unsigned` type to `utils::UUID`. Because each use case operates in Raft server IDs, we can perform a simple translation: `raft_id.uuid()` to get an `endpoint_id` from a Raft ID, `raft::server_id{ep_id}` to obtain a Raft ID from an `endpoint_id`. We no longer have to maintain complex sharded data structures to store the mappings.	2022-11-04 09:38:08 +01:00
Kamil Braun	ac70a05c7e	service/raft: store `raft_address_map` reference in `direct_fd_pinger` The pinger will use the map to translate `raft::server_id`s to `gms::inet_address`es when pinging.	2022-11-04 09:38:08 +01:00
Kamil Braun	2c20f2ab9d	gms: gossiper: move `direct_fd_pinger` out to a separate service In later commit `direct_fd_pinger` will operate in terms of `raft::server_id`s. Decouple it from `gossiper` since we don't want to entangle `gossiper` with Raft-specific stuff.	2022-11-04 09:38:08 +01:00
Pavel Emelyanov	efbfcdb97e	Merge 'Replicate `raft_address_map` non-expiring entries to other shards' from Kamil Braun Replicating `raft_address_map` entries is needed for the following use cases: - the direct failure detector - currently it assumes a static mapping of `raft::server_id`s to `gms::inet_address`es, which is obtained on Raft group 0 configuration changes. To handle dynamic mappings we need to modify the failure detector so it pings `raft::server_id`s and obtains the `gms::inet_address` before sending the message from `raft_address_map`. The failure detector is sharded, so we need the mappings to be available on all shards. - in the future we'll have multiple Raft groups running on different shards. To send messages they'll need `raft_address_map`. Initially I tried to replicate all entries - expiring and non-expiring. The implementation turned out to be very complex - we need to handle dropping expired entries and refreshing expiring entries' timestamps across shards, and doing this correctly while accounting for possible races is quite problematic. Eventually I arrived at the conclusion that replicating only non-expiring entries, and furthermore allowing non-expiring entries to be added only on shard 0, is good enough for our use cases: - The direct failure detector is pinging group 0 members only; group 0 members correspond exactly to the non-expiring entries. - Group 0 configuration changes are handled on shard 0, so non-expiring entries are added/removed on shard 0. - When we have multiple Raft groups, we can reuse a single Raft server ID for all Raft servers running on a single node belonging to different groups; they are 'namespaced' by the group IDs. Furthermore, every node has a server that belongs to group 0. Thus for every Raft server in every group, it has a corresponding server in group 0 with the same ID, which has a non-expiring entry in `raft_address_map`, which is replicated to all shards; so every group will be able to deliver its messages. With these assumptions the implementation is short and simple. We can always complicate it in the future if we find that the assumptions are too strong. Closes #11791 * github.com:scylladb/scylladb: test/raft: raft_address_map_test: add replication test service/raft: raft_address_map: replicate non-expiring entries to other shards service/raft: raft_address_map: assert when entry is missing in drop_expired_entries service/raft: turn raft_address_map into a service	2022-11-03 18:34:42 +03:00
Avi Kivity	ca2010144e	test: loading_cache_test: fix use-after-free in test_loading_cache_remove_leaves_no_old_entries_behind We capture `key` by reference, but it is in a another continuation. Capture it by value, and avoid the default capture specification. Found by clang 15 + asan + aarch64. Closes #11884	2022-11-03 17:23:40 +02:00
Avi Kivity	0c3967cf5e	Merge 'scylla-gdb.py: improve scylla-fiber' from Botond Dénes The main theme of this patchset is improving `scylla-fiber`, with some assorted unrelated improvement tagging along. In lieu of explicit support for mapping up continuation chains in memory from seastar (there is one but it uses function calls), scylla fiber uses a quite crude method to do this: it scans task objects for outbound references to other task objects to find waiters tasks and scans inbound references from other tasks to find waited-on tasks. This works well for most objects, but there are some problematic ones: * `seastar::thread_context`: the waited-on task (`seastar::(anonymous namespace)::thread_wake_task`) is allocated on the thread's stack which is not in the object itself. Scylla fiber now scans the stack bottom-up to find this task. * `seastar::smp_message_queue::async_work_item`: the waited on task lives on another shard. Scylla fiber now digs out the remote shard from the work item and continues the search on the remote shard. * `seastar::when_all_state`: the waited on task is a member in the same object tripping loop detection and terminating the search. Seastar fiber now uses the `_continuation` member explicitely to look for the next links. Other minor improvements were also done, like including the shard of the task in the printout. Example demonstrating all the new additions: ``` (gdb) scylla fiber 0x000060002d650200 Stopping because loop is detected: task 0x000061c00385fb60 was seen before. [shard 28] #-13 (task) 0x000061c00385fba0 0x00000000003b5b00 vtable for seastar::internal::when_all_state_component<seastar::future<void> > + 16 [shard 28] #-12 (task) 0x000061c00385fb60 0x0000000000417010 vtable for seastar::internal::when_all_state<seastar::internal::identity_futures_tuple<seastar::future<void>, seastar::future<void> >, seastar::future<void>, seastar::future<void> > + 16 [shard 28] #-11 (task) 0x000061c009f16420 0x0000000000419830 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_6futureISt5tupleIJNS4_IvEES6_EEE14discard_resultEvEUlDpOT_E_ZNS8_14then_impl_nrvoISC_S6_EET0_OT_EUlOS3_RSC_ONS_12future_stateIS7_EEE_S7_EE + 16 [shard 28] #-10 (task) 0x000061c0098e9e00 0x0000000000447440 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}>::run_and_dispose()::{lambda(auto:1)#1}, seastar::future<void>::then_wrapped_nrvo<void, seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}> >(seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #-9 (task) 0x000060000858dcd0 0x0000000000449d68 vtable for seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}> + 16 [shard 0] #-8 (task) 0x0000600050c39f60 0x00000000007abe98 vtable for seastar::parallel_for_each_state + 16 [shard 0] #-7 (task) 0x000060000a59c1c0 0x0000000000449f60 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::sharded<cql_transport::cql_server>::stop()::{lambda(seastar::future<void>)#2}, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, {lambda(seastar::future<void>)#2}>({lambda(seastar::future<void>)#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(seastar::future<void>)#2}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #-6 (task) 0x000060000a59c400 0x0000000000449ea0 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, cql_transport::controller::do_stop_server()::{lambda(std::unique_ptr<seastar::sharded<cql_transport::cql_server>, std::default_delete<seastar::sharded<cql_transport::cql_server> > >&)#1}::operator()(std::unique_ptr<seastar::sharded<cql_transport::cql_server>, std::default_delete<seastar::sharded<cql_transport::cql_server> > >&) const::{lambda()#1}::operator()() const::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, {lambda()#1}>({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #-5 (task) 0x0000600009d86cc0 0x0000000000449c00 vtable for seastar::internal::do_with_state<std::tuple<std::unique_ptr<seastar::sharded<cql_transport::cql_server>, std::default_delete<seastar::sharded<cql_transport::cql_server> > > >, seastar::future<void> > + 16 [shard 0] #-4 (task) 0x00006000019ffe20 0x00000000007ab368 vtable for seastar::(anonymous namespace)::thread_wake_task + 16 [shard 0] #-3 (task) 0x00006000085ad080 0x0000000000809e18 vtable for seastar::thread_context + 16 [shard 0] #-2 (task) 0x0000600009c04100 0x00000000006067f8 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_5asyncIZZN7service15storage_service5drainEvENKUlRS6_E_clES7_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayIT_E4typeEDpNSC_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSD_DpOSG_EUlvE0_ZNS_6futureIvE14then_impl_nrvoIST_SV_EET0_SQ_EUlOS3_RST_ONS_12future_stateINS1_9monostateEEEE_vEE + 16 [shard 0] #-1 (task) 0x000060000a59c080 0x0000000000606ae8 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_5asyncIZZN7service15storage_service5drainEvENKUlRS9_E_clESA_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayIT_E4typeEDpNSF_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSG_DpOSJ_EUlvE1_Lb0EEEZNS5_17then_wrapped_nrvoIS5_SX_EENSD_ISG_E4typeEOT0_EUlOS3_RSX_ONS_12future_stateINS1_9monostateEEEE_vEE + 16 [shard 0] #0 (task) 0x000060002d650200 0x0000000000606378 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<service::storage_service::run_with_api_lock<service::storage_service::drain()::{lambda(service::storage_service&)#1}>(seastar::basic_sstring<char, unsigned int, 15u, true>, service::storage_service::drain()::{lambda(service::storage_service&)#1}&&)::{lambda(service::storage_service&)#1}::operator()(service::storage_service&)::{lambda()#1}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, {lambda(service::storage_service&)#1}>({lambda(service::storage_service&)#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(service::storage_service&)#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #1 (task) 0x000060000bc40540 0x0000000000606d48 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_3smp9submit_toIZNS_7shardedIN7service15storage_serviceEE9invoke_onIZNSB_17run_with_api_lockIZNSB_5drainEvEUlRSB_E_EEDaNS_13basic_sstringIcjLj15ELb1EEEOT_EUlSF_E_JES5_EET1_jNS_21smp_submit_to_optionsESK_DpOT0_EUlvE_EENS_8futurizeINSt9result_ofIFSJ_vEE4typeEE4typeEjSN_SK_EUlvE_Lb0EEEZNS5_17then_wrapped_nrvoIS5_S10_EENSS_ISJ_E4typeEOT0_EUlOS3_RS10_ONS_12future_stateINS1_9monostateEEEE_vEE + 16 [shard 0] #2 (task) 0x000060000332afc0 0x00000000006cb1c8 vtable for seastar::continuation<seastar::internal::promise_base_with_type<seastar::json::json_return_type>, api::set_storage_service(api::http_context&, seastar::httpd::routes&)::{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}::operator()(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >) const::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}, {lambda()#1}<seastar::json::json_return_type> >({lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}&&)::{lambda(seastar::internal::promise_base_with_type<seastar::json::json_return_type>&&, {lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #3 (task) 0x000060000a1af700 0x0000000000812208 vtable for seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, seastar::httpd::function_handler::function_handler(std::function<seastar::future<seastar::json::json_return_type> (std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)> const&)::{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}::operator()(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >) const::{lambda(seastar::json::json_return_type&&)#1}, seastar::future<seastar::json::json_return_type>::then_impl_nrvo<seastar::json::json_return_type&&, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > >(seastar::json::json_return_type&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&, seastar::json::json_return_type&, seastar::future_state<seastar::json::json_return_type>&&)#1}, seastar::json::json_return_type> + 16 [shard 0] #4 (task) 0x0000600009d86440 0x0000000000812228 vtable for seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, seastar::httpd::function_handler::handle(seastar::basic_sstring<char, unsigned int, 15u, true> const&, std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)::{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::future>({lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&, {lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&, seastar::future_state<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&)#1}, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > + 16 [shard 0] #5 (task) 0x0000600009dba0c0 0x0000000000812f48 vtable for seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::handle_exception<std::function<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > (std::__exception_ptr::exception_ptr)>&>(std::function<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > (std::__exception_ptr::exception_ptr)>&)::{lambda(auto:1&&)#1}, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::then_wrapped_nrvo<seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, {lambda(auto:1&&)#1}>({lambda(auto:1&&)#1}&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&, {lambda(auto:1&&)#1}&, seastar::future_state<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&)#1}, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > + 16 [shard 0] #6 (task) 0x0000600026783ae0 0x00000000008118b0 vtable for seastar::continuation<seastar::internal::promise_base_with_type<bool>, seastar::httpd::connection::generate_reply(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)::{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::httpd::connection::generate_reply(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)::{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}<bool> >({lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&&)::{lambda(seastar::internal::promise_base_with_type<bool>&&, {lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&, seastar::future_state<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&)#1}, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > + 16 [shard 0] #7 (task) 0x000060000a4089c0 0x0000000000811790 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::httpd::connection::read_one()::{lambda()#1}::operator()()::{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}::operator()(std::default_delete<std::unique_ptr>) const::{lambda(std::default_delete<std::unique_ptr>)#1}::operator()(std::default_delete<std::unique_ptr>) const::{lambda(bool)#2}, seastar::future<bool>::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}, {lambda(std::default_delete<std::unique_ptr>)#1}<void> >({lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}&, seastar::future_state<bool>&&)#1}, bool> + 16 [shard 0] #8 (task) 0x000060000a5b16e0 0x0000000000811430 vtable for seastar::internal::do_until_state<seastar::httpd::connection::read()::{lambda()#1}, seastar::httpd::connection::read()::{lambda()#2}> + 16 [shard 0] #9 (task) 0x000060000aec1080 0x00000000008116d0 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::httpd::connection::read()::{lambda(seastar::future<void>)#3}, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, {lambda(seastar::future<void>)#3}>({lambda(seastar::future<void>)#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(seastar::future<void>)#3}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #10 (task) 0x000060000b7d2900 0x0000000000811950 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<seastar::httpd::connection::read()::{lambda()#4}, true>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::httpd::connection::read()::{lambda()#4}>(seastar::httpd::connection::read()::{lambda()#4}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::httpd::connection::read()::{lambda()#4}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 Found no further pointers to task objects. If you think there should be more, run `scylla fiber 0x000060002d650200 --verbose` to learn more. Note that continuation across user-created seastar::promise<> objects are not detected by scylla-fiber. ``` Closes #11822 * github.com:scylladb/scylladb: scylla-gdb.py: collection_element: add support for boost::intrusive::list scylla-gdb.py: optional_printer: eliminate infinite loop scylla-gdb.py: scylla-fiber: add note about user-instantiated promise objects scylla-gdb.py: scylla-fiber: reject self-references when probing pointers scylla-gdb.py: scylla-fiber: add starting task to known tasks scylla-gdb.py: scylla-fiber: add support for walking over when_all scylla-gdb.py: add when_all_state to task type whitelist scylla-gdb.py: scylla-fiber: also print shard of tasks scylla-gdb.py: scylla-fiber: unify task printing scylla-gdb.py: scylla fiber: add support for walking over shards scylla-gdb.py: scylla fiber: add support for walking over seastar threads scylla-gdb.py: scylla-ptr: keep current thread context scylla-gdb.py: improve scylla column_families scylla-gdb.py: scylla_sstables.filename(): fix generation formatting scylla-gdb.py: improve schema_ptr scylla-gdb.py: scylla memory: restore compatibility with <= 5.1	2022-11-03 13:52:31 +02:00
Nadav Har'El	b9d88a3601	cql/pytest: add reproducer for timestamp column validation issue This patch adds a reproducing test for issue #11588, which is still open so the test is expected to fail on Scylla ("xfail), and passes on Cassandra. The test shows that Scylla allows an out-of-range value to be written to timestamp column, but then it can't be read back. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11864	2022-11-01 08:11:01 +02:00
Kamil Braun	db6cc035ed	test/raft: raft_address_map_test: add replication test	2022-10-31 09:17:12 +01:00
Kamil Braun	159bb32309	service/raft: turn raft_address_map into a service	2022-10-31 09:17:10 +01:00
Botond Dénes	91516c1d68	scylla-gdb.py: improve scylla column_families Rename to scylla tables. Less typing and more up-to-date. By default it now only lists tables from local shard. Added flag -a which brings back old behaviour (lists on all shards). Added -u (only list user tables) and -k (list tables of provided keyspace only) filtering options.	2022-10-31 08:18:19 +02:00
Tomasz Grabiec	687df05e28	db: make_forwardable::reader: Do not emit range_tombstone_change with position past the range Since the end bound is exclusive, the end position should be before_key(), not after_key(). Affects only tests, as far as I know, only there we can get an end bound which is a clustering row position. Would cause failures once row cache is switched to v2 representation because of violated assumptions about positions. Introduced in `76ee3f029c` Closes #11823	2022-10-24 17:06:52 +03:00
Botond Dénes	e981bd4f21	Merge 'Alternator, MV: fix bug in some view updates which set the view key to its existing value' from Nadav Har'El As described in issue #11801, we saw in Alternator when a GSI has both partition and sort keys which were non-key attributes in the base, cases where updating the GSI-sort-key attribute to the same value it already had caused the entire GSI row to be deleted. In this series fix this bug (it was a bug in our materialized views implementation) and add a reproducing test (plus a few more tests for similar situations which worked before the patch, and continue to work after it). Fixes #11801 Closes #11808 * github.com:scylladb/scylladb: test/alternator: add test for issue 11801 MV: fix handling of view update which reassign the same key value materialized views: inline used-once and confusing function, replace_entry()	2022-10-21 10:49:28 +03:00
Avi Kivity	9ebac12e60	test: mutation-test: fix off-by-one in test_large_collection_allocation The test wants to see that no allocations larger than 128k are present, but sets the warning threshold to exactly 128k. Due to an off-by-one in Seastar, this went unnoticed. However, now that the off-by-one in Seastar is fixed [1], this test starts to fail. Fix by setting the warning threshold to 128k + 1. [1] `429efb5086` Closes #11817	2022-10-21 10:04:40 +03:00
Avi Kivity	db79f1eb60	Merge 'cql3: expr: Add unit tests for evaluate()' from Jan Ciołek This PR adds some unit tests for the `expr::evaluate()` function. At first I wanted to add the unit tests as part of #11658, but their size grew and grew, until I decided that they deserve their own pull request. I found a few places where I think it would be better to behave in a different way, but nothing serious. Closes #11815 * github.com:scylladb/scylladb: test/boost: move expr_test_utils.hh to .hh and .cc in test/lib cql3: expr: Add unit tests for bind_variable validation of collections cql3: expr: Add test for subscripted list and map cql3: expr: Add test for usertype_constructor cql3: expr: Add test for tuple_constructor cql3: expr: Add tests for evaluation of collection constructors cql3: expr: Add tests for evaluation of column_values and bind_variables cql3: expr: Add constant evaluation tests test/boost: Add expr_test_utils.hh cql3: Add ostream operator for raw_value cql3: add is_empty_value() to raw_value and raw_value_view	2022-10-20 22:55:34 +03:00
Jan Ciolek	4c4ed8e6df	test/boost: move expr_test_utils.hh to .hh and .cc in test/lib expr_test_utils.hh was a header file with helper methods for expression tests. All functions were inline, because I didn't know how to create and link a .cc file in test/boost. Now the header is split into expr_test_utils.hh and expr_test_utils.cc and moved to test/lib, which is designed to keep this kind of files. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-10-20 17:31:37 +02:00
Avi Kivity	6ce659be5b	Merge "Deglobalize snitch" from Pavel E " Snitch was the junction of several services' deps because it was the holder of endpoint->dc/rack mappings. Now this information is all on topology object, so snitch can be finally made main-local " * 'br-deglobalize-snitch' of https://github.com/xemul/scylla: code: Deglobalize snitch tests: Get local reference on global snitch instance once gossiper: Pass current snitch name into checker snitch: Add sharded<snitch_ptr> arg to reset_snitch() api: Move update_snitch endpoint api: Use local snitch reference api: Unset snitch endpoints on stop storage_service: Keep local snitch reference system_keyspace: Don't use global snitch instance snitch: Add const snitch_ptr::operator->()	2022-10-20 16:51:24 +03:00
Konstantin Osipov	8c920add42	test: (pytest) fix the pytest wrapper to work on Ubuntu Ubuntu doesn't have python, only python2 and python3. Closes #11810	2022-10-20 15:53:24 +03:00

1 2 3 4 5 ...

3857 Commits