scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 03:20:37 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	89a93a784e	tablets: Do not allocate tablets on nodes being decommissioned If tablet-based table is created concurrently with node being decommissioned after tablets are already drained, the new table may be permanently left with replicas on the node which is no longer in the topology. That creates an immidiate availability risk because we are running with one replica down. This also violates invariants about replica placement and this state cannot be fixed by topology operations. One effect is that this will lead to load balancer failure which will inhibit progress of any topology operations: load_balancer - Replica 154b0380-1dd2-11b2-9fdd-7156aa720e1a:0 of tablet 7e03dd40-537b-11ef-9fdd-7156aa720e1a:1 not found in topology, at: ... Fixes #20032 (cherry picked from commit `f5c74a5df2`) Closes scylladb/scylladb#20067	2024-08-08 11:57:09 +03:00
Dawid Medrek	d065d6f05d	db/hints: Log when ignoring invalid hint directories In `58784cd`, `aa4b06a` and other commits migrating hinted handoff from IPs to host IDs (scylladb/scylladb#15567), we started ignoring hint directories of invalid names, i.e. those that represent neither an IP address, nor a host ID. They remain on disk and are taken into account while computing e.g. the total size of hints, but they're not used in any way. These changes add logs informing the user when Scylla encounters such a directory. Closes scylladb/scylladb#17566 (cherry picked from commit `a5528a2093`) Closes scylladb/scylladb#19892	2024-08-07 10:55:06 +02:00
Michael Litvak	ccd01caed8	db: fix waiting for counter update operations on table stop When a table is dropped it should wait for all pending operations in the table before the table is destroyed, because the operations may use the table's resources. With counter update operations, currently this is not the case. The table may be destroyed while there is a counter update operation in progress, causing an assert to be triggered due to a resource being destroyed while it's in use. The reason the operation is not waited for is a mistake in the lifetime management of the object representing the write in progress. The commit fixes it so the object lives for the duration of the entire counter update operation, by moving it to the `do_with` list. Fixes scylladb/scylla-enterprise#4475 Closes scylladb/scylladb#20017	2024-08-05 12:52:32 +02:00
Piotr Dulikowski	78e3f0f208	Merge '[Backport 6.0] hinted handoff: migrate sync point to host ID ' from Dawid Mędrek Change the format of sync points to use host ID instead of IPs, to be consistent with the use of host IDs in hinted handoff module. Introduce sync point v3 format which is the same as v2 except it stores host IDs instead of IPs. The decoding supports both formats with host IDs and IPs, so a sync point contains now a variant of either types, and in the case of new type the translation is avoided. Fixes scylladb/scylladb#18653 (cherry picked from commit scylladb/scylladb@b824e73) (cherry picked from commit scylladb/scylladb@afc9a1a) (cherry picked from commit scylladb/scylladb@c56de90) (cherry picked from commit scylladb/scylladb@222dbf2) In scylladb/scylladb#18733, we were experiencing a test failure because the test code was receiving the reply `"DONE"` instead of `"IN_PROGRESS"` when awaiting a sync point. The cluster consisted of two nodes and the last few steps of the test that are relevant were: 1. Stop node 2. 2. Enable an error injection on node 1 to prevent it from sending hints. 3. Perform mutations on node 1 leading it to save hints towards node 2. 4. Start node 2 again. 5. Create a sync point on node 1. 6. Decommission node 2. 7. Await the created sync point on node 1. Decommissioning node 2 led to node 1 trying to drain hints saved towards it. However, due to the error injection, the draining process was stuck and never finished. Because of that, when node 1 received a request to await the sync point, the hint endpoint manager corresponding to node 2 was still present -- all of that was expected by the test. What was unexpected by the test was the fact that now that hinted handoff has started identifying nodes by their host IDs, but sync points themselves still used IP addresses internally, there had to be a point in the code where mapping one data type to the other would happen. That place in the code is `manager::wait_for_sync_point()`. When a node is decommissioned/removed, its host ID--IP mapping is removed from the locator::token_metadata. Since node 2 had been decommissioned, we no longer had access to the mapping we needed and so the code used the "default" replay position, which, when compared, is smaller than any other replay position except for itself. Because of that, Scylla thought that all of the hints corresponding to the sync point it got had been replayed and returned `"DONE"` to the test's code, effectively leading to its failure. These changes prevent that from happening as we start using host IDs in the internal format used by sync points. Similar failures might still occur if a sync point is created before the migration to host-ID-based hinted handoff takes place, but awaited only after the migration. However, the chances that that would happen are quite slim. The test itself should proceed without any failures now. Fixes scylladb/scylladb#18733 Closes scylladb/scylladb#19967 * github.com:scylladb/scylladb: test/boost: include test/lib/test_utils.hh test/boost/hint_test.cc: Add missing parse() callback db/hints: migrate sync point to host ID db/hints: rename sync point structures with _v1 suffix to _v1_v2	2024-08-05 09:46:48 +02:00
Kefu Chai	e1dab2779d	test/boost: include test/lib/test_utils.hh this change was created in the same spirit of 505900f18f. because we are deprecating the operator<< for vector and unorderd_map in Seastar, some tests do not compile anymore if we disable these operators. so to be prepared for the change disabling them, let's include test/lib/test_utils.hh for accessing the printer dedicated for Boost.test. and also '#include <fmt/ranges.h>' when necessary, because, in order to format the ranges using {fmt}, we need to use fmt/ranges.h. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-08-02 15:04:34 +02:00
Kamil Braun	7a2396867b	Merge '[Backport 6.0] raft: fix the shutdown phase being stuck' from Emil Maskovsky Some of the calls inside the raft_group0_client::start_operation() method were missing the abort source parameter. This caused the repair test to be stuck in the shutdown phase - the abort source has been triggered, but the operations were not checking it. This was in particular the case of operations that try to take the ownership of the raft group semaphore (get_units(semaphore)) - these waits should be cancelled when the abort source is triggered. This should fix the following tests that were failing in some percentage of dtest runs (about 1-3 of 100): * TestRepairAdditional::test_repair_kill_1 * TestRepairAdditional::test_repair_kill_3 Fixes #19223 (cherry picked from commit `2dbe9ef2f2`) (cherry picked from commit `5dfc50d354`) Refs #19860 Closes scylladb/scylladb#19985 * github.com:scylladb/scylladb: raft: fix the shutdown phase being stuck raft: use the abort source reference in raft group0 client interface	2024-08-02 11:25:37 +02:00
Emil Maskovsky	b99d87863d	raft: fix the shutdown phase being stuck Some of the calls inside the `raft_group0_client::start_operation()` method were missing the abort source parameter. This caused the repair test to be stuck in the shutdown phase - the abort source has been triggered, but the operations were not checking it. This was in particular the case of operations that try to take the ownership of the raft group semaphore (`get_units(semaphore)`) - these waits should be cancelled when the abort source is triggered. This should fix the following tests that were failing in some percentage of dtest runs (about 1-3 of 100): * TestRepairAdditional::test_repair_kill_1 * TestRepairAdditional::test_repair_kill_3 Fixes scylladb/scylladb#19223 (cherry picked from commit `5dfc50d354`)	2024-08-01 19:37:02 +02:00
Emil Maskovsky	0770069dda	raft: use the abort source reference in raft group0 client interface Most callers of the raft group0 client interface are passing a real source instance, so we can use the abort source reference in the client interface. This change makes the code simpler and more consistent. (cherry picked from commit `2dbe9ef2f2`)	2024-08-01 19:36:00 +02:00
Dawid Medrek	13183069f7	test/boost/hint_test.cc: Add missing parse() callback Before these changes, compilation was failing with the following error: In file included from test/boost/hint_test.cc:12: /usr/include/fmt/ranges.h:298:7: error: no member named 'parse' in 'fmt::formatter<db::hints::sync_point::host_id_or_addr>' 298 \| f.parse(ctx); \| ~ ^ We add the missing callback. Closes scylladb/scylladb#19375	2024-08-01 14:49:36 +02:00
Michael Litvak	df0503afd6	db/hints: migrate sync point to host ID Change the format of sync points to use host ID instead of IPs, to be consistent with the use of host IDs in hinted handoff module. Introduce sync point v3 format which is the same as v2 except it stores host IDs instead of IPs. The encoding of sync points now always uses the new v3 format with host IDs. The decoding supports both formats with host IDs and IPs, so a sync point contains now a variant of either types, and in the case of the new format the translation from IP to host ID is avoided.	2024-07-31 18:00:28 +02:00
Michael Litvak	42ee9f9e59	db/hints: rename sync point structures with _v1 suffix to _v1_v2 rename sync point types and variables to have v1/v2 suffix according to their use.	2024-07-31 17:59:08 +02:00
Kamil Braun	9572674f25	docs: extend "forbidden operations" section for Raft-topology upgrade The Raft-topology upgrade procedure must not be run concurrently with version upgrade. (cherry picked from commit `bb0c3cdc65`) Closes scylladb/scylladb#19837	2024-07-29 16:53:01 +02:00
Tomasz Grabiec	416cbafd16	Merge '[Backport 6.0] sstables: fix some mixups between the writer's schema and the sstable's schema' from Michał Chojnowski There are two schemas associated with a sstable writer: the sstable's schema (i.e. the schema of the table at the time when the sstable object was created), and the writer's schema (equal to the schema of the reader which is feeding into the writer). It's easy to mix up the two and break something as a result. The writer's schema is needed to correctly interpret and serialize the data passing through the writer, and to populate the on-disk metadata about the on-disk schema. The sstables's schema is used to configure some parameters for newly created sstable, such as bloom filter false positive ratio, or compression. This series fixes the known mixups between the two — when setting up compression, and when setting up the bloom filters. Fixes scylladb/scylladb#16065 The bug is present in all supported versions, so the patch has to be backported to all of them. (cherry picked from commit `a1834efd82`) (cherry picked from commit `d10b38ba5b`) (cherry picked from commit `1a8ee69a43`) Refs scylladb/scylladb#19695 Closes scylladb/scylladb#19877 * github.com:scylladb/scylladb: sstables/mx/writer: when creating local_compression, use the sstables's schema, not the writer's sstables/mx/writer: when creating filter, use the sstables's schema, not the writer's sstables: for i_filter downcasts, use dynamic_cast instead of static_cast	2024-07-29 15:36:52 +02:00
Jenkins Promoter	36cb61589d	Update ScyllaDB version to: 6.0.3	2024-07-29 15:21:14 +03:00
Takuya ASADA	fefa76bffc	scylla_raid_setup: install update-initramfs when it's not available scylla_raid_setup may fail on Ubuntu minimal image since it calls update-initramfs without installing. (cherry picked from commit `b6dedf1ee1`) Closes scylladb/scylladb#19871	2024-07-25 13:58:11 +03:00
Michał Chojnowski	43ba44ce97	sstables/mx/writer: when creating local_compression, use the sstables's schema, not the writer's There are two schema's associated with a sstable writer: the sstable's schema (i.e. the schema of the table at the time when the sstable object was created), and the writer's schema (equal to the schema of the reader which is feeding into the writer). It's easy to mix up the two and break something as a result. The writer's schema is needed to correctly interpret and serialize the data passing through the writer, and to populate the on-disk metadata about the on-disk schema. The sstables's schema is used to configure some parameters for newly created sstable, such as bloom filter false positive ratio, or compression. The problem fixed by this patch is that the writer was wrongly creating the compressor objects based on its own schema, but using them based based on the sstable's schema the sstable's schema. This patch forces the writer to use the sstable's schema for both. (cherry picked from commit `1a8ee69a43`)	2024-07-25 12:23:58 +02:00
Michał Chojnowski	d6d3a91283	sstables/mx/writer: when creating filter, use the sstables's schema, not the writer's There are two schema's associated with a sstable writer: the sstable's schema (i.e. the schema of the table at the time when the sstable object was created), and the writer's schema (equal to the schema of the reader which is feeding into the writer). It's easy to mix up the two and break something as a result. The writer's schema is needed to correctly interpret and serialize the data passing through the writer, and to populate the on-disk metadata about the on-disk schema. The sstables's schema is used to configure some parameters for newly created sstable, such as bloom filter false positive ratio, or compression. The problem fixed by this patch is that the writer was wrongly creating the filter based on its own schema, while the layer outside the writer was interpreting it as if it was created with the sstable's schema. This patch forces the writer to pick the filter's parameters based on the sstable's schema instead. (cherry picked from commit `d10b38ba5b`)	2024-07-25 12:23:58 +02:00
Michał Chojnowski	0555d4c30b	sstables: for i_filter downcasts, use dynamic_cast instead of static_cast As of this patch, those static_casts are actually invalid in some cases (they cast to the wrong type) because of an oversight. A later patch will fix that. But to even write a reliable reproducer for the problem, we must force the invalid casts to manifest as a crash (instead of weird results). This patch both allows writing a reproducer for the bug and serves as a bit of defensive programming for the future. (cherry picked from commit `a1834efd82`) # Conflicts: # sstables/sstables.cc	2024-07-25 12:23:58 +02:00
Nadav Har'El	af39675c38	alternator: fix "/localnodes" to not return nodes still joining Alternator's "/localnodes" HTTP request is supposed to return the list of nodes in the local DC to which the user can send requests. The existing implementation incorrectly used gossiper::is_alive() to check for which nodes to return - but "alive" nodes include nodes which are still joining the cluster and not really usable. These nodes can remain in the JOINING state for a long time while they are copying data, and an attempt to send requests to them will fail. The fix for this bug is trivial: change the call to is_alive() to a call to is_normal(). But the hard part of this test is the testing: 1. An existing multi-node test for "/localnodes" assummed that right after a new node was created, it appears on "/localnodes". But after this patch, it may take a bit more time for the bootstrapping to complete and the new node to appear in /localnodes - so I had to add a retry loop. 2. I added a test that reproduces the bug fixed here, and verifies its fix. The test is in the multi-node topology framework. It adds an injection which delays the bootstrap, which leaves a new node in JOINING state for a long time. The test then verifies that the new node is alive (as checked by the REST API), but is not returned by "/localnodes". 3. The new injection for delaying the bootstrap is unfortunately not very pretty - I had to do it in three places because we have several code paths of how bootstrap works without repair, with repair, without Raft and with Raft - and I wanted to delay all of them. Fixes #19694. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#19725 (cherry picked from commit `bac7c33313`) (deleted test for cherry-pick)	2024-07-24 11:29:36 +03:00
Lakshmi Narayanan Sreethar	3c1fd843c8	[Backport 6.0]: sstables: do not reload components of unlinked sstables The SSTable is removed from the reclaimed memory tracking logic only when its object is deleted. However, there is a risk that the Bloom filter reloader may attempt to reload the SSTable after it has been unlinked but before the SSTable object is destroyed. Prevent this by removing the SSTable from the reclaimed list maintained by the manager as soon as it is unlinked. The original logic that updated the memory tracking in `sstables_manager::deactivate()` is left in place as (a) the variables have to be updated only when the SSTable object is actually deleted, as the memory used by the filter is not freed as long as the SSTable is alive, and (b) the `_reclaimed.erase(sst)` is still useful during shutdown, for example, when the SSTable is not unlinked but just destroyed. Fixes https://github.com/scylladb/scylladb/issues/19722 Closes scylladb/scylladb#19717 github.com:scylladb/scylladb: boost/bloom_filter_test: add testcase to verify unlinked sstables are not reloaded sstables: do not reload components of unlinked sstables sstables/sstables_manager: introduce on_unlink method (cherry picked from commit `591876b44e`) Backported from #19717 to 6.0 Closes scylladb/scylladb#19830	2024-07-23 23:16:53 +03:00
Piotr Dulikowski	9cc20e7c4d	Merge '[Backport 6.0] schema: fix describe of indexes on collections' from ScyllaDB If the index was created on collection (both frozen or not), its description wasn't a correct create statement. This patch fixes the bug and includes functions like `full()`, `keys()`, `values()`, ... used to create index on collections. Fixes scylladb/scylladb#19278 (cherry picked from commit `253feb6811`) (cherry picked from commit `b65a4c66f0`) Refs #19381 Closes scylladb/scylladb#19700 * github.com:scylladb/scylladb: cql-pytest/test_describe: add a test for describe indexes schema/schema: fix column names in index description	2024-07-22 12:33:47 +02:00
Kamil Braun	9efaca0bd2	Merge '[Backport 6.0] test: raft: fix the flaky `test_raft_recovery_stuck`' from ScyllaDB Use the rolling restart to avoid spurious driver reconnects. This can be eventually reverted once the scylladb/python-driver#295 is fixed. Fixes scylladb/scylladb#19154 (cherry picked from commit `ef3393bd36`) (cherry picked from commit `a89facbc74`) Refs #19771 Closes scylladb/scylladb#19809 * github.com:scylladb/scylladb: test: raft: fix the flaky `test_raft_recovery_stuck` test: raft: code cleanup in `test_raft_recovery_stuck`	2024-07-22 11:14:44 +02:00
Emil Maskovsky	62c9709f4a	test: raft: fix the flaky `test_raft_recovery_stuck` Use the rolling restart to avoid spurious driver reconnects. This can be eventually reverted once the scylladb/python-driver#295 is fixed. Fixes scylladb/scylladb#19154 (cherry picked from commit `a89facbc74`)	2024-07-20 02:17:50 +00:00
Emil Maskovsky	64d414f10a	test: raft: code cleanup in `test_raft_recovery_stuck` Cleaning up the imports. (cherry picked from commit `ef3393bd36`)	2024-07-20 02:17:50 +00:00
Kamil Braun	f32ed716ed	Merge '[Backport 6.0] Fix lwt semaphore guard accounting' from ScyllaDB Currently the guard does not account correctly for ongoing operation if semaphore acquisition fails. It may signal a semaphore when it is not held. Should be backported to all supported versions. (cherry picked from commit `87beebeed0`) (cherry picked from commit `4178589826`) Refs #19699 Closes scylladb/scylladb#19796 * github.com:scylladb/scylladb: test: add test to check that coordinator lwt semaphore continues functioning after locking failures paxos: do not signal semaphore if it was not acquired	2024-07-19 19:06:36 +02:00
Gleb Natapov	c437c8be36	test: add test to check that coordinator lwt semaphore continues functioning after locking failures (cherry picked from commit `4178589826`)	2024-07-18 15:34:17 +00:00
Gleb Natapov	1c04b95c68	paxos: do not signal semaphore if it was not acquired The guard signals a semaphore during destruction if it is marked as locked, but currently it may be marked as locked even if locking failed. Fix this by using semaphore_units instead of managing the locked flag manually. Fixes: https://github.com/scylladb/scylladb/issues/19698 (cherry picked from commit `87beebeed0`)	2024-07-18 15:34:16 +00:00
Emil Maskovsky	5649b55e08	test: raft: fix the flaky `test_change_ip` The python driver might currently trigger spurios reconnects that cause the `NoHostAvailable` to be thrown, which is not expected. This patch adds a retry mechanism to the test to make skip this failure if it occurs, as a work-around. The proper fix is expected to be done in the scylladb/python-driver#295, once fixed there this work-around can be reverted. Fixes: scylladb/scylla#18547 (cherry picked from commit `6b9992737a`) Closes scylladb/scylladb#19773	2024-07-18 15:06:23 +02:00
Emil Maskovsky	b4745406da	raft: Fix crash in leader_host API handler The leader_host API handler was eventually using the `req` unique_ptr after it has been already destroyed (passed down to the future lambda by reference). This was causing an occassional crash in some tests. Reworked the leader_host handler to use the req only outside of the future lambda. Also updated the code to handle the possibility that the non-default leader group (other than Group 0) might reside on a different shard than the shard 0 - using the same concept of calling on all shards via `invoke_on_all()` as done for the other requests. Fixes scylladb/scylladb#19714 (cherry picked from commit `b2db8f4b9b`) Closes scylladb/scylladb#19742	2024-07-16 13:29:37 +02:00
Anna Stuchlik	27faec3015	doc: replace a link on the CDC+Kafka page This commit replaces a link to the installation section with a link to the getting started section. (cherry picked from commit `f90867c740`) Closes scylladb/scylladb#19712	2024-07-16 13:15:45 +02:00
Emil Maskovsky	06c356df8f	test: raft: fix the topology failure recovery test flakiness Setting the error condition for all nodes in the cluster to avoid having to check which one is the coordinator. This should make the test more stable and avoid the flakiness observed when the coordinator node is the one that got the error condition injected. Randomizing the retrieved running servers to reproduce the issue more frequently and to avoid making any assumptions about the order of the servers. Note that only the "raft_topology_barrier_fail" needs to run on a non-coordinator node, the other error "stream_ranges_fail" can be injected on any node (including the coordinator). Fixes: #18614 (cherry picked from commit `9dbad34205`) Closes scylladb/scylladb#19708	2024-07-15 16:27:22 +02:00
Michael Litvak	815a707b0a	storage_proxy: remove response handler if no targets When writing a mutation, it might happen that there are no live targets to send the mutation to, yet the request can be satisfied. For example, when writing with CL=ANY to a dead node, the request is completed by storing a local hint. Currently, in that case, a write response handler is created for the request and it remains active until it timeouts because it is not removed anywhere, even though the write is completed successfuly after storing the hint. The response handler should be removed usually when receiving responses from all targets, but in this case there are no targets to trigger the removal. In this commit we check if we don't have live targets to send the mutation to. If so, we remove the response handler immediately. Fixes scylladb/scylladb#19529 (cherry picked from commit `a9fdd0a93a`) Closes scylladb/scylladb#19680	2024-07-15 08:24:18 +02:00
Botond Dénes	16452f9cf5	Merge '[Backport 6.0] scylla-sstable: add method to load the schema from the sstable itself' from ScyllaDB As it turns out, each sstable carries its own schema in its serialization header (Statistics component). This schema is incomplete -- the names of the key columns are not stored, just their type. Static and regular columns do have names and types stored however. This bare-bones schema is enough to parse and display the content of the sstable. Another thing missing is schema options (the stuff after the `WITH` keyword, except the clustering order). The only options stored are the compression options (in the CompressionInfo component), this is actually needed to read the Data component. This series adds a new method to `tools/schema_loader.cc` to extract the schema stored in the sstable itself. This new schema load method is used as the last fall-back for obtaining the schema, in case scylla-sstable is trying to autodetect the schema of the sstable. Although, right now this bare-bones schema is enough for everything scylla-sstable does, it is more future proof to stick to the "full" schema if possible, so this new method is the last resort for now. Fixes: https://github.com/scylladb/scylladb/issues/17869 Fixes: https://github.com/scylladb/scylladb/issues/18809 New functionality, no backport needed. (cherry picked from commit `435c01d1e6`) (cherry picked from commit `0d7335dd27`) (cherry picked from commit `8f2ba03465`) (cherry picked from commit `43c44f0af5`) (cherry picked from commit `145a67f77c`) Refs #19169 Closes scylladb/scylladb#19711 * github.com:scylladb/scylladb: tools/scylla-sstable: log loaded schema with trace level tools/scylla-sstable: load schema from the sstable as fallback tools/schema_loader: introduce load_schema_from_sstable() test/lib/random_schema: remove assert on min number of regular columns sstables: introduce load_metadata()	2024-07-12 16:55:44 +03:00
Botond Dénes	5d94a08250	tools/scylla-sstable: log loaded schema with trace level The schema of the sstable can be interesting, so log it with trace level. Unfortunately, this is not the nice CQL statement we are used to (that requires a database object), but the not-nearly-so-nice CFMetadata printout. Still, it is better then nothing. (cherry picked from commit `145a67f77c`)	2024-07-12 10:36:59 +00:00
Botond Dénes	4f74e6f28e	tools/scylla-sstable: load schema from the sstable as fallback When auto-detecting the schema of the sstable, if all other methods failed, load the schema from the sstable's serialization header. This schema is incomplete. It is just enough to parse and display the content of the sstable. Although parsing and displaying the content of the sstable is all scylla-sstable does, it is more future-compatible to us the full schema when possible. So the always-available but minimal schema that each sstable has on itself, is used just as a fallback. The test which tested the case when all schema load attempts fail, doesn't work now, because loading the serialization header always succeeds. So convert this test into two positive tests, testing the serialization header schema fallback instead. (cherry picked from commit `43c44f0af5`)	2024-07-12 10:36:59 +00:00
Botond Dénes	f42e8e872a	tools/schema_loader: introduce load_schema_from_sstable() Allows loading the schema from an sstable's serialization header. This schema is incomplete, but it is enough to parse and display the content of the sstable. (cherry picked from commit `8f2ba03465`)	2024-07-12 10:36:59 +00:00
Botond Dénes	f7c8c32929	test/lib/random_schema: remove assert on min number of regular columns It is legal for a schema to have 0 regular columns, so remove the assert on the schema specification's regular column count. (cherry picked from commit `0d7335dd27`)	2024-07-12 10:36:59 +00:00
Botond Dénes	4f165eb3e9	sstables: introduce load_metadata() Loads just the metadata components. No validation. Split off from load(), to allow scylla-sstable to partially load an sstable. (cherry picked from commit `435c01d1e6`)	2024-07-12 10:36:59 +00:00
Michał Jadwiszczak	25f8fd0b5c	cql-pytest/test_describe: add a test for describe indexes (cherry picked from commit `b65a4c66f0`)	2024-07-11 12:59:27 +00:00
Michał Jadwiszczak	67764e7d66	schema/schema: fix column names in index description Previously description of index didn't include functions for indexes on collections like full(), keys(), values(), etc... (cherry picked from commit `253feb6811`)	2024-07-11 12:59:27 +00:00
Tomasz Grabiec	43ff19273c	Merge '[Backport 6.0] mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion' from ScyllaDB apply_monotonically() is run with reclaim disabled. So with some bad luck, sentinel insertion might fail with bad_alloc even on a perfectly healthy node. We can't deal with the failure of sentinel insertion, so this will result in a crash. This patch prevents the spurious OOM by reserving some memory (1 LSA segment) and only making it available right before the critical allocations. Fixes https://github.com/scylladb/scylladb/issues/19552 (cherry picked from commit `f784be6a7e`) (cherry picked from commit `7b3f55a65f`) (cherry picked from commit `78d6471ce4`) Refs #19617 Closes scylladb/scylladb#19675 * github.com:scylladb/scylladb: mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion logalloc: add hold_reserve logalloc: generalize refill_emergency_reserve()	2024-07-10 14:28:01 +02:00
Michał Chojnowski	aee0150506	mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion apply_monotonically() is run with reclaim disabled. So with some bad luck, sentinel insertion might fail with bad_alloc even on a perfectly healthy node. We can't deal with the failure of sentinel insertion, so this will result in a crash. This patch prevents the spurious OOM by reserving some memory (1 LSA segment) and only making it available right before the critical allocations. Fixes scylladb/scylladb#19552 (cherry picked from commit `78d6471ce4`)	2024-07-10 08:36:11 +00:00
Michał Chojnowski	c5c19e90ac	logalloc: add hold_reserve mutation_partition_v2::apply_monotonically() needs to perform some allocations in a destructor, to ensure that the invariants of the data structure are restored before returning. But it is usually called with reclaiming disabled, so the allocations might fail even in a perfectly healthy node with plenty of reclaimable memory. This patch adds a mechanism which allows to reserve some LSA memory (by asking the allocator to keep it unused) and make it available for allocation right when we need to guarantee allocation success. (cherry picked from commit `7b3f55a65f`)	2024-07-10 08:36:11 +00:00
Michał Chojnowski	985f5a50f6	logalloc: generalize refill_emergency_reserve() In the next patch, we will want to do the thing as refill_emergency_reserve() does, just with a quantity different than _emergency_reserve_max. So we split off the shareable part to a new function, and use it to implement refill_emergency_reserve(). (cherry picked from commit `f784be6a7e`)	2024-07-10 08:36:11 +00:00
Botond Dénes	ae11381d7c	Merge '[Backport 6.0] reader_concurrency_semaphore: make CPU concurrency configurable' from Botond Dénes The reader concurrency semaphore restricts the concurrency of reads that require CPU (intention: they read from the cache) to 1, meaning that if there is even a single active read which declares that it needs just CPU to proceed, no new read is admitted. This is meant to keep the concurrency of reads in the cache at 1. The idea is that concurrency in the cache is not useful: it just leads to the reactor rotating between these reads, all of the finishing later then they could if they were the only active read in the cache. This was observed to backfire in the case where there reads from a single table are mostly very fast, but on some keys are very slow (hint: collection full of tombstones). In this case the slow read keeps up the fast reads in the queue, increasing the 99th percentile latencies significantly. This series proposes to fix this, by making the CPU concurrency configurable. We don't like tunables like this and this is not a proper fix, but a workaround. The proper fix would be to allow to cut any page early, but we cannot cut a page in the middle of a row. We could maybe have a way of detecting slow reads and excluding them from the CPU concurrency. This would be a heuristic and it would be hard to get right. So in this series a robust and simple configurable is offered, which can be used on those few clusters which do suffer from the too strict concurrency limit. We have seen it in very few cases so far, so this doesn't seem to be wide-spread. Fixes: https://github.com/scylladb/scylladb/issues/19017 This PR backports https://github.com/scylladb/scylladb/pull/19018 and its follow-up https://github.com/scylladb/scylladb/pull/19600. Closes scylladb/scylladb#19644 * github.com:scylladb/scylladb: reader_concurrency_semaphore: execution_loop(): move maybe_admit_waiters() to the inner loop test/boost/reader_concurrency_semaphore_test: add test for live-configurable cpu concurrency test/boost/reader_concurrency_semaphore_test: hoist require_can_admit reader_concurrency_semaphore: wire in the configurable cpu concurrency reader_concurrency_semaphore: add cpu_concurrency constructor parameter db/config: introduce reader_concurrency_semahore_cpu_concurrency	2024-07-10 07:23:08 +03:00
Anna Stuchlik	4ec5a06101	doc: update Scylla Doctor installation This commit updates the instuctions on how to download and run Scylla Doctor, following the changes in how Scylla Doctor is released. (cherry picked from commit `2ffda9b262`) Closes scylladb/scylladb#19525	2024-07-09 14:32:21 +03:00
Anna Stuchlik	dcf4c757b2	doc: remove support for Debian 10 This PR removes support for Debian 10, which reached end of life on June 30, 2024. Refs https://github.com/scylladb/scylla-enterprise/issues/4377 (cherry picked from commit `1f340428ea`) Closes scylladb/scylladb#19630	2024-07-09 12:55:11 +02:00
Wojciech Przytuła	a7fe9eeffd	storage_proxy: fix uninitialized LWT contention counter When debugging the issue of high LWT contention metric, we (the drivers team) discovered that at least 3 drivers (Go, Java, Rust) cause high numbers in that metrics in LWT workloads - we doubted that all those drivers route LWT queries badly. We tried to understand that metric and its semantics. It took 3 people over 10 hours to figure out what it is supposed to count. People from core team suspected that it was the drivers sending requests to different shards, causing contention. Then we ran the workload against a single node single shard cluster... and observed contention. Finally, we looked into the Scylla code and saw it. Uninitialized stack value. The core member was shocked. But we, the drivers people, felt we always knew it. It's yet another time that we are blamed for a server-side issue. We rebuilt scylla with the variable initialized to 0 and the metric kept being 0. To prevent such errors in the future, let's consider some lints that warn against uninitialized variables. This is such an obvious feature of e.g. Rust, and yet this has shown to be cause a painful bug in 2024. Fixes: scylladb/scylladb#19654 (cherry picked from commit `36a125bf97`) Closes scylladb/scylladb#19657	2024-07-09 11:41:10 +02:00
Michael Litvak	ad6eb1cadf	view: drain view builder before database The view builder is doing write operations to the database. In order for the view builder to shutdown gracefully without errors, we need to ensure the database can handle writes while it is drained. The commit changes the drain order, so that view builder is drained before the database shuts down. Fixes scylladb/scylladb#18929 (cherry picked from commit `9d9318c564`) Closes scylladb/scylladb#19636	2024-07-08 19:16:26 +02:00
Botond Dénes	dadc0c32e1	reader_concurrency_semaphore: execution_loop(): move maybe_admit_waiters() to the inner loop Now that the CPU concurency limit is configurable, new reads might be ready to execute right after the current one was executed. So move the poll for admitting new reads into the inner loop, to prevent the situation where the inner loop yields and a concurrent do_wait_admission() finds that there are waiters (queued because at the time they arrived to the semaphore, the _ready_list was not empty) but it is is possible to admit a new read. When this happens the semaphore will dump diagnostics to help debug the apparent contradiction, which can generate a lot of log spam. Moving the poll into the inner loop prevents the false-positive contradiction detection from firing. Refs: scylladb/scylladb#19017 Closes scylladb/scylladb#19600 (cherry picked from commit `155acbb306`)	2024-07-08 08:13:40 +03:00

1 2 3 4 5 ...

43053 Commits