scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 21:17:01 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	fa0077fb77	Merge 'S3 chunked download source bug fixes' from Ernest Zaslavsky - Fix missing negation in the `if` in the background downloading fiber - Add test to catch this case - Improve the s3 proxy to inject errors if the same resource requested more than once - Suppress client retry since retrying the same request when each produces multiple buffers may lead to the same data appear more than once in the buffer deque - Inject exception from the test to simulate response callback failure in the middle No need to backport anything since this class in not used yet Closes scylladb/scylladb#24657 * github.com:scylladb/scylladb: s3_test: Add s3_client test for non-retryable error handling s3_test: Add trace logging for default_retry_strategy s3_client: Fix edge case when the range is exhausted s3_client: Fix indentation in try..catch block s3_client: Stop retries in chunked download source s3_client: Enhance test coverage for retry logic s3_client: Add test for Content-Range fix s3_client: Fix missing negation s3_client: Refine logging s3_client: Improve logging placement for current_range output	2025-07-02 14:45:10 +03:00
Konstantin Osipov	37fc4edeb5	test.py: add a way to provide pytest arguments via test.py Now that we use a single pytest.ini for all tests, different developer preferences collide. There should be an easy way to override pytest.ini defaults from the command line. Fixes https://github.com/scylladb/scylladb/issues/21800 Closes scylladb/scylladb#24573	2025-07-02 12:20:43 +03:00
Avi Kivity	dfaed80f55	Merge 'types: add byte-comparable format support for native cql3 types' from Lakshmi Narayanan Sreethar This PR introduces a new `comparable_bytes` class to add byte-comparable format support for all the [native cql3 data types](https://opensource.docs.scylladb.com/stable/cql/types.html#native-types) except `counter` type as that is not comparable. The byte-comparable format is a pre-requisite for implementing the trie based index format for our sstables(https://github.com/scylladb/scylladb/issues/19191). This implementation adheres to the byte-comparable format specification in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/bytecomparable/ByteComparable.md Note that support for composite data types like lists, maps, and sets has not been implemented yet and will be made available in a separate PR. Refs https://github.com/scylladb/scylladb/issues/19407 New feature - backport not required. Closes scylladb/scylladb#23541 * github.com:scylladb/scylladb: types/comparable_bytes: add testcase to verify compatibility with cassandra types/comparable_bytes: support variable-length natively byte-ordered data types types/comparable_bytes: support decimal cql3 types types/comparable_bytes: introduce count_digits() method types/comparable_bytes: support uuid and timeuuid cql3 types types/comparable_bytes: support varint cql3 type types/comparable_bytes: support skipping sign byte write in decode_signed_long_type types/comparable_bytes: introduce encode/decode_varint_length types/comparable_bytes: support float and double cql3 types types/comparable_bytes: support date, time and timestamp cql3 types types/comparable_bytes: support bigint cql3 type types/comparable_bytes: support fixed length signed integers types/comparable_bytes: support boolean cql3 type types: introduce comparable_bytes class bytes_ostream: overload write() to support writing from FragmentedView docs: fix minor typo in docs/dev/cql3-type-mapping.md	2025-07-02 11:58:32 +03:00
Avi Kivity	1e0b015c8b	Merge 'cql3: Represent create_statement using managed_bytes' from Dawid Mędrek When describing a table, we need to do it carefully: if some columns were dropped, we must specify that explicitly by ``` ALTER TABLE {table} DROP {column} USING TIMESTAMP ... ``` in the result of the DESCRIBE statement. Failing to do so could lead to data resurrection. However, if a table has been altered many, many times, we might end up with a huge create statement. Constructing it could, in turn, trigger an oversized allocation. Some tests ran into that very problem in fact. In this commit, we want to mitigate the problem: instead of allocating a contiguous chunk of memory for the create statement, we use `bytes_ostream` and `managed_bytes` to possibly keep data scattered in memory. It makes handling `cql3::description` less convenient in the code, but since the struct is pretty much immediately serialized after creating it, it's a very good trade-off. A reproducer is intentionally not provided by this commit: it's easy to test the change, but adding and dropping a huge number of columns would take a really long amount of time, so we need to omit it. Fixes scylladb/scylladb#24018 Backport: all of the supported versions are affected, so we want to backport the changes there. Closes scylladb/scylladb#24151 * github.com:scylladb/scylladb: cql3/description: Serialize only rvalues of description cql3: Represent create_statement using managed_string cql3/statements/describe_statement.cc: Don't copy descriptions cql3: Use managed_bytes instead of bytes in DESCRIBE utils/managed_string.hh: Introduce managed_string and fragmented_ostringstream	2025-07-01 21:59:38 +03:00
Lakshmi Narayanan Sreethar	5f5a8cf54c	types/comparable_bytes: add testcase to verify compatibility with cassandra	2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar	6c1853a830	types/comparable_bytes: support variable-length natively byte-ordered data types The following cql3 data types - ascii, blob, duration, inet, and text - are natively byte-ordered in their serialized forms. To encode them into a byte-comparable format, zeros are escaped, and since these types have variable lengths, the encoded form is terminated in an escaped state to mark its end. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar	5c77d17834	types/comparable_bytes: support decimal cql3 types The decimal cql3 type is internally stored as a scale and an unscaled integer. To convert them into a byte comparable format, they are first normalized into a base-100 exponent and a mantissa that lies in [0.01, 1) and then encoded into a byte sequence that preserves the numerical order. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar	832236d044	types/comparable_bytes: introduce count_digits() method Implemented a method `count_digits()` to return the number of significant digits in a given boost::multiprecision:cpp_int. This is required to convert big_decimal to a byte comparable format. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar	a00c5d3899	types/comparable_bytes: support uuid and timeuuid cql3 types The uuid type values are composed of two fixed-length unsigned integers: an msb and an lsb. The msb contains a version digit, which must be pulled first in a byte-comparable representation. For version 1 uuids, in addition to extracting the version digit first, the msb must be rearranged to make it byte comparable. The lsb is written as is. For the timeuuid type, the msb is handled simliar to the version 1 uuid values. The lsb however is treated differently - the sign bits of all bytes are inverted to preserve the legacy comparison order, which compared individual bytes as signed values. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:08 +05:30
Lakshmi Narayanan Sreethar	4592b9764c	types/comparable_bytes: support varint cql3 type Any varint value less than 7 bytes is encoded using the signed long encoding format and remaining values are all encoded using the full form encoding : <signbyte><length as unsigned integer - 7><7 or more bytes>, where <signbyte> is 00 for negative numbers and FF for positive ones, and the length's bytes are inverted if the number is negative (so that longer length sorts smaller). Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	ad45a19373	types/comparable_bytes: introduce encode/decode_varint_length The length of a varint value is encoded separately as an unsigned variable-length integer. For negative varint values, the encoded bytes are flipped to ensure that longer lengths sort smaller. This patch implements both encoding and decoding logic for varint lengths and will be used by the subsequent patch. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	7af153c237	types/comparable_bytes: support float and double cql3 types The sign bit is flipped for positive values to ensure that they are ordered after negative values. For negative values, all the bytes are inverted, allowing larger negative values to be ordered before smaller ones. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	0145c1d705	types/comparable_bytes: support date, time and timestamp cql3 types Both the date and time cql3 types are internally unsigned fixed length integers. Their serialized form is already byte comparable, so the encoder and decoder return the serialized bytes as it is. The timestamp type is encoded using the fixed length signed integer encoding. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	b6ff3f5304	types/comparable_bytes: support bigint cql3 type The bigint type, internally implemented as a long data type, is encoded using a variable-length encoding similar to UTF-8. This enables a significant amount of space to be saved when smaller numbers are frequently used, while still permitting large values to be efficiently encoded. The first bit of the encoding represents the inverted sign (i.e., 1 for positive, 0 for negative), followed by length encoded as a sequence of bits matching the inverted sign. This is then followed by a differing bit (except for 9-byte encodings) and the bits of the number's two's complement. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	c0d25060bd	types/comparable_bytes: support fixed length signed integers To encode fixed-length signed integers in a byte-comparable format, the first bit of each value is inverted. This ensures that negative numbers are ordered before positive ones during comparison. This patch adds support for the data types : byte_type (tinyint), short_type (smallint), and int32_type (int). Although long_type (bigint) is a fixed length integer type, it has different byte comparable encoding and will be handled separately in another patch. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	8572afca2b	types/comparable_bytes: support boolean cql3 type Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	74c556a33d	types: introduce comparable_bytes class This patch implements a new class, `comparable_bytes`, designed to implement methods for converting data values to and from byte-comparable formats. The class stores the comparable bytes as `managed_bytes` and currently provides the structure for all required methods. The actual logic for converting various data types will be implemented in subsequent patches. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Lakshmi Narayanan Sreethar	e4c7cb7834	bytes_ostream: overload write() to support writing from FragmentedView Overloaded write() method to support writing a FragmentedView into bytes_ostream. Also added a testcase to verify the implementation. The new helper will be used by the byte_comparable implementation during the encode/decode process. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-07-01 22:19:07 +05:30
Ernest Zaslavsky	acf15eba8e	s3_test: Add s3_client test for non-retryable error handling Introduce a test that injects a non-retryable error and verifies that the chunked download source throws an exception as expected.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	a5246bbe53	s3_test: Add trace logging for default_retry_strategy Introduce trace-level logging for `default_retry_strategy` in `s3_test` to improve visibility into retry logic during test execution.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	d2d69cbc8c	s3_client: Stop retries in chunked download source Disable retries for S3 requests in the chunked download source to prevent duplicate chunks from corrupting the buffer queue. The response handler now throws an exception to bypass the retry strategy, allowing the next range to be attempted cleanly. This exception is only triggered for retryable errors; unretryable ones immediately halt further requests.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	c75acd274c	s3_client: Enhance test coverage for retry logic Extend the S3 proxy to support error injection when the client makes multiple requests to the same resource—useful for testing retry behavior and failure handling.	2025-07-01 18:45:17 +03:00
Ernest Zaslavsky	ec59fcd5e4	s3_client: Add test for Content-Range fix Introduce a test that accurately verifies the Content-Range behavior, ensuring the previous fix is properly validated.	2025-07-01 18:45:17 +03:00
Tomasz Grabiec	97679002ee	Merge 'Co-locate tablets of different tables' from Michael Litvak Add the option to co-locate tablets of different tables. For example, a base table and its CDC table, or a local index. main changes and ideas: * "table group" - a set of one or more tables that should be co-located. (Example: base table and CDC table). A group consists of one base table and zero or more children tables. * new column `base_table` in `system.tablets`: when creating a new table, it can be set to point to a base table, which the new table's tablets will be co-located with. when it's set, the tablet map information should be retrieved from the base table map. the child map doesn't contain per-tablet information. * co-located tables always have the same tablet count and the same tablet replicas. each tablet operation - migration, resize, repair - is applied on all tablets in a synchronized manner by the topology coordinator. * resize decision for a group is made by combining the per-table hints and comparing the average tablet size (over all tablets in the group) with the target tablet size. * the tablets load balancer works with the base table as a representative of the group. it represents a single migration unit with some `group_size` that is taken into account. * view tablets are co-located with base tablets when the partition keys match. Fixes https://github.com/scylladb/scylladb/issues/17043 backport is not needed. this is preliminary work for support of MVs and CDC with tablets. Closes scylladb/scylladb#22906 * github.com:scylladb/scylladb: tablets: validate no clustering row mutations on co-located tables raft_group0_client: extend validate_change to mixed_change type docs: topology-over-raft: document co-located tables tablet-mon.py: visual indication for co-located tablets tablet-mon.py: handle co-located tablets test/boost/view_schema_test.cc: fix race in wait_until_built boost/tablets_test: test load balancing and resize of co-located tablets test/tablets: test tablets colocation tablets: co-locate view tablets with base when the partition keys match test/pylib/tablets: common get_tablet_count api test_mv_tablets: use get_tablet_replicas from common tablets api test/pylib/tablets: fix test api to read tablet replicas from base table tablets: allocator: create co-located tables in a single operation alternator: prepare all new tables in a single announcement migration_manager: add notification for creating multiple tables tablets: read_tablet_transition_stage: read from base table storage service: allow repair request only on base tables tablets: keyspace_rf_change: apply on base table storage service: generate tablet migration updates on base tables tablets: replace all_tables method tablets: split when all co-located tablets are ready tablets: load balancer: sizing plan for table groups tablets: load balancer: handle co-located tablets tablets: allocate co-located tablets tablets: handle migration of co-located tablets storage service: add repair colocated tablets rpc tablets: save and read tablet metadata of co-located tables tablets: represent co-located tables in tablet metadata tablets: add base_table column to system.tablets docs: update system.tablets schema	2025-07-01 16:02:30 +02:00
Tomasz Grabiec	6290b70d53	Merge 'repair: postpone repair until topology is not busy ' from Aleksandra Martyniuk Currently, repair_service::repair_tablets starts repair if there is no ongoing tablet operations. The check does not consider global topology operations, like tablet resize finalization. Hence, if: - topology is in the tablet_resize_finalization state; - repair starts (as there is no tablet transitions) and holds the erm; - resize finalization finishes; then the repair sees a topology state different than the actual - it does not see that the storage groups were already split. Repair code does not handle this case and it results with on_internal_error. Start repair when topology is not busy. The check isn't atomic, as it's done on a shard 0. Thus, we compare the topology versions to ensure that the business check is valid. Fixes: https://github.com/scylladb/scylladb/issues/24195. Needs backport to all branches since they are affected Closes scylladb/scylladb#24202 * github.com:scylladb/scylladb: test: add test for repair and resize finalization repair: postpone repair until topology is not busy	2025-07-01 16:02:22 +02:00
Łukasz Paszkowski	a22d1034af	test.py: Fix test_compactionhistory_rows_merged_time_window_compaction_strategy The test has two major problems 1. Wrongly computed time windows. Data was not spread across two 1-minute windows causing the test to generate even three sstables instead of two 2. Timestamp was not propagated to the prepared CQL statements. So in fact, a current time was used implicitly 3. Because of the incorrect timestamp issue, the remaining tests testing purged tombstones were affected as well. Fixes https://github.com/scylladb/scylladb/issues/24532 Closes scylladb/scylladb#24609	2025-07-01 15:01:21 +03:00
Dawid Mędrek	ac9062644f	cql3: Represent create_statement using managed_string When describing a table, we need to do it carefully: if some columns were dropped, we must specify that explicitly by ``` ALTER TABLE {table} DROP {column} USING TIMESTAMP ... ``` in the result of the DESCRIBE statement. Failing to do so could lead to data resurrection. However, if a table has been altered many, many times, we might end up with a huge create statement. Constructing it could, in turn, trigger an oversized allocation. Some tests ran into that very problem in fact. In this commit, we want to mitigate the problem: instead of allocating a contiguous chunk of memory for the create statement, we use `fragmented_ostringstream` and `managed_string` to possibly keep data scattered in memory. It makes handling `cql3::description` less convenient in the code, but since the struct is pretty much immediately serialized after creating it, it's a very good trade-off. We provide a reproducer. It consistently passes with this commit, while having about 50% chance of failure before it (based on my own experiments). Playing with the parameters of the test doesn't seem to improve that chance, so let's keep it as-is. Fixes scylladb/scylladb#24018	2025-07-01 12:58:02 +02:00
Michael Litvak	fb18b95b3c	test/boost/view_schema_test.cc: fix race in wait_until_built create the view waiter before creating the view, otherwise if the waiter is created after the view is built we may lose the notification.	2025-07-01 13:20:19 +03:00
Michael Litvak	3b4af89615	boost/tablets_test: test load balancing and resize of co-located tablets Add unit tests of load balancing and resize with co-located tablets.	2025-07-01 13:20:19 +03:00
Michael Litvak	65ed0548d6	test/tablets: test tablets colocation Add tests with co-located tablets, testing migration and other relevant operations.	2025-07-01 13:20:19 +03:00
Michael Litvak	e01aae7871	test/pylib/tablets: common get_tablet_count api Introduce a common get_tablet_count test api instead of it being duplicated in few tests, and fix it to read the tablet count from the base table.	2025-07-01 13:20:19 +03:00
Michael Litvak	e719da3739	test_mv_tablets: use get_tablet_replicas from common tablets api Replace the duplicated get_tablet_replicas method in test_mv_tablets with the common method from the tablets api, to reduce code duplication and use the correct method that reads the tablet replicas from the base table.	2025-07-01 13:20:19 +03:00
Michael Litvak	6bfb82844f	test/pylib/tablets: fix test api to read tablet replicas from base table When reading tablet replicas from system.tablets, we need to refer to the base table partition, if any. We fix and simplify the test api for reading tablet replicas to read from the base table.	2025-07-01 13:20:19 +03:00
Michael Litvak	ddf02c9489	tablets: replace all_tables method The method all_tables in tablet_metadata is used for iterating over all tables in the tablet metadata with their tablet maps. Now that we have co-located tables we need to make the distinction on which tables we want to iterate over. In some cases we want to iterate over each group of co-located tables, treating them as one unit, and in other cases we want to iterate over all tables, doesn't matter if they are part of a co-located group and have a base table. We replace all_tables with new methods that can be used for each of the cases.	2025-07-01 13:20:18 +03:00
Pavel Emelyanov	26c7f7d98b	Merge 'encryption_at_rest_test: Fix some spurious errors' from Calle Wilund Fixes #24574 * Ensure we close the embedded load_cache objects on encryption shutdown, otherwise we can, in unit testing, get destruction of these while a timer is still active -> assert * Add extra exception handling to `network_error_test_helper`, so even if test framework might exception-escape, we properly stop the network proxy to avoid use after free. Closes scylladb/scylladb#24633 * github.com:scylladb/scylladb: encryption_at_rest_test: Add exception handler to ensure proxy stop encryption: Ensure stopping timers in provider cache objects	2025-07-01 11:33:20 +03:00
Pavel Emelyanov	6826856cf8	Merge 'test.py: Fix start 3rd party services' from Andrei Chekun Move 3rd party services starting under `try` clause to avoid situation that main process is collapses without going stopping services. Without this, if something wrong during start it will not trigger execution exit artifacts, so the process will stay forever. This functionality in 2025.2 and can potentially affect jobs, so backport needed. Closes scylladb/scylladb#24734 * github.com:scylladb/scylladb: test.py: use unique hostname for Minio test.py: Catch possible exceptions during 3rd party services start	2025-07-01 11:33:19 +03:00
Calle Wilund	8d37e5e24b	encryption_at_rest_test: Add exception handler to ensure proxy stop If boost test is run such that we somehow except even in a test macro such as BOOST_REQUIRE_THROW, we could end up not stopping the net proxy used, causing a use after free.	2025-06-30 11:36:38 +00:00
Andrei Chekun	c6c3e9f492	test.py: use unique hostname for Minio To avoid situation that port is occupied on localhost, use unique hostname for Minio	2025-06-30 12:03:06 +02:00
Nadav Har'El	7db5e9a3e9	test/cqlpy: reproducer for decimal parsing with very high exponent This patch adds tests reproducing issue #24581, where Scylla incorrectly parsed "decimal"-type literals in CQL with very high exponents, near or above the 32-bit limit. For example, 1.1234e-2147483647 was incorrectly read as 1.1234E+2147483649, while it should be (as we explain in comments in the test) an error. The tests in this patch failed (in multiple checks) before #24581 was fixed, and pass after it was fixed. These tests all pass on Cassandra 3, confirming our understanding on the limits of "decimal" to be correct. But they fail on Cassandra 4 and 5 due to a regression https://issues.apache.org/jira/browse/CASSANDRA-20723 in Cassandra, that mistakenly limited "decimal" exponents to just 309. Refs #24581 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#24646	2025-06-30 10:37:13 +03:00
Botond Dénes	ee6d7c6ad9	test/boost/memtable_test: only inject error for test table Currently the test indiscriminately injects failures into the flushes of any table, via the IO extension mechanism. The tests want to check that the node correctly handles the IO error by self isolating, however the indiscriminate IO errors can have unintended consequences when they hit raft, leading to disorderly shutdown and failure of the tests. Testing raft's resiliency to IO errors if of course worth doing, but it is not the goal of this particular test, so to avoid the fallout, the IO errors are limited to the test tables only. Fixes: https://github.com/scylladb/scylladb/issues/24637 Closes scylladb/scylladb#24638	2025-06-30 10:08:49 +03:00
Avi Kivity	e2cda38b0f	Merge 'alternator: improve, document and test table/index name lengths' from Nadav Har'El Whereas DynamoDB limits the names of tables, LSIs and GSIs to 255 characters each, Alternator currently has different (and lower) limitations: 1. A table name must be up to 222 characters. 2. For a GSI, the sum of the table's and GSI's name length, plus 1, must be up to 222 characters. 3. For an LSI, the sum of the table's and LSI's name length, plus 2, must be up to 222 characters. The first patch documents these existing limitations, improves their testing, and fixes a tiny bug found by one of the tests (where UpdateTable adding a GSI's limit testing is off by one). The second patch unfortunately shows with a reproducer (issue #24598) this limit of 222 is problematic and we may need to lower it: If a user creates a table of length 222 and then enables Alternator streams, Scylla shuts down on an IO error. This will need to be fixed later, but at least this patch properly documents the existing behavior. No need to backport this patch - it is a very minor improvement that it is unlikely users care about and there is no potential for harm. Closes scylladb/scylladb#24597 * github.com:scylladb/scylladb: test/alternator: reproducer for streams bug with long table name alternator: improve, document and test table/index name lengths	2025-06-29 18:53:48 +03:00
Avi Kivity	b33dd2bd7d	Merge 'sstables/mx/writer: handle non-full prefix row keys' from Botond Dénes Although valid for compact tables, non-full (or empty) clustering key prefixes are not handled for row keys when writing sstables. Only the present components are written, consequently if the key is empty, it is omitted entirely. When parsing sstables, the parsing code unconditionally parses a full prefix. This mis-match results in parsing failures, as the parser parses part of the row content as a key resulting in a garbage key and subsequent mis-parsing of the row content and maybe even subsequent partitions. Introduce a new system table: `system.corrupt_data` and infrastructure similar to `large_data_handler`: `corrupt_data_handler` which abstracts how corrupt data is handled. The sstable writer now passes rows such corrupt keys to the corrupt data handler. This way, we avoid corrupting the sstables beyond parsing and the rows are also kept around in system.corrupt_data for later inspection and possible recovery. Add a full-stack test which checks that rows with bad keys are correctly handled. Fixes: https://github.com/scylladb/scylladb/issues/24489 The bug is present in all versions, has to be backported to all supported versions. Closes scylladb/scylladb#24492 * github.com:scylladb/scylladb: test/boost/sstable_datafile_test: add test for corrupt data sstables/mx/writer: handler rows with empty keys test/lib/cql_assertions: introduce columns_assertions sstables: add corrupt_data_handler to sstables::sstables tools/scylla-sstable: make large_data_handler a local db: introduce corrupt_data_handler mutation: introduce frozen_mutation_fragment_v2 mutation/mutation_partition_view: read_{clustering,static}_row(): return row type mutation/mutation_partition_view: extract de-ser of {clustering,static} row idl-compiler.py: generate skip() definition for enums serializers idl: extract full_position.idl from position_in_partition.idl db/system_keyspace: add apply_mutation() db/system_keyspace: introduce the corrupt_data table	2025-06-29 18:18:36 +03:00
Avi Kivity	48d9f3d2e3	Merge 'mutation: check key of inserted rows' from Botond Dénes Make sure the keys are full prefixes as it is expected to be the case for rows. At severeal occasions we have seen empty row keys make their ways into the sstables, despite the fact that they are not allowed by the CQL frontend. This means that such empty keys are possibly results of memory corruption or use-after-{free,copy} errors. The source of the corruption is impossible to pinpoint when the empty key is discovered in the sstable. So this patch adds checks for such keys to places where mutations are built: when building or unserializing mutations. Fixes: https://github.com/scylladb/scylladb/issues/24506 Not a typical backport candidate (not a bugfix or regression fix), but we should still backport so we have the additional checks deployed to existing production clusters. Closes scylladb/scylladb#24497 * github.com:scylladb/scylladb: mutation: check key of inserted rows compound: optimize is_full() for single-component types	2025-06-29 18:10:17 +03:00
Nadav Har'El	50d370f06e	test/alternator: reproducer for streams bug with long table name The two tests in this patch reproduce issue #24598: When enabling Alternator streams on an Alternator table with a very long name, such as the maximum allowed name length 222, the result is an I/O error and a Scylla shutdown. The two tests are currently marked "skip", otherwise they would crash the Scylla being tested. Refs #24598 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-06-29 11:40:55 +03:00
Nadav Har'El	0ce0b2934f	alternator: improve, document and test table/index name lengths Whereas DynamoDB limits the names of tables, LSIs and GSIs to 255 characters each, Alternator currently has different (and lower) limitations: 1. A table name must be up to 222 characters. 2. For a GSI, the sum of the table's and GSI's name length, plus 1, must be up to 222 characters. 3. For an LSI, the sum of the table's and LSI's name length, plus 2, must be up to 222 characters. These specific limitations were never documented, so in this patch we add this information to docs/alternator/compatibility.md. Moreover, these limitations where only partially tested, so in this patch we add testing for more cases that we forgot to check - such as length of LSI names (only GSI were checked before this patch), or adding a GSI to an existing table. It is important to check all these corner cases because there is a risk that if we attempt to create a table without checking its length, we can end up with an I/O error that brings down Scylla. In one case - UpdateTable adding a GSI to an existing table - the new test exposed a trivial bug: Because UpdateTable wants to verify the new GSI doesn't have the same name as an existing LSI, it mistakenly applied the LSI's length name limit instead of the GSI's name length limit, which is one byte less than it should be. So this patch fixes this trivial bug as well. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-06-29 11:40:55 +03:00
Pavel Emelyanov	23d86ede72	Merge 'audit: introduce debug level logs on happy path' from Dario Mirovic Audit component defines `audit` logger which it uses only for `error` and `info` logs, regarding `audit` module initialization and errors during audit log writing. This change introduces `debug` level logs on the happy path of audit log writes. Fixes: https://github.com/scylladb/scylladb/issues/23773 No backport needed - this is a small quality-of-life improvement. Closes scylladb/scylladb#24658 * github.com:scylladb/scylladb: audit: change audit test logger level to `debug` audit: introduce debug level logs on happy path	2025-06-27 20:10:54 +03:00
Dario Mirovic	ec6249b581	audit: change audit test logger level to `debug` Audit module tests should show the `debug` level messages. This change makes audit_test.py `audit` module log level to `debug`. Closes scylladb/scylladb#23773	2025-06-27 16:27:33 +02:00
Botond Dénes	495f607e73	test/cluster/test_read_repair: write 100 rows in trace test This test asserts that a read repair really happened. To ensure this happens it writes a single partition after enabling the database_apply error injection point. For some reason, the write is sometimes reordered with the error injection and the write will get replicated to both nodes and no read repair will happen, failing the test. To make the test less sensitive to such rare reordering, add a clustering column to the table and write a 100 rows. The chance of all 100 of them being reordered with the error injection should be low enough that it doesn't happen again (famous last words). Fixes: #24330 Closes scylladb/scylladb#24403	2025-06-27 16:23:08 +03:00
Pavel Emelyanov	4c0154f156	Merge 'test.py: enhance allure reporting' from Andrei Chekun Add run ID for process output file to be not overwritten in the next case: first run failed, second passed. They are using the same name, so the second run will overwrite and delete the file. This will help to investigate in case of C++ test fails Add attaching Scylla log files to allure report in case test failed. This is an alternative for link in JUnit report that exists in CI. That change will help to investigate the cluster tests fails. Example can be found in the failed [job](https://jenkins.scylladb.com/job/scylla-master/job/byo/job/byo_build_tests_dtest/2980/allure/). Backport is not needed, this is only framework enhancements Closes scylladb/scylladb#24677 * github.com:scylladb/scylladb: test.py: Attach node logs in allure report in case of fail test.py: Add run id to the boost output file	2025-06-27 16:22:03 +03:00
Botond Dénes	e715a150b9	tools/scylla-nodetool: backup: add --move-files parameter Allow opting in for backup to move the files instead of copying them. Fixes: https://github.com/scylladb/scylladb/issues/24372 Closes scylladb/scylladb#24503	2025-06-27 16:21:39 +03:00

1 2 3 4 5 ...

9073 Commits