Files
scylladb/test
Avi Kivity c59985c38b Merge 'cql3: limit large allocations when parsing queries' from Botond Dénes
Queries are stored and passed around as sstring/std::string_view. While normally they are small enough to not cause problems, as the `test_cdc_large_values.TestLargeColumnsWithCDC.test_single_column_blob_max_size_with_cdc_preimage_full_postimage[unprepared_statements]` demonstrates, queries can be arbitrarily large, putting heavy strain on Scylla internals via large allocations, in the extreme case causing denial of service.

This PR attempts to alleviate this by using fragmented storage for queries: read query as fragmented string from the input stream in `transport/server.cc`, propagate it as such to `query_processor::prepare()` and also store it as such in `cql3::cql_statement::raw_cql_statement`. Also avoid linearizing raw values during in the CQL expression tree: switch `cql3::expr::untyped_constant::raw_text` to fragmented storage.

For this to be possible, some infrastructure code had to be made fragmented storage friendly: ascii/utf8 validation, hashers, from_hex and importantly: `abstract_type::from_string()`.

Unfortunately, the query still has to be linearized for parsing itself, as ANTLR -- although allows for custom InputStream implementation -- plays pointer arithmetics games with the pointers obtained from them, so fragmented input cannot be used.

Still, this PR limits the places where the query is linearized to the
following:
* Parsing
* Audit
* Logs and error messages

So the normal query paths for queries that actually can get arbitrarily large (UPDATE and INSERT) should only linearize the query temporarily for parsing.

Fixes #10779

Improvement, no backport

Closes scylladb/scylladb#28619

* github.com:scylladb/scylladb:
  tracing: add_query(): change query param to utils::chunked_string
  cql3: store raw query string in utils::chunked_string
  serializer: add serializer<utils::chunked_string>
  utils/reusable_buffer: add get_linearized_view(managed_bytes_view)
  cql3/expr: use utils::chunked_string for untyped_constant::raw_text
  types: abstract_type::from_string() switch to fragmented buffers (implementation)
  types: abstract_type::from_string() switch to fragmented buffers (interface)
  types: use write_fragmented from utils/fragment_range.hh
  types: timestamp_from_string(): don't assume std::string_view is null-terminated
  types/duration: don't assume std::string_view is null-terminated
  utils/hashers: add calculate(managed_bytes_view) overload
  utils/ascii: add validate(managed_bytes_view) overload
  utils: add managed_bytes_fwd.hh
  utils: add chunked_string
  utils: add managed_bytes_basic_view::byte_iterator
2026-05-26 15:00:53 +03:00
..
2026-05-20 13:47:12 +03:00
2026-04-12 19:46:33 +03:00

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.