In August 2022, DynamoDB added a "S3 Import" feature, which we don't yet
support - so let's document this missing feature in the compatibility
document.
Refs #11739.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11740
This series adds support for detecting collections that have too many items
and recording them in `system.large_cells`.
A configuration variable was added to db/config: `compaction_collection_items_count_warning_threshold` set by default to 10000.
Collections that have more items than this threshold will be warned about and will be recorded as a large cell in the `system.large_cells` table. Documentation has been updated respectively.
A new column was added to system.large_cells: `collection_items`.
Similar to the `rows` column in system.large_partition, `collection_items` holds the number of items in a collection when the large cell is a collection, or 0 if it isn't. Note that the collection may be recorded in system.large_cells either due to its size, like any other cell, and/or due to the number of items in it, if it cross the said threshold.
Note that #11449 called for a new system.large_collections table, but extending system.large_cells follows the logic of system.large_partitions is a smaller change overall, hence it was preferred.
Since the system keyspace schema is hard coded, the schema version of system.large_cells was bumped, and since the change is not backward compatible, we added a cluster feature - `LARGE_COLLECTION_DETECTION` - to enable using it.
The large_data_handler large cell detection record function will populate the new column only when the new cluster feature is enabled.
In addition, unit tests were added in sstable_3_x_test for testing large cells detection by cell size, and large_collection detection by the number of items.
Closes#11449Closes#11674
* github.com:scylladb/scylladb:
sstables: mx/writer: optimize large data stats members order
sstables: mx/writer: keep large data stats entry as members
db: large_data_handler: dynamically update config thresholds
utils/updateable_value: add transforming_value_updater
db/large_data_handler: cql_table_large_data_handler: record large_collections
db/large_data_handler: pass ref to feature_service to cql_table_large_data_handler
db/large_data_handler: cql_table_large_data_handler: move ctor out of line
docs: large-rows-large-cells-tables: fix typos
db/system_keyspace: add collection_elements column to system.large_cells
gms/feature_service: add large_collection_detection cluster feature
test: sstable_3_x_test: add test_sstable_too_many_collection_elements
test: lib: simple_schema: add support for optional collection column
test: lib: simple_schema: build schema in ctor body
test: lib: simple_schema: cql: define s1 as static only if built this way
db/large_data_handler: maybe_record_large_cells: consider collection_elements
db/large_data_handler: debug cql_table_large_data_handler::delete_large_data_entries
sstables: mx/writer: pass collection_elements to writer::maybe_record_large_cells
sstables: mx/writer: add large_data_type::elements_in_collection
db/large_data_handler: get the collection_elements_count_threshold
db/config: add compaction_collection_elements_count_warning_threshold
test: sstable_3_x_test: add test_sstable_write_large_cell
test: sstable_3_x_test: pass cell_threshold_bytes to large_data_handler
test: sstable_3_x_test: large_data_handler: prepare callback for testing large_cells
test: sstable_3_x_test: large_data tests: use BOOST_REQUIRE_[GL]T
test: sstable_3_x_test: test_sstable_log_too_many_rows: use tests::random
The "virtual dirty" term is not very informative. "Virtual" means
"not real", but it doesn't say in which way it isn't real.
In this case, virtual dirty refers to real dirty memory, minus
the portion of memtables that has been written to disk (but not
yet sealed - in that case it would not be dirty in the first
place).
I chose to call "the portion of memtables that has been written
to disk" as "spooled memory". At least the unique term will cause
people to look it up and may be easier to remember. From that
we have "unspooled memory".
I plan to further change the accounting to account for spooled memory
rather than unspooled, as that is a more natural term, but that is left
for later.
The documentation, config item, and metrics are adjusted. The config
item is practically unused so it isn't worth keeping compatibility here.
And bump the schema version offset since the new schema
should be distinguishable from the previous one.
Refs scylladb/scylladb#11660
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add a new large_data_stats type and entry for keeping
the collection_elements_count_threshold and the maximum value
of collection_elements.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Extend the cql3 truncate statement to accept attributes,
similar to modification statements.
To achieve that we define cql3::statements::raw::truncate_statement
derived from raw::cf_statement, and implement its pure virtual
prepare() method to make a prepared truncate_statement.
The latter is no longer derived from raw::cf_statement,
and just stores a schema_ptr to get to the keyspace and column_family.
`test_truncate_using_timeout` cql-pytest was added to test
the new USING TIMEOUT feature.
Fixes#11408
Also, update docs/cql/ddl.rst truncate-statement section respectively.
Closes#11409
* github.com:scylladb/scylladb:
docs: cql-extensions: add TRUNCATE to USING TIMEOUT section.
docs: cql: ddl: add support for TRUNCATE USING TIMEOUT
cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT
cql3: selectStatement: restrict to USING TIMEOUT in grammar
cql3: deleteStatement: restrict to USING TIMEOUT|TIMESTAMP in grammar
The series contains fixes for system.large_* log warning and respective documentation.
This prepares the way for adding a new system.large_collections table (See #11449):
Fixes#11620Fixes#11621Fixes#11622
the respective fixes should be backported to different release branches, based on the respective patches they depend on (mentioned in each issue).
Closes#11623
* github.com:scylladb/scylladb:
docs: adjust to sstable base name
docs: large-partition-table: adjust for additional rows column
docs: debugging-large-partition: update log warning example
db/large_data_handler: print static cell/collection description in log warning
db/large_data_handler: separate pk and ck strings in log warning with delimiter
List the queries that support the TIMEOUT parameter.
Mention the newly added support for TRUNCATE
USING TIMEOUT.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Since 244df07771 (scylla 5.1),
only the sstable basename is kept in the large_* system tables.
The base path can be determined from the keyspace and
table name.
Fixes#11621
Adjust the examples in documentation respectively.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Since a7511cf600 (scylla 5.0),
sstables containing partitions with too many rows are recorded in system.large_partitions.
Adjust the doc respectively.
Fixes#11622
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The log warning format has changed since f3089bf3d1
and was fixed in the previous patch to include
a delimiter between the partition key, clustering key, and
column name.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Update several aspects of the alternator/getting-started.md which were
not up-to-date:
* When the documented was written, Alternator was moving quickly so we
recommended running a nightly version. This is no longer the case, so
we should recommend running the latest stable build.
* The link to the download link is no longer helpful for getting Docker
instructions (it shows some generic download options). Instead point to
our dockerhub page.
* Replace mentions of "Scylla" by the new official name, "ScyllaDB".
* Miscelleneous copy-edits.
Fixes#11218
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11605
Fix https://github.com/scylladb/scylladb/issues/11373
- Updated the information on the "Counting all rows in a table is slow" page.
- Added COUNT to the list of selectors of the SELECT statement (somehow it was missing).
- Added the note to the description of the COUNT() function with a link to the KB page for troubleshooting if necessary. This will allow the users to easily find the KB page.
Closes#11417
* github.com:scylladb/scylladb:
doc: add a comment to remove the note in version 5.1
doc: update the information on the Countng all rows page and add the recommendation to upgrade ScyllaDB
doc: add a note to the description of COUNT with a reference to the KB article
doc: add COUNT to the list of acceptable selectors of the SELECT statement
Fix https://github.com/scylladb/scylladb/issues/11376
This PR adds the upgrade guide from version 5.0 to 5.1. It involves adding new files (5.0-to-5.1) and language/formatting improvements to the existing content (shared by several upgrade guides).
Closes#11577
* github.com:scylladb/scylladb:
doc: upgrade the command to upgrade the ScyllaDB image from 5.0 to 5.1
doc: add the guide to upgrade ScyllaDB from 5.0 to 5.1
This PR adds the missing upgrade guides for upgrading the ScyllaDB image to a patch release:
- ScyllaDB 5.0: /upgrade/upgrade-opensource/upgrade-guide-from-5.x.y-to-5.x.z/upgrade-guide-from-5.x.y-to-5.x.z-image/
- ScyllaDB Enterprise: /upgrade/upgrade-enterprise/upgrade-guide-from-2021.1-to-2022.1/upgrade-guide-from-2022.1-to-2022.1-image/ (the file name is wrong and will be fixed with another PR)
In addition, the section regarding the recommended upgrade procedure has been improved.
Fixes https://github.com/scylladb/scylladb/issues/11450
Fixes https://github.com/scylladb/scylladb/issues/11452Closes#11460
* github.com:scylladb/scylladb:
doc: update the commands to upgrade the ScyllaDB image
doc: fix the filename in the index to resolve the warnings and fix the link
doc: apply feedback by adding she step fo load the new repo and fixing the links
doc: fix the version name in file upgrade-guide-from-2021.1-to-2022.1-image.rst
doc: rename the upgrade-image file to upgrade-image-opensource and update all the links to that file
doc: update the Enterprise guide to include the Enterprise-onlyimage file
doc: update the image files
doc: split the upgrade-image file to separate files for Open Source and Enterprise
doc: clarify the alternative upgrade procedures for the ScyllaDB image
doc: add the upgrade guide for ScyllaDB Image from 2022.x.y. to 2022.x.z
doc: add the upgrade guide for ScyllaDB Image from 5.x.y. to 5.x.z
It points to a private scylladb repo, which has no place in user-facing
documentation. For now there is no public replacement, but a similar
functionality is in the works for Scylla Manager.
Fixes: #11573Closes#11580
In compatibility.md where we refer to the missing ability to add a GSI
to an existing table - let's refer to a new issue specifically about this
feature, instead of the old bigger issue about UpdateItem.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11568
This tiny series fixes some small error and out-of-date information in Alternator documentation and code comments.
Closes#11547
* github.com:scylladb/scylladb:
alternator ttl: comment fixes
docs/alternator: fix mention of old alternator-test directory
The directory that used to be called alternator-test is now (and has
been for a long time) really test/alternator. So let's fix the
references to it in docs/alternator/alternator.md.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The purpose of this PR is to update the information about the default SStable format.
It
Closes#11431
* github.com:scylladb/scylladb:
doc: simplify the information about default formats in different versions
doc: update the SSTables 3.0 Statistics File Format to add the UUID host_id option of the ME format
doc: add the information regarding the ME format to the SSTables 3.0 Data File Format page
doc: fix additional information regarding the ME format on the SStable 3.x page
doc: add the ME format to the table
add a comment to remove the information when the documentation is versioned (in 5.1)
doc: replace Scylla with ScyllaDB
doc: fix the formatting and language in the updated section
doc: fix the default SStable format
The scope of this PR:
- Removing support for Ubuntu 16.04 and Debian 9.
- Adding support for Debian 11.
Closes#11461
* github.com:scylladb/scylladb:
doc: remove support for Debian 9 from versions 2022.1 and 2022.2
doc: remove support for Ubuntu 16.04 from versions 2022.1 and 2022.2
doc: add support for Debian 11 to versions 2022.1 and 2022.2
Fix https://github.com/scylladb/scylla-doc-issues/issues/816
Fix https://github.com/scylladb/scylla-docs/issues/1613
This PR fixes the CQL version in the Interfaces page, so that it is the same as in other places across the docs and in sync with the version reported by the ScyllaDB (see https://github.com/scylladb/scylla-doc-issues/issues/816#issuecomment-1173878487).
To make sure the same CQL version is used across the docs, we should use the `|cql-version| `variable rather than hardcode the version number on several pages.
The variable is specified in the conf.py file:
```
rst_prolog = """
.. |cql-version| replace:: 3.3.1
"""
```
Closes#11320
* github.com:scylladb/scylladb:
doc: add the Cassandra version on which the tools are based
doc: fix the version number
doc: update the Enterprise version where the ME format was introduced
doc: add the ME format to the Cassandar Compatibility page
doc: replace Scylla with ScyllaDB
doc: rewrite the Interfaces table to the new format to include more information about CQL support
doc: remove the CQL version from pages other than Cassandra compatibility
doc: fix the CQL version in the Interfaces table