scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Benny Halevy	4e5bfe2c18	size_tiered_backlog_tracker: make log4 helper static It is completely generic. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-26 13:30:43 +03:00
Benny Halevy	5d6c2b0d12	size_tiered_backlog_tracker: define struct sstables_backlog_contribution Encapsulate the contribution-related members in struct contribution, to be used for strong exception safety. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-26 13:29:38 +03:00
Benny Halevy	bf69584ccc	size_tiered_backlog_tracker: update_sstables: update total_bytes only if set changed Although replace_sstables is supposed to be called only once per {old_ssts, new_ssts} it is safer to update `_total_bytes` with `sst->data_size()` only if the sst was inserted/erased successfully. Otherwise _total_bytes may go out of sync with the contents of _all. That said, the next step should be to refer to the compaction_group's main sstable set directly rather than maintaining a "shadow" set in the tracker. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-26 13:28:50 +03:00
Benny Halevy	1a8cc84981	compaction_backlog_tracker: replace_sstables: pass old and new sstables vectors by ref To facilitate rollback on the error handling path, to provide strong exception safety guarantees. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-26 13:27:18 +03:00
Benny Halevy	0877e7a846	compaction_backlog_tracker: replace_sstables: add FIXME comments about strong exception safety Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-06-26 12:51:48 +03:00
Kamil Braun	be5b61b870	Merge 'cql3: expr: break up expression.hh header' from Avi Kivity It's very annoying to add a declaration to expression.hh and watch the whole world get recompiled. Improve that by moving less-common functions to a new header expr-utils.hh. Move the evaluation machinery to a new header evaluate.hh. The remaining definitions in expression.hh should not change as often, and thus cause less frequent recompiles. Closes #14346 * github.com:scylladb/scylladb: cql3: expr: break up expression.hh header cql3: expr: restrictions.hh: protect against double inclusions cql3: constants: deinline cql3: statement_restrictions: deinline cql3: deinline operation::fill_prepare_context()	2023-06-23 10:19:28 +02:00
Nadav Har'El	0a1283c813	Merge 'cql3:statements:describe_statement: check pointer after casting to UDF/UDA' from Michał Jadwiszczak There was a bug in describe_statement. If executing `DESC FUNCTION <uda name>` or ` DESC AGGREGATE <udf name>`, Scylla was crashing because the function was found (`functions::find()` searches both UDFs and UDAs) but the function was bad and the pointer wasn't checked after cast. Added a test for this. Fixes: #14360 Closes #14332 * github.com:scylladb/scylladb: cql-pytest:test_describe: add test for filtering UDF and UDA cql3:statements:describe_statement: check pointer to UDF/UDA	2023-06-22 20:54:25 +03:00
Michał Jadwiszczak	d3d9a15505	cql-pytest:test_describe: add test for filtering UDF and UDA	2023-06-22 18:08:45 +02:00
Michał Jadwiszczak	d498451cdf	cql3:statements:describe_statement: check pointer to UDF/UDA While looking for specific UDF/UDA, result of `functions::functions::find()` needs to be filtered out based on function's type. Fixes: #14360	2023-06-22 18:08:16 +02:00
Avi Kivity	b858a4669d	cql3: expr: break up expression.hh header Adding a function declaration to expression.hh causes many recompilations. Reduce that by: - moving some restrictions-related definitions to the existing expr/restrictions.hh - moving evaluation related names to a new header expr/evaluate.hh - move utilities to a new header expr/expr-utilities.hh expression.hh contains only expression definitions and the most basic and common helpers, like printing.	2023-06-22 14:21:03 +03:00
Avi Kivity	25c351a4f6	cql3: expr: restrictions.hh: protect against double inclusions Add #pragma once. Right now it's safe as it only has declarations (which can be repeated), but soon it will have a definition.	2023-06-22 14:19:43 +03:00
Avi Kivity	7302088274	cql3: constants: deinline To reduce future header fan-in, deinline all non-trivial functions. While these aer on the hot path, they can't be inlined anyway as they're virtual, and they're quite heavy anyway.	2023-06-22 14:19:43 +03:00
Avi Kivity	6c0f8a73c5	cql3: statement_restrictions: deinline Reduce future header fan-in by deinlining functions. These are all on the prepare path.	2023-06-22 14:19:43 +03:00
Avi Kivity	3834a1fd7c	cql3: deinline operation::fill_prepare_context() To reduce operation.hh include fan-in, deinline fill_prepare_context(). It's not performance sensitive has it's on the prepare phase.	2023-06-22 14:19:43 +03:00
Kamil Braun	23a60df92d	Merge 'cql3: expr: simplify evaluate()' from Avi Kivity Make evaluate()'s body more regular, then exploit it by replacing the long list of branches with a lambda template. Closes #14306 * github.com:scylladb/scylladb: cql3: expr: simplify evaluate() cql3: expr: standardize evaluate() branches to call do_evaluate() cql3: expr: rename evaluate(ExpressionElement) to do_evaluate()	2023-06-22 12:18:36 +02:00
Kamil Braun	563d466de1	Merge 'cql3: select_statement: coroutinize indexed statement's do_execute()' from Avi Kivity Improves readability, and probably a little faster too. Closes #14311 * github.com:scylladb/scylladb: cql3: select_statement: reindent indexed_table_select_statement::do_execute cql3: select_statement: simplify inner lambda in indexed_table_select_statement::do_execute() cql3: select_statement: coroutinize indexed_table_select_statement::do_execute()	2023-06-22 12:10:45 +02:00
Botond Dénes	55e09dbdc0	Merge 'doc: move cloud deployment instruction to docs -v2' from Anna Stuchlik This is V2 of https://github.com/scylladb/scylladb/pull/14108 This commit moves the installation instruction for the cloud from the [website ](https://www.scylladb.com/download/)to the docs. The scope: * Added new files with instructions for AWS, GCP, and Azure. * Added the new files to the index. * Updating the "Install ScyllaDB" page to create the "Cloud Deployment" section. * Adding new bookmarks in other files to create stable links, for example, ".. _networking-ports:" * Moving common files to the new "installation-common" directory. This step is required to exclude the open source-only files in the Enterprise repository. In addition: - The Configuration Reference file was moved out of the installation section (it's not about installation at all) - The links to creating a cluster were removed from the installation page (as not related). Related: https://github.com/scylladb/scylla-docs/issues/4091 Closes #14153 * github.com:scylladb/scylladb: doc: remove the rpm-info file (What is in each RPM) from the installation section doc: move cloud deployment instruction to docs -v2	2023-06-22 12:58:30 +03:00
Avi Kivity	32b27d6a08	cql3: expr: change evaluation_input vector components to take spans Spans are slightly cleaner, slightly faster (as they avoid an indirection), and allow for replacing some of the arguments with small_vector:s. Closes #14313	2023-06-22 11:28:01 +02:00
Anna Stuchlik	950ef5195e	Merge branch 'master' into anna-install-cloud-v2	2023-06-22 10:03:29 +02:00
Botond Dénes	e1c2de4fb8	Merge 'forward_service: fix forgetting case-sensitivity in aggregates ' from Jan Ciołek There was a bug that caused aggregates to fail when used on column-sensitive columns. For example: ```cql SELECT SUM("SomeColumn") FROM ks.table; ``` would fail, with a message saying that there is no column "somecolumn". This is because the case-sensitivity got lost on the way. For non case-sensitive column names we convert them to lowercase, but for case sensitive names we have to preserve the name as originally written. The problem was in `forward_service` - we took a column name and created a non case-sensitive `column_identifier` out of it. This converted the name to lowercase, and later such column couldn't be found. To fix it, let's make the `column_identifier` case-sensitive. It will preserve the name, without converting it to lowercase. Fixes: https://github.com/scylladb/scylladb/issues/14307 Closes #14340 * github.com:scylladb/scylladb: service/forward_service.cc: make case-sensitivity explicit cql-pytest/test_aggregate: test case-sensitive column name in aggregate forward_service: fix forgetting case-sensitivity in aggregates	2023-06-22 08:25:33 +03:00
Botond Dénes	320159c409	Merge 'Compaction group major compaction task' from Aleksandra Martyniuk Task manager task covering compaction group major compaction. Uses multiple inheritance on already existing major_compaction_task_executor to keep track of the operation with task manager. Closes #14271 * github.com:scylladb/scylladb: test: extend test_compaction_task.py test: use named variable for task tree depth compaction: turn major_compaction_task_executor into major_compaction_task_impl compaction: take gate holder out of task executor compaction: extend signature of some methods tasks: keep shared_ptr to impl in task compaction: rename compaction_task_executor methods	2023-06-22 08:15:17 +03:00
Avi Kivity	8576502c48	Merge 'raft topology: ban left nodes from the cluster' from Kamil Braun Use the new Seastar functionality for storing references to connections to implement banning hosts that have left the cluster (either decommissioned or using removenode) in raft-topology mode. Any attempts at communication from those nodes will be rejected. This works not only for nodes that restart, but also for nodes that were running behind a network partition and we removed them. Even when the partition resolves, the existing nodes will effectively put a firewall from that node. Some changes to the decommission algorithm had to be introduced for it to work with node banning. As a side effect a pre-existing problem with decommission was fixed. Read the "introduce `left_token_ring` state" and "prepare decommission path for node banning" commits for details. Closes #13850 * github.com:scylladb/scylladb: test: pylib: increase checking period for `get_alive_endpoints` test: add node banning test test: pylib: manager_client: `get_cql()` helper test: pylib: ScyllaCluster: server pause/unpause API raft topology: ban left nodes raft topology: skip `left_token_ring` state during `removenode` raft topology: prepare decommission path for node banning raft topology: introduce `left_token_ring` state raft topology: `raft_topology_cmd` implicit constructor messaging_service: implement host banning messaging_service: exchange host IDs and map them to connections messaging_service: store the node's host ID messaging_service: don't use parameter defaults in constructor main: move messaging_service init after system_keyspace init	2023-06-21 20:16:45 +03:00
Anna Stuchlik	c65abb06cd	doc: udpate the OSS docs landing page Fixes https://github.com/scylladb/scylladb/issues/14333 This commit replaces the documentation landing page with the Open Source-only documentation landing page. This change is required as now there is a separate landing page for the ScyllaDB documentation, so the page is duplicated, creating bad user experience. Closes #14343	2023-06-21 17:06:48 +03:00
Jan Ciołek	16c21d7252	service/forward_service.cc: make case-sensitivity explicit Make it explicit that the boolean argument determines case-sensitivity. It emphasizes its importance. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-06-21 16:02:41 +02:00
Jan Ciolek	854b0301be	cql-pytest/test_aggregate: test case-sensitive column name in aggregate There was a bug which made aggregates fail when used with case-sensitive column names. Add a test to make sure that this doesn't happen in the future. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-06-21 14:49:24 +02:00
Jan Ciolek	7fca350075	forward_service: fix forgetting case-sensitivity in aggregates There was a bug that caused aggregates to fail when used on column-sensitive columns. For example: ``` SELECT SUM("SomeColumn") FROM ks.table; ``` would fail, with a message saying that there is no column "somecolumn". This is because the case-sensitivity got lost on the way. For non case-sensitive column names we convert them to lowercase, but for case sensitive names we have to preserve the name as originally written. The problem was in `forward_service` - we took a column name and created a non case-sensitive `column_identifier` out of it. This converted the name to lowercase, and later such column couldn't be found. To fix it, let's make the `column_identifier` case-sensitive. It will preserve the name, without converting it to lowercase. Fixes: https://github.com/scylladb/scylladb/issues/14307 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-06-21 14:37:42 +02:00
Nadav Har'El	8a9de08510	sstable: limit compression chunk size to 128 KB The chunk size used in sstable compression can be set when creating a table, using the "chunk_length_in_kb" parameter. It can be any power-of-two multiple of 1KB. Very large compression chunks are not useful - they offer diminishing returns on compression ratio, and require very large memory buffers and reading a very large amount of disk data just to read a small row. In fact, small chunks are recommended - Scylla defaults to 4 KB chunks, and Cassandra lowered their default from 64 KB (in Cassandra 3) to 16 KB (in Cassandra 4). Therefore, allowing arbitrarily large chunk sizes is just asking for trouble. Today, a user can ask for a 1 GB chunk size, and crash or hang Scylla when it runs out of memory. So in this patch we add a hard limit of 128 KB for the chunk size - anything larger is refused. Fixes #9933 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14267	2023-06-21 14:26:02 +03:00
Kefu Chai	f014ccf369	Revert "Revert "Merge 'treewide: add uuid_sstable_identifier_enabled support' from Kefu Chai"" This reverts commit `562087beff`. The regressions introduced by the reverted change have been fixed. So let's revert this revert to resurrect the uuid_sstable_identifier_enabled support. Fixes #10459	2023-06-21 13:02:40 +03:00
Avi Kivity	e233f471b8	Merge 'Respect tablet shard assignment' from Tomasz Grabiec This PR changes the system to respect shard assignment to tablets in tablet metadata (system.tablets): 1. The tablet allocator is changed to distribute tablets evenly across shards taking into account currently allocated tablets in the system. Each tablet has equal weight. vnode load is ignored. 2. CDC subsystem was not adjusted (not supported yet) 3. sstable sharding metadata reflects tablet boundaries 5. resharding is NOT supported yet (the node will abort on boot if there is a need to reshard tablet-based tables) 6. The system is NOT prepared to handle tablet migration / topology changes in a safe way. 7. Sstable cleanup is not wired properly yet After this PR, dht::shard_of() and schema::get_sharder() are deprecated. One should use table::shard_of() and effective_replication_map::get_sharder() instead. To make the life easier, support was added to obtain table pointer from the schema pointer: ``` schema_ptr s; s->table().shard_of(...) ``` Closes #13939 * github.com:scylladb/scylladb: locator: network_topology_startegy: Allocate shards to tablets locator: Store node shard count in topology service: topology: Extract topology updating to a lambda test: Move test_tablets under topology_experimental sstables: Add trace-level logging related to shard calculation schema: Catch incorrect uses of schema::get_sharder() dht: Rename dht::shard_of() to dht::static_shard_of() treewide: Replace dht::shard_of() uses with table::shard_of() / erm::shard_of() storage_proxy: Avoid multishard reader for tablets storage_proxy: Obtain shard from erm in the read path db, storage_proxy: Drop mutation/frozen_mutation ::shard_of() forward_service: Use table sharder alternator: Use table sharder db: multishard: Obtain sharder from erm sstable_directory: Improve trace-level logging db: table: Introduce shard_of() helper db: Use table sharder in compaction sstables: Compute sstable shards using sharder from erm when loading sstables: Generate sharding metadata using sharder from erm when writing test: partitioner: Test split_range_to_single_shard() on tablet-like sharder dht: Make split_range_to_single_shard() prepared for tablet sharder sstables: Move compute_shards_for_this_sstable() to load() dht: Take sharder externally in splitting functions locator: Make sharder accessible through effective_replication_map dht: sharder: Document guarantees about mapping stability tablets: Implement tablet sharder tablets: Include pending replica in get_shard() dht: sharder: Introduce next_shard() db: token_ring_table: Filter out tablet-based keyspaces db: schema: Attach table pointer to schema schema_registry: Fix SIGSEGV in learn() when concurrent with get_or_load() schema_registry: Make learn(schema_ptr) attach entry to the target schema test: lib: cql_test_env: Expose feature_service test: Extract throttle object to separate header	2023-06-21 10:20:41 +03:00
Calle Wilund	f18e967939	storage_proxy: Make split_stats resilient to being called from different scheduling group Fixes #11017 When doing writes, storage proxy creates types deriving from abstract_write_response_handler. These are created in the various scheduling groups executing the write inducing code. They pick up a group-local reference to the various metrics used by SP. Normally all code using (and esp. modifying) these metrics are executed in the same scheduling group. However, if gossip sees a node go down, it will notify listeners, which eventually calls get_ep_stat and register_metrics. This code (before this patch) uses _active_ scheduling group to eventually add metrics, using a local dict as guard against double regs. If, as described above, we're called in a different sched group than the original one however, this can cause double registrations. Fixed here by keeping a reference to creating scheduling group and using this, not active one, when/if creating new metrics. Closes #14294	2023-06-21 10:08:27 +03:00
Tomasz Grabiec	ebdebb982b	locator: network_topology_startegy: Allocate shards to tablets Uses a simple algorihtm for allocating shards which chooses least-loaded shard on a given node, encapsulated in load_sketch. Takes load due to current tablet allocation into account. Each tablet, new or allocated for other tables, is assumed to have an equal load weight.	2023-06-21 00:58:25 +02:00
Tomasz Grabiec	e110167a2a	locator: Store node shard count in topology Will be needed by tablet allocator.	2023-06-21 00:58:25 +02:00
Tomasz Grabiec	dd968e16bf	service: topology: Extract topology updating to a lambda Reduces code duplication.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	6defcb7bd5	test: Move test_tablets under topology_experimental Tablets will rely on shard_count information in topology, which is set only when using eperimental raft-based topology.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	34f28aa0cb	sstables: Add trace-level logging related to shard calculation	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	f6625e16ee	schema: Catch incorrect uses of schema::get_sharder() We still use it in many places in unit tests, which is ok because those tables are vnode-based. We want to check incorrect uses in production as they may lead to hard to debug consistency problems.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	29cbdb812b	dht: Rename dht::shard_of() to dht::static_shard_of() This is in order to prevent new incorrect uses of dht::shard_of() to be accidentally added. Also, makes sure that all current uses are caught by the compiler and require an explicit rename.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	21198e8470	treewide: Replace dht::shard_of() uses with table::shard_of() / erm::shard_of() dht::shard_of() does not use the correct sharder for tablet-based tables. Code which is supposed to work with all kinds of tables should use erm::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	fb0bdcec0c	storage_proxy: Avoid multishard reader for tablets Currently, the coordinator splits the partition range at vnode (or tablet) boundaries and then tries to merge adjacent ranges which target the same replica. This is an optimization which makes less sense with tablets, which are supposed to be of substantial size. If we don't merge the ranges, then with tablets we can avoid using the multishard reader on the replica side, since each tablet lives on a single shard. The main reason to avoid a multishard reader is avoiding its complexity, and avoiding adapting it to work with tablet sharding. Currently, the multishard reader implementation makes several assumptions about shard assignment which do not hold with tablets. It assumes that shards are assigned in a round-robin fashion.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	10e05eec66	storage_proxy: Obtain shard from erm in the read path dht::shard_of() does not use the correct sharder for tablet-based tables. Code which is supposed to work with all kinds of tables should use erm::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	e48ec6fed3	db, storage_proxy: Drop mutation/frozen_mutation ::shard_of() dht::shard_of() does not use the correct sharder for tablet-based tables. Code which is supposed to work with all kinds of tables should use erm::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	d4497a058e	forward_service: Use table sharder schema::get_sharder() does not return the correct sharder for tablet-based tables. Code which is supposed to work with all kinds of tables should use erm::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	ab94e74774	alternator: Use table sharder schema::get_sharder() does not return the correct sharder for tablet-based tables. Code which is supposed to work with all kinds of tables should use erm::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	d92287f997	db: multishard: Obtain sharder from erm This is not strictly necessary, as the multishard reader will be later avoided altogether for tablet-based tables, but it is a step towards converting all code to use the erm->get_sharder() instead of schema::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	18f567385c	sstable_directory: Improve trace-level logging	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	34ba8a6a53	db: table: Introduce shard_of() helper Saves some boiler plate code.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	36da062bcb	db: Use table sharder in compaction	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	ad983ac23d	sstables: Compute sstable shards using sharder from erm when loading schema::get_sharder() does not use the correct sharder for tablet-based tables. Code which is supposed to work with all kinds of tables should obtain the sharder from erm::get_sharder().	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	17d6163548	sstables: Generate sharding metadata using sharder from erm when writing We need to keep sharding metadata consistent with tablet mapping to shards in order for node restart to detect that those sstables belong to a single shard and that resharding is not necessary. Resharding of sstables based on tablet metadata is not implemented yet and will abort after this series. Keeping sharding metadata accurate for tablets is only necessary until compaction group integration is finished. After that, we can use the sstable token range to determine the owning tablet and thus the owning shard. Before that, we can't, because a single sstable may contain keys from different tablets, and the whole key range may overlap with keys which belong to other shards.	2023-06-21 00:58:24 +02:00
Tomasz Grabiec	36e12020b9	test: partitioner: Test split_range_to_single_shard() on tablet-like sharder	2023-06-21 00:58:24 +02:00

1 2 3 4 5 ...

37531 Commits